Seek .(SKLTY)
Search documents
LPG crisis: Madhya Pradesh traders body seek limited gas supply to prevent shutdown of hotels, eateries
The Economic Times· 2026-03-14 05:51
Core Viewpoint - The Confederation of All India Traders (CAIT) has urged the Madhya Pradesh administration to restore a limited supply of commercial LPG cylinders to prevent the complete shutdown of the hospitality and food service sector in Bhopal, which could threaten thousands of livelihoods and disrupt the food supply [9]. Industry Impact - The ongoing crisis in West Asia has led to a shortage of LPG in the country, impacting the hospitality sector significantly [4][9]. - If the LPG supply is not restored within days, the entire hospitality and food service sector in Bhopal may face closure, affecting daily food services for citizens [5][9]. Alternative Solutions - Business owners have ordered induction-based cooking systems as an alternative; however, these systems may take seven to ten days to arrive, and restaurants would need to increase their power capacity to operate them, making this option impractical [6][9]. - The suggestion to switch to wood-fire cooking is deemed unfeasible due to safety, hygiene concerns, and the design of modern gas-based kitchens [7][9]. Demand for Essential Services - CAIT has requested that the food service sector be treated as an essential service, similar to its status during the COVID-19 pandemic, and that it receives limited relief in LPG supply to ensure the food supply system in the city is not disrupted [2][8][9].
DeepSeek发布下一代技术,北大实习生立功
3 6 Ke· 2026-02-27 09:09
Core Insights - DeepSeek has introduced a new inference system called DualPath, addressing the I/O bottleneck in current large language models for intelligent agent applications [1][3][19] - The DualPath system significantly enhances throughput by implementing a dual-path loading mechanism, effectively eliminating KV cache I/O overhead [1][13] Group 1: System Innovation - DualPath opens a new channel from storage directly to the decoding engine, allowing KV cache to be loaded into the decoding engine and efficiently transmitted to the pre-filling end via RDMA [1][5] - The system achieves a maximum offline inference throughput increase of 1.87 times and an average online service throughput increase of 1.96 times during real intelligent agent workload tests [1][13][17] Group 2: Technical Components - DualPath consists of three core components: the inference engine, traffic manager, and request scheduler, which work together to optimize data movement and resource utilization [6][7] - The traffic manager ensures that KV cache traffic does not interfere with latency-sensitive model collective communications, utilizing a compute network-centric traffic management strategy [11][12] Group 3: Performance Validation - Experiments conducted on a GPU server cluster connected via InfiniBand demonstrated that DualPath can achieve up to 1.87 times acceleration compared to baseline inference frameworks, indicating that KV cache I/O overhead has been largely eliminated [13][15] - The system has been validated for scalability, achieving near-linear expansion from 2P4D (2,000 agents) to 48P96D (48,000 agents) while maintaining consistent task completion times [17][18] Group 4: Future Directions - The research team acknowledges the need for more adaptive and flexible configurations for parallelism and P/D ratios in future developments, suggesting the potential for simulator or online adjustment mechanisms [19]
DeepSeek 有新消息!
Mei Ri Jing Ji Xin Wen· 2026-02-27 09:06
Core Insights - The article discusses a new research paper co-authored by DeepSeek, Peking University, and Tsinghua University, focusing on optimizing inference speed for large language models (LLMs) in AI agents [3][4] - The paper introduces an innovative inference system called DualPath, which enhances the performance of LLMs by implementing a "dual-path reading KV-Cache" mechanism, resulting in a 1.87 times increase in offline inference throughput and a 1.96 times increase in the number of AI agent operations per second [3][4] Group 1 - The transition of large models from single-turn dialogue systems to intelligent agent systems capable of multi-turn interactions is highlighted, necessitating a significant change in inference workloads [3] - The existing systems face bottlenecks due to bandwidth limitations, where the preprocessing engine monopolizes the network bandwidth, leaving the content generation engine underutilized [4] - DualPath addresses this issue by redesigning the KV-Cache loading logic, effectively utilizing idle bandwidth resources to significantly enhance speed [4] Group 2 - DeepSeek's approach to performance optimization is noted as a response to hardware limitations, with some industry professionals viewing it as a less innovative but necessary step [5] - There are ongoing rumors regarding the release timeline of DeepSeek V4, with speculation ranging from February to March, and recent reports indicating testing of a new model called "Sealion-lite" with a context window of 1 million tokens [5] - DeepSeek has provided early access to the updated V4 version to domestic manufacturers like Huawei, while competitors like NVIDIA have not received similar access [5] Group 3 - User feedback indicates a perceived decline in DeepSeek's empathetic communication style, with recent updates leading to a more rigid interaction approach [6] - The competitive landscape for AI assistants in China is intensifying, with major players like ByteDance, Baidu, and Alibaba rapidly iterating their products, alongside pressure from international competitors like ChatGPT and Claude [6]
DeepSeek联合北大、清华发布新论文
Cai Jing Wang· 2026-02-27 08:04
Core Insights - The article discusses a new academic paper released by the DeepSeek team in collaboration with Peking University and Tsinghua University, focusing on inference speed optimization for large language models (LLMs) [1] Group 1: Innovation and Technology - The paper introduces an innovative inference system named DualPath, specifically designed to enhance the inference performance of LLMs under agent workloads [1] - The DualPath system implements a "dual-path reading KV-Cache" mechanism, which reallocates storage network load [1] Group 2: Performance Improvements - The offline inference throughput is reported to have increased by up to 1.87 times [1] - The average number of agent operations per second for online services has improved by 1.96 times [1]
DeepSeek又一论文上新
Di Yi Cai Jing Zi Xun· 2026-02-27 07:58
Core Viewpoint - The DeepSeek team has released a new academic paper focusing on optimizing inference speed for large language models (LLMs), which is crucial for the practical application of AI agents [4][5]. Group 1: Research and Innovation - The paper, co-authored with Peking University and Tsinghua University, introduces an innovative inference system called DualPath, designed to enhance the performance of LLMs under agent workloads [4]. - The DualPath system employs a "dual-path reading KV-Cache" mechanism, redistributing storage network load, resulting in an offline inference throughput increase of 1.87 times and an average increase of 1.96 times in the number of agent operations per second for online services [4][5]. Group 2: Industry Context and Expectations - The introduction of DualPath addresses the significant changes in inference workloads as LLMs evolve from simple dialogue systems to complex agent systems capable of multi-turn interactions, which can reach dozens or even hundreds of rounds [4]. - There is a growing expectation for the release of DeepSeek's next flagship model, DeepSeek V4, with various rumors about its launch timeline ranging from early February to March [6]. - Recent leaks suggest that DeepSeek is testing a V4 Lite model, codenamed "Sealion-lite," which supports a context window of 1 million tokens and native multimodal inference [6]. Group 3: Market Reactions and Concerns - Despite the technical advancements presented in the paper, there is a sentiment in the industry that such optimizations are seen as a necessity due to GPU shortages, with some viewing it as "dirty work" rather than innovative [5]. - Concerns have been raised among investment institutions that the release of the new model could lead to significant market volatility, similar to the previous year's model launch [6].
DeepSeek又一论文上新!新模型V4更近了?
Di Yi Cai Jing· 2026-02-27 07:01
Core Insights - The paper introduces an innovative inference system called DualPath, aimed at optimizing the inference performance of large language models (LLMs) under agent workloads, significantly enhancing efficiency in AI applications [3][4] - The DualPath system improves offline inference throughput by 1.87 times and increases the average number of agent operations per second in online services by 1.96 times [3] Group 1: Technological Advancements - The introduction of a "dual-path reading KV-Cache" mechanism reallocates storage network load, addressing the core issue of speed being hindered by data reading during agent tasks [4] - The shift from traditional human-LLM interaction to human-LLM-environment interaction necessitates a transformation in inference workloads, allowing for multiple rounds of interaction that can accumulate extensive context [3] Group 2: Market Reactions and Expectations - There are mixed opinions within the industry regarding the optimization efforts by DeepSeek, with some viewing it as a necessary response to hardware limitations, while others see value in cost reduction for broader AI adoption [5] - Speculation around the release of DeepSeek's next flagship model, V4, has generated significant market interest, with various timelines being discussed, from early February to March [5][6] - DeepSeek has not publicly commented on the rumors surrounding the V4 model, leading to heightened anticipation and concern among investors about potential market volatility upon its release [6]
【大涨解读】华为产业链:华为加码AI编程,DeepSeek也有望率先适配国产芯片,昇腾有望成为AI算力“第二选择”
Xuan Gu Bao· 2026-02-27 03:12
Market Performance - On February 27, Huawei's industrial chain saw significant gains, with Huasheng Tiancai achieving two consecutive trading limits, and stocks like Geer Software, Xinjun Network, and Tuowei Information hitting the daily limit [1]. Event: Huawei's AI Product Launch - On February 26, Huawei officially released the public beta of its cloud coding solution, integrating a large model, IDE, and autonomous development mode, covering various AI coding technologies and incorporating GLM-5.0, DeepSeek-V3.2, and Huawei's self-developed models, including a HarmonyOS-specific model [4]. - The DeepSeek V4Lite model demonstrated significant improvements in testing, supporting 1M context and native multimodal capabilities, with initial SVG examples widely disseminated and currently being tested by Huawei and other chip manufacturers [4]. - Huawei's Chairman Liang Hua stated that 43 mainstream large models are based on Ascend pre-training, with over 200 open-source models adapted to the Ascend ecosystem, facilitating the implementation of more than 6,000 solutions [4]. Institutional Insights - AI programming is reshaping core productivity methods, with large model core technologies empowering programming tools. Automated programming and code generation through AI coding enhance software development efficiency and automation levels [5]. - The value of AI programming lies in improving software development efficiency and quality, lowering technical barriers, and accelerating project iteration cycles. The capabilities of large models in programming have significantly advanced, with Claude and GPT series leading in code generation and deployment [5]. - The global AI code tool market is projected to be valued at $6.1 billion in 2024, expected to reach $26 billion by 2030 [5]. - The trend of deep integration between domestic AI model companies and domestic AI chip enterprises is emerging, with Huawei's Ascend achieving Day 0 support for DeepSeek-V3.2-Exp, completing adaptation and deployment using vLLM/SGLang inference frameworks [5]. - Huawei plans to launch three series of Ascend chips (Ascend 950PR/950DT, Ascend 960, and Ascend 970) over the next three years, aiming for a doubling of computing power each year [5]. - Huawei's 384 super node has surpassed NVIDIA's flagship product GB200 NVL72 in several key metrics [6]. - Huawei's CANN is fully open-sourced, collaborating with the upstream and downstream of the industrial chain to build an ecosystem, with the CANN computing architecture comparable to NVIDIA's CUDA core software layer [6].
打破惯例!DeepSeek V4优先适配国产芯片,云计算ETF(159890)盘中拉升获资金抢筹超6600万
Sou Hu Cai Jing· 2026-02-27 02:46
Core Viewpoint - The rise of domestic computing power chains is significantly impacting the cloud computing sector, with the cloud computing ETF (159890) experiencing notable gains and increased investor interest [1] Group 1: Market Performance - The cloud computing ETF (159890) opened with a rise of over 1% and is currently up by 0.74%, indicating strong performance among its constituent stocks [1] - Notable stocks include Tuowei Information, which hit the daily limit, and Yuntian Lifei, which surged by 13%, while several others like Wangsu Science and Technology and Runhe Software rose by over 6% [1] - The ETF has seen a net inflow of over 66 million CNY during the trading session, with a cumulative net subscription of approximately 58.9 million CNY over the past five days, breaking previous scale records [1] Group 2: Industry Developments - DeepSeek is set to release its V4 model, which prioritizes compatibility with domestic chips, marking a shift in the industry towards domestic computing power solutions [3] - This move is expected to enhance the domestic computing power ecosystem, transitioning from mere usability to large-scale commercial application, thereby addressing the previous limitations of domestic chips [3] Group 3: Demand Insights - A report from CITIC Securities indicates a significant surge in the token usage of domestic large models, reflecting an exponential growth in AI inference demand [4] - The report highlights that the top three large models in terms of token usage are all domestic, showcasing the competitive edge of domestic computing power due to cost advantages and an improving ecosystem [4] - Additionally, the computing power market is experiencing price increases, indicating supply bottlenecks and suggesting ongoing benefits for the computing power industry [4] Group 4: ETF Composition - The cloud computing ETF (159890) tracks the CSI Cloud Computing and Big Data Theme Index, with a focus on both AI computing power (41%) and AI applications (32%) [5] - The top ten holdings include leading companies such as iFLYTEK, Kingsoft Office, and Inspur Information, positioning the ETF to benefit from the AI infrastructure wave [5][6] - The current trend led by DeepSeek in reshaping the domestic computing power ecosystem is expected to drive the industry from passive adaptation to active definition, presenting investment opportunities in the synergy between domestic computing power and AI applications [6]
DeepSeek新论文剧透V4新框架,用闲置网卡加速智能体推理性能,打破PD分离瓶颈
3 6 Ke· 2026-02-27 02:29
Core Insights - A new reasoning framework for agents called DualPath has been introduced, which addresses I/O bottlenecks in long-text reasoning scenarios by optimizing the speed of loading KV-Cache from external storage [1][3]. Group 1: DualPath Framework - DualPath changes the traditional Storage-to-Prefill loading mode by introducing a second path, Storage-to-Decode, allowing for more efficient data handling [3][6]. - The framework utilizes idle storage network interface card (SNIC) bandwidth from the decoding engine (DE) to read caches and employs high-speed computing networks (RDMA) to transfer data to the prefill engine (PE), achieving global pooling of storage bandwidth and dynamic load balancing [3][13]. Group 2: Performance Improvements - In tests with a production-level model of 660 billion parameters, DualPath demonstrated a remarkable increase in offline inference throughput by 1.87 times and an average increase in online service throughput by 1.96 times [3][14]. - The framework significantly optimizes first token latency (TTFT) under high load while maintaining stable token generation speed (TPOT) [5][14]. Group 3: Technical Innovations - DualPath allows KV-Cache to be loaded into the decoding engine first, which is then transmitted to the prefill engine, alleviating bandwidth pressure on the prefill side [7][9]. - The architecture includes a central scheduler that dynamically allocates tasks based on I/O pressure and computational load, preventing congestion on any single network interface or computational resource [14][18]. Group 4: Research and Development - The first author of the paper, Wu Yongtong, is a PhD student at Peking University, focusing on system software and large model infrastructure, particularly in optimizing inference systems for large-scale deployment [15][16].
DeepSeek、月之暗面、MiniMax被点“非法提取”,它们做错了吗? | 电厂
Xin Lang Cai Jing· 2026-02-25 10:47
Core Viewpoint - Anthropic has accused three Chinese AI companies—DeepSeek, Moonshot, and MiniMax—of illicitly extracting data from its model Claude, marking the second controversy involving domestic models within three months [1][9]. Group 1: Allegations and Responses - Anthropic claims that the three Chinese companies used approximately 24,000 fraudulent accounts to interact with Claude over 16 million times, using these interactions to enhance their own models [1][4]. - The accused companies have remained silent regarding the allegations, with no public response from DeepSeek, MiniMax, or Moonshot [1]. - Anthropic's statement highlighted that the interaction patterns with Claude were abnormal, indicating intentional extraction of Claude's unique capabilities [7]. Group 2: Technical Aspects of Distillation - The technique used by the accused companies is known as "distillation," which allows models to learn from a "teacher model" like Claude by interacting with it [4][6]. - Distillation is a common method for rapidly evolving models, enabling smaller models to approximate the performance of larger ones with less data [6]. - Major AI companies, including OpenAI and Google, have included clauses in their usage agreements prohibiting distillation, reflecting a growing concern over intellectual property [9]. Group 3: Legal and Ethical Considerations - The ongoing debate over model distillation raises questions about legal definitions, including contract law, copyright law, and unfair competition [10]. - Both Chinese and American companies utilize vast amounts of internet data for training, leading to discussions about authorization and ethical use of such data [10]. - The narrative surrounding "Chinese companies distilling American models" has become a one-sided discourse, with the potential for a prolonged public relations battle [10]. Group 4: Open Source vs. Closed Source Models - Many leading Chinese models operate under open-source licenses that permit distillation, contrasting with the closed-source models that prohibit such practices [10][13]. - For instance, DeepSeek's models are released under the MIT license, allowing for academic and commercial use, while other models like MiniMax and Qwen3 follow the Apache 2.0 license [10]. - The controversy over distillation also highlights the ongoing debate between open-source and closed-source development paths in the AI industry [13].