Workflow
稀疏注意力机制
icon
Search documents
国产芯片厂商争相认领新版DeepSeek
21世纪经济报道· 2025-10-01 15:00
Core Viewpoint - The release of DeepSeek-V3.2-Exp model by DeepSeek Company marks a significant advancement in the domestic AI chip ecosystem, showcasing a collaborative effort among various domestic chip manufacturers [1][4][7]. Group 1: Model Release and Features - DeepSeek-V3.2-Exp introduces DeepSeek Sparse Attention, which significantly reduces computational resource consumption and enhances inference efficiency [1][7]. - The new model has led to a price reduction of API services by 50% to 75% across DeepSeek's platforms [1]. - The model's release prompted immediate recognition and adaptation from several domestic chip manufacturers, including Cambrian, Huawei, and Haiguang [2][4]. Group 2: Industry Response and Ecosystem Development - Cambrian was the first to announce compatibility with DeepSeek-V3.2-Exp, followed by Huawei and Haiguang, indicating a rapid response from the industry [2][4]. - The consensus within the domestic AI industry regarding DeepSeek's models has enabled the company to take the lead in defining standards for domestic chips [4][7]. - The rapid adaptation of DeepSeek's models by various manufacturers suggests a growing synergy within the domestic AI hardware and software ecosystem [9]. Group 3: Future Implications - Experts believe that the swift development of domestic chips by 2025 can be attributed to the emergence of DeepSeek as a key player in the industry [4][5]. - The collaborative efforts among domestic companies to adapt to DeepSeek's standards may accelerate the growth of the AI chip ecosystem in China [4][9]. - The advancements made by DeepSeek in a short time frame highlight the potential for rapid evolution in the domestic AI landscape, contrasting with the decades-long establishment of ecosystems by companies like NVIDIA [9].
DeepSeek 与国产芯片开启“双向奔赴”
Core Insights - DeepSeek company has released the DeepSeek-V3.2-Exp model, introducing a sparse attention mechanism that significantly reduces computational resource consumption and enhances inference efficiency [1] - The new model has led to a price reduction of API services by 50% to 75% [1] - The release has prompted immediate recognition and adaptation from several domestic chip manufacturers, indicating a growing synergy within the domestic AI hardware and software ecosystem [1][2] Group 1: Model Release and Features - The DeepSeek-V3.2-Exp model incorporates the DeepSeek Sparse Attention mechanism, optimizing training and inference efficiency for long texts [5] - The model is compatible with CUDA and utilizes TileLang for rapid prototyping, aiming for higher efficiency through lower-level language implementations [5][6] - The release of V3.2-Exp marks a significant shift from the previous version, V3.1, which did not receive any proactive recognition from companies regarding the "UE8M0 floating-point format" [4][5] Group 2: Industry Response and Ecosystem Development - Within four minutes of the model's release, Cambricon announced its adaptation of the DeepSeek-V3.2-Exp model and open-sourced its large model inference engine [2] - Huawei and Haiguang also quickly followed suit, demonstrating the rapid response from the domestic chip industry to the new model [2] - The consensus within the domestic AI industry regarding the DeepSeek model has empowered the company to take the lead in defining standards for domestic chips [3][4] Group 3: Competitive Landscape - The rapid development of the domestic chip ecosystem is highlighted by the swift adaptation of major players like Tencent and Alibaba, who are actively integrating domestic chips into their cloud computing services [6] - Experts believe that the emergence of DeepSeek has accelerated the pace of domestic chip development, with expectations for significant advancements by 2025 [3]
AI日报丨再套现超4000万美元!黄仁勋持续减持英伟达,看好OpenAI称其或成为下一个万亿美元巨头
美股研究社· 2025-09-30 12:06
Core Insights - The article discusses the rapid advancements in artificial intelligence (AI) technology and its implications for investment opportunities in AI-related companies and market trends [3]. Group 1: AI Model Developments - The latest GLM-4.6 model by Zhiyuan has been launched, showing a 27% improvement in coding capabilities compared to its predecessor GLM-4.5, excelling in real programming tasks [5]. - DeepSeek introduced a "sparse attention" mechanism in its experimental AI model, DeepSeek-V3.1-Exp, aimed at enhancing training and inference efficiency in long contexts [5]. - Anthropic released its new AI model, Claude Sonnet 4.5, claiming it to be the "best coding model globally," with significant improvements in reliability and performance across various professional fields [6]. Group 2: Market Trends and Predictions - OpenAI has launched an "Instant Checkout" feature in ChatGPT, allowing users to purchase items directly through the platform, initially supporting single-item purchases [7]. - NVIDIA's CEO Jensen Huang sold 225,000 shares of NVIDIA stock for over $40 million, expressing confidence in AI's future, particularly in OpenAI's potential to become a trillion-dollar company [7][8]. - Huang predicts that OpenAI could achieve unprecedented growth, similar to other tech giants like Meta and Google, by offering both consumer and enterprise services [8]. Group 3: Copyright and Content Usage - OpenAI's Sora AI video generator will default to using copyrighted content, with an option for studios to opt-out, indicating a shift in content usage policies [12]. - The company has been in discussions with talent agencies and studios regarding the opt-out mechanism, ensuring that copyrighted characters do not appear in its AI tools [13].
DeepSeek,与国产芯片开启“双向奔赴”
Core Insights - DeepSeek company has released the DeepSeek-V3.2-Exp model, introducing a sparse attention mechanism that significantly reduces computational resource consumption and enhances inference efficiency [1][6] - The new model has led to a price reduction of API services by 50% to 75% [1] - The release has prompted immediate recognition and adaptation from several domestic chip manufacturers, including Huawei, Cambricon, and Haiguang, indicating a growing synergy within the domestic AI hardware and software ecosystem [2][4] Summary by Sections Model Release and Features - The DeepSeek-V3.2-Exp model incorporates the DeepSeek Sparse Attention mechanism, optimizing training and inference efficiency for long texts [6] - The model is compatible with CUDA and utilizes TileLang for rapid prototyping, which is designed specifically for AI operator development [6] Industry Response - Cambricon was the first to claim adaptation of the new model, followed by Huawei and Haiguang, showcasing a collaborative effort among domestic manufacturers [2] - The rapid response from these companies indicates a consensus within the domestic AI industry regarding the significance of the DeepSeek model [6] Ecosystem Development - DeepSeek is emerging as a key player in building a new ecosystem for domestic AI, with its model becoming a benchmark for open-source models in China [2][4] - The collaboration among major internet companies like Tencent and Alibaba in adapting domestic chips further accelerates the establishment of this ecosystem [7] Historical Context - The previous version, DeepSeek-V3.1, did not receive any proactive claims from companies regarding its adaptation, highlighting the significant shift in industry dynamics with the latest release [5] - Experts believe that the rapid development of domestic chips by 2025 can be attributed to the emergence of DeepSeek as a standard-setting entity [3]
华为昇腾、寒武纪宣布适配DeepSeek最新模型
Core Insights - DeepSeek officially launched the DeepSeek-V3.2-Exp model on September 29, introducing the self-developed DeepSeek Sparse Attention (DSA) mechanism, which optimizes training and inference efficiency for long texts [1][7] - The release of the new model has led to a significant reduction in service costs, with DeepSeek API prices dropping by over 50% [2][10] - The open-sourcing of the TileLang version operator has garnered considerable attention within the industry [3] Technical Innovations - The DSA mechanism is an optimization technique for the Transformer architecture, addressing the computational complexity associated with traditional dense attention mechanisms, which grow exponentially with text length [6][7] - The V3.2-Exp model has achieved substantial improvements in training and inference efficiency for long texts while maintaining performance levels comparable to the previous V3.1-Terminus model [7] Market Impact - DeepSeek has made the V3.2-Exp model fully open-source on platforms like HuggingFace and ModelScope, with related research papers also published [5] - The collaboration with domestic hardware providers such as Huawei, Cambricon, and Haiguang demonstrates the synergy between AI software and hardware ecosystems in China [11][12] - The adoption of TileLang, a programming language designed to simplify GPU operator development, is expected to enhance the efficiency of AI operator development significantly [12]
华为昇腾、寒武纪宣布适配DeepSeek最新模型
21世纪经济报道· 2025-09-30 10:13
Core Viewpoint - DeepSeek has officially released the V3.2-Exp model, introducing the DeepSeek Sparse Attention (DSA) mechanism, which optimizes training and inference efficiency for long texts, significantly reducing service costs by over 50% for the DeepSeek API [1][5]. Group 1: Model Development - The V3.2-Exp model builds on the V3.1-Terminus version and incorporates the DSA mechanism, which is a sparse attention approach that reduces computational complexity when processing long texts [1][4]. - DSA allows for adaptive selection of key attention heads and local context windows, improving efficiency and lowering costs compared to traditional dense attention mechanisms [3][4]. Group 2: Cost and Accessibility - The introduction of the new model has led to a significant reduction in the cost of accessing the DeepSeek API, with prices dropping by more than 50% [5]. - DeepSeek has temporarily retained additional API access for the previous V3.1-Terminus model until October 15, allowing users to conduct comparative testing [2]. Group 3: Open Source and Community Engagement - DeepSeek has fully open-sourced the V3.2-Exp model on platforms like HuggingFace and ModelScope, along with related research papers [2]. - The company has also open-sourced the TileLang version of the operators, which has garnered significant attention in the industry [1][6]. Group 4: Hardware Compatibility - Following the release of V3.2-Exp, major domestic hardware companies like Huawei, Cambricon, and Haiguang have announced compatibility with the new model, indicating a collaborative development within the domestic AI ecosystem [6][10]. - TileLang, a programming language developed for simplifying GPU operator development, has been recommended for use in research experiments, enhancing the efficiency of AI operator development [7][10].
DeepSeek新模型降价:优化推理效率,API价格降超50%
YOUNG财经 漾财经· 2025-09-30 06:25
Core Insights - DeepSeek has launched the new DeepSeek-V3.2-Exp model, which significantly reduces API costs by over 50% [2][3][4] Group 1: Model Release and Features - The DeepSeek-V3.2-Exp model is an experimental version that builds on the previous V3.1-Terminus, introducing the DeepSeek Sparse Attention mechanism to enhance training and inference efficiency for long texts [3][4] - The new model maintains performance levels comparable to V3.1-Terminus across various public evaluation datasets, despite the introduction of the sparse attention mechanism [4] Group 2: Cost Reduction and Pricing - The introduction of the new model has led to a substantial decrease in service costs, with API pricing dropping by more than 50%. Specific price changes include input cache hits reduced from 0.5 yuan to 0.2 yuan per million tokens, cache misses from 4 yuan to 2 yuan per million tokens, and output costs from 12 yuan to 3 yuan per million tokens [4] Group 3: Research and Development - The development of the DeepSeek-V3.2-Exp model involved designing new GPU operators and utilizing the TileLang programming language for rapid prototyping, which supports deeper exploration of model capabilities [4] - DeepSeek's research on the DeepSeek-R1 model, which focuses on incentivizing reasoning capabilities in large language models through reinforcement learning, was featured on the cover of the prestigious journal Nature [7]
DeepSeek新版本API价格下调 寒武纪:对新模型DeepSeek
Core Insights - DeepSeek has released the experimental version DeepSeek-V3.2-Exp, which introduces Sparse Attention for improved training and inference efficiency on long texts [1][2] - The new model has led to a significant reduction in service costs, with API prices dropping by over 50% for developers [1] - Cambricon has quickly adapted to the new model and open-sourced the vLLM-MLU inference engine, allowing developers to experience the new features immediately [1][2] - Huawei Ascend has also achieved 0-day support for DeepSeek-V3.2-Exp, optimizing deployment on the CANN platform and maintaining low inference generation speeds [3] Group 1 - DeepSeek-V3.2-Exp introduces Sparse Attention for enhanced efficiency [1] - API costs for developers have been reduced by over 50% [1] - Cambricon has achieved day 0 adaptation for the new model [2] Group 2 - Huawei Ascend has completed the adaptation and optimization for DeepSeek-V3.2-Exp [3] - The deployment strategy utilizes DeepSeek's large EP parallel scheme [3] - Inference generation speeds are maintained below 2 seconds for TTFT and 30 milliseconds for TPOT on long sequences [3]
DeepSeek最新模型上线,全新注意力机制基于北大ACL最佳论文
3 6 Ke· 2025-09-29 23:39
Core Insights - DeepSeek has launched its latest experimental model, DeepSeek-V3.2-Exp, featuring a new attention mechanism called DeepSeek Sparse Attention (DSA), which improves training and inference efficiency while reducing API costs by over 50% [1][19]. Model Features - The V3.2 model builds on DeepSeek-V3.1-Terminus and introduces DSA, achieving faster and more efficient training and inference for long contexts [3][5]. - DSA is the first key technology branded under "DeepSeek" and is an improvement over the Native Sparse Attention (NSA) from a previous collaboration with Peking University [3][5]. - The DSA mechanism allows the model to focus on a small subset of important tokens rather than all tokens, significantly reducing computational complexity from O(L²) to O(Lk), where k is much smaller than L [8][10]. Performance Evaluation - Evaluation results indicate that DeepSeek-V3.2-Exp maintains performance levels comparable to its predecessor, with no significant decline in effectiveness across both short and long text tasks [14][15]. - Specific benchmark results show that while some metrics slightly decreased, others improved, indicating a balanced performance across various tasks [15]. Cost Efficiency - The introduction of DSA has led to substantial reductions in operational costs, with the API price being lowered by over 50% for developers [19]. - The model's deployment has demonstrated significant end-to-end acceleration and cost savings in inference [18]. Future Implications - Although still an experimental model, DeepSeek-V3.2-Exp presents a promising engineering pathway for overcoming long text processing challenges without sacrificing performance [18].
DeepSeek-V3.2-Exp发布 API成本将降低50%以上
Feng Huang Wang· 2025-09-29 14:07
Core Insights - DeepSeek has released the V3.2-Exp model, which introduces a Sparse Attention mechanism aimed at optimizing training and inference efficiency for long texts [1] - The official app, web version, and mini-program have all been updated to DeepSeek-V3.2-Exp, and the API has seen a significant price reduction [1] - Under the new pricing policy, the cost for developers to access the DeepSeek API will decrease by over 50% [1] - The performance of DeepSeek-V3.2-Exp on various public evaluation datasets is comparable to that of V3.1-Terminus [1]