DeepSeek Sparse Attention (DSA)
Search documents
DeepSeek V3.2 发布:长文本能力新突破,API 价格砍半
Founder Park· 2025-09-29 10:55
Core Insights - DeepSeek has launched its latest experimental model, DeepSeek-V3.2-Exp, which incorporates the revolutionary DeepSeek Sparse Attention (DSA) technology aimed at significantly enhancing long text processing efficiency [2][6][7]. Group 1: Technical Innovations - The introduction of the DeepSeek Sparse Attention (DSA) mechanism allows for fine-grained sparse attention, achieving a substantial increase in long text training and inference speed with minimal impact on model output quality [6][7]. - A rigorous evaluation was conducted to align the training settings of DeepSeek-V3.2-Exp with V3.1-Terminus, showing that the performance of DeepSeek-V3.2-Exp is comparable to V3.1-Terminus across various public benchmarks [10]. Group 2: Cost Reduction - The efficiency improvements have led to a significant reduction in API call costs, with a decrease of over 50%, benefiting developers by allowing them to build more powerful applications at a lower cost [4][12]. Group 3: User Engagement and Testing - DeepSeek has retained access to the V3.1 model's API for a limited time until October 15, 2025, allowing users to compare the new and old versions while enjoying the same pricing for both [15][16]. - Users are encouraged to participate in testing the experimental version and provide feedback, which is crucial for further refinement [15][18].
DeepSeek新模型上线!引入DSA新稀疏注意力,还又狙了CUDA一枪
量子位· 2025-09-29 10:44
Core Insights - DeepSeek has launched its latest model, DeepSeek-V3.2-Exp, which introduces a new attention mechanism called DeepSeek Sparse Attention (DSA) [1][6] - The model aims to enhance long text processing and inference efficiency without significantly affecting output quality [7] - A significant price reduction for the official API has been announced, starting at 50% off [3][17] Model Updates - DeepSeek-V3.2-Exp is built on the previous version, DeepSeek-V3.1-Terminus, which focused on stability, tool invocation capabilities, language consistency, and error correction [9] - In benchmark comparisons, DeepSeek-V3.2-Exp shows comparable performance to DeepSeek-V3.1-Terminus across various evaluation sets [10] - The model demonstrates improved inference costs when handling 128K long contexts, particularly during the decoding phase [12] Technical Innovations - The introduction of DSA allows for fine-grained attention mechanisms, leading to significant improvements in processing efficiency [6][7] - DeepSeek has open-sourced GPU operators in both TileLang and CUDA versions, facilitating research and development [13][15] - The company recommends using the TileLang version for debugging and rapid iteration during experimental research [16] Community Engagement - The announcement includes a call to action for the community to engage with the new model and take advantage of the promotional pricing [18] - Links to access the model on platforms like HuggingFace and ModelScope have been provided [19]