DeepSeek新模型上线！引入DSA新稀疏注意力，还又狙了CUDA一枪

Core Insights - DeepSeek has launched its latest model, DeepSeek-V3.2-Exp, which introduces a new attention mechanism called DeepSeek Sparse Attention (DSA) [1][6] - The model aims to enhance long text processing and inference efficiency without significantly affecting output quality [7] - A significant price reduction for the official API has been announced, starting at 50% off [3][17] Model Updates - DeepSeek-V3.2-Exp is built on the previous version, DeepSeek-V3.1-Terminus, which focused on stability, tool invocation capabilities, language consistency, and error correction [9] - In benchmark comparisons, DeepSeek-V3.2-Exp shows comparable performance to DeepSeek-V3.1-Terminus across various evaluation sets [10] - The model demonstrates improved inference costs when handling 128K long contexts, particularly during the decoding phase [12] Technical Innovations - The introduction of DSA allows for fine-grained attention mechanisms, leading to significant improvements in processing efficiency [6][7] - DeepSeek has open-sourced GPU operators in both TileLang and CUDA versions, facilitating research and development [13][15] - The company recommends using the TileLang version for debugging and rapid iteration during experimental research [16] Community Engagement - The announcement includes a call to action for the community to engage with the new model and take advantage of the promotional pricing [18] - Links to access the model on platforms like HuggingFace and ModelScope have been provided [19]