Workflow
降价!DeepSeek,大消息!
证券时报·2025-09-29 11:55

Core Viewpoint - DeepSeek has launched the DeepSeek-V3.2-Exp model, which introduces a Sparse Attention mechanism to enhance training and inference efficiency for long texts, while maintaining output quality similar to its predecessor, V3.1-Terminus [2][4]. Group 1: Model Performance - The DeepSeek-V3.2-Exp model shows comparable performance to the V3.1-Terminus across various benchmark datasets, with specific scores indicating slight variations in certain areas [5]. - In the MMLU-Pro benchmark, both models scored 85.0, while in the General GPQA-Diamond benchmark, V3.1 scored 80.7 and V3.2-Exp scored 79.9 [5]. - The Sparse Attention mechanism has led to significant improvements in training and inference efficiency without compromising model output [4]. Group 2: Recent Developments - DeepSeek has been active recently, with the V3.1-Terminus model being released on August 21, which introduced a hybrid reasoning architecture and improved efficiency and agent capabilities [8]. - A research paper on the DeepSeek-R1 reasoning model was published in the prestigious journal Nature, marking a significant achievement for Chinese AI research [8][9]. - The new pricing policy for the DeepSeek API has reduced costs for developers by over 50%, making it more accessible [4].