Seek .-DeepSeek-V3.1震撼发布，全球开源编程登顶，R1/V3首度合体，训练量暴增10倍

Core Insights - DeepSeek has officially launched DeepSeek-V3.1, marking a significant step towards the era of intelligent agents with its hybrid reasoning model and 671 billion parameters, surpassing previous models like DeepSeek-R1 and Claude 4 Opus [1][12][18] Model Performance - DeepSeek-V3.1 demonstrates faster reasoning speeds compared to DeepSeek-R1-0528 and excels in multi-step tasks and tool usage, outperforming previous benchmarks [3][6] - In various benchmark tests, DeepSeek-V3.1 achieved scores of 66.0 in SWE-bench, 54.5 in SWE-bench Multilingual, and 31.3 in Terminal-Bench, significantly surpassing its predecessors [4][15] - The model scored 29.8 in the Humanity's Last Exam, showcasing its advanced reasoning capabilities [4][16] Training and Architecture - The model utilizes a hybrid reasoning mode, allowing it to switch between reasoning and non-reasoning modes seamlessly [6][12] - DeepSeek-V3.1-Base underwent extensive pre-training with 840 billion tokens, enhancing its contextual support [6][13] - The training process involved a two-stage long context expansion strategy, increasing the training dataset significantly [13] API and Accessibility - Starting September 5, a new API pricing structure will be implemented for DeepSeek [7] - Two versions of DeepSeek-V3.1, Base and standard, are available on Hugging Face, supporting a context length of 128k [6][14] Competitive Landscape - DeepSeek-V3.1 has been positioned as a strong competitor to OpenAI's models, particularly in reasoning efficiency and coding tasks, achieving notable scores in various coding benchmarks [12][20][23] - The model's performance in coding tests, such as Aider, reached 76.3%, outperforming Claude 4 Opus and Gemini 2.5 Pro [16][19]