Core Viewpoint - The latest update of DeepSeek, version V3.1-Terminus, addresses previous user-reported issues and enhances model performance while maintaining existing capabilities [2][3][7]. Group 1: Version Improvements - The Terminus version resolves a notable bug where the model randomly outputted the character "极" [3][7]. - Improvements include enhanced language consistency, reducing mixed-language outputs and random characters, and optimized performance of Code Agent and Search Agent [7][8]. Group 2: Performance Metrics - The new model shows improved performance in various benchmarks compared to the previous version: - MMLU-Pro: 85.0 (up from 84.8) - GPQA-Diamond: 80.7 (up from 80.1) - Humanity's Last Exam: 21.7 (up from 15.9) - BrowseComp: 38.5 (up from 30.0) - SimpleQA: 96.8 (up from 93.4) - SWE Verified: 68.4 (up from 66.0) - Terminal-bench: 36.7 (up from 31.3) [9]. Group 3: User Reactions and Future Speculations - Some users expressed concerns about a decrease in performance in the Codeforces competition, speculating that safety adjustments may have impacted the model's creativity [10]. - The naming of the Terminus version has led to speculation about the next version potentially being a complete overhaul (V4) [11][14].
DeepSeek V3.1更新「最终版」!下一次是V4/R2了???
量子位·2025-09-23 03:14