代码能力
Search documents
DeepSeek V4迟迟不发,中国开源王者为何越来越慢?
阿尔法工场研究院· 2026-03-17 09:35
Core Viewpoint - DeepSeek's development has slowed down significantly, raising concerns among developers and the AI community about its future competitiveness compared to other players like OpenAI and Anthropic [5][8][18]. Group 1: DeepSeek's Development Timeline - DeepSeek V4 is expected to launch in April 2026, following multiple delays in its announcement timeline [6][14]. - The previous version, DeepSeek V3.2, was released on December 1, 2025, marking a high point for the company with rapid updates and significant community engagement [8][11]. - Since the release of V3.2, updates have been minimal, focusing on small adjustments rather than major advancements, leading to community frustration [12][13]. Group 2: Comparison with Competitors - OpenAI and Anthropic have maintained a rapid release cycle, with OpenAI launching multiple updates and products almost monthly, while DeepSeek has not released any major updates since V3.2 [15][18]. - The competitive landscape has shifted, with DeepSeek lagging behind in terms of update frequency and innovation, which could impact its market position [42]. Group 3: Challenges Faced by DeepSeek - The transition from releasing basic models to developing a comprehensive system has increased the complexity and duration of DeepSeek's development cycles [21][25]. - DeepSeek is under pressure to meet high expectations from the open-source community, where any perceived failure could damage its reputation significantly [28][31]. - The need for DeepSeek to ensure that each release is impactful is critical, as minor updates may not suffice in a competitive environment [32]. Group 4: Strategic and Technical Considerations - The upcoming V4 is expected to focus on multi-modal capabilities, long-term memory, and enhanced code abilities, alongside deep adaptation to domestic chipsets [38][42]. - The development of V4 is seen as a response to both external technological pressures and internal resource limitations, which may extend the research and development timeline [39][40]. - The ability to adapt to the evolving hardware ecosystem is crucial for DeepSeek's future success in the AI landscape [37].
GPT-5.4发布,最适合OpenClaw的天选模型登场了。
数字生命卡兹克· 2026-03-05 22:38
Core Viewpoint - The article discusses the release of GPT-5.4, highlighting its advancements in coding ability, world knowledge, and multimodal understanding, making it a superior choice for applications like OpenClaw [2][11]. Group 1: Model Comparison - GPT-5.4 has a coding ability comparable to GPT-5.3 Codex and improved world knowledge over GPT-5.2, making it suitable for various professional fields [15][25]. - In performance metrics, GPT-5.4 achieved 83.0% in GDPval, surpassing Claude Opus 4.6 at 78.0% and GPT-5.3 Codex at 70.9% [16][19]. - For software engineering tasks, GPT-5.4 scored 57.7%, slightly ahead of GPT-5.3 Codex at 56.8% [17]. Group 2: Key Features of GPT-5.4 - GPT-5.4 features a significant upgrade with a context window of 1 million tokens, enhancing its ability to maintain task context [25]. - The model includes native computer usage capabilities, allowing it to execute commands based on visual inputs, which is a major advancement for agent tasks [27]. - It supports tool search functionality, reducing token usage by 47% while maintaining accuracy, optimizing performance in applications with numerous tools [30][34]. Group 3: Pricing and Accessibility - The pricing for GPT-5.4 is set at $2.50 per million tokens for input, which is more affordable compared to Claude Opus 4.6, making it accessible for smaller teams [39]. - GPT-5.4 can utilize subscription credits, making it a cost-effective option for users compared to other models that require API access [11][36].
DeepSeek小版本大升级,新R1模型代码能力媲美OpenAI o3
Di Yi Cai Jing· 2025-05-29 03:04
Core Insights - DeepSeek has released a minor version upgrade of its R1 model, named DeepSeek-R1-0528, which has shown significant improvements in coding capabilities, nearly matching the performance of OpenAI's o3-high model [1][5] - Developers have noted enhancements in writing tasks, with outputs appearing more natural and better formatted compared to previous versions [7] - The model's performance in context recall has improved for contexts up to 32K, although there is a decline in performance for 60K contexts [7][8] Performance Metrics - In the Live CodeBench testing platform, DeepSeek-R1-0528 achieved a Pass@1 score of 73.1, ranking fourth among various models [3] - The top three models in the same test were 04-Mini (High) with 80.2, 03 (High) with 75.8, and 04-Mini (Medium) with 74.2 [3] Developer Feedback - Developers have expressed that the upgrade represents a significant victory for open-source initiatives [4] - Some developers have conducted personal tests comparing DeepSeek-R1 with Claude-4, finding R1 superior in certain aspects, such as the visual effects of a simulated collision [5] - There is anticipation for the next major version, R2, with hopes for improvements in context length and multimodal capabilities [8]