大模型编码
Search documents
太卷了!专属Coding的新一代Arena榜单来了,有国产模型登上榜首
机器之心· 2025-11-13 10:03
Core Insights - The article highlights the rapid advancements in large model programming, emphasizing the competitive landscape among model vendors as they enhance coding capabilities and develop new tools [2][3] - The introduction of the Code Arena by LMArena marks a significant evolution in the evaluation of coding capabilities of large models, focusing on real-world application development rather than just code generation [4][6] Model Performance - The new Code Arena ranks the domestic model GLM-4.6 at the top, alongside Claude and GPT-5, showcasing its superior coding abilities [6][10] - GLM-4.6 has demonstrated a success rate of 94.9% in code modification tasks, closely trailing behind Anthropic's Claude Sonnet 4.5, which has a success rate of 96.2% [11] - The performance gap between open-source models and top proprietary models has significantly narrowed from 5-10 percentage points to mere basis points, indicating a rapid convergence in capabilities [14] Industry Trends - There is a noticeable shift among users towards utilizing GLM-4.6 for daily tasks, reflecting its growing acceptance and recognition in the AI programming community [15] - Cerebras has decided to adopt GLM-4.6 as its default recommended model, phasing out the previous model, which underscores the model's rising prominence in the industry [16] - The article emphasizes the remarkable acceleration of domestic models, transitioning from a phase of catching up to one of leading the market, particularly in the open-source ecosystem [17][18]