谷歌Gemini 2.5 Pro成为Aider Polyglot(真实世界编码)性能最高的模型,并且成本低廉,仅仅比DeepSeek R1成本高一点。(AI寒武纪)

Core Insights - Google Gemini 2.5 Pro has emerged as the highest-performing model in the Aider Polyglot benchmark for real-world coding, with a low cost that is only slightly higher than DeepSeek R1 [1]. Performance Metrics - Gemini 2.5 Pro achieved a correctness rate of 72.9% and a correct edit usage rate of 92.4%, with a total cost of $6.32 [4]. - Other models in comparison include: - Sonnet-20250219: 64.9% correctness, 97.8% correct edits, $36.83 cost [4]. - DeepSeek R1: 56.9% correctness, 96.9% correct edits, $5.42 cost [5]. - DeepSeek V3 (0324): 55.1% correctness, 99.6% correct edits, $1.12 cost [5]. - The cost-effectiveness of Gemini 2.5 Pro positions it favorably against competitors, particularly given its high performance metrics [1][4].