从Gemini到豆包:全球两大AI巨头为何走上同一条路?
第一财经·2026-02-14 15:19

Core Viewpoint - ByteDance has officially launched the Doubao Model 2.0 series, which includes significant upgrades in multi-modal understanding, enterprise-level agent capabilities, and cost efficiency, positioning it among the global leaders in AI models [1][2]. Version Iteration Updates - The Doubao 2.0 series features three different sizes: Pro, Lite, and Mini, with enhanced multi-modal understanding and improved capabilities for real-world long-chain tasks, achieving top-tier performance in high-value economic and research tasks [4][7]. Technical Advancements - Doubao 2.0 Pro is designed for deep reasoning and long-chain task execution, directly competing with models like GPT 5.2 and Gemini 3 Pro, indicating a strategic alignment among leading AI laboratories towards achieving general artificial intelligence (AGI) [2][4]. Performance Metrics - The Doubao 2.0 Pro flagship model has achieved gold medal results in IMO, CMO mathematics competitions, and ICPC programming contests, showcasing its top-tier mathematical and reasoning capabilities [4][5]. Multi-Modal Understanding - The model has significantly upgraded its multi-modal understanding capabilities, excelling in visual reasoning, spatial perception, and long-context understanding, achieving the best performance in authoritative tests [5][8]. Cost Efficiency - Doubao 2.0 Pro pricing is based on input length, with costs of 3.2 RMB per million tokens for input and 16 RMB per million tokens for output, offering a substantial cost advantage over competitors like Gemini 3 Pro and GPT 5.2 [6][7]. Real-World Task Execution - The core focus of Doubao 2.0's upgrade is its ability to execute complex real-world tasks, supported by breakthroughs in multi-modal understanding, allowing the model to evolve from a "test-taker" to an "executor" [7][9]. Competitive Landscape - The competition between Doubao 2.0 and Gemini centers on multi-modal capabilities, with both aiming to create AI that comprehends and interacts with the complexities of the physical world, moving beyond mere language processing [9].

从Gemini到豆包:全球两大AI巨头为何走上同一条路? - Reportify