自适应思考 - filings, earnings calls, financial reports, news

自适应思考

Search documents

Claude Opus 4.6 登顶编程之王! 杀入 Office 全家桶, 15 亿打工人变天

程序员的那些事· 2026-02-07 01:35

转自：新智元报道【新智元导读】整个硅谷又癫狂了！Anthropic深夜扔出王炸，Claude 4.6用近乎恐怖的编程能力和智能体军团，给OpenAI和谷歌上了一堂名为「降维打击」的课。 Anthropic深夜惊雷，终于祭出了编程之王！刚刚， Claude Opus 4.6横空出世，一夜成为全球最强编程AI ，「无模」能敌。它在前一代Opus 4.5的基础上，大幅提升了编码技能。 Claude Opus 4.6的规划更缜密，能更持久地执行AI Agent任务，在超大规模代码库中运行也更加可靠。最重要的是，它还具备更强自我纠错能力，比如精准的代码审查、调试。它也是Anthropic首款在beta阶段，上下文支持100万token的Opus级模型。在多项基准测试中， Claude Opus 4.6编程实力几乎全方位领先，Gemini 3 Pro、GPT-5.2望尘莫及。在ARC-AGI-2上，Opus 4.6拿下68.8%高分，超过GPT-5.2-xhigh （左右滑动查看）不仅如此，Opus 4.6一经上线，就开始革命办公了！今天，这款新模型同时在Excel、PPT中的Claud ...

智能体群

上下文压缩

自适应思考

Artificial Intelligence

Artificial Intelligence

Claude Opus 4.6

Gemini 3 Pro

DeepSeek、GPT-5都在尝试的快慢思考切换，有了更智能版本，还是多模态

机器之心· 2025-09-01 06:46

Core Insights - The article discusses the development of the R-4B multimodal large model by Tencent and the Institute of Automation, Chinese Academy of Sciences, which addresses the "overthinking" dilemma in AI models by introducing an adaptive thinking mechanism [3][5][10]. Group 1: Model Development and Performance - R-4B utilizes an "auto-thinking" mechanism that allows the AI to switch between direct responses for simple questions and deep reasoning for complex problems, optimizing accuracy while minimizing computational costs [5][21]. - The model has set a new performance benchmark among 4B-scale multimodal models, outperforming larger models like Keye-VL-8B and Kimi-VL-A3B-Thinking-2506 in various evaluation metrics [7][24]. - R-4B achieved top rankings on the OpenCompass multimodal academic leaderboard, specifically ranking first among multimodal models under 20B in size [10][12]. Group 2: Training Methodology - The core innovation of R-4B lies in its unique two-stage training strategy, which includes bi-mode annealing to teach the model both thinking and non-thinking capabilities [16][18]. - The model's training involves a mix of data types, where it learns to respond directly to simple queries and engage in detailed reasoning for complex tasks, laying a solid foundation for adaptive thinking [18][22]. - The Bi-mode Policy Optimization (BPO) reinforcement learning algorithm allows the model to learn when to switch thinking modes without relying on specifically designed reward functions [18][24]. Group 3: Applications and Future Prospects - R-4B's adaptive thinking capability enhances automation efficiency in various applications, such as document content extraction and scientific research, where it can analyze complex data relationships [27][29]. - The model is designed for deployment on consumer-grade devices, making it suitable for low-power scenarios like smart homes and instant Q&A systems [12][29]. - The lightweight and intelligent design of R-4B contributes to sustainable development in AI, addressing the rising costs of computation and reasoning [33][34].