万亿参数思考模型
Search documents
蚂蚁开源万亿思考模型Ring-2.5-1T,强代码与Agent能力
Feng Huang Wang· 2026-02-13 08:07
凤凰网科技讯2月13日,蚂蚁集团开源发布全球首个基于混合线性架构的万亿参数思考模型Ring-2.5- 1T,在长文本生成、数学推理与智能体任务执行上达到开源领先水平,为智能体(Agent)时代的复杂任 务处理提供高性能基础支撑。 Ring-2.5-1T基于Ling2.5架构,通过优化注意力机制,显著提升长文本推理的效率与稳定性。模型激活 参数规模从前代的51B提升至63B,但在混合线性注意力架构的支持下,推理效率相比上一代大幅提 升。与仅具备32B激活参数的KIMI K2架构相比,在1T总参数量下,Ling2.5架构在长序列推理任务中的 吞吐表现依然优势显著,且随着生成长度增加,效率优势持续扩大。 目前,Ring-2.5-1T的模型权重与推理代码已在Hugging Face、ModelScope等主流开源平台发布。官方平 台Chat体验页和API服务将在近期上线。 在生成效率上,Ring-2.5-1T在32K以上长文本生成场景中,对比上代模型访存规模降低10倍以上,生成 吞吐提升3倍以上。在深度思考能力方面,该模型在国际数学奥林匹克竞赛(IMO2025)和中国数学奥林 匹克(CMO2025)自测均达到金牌水平 ...
蚂蚁开源万亿参数思考模型 Ring-1T,综合能力逼近 GPT-5、数学能力对标 IMO 银牌
AI前线· 2025-10-15 07:45
Core Insights - Ant Group has officially launched the trillion-parameter thinking model Ring-1T, which is fully open-sourced including model weights and training recipes [2] - Ring-1T has shown significant improvements in natural language reasoning capabilities and general performance across various tasks compared to its preview version [2] - The model achieved impressive results in the International Mathematical Olympiad (IMO) challenges, demonstrating its ability to solve complex mathematical problems [2] Model Performance - Ring-1T achieved a success rate of 81.59% in the Arena-Hard V2 human preference alignment test, ranking first among open-source models and closely approaching the performance of GPT-5-Thinking (High) at 82.91% [3] - In the HealthBench evaluation for medical Q&A, Ring-1T also scored the highest, marking it as the best in the open-source domain [3] Technical Innovations - Ant Group addressed the challenge of training and inference precision discrepancies in trillion-parameter models by developing the "icepop" algorithm, which stabilizes the training-inference distribution [5] - The company also created a high-performance reinforcement learning system called ASystem, optimizing memory management and weight exchange for large-scale RL training [6] Model Architecture - Ring-1T continues to utilize the Ling 2.0 architecture, which incorporates features like highly sparse MoE architecture and mixed precision training to enhance efficiency [8] - The model underwent multi-stage training processes, including LongCoT-SFT, RLVR, and RLHF, significantly improving its complex reasoning and general capabilities [8] Product Matrix - Ant Group has released a total of 18 models, ranging from 16 billion to 1 trillion parameters, marking the transition of its large language model products into the 2.0 phase with the introduction of Ring-1T and Ling-1T [9]