Workflow
大模型‘不可能三角’
icon
Search documents
蚂蚁集团开源万亿思考模型 Ring-2.5-1T,打破大模型“不可能三角”
Guan Cha Zhe Wang· 2026-02-14 10:25
Core Insights - Ant Group has developed and open-sourced the world's first trillion-parameter thinking model, Ring-2.5-1T, which achieves fast inference speed, deep reasoning capabilities, and excellent long-range task execution [1][9] - The model scored 35 out of 42 in the IMO competition and 105 in the CMO, significantly exceeding the national training team's score line [1][7] Model Architecture - Ring-2.5-1T is based on the Ling 2.5 architecture, utilizing a hybrid linear attention mechanism that combines MLA (Multi-Head Latent Attention) and Lightning Linear Attention in a 1:7 ratio [2] - The model's active parameter count increased from 51 billion to 63 billion, yet its inference efficiency improved due to its linear time complexity [2] Performance and Capabilities - The model demonstrates significant advantages in long-sequence reasoning tasks compared to other models with similar parameter counts, particularly in throughput as sequence length increases [2] - Ring-2.5-1T has been benchmarked against various models and has achieved optimal performance in high-difficulty reasoning tasks and long-duration task execution benchmarks [5] Training Innovations - The model incorporates a dense reward mechanism based on Reinforcement Learning with Verifiable Rewards (RLVR), enhancing its logical reasoning and proof techniques [4] - It also employs large-scale fully asynchronous Agentic RL training, improving its autonomous execution capabilities in complex tasks [4] Ecosystem and Future Developments - Ring-2.5-1T is compatible with major intelligent agent frameworks and has been made available on platforms like Hugging Face and ModelScope [7] - Ant Group has also released other models, including LLaDA2.1 and Ming-flash-omni-2.0, focusing on various AI capabilities such as non-autoregressive parallel decoding and multimodal representation [8] - The company aims to provide reusable foundational solutions for developers, with plans to expand into video understanding, complex image editing, and real-time audio generation [8]