阶跃新模型快到“没推理”!印奇上任,果然气势一新
量子位·2026-02-03 07:45

Core Insights - The article discusses the launch of the new open-source agent model Step 3.5 Flash, which features a total of 196 billion parameters and 11 billion active parameters, supporting a context window of 256K [2][36]. Model Performance - The model achieves a peak inference rate of 350 TPS, comparable to closed-source models in agent scenarios and mathematical tasks, capable of handling complex, long-chain tasks [5][41]. - In benchmark tests, Step 3.5 Flash scored 97.3 in the AIME 2025 benchmark, 74.4% in the SWE-bench Verified coding tasks, and 88.2 in the τ²-Bench for agent tasks, indicating strong performance across various applications [7][6]. Technical Architecture - Step 3.5 Flash employs a MoE sparse mixture of experts architecture, activating approximately 11 billion parameters during inference to control computational and deployment costs effectively [36]. - The model incorporates a 3:1 sliding window attention mechanism to address long context issues, enhancing its ability to manage lengthy texts [37]. - It features a self-developed MIS-PO reinforcement learning framework to improve inference and agent execution capabilities, reducing data noise and gradient variance for stable optimization in long-sequence tasks [42]. Ecosystem Integration - The model is designed to work seamlessly with major AI acceleration chip platforms from various manufacturers, including Ascend, Mu Xi, and Alibaba's T-head, ensuring compatibility with current mainstream domestic AI hardware [4]. - Step 3.5 Flash emphasizes a cloud-edge collaboration approach, where the cloud handles complex planning and reasoning while the edge focuses on secure data retrieval and local execution [30][32]. Future Developments - The development team is already working on Step 4, indicating ongoing advancements in the model's capabilities [43].