MiniMax官宣参战“春节档” 编程模型MiniMaxM2.5上线

Core Viewpoint - MiniMax has launched its new generation text model MiniMaxM2.5, showcasing significant improvements in programming capabilities and performance metrics compared to its predecessor [2][4]. Group 1: Model Performance - In programming capabilities, MiniMaxM2.5 achieved scores of 80.2% on the SWE-BenchVerified and 51.3% on the Multi-SWE-Bench, marking a significant enhancement over the previous generation [2]. - The model outperformed Opus4.6 in multi-language complex environments, demonstrating its "native Spec capability" by actively decomposing architecture and functional planning before coding [2]. Group 2: Tool Utilization and Search Capabilities - MiniMaxM2.5 can automatically handle complex tasks, achieving a 20% performance improvement in tasks such as BrowseComp and WideSearch compared to the previous model [4]. - In office scenarios, the model showed significant capability enhancements in high-level tasks involving Word, PPT, and Excel financial modeling, achieving an average win rate of 59.0% in the GDPval-MM evaluation framework against mainstream models [4]. Group 3: Speed and Cost Efficiency - The M2.5-lightning version supports an output speed of over 100 TPS, approximately double that of mainstream models, with input costs around $0.3 per million tokens and output costs about $2.4 per million tokens [4]. - Calculating costs based on a continuous output of 100 tokens per second, running M2.5 for one hour would cost approximately $1, while at 50 tokens, the cost would be around $0.3 [4]. - MiniMax believes that when performance and cost are no longer constraints, the economic model for large-scale deployment of agents will fundamentally change [4].