全球开源大模型杭州霸榜被终结，上海Minimax M2发布即爆单，百万Tokens仅需8元人民币

Core Insights - The open-source model throne has shifted to Minimax M2, surpassing previous leaders DeepSeek and Qwen, with a score of 61 in evaluations by Artificial Analysis [1][7]. Performance and Features - Minimax M2 is designed specifically for agents and programming, boasting exceptional programming capabilities and agent performance. It operates at twice the reasoning speed of Claude 3.5 Sonnet while costing only 8% of its API price [3][4]. - The model features a high sparsity MoE architecture with a total parameter count of 230 billion, of which only 10 billion are activated, allowing for rapid execution, especially when paired with advanced inference platforms [4][6]. - M2's unique interleaved thinking format enables it to plan and verify operations across multiple dialogues, crucial for agent reasoning [6]. Competitive Analysis - In the Artificial Analysis tests, M2 ranked fifth overall and first among open-source models, evaluated across ten popular datasets [7]. - M2's pricing is significantly lower than competitors, at $0.3 per million input tokens and $1.2 per million output tokens, representing only 8% of Claude 3.5 Sonnet's costs [8][14]. Agent Capabilities - Minimax has deployed M2 on an agent platform for free, showcasing various applications, including web development and game creation [23][30]. - Users have successfully utilized M2 to create complex applications and games, demonstrating its programming capabilities [36][38]. Technical Aspects - M2 employs a hybrid attention mechanism, combining full attention and sliding window attention, although initial plans to incorporate sliding window attention were abandoned due to performance concerns [39][40]. - The choice of attention mechanism reflects Minimax's strategy to optimize performance for their specific use cases, despite ongoing debates in the research community regarding the best approach for long-sequence tasks [47].