Core Viewpoint - Xiaomi's newly announced open-source model MiMo-V2-Flash has successfully entered the first tier of open-source models, showcasing high efficiency and performance with a parameter scale of only 309 billion [2][4]. Technical Innovations - MiMo-V2-Flash employs a MoE architecture with 256 experts, achieving significant performance with a smaller parameter count compared to larger models [11]. - The model utilizes a dynamic activation mechanism, activating only 8 experts, resulting in a low inference cost of approximately 2.5% of the closed-source competitor Claude 4.5 Sonnet [12]. - Key technologies include a 5:1 mixed attention mechanism, learnable attention aggregation bias, multi-layer token prediction (MTP) for inference acceleration, and multi-teacher online policy distillation (MOPD) for training efficiency [13][23]. Performance Metrics - MiMo-V2-Flash scored 86.2 in the Arena-Hard benchmark and 84.9 in the MMLU-Pro complex reasoning task, demonstrating strong general capabilities [27]. - In coding ability, it achieved a score of 73.4% in SWE-Bench Verified, surpassing competitors like DeepSeek-V3.2 and Kimi-K2 [28]. Real-World Applications - The model has shown exceptional performance in practical scenarios, such as generating complete code for a web-based macOS operating system and implementing complex features like gesture control [30][41]. - Compared to closed-source models, MiMo-V2-Flash produced more functional and interactive web applications [36][40]. Strategic Vision - Xiaomi's development of MiMo-V2-Flash reflects a strategic shift towards becoming a major player in the AI model space, aiming to create a unified "brain" for its hardware ecosystem [62][63]. - The company envisions AI that can seamlessly integrate with physical devices, enhancing control precision and response speed [60][61].
小米大模型“杀”进第一梯队:代码能力开源第一,智商情商全在线