国产模型有点东西

Core Insights - The article highlights the emergence of domestic AI models, particularly focusing on Xiaomi's MiMo-V2-Flash, which has shown significant advancements in efficiency and performance [1]. Group 1: MiMo-V2-Flash Overview - MiMo-V2-Flash was officially released and open-sourced on December 17, featuring a total parameter count of 309 billion and active parameters of 150 billion, utilizing a mixture of experts (MoE) architecture [1]. - The model achieves a remarkable inference speed of 150 tokens per second, with costs reduced to $0.1 per million tokens for input and $0.3 for output [1]. Group 2: Cost Efficiency Innovations - The model employs a hybrid sliding window attention mechanism, which significantly reduces the storage of KV cache by nearly six times while maintaining long text capabilities, supporting a context window of approximately 256k [9]. - MiMo-V2-Flash integrates a three-layer MTP setup, enhancing encoding task speed by about 2.5 times and addressing GPU idle time issues in small batch on-policy reinforcement learning [10]. - The model utilizes a multi-teacher online policy distillation (MOPD) approach, requiring only 1/50th of the computational power of traditional methods to achieve peak performance, allowing for faster model iterations and self-evolution [10]. Group 3: Competitive Positioning - MiMo-V2-Flash has achieved high rankings in benchmark tests, placing among the top two in the AIME 2025 mathematics competition and GPQA-Diamond science knowledge test, with a programming capability score of 73.4% in the SWE-bench Verified test [6][11]. - The model's performance is noted to be competitive with leading models, closely approaching GPT-5-High in programming tasks [6].