Llama 3.2

Search documents
长文本推理 5 倍提速!面壁MiniCPM4 端侧模型发布,0.5B模型效果秒杀同级
AI前线· 2025-06-12 06:07
Github 链接:https://github.com/openbmb/minicpm Huggingface 链接:https://huggingface.co/collections/openbmb/minicpm-4- 6841ab29d180257e940baa9b 整理 | 华卫 近日,新一代"面壁小钢炮" MiniCPM4.0 端侧模型发布,拥有 8B 、0.5B 两种参数规模 。 一款 8B 稀疏闪电版,带来端侧性能大跃升;一款 0.5B "以小博大",适配广泛终端场景。 模型相关链接 Model Scope 链接:https://www.modelscope.cn/collections/MiniCPM-4-ec015560e8c84d 截至目前,面壁小钢炮 MiniCPM 系列全平台下载量累计破 1000 万。 据介绍,MiniCPM4.0 -8B 是首个原生稀疏模型,5% 的极高稀疏度加持系统级创新技术的大爆发, 让长文本、深思考在端侧真正跑起来。在 MMLU、CEval、MATH500、HumanEval 等基准测试中, MiniCPM4.0 -8B 以仅 22% 的训练开销,性能比肩 ...
面壁小钢炮4.0发布:性能比肩 Qwen-3-8B,极限220倍提速
Xin Lang Ke Ji· 2025-06-10 09:37
Core Insights - The fourth generation of the "MiniCPM" model, known as MiniCPM 4.0, has been released, featuring two parameter scales: 8B and 0.5B, achieving the best performance in its class [2][3] - MiniCPM 4.0-8B model utilizes a sparse attention mechanism, demonstrating performance comparable to Qwen-3-8B while requiring only 22% of the training cost [2][4] - The model achieves a remarkable inference speed of 600 Token/s, with a 220x acceleration in extreme scenarios, significantly enhancing long text processing capabilities [2][3] Performance and Architecture - MiniCPM 4.0 offers a 5x acceleration in long text inference speed compared to similar models like Qwen-3-8B and Llama-3-8B, with a maximum acceleration of 220x under memory-constrained conditions [3][4] - The model's architecture, InfLLMv2, reduces the sparsity from the industry standard of 40%-50% to just 5%, allowing for efficient long text calculations with only 1/10 of the computational load [4] - In terms of memory usage, MiniCPM 4.0-8B requires only 1/4 of the cache storage space compared to Qwen3-8B for 128K long text scenarios, indicating significant model compression and efficiency [4] Applications and Market Impact - Based on the 8B version, the company has fine-tuned two specific capability models for use as MCP Client and a research tool, MiniCPM4-Surve, which competes with Deep Research [5] - The MiniCPM series has achieved over 10 million downloads across all platforms, indicating strong market interest and adoption [5]