Workflow
端侧模型
icon
Search documents
长文本推理 5 倍提速!面壁MiniCPM4 端侧模型发布,0.5B模型效果秒杀同级
AI前线· 2025-06-12 06:07
Github 链接:https://github.com/openbmb/minicpm Huggingface 链接:https://huggingface.co/collections/openbmb/minicpm-4- 6841ab29d180257e940baa9b 整理 | 华卫 近日,新一代"面壁小钢炮" MiniCPM4.0 端侧模型发布,拥有 8B 、0.5B 两种参数规模 。 一款 8B 稀疏闪电版,带来端侧性能大跃升;一款 0.5B "以小博大",适配广泛终端场景。 模型相关链接 Model Scope 链接:https://www.modelscope.cn/collections/MiniCPM-4-ec015560e8c84d 截至目前,面壁小钢炮 MiniCPM 系列全平台下载量累计破 1000 万。 据介绍,MiniCPM4.0 -8B 是首个原生稀疏模型,5% 的极高稀疏度加持系统级创新技术的大爆发, 让长文本、深思考在端侧真正跑起来。在 MMLU、CEval、MATH500、HumanEval 等基准测试中, MiniCPM4.0 -8B 以仅 22% 的训练开销,性能比肩 ...
面壁MiniCPM4端侧模型发布:长文本推理 5 倍提速,0.5B 模型拿下新SOTA
AI科技大本营· 2025-06-10 09:31
Core Viewpoint - The release of MiniCPM4.0 marks a significant advancement in edge-side models, showcasing innovations in performance, speed, and storage efficiency, particularly for long text processing [1][4][32] Group 1: Model Performance and Efficiency - MiniCPM4.0-8B is the first native sparse model with a 5% sparsity, achieving a performance comparable to Qwen-3-8B while using only 22% of the training resources [2][5][6] - MiniCPM4.0-0.5B demonstrates impressive performance with a training cost of just 2.7%, outperforming larger models like Qwen-3-0.6B and Llama 3.2, achieving a speed of 600 Token/s [2][5][9] - The model's architecture allows for a 5x speed increase in long text inference and up to 220x in extreme scenarios, addressing the industry's challenge of slow long text processing [4][9][16] Group 2: Technological Innovations - The introduction of the InfLLM sparse attention architecture significantly reduces computational costs, allowing for efficient long text processing by lowering the sparsity from 40%-50% to 5% [18][19][20] - MiniCPM4.0 employs a three-tiered self-developed inference framework, CPM.cu, which optimizes performance for edge devices, achieving a 5x speed enhancement [21][22] - The model utilizes advanced quantization techniques, including P-GPTQ and BitCPM, to minimize computational and memory demands, ensuring efficient deployment [23][24] Group 3: Data and Training Efficiency - The company emphasizes the importance of high-quality data, utilizing innovative methods to construct datasets, which significantly reduces validation costs by 90% [29][30] - The training strategy incorporates the upgraded Model Wind Tunnel v2, optimizing hyperparameter configurations and enhancing GPU resource utilization [30][32] - MiniCPM4.0's development reflects a commitment to maximizing research investment returns through systematic improvements across data, training, and inference processes [28][32] Group 4: Market Position and Future Directions - MiniCPM4.0 has achieved over 10 million downloads across all platforms, indicating strong market acceptance and recognition [32] - The company plans to continue enhancing model knowledge density and intelligence levels, driving efficient development and large-scale applications in edge-side AI [32]
0.5B以小搏大拿下端侧模型新SOTA:4090可跑,长文本处理5倍常规加速丨清华&面壁开源
量子位· 2025-06-10 07:35AI Processing
清华大学&面壁智能 投稿 量子位 | 公众号 QbitAI 端侧性价比之王,清华大学和面壁智能团队开源新模型—— MiniCP M 4 ,提供 8B、0.5B 两种参数规模, 仅使用同级别开源模型22%的训练开销 ,就达到了同级别最优性能。 MiniCPM4-8B是 开源首个开源的原生稀疏模型,5%的极高稀疏度加持,让长文本、深思考在端侧真正跑起来。 在MMLU、CEval、MATH500、HumanEval等基准测试中,以仅22%的训练开销,性能比肩 Qwen-3-8B,超越Gemma-3-12B。 MiniCPM4-0.5B 在性能上,也展现出以小博大——在MMLU、CEval、BBH、HumanEval等基准测试中,MiniCPM4.0 -0.5B性能超越同级 的Qwen-3-0.6B、Llama 3.2、Gemma3, 并通过 原生QAT技术 实现几乎不掉点的int4量化以及600Token/s的极速推理速度。 在常见端侧芯片,比如Jetson AGX Orin与RTX 4090上,MiniCPM 4可实现长文本处理的5倍常规加速与极限场景下的百倍加速。 请看VCR: 目前团队已公开发布技术报告,该模 ...
0.5B以小搏大拿下端侧模型新SOTA:4090可跑,长文本处理5倍常规加速丨清华&面壁开源
量子位· 2025-06-10 07:35
清华大学&面壁智能 投稿 量子位 | 公众号 QbitAI 端侧性价比之王,清华大学和面壁智能团队开源新模型—— MiniCP M 4 ,提供 8B、0.5B 两种参数规模, 仅使用同级别开源模型22%的训练开销 ,就达到了同级别最优性能。 MiniCPM4-8B是 开源首个开源的原生稀疏模型,5%的极高稀疏度加持,让长文本、深思考在端侧真正跑起来。 在MMLU、CEval、MATH500、HumanEval等基准测试中,以仅22%的训练开销,性能比肩 Qwen-3-8B,超越Gemma-3-12B。 MiniCPM4-0.5B 在性能上,也展现出以小博大——在MMLU、CEval、BBH、HumanEval等基准测试中,MiniCPM4.0 -0.5B性能超越同级 的Qwen-3-0.6B、Llama 3.2、Gemma3, 并通过 原生QAT技术 实现几乎不掉点的int4量化以及600Token/s的极速推理速度。 在常见端侧芯片,比如Jetson AGX Orin与RTX 4090上,MiniCPM 4可实现长文本处理的5倍常规加速与极限场景下的百倍加速。 请看VCR: 目前团队已公开发布技术报告,该模 ...
开启端侧长文本时代!面壁全新架构,让小钢炮最快提升220倍
机器之心· 2025-06-09 08:03
端侧大模型,正在发生质变。 端侧语言模型,终于迎来了脱胎换骨式的创新。 上周五,2025 智源大会上,国内知名 AI 创业公司面壁智能正式发布了旗下最新一代「小钢炮」模型 MiniCPM 4.0,一下子把 AI 的发展推到了「前进 四」。 机器之心报道 编辑:泽南 模型、预训练数据和端侧推理框架均已开源。 MiniCPM 4.0 系列在卫冕全球最强端侧模型的同时,也让我们看到了继 DeepSeek 之后大模型领域又一次源自底层架构的技术突破。 速度提升百倍 在发布会上,面壁智能 CEO 宣布 MiniCPM 4.0 实现了行业首个系统级上下文稀疏语言模型创新,实现了 5% 的极高稀疏度,能够在端侧跑起长文本推 理,开启了端侧长文本时代。 本次发布的 MiniCPM 4.0 分为 8B 和 0.5B 两个参数版本,均刷新了端侧模型能力的上限。 据介绍,通过架构、算法、数据及系统层面的多维度创新,新一代上下文稀疏高效架构模型 MiniCPM 4.0 8B 相较于 Qwen-3-8B、Llama-3-8B、 GLM-4-9B 等同体量模型实现了长文本推理速度稳定 5 倍, 极限场景下最高 220 倍加速 ,实现了同 ...
国泰海通|电子:Deepseek R1更新,商业场景拓展加速
Core Viewpoint - The update of Deepseek R1 enhances its deep thinking capabilities, positioning it alongside top international models like OpenAI-o3 and Gemini-2.5-Pro-0506, which is expected to accelerate the growth of domestic computing power demand and the implementation of edge models [1][2]. Summary by Sections Performance Improvement - Deepseek R1-0528 has achieved performance iteration through improved training methods, showing significant enhancements in deep thinking capabilities across various benchmarks, closely matching the performance of leading international models [3]. - The distilled model, Deepseek-R1-0528-Qwen3-8B, demonstrates strong performance in mathematical testing, ranking just below Deepseek-R1-0528 and comparable to Qwen3-235B [3]. - The updated model has reduced hallucination rates by approximately 45-50% in tasks such as rewriting and reading comprehension, while also optimizing for different writing styles, enabling the generation of more structured long-form content [3]. Commercialization Potential - The performance improvements in deep thinking and writing, along with reduced hallucination rates, are expected to enhance user experience, potentially increasing user penetration and daily usage frequency, thereby driving growth in the domestic computing power industry [4]. - The outstanding performance of the distilled training model is anticipated to accelerate the deployment of large models on edge devices such as smartphones, PCs, and smart glasses, improving the intelligence level of these devices and enabling AI empowerment [4]. Catalyst - The iterative upgrade of Deepseek's large model performance serves as a catalyst for further advancements in the field [5].
「AI新世代」茅台基金参投!面壁智能完成新一轮数亿元融资,大模型“吸金”几家欢喜几家愁
Hua Xia Shi Bao· 2025-05-22 14:46
Group 1 - The core viewpoint of the articles highlights a significant shift in investment logic within the AI industry, moving from investing in models to prioritizing application-focused investments [1][7][9] - The "AI Six Tigers" have largely fallen silent in terms of financing, with only a few companies like Zhipu and Mianbi Intelligence successfully securing funding [1][5] - Mianbi Intelligence has raised substantial funding, including a recent multi-billion yuan round led by various investors, indicating strong market interest in application-oriented AI solutions [2][5] Group 2 - Mianbi Intelligence focuses on edge models rather than general-purpose foundational models, having released several iterations of its flagship product, MiniCPM [3][5] - The company has strategically positioned itself in various sectors, particularly in the automotive industry, by forming partnerships with major tech firms like Intel [5][6] - Investment in AI applications has shown new characteristics, with a stable number of financing cases but smaller individual investment amounts compared to previous years [7][8]
华泰证券|机器人产业跟踪
2025-06-30 01:02
小鹏汽车在 2025 年上海车展上发布的 AI 机器人通过视觉学习实现了自主行走, 步态优雅,演示效果超出预期。市场此前对小鹏机器人产品状态和进度不清楚, 预期较低,因此此次展示显得尤为惊艳。小鹏汽车在整个车企中做机器人速度 最快,其软件自研和自动驾驶技术处于第一梯队,并且在硬件供应链降本方面 具有优势。预计小鹏将在 2026 年实现量产,并遵循特斯拉的路线,先打造 ToB 端机器人。 华泰证券|机器人产业跟踪 20250427 小鹏汽车 AI 机器人的硬件有哪些重要变化? 小鹏 AI 机器人的硬件有几个重要变化:首先是丝杠的应用,小鹏成为国内第一 个大批量使用丝杠的企业,这一技术此前仅特斯拉采用;其次是手部自由度高, 未来触觉传感器将成为重要关注方向;第三是轴向磁通电机(盘式电机)的应 • 特斯拉发布微型丝杠和健身方案,并对国内产业链进行评估,明确订单意 向,推动国产化替代链发展,荣泰公司在结构件轻量化和微型丝杠领域卡 位优势明显。 • 北京机床展上,人形机器人丝杠设备需求旺盛,国内机床公司订单充足但 供不应求,国内企业积极研发专用磨床和车铣复合加工方式以降低成本, 提高生产效率。 摘要 • 小鹏汽车在机器 ...