端侧大模型

Search documents
面壁智能完成新一轮亿级融资
Sou Hu Cai Jing· 2025-05-21 02:37
Core Insights - Recently, Mianbi Intelligent completed a new round of financing amounting to several hundred million yuan, led by Hongtai Fund, Guozhong Capital, Qingkong Jinxin, and Moutai Fund, marking the third round of financing since 2024 [1][2] - Mianbi Intelligent has rapidly developed a complete matrix of full-modal, multi-modal, and foundational models, continuously pushing the boundaries of edge large model capabilities [1] - The MiniCPM series has achieved over 10 million downloads, recognized as the most downloaded and popular Chinese large model on Hugging Face in 2024 [1] Financing and Investment - The recent financing will further establish Mianbi Intelligent's efficient large model technology and product barriers, accelerating industry empowerment and ecological expansion [2] - The company aims to promote the large-scale application of "edge brains" across various industries by collaborating with upstream and downstream sectors [2] Product Development - In September 2024, Mianbi Intelligent released the MiniCPM 3.0 model, outperforming GPT-3.5 with 4 billion parameters [1] - The MiniCPM-V 0.6 model, launched in August 2024, achieved state-of-the-art results in single-image, multi-image, and video understanding with only 8 billion parameters, matching GPT-4V capabilities [1] - The first full-modal model, MiniCPM-O 2.6, was introduced in January 2025, enabling real-time interaction with 8 billion parameters [1] Applications and Collaborations - Mianbi Intelligent launched the "MiniCPM Super Assistant cpmGO," the world's first pure edge intelligent assistant for vehicles [2] - The company participated in the development of the "Faxin Legal Foundation Model," which has been released by the Supreme People's Court [2] - In collaboration with Tsinghua University, Mianbi Intelligent introduced the AI Student Growth Assistant "Qingxiaoda," providing personalized intelligent assistants for all undergraduate students [2]
手机流畅处理128K长文本,vivo端侧新算法突破内存限制 | ACL 2025
量子位· 2025-05-20 05:12
vivo端侧大模型团队 投稿 量子位 | 公众号 QbitAI 在端侧设备上处理长文本常常面临计算和内存瓶颈。 vivo AI研究院 推出的EdgeInfinite算法专为端侧设备设计,让设备处理超长文本时更加高效流畅,该方法能够在不到10GB GPU内存的设 备上处理长达128K tokens的输入。 该研究成果已中稿ACL 2025。 以下是更多详细内容介绍。 EdgeInfinite:解决端侧设备长文本处理的高效算法 端侧LLM在实际应用中会遇到很多长文本输入的场景(例如通话摘要和个人文档总结),但由于端侧设备的资源限制,现有的LLM在部署到 端侧后都无法处理很长的上下文。 这是由于现在LLM都是基于Transformer架构,其计算耗时和内存占用会随着输入长度增加而显著增长,尤其当需要将Transformer类模型 部署到端侧设备上时,面临的挑战会愈发突出。 为了解决这类问题, vivo AI研究院 提出了一种用于端侧设备的长文本算法—— EdgeInfinite ,该算法通过一个可训练的 门控记忆模块 将记忆压缩算法集成到了 Transformer架构 中。 本方法与原生的Transformer架构 ...
AI原生手机之战:三大阵营的对决
3 6 Ke· 2025-05-07 12:23
Core Insights - The smartphone industry is undergoing an AI revolution, with manufacturers increasingly integrating AI features into their new products, marking a shift from traditional hardware innovation to AI-driven functionalities [2][5][14] - IDC forecasts a dramatic increase in AI smartphone shipments in China, with a year-on-year growth of 591% in 2024, and a penetration rate rising from 3% in 2023 to 22% [4] - The competition among smartphone manufacturers is shifting from hardware specifications to AI capabilities, emphasizing the need for end-to-end AI design from chips to operating systems [8][13] Group 1: Industry Trends - The AI smartphone market is expected to reach 1.18 billion units by 2025, accounting for 40.7% of the overall market [4] - High-end smartphones priced above $600 are projected to exceed 30.9% of the market share, with AI features contributing 75% of their premium pricing [4] - The average replacement cycle for smartphones has extended to 51 months, prompting manufacturers to focus on AI to drive consumer upgrades [5] Group 2: Technological Developments - The new generation of smartphones must feature advanced AI capabilities, including large model computing power, system-level AI integration, and proactive service in various scenarios [8][16] - AI's impact on imaging technology is significant, with innovations allowing for real-time analysis and optimization of images, enhancing capabilities beyond traditional photography [10][11] - The relationship between hardware manufacturers and AI developers is evolving, with companies like Qualcomm and Huawei creating ecosystems that support AI development and deployment [17][22] Group 3: Competitive Landscape - Major smartphone manufacturers are divided into three camps: Apple, Huawei, and an open ecosystem represented by brands like Xiaomi and Honor, each pursuing different AI strategies [20][22] - Huawei is positioned to lead in the AI smartphone market due to its strong R&D investment and technological capabilities in AI chipsets and cloud collaboration [22][23] - The future of smartphones may not solely rely on traditional devices, raising questions about the evolution of AI-native smart devices beyond current smartphones [23][24]
ICML 2025 Spotlight|华为诺亚提出端侧大模型新架构MoLE,内存搬运代价降低1000倍
机器之心· 2025-05-07 00:33
Mixture-of-Experts(MoE)在推理时仅激活每个 token 所需的一小部分专家,凭借其稀疏激活的特点,已成为当前 LLM 中的主流架构。然而,MoE 虽然显著降低 了推理时的计算量,但整体参数规模依然大于同等性能的 Dense 模型,因此在显存资源极为受限的端侧部署场景中,仍然面临较大挑战。 思考 现有的主流解决方案是专家卸载(Expert Offloading),即将专家模块存储在下层存储设备(如 CPU 内存甚至磁盘)中,在推理时按需加载激活的专家到显存进行 计算。但这一方法存在两大主要缺陷: 本文的核心思考是,在专家卸载方案中,需要将专家模块加载到显存,主要是为了在 GPU 上执行高效的矩阵运算。换句话说,如果专家的计算过程能够绕过矩阵 运算的需求,就可以避免将专家权重加载到显存,从而根本上规避频繁加载带来的开销。直观来看,专家模块本质上是一个神经网络,用于建模输入到输出的映 射。如果能够在推理前预先计算出所有可能的输入 - 输出对应关系,并将其存储为查找表,那么在推理时即可用简单的查找操作代替矩阵运算。 为了解决上述问题,来自北大和华为诺亚的研究人员提出了 Mixture-of-Lo ...
智能车速度刷新:仅10个月,首个纯端侧大模型上车量产!
量子位· 2025-04-24 10:29
金磊 发自 凹非寺 量子位 | 公众号 QbitAI 端侧大模型圈子的 《速度与激情》 ,就这么水灵灵地上演了。 坐标上海车展,在长安马自达新车发布之际,车上的 智能座舱 竟然成了大亮点之一。 因为速度着实有点太快——从零到量产,只花了10个月的时间! 要知道,这件事儿在汽车领域里面,一般都是要按"年"这个单位来计算。 而且啊,搞出这件事的,还是车圈的一位"新手"—— 面壁智能 。 没错,就是那个在大模型圈里用"以小搏大"姿势出圈的端侧大模型玩家。 他家当家模型产品"面壁小钢炮",一直引领全球端侧 AI 性能与算法创新之先,是响当当的端侧模型扛把子! 面壁在端侧以不足1B到8B的模型尺寸,实现了惊人的端侧GPT-4V、GPT-4o效果,前不久发布了全球首个全模态端侧模型。 超绝口碑,来自高效低成本、以小搏大,在今年年后也常被人称为 "端侧 DeepSeek " 。 它所发布的智能座舱产品,叫小钢炮超级助手 cpmGO ,是行业首个由纯端侧大模型驱动的智能助手。 整体来看,cpmGO具备以下几个特点: 快准稳 此举可谓是一鸣惊人,一举刷新行业纪录,一步迈进了"月"的计量单位。 91%超高执行准确率,交互如行云流 ...