Workflow
面壁小钢炮MiniCPM
icon
Search documents
大模型每百天性能翻倍,清华团队“密度法则”登上Nature子刊
3 6 Ke· 2025-11-20 08:48
Core Insights - The article discusses the challenges and new perspectives in the development of large models, particularly focusing on the "Density Law" proposed by Tsinghua University, which indicates an exponential growth in the maximum capability density of large language models from February 2023 to April 2025, doubling approximately every 3.5 months [1][8]. Group 1: Scaling Law and Density Law - Since 2020, OpenAI's Scaling Law has driven the rapid development of large models, but by 2025, the sustainability of this path is in question due to increasing training costs and the nearing exhaustion of publicly available internet data [1]. - The Density Law provides a new perspective on model development, suggesting that just as the semiconductor industry improved chip density, large models can achieve efficient development through increased capability density [3][4]. Group 2: Implications of Density Law - The research team hypothesizes that different-sized models, when trained adequately, will have the same capability density, establishing a baseline for measuring other models [4]. - The Density Law indicates that the inference cost for models of the same capability decreases exponentially over time, with empirical data showing that the API price for models like GPT-3.5 has decreased by 266.7 times over 20 months, roughly halving every 2.5 months [7][8]. Group 3: Acceleration of Capability Density - An analysis of 51 recent open-source large models revealed that the maximum capability density has been increasing exponentially, with a doubling time of approximately 3.5 months since 2023 [8][9]. - Following the release of ChatGPT, the capability density has increased at a faster rate, doubling every 3.2 months compared to every 4.8 months prior, indicating a 50% acceleration in density enhancement [9][10]. Group 4: Limitations of Model Compression - The research found that model compression algorithms do not always enhance capability density, as many compressed models performed worse than their original counterparts due to insufficient training [11][13]. Group 5: Future Prospects - The intersection of chip circuit density (Moore's Law) and model capability density (Density Law) suggests that edge devices will be able to run higher-performance large models, leading to explosive growth in edge computing and terminal intelligence [14]. - Tsinghua University and the Mianbi Intelligence team are advancing high-density model development, with models like MiniCPM and VoxCPM gaining global recognition and significant download numbers, indicating a trend towards efficient and low-cost models [16].
大模型每百天性能翻倍!清华团队“密度法则”登上 Nature 子刊
AI前线· 2025-11-20 06:30
Core Insights - The article discusses the evolution of large models in AI, highlighting the challenges posed by increasing training costs and the potential end of pre-training as currently understood by 2025 [1] - It introduces the "Densing Law" from Tsinghua University, which suggests that the maximum capability density of large language models is growing exponentially, doubling approximately every 3.5 months from February 2023 to April 2025 [1] Group 1: Scaling Law and Densing Law - The Scaling Law proposed by OpenAI indicates that larger model parameters and training data lead to stronger intelligence capabilities, but sustainability issues arise as training costs escalate [1] - The Densing Law provides a new perspective on model development, revealing that the capability density of large models is increasing exponentially over time [1][6] Group 2: Key Findings from Research - The research team analyzed 51 recent open-source large models and found that the maximum capability density has been doubling every 3.5 months since 2023, allowing for the same intelligence level with fewer parameters [9] - The inference cost for models of the same capability is decreasing exponentially over time, with empirical data showing that the API price for GPT-3.5 has dropped by 266.7 times over 20 months, approximately halving every 2.5 months [12] Group 3: Implications of Densing Law - The capability density of large models is accelerating, with a notable increase in the rate of doubling from 4.8 months before the release of ChatGPT to 3.2 months afterward, indicating a 50% acceleration in density enhancement [14] - Model compression algorithms do not always enhance capability density, as many compressed models have lower density than their original counterparts, revealing limitations in current compression techniques [16] - The intersection of chip circuit density (Moore's Law) and model capability density suggests significant potential for edge computing and terminal intelligence, leading to a transformative shift in computational accessibility from cloud to edge devices [18] Group 4: Future Developments - Tsinghua University and Mianbi Intelligence are advancing high-density model research based on the Densing Law, releasing several efficient models that have gained global recognition, with downloads nearing 15 million and GitHub stars approaching 30,000 by October 2025 [20]
全国首个行政复议垂直大模型亮相,面壁智能“掘金”政务数智化市场|聚焦2025服贸会
Hua Xia Shi Bao· 2025-09-12 00:45
Core Insights - The article highlights the debut of the first national administrative review vertical large model developed by Mianbi Intelligent at the 2025 China International Service Trade Fair, showcasing its capabilities in the administrative review process [2][3] - Mianbi Intelligent has adopted a differentiated strategy focusing on smaller models, which are more efficient and cost-effective for the legal and administrative sectors [5][7] - The company has made significant strides in the automotive sector, establishing partnerships with major automotive brands to implement its edge models [8][9] Company Developments - Mianbi Intelligent's administrative review vertical large model covers the entire case handling process, including case element extraction and document generation, and is currently utilized by the Beijing Judicial Bureau and its 16 district bureaus [3][4] - The company has completed multiple rounds of financing since its establishment in August 2022, with significant investments from various venture capital firms [6] - Mianbi Intelligent's focus on edge models distinguishes it from other large model companies, emphasizing speed and efficiency in processing [7][8] Industry Context - The article notes the increasing competition in the small model market, with major tech companies like Alibaba and Tencent entering the space, which poses challenges for Mianbi Intelligent [9] - The demand for edge models is expected to grow due to the proliferation of smart devices, offering unique advantages in privacy, data protection, and operational performance [7][8] - Mianbi Intelligent's strategic partnerships in the automotive sector aim to enhance its presence and application of edge models in smart vehicle technologies [8][9]
「AI新世代」茅台基金参投!面壁智能完成新一轮数亿元融资,大模型“吸金”几家欢喜几家愁
Hua Xia Shi Bao· 2025-05-22 14:46
Group 1 - The core viewpoint of the articles highlights a significant shift in investment logic within the AI industry, moving from investing in models to prioritizing application-focused investments [1][7][9] - The "AI Six Tigers" have largely fallen silent in terms of financing, with only a few companies like Zhipu and Mianbi Intelligence successfully securing funding [1][5] - Mianbi Intelligence has raised substantial funding, including a recent multi-billion yuan round led by various investors, indicating strong market interest in application-oriented AI solutions [2][5] Group 2 - Mianbi Intelligence focuses on edge models rather than general-purpose foundational models, having released several iterations of its flagship product, MiniCPM [3][5] - The company has strategically positioned itself in various sectors, particularly in the automotive industry, by forming partnerships with major tech firms like Intel [5][6] - Investment in AI applications has shown new characteristics, with a stable number of financing cases but smaller individual investment amounts compared to previous years [7][8]
端侧大模型加速破圈!面壁智能获新一轮数亿元融资
机器人圈· 2025-05-21 09:40
Group 1 - The core viewpoint of the article highlights the recent funding rounds completed by the Chinese AI startup, Mianbi Intelligent, which focuses on edge large model development, indicating strong investor confidence and growth potential in the AI sector [1][2] - Mianbi Intelligent has successfully completed three rounds of financing since 2024, with significant investments from various funds, which will help establish a robust technological and product barrier for their efficient large model technology [1] - The company aims to accelerate industry empowerment and ecological expansion, promoting the large model's application across various sectors [1] Group 2 - In early 2025, the global AI competition has intensified, with Mianbi Intelligent leading the way in developing high-efficiency large models that offer better performance, lower costs, and reduced power consumption [2] - The launch of Mianbi's first edge full-modal model, MiniCPM-o 2.6, showcases its innovative capabilities, including real-time audio-visual processing and natural language generation, positioning it at the forefront of the industry [2] - The MiniCPM series has achieved over 10 million downloads across all platforms, reflecting its popularity and effectiveness in the market [2]
面壁智能时隔半年再完成数亿融资 大模型加速行业赋能
Nan Fang Du Shi Bao· 2025-05-21 03:04
Core Insights - Recently, the company completed a new round of financing amounting to several hundred million yuan, led by Hongtai Fund, Guozhong Capital, Qingkong Jinxin, and Moutai Fund, which will strengthen its technological and product barriers in the large model sector [1][3] - The CEO emphasized the need for advanced judgment on technology and market trends to provide sufficient supply for the large model industry, indicating a shift towards accelerated industry empowerment [1][4] Financing Details - The recent financing round is part of a series of investments, including a previous angel round in April 2023 and a significant A round in April 2024, showcasing a strong investor interest in the company's growth [3] - The company has also secured funding from various notable investors, including Zhihu and Springhua Capital, indicating a robust financial backing [3] Product Development - The company has launched its first full-modal model, MiniCPM-o 2.6, which integrates multiple capabilities such as real-time audio and visual processing, positioning it at the forefront of the industry [2][4] - The MiniCPM series has achieved over 10 million downloads across all platforms, reflecting strong market acceptance [1] Market Positioning - The company focuses on edge AI models, which are being integrated into various devices such as cars, robots, and smartphones, highlighting its commitment to practical applications of AI technology [4][6] - Collaborations with major automotive manufacturers like Changan Automobile and SAIC Volkswagen are underway to develop native edge experiences in smart cockpits [4] Industry Perspective - The CEO stated that the future of AGI (Artificial General Intelligence) relies on deploying intelligence at the device level, which allows for more responsive and effective decision-making [5][6] - The company believes that edge models will play a crucial role in the evolution of human-computer interaction, shifting from traditional GUI to a combination of VUI and GUI [2]