Workflow
VoxCPM
icon
Search documents
大模型每百天性能翻倍,清华团队“密度法则”登上Nature子刊
3 6 Ke· 2025-11-20 08:48
Core Insights - The article discusses the challenges and new perspectives in the development of large models, particularly focusing on the "Density Law" proposed by Tsinghua University, which indicates an exponential growth in the maximum capability density of large language models from February 2023 to April 2025, doubling approximately every 3.5 months [1][8]. Group 1: Scaling Law and Density Law - Since 2020, OpenAI's Scaling Law has driven the rapid development of large models, but by 2025, the sustainability of this path is in question due to increasing training costs and the nearing exhaustion of publicly available internet data [1]. - The Density Law provides a new perspective on model development, suggesting that just as the semiconductor industry improved chip density, large models can achieve efficient development through increased capability density [3][4]. Group 2: Implications of Density Law - The research team hypothesizes that different-sized models, when trained adequately, will have the same capability density, establishing a baseline for measuring other models [4]. - The Density Law indicates that the inference cost for models of the same capability decreases exponentially over time, with empirical data showing that the API price for models like GPT-3.5 has decreased by 266.7 times over 20 months, roughly halving every 2.5 months [7][8]. Group 3: Acceleration of Capability Density - An analysis of 51 recent open-source large models revealed that the maximum capability density has been increasing exponentially, with a doubling time of approximately 3.5 months since 2023 [8][9]. - Following the release of ChatGPT, the capability density has increased at a faster rate, doubling every 3.2 months compared to every 4.8 months prior, indicating a 50% acceleration in density enhancement [9][10]. Group 4: Limitations of Model Compression - The research found that model compression algorithms do not always enhance capability density, as many compressed models performed worse than their original counterparts due to insufficient training [11][13]. Group 5: Future Prospects - The intersection of chip circuit density (Moore's Law) and model capability density (Density Law) suggests that edge devices will be able to run higher-performance large models, leading to explosive growth in edge computing and terminal intelligence [14]. - Tsinghua University and the Mianbi Intelligence team are advancing high-density model development, with models like MiniCPM and VoxCPM gaining global recognition and significant download numbers, indicating a trend towards efficient and low-cost models [16].
大模型每百天性能翻倍!清华团队“密度法则”登上 Nature 子刊
AI前线· 2025-11-20 06:30
Core Insights - The article discusses the evolution of large models in AI, highlighting the challenges posed by increasing training costs and the potential end of pre-training as currently understood by 2025 [1] - It introduces the "Densing Law" from Tsinghua University, which suggests that the maximum capability density of large language models is growing exponentially, doubling approximately every 3.5 months from February 2023 to April 2025 [1] Group 1: Scaling Law and Densing Law - The Scaling Law proposed by OpenAI indicates that larger model parameters and training data lead to stronger intelligence capabilities, but sustainability issues arise as training costs escalate [1] - The Densing Law provides a new perspective on model development, revealing that the capability density of large models is increasing exponentially over time [1][6] Group 2: Key Findings from Research - The research team analyzed 51 recent open-source large models and found that the maximum capability density has been doubling every 3.5 months since 2023, allowing for the same intelligence level with fewer parameters [9] - The inference cost for models of the same capability is decreasing exponentially over time, with empirical data showing that the API price for GPT-3.5 has dropped by 266.7 times over 20 months, approximately halving every 2.5 months [12] Group 3: Implications of Densing Law - The capability density of large models is accelerating, with a notable increase in the rate of doubling from 4.8 months before the release of ChatGPT to 3.2 months afterward, indicating a 50% acceleration in density enhancement [14] - Model compression algorithms do not always enhance capability density, as many compressed models have lower density than their original counterparts, revealing limitations in current compression techniques [16] - The intersection of chip circuit density (Moore's Law) and model capability density suggests significant potential for edge computing and terminal intelligence, leading to a transformative shift in computational accessibility from cloud to edge devices [18] Group 4: Future Developments - Tsinghua University and Mianbi Intelligence are advancing high-density model research based on the Densing Law, releasing several efficient models that have gained global recognition, with downloads nearing 15 million and GitHub stars approaching 30,000 by October 2025 [20]