大模型每百天性能翻倍！清华团队“密度法则”登上 Nature 子刊

Core Insights - The article discusses the evolution of large models in AI, highlighting the challenges posed by increasing training costs and the potential end of pre-training as currently understood by 2025 [1] - It introduces the "Densing Law" from Tsinghua University, which suggests that the maximum capability density of large language models is growing exponentially, doubling approximately every 3.5 months from February 2023 to April 2025 [1] Group 1: Scaling Law and Densing Law - The Scaling Law proposed by OpenAI indicates that larger model parameters and training data lead to stronger intelligence capabilities, but sustainability issues arise as training costs escalate [1] - The Densing Law provides a new perspective on model development, revealing that the capability density of large models is increasing exponentially over time [1][6] Group 2: Key Findings from Research - The research team analyzed 51 recent open-source large models and found that the maximum capability density has been doubling every 3.5 months since 2023, allowing for the same intelligence level with fewer parameters [9] - The inference cost for models of the same capability is decreasing exponentially over time, with empirical data showing that the API price for GPT-3.5 has dropped by 266.7 times over 20 months, approximately halving every 2.5 months [12] Group 3: Implications of Densing Law - The capability density of large models is accelerating, with a notable increase in the rate of doubling from 4.8 months before the release of ChatGPT to 3.2 months afterward, indicating a 50% acceleration in density enhancement [14] - Model compression algorithms do not always enhance capability density, as many compressed models have lower density than their original counterparts, revealing limitations in current compression techniques [16] - The intersection of chip circuit density (Moore's Law) and model capability density suggests significant potential for edge computing and terminal intelligence, leading to a transformative shift in computational accessibility from cloud to edge devices [18] Group 4: Future Developments - Tsinghua University and Mianbi Intelligence are advancing high-density model research based on the Densing Law, releasing several efficient models that have gained global recognition, with downloads nearing 15 million and GitHub stars approaching 30,000 by October 2025 [20]