密度法则(Densing Law)
Search documents
从「密度法则」来看Scaling Law撞墙、模型密度的上限、豆包手机之后端侧想象力......|DeepTalk回顾
锦秋集· 2025-12-15 04:09
Core Insights - The article discusses the transition from the "Scaling Law" to the "Densing Law," emphasizing the need for sustainable development in AI models as data growth slows and computational costs rise [2][3][15]. - The "Densing Law" indicates that model capability density increases exponentially, with capability density doubling approximately every 3.5 months, while the parameter count and inference costs decrease significantly [11][28]. Group 1: Scaling Law and Its Limitations - The "Scaling Law" has faced challenges due to bottlenecks in training data and computational resources, making it unsustainable to continue increasing model size [15][16]. - The current training data is limited to around 20 trillion tokens, which is insufficient for the expanding needs of model scaling [15]. - The computational resource requirement for larger models is becoming prohibitive, as seen with LLaMA 3, which required 16,000 H100 GPUs for a 405 billion parameter model [16]. Group 2: Introduction of Densing Law - The "Densing Law" proposes that as data, computation, and algorithms evolve together, the density of model capabilities grows exponentially, allowing for more efficient models with fewer parameters [11][28]. - For instance, GPT-3 required over 175 billion parameters, while MiniCPM achieved similar capabilities with only 2.4 billion parameters [24]. Group 3: Implications of Densing Law - The implications of the Densing Law suggest that achieving specific AI capabilities will require exponentially fewer parameters over time, with a notable case being Mistral, which achieved its intelligence level with only 35% of the parameters in four months [32][33]. - Inference costs are also expected to decrease exponentially due to advancements in hardware and algorithms, with costs for similar capabilities dropping significantly over time [36][39]. Group 4: Future Directions and Challenges - The future of AI models will focus on enhancing capability density through a "four-dimensional preparation system," which includes efficient architecture, computation, data quality, and learning processes [49][50]. - The article highlights the importance of high-quality training data and stable environments for post-training data, which are critical for the performance of models in complex tasks [68][70]. Group 5: End-User Applications and Market Trends - By 2026, significant advancements in edge intelligence are anticipated, driven by the need for local processing of private data and the development of high-capacity edge chips [11][45][76]. - The article predicts a surge in edge applications, emphasizing the importance of privacy and personalized experiences in AI deployment [76][77].
面壁李大海谈端侧模型竞争:元年开启,巨头涌入印证前景无限可能
Huan Qiu Wang· 2025-08-15 07:48
Core Insights - The CEO of Mianbi Intelligent, Li Dahai, announced that 2025 will mark the "Year of Edge Intelligence," indicating a significant opportunity in the market as it is still in its formative stages [1] - The industry consensus is shifting towards the advantages of edge models and "edge-cloud collaboration," with major players increasingly focusing on edge technology [1] - Mianbi Intelligent aims to establish commercial advantages quickly while maintaining a balance between technology and user value, emphasizing the need for differentiated user experiences that cloud models cannot replicate [1] Company Strategy - Mianbi Intelligent's core competitive advantage lies in efficiency, striving for the best performance with minimal resources, which leads to faster and more cost-effective edge model solutions [1] - The company introduced the MiniCPM edge model in early 2024, which has 2.4 billion parameters, surpassing the Mistral 7B model, and has achieved over 13 million downloads [2] - The MiniCPM model has been successfully integrated with major chip manufacturers like Qualcomm, NVIDIA, MTK, Intel, Huawei, and Rockchip, and is particularly noted for its application in smart automotive human-machine interaction [2] Market Dynamics - The influx of new entrants into the market is seen as validation of Mianbi Intelligent's strategic choices and the potential for accelerated market growth [1] - The company has established a dedicated automotive business line to promote the widespread adoption of the MiniCPM model in vehicles [2]