Scaling Law
Search documents
谁说Scaling Law到头了?新研究:每一步的微小提升会带来指数级增长
机器之心· 2025-09-16 04:01
Core Viewpoint - The article discusses the ongoing debate regarding the diminishing returns of scaling models in AI, particularly in the context of large language models (LLMs). It presents a new perspective that, despite slower improvements in single-step accuracy, these incremental gains can lead to exponential growth in task completion length, which may hold greater economic value in real-world applications [1][3]. Group 1: Scaling Law and Economic Value - The scaling law indicates that while there may be diminishing returns in metrics like test loss, the real-world value of LLMs often comes from their ability to complete longer tasks. Larger models can compound small improvements in single-step accuracy, resulting in exponential increases in task length [3][6]. - The paper titled "The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs" argues that the economic value of an AI agent is derived from the length of tasks it can complete, rather than short task benchmarks that may suggest stagnation in progress [5][19]. Group 2: Long-Horizon Execution Challenges - Long-term task execution has historically been a significant weakness for deep learning models. The paper highlights that while LLMs have improved in complex reasoning tasks, they still struggle with executing longer tasks reliably [6][11]. - The authors propose that failures in long-term execution are often misattributed to reasoning or planning deficiencies, when in fact, execution remains a critical and under-researched challenge [7][22]. Group 3: Self-Conditioning Effect - The study identifies a self-conditioning effect where the error rate in long tasks increases with each step, leading to a compounding effect of mistakes. This phenomenon contrasts with human performance, where practice typically leads to improvement [9][30]. - The authors found that larger models do not necessarily mitigate the self-conditioning effect, which can lead to a decline in performance over extended tasks [29][32]. Group 4: Impact of Thinking Models - Recent thinking models have shown the ability to correct for self-conditioning limitations, allowing for significantly longer task execution in single rounds. For instance, the GPT-5 thinking version can execute over 1000 steps, far surpassing competitors [10][36]. - The research emphasizes the importance of reasoning before action, as models that utilize thinking chains can perform better in executing longer tasks compared to those that do not [36][37]. Group 5: Experimental Insights - The experiments conducted reveal that increasing model size significantly enhances the number of rounds a model can successfully execute, demonstrating a clear scaling trend [27][28]. - The findings suggest that while larger models can improve task execution, they still face challenges due to self-conditioning, which remains a critical area for future research [29][37].
院士张宏江:Agent将替代企业流程,也会改变未来的人类组织构成
Xin Lang Ke Ji· 2025-09-11 02:34
Core Insights - The emergence of DeepSeek R1 has significantly reduced the cost of inference models while maintaining performance close to the best models available, indicating a potential for increased demand as costs decrease [1] - The launch of ChatGPT marked a pivotal moment, with its daily active users nearing 30% of search engine usage by March this year, highlighting the integration of large models into daily life [1] - The rapid improvement in model performance and reduction in usage costs are expected to continue, driving the development of large models and their impact on various industries [1] - The concept of agents is evolving, with their planning capabilities growing exponentially, suggesting a new phase in AI development referred to as Moore's Law 3.0, where agent capabilities double every seven months [1] - AI is transitioning from being an assistant to becoming a partner, indicating a shift in the relationship between humans and machines, which will alter organizational structures and employment in the future [2]
国内外AI大厂重押,初创梭哈,谁能凭「记忆」成为下一个「DeepSeek」?
3 6 Ke· 2025-09-07 09:07
Core Insights - The concept of "memory" in AI is emerging as a crucial factor for the next wave of advancements, allowing models to learn continuously and adapt without forgetting previous knowledge [2][6][22] - Major players in the AI industry are increasingly focusing on integrating memory capabilities into their models, with various approaches being explored [4][24][30] Industry Developments - Companies like Anthropic, Google, and OpenAI have recently announced memory features in their AI systems, enabling more natural and coherent interactions by recalling past conversations [4][6][31] - The introduction of memory capabilities is seen as a response to the limitations of current models, which rely heavily on short-term memory and lack the ability to retain long-term knowledge [3][19][22] Technical Approaches - Different technical routes for implementing memory in AI models are being explored, including parameterized memory, context memory, and external databases [24][26][29] - Parameterized memory aims to allow models to distinguish which information should be retained as memory, enhancing their reasoning capabilities [24][25] - Context memory involves using prompts to provide necessary information before inference, while external databases store information outside the model for retrieval during decision-making [26][27] Competitive Landscape - The AI market is witnessing a competitive race among various players to establish memory capabilities, with established firms and startups alike vying for dominance [30][33] - Companies are adopting different business models based on their memory capabilities, with larger firms focusing on user retention through personalized experiences, while startups aim for a decentralized memory platform [32][33] Future Outlook - The timeline for achieving widespread and effective memory capabilities in AI models is estimated to be one to two years for practical applications, and three to five years for governance and privacy issues [34][35]
国内外AI大厂重押,初创梭哈,谁能凭「记忆」成为下一个「DeepSeek」?
机器之心· 2025-09-07 05:12
Core Viewpoint - The article discusses the emerging importance of "memory" in AI models, suggesting that the ability to possess human-like memory will be a key factor in the next wave of AI advancements [2][6][35]. Group 1: Importance of Memory in AI - The concept of "memory" is evolving from short-term to long-term or lifelong memory, allowing AI to learn continuously and adapt to new tasks without forgetting previous knowledge [3][7]. - Recent developments in AI memory capabilities have been highlighted by major players like Anthropic, Google, ByteDance, and OpenAI, all of which have introduced memory features in their AI systems [4][6][35]. - The demand for memory capabilities is driven by both technical and application needs, as AI models are increasingly expected to function as long-term partners rather than just tools [20][21][23]. Group 2: Current Trends and Developments - Various AI companies are exploring different approaches to implement memory, including parameterized memory, context memory, and external databases [26][28][30]. - The industry is witnessing a surge in interest and investment in memory-related research, with many companies racing to develop and integrate these capabilities into their products [6][35]. - The competition among AI firms is intensifying, with the potential for breakthroughs in memory capabilities to redefine the market landscape, similar to past pivotal moments in AI development [35][36]. Group 3: Future Outlook - The timeline for achieving widespread and effective memory capabilities in AI is estimated to be one to two years for basic functionalities, while addressing governance and privacy issues may take three to five years [36][37]. - The future of AI memory capabilities remains uncertain, with various players in the industry vying for dominance, indicating that any company could emerge as a leader in this space [38].
实测阿里万亿参数大模型:开源路线跑通了吗?
Tai Mei Ti A P P· 2025-09-06 11:32
Core Insights - Alibaba has launched its largest model to date, Qwen3-Max-Preview, with over 1 trillion parameters, surpassing Claude in programming capabilities, demonstrating the effectiveness of Scaling Law [1][4][17] - The "model + cloud" strategy has created the shortest path from technology development to commercialization, which is a key factor in Qwen's success as a latecomer [1][19] - The core challenge of Alibaba's open-source model lies in balancing openness with profitability, requiring continuous technological breakthroughs and proof of commercial viability [1][20] Model Performance - Qwen3-Max-Preview has outperformed competitors in various benchmark tests, including SuperGPQA, AIME2025, LiveCodeBench V6, Arena-Hard V2, and LiveBench [2] - In programming capabilities, Qwen3-Max-Preview has achieved significant improvements, surprising many users with its performance [4][15] Development Strategy - Alibaba's approach to model development has been characterized by rapid open-sourcing of multiple model versions, from 7 billion to 1 trillion parameters, fostering a strong developer community [16][17] - The company has made substantial investments in computing infrastructure and AI engineering, which have been crucial for training large models like Qwen3-Max-Preview [17][18] Cloud Integration - Alibaba Cloud plays a vital role in supporting Qwen's development by providing a stable and efficient computing infrastructure, which reduces the engineering burden on development teams [18] - The MaaS strategy allows Qwen to penetrate various industries quickly, enabling businesses to utilize Qwen's API without starting from scratch [18][19] Challenges Ahead - The open-source model presents both opportunities and challenges, as it may hinder the ability to maintain a significant technological edge over competitors [20] - Retaining top AI talent is critical for Alibaba, as the departure of key personnel could impact team morale and project continuity [21][22] Conclusion - Overall, Alibaba's Qwen is a leading force in the global AI model landscape, leveraging a clear strategy of open-source and self-research, supported by Alibaba Cloud's ecosystem [22] - The release of the trillion-parameter model highlights the company's commitment to Scaling Law, but the sustainability of its business model and talent retention will be crucial for future success [22]
他们在1993年就提出了Scaling Law
量子位· 2025-09-02 06:17
Core Viewpoint - The article highlights that the concept of Scaling Law was proposed 32 years ago by Bell Labs, not by recent AI advancements, emphasizing the historical significance of this research in machine learning [1][6]. Group 1: Historical Context - The paper titled "Learning Curves: Asymptotic Values and Rate of Convergence" introduced a predictive method for training errors and testing errors converging to the same asymptotic error value as training size increases, following a power-law form [4][6]. - The authors of the 1993 paper included notable figures such as Vladimir Vapnik and Corinna Cortes, who contributed significantly to the field of machine learning [6][25]. Group 2: Methodology and Findings - The research aimed to save computational resources when training classifiers by predicting their performance on larger datasets based on smaller training sets [8][10]. - The study found that as the training set size increases, both training and testing errors converge to a common asymptotic value, denoted as 'a', which typically falls between 0.5 and 1 [10][16]. - The proposed method allows for the estimation of classifier performance on larger datasets without complete training, thus conserving computational resources [10][14]. Group 3: Implications and Applications - The findings indicated that the predictive model was highly accurate for linear classifiers, demonstrating its potential to optimize resource allocation in training models [15][24]. - The research also revealed that the more difficult the task, the higher the asymptotic error and the slower the convergence rate, indicating a relationship between task complexity and learning efficiency [22].
深度|Anthropic CEO:AI技术潜力巨大,但无序扩张才是风险所在,我将引导其走向正轨
Z Potentials· 2025-08-28 03:51
Core Insights - The article discusses the rapid growth and potential of Anthropic, a leading AI company focused on developing safe and reliable AI systems with human welfare at its core. The company has achieved a recurring annual revenue exceeding $4 billion, making it one of the fastest-growing enterprises in history [12][24]. Group 1: Company Structure and Trust - Anthropic was founded by seven co-founders, which is often viewed skeptically by outsiders. However, the long-standing trust and familiarity among the founders have allowed the company to maintain cohesion and core values during rapid expansion [11][10]. - The unique dynamic of sibling co-founders, Dario and Daniela Amodei, enhances the company's strategic execution and operational management, allowing them to focus on their strengths [9][10]. Group 2: AI Applications and Market Potential - The fastest-growing application of AI is in programming, driven by the close relationship between developers and AI model creators, leading to rapid adoption [10][12]. - AI's potential extends beyond programming, with applications in customer service, biology, and pharmaceuticals, showcasing its versatility across various sectors [13][14]. Group 3: Business Model and Growth Expectations - Anthropic positions itself as a platform company, focusing on broad enterprise services rather than solely vertical-specific products. This approach allows for better understanding of user needs and market demands [15][16]. - The company has experienced exponential growth, with revenue projections that have consistently exceeded initial expectations, indicating a strong market demand for AI solutions [24][25]. Group 4: Investment and Financial Dynamics - The financial model of AI companies involves significant upfront investment in model training, with expectations of high returns over time. This cyclical investment pattern is common in venture capital, where initial losses are expected before profitability is achieved [34][35]. - The current capital expenditures may obscure the underlying profitability of individual models, which can be profitable when analyzed independently [43][44]. Group 5: Talent and Competitive Advantage - The competition for talent in the AI industry is intense, but Anthropic maintains a high employee retention rate due to its strong mission and commitment to its values, which helps in retaining skilled personnel [51][53]. - The company's approach to knowledge protection involves complex engineering capabilities and a culture that balances openness with necessary information security measures [48][49]. Group 6: Future of AI and Market Structure - The future market structure for AI is expected to consist of a few dominant players capable of building cutting-edge models, with the potential for new entrants targeting specific use cases [33]. - The article suggests that AI's growth trajectory may continue to extend, with the possibility of AI companies becoming some of the largest enterprises globally [25][24].
OpenAI史上最大失误:放走这位MIT学霸,美国AI「三朝元老」,现实韦小宝
3 6 Ke· 2025-08-21 00:39
Group 1 - The core argument of the article emphasizes that the scale of AI infrastructure development is unprecedented, surpassing both the Apollo and Manhattan projects [1][7] - The investment in AGI computing power is experiencing explosive growth, with an annual increase of up to three times [2] - Tom Brown, co-founder of Anthropic, is highlighted as a key figure in the AI field, having transitioned from a self-taught background to a leader in the development of general artificial intelligence [3][4] Group 2 - Anthropic's Claude has become the preferred choice for developers globally, marking a significant achievement in AI infrastructure [7] - The article details Tom Brown's journey from entrepreneurship to AI research, including his experiences at OpenAI and the founding of Anthropic [9][10] - The scaling law's impact on AI development is discussed, noting that increased computational power leads to significant advancements in intelligence [31][32] Group 3 - The article outlines the competitive landscape, where Anthropic's Claude is gaining market share, particularly in programming applications, with preferences shifting towards Claude over competitors like ChatGPT [37][40] - The success of Claude Code is attributed to its unexpected emergence as a superior product, driven by a user-centered approach in its development [41][42] - Tom Brown's advice for young engineers emphasizes the importance of pursuing meaningful projects over traditional career paths, advocating for risk-taking and intrinsic motivation [46][49]
GPT-5暴写“屎山代码”,14个Prompt,看穿GPT-1到GPT-5七年智商进化史
3 6 Ke· 2025-08-19 08:56
Group 1 - The core viewpoint of the articles is that GPT-5 has been released but has received criticism for not meeting expectations compared to its predecessor, GPT-4, despite the advancements in AI capabilities over the years [1][3][5]. - A comparison of performance metrics between GPT-4 and GPT-5 shows that the Scaling Law has not hit a wall, indicating ongoing improvements in AI models [3][5]. - The evolution of the GPT family from GPT-1 to GPT-5 over seven years highlights significant advancements in AI capabilities, with various prompts demonstrating the models' growing sophistication [5][7][8]. Group 2 - The articles provide examples of how each version of GPT has improved in generating creative content, such as poetry, with GPT-5 producing more coherent and human-like responses compared to earlier versions [19][20][40]. - In terms of technical tasks, GPT-5 has shown a marked improvement in writing Python code, moving from nonsensical outputs in earlier versions to producing complex yet humorous code in GPT-5 [53][54]. - The ability of GPT-5 to explain complex concepts, such as integration by parts in mathematics, has also improved significantly, making it more effective as a teaching tool compared to its predecessors [57][64][69]. Group 3 - The articles discuss how GPT-5 can now provide structured and detailed plans for various tasks, such as building a running habit, showcasing its capability to act as a personal coach or advisor [125][126][127]. - The transition from GPT-1 to GPT-5 reflects a shift from generating random or irrelevant responses to providing logical, structured, and contextually relevant answers to user queries [70][75][90]. - GPT-5's responses are characterized by a more professional tone and comprehensive information, indicating its advancement in handling complex inquiries compared to earlier models [75][90].
李建忠:关于AI时代人机交互和智能体生态的研究和思考
AI科技大本营· 2025-08-18 09:50
Core Insights - The article discusses the transformative impact of large models on the AI industry, emphasizing the shift from isolated applications to a more integrated human-machine interaction model, termed "accompanying interaction" [1][5][60]. Group 1: Paradigm Shifts in AI - The transition from training models to reasoning models has significantly enhanced AI's capabilities, particularly through reinforcement learning, which allows AI to generate synthetic data and innovate beyond human knowledge [9][11][13]. - The introduction of "Agentic Models" signifies a shift where AI evolves from merely providing suggestions to actively performing tasks for users [16][18]. Group 2: Application Development Transformation - "Vibe Coding" has emerged as a new programming paradigm, enabling non-professionals to create software using natural language, which contrasts with traditional programming methods [19][22]. - The concept of "Malleable Software" is introduced, suggesting that future software will allow users to customize and personalize applications extensively, leading to a more democratized software development landscape [24][26]. Group 3: Human-Machine Interaction Evolution - The future of human-machine interaction is predicted to be dominated by natural language interfaces, moving away from traditional graphical user interfaces (GUIs) [36][41]. - The article posits that the interaction paradigm will evolve to allow AI agents to seamlessly integrate various services, eliminating the need for users to switch between isolated applications [45][48]. Group 4: Intelligent Agent Ecosystem - The development of intelligent agents is characterized by enhanced capabilities in planning, tool usage, collaboration, memory, and action, which collectively redefine the internet from an "information network" to an "action network" [66][68]. - The introduction of protocols like MCP (Model Context Protocol) and A2A (Agent to Agent) facilitates improved interaction between agents and traditional software, enhancing the overall ecosystem [70].