Workflow
撞墙的不是Scaling Laws,是AGI。
自动驾驶之心·2025-09-28 23:33

Core Viewpoint - The article posits that scaling laws do not necessarily lead to AGI (Artificial General Intelligence) and may even diverge from it, suggesting that the underlying data structure is a critical factor in the effectiveness of AI models [1]. Group 1: Data and Scaling Laws - The scaling laws are described as an intrinsic property of the underlying data, indicating that the performance of AI models is heavily reliant on the quality and distribution of the training data [14]. - It is argued that the raw internet data mix is unlikely to provide the optimal data distribution for achieving AGI, as not all tokens are equally valuable, yet the same computational resources are allocated per token during training [15]. - The article emphasizes that the internet data, while abundant, is actually sparse in terms of useful contributions, leading to a situation where AI models often only achieve superficial improvements rather than addressing core issues [8]. Group 2: Model Development and Specialization - GPT-4 is noted to have largely exhausted the available internet data, resulting in a form of intelligence that is primarily based on language expression rather than specialized knowledge in specific fields [9]. - The introduction of synthetic data by Anthropic in models like Claude Opus 3 has led to improved capabilities in coding, indicating a shift towards more specialized training data [10]. - The trend continues with GPT-5, which is characterized by a smaller model size but greater specialization, leading to a decline in general conversational abilities that users have come to expect [12]. Group 3: Economic Considerations and Industry Trends - Due to cost pressures, AI companies are likely to move away from general-purpose models and focus on high-value areas such as coding and search, which are projected to have significant market valuations [7][12]. - The article raises concerns about the sustainability of a single language model's path to AGI, suggesting that the reliance on a "you feed me" deep learning paradigm limits the broader impact of AI on a global scale [12].