Workflow
Scaling Law
icon
Search documents
听说,大家都在梭后训练?最佳指南来了
机器之心· 2025-10-09 02:24
Core Insights - The article emphasizes the shift in focus from pre-training to post-training in large language models (LLMs), highlighting the diminishing returns of scaling laws as model sizes reach hundreds of billions of parameters [2][3][11]. Group 1: Importance of Post-Training - Post-training is recognized as a crucial phase for enhancing the reasoning capabilities of models like OpenAI's series, DeepSeek R1, and Google Gemini, marking it as a necessary step towards advanced intelligence [3][11]. - The article introduces various innovative post-training methods such as Reinforcement Learning from Human Feedback (RLHF), Reinforcement Learning from AI Feedback (RLAIF), and Reinforcement Learning with Verifiable Rewards (RLVR) [2][3][12]. Group 2: Transition from Pre-Training to Post-Training - The evolution from pre-training to instruction fine-tuning is discussed, where foundational models are trained on large datasets to predict the next token, but often lack practical utility in real-world applications [7][8]. - Post-training aims to align model behavior with user expectations, focusing on quality over quantity in the datasets used, which are typically smaller but more refined compared to pre-training datasets [11][24]. Group 3: Supervised Fine-Tuning (SFT) - Supervised Fine-Tuning (SFT) is described as a process that transforms a pre-trained model into one that can follow user instructions effectively, relying on high-quality instruction-answer pairs [21][24]. - The quality of the SFT dataset is critical, as even a small number of low-quality samples can negatively impact the model's performance [25][26]. Group 4: Reinforcement Learning Techniques - Reinforcement Learning (RL) is highlighted as a complex yet effective method for model fine-tuning, with various reward mechanisms such as RLHF, RLAIF, and RLVR being employed to enhance model performance [39][41]. - The article outlines the importance of reward models in RLHF, which are trained using human preference data to guide model outputs [44][46]. Group 5: Evaluation of Post-Training Models - The evaluation of post-training models is multifaceted, requiring a combination of automated and human assessments to capture various quality aspects [57][58]. - Automated evaluations are cost-effective and quick, while human evaluations provide a more subjective quality measure, especially for nuanced tasks [59][60].
“大就是好”,但技术男阿里云并不执著“上头条”
Guan Cha Zhe Wang· 2025-09-29 09:46
Core Viewpoint - Alibaba's CEO, Wu Yongming, delivered a notable presentation at the Yunqi Conference, which led to a significant 9.16% increase in Alibaba's stock price, indicating strong investor sentiment despite a generally cautious market environment [1][3]. Group 1: Company Developments - Wu Yongming highlighted that large models will dominate software as the next-generation operating system, and Alibaba Cloud plans to invest further in AI infrastructure beyond its existing 380 billion yuan commitment over three years [3]. - Alibaba Cloud's Qwen3-Max model has achieved significant advancements, including an increase in pre-training data from 18 terabytes to 36 terabytes, and a focus on scaling laws to enhance model performance [6][10]. - The company has positioned itself as a leader in the AI cloud market, with a reported 35.8% market share, significantly ahead of competitors [16][22]. Group 2: Competitive Landscape - The competition in the AI cloud sector is intensifying, particularly with ByteDance's Volcano Engine, which has captured a 49.2% market share in model-as-a-service (MaaS) [16][18]. - Despite the competitive pressure, Alibaba Cloud has maintained a strong position, with over 53% of Fortune 500 companies using its services for generative AI [16][22]. - The market dynamics are shifting, with a trend towards self-deployment of models on Alibaba Cloud rather than relying solely on API calls, which may not be fully reflected in market share statistics [16][22]. Group 3: Technological Innovations - Alibaba Cloud has made significant strides in AI infrastructure, including the development of a new AI chip that approaches NVIDIA's capabilities and a high-performance network architecture that supports large-scale GPU interconnectivity [25][27]. - The company is focusing on a comprehensive stack for AI infrastructure, which positions it well in the context of increasing domestic demand for AI capabilities [27]. - Innovations in model architecture, such as the introduction of the Qwen3-Next model with a sparse MoE architecture, demonstrate Alibaba's commitment to advancing AI technology [6][10].
人形与具身智能产业何以叩响“Scaling Law”之门?
机器人大讲堂· 2025-09-24 11:09
Core Viewpoint - The humanoid robot industry is at a critical transformation point, moving from early "theme speculation" to "pre-investment in industrial trends" as companies like Tesla and Figure begin small-scale production. The industry's non-linear growth hinges on breakthroughs in hardware cost reduction and advancements in intelligent robotics [1][3]. Group 1: Current Industry Landscape - The core contradiction in humanoid robotics is not about "whether to ship" but rather "whether to form a sustainable industrial flywheel." By the end of 2024 and early 2025, many domestic companies have completed deliveries of hundreds to thousands of units, primarily in research, education, and display sectors [1][3]. - Initial order numbers are not the key signal; the real turning point for the industry lies in the "Scaling Law moment" of the robotic brain, where intelligence improves non-linearly with data volume and model scale, breaking through the bottleneck of scenario generalization [1][3]. Group 2: Challenges to Scaling Law Moment - Two major challenges need to be addressed: high hardware costs and the lack of standardized solutions. For instance, Tesla's Optimus Gen1 has a high BOM cost, with a target to reduce it to $20,000 per unit. Key components for cost reduction include joint modules and sensors [3]. - The software side lacks a "robotic version of ChatGPT." The robotic brain must possess both "perception decision-making" and "motion control" capabilities, but current models face data challenges, including complex motion data modalities and high costs of real-world data collection [3][4]. Group 3: Technological Pathways - The "big and small brain collaboration" has become the mainstream engineering approach, with three clear paths for the evolution of large models in robotics. The dual-system layered VLA architecture is currently the optimal solution for engineering implementation [4][5]. - Figure's Helix system exemplifies this collaboration, utilizing a slow system for understanding natural language and a fast system for real-time control, enabling complex tasks in flexible manufacturing scenarios [7][9]. Group 4: Commercialization Pathways - The commercialization of humanoid robots is expected to follow a "from easy to difficult" path, starting with ToG (research and education), then ToB (industrial manufacturing), and finally ToC (household services). The ToB sector is becoming a critical battleground for breakthroughs [8][9]. - The apparel manufacturing industry is a typical case for ToB implementation, with a significant global workforce and high labor costs, yet low penetration of traditional industrial robots due to the flexibility of materials and rapid style changes [8][9]. Group 5: Investment Trends and Future Outlook - The flow of capital in the industry is shifting from a focus on hardware to software, with significant investments in embodied intelligent large models from companies like Google and NVIDIA. Domestic startups are also gaining traction in this space [11]. - The ultimate goal of the humanoid robot industry is to replicate the "non-linear growth curve" seen in sectors like electric vehicles and smartphones, with the "Scaling Law moment" of the robotic brain being the key trigger for this growth [13].
百度及AI的前途
3 6 Ke· 2025-09-24 10:53
Group 1 - Baidu's search engine is undergoing a significant transformation towards AI integration, referred to internally as "Big Search," marking the largest change in a decade [1] - The AI-driven agent model is expected to assist users in completing tasks beyond traditional keyword searches, indicating a shift in user interaction [1] - Baidu's Wenku and cloud storage services are also expanding, aiming to create a "one-stop AI creation platform" with a dedicated team of 1,200 [1] Group 2 - The article discusses the evolution of the internet ecosystem, highlighting the complexity of user needs and the competitive landscape dominated by major players like BAT and FANG [2] - The historical context of the internet's development is explored, noting the transition from information-centric models to more integrated social and e-commerce platforms [3] Group 3 - The recommendation engine developed by Baidu is based on user behavior data, aiming to enhance targeted advertising through detailed user profiling [5] - The article critiques the current state of content production, suggesting that the focus on quantity over quality has led to a decline in meaningful engagement [6] Group 4 - The dominance of algorithm-driven content distribution is noted, with implications for user experience and the overall information ecosystem [8] - Baidu's market position is analyzed in light of competition from ByteDance, emphasizing the challenges faced by traditional search models in adapting to new content consumption patterns [8] Group 5 - The article reflects on the missed opportunities for Baidu in the early days of algorithm distribution, suggesting that a more proactive approach could have altered its competitive stance [11] - The potential of AI to revolutionize information access and user interaction is highlighted, with a focus on the implications for Baidu's future strategies [19][20] Group 6 - Baidu's early commitment to AI, including the establishment of a deep learning research institute, is acknowledged, though recent performance in AI competitions has raised questions about its strategic direction [20] - The article emphasizes the importance of application development in AI, suggesting that successful models will depend on practical use cases rather than theoretical frameworks [32]
在「外滩大会·具身智能:从泛化到行动,重塑产业未来」上,这些大牛都说了什么?
机器之心· 2025-09-16 08:37
Core Viewpoint - The article discusses the future of AI and embodied intelligence, emphasizing the need for disruptive innovation to enable generalized action capabilities and the transition from technical feasibility to commercial success [2][4]. Group 1: Embodied Intelligence Development - The concept of embodied intelligence has evolved from simply giving machines a physical body to creating immersive perception processes [6]. - Current challenges in the field include data bottlenecks, which can be addressed through the establishment of training grounds that enhance robustness and generalization capabilities [7]. - The industry is witnessing a surge in the construction of training grounds, which offer benefits such as cost reduction, safety simulation, and unified standards [7]. Group 2: Data Collection and Utilization - Training grounds are described as new data factories in the AI era, crucial for collecting data to train embodied intelligence models [8][10]. - The development paradigm has shifted to a model where data collection occurs post-robot development, emphasizing the importance of large datasets for effective training [10][11]. - The use of synthetic data is highlighted as a viable solution to the challenges of obtaining real-world data, allowing for scalable and controllable training processes [18][19]. Group 3: Future Prospects and Challenges - The industry is exploring various paths for embodied intelligence, including the integration of real-world data and simulation data to enhance model performance [30][31]. - Discussions on the potential of humanoid robots reveal that while they may not be the only form of embodied intelligence, their development is crucial for achieving broader applications [34][35]. - The timeline for the integration of embodied intelligence into daily life is projected to be gradual, with significant advancements expected in the next 5 to 10 years [38]. Group 4: Industry Collaboration and Ecosystem - The need for collaboration across the industry is emphasized, with calls for the establishment of a robust ecosystem to support the development of embodied intelligence [48][49]. - Various stakeholders express the importance of integrating hardware and software capabilities to enhance the overall effectiveness of embodied intelligence solutions [47][49]. - The article concludes with a vision for a future where embodied intelligence significantly transforms industries and daily life, driven by collective efforts from academia and industry [51].
谁说Scaling Law到头了?新研究:每一步的微小提升会带来指数级增长
3 6 Ke· 2025-09-16 07:46
Core Insights - The Scaling Law is being questioned due to perceived diminishing returns in model training, but recent research suggests that small improvements in accuracy can lead to exponential growth in task completion length, which may hold more economic value in real-world applications [1][2][4] Group 1: Research Findings - A recent paper from Cambridge University indicates that while there are diminishing returns in metrics like test loss, the real-world value of large language models (LLMs) often comes from their ability to complete longer tasks [2][4] - The paper highlights that the long-term execution of tasks has been a significant weakness in deep learning, with LLMs struggling to perform complex, lengthy tasks despite improvements in reasoning capabilities [4][6] - The authors propose that the failures in long tasks are primarily due to execution challenges rather than reasoning or planning limitations, emphasizing the need for more focus on execution capabilities in LLM research [6][20] Group 2: Experimental Insights - The study measures LLMs' long-horizon execution capabilities by isolating execution from planning and knowledge retrieval, revealing that larger models can significantly increase the number of successful execution rounds [6][23][25] - The concept of self-conditioning is introduced, where the model's performance deteriorates as it builds on its previous errors, leading to a decline in accuracy over multiple rounds [8][26][30] - The research shows that while increasing model size improves task execution, it does not alleviate the self-conditioning effect, which remains a challenge for LLMs in long-term tasks [27][30] Group 3: Implications for Investment - The findings suggest that the economic value of LLMs may not be accurately reflected in short-task benchmarks, as the ability to complete longer tasks is a more reliable indicator of their potential [18][20] - The paper encourages further investment in scaling models, as the ability to perform longer tasks could justify continued financial commitment despite short-term performance metrics suggesting stagnation [10][18] - The research calls for the design of new benchmarks that better assess the execution depth of models, highlighting a potential area for future investment and development in the AI sector [10][18]
马斯克周末血裁xAI 500人
Sou Hu Cai Jing· 2025-09-16 06:27
Core Insights - xAI has implemented a sudden internal assessment leading to a significant layoff of its data annotation team, with a reported attrition rate of 33% and over 500 employees terminated [1][11]. Group 1: Layoff Details - The data annotation team, crucial for the development of Grok, has seen its size decrease from 1500 to just over 1000 employees, indicating a nearly one-third reduction [11]. - The layoffs were preceded by a series of one-on-one discussions with employees, creating a sense of panic within the company [5][7]. - The company announced a strategic shift towards hiring specialized data annotators, planning to expand their numbers tenfold, while reducing the focus on general data annotators [11][12]. Group 2: Strategic Shift - This shift from general to specialized data annotation reflects a belief that quality is more important than quantity, aiming to enhance Grok's capabilities in specific fields [12][14]. - The decision may limit the diversity of data available for training, which is essential for the growth of AI systems [12][14]. - The move is seen as a significant gamble on vertical industry AI applications, potentially positioning Grok advantageously if successful [14][15]. Group 3: Management Philosophy - Elon Musk's management style is characterized by a preference for small, high-performing teams, often leading to drastic layoffs to maintain efficiency and performance [22][24]. - This approach has been consistent across Musk's ventures, including Tesla and Twitter, where he has previously enacted similar layoffs to streamline operations [20][24]. - The emphasis on high performance and low tolerance for underachievement is a hallmark of Musk's leadership, which may drive the remaining employees to maximize their potential [22][25].
马斯克周末血裁xAI 500人
量子位· 2025-09-16 05:58
Core Insights - xAI has implemented a drastic layoff strategy, resulting in a 33% attrition rate within its data annotation team, with over 500 employees terminated [2][18]. - The company is shifting its focus from general data annotation to specialized roles, aiming to expand the number of professional data annotators by tenfold, indicating a strategic pivot towards vertical AI applications [19][21]. Group 1: Layoff and Testing Strategy - xAI conducted an internal test with a high elimination rate, leading to significant layoffs in the data annotation team, which was previously the largest team within the company [2][3]. - The layoffs were preceded by one-on-one discussions with employees, creating a sense of panic within the organization [11][12]. - The termination emails indicated a strategic shift to prioritize specialized data annotation roles over general positions, reflecting a change in the company's operational focus [17][18]. Group 2: Shift in Focus to Specialized AI - The decision to reduce the number of general data annotators in favor of specialized roles suggests a belief that quality is more important than quantity in AI training [21][22]. - This shift aims to enhance the capabilities and credibility of Grok in specific fields, although it may limit the diversity of data available for training [22][25]. - The move aligns with a broader trend where vertical models in industries like finance and healthcare are becoming more prominent compared to general models [25][27]. Group 3: Elon Musk's Management Style - Elon Musk's history of aggressive layoffs and restructuring is evident in his management approach, which emphasizes high performance and efficiency [30][35]. - Musk prefers small, highly skilled teams over larger ones, believing they are more creative and efficient [36][37]. - The culture of high expectations and low tolerance for underperformance is a hallmark of Musk's leadership, as seen in previous companies like Tesla and Twitter [40][42].
谁说Scaling Law到头了?新研究:每一步的微小提升会带来指数级增长
机器之心· 2025-09-16 04:01
Core Viewpoint - The article discusses the ongoing debate regarding the diminishing returns of scaling models in AI, particularly in the context of large language models (LLMs). It presents a new perspective that, despite slower improvements in single-step accuracy, these incremental gains can lead to exponential growth in task completion length, which may hold greater economic value in real-world applications [1][3]. Group 1: Scaling Law and Economic Value - The scaling law indicates that while there may be diminishing returns in metrics like test loss, the real-world value of LLMs often comes from their ability to complete longer tasks. Larger models can compound small improvements in single-step accuracy, resulting in exponential increases in task length [3][6]. - The paper titled "The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs" argues that the economic value of an AI agent is derived from the length of tasks it can complete, rather than short task benchmarks that may suggest stagnation in progress [5][19]. Group 2: Long-Horizon Execution Challenges - Long-term task execution has historically been a significant weakness for deep learning models. The paper highlights that while LLMs have improved in complex reasoning tasks, they still struggle with executing longer tasks reliably [6][11]. - The authors propose that failures in long-term execution are often misattributed to reasoning or planning deficiencies, when in fact, execution remains a critical and under-researched challenge [7][22]. Group 3: Self-Conditioning Effect - The study identifies a self-conditioning effect where the error rate in long tasks increases with each step, leading to a compounding effect of mistakes. This phenomenon contrasts with human performance, where practice typically leads to improvement [9][30]. - The authors found that larger models do not necessarily mitigate the self-conditioning effect, which can lead to a decline in performance over extended tasks [29][32]. Group 4: Impact of Thinking Models - Recent thinking models have shown the ability to correct for self-conditioning limitations, allowing for significantly longer task execution in single rounds. For instance, the GPT-5 thinking version can execute over 1000 steps, far surpassing competitors [10][36]. - The research emphasizes the importance of reasoning before action, as models that utilize thinking chains can perform better in executing longer tasks compared to those that do not [36][37]. Group 5: Experimental Insights - The experiments conducted reveal that increasing model size significantly enhances the number of rounds a model can successfully execute, demonstrating a clear scaling trend [27][28]. - The findings suggest that while larger models can improve task execution, they still face challenges due to self-conditioning, which remains a critical area for future research [29][37].
院士张宏江:Agent将替代企业流程,也会改变未来的人类组织构成
Xin Lang Ke Ji· 2025-09-11 02:34
Core Insights - The emergence of DeepSeek R1 has significantly reduced the cost of inference models while maintaining performance close to the best models available, indicating a potential for increased demand as costs decrease [1] - The launch of ChatGPT marked a pivotal moment, with its daily active users nearing 30% of search engine usage by March this year, highlighting the integration of large models into daily life [1] - The rapid improvement in model performance and reduction in usage costs are expected to continue, driving the development of large models and their impact on various industries [1] - The concept of agents is evolving, with their planning capabilities growing exponentially, suggesting a new phase in AI development referred to as Moore's Law 3.0, where agent capabilities double every seven months [1] - AI is transitioning from being an assistant to becoming a partner, indicating a shift in the relationship between humans and machines, which will alter organizational structures and employment in the future [2]