强化学习云 - filings, earnings calls, financial reports, news

强化学习云

Search documents

2026年，大模型训练的下半场属于「强化学习云」

机器之心· 2026-01-12 05:01

Core Insights - The article discusses the transition in AI model development from scaling laws based on increasing parameters and training data to a focus on reinforcement learning (RL) and post-training scaling, indicating a paradigm shift in AI capabilities [1][4][10]. Group 1: Scaling Law and Model Development - By the end of 2024, discussions in Silicon Valley and Beijing highlighted concerns that scaling laws were hitting a wall, as newer flagship models like Orion did not show expected marginal benefits from increased parameters and data [1]. - Ilya Sutskever's remark suggested a shift from an era of scaling to one of miracles and discoveries, indicating skepticism about the sustainability of the pre-training approach [3]. - By early 2025, OpenAI's o1 model introduced reinforcement reasoning, demonstrating that test-time scaling could lead to higher intelligence, while DeepSeek R1 successfully replicated this technology in an open-source manner [4][6]. Group 2: Reinforcement Learning and Infrastructure - The focus of computational power is shifting from pre-training scaling to post-training and test-time scaling, emphasizing the importance of deep reasoning capabilities over mere parameter size [8]. - The emergence of DeepSeek R1 revealed that deep reasoning, driven by reinforcement learning, is more critical for model evolution than simply increasing parameters [4][6]. - The industry is calling for a new computational infrastructure to support this shift towards dynamic exploration and reasoning, as existing cloud architectures struggle to meet these demands [11][12]. Group 3: Agentic RL and Its Implications - Nine Chapters Cloud has positioned itself as a leader in defining "reinforcement learning cloud" infrastructure, which is essential for the evolving AI landscape [12][14]. - The Agentic RL platform, launched in mid-2025, is the first industrial-grade reinforcement learning cloud platform, significantly enhancing training efficiency and reducing costs [15][19]. - Agentic RL aims to evolve general models into expert models capable of complex decision-making and control, addressing real-world challenges in various industries [20][22]. Group 4: Real-World Applications and Economic Impact - The successful implementation of a large-scale AI center in Huangshan within 48 days exemplifies Nine Chapters Cloud's engineering capabilities and operational efficiency [41][43]. - The Huangshan model is projected to generate significant economic benefits, with an estimated increase of at least 200 million yuan in annual service industry value [48]. - The integration of AI capabilities into urban management and tourism demonstrates the potential for AI infrastructure to drive economic growth and enhance operational efficiency [50][51]. Group 5: Future Vision and Market Position - Nine Chapters Cloud aims to establish itself as a key player in the independent AI cloud sector, advocating for an open ecosystem that does not compete with clients [54][60]. - The company emphasizes the importance of defining standards for next-generation infrastructure, moving beyond traditional cloud services to focus on enabling rapid evolution of intelligent agents [63][66]. - The future of cloud computing is envisioned as an "evolution era," where the focus will be on enhancing the capabilities of intelligent agents rather than merely providing computational resources [68][69].