强化学习
Search documents
月之暗面公开强化学习训练加速方法:训练速度暴涨97%,长尾延迟狂降93%
量子位· 2025-11-27 04:34
Core Viewpoint - The article discusses the introduction of a new acceleration engine called Seer, developed by Moonlight and Tsinghua University, which significantly enhances the reinforcement learning (RL) training speed of large language models (LLMs) without altering the core training algorithms [1][8]. Summary by Sections Performance Improvement - Seer can improve the rollout efficiency of synchronous RL by 74% to 97% and reduce long-tail delays by 75% to 93% [3][23]. Technical Architecture - Seer consists of three main modules: 1. **Inference Engine Pool**: Built on DRAM/SSD, it includes multiple inference instances and a global KVCache pool for load balancing and data reuse [9]. 2. **Request Buffer**: Acts as a unified entry for all rollout requests, managing metadata and request states for precise resource scheduling [10]. 3. **Context Manager**: Maintains context views for all requests and generates scheduling decisions based on context signals [11]. Key Technologies - **Divided Rollout**: This technique breaks down responses into independent requests and segments, reducing memory fluctuations and load imbalance [12][13]. - **Context-Aware Scheduling**: Implements a "speculative request" strategy to prioritize obtaining length features for requests, thus alleviating long request delays [17]. - **Adaptive Grouped Speculative Decoding**: Utilizes similar response patterns within groups to create a dynamic reference library for generating drafts, enhancing decoding efficiency [19]. Experimental Validation - In experiments with models like Moonlight, Qwen2-VL-72B, and Kimi-K2, Seer demonstrated a throughput increase of 74% to 97% compared to the baseline system veRL, with significantly reduced long-tail delays [21][23]. - For instance, in the Moonlight task, the last 10% of requests took 3984 seconds with veRL, while Seer reduced this to 364 seconds, achieving an 85% reduction in long-tail delays [23]. Financing and Future Plans - Moonlight is reportedly nearing completion of a new funding round, potentially raising several hundred million dollars, which could elevate its valuation to $4 billion [32][33]. - The company is in discussions with investment firms, including IDG Capital and existing shareholder Tencent, with plans to complete the funding by the end of the year and initiate an IPO process in the following year [36][37].
观众抢位中!锁定MEET2026,让我们畅聊AI|最新嘉宾阵容
量子位· 2025-11-27 04:34
Core Insights - The MEET2026 Smart Future Conference will focus on cutting-edge technologies and industry developments that have garnered significant attention throughout the year [1] - The theme "Symbiosis Without Boundaries, Intelligence to Ignite the Future" emphasizes how AI and smart technologies are penetrating various industries, disciplines, and scenarios, becoming a core driving force for societal evolution [2] Group 1: Conference Highlights - The conference will cover hot topics in the tech circle this year, including reinforcement learning, multimodal AI, chip computing power, AI in various industries, and AI going global [3] - The event will showcase the latest collisions between academic frontiers and commercial applications, featuring leading technological achievements from infrastructure, models, and product industries [4] - The conference will also feature the authoritative release of the annual AI rankings and the annual AI trend report [5][93] Group 2: Notable Speakers - Zhang Yaqin, President of Tsinghua University's Intelligent Industry Research Institute and an academician of the Chinese Academy of Engineering, has a notable background in AI and digital video technologies [11][12] - Sun Maosong, Executive Vice President of Tsinghua University's AI Research Institute, has led numerous national projects and has extensive experience in AI research [15] - Wang Zhongyuan, Director of the Beijing Academy of Artificial Intelligence, has a strong background in AI core technology development and has published over 100 papers [19] Group 3: AI Trends and Rankings - The 2025 AI Annual Rankings, initiated by Quantum Bit, will evaluate companies, products, and individuals across three dimensions, becoming one of the most influential rankings in the AI industry [94] - The 2025 Annual AI Trend Report will analyze ten significant AI trends based on technological maturity, current implementation, and potential value, highlighting representative organizations and best cases [95] Group 4: Event Details - The MEET2026 Smart Future Conference is scheduled for December 10, 2025, at the Beijing Jinmao Renaissance Hotel, with registration now open [96] - The conference aims to attract thousands of tech professionals and millions of online viewers, establishing itself as an annual barometer for the smart technology industry [98]
没有身体就没有AGI!Hillbot苏昊对谈千寻高阳:具身智能泡沫很大但进展真实
量子位· 2025-11-27 03:00
Core Viewpoints - The discussion emphasizes that embodied intelligence is essential for achieving general artificial intelligence (AGI) [2][19] - The path to AGI requires physical interaction with the environment, which is facilitated by embodied intelligence [21][23] Group 1: Insights from Experts - Su Hao asserts that without embodied intelligence, there can be no general physical intelligence or general intelligence [2][16] - Gao Yang highlights that scaling data is crucial for solving problems in embodied intelligence, indicating that the essence of the challenge remains unchanged [3][10] - Both experts agree that embodied intelligence is a key entry point for understanding AGI [3][4] Group 2: Challenges and Opportunities - The conversation addresses the technical bottlenecks in the evolution of embodied intelligence and the structural advantages China has in this field [7][24] - The experts discuss the importance of real-world data for training models, with China having a significant advantage in data iteration efficiency compared to the U.S. [27][28] - They note that the integration of hardware and software design is critical for the success of embodied intelligence [26][30] Group 3: Future Predictions - Predictions indicate that the next significant breakthrough in embodied intelligence may occur within the next 2-3 years, particularly in the development of embodied models akin to GPT-3.5 [41][39] - The experts believe that achieving AGI will be a continuous process involving multiple breakthroughs rather than a single event [38][40] - The discussion concludes that the current state of embodied intelligence is characterized by both significant progress and notable hype [31][32]
NeurIPS 2025奖项出炉,Qwen获最佳论文,Faster R-CNN获时间检验奖
机器之心· 2025-11-27 03:00
Core Insights - The NeurIPS 2025 conference awarded four Best Paper awards and three Best Paper Runner-up awards, highlighting significant advancements in various AI research areas [1][4]. Group 1: Best Papers - Paper 1: "Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)" discusses the limitations of large language models in generating diverse content and introduces Infinity-Chat, a dataset with 26,000 diverse user queries for studying model diversity [5][6][9]. - Paper 2: "Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free" reveals the impact of gated attention mechanisms on model performance and stability, demonstrating significant improvements in the Qwen3-Next model [11][16]. - Paper 3: "1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities" shows that increasing network depth to 1024 layers can enhance performance in self-supervised reinforcement learning tasks, achieving performance improvements of 2x to 50x [17][18]. - Paper 4: "Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training" identifies mechanisms that prevent diffusion models from memorizing training data, establishing a link between training dynamics and generalization capabilities [19][21][22]. Group 2: Best Paper Runner-Up - Paper 1: "Optimal Mistake Bounds for Transductive Online Learning" solves a 30-year-old problem in learning theory, establishing optimal mistake bounds for transductive online learning [28][30][31]. - Paper 2: "Superposition Yields Robust Neural Scaling" argues that representation superposition is the primary mechanism governing neural scaling laws, supported by multiple experiments [32][34]. Group 3: Special Awards - The Time-Tested Award was given to the paper "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," recognized for its foundational impact on modern object detection frameworks since its publication in 2015 [36][40]. - The Sejnowski-Hinton Prize was awarded for the paper "Random synaptic feedback weights support error backpropagation for deep learning," which contributed significantly to understanding biologically plausible learning rules in neural networks [43][46][50].
即将开课!面向量产的端到端小班课,上岸高阶算法岗位~
自动驾驶之心· 2025-11-27 00:04
Core Viewpoint - The article emphasizes the importance of end-to-end production in the automotive industry, highlighting the scarcity of qualified talent and the need for comprehensive training programs to address various challenges in this field [1][3]. Group 1: Course Overview - The course is designed to cover essential algorithms related to end-to-end production, including one-stage and two-stage frameworks, reinforcement learning applications, and trajectory optimization [3][9]. - It aims to provide practical experience and insights into production challenges, focusing on real-world applications and expert guidance [3][6]. Group 2: Course Structure - The course consists of eight chapters, each addressing different aspects of end-to-end production, such as task overview, algorithm frameworks, navigation information applications, and trajectory output optimization [9][10][11][12][13][14][15][16]. - The final chapter will share production experiences from various perspectives, including data, models, and strategies for system enhancement [16]. Group 3: Target Audience and Requirements - The course is aimed at advanced learners with a background in autonomous driving, reinforcement learning, and programming, although those with weaker foundations can still participate [17][18]. - Participants are required to have access to a GPU with recommended specifications and familiarity with relevant algorithms and programming languages [18].
具身智能之心技术交流群成立了!
具身智能之心· 2025-11-26 10:00
Group 1 - The establishment of a technical exchange group focused on embodied intelligence, covering areas such as VLA, VLN, remote operation, Diffusion Policy, reinforcement learning, VLA+RL, sim2real, multimodal large models, simulation, motion control, target navigation, mapping and localization, and navigation [1] - Interested individuals can add the assistant's WeChat AIDriver005 to join the community [2] - To expedite the joining process, it is advised to include a note with the institution/school, name, and research direction [3]
观众抢位中!锁定MEET2026,让我们畅聊AI|最新嘉宾阵容
量子位· 2025-11-26 09:33
Core Insights - The MEET2026 Smart Future Conference will focus on cutting-edge technologies and industry developments that have garnered significant attention throughout the year [1] - The theme "Symbiosis Without Boundaries, Intelligence to Ignite the Future" emphasizes how AI and smart technologies are penetrating various industries, disciplines, and scenarios, becoming a core driving force for societal evolution [2] Group 1: Conference Highlights - The conference will cover hot topics in the tech circle this year, including reinforcement learning, multimodal AI, chip computing power, AI in various industries, and AI going global [3] - The event will showcase the latest collisions between academic frontiers and commercial applications, featuring leading technological achievements from infrastructure, models, and product industries [4] - The conference will also feature the authoritative release of the annual AI rankings and the annual AI trend report [5][93] Group 2: Notable Speakers - Zhang Yaqin, President of Tsinghua University's Intelligent Industry Research Institute and an academician of the Chinese Academy of Engineering, has a notable background in AI and digital video technologies [11][12] - Sun Maosong, Executive Vice President of Tsinghua University's AI Research Institute, has led multiple national projects and has extensive experience in AI research [15] - Wang Zhongyuan, Director of the Beijing Academy of Artificial Intelligence, has a strong background in AI core technology development and has published over 100 papers [19] Group 3: AI Trends and Rankings - The "Artificial Intelligence Annual Rankings" initiated by Quantum Bit has become one of the most influential rankings in the AI industry, evaluating companies, products, and individuals across three dimensions [94] - The "2025 Annual AI Trend Report" will analyze ten major AI trends based on technological maturity, implementation status, and potential value, highlighting representative institutions and best cases [95] Group 4: Event Details - The MEET2026 Smart Future Conference is scheduled for December 10, 2025, at the Beijing Jinmao Renaissance Hotel, with registration now open [96] - The conference aims to attract thousands of tech professionals and millions of online viewers, establishing itself as an annual barometer for the smart technology industry [98]
llya最新判断:Scaling Laws逼近极限,AI暴力美学终结
3 6 Ke· 2025-11-26 08:46
Core Insights - Ilya Sutskever, co-founder of OpenAI and a key figure in deep learning, has shifted focus from scaling models to research-driven approaches in AI development [1][2][3] - The industry is moving away from "scale-driven" methods back to "research-driven" strategies, emphasizing the importance of asking the right questions and developing new methodologies [2][3] - Sutskever argues that while AI companies may experience stagnation, they can still generate significant revenue despite reduced innovation [2][3] - The potential for narrow AI models to excel in specific domains suggests that breakthroughs may come from improved learning methods rather than merely increasing model size [3][4] - The emergence of powerful AI could lead to transformative societal changes, including increased productivity and shifts in political and governance structures [3][4] - Sutskever emphasizes the importance of aesthetic principles in research, advocating for simplicity and elegance in AI design [4] Industry Trends - The scaling laws that dominated AI development are nearing their limits, prompting a return to foundational research and exploration [2][28] - The current phase of AI development is characterized by a shift from pre-training to reinforcement learning, which is more resource-intensive [29][30] - The distinction between effective resource utilization and mere computational waste is becoming increasingly blurred in AI research [30][31] - The scale of computational resources available today is substantial, but the focus should be on how effectively these resources are utilized for meaningful research [42][44] Company Insights - Safe Superintelligence (SSI) has raised $3 billion, positioning itself to focus on foundational research without the pressures of market competition [45][46] - SSI's approach to AI development may differ from other companies that prioritize immediate market applications, suggesting a long-term vision for advanced AI [45][46] - The company believes that the true value lies not in the sheer amount of computational power but in the strategic application of that power to drive research [43][44]
抢先报名!MEET2026最新嘉宾阵容官宣,一起热聊AI
量子位· 2025-11-25 09:32
Core Insights - The article emphasizes the transformative impact of artificial intelligence (AI) on various industries, marking the beginning of a new era in 2025 [1] - The MEET2026 Intelligent Future Conference will focus on cutting-edge technologies and industry advancements related to AI [2][3] - The conference will feature discussions on key topics such as reinforcement learning, multimodal AI, chip computing power, AI applications in various industries, and AI's global expansion [4] Event Details - The conference theme is "Symbiosis Without Boundaries, Intelligence to Ignite the Future," highlighting AI's role as a core driving force for societal evolution [3] - The event will showcase the latest academic and commercial innovations, featuring leading technologies from infrastructure, models, and products [5] - An authoritative release of the annual AI rankings and trends report will be a highlight of the conference [6][102] Notable Speakers - The conference will host prominent figures in the AI field, including Zhang Yaqin, a renowned scientist and entrepreneur in digital video and AI [12][13] - Other notable speakers include Sun Maosong, Wang Zhongyuan, and He Xiaodong, who have significant contributions to AI research and applications [17][21][30] - The lineup also features leaders from major tech companies, such as Wang Ying from Baidu and Daniel Povey from Xiaomi, showcasing a diverse range of expertise [26][40] AI Trends and Rankings - The 2025 AI Annual Rankings will evaluate companies, products, and individuals across three dimensions, becoming one of the most influential rankings in the AI industry [103] - The 2025 Annual AI Trends Report will identify and analyze ten significant AI trends based on technology maturity, current applications, and potential value [104] Conference Logistics - The MEET2026 Intelligent Future Conference is scheduled for December 10, 2025, at the Beijing Jinmao Renaissance Hotel, with registration now open [105] - The event aims to attract thousands of tech professionals and millions of online viewers, establishing itself as a key annual event in the intelligent technology sector [107]
刘芹:伟大的公司不是赢下一场战役,而是永不离场丨2025尾声
36氪· 2025-11-25 00:09
Core Viewpoint - The article emphasizes the need for adaptability and continuous learning in the investment landscape, particularly in the context of emerging technologies like AI and biotechnology, highlighting the importance of maintaining a growth mindset amidst uncertainty [6][7][11]. Group 1: Investment Landscape - The current investment environment is characterized by collective anxiety within the Chinese venture capital community, questioning how to navigate a landscape devoid of simple innovation models [7]. - The transition from traditional investment strategies to hard technology sectors, such as biotechnology, poses significant challenges for seasoned investors who must adapt to new paradigms [9][10]. - The concept of "infinite games" is introduced, suggesting that successful companies focus on continuous evolution rather than short-term victories, which is crucial for long-term sustainability [24][25]. Group 2: Cultural Confidence - There is a deep-rooted cultural confidence in Chinese entrepreneurship, reflecting a historical resilience and a spirit of innovation that persists despite challenges [12][13]. - The belief in a new cycle of innovation, termed "Innovation 2.0," is gaining traction among investors and entrepreneurs, indicating a shift towards optimism in the market [12][16]. Group 3: AI and Future Trends - The emergence of AI is seen as a transformative force that will redefine productivity, enabling individuals and small teams to achieve significant market valuations [17]. - The article discusses the potential for AI to integrate into various industries, suggesting that its true impact will be realized when it becomes ubiquitous in decision-making processes [17][19]. Group 4: Narrative and Collaboration - The ability to create compelling narratives is highlighted as a unique human trait that drives collaboration and innovation, essential for achieving extraordinary outcomes [19][20]. - Successful companies are described as those that not only provide solutions but also construct an attractive vision for the future, fostering a shared sense of purpose among stakeholders [21][22]. Group 5: Learning and Growth - Continuous learning and iteration are emphasized as critical components of success in an ever-evolving business landscape, with failures viewed as valuable learning experiences [28][30]. - The article concludes with a call for entrepreneurs to embrace challenges and maintain a commitment to growth, underscoring that great companies thrive by remaining engaged in the market and evolving over time [30].