Seer
Search documents
月之暗面公开强化学习训练加速方法:训练速度暴涨97%,长尾延迟狂降93%
量子位· 2025-11-27 04:34
Core Viewpoint - The article discusses the introduction of a new acceleration engine called Seer, developed by Moonlight and Tsinghua University, which significantly enhances the reinforcement learning (RL) training speed of large language models (LLMs) without altering the core training algorithms [1][8]. Summary by Sections Performance Improvement - Seer can improve the rollout efficiency of synchronous RL by 74% to 97% and reduce long-tail delays by 75% to 93% [3][23]. Technical Architecture - Seer consists of three main modules: 1. **Inference Engine Pool**: Built on DRAM/SSD, it includes multiple inference instances and a global KVCache pool for load balancing and data reuse [9]. 2. **Request Buffer**: Acts as a unified entry for all rollout requests, managing metadata and request states for precise resource scheduling [10]. 3. **Context Manager**: Maintains context views for all requests and generates scheduling decisions based on context signals [11]. Key Technologies - **Divided Rollout**: This technique breaks down responses into independent requests and segments, reducing memory fluctuations and load imbalance [12][13]. - **Context-Aware Scheduling**: Implements a "speculative request" strategy to prioritize obtaining length features for requests, thus alleviating long request delays [17]. - **Adaptive Grouped Speculative Decoding**: Utilizes similar response patterns within groups to create a dynamic reference library for generating drafts, enhancing decoding efficiency [19]. Experimental Validation - In experiments with models like Moonlight, Qwen2-VL-72B, and Kimi-K2, Seer demonstrated a throughput increase of 74% to 97% compared to the baseline system veRL, with significantly reduced long-tail delays [21][23]. - For instance, in the Moonlight task, the last 10% of requests took 3984 seconds with veRL, while Seer reduced this to 364 seconds, achieving an 85% reduction in long-tail delays [23]. Financing and Future Plans - Moonlight is reportedly nearing completion of a new funding round, potentially raising several hundred million dollars, which could elevate its valuation to $4 billion [32][33]. - The company is in discussions with investment firms, including IDG Capital and existing shareholder Tencent, with plans to complete the funding by the end of the year and initiate an IPO process in the following year [36][37].