Post-Training
Search documents
Building the GitHub for RL Environments: Prime Intellect's Will Brown & Johannes Hagemann
Sequoia Capital· 2026-02-10 13:00
If data is the bottleneck, if having the real expertise is the bottleneck, like would you rather have the smartest person in history work at your company or someone who's been there for 30 years. Sometimes you really want the person who's been there for 30 years. There's a lot of expertise that comes from really understanding a problem deeply and interact with it over a long time.And this is really what happens in training that is almost impossible to replicate in a a short prompt. You really want the abili ...
Mid-Training 会成为未来的 Pre-Training 吗?
机器之心· 2025-11-23 01:30
Group 1: Core Concepts of Mid-Training - The concept of "Mid-Training" is emerging as a potential new phase in the training of large language models (LLMs), positioned between pre-training and post-training, with OpenAI establishing a dedicated department for it in July 2024 [5][6][7] - Mid-Training is described as a vital stage that enhances specific capabilities of LLMs, such as mathematics, programming, reasoning, and long-context extension, while maintaining the foundational abilities of the model [9][10] - The definition and implementation of Mid-Training are still not universally agreed upon, with various organizations exploring its effects and mechanisms, indicating a growing interest in this area [8][11] Group 2: Technical Insights and Strategies - Research from Peking University and Meituan has attempted to clarify the definition of Mid-Training, focusing on data management, training strategies, and model architecture optimization [8][10] - Key optimization strategies for Mid-Training include data curation to enhance data quality, training strategies like learning rate annealing and context extension, and architecture optimization to improve model performance [10] - The exploration of Mid-Training has gained momentum since 2025, with increasing references in research papers from institutions like Microsoft and Zero One [6][7]