Workflow
Mid-Training
icon
Search documents
Mid-Training 会成为未来的 Pre-Training 吗?
机器之心· 2025-11-23 01:30
Group 1: Core Concepts of Mid-Training - The concept of "Mid-Training" is emerging as a potential new phase in the training of large language models (LLMs), positioned between pre-training and post-training, with OpenAI establishing a dedicated department for it in July 2024 [5][6][7] - Mid-Training is described as a vital stage that enhances specific capabilities of LLMs, such as mathematics, programming, reasoning, and long-context extension, while maintaining the foundational abilities of the model [9][10] - The definition and implementation of Mid-Training are still not universally agreed upon, with various organizations exploring its effects and mechanisms, indicating a growing interest in this area [8][11] Group 2: Technical Insights and Strategies - Research from Peking University and Meituan has attempted to clarify the definition of Mid-Training, focusing on data management, training strategies, and model architecture optimization [8][10] - Key optimization strategies for Mid-Training include data curation to enhance data quality, training strategies like learning rate annealing and context extension, and architecture optimization to improve model performance [10] - The exploration of Mid-Training has gained momentum since 2025, with increasing references in research papers from institutions like Microsoft and Zero One [6][7]
电子行业跟踪周报:三季度AI业绩持续兑现,Mid-trAIning开启结构化智能新阶段-20251026
Soochow Securities· 2025-10-26 09:38
Investment Rating - The report maintains an "Overweight" rating for the industry [1] Core Insights - The AI industry continues to show strong performance in Q3, with Mid-Training marking a new phase in structured intelligence [1] - Companies in the AI supply chain have reported robust earnings, contributing to a positive market sentiment [2] - The demand for AI infrastructure is increasing, as evidenced by Amphenol and Vertiv exceeding revenue guidance for Q3 [2][4] - The PCB sector, represented by Shengyi Technology, has shown significant growth, indicating a sustained upward trend in AI PCB demand [10] Summary by Sections AI Performance and Market Sentiment - Q3 AI performance has been strong, with notable stock price increases among key players in the AI supply chain [1] - Companies like Shengyi Electronics reported a revenue increase of 153% year-on-year, reflecting the rising value of PCB in the AI computing cycle [10] Company Earnings and Projections - Amphenol's Q3 revenue reached $6.194 billion, a 53.35% increase year-on-year, driven by strong demand in AI servers and high-performance connectors [4] - Vertiv's Q3 revenue was $2.676 billion, up 60% year-on-year, with a robust order backlog supporting future growth [8][9] Mid-Training and AI Training Paradigms - Mid-Training is emerging as a critical phase in AI training, focusing on capital efficiency and quality-driven model improvements [3][12] - This new paradigm shifts the focus from merely increasing computational power to optimizing data quality and model performance [17][18] Industry Trends and Future Outlook - The demand for high-quality AI servers and related materials is expected to grow, with significant implications for the PCB and copper connection sectors [2][10] - The trend towards high-power AI server cabinets is driving demand for advanced materials, benefiting companies with technological advantages [2][10]