后训练范式
Search documents
让大模型不再过度思考!上海AI Lab后训练新范式重塑CoT,推理又快又好
量子位· 2025-12-21 02:00
RePro团队 投稿 量子位 | 公众号 QbitAI 这篇论文将推理的过程视为模型内部状态的优化过程,从而对如何重塑大模型的CoT提供了一个全新视角: 核心观察:推理即优化 RePro 基于这样一个核心思想:将模型的推理轨迹 (Trajectory) 看作是在损失曲面上寻找最优解的路径。 然而,"长思考"并非总是完美的。我们常发现模型会陷入 "过度思考" (Overthinking) 的陷阱:为了得出一个简单的结论,模型可能会生成 数千个冗余Token,甚至在错误的路径上反复横跳 (Backtracking) 。这不仅浪费了宝贵的算力,还增加了推理延迟。 RePro的三大"矫正"机制 近年来,随着o1、DeepSeek-R1等模型的爆发,Long Chain-of-Thought (Long CoT) 已成为提升LLM复杂推理能力的标配。 如何让模型在"深思熟虑"的同时,保持"思维敏捷"? 基于上述视角,RePro设计了一套过程奖励机制,直接嵌入到RLVR (如PPO,GRPO) 流程中。 近日,上海人工智能实验室的研究团队提出了一种全新的后训练范式—— RePro (Rectifying Process- ...
肖仰华教授:具身智能距离“涌现”还有多远?|Al&Society百人百问
腾讯研究院· 2025-06-27 06:59
Core Viewpoint - The article discusses the transformative impact of generative AI and embodied intelligence on technology, business, and society, emphasizing the need for a multi-faceted exploration of AI's opportunities and challenges [1]. Group 1: AI Development Trends - The development of AI in recent years has followed two clear trajectories: generative AI (AIGC) and embodied intelligence [5][9]. - Generative AI aims to equip machines with human-like cognitive abilities, while embodied intelligence focuses on enabling machines to mimic human sensory and action capabilities [10][11]. - The current AI landscape highlights the importance of data quality and training strategies over sheer data volume and computational power [6][19]. Group 2: Embodied Intelligence - The next phase of embodied intelligence is expected to involve mind-body coordination, reflecting the philosophical inquiry into how human-level intelligence arises [6][11]. - The application of embodied intelligence in consumer markets hinges on the machine's ability to empathize and understand human emotional needs [6][10]. - There is a significant gap in the data required for embodied intelligence to reach its potential, with current datasets lacking the scale necessary for generalization [7][24]. Group 3: AI as a Technological Revolution - Generative AI is characterized as a technological revolution based on three criteria: foundational nature, exponential productivity enhancement, and profound societal impact [13][14]. - The societal implications of AI's cognitive capabilities are vast, potentially affecting all human activities and leading to concerns about cognitive laziness among humans [14][16]. - In contrast, the impact of embodied intelligence on productivity is seen as limited compared to the cognitive advancements of generative AI [15][16]. Group 4: Data and Model Relationships - The relationship between model algorithms and data is crucial, with algorithms determining the lower limit of model performance and data defining the upper limit [20][21]. - The current focus in AI development is on enhancing data quality and training strategies, particularly in the context of embodied intelligence [19][22]. - The industry faces challenges in data acquisition for embodied intelligence, necessitating innovative approaches to data collection and synthesis [25][26]. Group 5: Future Directions - To overcome the data scarcity in embodied intelligence, strategies such as leveraging real, simulated, and synthetic data are being explored [25][26]. - The development of wearable devices capable of capturing real-world actions could provide a substantial data foundation for embodied intelligence [26]. - The complexity of human experience and environmental interaction presents significant challenges for the data-driven advancement of embodied intelligence [34][35].