Workflow
Training
icon
Search documents
首创Mid-training范式破解RL奥秘,Llama终于追平Qwen!
机器之心· 2025-06-30 09:49
论文链接:https://arxiv.org/abs/2506.20512 代码仓库:https://github.com/GAIR-NLP/OctoThinker 近期,一份来自上海创智学院、上海交通大学的前沿研究论文吸引了人工智能领域的广泛关注。该论文深入探讨了不同基础语言模型家族(如 Llama 和 Qwen)在 强化学习(RL)训练中迥异表现的背后原因,并提出创新性的中期训练(mid-training)策略,成功地将 Llama 模型改造成高度适配强化学习的推理基础模型,显 著缩小了其与天生擅长 RL 扩展的 Qwen 模型之间的性能差距,为下一代 reasoning 能力 AI 系统的开发提供了关键的科学基础和技术路径。 论文发布后在社交媒体引发广泛关注,Meta AI 研究科学家、即将赴 UMass Amherst 任助理教授的 Wenting Zhao 率先盛赞:"Truly impressed by how an academic lab just figured out a lot of mysteries in mid-training to close the RL gap betwee ...
X @The Wall Street Journal
Vests used for decades in military-type training are now popular with middle-aged women and other power walkers. What does the research say? https://t.co/NwmXh4oqe2 ...
X @The Wall Street Journal
Vests used for decades in military-type training are now popular with middle-aged women and other power walkers. What does the research say? https://t.co/PmOoiklzrp ...
从后训练回到预训练,LLM+RL 的潜力兑现有有机会走更远吗?
机器之心· 2025-06-28 05:22
都是 NPT,用 RL 做预训练的潜力更大吗?为什么强化学习里很少有预训练模型?最流行的 RL 范式有何理论缺陷? 已有成效 的后训练 RL 实现存在什么问题? 2. 硅谷 AI Leaders 近期「暴论」大盘点! 1.从后训练回到预训练,LLM+RL 的潜力兑现有有机会走更远吗? 未来订阅 ChatGPT 就送人形机器人?AGI 为什么可能永远无法实现?为什么 AI 比程序员更显性价比?行业大模型真的没必要 吗?做好研究不如写好推文?OpenAI 和 Nvidia 的「AI 工厂」有何区别? 本期完整版通讯含 2 项专题解读 + 29 项 AI & Robotics 赛道要事速递,其中技术方面 11 项,国内方面 9 项,国外方面 9 项。 本期通讯总计 23143 字,可免费试读至 9% 机器之心PRO · 会员通讯 Week 26 --- 本周为您解读 ② 个值得细品的 AI & Robotics 业内要事 --- ① LLM 预训练对监督数据的需求趋于无穷,且需要覆盖尽可能所有遇到的问题,同时要求监督信号必须准确无 误,从而保证模型正确性。 ② 两项要求在现实中均难以实现,原因在于高质量人类标注数据 ...
SemiAnalysi:千兆瓦级 AI 训练负荷波动 - 电网负荷风险
2025-06-26 14:09
Queue, Tesa Megapacks, Supercapacitors, Gigawatt-scae Batteries, PyTorch No Power Pant Bow Up 2 minutes June 25, 2025 A Training Load Fuctuations at Gigawatt-scae Risk of Power Grid Backout? //108GW Large Load No comments By Jeremie Eiahou Ontiveros, Ajey Pandey and Dyan Pate The argest A abs are racing to buid muti-gigawatt-scae datacenters, and stressing our century-od power grid to an unprecedented extent. Not ony is the scae massive, but A training workoads have a very unique oad profie, unexpectedy ris ...
Advanced Insights S2E4: Deploying Intelligence at Scale
AMD· 2025-06-25 17:00
Chris Gandolfo, EVP of OCI and AI Sales at Oracle, and Mark Papermaster explore what it really takes to train and deploy large language models at scale. From evolving compute needs and energy efficiency to the rise of inference, this episode dives deep into the future of enterprise AI infrastructure. 00:00 Series Intro & Host Introduction 00:12 Meet Chris Gandolfo – EVP at Oracle 01:19 AI at an Inflection Point – Oracle’s Perspective 04:48 Does Training Ever End? 05:27 OCI’s Late Entry Strategy – Learning f ...
花旗:Dell‘Oro Q2 2025 数据中心资本支出报告要点
花旗· 2025-06-23 02:09
Flash | 17 Jun 2025 10:01:01 ET │ 9 pages – 1Q market up > 50% YoY to $134bn, driven by accelerated server spend; servers >50% of DC capex, accelerated > general purpose US Communications Equipment Dell'Oro 1Q25 Data Center Capex Report Highlights CITI'S TAKE We summarize key points from Dell'Oro Group's 1Q data center capex report. – General purpose infrastructure was up modestly – Limited impact from tariff-except components. System vendors and hyperscale cloud SPs increased component inventory purchases ...
Embracing Pain to Unlock Potential | Saad Al-Habsi | TEDxAl Qurum
TEDx Talks· 2025-06-18 16:23
[Music] As I gasped for air, barely able to move after my 10k run, I thought maybe I'm not built for this. But then I realized pain was giving me a choice. I could kit or I could continue.What if I told you that pain is not our enemy but our greatest teacher. The very thing we spend our lives avoiding could be the thing to unlocking our achievement. We all experience pain physically, emotionally, mentally.But the real question is what if we chose to embrace pain. Could it be the gateway to something greater ...
Universal Technical Institute Celebrates First Graduating Classes from Aviation Program at Avondale and Long Beach Campuses
Prnewswire· 2025-06-18 13:15
The program provides students with the skills to diagnose, repair, and maintain aircraft and powerplant components. Post this The Airframe and Powerplant Technician program was launched at the Avondale and Long Beach campuses in 2023 and is designed to be completed in 18 months. The program provides students with the skills to diagnose, repair, and maintain aircraft and powerplant components. Students are prepared to apply and test for FAA mechanic certification upon graduation. Densen Pantal was among the ...
Why Tesla Could Have a Self-Driving Advantage
Bloomberg Technology· 2025-06-16 19:12
The kind of key headline from from the React piece is that Tester's vehicle cost is about one seventh of waymo's. Why were you so focused on that, Steve. Well, I think I think that the critical piece in getting it on a commercial basis on a mass scale basis and also the payback is also important.If you can produce the cars at a fraction of the price. You know, you can get more of these vehicles out on the road. And also, you know, whoever is running them can actually make their money back much sooner.The is ...