Workflow
量子位
icon
Search documents
同时监督和强化的单阶段大模型微调,告别“先背书再刷题”,推理泛化双提升|中科院&美团等
量子位· 2025-07-02 02:02
Core Viewpoint - The article introduces the Supervised Reinforcement Fine-Tuning (SRFT) method, which combines supervised fine-tuning (SFT) and reinforcement learning (RL) in a single-stage approach to enhance the reasoning performance of large language models (LLMs) [1][22]. Group 1: Methodology - SRFT employs a dual strategy design to effectively utilize demonstration data, incorporating both SFT for coarse-grained behavior policy approximation and RL for fine-grained policy refinement [23][24]. - The method introduces an entropy-aware adaptive weighting mechanism to balance the influence of SFT and RL, ensuring stable training dynamics [29][44]. - SRFT achieves a significant improvement in training efficiency, speeding up the process by 2.28 times compared to traditional sequential methods [21][44]. Group 2: Performance Results - SRFT demonstrates an average accuracy of 59.1% across five mathematical reasoning tasks, outperforming the zero-RL baseline by 9.0% [4][47]. - In out-of-distribution tasks, SRFT achieves an average accuracy of 62.5%, surpassing the best baseline by 10.9% [4][47]. - The method shows superior generalization capabilities, with consistent performance improvements across various benchmarks [47][48]. Group 3: Training Dynamics - The training dynamics of SRFT reveal a more stable and efficient learning process, with a gradual increase in response length indicating a deeper reasoning process [48]. - SRFT maintains a more stable entropy during training, allowing for continued exploration, unlike pure RL which exhibits rapid entropy decline [20][48]. - The analysis of training trajectories indicates that SRFT effectively balances knowledge acquisition and self-exploration without excessive deviation from the initial model [15][45].
清华计算机女神,冲刺IPO了
量子位· 2025-07-01 23:58
Core Viewpoint - The article discusses the upcoming IPO of Meijia Technology, founded by Zhuang Li, focusing on its integrated domain control solutions for smart cockpits in the automotive industry [2][5]. Company Overview - Meijia Technology was established in 2018 after Zhuang Li left NIO, specializing in integrated domain control solutions for smart cockpits [3][10]. - The company aims to list on the main board of the Hong Kong Stock Exchange [2]. Market Position - Meijia Technology ranks among the top five companies in the integrated domain control solutions market in China, holding approximately 9.3% market share as of 2024 [9][13]. - The total new installation volume of integrated domain control solutions in China's passenger cars is projected to reach 6.8 million units in 2024 [8]. Product Offerings - The company provides a comprehensive solution that includes smart cockpit, ADAS parking, ADAS driving, vehicle networking, and OTA upgrades, integrating multiple AI-driven functionalities [7][10]. Financial Performance - Revenue surged from RMB 387.8 million in 2022 to RMB 1.513 billion in 2023, with a slight decline to RMB 1.419 billion expected in 2024 [20][21]. - The gross profit for the same years was RMB 73.6 million, RMB 183.1 million, and RMB 309.3 million, with gross margins improving from 12.1% in 2023 to 21.8% in 2024 [24]. - The company reported losses of RMB 422.9 million, RMB 356.6 million, and RMB 291.2 million for 2022, 2023, and 2024, respectively, with loss rates decreasing over the years [27]. Client Base - As of 2024, Meijia has established relationships with 12 major automotive manufacturers, up from 10 in 2023 and 7 in 2022 [30]. - The company has secured a total of 48 contracts with various manufacturers, including well-known brands like Chery, Changan, Dongfeng, and Ford [12]. Competitive Landscape - The integrated domain control solutions market is characterized by rapid technological changes and aggressive pricing, with competition from large suppliers and global automotive electronics companies [16][17]. - Meijia aims to maintain its competitive edge by providing superior customer support and continuously improving its offerings [17][18]. Investment and Valuation - Meijia has completed six rounds of financing, raising a total of USD 230 million, with the latest valuation exceeding USD 930 million [36]. - The company's board includes Zhuang Li and other key executives, with Zhuang holding a 44.85% stake prior to the IPO [38].
Meta“1亿美元年薪”震荡硅谷!奥特曼回应:总有人唯利是图,而且都算不上TOP
量子位· 2025-07-01 23:58
雷刚 白交 发自 纽凹非寺 量子位 | 公众号 QbitAI 疯狂,太疯狂。 这两天,硅谷乃至全球AI领域的注意力都被Meta老板扎克伯克带走了,毕竟也不是谁都能boss直聘,一举挖走8名OpenAI核心员工。 更何况其中大部分,还是以聪明能干吃苦耐劳著称的华人研究员。 但是,最新爆料就说了:小扎挖人,不光只是靠梦想,还给出了最真诚的尊重——人均 1亿美元的年薪 ,以及优先且不限制的最先进算力资 源使用权。 1亿美元年薪,什么概念? NBA现役头号球星勒布朗詹姆斯,最新的年薪也才5000多万美元。 小扎给的实在太多了,确实也让OpenAI上下如临大敌。 在一波涨薪暗示和临时休假的安排之后,OpenAI CEO山姆-奥特曼的最新内部信也被进一步曝光。 奥特曼除了痛骂小扎,认为Meta的作派令人反感,而且还对被挖走的旧部阴阳怪气杀人诛心—— "Meta没有挖到顶尖人才,只能退而求其次","当然也总有人唯利是图……" 太狠了太狠了,别的不知道,但应该挖到肺管子了。 疯狂小扎,天价挖人 小扎的待遇究竟有多壕,为啥能闪电一样带走OpenAI核心员工? 《连线》杂志援引消息人士曝出了内幕,称扎克伯格在招募新成立的"超级智 ...
首届国产机器人足球赛,最忙的是担架
量子位· 2025-07-01 23:58
Core Viewpoint - The article discusses the emergence and excitement surrounding the first robot football competition in China, showcasing advancements in AI and robotics through a 3v3 format where robots operate autonomously without human control [1][3][10]. Group 1: Event Overview - The robot football competition took place in Beijing, featuring four teams from different universities competing in a high-tech football match [2]. - The event utilized the Accelerated Evolution T1 robot platform, which is designed for autonomous operation and has various technical specifications [7][69]. Group 2: Competition Details - The competition consisted of three rounds, with Tsinghua University's Huoshen team ultimately winning the championship with a score of 5:3 [68]. - Each match lasted 10 minutes per half, with a 5-minute halftime break, and teams comprised three players and one substitute [74]. Group 3: Technical Specifications - The Accelerated T1 robot stands approximately 1.2 meters tall, weighs 20 kg, and has 23 degrees of freedom, allowing for complex movements and autonomous recovery after falling [69][72]. - The robots are equipped with advanced AI capabilities, including a Nvidia AGX Orin chip providing 200 TOPS of AI computing power, enabling them to perform tasks like visual recognition and path planning [72][78]. Group 4: Performance Insights - Despite the technological advancements, the robots displayed limitations, such as difficulty in getting up after falling and frequent scoring errors, indicating ongoing challenges in motion control and decision-making [79]. - The event highlighted the potential for future improvements in robot football, with aspirations for a fully robotic team to compete in the World Cup by 2050 [80].
首届国产机器人足球赛,最忙的是担架
量子位· 2025-07-01 09:25
Core Viewpoint - The article discusses the emergence and excitement surrounding the first robot football competition in China, showcasing advancements in AI and robotics through a competitive format that emphasizes autonomous decision-making and teamwork among robots [1][3][10]. Group 1: Event Overview - The first robot football competition, referred to as "机超," took place in Beijing, featuring four teams from different universities competing in a 3v3 format [2][3]. - The competition utilized the Accelerated Evolution T1 robot platform, which operates autonomously without remote control, relying on AI for strategy and execution [4][69]. Group 2: Competition Details - The Tsinghua Fire God team emerged as the champion after three rounds of matches, demonstrating effective teamwork and strategy [7][68]. - The matches were characterized by various incidents, including robots frequently falling and requiring assistance, highlighting the current limitations in robot mobility and control [5][79]. Group 3: Technical Specifications - The Accelerated T1 robot stands approximately 1.2 meters tall, weighs 20 kg, and features 23 degrees of freedom, allowing for complex movements and actions [69]. - The robots are equipped with advanced AI capabilities, including a Nvidia AGX Orin chip providing 200 TOPS of computing power, enabling them to perform tasks such as visual recognition and autonomous navigation [72][73]. Group 4: Future Aspirations - The event organizers aim for a fully robot team to win the World Cup by 2050, indicating a long-term vision for the integration of robotics in sports [80]. - The article concludes with a humorous note on the potential for robots to achieve success in football before human teams, reflecting a cultural commentary on sports in China [81][82].
国产GPU历史性时刻!摩尔线程、沐曦同日获IPO受理
量子位· 2025-07-01 07:29
梦晨 发自 凹非寺 量子位 | 公众号 QbitAI 国产GPU迎来历史性时刻! 2025年上半年刚结束, 摩尔线程 、 沐曦股份 两家公司的科创板IPO申请同日获得受理。 这两家公司都在用自己的方式讲述着国产GPU崛起的故事: 摩尔线程创始人 张建中 曾担任 英伟达 全球副总裁及中国区总经理,在英伟达任职超过15年,成功建立了GPU在中国的完整生态系统,并将 大中华区发展成为英伟达全球最重要的市场之一。 摩尔 线程 发展路线上 同时布局数据中心(B端)和消费级游戏显卡(C端)市场 ,试图全面对标行业领导者,构建广阔生态。 沐曦创始团队则有着 AMD 基因。创始人 陈维良 曾任AMD GPU设计高级总监、AMD全球GPU SOC设计总负责。 CTO兼首席硬件架构师 彭莉 是AMD全球首位华人女科学家(Fellow),CTO兼首席软件架构师 杨建 在AMD工作14年,并在华为海思担任 过GPU首席架构师。 整个核心团队平均拥有近20年的高性能GPU端到端研发经验,尤其擅长GPU架构定义、IP设计和SoC设计。 沐曦的发展路线 更聚焦于数据中心市场 ,从增长最快的通用计算(GPGPU)切入,逐步扩展至图形渲染。 ...
面部控制手机电脑、手语变语音…这些AI项目重新定义了“无障碍”
量子位· 2025-07-01 07:29
西风 发自 凹非寺 量子位 | 公众号 QbitAI 中国手部障碍者总数超过3000万,这其中有手部残疾者,也有因渐冻症、脑瘫、中风等原因导致手部障碍的人士。 对他们来说,在互联网时代,如何使用手机和电脑是一大难题。 为此,一个名为 "面面俱控——AI赋能手障人士新时代" 的项目应运而生。 面面俱控团队研发了 国内 首个面控操作手 机电脑的产品"面面俱控" ,通过人脸识别技术,捕捉脸部动作,模拟手机手势和PC鼠标操作。 比如,用户可以通过设置张嘴、抬眉等动作对手机电脑进行不同的操作,语音控制也已实现: AI眼镜应用新场景!造福听障人士,让他们与世界"畅聊"。 肢体残障人士也有了新AI工具,只需张嘴、抬眉、眨眼等面部动作,就能无障碍操控手机电脑。 在AI大模型卷产业、卷效率的风口上,一群年轻人把AI技术从代码世界,带到了最贴近人心的公益实践中。 除了手障人群,听障群体同样是技术公益不可忽视的一环。 全球有2.5亿听障人士,2023年中国的听障人群达到2780万。 借助腾讯云的AI和大模型产品,知音助聋团队启动了 "AI不释手——知音开启 听障人士无障碍 生活" 项目,他们研发了 AR字幕手语眼镜 , 能把健全人 ...
4年3次加码核聚变,硅谷巨头们疯狂抢电ing
量子位· 2025-07-01 07:29
闻乐 发自 凹非寺 量子位 | 公众号 QbitAI 当前的趋势是AI,AI的下一个趋势是—— 电。准确说是 核电 。 AI狂飙的时代,算力需求呈指数级攀升。大厂们为了给疯狂运转的服务器供能,甚至已经超前布局,将目光投向核聚变。 这已是谷歌四年内 第三次 投资核电领域。 谷歌与核聚变CFS的合作新进展 谷歌对核聚变领域的投资可追溯至2021年。 2021年,谷歌参与了CFS的B轮融资,布局高温超导托卡马克技术;2023年领投TAE Technologies的E轮融资,推动其场反转配置(FRC) 技术突破;2025年再次对两家企业进行投资,并与CFS签署了核聚变 电力采购协议 。 根据协议,CFS将从其 第一座发电厂Arc 向谷歌输送 200兆瓦 电力,该发电厂预计将于 2030 年投入运营。 谷歌是该发电厂的首位客户,未来还有权从该公司的其他发电厂购买电力。 最新消息,科技巨头 谷歌 宣布向核聚变初创公司 Commonwealth Fusion Systems(CFS) 追加注资,并与该公司签署了一项购买电力 的协议。 CFS团队脱胎于麻省理工学院(MIT),迄今已融资逾20亿美元,是世界上最大、最受关注的核 ...
华为又开源了个大的:超大规模MoE推理秘籍
量子位· 2025-07-01 05:30
金磊 发自 凹非寺 量子位 | 公众号 QbitAI 超大规模MoE模型 (如DeepSeek),到底该怎么 推理 才能做到又快又稳。 现在,这个问题似乎已经有了标准答案—— 华为 一个新项目,直接把推理超大规模MoE背后的架构、技术和代码,统统给 开源了! 这个新开源项目名叫 Omni-Infer ,整体来看,它对于企业用户来说是非常利好的。 例如它可以给企业提供PD分离部署方案,针对QPM进行系统级优化,还会分享大规模商用过程中硬件使用的"方法论"。 而且对于开发者和开源社区,华为这"一呼"也是起到了"百应"的效果。 北京智源研究院 副院长兼总工程师 林咏华 表示: 北京智源研究院一直以来致力于人工智能开源生态建设,很高兴看到Omni-infer项目开源,智源团队打造的面向多芯片的FlagScale框 架也在第一时间接入了Omni-infer,期待后续有更多生态合作。 | FlagOpen / FlagScale (Public) | | | | | | | A Notifications | Fork 81 | Star 308 | | | --- | --- | --- | --- | --- | - ...
不走寻常路的淘天技术节:AI狼人杀、Poster路演、博见社轮番上阵
量子位· 2025-07-01 03:51
Core Viewpoint - The "Hardcore Youth Technology Festival" organized by Taotian Group has evolved into a significant event showcasing technological advancements, particularly in AI, reflecting the company's commitment to practical and innovative technology applications [1][2][29]. Group 1: Event Overview - The fourth edition of the "Hardcore Youth Technology Festival" took place from June 30 to July 4, featuring a focus on practical technology rather than traditional presentations [1][2]. - The festival included various formats such as AI exhibition, AI communication, AI open day, and AI competitions, emphasizing hands-on demonstrations and interactions [3][4]. Group 2: AI Exhibition - The AI exhibition served as a large technology marketplace, showcasing nearly 40 latest technological achievements from Taotian Group's AIGX technology system through poster presentations [8][10]. - The AIGX system integrates closely with e-commerce scenarios, covering various operational needs such as indexing, recommendation, bidding, auctioning, creativity, and data management [9][11]. Group 3: AI Communication - The "Bojian Society" was established to share technological achievements and trends, facilitating discussions between academia and industry [16][19]. - This year, the event featured separate sessions for group and academic exchanges, focusing on "multimodal intelligence" and fostering collaboration between industry leaders and academic experts [18][19]. Group 4: AI Competitions - The AI competition segment included an "AI Hackathon 3.0" and a unique "AI Werewolf" game, where participants trained AI agents to play various roles, enhancing their skills in language understanding and strategic reasoning [20][24]. - The AI Werewolf game was designed to challenge AI agents in a social deduction context, emphasizing their capabilities in language generation and logical reasoning [25][26]. Group 5: Technological Advancements - Taotian Group announced significant progress in its AIGX technology system, including the launch of the self-developed recommendation model RecGPT, which enhances user experience by predicting needs based on historical data [34][37]. - The implementation of RecGPT has led to a notable increase in user engagement, with a double-digit growth in click rates and a 5% increase in add-to-cart actions [39][41]. Group 6: Organizational Philosophy - The festival reflects Taotian Group's long-term commitment to embedding AI into business processes, focusing on practical applications rather than chasing short-term trends [44][45]. - The event embodies a blend of youthful energy and craftsmanship, showcasing the company's dedication to continuous improvement and innovation in technology [58].