量子位
Search documents
中国大模型首登Nature封面!DeepSeek首次披露:R1训练只花了200万
量子位· 2025-09-18 00:51
Core Insights - DeepSeek has become the first Chinese large model company to be featured on the cover of Nature, with founder Liang Wenfeng as the corresponding author [2][3] - The R1 model has been recognized for its innovative approach, achieving significant performance improvements in reasoning tasks through a pure reinforcement learning framework [19][20] Group 1: Achievements and Recognition - DeepSeek's R1 model is the first large language model to undergo peer review, marking a significant milestone in the field [5] - The model has garnered 3,596 citations on Google Scholar and has been downloaded 10.9 million times from Hugging Face, indicating its widespread acceptance and use [7] - The training cost of R1 is approximately $294,000, significantly lower than competitors that often exceed $10 million, challenging the notion that high investment is necessary for top-tier AI models [12][13] Group 2: Training and Data - R1 was trained using 512 H800 GPUs for 198 hours, with a total training cost of $294,000 [10][11] - The dataset for R1 includes five types of data: Math, Code, STEM, Logic, and General, with a total of 126,000 prompts [15][18] - The model's training involved a combination of cold-start data, reinforcement learning, and supervised fine-tuning, enhancing its reasoning capabilities [25][26] Group 3: Performance Metrics - DeepSeek-R1-Zero achieved a pass@1 score of 71.0% in AIME 2024, significantly improving from 15.6% [21] - In comparison to other leading models, DeepSeek-R1 demonstrated competitive performance across various benchmarks, including MATH-500 and LiveCode [23][30] - The distilled models from DeepSeek-R1 outperformed direct applications of reinforcement learning on the base model, showcasing the effectiveness of the training approach [29] Group 4: Safety and Transparency - DeepSeek has released a detailed safety assessment of the R1 model, indicating a moderate inherent safety level comparable to GPT-4o [18][22] - The company has embraced transparency by open-sourcing the model weights for DeepSeek-R1 and DeepSeek-R1-Zero on Hugging Face, promoting community engagement [30]
ICPC总决赛被AI统治!GPT-5组合系统12题全对登顶,人类打破头只能争夺第三
量子位· 2025-09-18 00:51
这届大学生太难了,好不容易拼进编程竞赛总决赛,还要被AI秀一脸。 在刚刚结束的2025年国际大学程序设计竞赛(ICPC)世界总决赛上, OpenAI 的系统完美解决全部12道题目,若计入排名将 位居第一 。 谷歌 的Gemini 2.5 Deep Think模型解决10道题目,达到金牌水准 名列第二 。 这场顶级赛事汇集了来自全球103个国家、近3000所大学的139支顶尖队伍。 而AI系统在ICPC官方监督的独立"AI实验赛道"中,与人类选手面对相同题目和评测标准,表现非常抢眼。 梦晨 发自 凹非寺 量子位 | 公众号 QbitAI 其中比较难的一道 "问题C" ,没有一个大学团队能够解决,Gemini和OpenAI的模型组合都解决了。 | Rank Name | Solved Time | | A | B | C | D | 트 | E | G | H | I | 기 | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | 81 St. Petersburg State University | ...
豆包大模型开始上车了!上汽荣威率先进入AI智舱新拐点
量子位· 2025-09-17 12:09
豆包的深度思考和推理能力,正是当下许多主机厂所急需的智能底座。 然而让许多人意外的是,首发豆包大模型的是 上汽荣威 ,车企巨头与互联网大厂携手,将AI智舱推向新阶段。 没有深度思考,谈何AI智舱? AI正在全面重塑汽车,让车辆不只限于出行工具,还可以是移动的出行空间,在这个空间里,用户希望AI能提供智能便捷的服务,而且最 好"千人千面"。 为了迎合用户的需求,AI智舱的概念开始出现,各种各样的座舱功能涌现,一时间鱼龙混杂,于是产生了新的问题: 一凡 发自 凹非寺 量子位 | 公众号 QbitAI 豆包深度思考大模型,跨界上车了。 这说不上出人意料,毕竟AI正在从内到外全链路重塑汽车,尤其是AI智舱给用户带来了全新体验,深度思考带来了强大的智能和便捷,这背后 离不开大模型的赋能。 什么才是真正的AI智舱? 我坐在车里,先说一句"打开空调",车机把空调打开了,接着再喊一声"调节温度",空调就调节到了指定温度。 这很方便,但这称不上智能, 自然也不能称之为AI智舱。 因为座舱在这种情况下 只是被动地接受指令,无法主动察觉背后意图。只会映射单一指令,不能执行复杂操作。只有瞬时记忆,没有长期记 忆。 这就造成用户在表达 ...
腾讯披露元宝已是TOP3应用
量子位· 2025-09-17 11:06
Core Viewpoint - Tencent is making significant strides in both consumer and business sectors with its AI products, showcasing impressive user engagement and technological advancements while also expanding its global infrastructure with a substantial investment in Saudi Arabia [1][19][24]. Group 1: Consumer Product Developments - Tencent Yuanbao has become one of the top three AI-native applications in China, achieving daily active user metrics that match the total question volume from the entire previous month [5][4]. - The AI meeting summary feature in Tencent Meeting has seen a user growth of over 150% in one year [8]. - The Mixyuan Lab has launched over 30 models in a year, with the Mixyuan 3D model achieving a download count exceeding 2.6 million [10][12]. Group 2: Business Integration and Applications - Tencent is successfully transitioning its consumer products to the business sector, with examples like Tencent Cloud CodeBuddy, which generates 50% of new code internally [18]. - Companies like Midea and AstraZeneca are leveraging Tencent's AI capabilities to enhance operational efficiency and service delivery [18]. Group 3: Global Expansion and Investment - Tencent Cloud is not merely exporting products but is taking a validated ecosystem abroad, including audio-video technology and mini-program platforms [20][21]. - The company announced a $150 million investment to build a new data center in Saudi Arabia, aiming to enhance its global digital infrastructure [24][19]. - Tencent's strategy emphasizes increasing industrial efficiency through smart solutions and expanding revenue through global outreach [27].
小红书首次公开AI技术体系,为最大规模校招拼了
量子位· 2025-09-17 11:06
Core Insights - Xiaohongshu announced its largest-ever campus recruitment for 2026, opening eight major job categories, with a significant increase in technical positions, which surged by 2.5 times [1][3]. Group 1: Recruitment and Talent Development - The company is in a rapid growth phase, necessitating a large influx of talent due to the emergence of new businesses and functions [3]. - Xiaohongshu places high importance on the potential and growth of campus recruits, as past recruits have quickly developed into key business personnel, reinforcing the commitment to invest in campus recruitment and training [3][42]. - The "Shu Guang Plan" is a two-year growth program for all campus recruits, aimed at helping them quickly understand the company culture and integrate into the organization [46][50]. Group 2: AI Technology System - Xiaohongshu's AI technology system is divided into five key components, which support its large UGC community of over 350 million monthly active users [10][8]. - The AI infrastructure provides the necessary support for efficient operation of AI models and technologies, enhancing user experience and content accuracy [16]. - The search and recommendation algorithms emphasize community interaction and personalized user experiences, moving beyond traditional keyword matching [15][23]. Group 3: Career Guidance and Skills Development - During the live session, experts emphasized that potential is more important than experience for young job seekers, highlighting the value of learning and dedication [34][35]. - The balance between cutting-edge research and practical application in the AI field was discussed, with a focus on the greater opportunities in commercial applications compared to academic exploration [38]. - Xiaohongshu encourages recruits to find their interests and develop unique value while remaining aware of external developments in the industry [39].
稚晖君机器人炸场:全球首秀“真男人必会的韦伯斯特空翻”
量子位· 2025-09-17 11:06
Core Viewpoint - The article highlights the achievement of the Lingxi X2 robot, which has become the first robot globally to complete a Webster flip, a complex acrobatic maneuver that demonstrates advanced capabilities in robotics [1][7]. Group 1: Robot Capabilities - The Lingxi X2 robot stands approximately 1.3 meters tall and possesses 25-31 degrees of freedom, although it lost 2 degrees due to the removal of its head for the Webster flip [13][14]. - The robot can perform basic movements like running and can navigate various terrains without the need for navigation systems, showcasing its autonomous obstacle avoidance capabilities [16][19]. - The successful execution of the Webster flip required overcoming significant challenges, including high dynamical complexity, real-time perception and feedback, and high hardware reliability [23][24]. Group 2: Technological Innovations - The achievement is attributed to the Lingchuan platform, which is an AI-enhanced tool for robot motion and expression creation, allowing for the design and secondary development of robot movements [20][19]. - The robot's motion capabilities are based on a reinforcement learning strategy that utilizes human video data to train its movements, ensuring precise execution in real-world scenarios [24]. Group 3: Future Developments - The Lingxi X2 series includes other models such as Lingxi X2-W and Lingxi X2-N, which are designed for different operational capabilities, including task intelligence and adaptability to various terrains [26][34]. - The company plans to scale production of the Lingxi X2 by the second half of 2025, with an expected output of several thousand units by the end of 2026 [36].
AI在实时视频里秒“剪”出你想要的部分!输入文字/图/视频片段,它都能秒懂|ICCV2025
量子位· 2025-09-17 11:06
OVG-HQ团队 投稿 量子位 | 公众号 QbitAI 还在实时视频里找特定事件找半天?最新技术直接开挂了。 试想一下,安防监控中,几个人影短暂掠过,利用新技术可以秒级调出这段"可疑聚众"的精准片段。 △ 图片为AI生成 在VR训练场,你戴上VR眼镜练习投篮,提前在手机App输入"定位和这个视频示范 (库里完美三分片段) 相似的动作"。训练开始,每一次 出手,眼镜在后台默默分析第一视角视频流。当你做出动作、发力、弧线都神似库里的三分时,眼镜立刻就能在虚拟界面高亮标记这个片段。 △ 图片为AI生成 不卖关子,这就是来自深圳北理莫斯科大学、阿德莱德大学的研究团队提出的新任务。 名叫 混合模态在线视频定位 (Online Video Grounding with Hybrid-modal Queries, OVG-HQ) 。 用大白话说,这项技术能让系统一边直播/录像,一边根据你提供的多种"线索",包括文字、参考图、示范视频片段或组合等,瞬间在实时视频 流中找出并精准裁剪出你关心的完整事件。 论文已收录于ICCV2025。 "离线"是硬伤 :主流技术必须等视频录完才能干活,事后分析如同马后炮,无法满足安防"秒级响 ...
390亿美元,全球具身智能第一估值来了!英伟达持续加注中
量子位· 2025-09-17 11:06
Core Viewpoint - Figure has made significant advancements in technology and financing after parting ways with OpenAI, achieving a post-financing valuation of $39 billion, the highest in the embodied intelligence sector to date [2][32]. Financing and Valuation - Figure has successfully raised over $1 billion in Series C financing, leading to a post-money valuation of $39 billion [2][32]. - The financing round was led by Parkway Venture Capital, with participation from notable investors including Nvidia, Brookfield Asset Management, and Qualcomm Ventures [4]. Strategic Focus Areas - The new funding will support Figure's development in three core areas [8]. - The first area is the large-scale penetration of humanoid robots into household and commercial scenarios, with plans to expand the production capacity of its BotQ manufacturing facility [9]. - The second area involves building next-generation GPU infrastructure to accelerate training and simulation for the Helix model [21]. - The third area focuses on launching advanced data collection projects to enhance the robot's understanding and operational capabilities in complex environments [21]. Technological Advancements - Figure has introduced the Helix architecture, a visual-language-action model that allows robots to perceive, understand, and act like humans [17]. - Helix consists of two systems that communicate and are trained end-to-end, enabling the robot to perform various tasks with a single unified model [18]. - The recent funding will further enhance the capabilities of Helix, which is designed to optimize the performance of embodied intelligent AI systems [20]. Company Background - Figure was founded in May 2022 by Brett Adcock, a serial entrepreneur [22]. - The company gained attention in the humanoid robotics sector after raising $675 million in Series B financing in February 2024, achieving a valuation of $2.6 billion at that time [22]. - Following a partnership with OpenAI, Figure decided to pursue vertical integration of its AI models, focusing on developing an end-to-end AI model tailored for specific robotic hardware [30][28].
@CEO,你的下一个私人助理何必是人类
量子位· 2025-09-17 03:43
鱼羊 闻乐 发自 凹非寺 量子位 | 公众号 QbitAI CEO私人助理的活儿,也被Agent盯上了。 每天能独立更新出全公司的 日报版"今日头条" ,还是完全 本地部署 、 开箱即用 的那种: 本体甚至能被CEO拎着走。 没错,整个机箱就A4大小,跟iPhone 15 Pro Max对比起来是这样的: 不卖关子,这么个新鲜角色,名叫智跃Agent一体机。很有意思的一点是,这是市面上首个专门面向CEO打造的软硬一体私有化Agent,目标 用户非常明确。 不愧是"Agent应用元年",连AI新硬件都开始彰显"个性"了。 到底怎么一回事,量子位编辑部的同事们也是率先过了一把CEO瘾,咱们一边实测,一边看看2025年的AI新硬件,都进化成什么样的形态了 —— 开箱即用的"信息管理助手" 传统的一体机大家已经比较熟悉了,大体上是算力+模型供给的模式,基本上买到手里还是得给它配个专门的开发团队。 与之相比,智跃Agent一体机实际上属于一个 全新的概念,定位并不相同 。 在硬件层面,它采用小巧的12L机箱设计,搭载 单卡4090 ,可以说是超小型化的Agent方案。 所有数据处理、存储环节均可以在本地完成,无需依赖外 ...
腾讯混元开源AI绘画新框架:24维度对齐人类意图,让AI读懂复杂指令
量子位· 2025-09-17 01:42
Core Viewpoint - The article discusses the challenges faced by AI painting models in accurately interpreting human instructions and presents Tencent's PromptEnhancer framework as a solution to improve text-image alignment without modifying pre-trained models [2][4][12]. Group 1: Challenges in AI Painting - AI painting models struggle with understanding concise user instructions, leading to inaccuracies in generated images [9][10]. - Common issues include chaotic attribute binding, ineffective negation commands, and failure to comprehend complex spatial relationships [10][11]. Group 2: PromptEnhancer Framework - PromptEnhancer introduces a decoupled prompt optimization framework consisting of two main modules: CoT-based Rewriter and AlignEvaluator [12][14]. - The CoT-based Rewriter mimics human designers by breaking down instructions into core elements, potential ambiguities, and detailed supplements [15][19]. - AlignEvaluator provides a scoring system across 24 key dimensions to accurately identify errors in generated images [20][21]. Group 3: Performance Improvements - Testing on the HunyuanImage 2.1 model shows a 5.1% overall accuracy improvement, with significant gains in complex scene understanding [29]. - Specific dimensions such as "similarity relations" and "counterfactual reasoning" saw accuracy increases of 17.3% and 17.2%, respectively [29]. Group 4: Dataset and Research Support - Tencent's team released a high-quality benchmark dataset containing 6,000 prompts to aid in the training and evaluation of the PromptEnhancer [7][45]. - The dataset covers various complex scenarios, including everyday creative extensions and abstract relationship challenges [46]. Group 5: Future Implications - The advancements brought by PromptEnhancer position it as a critical tool for enhancing AI painting's applicability in professional fields like industrial design and advertising [54][55]. - The framework's ability to optimize instructions without altering model weights allows for broader adaptability across different T2I models [57].