Workflow
机器之心
icon
Search documents
NIPS 2025 Spotlight | 港大提出TreeSynth方法,一句话生成百万规模数据集
机器之心· 2025-10-03 03:39
Core Insights - TreeSynth is a novel data synthesis method inspired by decision trees, addressing the challenge of generating diverse and high-quality training data from scratch [6][7][25] - The method ensures systematic coverage of the data space, overcoming limitations of traditional data synthesis approaches [4][25] Methodology - TreeSynth employs a two-phase workflow: data space partitioning and subspace data synthesis [8][12] - In the first phase, the data space is divided into mutually exclusive subspaces using pivot samples and core criteria [9][12] - The second phase involves generating samples within each atomic subspace based on the path description from the root to the leaf node [13][14] Performance and Validation - Experimental results show that TreeSynth consistently outperforms baseline methods in various benchmarks, achieving significant performance improvements [19][23] - For instance, accuracy on the GSM8K dataset increased from 45.2% to 55.8% using the LLaMA3.1-8B model [19] - TreeSynth also demonstrated a 45% increase in data diversity compared to baseline methods, with improved distribution in the embedding space [23] Future Directions - TreeSynth opens new avenues for synthesizing diverse and comprehensive training datasets, with potential for scalability in large data scenarios [26][27] - Future exploration may focus on optimizing tree depth and partitioning criteria, as well as adapting to complex real-world scenarios [28]
Meta内部混乱持续:FAIR自由不再,LeCun考虑辞职
机器之心· 2025-10-03 03:39
机器之心报道 编辑:+0 Meta 内部混战又有新剧情了,这次主角是 FAIR 实验室。 据 The Information 报道,两位知情人士透露, Meta 最近对 FAIR 实验室施加了一项新政策:所有研究成果在公开发表前,必须通过额外的内部审查。 这项政策在 FAIR 内部引起了轩然大波。多位员工认为,这一变化严重限制了他们此前享有的学术自由,即在 Meta 之外自由分享研究成果的权利。 长久以来,开放的研究氛围一直是 FAIR 吸引顶尖人才的基石。然而,随着 Meta 全面重塑其 AI 业务,公司开始要求 FAIR 更多地为内部产品服务,同时减少可能 助益竞争对手的外部研究分享。 这些变化让 FAIR 的联合创始人 Yann LeCun 深感困扰。据知情人士称,他甚至在九月份私下向同事透露, 或许应该辞去首席科学家的职位。 LeCun 的不满早有征兆,几个月来,他对公司新成立的、统管所有 AI 业务的Meta 超级智能实验室(MSL)的内部状况已日益失望。 今年 7 月,MSL 任命了来自 OpenAI 的研究员赵晟佳担任首席科学家。一位知情人士称,LeCun 对于外界「他已被降职」的看法感到十分恼 ...
刚刚,Anthropic新CTO上任,与Meta、OpenAI的AI基础设施之争一触即发
机器之心· 2025-10-03 00:24
机器之心报道 机器之心编辑部 就在刚刚,Anthropic 迎来了新的首席技术官(CTO)—— 前 Stripe 首席技术官 Rahul Patil。 据报道,Rahul Patil 于本周早些时候加入公司,接替了联合创始人 Sam McCandlish,后者将转任首席架构师一职。 Rahul Patil 在社媒上表达了自己加入 Anthropic 的激动之情与未来期许。他表示,自己很高兴加入一个新的使命和召唤。AI 的可能性是无穷无尽的,这将是一次非 凡的发现之旅,需要付出努力,将这些可能性变为现实。 更重要的是,这将要求我们每天做出深思熟虑的决策,以安全地驾驭这一巨大变革,确保负责任的 AI 最终获胜。 他很感激能够加入 Anthropic 这个谦逊、聪明、勤奋且有责任感的团队,他们激发了全球无数的想象力!他还感谢每一位与自己建设 Stripe 的团队成员,感谢他们 过去五年多的深刻变革! 作为 CTO,Rahul Patil 将负责计算、基础设施、推理以及其他各类工程任务。Sam McCandlish 在担任首席架构师期间,将继续从事预训练和大规模模型训练的工 作,扩展之前的工作。他们二人都将向 Ant ...
全球价值最高创企诞生,OpenAI估值创纪录来到5000亿美元
机器之心· 2025-10-03 00:24
机器之心报道 机器之心编辑部 几天前,OpenAI 重磅发布了全新一代的视频大模型 Sora 2,不仅在物理准确性、真实感和可控性方面都优于以往的系统,还具备同步的对话和音效能力。 Altman 称之为「ChatGPT for creativity」时刻。 | Company | Valuation | | | | | | Country | | --- | --- | --- | --- | --- | --- | --- | --- | | OpenAl | | | | | | $500B | us | | SpaceX | | | | | 400 | | വട | | ByteDance | | | | 220 | | | China | | Anthropic | | | | 183 | | | വട | | Ant Group | | | 150 | | | | China | | Reliance Retail | | 100 | | | | | India | | Databricks | | 100 | | | | | വട | | Shein | | ୧୧ | | | | | China | ...
Sora 2数手指翻车,奥特曼成第一批「受害者」,被AI玩成最惨打工人
机器之心· 2025-10-02 06:19
机器之心报道 Sora 2 生成的视频中,男人能够正确数数,但手指的展示与数字并不完全对应。 这已经不是该博主第一次拿这种提示词测试视频生成模型。早在今年 5 月份,他就用这个提示词测试过 Veo3,Veo3 不仅手指没比划对,数字还只数到 3。 后来博主又润色了提示词:a man counts out loud from 1 to 10, "1, 2, 3, 4, 6, 7, 8, 9, 10", he counts using his fingers and holds them up as he goes.(一名男子大声从 1 数到 10,「1、2、3、4、6、7、8、9、10」,他一边数,一边举起手指),仍以失败告终: 编辑:杨文 奥特曼大型社死现场。 Sora 2,强大如斯,却也数不明白手指。 X 网友 @fofrAI 整了个提示词测试 Sora 2:a man counts out loud from 1 to 10, using his fingers and holding them up as he goes.(一名男子一边举起手指,一边大声数 着从 1 到 10。) 视频一开始,男人的表现 ...
开发者狂喜:Thinking Machines发布首款产品Tinker,后训练麻烦全给包了
机器之心· 2025-10-02 03:12
Core Insights - Tinker, the first product launched by Thinking Machines, is an API designed to simplify the fine-tuning of language models for developers and researchers, allowing them to focus on training data and algorithms while Tinker manages infrastructure-related tasks [2][4][16]. Product Features - Tinker supports various advanced models, including Qwen-235B-A22B, and allows users to switch from small to large models with ease, akin to changing a string in Python code [6][8]. - The API provides low-level primitives such as forward_backward and sample, which are essential for most common post-training methods. An open-source library, Tinker Cookbook, is also available to offer modern implementations of post-training methods [9][11]. Use Cases and Adoption - Teams from prestigious institutions like Princeton, Stanford, and UC Berkeley are already utilizing Tinker, demonstrating its versatility in supporting both supervised fine-tuning and experimental reinforcement learning pipelines [13]. - The Goedel team at Princeton achieved comparable performance to full-parameter models using only 20% of the data, while Stanford's chemistry group improved accuracy from 15% to 50% in a specific task using Tinker [14]. Market Position and Future Outlook - Tinker aims to democratize access to fine-tuning capabilities, potentially leading to more diverse product innovations in the AI space [16]. - The initial phase of Tinker will be free, with a usage-based pricing model to be introduced in the coming weeks [15].
小红书发布FireRedChat:首个可私有化部署的全双工大模型语音交互系统
机器之心· 2025-10-02 03:12
在线体验:https://fireredteam.github.io/demos/firered_chat 开源代码:https://github.com/FireRedTeam/FireRedChat 小红书智创音频团队推出业内首个支持私有化部署的全双工大模型语音交互系统 FireRedChat,自研流式 pVAD 与 EoT 让语音交互更加自 然,首发级联与半级联两套实现,端到端时延逼近工业级应用。彻底开源、可私域落地,打造真正 "知冷暖、能共情、懂表达" 的语音 AI。 小红书智创音频团队发布 Fi r eR ed Chat —— 业内首个支持私有化部署的全双工大模型语音交互系统,直击延迟高、噪声敏感、可控性差、依赖外部 API 等痛 点。 FireRedChat 基于 "交互控制器+交互模块+对话管理器" 的完整架构,将任意半双工链路一键升级为全双工;集成自研流式个性化打断 pVAD、语义判停 EoT、 FireRedTTS-1s、FireRedASR、FireRedTTS2 等核心模型,提供级联与半级联两种端到端服务部署方案,覆盖从 "稳定易部署" 到 "更有温度" 的不同需求,显著提升 实时性、鲁 ...
梦里啥都有?谷歌新世界模型纯靠「想象」训练,学会了在《我的世界》里挖钻石
机器之心· 2025-10-02 01:30
为了在具身环境中解决复杂任务,智能体需要深入理解世界并选择成功的行动。世界模型通过学习从智能体(如机器人或电子游戏玩家)的视角预测潜在行动的 未来结果,为实现这一目标提供了一种有前景的方法。 通过这种方式,世界模型使智能体能够深入理解世界,并具备通过在想象中进行规划或强化学习来选择行动的能力。此外,原则上世界模型可以从固定数据集中 学习,这使得智能体能够纯粹在想象中进行训练,而无需在线交互。对于许多实际应用而言,离线优化行为很有价值,例如物理世界中的机器人,在这种情况 下,与未充分训练的智能体进行在线交互往往不安全。 世界模型智能体 —— 如 Dreamer 3—— 是迄今为止在游戏和机器人领域表现最佳且最为稳健的强化学习算法之一。虽然这些模型在其特定的狭窄环境中速度快且 准确,但其架构缺乏拟合复杂现实世界分布的能力。可控视频模型,如 Genie 3,已在多样的真实视频和游戏上进行训练,并实现了多样的场景生成和简单交互。 这些模型基于可扩展架构,如 diffusion transformer。然而,它们在学习物体交互和游戏机制的精确物理规律方面仍存在困难,这限制了它们在训练成功智能体方面 的实用性。此外,它们 ...
Sora 2干翻Veo 3?超全对比实测:会中文脱口秀,但体操翻车,附有效邀请码
机器之心· 2025-10-01 07:26
Core Viewpoint - The article discusses the advancements of Sora 2, an AI video and audio generation model, highlighting its superior physical accuracy, realism, and controllability compared to its predecessor and competitors like Google's Veo3 [1][6][7]. Comparison with Veo3 - Sora 2 can generate up to 20 seconds of 1080p video, positioning it as a strong competitor to Veo3 [7]. - The audio generation capabilities of Sora 2 are noted to be superior to those of Veo3 [9]. - Sora 2's video generation avoids issues like object disappearance and distortion, which were present in the previous version [5][9]. - Users can access Sora 2 through a web platform or an iOS app, both requiring an invitation and a US IP address [11][12]. Performance Testing - In various tests, Sora 2 demonstrated impressive capabilities in generating realistic videos, including ASMR and singing performances, with accurate audio-visual synchronization [20][22]. - However, both Sora 2 and Veo3 struggled with generating gymnastics videos, resulting in unrealistic movements [28][33]. - Sora 2 outperformed Veo3 in generating fake news segments, providing a more dynamic presentation [24][25]. User Experience and Accessibility - The Sora iOS app mimics popular social media platforms like TikTok, featuring a recommendation algorithm and options for user interaction [44]. - OpenAI has implemented safety measures, including watermarks and restrictions on deepfakes of public figures, to prevent misuse of the technology [35]. Market Position and Competition - The article suggests that while OpenAI's Sora 2 has established a product barrier, competition remains fierce in the AI video generation space, with other companies like Meta and domestic platforms also advancing their offerings [46][47].
CUDA内核之神、全球最强GPU程序员?OpenAI的这位幕后大神是谁
机器之心· 2025-09-30 23:49
机器之心报道 编辑:+0 在 AI 圈里,聚光灯总是追逐着那些履历光鲜的明星人物。但一个伟大的团队,不仅有台前的明星,更有无数在幕后贡献关键力量的英雄。 之前我们介绍了 OpenAI 的两位波兰工程师 ,最近 OpenAI 又一位身处幕后的工程师成为了焦点。 起因是 X 上的一则热门帖子,其中提到 OpenAI 仅凭一位工程师编写的关键 CUDA Kernel,就支撑起每日数万亿次的庞大计算量。 评论区纷纷猜测,这位大神便是 OpenAI 的资深工程师 Scott Gray。 为什么一个能编写 CUDA Kernel 的工程师会引起如此关注? 因为编写高性能的模型训练 CUDA Kernel 是一项极度专业的技能,它要求开发者必须同时精通三大高深领域:并行计算理论、GPU 硬件架构与深度学习算法。能 将三者融会贯通的顶尖人才凤毛麟角。 大多数开发者停留在应用层,使用现成工具。从事推理优化的人稍多,因为其问题边界更清晰。然而,要深入底层,为复杂的训练过程(尤其是反向传播)从零 手写出超越 cuDNN 等现有库的 CUDA Kernel,则需要对算法、并行计算和硬件有宗师级的理解。 而 Scott Gray 的职 ...