Workflow
机器之心
icon
Search documents
AlphaGo之父David Silver离职创业,目标超级智能
机器之心· 2026-01-31 02:34
Core Viewpoint - David Silver, a prominent AI researcher from Google DeepMind, has left the company to establish a new startup named Ineffable Intelligence, focusing on solving complex AI challenges and pursuing superintelligence [1][3][4]. Group 1: Company Formation and Background - Ineffable Intelligence is being founded in London, with active recruitment for AI researchers and seeking venture capital [3]. - Silver was a key figure at Google DeepMind, contributing to significant achievements such as AlphaGo, AlphaStar, and AlphaZero, which demonstrated the capabilities of AI in complex games [9][12][14]. - The company was officially registered in November 2025, with Silver appointed as a director in January 2026 [4]. Group 2: Silver's Contributions and Vision - Silver's work includes developing AI systems that surpassed human capabilities in games, showcasing the potential of AI to learn and adapt [12][14]. - He emphasizes the need for AI to explore and discover knowledge independently, moving beyond human limitations and biases [18][23]. - The vision for Ineffable Intelligence is to create a self-learning superintelligence that can autonomously uncover foundational knowledge [23]. Group 3: Industry Context and Trends - Silver's departure follows a trend where notable AI researchers are leaving established labs to pursue startups focused on superintelligence, with significant funding being raised in the sector [15]. - Other notable figures, such as Ilya Sutskever and Yann LeCun, are also venturing into similar domains, indicating a growing interest in the pursuit of advanced AI capabilities [15][16].
顶尖模型离“科学家”还差得远?AI4S亟待迈向2.0时代
机器之心· 2026-01-30 10:43
Core Insights - The article discusses the transition from AI for Science (AI4S) to AGI for Science (AGI4S), emphasizing the need for a specialized generalist model to enhance scientific discovery and reasoning capabilities [1][2][71]. Group 1: Current State of AI in Science - AI for Science, exemplified by AlphaFold, has achieved significant milestones in specific fields like protein folding and weather prediction, but reliance on existing deep learning models may limit the exploration of new knowledge and hinder innovation [1][71]. - A systematic evaluation involving 100 scientists from 10 different scientific fields revealed that cutting-edge models scored 50 out of 100 in general scientific reasoning tasks, but dropped to scores between 15 and 30 in specialized reasoning tasks [1][71]. Group 2: The Need for AGI4S - The transition from AI4S 1.0 to AGI4S 2.0 is necessary to integrate general reasoning with specialized capabilities, addressing the limitations of current models in scientific discovery [2][71]. - The concept of "Specialized Generalist" is proposed as a feasible path to achieve AGI, which combines deep specialization with general capabilities [2][90]. Group 3: Technological Framework - SAGE - The "SAGE" architecture is introduced as a synergistic framework for developing generalizable experts, consisting of three layers: foundational, collaborative, and evolutionary [3][18]. - The foundational layer focuses on decoupling knowledge and reasoning capabilities, while the collaborative layer employs reinforcement learning to balance intuitive and logical reasoning [27][28]. - The evolutionary layer aims to enable self-evolution of models through continuous interaction and feedback, addressing the challenges of adapting to complex tasks [55][56]. Group 4: Innovations in Reinforcement Learning - The article highlights the development of the PRIME algorithm, which provides dense rewards for reinforcement learning without the need for extensive manual labeling, significantly improving model performance [38][39]. - FlowRL is introduced to enhance the diversity of reasoning paths in models, allowing them to explore multiple solutions rather than converging on a single answer [47][50]. Group 5: Applications and Case Studies - The Intern-S1 model is designed to be a deep specialized generalist for scientific applications, demonstrating superior performance in various scientific domains compared to existing models [77][79]. - The Intern-Discovery platform integrates the Intern-S1 model with extensive data and tools, facilitating a closed-loop system for hypothesis generation and experimental validation [80][84]. Group 6: Future Directions - The article calls for collaboration among researchers to fill the gaps in the current framework and advance the development of AGI4S, emphasizing the potential for AI to revolutionize scientific research [89][90].
揭秘!RLVR/GRPO中那些长期被忽略的关键缺陷
机器之心· 2026-01-30 08:49
近年来,大模型在数学推理、代码生成等任务上的突破,背后一个关键技术是 RLVR(Reinforcement Learning with Verifiable Rewards)。 简单来说,RLVR 不是让模型「听人打分」,而是让模型自己尝试多种解法,然后用可验证的规则(如答案是否正确)来反向改进自己。这使得模型能够通过反复 试错不断变强,被广泛应用于当前最先进的推理模型中。 在实际训练中,为了让学习过程更稳定、避免引入额外的价值网络,许多 RLVR 方法(如 GRPO)都会对同一个问题生成一组回答,并在组内进行相对比较。模 型不是直接看「这个回答好不好」,而是看「它在这一组回答中相对好不好」,这就是所谓的 组内优势估计(group-relative advantage), 也是目前几乎所有 group-based 强化学习方法的核心设计。优势估计并不仅仅是一个「评估指标」,而是直接决定策略梯度更新方向的核心信号。 然而,一个长期被忽视的关键问题在于: 组内优势估计并不像人们通常直觉认为的那样是「近似无偏」的。 相反, 北航、北大、UCB、美团 最新的工作揭示了,这种组内优势估计在统计意义上存在 明确且系统性的 ...
谷歌开放世界模型一夜刷屏,AI游戏门槛归零时刻来了?
机器之心· 2026-01-30 08:49
Core Insights - Google DeepMind has launched an experimental research prototype called "Project Genie," allowing users to create, edit, and explore virtual worlds using the world model Genie 3 [1][4][19] - Project Genie is supported by image generation and editing model Nano Banana Pro and language model Gemini, enhancing its capabilities [2] Group 1: Project Features - The world model Genie 3 can generate diverse interactive environments, enabling users to create immersive experiences and discover new usage methods [4] - Project Genie focuses on three core capabilities: World Sketching, World Exploration, and World Remixing [9][13][15] - Users can create environments through text prompts and images, define exploration methods, and adjust camera perspectives [9][12] Group 2: User Experience - The prototype is currently available to Google AI Ultra users aged 18 and above in the U.S., expanding its audience [6] - Users have reported positive experiences, showcasing their creations and expressing excitement about the potential of AI in gaming and other fields [21][23] - Despite its advancements, Genie 3 is still in the early research phase, with some limitations in realism, control, and content generation time [18][20]
大模型的第一性原理:(二)信号处理篇
机器之心· 2026-01-30 08:49
Core Viewpoint - The article discusses the transformation of natural language processing problems into signal processing problems through semantic vectorization, emphasizing the importance of token embedding in large models and its connection to signal processing and information theory [2][32]. Semantic Embedding / Vectorization - The concept of using vectors to model semantics dates back to Luhn's 1953 paper, but significant breakthroughs were achieved in 2013 by Mikolov and others, who successfully trained neural network models to convert tokens into semantic vectors [6][9]. - The ideal semantic vectorization has not been fully realized, but the inner product of semantic vectors can represent semantic relevance at the token level [7][11]. - The semantic vector space can be modeled as a probability-inner product space, balancing complexity and effectiveness by using a unit sphere to define the space [8][10]. Optimal Semantic Vectorization - The optimal semantic encoding is closely related to downstream tasks, with the goal of predicting the next token. The semantic encoder should maximize the conditional mutual information between the next token and the current sequence [13][14]. - The article highlights that existing methods like Contrastive Predictive Coding (CPC) optimize the upper bound of the semantic encoder but may not achieve the optimal solution [15][19]. Transformer as a Nonlinear Time-Varying Vector Autoregressive Time Series - The Transformer model is identified as a self-regressive large language model that predicts the next token based on the input token sequence and previously generated tokens [21][30]. - The attention mechanism in Transformers can be mathematically expressed as a nonlinear time-varying vector autoregressive time series, which is crucial for predicting the next token [22][24]. Signal Processing and Information Theory - The article establishes a relationship between signal processing and information theory, noting that signal processing implements information theory principles in specific computational architectures [32][33]. - The transition from BIT in the information age to TOKEN in the AI era is proposed as a way to apply Shannon's information theory to the mathematical principles behind large models [36].
姚顺雨现场颁奖,吉嘉铭、董冠霆等15位青年人才获腾讯「青云奖学金」
机器之心· 2026-01-30 04:25
机器之心编辑部 刚刚,腾讯「青云奖学金」正式在深圳颁奖。 作为腾讯支持青年人才和科学研究的项目,「青云奖学金」首期评选出 15 位获奖者,并为每位获奖者提供总价值 50 万的激励,包括 20 万现金和价值 30 万的云 异构算力资源。 「我们希望青年研究者敢于探索未知、富有创新精神,追逐那些大胆的、前沿的、具有长远影响力的科研方向,共同探索更广阔的科技前沿。」腾讯集团高级副 总裁、首席人才官奚丹说。 不久之前正式入职腾讯出任「CEO / 总裁办公室」首席 AI 科学家的姚顺雨(Vinces Yao),也现身为获奖者颁奖。 白雨石 来自清华大学的白雨石,研究领域涉及长上下文大模型与大模型评测。截至目前,他已在 NeurIPS、ICML、ICLR、ACL 等国际顶会发表 10 篇一作论文,总引用 量 4000 + 次,一作论文被引近 2000 + 次。开源的 LongBench、Long-Writer、LongAlign 等工作在 GitHub 共获 3000+ stars 及 300+ forks,在 HuggingFace 上开源的 数据集和模型共被下载 200 万+ 次。 吉嘉铭、董冠霆、张金涛等多位机器之 ...
LLM-in-Sandbox:给大模型一台电脑,激发通用智能体能力
机器之心· 2026-01-30 04:25
Core Idea - The article presents the concept of LLM-in-Sandbox, which allows large language models (LLMs) to explore tasks in a virtual computer environment, significantly enhancing their performance in various non-code domains without additional training [5][40]. Group 1: Technical Advancements - The evolution of large models is being unlocked through different paradigms, including In-Context Learning, Chain-of-Thought, and the recent intelligent agent framework that enables multi-turn interactions and tool usage [2][3]. - LLM-in-Sandbox is proposed as a new paradigm that combines LLMs with a virtual computer, allowing them to autonomously explore and complete tasks, leading to improved performance in fields such as mathematics, physics, chemistry, and long-text understanding [3][7]. Group 2: Design and Implementation - LLM-in-Sandbox features a lightweight, general-purpose design that contrasts with existing software engineering agents that require task-specific environments, thus enhancing generalization and scalability [10][11]. - The environment is based on a Docker Ubuntu setup with minimal pre-installed tools, allowing models to autonomously acquire domain-specific tools as needed [12][13]. Group 3: Experimental Results - Experiments across six non-code domains showed significant performance improvements for LLMs in the LLM-in-Sandbox mode, with enhancements observed in mathematics (+6.6% to +24.2%), physics (+1.0% to +11.1%), and other areas without additional training [20][21]. - The model's ability to autonomously utilize the sandbox environment was demonstrated through case studies, showcasing its capacity for external resource access, file management, and computational execution [21][22][23]. Group 4: Reinforcement Learning Integration - LLM-in-Sandbox RL is introduced to enhance the generalization capabilities of weaker models by training them in the sandbox environment using context-based tasks, which require active exploration [26][29]. - The approach has shown consistent performance improvements across various models, indicating its broad applicability and effectiveness [31]. Group 5: Efficiency and Performance - LLM-in-Sandbox demonstrates cross-domain generalization, achieving consistent performance improvements in multiple downstream tasks, including software engineering [31]. - The deployment of LLM-in-Sandbox can significantly reduce token consumption in long-text scenarios, with reductions of up to 8 times, while maintaining competitive throughput speeds [32][34]. Group 6: Future Prospects - LLM-in-Sandbox transcends traditional text generation capabilities, enabling cross-modal abilities and direct file generation, which could evolve into a universal digital creation system [35][38]. - The article concludes that LLM-in-Sandbox should become the default deployment paradigm for large models, as it offers substantial performance enhancements with minimal deployment costs [40].
Clawdbot接入10000+数据和工具后,7×24小时监听股票,杀疯了!
机器之心· 2026-01-29 11:37
Core Viewpoint - Clawdbot, now renamed as Moltbot, has gained significant popularity in the AI community, particularly with its integration into the Teamo platform, allowing users to access a wide range of professional databases and tools without the need for complex setup [1][7]. Group 1: Clawdbot Overview - Clawdbot is an open-source AI assistant that can interact with users through various platforms such as WhatsApp, Telegram, and Discord, and can also be integrated into domestic platforms like Feishu and WeChat [10]. - It operates 24/7, capable of monitoring markets, responding to messages, reminding users of daily tasks, managing emails, and handling files [11]. - The initial version of Clawdbot was limited in functionality, primarily due to the lack of access to specialized data sources, which restricted its capabilities to basic interactions [6][14]. Group 2: Teamo Integration - The Teamo platform has integrated Clawdbot with over 10,000 databases and tools across finance, business, and social media, enabling users to claim a fully configured Clawdbot instance with zero deployment and configuration required [7][20]. - Users can access financial data sources, cryptocurrency data, social media analytics, and various business analysis tools through this integration [23]. - The cost of acquiring these specialized data sources separately can exceed tens of thousands annually, making the Teamo integration a cost-effective solution [18]. Group 3: Enhanced Features - The enhanced version of Clawdbot on Teamo supports various skills, allowing users to perform tasks such as stock analysis, market monitoring, and business analysis with real-time data [29][36]. - Users can directly request specific analyses, such as technical analysis of stocks or market trends, and receive professional insights [30][34]. - The platform also allows users to create and install custom skills, enhancing the functionality of Clawdbot for specific needs [46][48].
Karpathy盛赞,啥都没有的创业公司刚融了1.8亿美元,要用小数据造强智能
机器之心· 2026-01-29 10:26
你想象中真正的 AI 是什么样子的? 至少有一点,大多数人会同意:未来的 AI,应该具备像人一样思考的能力。 问题在于,我们现在研究大模型走的这条路,能通向真正的「思考」吗? 当前最先进的大模型系统,几乎是在整个人类可获取的历史数据之上训练出来的:网页、书籍、代码、论文、对话,数万亿 token。训练大模型所需的数据,远超 任何一个人类个体一生所能接触的总和。 AI 需 要整个互联网来学 习,而人类只需要一个童年。 人类在成年之前,所接触的语言、文本与符号,顶多只有几十亿 token,相差几个数量级。 正是从这个问题出发,一家几乎 没有产品、没有盈利 、也不急于商业化的 AI 创业公司,从 GV、Sequoia 和 Index 拿到了 1.8 亿美元融资 ,并获得了 Andrej Karpathy 的公开力挺。 它的名字,叫 Flapping Airplanes。 Flapping Airplanes 是一家基础 AI 研究实验室,专注于解决「数据效率」这一核心问题 ,并正在探索一些看似怪异、但可能至关重要的新思路 —— 从重新思考损 失函数,到甚至质疑和重构梯度下降本身。该公司的研究团队成员中包括 IMO、 ...
刚刚,创智+模思发布开源版Sora2,电影级音视频同步生成,打破闭源技术垄断
机器之心· 2026-01-29 10:26
编辑|泽南、Panda 今天上午,上海创智学院 OpenMOSS 团队联合初创公司模思智能(MOSI),正式发布了端到端音视频生成模型 —— MOVA(MOSS-Video-and-Audio) 。 作为 中国首个高性能开源音视频模型 ,MOVA 实现了真正意义上的「音画同出」。它不仅能生成长达 8 秒、最高 720p 分辨率的视听片段,更在多语言口型同 步、环境音效契合度上展现了极高的工业水准。 更具行业意义的是,在 Sora 2 和 Veo 3 等顶尖技术普遍走向闭源的当下,MOVA 选择将模型权重、训练代码、推理代码以及微调方案进行全栈开源。 它生成视频的效果,给人一种身临其境的真实感: 效果亮眼 可称开源最强 过去一年,视频生成模型(Video Generation)经历了爆发式增长。从 Sora 到 Wan,再到 LTX Video,AI 输出的画面越来越逼真,能生成的时间越来越长。但仔细 观察 AI 生成的视频你就会发现,这些视频有的是「哑巴」,有的配音出戏。音视频生成(Video-Audio Generation)模型正是通过端到端的模态融合弥补了传统视 频模型的音频维度缺陷。 虽然以 Veo3 ...