Workflow
VLM(视觉语言模型)
icon
Search documents
Jinqiu Select | Physical Intelligence 联创:AI训练的真实数据不可替代
锦秋集· 2025-07-22 15:04
Core Viewpoint - Over-reliance on alternative data sources can severely limit the ultimate capabilities of models, and true breakthroughs must be built on real data [1][10] Group 1: The Dilemma of Alternative Data - Researchers in robotics often seek cheaper alternatives to real data due to high collection costs, leading to a compromise in model performance [2][3] - Common alternative methods include simulation training, learning from human videos, and using handheld devices to mimic robotic actions, but each method ultimately weakens the model's true potential [3][4] Group 2: Intersection Dilemma - The collection of data inevitably involves human judgment, which can limit the problem-solving approach when avoiding real data [4][6] - As models grow stronger, they can better distinguish between alternative and real data, leading to a smaller intersection of effective behaviors [6][7] Group 3: The Importance of Real Data - Attempting to bypass real data results in a "spork" scenario, where neither alternative data nor real data is effectively utilized [10][11] - To build robust robotic models that generalize well, real data is essential, but it can be complemented with diverse data sources [11][12] Group 4: The "Spork" Phenomenon - The concept of "spork" applies to various AI research areas, where attempts to combine manual design with learning systems ultimately create performance bottlenecks [13]
理想重押VLA,「端到端」模型负责人夏中谱将离职|36氪独家
36氪· 2025-05-21 11:18
Core Viewpoint - The article discusses the recent departure of Xia Zhongpu, the head of the end-to-end model for assisted driving at Li Auto, and the implications of this change on the company's strategic direction towards the VLA (Vision-Language-Action) model for autonomous driving technology [3][7][14]. Summary by Sections Departure of Xia Zhongpu - Xia Zhongpu, who joined Li Auto in 2023 and was responsible for the planning and control model of the assisted driving system, is set to leave the company. His departure may be linked to a shift in Li Auto's technology strategy [5][7]. - Xia's rapid promotion from P9 to 21st level within two years is noted as unusual within the company [6]. Shift in Technology Strategy - Li Auto has transitioned its assisted driving technology from a reliance on high-precision maps and rule-based systems to an end-to-end model, and now to the VLA model [9][10]. - The VLA model, which Li Auto is now fully committed to, is seen as a more advanced approach that incorporates action capabilities, allowing for interaction with the physical world [12][14]. VLA Model Advantages - The VLA model is positioned as superior to the previous end-to-end model, as it combines 3D and 2D visual understanding with the ability to execute actions, aligning more closely with human operational methods [12]. - This model is part of a broader industry trend towards enhancing the world knowledge and reasoning capabilities of assisted driving systems, as seen in recent developments from competitors like NIO and XPeng [12]. Internal Changes and Future Outlook - The leadership within Li Auto's assisted driving team has also seen changes, with the head of the team, Lang Xianpeng, being promoted to a higher level, indicating a strengthening of resources towards the VLA model [9]. - Despite the enthusiasm for the VLA model, industry insiders caution that it is still in its early stages and has not undergone extensive practical application [13].
ICML Spotlight | MCU:全球首个生成式开放世界基准,革新通用AI评测范式
机器之心· 2025-05-13 07:08
该工作由通用人工智能研究院 × 北京大学联手打造。第一作者郑欣悦为通用人工智能研究院研究员,共同一作为北京大学人工智能研究院博士生林昊苇, 通讯作者为北京大学助理教授梁一韬和通用人工智能研究院研究员郑子隆。 开发能在开放世界中完成多样任务的通用智能体,是 AI 领域的核心挑战。开放世界强调环境的动态性及任务的非预设性,智能体必须具备真正的泛化能力 才能稳健应对。然而,现有评测体系多受限于任务多样化不足、任务数量有限以及环境单一等因素,难以准确衡量智能体是否真正 「 理解 」 任务,或仅是 「 记住 」 了特定解法。 为此,我们构建了 Minecraft Universe ( MCU ) —— 一个面向通用智能体评测的生成式开放世界平台。 MCU 支持自动生成无限多样的任务配置,覆 盖丰富生态系统、复杂任务目标、天气变化等多种环境变量,旨在全面评估智能体的真实能力与泛化水平。该平台基于高效且功能全面的开发工具 MineStudio 构建,支持灵活定制环境设定,大规模数据集处理,并内置 VPTs 、 STEVE-1 等主流 Minecraft 智能体模型,显著简化评测流程,助力智 能体的快速迭代与发展。 开放世界 ...