Workflow
RoboBrain
icon
Search documents
具身智能商业化大单“含金量”几何?从业者也看不明白
Nan Fang Du Shi Bao· 2025-11-23 05:50
今年下半年以来,具身智能机器人行业连续宣布亿元级商业化大单,营造出一派乐观的落地前景。但也 有从业者直言,看不懂一些订单背后的虚实。 在11月20日的智源研究院"具身开放日"上,原力灵机创始人兼CEO唐文斌抛出诸多疑问:"这些订单它 到底解决了什么问题?真的(商业)闭环了吗?它创造的场景价值是真实的吗?"原力灵机是一家成立 于2025年3月的具身智能初创公司,11月中旬刚完成由阿里巴巴领投的数亿元A+轮融资。 王仲远建议,政府层面应更多从政策上给予支持与引导,避免直接提需求,因为真正的需求始终来自企 业和用户侧。 具身智能模型"难产"背后,数据短缺是一个老生常谈的问题。业内为此爆发了至今仍在持续的真机数据 与仿真数据路线之争。 智源研究院的具身训练场。图:智源研究院 尽管技术仍不成熟,但具身智能公司在今年纷纷发力商业化布局。这背后,既有投资人对创业公司"造 血"能力或跑通商业闭环能力的考验压力,同时也源于机器人企业在真实场景中发现问题、迭代产品的 现实需求。从应用场景来看,众多公司集中涌入工业和物流领域的搬运、分拣、安防,以及商用领域的 导览、导购和文娱表演等方向。 机器人能力的有限性,也在李凯的预期之中。作 ...
100亿都不够烧!机器人公司CEO们给出新判断:具身智能不能再照搬LLM
Sou Hu Cai Jing· 2025-11-22 02:41
机器人前瞻11月20日报道,在今天举行的2025智源具身Open Day上,智源研究院系统性公开了其在具身智能方向的最新研究进展,并举办了 围绕行业核心问题的圆桌讨论。 机器人前瞻(公众号:robot_pro) 作者 | 江宇 编辑 | 漠影 在现场,圆桌讨论从"世界模型是不是实现具身智能的关键"展开,随后延伸到"具身智能需不需要自己的统一架构、要不要有一套'具身版 Transformer'"。在数据层面,嘉宾们又讨论了在数据又重要又难的前提下,真实数据、仿真数据和视频数据该怎么组合使用。 第二场圆桌则进一步提出"人形机器人是不是具身智能的最终形态、硬件是不是现在最大的瓶颈"的问题。 大咖云集的圆桌讨论把业内当下关键与现实的议题都摆上了桌面。许多嘉宾在多个核心问题上给出了清晰、直接的判断,分歧与共识交织出 现。 一、智源的全栈布局:从世界模型到跨本体"具身大脑" 在开场演讲中,智源研究院院长王仲远系统介绍了过去一年在具身智能方向的多项关键进展,他将其概括为两条主线:世界模型的突破与具身 大脑全栈体系的成型。 首先,智源发布了原生多模态世界模型Emu3.5。相较上一代Emu3,新模型将训练数据从15年视频扩展至 ...
VLA的基础模型与大规模训练任务汇总
具身智能之心· 2025-10-08 02:49
Core Insights - The article summarizes several research papers related to Vision-Language-Action (VLA) models and their training strategies, highlighting advancements in embodied intelligence and robotics [2][3][5][7][9][11][13][15][17][19]. Group 1: Training Strategies and Model Improvements - The paper "Training strategies for efficient embodied reasoning" discusses the use of Chain of Thought (CoT) reasoning to enhance the performance and generalization of VLA models, achieving a threefold increase in reasoning speed compared to standard methods [3]. - "CAST: Counterfactual labels improve instruction following in vision-language-action models" introduces a method to generate counterfactual labels, which significantly improves the instruction-following capabilities of VLA models, with a 27% increase in navigation task success rates [5]. - "RoboBrain: A unified brain model for robotic manipulation" presents a new dataset, ShareRobot, which enhances the planning and trajectory prediction capabilities of robots, leading to state-of-the-art performance in various tasks [7]. Group 2: Dataset Development and Evaluation - The "DROID" dataset is introduced as a large-scale, diverse dataset for robot manipulation, containing 76,000 demonstration trajectories collected over 350 hours, which improves performance and generalization of trained strategies [9]. - "ViSA-Flow" proposes a framework for learning from large-scale video data, achieving state-of-the-art performance in robot skill learning, particularly in low-data scenarios [11]. - The "CORTEXBENCH" benchmark evaluates pre-trained visual representations for embodied AI, revealing that no single representation excels across all tasks, but task-specific adaptations can lead to significant performance improvements [13]. Group 3: Generalist Robot Policies and Learning Frameworks - "Effective tuning strategies for generalist robot manipulation policies" identifies key factors influencing the performance of Generalist Manipulation Policies (GMPs) during fine-tuning, establishing a new benchmark for future research [15]. - The "CACTI" framework focuses on scalable multi-task learning in robotic systems, demonstrating effective training across various kitchen tasks in both real and simulated environments [17]. - "R3m: A universal visual representation for robot manipulation" shows that pre-trained visual representations can enhance data-efficient learning in real-world environments, improving task success rates by over 20% compared to training from scratch [19].
对话智源王仲远:具身智能“小组赛”才刚刚开打,机器人需要“安卓”而非 iOS
AI科技大本营· 2025-06-07 09:42
悟道 1.0 发布时,学术界对" 大模型是通往 AGI 的技术路线 "尚未得出统一结论。 现在的具身智能,也处于这个阶段。 作者 | 王启隆 出品丨AI 科技大本营(ID:rgznai100) 大模型的热潮之下,一种微妙的瓶颈感,正成为行业共识。 "过往所说的 '百模大战',更多是大语言模型的竞争," 智源大会前夕, 智源研究院院长王仲远 在 与 CSDN 的对话中,开门见山地指出了问题的核 心,"而大语言模型受限于互联网数据的使用,性能虽然还在提升,但速度已大不如前。" 出路何在?在王仲远看来,AI 要突破天花板,就必须在"读万卷书"(互联网数据)后,去"行万里路"(物理世界)。 这并非孤立的判断。今年三月, 英伟达 CEO 黄仁勋就在 GTC 大会上为 AI 的下半场指明了方向 :打造"AI 工厂",迎接"物理 AI"时代,让 AI 走出屏 幕,与现实世 界交互。 思考趋于一致,行动便接踵而至。6 月 6 日,CSDN 在北京智源大会现场,见证了王仲远在他的主题演讲中给出的答案。如果说 2021 年的"悟道"系列 代表着对技术路径的探索(" 道 "),那么他所揭晓的全新"悟界"系列,则亮明了新的野心——用 ...
智源研究院院长王仲远:多模态大模型会给具身智能带来新变量
Xin Jing Bao· 2025-03-30 10:00
Core Insights - The topic of embodied intelligence is a major focus at the 2025 Zhongguancun Forum, with the introduction of the RoboOS framework and the open-source RoboBrain model [1][3] - Multi-modal large model technology is expected to enhance the intelligence of robots, allowing them to better understand and interact with the physical world [2][3] Group 1: Multi-modal Large Models - Multi-modal large models enable AI to perceive and understand the world through various data types, such as medical imaging and sensor data, facilitating the transition from digital to physical environments [2] - The performance improvement of large language models has slowed due to the exhaustion of available internet text data, necessitating the integration of multi-modal capabilities [2] Group 2: RoboBrain and RoboOS - RoboBrain and RoboOS are designed to support cross-scenario, multi-task deployment and collaboration among different types of robots, enhancing their general intelligence [3] - RoboBrain can interpret human commands and visual inputs to generate actionable plans based on real-time feedback, supporting various robotic configurations [3] Group 3: Industry Development and Challenges - The open-source approach is seen as a key driver for rapid development in the AI industry, allowing for collaboration among hardware, model, and application vendors [4] - Despite the potential of humanoid robots, there are significant challenges in their industrial application, with many still in the early stages of development [5] - The realization of Artificial General Intelligence (AGI) is projected to take an additional 5-10 years, influenced by advancements in embodiment capabilities and data accumulation [5]