Workflow
具身智能之心
icon
Search documents
3DGS杀入具身!港大×原力无限RoboTidy即将开源:让机器人在家庭场景“游刃有余”
具身智能之心· 2025-11-27 00:04
Core Insights - The article discusses the advancements in Embodied AI, particularly focusing on the RoboTidy project, which aims to enhance the capabilities of robots in household tasks through realistic training environments [3][4]. Group 1: Introduction to RoboTidy - RoboTidy is the first benchmark based on 3D Gaussian Splatting (3DGS) technology, creating 500 photo-realistic interactive 3D environments and providing over 8000 expert demonstration trajectories [4]. - The project demonstrates significant potential in real-world applications, with a nearly 30% increase in task success rates for real robots after training in the RoboTidy environment [4][16]. Group 2: Importance of 3DGS - Traditional simulation environments often suffer from low fidelity, which hampers the performance of robots in real-world scenarios [7]. - 3DGS offers high rendering speeds (over 100 FPS) and realistic scene reconstruction, addressing the limitations of previous methods [8][10]. Group 3: Redefining Organization Tasks - Organizing a room is a complex long-horizon planning challenge for robots, requiring semantic understanding and common-sense reasoning [13]. - RoboTidy provides a large and high-quality dataset that captures the implicit logic of human organization, enabling robots to learn effective planning strategies [14]. Group 4: Sim-to-Real Validation - The collaboration with Yuanli Infinite focuses on bridging the Sim-to-Real gap, a critical industry challenge [16]. - Experiments show that models trained in the RoboTidy environment outperform traditional methods, especially in handling unseen objects and complex backgrounds, with a task success rate improvement of 29.4% [16][17]. Group 5: Standardization and Open Source - RoboTidy establishes a standardized evaluation system and leaderboard, addressing the lack of uniform assessment criteria in household organization tasks [19]. - The project invites global developers to contribute to advancing household service robots on a more realistic and rigorous platform [21]. Group 6: Conclusion - The emergence of RoboTidy signifies a paradigm shift in Embodied AI research, emphasizing the need for stronger algorithms and more realistic environments [23]. - The collaboration between industry and academia, exemplified by Yuanli Infinite and top academic institutions, is seen as a catalyst for the evolution of general-purpose humanoid robots [23][24].
AAAI 2026 Oral | 华科&小米提出具身智能新范式:教机器人「时间管理」
具身智能之心· 2025-11-27 00:04
本文只做学术分享,如有侵权,联系删文 >> 点击进入→ 具身智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 点击下方 卡片 ,关注" 具身智能 之心 "公众号 作者丨 Dingkang Liang等 编辑丨具身智能之心 论文链接:https://arxiv.org/abs/2511.19430 在具身智能(Embodied AI)领域,任务规划(Task Planning)是让机器人理解人类指令并执行动作的关键。然而,现有的研究和数据集往往将任务过度简化,假设 机器人只能串行(Sequential)地完成子任务,如图1(a)。 例如,面对指令:"把微波炉打开热饭(需要35分钟),然后把水槽洗干净(需要 20 分钟)"。 这种差距的核心在于:现有机器人缺乏运筹学(Operations Research,OR)知识,无法识别哪些任务可以"并行(Parallelizable)",哪些必须"独占注意力(Non- parallelizable)"。同时,机器人不仅要规划时间,还得在复杂的 3D 场景中精准找到物体的位置(3D Grou ...
北京人形机器人!WoW:200万条数据训练的全知世界模型
具身智能之心· 2025-11-27 00:04
Core Insights - The article emphasizes the necessity of large-scale, causally rich interaction data for developing world models with true physical intuition, contrasting with current models that rely on passive observation [2][3] Group 1: WoW Model Overview - WoW is a generative world model trained on 2 million robot interaction trajectories, featuring 14 billion parameters [2] - The model's understanding of physical laws is probabilistic, leading to random instability and physical illusions [2] - The SOPHIA framework is introduced to evaluate the physical plausibility of generated results and guide the model towards physical reality through iterative language instructions [2] Group 2: Evaluation and Performance - WoWBench benchmark was created to systematically assess the model's physical consistency and causal reasoning capabilities [3] - WoW achieved leading performance in both manual and automated evaluations, particularly excelling in adherence to physical laws (80.16%) and instruction comprehension (96.53%) [3] - The research provides solid evidence that large-scale real-world interactions are essential for cultivating AI's physical intuition [3] Group 3: Live Event and Discussion - A live session is scheduled to discuss the latest open-source embodied world model WoW 1.0, covering trends in world model development and breakthroughs in causal and physical consistency [7] - Key highlights include the architecture of agents that imagine, act, and reflect, as well as practical application scenarios [7]
SLAM与视觉语言/目标导航有什么区别?
具身智能之心· 2025-11-27 00:04
Core Insights - Goal-Oriented Navigation empowers robots to autonomously complete navigation tasks based on goal descriptions, marking a significant shift from traditional visual language navigation [2] - The technology has been successfully implemented across various verticals, enhancing service efficiency in delivery, healthcare, and hospitality sectors [4] - The evolution of goal-oriented navigation can be categorized into three generations, each showcasing advancements in methodologies and technologies [6][8][10] Group 1: Technology Overview - Goal-Oriented Navigation is a key aspect of embodied navigation, relying on language understanding, environmental perception, and path planning [2] - The transition from explicit instruction-based navigation to autonomous decision-making is crucial for robots to interpret and navigate complex environments [2] - The integration of computer vision, reinforcement learning, and 3D semantic understanding is essential for achieving effective navigation [2] Group 2: Industry Applications - The technology has been applied in terminal delivery scenarios, enabling robots to adapt to dynamic environments and human interactions [4] - Companies like Meituan and Starship Technologies have deployed autonomous delivery robots in urban settings, showcasing the practical application of this technology [4] - In healthcare and hospitality, companies such as Aethon and Jianneng Technology have successfully implemented service robots for autonomous delivery of medications and meals [4] Group 3: Technological Evolution - The first generation of goal-oriented navigation focused on end-to-end methods using reinforcement and imitation learning, achieving significant progress in PointNav and image navigation tasks [6] - The second generation introduced modular approaches that explicitly construct semantic maps, enhancing performance in zero-shot object navigation tasks [8] - The third generation incorporates large language models (LLMs) to improve exploration strategies and open-vocabulary target matching accuracy [10] Group 4: Learning and Development Challenges - The complexity of embodied navigation requires knowledge across multiple domains, making it challenging for newcomers to enter the field [11] - A new course has been developed to address these challenges, providing a structured learning path for mastering goal-oriented navigation technologies [11][12] - The course emphasizes practical application, helping learners transition from theoretical knowledge to real-world implementation [12][13] Group 5: Course Structure - The course is divided into several chapters, covering core frameworks, Habitat simulation, end-to-end methodologies, modular navigation architectures, and LLM/VLM-driven systems [15][17][19][21] - Practical assignments will allow students to apply their knowledge in real-world scenarios, focusing on algorithm replication and deployment [23][27] - The course aims to equip participants with the skills necessary for independent research and development in the field of goal-oriented navigation [30]
AAAI'26 Oral | 华科&小米提出新范式:教机器人「时间管理」,任务效率提升30%以上!
具身智能之心· 2025-11-26 10:00
本文只做学术分享,如有侵权,联系删文 >> 点击进入→ 具身智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 论文链接:https://arxiv.org/abs/2511.19430 代码链接:https://github.com/H-EmbodVis/GRANT 点击下方 卡片 ,关注" 具身智能 之心 "公众号 作者丨 Dingkang Liang等 编辑丨具身智能之心 在做饭时,人们通常会在微波炉加热食物的同时去清洗水槽,而不是呆板地盯着微波炉倒计时。然而,目前的具身智能机器人却往往只能"一根筋"地按顺序做完一 件事,再做下一件。 近日,华中科技大学(白翔团队)联合小米的论文《Cook and Clean Together: Teaching Embodied Agents for Parallel Task Execution》被 AAAI 2026 录用为口头报告 (Oral Presentation)。该工作首次将运筹学(Operations Research, OR)知识引入 3D 具身任务规划中。研究团队提出 ...
具身方向,论文“救援”来了!
具身智能之心· 2025-11-26 10:00
Core Viewpoint - The article promotes a comprehensive thesis guidance service that addresses various challenges faced by students in research and writing, particularly in advanced fields like multimodal models and robotics. Group 1: Thesis Guidance Service - The service offers one-on-one customized guidance in cutting-edge research areas such as multimodal large models, visual-language navigation, and embodied intelligence [1][2]. - It provides a full-process closed-loop support system, covering topic innovation, experimental design, code debugging, writing, and submission strategies to help produce high-quality results quickly [2]. - The guidance is provided by a team of experienced mentors from prestigious institutions like CMU, Stanford, and MIT, with expertise in top-tier conferences [1][3]. Group 2: Dual Perspective Approach - The service emphasizes both academic publication and practical application, focusing on real-world value such as improving the robustness of robotic grasping and optimizing navigation in real-time [3]. - Students consulting in the top 10 inquiries can receive free matching with dedicated mentors for in-depth analysis and tailored publication advice [4].
具身智能之心技术交流群成立了!
具身智能之心· 2025-11-26 10:00
Group 1 - The establishment of a technical exchange group focused on embodied intelligence, covering areas such as VLA, VLN, remote operation, Diffusion Policy, reinforcement learning, VLA+RL, sim2real, multimodal large models, simulation, motion control, target navigation, mapping and localization, and navigation [1] - Interested individuals can add the assistant's WeChat AIDriver005 to join the community [2] - To expedite the joining process, it is advised to include a note with the institution/school, name, and research direction [3]
2.64亿元订单!刷新全球人形机器人记录
具身智能之心· 2025-11-26 04:00
点击下方 卡片 ,关注" 具身智能 之心 "公众号 >> 点击进入→ 具身智能之心 技术交流群 作者丨编辑丨具身智能之心 本文只做学术分享,如有侵权,联系删文 更多产业相关问题,欢迎加入我们的具身智能社区,和近200家机构,近3000名成员一起交流。 微信扫码加入星球 Cj 知识量班 具身智能之心 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要 的。 11 月 25 日优必选官方发布消息,中标广西防城港市人形机器人数据采集与测试中心和人工智能科创教 育示范项目,金额 2.64 亿元,产品以最新款的人形机器人 Walker S2 为主。边境口岸的旅客和人员疏 导、岗哨巡检、物流、商业服务以及国内钢铜铝大型生产制造基地的设施巡检等项目,预计12月交 付。 说到这款机器人,今年的订单已经总金额达到了11亿元,这也是全球最大的单品人形机器人销售金 额。 本月,S2已经陆续交付,主要在制造业、物流行业等。这也给全球人形机器人市场带来了信心。 写在最后 ...
机加篮球有没有搞头?港科大解锁全球首个真实篮球机器人Demo!
具身智能之心· 2025-11-26 00:05
编辑丨 量子位 点击下方 卡片 ,关注" 具身智能之心 "公众号 >> 点击进入→ 具身 智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区: 具身智能之心知识星球(戳我) ,这里包含所有你想要的! 1米3的机器人小土豆,三步上篮也可以如此丝滑。 别误会,这台宇树G1暂时还不准备参加NBA选秀,但它刚解锁的 "现实世界打篮球" 技能,离上"村BA"首发应该不远了。 据悉,这是全球首个能在真实场景中完成篮球动作的机器人demo,来自香港科技大学的研究团队。 虽然团队还没公开完整的技术细节,但结合他们此前让机器人"打篮球"的工作,这次很可能是在之前研究的基础上,进一步改良而来。 接下来,让我们一窥究竟。 SkillMimic-v2 首先是被收录于 SIGGRAPH 2025 的 SkillMimic-V2: Learning Robust and Generalizable Interaction Skills from Sparse and Noisy Demonstrations 。 SkillMimic-V2旨在解决交互演示强化学习(RLID)中演示轨迹稀疏、含噪且覆盖不足的难题。 其通过 ...
除了27万小时的真实世界操作轨迹和GEN-0 ,Generalist AI还有哪些亮点值得深扒
具身智能之心· 2025-11-26 00:05
点击下方 卡片 ,关注" 具身智能 之心 "公众号 11月4日,Generalist AI发布了震撼世界的Gen-0具身基础模型,其 数据规模是前所未有。这个由Google DeepMind高级研究科学家Pete Florence创 立、Andrew Barry担任CTO、Andy Zeng担任首席科学家的具身领域独角兽,仅在短短数月内就2度凭借官网公开发布的成果惊艳世人。上一次是 凭借4段任务难度高、精度要求不低的的双臂长程操作视频,而这次是Gen-0. GEN-0 的"强大"基于Generalist AI自研机器人数据集进行预训练。这套 27万小时的真实世界操作轨迹是当前具身领域规模最大的数据集,仅在衣物 处理的轨迹数就达到了3亿条。而DRIOD是七万多条示范轨迹,Agibot World/Open X-Embodiment是超一百万条轨迹。 而π0.5 是在移动操控机器 人的环境中,收集了大约400小时的真机数据。从轨迹的角度来看,他们仅在衣物处理的轨迹数就达到了3亿条。而DRIOD是七万多条示范轨迹, Agibot World/Open X-Embodiment是超一百万条轨迹。在数量级上,Gener ...