Workflow
具身智能之心
icon
Search documents
需要撕衣验证?全网都吵疯了!小鹏的人形机器人,是不是真人
具身智能之心· 2025-11-06 05:28
点击下方 卡片 ,关注" 具身智能 之心 "公众号 编辑丨机器之心 本文只做学术分享,如有侵权,联系删文 >> 点击进入→ 具身智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要 的。 物理 AI,已经能让人产生错觉了? 这是机器人还是真人? 从昨天到今天,全球大半个互联网都在讨论小鹏的人形机器人 IRON。 大家的「福尔摩斯」本能瞬间觉醒。 小红书网友热议,发布会上步态演示的机器人,其实是真人 + 皮套。 不过,面对铺天盖地的讨论,小鹏似乎一点也不慌。在一个网友评论:「100% 真人在里面」的下面, 何小鹏回应道:「感谢认可。」 IRON 身高约 1.78 米,比 1X 的 NEO 等公司的机器人更高些,体重达到 70kg。 双手使用微型谐波关节,拥有 22 个自由度(DOF),仅比人类少 5 个。这意味着它已经能完成一些 复杂的日常动作 —— 比如叠衣服、擦桌子、整理物品等精细任务。 全身共有 65 个自由度,具备类人脊柱运动能力,比 NEO 还要多出 10 个。 11 月 6 日,小鹏汽车在广州新总部举行 AI Day 202 ...
小鹏AI Day昨日发布 | 颜值、算法、算力均拉满!“IRON:最拟人的人形机器人来了?!”
具身智能之心· 2025-11-06 03:27
Core Viewpoint - Xiaopeng has launched the next-generation humanoid robot IRON, designed for real-world scenarios with easy data acquisition and generalization capabilities [1]. Group 1: Robot Features - The robot features a mechanical design with a 3D curved screen for the face, avoiding a lifelike appearance [4]. - It is capable of various movements including standing, sitting, squatting, lying down, and climbing, utilizing soft materials for its skin to mimic human characteristics [6]. - The robot has 22 degrees of freedom in one hand, showcasing its dexterity [8]. Group 2: Strategic Layout - Xiaopeng is developing a comprehensive ecosystem that integrates autonomous driving, Robotaxi, and humanoid robots, with IRON being the latest addition [9]. Group 3: Technical Specifications - The robot is equipped with the first all-solid-state battery, reducing weight by 30% and increasing battery life by 30% [11]. - It features three Turing AI chips, providing a computing power of 2250 TFLOPs [11]. - The robot incorporates a combination of VLT, VLA, and VLM for advanced cognitive capabilities [13]. - It includes active safety protection features [19]. Group 4: Production Timeline - Xiaopeng plans to achieve mass production of the IRON robot by 2026, focusing on home and industrial applications [21].
都在研究具身,但相当一部分同学卡在了这些地方.......
具身智能之心· 2025-11-06 00:03
Core Insights - The article discusses the challenges faced by individuals in the field of embodied intelligence, particularly in areas such as computational power, data collection, model optimization, and practical project implementation [1][2][6] - It emphasizes the importance of quality data collection and suggests starting with basic teleoperation to mitigate noise in data, which can hinder model training [1] - The community has established a platform for sharing knowledge, resources, and job opportunities in the field of embodied intelligence, aiming to cultivate talent and facilitate industry connections [2][12][16] Data Collection - Recommendations for data collection include focusing on the quality of data and starting with basic teleoperation techniques [1] - The article highlights the potential of using real2sim2real methods to address insufficient data issues [1] Model Optimization - For those using robotic arms, the article suggests exploring RL+VLA approaches, while cautioning against complex models for humanoid robots due to the difficulty in achieving effective results [1] Community and Resources - The community has organized various resources, including technical routes for beginners, industry-related project solutions, and job referral mechanisms with multiple companies in the field [2][10][12] - A comprehensive list of over 40 open-source projects and 60 datasets related to embodied intelligence has been compiled to assist members in their research and development efforts [13][28][34] Learning and Development - The community offers a structured learning path for newcomers, covering various technical stacks and routes to facilitate entry into the field [8] - Members can engage in discussions and seek advice from industry experts, enhancing their understanding and networking opportunities [12][16]
智源具身框架Thor开源:迈向类人级全身控制,在强对抗中“站稳脚跟”
具身智能之心· 2025-11-06 00:03
Core Viewpoint - The article discusses the development of the BAAI Thor framework, which aims to enhance humanoid robots' ability to perform complex physical interactions in real-world environments, achieving human-level whole-body reactions and dynamic stability [7][8][31]. Group 1: Challenges in Humanoid Robot Control - Humanoid robots face two main challenges in transitioning from performers to laborers: the lack of human-like reaction mechanisms and the complexity of high-dimensional coordination control [9]. - The absence of effective human-like reaction mechanisms limits robots' performance under high external forces, as they often rely on rigid resistance strategies that can lead to instability [9][10]. - The high-dimensional nature of the control problem complicates the optimization of control strategies, as it involves numerous degrees of freedom and strong coupling between joints, making learning and adaptation difficult [10][11]. Group 2: BAAI Thor Framework - The BAAI Thor framework integrates biomechanical principles with innovative network structures to enable humanoid robots to achieve coordinated and stable responses in high-intensity force interactions [8][12]. - The framework includes two core components: the Force Adaptive Trunk Tilt Reward (FAT2), which guides robots to adjust their posture based on external forces, and a decoupled network structure that addresses high-dimensional coordination challenges [13][17]. Group 3: Experimental Validation - The BAAI Thor framework was tested on the Yushu G1 robot, which successfully pulled a car weighing approximately 1400 kg, demonstrating its capability for whole-body coordination and dynamic balance under extreme loads [18][20]. - Thor outperformed various baseline algorithms in force interaction tasks, achieving a peak pulling force of 167.7 N, which is about 48% of the robot's weight, representing a 68.9% performance improvement over the best baseline method [26][30]. - Quantitative analysis indicated that the introduction of the FAT2 reward function significantly enhanced the robot's adaptive posture adjustment capabilities, contributing approximately 80%-90% of the performance gains [30].
北大&智源研究院最新!RoboOS-NeXT:“记忆 + 分层架构” 实现通用多机器人协作
具身智能之心· 2025-11-06 00:03
Core Insights - The article discusses the RoboOS-NeXT framework, which addresses the challenges in multi-robot collaboration by integrating a unified memory system and a hierarchical architecture for effective task execution and fault tolerance [1][4][23]. Group 1: Challenges in Multi-Robot Collaboration - Current multi-robot collaboration faces a "triple dilemma": reliance on single-robot memory, difficulty in adapting to heterogeneous robots, and lack of fault recovery capabilities [2][3]. - Existing solutions either fail to accumulate long-term experience or struggle with dynamic task allocation and fault tolerance [2][3]. Group 2: RoboOS-NeXT Framework - RoboOS-NeXT employs a "spatio-temporal entity unified memory (STEM)" and a "brain-cerebellum architecture" to facilitate global memory sharing and dynamic task execution [3][4]. - The framework consists of two core components: STEM for information integration and the brain-cerebellum model for planning and execution [4][9]. Group 3: Core Components of RoboOS-NeXT - **STEM** integrates spatial, temporal, and entity memories, providing a unified interface for all robots and eliminating information silos [6][7][8]. - **Brain-Cerebellum Architecture** separates global planning from local execution, ensuring efficient task decomposition and precise action control [9][10]. Group 4: Execution Workflow - The execution process involves four steps: task decomposition, dynamic scheduling, distributed execution, and dynamic memory updating [10][12]. - This workflow ensures that tasks are efficiently completed, even in the face of robot failures or tool malfunctions [10][12]. Group 5: Experimental Results - RoboOS-NeXT demonstrated superior performance in various scenarios, showing strong lifelong adaptability, collaboration scalability, and fault recovery capabilities [13][14][15]. - In adaptability tests, RoboOS-NeXT maintained a success rate of over 75% in long-sequence tasks, while the baseline without memory failed completely [13][14]. - The framework also showed significant improvements in execution efficiency, with average execution steps per task reduced by 20%-70% compared to the baseline [17][18]. Group 6: Key Conclusions and Future Directions - The unified memory is essential for collaboration, enabling lifelong adaptability and robust scheduling [23][25]. - Future enhancements may include multi-modal memory integration, end-to-end task optimization, and real-time performance improvements [25][26].
多任务、全场景、跨本体通用移动:银河通用发布环视导航基座大模型
具身智能之心· 2025-11-06 00:03
Core Viewpoint - The article discusses the advancements in navigation models for robots, particularly focusing on the launch of the NavFoM (Navigation Foundation Model) by Galaxy General Robotics, which represents a significant leap in the capabilities of robotic navigation systems, allowing for more autonomous and adaptable robots in various environments [3][9][27]. Group 1: Technological Advancements - The NavFoM is the world's first cross-entity panoramic navigation foundation model, unifying various navigation tasks such as Vision-and-Language Navigation, Object-goal Navigation, Visual Tracking, and Autonomous Driving into a single framework [3][9]. - NavFoM allows robots to autonomously perceive their environment and make navigation decisions in unknown settings, moving beyond simple following tasks [9][10]. - The model employs a unified learning paradigm that enables knowledge sharing across different tasks and robot forms, enhancing the efficiency of training and application [13][14]. Group 2: Key Features - NavFoM supports both indoor and outdoor scenarios, operates in zero-shot conditions without the need for mapping or additional training data, and can adapt to various robot types, including quadrupeds, wheeled humanoids, drones, and cars [11][12]. - The model incorporates two key innovations: TVI Tokens for understanding time and direction, and BATS strategy for efficient sampling of video data, allowing for real-time responses while conserving computational resources [17][19]. - The training dataset for NavFoM includes over 8 million cross-task navigation data points and 4 million open-ended question-answer pairs, significantly enhancing its learning capabilities [21][23]. Group 3: Application and Impact - NavFoM has demonstrated state-of-the-art performance in various international benchmarks, showcasing its ability to generalize across tasks and environments without the need for task-specific fine-tuning [25]. - The model has successfully driven various robot forms to execute complex tasks, marking a significant step towards the realization of embodied intelligence in navigation systems [25][27]. - The introduction of NavFoM is seen as a foundational element for a comprehensive navigation system that can support a wide range of applications, from indoor navigation to urban environments, effectively transforming robotic capabilities [29][30].
欢迎具身世界模型&数采相关方向的大佬加入我们!
具身智能之心· 2025-11-05 09:00
Group 1 - The article emphasizes the value of embodied world models, robotic control, and data collection as significant industry directions with certain barriers to entry [2] - The company seeks to collaborate with experts in the field to develop courses or practical projects related to these topics, aiming to provide insights for professionals currently working in these areas [2][3] - Interested individuals with at least one year of industry experience or a publication in a CCF-A level conference are encouraged to participate in the collaboration [3] Group 2 - The company offers competitive salaries and resource sharing for collaborators, with opportunities for part-time involvement [5]
清华团队提出AirScape:动作意图可控的低空世界模型,全面开源!
具身智能之心· 2025-11-05 09:00
点击下方 卡片 ,关注" 具身智能 之心 "公众号 作者丨 Baining Zhao等 编辑丨具身智能之心 本文只做学术分享,如有侵权,联系删文 >> 点击进入→ 具身智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 人类空间感的重要组成部分之一,是对自身移动会产生的视觉观测变化的预期。这对于空间移动下的任务/动作决策至关重要。 因此,推演和想象是具身智能领域的基础问题之一,表现为预测:如果本体执行移动意图,那么具身观测将会如何变化。 现有世界模型的研究主要聚焦于人形机器人和自动驾驶应用,它们大多在二维平面上操作,动作空间有限。 具体而言,关键挑战包括: 为此,清华大学团队提出 AirScape ,专为六自由度(6DoF)空中具身智能体设计的生成式世界模型。 利用提出的 11k 视频-意图对数据集 ,对视频生成基础模型进行监督微调。这一阶段使模型获得对低空动作意图的基本理解和生成能力。 AirScape 能基于当前的低空视觉观测和动作意图,推演未来的序列观测。 项目的数据集和代码已全面开源。 低空世界模型数据集 为支撑低空世界 ...
苏州跑出的这只机器狗,在IROS拿了冠军
具身智能之心· 2025-11-05 00:02
Core Viewpoint - The article highlights the rapid development and strategic pivot of Zhishen Technology, which has shifted its focus from humanoid robots to quadruped robotic dogs, achieving significant success in competitions and positioning itself as a comprehensive technology service provider in the field of embodied intelligence [5][28]. Group 1: Company Overview - Zhishen Technology was founded in 2023 and quickly gained attention by winning the IROS 2025 quadruped robot competition with its "Steel Coin L1" model, marking a significant achievement for a newly established startup [5][8]. - The company initially aimed to develop heavy-load humanoid robots but realized the market potential and technological maturity of quadruped robots, leading to a strategic shift that has fueled its rapid growth [5][6]. Group 2: Technological Development - The quadruped robot market is becoming increasingly competitive, with a convergence of technical routes among various manufacturers, making stability a crucial factor for success [9][10]. - Zhishen Technology emphasizes the importance of creating a reliable and stable platform for its robots, which is essential for executing advanced algorithms effectively [12][13]. - The company has developed a high-power density integrated joint, CHAMP P65, which offers a peak torque output of 48N·m and a torque density of 92.3 Nm/kg, positioning it at the forefront of the industry [24]. Group 3: Market Positioning and Strategy - Zhishen Technology positions itself as a "full-chain technology service provider" in embodied intelligence, focusing on the development and manufacturing of robotic platforms while avoiding direct involvement in end-user applications [28][32]. - The company aims to bridge the gap between experimental prototypes and commercially viable products, addressing the engineering challenges that arise in the transition from lab to market [16][30]. - By maintaining a focus on core competencies and avoiding distractions from diverse application scenarios, Zhishen Technology seeks to optimize its resources and enhance product quality [33][34]. Group 4: Future Outlook - The company plans to continue enhancing its motion control capabilities and explore the integration of visual perception and intelligent task execution in its robotic dogs [41][42]. - Zhishen Technology aims to build a technology flywheel that leverages cutting-edge research from universities to iterate on its products and create value in various industry applications [42].
这款平台支持了pi0和pi0.5~
具身智能之心· 2025-11-05 00:02
Core Viewpoint - Imeta-Y1 is a lightweight, cost-effective robotic arm designed specifically for beginners and researchers in the field of embodied intelligence, enabling low-cost and efficient algorithm validation and project development [2][5]. Group 1: Product Features - The robotic arm offers a complete open-source toolchain and code examples, facilitating a seamless process from data collection to model deployment [3][16]. - It supports dual-language interfaces in Python and C++, allowing users to quickly get started regardless of their programming background [3][17]. - Compatibility with ROS1 and ROS2 is provided, along with URDF models for smooth transitions between simulation and real-world applications [3][5]. - The arm features high-precision motion control, low power consumption, and an open hardware architecture, supporting seamless integration from simulation to real machine [5][6]. Group 2: Technical Specifications - The robotic arm has a weight of 4.2 kg, a rated load of 3 kg, and 6 degrees of freedom, with a working radius of 612.5 mm and a repeatability precision of ±0.1 mm [8][18]. - It operates at a supply voltage of 24V and communicates via CAN, with external interfaces for power and CAN [8][18]. - The joint motion range includes J1: -165° to 165°, J2: -180° to 0°, J3: 0° to 180°, J4: -128° to 86°, J5: -90° to 90°, and J6: -150° to 150° [8][18]. Group 3: Development and Support - The company provides a comprehensive open-source SDK, including drivers, API interfaces, sample code, and documentation, supporting rapid application development [25][31]. - A full-process toolchain is available for data collection, model training, and inference deployment, compatible with mainstream frameworks like TensorFlow and PyTorch [31][28]. - The company ensures timely after-sales support with a 24-hour response time, and offers bulk purchase discounts and project development support [18][43].