具身智能之心 - filings, earnings calls, financial reports, news - Reportify

具身智能之心

Search documents

看一次就能执行！单视频示范零样本学习&跨模态动作知识迁移

具身智能之心· 2025-12-15 01:04

Core Insights - The article discusses the ViVLA framework, which enables robots to learn new skills from single video demonstrations, addressing the limitations of existing Vision-Language-Action (VLA) models in generalizing to tasks outside their training distribution [1][2][25]. Group 1: Challenges in Robot Skill Generalization - Four core challenges hinder the generalization of robot skills: insufficient fine-grained action recognition, differences in action representation and modalities, inherent flaws in autoregressive modeling, and a lack of diverse expert-agent pairing data [4][5][7]. Group 2: ViVLA's Technical Framework - ViVLA employs a three-layer technical system: unified action space construction, parallel decoding optimization, and large-scale data generation, facilitating efficient learning from single expert demonstration videos [1][8]. - The first layer focuses on latent action learning through an Action-Centric Cycle-Consistency (A3C) framework to bridge the gap between different expert and agent action spaces [10]. - The second layer enhances model training efficiency with parallel decoding and spatiotemporal masking strategies, improving video understanding and action prediction [11][12]. Group 3: Data Generation and Validation - ViVLA's data generation pipeline converts human videos into high-quality paired data, resulting in a dataset of over 892,911 expert-agent training samples [13][17]. - The framework's effectiveness is validated through a three-tier performance verification system, demonstrating significant improvements in unseen task success rates compared to baseline models [14][16]. Group 4: Performance Metrics - In the LIBERO benchmark test, ViVLA achieved over a 30% performance increase in unseen tasks compared to baseline models, with success rates of 74% in real-world manipulation tasks, significantly outperforming other models [14][16][18]. - The model maintained a success rate of over 70% in varying environmental conditions, showcasing its robustness [20]. Group 5: Future Directions and Limitations - While ViVLA represents a breakthrough in single-sample video imitation learning, there are areas for optimization, including enhancing error recovery capabilities and expanding data diversity through automated filtering of human videos [25][27].

单视频示范零样本学习

跨模态动作知识迁移

单视频示范零样本学习

跨模态动作知识迁移

面向「空天具身智能」，北航团队提出星座规划新基准丨NeurIPS'25

具身智能之心· 2025-12-15 01:04

编辑丨量子位点击下方卡片，关注" 具身智能之心 "公众号 >> 点击进入→ 具身智能之心技术交流群这些运行在距地数百公里的卫星星座，正默默支撑着遥感、通信、导航、气象预测等关键行业。但每一个稳定运行的星座背后，都藏着一个高维、动态、强约束的规划难题。如何在短短几分钟的观测窗口内，调度数十颗卫星形成协同观测网络，执行上百项任务，同时响应地震救援、海上搜救、森林火灾等突发需求？人工智能技术正在成为破解这一难题的关键钥匙。北航刘偲教授团队提出首个大规模真实星座调度基准AEOS-Bench ，更创新性地将Transformer模型的泛化能力与航天工程的专业需求深度融合，训练内嵌时间约束的调度模型AEOS-Former 。这一组合为未来的"AI星座规划"奠定了新的技术基准。该研究目前已发表于NeurIPS 2025。更多干货，欢迎加入国内首个具身智能全栈学习社区：具身智能之心知识星球（戳我），这里包含所有你想要的! 将卫星星座送入轨道我们都知道很难，但高效规划调度在轨卫星星座执行任务也不简单。随着部署的星座规模越来越大，通过人力进行任务规划的效率已经赶不上卫星的任务执行效率，于是研 ...

空天具身智能

AEOS - Bench基准数据集

AEOS - Former模型

空天具身智能

AEOS - Bench基准数据集

AEOS - Former模型

Q4融资超过亿元的具身公司.......

具身智能之心· 2025-12-15 01:04

Core Insights - The article provides an overview of the financing situation for embodied robotics companies, highlighting investments over 100 million yuan across various funding rounds from angel to Series C [1]. Company Summaries - **AI² Robotics**: Secured hundreds of millions in funding, focusing on AGI-native general intelligent robots, with applications in semiconductor, automotive, electronics, biotechnology, and public services [4]. - **Self-Variable Robotics**: Raised 1 billion yuan, specializing in AI and robotics technology innovation, building general intelligent agents based on large robot models [5]. - **Xingyuan Intelligent Robotics**: Received 300 million yuan, developing a general embodied brain technology aimed at creating a universal brain for physical world interaction [6]. - **Micro Differential Intelligence**: Funded 100 million yuan, focusing on aerial robotics and intelligent systems for industrial and urban applications [7]. - **Dyna Robotics**: Raised 120 million yuan, dedicated to AI-driven robotics for various tasks, emphasizing cost-effective learning in real production scenarios [8]. - **Motorevo**: Secured 100 million yuan, specializing in robotic joints and power units for various robotic applications [9][18]. - **Lexiang Technology**: Funded 200 million yuan, focusing on general household robotics in the AI era [10]. - **Qianjue Robotics**: Raised 100 million yuan, developing high-dimensional multi-modal tactile perception technology for robotics [11]. - **Leju Robotics**: Secured 1.5 billion yuan, focusing on humanoid robot commercialization and technology accumulation [12]. - **Lingxin Qiaoshou**: Received hundreds of millions, developing a platform centered on dexterous hands and cloud intelligence [13]. - **Songyan Power**: Funded 300 million yuan, focusing on humanoid robot development and manufacturing [14]. - **Wubai Intelligent**: Raised 500 million yuan, a state-owned enterprise focusing on bionic intelligence and robotics [15]. - **Shengshi Weisheng**: Secured 100 million yuan, developing intelligent robots for manufacturing automation [16]. - **Zhongke Optoelectronics**: Funded 215 million yuan, focusing on high-end intelligent robot products for military and manufacturing sectors [17]. - **Deepwood Intelligent**: Raised 200 million yuan, specializing in general embodied intelligent robotics [19]. - **Wujie Power**: Secured 300 million yuan, focusing on building a "universal brain" for robotics [20]. - **Yuanli Lingji**: Funded hundreds of millions, focusing on industrial and logistics automation solutions [21]. - **Accelerated Evolution**: Raised 100 million yuan, developing humanoid robots with advanced motion capabilities [22]. - **Stardust Intelligent**: Secured hundreds of millions, focusing on commercial humanoid robots with strong operational performance [23]. - **Guanglun Intelligent**: Developing solutions for robotics using high-quality simulation and physical AI technology [24]. - **New Era Intelligent**: Raised 100 million yuan, focusing on commercial cleaning robots [25]. - **Star Motion Era**: Secured over 1 billion yuan, focusing on general humanoid robotics technology [26]. - **Aoyi Technology**: Funded 160 million yuan, specializing in non-invasive brain-machine interfaces and rehabilitation robotics [27]. - **Daimeng Robotics**: Raised 100 million yuan, focusing on multi-modal tactile perception and wearable remote operation systems [28]. - **Luming Robotics**: Secured hundreds of millions, focusing on family-oriented intelligent robotics [29]. - **UniX AI**: Funded 300 million yuan, specializing in AI and humanoid robotics technology [30]. - **Ling Sheng Technology**: Raised 100 million yuan, focusing on integrated systems for humanoid and embodied intelligent robotics [31]. - **Cloud Deep Technology**: Funded 500 million yuan, specializing in quadruped robot development [32].

AlphaBot系列产品

Talos等系列机器人

商用扫地机器人SP50

AlphaBot系列产品

Talos等系列机器人

商用扫地机器人SP50

没有好的科研能力，别想着去工业搞具身了～

具身智能之心· 2025-12-15 01:04

完整的科研能力是什么呢？代表能发现问题、定义问题、提出解决问题的方法、能形成方法论输出观点。这并不是简单的读论文，很多同学都错判了这点。这一年接触到了很多有科研需求的同学，主要有以下几个难题：老师不熟悉具身方向，需要自己调研；最快的提升方法则是跟着一个有经验的researcher一起工作，具身智能之心前面推出了1v1科研辅导业务，也欢迎大家咨询了解。主要辅导方向大模型、VLA、VLA+RL、视觉语言导航、端到端、强化学习、Diffusion Policy、sim2real、具身交互、位姿估计、机器人决策规划、运动规划、3DGS、SLAM、触觉感知、双足/四足机器人、遥控操作、零样本学习等。如果您有任意论文发表需求，支持带课题/研究方向咨询，欢迎联系我们，微信：paperguidance 最近和做人力服务的几个朋友聊天，说到现在市场上有具身领域科研经验的同学都是香饽饽（已经不敢奢求工业界经验了）。很多同学，还没毕业，就被各类猎头和HR预定了。要求不算很高，那就是"具备完整的科研能力",能独立完成对应工作。如果缺乏这个，不敢轻易推荐给企业。提供的服务论文选题；论文全流程指导；不知 ...

论文辅导服务

论文辅导服务

具身智能之心招募编辑、运营和销售的童鞋啦

具身智能之心· 2025-12-13 16:02

负责平台课程、硬件等产品的销售推广。我们希望您具备一定的销售基础，对具身用户需求与市场有一定的了解。运营岗位负责公众号、小红书、社群的运营，提升粉丝粘性和关注度。我们希望您有一定的运营能力，对自媒体平台的玩法有一定认识。具身智能之心招募编辑、运营和销售的童鞋啦～具身智能之心是具身领域的优秀技术创作平台，为行业输出了大量的前沿技术、课程、行业概况、融资、产品、政策等内容。现平台正处于上升期，因业务需求，面向全体粉丝招募编辑、运营、销售岗位，全职哦～编辑岗位负责日常公众号平台的内容创作、编辑，我们希望您具备一定的专业基础，在知乎、公众号等平台上具有内容创作经验。销售岗位咨询我们如果您有兴趣和我们一起成长，欢迎添加峰哥微信oooops-life ...

在看完近50篇VLA+RL工作之后......

具身智能之心· 2025-12-13 16:02

Core Insights - The article discusses advancements in Vision-Language-Action (VLA) models and their integration with reinforcement learning (RL) techniques, highlighting various research papers and projects that contribute to this field [2][4][5]. Group 1: Offline RL-VLA - NORA-1.5 is introduced as a vision-language-action model trained using world model- and action-based preference rewards, showcasing its potential in offline reinforcement learning [2][4]. - The paper "Balancing Signal and Variance: Adaptive Offline RL Post-Training for VLA Flow Models" emphasizes the importance of balancing signal and variance in offline RL applications [7]. - CO-RFT presents an efficient fine-tuning method for VLA models through chunked offline reinforcement learning, indicating a trend towards optimizing model performance post-training [9]. Group 2: Online RL-VLA - The concept of reinforcing action policies by prophesying is explored, suggesting a novel approach to enhance online reinforcement learning for VLA models [22]. - WMPO focuses on world model-based policy optimization for VLA models, indicating a shift towards utilizing world models for better policy learning [24]. - RobustVLA emphasizes robustness-aware reinforcement post-training, highlighting the need for models to maintain performance under varying conditions [27]. Group 3: Hybrid Approaches - GR-RL aims to improve dexterity and precision in long-horizon robotic manipulation by combining offline and online reinforcement learning strategies [100]. - The paper "Discover, Learn, and Reinforce" discusses scaling VLA pretraining with diverse RL-generated trajectories, indicating a comprehensive approach to model training [104]. - SRPO introduces self-referential policy optimization for VLA models, showcasing innovative methods to enhance model adaptability and performance [106].

Vision-Language-Action（VLA）

Reinforcement Learning（RL）

Vision-Language-Action（VLA）

Reinforcement Learning（RL）

招募VLA+RL&人形运控&数采相关的合作伙伴！

具身智能之心· 2025-12-13 16:02

具身VLA+RL、运控、数采相关课程设计、PPT制作。正在从事具身领域研究的童鞋，我们期望您至少发表一篇ccf-a级别会议或有1年以上的工业界经验。高于行业水平的薪资和资源共享，可兼职，感兴趣的可以添加负责人微信做进一步沟通。招募VLA+RL&人形运控&数采相关的合作伙伴！最近后台收到很多同学关于具身VLA+RL、机器人运控、数采相关的内容咨询，确实是行业比较有价值的方向，但又存在一定的门槛。具身智能之心期望和领域大牛一起研发相关方向的课程或实战项目，为正在从事相关工作的同学提供更多见解。如果有大佬感兴趣，可以添加峰哥微信：oooops-life做进一步咨询。合作内容待遇说明一些要求 ...

数采相关实战项目

数采相关实战项目

用SO-100，竟然完成这么多VLA实战......

具身智能之心· 2025-12-13 01:02

Core Viewpoint - The article discusses the challenges and complexities faced by beginners in implementing VLA (Vision-Language Alignment) models, emphasizing the need for practical experience and effective training methods to achieve successful deployment in real-world applications [2][4]. Group 1: Challenges in VLA Implementation - Many students report difficulties in achieving effective results with open-source models like GR00T and PI0, despite low training loss in simulations [2][4]. - The transition from simulation to real-world application (sim2real) poses significant challenges, particularly in data collection and model training [6][7]. - Beginners often struggle with the intricacies of VLA models, leading to prolonged periods of trial and error without achieving satisfactory outcomes [4][6]. Group 2: VLA Model Components - Data collection methods for VLA primarily include imitation learning and reinforcement learning, with a focus on high-quality data acquisition [6]. - Training VLA models typically requires extensive simulation debugging, especially when real-world data is insufficient, utilizing frameworks like Mujoco and Isaac Gym [7]. - Post-training, models often require optimization techniques such as quantization and distillation to reduce parameter size while maintaining performance [9]. Group 3: Educational Initiatives - The article introduces a practical course aimed at addressing the learning curve associated with VLA technologies, developed in collaboration with industry experts [10][12]. - The course covers a comprehensive range of topics, including hardware, data collection, VLA algorithms, and real-world experiments, designed to enhance practical skills [12][25]. - The course is targeted at individuals seeking to enter or advance in the field of embodied intelligence, with prerequisites including a foundational knowledge of Python and PyTorch [22].

《面向实战与求职的VLA小班课》

SO - 100机械臂

《面向实战与求职的VLA小班课》

SO - 100机械臂

看一次就能执行！VLA的零样本学习是伪命题吗？

具身智能之心· 2025-12-13 01:02

Core Insights - The article discusses the ViVLA framework, which enables robots to learn new skills from single video demonstrations, addressing the limitations of existing Vision-Language-Action (VLA) models in generalizing to tasks outside their training distribution [1][2][25] Group 1: Challenges in Robot Skill Generalization - Four core challenges hinder the generalization of robot skills: insufficient fine-grained action recognition, differences in action representation and modalities, inherent flaws in autoregressive modeling, and a lack of diverse expert-agent pairing data [4][5][7] Group 2: ViVLA's Technical Framework - ViVLA employs a three-layer technical system: unified action space construction, parallel decoding optimization, and large-scale data generation to achieve efficient learning from single video demonstrations [8] - The first layer focuses on latent action learning through an Action-Centric Cycle-Consistency (A3C) framework to bridge the gap between different expert and agent action spaces [10] - The second layer enhances model training efficiency with parallel decoding and spatiotemporal masking strategies, improving video understanding and reducing inference delays [11][12] Group 3: Data Generation and Validation - ViVLA's data generation pipeline converts human videos into high-quality paired data, resulting in a dataset of over 892,911 expert-agent training samples [13][17] - The framework's effectiveness is validated through a three-tier performance verification system, demonstrating significant improvements in unseen task success rates compared to baseline models [14][16] Group 4: Performance Metrics - In the LIBERO benchmark, ViVLA achieved over a 30% performance increase in unseen tasks compared to baseline models, with success rates of 74% in real-world manipulation tasks, significantly outperforming other models [14][16][18] - The model maintained a success rate of over 70% in varying environmental conditions, showcasing its robustness [20] Group 5: Future Directions and Limitations - While ViVLA represents a breakthrough in single-sample video imitation learning, there are areas for optimization, including enhancing error recovery capabilities and expanding data diversity through automated filtering of human videos [25][27]

单样本视频模仿学习

零样本学习

单样本视频模仿学习

零样本学习

全球强化学习+VLA范式，PI*0.6背后都有这家公司技术伏笔

具身智能之心· 2025-12-13 01:02

以下文章来源于具身纪元，作者具身纪元具身纪元 . 见证具身浪潮，书写智能新纪元编辑丨机器之心点击下方卡片，关注" 具身智能之心 "公众号 >> 点击进入→ 具身智能之心技术交流群更多干货，欢迎加入国内首个具身智能全栈学习社区：具身智能之心知识星球（戳我），这里包含所有你想要的! 在 Physical Intelligence 最新的成果 π 0.6 论文里，他们介绍了 π 0 .6 迭代式强化学习的思路来源：其中有我们熟悉的 Yuke Zhu 的研究，也有他们自己（Chelsea Finn、Sergey Levine）的一些研究，我们之前对这些工作一直有跟踪和介绍。此外，还有来自国内具身智能团队的工作，比如清华大学、星动纪元的研究。随着 π*0.6 的发布，VLA+online RL 成为了一个行业共识的非常有前景的研究方向（深扒了Π*0.6的论文，发现它不止于真实世界强化学习、英伟达也来做VLA在真实世界自我改进的方法了）大语言模型从SFT到RL的发展方向也逐渐在具身研究中清晰明朗。一、为什么VLA+RL很重要图注：VLA模型依赖研读微调在具身智能（Embodi ...