Workflow
视觉语言动作模型(VLA)
icon
Search documents
超万平方米的人形机器人训练场在京启用
Huan Qiu Wang Zi Xun· 2025-09-25 10:04
负责人表示,该训练场产出数据全部来源于真机运行,支持跨本体、跨场景迁移使用,有效解决了行业 面临的数据质量差、获取成本高、迁移难度大等痛点。基于技术团队自主研发的数采平台,通过采集、 清洗、标注、导出四个环节,以及"自动+人工+模型"三重质量评估,实现了数据高质量交付。经专业 机构认证,单条数据合格率达到99%。 "以往各企业分散采集训练,就像'小作坊生产',数据质量参差不齐。"技术人员介绍,"现在通过标准 化、规模化的数据生产,我们能够为整个行业提供高质量、低成本的数据服务。"未来,依托海量真实 数据,团队还将进一步推进数据标准制定和模型训练工作,通过交互式训练等方式,构建从单机控制到 群体协作的完整训练体系。 突破数据瓶颈,赋能具身智能产业标准化发展 人形机器人训练场于近日在北京石景山正式投入运营。该训练场占地面积上万平方米。这一重要基础设 施的落成,既是我国人形机器人产业的"关键落子",也为全国各地训练场建设提供了"北京样本",将加 速人形机器人"具身大脑"进化,推动其早日在汽车制造、物流搬运等场景规模化应用,为未来万亿级产 业发展奠定坚实基础。 人形机器人数据训练中心 超万平方米多元场景,搭建未来产业 ...
上海交大卢策吾:如何破解机器人泛化与鲁棒性
Core Insights - The main focus of the articles is on the advancements in robotics, particularly in embodied intelligence, and the challenges and opportunities within the industry [1][7]. Group 1: Robotics Development - The key challenges in developing robotic intelligence are not primarily related to chip computing power but rather to the iteration of embodied model architecture and data loops [1][2] - The "digital gene" framework proposed by the company aims to enhance the understanding and execution capabilities of robots, allowing them to interpret and act upon instructions more effectively [3][4] - The company has demonstrated significant advancements in robotic applications, such as a robot serving in an ice cream shop, showcasing its ability to perform complex tasks autonomously [6] Group 2: Market Dynamics - The robotics industry is experiencing a surge in investment, with various companies seeking funding to demonstrate their commercial potential [7][8] - Despite the increased interest, the financing scale for Chinese startups in embodied intelligence remains significantly lower compared to their American counterparts, with a reported difference of nearly 12 times in private AI investment [7] - The industry is characterized by a dual focus on talent and funding, which poses challenges for startups in terms of technology strategy and validation under financial constraints [8]
灵宝机器人团队在具身智能新赛道上不断突破 让机器人“心灵手巧”(科技视点·一线探创新)
Ren Min Ri Bao· 2025-07-27 22:23
Group 1 - The core message emphasizes the importance of technological innovation in advancing China's modernization and competitiveness in the global arena [1] - The article introduces a series of reports titled "Frontline Innovation," focusing on the experiences and observations of researchers in the field of scientific innovation [1] Group 2 - Lingbao Robotics, founded in 2023, specializes in developing general humanoid robots and embodied intelligence products, with a focus on practical applications [3][4] - The company utilizes a visual language action model (VLA) to enable robots to learn skills through imitation, significantly improving the efficiency of skill acquisition [4][5] - The robots developed by Lingbao can perform precise tasks, such as assembling computer components with a precision of 0.3 mm, showcasing their advanced capabilities [3][4] Group 3 - Lingbao Robotics is working on flexible automation solutions for the shoe manufacturing industry, addressing the challenges of high costs and low adaptability in traditional production lines [6][7] - The company has developed a system that allows robots to learn to perform tasks in dynamic environments, reducing the time required for training to about one hour [7] - The humanoid robot developed by Lingbao, CASBOT 01, features a bionic hand capable of executing complex tasks, highlighting the integration of embodied intelligence and precision operation [8] Group 4 - The domestic development of embodied intelligence is rapidly advancing, with a growing variety of tactile sensors and technologies being integrated into the industry [9] - Lingbao Robotics emphasizes the importance of collaboration between academia and industry, applying the latest research findings to product development while also contributing to academic research [9]
学习端到端大模型,还不太明白VLM和VLA的区别。。。
自动驾驶之心· 2025-06-19 11:54
Core Insights - The article emphasizes the growing importance of large models (VLM) in the field of intelligent driving, highlighting their potential for practical applications and production [2][4]. Group 1: VLM and VLA - VLM (Vision-Language Model) focuses on foundational capabilities such as detection, question answering, spatial understanding, and reasoning [4]. - VLA (Vision-Language Action) is more action-oriented, aimed at trajectory prediction in autonomous driving, requiring a deep understanding of human-like reasoning and perception [4]. - It is recommended to learn VLM first before expanding to VLA, as VLM can predict trajectories through diffusion models, enhancing action capabilities in uncertain environments [4]. Group 2: Community and Resources - The article invites readers to join a knowledge-sharing community that offers comprehensive resources, including video courses, hardware, and coding materials related to autonomous driving [4]. - The community aims to build a network of professionals in intelligent driving and embodied intelligence, with a target of gathering 10,000 members in three years [4]. Group 3: Technical Directions - The article outlines four cutting-edge technical directions in the industry: Visual Language Models, World Models, Diffusion Models, and End-to-End Autonomous Driving [5]. - It provides links to various resources and papers that cover advancements in these areas, indicating a robust framework for ongoing research and development [6][31]. Group 4: Datasets and Applications - A variety of datasets are mentioned that are crucial for training and evaluating models in autonomous driving, including pedestrian detection, object tracking, and scene understanding [19][20]. - The article discusses the application of language-enhanced systems in autonomous driving, showcasing how natural language processing can improve vehicle navigation and interaction [20][21]. Group 5: Future Trends - The article highlights the potential for large models to significantly impact the future of autonomous driving, particularly in enhancing decision-making and control systems [24][25]. - It suggests that the integration of language models with driving systems could lead to more intuitive and human-like vehicle behavior [24][25].
具身智能:一场需要谦逊与耐心的科学远征
Robot猎场备忘录· 2025-05-20 05:01
Core Viewpoints - Embodied intelligence is injecting new research vitality into the robotics field and has the potential to break through performance limits [1] - The development of embodied intelligence relies on breakthroughs in specific scientific problems and should not dismiss contributions from traditional robotics [2] - General intelligence cannot exist without a focus on specific tasks, as expertise in particular areas leads to advancements in broader capabilities [3] Group 1: Interdisciplinary Collaboration - Embodied intelligence is a cross-disciplinary product that requires collaboration with fields such as material science, biomechanics, and design aesthetics [2] - Breakthroughs often occur at the intersection of disciplines, highlighting the importance of diverse scientific contributions [2] Group 2: Technology Evolution - Technological evolution should not be viewed as a complete replacement of old systems; rather, it is a process of sedimentation where foundational technologies continue to support advancements [5] - The current trend in visual-language-action models may soon be replaced by more efficient alternatives, emphasizing the need for continuous innovation [5] Group 3: Realistic Expectations for AGI - Viewing embodied intelligence as the sole path to artificial general intelligence (AGI) is a dangerous oversimplification; AGI development requires a multitude of conditions and interdisciplinary knowledge [6] - The complexity of embodied systems necessitates a collaborative approach across various fields, rather than relying on a few "genius" individuals [6] Group 4: Current State of Embodied Intelligence - The field of embodied intelligence is still in its early stages, with significant challenges remaining in hardware and algorithm development [7] - Current human-like robots are not yet fully autonomous and often require human intervention, indicating that the technology is still evolving [7] Group 5: VLA Technology Pathway - The development of visual-language-action (VLA) models may not be the most efficient approach, as operational skills often precede language capabilities in learning processes [9] - Many current VLA models are resource-intensive and may be replaced by more efficient solutions in the future [9] Group 6: Balancing Short-term and Long-term Goals - A combination of learning and modeling approaches is seen as more practical in the short term, while pure learning methods may represent the long-term future of robotics [10] - Successful robotic solutions in industry often rely on model-based methods due to their stability and reliability [10] Group 7: Human-like Robots and Practicality - The design of human-like robots is driven by emotional projection and environmental adaptability, but specialized non-human forms may offer better efficiency in many applications [11] - There is a concern about over-investment in human-like robots at the expense of practical and economically viable solutions [11] Group 8: Building Technical Barriers - True competitive advantages in technology arise from extensive practical experience and meticulous attention to detail, rather than solely from innovative algorithms [12] - Long-term technical barriers are built through consistent effort and iterative improvements in engineering practices [12] Group 9: Vision and Practicality - Scientific research requires both grand visions and grounded practices, with embodied intelligence embodying both idealistic aspirations and real-world challenges [13] - The importance of foundational theories, such as control theory, remains critical in ensuring the safety and functionality of robotic systems [13]