RoboBrain 2.0

Search documents
盘点下国内外那些做具身感知的公司们!
具身智能之心· 2025-10-08 02:49
点击下方 卡片 ,关注" 具身智能 之心 "公众号 当前,具身智能已成为全球的新焦点,如何打造一个通用的本体和大脑是各个创业公司一直努力突破 的,更是受到资本和产业界的高度关注。 我们今天为大家全面梳理具身大脑领域的国内外知名公司,深入分析其技术特点、产品布局和应用场 景,为公司提供行业全景图,助力战略决策和业务拓展。 重点关注 :聚焦于开发机器人 "大脑" 系统的企业,包括具身大模型、多模态感知决策系统等。 (一)国内公司 自变量机器人(CEO 王潜) 星海图(CEO 高继扬) 公司简介 :成立于 2023 年,聚焦 "通用具身大模型" 研发,以 真实世界数据 为主要数据来源构建 具备精细操作能力的通用机器人。在技术路线上更偏向于 "大脑",从一开始就坚持走 端到端的具身 通用大模型路线 。成立不到两年,已完成 8 轮融资。 代表成果 : WALL - A 模型:2024 年 10 月推出全球目前最大参数规模的具身智能通用操作大模型Great Wall 系列(GW)的 WALL - A 模型,能整合视觉、语言与运动控制信号,实现从感知到执行的完整 闭环,跨任务泛化能力出色。 开源具身智能基础模型Wall-O ...
具身大脑风云榜!盘一盘国内外具身大脑的灵魂人物们...
自动驾驶之心· 2025-09-14 23:33
Core Viewpoint - The article provides a comprehensive overview of notable companies in the field of embodied intelligence, focusing on their technological characteristics, product layouts, and application scenarios, which are crucial for strategic decision-making and business expansion in the industry [2][3]. Domestic Companies - **Xinghai Map**: Founded in 2023, focuses on developing a "general embodied large model" using real-world data to create robots with fine operational capabilities. The company has completed 8 rounds of financing [5]. - **WALL-A Model**: Set to launch in October 2024, it will be the largest parameter scale embodied intelligence general operation model globally, integrating visual, language, and motion control signals [5]. - **Wall-OSS**: An open-source foundational model with strong generalization and reasoning capabilities [5]. - **UBTECH**: Established in 2012, a leader in humanoid robot commercialization with comprehensive self-research capabilities [6]. - **Thinker Model**: A hundred billion parameter multimodal model set to be developed by 2025, achieving top results in three international benchmark tests [6]. - **Zhiyuan Robotics**: Founded in February 2023, focuses on deep integration of AI and robotics [7]. - **Genie Operator-1**: A multimodal large model set to release in March 2025, enhancing task success rates by 32% compared to market models [7]. - **Galaxy General**: Established in May 2023, known for its core technology and products that create three major technical barriers [8]. - **VLA Model**: The world's first "general embodied large model" developed independently, utilizing a "brain + cerebellum" collaborative framework [8]. - **Qianxun Intelligent**: Founded in 2024, focuses on AI + robotics with a strong technical background [10]. - **Spirit V1 VLA Model**: The first model to tackle flexible object long-range operation challenges, supporting complex task execution through visual-language-action integration [10]. - **Star Motion Era**: A new tech company incubated by Tsinghua University, focusing on general artificial intelligence applications [11]. - **ERA-42 Model**: The first end-to-end native embodied large model in China, capable of learning over 100 dynamic tasks [11]. Foreign Companies - **Figure AI**: Focuses on embodied intelligence operation algorithms, enhancing data training and algorithm performance [16]. - **LimX DreamActor**: A new training paradigm combining simulation and real-world data for embodied intelligence training [16]. - **Physical Intelligence**: Founded in January 2023, aims to develop advanced intelligent software for various robots [21]. - **π0 Model**: Released in October 2024, a universal robot foundational model with pre-training and fine-tuning capabilities [21]. - **Google DeepMind**: Merged with Google Brain in 2023, focusing on general artificial intelligence research [19]. - **Gemini Robotics**: A VLA model that can control robots for complex tasks without specialized training [19]. - **Skild AI**: A leading robotics "brain" development company in the US, aiming to create a universal robot operating system [25]. - **Eureka System**: Based on GPT-4, it can automatically train robots for complex actions and optimize reinforcement learning processes [25].
国内外那些做具身大脑的公司们......
具身智能之心· 2025-09-13 04:03
Core Insights - The article focuses on the emerging field of embodied intelligence, highlighting the development of general-purpose robotic "brain" systems and multi-modal perception-decision systems, which are gaining significant attention from both capital and industry sectors [2][3]. Domestic Companies - **Xinghai Map**: Founded in 2023, focuses on developing a general embodied large model using real-world data to create robots with fine operational capabilities. The company has completed 8 rounds of financing in less than two years. Its representative product, WALL-A model, is set to launch in October 2024 and is claimed to be the largest parameter scale embodied intelligence model globally, integrating visual, language, and motion control signals [6]. - **UBTECH**: Established in 2012, it is a leader in humanoid robot commercialization with comprehensive self-research capabilities. The Thinker model, set to be released in 2025, has achieved top rankings in international benchmark tests, significantly enhancing robots' perception and planning capabilities in complex environments [10]. - **ZhiYuan Robotics**: Founded in February 2023, it aims to create world-class general embodied intelligent robots. Its Genie Operator-1 model, to be released in March 2025, integrates multi-modal large model and mixed expert technologies, improving task success rates by 32% compared to market models [12]. - **Galaxy General**: Established in May 2023, it focuses on multi-modal large models driven by synthetic data. Its VLA model is the first general embodied large model globally, utilizing a "brain + cerebellum" collaborative framework [14]. - **Qianxun Intelligent**: Founded in 2024, it is a leading AI + robotics company with a focus on flexible object manipulation. Its Spirit V1 VLA model is the first to tackle long-range operations of flexible objects [16]. - **Star Motion Era**: A new tech company incubated by Tsinghua University, focusing on general artificial intelligence applications. Its ERA-42 model supports over 100 dynamic tasks through video training [18]. - **Zhujidi Power**: Concentrates on embodied intelligent robots, developing core technologies for hardware design, full-body motion control, and training paradigms [20]. International Companies - **Figure AI**: Focuses on embodied intelligence operation algorithms, enhancing data training and algorithm performance through video generation technology [17]. - **Physical Intelligence**: Founded in January 2023, it aims to develop advanced intelligent software for various robots. Its π0 model, released in October 2024, is a universal robot foundation model [22]. - **Google DeepMind**: Merged with Google Brain in 2023, it focuses on general artificial intelligence research. Its Gemini Robotics model can control robots to perform complex tasks without specialized training [20]. - **Skild AI**: A leading robotics "brain" development company in the US, aiming to create a universal robot operating system that enables intelligent operations across various scenarios [26].
计算机行业周报:智源和KimiK2测试成绩优秀,OpenAI正式推出通用智能体-20250724
Huaxin Securities· 2025-07-24 15:27
Investment Rating - The report maintains a "Buy" investment rating for the companies mentioned [10][57]. Core Insights - The AI application sector is expected to accelerate, with Kimi K2 demonstrating competitive capabilities and significant commercial potential [7][55]. - OpenAI has launched the ChatGPT Agent, marking a significant upgrade in AI capabilities, allowing for complex task management and integration with various tools [27][30]. - The AI financing landscape remains robust, highlighted by Thinking Machines Lab's record $2 billion seed round, indicating strong investor interest in AI technologies [42][44]. Summary by Sections Computing Power Dynamics - The report notes stable pricing in computing power rentals, with specific configurations priced at 28.64 CNY/hour for Tencent Cloud and 31.58 CNY/hour for Alibaba Cloud [15][17]. - The launch of RoboBrain 2.0 by Zhiyuan Research Institute showcases advancements in embodied intelligence, achieving state-of-the-art performance in multiple benchmarks [16][20]. AI Application Dynamics - Bing's average weekly stay duration increased by 166.21%, indicating growing user engagement [26]. - The ChatGPT Agent has achieved a score of 41.6% on the HLE benchmark, significantly outperforming previous models [27][31]. AI Financing Trends - Thinking Machines Lab completed a $2 billion seed round, achieving a valuation of $12 billion, reflecting the high demand for AI talent and innovation [42][44]. Investment Recommendations - Companies such as 嘉和美康 (688246.SH), 科大讯飞 (002230.SZ), and 寒武纪 (688256.SH) are highlighted as key investment opportunities due to their strong market positions and growth potential [8][56].
从“想得好”到“做得好”有多远?具身大小脑协同之路解密
具身智能之心· 2025-07-23 08:45
Core Viewpoint - The article discusses the integration of "brain," "cerebellum," and "body" in embodied intelligent systems, emphasizing the need for improved collaboration and data acquisition for advancing artificial general intelligence (AGI) [2][3][4]. Group 1: Components of Embodied Intelligence - The "brain" is responsible for perception, reasoning, and planning, utilizing large language models and visual language models [2]. - The "cerebellum" focuses on movement, employing motion control algorithms and feedback systems to enhance the naturalness and precision of robotic actions [2]. - The "body" serves as the physical entity that executes the plans generated by the "brain" and the movements coordinated by the "cerebellum," embodying the principle of "knowing and doing" [2]. Group 2: Challenges and Future Directions - There is a need for the "brain" to enhance its reasoning capabilities, enabling it to infer task paths without explicit instructions or maps [3]. - The "cerebellum" should become more intuitive, allowing robots to react flexibly in complex environments and handle delicate objects with care [3]. - The collaboration between the "brain" and "cerebellum" requires improvement, as current communication is slow and responses are delayed, aiming for a seamless interaction system [3]. Group 3: Data Acquisition - The article highlights the challenges in data collection, noting that it is often difficult, expensive, and noisy, which hinders the training of intelligent systems [3]. - There is a call for the development of a training repository that is realistic, diverse, and transferable to enhance data quality and accessibility [3]. Group 4: Expert Discussion - A roundtable discussion is planned with experts from Beijing Academy of Artificial Intelligence and Zhiyuan Robotics to explore recent technological advancements and future pathways for embodied intelligence [4].
产业观察:【AI产业跟踪】MiniMax获约20亿融资
GUOTAI HAITONG SECURITIES· 2025-07-22 11:40
Investment Rating - The report does not explicitly provide an investment rating for the AI industry Core Insights - The AI industry is experiencing significant advancements, with notable developments such as MiniMax's recent financing of approximately 2 billion RMB and its plans for an IPO in Hong Kong [7] - The emergence of domestic models like Kimi K2, which has topped global open-source rankings, indicates a growing competitiveness in the AI sector [9] - Major companies like MiHoYo are heavily investing in AI, with a recent establishment of a new company focused on AI applications, showcasing the industry's potential for growth [15] Summary by Sections 1. AI Industry Dynamics - MiniMax has secured around 2 billion RMB in financing, achieving a post-investment valuation exceeding 28.7 billion RMB, and is preparing for a Hong Kong IPO. The company has launched the MiniMax-M1 model, which is the world's first large-scale mixed architecture inference model with 456 billion parameters, demonstrating superior performance in various benchmark tests [7] - Huang Renxun, CEO of NVIDIA, praised Chinese tech companies during his visit, highlighting the strength of Huawei and the innovation in the electric vehicle sector [8] 2. AI Application Insights - Meitu has launched an AI imaging agent called "RoboNeo," which integrates various functions for image processing and design, aimed at reducing the production barrier for small and medium-sized businesses [10] - The domestic AI search platform, Mita AI, has introduced a free deep research feature that allows users to conduct complex inquiries and generate structured reports without membership requirements [11] - MiniMax has released an Agent full-stack development feature that enables users to build complete applications without programming knowledge [12] 3. AI Large Model Insights - Tencent's open-source A13B model features a fine-grained MoE architecture with 800 billion parameters, significantly enhancing inference throughput and supporting ultra-long context windows [16] - The Zhiyuan Research Institute has launched RoboBrain 2.0 and RoboOS 2.0, focusing on overcoming core bottlenecks in AI models within real physical environments [17] - Tencent's RLVER framework addresses challenges in open-domain reinforcement learning, achieving significant improvements in emotional dialogue capabilities [18] 4. Technology Frontiers - A team from Beijing Normal University has conducted research on the cultural emotions behind Tang and Song dynasty floral imagery using AI models, providing a new quantitative approach to historical studies [20][22] - A collaborative team from West Lake University and Zhejiang University has proposed a new framework for optimizing generative AI performance, which could significantly enhance efficiency in various applications [23] - The startup HeShan Technology has developed the world's first AI tactile perception chip, indicating advancements in robotics and AI integration [24]
智源宣布全面开源RoboBrain 2.0与RoboOS 2.0;全球首个AI智能体运行安全测试标准发布丨AIGC日报
创业邦· 2025-07-14 23:59
Group 1 - The core viewpoint of the article highlights significant advancements in artificial intelligence and biotechnology, showcasing the integration of AI in various sectors such as drug development and education [1][2][3][4] Group 2 - The PROTEUS system developed by the University of Sydney represents a breakthrough in biotechnology, capable of creating new functional molecules in weeks, potentially revolutionizing drug development and personalized medicine [1] - The release of the AI STR series safety testing standards by the World Digital Academy addresses the industry gap in safety testing for AI agents, involving collaboration from multiple global institutions [2] - A joint agreement among ten leading AI companies emphasizes the importance of responsible AI in education, aiming to prevent the spread of misinformation and promote a healthy educational environment [3] - The open-source release of RoboBrain 2.0 and RoboOS 2.0 by the Zhiyuan Research Institute enhances accessibility to advanced AI frameworks and models, fostering innovation in the AI field [4]
腾讯研究院AI速递 20250715
腾讯研究院· 2025-07-14 14:38
Group 1: Generative AI Developments - Comet is an "AI Agent native" browser designed to redefine the relationship between users and information, allowing for complex task execution across multiple tabs [1] - Meta's acquisition of PlayAI for nearly $100 million aims to enhance its audio generation capabilities, complementing its broader AI Superintelligence strategy with a total annual investment of $72 billion [2] - RoboBrain 2.0, developed by Zhiyuan Research Institute, surpasses GPT-4o in 10 evaluations, breaking through key capabilities in spatial understanding and long-chain reasoning [3] Group 2: AI Tools and Applications - Meitu's AI image agent "RoboNeo" allows users to perform various tasks like image retouching and website creation through simple commands, enhancing efficiency in image production [4][5] - Bilibili's AI voice model IndexTTS2 achieves high-quality voice conversion with precise duration control and emotional expression, setting a new standard in voice synthesis [6] - PixVerse's new "multi-keyframe generation" feature enables users to create coherent videos from multiple images, enhancing storytelling capabilities in video production [7] Group 3: AI in Scientific Research - The LabUtopia platform introduces a new paradigm for intelligent scientific laboratories, integrating cognitive models and robotic agents for closed-loop scientific exploration [9] Group 4: Perspectives on AI in Programming - DHH, the creator of Ruby on Rails, expresses disdain for AI programming assistants, advocating for hands-on coding as a means to develop skills and creativity [10] - Perplexity's CEO emphasizes a strategy of combining a browser with intelligent agents to create a cognitive operating system, aiming to compete with Google through speed and user experience [11]
智源RoboBrain 2.0+RoboOS 2.0双发:问鼎评测基准最强具身大脑,刷新跨本体多机协作技术范式
机器之心· 2025-07-14 11:33
Core Insights - The article discusses the release of RoboBrain 2.0 and RoboOS 2.0, highlighting their advancements in embodied intelligence and multi-agent collaboration, which are expected to transition robotics from "single-agent intelligence" to "collective intelligence" [2][19]. RoboBrain 2.0 Breakthroughs - RoboBrain 2.0 has overcome three major capability bottlenecks: spatial understanding, temporal modeling, and long-chain reasoning, significantly enhancing its ability to understand and execute complex embodied tasks [4][6]. - The model employs a modular encoder-decoder architecture, integrating perception, reasoning, and planning, and is designed to handle complex embodied reasoning tasks beyond traditional visual-language models [5][6]. Training and Performance - RoboBrain 2.0 utilizes a comprehensive multi-modal dataset, including high-resolution images, multi-view video sequences, and complex natural language instructions, to empower robots in embodied environments [9][12]. - The training process consists of three phases: foundational spatiotemporal learning, embodied spatiotemporal enhancement, and chain-of-thought reasoning in embodied contexts, each progressively building the model's capabilities [12][13][14]. - The model has achieved state-of-the-art (SOTA) performance in various benchmarks, including spatial reasoning and multi-robot planning, outperforming competitors like Gemini and GPT-4o [17][19]. RoboOS 2.0 Framework - RoboOS 2.0 is the world's first embodied intelligence SaaS platform that supports serverless, lightweight robot deployment, facilitating multi-agent collaboration across various scenarios [21][22]. - The framework includes a cloud-based brain model for high-level cognition and multi-agent coordination, a distributed module for executing specialized skills, and a real-time shared memory mechanism to enhance environmental awareness [25][26]. - RoboOS 2.0 has optimized end-to-end reasoning links, achieving a 30% overall performance improvement and reducing average response latency to below 3ms [25]. Open Source Initiative - Both RoboBrain 2.0 and RoboOS 2.0 have been fully open-sourced, making model weights, training codes, and evaluation benchmarks available to the global community [24][28]. - The initiative has garnered significant attention in social media and tech communities, with strategic partnerships established with over 20 robotics companies and top laboratories worldwide [28][29].
智源全面开源具身大脑RoboBrain 2.0与大小脑协同框架RoboOS 2.0:刷新10项评测基准
具身智能之心· 2025-07-14 11:15
Core Insights - The article discusses the release of RoboBrain 2.0 and RoboOS 2.0, highlighting their advancements in embodied intelligence and multi-agent collaboration capabilities [2][3][30]. Group 1: RoboBrain 2.0 Capabilities - RoboBrain 2.0 overcomes three major capability bottlenecks: spatial understanding, temporal modeling, and long-chain reasoning, significantly enhancing its ability to understand and execute complex embodied tasks [4]. - The model features a modular encoder-decoder architecture that integrates perception, reasoning, and planning, specifically designed for embodied reasoning tasks [9]. - It utilizes a diverse multimodal dataset, including high-resolution images and complex natural language instructions, to empower robots in physical environments [12][18]. Group 2: Training Phases of RoboBrain 2.0 - The training process consists of three phases: foundational spatiotemporal learning, embodied spatiotemporal enhancement, and chain-of-thought reasoning in embodied contexts [15][17][18]. - Each phase progressively builds the model's capabilities, from basic spatial and temporal understanding to complex reasoning and decision-making in dynamic environments [15][18]. Group 3: Performance Benchmarks - RoboBrain 2.0 achieved state-of-the-art (SOTA) results across multiple benchmarks, including BLINK, CV-Bench, and RoboSpatial, demonstrating superior spatial and temporal reasoning abilities [21][22]. - The 7B model scored 83.95 in BLINK and 85.75 in CV-Bench, while the 32B model excelled in various multi-robot planning tasks [22][23]. Group 4: RoboOS 2.0 Framework - RoboOS 2.0 is the first open-source framework for embodied intelligence SaaS, enabling lightweight deployment and seamless integration of robot skills [3][25]. - It features a cloud-based brain model for high-level cognition and a distributed module for executing specific robot skills, enhancing multi-agent collaboration [27]. - The framework has been optimized for performance, achieving a 30% improvement in overall efficiency and reducing average response latency to below 3ms [27][29]. Group 5: Open Source and Community Engagement - Both RoboBrain 2.0 and RoboOS 2.0 have been fully open-sourced, inviting global developers and researchers to contribute to the embodied intelligence ecosystem [30][33]. - The initiative has garnered interest from over 20 robotics companies and top laboratories worldwide, fostering collaboration in the field [33].