Workflow
开放全栈!超越π0,具身智能基础大模型迎来真·开源,开发者狂喜
量子位·2025-09-08 05:04

Core Viewpoint - The article highlights the launch of WALL-OSS, an open-source embodied intelligence model in China, which surpasses previous models like π0 in various metrics [1][5][17]. Group 1: Model Features - WALL-OSS is a general-purpose embodied model with excellent generalization and reasoning capabilities, allowing for quick fine-tuning on proprietary systems [2]. - It is a multimodal model capable of processing and outputting data in various forms, including language, video, and actions, demonstrating strong causal reasoning and spatial understanding [3]. - With 4.2 billion parameters, WALL-OSS is the only open-source embodied model that provides end-to-end unified output across language, vision, and action [5][27]. Group 2: Team and Development - The development team, 自变量机器人, was established in late 2023 and has focused on end-to-end models, launching WALL-A, the largest unified embodied model globally [9]. - The team recently completed nearly 1 billion yuan in Series A+ financing, with major investors including Alibaba Cloud and Sequoia [13][14]. Group 3: Performance and Evaluation - WALL-OSS exhibits superior performance in both ID and OOD evaluations, maintaining high task success rates even in varied scenarios [17]. - It outperforms baseline models in long-term tasks requiring instruction breakdown and in reasoning tasks reliant on CoT [19][20]. - The model retains core functionalities of VLM while enhancing capabilities through multimodal benchmark tests [22]. Group 4: Technical Innovations - WALL-OSS addresses the "impossible triangle" of modality unification, action precision, and capability generalization through systematic architectural and training paradigm innovations [32]. - The model employs a novel architecture combining shared attention and expert flow mechanisms, allowing for effective information processing across modalities [34]. - It utilizes a two-stage training strategy to enhance spatial and semantic understanding while maintaining the original VLM capabilities [41][45]. Group 5: Open Source Strategy - WALL-OSS is fully open-sourced, providing a complete reproducible model solution, including pre-trained weights, training code, and deployment documentation [52][53]. - This initiative significantly lowers the entry barrier for developers, enabling rapid adaptation and deployment of advanced embodied intelligence [56]. - The open-source approach aims to foster industry growth by providing a robust foundational model that can be utilized across various applications [68].