Core Viewpoint - The company has introduced a lightweight embodiment model DM0 with 2.4 billion parameters, claiming it is sufficient for real-time processing and capable of continuous evolution through reinforcement learning [1][5][4]. Group 1: Model Specifications - DM0 is designed to handle three perspectives of 728x728 images with a reasoning delay of only 60 milliseconds [4]. - The model is considered the first "embodiment native large model" due to its unique training approach from scratch, differing from industry norms [7][18]. - The model's training process consists of three phases: VLM Train, VLA Pre-Train, and VLA Post-Train, focusing on multi-source and multi-task training [26][29][30]. Group 2: Technical Framework - Alongside DM0, the company released an open-source framework Dexbotic 2.0 and a production workflow DFOL, aimed at enhancing embodied applications [8][97]. - Dexbotic 2.0 is designed to unify embodied operations and navigation, allowing for modular architecture [98][100]. - DFOL aims to bridge the gap between traditional automation and human-like flexibility, focusing on efficiency and adaptability [101]. Group 3: Data Collection and Training Philosophy - The company emphasizes a "from zero" training approach, arguing that early exposure to physical world interactions is crucial for model understanding [40][42]. - Data collection is comprehensive, involving internet data, intelligent driving data, and embodied data, with a focus on high-resolution inputs for precise actions [62][64][66]. - The data collection strategy is dynamic, adjusting based on experimental results to ensure effective model training [68][70]. Group 4: Application and Market Strategy - The company is initially focusing on logistics as a practical application for embodied intelligence, aiming to refine capabilities in a controlled environment [125][146]. - The logistics scenario is chosen for its scalability and replicability, allowing for rapid data feedback loops to enhance model performance [149][150]. - Future plans include expanding from logistics to more complex environments, ultimately targeting consumer applications [155][156]. Group 5: Long-term Vision - The ultimate goal is to develop robots with broad social identities, capable of independent transactions and interactions in various environments [168][171]. - The company believes that achieving this vision requires a phased approach, ensuring reliability in hardware and model capabilities before expanding to more complex tasks [169][172].
对话原力灵机周而进:模型2.4B就够用,关键是“具身原生”;能闭环才是最高效方法
量子位·2026-02-13 05:42