Workflow
DFOL
icon
Search documents
对话原力灵机周而进:模型2.4B就够用,关键是“具身原生”;能闭环才是最高效方法
量子位· 2026-02-13 05:42
Core Viewpoint - The company has introduced a lightweight embodiment model DM0 with 2.4 billion parameters, claiming it is sufficient for real-time processing and capable of continuous evolution through reinforcement learning [1][5][4]. Group 1: Model Specifications - DM0 is designed to handle three perspectives of 728x728 images with a reasoning delay of only 60 milliseconds [4]. - The model is considered the first "embodiment native large model" due to its unique training approach from scratch, differing from industry norms [7][18]. - The model's training process consists of three phases: VLM Train, VLA Pre-Train, and VLA Post-Train, focusing on multi-source and multi-task training [26][29][30]. Group 2: Technical Framework - Alongside DM0, the company released an open-source framework Dexbotic 2.0 and a production workflow DFOL, aimed at enhancing embodied applications [8][97]. - Dexbotic 2.0 is designed to unify embodied operations and navigation, allowing for modular architecture [98][100]. - DFOL aims to bridge the gap between traditional automation and human-like flexibility, focusing on efficiency and adaptability [101]. Group 3: Data Collection and Training Philosophy - The company emphasizes a "from zero" training approach, arguing that early exposure to physical world interactions is crucial for model understanding [40][42]. - Data collection is comprehensive, involving internet data, intelligent driving data, and embodied data, with a focus on high-resolution inputs for precise actions [62][64][66]. - The data collection strategy is dynamic, adjusting based on experimental results to ensure effective model training [68][70]. Group 4: Application and Market Strategy - The company is initially focusing on logistics as a practical application for embodied intelligence, aiming to refine capabilities in a controlled environment [125][146]. - The logistics scenario is chosen for its scalability and replicability, allowing for rapid data feedback loops to enhance model performance [149][150]. - Future plans include expanding from logistics to more complex environments, ultimately targeting consumer applications [155][156]. Group 5: Long-term Vision - The ultimate goal is to develop robots with broad social identities, capable of independent transactions and interactions in various environments [168][171]. - The company believes that achieving this vision requires a phased approach, ensuring reliability in hardware and model capabilities before expanding to more complex tasks [169][172].
雷军宣布初代小米SU7停产;传百度秘密启动“O计划”
Group 1: Company Developments - Xiaomi's founder Lei Jun announced the discontinuation of the first-generation Xiaomi SU7, with nearly 370,000 units delivered [2] - Baidu has reportedly initiated a secret project called "O Plan," which is related to the Baidu APP and aims to enhance its AI capabilities, with the app's monthly active users surpassing 200 million [3] - Zhizhu's stock surged nearly 200% after the announcement of a new model, speculated to be GLM-5, which has generated significant interest in the developer community [4] Group 2: New Product Launches - Alibaba's Qianwen launched a new image generation model, Qwen-Image-2.0, with API access available for developers [7] - ByteDance introduced the Seedream 5.0 Preview model, which is now available for testing in various applications, including video editing [8] - Tencent released a small model, HY-1.8B-2Bit, which occupies only about 600MB of storage, marking a breakthrough in edge deployment [16] Group 3: Financial Updates - Honda reported a third-quarter operating profit of 153.36 billion yen, exceeding expectations, and has developed a plan to prevent future chip supply shortages [11] - "Qingche Intelligent" completed a multi-hundred million yuan Series A financing round, focusing on the development of large models for robotics [13] - "Daxiao Robotics" has recently completed an angel round of financing led by Ant Group, with participation from several other investment firms [14] Group 4: Industry Trends - The trend of AI model development continues to accelerate, with multiple companies launching new models and enhancing existing ones to meet market demands [4][7][8][16] - The competition in the AI space is intensifying, as companies like Baidu and ByteDance focus on integrating AI capabilities into their existing platforms [3][5]
「具身原生」元年!专访原力灵机汪天才,解析具身智能的「PyTorch时刻」
机器之心· 2026-02-10 08:52
Core Viewpoint - The article discusses the significant advancements in embodied intelligence, particularly through the launch of the Dexbotic 2.0 framework and its collaboration with RLinf, marking a pivotal moment in the industry towards a "native embodied" era of AI [3][5][9]. Group 1: Framework and Collaboration - The Dexbotic 2.0 framework aims to standardize the infrastructure for embodied intelligence, similar to how PyTorch revolutionized deep learning [5][16]. - The collaboration with Tsinghua University and RLinf focuses on enhancing the capabilities of embodied AI through a unified framework that integrates perception, decision-making, and execution [3][5][19]. - The introduction of the DM0 model and the DFOL workflow signifies a comprehensive approach to developing and deploying embodied applications [6][51]. Group 2: Embodied Native Concept - "Embodied Native" is defined as a concept that emphasizes a closed-loop system of perception, decision-making, and execution, allowing AI to interact with the physical world effectively [15][13]. - The framework promotes the use of real-world data and multi-modal training to enhance the model's understanding and interaction with its environment [17][41]. - The transition from a "big model brain + mechanical limbs" approach to a fully integrated embodied system is highlighted as a key evolution in the field [12][13]. Group 3: Technical Innovations - Dexbotic 2.0 features a modular design that maintains high flexibility while ensuring end-to-end processing, allowing for independent upgrades of perception, cognition, and control modules [21][33]. - The framework integrates various models and capabilities, including visual-language-action (VLA) and navigation, to achieve comprehensive task execution [37][38]. - The introduction of a standardized data format (Dexdata) and a unified training pipeline addresses the fragmentation in the development of embodied intelligence [45][46]. Group 4: Performance and Evaluation - The DM0 model, with 2.4 billion parameters, has achieved high performance in real-world evaluations, demonstrating its capability in both single and multi-task scenarios [57][58]. - The RoboChallenge benchmark is established to provide a fair evaluation of embodied models, ensuring that performance metrics reflect true capabilities rather than optimized scores [46][57]. - The DFOL workflow enables continuous improvement of robotic systems through real-time data feedback, enhancing their operational efficiency [62][65]. Group 5: Future Insights - The article emphasizes the importance of integrating multi-modal sensory inputs, such as touch and auditory capabilities, to enhance the modeling of the physical world [74]. - The rapid evolution of embodied intelligence is noted, with expectations for significant advancements in the near future, akin to the pace seen in large model developments [73][75]. - The company advocates for an open-source approach to foster collaboration and innovation within the embodied intelligence community, aiming to lower barriers for developers [68][71].