贾鹏GTC2026讲灵巧手的强化学习框架完整图文版/压缩版/视频版

Core Viewpoint - The article discusses the advancements and methodologies of Zhijian Power in the field of embodied intelligence, highlighting their innovative approaches to model design, data collection, and reinforcement learning frameworks. Group 1: Company Overview - Zhijian Power has completed five rounds of financing in less than six months, with a total funding amount of 2 billion RMB [1] - The CEO of Zhijian Power is Jia Peng, who was previously the head of intelligent driving technology at Li Auto [1] Group 2: Methodology and Model Design - Zhijian Power's methodology emphasizes a unified framework for a general base model that can achieve 100% success rates across various downstream tasks while maintaining generalization capabilities [42][43] - The company believes that the development trend of embodied base models is towards unification, integrating multiple modalities and capabilities [12][57] - The base model requires four key capabilities: understanding language instructions, closed-loop interaction with the world, high real-time performance, and self-evaluation of its state [9][11][54] Group 3: Model Architecture - The company proposes a model architecture called LaST-0, which integrates understanding and generation in a compact latent space, combining the advantages of VLA and world models [20][69] - LaST-0 has shown significant improvements in both simulation and real-world tasks, achieving state-of-the-art results and approximately 14 times acceleration compared to explicit CoT methods [78] Group 4: Data Collection Strategies - Zhijian Power identifies four methods for data acquisition: synthetic data, real machine data collection, semi-real machine collection, and ego-centric data [92] - The company opts for portable gloves for data collection, ensuring high-quality data while being adaptable to various modalities [28][95] Group 5: Reinforcement Learning Framework - The company introduces the Twin-RL framework to enhance the efficiency of reinforcement learning by creating a virtual digital twin of the environment [105] - Current reinforcement learning methods often face challenges such as sparse supervision and overfitting, which Zhijian Power aims to address through their innovative approaches [102][106]