Workflow
分层模型
icon
Search documents
对话千寻高阳:端到端是具身未来,分层模型只是短期过渡
晚点LatePost· 2025-07-10 12:30
Core Viewpoint - The breakthrough in embodied intelligence will not occur in laboratories but in practical applications, indicating a shift from academic research to entrepreneurial ventures in the field [1][5]. Company Overview - Qianxun Intelligent was founded by Gao Yang, a chief scientist and assistant professor at Tsinghua University, and Han Fengtao, a veteran in the domestic robotics industry, to explore the potential of embodied intelligence [2][3]. - The company recently demonstrated its new Moz1 robot, capable of performing intricate tasks such as organizing office supplies [4][3]. Industry Trends - The development of embodied intelligence is currently at a critical scaling moment, similar to the advancements seen with large models like GPT-4, but it may take an additional four to five years for significant breakthroughs [2][29]. - There is a notable difference in the development of embodied intelligence between China and the U.S., with China having advantages in hardware manufacturing and faster repair times for robots [6][7]. Research and Development - Gao Yang transitioned from autonomous driving to robotics, believing that robotics offers more versatility and challenges compared to specialized applications like self-driving cars [10][12]. - The field of embodied intelligence is experiencing a convergence of ideas, with many previously explored paths being deemed unfeasible, leading to a more focused research agenda [12][13]. Technological Framework - Gao Yang defines the stages of embodied intelligence, with the industry currently approaching Level 2, where robots can perform a limited range of tasks in office settings [17][18]. - The preferred approach in the industry is end-to-end systems, particularly the vision-language-action (VLA) model, which integrates visual, linguistic, and action components into a unified framework [19][20]. Data and Training - The training of VLA models involves extensive data collection from the internet, followed by fine-tuning with real-world operation data and reinforcement learning to enhance performance [23][24]. - The scaling law observed in the field indicates that increasing data volume significantly improves model performance, with a ratio of 10-fold data increase leading to substantial performance gains [27][28]. Market Dynamics - The demand for humanoid robots stems from the need to operate in environments designed for humans, although non-humanoid designs may also be effective depending on the application [33][34]. - The industry is moving towards a model where both the "brain" (AI) and the "body" (robotic hardware) are developed in tandem, similar to the automotive industry, allowing for specialization in various components [39][41].