质疑VLA模型、AI完全不够用？有从业者隔空回应宇树王兴兴

Core Viewpoint - The traditional humanoid robots face three core challenges: perception limitations, decision-making gaps, and generalization bottlenecks [5] Group 1: Industry Challenges - The industry is currently unable to utilize full parameter models effectively, indicating a need for deeper collaboration between the robot's brain, cerebellum, and limbs [2] - Traditional robots often rely on preset rules for task execution, making it difficult to adapt to complex and dynamic environments [5] - Robots require manual intervention for reprogramming or strategy adjustments during multi-task switching [5] Group 2: Perspectives on VLA Model - The VLA (Vision-Language-Action) model is seen as a controversial yet pivotal paradigm for humanoid robot motion control, with many in the industry betting on its potential [4] - The OPEN VLA, based on the Llama2 language model with 7 billion parameters, is an example of a smaller-scale model that still faces challenges in effectively utilizing large language models [4] - There is a call for the industry to explore the collaborative distribution of computing power between cloud and edge devices to create a comprehensive deployment architecture [4] Group 3: Future Directions - The ideal "brain" model for humanoid robots should not only be a large language model but a complete system that deeply integrates hardware and software [4] - The industry is encouraged to rethink the VLA model and seek new paradigms, potentially through biomimicry to develop original foundational models for embodied intelligence [6] - There is growing confidence in the humanoid robot industry, with many believing it will become a significant sector, marking this year as a potential turning point for mass production [6]