为什么纯人形VLA方案很少？这些公司的方案是哪些？

Core Viewpoint - The current focus in the industry is on mechanical arm VLA (Vision-Language Agents) for tasks like mobile grabbing and placing, while humanoid and quadrupedal VLA are facing challenges in job applications due to complexity and data collection issues [1] Group 1: Application of VLA in Industry - Mechanical arm VLA is primarily used for simple tasks that rely on visual input, supplemented by tactile or force sensors, making them easier to implement [1] - Humanoid robots face difficulties in data collection and have high control complexity, with a single dexterous hand potentially having 20 degrees of freedom, and the entire body nearing 100 degrees of freedom [1] - Many leading companies are adopting reinforcement learning (RL) to train humanoid VLA for complex tasks, but the generalization and flexibility of humanoid models remain insufficient compared to mechanical arms [1] Group 2: Future Directions - A promising approach for the future may involve a hybrid architecture combining VLA for high-level task planning and RL for low-level motion optimization, which is currently a focus for many companies [1] - There is an increasing number of job openings in unicorn companies that are pursuing breakthroughs in this combined direction [1]