Core Viewpoint - The debate between WA (World Behavior Model) and VLA (Vision-Language-Action Model) technology routes in autonomous driving is intensifying, with Huawei favoring WA as the ultimate solution for true autonomous driving, while VLA has more supporters in the industry [2][3] Group 1: Supporters and Adoption - As of July this year, vehicles equipped with Huawei's QianKun autonomous driving system reached 1 million, with a cumulative driving distance of 4 billion kilometers [3] - By the end of August, 28 models collaborating with Huawei were launched, including brands like "Wujie," Lantu, and Audi [3] - VLA is perceived as a shortcut but not the ultimate solution for autonomous driving, while WA is seen as a more challenging yet viable option [3] - WA integrates perception, prediction, decision-making, and planning into a single model framework, aligning more closely with human cognitive processes [3] - WA's decision response time is approximately 100 milliseconds, compared to VLA's nearly 200 milliseconds, allowing for quicker vehicle adjustments [3] Group 2: Technical Advantages and Limitations - WA has a higher research and development investment threshold and requires more computational power, making it less accessible for smaller car manufacturers [4] - WA's hardware costs are over 40% higher than VLA, limiting its adoption in mid-to-low-end models [4] - In adverse weather conditions, WA's recognition accuracy for stationary vehicles at 150 meters is about 37% higher than VLA [3][4] - VLA is currently being adopted by several car manufacturers, including the newly launched Xiaopeng P7 and Li Auto's i8 [4] Group 3: Performance and Future Potential - WA is believed to have a higher potential ceiling but requires more time and resources to implement, with a model matching process taking 6 to 9 months [7] - WA aims to create a "digital twin" driving decision system, with a goal of achieving 99.999% coverage of real driving scenarios [7][8] - The decision accuracy for L4-level autonomous driving with WA is targeted at 0.1 times per 1,000 kilometers, outperforming VLA's 1.2 times [8] - VLA is better suited for human-machine collaboration and can leverage language interaction for strategy transparency, making it suitable for the L2+ transitional phase [10] Group 4: Long-term Perspectives - The future may see a "fusion route" that combines WA's world modeling with VLA's interaction and reasoning capabilities, potentially leading to a new architecture that surpasses current models [11] - The industry consensus suggests that no single solution is currently the ultimate answer, as various technical approaches are being explored and refined [12]
三角度看WA与VLA之争
Zhong Guo Qi Che Bao Wang·2025-09-12 10:39