VLA 模型
Search documents
智驾软硬件持续迭代,robotaxi未来已来
2025-11-03 02:35
智驾软硬件持续迭代,robotaxi 未来已来 20251102 当前智驾行业内主要车企的技术路线和进展如何? 目前智驾行业内的技术路线主要分为三类。第一类是端到端算法,这一方向自 特斯拉在 2021 年 AI Day 之后开始受到广泛关注。目前采用端到端算法并实 现量产上车的企业包括 Momenta、特斯拉和极氪等。端到端算法分为一段式 和两段式,目前量产应用的大多是两段式,一段式预计将在今年继续推进相关 工作。其优势在于通过较小算力即可实现城市 OA 功能。 第二类是 VLA(Vision Language Action)模型,代表企业有理想和小鹏。VLA 模型 结合语言模型对环境进行语义级别分析,并将这些信息传递给后续决策模块, 实现控车。然而,VLA 模型依赖语言模型的训练开发,需要大量资源。此外, VLA 对算力要求高,最低需求在 500 TOPS 以上,同时推理速度相对较慢,例 如理想目前能实现的推理速度约为 10 帧左右。 第三类是世界模型,这条技术 路线与 VLA 不冲突,可以结合使用。世界模型能够理解当前环境并预测未来几 秒内场景变化。例如华为、Momenta、地平线等公司正在开发这种方案。 ...
字节发布全新 VLA 模型,配套机器人化身家务小能手
Sou Hu Cai Jing· 2025-07-23 16:51
Core Insights - ByteDance's Seed team has launched a new VLA model, GR-3, which supports high generalization, long-range tasks, and flexible object manipulation with dual-arm operations [2][4] - The GR-3 model is designed to understand abstract language instructions and can efficiently adapt to new tasks with minimal human data, contrasting with previous models that required extensive training [2][7] - The accompanying robot, ByteMini, is a versatile dual-arm mobile robot specifically designed to work with the GR-3 model, featuring 22 degrees of freedom and advanced sensory capabilities [4][5] Model Features - GR-3 is characterized by its ability to perform complex tasks with high robustness and success rates, effectively following step-by-step human instructions [4][5] - The model utilizes a unique training method that combines data from remote-operated robots, human VR trajectory data, and publicly available visual-language data, enhancing its learning capabilities [7] - GR-3's architecture includes a 4 billion parameter end-to-end model that integrates visual-language and action generation modules [7] Performance Highlights - In tasks such as table organization, GR-3 demonstrates high success rates and can accurately interpret and respond to complex instructions, even when faced with invalid commands [4][5] - The model excels in collaborative dual-arm operations, effectively manipulating deformable objects and recognizing various clothing arrangements [5] - GR-3's generalization ability allows it to handle previously unseen objects and comprehend abstract concepts during tasks, showcasing its adaptability [5][7] Future Plans - The Seed team plans to expand the model's scale and training data while incorporating reinforcement learning methods to further enhance generalization capabilities [7] - Generalization is identified as a key metric for evaluating VLA models, crucial for enabling robots to adapt quickly to dynamic real-world scenarios [7]
体验向上价格向下,端到端加速落地
HTSC· 2025-03-02 07:30
Investment Rating - The report maintains a rating of "Buy" for several companies in the automotive sector, including XPeng Motors, Li Auto, BYD, SAIC Motor, Great Wall Motors, and Leap Motor [10]. Core Viewpoints - The report emphasizes that by 2025, advanced intelligent driving (high-level AD) will see improved user experience and reduced prices, transitioning from a trial phase to widespread adoption among consumers [14][20]. - The penetration rates for L2.5 and L2.9 intelligent driving are projected to reach 3.5% and 10.1% respectively by November 2024, with expectations of further growth to 16% for highway NOA and 14% for urban NOA by 2025 [14][24]. - The report highlights the shift towards end-to-end architecture in intelligent driving systems, which allows for higher performance limits and seamless data transmission, enhancing the overall driving experience [30][31]. Summary by Sections Investment Recommendations - The report suggests focusing on companies with strong engineering capabilities and advantages in data, computing power, and funding, such as XPeng Motors, Li Auto, and BYD, as well as third-party suppliers like Desay SV and Kobot [5][10]. Market Trends - The report notes that the intelligent driving market is evolving, with a focus on enhancing user experience through features like "human-like" driving capabilities and the implementation of end-to-end architectures [14][20]. - The price of high-level intelligent driving systems is expected to decrease significantly, with current models priced below 100,000 and 150,000 yuan for highway and urban NOA respectively [24][28]. Technological Developments - The report discusses the advancements in end-to-end architecture, which is gaining traction among automotive manufacturers, allowing for improved data processing and decision-making capabilities [30][31]. - It also mentions the importance of AI-driven models and the need for automotive companies to adapt their organizational structures to support these technological shifts [15][41]. Competitive Landscape - The report outlines the competitive dynamics among leading automotive companies, highlighting their respective advancements in intelligent driving technologies and the rapid iteration of their systems [41][45]. - Companies like Tesla, Li Auto, and XPeng Motors are noted for their significant investments in R&D and their ability to push updates and improvements quickly [42][46].