VLA

Search documents
正式开课!端到端与VLA自动驾驶小班课,优惠今日截止~
自动驾驶之心· 2025-08-13 23:33
Core Viewpoint - The article emphasizes the significance of VLA (Vision-Language Alignment) as a new milestone in the mass production of autonomous driving technology, highlighting the progressive development from E2E (End-to-End) to VLA, and the growing interest from professionals in transitioning to this field [1][11]. Course Overview - The course titled "End-to-End and VLA Autonomous Driving Small Class" aims to provide in-depth knowledge of E2E and VLA algorithms, addressing the challenges faced by individuals looking to transition into this area [1][12]. - The curriculum is designed to cover various aspects of autonomous driving technology, including foundational knowledge, advanced models, and practical applications [5][15]. Course Structure - **Chapter 1**: Introduction to End-to-End Algorithms, covering the historical development and the transition from modular to end-to-end approaches, including the advantages and challenges of each paradigm [17]. - **Chapter 2**: Background knowledge on E2E technology stacks, focusing on key areas such as VLA, diffusion models, and reinforcement learning, which are crucial for future job interviews [18]. - **Chapter 3**: Exploration of two-stage end-to-end methods, discussing notable algorithms and their advantages compared to one-stage methods [18]. - **Chapter 4**: In-depth analysis of one-stage end-to-end methods, including various subfields like perception-based and world model-based approaches, culminating in the latest VLA techniques [19]. - **Chapter 5**: Practical assignment focusing on RLHF (Reinforcement Learning from Human Feedback) fine-tuning, providing hands-on experience with pre-training and reinforcement learning modules [21]. Target Audience and Learning Outcomes - The course is aimed at individuals with a foundational understanding of autonomous driving and related technologies, such as transformer models and reinforcement learning [28]. - Upon completion, participants are expected to achieve a level equivalent to one year of experience as an end-to-end autonomous driving algorithm engineer, mastering various methodologies and being able to apply learned concepts to real-world projects [28].
传统感知逐渐被嫌弃,VLA已经上车了?!
自动驾驶之心· 2025-08-13 06:04
Core Viewpoint - The article discusses the launch of the Li Auto i8, which is the first model equipped with the VLA driver model, highlighting its advancements in understanding semantics, reasoning, and human-like driving intuition [2][7]. Summary by Sections VLA Driver Model Capabilities - The VLA model enhances four core capabilities: spatial understanding, reasoning ability, communication and memory, and behavioral ability [2]. - It can comprehend natural language commands during driving, set specific speeds based on past memories, and navigate complex road conditions while avoiding obstacles [5]. Industry Trends and Educational Initiatives - The VLA model represents a new milestone in the mass production of autonomous driving technology, prompting many professionals from traditional fields to seek transition into VLA-related roles [7]. - The article introduces a new course titled "End-to-End and VLA Autonomous Driving," designed to help individuals transition into this field by providing in-depth knowledge and practical skills [21][22]. Course Structure and Content - The course covers various topics, including end-to-end background knowledge, large language models, BEV perception, diffusion model theory, and reinforcement learning [12][26]. - It aims to build a comprehensive understanding of the research landscape in autonomous driving, focusing on both theoretical and practical applications [22][23]. Job Market and Salary Insights - The demand for VLA/VLM algorithm experts is high, with salary ranges for positions such as VLA model quantization deployment engineers and VLM algorithm engineers varying from 40K to 120K [15]. - The course is tailored for individuals looking to enhance their skills or transition into the autonomous driving sector, emphasizing the importance of mastering multiple technical domains [19][41].
车企、科技企业VLA研发进展
Zhong Guo Qi Che Bao Wang· 2025-08-13 01:33
Group 1: Li Auto - Li Auto's i8 features the VLA "driver model," marking a significant advancement in intelligent driving following the previous VLM introduction [1] - The VLA model includes a newly designed spatial encoder that utilizes language models and logical reasoning to provide driving decisions, predicting trajectories of other vehicles and pedestrians through a diffusion model [1] - The inference frame rate of the VLA is approximately 10 Hz, more than tripling the previous VLM's rate of 3 Hz [1] Group 2: XPeng Motors - XPeng G7 officially commenced deliveries on July 7, with a clear timeline for the Ultra version's VLA and VLM software updates [2] - The VLA software OTA update is scheduled for September 2025, with VLM software upgrades following in November 2025, and personalized recommendations by December 2025 [2] - The XPeng G7 Ultra version is equipped with three self-developed Turing AI chips, boasting a total computing power of 2250 TOPS, positioning it as a leader among mass-produced models [2] Group 3: Chery Automobile - Chery plans to introduce the VLA and world model technology into fuel vehicles by 2025 through its Falcon 900 intelligent driving system, aiming to set a new benchmark for "oil-electric intelligence" [3] - The Falcon 900 system utilizes a self-developed VLA model that integrates visual perception, language understanding, and action execution [3] - The model has been trained on 20 million kilometers of real-world data, capable of understanding over 5000 traffic scenarios, achieving a 92% accuracy rate in recognizing non-standard traffic signals in complex urban conditions, a 37% improvement over traditional systems [3] Group 4: Geely Automobile - Geely is actively developing VLA technology, integrating it with world models to create a comprehensive world model system [4] - The Qianli Haohan system features a "dual end-to-end model" design, enabling a multi-modal VLA general scene model and an end-to-end model to back each other up [4] - This system is powered by dual NVIDIA Thor chips, with a total computing power of 1400 TOPS and over 40 perception units capable of detecting objects 0.75 meters in size from 300 meters away [4] Group 5: Yuanrong Qihang - Yuanrong Qihang is also investing in the VLA model, with five models expected to feature it by the third quarter of this year [5] - The company was among the earliest to publicly announce its VLA development in June of last year [5] - The VLA model focuses on defensive driving with four core functions: spatial semantic understanding, recognition of irregular obstacles, comprehension of text-based guide signs, and voice control of the vehicle, which will be gradually released with mass production [5]
VLA还是VTLA?这家企业用“超人类触觉”技术颠覆机器人未来!
具身智能之心· 2025-08-13 00:04
虽然触觉传感器如此重要,但还有很多问题没有解决,比如分辨率不高,实时性保证不了、买过来没多久 就坏了、质量不行等。然而,我们发现现场有一家触觉传感器硬件公司同时在分辨率、实时性、耐用性与 成本平衡方面取得了最优,这家公司就是 " 戴盟机器人 "。 这几天去WRC25逛了一圈,看到了各家具身机器人公司的产品和功能。说实话相比于去年,硬件和技术上 真的是有较大提升。还看到了多家没去WAIC25现场的公司,总体结论是现阶段的本体已经基本能够满足一 些场景的需求,反而是感知大脑,有点落后于硬件。 现场看到了很多相关的技术,特别是VLA模型。VLA作为新一代端到端视觉语言动作模型,是各家公司与 研究机构重点关注的。不过在展示过程中我们也发现了一个明显的问题,视觉虽然能提供丰富的环境信 息,在涉及物理交互(如抓取、操作物体)时,无法精确感知物体的材质、硬度、摩擦力等属性。特别是 在工业装配、医疗手术、家庭服务等场景中,机器人需要执行高精度任务,如果不小心用力过度将会产生 不良的后果。 就在近几天,戴盟机器人也(Daimon Robotics)宣布完成亿元级天使++轮融资,由招商局创投领投,东方 嘉富、架桥资本跟投。本轮融 ...
具身智能之心技术交流群成立了!
具身智能之心· 2025-08-11 06:01
注意哦, 备注:机构/学校+姓名+研究方向 ,能够快速入群! 感兴趣的同学可以添加小助理微信AIDriver005,邀请加入我们的社群。 具身智能之心技术交流群成立了!主要关注VLA、VLN、遥操作、Diffusion Policy、强化学习、VLA+RL、 sim2real、多模态大模型、仿真、运动控制、目标导航、建图定位、导航等方向。 ...
对话千寻智能高阳:科学家创业不太「靠谱」,但创业就像一场游戏
36氪· 2025-08-08 09:28
智能涌现 . 直击AI新时代下涌现的产业革命。36氪旗下账号。 具身智能创业,要做苹果,而不是安卓。 文 | 邱晓芬 编辑 | 苏建勋 来源| 智能涌现(ID:AIEmergence) 封面来源 | 视觉中国 不管是刚刚结束的WAIC(世界人工智能大会),还是本周要开幕的WRC(世界机器人大会),如何在展会上识别一个机器人的真正实力? 具身智能公司"千寻智能"的联合创始人高阳,提供了这样几个tips: 以下文章来源于智能涌现 ,作者邱晓芬 对于号称能叠衣服的机器人,你可以尝试把衣服团成一团,随意丢在桌上,观察它是否能继续完成动作;或者是再给它裤子、外套,看它能否具备跨品类 的泛化能力; 在机器人操作时,可以观察其动作是否足够丝滑流畅,而不是一卡一卡,这代表了思维和动作的协调性…… 给我们提出指引的高阳,是当前具身智能领域炙手可热的创业者之一——从美国加州大学伯克利分校博士毕业后,他选择回国成为清华大学交叉信息研究 院助理教授。 2023年,他又与前珞石机器人CTO韩峰涛一起,创办了具身智能公司千寻智能——韩峰涛硬件经验丰富,过往操盘过数万台机器人量产出货,高阳则有 AI的研究基础,学术和产业界的搭配,使得千寻 ...
对话千寻智能高阳:科学家创业不太“靠谱”,但创业就像一场游戏
3 6 Ke· 2025-08-08 01:49
智能涌现制图 具身智能创业,要做苹果,而不是安卓。 文|邱晓芬 编辑|苏建勋 不管是刚刚结束的WAIC(世界人工智能大会),还是本周要开幕的WRC(世界机器人大会),如何在展会上识别一个机器人 的真正实力? 做具身智能领域的苹果,不是安卓 具身智能公司"千寻智能"的联合创始人高阳,提供了这样几个tips: 对于号称能叠衣服的机器人,你可以尝试把衣服团成一团,随意丢在桌上,观察它是否能继续完成动作;或者是再给它裤子、 外套,看它能否具备跨品类的泛化能力; 在机器人操作时,可以观察其动作是否足够丝滑流畅,而不是一卡一卡,这代表了思维和动作的协调性…… 给我们提出指引的高阳,是当前具身智能领域炙手可热的创业者之一——从美国加州大学伯克利分校博士毕业后,他选择回国 成为清华大学交叉信息研究院助理教授。 2023年,他又与前珞石机器人CTO韩峰涛一起,创办了具身智能公司千寻智能——韩峰涛硬件经验丰富,过往操盘过数万台机 器人量产出货,高阳则有AI的研究基础,学术和产业界的搭配,使得千寻智能成为这波具身智能浪潮里的当红公司。 成立19个月的时间里,他们累计融资超10亿人民币。资方名单中,有华为哈勃、京东、宁德时代、顺为资 ...
具身智能之心技术交流群成立了!
具身智能之心· 2025-08-07 02:38
注意哦, 备注:机构/学校+姓名+研究方向 ,能够快速入群! 具身智能之心技术交流群成立了!主要关注VLA、VLN、遥操作、Diffusion Policy、强化学习、VLA+RL、 sim2real、多模态大模型、仿真、运动控制、目标导航、建图定位、导航等方向。 感兴趣的同学可以添加小助理微信AIDriver005,邀请加入我们的社群。 ...
新势力提前批,跪了。。。
自动驾驶之心· 2025-08-06 11:25
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近15个 方向 学习 路线 知识星球最近有小伙伴面了新势力的提前批,结果面试官最后来个三连问,都是开放性的非技术问题。 面试快结束的时候,面试官问了我很多非技术问题,感觉从来没思考过,被问懵了。。。 彻底懵了,而且感觉回答的,面试官并不满意,主要是这些问题,我不知道面试官关注的点在哪里? 星主回答: 1、这个就几何你自己的兴趣和经验展开来说就好,没什么标准的答案。但你最好思考过,这里面试官隐含的考察你有没有自己的主见。其实很多校招生都不知 道自己想做什么。。。找学长学姐或者网上提前了解面试大部门的业务,如果觉得不错可以靠一靠,一方面会吸引面试官感兴趣,甚至最后分配的时候会被点名 要过去。另一方面你在回答这个问题的时候也可以适当问问面试官团队是在做什么方向工作,这样也算是有个渠道了解业内实际的工作方向。 2、面试官是想了解,你沟通能力怎么样?是不是一个"好带的人"。没有实习经历的人可能没这种体会,如果在实验室和学长学姐沟通也类似。你可以这么说: 我习惯接手一个任务时先判断熟悉程度,一般会先自己整体调研下这个方向,遇到不会的地方记录下来,跟 ...
自动驾驶秋招&社招求职群成立了!
自动驾驶之心· 2025-08-04 23:33
Core Viewpoint - The article emphasizes the convergence of autonomous driving technology, highlighting the shift from numerous diverse approaches to a more unified model, which indicates higher technical barriers in the industry [1] Group 1 - The industry is moving towards a unified solution with models like one model, VLM, and VLA, suggesting a reduction in the need for numerous algorithm engineers [1] - The article encourages the establishment of a large community to support industry professionals, facilitating growth and collaboration among peers [1] - A new job-related community is being launched to discuss industry trends, company developments, product research, and job opportunities [1]