端到端

Search documents
任少卿的智驾非共识:世界模型、长时序智能体与 “变态” 工程主义
晚点Auto· 2025-10-09 12:17
以下文章来源于晚点LatePost ,作者晚点团队 晚点LatePost . 晚一点,好一点 留在智能驾驶,不是因为容易,而是因为更难。 文 丨 魏冰 宋玮 编辑 丨 宋玮 任少卿的头发很有辨识度,浓密、微卷,刘海盖住额头。走进会议室,第一次见他的人把他当成了实习生,知道身 份后调侃说,只有在 AI 创业公司才能看到这么年轻的技术 leader。 "我们就是 AI 公司"——任少卿一本正经的回答。 但他身处的是蔚来,一家还在血海中搏杀的汽车制造商,而他的战场,是智能驾驶。这个反常回答,和他的人生轨 迹相似:总在别人以为答案已定的时候,他偏要走向另一个方向。 2007 年他考入中科大,2016 年博士毕业。期间他提出了 Faster R-CNN(一种基于深度学习的目标检测框架),又 和当时微软亚研院视觉计算组的孙剑、何恺明,博士生张祥雨一起研究 ResNet(残差网络)。后者解决了神经网络 越深越 "失忆" 的难题,让模型可以无限叠加层数,被视为深度学习史上的里程碑。当时任少卿 27 岁。 2016 年,他与曹旭东共同创立自动驾驶公司 Momenta,亲历了自动驾驶最热的创业年代。4 年后,他离开一手创立 的公 ...
任少卿的智驾非共识:世界模型、长时序智能体与 “变态” 工程主义
晚点LatePost· 2025-10-09 10:14
留在智能驾驶,不是因为容易,而是因为更难。 文 丨 魏冰 宋玮 编辑 丨 宋玮 任少卿的头发很有辨识度,浓密、微卷,刘海盖住额头。走进会议室,第一次见他的人把他当成了实习生,知道身 份后调侃说,只有在 AI 创业公司才能看到这么年轻的技术 leader。 "我们就是 AI 公司"——任少卿一本正经的回答。 但他身处的是蔚来,一家还在血海中搏杀的汽车制造商,而他的战场,是智能驾驶。这个反常回答,和他的人生轨 迹相似:总在别人以为答案已定的时候,他偏要走向另一个方向。 2007 年他考入中科大,2016 年博士毕业。期间他提出了 Faster R-CNN(一种基于深度学习的目标检测框架),又 和当时微软亚研院视觉计算组的孙剑、何恺明,博士生张祥雨一起研究 ResNet(残差网络)。后者解决了神经网络 越深越 "失忆" 的难题,让模型可以无限叠加层数,被视为深度学习史上的里程碑。当时任少卿 27 岁。 2016 年,他与曹旭东共同创立自动驾驶公司 Momenta,亲历了自动驾驶最热的创业年代。4 年后,他离开一手创立 的公司,转身去了还在低谷挣扎的蔚来。 原因很简单,当年 AI 发展撞上瓶颈,他认为下一次突破只能靠 ...
学术界和工业界都在如何研究端到端与VLA?三个月搞定端到端自动驾驶!
自动驾驶之心· 2025-10-09 04:00
端到端作为当前自动驾驶量产的核心算法,所涉及的技术栈十分丰富。很多研究生的同学和转行的工业界小伙伴在刚开始接触时,往往会遇到很多问 题。目前业内主要有两大类范式:一段式和两段式。一段式最具代表性的就是UniAD,直接从传感器输入(视觉/Lidar/Radar等)建模自车轨迹的输出, 二段式基于感知结果进一步输出自车和他车的轨迹。 一段式端到端又可以进一步延伸出基于感知的一段式、基于扩散模型的一段式、基于世界模型的一段式以及基于VLA的一段式端到端算法。不难看出, 端到端已经衍生出很多子领域,尤其是基于VLA的相关算法,这两年相关论文在爆发式发表,工业界也在争先量产。 从模块化的量产算法发展到端到端,再到如今的VLA。核心算法涉及BEV感知、视觉语言模型VLM、扩散模型、强化学习、世界模型等等。通过学习端 到端与VLA自动驾驶,可以掌握学术界和工业界最前沿的技术方向。 最近几个月,我们收到了很多同学的咨询如何快速高效的入门端到端和VLA。所以我们联合了 工业界 和 学术界 的大佬开展了 《端到端与VLA自动驾 驶小班课》 和 《自动驾驶VLA和大模型实战课程》 ! 扫码报名!抢占课程名额 课程大纲 自动驾驶VL ...
自动驾驶之心招募合伙人啦!4D标注/世界模型/模型部署等方向
自动驾驶之心· 2025-10-04 04:04
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近15个 方向 学习 路线 业务合伙人 自动驾驶之心业务合伙人招募来啦!我们团队今年计划向国内外招募10名优秀的合伙人,负责自动驾驶相 关课程研发、论文辅导业务开发、硬件研发; 主要方向 如果您是大模型/多模态大模型、扩散模型、VLA、端到端、具身交互、联合预测、SLAM、3D目标检测、 世界模型、闭环仿真3DGS、大模型部署与量化感知推理等方向,欢迎加入我们; 待遇说明 自动驾驶资源共享(求职、读博、出国留学推荐等); 丰厚的现金激励; 创业项目合作与推荐; 联系我们 更多欢迎添加微信咨询,备注" 机构/公司 + 自动驾驶合作咨询 "。 岗位要求 QS200以内高校,硕士及以上学历,手握顶会的大佬优先。 ...
投注“端到端”:AI驶向物理世界,阿里云加速“闭环”
第一财经· 2025-09-27 12:39
人工智能浪潮下,具身智能、智能辅助驾驶正载着AI开启一场从数字世界穿梭到物理世界,Agentic AI 时代正在到来,一场新的竞赛也正在开启。 "过去这一年,AI的兴起让我们看到一个比较大的赛道涌现,就是自动驾驶与具身智能。"2025云栖大 会上,阿里云大数据AI平台负责人汪军华告诉记者,阿里云对这两个行业非常关注,这不仅体现在资本 层面的投入,更重要的是进行了高强度的基础设施技术栈投入。 智能辅助驾驶技术架构正在从"多模块多阶段串联"走向"端到端一体化"变革,底层的大数据 AI 工程架 构也面临不断升级的需求,产业界看得到"端到端"范式革命拐点的到来,也在走向新的技术难关。在 机器人研发落地的场景中,云厂商也发现了相似的趋势与需求。 如何帮助智驾和机器人实现"越用越聪明"?诸多云厂商正看好并投入这个产业难关的攻克抢占未来市 场。实现大数据 AI 闭环被视作关键破局点之一。一场在两个AI落地前沿行业中的"卡位"战正悄然兴 起。 1."端到端"来了 "自动驾驶从2003年左右开始发展,近二十年是没有太大进步的状态,直到'端到端'出现了。" 卓驭AI 首席技术官陈晓智对记者表示。 刚刚过去的云栖大会上,阿里云便宣 ...
基于模仿学习的端到端决定了它的上限不可能超越人类
自动驾驶之心· 2025-09-24 06:35
Core Viewpoint - The article discusses the evolution of end-to-end (E2E) autonomous driving technology, emphasizing the transition from rule-based to data-driven approaches, and highlights the limitations of current models in handling complex scenarios. It introduces Visual Language Models (VLM) and Visual Language Agents (VLA) as potential solutions to enhance the capabilities of autonomous driving systems [2][3]. Summary by Sections Introduction to VLA - VLA represents a shift from merely imitating human behavior to understanding and interacting with the physical world, addressing the limitations of traditional E2E models in complex driving scenarios [2]. Challenges in Autonomous Driving - The VLA technology stack is still evolving, with numerous algorithms emerging, indicating a lack of convergence in the field [3]. Course Overview - A course titled "Autonomous Driving VLA and Large Model Practical Course" is being prepared to address various aspects of VLA, including its origins, algorithms, and practical applications [5]. Learning Objectives - The course aims to provide a comprehensive understanding of VLA, covering topics such as data set creation, model training, and performance enhancement [5][17]. Course Structure - The course is structured into several chapters, each focusing on different aspects of VLA, including algorithm introduction, foundational knowledge, VLM as an interpreter, modular and integrated VLA, reasoning enhancement, and practical assignments [20][26][31][34][36]. Instructor Background - The instructors have extensive experience in multimodal perception, autonomous driving, and large model frameworks, contributing to the course's credibility [38]. Expected Outcomes - Participants are expected to gain a thorough understanding of current advancements in VLA, master core algorithms, and be able to apply their knowledge in practical settings [39][40]. Course Schedule - The course is set to begin on October 20, with a structured timeline for each chapter's release [43].
自动驾驶VLA发展到哪个阶段了?现在还适合搞研究吗?
自动驾驶之心· 2025-09-22 08:04
Core Insights - The article discusses the transition in intelligent driving technology from rule-driven to data-driven approaches, highlighting the emergence of VLA (Vision-Language Action) as a more straightforward and effective method compared to traditional end-to-end systems [1][2] - The challenges in the current VLA technology stack are emphasized, including the complexity and fragmentation of knowledge, which makes it difficult for newcomers to enter the field [2][3] - A new practical course on VLA has been developed to address these challenges, providing a structured learning path for students interested in advanced knowledge in autonomous driving [3][4][5] Summary by Sections Introduction to VLA - The article introduces VLA as a significant advancement in autonomous driving, offering a cleaner approach than traditional end-to-end systems, while also addressing corner cases more effectively [1] Challenges in Learning VLA - The article outlines the difficulties faced by learners in navigating the complex and fragmented knowledge landscape of VLA, which includes a plethora of algorithms and a lack of high-quality documentation [2] Course Development - A new course titled "Autonomous Driving VLA Practical Course" has been created to provide a comprehensive overview of the VLA technology stack, aiming to facilitate easier entry into the field for students [3][4] Course Features - The course is designed to address key pain points, offering quick entry into the subject matter through accessible language and examples [3] - It aims to build a framework for understanding VLA research and enhance research capabilities by teaching students how to categorize papers and extract innovative points [4] - The course includes practical components to ensure that theoretical knowledge is effectively applied in real-world scenarios [5] Course Outline - The course covers various topics, including the origins of VLA, foundational algorithms, and the differences between modular and integrated VLA systems [6][15][19][20] - It also includes practical coding exercises and projects to reinforce learning and application of concepts [22][24][26] Instructor Background - The course is led by experienced instructors with a strong background in multi-modal perception, autonomous driving, and large model frameworks, ensuring high-quality education [27] Learning Outcomes - Upon completion, students are expected to have a thorough understanding of current advancements in VLA, core algorithms, and the ability to apply their knowledge in practical settings [28][29]
拟派发现金红利10.3亿!药明康德实施首次中期分红
Xin Lang Cai Jing· 2025-09-22 03:07
Core Viewpoint - WuXi AppTec (603259.SH/2359.HK) announced its first interim dividend plan, distributing a total cash dividend of 1.03 billion yuan, reflecting strong financial performance in the first half of the year [1] Financial Performance - For the first half of the year, WuXi AppTec achieved a revenue of 20.799 billion yuan, a year-on-year increase of 20.6% [1] - The net profit attributable to shareholders reached 8.287 billion yuan, up 95.5% year-on-year [1] - In Q2, the company reported revenue of 11.145 billion yuan, marking the first time it surpassed 10 billion yuan in a single quarter [1] Dividend Distribution - The total cash dividends distributed to investors this year, including annual, special, and interim dividends, amounted to 4.88 billion yuan [1] - The total cash dividends and share buybacks reached 6.88 billion yuan, accounting for over 70% of the company's projected net profit for 2024 [1] Order Backlog and Revenue Sources - As of June 2025, the company had a backlog of orders amounting to 56.69 billion yuan, a year-on-year growth of 37.2% [2] - Revenue from U.S. clients was 14.03 billion yuan, up 38.4% year-on-year, while revenue from European clients was 2.33 billion yuan, a 9.2% increase [2] Business Model and Growth Drivers - The growth is attributed to the focus on an "integrated, end-to-end" CRDMO business model, enhancing operational efficiency and expanding capabilities [4] - The sale of partial equity in the joint venture WuXi XDC Cayman Inc. is expected to yield an investment income of approximately 3.21 billion yuan [4] Future Projections - The company expects revenue growth for its ongoing business to return to double digits, with the growth rate adjusted from 10%-15% to 13%-17% [4] - Overall revenue projections for the year have been revised from 41.5-43 billion yuan to 42.5-43.5 billion yuan [4] Accounts Receivable Trends - Accounts receivable increased from 3.665 billion yuan in 2020 to 7.918 billion yuan in Q1 2025, with the proportion of accounts receivable to revenue rising from 15.18% in 2022 to 19.59% in 2023 [5]
开放几个自动驾驶技术交流群(世界模型/端到端/VLA)
自动驾驶之心· 2025-09-20 16:03
欢迎大家加入一起交流相关的内容。感兴趣的同学欢迎添加小助理微信进群:AIDriver005, 备注:昵称 +方向加群。 自动驾驶之心技术交流群成立了,开学季&秋招期我们开放了几个技术交流群(世界模型/端到端/VLA等方 向)。 ...
VLA搞到现在,可能还是情绪价值的内容偏多一些......
自动驾驶之心· 2025-09-20 16:03
Core Insights - The article discusses the current state of end-to-end (E2E) technology in both academia and industry, highlighting the differences in approach and data availability between the two sectors [1][4][5] - It emphasizes the importance of data iteration speed in the AI model development process, suggesting that a slow data iteration can hinder technological advancements [2][4] - The article also explores the role of reinforcement learning in enhancing Vision-Language Models (VLA), particularly in scenarios where there are no definitive correct answers [6][7][9][10] Summary by Sections End-to-End Technology - The academic field is experiencing a proliferation of end-to-end methodologies, with various approaches emerging [1] - In contrast, the industrial sector is more pragmatic, facing computational limitations that exclude some popular models, but benefiting from vast amounts of data [4] - The success of models like ChatGPT is attributed to the internet's ability to provide extensive data, which is also true for the automotive industry where companies can easily gather massive driving data [4] Data and Technology Iteration - The article stresses that as technology evolves rapidly, the iteration of datasets must keep pace; otherwise, it will impede technological progress [2] - Research teams are increasingly publishing datasets alongside their papers to maintain high-impact outputs [3] Reinforcement Learning and VLA - Reinforcement learning is suitable for problems where there are no correct answers, only characteristics of correct and incorrect answers [7] - The training process in reinforcement learning allows for the identification of optimal solutions based on reward systems, thus reducing the need for extensive demonstration data [9] - The article notes that while short-term results of VLA applications may be uncertain, the long-term potential is widely recognized [10][11] Future of VLA - The article suggests that the importance of algorithms in VLA models extends beyond mere performance metrics; factors such as data availability and training strategies are crucial [12] - The community is encouraged to engage in discussions about the development and challenges of autonomous driving technologies [5][13][16]