Workflow
世界模型
icon
Search documents
随到随学!端到端与VLA自动驾驶小班课正式结课
自动驾驶之心· 2025-12-09 19:00
Core Viewpoint - 2023 marks the year of end-to-end production, with 2024 expected to be a significant year for end-to-end production in the automotive industry, as leading new forces and manufacturers have already achieved end-to-end production [1][3]. Group 1: End-to-End Production Development - The automotive industry has two main paradigms: single-stage and two-stage, with UniAD being a representative of the single-stage approach that directly models vehicle trajectories from sensor inputs [1]. - Since last year, the single-stage end-to-end development has rapidly advanced, leading to various derivatives such as perception-based, world model-based, diffusion model-based, and VLA-based single-stage methods [3][5]. - Major players in the autonomous driving sector, including both solution providers and car manufacturers, are focusing on self-research and production of end-to-end autonomous driving technologies [3]. Group 2: Course Overview - A course titled "End-to-End and VLA Autonomous Driving" has been launched, aimed at teaching cutting-edge algorithms in both single-stage and two-stage end-to-end approaches, with a focus on the latest developments in the industry and academia [5][14]. - The course is structured into several chapters, starting with an introduction to end-to-end algorithms, followed by background knowledge on various technologies such as VLA, diffusion models, and reinforcement learning [8][9]. - The second chapter is highlighted as containing the most frequently asked technical keywords for job interviews in the next two years [9]. Group 3: Technical Focus Areas - The course covers various subfields of single-stage end-to-end methods, including perception-based (UniAD), world model-based, diffusion model-based, and the currently popular VLA-based approaches [10][12]. - The curriculum includes practical assignments, such as RLHF fine-tuning, and aims to provide students with hands-on experience in building and experimenting with pre-trained and reinforcement learning modules [11][12]. - The course emphasizes the importance of understanding BEV perception, multi-modal large models, and the latest advancements in diffusion models, which are crucial for the future of autonomous driving [12][16].
世界模型自动驾驶小班课!特斯拉世界模型、视频&OCC生成速通
自动驾驶之心· 2025-12-09 19:00
Core Viewpoint - The article introduces a new course titled "World Models and Autonomous Driving Small Class," focusing on advanced algorithms in the field of autonomous driving, including general world models, video generation, and OCC generation [1][3]. Course Overview - The course is developed in collaboration with industry leaders and follows the success of a previous course on end-to-end and VLA autonomous driving [1]. - The course aims to enhance understanding and practical skills in world models, which are crucial for the advancement of autonomous driving technology [11]. Course Structure Chapter 1: Introduction to World Models - This chapter covers the relationship between world models and end-to-end autonomous driving, the history of world models, and current application cases [6]. - It discusses various types of world models, including pure simulation, simulation plus planning, and generating sensor inputs and perception results [6]. Chapter 2: Background Knowledge of World Models - The second chapter focuses on foundational knowledge related to world models, including scene representation, Transformer technology, and BEV perception [6][12]. - It highlights key technical terms frequently encountered in job interviews related to world models [7]. Chapter 3: Discussion on General World Models - This chapter addresses popular general world models and recent trends in autonomous driving jobs, including models from Li Feifei's team and DeepMind [7]. - It provides insights into the core technologies and design philosophies behind these models [7]. Chapter 4: Video Generation-Based World Models - The fourth chapter focuses on video generation algorithms, showcasing significant works such as GAIA-1 & GAIA-2 and recent advancements from various institutions [8]. - It includes practical applications using open-source projects like OpenDWM [8]. Chapter 5: OCC-Based World Models - This chapter explores OCC generation algorithms, discussing three major papers and a practical project that extends to vehicle trajectory planning [9]. Chapter 6: World Model Job Topics - The final chapter shares practical experiences from the instructor's career, addressing industry applications, pain points, and interview preparation for related positions [10]. Target Audience and Learning Outcomes - The course is designed for individuals aiming to deepen their understanding of end-to-end autonomous driving and world models [11]. - Upon completion, participants are expected to achieve a level equivalent to one year of experience as a world model autonomous driving algorithm engineer, mastering key technologies and being able to apply learned concepts in projects [14].
端到端落地小班课:核心算法&实战讲解(7个project)
自动驾驶之心· 2025-12-09 19:00
Core Insights - The article discusses the evolving recruitment landscape in the autonomous driving sector, highlighting a shift in demand from perception roles to end-to-end, VLA, and world model positions [2] - A new advanced course focused on end-to-end production in autonomous driving has been designed, emphasizing practical applications and real-world experience [2][4] Course Overview - The course is structured to cover various core algorithms, including one-stage and two-stage end-to-end methods, navigation information applications, reinforcement learning, and trajectory optimization [2] - The course aims to provide in-depth knowledge and practical skills necessary for production in autonomous driving, with a focus on real-world applications and challenges [2][4] Chapter Summaries - **Chapter 1: Overview of End-to-End Tasks** Discusses the integration of perception tasks and the learning-based design of control algorithms, which are essential skills for companies in the end-to-end era [7] - **Chapter 2: Two-Stage End-to-End Algorithm Framework** Introduces the modeling methods of two-stage frameworks and the information transfer between perception and planning, including practical examples [8] - **Chapter 3: One-Stage End-to-End Algorithm** Focuses on one-stage frameworks that allow for lossless information transfer, presenting various methods and practical learning experiences [9] - **Chapter 4: Production Application of Navigation Information** Covers the critical role of navigation information in autonomous driving, detailing mainstream navigation map formats and their integration into models [10] - **Chapter 5: Introduction to RL Algorithms in Autonomous Driving** Explains the necessity of reinforcement learning in conjunction with imitation learning to enhance the model's ability to generalize [11] - **Chapter 6: Trajectory Output Optimization** Engages participants in practical projects focusing on algorithms based on imitation learning and reinforcement learning [12] - **Chapter 7: Safety Net Solutions - Spatiotemporal Joint Planning** Discusses post-processing logic to ensure model accuracy and stability in trajectory outputs, introducing common smoothing algorithms [13] - **Chapter 8: Experience Sharing on End-to-End Production** Provides insights on practical experiences in production, addressing data, models, scenarios, and strategies for system capability enhancement [14] Target Audience - The course is aimed at advanced learners with a foundational understanding of autonomous driving algorithms, reinforcement learning, and programming skills [15][17]
Khosla 继 OpenAI 后的最大赌注,General Intuition 凭 38 亿个游戏高光片段做世界模型
海外独角兽· 2025-12-09 12:05
编译:Haozhen、Gemini 而支撑这场豪赌的理由之一就是 General Intuition 拥有一个业内无法复制的独特数据集。 General Intuition 是从游戏高光片段剪辑平台 Medal 中分拆而来,拥有超过 38 亿个游戏短视频片 段。与传统机器人或仿真数据不同,Pim 认为高光片段是人类在模拟环境中的情景记忆(Episodic Memory),是对人类直觉、反应和决策最密集的数字化记录。 如果说 OpenAI 通过 ChatGPT 解决了人类的"认知与逻辑",让机器学会了像人类一样进行复杂思 考、推理与 coding,那么 General Intuition 希望赋予机器像人类一样的"直觉和物理常识",使机器 能够在本能层面理解物理世界的空间关系。 在 CEO Pim de Witte 的构想中,LLM 负责思考与规划(Next Token), General Intuition 则基于自 身的数据优势承担行动与交互(Next Action),两者形成互补的智能结构。团队 希望从游戏场景起 步,经由模拟环境走向自动驾驶,再延伸至机器人与物理世界,终极愿景就是实现"Atoms to ...
世界模型与自动驾驶小班课正式推出!特斯拉世界模型、视频OCC生成一网打尽~
自动驾驶之心· 2025-12-09 07:59
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 Jason老师新课《世界模型与自动驾驶小班课》正式推出啦! 自动驾驶之心联合 工业界大佬 共同开展,先前的《端到端与VLA自动驾驶小班课》备受大家好评,因 此我们进一步推出这门世界模型小班课, 课程聚焦于通用世界模型、视频生成、OCC生成等世界模型算法,涵盖特斯拉世界模型、李飞飞团队Marble等。欢迎大 家加入学习~ 课程大纲 这门课程讲如何展开 第一章:世界模型介绍 第一章主要针对自动驾驶世界模型概括性的内容讲解。 这一章老师会先复盘世界模型和端到端自动驾驶的联系,接着讲解世界模型的发展历史以及当下的应用案 例。然后介绍世界模型有哪些流派:纯仿真的世界模型、仿真+Planning、生成传感器输入、生成感知结果等等流派。每一种流派在当前业界的应用,能解决什么问 题,处于自驾的哪个环节。学术界和工业界都在做什么,相关的数据集、评测都有啥。在这一章节为大家一一解答~ 第二章:世界模型的背景知识 早鸟优惠!开课即止~ 讲师介绍 Jason:C9本科+QS50 PhD,已发表CCF-A论文2篇,CCF-B论文若干。 ...
3个月斩获5亿元!华为重投的具身智能机器人创企,又完成新一轮融资!
Robot猎场备忘录· 2025-12-09 00:03
温馨提示 : 点击下方图片,查看运营团队最新(12月)原创报告(共260页) 说明: 欢迎约稿、刊例合作、行业交流 , 行业交流记得先加入 "机器人头条"知识星球 ,后添加( 微信号:lietou100w )微 信; 若有侵权、改稿请联系编辑运营(微信:li_sir_2020); 正文: 梅开五度, 国内领先通用具身智能企业[极佳视界]完成2亿元级A2轮融资! 12月8日,Physical AI(物理AI)领域头部创企 [北京极佳视界科技有限公司 ](公司简称" 极佳视界 GigaAl ") 宣布完成 2 亿元 A2 轮融资 ,本轮融资由达晨财智领投, 老股东 华 控 基金联合领投,首发展创投、浦耀信晔、 财鑫资本、珠海科技产业集团 、张科垚坤、复琢创投等知名机构跟投,老股东合鼎共资本超额跟投。 | 融资历程 7 | | | | | | | 〔 导出数据, 联系商务 | © 企查查 | | --- | --- | --- | --- | --- | --- | --- | --- | --- | | 序号 | 融资日期 | 融资轮次 | 融资金额 | 投资方 | | | 关联机构 | >> 来源 | | | ...
中游智驾厂商,正在快速抢占端到端人才......
自动驾驶之心· 2025-12-09 00:03
Core Viewpoint - The article discusses the technological anxiety in intelligent driving, particularly among mid-tier manufacturers, and highlights the anticipated growth in demand for end-to-end (E2E) and VLA (Vision-Language-Action) technologies in the coming year [2]. Group 1: Industry Trends - The mass production of cutting-edge technologies like end-to-end systems is expected to begin next year, with L2 technology becoming more standardized and moving towards lower-tier markets [2]. - The total sales of passenger vehicles priced above 200,000 are around 7 million, but leading new forces account for less than one-third of this, indicating a slow adoption of end-to-end mass production models [2]. - The maturity of end-to-end technology is seen as a precursor to larger-scale production, with the advancement of L3 regulations prompting urgent upgrades among mid-tier manufacturers [2]. Group 2: Recruitment and Training - There is a growing demand for positions related to end-to-end and VLA technologies, as many professionals are seeking to quickly learn these advanced skills [3]. - The article mentions the launch of specialized courses aimed at practical applications of end-to-end and VLA technologies, designed for individuals already working in the field [3][6]. - The courses will cover various modules, including navigation information application, reinforcement learning optimization, and production experiences related to diffusion and autoregressive models [3][6]. Group 3: Course Details - The end-to-end production course will focus on practical implementation, including seven major practical applications, making it suitable for those looking to advance their careers [3][6]. - The VLA course will cover foundational algorithms and theories, including BEV perception and large language models, with practical projects based on diffusion models and VLA algorithms [6][11]. - The instructors for these courses are experienced professionals from top-tier companies and academic institutions, ensuring a high-quality learning experience [5][8][13].
智驾国产芯片格局变化
2025-12-08 15:36
智驾国产芯片格局变化 20251208 摘要 蔚来汽车智驾方案采用全栈自研,主推世界模型,但效果相对落后,明 年主要任务是提升车位到车位功能的接网率及处理复杂案例,乐道和萤 火虫系列预计仍将使用英伟达方案。 小鹏汽车中高端车型将搭载自研图灵芯片,算力达 1,000+TOPS,算法 重点在于 VLA 和世界模型的迭代,深度融合 BL 模块,并计划在 Robotic 业务线中使用图灵芯片优化 Robot Taxi 的通行效率和安全性。 理想汽车自研 M100 苏马赫芯片预计 2026 年 Q2 量产,首发于高端改 款车型,AD Max 系统将存在 M100 与地平线混合方案并存的情况, AD Pro 系统继续采用地平线方案,但可能升级至 G6H 版本,算法方面 坚定走 VOL 路线。 小米汽车计划在高端车型上采用英伟达 42 系列芯片,自研玄戒 O2 芯片 暂缓使用,明年将采用类似特斯拉的架构,以世界模型为主,加语言模 型辅助,解决停车场出入、道路标识牌识别及城市通勤问题。 比亚迪高端方案将升级至英伟达索尔方案,首发于仰望 U8 改款车型, 由 Momenta 提供 R6 大模型 plus 版本算法。终端方案天翼 ...
达晨财智领投 极佳视界完成2亿元A2轮融资
Xin Lang Cai Jing· 2025-12-08 15:14
Investment Overview - The company Jijiashijie has recently completed a new round of financing, raising 200 million yuan in Series A2 funding, led by Dacheng Caizhi, with participation from several notable institutions [1][3] - This round of financing follows three previous rounds (Pre-A, Pre-A+, A1) completed within three months, totaling 500 million yuan in Series A funding [1][3] Company Focus and Products - Jijiashijie specializes in general intelligence for the physical world, aiming for physical AGI (Artificial General Intelligence) and has plans to release a corresponding ontology by November 26, 2025 [1][3] - The company's product offerings include the GigaWorld platform (for driving and embodiment), GigaBrain (general embodied brain), and Maker (general embodied ontology), representing a full-stack approach to physical AI [1][3] Model Development - The company has introduced a native paradigm of "world model + action model + reinforcement learning," where each component is driven by the world model [1][3] - The current trend in model architecture is converging towards general action models, with a shift in data sources to real machine data and world model-generated data [2][4] Industry Trends - The company believes that physical AI is entering a new critical era, with the next 2-3 years being a key window for breakthroughs in physical AGI [5] - The advancements in world models and action models are accelerating the arrival of a "ChatGPT moment" in the physical world [5]
Roblox CEO感叹AI研究进展:曾博览群书的自己都快看不懂了
Sou Hu Cai Jing· 2025-12-08 11:28
巴祖基 2005 年创立 Roblox。创业初期,他几乎读遍了从物理模拟到图形渲染的各类研究,而且都能理解。然而 AI 时代的到来改变了一切。他称如今的研究浪潮"规模巨大、速度惊人",从 Transformer 到扩散模型,再到世界 模型,"内容多到让人难以完全掌握"。 IT之家 12 月 8 日消息,AI 研究更新速度飞快,新论文几乎每天出现,技术概念也越来越复杂,Roblox CEO 大 卫・巴祖基对此深有体会。 据《商业内幕》今日报道,巴祖基透露,自己休假时抽出大量时间系统阅读 AI 研究,却发现过程"发人深省"—— 想真正看懂所有论文"极其困难"。 尽管外界关注焦点集中在算力扩张,OpenAI 联合创始人伊利亚・苏茨克维却认为,真正决定 AI 走向的仍是"研 究本身":"我们重新回到研究时代,只不过现在用的是更大的计算机。" 而对于 Roblox 而言,巴祖基的结论是:AI 在"三维世界"里仍然处于非常初期的阶段。他指出,AI 依赖的是人类 制造出来的文本和图像,"我们在用自己创造的内容训练 AI,而不是用真实世界的三维原始数据"。 随着 AI 从学界扩展到国家战略高度,Meta、微软等公司纷纷建立自 ...