OmniDrive
Search documents
深扒了学术界和工业界的「空间智能」,更多的还停留在表层......
自动驾驶之心· 2025-12-28 03:30
Core Viewpoint - The article emphasizes the transition of autonomous driving from "perception-driven" to "spatial intelligence" by 2025, highlighting the importance of understanding and interacting with the three-dimensional physical world [3]. Group 1: Spatial Intelligence Definition - Spatial intelligence is defined as the ability to perceive, represent, reason, decide, and interact with spatial information, which is crucial for the interaction between intelligent agents and the physical world [3]. - Current spatial intelligence is primarily focused on perception and representation, with significant room for improvement in reasoning, decision-making, and interaction capabilities [3]. Group 2: World Models and Simulation - GAIA-2 is a multi-view generative world model for autonomous driving that generates driving videos based on physical laws and conditions, addressing edge cases in driving scenarios [5]. - GAIA-3 enhances GAIA-2 by increasing the scale fivefold and capturing fine-grained spatiotemporal contexts, representing the physical causal structure of the real world [9]. - ReSim combines expert trajectories from the real world with simulated dangerous behaviors to achieve high-fidelity simulations of extreme driving scenarios [11]. Group 3: Multimodal Reasoning - The SIG framework introduces a structured graph scheme that encodes scene layouts and object relationships, aiming to enhance geometric reasoning in autonomous driving [16]. - OmniDrive generates a large-scale 3D question-answer dataset to align visual language models with 3D spatial understanding and planning [19]. - SimLingo addresses the alignment of driving behavior with semantic instructions through an action dreaming task, demonstrating the potential of general models in real-time decision-making [21]. Group 4: Real-time Digital Twins - DrivingRecon is a 4D Gaussian reconstruction model that predicts parameters from surround-view videos, enabling efficient dynamic scene reconstruction for autonomous driving [26]. - VR-Drive enhances robustness in driving systems by allowing real-time prediction of new viewpoints without scene optimization [29]. Group 5: Embodied Fusion - MiMo-Embodied is the first open-source cross-embodied model that integrates autonomous driving with embodied intelligence, showcasing significant transfer effects in spatial reasoning capabilities [31]. - DriveGPT4-V2 is a closed-loop end-to-end autonomous driving framework that outputs low-level control signals, evolving from visual understanding to closed-loop control [36]. Group 6: Industry Trends - By 2025, the industry is moving towards an end-to-end VLA architecture, leveraging large language models for driving decision-making [40]. - Waymo's EMMA model integrates multimodal inputs and outputs in a unified language space, enhancing complex reasoning in driving tasks [41]. - DeepRoute.ai's DeepRoute IO 2.0 architecture introduces chain-of-thought reasoning to address the "black box" issue in end-to-end models, improving user trust in autonomous systems [44].
从点工具到全流程,思尔芯的突围之路
半导体芯闻· 2025-12-09 10:36
如果您希望可以时常见面,欢迎标星收藏哦~ 随着芯片规模越来越大,芯片架构从单芯片走向Chiplet、芯片布局从2D走向3D,芯片设计师面临着前所未有的重大挑战。与此同时,诸如EDA厂 商、制造厂商和封装厂商等供应链参与者,也正在历经重重考验。 在日前举办的ICCAD 2025峰会期间,思尔芯副总裁陈英仁就从EDA厂商的角度,给我们分享了他的看法和破局之道。 深耕原型验证20年 "功能验证对芯片开发来说极为重要,芯片功能上如果有问题,就会导致流片失误、失败,进而影响项目的成败。我们提供了一个很好的方案,让 我们的使用者可以加速验证、设计、开发。"陈英仁告诉半导体行业观察。 如他所说,这其实是思尔芯过去二十多年里一直聚焦于解决的问题。 据"芯思想"引述相关资料报道,2003年,EDA学术界的大牛、加州大学伯克利分校(UC Berkeley)教授Alberto Sangiovanni-Vincentelli在DAC 40周年上发表了题为《The Tides of EDA》的演讲,讲述了如何看待40年来DAC相关的研究成果,同时还阐述EDA未来的趋势和挑战,并强调这 是一个EDA大变革的年代。 在DAC 2003结束 ...
快慢双系统评测!Bench2ADVLM:专为自动驾驶VLM设计(南洋理工)
自动驾驶之心· 2025-08-07 23:32
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 今天自动驾驶之心为大家分享XX最新的工作!如果您有相关工作需要分享,请在文末联系我们! 自动驾驶课程学习与 技术交流群加入 ,也欢迎添加小助理微信AIDriver005 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 论文作者 | Tianyuan Zhang等 编辑 | 自动驾驶之心 写在前面 & 笔者的个人理解 视觉-语言模型(VLMs)最近已成为自动驾驶(AD)中一个有前景的范式。然而当前对基于VLM的自动驾驶系统(ADVLMs)的性能评估协议主要局限于具有静 态输入的开环设置,忽略了更具现实性和信息性的闭环设置,后者能够捕捉交互行为、反馈弹性和真实世界的安全性。为了解决这一问题,我们引入了 BENCH2ADVLM,这是一个统一的分层闭环评估框架,用于在仿真和物理平台上对ADVLMs进行实时、交互式评估。受认知的双过程理论启发,我们首先通过双 系统适应架构将多种ADVLMs适配到仿真环境中。在此设计中,由目标ADVLMs(快速系统)生成的异构高级驾驶命令被通用VLM(慢速系统)解释为适合在仿 真中执 ...