Workflow
生成式建模
icon
Search documents
闭环碰撞率爆降50%!DistillDrive:异构多模态蒸馏端到端新方案
自动驾驶之心· 2025-08-11 23:33
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 今天自动驾驶之心为大家分享 华东理工大学、商汤研究院、悉尼大学 最新的工作! DistillDrive:异构蒸馏框架显著降低自动驾驶碰 撞率50% ! 如果您有相关工作需要分享,请在文末联系我们! 自动驾驶课程学习与 技术交流群加入 ,也欢迎添加小助理微信AIDriver005 简介 端到端自动驾驶近年来取得了显著进展,这主要得益于感知技术和模仿学习的进步。如图1(b)所示,该方法直接从复杂的传感器输入学习到最终的规划和决 策,消除了中间的数据传递和目标表征过程,从而显著减少了级联误差。然而在闭环实验中,图1(a)中感知分离的规划模型表现优于端到端模型,这得益于其 论文链接:https://arxiv.org/abs/2508.05402 代码链接:https://github.com/YuruiAI/DistillDrive 对比学习和仿真实验。尽管如此,它在感知和规划之间面临着耦合障碍。 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 论文作者 | Rui Yu等 编辑 | 自动驾驶之心 写在前面 & ...
具身领域LLM结合强化学习与世界模型工作汇总
具身智能之心· 2025-07-29 06:15
Core Viewpoint - The article discusses recent advancements in the field of embodied intelligence, particularly focusing on the integration of large language models (LLMs) with reinforcement learning and world models, highlighting several notable research papers from 2024 [2][3]. Group 1: UniSim - UniSim aims to learn general real-world interactive simulators through generative modeling, revealing that natural datasets can provide diverse advantages for learning simulators [3]. - The research demonstrates that integrating various datasets allows for the simulation of high-level commands and low-level controls, enabling zero-shot application in real-world scenarios [3]. Group 2: Robust Agents - The study from Google DeepMind asserts that causal reasoning is essential for robust and general AI, concluding that agents capable of satisfying regret bounds must learn approximate causal models [5]. - This finding has significant implications for transfer learning and causal inference [5]. Group 3: MAMBA - MAMBA introduces an efficient world model approach for meta-reinforcement learning, addressing sample efficiency issues prevalent in current methods [8]. - The framework shows a remarkable improvement in sample efficiency, achieving up to 15 times better performance in high-dimensional tasks [8]. Group 4: EMMA - EMMA leverages LLMs trained in text-based worlds to guide the training of visual world agents, enhancing their ability to interact with dynamic environments [10]. - The approach results in a significant success rate improvement of 20%-70% in diverse tasks compared to existing VLM agents [10]. Group 5: Text2Reward - The Text2Reward framework automates the generation of dense reward functions using LLMs, addressing the challenges of reward function design in reinforcement learning [13][14]. - The method demonstrates superior performance in 13 out of 17 tasks, achieving over 94% success in new motion behaviors [14]. Group 6: Online Continual Learning - The research proposes two frameworks for continuous learning in interactive instruction-following agents, emphasizing the need for agents to learn incrementally as they explore their environments [17][18]. - A confidence-aware moving average mechanism is introduced to update parameters without relying on task boundary information [18]. Group 7: AMAGO - AMAGO is a scalable contextual reinforcement learning framework that addresses challenges in generalization, long-term memory, and meta-learning [21]. - The framework allows for parallel training of long-sequence transformers, enhancing scalability and performance in complex tasks [21]. Group 8: PDDL-based Planning - The study presents a novel paradigm for task planning using pre-trained LLMs, focusing on building explicit world models through PDDL [22][23]. - The framework significantly reduces the need for human intervention by allowing LLMs to convert between PDDL and natural language, facilitating efficient model correction [23].