Workflow
小米提出DriveMRP:合成难例数据+视觉提示事故识别率飙至88%!

Core Viewpoint - The article discusses advancements in autonomous driving technology, specifically focusing on the DriveMRP framework, which synthesizes high-risk motion data to enhance the motion risk prediction capabilities of vision-language models (VLMs) [1][4]. Background and Core Objectives - Autonomous driving technology has rapidly developed, but accurately predicting the safety of ego vehicle movements in rare high-risk scenarios remains a significant challenge. Existing trajectory evaluation methods often provide a single reward score, lacking risk type explanation and decision-making support [1]. Limitations of Existing Methods - Rule-based methods rely heavily on external world models and are sensitive to perception errors, making them difficult to generalize to complex real-world scenarios, such as extreme weather conditions [2]. Core Innovative Solutions - DriveMRP-10K: A synthetic high-risk motion dataset containing 10,000 high-risk scenarios, generated through a "human-in-the-loop" mechanism, enhancing the VLM's motion risk prediction capabilities [4]. - DriveMRP-Agent: A VLM framework that improves risk reasoning by using inputs like BEV layout and scene images [5]. - DriveMRP-Metric: Evaluation metrics that assess model performance through high-risk trajectory synthesis and automatic labeling of motion attributes [5]. Performance Improvement - On the DriveMRP-10K dataset, the DriveMRP-Agent achieved a scene understanding metric (ROUGE-1-F1) of 69.08 and a motion risk prediction accuracy of 88.03%, significantly surpassing other VLMs. The accident identification accuracy improved from 27.13% to 88.03% [7][8]. Dataset Effectiveness - The DriveMRP-10K dataset significantly enhances the performance of various general VLMs, demonstrating its "plug-and-play" enhancement capability [10]. Key Component Ablation Experiments - The inclusion of global context in the model led to significant improvements in scene understanding and risk prediction metrics, highlighting the importance of global information for reasoning [12].