Core Viewpoint - Despite significant advancements in multimodal large language models (MLLMs), existing models still lack sufficient alignment with human preferences, primarily due to the focus of current alignment research on specific areas such as reducing hallucination issues. The overall impact of aligning with human preferences on enhancing various capabilities of MLLMs remains uncertain [1]. Group 1: Contributions - Introduction of a new dataset containing 120,000 finely annotated preference comparison pairs, significantly improving scale, sample diversity, annotation granularity, and quality compared to existing resources [5]. - Development of an innovative critique-based reward model that provides better interpretability and more informative feedback than traditional scalar reward mechanisms, achieving superior performance with a 7B model compared to existing 72B models [5]. - Implementation of dynamic reward scaling to optimize the use of high-quality comparison pairs, enhancing data utilization efficiency [5]. - Comprehensive evaluation across 10 dimensions and 27 benchmarks, demonstrating significant and consistent performance improvements in various aspects [5][15]. Group 2: Data Sources and Annotation - Data sources include image datasets from LLaVA-OV, VLfeedback, and others, totaling 10 million samples, with video data primarily from SharedGPT-4-video [6]. - Annotation focuses on three dimensions: usefulness, authenticity, and ethics, with detailed scoring and ranking justifications provided by human experts [7]. Group 3: Performance Evaluation - The proposed model framework shows competitive performance against GPT-4o across multiple benchmarks, with notable improvements in custom benchmarks, validating the effectiveness of the training algorithm's reward signals [10]. - The model's conversational abilities and safety features improved significantly, with average enhancements exceeding 10% in conversation benchmarks and a reduction of unsafe behaviors by at least 50% [17]. Group 4: Future Research Directions - The study emphasizes the potential for further exploration of the dataset's value, particularly in utilizing the rich annotation granularity to enhance current alignment algorithms and address specific benchmark limitations [21]. - Future efforts will focus on leveraging detailed information and advanced optimization techniques to improve MLLM alignment and establish a more universal multimodal learning framework [22].
多模态大模型对齐新范式,10个评估维度全面提升,快手&中科院&南大打破瓶颈
量子位·2025-02-26 03:51