音频大模型安全可信度的全面“体检”!6大维度,清华南洋理工联手打造
量子位·2025-06-03 04:26

Core Viewpoint - The article introduces AudioTrust, a novel multidimensional credibility assessment framework specifically designed for Audio Large Language Models (ALLMs), addressing the unique characteristics and security issues of audio modalities [1][5][41]. Summary by Sections Introduction to AudioTrust - AudioTrust is developed by a research team led by Nanyang Technological University and Tsinghua University to tackle credibility challenges posed by ALLMs, utilizing a two-phase architecture that separates reasoning execution from credibility analysis [5][39]. Assessment Framework - The framework expands the evaluation scope to six core dimensions: Fairness, Hallucination, Safety, Privacy, Robustness, and Authentication, focusing on various scenarios and feature classifications unique to audio modalities [8][41]. Fairness - The fairness assessment incorporates seven sensitive attributes and utilizes 840 high-quality audio samples to simulate diverse social roles and contexts, revealing systemic biases in mainstream language models [10][12][43]. Hallucination - The hallucination module evaluates the model's tendency to misinterpret audio in complex acoustic environments, identifying two core dimensions: factual and logical hallucinations, with findings indicating that errors stem from audio signal processing rather than reasoning flaws [14][17][43]. Safety - The safety assessment investigates two primary risks: jailbreak attacks and guidance for illegal activities, utilizing 600 test samples. Results show that audio modalities pose significant security threats, particularly in medical scenarios [20][22][43]. Privacy - The privacy evaluation focuses on direct and inferred privacy leaks, with a dataset of 900 audio samples. Findings indicate a notable inconsistency in privacy protection across models, with closed-source models performing better in certain sensitive information categories [25][27][43]. Robustness - The robustness module examines the stability of ALLMs against real-world audio disturbances, revealing significant performance variability among models, with the Gemini series consistently outperforming others [29][31][43]. Authentication - The authentication assessment tests the models' resilience against deception attacks, highlighting significant differences in performance based on model type and scenario, with stricter prompts enhancing defense against voice cloning attacks [34][37][43]. Key Innovations of AudioTrust - AudioTrust is distinguished by four key innovations: comprehensive evaluation dimensions, real-world scenario datasets, audio-specific assessment metrics, and an automated evaluation pipeline, significantly improving assessment efficiency and consistency [39][41]. Conclusion - The research establishes AudioTrust as the first multidimensional trust assessment benchmark for ALLMs, revealing potential risks across the six core dimensions and laying a solid foundation for future credibility research in the field [41][42].

音频大模型安全可信度的全面“体检”!6大维度,清华南洋理工联手打造 - Reportify