LEGION

Search documents
从捍卫者到引路人,上交&上海AI Lab提出LEGION:不仅是AI图像伪造克星,还能反哺生成模型进化?
机器之心· 2025-08-11 07:12
Core Viewpoint - The rapid advancement of Text-to-Image models has significantly improved the quality and detail of generated images, but it has also led to increased misuse, resulting in a growing trust crisis among the public due to the difficulty in distinguishing between real and AI-generated images [3][4][9]. Group 1: Development of AI Image Generation - Recent developments in Text-to-Image models have transitioned from early GAN architectures to diffusion and autoregressive models, greatly lowering the barriers for high-quality image creation [4]. - The proliferation of AI-generated images has facilitated various fields such as design, education, and art, but has also led to serious issues like fraud and misinformation [4][9]. Group 2: Trust Crisis and Detection Challenges - The public faces an escalating trust crisis as the realism of AI-generated images increases, making it harder to discern authenticity [9][12]. - Existing datasets for detecting forged images have limitations, prompting the creation of a new dataset, SynthScars, which focuses on pure AI-generated images and highlights their flaws [15][12]. Group 3: Proposed Solutions - The research team proposes a three-pronged approach to address the challenges of synthetic image detection: building high-quality datasets, designing interpretable forgery analysis models, and achieving a balance between detection and generation [12][15]. - The LEGION framework utilizes a multi-modal large model (MLLMs) for image forgery analysis, integrating detection, forgery localization, and anomaly explanation into a unified process [17][20]. Group 4: Performance and Robustness - LEGION demonstrates superior performance in various tasks, outperforming existing models with fewer parameters, particularly in anomaly explanation and forgery detection [24][27]. - The framework shows robust performance against various distortions, maintaining stability compared to traditional expert models [27][28]. Group 5: Synergy Between Detection and Generation - The paper suggests that LEGION can serve as both a protector of image security and a catalyst for high-quality generation, proposing methods to refine generated images based on detected anomalies [33][37]. - Techniques such as global prompt optimization and localized semantic repair are introduced to enhance the quality of generated images by addressing identified flaws [37][40].