FLUX系列模型

Search documents
ICCV 2025|降低扩散模型中的时空冗余,上交大EEdit实现免训练图像编辑加速
机器之心· 2025-07-05 02:46
Core Viewpoint - The article discusses the latest research from Professor Zhang Linfeng's team at Shanghai Jiao Tong University, introducing EEdit, a novel framework designed to enhance the efficiency of image editing by addressing spatial and temporal redundancy in diffusion models, achieving a speedup of over 2.4 times compared to previous methods [1][6][8]. Summary by Sections Research Motivation - The authors identified significant spatial and temporal redundancy in image editing tasks using diffusion models, leading to unnecessary computational overhead, particularly in non-editing areas [12][14]. - The study highlights that the inversion process incurs higher time redundancy, suggesting that reducing redundant time steps can significantly accelerate editing tasks [14]. Method Overview - EEdit employs a training-free caching acceleration framework that utilizes output feature reuse to compress the inversion process time steps and control the frequency of area marking updates through region score rewards [15][17]. - The framework is designed to adapt to various input types for editing tasks, including reference images, prompt-based editing, and drag-region guidance [10][15]. Key Features of EEdit - EEdit achieves over 2.4X acceleration in inference speed compared to the unaccelerated version and can reach up to 10X speedup compared to other image editing methods [8][9]. - The framework addresses the computational waste caused by spatial and temporal redundancy, optimizing the editing process without compromising quality [9][10]. - EEdit supports multiple input guidance types, enhancing its versatility in image editing tasks [10]. Experimental Results - The performance of EEdit was evaluated on several benchmarks, demonstrating superior efficiency and quality metrics compared to existing methods [26][27]. - EEdit outperformed other methods in terms of PSNR, LPIPS, SSIM, and CLIP metrics, showcasing its competitive edge in both speed and quality [27][28]. - The spatial locality caching algorithm (SLoC) used in EEdit was found to be more effective than other caching methods, achieving better acceleration and foreground preservation [29].