Workflow
WFNO
icon
Search documents
CVPR 2025 Oral | DiffFNO:傅里叶神经算子助力扩散,开启任意尺度超分辨率新篇章
机器之心· 2025-05-04 04:57
Core Viewpoint - The article discusses the development of DiffFNO, a novel method that enhances diffusion models with neural operators to achieve high-quality and efficient super-resolution (SR) for images at any continuous scaling factor, addressing the challenges of traditional models [2][4]. Group 1: Methodology Overview - DiffFNO consists of three main components: Weighted Fourier Neural Operator (WFNO), Gated Fusion Mechanism, and Adaptive ODE Solver, which collectively improve the quality and efficiency of image reconstruction [2][5]. - The WFNO captures global information through frequency domain convolution and amplifies high-frequency components using learnable frequency weights, resulting in a PSNR improvement of approximately 0.3–0.5 dB in high-magnification tasks [10]. - The Gated Fusion Mechanism integrates a lightweight attention operator (AttnNO) to capture local spatial features, allowing for a flexible combination of spectral and spatial information [12][13]. Group 2: Adaptive ODE Solver - The Adaptive ODE Solver transforms the diffusion model's reverse process from a stochastic SDE to a deterministic ODE, significantly reducing the number of steps required for denoising from over a thousand to about thirty, thus enhancing inference speed [15]. - This method maintains image quality while halving the inference time from 266 ms to approximately 141 ms, even performing better at larger scaling factors [15]. Group 3: Experimental Validation - DiffFNO outperforms various state-of-the-art (SOTA) methods by 2–4 dB in PSNR across multiple benchmark datasets, particularly excelling in high magnification scenarios such as ×8 and ×12 [17][20]. - The method retains the complete Fourier spectrum, balancing overall image structure and local detail, and employs learnable frequency weights to dynamically adjust the influence of different frequency bands [18]. Group 4: Conclusion - The introduction of DiffFNO provides a new approach to reconcile the trade-off between high precision and low computational cost in super-resolution tasks, making it suitable for fields requiring high image quality, such as medical imaging, exploration, and gaming [22].