Workflow
FeRA
icon
Search documents
NUS LV Lab新作|FeRA:基于「频域能量」动态路由,打破扩散模型微调的静态瓶颈
机器之心· 2025-12-12 03:41
Core Viewpoint - The article discusses the introduction of the FeRA (Frequency-Energy Constrained Routing) framework, which addresses the limitations of existing static parameter-efficient fine-tuning (PEFT) methods in diffusion models by implementing a dynamic routing mechanism based on frequency-energy principles [3][23]. Group 1: Research Background and Limitations - The current PEFT methods, such as LoRA and AdaLoRA, utilize a static strategy that applies the same low-rank matrix across all time steps, leading to a misalignment between parameters responsible for structure and detail, resulting in wasted computational resources [8][9]. - The research team identifies a significant "low-frequency to high-frequency" evolution pattern in the denoising process of diffusion models, which is not isotropic and has distinct phase characteristics [7][23]. Group 2: FeRA Framework Components - FeRA consists of three core components: - Frequency-Energy Indicator (FEI), which extracts frequency-energy distribution features in latent space using Gaussian difference operators [11]. - Soft Frequency Router, which dynamically calculates the weights of different LoRA experts based on the energy signals provided by FEI [12]. - Frequency-Energy Consistency Loss (FECL), which ensures that the parameter updates in the frequency domain align with the model's original residual error, enhancing training stability [13]. Group 3: Experimental Validation - The research team conducted extensive testing on multiple mainstream bases, including Stable Diffusion 1.5, 2.0, 3.0, SDXL, and FLUX.1, focusing on style adaptation and subject customization tasks [19]. - In style adaptation tasks, FeRA achieved optimal or near-optimal results in FID (image quality), CLIP Score (semantic alignment), and Style (MLLM scoring) across various style datasets [20]. - In the DreamBooth task, FeRA demonstrated remarkable text controllability, allowing for specific prompts to be effectively executed [21][26]. Group 4: Conclusion and Future Implications - The FeRA framework represents a significant advancement in fine-tuning diffusion models by aligning the tuning mechanism with the physical laws of the generation process, thus providing a pathway for efficient and high-quality fine-tuning [23][27]. - This work not only sets new state-of-the-art (SOTA) benchmarks but also offers valuable insights for future fine-tuning in more complex tasks such as video and 3D generation [27].