Improved MeanFlow (iMF)
Search documents
后生可畏,何恺明团队新成果发布,共一清华姚班大二在读
3 6 Ke· 2025-12-04 02:21
Core Insights - The article discusses the introduction of Improved MeanFlow (iMF), an enhanced version of the original MeanFlow (MF), which addresses key issues related to training stability, guidance flexibility, and architectural efficiency [1][4]. Model Performance - iMF significantly improves model performance by reformulating the training objective to a more stable instantaneous velocity loss and introducing flexible classifier-free guidance (CFG) [2][12]. - In the ImageNet 256x256 benchmark, the iMF-XL/2 model achieved a FID score of 1.72 in 1-NFE (single-step function evaluation), representing a 50% improvement over the original MF [2][18]. Model Configuration and Efficiency - The configurations of both MF and iMF models are detailed, showing a reduction in parameters and improved performance metrics for iMF models compared to MF models [3][19]. - For instance, the iMF-B/2 model has 89 million parameters and a FID score of 3.39, while the MF-B/2 model has 131 million parameters and a FID score of 6.17 [3][19]. Training Methodology - iMF's core improvement lies in reconstructing the prediction function, transforming the training process into a standard regression problem, which enhances optimization stability [4][11]. - The training loss is now based on instantaneous velocity, allowing for a more stable and standard regression training process [10][11]. Guidance Flexibility - iMF introduces a flexible classifier-free guidance mechanism, allowing the guidance scale to be learned as a condition, thus enhancing the model's adaptability during inference [12][14]. - This flexibility enables the model to learn average velocity fields under varying guidance strengths, unlocking CFG's full potential [12]. Contextual Conditioning - The iMF architecture employs an efficient in-context conditioning mechanism, replacing the large adaLN-zero module with multiple learnable tokens for various conditions, improving efficiency and reducing parameter count [15][17]. - This adjustment allows iMF to handle multiple heterogeneous conditions more effectively, leading to a significant reduction in model size and increased design flexibility [17]. Experimental Results - iMF demonstrates exceptional performance on challenging benchmarks, with the iMF-XL/2 model achieving a FID of 1.72 in 1-NFE, showcasing its superiority over many pre-trained multi-step models [18][20]. - In 2-NFE evaluations, iMF further narrows the performance gap between single-step and multi-step diffusion models, achieving a FID of 1.54 [20].
后生可畏!何恺明团队新成果发布,共一清华姚班大二在读
量子位· 2025-12-03 09:05
Core Viewpoint - The article discusses the introduction of Improved MeanFlow (iMF), which addresses key issues in the original MeanFlow (MF) model, enhancing training stability, guidance flexibility, and architectural efficiency [1]. Group 1: Model Improvements - iMF reformulates the training objective to a more stable instantaneous velocity loss, introducing flexible classifier-free guidance (CFG) and efficient in-context conditioning, significantly improving model performance [2][14]. - In the ImageNet 256x256 benchmark, the iMF-XL/2 model achieved a FID score of 1.72 in 1-NFE, a 50% improvement over the original MF, demonstrating that single-step generative models can match the performance of multi-step diffusion models [2][25]. Group 2: Technical Enhancements - The core improvement of iMF is the reconstruction of the prediction function, transforming the training process into a standard regression problem [4]. - iMF constructs the loss from the perspective of instantaneous velocity, stabilizing the training process [9][10]. - The model simplifies input to a single noisy data point and modifies the prediction function's computation, removing dependency on external approximations [11][12][13]. Group 3: Flexibility and Efficiency - iMF internalizes the guidance scale as a learnable condition, allowing the model to adapt and learn average velocity fields under varying guidance strengths, thus enhancing CFG flexibility during inference [15][16][18]. - The improved in-context conditioning architecture eliminates the need for the large adaLN-zero mechanism, optimizing model size and efficiency, with iMF-Base reducing parameters by about one-third [19][24]. Group 4: Experimental Results - iMF demonstrates exceptional performance on challenging benchmarks, with iMF-XL/2 achieving a FID of 1.72 in 1-NFE, outperforming many pre-trained multi-step models [26][27]. - In 2-NFE, iMF further narrows the gap between single-step and multi-step diffusion models, achieving a FID of 1.54 [29].