后生可畏！何恺明团队新成果发布，共一清华姚班大二在读

Core Viewpoint - The article discusses the introduction of Improved MeanFlow (iMF), which addresses key issues in the original MeanFlow (MF) model, enhancing training stability, guidance flexibility, and architectural efficiency [1]. Group 1: Model Improvements - iMF reformulates the training objective to a more stable instantaneous velocity loss, introducing flexible classifier-free guidance (CFG) and efficient in-context conditioning, significantly improving model performance [2][14]. - In the ImageNet 256x256 benchmark, the iMF-XL/2 model achieved a FID score of 1.72 in 1-NFE, a 50% improvement over the original MF, demonstrating that single-step generative models can match the performance of multi-step diffusion models [2][25]. Group 2: Technical Enhancements - The core improvement of iMF is the reconstruction of the prediction function, transforming the training process into a standard regression problem [4]. - iMF constructs the loss from the perspective of instantaneous velocity, stabilizing the training process [9][10]. - The model simplifies input to a single noisy data point and modifies the prediction function's computation, removing dependency on external approximations [11][12][13]. Group 3: Flexibility and Efficiency - iMF internalizes the guidance scale as a learnable condition, allowing the model to adapt and learn average velocity fields under varying guidance strengths, thus enhancing CFG flexibility during inference [15][16][18]. - The improved in-context conditioning architecture eliminates the need for the large adaLN-zero mechanism, optimizing model size and efficiency, with iMF-Base reducing parameters by about one-third [19][24]. Group 4: Experimental Results - iMF demonstrates exceptional performance on challenging benchmarks, with iMF-XL/2 achieving a FID of 1.72 in 1-NFE, outperforming many pre-trained multi-step models [26][27]. - In 2-NFE, iMF further narrows the gap between single-step and multi-step diffusion models, achieving a FID of 1.54 [29].