Workflow
MeanFlow
icon
Search documents
何恺明CVPR最新讲座PPT上线:走向端到端生成建模
机器之心· 2025-06-19 09:30
Core Viewpoint - The article discusses the evolution of generative models, particularly focusing on the transition from diffusion models to end-to-end generative modeling, highlighting the potential for generative models to replicate the historical advancements seen in recognition models [6][36][41]. Group 1: Workshop Insights - The workshop led by Kaiming He at CVPR focused on the evolution of visual generative modeling beyond diffusion models [5][7]. - Diffusion models have become the dominant method in visual generative modeling, but they face limitations such as slow generation speed and challenges in simulating complex distributions [6][36]. - Kaiming He's presentation emphasized the need for end-to-end generative modeling, contrasting it with the historical layer-wise training methods prevalent before AlexNet [10][11][41]. Group 2: Recognition vs. Generation - Recognition and generation can be viewed as two sides of the same coin, where recognition abstracts features from raw data, while generation concretizes abstract representations into detailed data [41][42]. - The article highlights the fundamental differences between recognition tasks, which have a clear mapping from data to labels, and generation tasks, which involve complex, non-linear mappings from simple distributions to intricate data distributions [58]. Group 3: Flow Matching and MeanFlow - Flow Matching is presented as a promising approach to address the challenges in generative modeling by constructing ground-truth fields that are independent of specific neural network architectures [81]. - The MeanFlow framework introduced by Kaiming He aims to achieve single-step generation tasks by modeling average velocity rather than instantaneous velocity, providing a theoretical basis for network training [83][84]. - Experimental results show that MeanFlow significantly outperforms previous single-step diffusion and flow models, achieving a FID score of 3.43, which is over 50% better than the previous best [101][108]. Group 4: Future Directions - The article concludes with a discussion on the ongoing research efforts in the field, including Consistency Models, Two-time-variable Models, and revisiting Normalizing Flows, indicating that the field is still in its early stages akin to the pre-AlexNet era in recognition models [110][113].
何恺明团队又发新作: MeanFlow单步图像生成SOTA,提升达50%
机器之心· 2025-05-21 04:00
Core Viewpoint - The article discusses a new generative modeling framework called MeanFlow, which significantly improves existing flow matching methods by introducing the concept of average velocity, achieving a FID score of 3.43 on the ImageNet 256×256 dataset without the need for pre-training, distillation, or curriculum learning [3][5][7]. Methodology - MeanFlow introduces a new ground-truth field representing average velocity instead of the commonly used instantaneous velocity in flow matching [3][8]. - The average velocity is defined as the displacement over a time interval, and the relationship between average and instantaneous velocity is derived to guide network training [9][10]. Performance Results - MeanFlow demonstrates strong performance in one-step generative modeling, achieving a FID score of 3.43 with only 1-NFE, which is a 50% improvement over the best previous methods [5][16]. - In 2-NFE generation, MeanFlow achieves a FID score of 2.20, comparable to leading multi-step diffusion/flow models [18]. Comparative Analysis - The article provides a comparative analysis of MeanFlow against previous single-step diffusion/flow models, showing that MeanFlow outperforms them significantly, with a FID score of 3.43 compared to 7.77 for IMM [16][17]. - The results indicate that the proposed method effectively narrows the gap between single-step and multi-step diffusion/flow models [18].