Workflow
端到端生成建模
icon
Search documents
何恺明新身份:谷歌DeepMind杰出科学家
机器之心· 2025-06-26 00:30
Core Viewpoint - The article discusses the recent news of Kaiming He joining Google as a part-time Distinguished Scientist at DeepMind, highlighting his significant contributions to the field of AI and computer vision [2][4][24]. Group 1: Kaiming He's Background and Achievements - Kaiming He achieved the highest score in the 2003 Guangdong Province college entrance examination and was admitted to Tsinghua University [8]. - He completed his PhD at the Chinese University of Hong Kong under the supervision of Xiaoguo Tang and has held positions at Microsoft Research Asia, Facebook AI Research, and MIT [9]. - His research has received multiple awards, including the best paper award at CVPR in 2009 and 2016, and he has over 710,000 citations according to Google Scholar [10][12]. Group 2: Research Contributions - Kaiming He's most notable work includes the ResNet paper published in 2016, which has been cited over 280,000 times and is considered one of the most cited papers of the 21st century [15][18]. - His research addresses the gradient propagation problem in deep networks, establishing fundamental components for modern deep learning models [18][19]. - He has also contributed to the development of the Masked Autoencoders model, which has gained popularity in the computer vision community [20]. Group 3: Future Prospects at Google - The article expresses anticipation for Kaiming He's potential contributions at Google, particularly in the area of generative modeling, as suggested by his recent research [6][24].
何恺明CVPR最新讲座PPT上线:走向端到端生成建模
机器之心· 2025-06-19 09:30
Core Viewpoint - The article discusses the evolution of generative models, particularly focusing on the transition from diffusion models to end-to-end generative modeling, highlighting the potential for generative models to replicate the historical advancements seen in recognition models [6][36][41]. Group 1: Workshop Insights - The workshop led by Kaiming He at CVPR focused on the evolution of visual generative modeling beyond diffusion models [5][7]. - Diffusion models have become the dominant method in visual generative modeling, but they face limitations such as slow generation speed and challenges in simulating complex distributions [6][36]. - Kaiming He's presentation emphasized the need for end-to-end generative modeling, contrasting it with the historical layer-wise training methods prevalent before AlexNet [10][11][41]. Group 2: Recognition vs. Generation - Recognition and generation can be viewed as two sides of the same coin, where recognition abstracts features from raw data, while generation concretizes abstract representations into detailed data [41][42]. - The article highlights the fundamental differences between recognition tasks, which have a clear mapping from data to labels, and generation tasks, which involve complex, non-linear mappings from simple distributions to intricate data distributions [58]. Group 3: Flow Matching and MeanFlow - Flow Matching is presented as a promising approach to address the challenges in generative modeling by constructing ground-truth fields that are independent of specific neural network architectures [81]. - The MeanFlow framework introduced by Kaiming He aims to achieve single-step generation tasks by modeling average velocity rather than instantaneous velocity, providing a theoretical basis for network training [83][84]. - Experimental results show that MeanFlow significantly outperforms previous single-step diffusion and flow models, achieving a FID score of 3.43, which is over 50% better than the previous best [101][108]. Group 4: Future Directions - The article concludes with a discussion on the ongoing research efforts in the field, including Consistency Models, Two-time-variable Models, and revisiting Normalizing Flows, indicating that the field is still in its early stages akin to the pre-AlexNet era in recognition models [110][113].