对话27岁博导张林峰：模型压缩获CVPR满分有点意外，上海交大像我这样年轻老师很多

Core Viewpoint - Zhang Linfeng, a young professor at Shanghai Jiao Tong University, has made significant contributions to the field of model compression, particularly through innovative data distillation methods that enhance model efficiency and reduce training costs [2][4][27]. Group 1: Model Compression Techniques - Zhang Linfeng's team developed a new data distillation method that achieved a perfect score at CVPR 2025, utilizing a 6-year-old 2080Ti GPU with only 1/300 of the memory compared to previous state-of-the-art methods, while increasing speed by 20 times [2][4]. - The team introduced a novel distribution difference metric (NCFD) to transform the data distillation problem into a min-max optimization problem, significantly improving the quality of synthetic data and demonstrating scalability across various benchmark datasets [6][7]. - Their approach focuses on efficiently utilizing data to reduce the training costs of large AI models, aiming for a cost-saving ratio greater than 1 for training expenses versus data selection costs [9][10]. Group 2: Token Reduction Strategies - The team has explored token-level feature caching methods, achieving up to 9 times acceleration in diffusion language models with minimal performance loss, and extending this to multimodal models where up to 90% of tokens can be removed without sacrificing accuracy [11][12]. - The introduction of the Toca method allows for adaptive selection of tokens for caching, optimizing performance based on the specific task, such as image editing, where only relevant areas need computation [16][20]. - The latest TaylorSeer model aims to predict the next features instead of reusing previous ones, achieving close to 5 times acceleration across various models, including video generation tasks [18][20][24]. Group 3: Future Directions and Industry Impact - The overarching goal of Zhang Linfeng's research is to lower the deployment costs of large models, making them more applicable in real-world scenarios, particularly in video generation where the aim is to achieve real-time generation speeds [27][25]. - The evolution of model compression is seen as a response to the increasing size of AI models, with a shift from traditional methods to data-centric approaches that minimize knowledge loss during compression [38][44]. - The research outcomes have been open-sourced and are gradually being integrated into various models, indicating a significant impact on the industry and the potential for widespread application [23][26].