模型压缩

Search documents
联想申请数据处理方法、模型压缩方法及装置专利,公开一种数据处理方法、模型压缩方法及装置
Jin Rong Jie· 2025-05-31 00:32
Group 1 - Lenovo (Beijing) Co., Ltd. has applied for a patent titled "Data Processing Method, Model Compression Method and Device," with publication number CN120068971A, and the application date is February 2025 [1] - The patent abstract reveals a data processing method that includes obtaining input data for a target task, which can be image data, text data, voice data, or video data [1] - The method involves processing the target task based on two different parameter sets that represent the target model, where the first and second types of tasks meet similarity conditions [1] Group 2 - Lenovo (Beijing) Co., Ltd. was established in 1992 and is primarily engaged in the manufacturing of computers, communications, and other electronic devices [2] - The company has a registered capital of 565 million Hong Kong dollars and has invested in 102 enterprises [2] - Lenovo (Beijing) has participated in 5,000 bidding projects and holds 1,730 trademark records and 5,000 patent records, along with 237 administrative licenses [2]
对话27岁博导张林峰:模型压缩获CVPR满分有点意外,上海交大像我这样年轻老师很多
量子位· 2025-05-27 01:07
Core Viewpoint - Zhang Linfeng, a young professor at Shanghai Jiao Tong University, has made significant contributions to the field of model compression, particularly through innovative data distillation methods that enhance model efficiency and reduce training costs [2][4][27]. Group 1: Model Compression Techniques - Zhang Linfeng's team developed a new data distillation method that achieved a perfect score at CVPR 2025, utilizing a 6-year-old 2080Ti GPU with only 1/300 of the memory compared to previous state-of-the-art methods, while increasing speed by 20 times [2][4]. - The team introduced a novel distribution difference metric (NCFD) to transform the data distillation problem into a min-max optimization problem, significantly improving the quality of synthetic data and demonstrating scalability across various benchmark datasets [6][7]. - Their approach focuses on efficiently utilizing data to reduce the training costs of large AI models, aiming for a cost-saving ratio greater than 1 for training expenses versus data selection costs [9][10]. Group 2: Token Reduction Strategies - The team has explored token-level feature caching methods, achieving up to 9 times acceleration in diffusion language models with minimal performance loss, and extending this to multimodal models where up to 90% of tokens can be removed without sacrificing accuracy [11][12]. - The introduction of the Toca method allows for adaptive selection of tokens for caching, optimizing performance based on the specific task, such as image editing, where only relevant areas need computation [16][20]. - The latest TaylorSeer model aims to predict the next features instead of reusing previous ones, achieving close to 5 times acceleration across various models, including video generation tasks [18][20][24]. Group 3: Future Directions and Industry Impact - The overarching goal of Zhang Linfeng's research is to lower the deployment costs of large models, making them more applicable in real-world scenarios, particularly in video generation where the aim is to achieve real-time generation speeds [27][25]. - The evolution of model compression is seen as a response to the increasing size of AI models, with a shift from traditional methods to data-centric approaches that minimize knowledge loss during compression [38][44]. - The research outcomes have been open-sourced and are gradually being integrated into various models, indicating a significant impact on the industry and the potential for widespread application [23][26].