模型压缩 - filings, earnings calls, financial reports, news

模型压缩

Search documents

机器之心· 2025-12-11 02:47

Core Insights - Waymo has released a deep blog detailing its AI strategy centered around its foundational model, emphasizing the use of distillation methods to create efficient models for onboard operations [1] - Jeff Dean highlighted the significance of knowledge distillation in AI, reflecting on its initial rejection by NeurIPS 2014, which underestimated its potential impact [3][4] Group 1: Historical Context of Rejected Papers - Many foundational technologies in AI, such as optimizers for large models and computer vision techniques, were initially rejected by top conferences, showcasing a systemic lag in recognizing groundbreaking innovations [6] - Notable figures in AI, including Geoffrey Hinton and Yann LeCun, faced rejection for their pioneering work, often due to reasons that seem absurd in hindsight, such as claims of lacking theoretical basis or being overly simplistic [6] Group 2: Specific Case Studies of Rejected Innovations - LSTM, a milestone in handling sequential data, was rejected by NIPS in 1996 during a period when statistical methods were favored, only to later dominate fields like speech recognition [8] - The SIFT algorithm, which ruled the computer vision domain for 15 years, faced rejection from ICCV and CVPR due to its perceived complexity and lack of elegance, ultimately proving the value of robust engineering design [11] - Dropout, a key regularization method for deep neural networks, was rejected by NIPS in 2012 for being too radical, yet it became crucial for the success of models like AlexNet [17] - Word2Vec, despite its revolutionary impact on NLP, received a strong rejection at ICLR 2013 due to perceived lack of scientific rigor, but it quickly became a cornerstone of text representation [19][20] Group 3: Reflection on Peer Review Limitations - The peer review system often struggles to recognize disruptive innovations, leading to a "simplicity trap" where reviewers equate mathematical complexity with research contribution [40] - Reviewers tend to maintain existing paradigms, which can hinder the acceptance of novel ideas that challenge traditional metrics of success [40] - The demand for rigorous theoretical proof in an experimental field like deep learning can stifle practical breakthroughs, as seen with the initial skepticism towards methods like Adam optimizer [40] Group 4: Broader Implications - The experiences of rejected papers illustrate the nonlinear nature of scientific progress, highlighting that peer review, while essential, is limited by human cognitive biases [41] - Historical anecdotes, such as Einstein's rejection of a paper on gravitational waves, emphasize that the true measure of a research's impact is its long-term relevance rather than immediate acceptance [42][44]

联想申请数据处理方法、模型压缩方法及装置专利，公开一种数据处理方法、模型压缩方法及装置

Jin Rong Jie· 2025-05-31 00:32

Group 1 - Lenovo (Beijing) Co., Ltd. has applied for a patent titled "Data Processing Method, Model Compression Method and Device," with publication number CN120068971A, and the application date is February 2025 [1] - The patent abstract reveals a data processing method that includes obtaining input data for a target task, which can be image data, text data, voice data, or video data [1] - The method involves processing the target task based on two different parameter sets that represent the target model, where the first and second types of tasks meet similarity conditions [1] Group 2 - Lenovo (Beijing) Co., Ltd. was established in 1992 and is primarily engaged in the manufacturing of computers, communications, and other electronic devices [2] - The company has a registered capital of 565 million Hong Kong dollars and has invested in 102 enterprises [2] - Lenovo (Beijing) has participated in 5,000 bidding projects and holds 1,730 trademark records and 5,000 patent records, along with 237 administrative licenses [2]

LENOVO GROUP(HK:00992)

对话27岁博导张林峰：模型压缩获CVPR满分有点意外，上海交大像我这样年轻老师很多

量子位· 2025-05-27 01:07

Core Viewpoint - Zhang Linfeng, a young professor at Shanghai Jiao Tong University, has made significant contributions to the field of model compression, particularly through innovative data distillation methods that enhance model efficiency and reduce training costs [2][4][27]. Group 1: Model Compression Techniques - Zhang Linfeng's team developed a new data distillation method that achieved a perfect score at CVPR 2025, utilizing a 6-year-old 2080Ti GPU with only 1/300 of the memory compared to previous state-of-the-art methods, while increasing speed by 20 times [2][4]. - The team introduced a novel distribution difference metric (NCFD) to transform the data distillation problem into a min-max optimization problem, significantly improving the quality of synthetic data and demonstrating scalability across various benchmark datasets [6][7]. - Their approach focuses on efficiently utilizing data to reduce the training costs of large AI models, aiming for a cost-saving ratio greater than 1 for training expenses versus data selection costs [9][10]. Group 2: Token Reduction Strategies - The team has explored token-level feature caching methods, achieving up to 9 times acceleration in diffusion language models with minimal performance loss, and extending this to multimodal models where up to 90% of tokens can be removed without sacrificing accuracy [11][12]. - The introduction of the Toca method allows for adaptive selection of tokens for caching, optimizing performance based on the specific task, such as image editing, where only relevant areas need computation [16][20]. - The latest TaylorSeer model aims to predict the next features instead of reusing previous ones, achieving close to 5 times acceleration across various models, including video generation tasks [18][20][24]. Group 3: Future Directions and Industry Impact - The overarching goal of Zhang Linfeng's research is to lower the deployment costs of large models, making them more applicable in real-world scenarios, particularly in video generation where the aim is to achieve real-time generation speeds [27][25]. - The evolution of model compression is seen as a response to the increasing size of AI models, with a shift from traditional methods to data-centric approaches that minimize knowledge loss during compression [38][44]. - The research outcomes have been open-sourced and are gradually being integrated into various models, indicating a significant impact on the industry and the potential for widespread application [23][26].