模型压缩
Search documents
突发|华为诺亚方舟实验室主任王云鹤离职
机器之心· 2026-03-28 04:45
Core Viewpoint - The departure of Wang Yunhe, the director of Huawei's Noah's Ark Lab, marks a significant shift in the AI industry, indicating a profound structural transformation within the sector since 2026 [3][25]. Group 1: Wang Yunhe's Background - Wang Yunhe, born in 1991, graduated with a Bachelor's degree in Mathematics from Xi'an University of Electronic Science and Technology and obtained his PhD in Intelligent Science from Peking University in 2018, focusing on deep learning, model compression, machine learning, and computer vision [5][8]. - He has over 8 years of experience at Huawei, starting as an intern at Noah's Ark Lab and progressing to roles such as Senior Engineer, Chief Engineer, and eventually the director of the lab [8][25]. Group 2: Contributions and Achievements - Wang has a notable academic record with over 33,000 citations on Google Scholar, highlighting his influence in the field of AI [13]. - His research includes the development of GhostNet, a lightweight neural network architecture that achieved a Top-1 accuracy of 75.7% on the ImageNet classification task, surpassing MobileNetV3 [15][16]. - He has contributed significantly to the Vision Transformer research, with his survey article receiving 5,528 citations, establishing it as a key reference in the field [18]. Group 3: Insights on AI Models - Wang has provided unique insights into the mainstream technology routes in the era of large models, discussing the potential impact of diffusion models on autoregressive models and emphasizing the need for structural thinking in model design [21]. - His recent work on the DLLM Agent explores how different generative paradigms affect agent planning and decision-making, demonstrating the efficiency of the proposed model in global planning and interaction [22][24]. Group 4: Industry Impact - Wang's departure from Huawei is a focal point for the industry, as he has led several internationally influential algorithm innovations during his tenure [25]. - His future career path, particularly regarding his thoughts on unifying architectures for diffusion language models and general artificial intelligence, remains a topic of interest for the industry [26].
想进OpenAI?先解出这道题,百万美元算力已就位
机器之心· 2026-03-19 06:49
Core Insights - OpenAI has launched the Model Craft Challenge "Parameter Golf," which focuses on training optimal small models under strict resource constraints, emphasizing efficiency and innovative design [3][4][5]. Group 1: Challenge Overview - The challenge requires participants to minimize validation loss on a fixed dataset while keeping model artifacts (weights and training code) under 16 MB and completing training on 8 H100 GPUs within 10 minutes [1][5]. - The challenge is open globally and aims to explore more efficient pre-training models under strict resource constraints, with outstanding participants potentially receiving interview opportunities at OpenAI [4][8]. Group 2: Design and Structure - The challenge draws inspiration from the NanoGPT Speedrunning concept, focusing on achieving specified validation loss levels in the shortest time possible while exploring efficient model design under parameter constraints [5]. - OpenAI has set a total of $1 million in compute credits to assist participants in starting and advancing their model training [6]. Group 3: Participation Guidelines - Participants can fork a GitHub repository provided by OpenAI, which includes baseline models, fixed datasets, and evaluation scripts, and submit their improvements as pull requests [10]. - The challenge runs from March 18 to April 30, and participants must be at least 18 years old and located in OpenAI-supported regions [9][12]. Group 4: Community Reactions - Reactions to the challenge have been mixed, with some praising it as a true test of engineering talent under constraints, while others express concerns about potential exploitation by companies [20][22]. - Notable AI researchers suggest that the challenge could be completed by AI agents, indicating a belief that automated solutions could outperform human participants [25].
模型砍掉一大半,准确率反升15%!华科&阿里安全新研究实现ViT近乎无损的类特定压缩|ICLR'26
量子位· 2026-03-05 06:33
Core Viewpoint - The article emphasizes the limitations of large, general-purpose visual models in real-world applications, advocating for smaller, specialized models that are more efficient and better suited for specific tasks [1][2]. Group 1: Limitations of Large Models - Large visual models, while powerful, have high computational costs and are not optimal for deployment in resource-constrained environments [1][4]. - Many applications only require a focus on a few key target categories, making the extensive knowledge in general models unnecessary and counterproductive [1][8]. Group 2: Advantages of Customized Models - Customized models, described as "small and specialized," align better with practical needs, reducing deployment costs and enhancing long-term operational stability [2]. - The new paradigm proposed by Huazhong University of Science and Technology and Alibaba, named Vulcan, allows for the derivation of specialized models from general ones, focusing on key target categories while minimizing knowledge loss [3]. Group 3: Methodology of Vulcan - Vulcan introduces a "train-then-prune" approach, which is a departure from traditional methods that prune first and then train, thus preserving critical information related to target categories [3][13]. - The methodology includes two main components: Class-Centric Neuron Collapse (CCNC) and Truncated Nuclear Norm Regularization (TNNR), which work together to refine the model's focus on relevant information [15][16]. Group 4: Experimental Results - The Vulcan-derived models demonstrated a significant accuracy improvement of up to 15.12% on ImageNet tasks while reducing the model size to 20%-40% of the original [19]. - In various tests across different datasets and model sizes, Vulcan showed superior performance compared to existing structured pruning methods, achieving up to 13.92% higher accuracy in class-specific tasks [19][21]. Group 5: Practical Deployment - In practical deployment scenarios, Vulcan achieved inference speedups ranging from 1.23× to 3.02× and reduced memory usage by 20.59% to 76.47% on edge devices [22][23]. - The research indicates that understanding the internal knowledge structure of models is crucial for achieving reliable lightweight deployment [25].
被拒≠失败!这些高影响力论文都被顶会拒收过
机器之心· 2025-12-11 02:47
Core Insights - Waymo has released a deep blog detailing its AI strategy centered around its foundational model, emphasizing the use of distillation methods to create efficient models for onboard operations [1] - Jeff Dean highlighted the significance of knowledge distillation in AI, reflecting on its initial rejection by NeurIPS 2014, which underestimated its potential impact [3][4] Group 1: Historical Context of Rejected Papers - Many foundational technologies in AI, such as optimizers for large models and computer vision techniques, were initially rejected by top conferences, showcasing a systemic lag in recognizing groundbreaking innovations [6] - Notable figures in AI, including Geoffrey Hinton and Yann LeCun, faced rejection for their pioneering work, often due to reasons that seem absurd in hindsight, such as claims of lacking theoretical basis or being overly simplistic [6] Group 2: Specific Case Studies of Rejected Innovations - LSTM, a milestone in handling sequential data, was rejected by NIPS in 1996 during a period when statistical methods were favored, only to later dominate fields like speech recognition [8] - The SIFT algorithm, which ruled the computer vision domain for 15 years, faced rejection from ICCV and CVPR due to its perceived complexity and lack of elegance, ultimately proving the value of robust engineering design [11] - Dropout, a key regularization method for deep neural networks, was rejected by NIPS in 2012 for being too radical, yet it became crucial for the success of models like AlexNet [17] - Word2Vec, despite its revolutionary impact on NLP, received a strong rejection at ICLR 2013 due to perceived lack of scientific rigor, but it quickly became a cornerstone of text representation [19][20] Group 3: Reflection on Peer Review Limitations - The peer review system often struggles to recognize disruptive innovations, leading to a "simplicity trap" where reviewers equate mathematical complexity with research contribution [40] - Reviewers tend to maintain existing paradigms, which can hinder the acceptance of novel ideas that challenge traditional metrics of success [40] - The demand for rigorous theoretical proof in an experimental field like deep learning can stifle practical breakthroughs, as seen with the initial skepticism towards methods like Adam optimizer [40] Group 4: Broader Implications - The experiences of rejected papers illustrate the nonlinear nature of scientific progress, highlighting that peer review, while essential, is limited by human cognitive biases [41] - Historical anecdotes, such as Einstein's rejection of a paper on gravitational waves, emphasize that the true measure of a research's impact is its long-term relevance rather than immediate acceptance [42][44]
联想申请数据处理方法、模型压缩方法及装置专利,公开一种数据处理方法、模型压缩方法及装置
Jin Rong Jie· 2025-05-31 00:32
Group 1 - Lenovo (Beijing) Co., Ltd. has applied for a patent titled "Data Processing Method, Model Compression Method and Device," with publication number CN120068971A, and the application date is February 2025 [1] - The patent abstract reveals a data processing method that includes obtaining input data for a target task, which can be image data, text data, voice data, or video data [1] - The method involves processing the target task based on two different parameter sets that represent the target model, where the first and second types of tasks meet similarity conditions [1] Group 2 - Lenovo (Beijing) Co., Ltd. was established in 1992 and is primarily engaged in the manufacturing of computers, communications, and other electronic devices [2] - The company has a registered capital of 565 million Hong Kong dollars and has invested in 102 enterprises [2] - Lenovo (Beijing) has participated in 5,000 bidding projects and holds 1,730 trademark records and 5,000 patent records, along with 237 administrative licenses [2]
对话27岁博导张林峰:模型压缩获CVPR满分有点意外,上海交大像我这样年轻老师很多
量子位· 2025-05-27 01:07
Core Viewpoint - Zhang Linfeng, a young professor at Shanghai Jiao Tong University, has made significant contributions to the field of model compression, particularly through innovative data distillation methods that enhance model efficiency and reduce training costs [2][4][27]. Group 1: Model Compression Techniques - Zhang Linfeng's team developed a new data distillation method that achieved a perfect score at CVPR 2025, utilizing a 6-year-old 2080Ti GPU with only 1/300 of the memory compared to previous state-of-the-art methods, while increasing speed by 20 times [2][4]. - The team introduced a novel distribution difference metric (NCFD) to transform the data distillation problem into a min-max optimization problem, significantly improving the quality of synthetic data and demonstrating scalability across various benchmark datasets [6][7]. - Their approach focuses on efficiently utilizing data to reduce the training costs of large AI models, aiming for a cost-saving ratio greater than 1 for training expenses versus data selection costs [9][10]. Group 2: Token Reduction Strategies - The team has explored token-level feature caching methods, achieving up to 9 times acceleration in diffusion language models with minimal performance loss, and extending this to multimodal models where up to 90% of tokens can be removed without sacrificing accuracy [11][12]. - The introduction of the Toca method allows for adaptive selection of tokens for caching, optimizing performance based on the specific task, such as image editing, where only relevant areas need computation [16][20]. - The latest TaylorSeer model aims to predict the next features instead of reusing previous ones, achieving close to 5 times acceleration across various models, including video generation tasks [18][20][24]. Group 3: Future Directions and Industry Impact - The overarching goal of Zhang Linfeng's research is to lower the deployment costs of large models, making them more applicable in real-world scenarios, particularly in video generation where the aim is to achieve real-time generation speeds [27][25]. - The evolution of model compression is seen as a response to the increasing size of AI models, with a shift from traditional methods to data-centric approaches that minimize knowledge loss during compression [38][44]. - The research outcomes have been open-sourced and are gradually being integrated into various models, indicating a significant impact on the industry and the potential for widespread application [23][26].