模型砍掉一大半，准确率反升15%！华科&阿里安全新研究实现ViT近乎无损的类特定压缩｜ICLR'26

Core Viewpoint - The article emphasizes the limitations of large, general-purpose visual models in real-world applications, advocating for smaller, specialized models that are more efficient and better suited for specific tasks [1][2]. Group 1: Limitations of Large Models - Large visual models, while powerful, have high computational costs and are not optimal for deployment in resource-constrained environments [1][4]. - Many applications only require a focus on a few key target categories, making the extensive knowledge in general models unnecessary and counterproductive [1][8]. Group 2: Advantages of Customized Models - Customized models, described as "small and specialized," align better with practical needs, reducing deployment costs and enhancing long-term operational stability [2]. - The new paradigm proposed by Huazhong University of Science and Technology and Alibaba, named Vulcan, allows for the derivation of specialized models from general ones, focusing on key target categories while minimizing knowledge loss [3]. Group 3: Methodology of Vulcan - Vulcan introduces a "train-then-prune" approach, which is a departure from traditional methods that prune first and then train, thus preserving critical information related to target categories [3][13]. - The methodology includes two main components: Class-Centric Neuron Collapse (CCNC) and Truncated Nuclear Norm Regularization (TNNR), which work together to refine the model's focus on relevant information [15][16]. Group 4: Experimental Results - The Vulcan-derived models demonstrated a significant accuracy improvement of up to 15.12% on ImageNet tasks while reducing the model size to 20%-40% of the original [19]. - In various tests across different datasets and model sizes, Vulcan showed superior performance compared to existing structured pruning methods, achieving up to 13.92% higher accuracy in class-specific tasks [19][21]. Group 5: Practical Deployment - In practical deployment scenarios, Vulcan achieved inference speedups ranging from 1.23× to 3.02× and reduced memory usage by 20.59% to 76.47% on edge devices [22][23]. - The research indicates that understanding the internal knowledge structure of models is crucial for achieving reliable lightweight deployment [25].