分布外泛化能力

Search documents
软件所提出小批量数据采样策略
Jing Ji Guan Cha Wang· 2025-05-27 07:50
Core Insights - A research team from the Institute of Software, Chinese Academy of Sciences, proposed a small-batch data sampling strategy to eliminate the interference of unobservable variable semantics on representation learning, enhancing the out-of-distribution generalization ability of self-supervised learning models [1][2] Group 1: Research Findings - The out-of-distribution generalization ability refers to the model's performance on test data that differs from the training data distribution, which is crucial for maintaining effectiveness in unseen data scenarios [1] - The study identified that self-supervised learning models are affected by unobservable variable semantics during training, which weakens their out-of-distribution generalization ability [1] Group 2: Methodology - The proposed strategy utilizes causal effect estimation techniques to eliminate the confounding effects of unobservable variable semantics [1] - By learning a latent variable model, the strategy estimates the posterior probability distribution of unobservable semantic variables given "anchor" samples, termed as balance scores [1] - Samples with similar or close balance scores are grouped into the same small-batch dataset, ensuring that unobservable semantic variables are conditionally independent of the "anchor" samples within each batch [1] Group 3: Experimental Results - Extensive experiments on benchmark datasets showed that the sampling strategy improved the performance of mainstream self-supervised learning methods by at least 2% across various evaluation tasks [2] - In classification tasks on ImageNet100 and ImageNet, both Top-1 and Top-5 accuracy surpassed the state-of-the-art self-supervised methods [2] - In semi-supervised classification tasks, Top-1 and Top-5 accuracy increased by over 3% and 2%, respectively [2] - The strategy also provided stable gains in average precision for object detection and instance segmentation transfer learning tasks [2] - Performance improvements exceeded 5% for few-shot transfer learning tasks on datasets like Omniglot, miniImageNet, and CIFARFS [2] - The research findings were accepted by the top-tier academic conference in artificial intelligence, International Conference on Machine Learning (ICML-25) [2]