Batch Normalization(批次归一化

Search documents
十年六万引,BatchNorm 封神,ICML 授予时间检验奖
3 6 Ke· 2025-07-17 08:52
一篇发表于2015年的论文,在十年后,于国际机器学习大会(ICML)2025上,被授予了"时间检验奖"(Test of Time Award)。 这篇论文就是深度学习领域无人不晓的《Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift》。 它的出现,从根本上改变了研究者们训练深度神经网络的方式,成为AI技术发展进程中的一座关键里程碑。 一、一座无法绕过的丰碑 国际机器学习大会(ICML)的时间检验奖,旨在表彰十年前发表的、并在此后十年间对整个领域产生深远影响的论文。 获奖,意味着一项研究不仅在当时具有开创性,更重要的是,它的思想和方法经受住了时间的考验,成为了后续无数研究的基石。 Batch Normalization(批次归一化,简称BatchNorm)的获奖,可谓实至名归。 在工程实践中,BatchNorm更是成为了一个"默认选项"。开发者在构建神经网络时,几乎会下意识地在卷积层或全连接层后加入一个BatchNorm层。 * [10] A. A. K. K. 自2015年由谷 ...
一篇被证明“理论有误”的论文,拿下了ICML2025时间检验奖
量子位· 2025-07-15 08:31
Core Insights - The Batch Normalization paper, published in 2015, has been awarded the Time-Tested Award at ICML 2025, highlighting its significant impact on deep learning [1] - With over 60,000 citations, this work is considered a milestone in the development of deep learning, facilitating the training and application of deep neural networks [2][4] - Batch Normalization is a key technology that enabled deep learning to transition from small-scale experiments to large-scale practical applications [3] Group 1 - In 2015, deep learning faced challenges in training deep neural networks, which were often unstable and sensitive to parameter initialization [5][6][7] - Researchers Sergey Ioffe and Christian Szegedy identified the issue of Internal Covariate Shift, where the distribution of data within the network changes during training, complicating the training process [8][11] - Their solution involved normalizing the data at each layer, similar to input layer normalization, which significantly improved training speed and stability [12] Group 2 - The original paper demonstrated that using Batch Normalization allowed advanced image classification models to achieve the same accuracy with only 1/14 of the training steps [13] - Batch Normalization not only accelerated training but also introduced a regularization effect, enhancing the model's generalization ability [14][15] - Following its introduction, Batch Normalization became foundational for many mainstream convolutional neural networks, such as ResNet and DenseNet [18] Group 3 - In 2018, a paper from MIT challenged the core theory of Batch Normalization, showing that even with introduced noise, models with Batch Normalization still trained faster than those without it [21][23] - This research revealed that Batch Normalization smooths the Optimization Landscape, making gradient behavior more predictable and stable [24] - It was suggested that Batch Normalization acts as an unsupervised learning technique, allowing networks to adapt to the data's inherent structure early in training [25] Group 4 - Recent studies have provided deeper insights into Batch Normalization from a geometric perspective [29] - Both authors, Ioffe and Szegedy, have continued their careers in AI, with Szegedy joining xAI and Ioffe following suit [30][32] - Szegedy has since transitioned to a new role at Morph Labs, focusing on achieving "verifiable superintelligence" [34]