Workflow
无监督学习技术
icon
Search documents
一篇被证明“理论有误”的论文,拿下了ICML2025时间检验奖
猿大侠· 2025-07-17 03:11
Core Viewpoint - The Batch Normalization paper, published in 2015, has been awarded the Time-Tested Award at ICML 2025, highlighting its significant impact on deep learning and its widespread adoption in the field [1][2]. Group 1: Impact and Significance - The Batch Normalization paper has been cited over 60,000 times, marking it as a milestone in the history of deep learning [2][4]. - It has been a key technology that enabled deep learning to transition from small-scale experiments to large-scale practical applications [3][4]. - The introduction of Batch Normalization has drastically accelerated the training of deep neural networks, allowing models to achieve the same accuracy with significantly fewer training steps [13][14]. Group 2: Challenges Addressed - In 2015, deep learning faced challenges with training deep neural networks, which became unstable as the number of layers increased [5][6]. - Researchers identified that the internal data distribution of network nodes changed during training, leading to difficulties in model training [11][12]. - Batch Normalization addresses this issue by normalizing the data distribution of hidden layers, thus stabilizing the training process [12][14]. Group 3: Theoretical Developments - Initial theories surrounding Batch Normalization were challenged in 2018, revealing that it not only accelerated training but also made the optimization landscape smoother, enhancing gradient predictability and stability [22][24]. - New research suggests that Batch Normalization functions as an unsupervised learning technique, allowing networks to adapt to the inherent structure of data from the start of training [25][26]. Group 4: Authors' Current Endeavors - The authors of the Batch Normalization paper, Sergey Ioffe and Christian Szegedy, have continued their careers in AI, with Szegedy joining xAI and Ioffe following suit [30][31]. - Szegedy has since moved to Morph Labs, focusing on achieving "verifiable superintelligence" [33].
一篇被证明“理论有误”的论文,拿下了ICML2025时间检验奖
量子位· 2025-07-15 08:31
Core Insights - The Batch Normalization paper, published in 2015, has been awarded the Time-Tested Award at ICML 2025, highlighting its significant impact on deep learning [1] - With over 60,000 citations, this work is considered a milestone in the development of deep learning, facilitating the training and application of deep neural networks [2][4] - Batch Normalization is a key technology that enabled deep learning to transition from small-scale experiments to large-scale practical applications [3] Group 1 - In 2015, deep learning faced challenges in training deep neural networks, which were often unstable and sensitive to parameter initialization [5][6][7] - Researchers Sergey Ioffe and Christian Szegedy identified the issue of Internal Covariate Shift, where the distribution of data within the network changes during training, complicating the training process [8][11] - Their solution involved normalizing the data at each layer, similar to input layer normalization, which significantly improved training speed and stability [12] Group 2 - The original paper demonstrated that using Batch Normalization allowed advanced image classification models to achieve the same accuracy with only 1/14 of the training steps [13] - Batch Normalization not only accelerated training but also introduced a regularization effect, enhancing the model's generalization ability [14][15] - Following its introduction, Batch Normalization became foundational for many mainstream convolutional neural networks, such as ResNet and DenseNet [18] Group 3 - In 2018, a paper from MIT challenged the core theory of Batch Normalization, showing that even with introduced noise, models with Batch Normalization still trained faster than those without it [21][23] - This research revealed that Batch Normalization smooths the Optimization Landscape, making gradient behavior more predictable and stable [24] - It was suggested that Batch Normalization acts as an unsupervised learning technique, allowing networks to adapt to the data's inherent structure early in training [25] Group 4 - Recent studies have provided deeper insights into Batch Normalization from a geometric perspective [29] - Both authors, Ioffe and Szegedy, have continued their careers in AI, with Szegedy joining xAI and Ioffe following suit [30][32] - Szegedy has since transitioned to a new role at Morph Labs, focusing on achieving "verifiable superintelligence" [34]