神经缩放律
Search documents
NeurIPS 2025最佳论文开奖,何恺明、孙剑等十年经典之作夺奖
3 6 Ke· 2025-11-27 07:27
Core Insights - NeurIPS 2025 announced its best paper awards, with four papers recognized, including a significant contribution from Chinese researchers [1][2] - The "Test of Time Award" was given to Faster R-CNN, highlighting its lasting impact on the field of computer vision [1][50] Best Papers - The first best paper titled "Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)" was authored by a team from multiple prestigious institutions, including Washington University and Carnegie Mellon University [5][6] - The second best paper, "Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free," involved collaboration between researchers from Alibaba, Edinburgh University, Stanford University, MIT, and Tsinghua University [14][15] - The third best paper, "1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities," was authored by researchers from Princeton University and Warsaw University of Technology [21][24] - The fourth best paper, "Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training," was a collaborative effort from PSL University and Bocconi University [28][29] Runners Up - Three runner-up papers were also recognized, including "Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?" from Tsinghua University and Shanghai Jiao Tong University [33][34] - Another runner-up paper titled "Optimal Mistake Bounds for Transductive Online Learning" was authored by researchers from Kent State University, Purdue University, Google Research, and MIT [38][39] - The third runner-up paper, "Superposition Yields Robust Neural Scaling," was from MIT [42][46] Test of Time Award - The "Test of Time Award" was awarded to the paper "Faster R-CNN," which has been cited over 56,700 times and has significantly influenced the computer vision field [50][52] - The paper introduced a fully learnable two-stage process that replaced traditional methods, achieving high detection accuracy and near real-time speeds [50][52]