几何
Search documents
参数空间对称性:深度学习理论的统一几何框架
机器之心· 2025-10-29 09:25
Core Insights - The article discusses the evolution of deep learning models from millions to billions of parameters, highlighting the lack of systematic understanding of their effectiveness [2] - A key focus is on the concept of parameter space symmetry, which refers to the existence of multiple parameter configurations that yield the same model function, complicating optimization and generalization analysis [4][6] Group 1: Parameter Space Symmetry - Parameter space symmetry allows different parameter combinations to produce identical outputs, exemplified by the interchange of neurons in hidden layers [4][6] - This symmetry is mathematically defined as transformations that keep the loss function invariant, forming a group that defines equivalent orbits in parameter space [6] Group 2: Types of Symmetry - In addition to discrete symmetries, most neural network architectures exhibit continuous symmetries, such as scaling and linear transformations, which maintain function invariance [8] - Complex architectures like Transformers combine various symmetries from their components, including multi-head attention mechanisms [8] Group 3: Impact on Loss Landscape - Symmetry creates a complex yet structured optimization space, where continuous symmetries can stretch isolated minima into flat manifolds, affecting the interpretation of generalization metrics [10] - Observed phenomena like "mode connectivity," where independently trained models can connect through low-loss paths, are partially attributed to continuous symmetries [10] Group 4: Optimization Methods - The presence of symmetry leads to the phenomenon of "equal loss, different gradients," suggesting new algorithmic possibilities for optimization methods that seek better gradient points within equivalent orbits [15][19] - Some optimization strategies leverage symmetry as a degree of freedom, while others aim to reduce it as redundancy, indicating its importance in algorithm design [19] Group 5: Learning Dynamics - Continuous symmetries correspond to conserved quantities, which remain constant during training, revealing insights into the stability of the training process and the implicit bias of optimization [21][23] - The structure of parameter space symmetry influences the statistical distribution of learning trajectories and outcomes [23] Group 6: Connections Across Spaces - Parameter space symmetry is interconnected with data space and internal representation space, where model parameters often reflect the symmetry present in the data distribution [27][28] - Emerging directions like Weight Space Learning utilize symmetry as a new data structure, facilitating the analysis and generation of model properties [28][29] Group 7: Future Directions - The widespread existence of parameter space symmetry offers a new mathematical language for deep learning, linking complex behaviors of models with established tools from group theory and geometry [30] - This perspective is influencing various practical fields, from optimization acceleration to model fusion and new model design, transforming theoretical concepts into actionable algorithmic principles [30]
北大校友王虹,将任法国高等研究所常任教授!2/3前辈为菲尔兹奖得主
量子位· 2025-05-28 05:59
Core Viewpoint - The article highlights the recent appointment of Chinese mathematician Wang Hong, known for solving the Kakeya conjecture, as a permanent professor at the Institut des Hautes Études Scientifiques (IHES) in France, marking a significant achievement in her career and the mathematics community [1][2][10]. Group 1: Appointment Details - Wang Hong will officially join IHES on September 1, 2025, and will also hold a position as a mathematics professor at New York University's Courant Institute of Mathematical Sciences [6]. - IHES currently has only seven permanent professors, with five being prominent mathematicians, including two Fields Medal winners [3][4]. Group 2: Academic Background - Wang Hong was born in 1991 in Guilin, Guangxi, and demonstrated exceptional academic ability from a young age, entering Peking University at 16 [15]. - She obtained her bachelor's degree in mathematics in 2011, followed by an engineering degree from École Polytechnique and a master's degree from Paris XI University in 2014, and completed her PhD at MIT in 2019 [16]. Group 3: Research Contributions - Wang Hong, along with UBC mathematics associate professor Joshua Zahl, solved the Kakeya conjecture, a long-standing problem in mathematics that has implications across various fields such as harmonic analysis and number theory [10][12]. - The Kakeya conjecture in three dimensions asserts that a set containing unit-length line segments in every direction must have Minkowski and Hausdorff dimensions equal to three [11]. Group 4: Community Reception - The announcement of Wang Hong's appointment was met with enthusiasm in the mathematics community, with notable figures expressing their support and anticipation for her contributions [7][9]. - Many believe her recent achievement could position her as a strong candidate for the Fields Medal [14].