大模型长脑子了?研究发现LLM中层会自发模拟人脑进化
机器之心·2026-01-15 00:53

Core Insights - The article discusses the emergence of a "Synergistic Core" structure in large language models (LLMs), which is similar to the human brain's organization [1][2][17]. - The research indicates that this structure is not inherent to the Transformer architecture but develops through the learning process [18][19]. Model Analysis - Researchers utilized the Partial Information Decomposition (PID) framework to analyze models such as Gemma, Llama, Qwen, and DeepSeek, revealing strong synergistic processing capabilities in the middle layers, while lower and upper layers exhibited redundancy [5][6][8]. - The study involved cognitive tasks across six categories, with models generating responses that were analyzed for activation values [9][10]. Experimental Methodology - The Integrated Information Decomposition (ID) framework was applied to quantify interactions between attention heads, leading to the development of the Synergy-Redundancy Rank, which indicates whether components are aggregating signals independently or integrating them deeply [12][13]. Findings on Spatial Distribution - The experiments revealed a consistent "inverted U-shape" curve in the distribution of synergy across different model architectures, indicating a common organizational pattern [14]. - This pattern suggests that synergistic processing may be a computational necessity for achieving advanced intelligence, paralleling the human brain's structure [17]. Core Structure Characteristics - The "Redundant Periphery" consists of early and late layers with low synergy, focusing on basic tasks, while the "Synergistic Core" in the middle layers shows high synergy, crucial for advanced semantic integration and reasoning [21][23]. - The Synergistic Core is identified as a hallmark of the model's capabilities, exhibiting high global efficiency for rapid information integration [23]. Validation of Synergistic Core - Ablation experiments demonstrated that removing high-synergy nodes led to significant performance declines, confirming the Synergistic Core as a driving force behind model intelligence [25]. - Fine-tuning experiments showed that training focused on the Synergistic Core resulted in greater performance improvements compared to training on redundant nodes [27]. Implications for AI and Neuroscience - Identifying the Synergistic Core can aid in designing more efficient compression algorithms and targeted parameter updates to accelerate training [29]. - The findings suggest a convergence in the organizational patterns of large models and biological brains, providing insights into the nature of general intelligence [29].

大模型长脑子了?研究发现LLM中层会自发模拟人脑进化 - Reportify