Core Insights - The article discusses the open-source implementation of Circuit Sparsity technology, which aims to enhance the interpretability of large language models by introducing a sparse structure that allows for clearer understanding of internal decision-making processes [2][4]. Group 1: Circuit Sparsity Technology - Circuit Sparsity is a variant of large language models that enforces sparsity in internal connections, making the model's computation process more understandable and interpretable [4]. - This technology aims to address the "black box" issue of traditional dense Transformers, allowing for clearer insights into how AI makes decisions and reducing reliance on potentially misleading outputs [4][10]. Group 2: Comparison with MoE Models - The article suggests that the extreme sparsity and functional decoupling of Circuit Sparsity may threaten the current popularity of Mixture of Experts (MoE) models, which rely on a more coarse approximation of sparsity [5][12]. - MoE models face challenges such as feature flow fragmentation and knowledge redundancy, while Circuit Sparsity offers a more precise dissection of model mechanisms [12][14]. Group 3: Performance and Efficiency - Experimental data indicates that the task-specific circuits of sparse models are 16 times smaller than those of dense models while maintaining the same pre-training loss, allowing for precise tracking of logical steps [12]. - However, Circuit Sparsity currently has significant drawbacks, including extremely high computational costs, being 100 to 1000 times more demanding than traditional dense models [14]. Group 4: Future Directions - The research team plans to expand the technology to larger models to unlock more complex reasoning circuits, indicating that this is an early step in exploring AI interpretability [14][16]. - Two potential methods to overcome the training efficiency issues of sparse models are identified: extracting sparse circuits from existing dense models and optimizing training mechanisms for new interpretable sparse models [16].
OpenAI突然开源新模型,99.9%的权重是0,新稀疏性方法代替MoE
3 6 Ke·2025-12-15 03:29