Core Insights - OpenAI has introduced a new model called Circuit-Sparsity, which aims to enhance the interpretability of AI models by creating a sparse architecture where 99.9% of the weights are zero, retaining only 0.1% of non-zero weights [1][2][11] Group 1: Model Characteristics - The Circuit-Sparsity model is a sparse Transformer that simplifies the internal workings of AI, addressing the "black box" nature of large language models (LLMs) [1][6] - The model's architecture allows for the formation of compact and readable "circuits," significantly reducing the complexity of decision-making processes within the model [11][18] - Compared to traditional dense models, the sparse model's circuit size is reduced by 16 times while maintaining similar task performance [11][13] Group 2: Technical Innovations - Key technical methods include dynamic pruning, activation sparsification, and architectural adjustments to maintain sparsity without compromising performance [10][11] - The model employs a new activation function, AbsTopK, to retain only the top 25% of activation values in critical areas, enhancing interpretability [10] Group 3: Performance and Limitations - Despite its advantages in interpretability, the sparse model is significantly slower, operating 100 to 1000 times slower than dense models due to computational efficiency bottlenecks [4][17] - OpenAI has proposed a "Bridges" network to facilitate interaction between sparse and dense models, allowing for modifications in the sparse model to be reflected in the dense model [17][18] Group 4: Future Directions - OpenAI plans to expand the application of this technology to larger models and further explore the logic behind various models' behaviors [18] - Future research will focus on extracting sparse circuits from existing dense models and developing more efficient training techniques for interpretable models [18]
OpenAI又开源了,仅0.4B,给模型大瘦身
3 6 Ke·2025-12-15 08:14