国君计算机|DeepSeek NSA架构引领AI效率革新

Investment Rating - The report suggests a positive outlook on the industry due to the advancements in Native Sparse Attention (NSA) technology, which is expected to lower the cost of model training and enhance algorithm efficiency in AI models [1]. Core Insights - The NSA technology represents a breakthrough in processing long contexts, which is a critical bottleneck in the development of large models. The attention computation in the softmax architecture accounts for 70%-80% of the total decoding delay for 64k contexts. NSA employs three parallel attention branches to improve efficiency while maintaining performance comparable to full attention models [1]. - The reduction in computational resource requirements for pre-training large models will democratize AI technology, allowing more small and medium enterprises to participate in foundational AI development. This shift is anticipated to broaden the market beyond a few tech giants [1]. - The enhanced capability for long text processing will create new application scenarios and drive innovation in business models. NSA technology enables models to handle entire books, code repositories, or extensive customer service dialogues, significantly expanding AI's application boundaries in document analysis and code generation [2]. Summary by Sections - Investment Rating: Positive outlook on the industry due to NSA advancements [1]. - Long Context Processing: NSA technology improves efficiency in handling long contexts, addressing a key bottleneck in large model development [1]. - Democratization of AI: Lower computational barriers will enable broader participation in AI development, moving beyond major tech companies [1]. - New Application Scenarios: Enhanced long text processing capabilities will foster new business models and market opportunities [2].