华为昇腾、寒武纪宣布适配DeepSeek最新模型

Core Viewpoint - DeepSeek has officially released the V3.2-Exp model, introducing the DeepSeek Sparse Attention (DSA) mechanism, which optimizes training and inference efficiency for long texts, significantly reducing service costs by over 50% for the DeepSeek API [1][5]. Group 1: Model Development - The V3.2-Exp model builds on the V3.1-Terminus version and incorporates the DSA mechanism, which is a sparse attention approach that reduces computational complexity when processing long texts [1][4]. - DSA allows for adaptive selection of key attention heads and local context windows, improving efficiency and lowering costs compared to traditional dense attention mechanisms [3][4]. Group 2: Cost and Accessibility - The introduction of the new model has led to a significant reduction in the cost of accessing the DeepSeek API, with prices dropping by more than 50% [5]. - DeepSeek has temporarily retained additional API access for the previous V3.1-Terminus model until October 15, allowing users to conduct comparative testing [2]. Group 3: Open Source and Community Engagement - DeepSeek has fully open-sourced the V3.2-Exp model on platforms like HuggingFace and ModelScope, along with related research papers [2]. - The company has also open-sourced the TileLang version of the operators, which has garnered significant attention in the industry [1][6]. Group 4: Hardware Compatibility - Following the release of V3.2-Exp, major domestic hardware companies like Huawei, Cambricon, and Haiguang have announced compatibility with the new model, indicating a collaborative development within the domestic AI ecosystem [6][10]. - TileLang, a programming language developed for simplifying GPU operator development, has been recommended for use in research experiments, enhancing the efficiency of AI operator development [7][10].