寒武纪、华为昇腾适配DeepSeek最新模型

Core Viewpoint - The release of DeepSeek-V3.2-Exp model on Hugging Face platform introduces a sparse Attention architecture that reduces computational resource consumption and enhances inference efficiency [1] Group 1: Model Deployment and Adaptation - Huawei's Ascend has quickly adapted and deployed the DeepSeek-V3.2-Exp model based on vLLM/SGLang inference frameworks, providing open-source inference code and operator implementations for developers [1] - Cambricon announced the adaptation of the latest DeepSeek-V3.2-Exp model and has open-sourced the vLLM-MLU inference engine source code, leveraging the new DeepSeek Sparse Attention mechanism to significantly reduce training and inference costs in long-sequence scenarios [1] - Haiguang Information announced seamless adaptation and deep optimization of its DCU, achieving "zero-wait" deployment for large model computing power, showcasing excellent performance of DeepSeek-V3.2-Exp on Haiguang DCU [1]