Core Insights - DeepSeek has launched the DeepSeek-V3.2-Exp model on Hugging Face, introducing the DeepSeek Sparse Attention (DSA) mechanism to enhance training and inference efficiency for long texts [1][3] - Huawei Cloud has adapted the DeepSeek-V3.2-Exp model, supporting a maximum context length of 160K [2] - The DSA technology significantly improves training and inference efficiency for long text scenarios with minimal impact on model output [3] - The training settings of DeepSeek-V3.2-Exp were strictly aligned with the previous version, V3.1-Terminus, showing comparable performance across various benchmarks [5] - The new model has led to a reduction of over 50% in API costs, with immediate price adjustments implemented [8] - DeepSeek has made the DeepSeek-V3.2-Exp model fully open-source on Hugging Face and ModelScope, with related research papers also published [9] - The company has retained API access for the V3.1-Terminus version for comparison purposes until October 15, 2025 [9] - Additionally, DeepSeek has open-sourced GPU operators designed for the new model, recommending the use of the TileLang version for research experiments [10]
国庆前放大招!DeepSeek-V3.2-Exp发布并开源,API成本将降低50%以上
华尔街见闻·2025-09-29 11:12