Core Insights - DeepSeek has released the experimental version DeepSeek-V3.2-Exp, which introduces Sparse Attention for improved training and inference efficiency on long texts [1] - The API pricing has been reduced by over 50% due to significant cost savings from the new model [1] - Cambricon has adapted to DeepSeek-V3.2-Exp and open-sourced the vLLM-MLU inference engine, allowing developers to experience the new model on their platform [1][2] - Huawei Ascend has also quickly adapted to DeepSeek-V3.2-Exp, open-sourcing all inference code and achieving optimized deployment on the CANN platform [3] Group 1 - DeepSeek-V3.2-Exp is an experimental version that builds on the previous V3.1-Terminus, focusing on optimizing long text processing [1] - The new model's API pricing reduction is a strategic move to enhance developer engagement and usage [1] - Cambricon's rapid adaptation to the new model demonstrates its commitment to software ecosystem development and performance optimization [2] Group 2 - Huawei's deployment of DeepSeek-V3.2-Exp BF16 model showcases its capability in handling large sequence processing with low latency and high throughput [3] - The continuous iteration of DeepSeek models indicates a proactive approach to addressing user feedback and improving model performance [3]
DeepSeek,新版本