Hardware Autonomy

Search documents
DeepSeek-R2发布在即,参数量翻倍,华为昇腾芯片利用率达82%!
Sou Hu Cai Jing· 2025-04-29 07:17
Core Insights - The next-generation AI model DeepSeek-R2 is set to be released, featuring advanced parameters and architecture [1][5] - DeepSeek-R2 will utilize a hybrid expert model (MoE) with an intelligent gating network, significantly enhancing performance for high-load inference tasks [5] - The total parameter count for DeepSeek-R2 is expected to reach 1.2 trillion, doubling the 671 billion parameters of DeepSeek-R1, making it comparable to GPT-4 Turbo and Google's Gemini 2.0 Pro [5] Cost Efficiency - DeepSeek-R2's unit inference cost is projected to decrease by 97.4% compared to GPT-4, costing approximately $0.07 per million tokens, while GPT-4 costs $0.27 per million tokens [8] - The model's cost efficiency is attributed to the use of Huawei's Ascend 910B chip cluster, which achieves a computational performance of 512 PetaFLOPS with an 82% resource utilization rate [7][8] Hardware and Infrastructure - DeepSeek-R2's training framework is based on Huawei's Ascend 910B chip cluster, which has been validated to deliver 91% of the performance of NVIDIA's previous A100 training cluster [7] - The introduction of Huawei's Ascend 910C chip, which is entering mass production, may provide a domestic alternative to NVIDIA's high-end AI chips, enhancing hardware autonomy in China's AI sector [10]