Workflow
“训练成本才这么点?美国同行陷入自我怀疑”
Guan Cha Zhe Wang·2025-09-19 11:28

Core Insights - DeepSeek has achieved a significant breakthrough in AI model training costs, with the DeepSeek-R1 model's training cost reported at only $294,000, which is substantially lower than the costs disclosed by American competitors [1][2][4] - The model utilizes 512 NVIDIA H800 chips and has been recognized as the first mainstream large language model to undergo peer review, marking a notable advancement in the field [2][4] - The cost efficiency of DeepSeek's model challenges the notion that only countries with the most advanced chips can dominate the AI race, as highlighted by various media outlets [1][2][6] Cost and Performance - The training cost of DeepSeek-R1 is significantly lower than that of OpenAI's models, which have been reported to exceed $100 million [2][4] - DeepSeek's approach emphasizes the use of open-source data and efficient training methods, allowing for high performance at a fraction of the cost compared to traditional models [5][6] Industry Impact - The success of DeepSeek-R1 is seen as a potential game-changer in the AI landscape, suggesting that AI competition is shifting from resource quantity to resource efficiency [6][7] - The model's development has sparked discussions regarding China's position in the global AI sector, particularly in light of U.S. export restrictions on advanced chips [1][4] Technical Details - The latest research paper provides more detailed insights into the training process and acknowledges the use of A100 chips in earlier stages, although the final model was trained exclusively on H800 chips [4][5] - DeepSeek has defended its use of "distillation" techniques, which are common in the industry, to enhance model performance while reducing costs [5][6]