DeepSeek首次回应“蒸馏OpenAI”质疑

Core Viewpoint - DeepSeek's R1 model has gained significant attention after being published in the prestigious journal "Nature," showcasing its ability to enhance reasoning capabilities through reinforcement learning without relying heavily on supervised data [3][11]. Group 1: Model Development and Training - The training cost for the DeepSeek-R1 model was approximately $294,000, with specific costs for different components detailed as follows: R1-Zero training cost was $202,000, SFT dataset creation cost was $10,000, and R1 training cost was $82,000 [10]. - DeepSeek-R1 utilized 64×8 H800 GPUs for training, taking about 198 hours for R1-Zero and around 80 hours for R1 [10]. - The total training cost, including the earlier V3 model, remains significantly lower than competitors, totaling around $6 million for V3 and $294,000 for R1 [10]. Group 2: Model Performance and Validation - DeepSeek's approach allows for significant performance improvements in reasoning capabilities through large-scale reinforcement learning, even without supervised fine-tuning [13]. - The model's ability to self-validate and reflect on its answers enhances its performance on complex programming and scientific problems [13]. - The research indicates that the R1 model has become the most popular open-source reasoning model globally, with over 10.9 million downloads on Hugging Face [10]. Group 3: Industry Impact and Peer Review - The publication of the R1 model in "Nature" sets a precedent for transparency in AI research, addressing concerns about the reliability of benchmark tests and the potential for manipulation [11]. - The research emphasizes the importance of independent peer review in validating the capabilities of AI systems, which is crucial in an industry facing scrutiny over performance claims [11].