梁文锋带队，首次回应“蒸馏”争议

Core Viewpoint - The article highlights the breakthrough of DeepSeek-AI's open-source model DeepSeek-R1, which significantly reduces the cost of AI model training and enhances reasoning capabilities through innovative methodologies, marking a pivotal moment for AI development in China and globally [5][20]. Group 1: Cost and Methodology - DeepSeek-R1's inference cost is remarkably low at $294,000, which is significantly less than the estimated $100 million spent by OpenAI on GPT-4 [11]. - The research team employed a pure reinforcement learning framework and introduced the Group Relative Policy Optimization (GRPO) algorithm, rewarding the model based solely on the correctness of final answers rather than mimicking human reasoning paths [12]. - The model demonstrated advanced behaviors such as self-reflection and self-verification, achieving a 77.9% accuracy in the American Mathematics Invitational Exam (AIME 2024), which further improved to 86.7% with self-consistency decoding [15]. Group 2: Impact and Future of AI - DeepSeek-R1 represents a methodological declaration, showcasing a sustainable path for AI evolution that does not rely on vast amounts of labeled data, thus shifting the focus from funding barriers to scientific innovation [20]. - The success of DeepSeek-R1 indicates a potential shift in AI competition from a race for data and computational power to one centered on algorithmic and intellectual innovation [21]. - The model's development is seen as a significant milestone in the global AI landscape, with experts suggesting it could initiate a "reasoning revolution" in AI [21].