DeepSeek
Search documents
梁文锋带队,首次回应“蒸馏”争议
阿尔法工场研究院· 2025-09-19 00:05
Core Viewpoint - The article highlights the breakthrough of DeepSeek-AI's open-source model DeepSeek-R1, which significantly reduces the cost of AI model training and enhances reasoning capabilities through innovative methodologies, marking a pivotal moment for AI development in China and globally [5][20]. Group 1: Cost and Methodology - DeepSeek-R1's inference cost is remarkably low at $294,000, which is significantly less than the estimated $100 million spent by OpenAI on GPT-4 [11]. - The research team employed a pure reinforcement learning framework and introduced the Group Relative Policy Optimization (GRPO) algorithm, rewarding the model based solely on the correctness of final answers rather than mimicking human reasoning paths [12]. - The model demonstrated advanced behaviors such as self-reflection and self-verification, achieving a 77.9% accuracy in the American Mathematics Invitational Exam (AIME 2024), which further improved to 86.7% with self-consistency decoding [15]. Group 2: Impact and Future of AI - DeepSeek-R1 represents a methodological declaration, showcasing a sustainable path for AI evolution that does not rely on vast amounts of labeled data, thus shifting the focus from funding barriers to scientific innovation [20]. - The success of DeepSeek-R1 indicates a potential shift in AI competition from a race for data and computational power to one centered on algorithmic and intellectual innovation [21]. - The model's development is seen as a significant milestone in the global AI landscape, with experts suggesting it could initiate a "reasoning revolution" in AI [21].
DeepSeek 创始人梁文锋在《自然》杂志回应质疑,R1 训练真 29.4 万美金
Xin Lang Cai Jing· 2025-09-19 00:03
Core Insights - DeepSeek-R1 has made a significant impact in the AI field by being featured on the cover of Nature, highlighting its innovative approach to enhancing reasoning capabilities in large language models (LLMs) through reinforcement learning (RL) [1][3][5]. Group 1: Achievements and Recognition - The paper "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning" was published in January and has now been recognized on the cover of a leading journal, Nature [3]. - DeepSeek-R1 has become the most popular model on Hugging Face after its open-source release, achieving over 10.9 million downloads [5]. - The training cost for DeepSeek-R1 was remarkably low at $294,000, which is significantly less than the costs incurred by competitors like OpenAI and Google [6][7]. Group 2: Training Methodology - DeepSeek-R1 utilizes a novel RL framework that focuses solely on the task format and reward signals based on the correctness of the final answer, allowing for a more organic development of reasoning capabilities [10]. - The model's reasoning accuracy improved dramatically from 15.6% to 77.9% during training, with a peak accuracy of 86.7% when combined with "self-consistent decoding" techniques [10]. Group 3: Self-Evolution and Advanced Strategies - The model exhibited self-evolution behaviors, such as increasing the length of generated text and employing advanced reasoning strategies like self-reflection and systematic exploration of alternative solutions [12][14]. - A notable "Aha Moment" was observed when the model began using the word "wait" more frequently, indicating a shift in its reasoning approach [15][17]. Group 4: Future Development Plans - To address the limitations of DeepSeek-R1, a multi-stage refinement plan has been initiated, which includes cold starting with high-quality conversational data, followed by multiple rounds of RL and supervised fine-tuning [18][19]. - The model's performance has improved by 17%-25% on various benchmarks after undergoing this multi-stage training process [21]. Group 5: Algorithm and Reward System - DeepSeek employs the GRPO (Group Relative Policy Optimization) algorithm, which optimizes model performance by evaluating a group of answers rather than a single best answer, thus reducing resource consumption while maintaining stability [23][24]. - A dual reward system has been established, incorporating both rule-based rewards for reasoning tasks and model-based rewards for general tasks, ensuring the model aligns with human preferences while maintaining its reasoning capabilities [25][26]. Group 6: Challenges and Limitations - Despite its advancements, DeepSeek-R1 faces challenges in structured outputs and tool usage, and it is sensitive to prompts, which limits its effectiveness in complex scenarios [35][37]. - The potential for reward hacking exists, particularly in subjective tasks, which could undermine the model's performance if the reward signals are not robust [37].
陆家嘴财经早餐2025年9月19日星期五
Wind万得· 2025-09-18 22:35
Group 1 - The Ministry of Commerce stated that China will not sacrifice principles and corporate interests to reach any agreement regarding TikTok and hopes the EU will not weaponize tariffs against Chinese electric vehicles [2] - The latest issue of the journal Nature featured a research paper on the DeepSeek-R1 reasoning model, marking a significant achievement for China's AI technology on an international platform [2] - The Ministry of Science and Technology announced that China's R&D investment will exceed 3.6 trillion yuan in 2024, a 48% increase from 2020, with R&D intensity reaching 2.68% [3] Group 2 - The "2025 China Service Industry Enterprises 500 Strong" report indicated that the total revenue of the listed companies is expected to reach 51.1 trillion yuan in 2024, with an average revenue exceeding 1 billion yuan [3] - Beijing and Shanghai announced the social security contribution limits for 2025, with Beijing's upper limit set at 35,811 yuan and Shanghai's at 37,302 yuan [3] - Shanghai is soliciting opinions on guidelines to support high-growth enterprises, offering rewards of up to 1 million yuan for "gazelle" companies and 2 million yuan for "unicorn" companies [3] Group 3 - A-shares experienced volatility with the Shanghai Composite Index closing down 1.15% at 3,831.66 points, while the Shenzhen Component Index and the ChiNext Index also fell [4] - The Hong Kong Hang Seng Index dropped 1.35% to 26,544.85 points, with significant declines in cyclical stocks and financials, while semiconductor and robotics sectors showed resilience [4] - The Shanghai Stock Exchange reported abnormal trading activities in Tianpu Co., leading to regulatory measures against certain investors [4] Group 4 - Goldman Sachs maintained an "overweight" rating on A-shares and H-shares, predicting an 8% and 3% upside respectively over the next 12 months [5] - DWS announced plans to launch an ETF tracking the CSI A500 Index in Europe, providing a new investment tool for overseas investors [5] Group 5 - The cumulative sales of new energy vehicles in China surpassed 40 million units, maintaining the world's leading position for ten consecutive years [8] - The retail market for narrow passenger cars in September is expected to reach approximately 2.15 million units, with new energy vehicles accounting for about 1.25 million units and a penetration rate of 58.1% [8] - As of August 2025, the total number of electric vehicle charging infrastructure in China reached 17.348 million, a 53.5% year-on-year increase [8] Group 6 - The postal industry in August generated a revenue of 142.99 billion yuan, a 4.4% year-on-year increase, with express delivery services contributing 118.96 billion yuan [9] - The property insurance industry in China saw a premium growth rate of 4.2% in the first half of 2025, with underwriting profits reaching 26 billion yuan, a historical high [9] - An international standard for oil and gas pipelines, developed by China, was released, unifying 287 terms related to pipeline transportation [9] Group 7 - CoGoLinks International obtained a money service license in the UAE, allowing it to operate payment accounts and transactions, becoming the first Chinese cross-border payment platform to do so [10] - Huawei announced a series of upcoming product launches, including new Ascend chips and Atlas products, with specific release timelines [11] - Nvidia announced a $5 billion investment in Intel, which will customize x86 CPUs for Nvidia [11] Group 8 - The US stock market indices reached new closing highs, with the Dow Jones up 0.27% and the Nasdaq up 0.94%, driven by strong performances from companies like Caterpillar and Nvidia [16] - The US initial jobless claims fell to 231,000, marking the largest drop in nearly four years, although continuing claims remain above 1.9 million [13] - The UK government announced significant investments from BP and CoreWeave in the US, supporting job creation and AI data center expansion [13]
DeepSeek团队发表重磅论文,《自然》配发社论狂赞呼吁同行效仿
Yang Zi Wan Bao Wang· 2025-09-18 13:19
Group 1 - The DeepSeek-R1 inference model research paper has been published on the cover of the prestigious journal Nature, marking it as the first mainstream large language model (LLM) to undergo peer review, which is significant for AI model development [2][4] - The paper reveals more details about the model's training compared to its initial version released in January, indicating that the reasoning capabilities of LLMs can be enhanced through pure reinforcement learning, reducing the human input required for performance improvement [2][9] - Since its release in January, DeepSeek-R1 has become the most downloaded product for solving complex problems on the platform, and it has undergone evaluation by eight experts on originality, methodology, and robustness [9] Group 2 - Nature's editorial emphasizes the importance of peer review for AI models, noting that almost all mainstream large models have not undergone independent peer review until DeepSeek broke this gap [4][6] - Peer review helps clarify the workings of LLMs and assess whether they truly achieve their claimed functionalities, which is particularly crucial given the significant implications and potential risks associated with LLMs [6][10] - The editorial calls for other AI companies to follow DeepSeek's example, suggesting that if this practice becomes a trend, it could greatly promote the healthy development of the AI industry [10]
氪星晚报 |华为公布未来三年昇腾芯片演进和目标:950PR明年Q1推出;特斯拉正重新设计饱受安全争议的车门把手;《731》今日票房超1.36亿,成内地影...
3 6 Ke· 2025-09-18 10:32
Group 1: Meta and AI Technology - Meta launched its first smart glasses with a built-in screen, priced at $799, featuring capabilities such as displaying messages, video calls, and navigation instructions [1] - The glasses integrate with Meta's AI services to provide visual results from queries [1] Group 2: Coffee Industry - Lucky Coffee, a brand under Mixue Group, has over 70 stores in Beijing and signed more than 1,200 new stores nationwide in July, setting a record for monthly new store openings [2] - As of the end of August, Lucky Coffee has over 8,200 stores across the country [2] Group 3: Huawei's Technological Advancements - Huawei unveiled the world's strongest computing supernodes and clusters at the Huawei Connect 2025 event, with the Atlas 950 SuperPoD and Atlas 960 SuperPoD supporting 8,192 and 15,488 Ascend cards, respectively [3] - The supernodes and clusters aim to provide sustainable and abundant computing power for the long-term development of artificial intelligence [3] Group 4: Electric Vehicle Industry - Rivian is advancing its factory plans in Georgia, with the first phase expected to start next year and production of customer vehicles slated for 2028, targeting an annual capacity of 400,000 vehicles [4] - The factory investment could reach several billion dollars and is projected to create 7,500 jobs by 2030 [4] Group 5: Semiconductor and AI Development - Huawei announced its roadmap for Ascend chips over the next three years, including the launch of the 950PR chip in Q1 2026, which will utilize Huawei's self-developed HBM [4] Group 6: Digital Content Creation - Keling AI launched a new digital human feature that generates 1080p/48FPS videos up to one minute long from a character image and text or audio, significantly lowering industry barriers [6] Group 7: Postal Industry Performance - In August, the postal industry in China achieved a business revenue of 142.99 billion yuan, a year-on-year increase of 4.4%, with express delivery revenue reaching 118.96 billion yuan, up 4.2% [8] - The total volume of postal services in August was 17.62 billion items, growing by 10.5%, with express delivery volume increasing by 12.3% [8] Group 8: Film Industry - The film "731" achieved a box office of over 136 million yuan on its opening day, becoming the highest single-day total in Chinese film history [9]
氪星晚报 |华为公布未来三年昇腾芯片演进和目标:950PR明年Q1推出;特斯拉正重新设计饱受安全争议的车门把手;《731》今日票房超1.36亿,成内地影史首映日总场次冠军
3 6 Ke· 2025-09-18 10:19
Group 1: Meta's Smart Glasses - Meta launched its first smart glasses with a built-in screen, named "Meta Ray-Ban Display," priced at $799, featuring capabilities to display messages, video calls, navigation, and AI service queries [1] Group 2: Baidu's PaddleOCR - Baidu's PaddleOCR has surpassed 9 million downloads since its open-source launch in 2020, being utilized by over 5,900 open-source projects, and is the only Chinese OCR project on GitHub with over 50,000 stars [1] Group 3: Lucky Coffee's Expansion - Lucky Coffee, under Mixue Group, has opened over 70 stores in Beijing and signed over 1,200 new stores nationwide in July, achieving a record for new store openings in a month [2] Group 4: Huawei's Computing Power - Huawei unveiled the world's strongest computing power super nodes and clusters at the Huawei Connect 2025 event, with the Atlas 950 SuperPoD and Atlas 960 SuperPoD supporting 8,192 and 15,488 Ascend cards respectively, aiming to provide sustainable computing power for AI development [3] Group 5: Rivian's Factory Plans - Rivian is advancing its factory plans in Georgia, with the first phase expected to start next year and production of customer vehicles by 2028, aiming for an annual capacity of 400,000 vehicles and creating 7,500 jobs by 2030 [4] Group 6: Tesla's Door Handle Redesign - Tesla is redesigning its controversial door handle system to improve safety and usability in emergencies, following an investigation by the NHTSA into complaints regarding door malfunctions in approximately 174,000 vehicles [4] Group 7: Huawei's Ascend Chip Evolution - Huawei announced its roadmap for Ascend chips over the next three years, including the launch of the 950PR chip in Q1 2026, which will utilize Huawei's self-developed HBM [4] Group 8: AI Digital Human Launch - Keling AI introduced a new digital human feature that generates 1080p videos up to one minute long from a character image and text or audio, significantly lowering industry barriers and applicable in various sectors [7] Group 9: DeepSeek's Fraud Warning - DeepSeek issued a statement warning users about fraudulent activities impersonating the company, clarifying that it has never requested payments to personal or unofficial accounts [8] Group 10: Postal Industry Performance - In August, the postal industry in China generated a revenue of 142.99 billion yuan, a year-on-year increase of 4.4%, with express delivery revenue reaching 118.96 billion yuan, up 4.2% [9] Group 11: Film Box Office Success - The film "731" achieved a box office of over 136 million yuan on its opening day, setting records for the highest single-day total and premiere day total in Chinese film history [10]
刚刚,梁文锋发Nature了
36氪· 2025-09-18 10:18
Core Viewpoint - DeepSeek's R1 reasoning model has achieved significant recognition by being published in the prestigious journal Nature, marking a milestone in AI research and transparency in the industry [4][22][36]. Group 1: Model Development and Achievements - The DeepSeek-R1 model, developed by Liang Wenfeng's team, is the first mainstream large language model to undergo peer review, breaking a significant gap in the AI industry [4][11][22]. - The model has become the most popular open-source reasoning model globally, with over 10.9 million downloads on Hugging Face [4]. - DeepSeek-R1's research addresses a major issue in AI, enhancing reasoning capabilities through reinforcement learning without relying on extensive human labeling [14][16]. Group 2: Transparency and Peer Review - Nature's editorial highlights the importance of peer-reviewed publications in clarifying how large models work and ensuring their performance aligns with vendor claims [24][25][34]. - The peer review process for DeepSeek-R1 involved eight external experts who provided over a hundred specific comments, enhancing the paper's clarity and credibility [26][29][34]. - DeepSeek's commitment to transparency is evident in the detailed disclosures about model training and safety assessments, which are crucial for mitigating risks associated with AI technologies [11][18][36]. Group 3: Safety and Data Integrity - DeepSeek conducted a comprehensive safety evaluation of the R1 model, demonstrating its superior safety compared to contemporaneous models [11][18]. - The model's training data underwent rigorous decontamination processes to prevent bias and ensure that evaluation results accurately reflect its problem-solving capabilities [17][20]. - Despite acknowledging potential contamination issues in some benchmark tests, DeepSeek has implemented external risk control systems to enhance safety during deployment [18][19]. Group 4: Industry Impact and Future Directions - DeepSeek's open-source model is positioned as a representative of domestic AI technology on the global stage, potentially setting a standard for research transparency in the AI industry [36]. - The call for more AI companies to submit their models for peer review reflects a growing recognition of the need for verified claims and enhanced credibility in AI research [36].
华人学者一天发表了11篇Nature论文
生物世界· 2025-09-18 10:05
Core Insights - On September 17, 2025, a total of 24 papers were published in the prestigious journal Nature, with 10 of them authored by Chinese scholars, highlighting the significant contribution of Chinese researchers to global scientific advancements [2][5][7][9][12][14][16][18][21]. Group 1: Research Contributions - A paper titled "Toughened self-assembled monolayers for durable perovskite solar cells" was co-authored by scholars from Hong Kong City University and the Chinese Academy of Sciences, focusing on enhancing the durability of perovskite solar cells [2]. - Another significant paper, "A movable long-term implantable soft microfibre for dynamic bioelectronics," was published by researchers from the Chinese Academy of Sciences and Donghua University, contributing to the field of bioelectronics [5]. - The paper "Atomic-scale imaging of frequency-dependent phonon anisotropy" was authored by researchers from the University of California, Irvine, providing insights into phonon behavior at the atomic level [7]. - A study titled "Covariation mass spectrometry uncovers a protein that controls cysteine catabolism" was led by a researcher from Dana-Farber Cancer Institute, revealing important findings in protein metabolism [9]. - The research "A room temperature rechargeable all-solid-state hydride ion battery" was published by scholars from the Dalian Institute of Physical Chemistry, focusing on advancements in battery technology [12]. - A paper on "High-density soft bioelectronic fibres for multimodal sensing and stimulation" was authored by researchers from Stanford University, contributing to the development of bioelectronic devices [14]. - The study "DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning" was published by DeepSeek, exploring advancements in large language models [16]. - A paper titled "Structural basis for mTORC1 activation on the lysosomal membrane" was authored by researchers from the University of California, Berkeley, providing insights into cellular signaling mechanisms [17]. - The research "Peroxisomal metabolism of branched fatty acids regulates energy homeostasis" was published by scholars from Washington University in St. Louis, contributing to the understanding of metabolic processes [18]. - A study on "Delta-type glutamate receptors are ligand-gated ion channels" was published by Johns Hopkins University, enhancing knowledge in neurobiology [21].
DeepSeek 首登《自然》封面:中国大模型创造新历史,做了 OpenAI 不敢做的事
3 6 Ke· 2025-09-18 09:56
Core Insights - DeepSeek's AI model, R1, has gained significant recognition by being featured on the cover of Nature, a prestigious scientific journal, highlighting its impact in the AI industry [2][10][12] - The training cost for R1 was notably low at $294,000, which contrasts sharply with the multi-million dollar investments typical for models from companies like OpenAI [7][48] - The model's development process involved rigorous peer review, setting a new standard for transparency and scientific validation in AI [11][15][16] Group 1: Model Development and Training - DeepSeek R1's training process was detailed in a paper published on arXiv, which was later expanded upon in the Nature article, showcasing a comprehensive methodology [6][7] - The model was trained using a pure reinforcement learning framework, allowing it to develop reasoning capabilities without relying on human-annotated data [19][41] - R1 achieved an impressive accuracy of 77.9% in the AIME 2024 math competition, surpassing human average scores and even outperforming GPT-4 in certain tasks [23][31] Group 2: Peer Review and Industry Impact - The peer review process for R1 involved independent experts scrutinizing the model, which is a departure from the typical practices of major AI companies that often do not submit their models for academic evaluation [10][11][15] - Nature's editorial team has called for other companies to submit their models for peer review, emphasizing the importance of transparency and accountability in AI development [15][16] - The recognition from Nature not only validates R1's scientific contributions but also positions DeepSeek as a leader in the push for more rigorous standards in AI research [12][50] Group 3: Technical Innovations - R1's architecture is based on a mixture of experts (MoE) model with 671 billion parameters, which was pre-trained on a vast dataset of web pages and e-books [25] - The model's training involved a unique approach where it was rewarded solely based on the correctness of its answers, fostering an environment for self-reflection and dynamic adjustment during problem-solving [29][38] - The final version of R1 was developed through a multi-stage training process that combined reinforcement learning with supervised fine-tuning, enhancing both reasoning and general capabilities [39][47]
训练成本29.4万美元,DeepSeek-R1登Nature封面,首个通过权威期刊同行评审的主流大模型获好评
3 6 Ke· 2025-09-18 07:55
Core Insights - DeepSeek-R1's research results have been published in Nature, marking it as the first mainstream large model to undergo peer review by a reputable journal, which has sparked significant discussion in the academic community [1][14][17] - The training cost of DeepSeek-R1 is reported to be only $294,000, significantly lower than the industry standard of tens of millions for leading models, despite an investment of approximately $6 million in the foundational LLM [1][2][17] Training Costs - The training costs for DeepSeek-R1 are broken down as follows: - DeepSeek-R1-Zero: $202,000 - SFT data creation: $10,000 - DeepSeek-R1: $82,000 - Total: $294,000 - The training utilized 648 H800 GPUs over approximately 198 hours for DeepSeek-R1-Zero and around 80 hours for DeepSeek-R1 [2] Reinforcement Learning and Reasoning Capabilities - The model employs Group Relative Policy Optimization (GRPO) to enhance reasoning capabilities without traditional supervised fine-tuning, allowing for more exploratory learning [3][4] - DeepSeek-R1-Zero demonstrates complex reasoning behaviors, generating longer responses that incorporate verification and exploration of different solutions [4][6] Performance Metrics - DeepSeek-R1-Zero achieved a pass@1 score of 77.9% in the AIME 2024 math competition, with further improvements to 86.7% using self-consistent decoding strategies, surpassing human average performance [6][8] - The model also excelled in programming competitions and graduate-level questions in biology, physics, and chemistry, validating the effectiveness of reinforcement learning in enhancing reasoning capabilities [6] Development Pipeline - The development of DeepSeek-R1 involved multiple stages, starting from data collection based on human-like dialogue to reinforcement learning and sampling, ultimately enhancing the model's utility and safety [9][11] - Experimental results indicate significant improvements in instruction execution across various development stages, with DeepSeek-R1 outperforming its predecessors in benchmark tests [11][13] Industry Impact - The peer review of DeepSeek-R1 is seen as a positive trend for AI research, promoting transparency and standardization in the field, which has been lacking for many mainstream AI models [14][16][17]