Seek .(SKLTY)
Search documents
DeepSeek-R1登上Nature封面:朝着AI透明化迈出的可喜一步
3 6 Ke· 2025-09-18 02:02
Core Insights - The value of open-source artificial intelligence (AI) is gaining broader recognition, highlighted by the publication of the DeepSeek-R1 paper in the prestigious journal Nature, with founder Liang Wenfeng as the corresponding author [1][5]. Research Findings - The research team hypothesized that human-defined reasoning patterns might limit model exploration, and unrestricted reinforcement learning (RL) training could better stimulate the emergence of new reasoning capabilities in large language models (LLMs) [3][8]. - Experiments demonstrated that the reasoning ability of LLMs can be enhanced through pure RL, reducing the need for human input, and outperforming traditionally trained LLMs in tasks such as mathematics, programming competitions, and graduate-level STEM problems [3][9]. Model Evaluation - Following the launch of DeepSeek-R1, it received widespread acclaim from global developers, achieving 91.1k stars on GitHub [4]. - Nature's editorial recognized DeepSeek-R1 as the first mainstream LLM published after peer review, marking a significant step towards transparency in AI [5][17]. - The editorial emphasized the importance of peer-reviewed publications in clarifying LLM operations and assessing their authenticity [6][17]. Methodology - The research introduced a new paradigm within the RL framework, minimizing reliance on human-annotated reasoning processes and exploring the potential for LLMs to develop reasoning capabilities through self-evolution [9][10]. - The team proposed a RL algorithm called "Group Relative Policy Optimization" (GRPO) and trained various models, including DeepSeek-R1-Zero and DeepSeek-R1, based on the foundational model DeepSeek-V3 Base [10][12]. Training Phases - The training process involved multiple stages, with each subsequent model improving upon the previous one in terms of reasoning and instruction-following capabilities [14]. - DeepSeek-R1 demonstrated strong reasoning abilities aligned with human preferences, achieving superior performance across 21 mainstream benchmarks, validating the effectiveness of the RL framework [15][16]. Industry Implications - The editorial raised concerns about the lack of independent peer review for many widely used LLMs, highlighting the need for transparency and accountability in the AI industry [17][18]. - Nature called for more AI companies to submit their models for publication review, emphasizing that peer review can enhance trust and credibility in AI research [18][19].
DeepSeek登上Nature封面,梁文锋带队回应质疑,R1训练真29.4万美金
3 6 Ke· 2025-09-18 01:32
Core Insights - The paper "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning" has gained significant recognition, being featured on the cover of a leading global journal, Nature [2][4] - DeepSeek-R1 is noted as the first mainstream large language model (LLM) to undergo a peer review process, which has set a precedent for transparency in AI development [7] Model Performance and Popularity - After its open-source release, DeepSeek-R1 became the most downloaded model on Hugging Face, surpassing 10.9 million downloads [4] - The model demonstrated a remarkable improvement in reasoning capabilities, achieving an average problem-solving accuracy (pass@1) of 77.9%, and up to 86.7% with "self-consistent decoding" technology [10] Training Costs and Efficiency - The training cost for DeepSeek-R1 was reported at $294,000, significantly lower than the costs incurred by companies like OpenAI and Google [5][6] - The training process involved 147,000 GPU hours, with a breakdown of costs for different training phases [6] Innovative Training Approach - DeepSeek-R1-Zero was developed by completely discarding human reasoning patterns, utilizing a simplified reinforcement learning framework [8][10] - The model was trained with a focus on two main components: task format and reward signals based on the correctness of final answers [10] Self-Evolution and Advanced Reasoning - During training, the model exhibited self-evolution behaviors, increasing the length of generated text in the "think" tag and developing advanced reasoning strategies [12][15] - A notable "Aha Moment" was observed when the model began using the word "wait" more frequently, indicating a shift in its reasoning process [16][18] Multi-Stage Training Process - The training process consists of multiple stages, including cold start, reinforcement learning, large-scale supervised fine-tuning, and a second round of reinforcement learning [19][20] - Each stage is designed to enhance different aspects of the model's capabilities, from initial fine-tuning to improving language consistency and general knowledge [20][35] Reward System Design - DeepSeek implemented a dual-track reward system, combining rule-based rewards for reasoning tasks and model-based rewards for general tasks [27][30] - The rule-based rewards focus on accuracy and format compliance, while the model-based rewards assess the usefulness and safety of the outputs [28][31] Challenges and Future Directions - Despite its advanced reasoning capabilities, DeepSeek-R1 faces limitations in structured outputs and tool usage, and it is sensitive to prompt variations [43] - The reliance on reliable reward signals poses challenges, particularly for subjective tasks, which may lead to reward hacking [44]
DeepSeek-R1论文登上Nature封面,通讯作者梁文锋
3 6 Ke· 2025-09-18 00:45
太令人意外! 却又实至名归! 最新一期的 Nature 封面,竟然是 DeepSeek-R1 的研究。 也就是今年 1 月份 DeepSeek 在 arxiv 公布的论文《DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning》。这篇Nature论文 通讯作者正是梁文锋。 论文链接: https://www.nature.com/articles/s41586-025-09422-z 在封面的推荐介绍中,Nature 写到: 如果训练出的大模型能够规划解决问题所需的步骤,那么它们往往能够更好地解决问题。这种『推理』与人类处理更复杂问题的方式类似,但 这对人工智能有极大挑战,需要人工干预来添加标签和注释。在本周的期刊中,DeepSeek 的研究人员揭示了他们如何能够在极少的人工输入 下训练一个模型,并使其进行推理。 DeepSeek-R1 模型采用强化学习进行训练。在这种学习中,模型正确解答数学问题时会获得高分奖励,答错则会受到惩罚。结果,它学会了推 理——逐步解决问题并揭示这些步骤——更有可能得出正确 ...
美联储宣布降息25个基点;欧盟宣布对以色列实施制裁,以外长回应;DeepSeek-R1开创历史,梁文锋论文登上《自然》封面|早报
Di Yi Cai Jing· 2025-09-18 00:20
Group 1 - The Federal Reserve announced a 25 basis point interest rate cut, bringing the target range for the federal funds rate to 4.00%-4.25%, marking the first rate cut since December 2024 [2] - The European Union has implemented sanctions against Israel, including the partial suspension of trade-related aspects of the EU-Israel Association Agreement, in response to the situation in Gaza [3] - The Ministry of Commerce plans to conduct pilot projects for new consumption formats and models in approximately 50 cities to enhance quality consumption supply and stimulate economic growth [6] Group 2 - In August, China's fiscal revenue reached 148.198 billion yuan, with a year-on-year growth of 0.3%, marking the first positive growth in tax revenue this year [5] - The China Association of Automobile Manufacturers reported that domestic sales of new energy vehicles reached 1.171 million units in August, a year-on-year increase of 18.3%, with total sales for the first eight months reaching 8.088 million units, up 30.1% [10] - Vanke has undergone its largest organizational restructuring in recent years, adjusting its management structure to include 16 regional companies directly managed by headquarters [22] Group 3 - Dongfeng Group announced the establishment of a new joint venture company with a registered capital of 8.47 billion yuan, focusing on the Hummer brand and incorporating various intangible assets [23] - The Ministry of Industry and Information Technology is soliciting public opinions on mandatory national standards for safety requirements of intelligent connected vehicles [9] - The World Trade Organization predicts that AI could increase cross-border goods and services flow by nearly 40% by 2040, provided that policies are implemented to bridge the digital divide [21]
8点1氪|西贝回应“公筷喂狗”事件;美联储宣布降息25个基点;DeepSeek梁文锋论文登上《自然》封面
3 6 Ke· 2025-09-18 00:06
Group 1 - The incident involving a customer feeding a dog with restaurant chopsticks at a West B restaurant raised concerns about dining safety, leading to the disposal of all involved utensils and a thorough sanitation of the premises [1] - The Federal Reserve announced a 25 basis point cut in the federal funds rate, marking its first rate decrease since December 2024, bringing the target range to 4.00%-4.25% [1] Group 2 - DeepSeek's research paper on the DeepSeek-R1 reasoning model was featured on the cover of the prestigious journal Nature, highlighting its significance as the first mainstream large language model to undergo peer review [2][3] - The U.S. government extended the grace period for TikTok's ban for the fourth time, now set to expire on December 16, 2024 [4] - A Chinese restaurant chain, Green Tea, was reported to have removed its "no pre-made dishes" signage, raising questions about its food preparation practices [6] Group 3 - The Chinese government announced that personal medical insurance accounts can now transfer funds to family members' accounts for medical expenses, enhancing the utility of these accounts [5] - A man discovered two salary payments linked to his identity, suggesting potential misuse of personal information by a company to evade taxes [5] - Japan Airlines faced significant delays due to a pilot's pre-flight drinking, resulting in salary cuts for 37 executives, including the company president [9] Group 4 - Google Pay will be fully launched in Saudi Arabia, with Alipay expected to be integrated by 2026, enhancing digital payment options in the region [6] - The Chinese government plans to distribute over 330 million yuan in consumer subsidies during the upcoming National Day holiday, aiming to boost tourism and cultural consumption [6] - Peak Sports was reported to have implemented salary cuts across the board, with reductions reaching up to 50% for certain employees [7] Group 5 - China's bicycle and electric bicycle ownership has reached approximately 580 million, with significant reductions in carbon emissions attributed to two-wheeled transportation [12] - The U.S. stock market showed mixed results, with Baidu's shares rising over 11%, indicating positive market sentiment for certain Chinese tech stocks [13] Group 6 - NIO Inc. successfully raised $1.16 billion through a public stock offering, aimed at advancing its electric vehicle technology and infrastructure [14] - AI chip startup Groq completed a $750 million funding round, achieving a post-money valuation of $6.9 billion, reflecting strong investor interest in AI technology [14] - "Qingyun New Materials" announced the completion of a multi-billion C round financing to expand its production capacity and enhance its position in the high-end fiber materials market [14]
DeepSeek-R1开创历史,梁文锋论文登上《自然》封面
Di Yi Cai Jing· 2025-09-17 23:09
与今年1月发布的DeepSeek-R1的初版论文相比,本次论文披露了更多模型训练的细节,并正面回应了 模型发布之初的蒸馏质疑。 DeepSeek-R1也是全球首个经过同行评审的主流大语言模型。Nature评价道:目前几乎所有主流的大模 型都还没有经过独立同行评审,这一空白"终于被DeepSeek打破"。 本次论文正面回应了模型发布之初的蒸馏质疑。 由DeepSeek团队共同完成、梁文锋担任通讯作者的DeepSeek-R1推理模型研究论文,登上了国际权威期 刊《自然(Nature)》的封面。 ...
DeepSeek-R1开创历史 梁文锋论文登上《自然》封面
Di Yi Cai Jing· 2025-09-17 23:07
Core Insights - The DeepSeek-R1 inference model research paper, led by Liang Wenfeng, has been published on the cover of the prestigious journal "Nature" [1] - This paper provides more detailed information on model training compared to the initial version released in January, addressing concerns raised about the model's distillation [1] - DeepSeek-R1 is recognized as the first mainstream large language model to undergo peer review, breaking a significant gap in the field as noted by Nature [1]
仿制药→创新药,中国生物医药领域迎来“DeepSeek时刻”
Sou Hu Cai Jing· 2025-09-17 15:23
(央视财经《经济信息联播》)在生物医药领域,部分国产创新药在疗效上展现出了能媲美 国际药企 明星药品的表现。近年来,有越来越多的国际制药企业斥巨资和中国药企展开合作。 今年年初,中国的人工智能模型DeepSeek以极低的开发成本和强大的性能获得了世界的瞩目,而其实 在生物技术领域,中国也正在迎来类似的"DeepSeek时刻",中国药企正在从过去的仿制药时代进入到创 新药时代,并且吸引了众多的国际合作。合作的主要形式是授权合作:外方获得在中国以外市场对中方 药品或相关技术,开发、生产及商业化权利。 宏观上来看,美国生物医药数据公司DealForma在今年5月份发布的报告显示:中国生物医药企业在全球 大型授权交易,也就是首付款达到5000万美元或以上的交易中,按数量看占比已达42%,这一比例与去 年的27%和前年的20%相比显著增加。 一些亮眼案例,就例如今年5月辉瑞支付了12.5亿美元的首付款,获得了三生制药一款创新药在海外市 场的授权。此外,今年上半年,美国制药公司艾伯维、默沙东和再生元等和中国药企总计签署了价值数 十亿美元的授权合作协议。 值得一提的是药企阿斯利康,在过去两年间已和十多家中国创新药企达成了授 ...
中国创新药授权出海跑出加速度!中国生物技术领域迎来DeepSeek时刻
Xin Lang Cai Jing· 2025-09-17 15:00
Core Insights - China's innovative drug sector is experiencing explosive growth, with overseas licensing transaction amounts significantly increasing [1] - In the first half of this year, the transaction amount for innovative drug licensing has already exceeded $66 billion, surpassing the total for the entire previous year [1] - The emergence of AI model DeepSeek has drawn global attention, paralleling the advancements in China's biotechnology sector, marking a transition from generic drugs to innovative drugs [1] Industry Developments - Chinese pharmaceutical companies are attracting numerous international collaborations, indicating a shift in the global perception of China's drug innovation capabilities [1] - Notable partnerships include Pfizer's payment of $1.25 billion for the overseas licensing of an innovative drug from 3SBio in May [1] - Major US pharmaceutical companies such as AbbVie, Merck, and Regeneron have signed licensing agreements worth billions with Chinese firms in the first half of this year [1]
汤道生:腾讯是最早拥抱DeepSeek的,背后是基于用户诉求
Xin Lang Ke Ji· 2025-09-17 04:37
Group 1 - The core viewpoint is that the subscription model for cloud vendors' future business development is closely related to customer needs and industry strategies [1] - Currently, most cloud products are charged based on usage, including storage, computing, and bandwidth [1] - There is a challenge in implementing a performance-based payment model for marketing cloud services due to the numerous factors influencing marketing outcomes [1] Group 2 - The strategic choices made by the company are centered around user needs, focusing on identifying pain points and providing effective solutions [2] - The company has embraced the DeepSeek model in response to strong user interest, indicating a commitment to addressing real user demands [2] - The company will continue to adopt a multi-modal strategy based on user needs, ensuring that the best technology solutions are provided to customers [2]