DeepSeek - filings, earnings calls, financial reports, news

DeepSeek

Search documents

Mei Ri Jing Ji Xin Wen· 2025-09-18 00:42

（文章来源：每日经济新闻）与今年1月发布的DeepSeek-R1的初版论文相比，本次论文披露了更多模型训练的细节，并正面回应了模型发布之初的蒸馏质疑。DeepSeek-R1也是全球首个经过同行评审的主流大语言模型。Nature评价道：目前几乎所有主流的大模型都还没有经过独立同行评审，这一空白"终于被DeepSeek打破"。由DeepSeek团队共同完成、梁文锋担任通讯作者的DeepSeek-R1推理模型研究论文，登上了国际权威期刊《自然（Nature）》第645期的封面。 ...

8点1氪：西贝回应“公筷喂狗”事件；美联储宣布降息25个基点；DeepSeek梁文锋论文登上《自然》封面

36氪· 2025-09-18 00:19

Group 1 - The incident at Xibei restaurant involved customers using restaurant utensils to feed a pet dog, raising concerns about dining safety [4] - The restaurant confirmed that all utensils used by the customers were discarded and a thorough disinfection of the premises was conducted [4] - Local authorities stated there are currently no legal grounds to penalize the restaurant for allowing pets, as the customer's actions were deemed personal behavior [4] Group 2 - The Federal Reserve announced a 25 basis point cut in the federal funds rate, marking its first rate decrease since December 2024 [4] Group 3 - NIO Group successfully completed a financing round of $1.16 billion, aimed at enhancing its technological capabilities and expanding charging infrastructure [20] - AI chip startup Groq raised $750 million in a new funding round, achieving a post-money valuation of $6.9 billion [20] - "Qingyun New Materials" announced the completion of a multi-hundred million C round financing to support the development of advanced materials [20] Group 4 - The month of September saw a significant increase in lemon prices, doubling from 7.83 yuan per kilogram to 15 yuan per kilogram over the past year, leading to supply shortages at some stores [15] - The mooncake industry in China is transitioning from seasonal demand to year-round consumption, with over 20,000 related enterprises currently registered [24]

3 6 Ke· 2025-09-17 23:43

昨晚，DeepSeek再度开创历史！智东西9月18日报道，9月17日，由DeepSeek团队共同完成、梁文锋担任通讯作者的DeepSeek-R1推理模型研究论文，登上了国际权威期刊《自然（Nature）》的封面。 DeepSeek-R1论文首次公开了仅靠强化学习，就能激发大模型推理能力的重要研究成果，启发全球AI研究者；这一模型还成为全球最受欢迎的开源推理模型，Hugging Face下载量超1090万次。此番获得《自然》的认证，可谓是实至名归。与此同时，DeepSeek-R1也是全球首个经过同行评审的主流大语言模型。《自然》在社论中高度评价道：几乎所有主流的大模型都还没有经过独立同行评审，这一空白"终于被DeepSeek打破"。《自然》认为，在AI行业中，未经证实的说法和炒作已经"司空见惯"，而DeepSeek所做的一切，都是"迈向透明度和可重复性的可喜一步"。《自然》杂志封面标题：自助——强化学习教会大模型自我改进发表在《自然》杂志的新版DeepSeek-R1论文，与今年1月未经同行评审的初版有较大差异，披露了更多模型训练的细节，并正面回应了模型发布之初的蒸馏质疑。 | https:// ...

Artificial Intelligence

Reinforcement Learning

Artificial Intelligence

DeepSeek-R1

Artificial Intelligence

Reinforcement Learning

Artificial Intelligence

DeepSeek-R1

刚刚！DeepSeek梁文锋论文登上《Nature》封面了！

是说芯语· 2025-09-17 23:35

Core Viewpoint - The DeepSeek-R1 inference model research paper, led by Liang Wenfeng, has been published in the prestigious journal Nature, marking a significant milestone in the field of AI and large language models [1][3]. Group 1: Model Development and Validation - The latest paper provides more detailed insights into the training of the DeepSeek-R1 model compared to its initial version released in January [3]. - DeepSeek-R1 is recognized as the first mainstream large language model to undergo peer review, addressing previous concerns regarding its distillation process [3]. - The peer review process is seen as a necessary step to mitigate the risks associated with unverified claims in the AI industry, as highlighted by Nature [5]. Group 2: Data and Safety Assessment - DeepSeek-V3 Base, the foundational model for DeepSeek-R1, utilized data sourced entirely from the internet, which may include outputs generated by GPT-4, though this was not intentional [5]. - The company has provided a detailed process in supplementary materials to demonstrate how data contamination was minimized during training, ensuring that benchmark tests were not deliberately included to enhance model performance [5]. - A comprehensive safety assessment of DeepSeek-R1 has been conducted, showing that its safety features are superior to those of contemporaneous models [5].

DeepSeek梁文锋论文登上《自然》封面

第一财经· 2025-09-17 23:23

2025.09. 18 本文字数：307，阅读时长大约1分钟作者 | 一财科技由DeepSeek团队共同完成、梁文锋担任通讯作者的DeepSeek-R1推理模型研究论文，登上了国际权威期刊《自然（Nature）》的封面。推荐阅读 "嘎子谢孟伟"公开道歉！警方已介入 47.7 与今年1月发布的DeepSeek-R1的初版论文相比，本次论文披露了更多模型训练的细节，并正面回应了模型发布之初的蒸馏质疑。 DeepSeek-R1也是全球首个经过同行评审的主流大语言模型。Nature评价道：目前几乎所有主流的大模型都还没有经过独立同行评审，这一空白"终于被DeepSeek打破"。微信编辑 | 七三第一财经持续追踪财经热点。若您掌握公司动态、行业趋势、金融事件等有价值的线索，欢迎提供。专用邮箱： bianjibu@yicai.com （注：我们会对线索进行核实。您的隐私将严格保密。） ...

DeepSeek-R1开创历史，梁文锋论文登上《自然》封面

Di Yi Cai Jing· 2025-09-17 23:09

与今年1月发布的DeepSeek-R1的初版论文相比，本次论文披露了更多模型训练的细节，并正面回应了模型发布之初的蒸馏质疑。 DeepSeek-R1也是全球首个经过同行评审的主流大语言模型。Nature评价道：目前几乎所有主流的大模型都还没有经过独立同行评审，这一空白"终于被DeepSeek打破"。本次论文正面回应了模型发布之初的蒸馏质疑。由DeepSeek团队共同完成、梁文锋担任通讯作者的DeepSeek-R1推理模型研究论文，登上了国际权威期刊《自然（Nature）》的封面。 ...

Seek .(US:SKLTY)

Large Language Model

Artificial Intelligence

DeepSeek - R1

Large Language Model

Artificial Intelligence

DeepSeek - R1

刚刚，DeepSeek-R1论文登上Nature封面，通讯作者梁文锋

机器之心· 2025-09-17 17:00

Core Viewpoint - The article highlights the significance of DeepSeek-R1, which is recognized as the first large language model (LLM) to pass peer review in a prestigious academic journal, Nature. This achievement marks a pivotal shift in the AI industry towards more rigorous scientific validation of AI models, moving from mere technical competition to a focus on scientific discipline and public trust [5][11][12]. Summary by Sections DeepSeek-R1 Overview - DeepSeek-R1 is trained using reinforcement learning, where the model receives rewards for correct answers and penalties for incorrect ones, enabling it to develop reasoning capabilities similar to human problem-solving [7][8]. - The model's ability to self-validate and reflect on its performance enhances its effectiveness in programming and advanced scientific inquiries [7]. Peer Review Significance - The peer review process serves as a critical gatekeeper, requiring AI companies to substantiate their claims with solid evidence rather than self-promotion [10]. - The rigorous evaluation of DeepSeek-R1's methodology and limitations by external experts helps to mitigate inflated claims in the AI industry [9][10]. Training Methodology - DeepSeek-R1 employs a novel multi-stage pipeline that enhances reasoning capabilities without relying heavily on supervised data [15]. - The model utilizes Group Relative Policy Optimization (GRPO) to reduce training costs and incorporates a dual reward mechanism based on accuracy and format [16][17]. - A structured training template guides the model to articulate its reasoning process before providing final answers, allowing for clear observation of its learning progress [18]. Performance and Limitations - DeepSeek-R1 demonstrates advanced self-evolution capabilities, developing higher-order reasoning skills autonomously during training [20]. - Despite its advancements, the model still faces challenges such as poor readability and language mixing in its outputs [21][26]. Cold Start and Reinforcement Learning - The development team collected a small amount of long Chain of Thought (CoT) data to stabilize the model during the early stages of reinforcement learning [22]. - The integration of language consistency rewards during training aims to improve the model's readability, although it may slightly affect performance [23]. Distillation and Model Efficiency - The team successfully distilled the reasoning capabilities of DeepSeek-R1 into smaller models, significantly enhancing their performance [29]. - Benchmark tests indicate that DeepSeek-R1 competes effectively with state-of-the-art models in reasoning tasks, showcasing its robust capabilities [30][31].

强化学习

大语言模型同行评审

Artificial Intelligence

Artificial Intelligence

Huan Qiu Shi Bao· 2025-09-16 22:42

Group 1: Core Insights - The 22nd China-ASEAN Expo focuses on digital economy and AI collaboration, showcasing achievements in these fields [1][3] - The China-ASEAN Free Trade Area 3.0 negotiations have been completed, emphasizing digital economy as a key area for cooperation [1][5] - China and ASEAN aim to enhance digital infrastructure, e-commerce, and AI collaboration to foster new growth points like blue and green economies [1][3] Group 2: Digital Economy and AI Cooperation - China-ASEAN cross-border e-commerce has grown at an annual rate exceeding 20%, becoming a significant driver of trade [3] - ASEAN countries are increasing investments in digital technology, with Indonesia and Vietnam setting ambitious digital economy targets [3][4] - The establishment of the AI Innovation Cooperation Center aims to connect Chinese enterprises with ASEAN's AI needs across various sectors [5][6] Group 3: Market Opportunities and Challenges - Chinese high-tech companies are increasingly looking to enter the ASEAN market, with significant interest in AI and robotics [7][8] - The ASEAN market presents both opportunities and challenges for automation, with varying levels of technological advancement across countries [7][8] - The automotive sector is a focal point for collaboration, with Chinese companies introducing AI solutions to enhance competitiveness in ASEAN [9][10] Group 4: Regional Development and Integration - Guangxi is positioned as a "bridgehead" for China-ASEAN cooperation in the digital economy and AI [5][12] - The region is actively promoting AI applications in various industries, including agriculture and smart cities, through initiatives like the "AI Empowerment Super League" [11][12] - The collaboration between Chinese and ASEAN entities is expected to yield mutual benefits, leveraging each other's strengths in technology and market access [10][11]

外汇交易员· 2025-09-16 06:33

Industry Development & Technology - Tencent has fully adapted to mainstream domestic chips, aiming to provide cost-effective AI computing power through full-stack optimization of software and hardware [1] - The industry is addressing the challenge of computing power supply by integrating different types of chips [1] - DeepSeek-V3.1 uses UE8M0 FP8 Scale parameter precision and has made significant adjustments to the tokenizer and chat template [1] - UE8M0 FP8 is designed for the next generation of domestic chips [1] Company Strategy - Tencent is adopting a strategy of software and hardware collaboration for full-stack optimization [1] - Tencent aims to provide high cost performance AI computing power [1]

OpenAI发布GPT-5-Codex：独立编码7小时，能动态调整资源，token消耗更少

Founder Park· 2025-09-16 03:24

Core Insights - OpenAI has released a new model specifically designed for programming tasks, named GPT-5-Codex, which is a specialized version of GPT-5 [3][4] - GPT-5-Codex features a "dual-mode" capability, being both fast and reliable, with improved responsiveness for both small and large tasks [5][6] - The model can execute large-scale refactoring tasks for up to 7 hours continuously, showcasing its efficiency [7] Performance and Features - In SWE-bench validation and code refactoring tasks, GPT-5-Codex outperformed the previous model, GPT-5-high, achieving an accuracy rate of 51.3% compared to 33.9% [9][10] - The model dynamically adjusts resource allocation based on task complexity, reducing token consumption by 93.7% for simpler tasks while doubling the processing time for more complex requests [12][13] - GPT-5-Codex has significantly improved code review capabilities, with incorrect comments dropping from 13.7% to 4.4% and high-impact comments increasing from 39.4% to 52.4% [16][18] Integration and User Experience - The model supports multi-modal interactions, including terminal vibe coding, IDE editing, and GitHub integration, catering to various developer preferences [32] - OpenAI emphasizes the importance of "harnessing" the model, integrating it with infrastructure to enable real-world task execution [29][34] - The user experience is enhanced with a response time of less than 1.5 seconds for code completion, crucial for maintaining developer productivity [30] Competitive Landscape - The release of GPT-5-Codex intensifies the competition in the programming AI space, with various domestic and international players developing similar programming agents [45][46] - Notable competitors include Cursor, Gemini CLI, and Claude Code, which focus on execution capabilities and seamless integration with development environments [51][52] - The market is rapidly evolving, with many companies racing to establish their programming AI solutions, indicating a significant shift in software development practices by 2030 [43][54]

编程智能体

AI编程

Artificial Intelligence

Artificial Intelligence