Workflow
DeepSeek
icon
Search documents
8点1氪:西贝回应“公筷喂狗”事件;美联储宣布降息25个基点;DeepSeek梁文锋论文登上《自然》封面
36氪· 2025-09-18 00:19
Group 1 - The incident at Xibei restaurant involved customers using restaurant utensils to feed a pet dog, raising concerns about dining safety [4] - The restaurant confirmed that all utensils used by the customers were discarded and a thorough disinfection of the premises was conducted [4] - Local authorities stated there are currently no legal grounds to penalize the restaurant for allowing pets, as the customer's actions were deemed personal behavior [4] Group 2 - The Federal Reserve announced a 25 basis point cut in the federal funds rate, marking its first rate decrease since December 2024 [4] Group 3 - NIO Group successfully completed a financing round of $1.16 billion, aimed at enhancing its technological capabilities and expanding charging infrastructure [20] - AI chip startup Groq raised $750 million in a new funding round, achieving a post-money valuation of $6.9 billion [20] - "Qingyun New Materials" announced the completion of a multi-hundred million C round financing to support the development of advanced materials [20] Group 4 - The month of September saw a significant increase in lemon prices, doubling from 7.83 yuan per kilogram to 15 yuan per kilogram over the past year, leading to supply shortages at some stores [15] - The mooncake industry in China is transitioning from seasonal demand to year-round consumption, with over 20,000 related enterprises currently registered [24]
刚刚,梁文锋发Nature了
3 6 Ke· 2025-09-17 23:43
昨晚,DeepSeek再度开创历史! 智东西9月18日报道,9月17日,由DeepSeek团队共同完成、梁文锋担任通讯作者的DeepSeek-R1推理模型研究论文,登上了国际权威期刊《自 然(Nature)》的封面。 DeepSeek-R1论文首次公开了仅靠强化学习,就能激发大模型推理能力的重要研究成果,启发全球AI研究者;这一模型还成为全球最受欢迎的 开源推理模型,Hugging Face下载量超1090万次。此番获得《自然》的认证,可谓是实至名归。 与此同时,DeepSeek-R1也是全球首个经过同行评审的主流大语言模型。《自然》在社论中高度评价道:几乎所有主流的大模型都还没有经过 独立同行评审,这一空白"终于被DeepSeek打破"。 《自然》认为,在AI行业中,未经证实的说法和炒作已经"司空见惯",而DeepSeek所做的一切,都是"迈向透明度和可重复性的可喜一步"。 《自然》杂志封面标题:自助——强化学习教会大模型自我改进 发表在《自然》杂志的新版DeepSeek-R1论文,与今年1月未经同行评审的初版有较大差异,披露了更多模型训练的细节,并正面回应了模型 发布之初的蒸馏质疑。 | https:// ...
刚刚!DeepSeek梁文锋论文登上《Nature》封面了!
是说芯语· 2025-09-17 23:35
Core Viewpoint - The DeepSeek-R1 inference model research paper, led by Liang Wenfeng, has been published in the prestigious journal Nature, marking a significant milestone in the field of AI and large language models [1][3]. Group 1: Model Development and Validation - The latest paper provides more detailed insights into the training of the DeepSeek-R1 model compared to its initial version released in January [3]. - DeepSeek-R1 is recognized as the first mainstream large language model to undergo peer review, addressing previous concerns regarding its distillation process [3]. - The peer review process is seen as a necessary step to mitigate the risks associated with unverified claims in the AI industry, as highlighted by Nature [5]. Group 2: Data and Safety Assessment - DeepSeek-V3 Base, the foundational model for DeepSeek-R1, utilized data sourced entirely from the internet, which may include outputs generated by GPT-4, though this was not intentional [5]. - The company has provided a detailed process in supplementary materials to demonstrate how data contamination was minimized during training, ensuring that benchmark tests were not deliberately included to enhance model performance [5]. - A comprehensive safety assessment of DeepSeek-R1 has been conducted, showing that its safety features are superior to those of contemporaneous models [5].
DeepSeek梁文锋论文登上《自然》封面
第一财经· 2025-09-17 23:23
2025.09. 18 本文字数:307,阅读时长大约1分钟 作者 | 一财科技 由DeepSeek团队共同完成、梁文锋担任通讯作者的DeepSeek-R1推理模型研究论文,登上了国际权威期刊《自然(Nature)》的封面。 推荐阅读 "嘎子谢孟伟"公开道歉!警方已介入 47.7 与今年1月发布的DeepSeek-R1的初版论文相比,本次论文披露了更多模型训练的细节,并正面回应了模型发布之初的蒸馏质疑。 DeepSeek-R1也是全球首个经过同行评审的主流大语言模型。Nature评价道:目前几乎所有主流的大模型都还没有经过独立同行评审,这一空白"终 于被DeepSeek打破"。 微信编辑 | 七三 第一财经持续追踪财经热点。若您掌握公司动态、行业趋势、金融事件等有价值的线索,欢迎提供。 专用邮箱: bianjibu@yicai.com (注:我们会对线索进行核实。您的隐私将严格保密。) ...
DeepSeek-R1开创历史,梁文锋论文登上《自然》封面
Di Yi Cai Jing· 2025-09-17 23:09
与今年1月发布的DeepSeek-R1的初版论文相比,本次论文披露了更多模型训练的细节,并正面回应了 模型发布之初的蒸馏质疑。 DeepSeek-R1也是全球首个经过同行评审的主流大语言模型。Nature评价道:目前几乎所有主流的大模 型都还没有经过独立同行评审,这一空白"终于被DeepSeek打破"。 本次论文正面回应了模型发布之初的蒸馏质疑。 由DeepSeek团队共同完成、梁文锋担任通讯作者的DeepSeek-R1推理模型研究论文,登上了国际权威期 刊《自然(Nature)》的封面。 ...
刚刚,DeepSeek-R1论文登上Nature封面,通讯作者梁文锋
机器之心· 2025-09-17 17:00
Core Viewpoint - The article highlights the significance of DeepSeek-R1, which is recognized as the first large language model (LLM) to pass peer review in a prestigious academic journal, Nature. This achievement marks a pivotal shift in the AI industry towards more rigorous scientific validation of AI models, moving from mere technical competition to a focus on scientific discipline and public trust [5][11][12]. Summary by Sections DeepSeek-R1 Overview - DeepSeek-R1 is trained using reinforcement learning, where the model receives rewards for correct answers and penalties for incorrect ones, enabling it to develop reasoning capabilities similar to human problem-solving [7][8]. - The model's ability to self-validate and reflect on its performance enhances its effectiveness in programming and advanced scientific inquiries [7]. Peer Review Significance - The peer review process serves as a critical gatekeeper, requiring AI companies to substantiate their claims with solid evidence rather than self-promotion [10]. - The rigorous evaluation of DeepSeek-R1's methodology and limitations by external experts helps to mitigate inflated claims in the AI industry [9][10]. Training Methodology - DeepSeek-R1 employs a novel multi-stage pipeline that enhances reasoning capabilities without relying heavily on supervised data [15]. - The model utilizes Group Relative Policy Optimization (GRPO) to reduce training costs and incorporates a dual reward mechanism based on accuracy and format [16][17]. - A structured training template guides the model to articulate its reasoning process before providing final answers, allowing for clear observation of its learning progress [18]. Performance and Limitations - DeepSeek-R1 demonstrates advanced self-evolution capabilities, developing higher-order reasoning skills autonomously during training [20]. - Despite its advancements, the model still faces challenges such as poor readability and language mixing in its outputs [21][26]. Cold Start and Reinforcement Learning - The development team collected a small amount of long Chain of Thought (CoT) data to stabilize the model during the early stages of reinforcement learning [22]. - The integration of language consistency rewards during training aims to improve the model's readability, although it may slightly affect performance [23]. Distillation and Model Efficiency - The team successfully distilled the reasoning capabilities of DeepSeek-R1 into smaller models, significantly enhancing their performance [29]. - Benchmark tests indicate that DeepSeek-R1 competes effectively with state-of-the-art models in reasoning tasks, showcasing its robust capabilities [30][31].
财经观察:中国东盟携手共创“数字未来”
Huan Qiu Shi Bao· 2025-09-16 22:42
Group 1: Core Insights - The 22nd China-ASEAN Expo focuses on digital economy and AI collaboration, showcasing achievements in these fields [1][3] - The China-ASEAN Free Trade Area 3.0 negotiations have been completed, emphasizing digital economy as a key area for cooperation [1][5] - China and ASEAN aim to enhance digital infrastructure, e-commerce, and AI collaboration to foster new growth points like blue and green economies [1][3] Group 2: Digital Economy and AI Cooperation - China-ASEAN cross-border e-commerce has grown at an annual rate exceeding 20%, becoming a significant driver of trade [3] - ASEAN countries are increasing investments in digital technology, with Indonesia and Vietnam setting ambitious digital economy targets [3][4] - The establishment of the AI Innovation Cooperation Center aims to connect Chinese enterprises with ASEAN's AI needs across various sectors [5][6] Group 3: Market Opportunities and Challenges - Chinese high-tech companies are increasingly looking to enter the ASEAN market, with significant interest in AI and robotics [7][8] - The ASEAN market presents both opportunities and challenges for automation, with varying levels of technological advancement across countries [7][8] - The automotive sector is a focal point for collaboration, with Chinese companies introducing AI solutions to enhance competitiveness in ASEAN [9][10] Group 4: Regional Development and Integration - Guangxi is positioned as a "bridgehead" for China-ASEAN cooperation in the digital economy and AI [5][12] - The region is actively promoting AI applications in various industries, including agriculture and smart cities, through initiatives like the "AI Empowerment Super League" [11][12] - The collaboration between Chinese and ASEAN entities is expected to yield mutual benefits, leveraging each other's strengths in technology and market access [10][11]
X @外汇交易员
外汇交易员· 2025-09-16 06:33
Industry Development & Technology - Tencent has fully adapted to mainstream domestic chips, aiming to provide cost-effective AI computing power through full-stack optimization of software and hardware [1] - The industry is addressing the challenge of computing power supply by integrating different types of chips [1] - DeepSeek-V3.1 uses UE8M0 FP8 Scale parameter precision and has made significant adjustments to the tokenizer and chat template [1] - UE8M0 FP8 is designed for the next generation of domestic chips [1] Company Strategy - Tencent is adopting a strategy of software and hardware collaboration for full-stack optimization [1] - Tencent aims to provide high cost performance AI computing power [1]
OpenAI发布GPT-5-Codex:独立编码7小时,能动态调整资源,token消耗更少
Founder Park· 2025-09-16 03:24
Core Insights - OpenAI has released a new model specifically designed for programming tasks, named GPT-5-Codex, which is a specialized version of GPT-5 [3][4] - GPT-5-Codex features a "dual-mode" capability, being both fast and reliable, with improved responsiveness for both small and large tasks [5][6] - The model can execute large-scale refactoring tasks for up to 7 hours continuously, showcasing its efficiency [7] Performance and Features - In SWE-bench validation and code refactoring tasks, GPT-5-Codex outperformed the previous model, GPT-5-high, achieving an accuracy rate of 51.3% compared to 33.9% [9][10] - The model dynamically adjusts resource allocation based on task complexity, reducing token consumption by 93.7% for simpler tasks while doubling the processing time for more complex requests [12][13] - GPT-5-Codex has significantly improved code review capabilities, with incorrect comments dropping from 13.7% to 4.4% and high-impact comments increasing from 39.4% to 52.4% [16][18] Integration and User Experience - The model supports multi-modal interactions, including terminal vibe coding, IDE editing, and GitHub integration, catering to various developer preferences [32] - OpenAI emphasizes the importance of "harnessing" the model, integrating it with infrastructure to enable real-world task execution [29][34] - The user experience is enhanced with a response time of less than 1.5 seconds for code completion, crucial for maintaining developer productivity [30] Competitive Landscape - The release of GPT-5-Codex intensifies the competition in the programming AI space, with various domestic and international players developing similar programming agents [45][46] - Notable competitors include Cursor, Gemini CLI, and Claude Code, which focus on execution capabilities and seamless integration with development environments [51][52] - The market is rapidly evolving, with many companies racing to establish their programming AI solutions, indicating a significant shift in software development practices by 2030 [43][54]
'DeepSeek is only the beginning' for #China says professor #tech
Bloomberg Television· 2025-09-15 21:00
How should we look at the Chinese economy right now. Um, still tested. I'd say resilient in some ways if you look at the macro numbers, but still tested by deflationary pressures, real estate way down.But you know what I found out this summer was that there's a real dichotomy between how good high-tech is, how really strong high-tech is going forward. Deep deepseek is really only the beginning, but how weak uh the microlevel economy is on consumption, all that. like you know they're leaning off of policy bu ...