DeepSeek
Search documents
DeepSeek梁文锋论文登上《自然》封面
第一财经· 2025-09-17 23:23
2025.09. 18 本文字数:307,阅读时长大约1分钟 作者 | 一财科技 由DeepSeek团队共同完成、梁文锋担任通讯作者的DeepSeek-R1推理模型研究论文,登上了国际权威期刊《自然(Nature)》的封面。 推荐阅读 "嘎子谢孟伟"公开道歉!警方已介入 47.7 与今年1月发布的DeepSeek-R1的初版论文相比,本次论文披露了更多模型训练的细节,并正面回应了模型发布之初的蒸馏质疑。 DeepSeek-R1也是全球首个经过同行评审的主流大语言模型。Nature评价道:目前几乎所有主流的大模型都还没有经过独立同行评审,这一空白"终 于被DeepSeek打破"。 微信编辑 | 七三 第一财经持续追踪财经热点。若您掌握公司动态、行业趋势、金融事件等有价值的线索,欢迎提供。 专用邮箱: bianjibu@yicai.com (注:我们会对线索进行核实。您的隐私将严格保密。) ...
DeepSeek-R1开创历史,梁文锋论文登上《自然》封面
Di Yi Cai Jing· 2025-09-17 23:09
与今年1月发布的DeepSeek-R1的初版论文相比,本次论文披露了更多模型训练的细节,并正面回应了 模型发布之初的蒸馏质疑。 DeepSeek-R1也是全球首个经过同行评审的主流大语言模型。Nature评价道:目前几乎所有主流的大模 型都还没有经过独立同行评审,这一空白"终于被DeepSeek打破"。 本次论文正面回应了模型发布之初的蒸馏质疑。 由DeepSeek团队共同完成、梁文锋担任通讯作者的DeepSeek-R1推理模型研究论文,登上了国际权威期 刊《自然(Nature)》的封面。 ...
刚刚,DeepSeek-R1论文登上Nature封面,通讯作者梁文锋
机器之心· 2025-09-17 17:00
Core Viewpoint - The article highlights the significance of DeepSeek-R1, which is recognized as the first large language model (LLM) to pass peer review in a prestigious academic journal, Nature. This achievement marks a pivotal shift in the AI industry towards more rigorous scientific validation of AI models, moving from mere technical competition to a focus on scientific discipline and public trust [5][11][12]. Summary by Sections DeepSeek-R1 Overview - DeepSeek-R1 is trained using reinforcement learning, where the model receives rewards for correct answers and penalties for incorrect ones, enabling it to develop reasoning capabilities similar to human problem-solving [7][8]. - The model's ability to self-validate and reflect on its performance enhances its effectiveness in programming and advanced scientific inquiries [7]. Peer Review Significance - The peer review process serves as a critical gatekeeper, requiring AI companies to substantiate their claims with solid evidence rather than self-promotion [10]. - The rigorous evaluation of DeepSeek-R1's methodology and limitations by external experts helps to mitigate inflated claims in the AI industry [9][10]. Training Methodology - DeepSeek-R1 employs a novel multi-stage pipeline that enhances reasoning capabilities without relying heavily on supervised data [15]. - The model utilizes Group Relative Policy Optimization (GRPO) to reduce training costs and incorporates a dual reward mechanism based on accuracy and format [16][17]. - A structured training template guides the model to articulate its reasoning process before providing final answers, allowing for clear observation of its learning progress [18]. Performance and Limitations - DeepSeek-R1 demonstrates advanced self-evolution capabilities, developing higher-order reasoning skills autonomously during training [20]. - Despite its advancements, the model still faces challenges such as poor readability and language mixing in its outputs [21][26]. Cold Start and Reinforcement Learning - The development team collected a small amount of long Chain of Thought (CoT) data to stabilize the model during the early stages of reinforcement learning [22]. - The integration of language consistency rewards during training aims to improve the model's readability, although it may slightly affect performance [23]. Distillation and Model Efficiency - The team successfully distilled the reasoning capabilities of DeepSeek-R1 into smaller models, significantly enhancing their performance [29]. - Benchmark tests indicate that DeepSeek-R1 competes effectively with state-of-the-art models in reasoning tasks, showcasing its robust capabilities [30][31].
财经观察:中国东盟携手共创“数字未来”
Huan Qiu Shi Bao· 2025-09-16 22:42
Group 1: Core Insights - The 22nd China-ASEAN Expo focuses on digital economy and AI collaboration, showcasing achievements in these fields [1][3] - The China-ASEAN Free Trade Area 3.0 negotiations have been completed, emphasizing digital economy as a key area for cooperation [1][5] - China and ASEAN aim to enhance digital infrastructure, e-commerce, and AI collaboration to foster new growth points like blue and green economies [1][3] Group 2: Digital Economy and AI Cooperation - China-ASEAN cross-border e-commerce has grown at an annual rate exceeding 20%, becoming a significant driver of trade [3] - ASEAN countries are increasing investments in digital technology, with Indonesia and Vietnam setting ambitious digital economy targets [3][4] - The establishment of the AI Innovation Cooperation Center aims to connect Chinese enterprises with ASEAN's AI needs across various sectors [5][6] Group 3: Market Opportunities and Challenges - Chinese high-tech companies are increasingly looking to enter the ASEAN market, with significant interest in AI and robotics [7][8] - The ASEAN market presents both opportunities and challenges for automation, with varying levels of technological advancement across countries [7][8] - The automotive sector is a focal point for collaboration, with Chinese companies introducing AI solutions to enhance competitiveness in ASEAN [9][10] Group 4: Regional Development and Integration - Guangxi is positioned as a "bridgehead" for China-ASEAN cooperation in the digital economy and AI [5][12] - The region is actively promoting AI applications in various industries, including agriculture and smart cities, through initiatives like the "AI Empowerment Super League" [11][12] - The collaboration between Chinese and ASEAN entities is expected to yield mutual benefits, leveraging each other's strengths in technology and market access [10][11]
X @外汇交易员
外汇交易员· 2025-09-16 06:33
腾讯集团副总裁、腾讯云总裁邱跃鹏周二宣布,腾讯已全面适配主流国产芯片。此举旨在通过软硬件协同的全栈优化策略,整合不同类型的芯片,对外提供高性价比的AI算力,以应对当前备受关注的算力供应挑战。“今天我们的GPU计算资源越来越异构,很多国产芯片在不断提升计算性能。”外汇交易员 (@myfxtrader):DeepSeek在其官宣发布DeepSeek-V3.1的文章中提到,DeepSeek-V3.1使用了UE8M0 FP8 Scale的参数精度。另外,V3.1对分词器及chat template进行了较大调整,与DeepSeek-V3存在明显差异。DeepSeekg官方在置顶留言里表示,UE8M0 FP8是针对即将发布的下一代国产芯片设计。 https://t.co/ydxMxF53VL ...
OpenAI发布GPT-5-Codex:独立编码7小时,能动态调整资源,token消耗更少
Founder Park· 2025-09-16 03:24
Core Insights - OpenAI has released a new model specifically designed for programming tasks, named GPT-5-Codex, which is a specialized version of GPT-5 [3][4] - GPT-5-Codex features a "dual-mode" capability, being both fast and reliable, with improved responsiveness for both small and large tasks [5][6] - The model can execute large-scale refactoring tasks for up to 7 hours continuously, showcasing its efficiency [7] Performance and Features - In SWE-bench validation and code refactoring tasks, GPT-5-Codex outperformed the previous model, GPT-5-high, achieving an accuracy rate of 51.3% compared to 33.9% [9][10] - The model dynamically adjusts resource allocation based on task complexity, reducing token consumption by 93.7% for simpler tasks while doubling the processing time for more complex requests [12][13] - GPT-5-Codex has significantly improved code review capabilities, with incorrect comments dropping from 13.7% to 4.4% and high-impact comments increasing from 39.4% to 52.4% [16][18] Integration and User Experience - The model supports multi-modal interactions, including terminal vibe coding, IDE editing, and GitHub integration, catering to various developer preferences [32] - OpenAI emphasizes the importance of "harnessing" the model, integrating it with infrastructure to enable real-world task execution [29][34] - The user experience is enhanced with a response time of less than 1.5 seconds for code completion, crucial for maintaining developer productivity [30] Competitive Landscape - The release of GPT-5-Codex intensifies the competition in the programming AI space, with various domestic and international players developing similar programming agents [45][46] - Notable competitors include Cursor, Gemini CLI, and Claude Code, which focus on execution capabilities and seamless integration with development environments [51][52] - The market is rapidly evolving, with many companies racing to establish their programming AI solutions, indicating a significant shift in software development practices by 2030 [43][54]
'DeepSeek is only the beginning' for #China says professor #tech
Bloomberg Television· 2025-09-15 21:00
How should we look at the Chinese economy right now. Um, still tested. I'd say resilient in some ways if you look at the macro numbers, but still tested by deflationary pressures, real estate way down.But you know what I found out this summer was that there's a real dichotomy between how good high-tech is, how really strong high-tech is going forward. Deep deepseek is really only the beginning, but how weak uh the microlevel economy is on consumption, all that. like you know they're leaning off of policy bu ...
X @Bloomberg
Bloomberg· 2025-09-15 12:47
"DeepSeek is just the beginning."Economics professor and author Keyu Jin tells @flacqua that China is "going after a drastic cost-cutting innovation" for its economic path forward https://t.co/3e1Reko1kT https://t.co/WWhtv2aGnU ...
罗永浩提议与贾国龙公开直播对质;宇树入选MIT“聪明公司”
2 1 Shi Ji Jing Ji Bao Dao· 2025-09-15 02:53
(原标题:罗永浩提议与贾国龙公开直播对质;宇树入选MIT"聪明公司") 21世纪经济报道新质生产力研究院综合报道 早上好,新的一天又开始了。在过去的24小时内,科技行业发生了哪些有意思的事情?来跟21tech一起看看吧。 【巨头风向标】 罗永浩提议与贾国龙公开直播对质 9月14日,西贝创始人贾国龙在某个行业群内的表态截图流出。贾国龙表示:"我应对方式有错,改。做饭的围着吃饭的转,你说咋好就咋办。"并 称"罗永浩是网络黑嘴,是网络黑社会,太坏了。但他打醒了我,算变相的帮西贝进步。"9月15日凌晨消息,罗永浩针对西贝创始人贾国龙在行业 群里发言并提及自身一事发文称:"贾总,你说我是网络黑社会,我认为你是诬蔑诽谤。这次的事件,总是我说几句,你说几句,容易各说各话, 媒体转来转去也容易出现信息偏差,我们还是找一个大的网络平台直播,当面公平公正冷静理性地对一次话吧。相信这也能澄清西贝的真相,并 且对中国预制菜产业和餐饮行业的健康发展做一些贡献@西贝贾国龙。" DeepSeek、宇树科技等被MIT科技评论评为聪明公司 9月12日,《麻省理工科技评论》"50家聪明公司"最新评选结果揭晓,DeepSeek、宇树科技等明星创企 ...
OpenAI将与微软分成比例降至8%,将获利500亿美元;DeepSeek、宇树科技等被MIT科技评论评为聪明公司丨AIGC日报
创业邦· 2025-09-15 00:08
Group 1 - Penske Media Corporation (PMC) has filed a lawsuit against Google, accusing the tech giant of illegally using its news content to generate AI summaries, resulting in decreased website traffic [2] - DeepSeek and Yushu Technology have been recognized as "smart companies" by MIT Technology Review, highlighting their innovative use of technology and understanding of market opportunities [2] - A study from Turku University in Finland reveals that GPT-4V can assess social situations similarly to humans, which could enhance efficiency in brain science experiments and have applications in medical, security, and market analysis fields [2] Group 2 - OpenAI plans to reduce its revenue-sharing ratio with Microsoft to 8%, which is expected to generate an additional $50 billion in revenue [2]