Workflow
DeepSeek
icon
Search documents
DeepSeekR1幻觉率最高降低50%,用户喊话想要R2模型
Di Yi Cai Jing· 2025-05-29 14:10
Core Insights - The updated R1 model from DeepSeek has significantly improved its capabilities, particularly in reducing the "hallucination" rate, which previously stood at around 21% [1][4]. Model Performance - The new R1 model has achieved top-tier performance in various benchmark tests, surpassing all domestic models and nearing the performance of international leaders like o3 and Gemini-2.5-Pro [4]. - The hallucination rate has been reduced by approximately 45%-50% in tasks such as rewriting, summarization, and reading comprehension, providing more accurate and reliable results [4][18]. - In the AIME 2025 test, the model's accuracy improved from 70% to 87.5% in complex reasoning tasks [18]. Model Features and Capabilities - The updated R1 model can generate longer and more structured pieces of writing, including essays, novels, and prose, while aligning more closely with human writing styles [18]. - The model's coding capabilities have also seen significant enhancements, performing nearly on par with OpenAI's o3-high model in code testing environments [18]. - The new model has a parameter count of 685 billion and supports a context length of 128K in the open-source version [19]. Future Developments - There is considerable anticipation in the industry for the next-generation R2 model, with users expressing their eagerness for its release [19]. - DeepSeek has not commented on speculations regarding the R2 model, but the ongoing competition in the foundational model space remains intense [19].
DeepSeek R1官宣更新:思维深度与推理能力显著提升,优化“幻觉”问题
Xin Lang Ke Ji· 2025-05-29 12:40
新浪科技讯 5月29日晚间消息,DeepSeek今日宣布,DeepSeek R1模型已完成小版本升级,当前版本为 DeepSeek-R1-0528。用户通过官方网站、App或小程序进入对话界面后,开启"深度思考"功能即可体验 最新版本。API 也已同步更新,调用方式不变。 工具调用,DeepSeek-R1-0528 支持工具调用(不支持在 thinking 中进行工具调用); 据介绍,DeepSeek-R1-0528 仍然使用 2024 年 12 月所发布的 DeepSeek V3 Base 模型作为基座,但在后 训练过程中投入了更多算力,显著提升了模型的思维深度与推理能力。官方称更新后的 R1 模型在数 学、编程与通用逻辑等多个基准测评中取得了当前国内所有模型中首屈一指的优异成绩,并且在整体表 现上已接近其他国际顶尖模型,如o3与Gemini-2.5-Pro。 其他能力更新方面,包括幻觉改善,新版 DeepSeek R1 针对"幻觉"问题进行了优化。与旧版相比,更新 后的模型在改写润色、总结摘要、阅读理解等场景中,幻觉率降低了45~50%左右,能够有效地提供更 为准确、可靠的结果; 创意写作,在旧版 R1 ...
DeepSeek R1悄悄更新,用“小版本”干翻大模型
Hu Xiu· 2025-05-29 09:52
目前该升级版的DeepSeek-R1-0528已经全量上线官方网页、APP、小程序等等,API也已经可以接入。 关于DeepSeek官方多么有诚意,我们已经在V3版本的升级上看到了——模型性能大幅提升只是开胃小菜,成本价格比 更是再度优化。这回的更新也是一样,新版本的DeepSeek-R1主要在编程能力上大幅提升。据一家LLM API接入网站 OpenRouter,这回的新版本R1的输入输出价格几乎与先前版本毫无变化! | or DeepSeek: R1 0528 | | DeepSeek: R1 | (2) | | --- | --- | --- | --- | | Author | V deepseek @ | Author or deepseek @ | | | Context Length | 164K | Context Length | 164K | | May 28th update to the original DeepSeek R1 Performance | A | DeepSeek R1 is here: Performance on par with OpenAl o1. | 14 ...
多重催化来袭!恒生科技指数ETF(513180)高开高走,小鹏大涨近7%
Mei Ri Jing Ji Xin Wen· 2025-05-29 05:41
此外,5月28日晚,小鹏MONA M03 Max版正式上市,根据续航分为两个版本,其中续航502km的版本 售价为12.98万元,600km续航的版本售价为13.98万元。另外,小鹏MONA M03还推出了515长续航Plus 版售价11.98万元,以及620超长续航Plus版售价12.98万元。据小鹏汽车官方微博披露,小鹏MONA M03 加推Max、Plus新版型,上市1小时大定12566台,超过去年上市同期,其中Max版订单占比83%。 5月29日,港股三大指数持续走高,恒生科技指数午后一度涨超2.5%。盘面上,科网股普涨,生物技术 股大涨,中资券商股普涨。主流ETF方面,恒生科技指数ETF(513180)跟随指数强势上扬,小鹏汽 车、美团、同程旅行、舜宇光学科技、金蝶国际等持仓股涨幅居前,其中小鹏汽车午后一度涨近7%。 消息面上,近日DeepSeek在官方交流群中公布,DeepSeek R1模型已完成小版本试升级。用户可在官方 网页、APP、小程序测试(打开深度思考),API接口和使用方式保持不变。DeepSeek在开源社区 Hugging Face也开源了新版R1模型(R1-0528)。目前,市场在 ...
DeepSeek上新,又一次“开源的巨大胜利”
第一财经· 2025-05-29 04:52
Core Viewpoint - The recent upgrade of the DeepSeek R1 model, specifically the release of DeepSeek-R1-0528, has significantly improved its coding capabilities, making it competitive with OpenAI's o3-high model [1]. Group 1: Model Performance - The DeepSeek-R1-0528 model has shown remarkable improvements in code execution and generation, achieving performance levels comparable to leading models in the Live CodeBench testing platform [1]. - In the current leaderboard, DeepSeek-R1-0528 ranks fourth with a Pass@1 score of 73.1, indicating its strong performance in coding tasks [4]. Group 2: Developer Feedback - Developers have expressed that the upgrade represents a significant victory for open-source initiatives, highlighting the model's enhanced writing capabilities and more natural output [6]. - Testing by developers has shown that the new model performs better in specific tasks, such as text recall within a 32K context, although performance declines in a 60K context [6]. Group 3: Future Expectations - There is anticipation for the next version, R2, with developers hoping for improvements in context length and multimodal capabilities, which are crucial for practical applications [7]. - The industry speculates that DeepSeek's approach to versioning may differ from competitors, focusing on training data adjustments rather than structural updates [7].
连续9日获资金净流入,计算机ETF(159998)半日涨超2.3%,机构:国内外算力需求有望迎来共振
Group 1 - A-shares indices opened high and showed strong performance in computing power concepts, with the Computer ETF (159998) rising by 2.36% and trading volume exceeding 500 million yuan [1][2] - The Computer ETF (159998) has seen a continuous net inflow of funds for 9 days, accumulating over 125 million yuan, and its latest scale reached 2.791 billion yuan, making it the largest ETF in its category [2] - The Computer ETF tracks the CSI Computer Index, which includes stocks from companies involved in information technology services, application software, system software, and computer hardware, with top holdings including Hikvision and iFlytek [2] Group 2 - Nvidia reported a 69% year-on-year revenue growth in its first fiscal quarter, reaching 44.1 billion USD, exceeding market expectations [3] - The U.S. International Trade Court blocked President Trump's tariff policy, which could impact the semiconductor market dynamics [3] - Domestic chip manufacturers are accelerating breakthroughs in performance and capacity, indicating significant potential for domestic AI chip market growth [3][4] Group 3 - Domestic and international computing power demand is expected to resonate positively, with infrastructure demand remaining strong, particularly for domestic computing power and the entire AIDC industry chain [4]
DeepSeek-R1今天一次「小更新」,颠覆了大模型格局,网友:尽快放R2
机器之心· 2025-05-29 03:04
机器之心报道 昨晚,DeepSeek 官方宣布其 R1 推理模型升级到了最新版本(0528),并在今天凌晨公开了模型及权重。 编辑:泽南、Panda 超出所有人的期待。 千呼万唤始出来,DeepSeek 迎来了推理模型更新。 HuggingFace 链接:https://huggingface.co/deepseek-ai/DeepSeek-R1-0528 模型文件上传时间是凌晨 1 点,不知 DeepSeek 工程师们是不是加班到了最后一刻。也有网友表示,这回又在端午节假期前发新模型,简直比放假通知还靠谱。 这次更新的升级版 R1 参数量高达 6850 亿,体量巨大,虽然开源了出来,但大多数人只能围观。如果「满血版」不进行蒸馏,是肯定无法在消费级硬件上本地运 行的。 不过这种不说话直接放链接的态度还是引来了网友们的普遍欢迎。 根据 DeepSeek 的小范围通知,更新后的 R1 版本采用 MIT 许可证,这意味着它可以用于商业用途,从版本号看来这是一个「小」升级,不过人们大量实测后发 现,新版大模型的性能提升颇为明显。 我们也能在新版 DeepSeek-R1 模型的配置文件中看到更多但并不出人意料的信息,包 ...
DeepSeek小版本大升级,新R1模型代码能力媲美OpenAI o3
Di Yi Cai Jing· 2025-05-29 03:04
Core Insights - DeepSeek has released a minor version upgrade of its R1 model, named DeepSeek-R1-0528, which has shown significant improvements in coding capabilities, nearly matching the performance of OpenAI's o3-high model [1][5] - Developers have noted enhancements in writing tasks, with outputs appearing more natural and better formatted compared to previous versions [7] - The model's performance in context recall has improved for contexts up to 32K, although there is a decline in performance for 60K contexts [7][8] Performance Metrics - In the Live CodeBench testing platform, DeepSeek-R1-0528 achieved a Pass@1 score of 73.1, ranking fourth among various models [3] - The top three models in the same test were 04-Mini (High) with 80.2, 03 (High) with 75.8, and 04-Mini (Medium) with 74.2 [3] Developer Feedback - Developers have expressed that the upgrade represents a significant victory for open-source initiatives [4] - Some developers have conducted personal tests comparing DeepSeek-R1 with Claude-4, finding R1 superior in certain aspects, such as the visual effects of a simulated collision [5] - There is anticipation for the next major version, R2, with hopes for improvements in context length and multimodal capabilities [8]
24小时环球政经要闻全览 | 5月29日
Sou Hu Cai Jing· 2025-05-29 03:04
Market Summary - Major US indices experienced declines, with the Dow Jones Industrial Average down by 244.95 points (-0.58%) to 42098.7, and the Nasdaq down by 98.22 points (-0.51%) to 19100.94 [2] - European indices also fell, with the Euro Stoxx 50 down by 37.06 points (-0.68%) to 5378.39 and the German DAX down by 188.3 points (-0.78%) to 24038.19 [2] - Asian markets showed mixed results, with the Hang Seng Index down by 123.68 points (-0.53%) to 23258.31, while the Korean KOSPI rose by 32.93 points (1.25%) to 2670.15 [2] Legal and Regulatory Developments - The US International Trade Court blocked President Trump's tariff policy announced on April 2, ruling that the president overstepped his authority by imposing comprehensive tariffs on countries exporting more than they import [3] - The lawsuit was initiated by the Freedom Justice Center on behalf of five affected small businesses, marking a significant legal challenge to Trump's tariff policies [3] Visa and Immigration Policies - The US State Department announced plans to revoke visas for Chinese students, particularly those studying in key fields, and will enhance scrutiny on all visa applications from China and Hong Kong [4] - The announcement was confirmed by Secretary of State Rubio, although specific details regarding affected fields and the number of students were not provided [4] Education Sector Pressures - President Trump pressured Harvard University to limit its foreign student population to 15%, citing that approximately 31% of its students are international [5] - The US government has recently taken measures against Harvard, including revoking its eligibility for the Student and Exchange Visitor Program [5] Federal Reserve Insights - The Federal Reserve's meeting minutes indicated a cautious approach to interest rate cuts due to increased economic uncertainty, particularly regarding the impacts of Trump's tariff policies [6] - Officials expressed concerns about potential inflation and rising unemployment rates in the coming months [6] NATO Military Expansion - NATO plans to increase its target for combat brigades from approximately 80 to between 120 and 130, potentially adding up to 350,000 soldiers [7] - The alliance will also propose raising defense spending targets from 2% to 5% of GDP, with specific allocations for defense and broader security-related expenditures [7] Nvidia Financial Performance - Nvidia reported Q1 revenue of $44.1 billion, a 69% year-over-year increase, surpassing market expectations of $43.1 billion [8] - The company's net profit was $18.775 billion, slightly below the expected $20.767 billion, while data center revenue grew by 73% to $39.1 billion [8] - Nvidia anticipates Q2 revenue to fluctuate around $45 billion, with analysts predicting $45.5 billion [8] OpenAI IPO Plans - OpenAI's CFO indicated that the company's restructuring is paving the way for a potential IPO, contingent on market conditions and the company's readiness [9] - The structure of the public benefit corporation allows OpenAI to pursue an IPO if desired [9]
DeepSeek新版R1直追OpenAI o3!实测来了:“小版本升级”着实不小
量子位· 2025-05-29 01:08
Core Viewpoint - DeepSeek has released a significant update with version R1-0528, which is comparable to leading models like OpenAI's o3-high, indicating a major advancement in capabilities [1][10]. Group 1: Model Performance - The new R1 model can solve complex numerical problems that challenge top models such as o3, Gemini 2.5 pro, and Claude 4 [4]. - The model has shown improved reasoning abilities, allowing for deeper analysis similar to Google's models [10]. - In practical tests, the R1 model demonstrated enhanced programming skills and could generate executable solutions in a shorter time frame [17][20]. Group 2: Features and Improvements - The R1 model has improved writing tasks, producing more natural and better-formatted outputs [10]. - It can think for extended periods, with a maximum contemplation time of 30-60 minutes per task [10]. - The model's unique reasoning style is characterized by being quick yet thoughtful, and it considers the interest level of the answers for the user [14] [15]. Group 3: Community and Open Source Impact - The release of R1-0528 is seen as a significant victory for open-source AI, as it competes effectively with closed-source models [31]. - The community has actively engaged with the new model, sharing insights and testing results, which highlights the collaborative nature of open-source development [9][28].