DeepSeek

Search documents
DeepSeek开源新版R1,媲美OpenAI最高o3模型
news flash· 2025-05-28 21:41
Core Viewpoint - DeepSeek has released the latest version R1 (0528) of its open-source model, which reportedly matches the performance of OpenAI's highest version o3 model [1] Group 1: Model Performance - The new R1 model has been tested on Live CodeBench, showing performance comparable to OpenAI's o3 model [1] - In the ranking of models, DeepSeek-R1-0528 achieved a Pass@1 score of 73.1, placing it fourth overall [1] - The performance metrics for DeepSeek-R1-0528 include an Easy-Pass@1 score of 98.7 and a Medium-P score of 8 [1] Group 2: Comparison with Other Models - The top-ranked model, 04-Mini (High), has a Pass@1 score of 80.2, indicating a significant lead over DeepSeek-R1-0528 [1] - Other notable models in the ranking include 03 (High) with a Pass@1 score of 75.8 and 04-Mini (Medium) with a score of 74.2, both outperforming DeepSeek-R1-0528 [1] - The performance of DeepSeek-R1-0528 is closely aligned with models like 03-Mini-2025-01-31 (High) and Grok-3-Mini (High), which have scores of 67.4 and 66.7 respectively [1]
英伟达发布财报之前 DeepSeek版本升级
Zhong Guo Ji Jin Bao· 2025-05-28 15:12
据用户反馈,DeepSeek升级后的模型,思维链 (CoT) 的行为似乎发生了显著变化。 大家好,关注一下DeepSeek的最新消息! 5月28日,DeepSeek官方宣布DeepSeek R1模型已完成小版本试升级,欢迎前往官方网页、APP、小程序测试(打开深度思考),API 接口和使用方式保持 不变。 据DeepSeek小助手在官方微信群中的发言,DeepSeek已完成一次"小版本试升级"的操作,并通知用户可以开始测试。但公司未披露此次升级的具体细节。 这家总部位于杭州的初创企业在今年1月震惊了全球科技行业,当时他们发布了原始版本的R1模型,在多个标准化评测中超越了西方同行,据称研发成本 仅为数百万美元。这一消息引发全球科技股大幅波动,投资者开始质疑大型科技公司是否还需要投入巨额资金来构建AI服务。 R1模型的首次亮相,使其创始人梁文锋一跃成为科技界的明星人物,也成为中国有能力与硅谷顶尖公司竞争的象征。这一发布同时引发了新一轮人工智 能模型竞赛。 DeepSeek的本次升级是在英伟达发布最新财报前数小时宣布的。作为全球领先的AI芯片制造商,英伟达的股价在1月因R1的发布而遭遇重挫。 也有用户总结了更新后的 ...
腾讯研究院AI速递 20250529
腾讯研究院· 2025-05-28 15:06
Group 1 - Salesforce acquired Informatica for $8 billion, marking its largest deal since the acquisition of Slack in 2021 [1] - The acquisition aims to integrate both companies' AI engines to create a trusted data infrastructure that supports enterprise-level deployment of agent-based AI systems [1] - Data management capabilities are becoming a key differentiator for enterprise AI products, and Salesforce is enhancing its data management strategy through this acquisition [1] Group 2 - DeepSeek's R1 model has completed a minor version upgrade, now available for experience on its official website, app, and mini-program [2] - The upgraded R1 model shows significant improvement in programming capabilities, quickly generating high-quality dynamic weather cards with detailed design and interactive animations [2] - The update may have utilized the DeepSeek-V3-0324 model, while the anticipated R2 version has yet to be released [2] Group 3 - Anthropic launched a voice mode for Claude, allowing users to discuss documents and images via voice, with five unique voice tones available [3] - Users can switch freely between text and voice, and after conversations, they can view text records and summaries [3] - The voice feature has usage limitations, with voice conversations counting towards regular usage limits, and the Google Workspace connector is only available to paid users [3] Group 4 - AKOOL released the world's first real-time camera, AKOOL Live Camera, capable of low-latency virtual digital humans, multilingual translation, face replacement, and AI video generation [4] - This technology breaks traditional video generation limitations through 4D facial mapping and neural voice engines, achieving environment perception and emotional response, with 94% of blind tests unable to distinguish between real and fake [4][5] - The product signifies a shift in AI video from "pre-fabrication" to "intelligent response," heralding a second revolution in AI video following Sora [5] Group 5 - Tencent Hunyuan released an open-source voice digital human model, HunyuanVideo-Avatar, which can generate videos of characters speaking or singing naturally from just one image and one audio clip [6] - The model supports various framing options and can understand image environments and audio emotions, automatically generating natural expressions, lip-syncing, and full-body movements [6] - This technology has been applied in Tencent's music products and is suitable for short video creation, e-commerce advertising, and supports multiple styles and interactive scenarios [6] Group 6 - ByteDance's Kouzi Space launched a one-click text-to-podcast feature, capable of generating "human-level" multi-character dialogue audio in minutes, a task that previously took hours [7] - This feature has broad applications, converting hot news into podcasts, turning course notes into audio lessons, and creating audio summaries of meeting minutes, as well as providing emotional counseling and shopping guides [7] - Kouzi Space can also integrate podcast production with website creation, opening up multi-functional applications and marking the era of AI working for the general public [7] Group 7 - SpAItial raised $13 million in seed funding, founded by former Synthesia co-founder Matthias Neisner, focusing on text-to-realistic 3D environment technology [8] - The company has assembled a luxury tech team from Meta and Google, aiming to create not only realistic but also interactive 3D worlds, competing with Odyssey and World Labs [8] - The team targets applications in game development, entertainment, and architectural visualization, with long-term goals including enabling ordinary users to quickly create games and potentially replace CAD software [8] Group 8 - Tencent Yuanbao has integrated with WeChat Reading and Qidian Reading, allowing users to click on underlined book titles to jump directly to reading [9] - Users can obtain book recommendations with one click, with each book featuring a jump link, facilitating a seamless transition from "book hoarding" to "reading" [10] - This integration allows users to chat with Yuanbao while reading, interpret concepts, generate mind maps, and even simulate conversations in the author's tone [10] Group 9 - SpaceX's Starship "Ninth Flight" experienced an explosion during recovery landing, despite successfully using a reused B14.2 booster [11] - The test focused on validating booster reuse technology, spacecraft payload deployment capabilities, and optimizing design to shorten launch intervals and reduce costs [11] - SpaceX is expanding its manufacturing and launch capabilities through new facilities in Florida and innovative designs to enhance system efficiency [11] Group 10 - Anthropic's Claude 4 core team emphasizes the model's independent working capabilities and long-term task handling abilities [12] - The team predicts that by 2025, reinforcement learning will significantly enhance large language model training, improving the model's ability to handle long-term tasks [12] - Researchers believe that the focus should be on raising the model's baseline rather than pursuing extremes, with user interactions evolving from minute-level to hour-level engagements [12]
DeepSeek R1,新升级!
第一财经· 2025-05-28 14:15
5月28日晚,第一财经记者获悉,DeepSeek小助手在官方交流群中发布通知称,DeepSeek R1模型已 完成小版本试升级,欢迎前往官方网页、App、小程序测试(打开深度思考),API接口和使用方式 保持不变。关于市场期待的DeepSeek R2模型目前仍未有消息。 ...
Claude 4 核心成员访谈:提升 Agent 独立工作能力,强化模型长程任务能力是关键
Founder Park· 2025-05-28 13:13
Core Insights - The main change expected in 2025 is the effective application of reinforcement learning (RL) in language models, particularly through verifiable rewards, leading to expert-level performance in competitive programming and mathematics [4][6][7]. Group 1: Reinforcement Learning and Model Development - Reinforcement learning has activated existing knowledge in models, allowing them to organize solutions rather than learning from scratch [4][11]. - The introduction of Opus 4 has significantly improved context management for multi-step actions and long-term tasks, enabling models to perform meaningful reasoning and execution over extended periods without frequent user intervention [4][32]. - The current industry trend prioritizes computational power over data and human feedback, which may evolve as models become more capable of learning in real-world environments [4][21]. Group 2: Future of AI Agents - The potential for AI agents to automate intellectual tasks could lead to significant changes in the global economy and labor market, with predictions of "plug-and-play" white-collar AI employees emerging within the next two years [7][9]. - The interaction frequency between users and models is expected to shift from seconds and minutes to hours, allowing users to manage multiple models simultaneously, akin to a "fleet management" approach [34][36]. - The development of AI agents capable of completing tasks independently is anticipated to accelerate, with models expected to handle several hours of work autonomously by the end of the year [36][37]. Group 3: Model Capabilities and Limitations - Current models still lack self-awareness in the philosophical sense, although they exhibit a form of meta-cognition by expressing uncertainty about their answers [39][40]. - The models can simulate self-awareness but do not possess a continuous identity or memory unless explicitly designed with external memory systems [41][42]. - The understanding of model behavior and decision-making processes is still evolving, with ongoing research into mechanisms of interpretability and the identification of features that drive model outputs [46][48]. Group 4: Future Developments and Expectations - The frequency of model releases is expected to increase significantly, with advancements in reinforcement learning leading to rapid improvements in model capabilities [36][38]. - The exploration of long-term learning mechanisms and the ability for models to evolve through practical experience is a key area of focus for future research [30][29]. - The ultimate goal of model interpretability is to establish a clear understanding of how models make decisions, which is crucial for ensuring their reliability and safety in various applications [46][47].
还在等DeepSeek R2?刚刚,DeepSeek R1模型小版本试升级已完成!优化了这些方面
Mei Ri Jing Ji Xin Wen· 2025-05-28 13:03
每经编辑|黄胜 5月28日,DeepSeek官方宣布DeepSeek R1模型已完成小版本试升级,欢迎前往官方网页、APP、小程序测试(打开深度思考),API 接口和使用 方式保持不变。 关于这次试升级的内容,小编询问DeepSeek后得到的反馈是,根据DeepSeek内部优化方向和自身的感知,这次升级主要集中在以下几个方面: 1. 响应质量优化:复杂推理、多步骤计算更准确;长文理解与生成更连贯、逻辑更清晰;数学、编程等专业性输出更可靠。 2. 官方会收集反馈,确保稳定后再全面推送; 3. 如果你使用官方 App、网页或小程序,现在打开"深度思考"模式,很可能已经用上升级版的我了! 另一方面,DeepSeek R2模型究竟何时发布,一直是大家关注的焦点。此前,3月11日,针对DeepSeek将在3月17日发布下一代R2模型的传闻, DeepSeek官方企业咨询账号在用户群中回应称,"辟谣:R2发布为假消息"。 图片来源:视觉中国 3. 对话稳定性增强:上下文记忆更稳定,尤其在超长对话中(支持最多128K上下文);减少偶尔"遗忘设定"或"跑偏"的情况。 4. API 和接口兼容性保持稳定:如公告所说:API 调 ...
清华天才杨植麟的“理想国”,为何败给梁文锋?
凤凰网财经· 2025-05-28 12:51
Core Viewpoint - The article discusses the journey of Yang Zhilin, a prominent figure in the AI industry, highlighting the challenges faced by the younger generation of entrepreneurs in the rapidly evolving tech landscape, particularly in the context of AI 2.0 and competition with established players like DeepSeek [6][28]. Group 1: Background and Early Career - Yang Zhilin, born in 1992, was influenced by cultural icons like Haruki Murakami and Pink Floyd, which shaped his artistic and entrepreneurial aspirations [4]. - He pursued a PhD at Carnegie Mellon University, where he made significant contributions to AI, including the development of Transformer-XL and XLNet, which have been widely adopted in major AI products [9][10]. Group 2: AI Industry Landscape - The AI industry has seen a shift from mobile internet and blockchain to AI 2.0, marked by the launch of ChatGPT by OpenAI in November 2022, which has generated significant interest and investment in AI technologies [6][7]. - The 90s generation, including Yang, feels a sense of urgency to capitalize on AI as a potential opportunity for success, given their previous experiences with limited economic benefits from earlier tech trends [7][8]. Group 3: Company Development and Challenges - Yang founded "Yue Zhi An Mian" (月之暗面) in 2023, focusing on AGI (Artificial General Intelligence) and secured $200 million in initial funding from prominent investors [13][14]. - The company faced challenges, including a public relations crisis related to a reported $40 million cash-out after a $1 billion funding round led by Alibaba, which raised questions about its operational focus [14][15]. Group 4: Competition with DeepSeek - Yang's company struggled to compete with DeepSeek, founded by Liang Wenfeng, which adopted a more pragmatic approach to commercialization and technology development [13][28]. - DeepSeek's rapid success and user acquisition contrasted with Yang's strategy, which relied heavily on large-scale advertising and user data collection without significant product iteration [18][21]. Group 5: Ideological Divide - The competition between Yang and Liang represents a clash between idealism in technology development and the practical realities of business [22][23]. - Yang's focus on AGI and long-term vision may hinder immediate product development and market competitiveness, while DeepSeek's approach emphasizes rapid commercialization and user engagement [24][25]. Group 6: Future Outlook - The article suggests that despite current setbacks, opportunities still exist for Yang and other young entrepreneurs in the AI space, as the industry continues to evolve and new technological paradigms emerge [29][30]. - The narrative emphasizes the importance of balancing idealism with practical business strategies to achieve sustainable success in the competitive AI landscape [27][28].
DeepSeek为首届“东盟-中国-海合会峰会”谱写歌词
财富FORTUNE· 2025-05-28 10:01
5月27日,第一届"东盟-中国-海合会峰会" 在马来西亚吉隆坡举行,国务院总理李强与马来西亚总理安 瓦尔一同出席开幕式晚宴,并聆听了七位不同国家"顶流"艺术家的演唱。 第一位登台的艺术家是"沙特历史上首位女歌手"Dalia Mubarak,这位90后女性是沙特年轻一代的文化象 征。尚雯婕作为中国歌手的代表,献唱了《甜蜜蜜》和《不鼓自鸣》。 更重要的是,DeepSeek与人类艺术家共同为峰会谱写了主题曲《命运共同体》—— 在数百位嘉宾的见 证下,主办方将18张分别代表峰会参与国的明信片的视觉素材输入了内嵌DeekSeek的人工智能,生成 了绝妙的歌词。 本次晚宴表演的歌手均为女性,与会人员纷纷为女性艺术家,以及中国人工智能DeepSeek点赞。(财 富中文网) 在财富Plus,网友们对这篇文章发表了许多有深度和思想的观点。一起来看看吧。也欢迎你加入我们,谈谈你的 想法。今日其他热议话题: 查看《日本34年来首次丢失全球最大债权国地位》的精彩观点 查看《王兴:将采取一切必要措施来赢得竞争》的精彩观点 推荐阅读 FORTUNE_ FORTUNE_ FORTUNE t富》中国40位40岁以下的商界精英 申报入国|20 ...
杨植麟,一个90后理想主义者的悬浮
Hu Xiu· 2025-05-28 06:01
天才的标签之外,杨植麟还是个资深文青。90后一代或多或少都曾迷恋过村上春树,1992年出生的杨植 麟也不例外。在村上春树的一本小说中,杨植麟对一个程序员深夜写代码这件事印象深刻,并充满憧 憬,这为他未来进入AI领域埋下了伏笔。 高中和大学时期,他热爱摇滚,最喜欢的乐队是平克弗洛伊德。在清华读书期间,他创立了摇滚乐队 Splay,曾晋级清华大学校园歌手大赛原创决赛。清华向来有音乐传统,除了走出过高晓松和水木年 华,杨植麟那位大名鼎鼎的学弟姚顺雨(任职于OpenAI),本科时还曾创立了清华大学说唱社。 玩摇滚和说唱属于理科生的叛逆和浪漫。90后一代人的迷茫在于,这个时代留给他们的红利并不多,音 乐恰好能宣泄这种愤懑的情绪。杨植麟的乐队创作过一首歌,讲述了一个关于"做了一个创业成功一夜 暴富的白日梦"的故事。他们对追求理想和获得金钱总是摇摆不定,这正是青春期普遍的状态,渴望一 夜暴富或许是抵挡理想主义破灭的有效手段。 从时间坐标上来看,其实90后赶上过移动互联网红利期的尾巴。戴威是只比杨植麟大一岁的清华校友, 2015年,戴威的ofo共享单车正式上线,并在全球首创"无桩单车共享"模式,成为当之无愧的创业明 星。of ...
日媒:美国需要更明智、可持续的AI策略
Huan Qiu Wang Zi Xun· 2025-05-27 23:12
来源:环球时报 日本《日经亚洲评论》5月26日文章,原题:瞄准DeepSeek不会修复华盛顿的对华人工智能战略缺陷 美 国政府似乎准备对"深度求索"(DeepSeek)采取一系列行动,DeepSeek是一家快速崛起的中国人工智能 (AI)初创企业,其先进的人工智能模型已经迅速受到全球开发人员和技术爱好者的关注。最近,美 国开始在盟友和业界的反对下修改AI扩散规则。 从表面看,全面禁令也可能适得其反。如果美国走得太远,例如向云提供商施压要求其下架开源模型, 或封锁GitHub托管的AI工具,那么美国就有可能损害自身作为互联网开放和创新捍卫者的信誉。这么 做还将授人以柄:美国缺乏在公平环境中开展竞争的信心,正在诉诸于禁令而非寻求突破来保持竞争 力。这让人们感觉不像是一个有原则的立场,而更像是美国对任何首先获得全球关注的中国AI公司都 会发起的针对性打击。这种变本加厉的限制可能会在更大程度上惩罚美国企业而非中国企业。 美国的政策需要不断发展。不断收紧硬件出口管控而忽视开源模型快速扩散能力的策略不仅不完整,而 且现在还变得过时甚至倒退。与全方位禁止相比,美国能以更有效的方式与中国开展竞争。美国可以与 其他国家,特别是 ...