DeepSeek

Search documents
GPT-5登场!国产大模型“扎堆上新”,DeepSeek得加速了
Hua Xia Shi Bao· 2025-08-08 05:04
Core Insights - OpenAI has officially launched its new flagship AI model, GPT-5, marking a significant step towards achieving general artificial intelligence (AGI) [2] - The release emphasizes practical applications rather than technical specifications, showcasing improvements in programming, creative writing, and health consultation capabilities [3][5] - The launch of GPT-5 has heightened expectations for competing models, particularly DeepSeek's upcoming R2 model, which has faced delays [2][8] Group 1: GPT-5 Features and Performance - GPT-5 has shown significant enhancements in three key areas: programming, creative writing, and health consultation, with capabilities such as creating responsive websites and identifying potential health issues [3][5] - OpenAI has not disclosed the model parameters, focusing instead on the model's ability to integrate into various real-world applications [3][5] - The model is available in four versions: GPT-5, GPT-5 mini, GPT-5 nano, and GPT-5 chat, with different usage limits and subscription options for consumers [5][6] Group 2: Market Impact and Competition - Following the release of GPT-5, OpenAI's dominance in the AI model market is expected to strengthen, as evidenced by ChatGPT's leading position in user traffic [7][8] - DeepSeek, despite being a previous leader, has seen a decline in user engagement and is under pressure to release its R2 model to remain competitive [8][10] - Other companies in the industry are rapidly launching new models, indicating a highly competitive landscape where DeepSeek must accelerate its development to keep pace [9][10]
全球大模型进化的下一个方向,OpenAI的GPT-5做出来了
3 6 Ke· 2025-08-08 03:57
(OpenAI CEO萨姆·奥尔特曼正在发布GPT-5 图源/OpenAI官网直播) 美国明星AI(人工智能)创业公司OpenAI的每一代旗舰模型,都会引领全球未来半年的技术潮流。美国西部时间8月7日,这家公司发布了GPT-5。 OpenAI CEO(首席执行官)萨姆·奥尔特曼(Sam Altman)形容,GPT-3给人感觉像是在和高中生交谈。虽然偶尔灵光乍现,但也有很多恼人的地方。 GPT-4o或许像在和一个大学生交谈,它具备了真正的智能和实用性。而现在,有了GPT-5,就像是在和一位专家对话——一位在任何领域都能随时待命、 专业的博士级专家,他们能帮你实现任何目标。GPT-5不仅能聊天,还能为你做事。 GPT-5是两个模型(长思考版+高效率版,前者可以深度思考,后者可以高效问答)组成的一个系统。它会在用户提问时,自动判断切换版本。 OpenAI官网披露的性能基准测试结果显示,GPT-5超越了上一代旗舰模型OpenAI o3,GPT-5(长思考版)幻觉数量比o3少了六倍。国际市场调研机构 Artificial Analysis长期对全球主流模型进行性能基准测试,截至8月8日的测试结果显示,GPT-5目前是全 ...
当中国极客们不再仰望硅谷:本土科技偶像的时代来了 | 深网
Jin Shi Shu Ju· 2025-08-07 12:06
出品丨深网·腾讯新闻小满工作室 "春节后我注意到一个变化,"强脑科技(BrainCo)创始人韩璧丞说,"很多同事的电脑屏保,悄然从马斯克换成了DeepSeek创始人梁文锋。" DeepSeek的爆火,让梁文锋的故事刷屏。低调、谦逊和不事张扬的形象,在流量的狂欢中淬炼出独特光芒,却意外成就了这个时代最稀缺的英雄叙事: 用技术纯粹地改变世界。跟着梁文锋一起爆火的,还有杭州的"六小龙"。 位于余杭区文一西路的强脑科技参展区每日都人潮涌动——投资人、供应商、各地产业基金、政府部门纷至沓来;滨江区宇树科技大厦前同样排起长 队……无一例外,几家公司的门口都有一条温馨提示:"没有付费参访服务!谨防受骗!" 来源:深网腾讯新闻 图: 创业初期,韩璧丞在波士顿的地下室做了一个机器人,可通过意识来对它控制。 文丨薛芳编辑丨张睿 半年光阴流转,这场"科技朝圣热",还在持续。涌动的人群,他们朝圣的圣殿正在从硅谷转向本土。昔日,硅谷的光芒从惠普车库的微光燃起,直至 OpenAI 推开星际之门;而今,本土技术英雄登上了偶像神坛。 身处风暴中心的韩璧丞,把自己剥离于聚光灯之外。"我80%的时间仍在产品研发上,"他语气笃定。正是这份专注,使 ...
DeepSeek的GRPO会导致模型崩溃?看下Qwen3新范式GSPO
机器之心· 2025-08-07 09:42
Core Viewpoint - The article discusses the evolution of reinforcement learning techniques in the post-training phase of large language models (LLMs), highlighting the introduction of Group Sequence Policy Optimization (GSPO) as a solution to the instability issues associated with Group Relative Policy Optimization (GRPO) [2][10][31]. Group 1: Training Phases and Techniques - The training of large language models typically consists of two phases: pre-training and post-training, where the latter focuses on improving the model's understanding and execution of human instructions [1]. - The post-training phase employs reinforcement learning, with initial methods like Reinforcement Learning from Human Feedback (RLHF) being time-consuming and costly due to reliance on human annotators [2][3]. Group 2: Innovations and Comparisons - DeepSeek introduced an automated approach to RLHF, significantly reducing costs and improving efficiency by allowing the model to learn through reward signals rather than manual evaluations [2]. - The DeepSeek team proposed the Group Relative Policy Optimization (GRPO) algorithm, which they believe is more effective than the Proximal Policy Optimization (PPO) used by OpenAI in ChatGPT [3][5]. Group 3: Issues with GRPO - The Qwen team identified serious stability issues with GRPO, particularly due to its reliance on token-level importance sampling, which can lead to high variance and training instability [10][11][12]. - The instability arises from the incorrect application of importance sampling weights at the token level, which can accumulate high variance in long sequences, exacerbating the training challenges [15][16][17]. Group 4: Introduction of GSPO - To address the issues with GRPO, the Qwen team proposed the Group Sequence Policy Optimization (GSPO), which utilizes sequence-level importance sampling to enhance training stability [10][22][31]. - GSPO's design mitigates the accumulation of variance seen in token-level sampling, leading to improved training efficiency and stability [23][24]. Group 5: Experimental Evidence and Advantages - Experimental results demonstrated that GSPO outperformed GRPO in various tasks, showcasing better scalability and efficiency in training [20][30]. - The Qwen team highlighted that GSPO simplifies the training of Mixture-of-Experts (MoE) models by eliminating the need for auxiliary strategies like Routing Replay, which were necessary for GRPO to achieve stable convergence [25][27][30].
OpenAI拟出售股权,估值或跃升至5000亿美元
Guo Ji Jin Rong Bao· 2025-08-07 09:32
竞争压力之下,OpenAI并未停止技术创新。据奥尔特曼透露,OpenAI正在准备发布升级版的ChatGPT 模型,他还分享了一张截图,暗示即将推出GPT-5。 当地时间8月6日,据海外多家媒体报道,OpenAI正在与现有投资者展开初步谈判,商讨员工持股的股 权出售事宜。如果这一交易顺利达成,OpenAI的估值预计将从目前的3000亿美元跃升至5000亿美元, 超过埃隆·马斯克的航天公司SpaceX(3500亿美元),成为全球最具价值的人工智能公司之一。 彭博社首次披露了OpenAI与投资者之间的股权出售谈判。现有投资者,包括风险投资公司兴盛资本 (Thrive Capital),已向OpenAI提出收购员工股份的请求。 OpenAI的其他投资方还包括软银、微软等。《卫报》指出,OpenAI不仅面临来自微软等投资方的压 力,还在人才争夺战中与其他科技巨头展开了激烈竞争。 例如,Facebook母公司Meta正积极扩展其人工智能团队,为留住顶尖AI人才,Meta提供了高达1亿美元 的签约奖金。对此,OpenAI首席执行官山姆·奥尔特曼(Sam Altman)称,Meta虽然提供了丰厚的待 遇,但并未能挖走Open ...
DeepSeek、Kimi 首轮淘汰,马斯克 Grok 4 杀进决赛,首届全球 AI 对抗赛连爆冷门
3 6 Ke· 2025-08-07 08:27
马斯克和奥特曼的恩怨,可能要在 64 格的国际象棋棋盘上解决了。 就在 Kaggle 游戏竞技场刚刚举行的 AI 国际象棋锦标赛半决赛中,o3 以 4:0 横扫 o4 mini,而 Grok 4 与 Gemini 2.5 Pro 激战五局,最终通过加时赛惊险取 胜。 此次比赛聚集了八款全球主流语言模型,其中就包括月之暗面的 Kimi K2 和 DeepSeek R1 等热门选手,但遗憾的是,两者均在首轮就被淘汰,未能进入四 强。 面对 Grok 4 的强势发挥,实时追更赛况的马斯克也是「装」起来了:「xAI 在国际象棋上几乎没花什么心思。」 八大 AI 模型齐聚棋盘,上演棋王争霸赛 本次比赛为期三天(当地时间 8 月 5 日-7 日),第一天决出 4 强,第二天诞生决赛名单,第三天上演金牌和铜牌争夺战。参赛的八位 AI 选手分别是: Anthropic 的 Claude Opus 4 DeepSeek 的 DeepSeek-R1 有趣的是,在比赛阵容和规则公布后,作为参赛选手之一的 Kimi 在社交平台上公开「吐槽」匹配机制,称自己的推理版本尚未发布。 这场 AI 棋王争霸赛由 Google 旗下的 Kagg ...
OpenAI推出开源模型gpt-oss抗衡中企
日经中文网· 2025-08-07 08:00
Core Viewpoint - OpenAI has launched an open-source AI model named "gpt-oss," allowing developers to use and modify it for free, marking a significant return to open-source large language models after nearly six years since "GPT-2" [2][4]. Group 1 - OpenAI's CEO Sam Altman announced the release of the open-source AI model on August 5, 2023, to counter emerging competitors like DeepSeek from China [2][5]. - The newly released models are designed to operate efficiently with fewer computational resources, making them suitable for devices like laptops and smartphones [4]. - The open-source model is characterized by its logical reasoning capabilities, excelling in mathematics and programming tasks [4]. Group 2 - OpenAI's commitment to sharing research and technology has been a core principle since its inception, but competition has led to reduced information sharing among companies [5]. - The rise of Chinese companies in the open-source model space, particularly DeepSeek's release of the "R1" model, has prompted OpenAI to consider launching its own open-source models [5]. - Other Chinese companies, such as Alibaba's Tongyi Qianwen and emerging firms like Moonshot AI, have also entered the open-source model market, intensifying competition [5].
首届大模型象棋争霸赛:Grok 4与o3挺进决赛,DeepSeek、Kimi落败
3 6 Ke· 2025-08-07 06:16
Core Insights - The AI chess tournament hosted on Kaggle featured eight large language models (LLMs) competing in a knockout format, with Grok 4 and o3 advancing to the finals after defeating Gemini 2.5 Pro and o4-mini respectively [1][3][8] Group 1: Tournament Structure and Results - The tournament lasted three days and involved eight AI models, including Grok 4 (xAI), Gemini 2.5 Pro (Google), o4-mini (OpenAI), o3 (OpenAI), Claude 4 Opus (Anthropic), Gemini 2.5 Flash (Google), DeepSeek R1 (DeepSeek), and Kimi k2 (Moonshot AI) [1] - The competition utilized a single-elimination format where each AI had up to four attempts to make a legal move; failure to do so resulted in an immediate loss [1] - On the first day, Grok 4, o3, Gemini 2.5 Pro, and o4-mini all achieved 4-0 victories, advancing to the semifinals [3][11][22] Group 2: Semifinal Highlights - In the semifinals, o3 demonstrated a dominant performance, winning 4-0 against o4-mini, showcasing a high level of precision with a perfect accuracy score of 100 in one of the games [5] - The match between Grok 4 and Gemini 2.5 Pro ended in a tie after regular play, leading to an Armageddon tiebreaker where Grok 4 emerged victorious [8] - The semifinals highlighted the strengths and weaknesses of the AI models, with Grok 4 overcoming early mistakes to secure its place in the finals [8][19] Group 3: Performance Analysis - The tournament revealed that while some AI models performed exceptionally well, others struggled with basic tactical sequences and context understanding, indicating areas for improvement in AI chess capabilities [22] - The performance of Grok 4 attracted attention from industry figures, including Elon Musk, who commented on its impressive gameplay [19]
OpenAI重开源战略:扩大影响力,应对全球AI开源竞争新格局
Sou Hu Cai Jing· 2025-08-07 05:39
Core Viewpoint - OpenAI is re-entering the open-source arena by launching two significant models, gpt-oss-120b and gpt-oss-20b, amidst a growing competition between open-source and closed-source AI solutions [1][2]. Group 1: OpenAI's Historical Context - Since its establishment in 2015, OpenAI has been at the forefront of AI technology, with the launch of ChatGPT in 2022 marking a significant milestone in user growth [1]. - Initially, OpenAI embraced open-source principles with the releases of GPT-1 and GPT-2, but shifted to a closed-source model with GPT-3 in 2020, which drew criticism for contradicting its mission to benefit humanity [1][2]. Group 2: New Open-Source Models - The newly launched models, gpt-oss-120b and gpt-oss-20b, are designed for high inference in cloud environments and low latency on edge devices, providing developers with more options [2]. - The release of these models has generated significant interest in the AI open-source community, leading to a surge in downloads on the Hugging Face platform, prompting requests to manage server load [2]. Group 3: Industry Reactions and Implications - Opinions on OpenAI's approach are mixed; some view it as a protective measure for core assets, while others argue it limits developers' ability to conduct in-depth research and hinders the open-source ecosystem [4]. - OpenAI's collaboration with cloud providers like Amazon AWS is seen as a strategy to enhance the distribution and application of its open-source models [4]. - The performance gap between open-source and closed-source models is narrowing, increasing competition from global open-source players, particularly in the Chinese market with Alibaba's Qwen series achieving over 300 models and 400 million downloads by July 2023 [4]. Group 4: Future Outlook - OpenAI's re-entry into open-source presents both challenges and opportunities, suggesting that AI giants may adopt more flexible strategies in response to the rapidly changing market [7].
X @Decrypt
Decrypt· 2025-08-07 03:55
Lawmakers Call for Inquiry Into China's DeepSeek Over National Security, Data Risks► https://t.co/M37LVXXyVt https://t.co/M37LVXXyVt ...