Workflow
DeepSeek
icon
Search documents
OpenAI将与微软分成比例降至8%,将获利500亿美元;DeepSeek、宇树科技等被MIT科技评论评为聪明公司丨AIGC日报
创业邦· 2025-09-15 00:08
1. 【美国媒体公司PMC因人工智能摘要侵权对谷歌提起诉讼】9月14日消息,据报道,美国媒体公 司Penske Media Corporation(PMC)已起诉谷歌,指控该科技公司非法使用其新闻内容生成人工 智能摘要,导致其网站流量减少。PMC于周五(12日)向哥伦比亚特区联邦地区法院提起诉讼,将 自身及其旗下媒体列为原告,包括《好莱坞报道者》、《滚石》和《公告牌》。(界面新闻) 2. 【DeepSeek、宇树科技等被MIT科技评论评为聪明公司】9月12日,《麻省理工科技评论》"50 家聪明公司"最新评选结果揭晓,DeepSeek、宇树科技等明星创企均入选。根据MIT科技评论的定 义,聪明公司应该具备两个特征:聪明地研发和使用新技术、聪明地理解市场和商业机会。它们凭借 技术创新开路,结合可持续的商业模式,将技术的影响扩展至全球。(澎湃新闻) 3.【科学家发现AI能像人类一样评估社交情境,科研效率远超人工】 9月14日消息,据新华社,芬兰 图尔库大学的一项新研究表明,GPT-4V能够像人类一样,从图像和视频中识别并解读人与人之间的 复杂社交信息,其准确性几乎可与人类相媲美。 相关研究成果已于9月2日发表在国际 ...
大模型碰到真难题了,测了500道,o3 Pro仅通过15%
机器之心· 2025-09-14 03:07
机器之心报道 机器之心编辑部 基准测试是检验大模型能力的一种方式,一般而言,一个有用的基准既要足够难,又要贴近现实:问题既能挑战前沿模型,又要反映真实世界的使用场景。 然而,现有测试面临着「难度–真实性」的矛盾:侧重于考试的基准往往被人为设置得很难,但实际价值有限;而基于真实用户交互的基准又往往偏向于简单的高 频问题。 在此背景下,来自斯坦福大学、华盛顿大学等机构的研究者探索了一种截然不同的方式:在未解决的问题上评估模型的能力。 与一次性打分的静态基准不同,该研究不断收集未解决的问题,然后通过验证器辅助筛选与社区验证机制,实现对模型的持续异步评估。 具体而言,本文提出了 UQ(Unsolved Questions),这是一个由 500 道题组成的测试集,涵盖计算机理论、数学、科幻、历史等主题,用于考察模型在推理、事实 准确性以及浏览等方面的能力。UQ 在设计上兼具难度大与贴近真实两大特点:这些问题大多是人类遇到但尚未解决的难题,因此攻克它们可直接产生现实价值。 数据集介绍 UQ 数据集由 500 道具有挑战性的未解决问题组成,问题来源问答社区 Stack Exchange,并且是经过三轮筛选得到的。 在筛选流 ...
清华、上海AI Lab等顶级团队发布推理模型RL超全综述,探索通往超级智能之路
机器之心· 2025-09-13 08:54
Core Insights - The article emphasizes the significant role of Reinforcement Learning (RL) in enhancing the reasoning capabilities of large language models (LLMs), marking a pivotal shift in artificial intelligence development [2][5][16] - It highlights the emergence of Large Reasoning Models (LRMs) that utilize RL to improve reasoning through verifiable rewards, showcasing advancements in complex tasks such as mathematics and programming [3][5][10] Summary by Sections Introduction - The introduction outlines the historical context of RL since its inception in 1998 and its evolution into a crucial method for training intelligent agents to surpass human performance in complex environments [2] Recent Trends - A new trend is emerging where researchers aim to enhance models' reasoning abilities through RL, moving beyond mere compliance to actual reasoning skills [3][5] Overview of RL in LRM - The article reviews recent advancements in RL applied to LLMs, noting significant achievements in complex logical tasks, and identifies RL as a core method for evolving LLMs into LRMs [5][12] Foundational Components - The foundational components of RL for LRMs include reward design, policy optimization, and sampling strategies, which are essential for effective model training [13][14] Foundational Problems - Key challenges in RL for LRMs include the design of appropriate reward signals, efficient scaling under computational and data constraints, and ensuring reliability in practical applications [12][16] Training Resources - The article discusses the necessary training resources, including static corpora, dynamic environments, and RL infrastructure, emphasizing the need for standardization and development [13][15] Applications - RL has been applied across various tasks, including coding, agentic tasks, multimodal tasks, and robotics, showcasing its versatility and potential for broader applications [13][15] Future Directions - Future research directions for RL in LLMs include the development of new algorithms, mechanisms, and functionalities to further enhance reasoning capabilities and address existing challenges [15][16]
How Baidu (BIDU) Is Positioning Its AI Against OpenAI, Google, and DeepSeek
Yahoo Finance· 2025-09-12 21:33
Baidu, Inc. (NASDAQ:BIDU) is one of the AI Stocks In The Spotlight For Investors. On September 10, the company released an updated version of its proprietary reasoning showcasing capabilities similar to advanced AI systems from DeepSeek, OpenAI and Google. According to Baidu chief technology officer Wang Haifeng, third-party AI benchmarks reveal that the firm’s X1.1 reasoning model had surpassed the performance of DeepSeek-R1, while it matched OpenAI’s GPT-5 and Google’s Gemini 2.5 Pro. Wang further said ...
吴世春:2025,AI重塑一切
FOFWEEKLY· 2025-09-12 10:01
Core Viewpoint - The year 2025 is seen as a watershed moment for the AI era, with a strong emphasis on the necessity of believing in trends to capitalize on opportunities, particularly in AI [3][11]. Investment Landscape - Early-stage investment is crucial in the equity investment market, as it initiates entrepreneurial ventures [7]. - In the robotics sector, funding has surged, with the financing amount in the first eight months of this year exceeding the total for the previous year by 80% [4]. - The focus of capital has shifted from "technology stories" to "mass production capabilities," indicating a preference for commercial viability [4]. AI Trends and Opportunities - The rise of DeepSeek is prompting a global reassessment of Chinese tech assets, marking 2025 as the true beginning of the AI era [6][10]. - AI is driving a transformation in the physical world, necessitating a redesign of all hardware, including toys, intelligent robots, drones, and autonomous vehicles [9][10]. - The "Artificial Intelligence +" strategy has been elevated to a national strategy, pushing for industrial upgrades [11]. Competitive Landscape - To avoid the pitfalls of homogenized competition, companies must engage in differentiated competition, focusing on personalized demand-side strategies [15][16]. - The essence of "involution" is profit shrinkage due to homogeneous competition, necessitating a shift towards unique value propositions [15]. Entrepreneurial Strategies - Entrepreneurs are encouraged to focus on niche markets and create unique value propositions rather than relying on low-cost competition [16]. - The importance of organizational capability is emphasized, with a need for companies to leverage AI to streamline processes and enhance collaboration [17]. Investment Directions - The investment focus is on two main areas: AI agents' application fields and verticalized AI infrastructure [20]. - In the robotics sector, several innovative companies are being supported, including those specializing in humanoid robots and industrial automation [21]. Conclusion - The entrepreneurial journey is challenging, and the goal is to assist aspiring entrepreneurs in becoming impactful leaders in the AI era [23].
GPT-5 为啥不 “胡说” 了?OpenAI 新论文讲透了
腾讯研究院· 2025-09-12 08:58
以下文章来源于腾讯科技 ,作者博阳 腾讯科技 . 腾讯新闻旗下腾讯科技官方账号,在这里读懂科技! 博阳 腾讯新闻作者 GPT-5发布之后,虽然其性能并未能达成业界的"飞跃"期望, 但其中最亮眼的就是幻觉率的大幅下降。 OpenAI给出的数据显示,GPT-5出现事实错误的概率比 GPT-4o 低约 45%,比 OpenAI o3 低约 80%。 虽然OpenAI未完全公开所有技术细节,但结合这篇官方论文和已发布的技术文档,我们可以窥见其核心 思路。 | Adam Tauman Kalai* | Ofir Nachum | Santosh S. Vempalat | Edwin Zhar | | --- | --- | --- | --- | | OpenAI | OpenAI | Georgia Tech | OpenAI | 幻觉产生,在预训练阶段是不可避免的 幻觉是不可避免的,这个结论并不新鲜。但过往的研究基本上很少从语言模型本身的机制出发探讨,更 多是围绕着训练数据的问题。 OpenAI的新论文一开始就证明了: "幻觉"是一种在LLM统计学习本质下必然会产生的、可预测的副产 品。 但这一提升背后的原因却一 ...
你的AI越来越蠢?因为它学会见人下菜碟了
创业邦· 2025-09-12 03:14
以下文章来源于差评前沿部 ,作者江江 差评前沿部 . 站在科技的前列线,关注AI、机器人等前沿科技。 来源丨 差评(ID:chaping321) 事情是这样的,前几天我心一狠,给 OpenAI 充了个200刀的会员,想着试试现在的 ChatGPT 强到 啥程度了。 结果,我给他丢了个算术题,求解 5.9 = x+5.11,哥们直接给我算崩了。。 《这题真是幼儿园级别》? | | 昨 -- ト . コ.9=X+5.11 | | | --- | --- | --- | | 行吧,这题真是幼儿园级别: | | | | 方程: | | | | ini | の 复制 | ○ 编辑 | | 5.9 = x + 5.11 | | | | 把右边的 5.11 移过去: | | | | ini | の 复制 | 2 编辑 | | x = 5.9 - 5.11 | | | | 逐位相减: | | | | 0 5.90 | | | | ● 5.11 | | | | = 0.79 | | | | 然后加个负号: | | | | ini | の 复制 | ♡ 编辑 | | x = -0.21 | | | | 答案就是 x = -0.2 ...
「京东」智驾总经理刘东与北大副教授联手创业,入局具身智能大模型赛道!
Robot猎场备忘录· 2025-09-12 00:03
温馨提示 : 点击下方图片,查看运营团队最新原创报告(共235页) 说明: 欢迎约稿、刊例合作、行业交流 , 行业交流记得先加入 " 机器人头条"知识星球 ,后添加( 微信号:lietou100w )微信; 若有侵权、改稿请联系编辑运营(微 信:li_sir_2020); —— 正文: 成立1个月, 智源研究院孵化的具身智能大模型创企[ 星源智机器人 ]完成2亿元首轮融资! 近日,具身智能大模型(机器人通用大脑)创企【 北京星源智机器人科技有限公司 】(以下简称" 星源智机器人 ") 宣布 完成 2亿元 天使轮融资 ,投资方包括中科创星、高瓴、元禾原点、元生创投、慕华科创、力合资本、 华金资本 等知名机构和 智元机器人、芯联资本、国汽投资、中力实桥、长飞基金、灵初智能等产业投资方。 在具身智能大火的今天,带资下场创业已屡见不鲜,但如此多机构参与实属罕见,可见资本青睐;最值得注意的该公司成立之初股东栏就有[智元 机器人]和其生态伙伴[灵初智能],大概率是 智元"A计划" 孵化50+个早期项目之一,这可能也是 高瓴参与首轮融资原因。 (注:智元 与高瓴资本已成立具身智能产业基金 ) | 序号 | 股东名称 | 持 ...
Claude断供,国产AI编程工具顶上
Core Insights - Anthropic has announced a complete ban on the use of its AI programming tool Claude Code by companies with over 50% ownership by Chinese entities, which is expected to accelerate the development of domestic AI programming tools [1][2] - Claude Code processes nearly 200 million lines of code weekly and generates an annual revenue of approximately $500 million [1] - Domestic companies such as Tencent, DeepSeek, and Alibaba are actively developing AI programming tools, with Tencent's CodeBuddy Code recently entering public testing [1][2] Company Developments - DeepSeek V3.1 has gained significant attention in the international developer community for its performance in AI programming [1] - Tencent's CodeBuddy Code supports multiple formats including plugins, IDE, and CLI, allowing developers to automate the entire development and operations process using natural language [1][2] - Over 90% of Tencent's engineers are currently using CodeBuddy, resulting in an average coding time reduction of over 40% [2] Industry Trends - The ban by Anthropic highlights the risks of over-reliance on foreign AI services, prompting a push for a more robust domestic AI service ecosystem [2] - The emergence of domestic AI programming tools is seen as a counter to the dominance of OpenAI, with a growing demand for self-sufficient and controllable tools in the market [2]
2025年初人工智能格局报告:推理模型、主权AI及代理型AI的崛起(英文版)-Lablup
Sou Hu Cai Jing· 2025-09-11 09:17
Group 1: Core Insights - The global AI ecosystem is undergoing a fundamental paradigm shift driven by geopolitical competition, technological innovation, and the rise of reasoning models [10][15][25] - The transition from "Train-Time Compute" to "Test-Time Compute" has led to the emergence of reasoning models, enhancing AI capabilities while reducing development costs [11][18][24] - The "DeepSeek Shock" in January 2025 marked a significant moment in AI competition, showcasing China's advancements in AI technology and prompting a response from the U.S. government with substantial investment plans [25][30][31] Group 2: Technological Developments - AI models are increasingly demonstrating improved reasoning capabilities, with OpenAI's o1 model achieving a 74.4% accuracy in complex reasoning tasks, while DeepSeek's R1 model offers similar performance at a significantly lower cost [19][20][24] - The performance gap between top-tier AI models is narrowing, indicating intensified competition and innovation in the AI landscape [22][23] - Future AI architectures are expected to adopt hybrid strategies, integrating both training and inference optimizations to enhance performance [24] Group 3: Geopolitical and National Strategies - "Sovereign AI" has become a central focus for major nations, with the U.S., U.K., France, Japan, and South Korea announcing substantial investments to develop their own AI capabilities and infrastructure [2][5][13][51] - The U.S. has initiated the $500 billion "Stargate Project" to bolster its AI leadership in response to emerging competition from China [25][51] - South Korea aims to invest 100 trillion won (approximately $72 billion) over five years to position itself among the top three global AI powers [55] Group 4: Market Dynamics and Applications - The AI hardware market is projected to grow from $66.8 billion in 2024 to $296.3 billion by 2034, with GPUs maintaining a dominant market share [39] - AI applications are becoming more specialized, with coding AI evolving from tools to autonomous teammates, although challenges such as the "productivity paradox" persist [14][63] - Major AI companies are focusing on integrating their models into broader ecosystems, with Microsoft, Google, and Meta leading the charge in enterprise and consumer applications [61]