通用人工智能(AGI)
Search documents
GPT-5官宣发布前,谷歌和Anthorpic继续给OpenAI上强度
3 6 Ke· 2025-08-07 09:01
Core Insights - OpenAI announced the release of GPT-5, with three versions: GPT-5, GPT-5-mini, and GPT-5-nano, amidst heightened competition from Google and Anthropic, who launched significant products just before OpenAI's event [1][2] - Google DeepMind's Genie 3 emerged as a standout, capable of generating interactive 3D worlds from a single sentence, while Anthropic's Claude Opus 4.1 claimed the title of the strongest AI programming model [1][2][31] - OpenAI's return to open-sourcing after six years with two models, gpt-oss-120b and gpt-oss-20b, aims to re-establish its leadership in the open-source community, although it appears slightly behind in the overall AI race [1][20][32] Group 1: Product Releases - OpenAI's GPT-5 release event is scheduled for Friday at 1 AM Beijing time, with confirmed versions available [2] - Google DeepMind's Genie 3 can generate realistic 3D worlds at 720p resolution and 24 frames per second, showcasing significant advancements in AI capabilities [5][7] - Anthropic's Claude Opus 4.1 achieved a new state-of-the-art (SOTA) in AI programming, outperforming competitors in various benchmarks [14][15] Group 2: Competitive Landscape - The AI competition intensified with Google and Anthropic's releases, leading to a three-hour event where both companies showcased their innovations [2][4] - Genie 3 is viewed as a critical step towards artificial general intelligence (AGI), with its ability to create interactive and memory-capable virtual environments [11][13][31] - Claude Opus 4.1's performance in programming tasks positions it as a strong contender in the AI programming space, with a 74.5% accuracy rate in SWE-bench [14][18] Group 3: OpenAI's Strategy - OpenAI's decision to open-source two models, gpt-oss-120b and gpt-oss-20b, reflects a strategic move to engage the developer community and enhance its influence [20][32] - The gpt-oss-120b model, with 117 billion parameters, is designed to run on high-performance GPUs, while gpt-oss-20b is optimized for consumer-grade devices [20][22] - OpenAI's open-source models are reported to perform comparably to its closed-source models, indicating a strong return to the open-source arena [22][25]
为何强化学习火遍硅谷?AGI的关键一步
Hu Xiu· 2025-08-07 07:46
Group 1 - Reinforcement Learning (RL) has become a mainstream trend in Silicon Valley for building technical architectures and model pre-training, following its previous popularity during the AlphaGo era [1][2][3] - Top talent in reinforcement learning is highly sought after by major tech companies and investors in Silicon Valley [1][2] Group 2 - The discussion highlights the evolution of models and the commercialization of AI agents, focusing on the latest technological directions [2][3] - The acquisition of ScaleAI by Meta is driven by the need for high-quality data annotation, particularly in multimodal contexts like video and image data [31][36] Group 3 - There are two main decision-making frameworks in RL: one based on large language models (LLMs) and another that focuses on actions rather than language tokens [5][6] - RL is particularly effective for tasks that are goal-driven, such as coding, mathematics, and financial analysis, where data may be scarce [10][11] Group 4 - The consensus is that supervised learning is effective for tasks with abundant labeled data, while RL from human feedback (RLHF) can enhance model performance to align with human preferences [8][9] - The challenges of RL pre-training include the need for counterfactual learning and the difficulty of generating data for unique tasks [27][28] Group 5 - The conversation touches on the five levels of Artificial General Intelligence (AGI) as defined by OpenAI, with a focus on the significant gap between agent-based AI and innovative AI [15][21] - The potential for RL to discover new knowledge and contribute to superintelligence is discussed, emphasizing the importance of verification mechanisms [12][13] Group 6 - The importance of reward design in RL is highlighted, as it can significantly impact the behavior and outcomes of AI agents [55][56] - The future of AI agents will depend on their ability to balance multiple objectives and optimize performance across various tasks [56][63] Group 7 - The conversation indicates that the landscape of AI companies is evolving, with a potential for significant mergers and acquisitions in the near future [64][65] - The need for companies to focus on technical paths that ensure profitability and sustainability is emphasized, as high operational costs can lead to challenges in growth [63][64]
DeepMind 掌门告诫马斯克:如果AI出问题,去火星也没用
3 6 Ke· 2025-08-07 07:05
Core Insights - Demis Hassabis, the leader of Google DeepMind, emphasizes the transformative impact of AI, claiming it will revolutionize society at a scale and speed ten times greater than the Industrial Revolution [1][16] - Google DeepMind has integrated its advanced AI models, particularly Gemini, into the Google ecosystem, significantly increasing user engagement and maintaining a strong presence in academic research [1][10] Group 1: Company Overview - Google DeepMind was formed after the merger of DeepMind and Google Brain in April 2023, with Hassabis at the helm [1] - The company has made significant advancements in AI, including the release of AlphaFold 3, which predicts protein complex structures and has been cited over 4,000 times in research [1][10] - Google acquired DeepMind for £400 million in 2014, driven by a shared vision of integrating AI into Google's core mission [9] Group 2: Industry Impact - The release of ChatGPT in 2022 dramatically changed the AI landscape, prompting major tech companies to accelerate their AI investments and talent acquisition [10][11] - Competitors like Meta, Amazon, Apple, and Microsoft are heavily investing in AI, with Microsoft recently hiring over 20 engineers from DeepMind [11][12] - Hassabis believes that the next five to ten years will be crucial for achieving Artificial General Intelligence (AGI), which could exhibit human-like cognitive abilities [12] Group 3: Future Outlook - Hassabis envisions a future of "extreme abundance" facilitated by AI advancements, leading to significant societal benefits if resources are distributed equitably [13][14] - He acknowledges potential challenges, such as energy consumption and job displacement due to AI, but remains optimistic about humanity's ability to adapt and thrive [14][15] - The transformative changes brought by AI are seen as necessary and inevitable, with a focus on minimizing disruption while embracing progress [16]
GPT-5,要来了?
财联社· 2025-08-07 02:58
Core Viewpoint - The highly anticipated GPT-5 from OpenAI is expected to be released soon, with a livestream event scheduled that hints at its launch [1] Group 1: GPT-5 Release and Features - OpenAI's CEO Sam Altman indicated that GPT-5 is likely to be released this summer, with plans for mini and nano versions to be made available via API [2] - GPT-5 is described as an integrated system that combines various technologies, aiming to simplify the product line and move towards achieving Artificial General Intelligence (AGI) [2][3] - There is no indication that GPT-5 will be open-sourced, but Altman previously promised that users would have free access to the model [3] Group 2: Competitive Landscape and Market Implications - The release of GPT-5 comes amid a flurry of updates from other major AI models, such as Google's Genie 3 and Kimi's K2, indicating a rapidly evolving competitive landscape [3] - Analysts believe that the next generation of models, including GPT-5, could achieve a 2-3 times increase in scale, leading to nearly a 10-fold improvement in intelligence levels [3] - The advancements in logic reasoning, multi-modal capabilities, and memory systems are expected to accelerate the application of AI in high-value complex industries, enhancing profitability and computational demand [3]
国内首个具身大脑+小脑算法实战全栈教程
具身智能之心· 2025-08-07 02:38
Core Insights - The exploration towards Artificial General Intelligence (AGI) highlights embodied intelligence as a key direction, focusing on the interaction and adaptation of intelligent agents within physical environments [1] - The development of embodied intelligence is marked by the evolution of technology from low-level perception to high-level task understanding and generalization [6][9] Industry Analysis - In the past two years, numerous star teams in the field of embodied intelligence have emerged, establishing valuable companies such as Xinghaitu, Galaxy General, and Zhujidongli, transitioning from laboratories to commercial and industrial applications [3] - Major domestic companies like Huawei, JD, Tencent, Ant Group, and Xiaomi are actively investing and collaborating to build an ecosystem for embodied intelligence, while international players like Tesla and investment firms support advancements in autonomous driving and warehouse robotics [5] Technological Evolution - The evolution of embodied intelligence technology has progressed through several stages: - The first stage focused on grasp pose detection, which struggled with complex tasks due to a lack of context modeling [6] - The second stage involved behavior cloning, allowing robots to learn from expert demonstrations but revealing weaknesses in generalization and performance in multi-target scenarios [6] - The third stage introduced Diffusion Policy methods, enhancing stability and generalization through sequence modeling [7] - The fourth stage, emerging in 2025, explores the integration of VLA models with reinforcement learning and tactile sensing to overcome current limitations [8] Product Development and Market Growth - The advancements in embodied intelligence have led to the development of various products, including humanoid robots, robotic arms, and quadrupedal robots, serving industries such as manufacturing, home services, and healthcare [9] - The demand for engineering and system capabilities is increasing as the industry shifts from research to deployment, necessitating higher engineering skills [13] Educational Initiatives - A comprehensive curriculum has been developed to assist learners in mastering the full spectrum of embodied intelligence algorithms, covering topics from basic tasks to advanced models like VLA and its integrations [9][13]
谷歌推出Genie3:世界模型的ChatGPT时刻?
Hu Xiu· 2025-08-06 12:13
2025年8月5日,Google DeepMind宣布推出Genie 3,这是一个通用世界模型,能够根据文本提示生成各种 可互动的3D环境。该模型在720p分辨率下以24帧/秒实时生成环境,用户可以像玩游戏一样自由移动,并 且场景在几分钟内保持一致。此举标志着DeepMind在世界模型领域的又一次重要跃进,距上一代Genie 2 发布仅一年多时间。 我们梳理了目前Google官方报告和参与内测的用户的反馈,以及Genie3背后团队的深度访谈,为大家提供 核心信息的汇总,更好地了解这个模型。 Google官方博客:从文字到世界,Genie 3是什么? 1. 迈向世界模拟 在Google DeepMind,我们已超过十年时间在模拟环境领域进行开创性研究,从训练智能体到掌握实时策 略游戏,再到为开放式学习和机器人技术开发模拟环境。这项工作促使我们开发了世界模型,即能够利用 其对世界的理解来模拟世界某些方面的AI系统,使智能体能够预测环境将如何演变以及其行为将如何影响 环境。 世界模型也是通往通用人工智能(AGI)道路上的关键里程碑,因为它们使得在丰富的模拟环境中对人工 智能代理进行无限课程训练成为可能。去年,我们推 ...
OpenAI被“断供”,AI圈也搞起了以邻为壑
3 6 Ke· 2025-08-06 11:29
进入移动互联网时代的下半场后,伴随着流量红利的枯竭,几乎所有巨头都开始明里暗里"建墙"。以开 放为代表的互联网精神不可避免地开始凋零,以邻为壑则成为了主旋律,其中最典型的代表就是"二选 一"。 不难发现,在GPT-5即将发布的当口,Anthropic的这番操作算得上是蛇打七寸。当然,Anthropic的行为 也无可指摘,该公司发言人在相关声明中表示,"OpenAI的技术团队在GPT-5发布前也在使用我们的编 程工具,这直接违反了我们的服务条款"。根据Anthropic的商业条款显示,禁止其他公司使用Claude API来构建竞争性服务。 对此OpenAI公司的发言人进行了辩解,称这种做法属于"行业标准",此外还表达了对Anthropic这一决 定的失望,并强调"我们的API仍然对他们开放"。 其实这并不是Anthropic第一次展现出对OpenAI的敌意,此前该公司突然切断了AI编程初创企业 Windsurf访问Claude模型的权限,外界彼时的一致看法,是该动作源自OpenAI正计划收购Windsurf。 除此之外,另一个证明Anthropic针对OpenAI的事实,是谷歌此前也干了同样的事情,可Anthr ...
外媒:谷歌DeepMind宣布推出新一代世界模型Genie 3
Huan Qiu Wang Zi Xun· 2025-08-06 09:21
此外,Genie 3还引入了"可提示世界事件"功能,用户可以通过简单的文本指令动态修改虚拟世界,例 如添加一群鹿或改变天气条件。 外媒称,Genie 3的发布被DeepMind视为迈向通用人工智能(AGI)的重要一步。该模型不仅为AI智能 体训练提供了更广阔的模拟空间,还为游戏开发、教育和创意设计等领域带来了新的可能性。例如,机 器人可以在模拟仓库中学习应对不可预测的场景,而无需真实世界的试错成本。 尽管Genie 3在技术上取得了显著突破,但仍存在一些局限性。例如,模型当前仅支持数分钟的连续交 互,远未达到数小时的理想状态。此外,AI智能体在模拟环境中的交互能力有限,复杂多智能体交互 仍需进一步探索。谷歌DeepMind表示,Genie 3目前以研究预览形式向部分学者和创作者开放,旨在进 一步优化模型并评估潜在风险。(青云) 【环球网科技综合报道】8月6日消息,据PANews报道,谷歌DeepMind今日宣布推出其最新一代世界模 型Genie 3。Genie 3是一款通用型世界模型,能够根据文本提示实时生成多样化的交互式虚拟环境,支 持以24帧/秒的速度生成720p分辨率的交互式3D环境。 来源:环球网 ...
蚂蚁集团联合中国人工智能学会发布AGI科研专项基金
Zheng Quan Ri Bao Wang· 2025-08-06 05:16
据了解,本次发布的基金,聚焦通用人工智能相关的关键技术和前沿方向,从底层发力,提升模型整体 的智能能力。例如,方向一AGI数据与评测,共开放3项课题,围绕AIGC视频评测、大模型高效数据蒸 馏、大模型动态评测和反污染检测展开,通过多种方式的数据生产处理与模型效果评测,提高AGI技术 的准确度、智能度和稳定性。方向二AGI基础模型,共开放18项课题,围绕多模态大模型交互体验、多 模态生成与理解一体化、高效注意力机制等课题展开。方向三AGIInfra共开放6项课题,围绕RL训推一 体、高性能AgenticRL、面向RL大模型推理加速等课题展开。 近年来,蚂蚁集团以AIFirst为战略,持续投入AGI,在基础研究攻关、产学研共建、开源开放等层面取 得了一定的进展。由蚂蚁发起的InclusionAI开源社区,持续开源了百灵基础大模型、强化学习推理框架 AReaL、多智能体框架AWorld等工作,登上了著名模型开源社区HuggingFace发布的中国开源热点地 图,其中百灵多模态大模型还获得了模型趋势(anytoany类型)榜第一。去年以来,蚂蚁集团接连与上 海交通大学、浙江大学、南京大学等知名高校成立了联合实验室,聚 ...
六年来首次开源,OpenAI放出两款o4-mini级的推理模型
Jin Shi Shu Ju· 2025-08-06 03:47
Core Insights - OpenAI has launched two open-source AI inference models, GPT-oss-120b and GPT-oss-20b, which are comparable in capability to its existing models [1][2] - The release marks OpenAI's return to the open-source language model space after six years, aiming to attract both developers and policymakers [2][3] Model Performance - In the Codeforces programming competition, GPT-oss-120b and GPT-oss-20b scored 2622 and 2516, respectively, outperforming DeepSeek's R1 model but slightly below OpenAI's own o3 and o4-mini models [2] - In the Human-Level Exam (HLE), the models achieved scores of 19% and 17.3%, surpassing DeepSeek and Qwen but still lower than o3 [3] - The "hallucination" rates for the GPT-oss models were significantly higher than those of OpenAI's latest models, with rates of 49% and 53% compared to 16% and 36% for o1 and o4-mini [3] Model Training Methodology - The GPT-oss models utilize a "Mixture-of-Experts" architecture, activating only a portion of their parameters for efficiency [5] - Despite having 117 billion parameters, GPT-oss-120b activates only 510 million per token, and both models underwent high-computational reinforcement learning [5] - Currently, the models only support text input and output, lacking multi-modal processing capabilities [5] Licensing and Data Transparency - GPT-oss-120b and GPT-oss-20b are released under the Apache 2.0 license, allowing commercial use without authorization [5] - OpenAI has chosen not to disclose the training data sources, a decision influenced by ongoing copyright litigation in the AI sector [6] Competitive Landscape - OpenAI faces increasing competition from Chinese AI labs like DeepSeek and Alibaba's Tongyi (Qwen), which have released leading open-source models [2] - The focus in the industry is shifting towards upcoming models from DeepSeek and Meta's Superintelligence Lab, indicating a rapidly evolving competitive environment [6]