通用人工智能(AGI)
Search documents
VLA/VLA+触觉/VLA+RL/具身世界模型等方向教程来啦!
具身智能之心· 2025-08-18 00:07
Core Viewpoint - The exploration of Artificial General Intelligence (AGI) is increasingly focusing on embodied intelligence, which emphasizes the interaction and adaptation of intelligent agents within physical environments, enabling them to perceive, understand tasks, execute actions, and learn from feedback [1]. Industry Analysis - In the past two years, numerous star teams in the field of embodied intelligence have emerged, leading to the establishment of valuable companies such as Xinghaitu, Galaxy General, and Zhujidongli, which are advancing the technology of embodied intelligence [3]. - Major domestic companies like Huawei, JD.com, Tencent, Ant Group, and Xiaomi are actively investing and collaborating to build a robust ecosystem for embodied intelligence, while international players like Tesla and investment firms are supporting companies like Wayve and Apptronik in the development of autonomous driving and warehouse robots [5]. Technological Evolution - The development of embodied intelligence has progressed through several stages: - The first stage focused on grasp pose detection, which struggled with complex tasks due to a lack of context modeling [6]. - The second stage involved behavior cloning, allowing robots to learn from expert demonstrations but revealing weaknesses in generalization and performance in multi-target scenarios [6]. - The third stage introduced Diffusion Policy methods, enhancing stability and generalization by modeling action sequences, followed by the emergence of Vision-Language-Action (VLA) models that integrate visual perception, language understanding, and action generation [7]. - The fourth stage, starting in 2025, aims to integrate VLA models with reinforcement learning, world models, and tactile sensing to overcome current limitations [8]. Product and Market Development - The evolution of embodied intelligence technologies has led to the emergence of various products, including humanoid robots, robotic arms, and quadrupedal robots, serving industries such as manufacturing, home services, dining, and medical rehabilitation [9]. - The demand for engineering and system capabilities is increasing as the industry shifts from research to deployment, necessitating training in platforms like Mujoco, IsaacGym, and Pybullet for strategy training and simulation testing [23]. Educational Initiatives - A comprehensive curriculum has been developed to cover the entire technology route of embodied "brain + cerebellum," including practical applications and advanced topics, aimed at both beginners and those seeking to deepen their knowledge [10][20].
硅谷画饼王「塌房」:奥特曼撒谎微表情被扒光,网友集体喊下台
3 6 Ke· 2025-08-17 23:50
AGI 即将到来 我们现在有信心知道如何构建传统意义上的 AGI GPT-5 是一次重大升级……是通往 AGI 的重要一步 其实 AGI 这个词没什么用 短短半年时间内,OpenAI CEO 山姆·奥特曼(Sam Altman)先后抛出了这个观点,第一句让全世界振奋,第二、三句让用户和投资人躁动,第四句却又几 乎否定了前面的一切。 关于 AGI 的定义,在他嘴里已经变成了薛定谔的猫,既存在又不存在,既重要又无关紧要。 奥特曼的人设,塌在 GPT-5 图片来自:@tsarnick 这些微妙的肢体语言,让人分不清他究竟是在仰望天花板寻找灵感,还是因为心虚而不敢直视镜头?当这些细节被网友翻出来反复解读,也成了压垮奥特 曼人设的最后一根稻草。 在 GPT-5 尚未发布之前,奥特曼的预热推文更是一条比一条神秘。直到现在,我们依然搞不明白为什么他看到 GPT-5 后会「眩晕无力、瘫倒在地」。那 张著名死星图片,原本想营造的神秘感,如今也成了网友们的笑料。 尤其是,他最近大半年来一直强调 AGI 多重要,上周末转头接受外媒 CNBC 采访表示:「我认为这(AGI)不是一个非常有用的术语。」对此,他解释 说 AGI 定义太多 ...
未来改变世界的不再是人?OpenAI 首席科学家直言:AI才是关键力量
3 6 Ke· 2025-08-17 23:47
8月16日消息,在最新一期OpenAI发布的播客节目中,主持人(OpenAI前工程师)安德鲁・梅恩和其公 司的黄金搭档——首席科学家雅库布・帕乔基(Jakub Pachocki)和研究员西蒙・西多尔(Szymon Sidor)作为嘉宾参与。 这对搭档回顾了从波兰高中同学到在OpenAI共事的渊源,还深入探讨了人工智能发展的关键议题,包 括通用人工智能(AGI)的定义与衡量标准、技术突破的标志性成果、基准测试面临的挑战,以及AI对 教育、科研和社会的实际影响等。核心观点有: ●AGI的定义与衡量演进:AGI已从抽象概念细化为多维能力集合。比如IMO金牌等里程碑虽有意义, 但点状突破已不足,未来应关注其在自动化科研和现实应用中的影响。 ●AI技术的突破轨迹:从早期情感分析的局限,到GPT系列模型的迭代,模型已能参与IMO、ICPC、日 本AtCoder等竞赛,展现出强大的推理与创造性思维能力。 ●基准测试的挑战与 "饱和":许多基准测试已出现"饱和",模型接近或超过人类水平,但难以全面反映 智能。衡量标准需转向实际效用与新见解的发现能力。 以下为此次播客节目的精华版内容: 安德鲁・梅恩:大家好,我是安德鲁・梅恩, ...
AI和互联网的旷世冲突
3 6 Ke· 2025-08-17 23:41
关于AI将如何重塑世界,当下的讨论充满了喧嚣与迷雾。有人在渲染通用人工智能(AGI)的奇点焦虑,有人在计算AI取代了多少岗位,但这些讨论往往 忽略了一个更根本、更迫在眉睫的关键:AI如果充分代表个人意志,与现行互联网的平台中心主义,正在酝酿一场不可避免的旷世冲突。 这场冲突的核心切入点,简单到只有一句话:当你的个人AI智能体(Personal Agent)收到的第一指令是"我不想看任何广告",而服务方平台的智能体核 心目标是"提升广告转化率",会发生什么? 答案是,支撑了互联网商业的基石——广告模式,将瞬间崩塌。 因为现有模式的命脉,在于平台对信息流的绝对控制权,在于它能决定你看什么、不看什么。 当AI赋予了每个用户一个忠诚的、以个人利益为最高指令的"代理人"时,这种控制权便被瓦解了,上面的模式就会崩了。 而如果你的代理不听你的而听平台的,你会用它么?所以这是完全不相容的两种模式。 这并非简单的技术迭代,而是一场市场主导权的再分配。 一、 平台主导下的旧秩序 要理解这场冲突的颠覆性,必须先看清过去二十年互联网的底层权力结构:注意力上的权利不对称。 过去用户是注意力的供给方,而平台(所有互联网大厂等)是注意力的 ...
融资数千万美元,前B站副总裁创业:走出ICU,用户已超800万
Sou Hu Cai Jing· 2025-08-17 21:36
Core Insights - Binson, a veteran in the internet industry, founded a new AI companionship product called "Doudou Game Partner," which has gained 8 million users during its testing phase and received several rounds of funding totaling tens of millions of dollars [1][28] - The product aims to provide not just companionship but also practical assistance in gaming, differentiating itself from traditional virtual pets by offering strategic advice and real-time game support [3][5] - Binson's personal experience with a life-threatening accident has influenced his perspective on the importance of emotional connection and companionship in AI products [1][11] Product Overview - "Doudou Game Partner" is an AI companion designed to assist users while they play games, offering strategic insights and reminders during gameplay [3][5] - The AI supports various popular games, providing tailored advice and emotional engagement, making it feel more like a gaming coach than a simple virtual pet [5][9] - The product features voice interaction, allowing users to engage without needing to divert their attention from the game [5][11] Market Positioning - The company targets a large user base, aiming for "at least tens of millions, even hundreds of millions" of users, reflecting the potential market size in the gaming industry [11][67] - Binson believes that the AI companionship market will expand as societal loneliness increases, positioning the product as a solution for emotional support [39][48] Technology and Development - The product utilizes advanced AI technologies, including visual language models (VLM) and real-time inference capabilities, to enhance user interaction and experience [31][34] - Continuous improvements are being made to the AI's understanding and contextual awareness, with a focus on long-term user engagement and emotional connection [37][38] User Engagement and Feedback - The company emphasizes user satisfaction, monitoring retention rates and user engagement to gauge emotional connections with the AI [46] - Users have expressed a willingness to wait for further improvements, indicating a strong demand for the product despite its current limitations [29][28] Competitive Landscape - Binson acknowledges competition from both game developers and larger tech companies but believes that the unique focus on emotional companionship and cross-game support sets "Doudou Game Partner" apart [47][48] - The company has established a strong emotional bond with its users, which is seen as a significant competitive advantage [49][50] Future Outlook - The company plans to expand its offerings beyond gaming, potentially integrating AI companionship into users' offline lives, such as managing daily tasks [27][39] - Binson envisions a future where AI companionship becomes a standard part of life, addressing the emotional needs of users in various contexts [39][48]
“按需思考”的GPT-5引发争议,但这可能是AI的未来
财富FORTUNE· 2025-08-17 13:04
Core Viewpoint - OpenAI's release of GPT-5 has turned into a public relations and trust crisis due to user dissatisfaction with the model's performance and the introduction of a new routing technology that users feel has stripped them of control [1][2]. Summary by Sections Release and Initial Reactions - The launch of GPT-5 was expected to solidify OpenAI's leadership in the AI field, but it faced backlash from users who mourned the loss of their favorite model, which served multiple roles [1]. - Critics, including Gary Marcus, labeled GPT-5 as "overhyped" and "lackluster," indicating a decline in user satisfaction [1]. Technical Aspects of GPT-5 - GPT-5 utilizes a routing technology that automatically allocates tasks to different sub-models, which has led to performance inconsistencies [1][3]. - Users were surprised to learn that GPT-5 is not a single model but a network of multiple models, some of which are less capable and cheaper [1][3]. User Response and Company Actions - In response to the backlash, OpenAI re-enabled the earlier model GPT-4o for professional users and promised to fix routing issues and improve system stability [2]. - Anand Choudhury from FirstQuadrant commented on the dual nature of routing technology, highlighting its potential for both magic and failure [2]. Future of Model Routing Technology - Experts believe that model routing technology will become standard due to the limitations of single models and the economic benefits of reusing older models [3][4]. - The physical limitations of GPU memory and the challenges in scaling models further support the need for routing technology [4]. Historical Context and Criticism - The concept of model integration is not new, having emerged around 2018, but the current implementation in GPT-5 has been criticized for being overhyped [5]. - William Falken noted that the improvements from GPT-4 to GPT-5 are minimal compared to previous iterations, which has contributed to user dissatisfaction [6]. AGI Aspirations and Industry Perspectives - The debate surrounding model routing has led to skepticism about the imminent realization of Artificial General Intelligence (AGI), with some experts questioning the feasibility of achieving AGI through current models [6][7]. - The industry recognizes that while routing technology offers advantages, the path to AGI remains complex and uncertain, with a need for a balance between specialized models and unified large models [7].
奥特曼的人设,塌在GPT-5
虎嗅APP· 2025-08-17 10:23
以下文章来源于APPSO ,作者发现明日产品的 APPSO . AI 第一新媒体,「超级个体」的灵感指南。 #AIGC #智能设备 #独特应用 #Generative AI 本文来自微信公众号: APPSO (ID:appsolution) ,作者:发现明日产品的,原文标题:《硅谷 画饼王"塌房":奥特曼撒谎微表情被扒光,网友集体喊下台》,题图来自:视觉中国 AGI 即将到来。 我们现在有信心知道如何构建传统意义上的 AGI。 GPT-5 是一次重大升级……是通往 AGI 的重要一步。 其实 AGI 这个词没什么用。 短短半年时间内,OpenAI CEO山姆·奥特曼 (Sam Altman) 先后抛出了这个观点,第一句让全世界 振奋,第二、三句让用户和投资人躁动,第四句却又几乎否定了前面的一切。 关于AGI的定义,在他嘴里已经变成了薛定谔的猫,既存在又不存在,既重要又无关紧要。 奥特曼的人设,塌在GPT-5 尽管大家对奥特曼"营销大师"的人设早有心理预期,但GPT-5这次翻车,还是让人大跌眼镜。网友们 在对产品失望之余,还扒出了一个关于奥特曼有趣的细节。 知名学者加里·马库斯 (Gary Marcus) 在X ...
谷歌内部揭秘Genie 3:Sora后最强AI爆款,开启世界模型新时代
3 6 Ke· 2025-08-17 08:44
Core Insights - Genie 3 is one of the most advanced world models ever created, capable of generating fully interactive and highly consistent environments in real-time through text input, marking a significant step towards AGI and embodied agents [1][6][26] Group 1: Development and Features - Genie 3 is the result of collaboration between two DeepMind projects, Veo 2 and Genie 2, and is designed to retain spatial memory for up to one minute [4][6] - The model can generate dynamic worlds at a resolution of 720p and up to 24 frames per second, allowing for real-time exploration [6][9] - Special memory is a key feature, enabling the model to remember actions taken in the environment, such as painting a wall and retaining the marks when returning to the same spot [10][11] Group 2: Performance and Capabilities - Genie 3 has achieved breakthroughs in video generation duration, world consistency, content diversity, and special memory capabilities [8][16] - The model demonstrates high consistency, maintaining the appearance of objects throughout interactions, even when they temporarily leave the field of view [11][12] - The model's ability to simulate physical effects, such as water dynamics and lighting changes, has significantly improved, making generated content nearly indistinguishable from real video [17][18][20] Group 3: Future Prospects and Applications - The team emphasizes the importance of enhancing the model's capabilities to create broader impacts, with plans to eventually open access to Genie 3 [26][27] - Future developments will focus on improving realism and interactivity, with the potential for robots to learn in virtually generated environments, overcoming limitations of real-world data collection [32][33] - The philosophical question of whether humans live in a simulation is addressed, suggesting that if it were true, it would operate on fundamentally different hardware than current computers [34][36]
奥特曼的人设,塌在GPT-5
Hu Xiu· 2025-08-16 11:03
Core Viewpoint - The article discusses the recent controversies surrounding OpenAI CEO Sam Altman, particularly in relation to the launch of GPT-5 and the concept of AGI, highlighting the disconnect between his promises and the actual product performance, leading to a loss of credibility and trust among users and investors [4][20][34]. Group 1: AGI and GPT-5 - Altman claims that GPT-5 is a significant upgrade and a crucial step towards AGI, yet he later downplays the importance of AGI as a term, stating it is not very useful [2][11]. - The definition of AGI has become ambiguous in Altman's statements, leading to confusion about its significance [5][11]. - Despite the hype, the release of GPT-5 has not met user expectations, resulting in a sharp decline in OpenAI's perceived credibility [12][15]. Group 2: Altman's Leadership and Public Perception - Altman's persona as a "marketing master" has been challenged following the disappointing reception of GPT-5, with users expressing disappointment and scrutinizing his behavior during public appearances [6][10]. - Observations of Altman's body language during discussions about GPT-5 suggest a tendency to avoid direct engagement when making bold claims, raising questions about his sincerity [7][9]. - Criticism of Altman's leadership style has intensified, with calls for his resignation, as some believe he is more suited for sales than for leading OpenAI [20][24]. Group 3: OpenAI's Business Model and Market Position - OpenAI's initial mission to create AGI for the benefit of humanity has shifted towards a more profit-driven approach, leading to skepticism about its original ideals [11][18]. - The company has seen significant user growth, with ChatGPT's weekly active users reaching 700 million, but this growth is now threatened by increasing competition from rivals like Google and Anthropic [12][34]. - The article suggests that the current marketing-driven approach may not be sustainable, as unmet expectations could lead to a backlash from users [14][35]. Group 4: Industry Context and Future Implications - The article reflects on the broader implications of Altman's leadership and OpenAI's trajectory for the AI industry, suggesting that a more competitive landscape could foster genuine innovation [36][37]. - The narrative surrounding Altman and OpenAI serves as a cautionary tale about the risks of over-promising and under-delivering in high-tech industries, where founder personas often serve as trust proxies for investors and consumers [31][32].
AI竞赛愈演愈烈,Meta六个月内第四次重组AI团队
Feng Huang Wang· 2025-08-16 09:21
Group 1 - Meta is planning a comprehensive restructuring of its artificial intelligence team, marking the fourth major reform in six months [1] - The new Superintelligence Labs will be divided into four groups: a TBD lab, a product team including Meta AI Assistant, an infrastructure team, and the Fundamental AI Research (FAIR) lab focusing on long-term research [1] - The restructuring follows a recent formation of the Superintelligence Labs in July, which was a high-risk move due to senior employee departures and poor reception of the Llama 4 model [1] Group 2 - Meta has been actively pursuing advancements in artificial intelligence, with CEO Mark Zuckerberg accelerating the development of general artificial intelligence amid increasing competition in Silicon Valley [2] - The company plans to invest hundreds of billions of dollars in building several large AI data centers, with recent financing of $29 billion from PIMCO and Blue Owl Capital for expansion in rural Louisiana [2] - Meta has raised its annual capital expenditure forecast by $2 billion to a range of $66 billion to $72 billion, citing rising costs for data center infrastructure and employee salaries, which will drive expense growth rates in 2026 [2]