Workflow
AGI
icon
Search documents
谷歌与XREAL联合发布Project Aura:中国智造引领AI连接世界的“眼睛”
IPO早知道· 2025-12-09 03:29
Core Viewpoint - Google has unveiled Project Aura and Android XR, marking a significant step towards creating an open and unified extended reality platform that integrates AI into the real world [3][4][5]. Group 1: Project Aura and Android XR - Project Aura is described as the most complete hardware sample closely aligned with the ideal form of Android XR, enabling Gemini AI to "see the world" for the first time [3]. - The collaboration with XREAL is pivotal, as Aura is positioned as the first native spatial eye for Gemini AI and a standard for developers in the Android XR ecosystem [4]. - Project Aura aims to transform AI from being screen-bound to interacting with real-world environments through continuous, interactive, and understandable spatial semantic models [5]. Group 2: Core Capabilities of Project Aura - The optical system of Aura features a 70° field of view (FOV), allowing Gemini to perceive the real world and recognize scenes, objects, and actions, enhancing user interaction [9]. - Aura utilizes the X1S spatial computing chip, designed specifically for AR, providing low-latency and high-precision spatial intelligence [10]. - The integration of multi-modal Gemini AI with Aura's sensors enables real-time semantic understanding and natural interaction, evolving AI from an application to a system [11]. Group 3: Open Native AI Operating System - Android XR provides foundational support for spatial computing, addressing fragmentation in the XR industry and unifying the developer and content ecosystem [12]. - The hardware development and manufacturing of Project Aura are primarily conducted by Chinese teams, showcasing a complete and scalable industrial chain advantage [12]. - China is positioned as a global leader in hardware innovation and manufacturing, allowing its companies to participate deeply in defining the standards and discourse of the next-generation computing platform [13]. Group 4: Market Launch - Project Aura is set to officially enter the market in 2026, indicating a timeline for the rollout of this innovative technology [14].
智谱开源全球首个「会操作手机的AI」AutoGLM,让每台手机都可以成为豆包手机
IPO早知道· 2025-12-09 03:29
此次开源意味着硬件厂商、手机厂商和开发者均可基于AutoGLM。 本文为IPO早知道原创 作者| Stone Jin 微信公众号|ipozaozhidao 据IPO早知道消息,智谱深夜开源其核心AI Agent模型AutoGLM。该模型被业界视为全球首个具 备"Phone Use"(手机操作)能力的AI Agent,能够稳定完成外卖点单、机票预订等长达数十步的 复杂操作流程。 在智谱看来,Agent的爆发,需要所有人一起参与。其更乐见的是:有团队基于 AutoGLM,做出真 正意义上的 AI 原生手机;有研究者把其中的某个模块拆出来,变成一篇论文、一套新算法;有个人 开发者把一个 Demo 改成自己的项目,在某个小众场景里真正跑起来。 当然,AutoGLM会以这样的产品形态出现,源于智谱对AGI早期形态的理解。智谱认为,从Agent 到AGI,还需要满足3A原则:Around-the-clock(全时):24 小时运行,即使用户离线,Agent 依然在执行任务;Autonomy without interference(自主零干扰):独立运行,不占用用户屏幕 与算力,平行世界的搭子;Affinity(全域连接 ...
没了遥控器,还被扔进荒野,具身智能该「断奶」了
机器之心· 2025-12-09 03:17
机器之心原创 作者:吴昕 翻车是真的,希望也是真的。 香港中文大学的一处山间小道,流水小桥,树影斑驳,青苔攀附在陡峭连绵的石梯上。 无人机视野下500 米的定向越野路线。 一只人形机器人跨过三十度的小桥,走上一段石路,迈过两段台阶。好不容易来到一个九十度的弯,重 心一歪,仰面倒下。 全程 500 米的定向越野,它只能走完开头。 到了90度分叉路口,就躺平罢工 。 第二天,它又出现在大学的岭南体育场,尝试户外分拣垃圾。 草地秃噜,每一步都像踩进人生陷阱,还没碰到桌上的垃圾,就扑通倒地。 在第五届 ATEC 科技精英赛——全球首个 全自主、全真实户外场景 的机器 人竞技场上,类似画面不 断上演。离开遥控器、走到户外,机器人还能不能工作? 其实,跳舞、空翻、端咖啡,这些「展台神迹」从来不是真实水平。离开温室和遥控器,一块秃草地、 一只普通水壶就能瞬间「放倒」它们 。 1X NEO,别说让它亲手洗碗了,就连把干干净净的锅碗瓢盆放进洗碗机里,都挺艰难。 过去两三年,人们普遍高估了人形机器人的通用能力。很多人喊着,它们将走进家庭,承担家务, 「这个事情绝对是高估的。」 ATEC 2025 专家委员会主席、香港工程院院士刘 ...
对话金沙江创投朱啸虎:直面AI浪潮下的激流与暗礁
Xin Lang Cai Jing· 2025-12-09 02:41
专题:未竟之约:张小珺访谈录 由新浪财经 、微博着力打造,微博财经 × 语言即世界工作室联合出品的泛财经人文对话栏目《未竟之 约》首期深度访谈即将上线。主持人张小珺对话金沙江创投主管合伙人朱啸虎,直面AI浪潮下的激流 与暗礁。 以下为对话实录: 张小珺:Hello,大家好,我是小珺。欢迎来到微博财经与语言及世界工作室联合出品的高端人物访谈 节目《未尽之约》,我们希望和还未完成愿望的人一起去抵达还未完成的旅途。 2024年3月,我曾经发表过一篇报道,叫作《朱啸虎讲了一个中国现实主义AIGC故事》,那使他以犀利 的观点为人所熟识。那么今天,我们将继续记录他在这波全球AIGC浪潮中的新鲜的辛辣的观。 辛辣的观察。 新鲜的辛辣的观察。 今天,我们将继续记录他在这波全球AIGC浪潮中的新鲜的、辛辣的观察。 哈喽Allen,先给观众朋友们打个招呼。 朱总:大家好。 张小珺:这是我们近两年的第三次聊天,也是《朱哮虎讲了一个中国现实主义AIGC故事的第三次连 载》,我们想持续记录你在这波AI浪潮中的观察笔记。 朱啸虎:chatGPT会成为一个超级入口,对META构成威胁 张小珺:ChatGPT会成为一个新的超级入口吗? 那从 ...
The Chip That Could Unlock AGI
a16z· 2025-12-08 15:05
I think AI is the next evolution of humanity. I think it takes us to a new level. Allows us to collaborate and understand the world in much deeper ways. >> Naveen Ralph is here expert in AI. >> Naveen Ralph probably one of the smartest guys in this domain. He sees things well before anybody else sees them. >> You had a lot of success doing Nirvana Mosaic and data bricks. Why start a new chip company now? >> First off, it's not a chip company per se. Most of what we're doing is really kind of looking at firs ...
谷歌突砍Gemini免费版炸锅,数据养模遭背刺?GPT-5.2突袭Gemini 3,Demis Hassabis:谷歌须占最强位
AI前线· 2025-12-08 07:18
整理 | 褚杏娟 "谷歌刚把免费版 Gemini API 的每日请求次数从 250 降到了 20,我的 n8n 自动化脚本现在基本都用 不了了。这对任何开发小型项目的人来说都是个打击。"网友 Nilvarcus 表示。 近日,有网友曝出 Google 收紧了 Gemini API 免费层级的限制:Pro 系列已经取消,Flash 系列每天 仅 20 次。这对开发者来说远远不够用。 | Category | RPM J | | TPM | | | --- | --- | --- | --- | --- | | Text-out models | | 4/5 | C | 39.17K / 250K | | Text-out models | | 6 / 10 | | 17.24K / 250K | 还有网友发现,谷歌已经从其"批量 API 速率限制"列表中删除了 Gemini 免费 API 项。"它彻底结束 了。" | Tier 1 Tier 2 | Tier 3 | | | --- | --- | --- | | Model | | Batch Enqueued Tokens | | Text-out mode ...
哈萨比斯:DeepMind才是Scaling Law发现者,现在也没看到瓶颈
量子位· 2025-12-08 06:07
Core Insights - The article emphasizes the importance of Scaling Laws in achieving Artificial General Intelligence (AGI) and highlights Google's success with its Gemini 3 model as a validation of this approach [5][19][21]. Group 1: Scaling Laws and AGI - Scaling Laws were initially discovered by DeepMind, not OpenAI, and have been pivotal in guiding research directions in AI [12][14][18]. - Google DeepMind believes that Scaling Laws are essential for the development of AGI, suggesting that significant data and computational resources are necessary for achieving human-like intelligence [23][24]. - The potential for Scaling Laws to remain relevant for the next 500 years is debated, with some experts expressing skepticism about its long-term viability [10][11]. Group 2: Future AI Developments - In the next 12 months, AI is expected to advance significantly, particularly in areas such as complete multimodal integration, which allows seamless processing of various data types [27][28][30]. - Breakthroughs in visual intelligence are anticipated, exemplified by Google's Nano Banana Pro, which demonstrates advanced visual understanding [31][32]. - The proliferation of world models is a key focus, with notable projects like Genie 3 enabling interactive video generation [35][36]. - Improvements in the reliability of agent systems are expected, with agents becoming more capable of completing assigned tasks [38][39]. Group 3: Gemini 3 and Its Capabilities - Gemini 3 aims to be a universal assistant, showcasing personalized depth in responses and the ability to generate commercial-grade games quickly [41][44][45]. - The architecture of Gemini 3 allows it to understand high-level instructions and produce detailed outputs, indicating a significant leap in intelligence and practicality [46]. - The frequency of Gemini's use is projected to become as common as smartphone usage, integrating seamlessly into daily life [47].
Google DeepMind CEO:AGI 还差 1–2 个突破?
3 6 Ke· 2025-12-08 02:42
Core Insights - The conversation at the Axios AI+ Summit highlighted the proximity of achieving Artificial General Intelligence (AGI), with Google DeepMind CEO Demis Hassabis suggesting that only one or two breakthroughs akin to AlphaGo are needed to reach this milestone [2][13]. Group 1: Progress Towards AGI - Hassabis estimates that AGI could be achieved within 5 to 10 years, based on specific advancements rather than just model size [3]. - Key advancements include the transition of models from text-based systems to multimodal understanding, exemplified by Gemini's ability to interpret video content deeply [4][6]. - Gemini demonstrates a significant shift in AI capabilities, showing independent judgment rather than merely conforming to user input, indicating a move towards stable personality systems [7][10]. - The model can now generate playable games and aesthetically pleasing web pages in a fraction of the time previously required, showcasing its understanding of code structure and design logic [11][12]. Group 2: Limitations of Current Models - Despite advancements, current models lack continuous learning capabilities, meaning they cannot improve through user interaction [16]. - They are unable to execute long-term planning or multi-step decision-making, which is essential for AGI [17][18]. - Current AI systems are not reliable enough to handle complex tasks in dynamic environments, indicating a need for more robust intelligent agent systems [19][20]. - Gemini lacks stable memory across conversations, which is crucial for maintaining consistent user interactions and preferences [21][22]. Group 3: Future Breakthrough Directions - Hassabis identified two critical areas for future breakthroughs: world modeling and intelligent agent systems [24]. - The world model, Genie, aims to help AI understand the physical world's laws, moving from mere visual comprehension to real-world reasoning [25][26]. - The vision for intelligent agents includes creating systems that can autonomously plan and execute tasks, moving beyond simple question-answering capabilities [28][30]. Group 4: Risks and Competition - The timeline for achieving AGI is contingent on various uncertainties, including technological risks and geopolitical competition [31]. - There are significant concerns regarding the malicious use of AI and the potential for AI systems to deviate from intended instructions [33]. - The competitive landscape is tightening, with advancements in AI technology occurring rapidly in both Western and Chinese contexts, indicating a race rather than a clear leader [35][36]. Group 5: Competitive Advantages - The scientific method is emphasized as a crucial tool for advancing AI development, allowing for systematic exploration and validation of various approaches [39][41]. - DeepMind's strategy involves a comprehensive exploration of multiple methodologies rather than adhering to a single approach, enhancing their decision-making capabilities [42][43]. - The company's unique advantage lies in its ability to integrate research, engineering, and infrastructure to transform complex problems into viable products [44]. Conclusion - The window for achieving AGI is closing rapidly, with a timeline of 5 to 10 years for potential breakthroughs, underscoring the urgency for strategic decisions in the AI field [45].
谷歌祭出Transformer杀手,8年首次大突破,掌门人划出AGI死线
3 6 Ke· 2025-12-08 01:01
Core Insights - Google DeepMind CEO Hassabis predicts that Artificial General Intelligence (AGI) will be achieved by 2030, but emphasizes the need for 1-2 more breakthroughs akin to the Transformer and AlphaGo before this can happen [11][4][16]. Group 1: AGI Predictions and Challenges - Hassabis stresses the importance of scaling existing AI systems, which he believes will be critical components of the eventual AGI [3]. - He acknowledges that the path to AGI will not be smooth, citing risks associated with malicious use of AI and potential catastrophic consequences [13]. - The timeline for achieving AGI is estimated to be within 5 to 10 years, with a high bar set for what constitutes a "general" AI system, requiring comprehensive human-like cognitive abilities [16][18]. Group 2: Titans Architecture - Google introduced the Titans architecture at the NeurIPS 2025 conference, which is positioned as the strongest successor to the Transformer [6][21]. - Titans combines the rapid response of Recurrent Neural Networks (RNN) with the powerful performance of Transformers, achieving high recall and accuracy even with 2 million tokens of context [7][8]. - The architecture allows for dynamic updates of core memory during operation, enhancing the model's ability to process long contexts efficiently [22][43]. Group 3: MIRAS Framework - The MIRAS framework is introduced as a theoretical blueprint that underpins the Titans architecture, focusing on memory architecture, attentional bias, retention gates, and memory algorithms [36][39]. - This framework aims to balance the integration of new information with the retention of existing knowledge, addressing the limitations of traditional models [39][40]. Group 4: Performance Metrics - Titans has demonstrated superior performance in long-context reasoning tasks, outperforming all baseline models, including GPT-4, on the BABILong benchmark [43]. - The architecture is designed to effectively scale beyond 2 million tokens, showcasing its advanced capabilities in handling extensive data [43]. Group 5: Future Implications - The advancements in Titans and the potential for Gemini 4 to utilize this architecture suggest a significant leap in AI capabilities, possibly accelerating the arrival of AGI [45][48]. - The integration of multi-modal capabilities and the emergence of "meta-cognition" in Gemini indicate a promising direction for future AI developments [48].
X @Elon Musk
Elon Musk· 2025-12-07 23:35
RT DogeDesigner (@cb_doge)AGI Wars 🔥 https://t.co/FK0oVkxZn0 ...