世界模型

Search documents
腾讯加码空间智能大模型,这一赛道正在成为下一个风口
首席商业评论· 2025-08-09 04:17
Core Viewpoint - Tencent's Hunyuan 3D model represents a significant advancement in the creation of immersive 3D environments, allowing users to generate complete scenes from text or images, thus democratizing access to 3D content creation [3][4][5]. Group 1: Hunyuan 3D Model Features - The Hunyuan 3D World Model 1.0 supports 360° immersive roaming, asset export in standard mesh format, and editing with mainstream modeling software, marking a leap from "AI can draw" to "humans can use" [3][7]. - The model has surpassed state-of-the-art (SOTA) open-source models in quality across various evaluation dimensions, including texture detail and aesthetic quality [7]. - Tencent plans to release a series of open-source initiatives, including multimodal understanding models and game vision models, to create a comprehensive ecosystem for 3D AIGC creation [7][9]. Group 2: User Experience and Accessibility - Users can generate a 360-degree immersive scene based on simple text descriptions or images, enabling the creation of complex environments with dynamic elements [8]. - The model allows for the construction of "walkable" scene maps, enhancing interactivity and user experience compared to previous models that lacked spatial continuity [8][9]. - The hybrid approach of combining 2D and 3D elements in scene generation addresses the limitations of purely 3D or 2D models, providing a more stable and diverse creative output [8]. Group 3: Impact on Game Development - The Hunyuan 3D model revolutionizes game development by significantly reducing the time required to create high-quality scene prototypes, thus shortening development cycles and lowering trial-and-error costs [9]. - It lowers the barrier for 3D enthusiasts and content creators, allowing them to create virtual worlds without needing advanced modeling skills [9]. Group 4: Future of Spatial Intelligence - The development of spatial intelligence models, like the Hunyuan 3D model, is seen as a precursor to more complex world models that incorporate physical and causal reasoning [11][12]. - The concept of world models is gaining traction as a critical breakthrough in AI, enabling machines to understand and simulate complex physical environments [11][12][14]. - Major tech companies, including Google and Nvidia, are investing in world models, indicating a competitive landscape focused on advancing spatial intelligence capabilities [14][22]. Group 5: Tencent's Strategic Position - Tencent's capital expenditure for AI initiatives reached 76.7 billion yuan in 2024, a 221% increase year-on-year, reflecting its commitment to AI development [24]. - The company has established a comprehensive model system, with its Hunyuan models ranking among the top globally, showcasing its competitive edge in the AI landscape [24][27]. - Tencent aims to create a supportive infrastructure for small developers, emphasizing collaboration and ecosystem building rather than monopolistic practices [24][27].
对话千寻智能高阳:科学家创业不太「靠谱」,但创业就像一场游戏
36氪· 2025-08-08 09:28
智能涌现 . 直击AI新时代下涌现的产业革命。36氪旗下账号。 具身智能创业,要做苹果,而不是安卓。 文 | 邱晓芬 编辑 | 苏建勋 来源| 智能涌现(ID:AIEmergence) 封面来源 | 视觉中国 不管是刚刚结束的WAIC(世界人工智能大会),还是本周要开幕的WRC(世界机器人大会),如何在展会上识别一个机器人的真正实力? 具身智能公司"千寻智能"的联合创始人高阳,提供了这样几个tips: 以下文章来源于智能涌现 ,作者邱晓芬 对于号称能叠衣服的机器人,你可以尝试把衣服团成一团,随意丢在桌上,观察它是否能继续完成动作;或者是再给它裤子、外套,看它能否具备跨品类 的泛化能力; 在机器人操作时,可以观察其动作是否足够丝滑流畅,而不是一卡一卡,这代表了思维和动作的协调性…… 给我们提出指引的高阳,是当前具身智能领域炙手可热的创业者之一——从美国加州大学伯克利分校博士毕业后,他选择回国成为清华大学交叉信息研究 院助理教授。 2023年,他又与前珞石机器人CTO韩峰涛一起,创办了具身智能公司千寻智能——韩峰涛硬件经验丰富,过往操盘过数万台机器人量产出货,高阳则有 AI的研究基础,学术和产业界的搭配,使得千寻 ...
对话千寻智能高阳:科学家创业不太“靠谱”,但创业就像一场游戏
3 6 Ke· 2025-08-08 01:49
智能涌现制图 具身智能创业,要做苹果,而不是安卓。 文|邱晓芬 编辑|苏建勋 不管是刚刚结束的WAIC(世界人工智能大会),还是本周要开幕的WRC(世界机器人大会),如何在展会上识别一个机器人 的真正实力? 做具身智能领域的苹果,不是安卓 具身智能公司"千寻智能"的联合创始人高阳,提供了这样几个tips: 对于号称能叠衣服的机器人,你可以尝试把衣服团成一团,随意丢在桌上,观察它是否能继续完成动作;或者是再给它裤子、 外套,看它能否具备跨品类的泛化能力; 在机器人操作时,可以观察其动作是否足够丝滑流畅,而不是一卡一卡,这代表了思维和动作的协调性…… 给我们提出指引的高阳,是当前具身智能领域炙手可热的创业者之一——从美国加州大学伯克利分校博士毕业后,他选择回国 成为清华大学交叉信息研究院助理教授。 2023年,他又与前珞石机器人CTO韩峰涛一起,创办了具身智能公司千寻智能——韩峰涛硬件经验丰富,过往操盘过数万台机 器人量产出货,高阳则有AI的研究基础,学术和产业界的搭配,使得千寻智能成为这波具身智能浪潮里的当红公司。 成立19个月的时间里,他们累计融资超10亿人民币。资方名单中,有华为哈勃、京东、宁德时代、顺为资 ...
当AI“看见”世界,商业的未来正在被彻底重塑 | 两说
第一财经· 2025-08-07 10:20
Group 1: AI Impact on Labor Market - AI is predicted to take over creative tasks, not just repetitive jobs, with experts suggesting that roles such as financial analysts and scriptwriters may be at risk [7][9] - Those who do not understand or utilize AI are likely to be the first to be eliminated from the workforce [7] Group 2: Integration of AI with Navigation Systems - The integration of AI with China's BeiDou navigation system is expected to create a trillion-dollar industry, enhancing capabilities beyond navigation to include disaster response and urban planning [10] Group 3: World Models as a Key to Physical Interaction - The concept of world models is introduced as the next generation of AI, enabling machines to understand spatial relationships and perform complex tasks in physical environments [13] Group 4: Revolution in Content Creation - AI-generated content (AIGC) is set to revolutionize the content industry, with AI tools allowing creators to produce high-quality content significantly faster than traditional methods [15] Group 5: Ethical Governance of AI - The ultimate challenge for AI development is governance, focusing on ensuring AI does not become a tool for domination, with a call for global participation in AI governance [18]
【重磅深度/小马智行】革新交通运输,Robotaxi驶向未来
东吴汽车黄细里团队· 2025-08-06 13:52
Investment Highlights - The cost of Robotaxi is decreasing, with BOM costs dropping to around 300,000 yuan, aided by mass production of autonomous driving kits and significant reductions in the costs of onboard computing units and LiDAR by 80% and 68% respectively [3][48] - The company has a strong technical foundation and is leading in commercialization, with over 10 billion kilometers of testing data generated through its PonyWorld platform [4][66] - The company is expanding its operations in major cities like Beijing, Shanghai, Guangzhou, and Shenzhen, while also pursuing international markets, having obtained Robotaxi licenses in the US, South Korea, and Luxembourg [5][62] Business Model and Financials - The company’s revenue from autonomous driving truck logistics is expected to grow significantly, with a 61.3% increase projected for 2024 [23] - The company’s total revenue is forecasted to reach 78 million USD in 2025, with a rapid scale-up expected as the Robotaxi business model matures [6] - The gross margin is under pressure due to the increasing share of lower-margin autonomous truck logistics revenue, but there is potential for improvement as operational efficiency increases [26] Market Potential - The Robotaxi market in China is projected to reach 200 billion yuan, with significant growth expected as it replaces traditional shared mobility services [52] - The company is well-positioned to benefit from a supportive policy environment and advancements in autonomous driving technology, which are expected to drive down costs and enhance profitability [59][60] Technological Advancements - The company’s latest generation of Robotaxi vehicles features advanced sensor configurations, including 9 LiDARs and 14 cameras, enabling 360-degree detection and a range of up to 650 meters [70] - The integration of multi-modal language models into the autonomous driving system enhances its ability to understand complex traffic scenarios and improve decision-making [34][38] Regulatory Environment - The regulatory framework for autonomous vehicles in China is evolving, with increasing support for testing and commercial operations, which is expected to accelerate the industry’s growth [59][62] - The company is actively participating in pilot programs across various cities, contributing to the establishment of a robust operational framework for autonomous driving [62]
计算机行业重大事项点评:Genie3实现世界交互,AGI迈出关键一步
Huachuang Securities· 2025-08-06 09:34
证 券 研 究 报 告 计算机行业重大事项点评 Genie 3 实现世界交互,AGI 迈出关键一步 事项: ❑ 8 月 5 日,谷歌 DeepMind 发布世界模型 Genie 3,新版本首次在模型系列中 实现了实时交互模拟能力,可生成高度多样化的虚拟环境。 评论: ❑ 投资建议及相关标的:建议关注 AI 应用方向: 行业研究 计算机 2025 年 08 月 06 日 推荐(维持) 华创证券研究所 证券分析师:吴鸣远 邮箱:wumingyuan@hcyjs.com 执业编号:S0360523040001 联系人:周志浩 相对指数表现 | % | 1M | 6M | 12M | | --- | --- | --- | --- | | 绝对表现 | 9.1% | 12.5% | 77.7% | | 相对表现 | 6.1% | 4.4% | 54.9% | -7% 24% 55% 86% 24/08 24/10 24/12 25/03 25/05 25/08 2024-08-06~2025-08-05 计算机 沪深300 相关研究报告 《计算机行业深度研究报告:Kimi:K2 模型,跻 身全球开源 SOTA 队列》 ...
OpenAI、谷歌等深夜更新多款模型 展示开源、智能体、世界模型进展
Di Yi Cai Jing· 2025-08-06 04:59
Core Insights - Major AI companies released new products, showcasing shifts in product strategies, particularly OpenAI's transition to open-source models and Anthropic's focus on incremental updates [1][3] OpenAI - OpenAI launched two open-source models: gpt-oss-120b with 117 billion parameters and gpt-oss-20b with 21 billion parameters, both utilizing MoE architecture [2] - The gpt-oss-120b model can run on an 80GB GPU, while gpt-oss-20b can operate on consumer devices with 16GB memory, allowing local deployment on laptops and mobile phones [2] - These models achieved top-tier performance in benchmark tests, with gpt-oss-120b scoring close to or exceeding the closed-source o4-mini model [2] Anthropic - Anthropic introduced Claude Opus 4.1, marking a shift towards more frequent, incremental updates rather than focusing solely on major version releases [3] - The new model demonstrated improved capabilities in complex multi-step problem-solving and coding tasks, with a SWE-bench Verify score of 74.5%, surpassing the previous version [4] Google - Google launched Genie 3, its first world model allowing real-time interaction, building on previous models Genie 1 and Genie 2 [5] - Genie 3 can simulate diverse environments and natural phenomena, maintaining visual consistency for up to several minutes at 720p resolution [6] - Despite advancements, Genie 3 has limitations, such as restricted action space and challenges in simulating multiple agents in shared environments [9]
OpenAI、谷歌等深夜更新多款模型,展示开源、智能体、世界模型进展
Di Yi Cai Jing· 2025-08-06 04:49
Core Insights - The recent product launches by OpenAI, Anthropic, and Google indicate a shift in product strategies among major AI model developers, with a focus on open-source models and incremental updates [1][3][5] OpenAI - OpenAI has released two open-source models, gpt-oss-120b with 117 billion parameters and gpt-oss-20b with 21 billion parameters, both utilizing the MoE architecture [2] - The gpt-oss-120b model can run on a single 80GB GPU, while gpt-oss-20b can operate on consumer devices with 16GB memory, allowing for local deployment on laptops and mobile devices [2] - OpenAI's CEO, Sam Altman, emphasized the importance of releasing powerful open-source models, which are the result of billions of dollars in research [1][2] Anthropic - Anthropic has shifted its strategy to focus on more frequent incremental updates rather than solely major version releases, exemplified by the launch of Claude Opus 4.1 [3] - Claude Opus 4.1 shows improvements in coding capabilities, scoring 74.5% on the SWE-bench Verify benchmark, surpassing its predecessor [4] - The new model is designed to handle complex multi-step problems more effectively, positioning it as a more capable AI agent [3][4] Google - Google introduced Genie 3, its first world model that supports real-time interaction, building on previous models like Genie 1 and Genie 2 [5] - Genie 3 can simulate diverse interactive environments and model physical properties, allowing for realistic navigation and interaction within generated worlds [5][6] - Despite its advancements, Google acknowledges limitations in Genie 3, such as restricted action spaces and challenges in simulating multiple agents in shared environments [9]
震撼,世界模型第一次超真实地模拟了真实世界:谷歌Genie 3昨晚抢了OpenAI风头
3 6 Ke· 2025-08-06 03:17
昨晚十点,谷歌 DeepMind 重磅宣布其 Genie 世界模型系列正式来到了第 3 代。 「Genie 3是我们突破性的世界模型,可以通过单个文本提示词创建交互式、可玩的环境。从照片般逼真的风景到奇幻的境界,可能性无穷无尽。」 相比于前一代 Genie 2 世界模型、使用扩散模型的游戏生成引擎 GameNGen 以及视频生成模型 Veo,最新的 Genie 3 在多个特性上都具有明显优势。 | | GameNGen | Genie 2 | Veo | Genie 3 | | --- | --- | --- | --- | --- | | Resolution | 320p | 360p | 720p to 4K | 720p | | Domain | Game-specific | 3D Environments | General | General | | Control | Game-specific | Limited keyboard / mouse actions | Video-level description* | Navigation; Promptable world events ...
六年来首次!OpenAI发布两款开放权重AI推理模型!奥尔特曼称其为“全球最佳开放模型”
Mei Ri Jing Ji Xin Wen· 2025-08-05 22:57
OpenAI向开源模型迈出重要一步:六年来首次推出开放权重模型。 OpenAI首席执行官山姆·奥尔特曼当地时间8月5日宣布,公司将在未来几天里带来许多新东西,其中周 二迎来一项"小而重磅"的更新——预热已久的开源模型GPT-OSS。 两款模型都以宽松的Apache 2.0许可证发布,企业在商用前无需付费或获得许可。 奥尔特曼在社交媒体表示:gpt-oss是一个重大突破,这是最先进的开放权重推理模型,具有与o4-mini 相当的强大现实世界性能,可以在你自己的电脑(或手机的较小版本)上本地运行。我们相信这是世界 上最好、最实用的开放模型。 简单而言,OpenAI在8月5日共发布两款开放权重AI推理模型。其中参数量达到1170亿的gpt-oss-120b能 力更强,可以由单个英伟达专业数据中心GPU驱动;参数量210亿的gpt-oss-20b模型,则能够在配备 16GB内存的消费级笔记本电脑上运行。 同时,亚马逊宣布将首次向客户提供OpenAI的模型,计划在其Bedrock和SageMaker平台上提供OpenAI 的开放AI权重新模型。这是云计算巨头亚马逊首次提供OpenAI的产品。 gpt-oss-20b和1 ...