世界模型

Search documents
昆仑万维:正式发布并开源「Matrix-Game 2.0」模型
Zheng Quan Shi Bao Wang· 2025-08-12 03:52
Core Insights - Kunlun Wanwei has launched an upgraded version of its self-developed world model Matrix series, named "Matrix-Game2.0," which is the first open-source solution for real-time long-sequence interactive generation in general scenarios [1] - The new version emphasizes low latency and high frame rate long-sequence interaction performance, achieving stable continuous video content generation at 25 FPS across various complex scenes, with generation duration extendable to minutes [1] - "Matrix-Game2.0" breaks down barriers between content generation and interaction, opening new possibilities for applications in virtual humans, game engines, and embodied intelligence, providing a strong technical foundation for building a universal virtual world [1] Industry Impact - The world model is considered the next frontier towards embodied intelligence and advanced spatial reasoning [2] - "Matrix-Game2.0" is expected to bring transformative impacts in areas such as training and data generation for embodied intelligence, rapid construction of virtual game worlds, and content production for film and the metaverse [2]
对话星动纪元陈建宇:世界模型是VLA的一个路径,未来5年家庭机器人会爆发
Tai Mei Ti A P P· 2025-08-12 02:00
Core Insights - The future trend in AI technology is the development of general humanoid robots, which will significantly enhance productivity and social service capabilities [2][4] - The VLA model is a broader concept that encompasses various applications of visual perception, language, and actions in robotics, with the world model being a pathway within this framework [3][4] Company Overview - Star Motion Era was established in August 2023 as an incubated project from Tsinghua University's Institute for Interdisciplinary Information Research, focusing on creating general intelligent agents in the physical world [5] - The company has completed three rounds of financing within two years, raising nearly 500 million yuan in Series A funding led by Dinghui VGC and Haier Capital [5] Product Development - Star Motion Era is developing embodied intelligent robots, integrating a general brain and ontology, with the VLA model ERA-42 unifying functions like vision, understanding, prediction, and action into an end-to-end model [5][6] - The company has introduced the Star Motion L7, a full-size bipedal humanoid robot, and the Star Motion Q5, designed for service industries, showcasing capabilities in logistics and daily tasks [6] Market Potential - The next five years are anticipated to be a breakthrough period for household robots, with simpler forms entering homes and high-net-worth individuals potentially using more advanced humanoid robots [4][9] - The humanoid robot's ultimate application is expected to be in households, although initial deployments will focus on B2B scenarios to refine technology and data accumulation [9][10] Industry Insights - Current intelligent robots achieve about 70% efficiency compared to humans, with projections to reach 90% in the coming year, indicating significant advancements in software and hardware [8] - The industry has not yet reached a "bubble" phase, as valuations have not matched those of sectors like smart vehicles, with a potential for a capital explosion once leading companies achieve scalable commercial applications [8]
昆仑万维发布并开源Matrix-Game 2.0模型
Xin Lang Cai Jing· 2025-08-12 01:22
8月12日,昆仑万维发布并开源自研世界模型Matrix系列中Matrix-Game交互世界模型的升级版本 ——"Matrix-Game 2.0"。据介绍,Matrix-Game 2.0能够生成跨场景的长时视频,保持动作和视觉的时序 一致性,并且支持用户在交互过程中的连续指令输入。 ...
CMU最新!跨实体世界模型助力小样本机器人学习
具身智能之心· 2025-08-12 00:03
Core Viewpoint - The article discusses a novel approach to training visuomotor policies for robots by leveraging existing low-cost data sources, which significantly reduces the need for expensive real-world data collection [2][11]. Group 1: Methodology - The proposed method is based on two key insights: 1. Embodiment-agnostic world model pretraining using optic flow as an action representation, allowing for cross-embodiment data set training followed by fine-tuning with minimal target embodiment data [3][12]. 2. Latent Policy Steering (LPS) method improves policy outputs by searching for better action sequences in the latent space of the world model [3][12]. Group 2: Experimental Results - Real-world experiments showed that combining the policy with a pretrained world model from existing datasets led to significant performance improvements, with 30 demonstrations yielding over 50% relative improvement and 50 demonstrations yielding over 20% relative improvement [3][9]. Group 3: Challenges and Solutions - The article highlights the challenges posed by embodiment gaps in pretraining models across different robots, and emphasizes that world models are more suitable for cross-embodiment pretraining and fine-tuning for new embodiments [11][12].
本来决定去具身,现在有点犹豫了。。。
自动驾驶之心· 2025-08-11 12:17
Core Insights - Embodied intelligence is a hot topic this year, transitioning from previous years' silence to last year's frenzy, and now gradually cooling down as the industry realizes that embodied robots are far from being productive [1] Group 1: Industry Trends - The demand for multi-sensor fusion and positioning in robotics is significant, with a focus on SLAM and ROS technologies [3] - Many robotics companies are rapidly developing and have secured considerable funding, indicating a promising future for the sector [3] - Traditional robotics remains the main product line, despite the excitement around embodied intelligence [3] Group 2: Community and Resources - The community has established a closed loop across various fields including industry, academia, and job seeking, aiming to create a valuable exchange platform [4][6] - The community offers access to over 40 technical routes and invites industry leaders for discussions, enhancing learning and networking opportunities [6][20] - Members can freely ask questions regarding job choices or research directions, receiving guidance from experienced professionals [83] Group 3: Educational Content - Comprehensive resources for beginners and advanced learners are available, including technical stacks and learning roadmaps for autonomous driving and robotics [13][16] - The community has compiled a list of notable domestic and international research labs and companies in the autonomous driving and robotics sectors, aiding members in their academic and career pursuits [27][29]
OpenAI发布最强AI模型GPT-5;英特尔CEO发全员信:回应辞职要求;微信员工回应“改手机日期可恢复过期文件” | Q资讯
Sou Hu Cai Jing· 2025-08-10 02:43
Group 1: OpenAI and AI Models - OpenAI has officially released its latest AI model, GPT-5, which features intelligent model version switching, lower hallucination rates, enhanced coding capabilities, and personalized settings [1][3] - GPT-5 achieved state-of-the-art scores in key coding benchmarks, scoring 74.9% in SWE-bench Verified tests and 88% in Aider polyglot tests, positioning it as a strong coding collaborator [3] - The model excels in front-end coding tasks, outperforming previous versions in 70% of internal tests [3] Group 2: Intel and CEO Response - Intel CEO Pat Gelsinger addressed employees in a letter, clarifying misconceptions and indicating he will not resign, emphasizing his commitment to the company's future goals and investments [4][5] - Intel has a 56-year history of semiconductor production in the U.S. and plans to invest billions in semiconductor R&D and manufacturing, including a new fab in Arizona [4] Group 3: Microsoft Layoffs - Microsoft has initiated a new round of layoffs in Washington state, reducing approximately 40 positions, bringing the total layoffs in the state to 3,160 this year [6] - The layoffs are part of a broader plan to cut over 15,000 jobs globally, with the latest round being relatively small compared to previous months [6] Group 4: ByteDance Recruitment - ByteDance has launched its 2026 campus recruitment, offering over 5,000 positions, a significant increase from the previous year's 4,000+ offers [10] - The recruitment focuses on various roles, with a 23% increase in R&D positions, particularly in algorithms and front-end development [10] Group 5: Gaming and Service Outages - Multiple games under NetEase experienced login issues, leading to a significant outage that lasted over 2 hours, attributed to internal server problems [8][9] - The outage affected several popular titles, causing widespread player frustration and highlighting the challenges in troubleshooting large-scale service disruptions [8][9] Group 6: AI Developments - OpenAI released two open-weight AI models, GPT-oss-120b and GPT-oss-20b, which can mimic human reasoning and perform complex tasks, although they are not fully open-source [13] - Google DeepMind introduced Genie 3, a universal world model capable of generating interactive 3D environments in real-time, marking a significant advancement in world modeling technology [14][15]
腾讯加码空间智能大模型,这一赛道正在成为下一个风口
首席商业评论· 2025-08-09 04:17
Core Viewpoint - Tencent's Hunyuan 3D model represents a significant advancement in the creation of immersive 3D environments, allowing users to generate complete scenes from text or images, thus democratizing access to 3D content creation [3][4][5]. Group 1: Hunyuan 3D Model Features - The Hunyuan 3D World Model 1.0 supports 360° immersive roaming, asset export in standard mesh format, and editing with mainstream modeling software, marking a leap from "AI can draw" to "humans can use" [3][7]. - The model has surpassed state-of-the-art (SOTA) open-source models in quality across various evaluation dimensions, including texture detail and aesthetic quality [7]. - Tencent plans to release a series of open-source initiatives, including multimodal understanding models and game vision models, to create a comprehensive ecosystem for 3D AIGC creation [7][9]. Group 2: User Experience and Accessibility - Users can generate a 360-degree immersive scene based on simple text descriptions or images, enabling the creation of complex environments with dynamic elements [8]. - The model allows for the construction of "walkable" scene maps, enhancing interactivity and user experience compared to previous models that lacked spatial continuity [8][9]. - The hybrid approach of combining 2D and 3D elements in scene generation addresses the limitations of purely 3D or 2D models, providing a more stable and diverse creative output [8]. Group 3: Impact on Game Development - The Hunyuan 3D model revolutionizes game development by significantly reducing the time required to create high-quality scene prototypes, thus shortening development cycles and lowering trial-and-error costs [9]. - It lowers the barrier for 3D enthusiasts and content creators, allowing them to create virtual worlds without needing advanced modeling skills [9]. Group 4: Future of Spatial Intelligence - The development of spatial intelligence models, like the Hunyuan 3D model, is seen as a precursor to more complex world models that incorporate physical and causal reasoning [11][12]. - The concept of world models is gaining traction as a critical breakthrough in AI, enabling machines to understand and simulate complex physical environments [11][12][14]. - Major tech companies, including Google and Nvidia, are investing in world models, indicating a competitive landscape focused on advancing spatial intelligence capabilities [14][22]. Group 5: Tencent's Strategic Position - Tencent's capital expenditure for AI initiatives reached 76.7 billion yuan in 2024, a 221% increase year-on-year, reflecting its commitment to AI development [24]. - The company has established a comprehensive model system, with its Hunyuan models ranking among the top globally, showcasing its competitive edge in the AI landscape [24][27]. - Tencent aims to create a supportive infrastructure for small developers, emphasizing collaboration and ecosystem building rather than monopolistic practices [24][27].
对话千寻智能高阳:科学家创业不太「靠谱」,但创业就像一场游戏
36氪· 2025-08-08 09:28
智能涌现 . 直击AI新时代下涌现的产业革命。36氪旗下账号。 具身智能创业,要做苹果,而不是安卓。 文 | 邱晓芬 编辑 | 苏建勋 来源| 智能涌现(ID:AIEmergence) 封面来源 | 视觉中国 不管是刚刚结束的WAIC(世界人工智能大会),还是本周要开幕的WRC(世界机器人大会),如何在展会上识别一个机器人的真正实力? 具身智能公司"千寻智能"的联合创始人高阳,提供了这样几个tips: 以下文章来源于智能涌现 ,作者邱晓芬 对于号称能叠衣服的机器人,你可以尝试把衣服团成一团,随意丢在桌上,观察它是否能继续完成动作;或者是再给它裤子、外套,看它能否具备跨品类 的泛化能力; 在机器人操作时,可以观察其动作是否足够丝滑流畅,而不是一卡一卡,这代表了思维和动作的协调性…… 给我们提出指引的高阳,是当前具身智能领域炙手可热的创业者之一——从美国加州大学伯克利分校博士毕业后,他选择回国成为清华大学交叉信息研究 院助理教授。 2023年,他又与前珞石机器人CTO韩峰涛一起,创办了具身智能公司千寻智能——韩峰涛硬件经验丰富,过往操盘过数万台机器人量产出货,高阳则有 AI的研究基础,学术和产业界的搭配,使得千寻 ...
对话千寻智能高阳:科学家创业不太“靠谱”,但创业就像一场游戏
3 6 Ke· 2025-08-08 01:49
Core Viewpoint - The article discusses the emergence of embodied intelligence in robotics, emphasizing the importance of creating integrated hardware and software solutions, akin to Apple's approach, rather than a fragmented one like Android's [5][6]. Group 1: Company Overview - Qianxun Intelligent, co-founded by Gao Yang and Han Fengtao, has raised over 1 billion RMB in funding within 19 months, with investors including Huawei Hubble, JD.com, and CATL [4]. - Gao Yang, a former assistant professor at Tsinghua University, transitioned from academia to entrepreneurship, highlighting the challenges and learning experiences in this shift [5][12]. Group 2: Market Insights - The robotics market is currently competitive, with established companies focusing on hardware while neglecting the software aspect, which Gao Yang believes is crucial for long-term success [9]. - The potential for embodied intelligence is seen as inevitable, driven by advancements in AI technologies like ChatGPT, which have shifted perceptions about the capabilities of AI [8]. Group 3: Technical Perspectives - The integration of hardware and software is deemed essential in the early stages of robotics development, as seen in historical examples like IBM's approach to personal computers [6][7]. - Gao Yang emphasizes the importance of algorithms and data in evaluating the performance of robotic systems, noting that models must be capable of handling complex tasks rather than just simple ones [28][29]. Group 4: Future Outlook - The anticipated development of robots capable of performing complex tasks, referred to as Robot GPT-3.5, is expected to significantly enhance their functionality in everyday scenarios [32]. - The article suggests that the current focus on large-scale data collection in robotics may not be as valuable due to the rapid evolution of robot forms, indicating a need for more effective pre-training methods [41][42].
当AI“看见”世界,商业的未来正在被彻底重塑 | 两说
Di Yi Cai Jing Zi Xun· 2025-08-07 11:15
Group 1: AI Impact on Employment - AI is predicted to take over creative tasks, with experts suggesting that jobs such as delivery personnel, cashiers, financial analysts, and scriptwriters may be restructured by AI within the next five years [5][7] - Those who do not understand or utilize AI are likely to be the first to be eliminated from the workforce [5] Group 2: AI and the Beidou System - The Beidou satellite system, when empowered by AI, is expected to enhance its capabilities beyond navigation to include precise disaster response and urban traffic optimization [7] - AI will facilitate a high-precision, low-latency economic evolution through the integration of the Beidou system into various industries [7] Group 3: World Models and AI - The concept of "world models" is identified as a key direction for the next generation of AI, enabling machines to perceive space, reason relationships, and execute tasks [9] - This shift represents a transition from digital to physical environments, opening new opportunities for AI entrepreneurs [9] Group 4: Revolution in Content Industry - AI-generated content (AIGC) is set to revolutionize the content industry, with AI-assisted video production significantly increasing efficiency compared to traditional methods [11] - The content industry, valued at three trillion dollars, is expected to undergo a major transformation due to AIGC [11] Group 5: AI Governance - The ultimate challenge for AI is governance, ensuring that AI systems do not seek to dominate the world [13] - There is a growing consensus on the need for global participation in AI governance, with China beginning to assert its voice in international discussions [13][14]