世界模型
Search documents
为机器人而生!NVIDIA 开启具身智能新纪元的终极大脑
机器人大讲堂· 2025-12-01 01:30
" 我们正从感知智能迈向行动智能的新纪元。 " 这是斯坦福大学 HAI 联合主任、具身智能领域先驱李飞飞教授所前瞻的下一个机器人时代节点,其认为机器 人的下一个挑战,不是如何看得更准,而是如何根据所见做出正确的决策和行动,而这需要一种全新的、通用 化的 AI 能力框架。 过去数十年,机器人被牢牢禁锢在固定范围之内,执行着精准却单一的重复性任务。如今,源于以大模型为代 表的 AI 技术的突破性进展,全球机器人产业在具身智能等新理念的驱动下,正迎来一个历史性的 " 奇点时刻 (singularity) " ,即从专用到通用的范式转移。 人们开始希望,机器人不再是为特定流水线量身定制的工具,而是一种能够适应复杂、非结构化环境,并执行 多种任务的通用型智能体,或者称其为通用机器人( General-Purpose Robots )。 ▍ 机器人如何加速进入 " 通用化 " 临界点? 机器人想要实现这一宏大的 " 通用化 " 愿景,产业对底层支撑技术提出了前所未有的苛刻要求,四大技术支 柱或许缺一不可。 因为训练一个能够理解千变万化物理世界的机器人通用 " 大脑 " ,需要处理远超以往的视觉、语言和动作数 据。这要求算 ...
中金机器人播客 #6 | 朱政:“世界模型”的路线与前沿
中金点睛· 2025-11-30 23:49
Core Viewpoint - The podcast explores the development and application of world models in robotics, emphasizing their significance in embodied intelligence and autonomous driving [6]. Summary by Sections World Models - World models are essential for understanding and simulating environments, which is crucial for the advancement of robotics [6]. Applications in Embodied Intelligence - The application of world models in embodied intelligence is discussed, highlighting their role in enhancing robot capabilities [6]. Challenges in Application - Various challenges associated with the implementation of world models are identified, indicating the complexities involved in their practical use [6]. Differences in Applications - The podcast differentiates between the applications of world models in embodied intelligence and autonomous driving, noting the unique requirements of each field [6]. Evolution of Simulation - The evolution of simulation techniques from 1.0 to 2.0 is explained, showcasing advancements in how world models are utilized [6]. Understanding Robot World Models - Insights into how to comprehend the world models used in robotics are provided, emphasizing their foundational role in robot functionality [6]. Data Sources and Limitations - The sources of data for world models and their capability boundaries are discussed, underlining the importance of accurate data in model effectiveness [6]. Future Development Trends - Future trends in the development of world models are anticipated, suggesting potential advancements and innovations in the field [6]. Ensuring Physical Consistency - The importance of ensuring physical consistency in world models is highlighted, which is critical for their reliability in real-world applications [6]. Technological Projections for 2030 - Projections regarding technological advancements by 2030 are made, indicating the expected growth and evolution of robotics and world models [6].
北京人工智能产业白皮书:各类AI Agent将迎来爆发式增长
Xin Jing Bao· 2025-11-29 07:55
Core Insights - The Beijing Artificial Intelligence Industry White Paper (2025) predicts explosive growth in various AI agents capable of serving as personal assistants, automating enterprise processes, and acting as scientific research assistants [1][3] - The development of embodied intelligence will enable a transition from information processing to physical tasks [3] Industry Overview - Beijing has registered 183 large models, maintaining its position as the national leader [2] - The AI core industry in Beijing is projected to reach a scale of 215.22 billion yuan in the first half of 2025, reflecting a year-on-year growth of 25.3% [2] - The total industry scale is expected to exceed 450 billion yuan by the end of 2025, with over 2,500 AI companies operating in the region [2] Technological Advancements - Various innovative entities in Beijing are producing leading-edge results, including the launch of FlagOS by the Beijing Zhiyuan Artificial Intelligence Research Institute and the introduction of "Tongtong 2.0" by the Beijing General Artificial Intelligence Research Institute [3] - The establishment of the world's first AI research platform covering literature review, computation, experimentation, and multidisciplinary collaboration has been achieved with the launch of the Bohr Research Space Station [3] Future Trends - The white paper outlines future trends in the AI industry, indicating that AI agents will experience significant growth and that embodied intelligence will bridge the gap between information processing and physical operations [3] - The development of world models is expected to enhance the generalization capabilities and reliability of AI systems [3] - The "AI for Science" initiative is anticipated to accelerate scientific discovery and lead to breakthroughs across various fields [3]
世界模型,是否正在逼近自己的「ChatGPT时刻」?
机器之心· 2025-11-29 01:49
机器之心报道 机器之心编辑部 李飞飞等顶尖学者投身的创业方向——世界模型是 AI 的下一站吗? 「AI 是人类自诞生以来,唯一担得起『日新月异』这个词的技术领域,」在机器之心近日举办的 NeurIPS 2025 论文分享会圆桌讨论上,茶思屋科技网站 总编张群英的开场感叹引发了在场专家们的共鸣。 这场由黄大年茶思屋总编主持 ,聚集了中科院自动化所、南京大学、北京通用人工智能研究院、极佳科技等机构专家的大讨论,直指目前 AI 领域最热门的 方向——世界模型。最近一段时间,从谷歌 Genie 3 的发布到李飞飞的长文论述,世界模型、空间智能等概念正成为新的焦点。 四十多分钟的对话里,专家们围绕世界模型的定义、数据与架构方向、技术路径分歧,以及商业化前景展开了讨论。在一些议题上,大家的观点一致,不过在很 多重要方向上有着明显不同的思考。看得出,面对这个正在快速发展的新兴领域,不论是技术还是评判标准,我们还有很多需要去探索、验证的。 首先,世界模型究竟是什么? 几位嘉宾从不同角度给出了自己的定义。 极佳科技联合创始人、首席科学家朱政 认为,世界模型本质上是预测模型:「给定当前状态及动作序列,预测下一个状态。」他指出了世 ...
贝索斯、杨立昆纷纷“出山”创业:AI黄金十年还是泡沫前夜?
Sou Hu Cai Jing· 2025-11-28 15:03
Core Insights - The return of Jeff Bezos and Yann LeCun to the AI sector marks a significant shift in the industry, with their contrasting approaches aiming to address the real bottlenecks in AI technology and its application in creating tangible value [1][3][4] Group 1: Major Players Re-entering the AI Arena - Jeff Bezos has taken on a leadership role in the AI startup "Project Prometheus," securing $6.2 billion in funding, making it one of the best-funded early-stage AI companies globally [4][5] - Yann LeCun, a Turing Award winner, is establishing a new company focused on Advanced Machine Intelligence (AMI), with Meta as a strategic partner, emphasizing foundational research over immediate commercialization [5][6] Group 2: Diverging Paths in AI Development - Bezos's "physical AI" approach targets the optimization of engineering manufacturing in sectors like hardware, automotive, and aerospace, aiming to reduce production cycles significantly [5][7] - LeCun's focus on AMI seeks to address the fundamental challenges of AI, such as understanding the physical world and developing reasoning capabilities, which he believes are essential for the next AI revolution [8][9] Group 3: Capital and Talent Dynamics - The influx of capital into the AI sector is accelerating, with 57% of new unicorns being AI companies, and the funding environment becoming increasingly competitive [10][11] - Talent acquisition has intensified, with companies offering substantial compensation packages to attract top AI researchers, further reshaping the competitive landscape [11][12] Group 4: Industry Trends and Future Outlook - The AI industry is transitioning from a phase of technological explosion to one of deep industry engagement, characterized by a focus on foundational innovation and vertical integration [9][10] - The potential for AI to drive significant advancements in manufacturing and healthcare is evident, with applications already demonstrating substantial efficiency gains and cost reductions [13][14] Group 5: Balancing Opportunities and Risks - While the enthusiasm for AI's potential is high, concerns about a possible bubble due to overvaluation and a lack of sustainable business models are emerging [14][16] - The industry's future will depend on maintaining a balance between innovation quality and commercial viability, as well as navigating regulatory uncertainties [14][16]
90后华人科学家:超一亿美金年薪背后的权力游戏
创业邦· 2025-11-28 10:14
Core Insights - The departure of Yann LeCun, a Turing Award winner and AI pioneer, from Meta marks a significant shift in the company's AI strategy towards a more pragmatic, product-oriented approach [5][6][27] - The recruitment of Shengjia Zhao, a former key developer at OpenAI, highlights the intense competition for AI talent in Silicon Valley and reflects a deeper power struggle within Meta [6][17][30] Group 1: Key Events - Yann LeCun announced his departure from Meta after 12 years, indicating a shift from long-term idealism to practical application in AI [5][6] - Shengjia Zhao joined Meta with a reported annual salary exceeding $100 million, showcasing the aggressive talent acquisition strategies employed by tech giants [6][10][20] - Zhao's rapid rise within Meta, including his appointment as Chief Scientist of the newly formed Meta Super Intelligence Lab (MSL), underscores the company's urgent need to enhance its AI capabilities [19][20][30] Group 2: Internal Dynamics - Meta's internal turmoil is evident as Zhao faced management chaos and cultural clashes shortly after joining, leading him to consider returning to OpenAI [19][21] - The establishment of MSL and Zhao's leadership role have exacerbated existing tensions between new and old factions within Meta, as evidenced by the departure of other top researchers [22][25] - The marginalization of the FAIR lab, previously led by LeCun, reflects a broader shift in Meta's AI focus, moving away from academic ideals towards commercial viability [26][27] Group 3: Future Implications - The challenges faced by Zhao in navigating Meta's bureaucratic environment while striving to advance AI technology signal a critical juncture for the company [30] - The competition for AI talent and the strategic shifts within Meta may influence the broader AI industry, as companies seek to balance idealism with practical outcomes [30]
展望2026,AI行业有哪些创新机会?
3 6 Ke· 2025-11-28 08:37
Core Insights - The AI industry is entering a rapid change cycle, with 2025 being a pivotal year for the development of large models, particularly with the emergence of DeepSeek, which is reshaping the global landscape and promoting open-source initiatives [1][10][18] - The dual-core driving force of AI development is characterized by the United States and China, each following distinct paths, with key technologies accelerating towards engineering applications [1][10][11] - Despite advancements in model capabilities, challenges in real-world application remain prevalent, indicating a shift in focus from "large models" to "AI+" [1][10][19] Group 1: Global Large Model Landscape - The global large model development is driven by a dual-core approach, with the U.S. leading in closed-source models and China focusing on open-source models [10][11][13] - OpenAI, Anthropic, and Google represent the leading trio in the large model arena, each adopting differentiated strategic paths [17] - DeepSeek's emergence marks a significant breakthrough for China's large model development, showcasing the potential of open-source models [18][19] Group 2: Key Technological Evolution - The evolution of large models is marked by four major technological trends: native multimodal integration, reasoning capabilities, long context memory, and agentic AI [22][24] - Native multimodal architectures are replacing text-centric models, allowing for seamless integration of various modalities [23] - Reasoning capabilities are becoming a core feature of advanced models, enabling them to demonstrate their thought processes [24][26] Group 3: Industry Chain and Infrastructure - The AI infrastructure is still dominated by Nvidia, with a slow transition towards a multi-polar ecosystem despite the emergence of alternatives like Google’s TPU and AMD’s chips [47][48] - The AI industry is shifting from reliance on a few cloud providers to a more collaborative funding model, with Nvidia and OpenAI acting as dual cores driving the ecosystem [51][52] Group 4: Application Layer Opportunities - Large model companies are positioning themselves as "super assistants" while also aiming to control user entry points through various products and services [53][54] - Independent application companies can find opportunities in vertical markets that require deep industry understanding and complex workflow integration [55][56] - The evolution of AI applications is moving towards intelligent agents capable of autonomous operation, indicating a significant shift in application development paradigms [61][62]
图灵奖得主 Yann LeCun:大模型是“死胡同”,下一步押在哪一条路?
3 6 Ke· 2025-11-28 01:43
Core Insights - Yann LeCun, a Turing Award winner, announced his departure from Meta to establish a new company focused on Advanced Machine Intelligence (AMI), marking a significant shift in his career and the AI landscape [1][2] - LeCun criticizes large language models (LLMs), labeling them as a "dead end" for achieving human-like intelligence, emphasizing their lack of real-world understanding and limitations in reasoning and action [3][4] Group 1: Critique of Large Language Models - LeCun argues that while LLMs perform well in language tasks, they do not possess true understanding of the world, lacking common sense and causal reasoning [5][6] - He highlights that the performance of LLMs is reaching a saturation point, where increasing model size does not equate to enhanced intelligence [6][7] - The training data and computational costs are approaching their limits, leading to diminishing returns in understanding [7][8] - LLMs are described as being unable to plan or take action effectively, with LeCun providing examples of how human-like intelligence involves more than just language skills [12][13] Group 2: The Concept of World Models - LeCun proposes that the next generation of AI should focus on building "world models" that allow AI to understand and interact with the physical world [14][15] - He introduces the Joint Embedding Predictive Architecture (JEPA) as a new learning paradigm that contrasts with LLMs by enabling AI to learn from multi-modal inputs and develop an internal representation of the world [16][17] - JEPA emphasizes the importance of action and planning, moving beyond mere language processing to a more holistic understanding of the environment [18][19] Group 3: Diverging Paths in AI Development - Both LeCun and former OpenAI chief scientist Ilya Sutskever are questioning the current trajectory of AI, but they propose different solutions: LeCun focuses on world models, while Sutskever emphasizes safety and control in AI systems [25][26] - The industry is witnessing a shift towards new architectures and approaches, as evidenced by significant investments and developments in embodied intelligence and robotics [34][35] - The future of AI is seen as a marathon rather than a sprint, with both LeCun and Sutskever acknowledging that their proposed directions will take years to mature [38][40] Group 4: Implications for Entrepreneurs and Developers - LeCun's transition signals that larger models do not necessarily equate to better intelligence, highlighting the need for architectural innovation [41] - There are opportunities in vertical applications, particularly in fields requiring physical interaction, such as robotics and autonomous driving [42] - The importance of open-source development is emphasized, as LeCun's new company will continue to support this approach, allowing smaller teams to contribute to new paradigms [43]
理想披露了一些新的技术信息
自动驾驶之心· 2025-11-28 00:49
Core Insights - The article discusses the advancements and challenges faced by Li Auto in the development of its autonomous driving technology, particularly focusing on the end-to-end model and VLA (Vision-Language-Action) integration [2][5][9]. Group 1: Model Performance and Data Utilization - The performance improvement of end-to-end models slows down after reaching a certain amount of training data, specifically after 10 million clips, where the model's MPI (Miles Per Interaction) only doubled in five months [5]. - To enhance model performance, Li Auto adjusted the training data mix, increasing the quantity of generated data, including corner cases, and implementing manual rules for safety and compliance in special scenarios [5][9]. Group 2: VLA Integration and Decision-Making - The introduction of VLA aims to enhance the decision-making capabilities of the end-to-end model, addressing issues such as illogical behavior, lack of deep thinking in decision-making, and insufficient preventive judgment based on scenarios [5][6]. - VLA incorporates spatial intelligence, linguistic intelligence, and action policy, allowing the model to understand and communicate spatial information effectively, and generate smooth driving trajectories using diffusion models [6][9]. Group 3: Simulation and Testing Efficiency - Li Auto upgraded its model evaluation methods by utilizing a world model for closed-loop simulation and testing, significantly reducing testing costs from 18.4 per kilometer to 0.53 per kilometer [9][11]. - The closed-loop training framework AD-R1 was introduced, allowing for efficient data management and reinforcement learning, with high-value data being processed through a series of steps back to the cloud platform [11][12]. Group 4: Computational Power and Resources - Li Auto's total computational power is 13 EFLOPS, with 3 EFLOPS dedicated to inference and 10 EFLOPS for training, utilizing 50,000 training and inference cards [13]. - The emphasis on inference power is crucial in the VLA era, as it is necessary for generating simulation training environments [13].
从游戏工厂到空间智能仿真:混元 3D 为何是腾讯 AI 的“侧翼突围”
AI前线· 2025-11-27 04:02
Core Insights - Tencent's "Hunyuan 3D" has accelerated its global outreach by launching an international version of its creative engine and achieving over 3 million downloads of its open-source model, marking a significant step in its AI strategy [2][3][21] - Tencent's unique position as a technology company lies in its combination of massive 3D demand from various sectors, mature multi-modal capabilities of its Hunyuan model, and a comprehensive distribution network through WeChat, QQ, and Tencent Cloud [3][4] Group 1: Business and Technology Integration - The traditional 3D industry faces challenges of high costs and long production times, with art costs in game development often accounting for 50%-80% of total expenses, and 3D asset creation being the most resource-intensive [6][7] - Hunyuan 3D aims to address these issues by enhancing the efficiency of 3D asset production and solving scene-level construction problems through two main technical lines [8][9] - The integration of Hunyuan 3D into Tencent's internal game projects has shown promising results, significantly reducing the time required to create 3D assets from days to mere hours [12][14] Group 2: Market Applications and Expansion - Hunyuan 3D's applications extend beyond gaming, with over 150 companies across various industries, including e-commerce, film, advertising, and 3D printing, utilizing its models to enhance production efficiency [25][27] - The technology has enabled a shift in consumer 3D printing, allowing users to generate personalized models with minimal expertise, thus expanding the market [26] - In advertising and content creation, Hunyuan 3D is poised to transform how brands engage with consumers by moving from static displays to interactive experiences [27][29] Group 3: Strategic Vision and Competitive Edge - Tencent's AI strategy focuses on building ecological barriers rather than merely scaling operations, emphasizing quality, controllability, and cost-effectiveness as foundational capabilities [31][32] - The company has achieved recognition for its Hunyuan image model, which topped global rankings, indicating its leadership in multi-modal technology [31] - Tencent's approach to 3D generation is characterized by a commitment to understanding industry pain points and fostering an ecosystem that supports sustainable growth [39][40]