Core Insights - The AI industry is experiencing a chaotic evolution of "world models," with various interpretations and definitions emerging from leading figures in the field, all agreeing that world models are essential for achieving AGI [2][20][22] - The concept of world models has expanded significantly, encompassing a wide range of technologies and applications, from embodied intelligence to video generation and 3D modeling [18][20] Group 1: Definition and Evolution of World Models - The term "world model" refers to the ability of AI to understand external world rules and predict changes, rather than a specific technical path [3][6] - The idea of world models dates back to 1943 with Kenneth Craik's "mental models," which posited that the brain constructs miniature models of the external world for prediction [4] - The modern framework for neural network world models was established by Jürgen Schmidhuber in 2018, defining a structure that includes visual and memory components [4] Group 2: Different Approaches to World Models - Current world models can be categorized into two main schools: the Representation school, which focuses on abstract state predictions, and the Generation school, which aims to reconstruct and simulate visual worlds [6][13] - Yann LeCun represents the Representation school, emphasizing a minimalist approach that predicts abstract states rather than visual details [7][9] - The Generation school, exemplified by OpenAI's Sora, focuses on creating visual simulations and understanding physical laws through video generation [13][14] Group 3: Emerging Technologies and Concepts - Interactive Generative Video (IGV) represents an advanced form of the Generation school, allowing real-time user interaction with generated environments, as seen in Google DeepMind's Genie 3 [14] - Li Fei-Fei's concept of "Spatial Intelligence" aims to create a persistent, downloadable 3D environment, represented by the Marble project, which focuses on high-precision physical accuracy [16] - The rise of world models is driven by a collective anxiety in the AI industry regarding the limitations of large language models (LLMs) and a shift towards understanding and simulating the physical world [23][20]
世界太小,不够世界模型们用了