物理世界基础模型
Search documents
继宇树后,唯一获得三家大厂押注的自变量:具身模型不是把DeepSeek塞进机器人
机器之心· 2026-01-14 07:18
Core Viewpoint - The article discusses the evolution of embodied intelligence, emphasizing that the next battleground will be the "brain" of robots, which is crucial for their autonomous operation in the physical world [1][4]. Group 1: Investment and Development - The company Zivariable has recently raised $1 billion in funding from ByteDance and Sequoia, indicating strong investor interest in their approach to robotic intelligence [1]. - Zivariable's focus is on developing a foundational model for physical intelligence that operates independently of existing AI models, aiming for a paradigm shift in how robots interact with the physical world [7][12]. Group 2: Challenges in Embodied Intelligence - The complexity of physical tasks requires robots to have a brain supported by a physical world foundational model, which is distinct from merely applying existing AI models [1][4]. - Current AI models struggle with understanding subtle physical differences that only become apparent through real-world interaction, highlighting the need for a model that can process long sequences of actions and understand causality over time [6][7]. Group 3: Model Development Approach - Zivariable advocates for an end-to-end architecture that allows for a holistic understanding of physical interactions, contrasting with the modular approach that often leads to a loss of critical details [9][10]. - The company emphasizes the importance of a general-purpose model that can learn the common structures of the physical world, similar to how language models have evolved [11]. Group 4: Unique Characteristics of Zivariable - Zivariable is committed to self-research, particularly in foundational models, believing that the next phase of competition in embodied intelligence will revolve around the ability to construct data loops and evolve models [15][16]. - The company has developed two core models, WALL-A and WALL-OSS, which integrate various aspects of embodied intelligence and have been successfully deployed in real-world scenarios [16][13]. Group 5: The Path Forward - The construction of a physical world foundational model is likened to retracing the developmental path of human infants, as it involves learning complex physical interactions that are not easily articulated [22]. - Zivariable's journey in this domain is characterized as long and challenging but ultimately rewarding, as they aim to redefine the capabilities of robots in the physical world [23].
自变量王潜:具身智能是物理世界的独立基础模型|MEET2026
具身智能之心· 2025-12-22 01:22
Core Viewpoint - The article discusses the debate on whether embodied intelligence should be viewed as an application or as an independent foundational model, asserting that it is a foundational model specifically designed for the physical world, parallel to language and multimodal models [6][12][60]. Group 1: Differences Between Physical and Virtual Worlds - There is a fundamental difference between the physical world, characterized by randomness and continuous processes, and the virtual world, which is highly reproducible and low in randomness [2][10]. - Existing models based on language and visual modalities are inadequate for accurately representing the complexities and randomness of physical interactions [16][22]. Group 2: Need for a Separate Foundational Model - A separate foundational model for embodied intelligence is necessary due to the unique characteristics of the physical world, which often leads to unpredictable outcomes even under identical conditions [10][11]. - The current architectures and training methods struggle to capture the high randomness present in physical events, necessitating a new approach to model design [12][20]. Group 3: Future of Multimodal Models - Shifting the perspective to view embodied intelligence as an independent foundational model can lead to significant changes in model architecture and data utilization [9][23]. - The learning and perception processes in the physical world differ fundamentally from those in the virtual world, suggesting that future multimodal models should incorporate these differences [24][29]. Group 4: Scaling Laws and Data Utilization - The article emphasizes the importance of scaling laws in the development of large models, particularly in the context of robotics, where data acquisition and utilization are critical [46][51]. - A phased approach to training, utilizing both pre-training and post-training data, is recommended to enhance model performance [48][52]. Group 5: Hardware and AI Integration - The integration of AI in defining hardware is crucial for the development of embodied intelligence, advocating for a simultaneous evolution of both software and hardware [53][54]. - The potential for embodied intelligence to drive exponential growth in resources and capabilities is highlighted, suggesting a transformative impact on the future of artificial general intelligence (AGI) [59][60].