具身智能一步踏入Scaling Law！10B+基础模型，27万小时真实数据

Core Viewpoint - The article discusses the breakthrough achieved by the AI robotics startup Generalist with the introduction of a new embodied foundational model, GEN-0, which is designed for multimodal training on high-fidelity physical interaction data, aiming to enhance robotic intelligence through scalable data and computational power [2][5]. Group 1: GEN-0 Model Features - GEN-0 is built to capture human-level reflexes and physical common sense, with a parameter count exceeding 10 billion [3][4]. - A core feature of GEN-0 is "Harmonic Reasoning," allowing the model to seamlessly think and act simultaneously, which is crucial for real-world physical systems [5]. - The model has demonstrated strong scaling laws, indicating that increased pre-training data and computational power can predictably enhance performance across various tasks [6][10]. Group 2: Data and Training Insights - Generalist has pre-trained GEN-0 on over 270,000 hours of diverse real-world operational data, with the dataset growing at a rate of 10,000 hours per week [23][24]. - The company emphasizes that the quality and diversity of data are more critical than sheer quantity, leading to models with different characteristics based on the data mix used [33]. - The scaling experiments revealed that smaller models exhibit "ossification," while larger models continue to improve, highlighting the importance of model size in absorbing complex sensory-motor data [10][11]. Group 3: Applications and Future Directions - GEN-0 has been successfully tested on various robotic platforms, including humanoid robots with different degrees of freedom [6]. - The company is building the largest and most diverse real-world operational dataset to expand GEN-0's capabilities, covering a wide range of tasks across different environments [28]. - Generalist aims to create a robust infrastructure to support the extensive data collection and processing required for training large-scale robotic models [31].