Core Insights - The event highlighted the latest advancements in embodied intelligence by the Zhiyuan Research Institute, focusing on the importance of world models and the development of a comprehensive embodied brain system [2][3] Group 1: Zhiyuan's Full-Stack Layout - Zhiyuan introduced the native multimodal world model Emu3.5, which expanded training data from 15 years of video to 790 years and increased parameter size from 8 billion to 34 billion, enhancing video and image generation speed [5] - The institute is constructing a cross-heterogeneous ontology embodied intelligence system, including RoboBrain, RoboOS, and RoboBrain-0, deployed across various robotic forms for tasks ranging from navigation to complex interactions [5] Group 2: Key Elements of Embodied Intelligence - The role of world models in embodied intelligence was debated, with experts emphasizing the need for models that predict the next state based on the robot's form and goals, rather than merely generating videos [7][10] - There is a consensus that embodied intelligence should not follow the current language-first paradigm but rather adopt a structure centered on action and perception [10][12] - The importance of real data was highlighted, with discussions on the necessity of combining real, simulated, and video data for effective learning in robots [15][17] Group 3: Investment Priorities - When asked how to allocate 10 billion, experts prioritized talent acquisition, computational power, and data engines as key investment areas [19][21] - There were differing views on the importance of infrastructure versus model development, with some advocating for a focus on creating a comprehensive data engine for continuous digitalization [21][22] Group 4: Human-like Robots and Hardware Limitations - The debate on whether human-like robots represent the ultimate form of embodied intelligence concluded that neither models nor hardware define each other; rather, the specific application scenarios dictate the requirements [22][24] - Experts suggested that a layered structure for embodied intelligence should be adopted, where higher-level models can be reused across different robotic forms, but lower-level models must be tailored to specific hardware [23][24] Conclusion - The discussions at the event signaled a proactive search for solutions to achieve a closed-loop system in embodied intelligence, emphasizing the need for models, hardware, and scaling to evolve together [24]
100亿都不够烧!机器人公司CEO们给出新判断:具身智能不能再照搬LLM