Core Insights - The article discusses the rising prominence of embodied robots, highlighting their potential to revolutionize industries and the ongoing technological advancements in this field [2][3][36]. Industry Overview - The embodied robotics sector is experiencing significant interest from both large and small companies, with capital investment and media attention driving the narrative [2]. - There is a dual perspective within the industry: one that celebrates the impressive visual demonstrations of robots and another that seeks to understand their practical applications in real-world settings [3]. Technological Innovations - Xiaomi's embodied VLA model, Xiaomi-Robotics-0, addresses the issue of robots frequently pausing during tasks, achieving a 80ms inference delay and a 30Hz real-time control frequency on consumer-grade hardware [4][14]. - The model incorporates three core technological innovations: a dual-brain architecture, a two-stage pre-training strategy, and an improved asynchronous mechanism [6][22][36]. Dual-Brain Architecture - The architecture separates the model into a "brain" (VLM) for understanding and decision-making and a "small brain" (DiT) for generating continuous action blocks, enhancing the precision and fluidity of movements [8][10][14]. Two-Stage Pre-Training - The pre-training process ensures that the model retains its visual understanding capabilities while learning to execute actions, preventing a decline in its cognitive abilities [16][19]. Improved Asynchronous Mechanism - The introduction of a Lambda-shaped attention mechanism allows the model to maintain smooth action transitions while adapting to real-time visual feedback, addressing the issue of action inertia [22][24]. Performance Metrics - Xiaomi-Robotics-0 has demonstrated superior performance in various benchmarks, surpassing approximately 30 existing models in environments like LIBERO and CALVIN [27][28]. - The model achieved a 100% success rate in the Libero-Object task and maintained high throughput in real-world tasks such as towel folding and LEGO disassembly [34][30]. Strategic Direction - Xiaomi's approach in the embodied robotics field appears to focus on practical applications, aiming to address complex industrial challenges rather than merely showcasing advanced hardware capabilities [36][39]. - The company has also committed to open-sourcing its technologies, allowing broader access for developers and fostering innovation within the industry [5][41][42].
小米的首代机器人VLA大模型来了,丝滑赛德芙,推理延迟仅80ms