Pi0.5
Search documents
蚂蚁出手VLA,就是开源超越Pi0.5的基座模型
机器之心· 2026-01-28 03:36
Core Viewpoint - The article discusses the challenges and advancements in embodied intelligence, particularly focusing on the new model LingBot-VLA, which surpasses the previous benchmark Pi0.5 in terms of generalization capabilities and training efficiency [2][26]. Group 1: Current Challenges in Embodied Intelligence - The current state of embodied intelligence is characterized by a lack of generalization ability, where robots can perform specific tasks but struggle with broader applications [2]. - The industry consensus is that larger and more diverse real-world robot data is needed to improve model training and task understanding [2][10]. - High-quality data collection is costly, and the difficulty in reusing data across different robot configurations limits the effectiveness of many models [2][8]. Group 2: Introduction of LingBot-VLA - LingBot-VLA is based on approximately 20,000 hours of real-world data from nine different robot configurations, allowing it to perform over 100 tasks effectively [2][10]. - The model has shown significant improvements in success rates compared to Pi0.5, with an average success rate increase of 4.28% and a partial success rate increase of 7.76% [14][17]. - The model's architecture combines a pre-trained visual language model with a specialized action generator, enhancing its ability to understand and execute complex tasks [20][21]. Group 3: Performance and Testing - LingBot-VLA was tested on the GM-100 benchmark, which includes 100 diverse real-world tasks, demonstrating its superior performance across different robot platforms [12][13]. - The model's success rates were the highest among tested models, indicating its robustness in handling complex and varied tasks [14][17]. - The testing involved 25 robots from three different platforms, emphasizing its cross-platform and cross-task capabilities [13]. Group 4: Efficiency and Scalability - LingBot-VLA exhibits higher data utilization and computational efficiency, outperforming Pi0.5 in training with fewer data samples [17][19]. - The model's training framework allows for significant scalability, maintaining high training throughput even as GPU resources are increased [19][24]. - The systematic optimization of the training codebase has led to a notable increase in training speed, achieving 261 samples per second per GPU [24]. Group 5: Implications for the Future - The advancements represented by LingBot-VLA not only set a new industry standard but also provide empirical evidence that scaling real-world data can lead to stronger generalization in models [26][28]. - The open-source release of LingBot-VLA, along with its associated tools, fosters a collaborative environment for further development in embodied intelligence [28][29]. - The model's development is seen as a strategic move by Ant Group towards integrating embodied intelligence into broader artificial general intelligence (AGI) frameworks [28][29].
一款持续在进化的具身机械臂......
具身智能之心· 2026-01-22 09:42
Core Viewpoint - The article emphasizes the importance of continuous evolution and adaptability in robotics, particularly through the introduction of the Imeta-Y1, a lightweight and cost-effective robotic arm designed for beginners and researchers in the field of embodied intelligence [2]. Group 1: Product Introduction - Imeta-Y1 is designed specifically for novices and researchers, providing a low-cost and efficient solution for algorithm validation and project development [2]. - The robotic arm features high-precision motion control, low power consumption, and an open hardware and software architecture, facilitating seamless integration from simulation to real-world applications [5]. Group 2: User-Friendly Features - The product offers a comprehensive open-source toolchain and code examples, enabling users to complete the entire process from data collection to model deployment [3][17]. - It supports dual programming languages (Python and C++) and is compatible with ROS1 and ROS2, allowing users to quickly adapt regardless of their programming background [3][18]. Group 3: Technical Specifications - The Imeta-Y1 has a weight of 4.2 kg, a rated load of 3 kg, and 6 degrees of freedom, with a working radius of 612.5 mm and a repeat positioning accuracy of ±0.1 mm [8][19]. - The arm operates at a supply voltage of 24V and utilizes CAN communication, with a control method that includes trajectory tracking, teaching, and API [19]. Group 4: Development and Support - The product provides a full-process toolchain for data collection, model training, and inference deployment, supporting multi-modal data fusion and compatibility with major frameworks like TensorFlow and PyTorch [36]. - The company ensures rapid customer support with a 24-hour response time and offers bulk purchase discounts, as well as project development and training services [19][48].
速递 | 中国公司干翻硅谷!全球具身智能第一,完全开源
未可知人工智能研究院· 2026-01-13 03:02
Core Viewpoint - A Chinese company, Qianxun Intelligent, has achieved a significant milestone by surpassing the Silicon Valley company Physical Intelligence in the global embodied intelligence model rankings, securing the top position with its open-source model, Spirit 1.5 [1][2]. Group 1: Global Ranking and Performance - The RoboChallenge Table30 is the first large-scale real-robot evaluation ranking, assessing robots on 30 practical tasks, which provides a more accurate measure of performance compared to previous self-reported rankings [2]. - Qianxun's Spirit 1.5 scored 66.09, achieving a success rate of over 50%, surpassing the previous leader, Pi0.5, which had a score of 61.84 and a success rate of 42.67% [3]. Group 2: Technological Innovation - Qianxun's success is attributed to its unique approach called "Diverse Collection," which allows robots to learn from real-world scenarios without predefined scripts, leading to a 40% increase in transfer learning efficiency compared to traditional methods [4][5]. - The company emphasizes the importance of training models with diverse, real-world data to enhance their adaptability and performance in complex tasks [5]. Group 3: Team and Funding - Qianxun's founding team includes Han Fengtang, a former CTO of Lush Robotics, and Gao Yang, a Tsinghua University graduate and Berkeley PhD, both of whom bring extensive industry experience and expertise [6][8]. - The company has rapidly secured funding, completing multiple financing rounds since its establishment in February 2024, with the latest round raising nearly 600 million, led by JD.com [8][9]. Group 4: Market Position and Strategy - Qianxun is categorized as a "soft-hard integrated" player in the embodied intelligence sector, possessing both proprietary software and hardware, which allows for a data feedback loop that enhances model training and hardware performance [12][13]. - The company is positioned to capitalize on the growing demand for robots in logistics and retail, as evidenced by JD.com's investment, which reflects the need for functional robots in real-world applications [9][11]. Group 5: Future Opportunities - The emergence of embodied intelligence is expected to create significant employment opportunities, particularly in roles such as robot algorithm engineers, scene solution experts, and data collection engineers [17][19]. - The industry is anticipated to see a clear differentiation in the next 1-2 years, with companies that cannot access sufficient real-world data likely to fall behind, while those like Qianxun that integrate hardware and software effectively will thrive [21][22].
具身开源模型新王!千寻Spirit v1.5模型登顶 RoboChallenge,终结 Pi0.5领跑时代
量子位· 2026-01-12 00:37
Core Viewpoint - The article highlights the significant achievement of Spirit v1.5 from Qianxun Intelligent, which has topped the RoboChallenge leaderboard, surpassing the American model Pi0.5, marking a milestone in embodied intelligence models [1][5][9]. Performance Summary - Spirit v1.5 scored 66.09 with a success rate of 50.33%, outperforming Pi0.5, which scored 61.84 with a success rate of 42.67% [2][5]. - The model excelled in various tasks, including stacking bowls (100% success), putting cups on coasters (90%), and searching for green boxes (90%) [3][10][11]. - Spirit v1.5 is the first domestic embodied model to exceed a 50% success rate in RoboChallenge since its launch [3][9]. Data Strategy Innovation - The core innovation of Spirit v1.5 lies in its diverse data strategy during the pre-training phase, shifting from highly controlled "clean data" to a more varied and open data collection approach [33][34]. - This strategy allows for a broader range of actions and better adaptation to real-world uncertainties, enhancing the model's transfer and generalization capabilities [40][44]. Open Source Contribution - Spirit v1.5 has been open-sourced, including model weights, inference code, and usage examples, facilitating further research and development in the field of embodied intelligence [7][68]. - The open-source nature of the model aligns with the goal of promoting reproducible and verifiable advancements in embodied intelligence [71][72]. Company Background - Qianxun Intelligent, established in January 2024, is recognized for its comprehensive AI and robotics capabilities, focusing on general humanoid robots and large-scale models [58][59]. - The company has secured significant funding, including over 1.5 billion yuan in 2025, indicating strong investor confidence and growth potential [61].