Workflow
奥比中光Gemini 330系列双目3D相机
icon
Search documents
让机器人“看清”三维世界,蚂蚁灵波开源空间感知模型
Core Insights - Ant Group's Lingbo Technology has made significant advancements in spatial intelligence by open-sourcing the high-precision spatial perception model LingBot-Depth, which enhances depth perception and 3D spatial understanding for robots and autonomous vehicles [1] Group 1: Model Performance - LingBot-Depth demonstrates a generational advantage in authoritative benchmark evaluations, reducing relative error (REL) by over 70% compared to mainstream models like PromptDA and PriorDA in indoor scenes, and achieving a 47% reduction in RMSE error in challenging sparse SfM tasks [1] - The model excels in handling transparent and reflective objects, which are common in household and industrial environments, overcoming limitations faced by traditional depth cameras [1][2] Group 2: Technology and Innovation - The "Masked Depth Modeling" (MDM) technology developed by Lingbo Technology allows the model to infer and complete missing depth data by integrating texture, contours, and contextual information from RGB images, resulting in clearer and more complete 3D depth maps [2] - LingBot-Depth has been certified by the Oubo Zhongguang Depth Vision Laboratory, achieving industry-leading levels in accuracy, stability, and adaptability to complex scenes [2] Group 3: Data and Collaboration - The model's superiority is attributed to a vast dataset, with approximately 10 million raw samples and 2 million high-value depth pair data used for training, which will soon be open-sourced to accelerate community efforts in tackling complex spatial perception challenges [3] - Ant Group's Lingbo Technology has reached a strategic cooperation intention with Oubo Zhongguang to launch a new generation of depth cameras based on LingBot-Depth's capabilities [3]
让机器人“看清”三维世界 蚂蚁灵波开源空间感知模型
Core Insights - Ant Group's Lingbo Technology has made significant advancements in spatial intelligence by open-sourcing the high-precision spatial perception model LingBot-Depth, aimed at enhancing depth perception and 3D spatial understanding for robots and autonomous vehicles [1] Group 1: Model Capabilities - LingBot-Depth utilizes raw data from the Orbbec Gemini 330 series dual-camera 3D system, achieving over a 70% reduction in relative error (REL) in indoor scenes compared to mainstream models like PromptDA and PriorDA [1] - The model demonstrates a 47% reduction in RMSE error in challenging sparse Structure from Motion (SfM) tasks, showcasing its generational advantage in performance [1] Group 2: Addressing Industry Challenges - Traditional depth cameras struggle with transparent and reflective objects, leading to data loss or noise in depth maps. Lingbo Technology has developed "Masked Depth Modeling" (MDM) to address this issue [3] - LingBot-Depth can infer and complete missing depth data by integrating texture, contours, and contextual information from RGB images, resulting in clearer and more complete 3D depth maps [3] Group 3: Performance Validation - In tests, the Gemini 330 series, when paired with LingBot-Depth, produced smooth and complete depth maps even in challenging optical scenarios, outperforming Stereolabs' ZED Stereo Depth camera [4] - This indicates that LingBot-Depth can significantly enhance the performance of consumer-grade depth cameras without requiring hardware changes [4] Group 4: Data and Collaboration - The model's effectiveness is supported by a vast dataset, with approximately 10 million raw samples and 2 million high-value depth pairs used for training, enhancing its generalization capabilities in extreme environments [6] - Ant Group's Lingbo Technology has established a strategic partnership with Orbbec to develop a new generation of depth cameras based on LingBot-Depth's capabilities, with plans to open-source multiple models in the field of embodied intelligence [6]