OVSeg3R
Search documents
视启未来——两大AI领军人物看中的"空间智能模型"公司
投中网· 2026-01-26 02:12
Core Viewpoint - The article discusses the advancements in "spatial intelligent models" led by the company Vision Future, highlighting its competitive edge in the AI field, particularly in visual models, and the support from prominent figures in AI research [2][5][6]. Group 1: Company Background - Vision Future was founded by Dr. Zhang Lei, a prominent figure in AI, who has developed the state-of-the-art visual model Grounding DINO 1.5, outperforming major competitors like Google and Meta [5][6]. - The company has received significant backing from renowned AI experts, including Academicians Zhang Bo and Shen Xiangyang, who serve as advisors [6][8]. Group 2: Technological Advancements - Vision Future's DINO-X model has unique "generalized perception" capabilities, leading to partnerships with major companies like China Merchants Group and Meituan Robotics [8][9]. - The company aims to integrate spatial perception models with Visual-Language-Action (VLA) frameworks to create intelligent systems that align with physical world laws [9][11]. Group 3: Research and Development - The core research direction includes upgrading 2D perception to 3D understanding, addressing key challenges in embodied intelligence [11][12]. - The OVSeg3R model, developed under Dr. Zhang's guidance, has achieved significant breakthroughs in 3D object detection and segmentation, enhancing the capabilities of embodied intelligence [12][13]. Group 4: Market Potential - The article emphasizes the dual benefits of technological iteration and industrial integration in the spatial intelligence model sector, suggesting a bright future for Vision Future as a potential unicorn in this field [14].
用2D先验自动生成3D标注,自动驾驶、具身智能有福了丨IDEA团队开源
量子位· 2026-01-17 02:53
Core Viewpoint - The article discusses the introduction of OVSeg3R, a new paradigm for open-set 3D instance segmentation that significantly reduces training costs and improves performance by leveraging mature 2D instance segmentation data and 3D reconstruction techniques [2][3][10]. Group 1: Challenges in 3D Perception - 3D instance segmentation is crucial for applications like autonomous driving and robotics, as it enables machines to understand and delineate objects in 3D space [4]. - The high cost and difficulty of acquiring and annotating 3D data have been major bottlenecks in the development of 3D perception models [5][6]. - Existing methods to integrate 2D models into 3D tasks often fail to enhance the 3D model's ability to recognize new objects, limiting their effectiveness [7][8][9]. Group 2: OVSeg3R's Technical Principles - OVSeg3R connects 3D models with 2D models through 3D reconstruction, allowing the 3D model to learn from the rich data available in 2D segmentation [10]. - The learning process involves three stages: data preparation, model input and annotation preparation, and model learning [12][19]. - Key innovations include the use of Instance-Boundary-aware Superpoints (IBSp) to improve training stability and the generation of high-quality semantic labels from raw video data [16][19]. Group 3: Performance Metrics - OVSeg3R achieved a significant performance leap in the ScanNet200 benchmark, reducing the performance gap between long-tail and head classes from 11.3 mAP to 1.9 mAP [21][22]. - In open-set settings, OVSeg3R outperformed previous methods, achieving a mAP of 24.6 and a notable improvement of 7.7 mAP in novel categories [23]. Group 4: Application Scenarios - OVSeg3R's capabilities are expected to drive advancements in open-set 3D instance segmentation, particularly in embodied intelligence by reducing reliance on expensive manual 3D annotations [27]. - The model's open-set recognition ability allows for precise identification of previously unseen "long-tail" objects, enhancing safety in robotic navigation and operation [28]. - OVSeg3R also compensates for the limitations of 3D geometry in recognizing non-rigid objects, providing a solid foundation for robotic grasping and navigation applications [29]. Group 5: Industry Progress - The technology is being advanced towards industrial application by IDEA's incubated company, Vision Future, indicating a move towards practical deployment [30].