Manycore Tech-群核科技发布两款空间开源模型将坚持开源共建技术生态

Core Insights - The core focus of the news is the launch of two advanced models by Qunhe Technology, namely SpatialLM 1.5 and SpatialGen, aimed at enhancing 3D scene understanding and generation, as well as addressing challenges in AI video consistency [1][2][3]. Group 1: SpatialLM 1.5 - SpatialLM 1.5 is a spatial language model that allows users to generate interactive 3D scenes through a dialogue system, overcoming limitations of traditional models in understanding physical geometry and spatial relationships [2]. - The model can produce scenes with physically accurate structured information, enabling rapid generation of diverse scenarios for applications like robot path planning and obstacle avoidance, thus addressing data scarcity in robot training [2]. - A demonstration showcased the model's ability to understand commands and autonomously plan optimal action paths in complex environments, highlighting its potential in practical applications [2]. Group 2: SpatialGen - SpatialGen is a multi-view image generation model based on a diffusion model architecture, capable of creating temporally consistent multi-view images from text descriptions and 3D layouts [3]. - The model ensures that the same object maintains accurate spatial properties and physical relationships across different views, enhancing the realism of generated scenes [3]. - Qunhe Technology plans to release a 3D-integrated AI video generation product by the end of the year, aiming to address current limitations in AI-generated video consistency [3]. Group 3: Open Source Strategy - Qunhe Technology emphasizes the importance of open-source initiatives to maximize the value of its technology and contribute to the growth of the spatial intelligence sector [4]. - The company has developed a "space editing tool-space synthesis data-space large model" ecosystem, leveraging data to accelerate model training and improve user experience [4]. - As of June 30, the company has amassed over 441 million 3D models and more than 500 million structured 3D spatial scenes, showcasing its extensive data resources [4]. Group 4: Future Developments - The two models, SpatialLM 1.5 and SpatialGen, will be gradually open-sourced on platforms like HuggingFace, GitHub, and Modao Community, making them accessible to global developers [5].