Workflow
SpatialLM 1.5
icon
Search documents
视频生成告别“瞬移变形”,群核科技Hugging Face登顶背后:空间语言改写AI物理世界规则
Tai Mei Ti A P P· 2025-09-01 03:18
Core Insights - AIGC technology is evolving from text and image generation to more complex 3D space and video domains, facing challenges in understanding physical world structures and maintaining temporal consistency in video creation [2][6] - Spatial intelligence is identified as a crucial bridge for AI to transition from the digital to the physical world, requiring AI to learn the "language" of space [2][9] Model Developments - The newly released models, SpatialLM 1.5 and SpatialGen, address the challenges of 3D scene generation and video creation, with SpatialLM 1.5 focusing on structured generation through "spatial language" and SpatialGen ensuring spatial coherence across multiple perspectives [3][4] - SpatialLM 1.5 encodes spatial relationships as "language," allowing for end-to-end generation of 3D scenes based on user input, producing structured scripts with physical parameters [4][5] Data and Training - The scarcity of high-quality 3D data is a significant bottleneck for spatial intelligence development, with over 4.41 billion 3D models and 500 million structured 3D scenes available by mid-2025 [5] - The company leverages its platform, CoolJia, to accumulate data that enhances the training of spatial understanding and generation models, creating a feedback loop between tools, data, and models [5] Video Generation Innovations - Current AI video generation tools struggle with spatial logic due to their reliance on 2D image sequences, leading to issues like object distortion and inconsistency [6][7] - SpatialGen overcomes these limitations by using a 3D Gaussian scene as an intermediary, allowing for the generation of images from any perspective while maintaining object consistency across frames [6][7] Market Strategy and Ecosystem - The company emphasizes open-sourcing its models and data to foster collaboration and innovation in the spatial intelligence market, aiming to expand the ecosystem rather than monopolize it [9][10] - The open-source strategy has garnered international attention, with the company releasing the world's first 3D Gaussian dataset, which has implications for various industries, including autonomous driving [9][10] Differentiation and Future Directions - The company's focus on interactive functional scenes differentiates it from other models that may lack spatial consistency, positioning it for industrial applications [10][11] - By providing a new path for industrial software development, the company aims to create "AI-native" design tools that bypass traditional complex geometric algorithms [11]
群核科技发布空间大模型,旨在解决AI视频空间一致性难题
3 6 Ke· 2025-08-29 04:00
Core Insights - The company, Qunke Technology, launched its latest spatial models, SpatialLM 1.5 and SpatialGen, during the first Tech Day on August 25, emphasizing an open-source strategy to engage global developers [1][4] - SpatialLM 1.5 is designed to understand and generate spatial language, enabling the creation of structured 3D scene scripts based on user text inputs, showcasing its potential in robotics [1][2] - SpatialGen focuses on generating multi-view images with temporal consistency, addressing current challenges in AI-generated video content [2][3] Group 1: Model Features - SpatialLM 1.5 utilizes a large language model to learn a new "spatial language," allowing it to describe spatial structures and relationships in 3D scenes accurately [1] - The model can generate structured 3D scene scripts and assist in robot path planning and task execution, addressing the scarcity of interactive 3D data [2] - SpatialGen employs a diffusion model architecture to create multi-view images based on text and 3D layouts, maintaining spatial logic and consistency [2][3] Group 2: Strategic Vision - Qunke Technology's strategy revolves around a "space editing tool - space synthesis data - space large model" framework, creating a positive feedback loop to enhance model training and tool experience [3] - The company has accumulated over 441 million 3D models and 500 million structured 3D spatial scenes as of June 30, 2025, leveraging these assets for model development [3] - The open-source initiative, started in 2018, aims to collaborate with global developers to advance spatial model technology [3][4]
群核科技扭亏之后:既要扩张又要节流
Bei Jing Shang Bao· 2025-08-28 17:24
Core Viewpoint - The company, Qunhe Technology, has submitted an updated prospectus to the Hong Kong Stock Exchange after the first one became invalid. The company reported a revenue of 399 million yuan in the first half of 2025, a year-on-year increase of 9%, and achieved adjusted net profit, but faces high redemption liabilities and reduced spending on sales, marketing, and R&D [1][3][8]. Revenue and Profitability - In the first half of 2025, Qunhe Technology's revenue reached 399 million yuan, reflecting a 9% increase compared to the previous year, which is a decline from the growth rates of 10.5% and 13.8% in 2023 and 2024 respectively [3][6]. - The company's revenue structure remains heavily reliant on subscription services, with 97.7% of revenue coming from software subscriptions, up from 90.6% in 2022 [3][4]. - The adjusted net profit for the first half of 2025 was 17.825 million yuan, a significant turnaround from a net loss of 73.196 million yuan in the same period last year [6]. Cost Management - Sales and marketing expenses decreased from 171 million yuan in the first half of 2024 to 136 million yuan in the first half of 2025, alongside a reduction in the sales team from 615 to 501 employees [8][9]. - R&D expenses also saw a reduction of 16.8%, dropping to 150 million yuan in the first half of 2025, primarily due to optimization of R&D personnel [8]. Product Development and Market Strategy - Qunhe Technology launched two new spatial open-source models, SpatialLM 1.5 and SpatialGen, aimed at enhancing AI video generation capabilities [6][7]. - The company plans to use the funds raised from the IPO for international expansion, product launches, and to enhance existing product functionalities, targeting markets in South Korea, Southeast Asia, India, the US, and Japan [8][9]. Industry Context - The company operates in a challenging environment, with clients in the real estate and construction sectors facing significant pressures, which may impact demand for AI-driven design solutions [9].
“六小龙”之群核科技扭亏背后:既要扩张又要节流
Bei Jing Shang Bao· 2025-08-27 14:39
Core Viewpoint - The company, Qunhe Technology, has submitted an updated prospectus to the Hong Kong Stock Exchange after the first one became invalid. The updated financials show a revenue of 399 million yuan for the first half of 2025, a 9% year-on-year increase, and a return to adjusted profitability, but with significant reductions in sales, marketing, and R&D expenses, alongside a high redemption liability of 4 billion yuan [1][4][10]. Revenue Structure - The revenue structure remains heavily reliant on subscription services, with 97.7% of total revenue coming from software subscriptions in the first half of 2025, up from 90.6% in 2022 [4][5]. - The company provides professional services to enterprise clients, contributing only 2.3% to total revenue in the second quarter of 2025 [6][10]. Financial Performance - Qunhe Technology achieved an adjusted net profit of 17.825 million yuan in the first half of 2025, a significant turnaround from an adjusted net loss of 73.196 million yuan in the same period last year [7]. - The company had previously reported adjusted net losses of 338 million yuan, 242 million yuan, and 70.049 million yuan from 2022 to 2024 [7]. Cost Management - Sales and marketing expenses decreased from 171 million yuan in the first half of 2024 to 136 million yuan in the first half of 2025, with a reduction in sales personnel from 615 to 501 [10]. - R&D expenses also saw a reduction of 16.8%, from 180 million yuan in the first half of 2024 to 150 million yuan in the first half of 2025 [10]. Future Plans - The company plans to use the funds raised from the IPO for international expansion, product launches, and enhancing existing product functionalities, particularly focusing on AIGC and geometric modeling [10]. - Qunhe Technology aims to establish a sales team of approximately 250 people and allocate around 20 million yuan for marketing activities over the next 3-5 years [10]. Product Development - Recently, Qunhe Technology launched two new spatial open-source models, SpatialLM 1.5 and SpatialGen, aimed at enhancing AI-generated content capabilities [7]. - The company is also planning to release a 3D technology-based AI video generation product in 2025 to address current limitations in AI video generation [7][9].
群核科技开源两款空间大模型,想解决 Genie3 没能彻底解决的问题
Founder Park· 2025-08-27 11:41
Core Viewpoint - The article discusses the emergence of "world models" in AI, highlighting the release of Genie 3 by Google DeepMind and the advancements in 3D spatial models by Qunke Technology, which aim to address the challenges of spatial consistency in AI-generated environments [2][8]. Group 1: Types of World Models - There are two main types of world models: video models like Sora and Genie 3, which simulate the physical world using 2D image sequences, and large-scale 3D models that focus on reconstructing 3D scenes [4][5]. - Video models struggle with maintaining spatial consistency due to their reliance on 2D images, while 3D models face challenges in creating comprehensive spatial content from multiple angles [6][8]. Group 2: Qunke Technology's Innovations - Qunke Technology introduced the first 3D indoor scene cognition and generation model, SpatialGen, which addresses spatial consistency issues by generating a navigable 3D space that supports any viewpoint switching [8][10]. - SpatialLM 1.5, a spatial language model, allows users to generate interactive 3D scenes through natural language commands, significantly enhancing usability for non-experts [10][11]. Group 3: Technical Foundations - SpatialGen utilizes a multi-view diffusion and 3D Gaussian reconstruction technology to ensure that lighting and texture remain consistent across different viewpoints [14][15]. - The models are built on a foundation of extensive 3D spatial data, with Qunke's tools generating structured 3D data that includes physical parameters and spatial relationships [16][18]. Group 4: Market Opportunities and Challenges - The current state of spatial models is likened to early versions of GPT, indicating that while they have foundational capabilities, they are not yet universally applicable [20]. - The demand for AI-generated short films presents a significant opportunity, as these models can improve scene coherence and production efficiency, addressing common issues in traditional AI tools [21][22]. Group 5: Future Directions - Qunke Technology is developing an AI video generation product that integrates 3D capabilities to further enhance spatial consistency in generated content [24]. - The company aims to bridge the gap between virtual and real-world applications, particularly in robotics, by providing structured 3D data that can be used for training [41].
将数据优势发挥到极致:「杭州六小龙」开源搭建空间智能的第一步
机器之心· 2025-08-26 09:38
Core Insights - The article emphasizes the importance of high-quality spatial data in the development of AI models, particularly in the context of three-dimensional (3D) space understanding [1][4][6] - It discusses the emergence of powerful models like SpatialLM and SpatialGen, which leverage vast amounts of spatial data to enhance AI capabilities in understanding and generating 3D environments [10][20] Group 1: Spatial Data and AI Models - The availability of extensive spatial data is crucial for training robust AI models, which can then improve tools and applications in various fields [2][4] - The article highlights the concept of a "data flywheel," where tools, data, and models continuously enhance each other, particularly in the realm of spatial intelligence [4][6] - The launch of SpatialLM 1.5 marks a significant advancement in spatial language understanding, allowing the model to interpret and generate structured spatial information [13][15] Group 2: Model Features and Capabilities - SpatialLM 1.5 can generate structured scene scripts from simple text descriptions, enabling users to create and manipulate 3D environments interactively [16][17] - SpatialGen focuses on generating multi-view images that maintain spatial consistency across different perspectives, addressing challenges in traditional 3D scene generation [20][21] - The models utilize extensive datasets, such as SpatialGen's dataset, which includes over 1 million images, to ensure high-quality outputs [22][28] Group 3: Open Source and Collaboration - The company aims to foster collaboration by open-sourcing its models and datasets, encouraging innovation and development within the AI community [32][36] - The leadership expresses a commitment to making spatial intelligence accessible, emphasizing that no single company can dominate this emerging market [33][36] - The open-source approach is expected to stimulate advancements in AI, providing opportunities for researchers and developers to contribute to the field [36]
Meta与Midjourney合作开发AI图像和视频模型;群核科技发布空间大模型丨AIGC日报
创业邦· 2025-08-26 00:04
Group 1 - Meta collaborates with Midjourney to develop AI image and video generation technologies, aiming to integrate these advancements into future AI models and products [2] - DingTalk launched its next-generation AI office application "DingTalk ONE," designed as a unified entry point for natural language dialogue between humans and AI, focusing on creating an agent-driven work information flow [2] - Baidu's AI search app "梯子AI" (Tizzy.ai) has been officially renamed and positioned as an intelligent search assistant, emphasizing ad-free smart search and integrating deep thinking, resource retrieval, and entertainment features [2] - Qunhe Technology released its latest spatial large model, including SpatialLM 1.5, an interactive spatial language model, and SpatialGen, a multi-view image generation model based on diffusion architecture [2]
群核科技黄晓煌:积极拥抱开源,推动属于空间大模型的「DeepSeek时刻」来临
IPO早知道· 2025-08-25 13:10
Core Viewpoint - Qunhe Technology aims to accelerate global spatial intelligence technology through open-source initiatives, showcasing its latest spatial models, SpatialLM 1.5 and SpatialGen, at its first Tech Day event [3][4]. Group 1: Spatial Models - Qunhe Technology has introduced SpatialLM 1.5, a spatial language model that allows users to generate structured scene scripts and layouts through natural language interactions, addressing limitations of traditional language models in understanding spatial relationships [4][6]. - SpatialGen, a multi-view image generation model, focuses on generating images with temporal and spatial consistency based on text descriptions and 3D layouts, enabling immersive experiences in generated 3D environments [7][8]. Group 2: Open Source Strategy - The company has been implementing an open-source strategy since 2018, gradually releasing its data and algorithm capabilities to foster innovation in spatial intelligence technology [4][10]. - Qunhe Technology's spatial intelligence ecosystem consists of a "space editing tool - spatial synthesis data - spatial large model" framework, which enhances data accumulation and model training through widespread tool application [4]. Group 3: Data and Model Performance - As of June 30, 2025, Qunhe Technology possesses over 441 million 3D models and more than 500 million structured 3D spatial scenes, which significantly contribute to the training and performance of its spatial models [4]. - The previous version, SpatialLM 1.0, quickly gained popularity on the Hugging Face trends list after its open-source release, demonstrating the effectiveness of the open-source model [6]. Group 4: AI Video Generation - The company is developing an AI video generation product that integrates 3D capabilities, aiming to address the challenges of temporal consistency in current AI-generated videos [10]. - Existing AI video creation often suffers from issues like object displacement and spatial logic confusion due to a lack of understanding of 3D structures, which Qunhe Technology seeks to overcome with its new model [10].
群核科技发布两款空间开源模型 将坚持开源共建技术生态
Zheng Quan Ri Bao Wang· 2025-08-25 11:18
Core Insights - The core focus of the news is the launch of two advanced models by Qunhe Technology, namely SpatialLM 1.5 and SpatialGen, aimed at enhancing 3D scene understanding and generation, as well as addressing challenges in AI video consistency [1][2][3]. Group 1: SpatialLM 1.5 - SpatialLM 1.5 is a spatial language model that allows users to generate interactive 3D scenes through a dialogue system, overcoming limitations of traditional models in understanding physical geometry and spatial relationships [2]. - The model can produce scenes with physically accurate structured information, enabling rapid generation of diverse scenarios for applications like robot path planning and obstacle avoidance, thus addressing data scarcity in robot training [2]. - A demonstration showcased the model's ability to understand commands and autonomously plan optimal action paths in complex environments, highlighting its potential in practical applications [2]. Group 2: SpatialGen - SpatialGen is a multi-view image generation model based on a diffusion model architecture, capable of creating temporally consistent multi-view images from text descriptions and 3D layouts [3]. - The model ensures that the same object maintains accurate spatial properties and physical relationships across different views, enhancing the realism of generated scenes [3]. - Qunhe Technology plans to release a 3D-integrated AI video generation product by the end of the year, aiming to address current limitations in AI-generated video consistency [3]. Group 3: Open Source Strategy - Qunhe Technology emphasizes the importance of open-source initiatives to maximize the value of its technology and contribute to the growth of the spatial intelligence sector [4]. - The company has developed a "space editing tool-space synthesis data-space large model" ecosystem, leveraging data to accelerate model training and improve user experience [4]. - As of June 30, the company has amassed over 441 million 3D models and more than 500 million structured 3D spatial scenes, showcasing its extensive data resources [4]. Group 4: Future Developments - The two models, SpatialLM 1.5 and SpatialGen, will be gradually open-sourced on platforms like HuggingFace, GitHub, and Modao Community, making them accessible to global developers [5].