Workflow
三维重建
icon
Search documents
峰瑞资本投了家智能硬件公司,做空间三维重建,创始人为前群核科技副总裁丨早起看早期
36氪· 2026-03-26 04:35
Core Viewpoint - The article discusses the emergence of Hangzhou Zhuma Innovation Technology Co., Ltd., which has completed a multi-million angel round financing to develop consumer-grade 3D reconstruction and spatial intelligence products, filling a market gap between expensive industrial-grade equipment and limited consumer applications [5][6]. Group 1: Company Overview - Zhuma Innovation was founded in November 2025 and focuses on consumer-grade 3D reconstruction and spatial intelligence products [5]. - The founder and CEO, Zhang Ji, has over 20 years of experience in the 3D graphics industry and has held significant roles in companies like Qunhe Technology and Glodon [5]. - The company aims to leverage three major trends: the drop in sensor costs due to the explosion of the smart automotive industry, breakthroughs in 3DGS technology for real-time high-quality 3D reconstruction, and the integration of spatial intelligence with generative AI [5][6]. Group 2: Market Opportunity - The consumer-grade 3D reconstruction market is currently a blank space, with industrial-grade equipment being too expensive and complex for widespread use, while mobile AR applications remain superficial [6]. - Zhuma Innovation's entry point is the market vacuum characterized by "industrial-grade too expensive, consumer-grade nonexistent" [6]. Group 3: Product Development - The first product, codenamed "Pebble," is a professional-grade 3DGS camera targeting overseas home designers, space designers, video producers, and independent game developers [6]. - Pebble is designed to be consumer-friendly, priced lower than traditional industrial-grade equipment (which averages over 50,000 yuan), and requires no high-performance computer for operation [6]. - The second product will be a "spatial memory camera" aimed at ordinary consumers, allowing them to capture and replay 3D memories of events like family gatherings and trips [6]. Group 4: Development Strategy - The company has a clear development path: short-term focus on the Pebble product, mid-term expansion into mainstream markets in Europe and the U.S., and long-term establishment of a "hardware + software + community" business loop around 3DGS technology [7]. - The goal is to make the 3DGS camera affordable for the average consumer, akin to the price of a regular camera [7]. Group 5: Investor Insights - Investors express confidence in Zhuma Innovation's potential, highlighting the team's strong technical background in multi-sensor fusion, 3DGS algorithms, and cloud optimization [9]. - The 3DGS technology is seen as a potential foundational technology that could democratize 3D spatial modeling capabilities for the consumer market [10].
36氪首发丨峰瑞资本投了家智能硬件公司,做空间三维重建,创始人为前群核科技副总裁
3 6 Ke· 2026-03-23 01:36
Core Insights - Hangzhou Zhuma Innovation Technology Co., Ltd. has completed a multi-million angel round financing led by Fengrui Capital, with funds primarily allocated for R&D team recruitment, product mass production preparation, and initial marketing promotion [1] Company Overview - Zhuma Innovation was established in November 2025, focusing on consumer-grade 3D reconstruction and spatial intelligence products. The founder and CEO, Zhang Ji, has over 20 years of experience in the 3D graphics industry [1] - The company aims to fill the market gap in consumer-grade 3D reconstruction, as industrial-grade equipment is expensive and complex, while mobile AR applications are limited [2] Technology and Product Development - The core technology of Zhuma Innovation is 3D Gaussian Splatter (3DGS), which allows for high-fidelity reconstruction and real-time rendering, significantly lowering hardware barriers [2] - The first product, "Pebble," is a professional-grade 3DGS camera targeting overseas home designers and independent game developers, priced competitively compared to traditional industrial equipment [2] - The second product will be a "spatial memory camera" for general consumers, enabling users to capture and replay 3D memories of events [2] Strategic Vision - The company has a clear development path: short-term focus on the Pebble product, mid-term expansion into mainstream markets, and long-term establishment of a "hardware + software + community" business ecosystem [3] - The goal is to create a new content category rather than just a hardware product [4] Investor Perspectives - Investors believe that 3DGS technology has the potential to become a foundational technology for consumer markets, with Zhuma's team having strong technical capabilities and deep understanding of spatial design [5][6] - The expectation is for Zhuma to penetrate various fields such as space design, gaming, AR/VR, and physical AI, ultimately creating groundbreaking products [6]
摸底GS重建在自动驾驶业内的岗位需求......
自动驾驶之心· 2026-01-19 09:04
Core Viewpoint - The article discusses the growing demand for algorithm teams in the field of 3DGS (3D Geometric Scene) for autonomous driving, emphasizing the need for skilled professionals to support closed-loop simulation and scene reconstruction [2][3]. Group 1: Industry Demand and Job Roles - Companies are looking to hire 5-20 algorithm team members to support the optimization of closed-loop simulations [3]. - There is a specific need for cloud data production roles, such as static road surface reconstruction from a BEV perspective, indicating a growing market for these skills [3]. - The field is relatively new, making it challenging for beginners to find effective learning resources, highlighting a gap in the market for educational programs [3]. Group 2: Educational Initiatives - The article introduces a course titled "3DGS Theory and Algorithm Practical Tutorial," designed to provide a comprehensive learning path for 3DGS technology [3]. - The course covers various aspects of 3DGS, including background knowledge, principles, algorithms, and important research directions, aiming to equip participants with a solid understanding of the technology stack [8][9][10][11][12]. Group 3: Course Structure and Content - The course is structured into six chapters, starting with foundational knowledge in computer graphics and progressing to advanced topics like feed-forward 3DGS [8][9][10][11][12]. - Each chapter includes practical assignments and discussions to enhance understanding and application of the concepts learned [10][11][12]. - The course is set to begin on December 1st and will last approximately two and a half months, featuring offline video lectures and online Q&A sessions [15]. Group 4: Target Audience and Prerequisites - The course is aimed at individuals with a background in computer graphics, visual reconstruction, and programming, particularly those familiar with Python and PyTorch [17]. - Participants are expected to have a foundational understanding of probability and linear algebra, ensuring they can engage with the course material effectively [17].
前馈GS在自驾场景落地的难点是什么?
自动驾驶之心· 2025-12-26 03:32
Core Viewpoint - The article discusses the challenges and advancements in the field of 3D Generative Synthesis (3DGS) for autonomous driving, emphasizing the importance of a structured learning path for newcomers in the industry [2][6]. Group 1: Course Overview - The course titled "3DGS Theory and Algorithm Practical Tutorial" aims to provide a comprehensive learning roadmap for 3DGS, covering both theoretical foundations and practical applications [2][6]. - The course is designed in collaboration with industry algorithm experts and spans over two and a half months, starting from December 1 [13]. Group 2: Course Structure - Chapter 1 introduces the background knowledge of 3DGS, including basic concepts of computer graphics, implicit and explicit representations of 3D space, and common development tools like SuperSplat and COLMAP [6][7]. - Chapter 2 delves into the principles and algorithms of 3DGS, covering dynamic reconstruction, surface reconstruction, and ray tracing, with practical exercises using the NVIDIA open-source 3DGRUT framework [7][8]. - Chapter 3 focuses on the application of 3DGS in autonomous driving simulation, highlighting key works and practical tools like DriveStudio for further learning [8][9]. - Chapter 4 discusses important research directions in 3DGS, including extensions of COLMAP and depth estimation, and their relevance to both industry and academia [9]. - Chapter 5 covers Feed-Forward 3DGS, detailing its development history and algorithmic principles, along with discussions on recent algorithms like AnySplat and WorldSplat [10]. Group 3: Interaction and Support - Chapter 6 is dedicated to online discussions and Q&A sessions, allowing participants to engage with instructors on industry pain points and job market demands [11]. - The course encourages continuous interaction between students and professionals from both academia and industry, enhancing networking opportunities [15].
Meta「分割一切」进入3D时代!图像分割结果直出3D,有遮挡也能复原
量子位· 2025-11-20 07:01
Core Viewpoint - Meta's new 3D modeling paradigm allows for direct conversion of image segmentation results into 3D models, enhancing the capabilities of 3D reconstruction from 2D images [1][4][8]. Summary by Sections 3D Reconstruction Models - Meta's MSL lab has released SAM 3D, which includes two models: SAM 3D Objects for object and scene reconstruction, and SAM 3D Body focused on human modeling [4][8]. - SAM 3D Objects can reconstruct 3D models and estimate object poses from a single natural image, overcoming challenges like occlusion and small objects [10][11]. - SAM 3D Objects outperforms existing methods, achieving a win rate at least five times higher than leading models in direct user comparisons [13][14]. Performance Metrics - SAM 3D Objects shows significant performance improvements in 3D shape and scene reconstruction, with metrics such as F1 score of 0.2339 and 3D IoU of 0.4254 [15]. - SAM 3D Body also achieves state-of-the-art (SOTA) results in human modeling, with MPJPE of 61.7 and PCK of 75.4 across various datasets [18]. Semantic Understanding - SAM 3 introduces a concept segmentation feature that allows for flexible object segmentation based on user-defined prompts, overcoming limitations of fixed label sets [21][23]. - The model can identify and segment objects based on textual descriptions or selected examples, significantly enhancing its usability [26][31]. Benchmarking and Results - SAM 3 has set new SOTA in promptable segmentation tasks, achieving an accuracy of 47.0% in zero-shot segmentation on the LVIS dataset, surpassing the previous SOTA of 38.5% [37]. - In the new SA-Co benchmark, SAM 3's performance is at least twice as strong as baseline methods [38]. Technical Architecture - SAM 3's architecture is built on a shared Perception Encoder, which improves consistency and efficiency in feature extraction for both detection and tracking tasks [41][43]. - The model employs a two-stage generative approach for SAM 3D Objects, utilizing a 1.2 billion parameter flow-matching transformer for geometric predictions [49][50]. - SAM 3D Body utilizes a unique Momentum Human Rig representation to decouple skeletal pose from body shape, enhancing detail in human modeling [55][60].
高德如何助力文博业“打破无限”?
21世纪经济报道· 2025-09-30 11:56
Core Viewpoint - The article emphasizes the importance of cultural relic protection and the urgent need for digital transformation in the museum sector to enhance preservation and public engagement [1][3]. Digital Transformation in Cultural Heritage - Recent policies have highlighted the need for accelerating the digitization of collections and improving database accessibility, with a focus on integrating digital technology into cultural heritage management [1][3]. - The digitalization of cultural heritage has seen significant advancements, with many museums adopting 3D online exhibitions and AI-guided tours, making digital experiences a norm [3][4]. Challenges in Digitalization - The cultural heritage sector faces substantial challenges, including the large volume of collections, the fragility of artifacts, and the high costs and lengthy processes associated with traditional 3D modeling [5][6]. - Three main pain points have been identified: limited physical accessibility to popular museums, bottlenecks in large-scale digitization, and ongoing operational pressures to balance artifact protection with visitor capacity [5][6]. Technological Solutions - Companies like Gaode are leveraging their expertise in digital twin technology and AI to address these challenges, aiming to lower costs and improve operational efficiency in cultural heritage management [6][8]. - Gaode's "Cloud Realm" platform enhances the efficiency of 3D reconstruction and digitalization, allowing for a more interactive and engaging public experience with cultural artifacts [8][9]. Innovative Collaborations - Gaode has partnered with institutions like the Palace Museum to establish an AI 3D reconstruction innovation lab, focusing on overcoming data collection challenges in complex environments [8][12]. - The collaboration aims to create a replicable model for digital management and service in various cultural institutions, enhancing the role of intelligent technology in museums [11][12]. Future Directions - The article suggests that Gaode's approach to cultural heritage digitalization could serve as a model for smaller museums, emphasizing the importance of technology in making cultural knowledge more accessible [12][13]. - Gaode's commitment to maintaining a technology-focused platform aims to provide a neutral and serious approach to cultural heritage management, distinguishing it from other content-driven organizations [13][14].
世界机器人大会引爆3D视觉革命,空间智能成焦点​~
自动驾驶之心· 2025-08-11 05:45
Core Viewpoint - The 2025 World Robot Conference (WRC) in Beijing highlights 3D perception technology as a key focus, showcasing advancements in spatial memory modules and multi-modal sensors that enhance robotic capabilities in various industries [2][4]. Group 1: 3D Reconstruction Technology - The ultimate goal of 3D reconstruction technology is to enable robots to understand, navigate, and operate in any environment [4]. - The latest handheld laser scanner, D-H100, achieves centimeter-level precision scanning at a distance of 120 meters, significantly improving efficiency by 300% in complex environments [4]. - The integration of laser scanning capabilities with robots can facilitate real-time mapping of disaster areas and enhance operational efficiency in industrial settings [4][5]. Group 2: GeoScan S1 Laser Scanner - GeoScan S1 is presented as the most cost-effective handheld 3D laser scanner in China, featuring a lightweight design and easy one-button operation for efficient 3D solutions [7][12]. - The device supports real-time reconstruction of 3D scenes with centimeter-level accuracy and can cover areas exceeding 200,000 square meters [7][25]. - It integrates multiple sensors and offers high bandwidth connectivity, making it suitable for various research and industrial applications [7][9]. Group 3: Technical Specifications and Features - GeoScan S1 operates on Ubuntu 20.04 and supports various data export formats, including PCD, LAS, and PLY, with relative accuracy better than 3 cm and absolute accuracy better than 5 cm [25][28]. - The scanner features a compact design with dimensions of 14.2 cm x 9.5 cm x 45 cm and weighs 1.3 kg without the battery, providing a battery life of approximately 3 to 4 hours [25][27]. - It includes advanced synchronization technology for multi-sensor data, ensuring precise mapping in complex indoor and outdoor environments [33][34]. Group 4: Market Position and Pricing - The GeoScan S1 is available in multiple versions, with prices starting at 19,800 yuan for the basic model and going up to 67,800 yuan for the offline version [60]. - The product is backed by extensive research and validation from teams at Tongji University and Northwestern Polytechnical University, ensuring reliability and performance [14][18]. - The scanner is designed for cross-platform integration, making it compatible with drones, unmanned vehicles, and humanoid robots for automated operations [45][48].
自动驾驶之心项目与论文辅导来了~
自动驾驶之心· 2025-08-07 12:00
Core Viewpoint - The article announces the launch of the "Heart of Autonomous Driving" project and paper guidance, aimed at assisting students facing challenges in their research and development efforts in the field of autonomous driving [1]. Group 1: Project and Guidance Overview - The project aims to provide support for students who encounter difficulties in their research, such as environmental configuration issues and debugging challenges [1]. - Last year's outcomes were positive, with several students successfully publishing papers in top conferences like CVPR and ICRA [1]. Group 2: Guidance Directions - **Direction 1**: Focus on multi-modal perception and computer vision, end-to-end autonomous driving, large models, and BEV perception. The guiding teacher has published over 30 papers in top AI conferences with a citation count exceeding 6000 [3]. - **Direction 2**: Emphasis on 3D Object Detection, Semantic Segmentation, Occupancy Prediction, and multi-task learning based on images or point clouds. The guiding teacher is a top-tier PhD with multiple publications in ECCV and CVPR [5]. - **Direction 3**: Concentration on end-to-end autonomous driving, OCC, BEV, and world model directions. The guiding teacher is also a top-tier PhD with contributions to several mainstream perception solutions [6]. - **Direction 4**: Focus on NeRF / 3D GS neural rendering and 3D reconstruction. The guiding teacher has published four CCF-A class papers, including two in CVPR and two in IEEE Transactions [7].
再见伪影!港大开源GS-SDF:SDF做高斯初始化还能这么稳~
自动驾驶之心· 2025-07-24 06:46
Core Viewpoint - The article presents a unified LiDAR-visual system that addresses geometric inconsistencies in Gaussian splatting for robotic applications, successfully combining Gaussian splatting with Neural Signed Distance Fields (NSDF) to achieve geometrically consistent rendering and reconstruction [52]. Group 1: Unified LiDAR-Visual System - The proposed system aims to utilize registered images and low-cost LiDAR data to reconstruct both the appearance and surface structure of scenes under arbitrary trajectories [5][6]. - The importance of Gaussian initialization in achieving good structure is emphasized, highlighting its role in the optimization process [22]. Group 2: Geometric Regularization - The article discusses the introduction of geometric regularization into the 3D Gaussian Splatting (3DGS) framework to address geometric inconsistencies that manifest as rendering distortions [3][6]. - It suggests that depth cameras and LiDAR can provide direct structural priors, which can be integrated into the 3DGS framework for improved geometric regularization [3]. Group 3: Methodology - The overall process includes three stages: training a Neural Signed Distance Field (NSDF) using point clouds, initializing Gaussian primitives from the NSDF, and optimizing both Gaussian primitives and NSDF through SDF-assisted shape regularization [8][6]. - The use of 2D Gaussian splatting to represent 3D scenes is detailed, with each disk defined by parameters such as center point, orthogonal tangent vectors, scaling factor, opacity, and view-dependent color [10]. Group 4: Experimental Results - The proposed method demonstrates superior reconstruction accuracy and rendering quality across various trajectories, as evidenced by extensive experiments [52]. - Quantitative results indicate that the method outperforms existing techniques in metrics such as C-L1, F-Score, SSIM, and PSNR across multiple datasets [46][49]. Group 5: Limitations and Future Work - The method exhibits limitations in extrapolating new view synthesis capabilities, suggesting a need for further exploration of advanced neural rendering techniques to address this limitation [53].
放榜了!ICCV 2025最新汇总(自驾/具身/3D视觉/LLM/CV等)
自动驾驶之心· 2025-06-28 13:34
Core Insights - The article discusses the recent ICCV conference, highlighting the excitement around the release of various works related to autonomous driving and the advancements in the field [2]. Group 1: Autonomous Driving Innovations - DriveArena is introduced as a controllable generative simulation platform aimed at enhancing autonomous driving capabilities [4]. - Epona presents an autoregressive diffusion world model specifically designed for autonomous driving applications [4]. - SynthDrive offers a scalable Real2Sim2Real sensor simulation pipeline for high-fidelity asset generation and driving data synthesis [4]. - StableDepth focuses on scene-consistent and scale-invariant monocular depth estimation, which is crucial for improving perception in autonomous vehicles [4]. - CoopTrack explores end-to-end learning for efficient cooperative sequential perception, enhancing the collaborative capabilities of autonomous systems [4]. Group 2: Image and Vision Technologies - CycleVAR repurposes autoregressive models for unsupervised one-step image translation, which can be beneficial for visual recognition tasks in autonomous driving [5]. - CoST emphasizes efficient collaborative perception from a unified spatiotemporal perspective, which is essential for real-time decision-making in autonomous vehicles [5]. - Hi3DGen generates high-fidelity 3D geometry from images via normal bridging, improving the spatial understanding of environments for autonomous systems [5]. - GS-Occ3D focuses on scaling vision-only occupancy reconstruction for autonomous driving using Gaussian splatting techniques [5]. Group 3: Large Model Applications - ETA introduces a dual approach to self-driving with large models, enhancing the efficiency and effectiveness of autonomous driving systems [5]. - Taming the Untamed discusses graph-based knowledge retrieval and reasoning for multi-layered large models (MLLMs), which can significantly improve the decision-making processes in autonomous driving [7].