3D高斯泼溅(3DGS)
Search documents
单卡训练1亿高斯点,重建25平方公里城市:3DGS内存墙被CPU「外挂」打破了
3 6 Ke· 2025-12-23 07:27
Core Insights - The article discusses a new system called CLM (CPU-offloaded Large-scale 3DGS training) developed by a research team from New York University, which allows for city-scale 3D reconstruction using a single consumer-grade GPU, specifically the RTX 4090, by offloading memory-intensive parameters to CPU memory [1][20]. Group 1: 3D Gaussian Splatting (3DGS) Challenges - 3DGS has become a significant technology in neural rendering due to its high-quality rendering and speed, but it faces scalability issues when applied to complex scenes like urban areas, primarily due to GPU memory limitations [2]. - A high-precision 3DGS model typically contains tens of millions to over a billion Gaussian points, with each point requiring substantial memory for parameters and gradients, making it difficult to train on a single GPU [2][3]. Group 2: CLM System Design - CLM is designed to address the GPU memory bottleneck by dynamically loading Gaussian parameters from CPU memory only when needed, rather than keeping all parameters in GPU memory [3][4]. - The system employs three key mechanisms: 1. **Attribute Segmentation**: Only "key attributes" necessary for visibility are stored in GPU memory, while the majority of parameters are offloaded to CPU memory [5][6]. 2. **Pre-rendering Visibility Culling**: CLM calculates visible Gaussian points before rendering, reducing unnecessary computations and memory usage on the GPU [7][8]. 3. **Efficient CPU Utilization**: CLM minimizes data transfer delays through micro-batching, caching, and intelligent scheduling, allowing the CPU to effectively assist in training without slowing down the process [10][12]. Group 3: Performance Results - The implementation of CLM on an RTX 4090 allowed for the training of 102.2 million Gaussian points, a 6.7-fold increase compared to the traditional method, which could only handle 15.3 million points [13][14]. - Despite communication overhead, CLM achieved a training throughput of 55% to 90% of the enhanced baseline on the RTX 4090, and up to 86% to 97% on the slower RTX 2080 Ti [16]. - The quality of reconstruction improved significantly, with the PSNR of the 102.2 million point model reaching 25.15 dB, compared to 23.93 dB for the 15.3 million point model [18]. Group 4: Broader Implications - CLM represents a cost-effective solution for large-scale 3D reconstruction, addressing deployment challenges without the need for multi-GPU setups, which is beneficial for both academic and industrial applications [20]. - The growing demand for efficient and low-cost 3D reconstruction tools in areas like digital twins and large-scale mapping makes CLM's approach particularly relevant [20].
IJRR最新成果!中山大学提出基于3D高斯泼溅的机器人自建模技术:仅凭RGB图像实现高保真形态、运动与颜色重建
机器人大讲堂· 2025-11-17 09:00
机器人自建模,是实现机器人自主智能的核心支撑技术之一。依托自建模,机器人可以像人类一样,通过视觉学习自身结构与外观,实现自我认知。在实际应用 中,无论是工业机械臂的精准防撞、数字孪生的高保真仿真,还是服务机器人在动态环境中的自适应调整,都离不开对机器人自身结构、运动状态及外观特征的精 确建模。然而,传统方法往往受限于设备成本高、建模精度低或特征覆盖不全面等问题,难以在实际场景中广泛应用。 早期技术多针对特定任务设计,仅能建模末端执行器位置、关节速度等局部信息,缺乏通用性;依赖深度相机、激光雷达或大量惯性测量单元( IMU)的方案,设 备成本高昂且数据采集复杂;基于神经辐射场(NeRF)的方法虽能实现三维重建,但训练与渲染耗时久,模型可解释性差,且大多忽略表面颜色建模。而依托于 NeRF的仅使用 RGB 图像的方法,又存在形态重建模糊、无法捕捉连杆结构等问题。 如何在低成本前提下,实现高精度、多特征融合的连杆级自建模,成为行业 亟待解决的技术难题。 针对上述挑战,来自 中山大学计算机学院的研究团队 (论文第一作者为硕士生胡可钧,通讯作者为谭宁教授) 提出了基于 3D 高斯 泼溅 ( 3DGS)的自建模技 术 。 ...