3D Gaussian Splatting
Search documents
中山&港科纯视觉方案:3DGS实现高精轨迹视频生成
自动驾驶之心· 2025-12-22 00:42
深蓝AI . 专注于人工智能、机器人与自动驾驶的学习平台。 来源 | 深蓝AI 原文链接: 纯视觉方案!中山大学&港科大新作:基于3DGS实现高精度轨迹视频生成 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 本文只做学术分享,如有侵权,联系删文 「 不修图、不依赖 LiDAR 」 以下文章来源于深蓝AI ,作者深蓝学院 在自动驾驶领域, 多轨迹、多视角的视频数据 几乎是刚需。 它不仅决定了 3D 重建的完整性,也直接影响世界模型和规划系统的泛化能力。但现实很骨感: 真实世界里,想采集同一条道路、不同横向位置、严格同步的多条驾驶视频,成本极高。要么多车协同,要么反复跑同一路段,还会带来时间、动态目 标不一致的问题。于是,研究者开始尝试: 能不能只用一条真实驾驶视频,自动"生成"另一条相邻轨迹的视频? 看似简单,实际却踩了两个大坑: 中山大学与香港科技大学提出了ReCamDriving,一个完全基于视觉、却能精确控制相机轨迹的新轨迹视频生成方法。 不修补、不靠 LiDAR,直接换一种相机控制思路。 标题: ...
打破恶性循环!CoherentGS:稀疏模糊图像也能高清重建
自动驾驶之心· 2025-12-20 02:16
Core Viewpoint - The article discusses the breakthrough technology CoherentGS developed by Peking University, which enables high-quality 3D scene reconstruction from a limited number of blurry images, addressing the challenges of sparse views and motion blur [5][33]. Group 1: CoherentGS Technology Overview - CoherentGS utilizes a "dual prior guidance" strategy, allowing the reconstruction of high-definition, coherent 3D scenes from just 3 to 9 blurry images [5][7]. - The system effectively addresses both "deblurring" and "geometric completion" issues through a collaborative optimization process [7][12]. - The core framework integrates deblurring and geometric completion into the entire 3D Gaussian optimization process, ensuring clear and coherent reconstruction results [10][12]. Group 2: Key Technologies of CoherentGS - The deblurring prior restores clear details and provides photometric guidance, essential for extracting reliable details from blurry images [13]. - The diffusion prior completes geometric gaps, ensuring global coherence by filling in unobserved areas with structured images [18]. - Consistency-guided camera exploration intelligently selects valuable viewpoints, enhancing optimization efficiency without blindly increasing perspectives [19][21]. - Joint optimization incorporates geometric regularization to avoid distortion, ensuring the reliability of reconstructed geometry [24][26]. Group 3: Performance Validation - CoherentGS outperforms existing methods, achieving a PSNR improvement of up to 2.78 dB and reducing LPIPS by over 40% on the Deblur-NeRF and DL3DV-BLUR datasets with 3 to 9 sparse blurry inputs [26]. - The qualitative results demonstrate that CoherentGS can recover texture details and maintain structural coherence, unlike other methods that produce either blurriness or fragmented structures [29]. - Frequency analysis shows that CoherentGS retains natural high-frequency details, confirming that the restored details are genuine and not artificially generated [32]. Group 4: Future Implications - CoherentGS represents a significant advancement in 3D reconstruction, breaking the dependency on dense, clear inputs, and has the potential to extend to various real-world scenarios involving defocus blur and exposure anomalies [33].
清华团队开源DISCOVERSE框架:用3D高斯渲染打通机器人仿真到现实的“最后一公里”!
机器人大讲堂· 2025-11-10 04:07
Core Insights - The article discusses the challenges in end-to-end robot learning, particularly focusing on the "Sim2Real" gap, which is primarily caused by the inadequacy of simulation environments to accurately replicate real-world scenarios [1][6][10]. Group 1: Challenges in Robot Simulation - Current simulation environments struggle with three main issues: insufficient realism in replicating real-world scenarios, high costs in scene asset acquisition and system configuration, and time-consuming data collection processes [1][5]. - The core obstacle is the performance drop during the Sim2Real transfer, which stems from the fundamental differences between simulated and real-world environments, such as object appearance, lighting effects, and spatial geometry [1][6]. Group 2: Existing Simulation Frameworks - Various simulation frameworks have been developed, but none meet the three critical requirements: high visual fidelity, accurate physical interaction, and efficient parallel scalability [3][6]. - Traditional simulators often compromise on either visual realism or physical accuracy, leading to ineffective training for robots [6][7]. Group 3: DISCOVERSE Framework - DISCOVERSE is an open-source simulation framework developed by Tsinghua University in collaboration with other institutions, integrating 3D Gaussian Splatting (3DGS), MuJoCo physics engine, and control interfaces into a unified architecture [5][10]. - The framework aims to bridge the Sim2Real gap by enhancing the realism of simulations through a three-layer innovation approach, focusing on accurate digital representation of real-world scenes and objects [10][12]. Group 4: Performance and Efficiency - DISCOVERSE significantly improves simulation speed, achieving rendering speeds up to 650 FPS on high-end hardware, which is three times faster than competing solutions [19][20]. - The framework supports a wide range of asset formats and robot models, enhancing compatibility and reducing the need for extensive configuration [21][22]. Group 5: Testing and Results - In comparative tests, DISCOVERSE outperformed other mainstream simulators in zero-shot transfer success rates across various tasks, demonstrating its effectiveness in real-world applications [24][27]. - The framework also enhances data collection efficiency, reducing the time required to gather demonstration data from 146 minutes to just 1.5 minutes, thus accelerating algorithm iteration [29]. Group 6: Future Implications - DISCOVERSE is positioned as a versatile robot simulation framework capable of supporting various complex tasks, with potential applications in robotics, drones, and autonomous driving sensors [30]. - The release of the framework's code and API aims to facilitate adoption by developers and enterprises, marking a significant milestone in the robotics industry [30].
ICCV 2025 | RobustSplat: 解耦致密化与动态的抗瞬态3DGS三维重建
具身智能之心· 2025-08-20 00:03
Core Viewpoint - The article discusses the RobustSplat method, which addresses the challenges of 3D Gaussian Splatting (3DGS) in rendering dynamic objects while maintaining high-quality static scene reconstruction [1][4][19]. Research Motivation - The motivation stems from understanding the dual role of Gaussian densification in 3DGS, which enhances scene detail but can lead to overfitting in dynamic areas, resulting in artifacts and scene distortion [4][6]. Methodology - **Transient Mask Estimation**: Utilizes a Mask MLP architecture to output pixel-wise transient masks, distinguishing between transient and static regions [9]. - **Feature Selection**: DINOv2 features are chosen for their balance of semantic consistency, noise resistance, and computational efficiency, outperforming other feature sets [10]. - **Supervision Design**: Combines image residual loss and feature cosine similarity loss for mask optimization, enhancing dynamic area recognition [10]. - **Delayed Gaussian Growth Strategy**: This core strategy postpones the densification process to prioritize static scene structure optimization, reducing the risk of misclassifying static areas as transient [12]. - **Mask Regularization**: Aims to minimize the misclassification of static regions during early optimization stages [12]. - **Scale Cascade Mask Guidance**: Initially estimates transient masks using low-resolution features, transitioning to high-resolution supervision for improved accuracy [14]. Experimental Results - Experiments on NeRF On-the-go and RobustNeRF datasets show that RobustSplat outperforms baseline methods like 3DGS, SpotLessSplats, and WildGaussians in PSNR, SSIM, and LPIPS metrics [16][20]. Conclusion - The RobustSplat method effectively reduces rendering artifacts caused by transient objects while preserving scene details, demonstrating its robustness in complex scenarios [18][19].