计算机图形学
Search documents
SIGGRAPH Asia 2025 | 只用一部手机创建和渲染高质量3D数字人
机器之心· 2025-12-18 10:15
Core Insights - The article discusses the advancements in 3D digital human reconstruction and rendering technology, specifically focusing on the HRM²Avatar system developed by the Taobao technology - Meta team, which allows for high-fidelity, real-time 3D digital humans to be created using only a smartphone [4][5][6]. Group 1: Technology Overview - HRM²Avatar is a system designed for high-fidelity real-time 3D digital human reconstruction and rendering, utilizing a two-stage capture method and a combination of explicit clothing mesh representation and Gaussian-based dynamic detail modeling [12][36]. - The system allows for the reconstruction of human figures, clothing structures, and detailed appearances under ordinary smartphone conditions, achieving a balance between visual realism, cross-pose consistency, and mobile real-time rendering [6][12]. Group 2: Methodology - The capture process involves both static and dynamic scanning phases, where users maintain a fixed pose for static scans and perform natural movements for dynamic scans, enabling the system to capture necessary signals for reconstruction and dynamic modeling [18][28]. - The system employs a mixed representation approach, attaching Gaussian points to the clothing mesh to provide controllable parameters for pose-related deformations and lighting modeling [40][46]. Group 3: Performance Evaluation - HRM²Avatar has been tested on mobile devices, achieving stable real-time performance with approximately 530,000 Gaussian points at 2K resolution and 120 FPS on the iPhone 15 Pro Max, and 2K at 90 FPS on Apple Vision Pro [87][89]. - Comparative evaluations show that HRM²Avatar outperforms existing methods in static reconstruction quality and appearance consistency under pose variations, as evidenced by higher PSNR and SSIM scores [76][80]. Group 4: Future Directions - The article emphasizes the ongoing need for optimization, particularly in handling complex clothing structures and extreme lighting conditions, indicating that HRM²Avatar is a significant milestone in making high-quality digital humans accessible to ordinary users [90].
SIGGRAPH 2025:摩尔线程赢3DGS挑战赛大奖,LiteGS全面开源
具身智能之心· 2025-12-18 00:07
Core Insights - The article highlights the significant achievement of Moore Threads at the SIGGRAPH Asia 2025, where the company won a silver medal in the 3D Gaussian Splatting Reconstruction Challenge, showcasing its advanced algorithm capabilities and hardware-software optimization in next-generation graphics rendering technology [1][17]. Group 1: 3D Gaussian Splatting Technology - 3D Gaussian Splatting (3DGS) is a revolutionary 3D scene representation and rendering technology introduced in 2023, achieving a remarkable balance between image quality, efficiency, and resource usage, with rendering efficiency improved by hundreds to thousands of times compared to traditional NeRF [4][8]. - The technology demonstrates strong adaptability and scalability in areas such as ray tracing, real-time VR/AR rendering, and multimodal fusion, making it a foundational technology for embodied AI, which requires high-quality, low-latency 3D environment modeling [7][8]. Group 2: Competition Details - The 3DGS Reconstruction Challenge required participants to complete high-quality 3DGS reconstruction within 60 seconds using real terminal video sequences and imperfect camera trajectories, emphasizing the challenge of achieving both reconstruction quality and speed [10][12]. - The evaluation metrics included PSNR (Peak Signal-to-Noise Ratio) for reconstruction quality and time taken, ensuring a fair and transparent ranking process [12][14]. Group 3: Moore Threads' Performance - Moore Threads' AI team, competing under the identifier "MT-AI," achieved a commendable balance in reconstruction accuracy and efficiency, securing the second place with an average PSNR of 27.58 and a reconstruction time of 34 seconds [17][21]. - The results from the competition indicated that Moore Threads' performance was competitive, with the top team achieving a PSNR of 28.43 and a reconstruction time of 57 seconds [18]. Group 4: LiteGS Library - Moore Threads developed the LiteGS library, which optimizes the entire pipeline from GPU systems to data management and algorithm design, achieving a PSNR of 27.58 and a reconstruction time of 34 seconds, significantly ahead of many competitors [21][24]. - LiteGS can achieve up to 10.8 times training acceleration while reducing parameter count by over 50%, demonstrating its engineering practicality and technological foresight [25][31]. - The library has been fully open-sourced on GitHub to promote collaborative development and continuous evolution in 3D reconstruction and rendering technology [27].
SIGGRAPH 2025 | CLR-Wire:曲线框可生成?可交互?深大VCC带你见证魔法
机器之心· 2025-05-28 08:09
Core Insights - The article discusses the innovative CLR-Wire technology developed by a team from Shenzhen University, which enables the encoding of complex 3D wireframe structures into a continuous latent space, addressing challenges in capturing both geometric and topological information effectively [1][5]. Group 1: Technology Overview - CLR-Wire allows for efficient generation and smooth interpolation of complex 3D structures, with applications in industrial design, 3D reconstruction, and content creation [1]. - The technology integrates geometric curves and topological structures into a continuous latent space, facilitating smooth transitions between different 3D wireframes [5][14]. - The method employs a multi-layer cross-attention mechanism to encode neural parametric curves and their discrete topological relationships into fixed-length latent vectors, utilizing a variational autoencoder to construct a continuous latent space distribution [8][14]. Group 2: Key Modules - The CurveVAE module standardizes 3D geometric curves to enhance training stability and utilizes cross-attention for dimensionality reduction, ultimately achieving continuous reconstruction of curves [13][14]. - The WireframeVAE module combines latent vectors, vertex coordinates, and adjacency relationships into a global latent vector, ensuring efficient fusion of geometric and topological information for high-quality reconstruction [15][17]. - The Flow Matching module generates wireframe samples from noise by training a velocity field network, allowing for both unconditional and conditional generation based on point clouds or images [17][18]. Group 3: Performance Evaluation - CLR-Wire outperforms existing methods in generating wireframes, demonstrating superior coverage, lower distribution differences, and high fidelity in geometric details [19][21]. - In unconditional generation tasks, CLR-Wire shows significant advantages over methods like 3DWire, DeepCAD, and BrepGen, particularly in generating diverse and detailed freeform wireframes [19][21]. - The method also excels in conditional generation scenarios, effectively reconstructing wireframes from sparse point clouds and single-view images, showcasing its robustness against incomplete data [24][26]. Group 4: Future Directions - While CLR-Wire has demonstrated smooth interpolation capabilities, further research is needed to enhance controllable generation and editing features [28]. - Future developments may focus on aligning the latent space more closely with textual descriptions to achieve higher levels of semantic-driven control [28].