ArtGS：3DGS实现关节目标精准操控，仿真/实物双验证性能SOTA！

Group 1 - The core challenge in robotics is joint target manipulation, which involves complex kinematic constraints and limited physical reasoning capabilities of existing methods [3][4] - The proposed ArtGS framework integrates 3D Gaussian Splatting (3DGS) with visual-physical modeling to enhance understanding and interaction with joint targets, ensuring physically consistent motion constraints [3][4][20] - ArtGS consists of three key modules: static Gaussian reconstruction, VLM-based skeletal inference, and dynamic 3D Gaussian joint modeling [4] Group 2 - Static 3D Gaussian reconstruction utilizes 3D Gaussian splatting to create high-fidelity 3D scenes from multi-view RGB-D images, representing the scene as a collection of 3D Gaussian spheres [5] - VLM-based skeletal inference employs a fine-tuned visual-language model (VLM) to estimate joint parameters, generating target views to assist in visual question answering [6][8] - Dynamic 3D Gaussian joint modeling implements impedance control for interaction with the environment, optimizing joint parameters through differential rendering [10] Group 3 - Experimental validation shows that ArtGS significantly outperforms baseline methods in joint parameter estimation, with lower angular error (AE) and origin error (OE) [12] - In simulation, ArtGS achieves a manipulation success rate ranging from 62.4% to 90.3%, which is substantially higher than other methods like TD3 and Where2Act [14] - Real-world experiments demonstrate a perfect success rate of 10/10 for drawer operations and 9/10 for cabinet operations, indicating the effectiveness of the optimized version of ArtGS [14][17] Group 4 - Ablation studies reveal that even with initial axis estimation errors exceeding 20°, ArtGS can still enhance operation success rates through 3DGS optimization [19] - ArtGS exhibits cross-embodiment adaptability, accurately reconstructing various robotic arms, particularly excelling in gripper rendering details [19][20] - The core contribution of ArtGS lies in transforming 3DGS into a visual-physical model for joint targets, ensuring spatiotemporal consistency in differentiable operation trajectories [20] Group 5 - Future directions for ArtGS include expanding capabilities to handle more complex scenarios and improving modeling and manipulation of multi-joint, high-dynamic targets [21]