Workflow
具身合成数据
icon
Search documents
还不知道发什么方向论文?别人已经投稿CCF-A了......
具身智能之心· 2025-06-18 03:03
Group 1 - The core viewpoint of the article is the launch of a mentoring program for students aiming to publish papers in top conferences such as CVPR and ICRA, building on last year's successful outcomes [1] - The mentoring directions include multimodal large models, VLA, robot navigation, robot grasping, embodied generalization, embodied synthetic data, end-to-end embodied intelligence, and 3DGS [2] - The mentors have published papers in top conferences like CVPR, ICCV, ECCV, ICLR, RSS, ICML, and ICRA, indicating their rich guiding experience [3] Group 2 - Students are required to submit a resume and must come from a domestic top 100 university or an international university ranked within QS 200 [4][5]
深度|具身合成数据的路线之争,谁将率先走出困境?
Z Potentials· 2025-04-08 12:30
Core Viewpoint - The article discusses the competition between two main technical routes for embodied synthetic data: "Video Synthesis + 3D Reconstruction" and "End-to-End 3D Generation" [1][49]. Group 1: Challenges in Embodied Intelligence - The development of robots has seen faster advancements in physical capabilities compared to cognitive abilities, leading to difficulties in unfamiliar environments [3]. - Embodied intelligence requires an integrated ability of perception, reasoning, and decision-making, which is contingent on a clear understanding of spatial structures [4]. - Current AI advancements are hindered by a lack of high-quality spatial data, which is essential for effective cognitive functioning [5]. Group 2: Data Dilemma - The existing data for embodied intelligence is limited and insufficient, categorized into three types: real scanned data, game engine environments, and open-source synthetic datasets, all of which have significant limitations [6]. - The unique layout and usage patterns of homes create challenges in collecting comprehensive training data, making traditional data collection methods impractical [8]. Group 3: Technical Routes - The two main technical paths for synthetic data generation are: 1. Video Synthesis + 3D Reconstruction: This method generates video or images first, then reconstructs them into 3D data, facing issues with accuracy and physical consistency [11][13]. 2. End-to-End 3D Generation: This approach directly synthesizes structured spatial data using advanced techniques like Graph Neural Networks (GNNs) and diffusion models, but struggles with generating high-quality outputs [22][39]. Group 4: Innovations in 3D Generation - New methods such as "modal encoding" aim to integrate design knowledge into the generation process, enhancing the model's ability to create reasonable spatial structures [2][44]. - The Sengine SimHub framework incorporates training processes that improve the stability and adaptability of the generated data, aligning it more closely with real-world logic and semantics [45][48]. Group 5: Future Directions - The industry faces a "data drought" compared to the more established data loops in autonomous driving, necessitating innovative approaches to spatial understanding and generation [49]. - The future of embodied intelligence may hinge on how spatial concepts are defined and understood, emphasizing the need for a system that embeds rules and preferences into spatial data generation [50].