Workflow
SAM 3D
icon
Search documents
分割一切并不够,还要3D重建一切,SAM 3D来了
具身智能之心· 2025-11-21 00:04
更多干货,欢迎加入国内首个具身智能全栈学习社区: 具身智能之心知识星球(戳我) ,这里包含所有你想要的! 沉默后爆发? 编辑丨 机器之心 点击下方 卡片 ,关注" 具身智能之心 "公众号 >> 点击进入→ 具身 智能之心 技术交流群 深夜,Meta 有了重大更新,接连上线 SAM 3D、SAM 3(Segment Anything Model,SAM)。 其中,SAM 3D 是 SAM 系列的最新成员,它将人们对图像的 3D 理解带入通俗易懂的世界,其包含两个模型: 这两个模型都具备强大且稳定的 SOTA(业界领先)性能,能够将静态的 2D 图像转化为细致的 3D 重建结果。 SAM 3D Objects:支持物体与场景重建 SAM 3D Body:专注于人体形状与姿态估计 SAM 3 可通过文本、示例和视觉提示,对图像和视频中的物体进行检测、分割与跟踪。 作为本次发布的一部分,Meta 同步开放了 SAM 3D、SAM 3 的模型权重与推理代码。 此外,Meta 还推出了一个全新平台 Segment Anything Playground,通过该平台,用户能轻松体验 SAM 3D、SAM 3 的能力。 接下 ...
Meta「分割一切」进入3D时代!图像分割结果直出3D,有遮挡也能复原
量子位· 2025-11-20 07:01
Core Viewpoint - Meta's new 3D modeling paradigm allows for direct conversion of image segmentation results into 3D models, enhancing the capabilities of 3D reconstruction from 2D images [1][4][8]. Summary by Sections 3D Reconstruction Models - Meta's MSL lab has released SAM 3D, which includes two models: SAM 3D Objects for object and scene reconstruction, and SAM 3D Body focused on human modeling [4][8]. - SAM 3D Objects can reconstruct 3D models and estimate object poses from a single natural image, overcoming challenges like occlusion and small objects [10][11]. - SAM 3D Objects outperforms existing methods, achieving a win rate at least five times higher than leading models in direct user comparisons [13][14]. Performance Metrics - SAM 3D Objects shows significant performance improvements in 3D shape and scene reconstruction, with metrics such as F1 score of 0.2339 and 3D IoU of 0.4254 [15]. - SAM 3D Body also achieves state-of-the-art (SOTA) results in human modeling, with MPJPE of 61.7 and PCK of 75.4 across various datasets [18]. Semantic Understanding - SAM 3 introduces a concept segmentation feature that allows for flexible object segmentation based on user-defined prompts, overcoming limitations of fixed label sets [21][23]. - The model can identify and segment objects based on textual descriptions or selected examples, significantly enhancing its usability [26][31]. Benchmarking and Results - SAM 3 has set new SOTA in promptable segmentation tasks, achieving an accuracy of 47.0% in zero-shot segmentation on the LVIS dataset, surpassing the previous SOTA of 38.5% [37]. - In the new SA-Co benchmark, SAM 3's performance is at least twice as strong as baseline methods [38]. Technical Architecture - SAM 3's architecture is built on a shared Perception Encoder, which improves consistency and efficiency in feature extraction for both detection and tracking tasks [41][43]. - The model employs a two-stage generative approach for SAM 3D Objects, utilizing a 1.2 billion parameter flow-matching transformer for geometric predictions [49][50]. - SAM 3D Body utilizes a unique Momentum Human Rig representation to decouple skeletal pose from body shape, enhancing detail in human modeling [55][60].
分割一切并不够,还要3D重建一切,SAM 3D来了
机器之心· 2025-11-20 02:07
Core Insights - Meta has launched significant updates with the introduction of SAM 3D and SAM 3, enhancing the understanding of images in 3D [1][2] Group 1: SAM 3D Overview - SAM 3D is the latest addition to the SAM series, featuring two models that convert static 2D images into detailed 3D reconstructions [2][5] - SAM 3D Objects focuses on object and scene reconstruction, while SAM 3D Body specializes in human shape and pose estimation [5][28] - Meta has made the model weights and inference code for SAM 3D and SAM 3 publicly available [7] Group 2: SAM 3D Objects - SAM 3D Objects introduces a novel technical approach for robust and realistic 3D reconstruction and object pose estimation from a single natural image [11] - The model can generate detailed 3D shapes, textures, and scene layouts from everyday photos, overcoming challenges like small objects and occlusions [12][13] - Meta has annotated nearly 1 million images, generating approximately 3.14 million 3D meshes, leveraging a scalable data engine for efficient data collection [17][22] Group 3: SAM 3D Body - SAM 3D Body addresses the challenge of accurate human 3D pose and shape reconstruction from a single image, even in complex scenarios [28] - The model supports interactive input, allowing users to guide and control predictions for improved accuracy [29] - A high-quality training dataset of around 8 million images was created to enhance the model's performance across various 3D benchmarks [31] Group 4: SAM 3 Capabilities - SAM 3 introduces promptable concept segmentation, enabling the model to identify and segment instances of specific concepts based on text or example images [35] - The architecture of SAM 3 builds on previous AI advancements, utilizing Meta Perception Encoder for enhanced image recognition and object detection [37] - SAM 3 has achieved a twofold improvement in concept segmentation performance compared to existing models, with rapid inference times even for images with numerous detection targets [39]