Workflow
视频扩散模型
icon
Search documents
SIGGRAPH Asia 2025|30FPS普通相机恢复200FPS细节,4D重建方案来了
机器之心· 2025-12-14 04:53
硬件革新:异步捕捉,让相机 "错峰拍摄" 本文第一作者陈羽田,香港中文大学 MMLab 博士二年级在读,研究方向为三维重建与生成,导师为薛天帆教授。个人主页:https://yutian10.github.io 当古装剧中的长袍在武林高手凌空翻腾的瞬间扬起 0.01 秒的惊艳弧度,当 VR 玩家想伸手抓住对手 "空中定格" 的剑锋,当 TikTok 爆款视频里一滴牛奶皇冠般的溅 落要被 360° 无死角重放 —— 如何用普通的摄像机,把瞬间即逝的高速世界 "冻结" 成可供反复拆解、传送与交互的数字化 4D 时空,成为 3D 视觉领域的一个难 题。 然而,受限于硬件成本与数据传输带宽,目前绝大多数 4D 采集阵列的最高帧率仅约 30 FPS;相比之下,传统高速摄影通常需要 120 FPS 乃至更高。简单升级相机 硬件不仅价格高昂,还会带来指数级增长的数据通量,难以在大规模部署中落地。另一条改变的思路是在重建阶段 "补帧"。近期,例如 4D 高斯溅射(4D Gaussian Splatting)等动态场景重建方法能在简单运动中通过稀疏时序输入合成连续帧,变相提升帧率,但面对布料摆动、高速旋转等非线性复杂运动,中间 ...
任意骨骼系统的模型都能驱动?AnimaX提出基于世界模型的3D动画生成新范式
机器之心· 2025-09-06 03:14
Core Viewpoint - The article discusses the development of AnimaX, an efficient feedforward 3D animation generation framework that supports arbitrary skeletal topologies while combining the diversity of video priors with the controllability of skeletal animation [2][8]. Group 1: Limitations of Traditional Methods - Traditional 3D animation relies on skeletal binding and keyframe design, which, while providing high quality and control, requires significant human labor and time [11]. - Existing methods based on motion capture diffusion models or autoregressive models are limited to fixed skeletal topologies and primarily focus on humanoid actions, making them difficult to generalize to a wider range of character types [3][11]. - Video generation models can produce diverse dynamic sequences but often depend on high degrees of freedom in 3D deformation field optimization, leading to high computational costs and unstable results [3][11]. Group 2: AnimaX Framework - AnimaX integrates motion priors from video diffusion models with low-degree control of skeletal animation, innovatively representing 3D actions as multi-view, multi-frame 2D pose maps [5][12]. - The framework employs a video-pose joint diffusion model that can simultaneously generate RGB videos and corresponding pose sequences, achieving effective spatiotemporal alignment through shared positional encoding and modality-specific embeddings [5][12][14]. - AnimaX is capable of generating natural and coherent animation videos for various categories of 3D meshes, including humanoid characters, animals, and mechanical structures, completing the animation sequence generation in minutes while maintaining action diversity and realism [9][10]. Group 3: Performance and Comparisons - AnimaX has been quantitatively and qualitatively compared with several leading open-source models, demonstrating superior results across multiple metrics, particularly in appearance quality [18][21]. - In user preference tests, AnimaX achieved the highest preference rates across all evaluated aspects, including action-text matching, shape consistency, and overall motion quality [24]. - The model's design allows for robust transfer of motion priors from video diffusion models to skeletal-driven 3D animation synthesis, showcasing its advantages over existing methods [21][24]. Group 4: Future Prospects - The AnimaX research team suggests that the method can be extended beyond skeletal animation to scene-level dynamic modeling, potentially advancing broader 4D content generation [30]. - Future developments may involve integrating long-sequence video generation to enhance the continuity and detail fidelity of long-range animations, supporting more complex and richer 3D animation generation [30].