Core Viewpoint - Alibaba has open-sourced a film-level video generation model named Wan2.2, which can generate high-definition videos of 5 seconds in length [1] Group 1: Model Details - The Wan2.2 model includes three variants: text-to-video (Wan2.2-T2V-A14B), image-to-video (Wan2.2-I2V-A14B), and unified video generation (Wan2.2-TI2V-5B) [1] - Both the text-to-video and image-to-video models are the first in the industry to utilize the MoE (Mixture of Experts) architecture for video generation [1] - The total parameter count for the model is 27 billion, with 14 billion active parameters, consisting of high-noise expert models and low-noise expert models [1] Group 2: Efficiency and Resource Consumption - The model is designed to save approximately 50% of computational resource consumption while maintaining the same parameter scale [1] - The high-noise expert models are responsible for the overall layout of the video, while the low-noise expert models focus on detail enhancement [1]
阿里开源电影级视频生成模型通义万相2.2