腾讯元宝上线AI视频生成能力

Core Insights - Tencent's HunyuanVideo 1.5, a lightweight video generation model based on the Diffusion Transformer architecture, has been officially released and open-sourced, featuring 8.3 billion parameters and the capability to generate 5-10 seconds of high-definition video [1][2] Group 1: Model Capabilities - HunyuanVideo 1.5 supports both Chinese and English input for text-to-video and image-to-video generation, showcasing high consistency between images and videos [2][3] - The model demonstrates strong instruction understanding and adherence, enabling it to execute diverse scenarios such as camera movements, smooth motion, realistic characters, and emotional expressions [2] - It supports various styles including realistic, animation, and block-based, and can generate text in both Chinese and English within the videos [2] Group 2: Video Quality - The model can natively generate 5-10 seconds of high-definition video at 480p and 720p, with the option to enhance quality to 1080p cinematic level through a super-resolution model [2] Group 3: Performance Comparison - In the T2V (Text-to-Video) task, HunyuanVideo outperformed several comparison models, with a winning margin of up to 17.12% against models like Wan2.2 [4] - In the I2V (Image-to-Video) task, HunyuanVideo also showed competitive results, achieving a winning margin of 12.65% against Wan2.2 [4]