虚拟人视频生成
Search documents
美团推出虚拟人视频生成模型LongCat-Video-Avatar
Bei Jing Shang Bao· 2025-12-18 13:03
Core Viewpoint - Meituan has released and open-sourced a virtual human video generation model called LongCat-Video-Avatar, which enhances action realism, long video stability, and identity consistency compared to its predecessor [1] Group 1: Product Development - LongCat-Video-Avatar is an upgraded model based on LongCat-Video, focusing on optimizing core pain points in practical applications [1] - The model was developed in response to the needs identified in the InfiniteTalk and LongCat-Video projects [1] Group 2: Community Engagement - In August, Meituan's open-sourced InfiniteTalk project attracted the participation of hundreds of thousands of developers globally [1] - The LongCat team open-sourced the LongCat-Video model at the end of October, emphasizing long video generation capabilities [1]
资金动向 | 北水单日扫货小米超9亿港元,连续6日减持中海油
Ge Long Hui A P P· 2025-12-18 12:23
Group 1 - On December 18, southbound funds net bought Hong Kong stocks worth 1.257 billion HKD, with significant purchases in Xiaomi Group (903 million HKD) and Meituan (434 million HKD) [1] - Xiaomi Group has seen continuous net buying for 15 days, totaling 14.75053 billion HKD, while Meituan has experienced net buying for 7 days, totaling 5.99241 billion HKD [1] - Major net selling was observed in the Tracker Fund of Hong Kong (14.22 billion HKD) and China Mobile (12.94 billion HKD) [1] Group 2 - Xiaomi Group announced a share buyback of 1.51 million HKD for 3.75 million shares at a price range of 40.12-40.24 HKD per share [3] - Meituan's LongCat team launched an open-source SOTA-level virtual human video generation model, enhancing capabilities in video generation and stability [3] - Tencent Holdings repurchased shares worth approximately 640 million HKD, and Goldman Sachs maintains a "buy" rating with a target price of 770 HKD, citing Tencent Cloud's expansion into key international markets [3]
美团LongCat-Video-Avatar发布并开源,重点提升动作拟真度
Feng Huang Wang· 2025-12-18 10:19
Core Viewpoint - Meituan's LongCat team has officially released and open-sourced the LongCat-Video-Avatar model, which enhances virtual human video generation capabilities [1] Group 1: Model Features - The LongCat-Video-Avatar model is built on the previously open-sourced LongCat-Video base and supports video generation from audio, text, or images, along with video continuation functionality [1] - The model significantly improves action realism, long video generation stability, and identity consistency [1] Group 2: Technical Innovations - The model employs "decoupled unconditional guidance" technology, allowing virtual humans to exhibit natural states such as blinking and posture adjustments during speech pauses [1] - To address common quality degradation in long video generation, the team introduced a "cross-segment latent space stitching" strategy, which aims to prevent cumulative errors from repeated encoding and decoding, claiming it can generate videos up to 5 minutes long while maintaining stable visuals [1] Group 3: Performance Metrics - In terms of identity consistency, the model utilizes reference frame injection with positional encoding and a "reference jump attention" mechanism to maintain character traits while reducing motion stiffness [1] - The model has achieved advanced levels of lip-sync accuracy and consistency metrics in evaluations on public datasets like HDTF and CelebV-HQ, and has shown leading performance in comprehensive tests covering commercial promotion and educational scenarios [1]
美团发布并开源SOTA级虚拟人视频生成模型LongCat-Video-Avatar
Di Yi Cai Jing· 2025-12-18 09:17
Group 1 - The core viewpoint of the article is the official release and open-sourcing of the SOTA-level virtual human video generation model, LongCat-Video-Avatar, by Meituan's LongCat team [2] - The model is built on the LongCat-Video foundation and continues the core design of "one model supports multiple tasks" [2] - LongCat-Video-Avatar natively supports key functionalities such as Audio-Text-to-Video (AT2V), Audio-Text-Image-to-Video (ATI2V), and video continuation [2] Group 2 - The underlying architecture of the model has been comprehensively upgraded, achieving significant breakthroughs in three dimensions: action realism, long video stability, and identity consistency [2]