MEITUAN-美团发布并开源SOTA级虚拟人视频生成模型LongCat-Video-Avatar

Group 1 - The core viewpoint of the article is the official release and open-sourcing of the SOTA-level virtual human video generation model, LongCat-Video-Avatar, by Meituan's LongCat team [2] - The model is built on the LongCat-Video foundation and continues the core design of "one model supports multiple tasks" [2] - LongCat-Video-Avatar natively supports key functionalities such as Audio-Text-to-Video (AT2V), Audio-Text-Image-to-Video (ATI2V), and video continuation [2] Group 2 - The underlying architecture of the model has been comprehensively upgraded, achieving significant breakthroughs in three dimensions: action realism, long video stability, and identity consistency [2]