图生视频
Search documents
腾讯元宝迎来重大更新:可一句话生视频
Xin Lang Ke Ji· 2025-11-21 04:42
新浪科技讯 11月21日下午消息,今天,元宝官宣推出了"一句话生视频"能力,为用户带来了"人人都是 视频创作者"的全新体验。用户现在无需任何视频剪辑基础,通过元宝就能将脑海中的一句话构思,或 手机里的一张静态照片,快速转化为一段生动的视频,该能力底层技术基于腾讯混元最新开源的 HunyuanVideo 1.5 模型。 用户只需将元宝更新至最新版本,在元宝对话框内通过两种方式体验该功能:一是"文字生视频",在对 话框输入文字描述(Prompt),如"一只猫在赛博朋克城市中漫步",便能将文字构想转化为视频;二 是"图生视频",上传一张手机里的照片,配合简单的指令,即可让静态画面"动"起来,无论是让风景照 呈现云卷云舒,还是为宠物照片增添趣味动态,都变得轻而易举。 据悉,该功能背后是混元开源HunyuanVideo 1.5,支持中英文的文生视频与图生视频,能实现图像与视 频在色调、细节上的高度一致性,并精准遵循运镜、流畅运动等多样化指令。模型以仅8.3B的轻量尺 寸,实现开源最强的效果,可在14G显存的消费级显卡上流畅运行。(罗宁) 责任编辑:王翔 ...
生数科技发布图生视频模型Vidu Q2
Jing Ji Guan Cha Wang· 2025-09-25 05:58
Core Insights - The company Shengshu Technology has officially launched its next-generation image-to-video model, Vidu Q2, which shows advancements in expression changes, camera movements, generation speed, and semantic understanding [1] Group 1: Product Features - Vidu Q2 includes capabilities for image-to-video generation, beginning and ending frame video, and selectable duration options ranging from 2 to 8 seconds [1] - The model offers two modes: cinematic blockbuster and lightning-fast production [1]
9款图生视频模型横评:谁能拍广告,谁还只是玩票?
锦秋集· 2025-09-01 04:32
Core Viewpoint - The article evaluates the capabilities of nine representative image-to-video AI models, highlighting their advancements and persistent challenges in semantic understanding and logical coherence in video generation [2][7][50]. Group 1: Evaluation of AI Models - Nine models were tested, including Google Veo3, Kuaishou Kling 2.1, and Baidu Steam Engine 2.0, covering both newly launched and mature products [7][8]. - The evaluation focused on real-world creative scenarios, assessing models on criteria such as image quality, action organization, style continuity, and overall usability [9][14]. - The testing period was in August 2025, with a standardized prompt and conditions for all models to ensure comparability [13][9]. Group 2: User Perspectives - Young users, who are not professional video creators, expressed a need for easy-to-use tools that can assist in daily content creation [3][4]. - The evaluation was conducted from a practical and aesthetic perspective, reflecting a generally positive attitude towards AI products [5]. Group 3: Performance Metrics - The models were assessed based on three main criteria: semantic adherence, physical realism, and visual expressiveness [14][21]. - Results showed that Veo3 and Hailuo performed best in terms of structural integrity and visual quality, while other models struggled with semantic accuracy and physical logic [17][21]. Group 4: Specific Use Cases - The models were tested across various scenarios, including workplace branding, light creative expression, and conceptual demonstrations [11][16]. - In the workplace scenario, models were tasked with generating videos for corporate events, while in creative contexts, they were evaluated on their ability to produce engaging and entertaining content [11][16]. Group 5: Limitations and Future Directions - The evaluation revealed significant limitations in the models, particularly in generating coherent narrative sequences and adhering to physical laws in complex scenes [39][50]. - Future developments are expected to focus on enhancing the models' ability to create logically complete segments, integrate into creative workflows, and facilitate collaborative storytelling [53][54][55].
新手实测8款AI文生视频模型:谁能拍广告,谁只是凑热闹
锦秋集· 2025-08-26 12:33
Core Viewpoint - The rapid iteration of AI video models has created a landscape where users can easily generate videos, but practical application remains a challenge for ordinary users [2][3][4]. Group 1: User Needs and Model Evaluation - Many users require clear narratives, reasonable actions, and smooth visuals rather than complex effects [4][6]. - The evaluation focuses on whether these models can solve real problems in practical applications, particularly for novice content creators [5][7]. - A series of assessments were designed to test the models' capabilities in real-world scenarios, emphasizing practical video content creation [8][9]. Group 2: Model Selection and Testing - Eight popular video generation models were selected for testing, including Veo3, Hailuo02, and Jimeng3.0, which represent the core capabilities in the current video generation landscape [11]. - The testing period was set for July 2025, with specific attention to the models' performance in generating videos from text prompts [11]. Group 3: Evaluation Criteria - Five core evaluation dimensions were established: semantic adherence, physical laws, action amplitude, camera language, and overall expressiveness [20][25]. - The models were assessed on their ability to understand prompts, maintain physical logic, and produce coherent and stable video outputs [21][22][23][24][25]. Group 4: Practical Application and Limitations - The models can generate usable visual materials but are not yet capable of producing fully deliverable commercial videos [57]. - Current models are better suited for creative sketch generation and visual exploration rather than high-precision commercial content [65]. Group 5: Future Directions - Future improvements may focus on enhancing structural integrity, semantic understanding, and detail stability in video generation [60][61][62]. - The rise of image-to-video models may provide a more practical solution for commercial applications, bypassing some of the challenges faced by text-to-video models [62].