AI视频生成
Search documents
腾讯元宝上线AI视频生成能力
Guan Cha Zhe Wang· 2025-11-21 08:58
相关使用示意 腾讯官方 据官方介绍,HunyuanVideo 1.5模型具备全面且强大的核心能力,支持中英文输入的文生视频与图生视频。 其图生视频能力展现出图像与视频的高度一致性。模型还具备强指令理解与遵循能力,能够精准地实现多样 化场景,包括运镜、流畅运动、写实人物和人物情绪表情等多种指令;同时支持写实、动画、积木等多种风 格,并可在视频中生成中英文文字。在画质方面,模型可原生生成5–10秒时长的480p和720p高清视频,并可 通过超分模型提升至1080p电影级画质。 11月21日,腾讯混元大模型团队正式发布并开源HunyuanVideo 1.5,一款基于 Diffusion Transformer(DiT)架 构、参数为8.3B的轻量级视频生成模型,支持生成5-10秒的高清视频。 目前,腾讯元宝最新版已上线该模型能力。用户可通过两种方式即可体验:一是输入文字描述(Prompt), 直接实现"文生视频";二是上传图片配合Prompt,轻松将静态图片转化为动态视频。 HunyuanVideo 1.5 GSB(Good Same Bad) 评测结果 | | T2V 任务GSB对比 | | | | --- | ...
元宝上线AI视频能力
Bei Ke Cai Jing· 2025-11-21 08:40
截图。 新京报贝壳财经讯(记者罗亦丹)11月21日,元宝官宣推出"一句话生视频"能力,该能力底层技术基于腾讯混元最新开源的 HunyuanVideo 1.5 模型。 贝壳财经记者测试发现,将元宝更新至最新版后输入"大熊猫在长城上吃竹子动画"的提示词后,元宝在约3分钟的时间里生成了一个以梦工厂"熊猫阿 宝"为蓝本的符合提示词要求的视频,长度约6秒。 腾讯方面表示,该功能背后的混元开源HunyuanVideo 1.5,支持中英文的文生视频与图生视频,能实现图像与视频在色调、细节上的高度一致性,并精准 遵循运镜、流畅运动等多样化指令。模型以仅8.3B的轻量尺寸实现开源最强的效果,可在14G显存的消费级显卡上流畅运行。 该功能的上线,标志着元宝正式实现了从文本、图片、音频到视频的"图文音视"全模态覆盖。 校对 柳宝庆 ...
并行扩散架构突破极限,实现5分钟AI视频生成,「叫板」OpenAI与谷歌?
机器之心· 2025-11-20 09:35
资料显示,CraftStory 由全球使用最广泛的计算机视觉库 OpenCV 的创建者 Victor Erukhimov 创立,他是 OpenCV 的早期贡献者之一, 参与了 OpenCV 库的开发和 维护 。此外, 他曾联合创立 Itseez——专注于开发运行于嵌入式平台(特别是汽车安全系统)的计算机视觉解决方案, 担任首席技术官、首席执行官和总裁, 2016 年 Itseez 被英特尔收购。 CraftStory 此次推出的 Model 2.0 视频生成系统在视频时长上的突破,可能会为 那些难以扩大视频制作规模以用于培训、营销和客户教育的企业,带来巨大的商业 价值。 机器之心报道 机器之心编辑部 近日,一家名为 CraftStory 的 AI 初创公司推出了 Model 2.0 视频生成系统,凭借可生成长达五分钟的富有表现力、可媲美专业水准、以人为中心的视频,破解了困 扰 AI 视频生成行业长久以来的「视频时长」难题,引起热议,并被视为或将是 OpenAI 的 Sora 和 Google 的 Veo 的强有力竞争者。 大家都知道,包括当前的行业佼佼者 OpenAI 的 Sora 2,所生成的视频时长上限也 ...
把龙做成菜,一个会计是怎么用AI做出740万播放的视频的?
后浪研究所· 2025-11-17 09:35
Core Viewpoint - The article discusses the viral success of an AI-generated video titled "Making Six Dishes from the Ancient Canglong," highlighting the innovative use of AI in content creation and the strategic approach taken by the creator to engage viewers and leverage trending topics [5][12][14]. Group 1: Video Content and Creation - The video achieved 7 million views within three days, showcasing a unique concept of cooking an extinct creature, the Canglong, which captivated audiences [5][11]. - The creator, known as "Huangpu River Salmon," utilized various popular memes and engaging storytelling techniques to maintain viewer interest throughout the 6-minute video [8][12]. - The production involved generating over 1,000 video clips, with a focus on achieving a high level of realism in AI-generated visuals, aiming for 90% authenticity [9][28]. Group 2: Strategic Approach and Audience Engagement - Prior to the viral video, the creator conducted A/B testing with three themed cooking videos to refine the formula for success, incorporating audience feedback and trending elements [12][18]. - The creator intentionally included "flaws" in the video to spark discussions among viewers, which in turn increased engagement and visibility on the platform [12][20]. - The acceptance of AI-generated content has significantly increased across major platforms, with many creators exploring AI tools to enhance their productions [12][40]. Group 3: Future Prospects and Industry Trends - The creator aims to transition into a full-time AI designer, reflecting a broader trend where AI is increasingly replacing traditional filming methods in content creation [13][40]. - The article suggests a promising future for AI-generated media, as brands and creators are willing to invest in AI capabilities to streamline production processes [40]. - The creator plans to explore more imaginative concepts in future videos, potentially featuring entirely fictional creatures, to maintain viewer interest and creativity [36][39].
把龙做成菜,一个会计是怎么用AI做出740万播放的视频的?
3 6 Ke· 2025-11-14 08:41
"一定要有梗才能留住人"。 10月下旬,一条名为《把远古沧龙做成六道菜(上)》的视频在B站爆火,上线三天播放量冲上700万。关键这是一段完全由AI生成的视频,时长6分23 秒,按以往规律,这两个buff叠在一起,是很难被流量眷顾的。 毕竟不少人对AI做的内容是"排斥的",但这条视频下的近5000评论中,多是对AI快速精进的画面质量与作者对AI掌控力的双重震惊。 这条片子确实也跟以往的多数AI视频不一样,它不切石头也不是小猫做饭,而是几个国家的厨师进行一场烹饪比赛,食材则为一条沧龙。是的,沧龙 ——一种6500万年前就已经灭绝的远古生物——把它做成菜,没见过吧。 开头是一群老外拿着锯锯肉和剁比人还高的排骨的宏大画面,镜头拉近、旋转、快速转换,人物出场冲突爆发,情节紧凑而有张力,一下便抓住了观众的 注意力。 ●《把远古沧龙做成六道菜(上)》视频开头 而要让人在这6分23秒中不流失,才是最难且重要的。为此,B站UP主"黄浦江三文鱼"(以下简称"三文鱼")上了很多"手段"—— 比如贯穿视频的各种热梗。首先登场的印度厨师做的"九转大肠";中国厨师是来自上海的"辛西娅",出场时自配背景音乐和解说,比如这句耳熟能详 的"最 ...
NeurIPS'25 Oral:何必DiT,字节首次拿着自回归,单GPU一分钟生成5秒720p视频
3 6 Ke· 2025-11-14 08:35
一篇入围顶会NeurIPS'25 Oral的论文,狠狠反击了一把DiT(Diffusion Transformer)。 毕竟自打DiT问世以来,视频生成这块,算是被它给稳稳拿捏住了。 但站稳了脚跟,并不意味着没有问题,因为它的计算复杂度高,在资源消耗和速度上有着诸多挑战。 而这篇来自字节跳动商业化技术团队的论文,则是提出了一个名叫InfinityStar的方法,一举兼得了视频生成的质量和效率,为视频生成方法探索更多可 能的路径。 像下面这些有趣的动画片片段,便是由InfinityStar亲手打造: 整体来看InfinityStar的亮点,我们可以总结为如下三点: 是首个在VBench上超越扩散模型的离散自回归视频生成器; 视频生成不用再"慢慢熬":从百步去噪到自回归,告别延迟; 1. 任务通吃:文生图、文生视频、图生视频、交互式长视频生成等。 值得一提的是,InfinityStar目前的论文、代码、体验地址均已经发布(链接见文末),接下来我们就进一步实测一波~ 啪!~~~ 实测给DiT上了一课的AI视频生成 首先我们来简单了解一下InfinityStar的体验方法。 它的入口就在Discord社区里面,大家登 ...
可灵2.5 Turbo模型上线首尾帧功能
Xin Lang Ke Ji· 2025-11-12 12:27
Core Insights - The launch of the new 2.5 Turbo model introduces a frame feature that significantly enhances video generation capabilities compared to the previous 2.1 model [1] - Improvements in dynamic effects, text responsiveness, style consistency, and aesthetic quality have been noted, reinforcing the controllability, stability, and consistency of AI video generation [1] - This advancement lays a foundation for broader applications in professional creative content production across various sectors such as film, short dramas, gaming, animation, and advertising [1] Summary by Categories - **Product Development** - The 2.5 Turbo model features a new frame function that improves video generation [1] - Significant enhancements in various dimensions of video generation have been achieved compared to the 2.1 model [1] - **Performance Improvements** - The model shows notable advancements in dynamic effects, text responsiveness, style retention, and aesthetic quality [1] - These improvements contribute to better controllability, stability, and consistency in AI-generated videos [1] - **Market Applications** - The enhanced capabilities of the 2.5 Turbo model support its application in diverse fields such as film, short dramas, gaming, animation, and advertising marketing [1] - The model provides creators with a higher quality video generation solution [1]
这家好莱坞公司提供了全新的影视工业AI解决方案
Tai Mei Ti A P P· 2025-11-11 09:33
公开数据显示,2025年全球AI视频生成市场规模已突破300亿美元,年复合增长率维持在40%以上的高 位水平,并呈现出短视频厂商和通用大模型厂商分而治之的局面。 而短视频平台(如快手可灵、抖音即梦)凭借其庞大的流量基础,加之模板化创作+社区分发的使用闭 环,在全球市场占有率上甚至超过了以Sora、Google Veo为首的技术领跑者。而这一趋势也使得全球的 主流AI视频模型都在追求短片中的极致细节,以此来最大程度吸引C端用户的付费。 这种现状也使得市场上主流的视频模型在面对"长片",尤其是电影这种工业级需求时体现出的种种"力 不从心"。 首先是一致性的问题,主流视频模型在处理短视频、,少人物、,简单场景的镜头转换时还能勉强保持 一致性。而一旦涉及到长程视频、多人物、复杂场景,就会很难维持角色外貌、服装和场景元素的稳 定。其次是模型叙事能力的缺失,视频模型难以理解剧本中的因果链条和叙事手法,更无法匹配与之相 符的镜头语言,导致生成的内容常常与导演意图天差地别。另外,主流模型的物理规则认知水平不足。 对于短视频这种"浅内容"来说,些许的物理"幻觉"是可以容忍的,模型靠对2D像素统计规律理解的物理 规则已经足够。但 ...
对谈 Sora 核心团队:Sora 其实是一个社交产品,视频生成模型会带来科研突破
海外独角兽· 2025-11-09 08:17
Core Insights - Sora 2 has rapidly gained popularity, topping the Apple App Store charts shortly after its launch, attributed to its unique features and viral potential [2][3] - The product emphasizes creativity and social interaction, distinguishing it from traditional video generation tools [3][4] - The Cameos feature allows users to integrate their likeness into AI-generated videos, enhancing personalization and engagement [5][8] - The long-term vision for Sora includes evolving into a "world simulator," capable of generating extensive video content for various applications, including scientific research [2][29] Group 1: Product Features and Development - Sora is designed as a social product, focusing on user creativity rather than passive content consumption [3][4] - The Cameos feature emerged unexpectedly as a core highlight, showcasing the product's ability to blend real and virtual elements [5][6] - The Storyboard function allows for the generation of coherent video segments from natural language, marking a significant advancement in video generation technology [6][8] Group 2: User Engagement and Community - The application aims to democratize content creation, enabling users of all skill levels to participate and grow as creators [10][11] - The recommendation system is designed to support creative expression rather than merely driving consumption, addressing concerns about algorithmic content overload [8][9] - The platform encourages remixing and collaborative creativity, fostering a community-driven environment [9][10] Group 3: Commercialization and Market Position - Sora is exploring monetization strategies, including a potential fee structure after a certain usage threshold, while ensuring a beneficial ecosystem for all participants [16][17] - The platform's unique features, such as Cameos, present new opportunities for brand marketing and content monetization [19][20] - The team is committed to maintaining a competitive edge in the rapidly evolving video generation market, focusing on user engagement and innovative features [25][26] Group 4: Future Prospects and Technological Advancements - The next breakthroughs in video generation technology are expected to involve longer-duration content and enhanced realism, with applications in various scientific fields [29][30] - The integration of Sora with other OpenAI projects, such as ChatGPT, is anticipated to create new interactive experiences for users [21][22] - The ongoing development of video models is seen as a key driver for advancements in robotics and other complex tasks, highlighting the potential for significant breakthroughs in these areas [31][32]
3.6亿,前腾讯混元技术负责人创业,0产品融资了
3 6 Ke· 2025-11-07 07:57
Core Insights - Video Rebirth, a video generation startup founded by Dr. Liu Wei, has completed a $50 million financing round to accelerate the development of its AI video generation model, "Bach" [2][3][4] Company Overview - Video Rebirth was established in October 2024 and is headquartered in Singapore, focusing on creating a "world model" for AI video generation [3] - The company plans to shift its focus from consumer-level tools to professional creative fields such as advertising, e-commerce, film, and animation [3][8] Technology and Development - The financing will support the development of the "Bach" model and the company's proprietary "Physics Native Attention" (PNA) architecture, which aims to address challenges in AI-generated entertainment by achieving accurate modeling of light, motion, and interaction [3][6] - Video Rebirth's previous model, Avenger 0.5 Pro, ranked second in the Artificial Analysis video arena, indicating its competitive position in the market [7] Market Position and Competition - The company aims to differentiate itself in the professional video generation market, which is highly competitive with major players like ByteDance and Kuaishou offering similar services [8] - The focus on high fidelity and physical consistency in video generation may provide Video Rebirth with a unique value proposition in a crowded landscape [8]