视频生成大模型
Search documents
可灵3.0模型登顶全球视频生成大模型榜单
Zhi Tong Cai Jing· 2026-02-26 01:25
Core Insights - The latest global video generation model ranking by Artificial Analysis highlights the Kling 3.0 Pro model, which achieved a score of 1240 on the Arena ELO benchmark, placing it first in the text-to-video sector [1] - A total of seven models from the Kling series are included in the top 15 of the ranking, indicating a strong presence in the market [1] - The Kling 3.0 model is noted for its industry-leading advantages in video realism, consistency, and controllability, marking a significant advancement for AI in the core aspects of film and visual production [1]
豆包Seedance 2.0全端上线
Xin Lang Cai Jing· 2026-02-12 15:27
技术层面,Seedance 2.0实现三大核心能力升级:原声音画同步、多镜头长叙事、多模态可控生成。用 户输入提示词与参考图,可一键生成带完整原生音轨的多镜头视频,模型自动解析叙事逻辑,确保角 色、光影、风格与氛围高度统一。 官方提示,当前Seedance 2.0暂不支持上传真人图片作为主体参考。 中经记者 李静 北京报道 2月12日,字节跳动旗下豆包宣布,视频生成大模型Seedance 2.0正式接入豆包App、电脑端与网页版, 面向全平台用户开放。 用户打开豆包App对话框,通过新增的Seedance 2.0入口,输入提示词即可生成5秒/10秒短视频;同时支 持分身视频功能,完成真人验证后可创建个人视频分身,拓展创意场景。 (编辑:张靖超 审核:李正豪 校对:颜京宁) ...
Seedance2.0暂停真人素材参考能力
YOUNG财经 漾财经· 2026-02-10 02:30
资料图。本文来源:蓝鲸新闻、新浪财经 Seedance2.0暂停真人素材参考能力 近日,字节跳动旗下Seedance 2.0 视频生成大模型发布引发热议,自媒体 影视飓风 创始人Tim (潘天鸿)发布评测指出,"在没有给任何提示,任何词语,任何信息的情况下,我也没有给我 的声音文件,只是把我的脸传上去,这个AI居然知道这张脸的声音。"Tim表示,自己个人并没 有给过官方授权,也没有收过钱,这让他感到恐怖。而这种没上传对应素材的情况下,被模型 生成了高度相似的声音和画面风格,也引发了广泛关注。 据新浪科技报道,相关事件发酵同时,字节跳动Seedance 2.0紧急暂停了真人素材参考能力。有 官方运营人员表示:即梦创作者们大家好,Seedance 2.0在内测期间收获了远超预期的关注,感 谢大家的使用与反馈。为了保障创作环境的健康可持续,我们正在针对反馈进行紧急优化,目 前暂不支持输入真人素材作为主体参考。我们深知创意的边界是尊重。产品调整期间给各位带 来的不便,敬请谅解,期待以更完善的面貌与大家正式见面。 即梦创作者们大家好,Seedance 2.0 在内测期间收获了远超预期的关注, 感谢大家的使用与反馈。 为 ...
字节跳动Seedance 2.0暂停真人素材参考能力
Xin Lang Cai Jing· 2026-02-10 01:03
新浪科技讯 2月10日上午消息,近日,字节跳动旗下Seedance 2.0 视频生成大模型发布引发热议,自媒 体影视飓风创始人Tim(潘天鸿)发布评测指出,"在没有给任何提示,任何词语,任何信息的情况 下,我也没有给我的声音文件,只是把我的脸传上去,这个AI居然知道这张脸的声音。"Tim表示,自 己个人并没有给过官方授权,也没有收过钱,这让他感到恐怖。而这种没上传对应素材的情况下,被模 型生成了高度相似的声音和画面风格,也引发了广泛关注。 新浪科技注意到,相关事件发酵同时,字节跳动Seedance 2.0紧急暂停了真人素材参考能力。有官方运 营人员表示:即梦创作者们大家好,Seedance 2.0在内测期间收获了远超预期的关注,感谢大家的使用 与反馈。为了保障创作环境的健康可持续,我们正在针对反馈进行紧急优化,目前暂不支持输入真人素 材作为主体参考。我们深知创意的边界是尊重。产品调整期间给各位带来的不便,敬请谅解,期待以更 完善的面貌与大家正式见面。 新浪科技讯 2月10日上午消息,近日,字节跳动旗下Seedance 2.0 视频生成大模型发布引发热议,自媒 体影视飓风创始人Tim(潘天鸿)发布评测指出,"在 ...
字节跳动Seedance 2.0紧急暂停真人素材参考能力
Xin Lang Cai Jing· 2026-02-10 00:57
新浪科技讯 2月10日上午消息,近日,字节跳动旗下Seedance 2.0 视频生成大模型发布引发热议,自媒 体影视飓风创始人Tim(潘天鸿)发布评测指出,"在没有给任何提示,任何词语,任何信息的情况 下,我也没有给我的声音文件,只是把我的脸传上去,这个AI居然知道这张脸的声音。"Tim表示,自 己个人并没有给过官方授权,也没有收过钱,这让他感到恐怖。而这种没上传对应素材的情况下,被模 型生成了高度相似的声音和画面风格,也引发了广泛关注。 新浪科技注意到,相关事件发酵同时,字节跳动Seedance 2.0紧急暂停了真人素材参考能力。有官方运 营人员表示:即梦创作者们大家好,Seedance 2.0在内测期间收获了远超预期的关注,感谢大家的使用 与反馈。为了保障创作环境的健康可持续,我们正在针对反馈进行紧急优化,目前暂不支持输入真人素 材作为主体参考。我们深知创意的边界是尊重。产品调整期间给各位带来的不便,敬请谅解,期待以更 完善的面貌与大家正式见面。 责任编辑:石秀珍 SF183 新浪科技讯 2月10日上午消息,近日,字节跳动旗下Seedance 2.0 视频生成大模型发布引发热议,自媒 体影视飓风创始人Ti ...
港股异动丨快手拉升涨近4%,可灵AI月活突破1200万
Ge Long Hui· 2026-01-21 06:47
Group 1 - Kuaishou-W (1024.HK) shares rose nearly 4% to HKD 78.9 [1] - The monthly active users (MAU) of Kuaishou's AI video generation model, Keling, surpassed 12 million in January this year [1] - The projected annual revenue for Keling in 2025 is expected to reach USD 140 million, significantly exceeding Kuaishou's initial revenue target of USD 60 million set for early 2025 [1]
盖坤访谈:赢在判断与时机,可灵AI仍在全球市场加速前行
华尔街见闻· 2026-01-07 12:43
在这样的背景下,近日, 彭博社( Bloomberg)对快手高级副总裁、可灵AI事业部负责人兼社区科学线负责人盖坤进行了深度专访。 报道指出,很少有 大型企业能像快手这样,如此迅速地实现向人工智能领域的战略转型。过去18个月里,快手的AI战略路径愈发清晰,并以可灵这一产品迅速切入全球市 场。 根据 Sensor Tower 的数据,截至 1 月 2 日,快手的 可灵 AI 应用是韩国和俄罗斯 iPhone 上收入最高的图形与设计类应用,并在美国、英国、日本、 澳大利亚和土耳其等市场位列前十。彭博预计, 2025年可灵AI的商业收入将达到1.4亿美元。随着可灵商业化和海外化扩展的进程逐步加快,市场对快手 AI布局的预期也在升温,其股价在过去一年累计上涨88%,成为中国AI相关公司中备受关注的标的之一。 在大模型 "参数竞赛"逐渐降温之后,资本市场开始更关心一个问题:哪些AI能力,已经真正走向产品化,并且能跑通商业闭环? 随着生成式文本、图像和 视频技术逐步成熟,市场的关注点正在从模型性能,转向这些能力是否具备规模化应用和稳定变现的可能性。 过去一年,全球 AI竞争的焦点正在悄然发生变化。 在采访中,盖坤多次强调 ...
美团首个视频大模型开源,速度暴涨900%
3 6 Ke· 2025-10-27 09:13
Core Insights - Meituan has launched its first video generation model, LongCat-Video, designed for multi-task video generation, supporting text-to-video, image-to-video, and video continuation capabilities [1][2] - LongCat-Video addresses the challenge of generating long videos, natively supporting outputs of up to 5 minutes, while maintaining high temporal consistency and visual stability [1] - The model significantly enhances inference efficiency, achieving a speed increase of over 900% by employing a two-stage generation strategy and block sparse attention mechanisms [1][10][13] Model Features - LongCat-Video utilizes a unified task framework that allows it to handle three types of video generation tasks within a single model, reducing complexity and enhancing performance [9][10] - The model architecture is based on a Diffusion Transformer structure, integrating diffusion model capabilities with long-sequence modeling advantages [7] - A three-stage training process is implemented, progressively learning from low to high-resolution video tasks, and incorporating reinforcement learning to optimize performance across diverse tasks [9][10] Performance Evaluation - In the VBench public benchmark test, LongCat-Video scored second overall, with a notable first place in "common sense understanding" at 70.94%, outperforming several closed-source models [2][20] - The model demonstrates strong performance in visual quality and motion fluidity, although there is room for improvement in text alignment and image consistency [19][20] - LongCat-Video's visual quality score is nearly on par with Google's Veo3, indicating competitive capabilities in the video generation landscape [17][20] Future Implications - Meituan views LongCat-Video as a foundational step towards developing "world models," which could enhance its capabilities in robotics and autonomous driving [22] - The model's ability to generate realistic video content may facilitate better modeling of physical knowledge and integration with large language models in future applications [22]
一码难求!Sora凭邀请制杀上苹果美区榜首,ChatGPT都得靠边站
Ge Long Hui· 2025-10-04 11:08
Core Insights - OpenAI launched the iOS social application "Sora" powered by the new video generation model Sora 2, which quickly topped the Apple App Store's free app chart in the U.S. within days of its release [1][3] - The application has gained significant popularity, with 56,000 downloads on its first day, surpassing competitors like Claude and Copilot, and achieving a total of 164,000 installations in the first two days [1][2] - Sora 2 features significant advancements in physical simulation accuracy and controllability, allowing for realistic failure scenarios and complex multi-shot instructions [2][3] Application Features - Sora 2 can simulate realistic physical interactions, such as a basketball rebounding off the backboard when missed, enhancing the realism of generated content [2] - The application allows users to create and remix videos collaboratively, fostering deeper interaction through features like cameo appearances [2][3] - Users can share access through invitation codes, with each new user receiving four codes to distribute [3] Commercial Strategy - OpenAI is exploring monetization strategies, considering options for users to pay for additional video generation if demand exceeds available computational capacity [3] - The company plans to share revenue with copyright holders of characters used in user-generated content, although the specific business model is still under development [3][4] - OpenAI has announced a massive $850 billion investment in AI infrastructure, aiming to build a large-scale AI computing facility with a total power of 17GW [5]
可灵2.5Turbo模型登顶全球视频生成大模型榜单
Ge Long Hui· 2025-10-02 06:48
Core Insights - The latest global video generation model ranking by Artificial Analysis highlights Kuaishou's Keling 2.5 Turbo model as the leader in both image-to-video and text-to-video categories with Arena ELO scores of 1329 and 1252 respectively, surpassing competitors like Veo3, Ray3, and PixVerse V5 [1] Group 1 - Kuaishou launched the Keling 2.5 Turbo model on September 23, and within just 10 days, it has taken the top position, succeeding the Keling 1.6 and Keling 2.0 models [1] - The Keling 2.5 Turbo model maintains a global lead in various dimensions including text response, dynamic effects, style retention, and aesthetic quality [1]