Workflow
AI视频生成
icon
Search documents
标题不贴合需求核心,推测你可能想围绕科技产业博弈等方面生成标
Sou Hu Cai Jing· 2025-12-27 11:03
现象描述 Sora2一经发布便在市场上掀起了轩然大波。据行业统计,发布后的短短一周内,其相关话题在社交媒体上的讨论量突破了10亿次,大量用户涌入体验。在 应用下载排名中,迅速攀升至榜首,日下载量超过62万次,成为现象级产品,引发了全球范围内对AI视频生成领域的高度关注。 战略解码 从商业层面来看,开发Sora2是企业构建内容生态闭环的重要战略布局。其盈利模式不再局限于传统的软件售卖,而是通过提供订阅服务、企业定制化解决 方案以及广告分成等多元化途径实现盈利。背后的资本运作也十分明显,为了研发Sora2,企业投入了巨额资金,面临着巨大的财务压力。有分析指出,研 发成本可能高达85亿美元,这也倒逼企业加速商业化进程,以尽快实现盈利。 风险与挑战 这场豪赌并非没有隐患。首先是版权问题,Sora2生成的内容可能涉及侵犯他人版权,引发法律纠纷。其次,监管方面也存在不确定性,随着AI技术的发 展,各国政府可能会出台更加严格的监管政策,限制其发展。此外,高额的研发成本和烧钱速度也对企业的财务可持续性提出了挑战。 数据流场 pa 22 图片→AI处理 0 现在评论区扣 视频创作革命已至,财富风口为你散开|现在 "风", 优先 ...
视频生成DeepSeek时刻!清华&生数开源框架提速200倍,一周斩获2k Star
机器之心· 2025-12-26 04:35
这意味着,AI 视频创作进一步突破了传统的「渲染与等待」模式,来到了向「实时生成」时代转变的关键节点。这项突破迅速引起了学界的广泛关注。 编辑|杜伟 在 2025 年的最后时刻,一个全新视频生成加速框架的开源宣告了:「等待数分钟才能生成一个视频」的时代已经终结! 这个框架正是 清华大学 TSAIL 团队与生数科技联合发布的 TurboDiffusion 。 加速效果有多夸张呢?在几乎不影响生成质量的前提下,主流视频生成模型在单张 RTX 5090 上生成 5 秒 720p 视频的速度可以提升约 200 倍,同时一个 5 秒 480p 视频的生成时长可以被压缩到不到 2 秒(如下动图)。 现在,用户可以体验 TurboDiffusion 支持下的高效文生视频、图生视频的模型版本。 | Model Name | Checkpoint Link | Best Resolution | | --- | --- | --- | | TurboWan2.2-I2V-A14B-720P | Huggingface Model | 720p | | TurboWan2.1-T2V-1.3B-480P | Huggingfac ...
攻克长视频生成记忆难题:港大与快手可灵MemFlow设计动态自适应长期记忆,告别快速遗忘与剧情错乱
量子位· 2025-12-25 00:27
Core Viewpoint - The article discusses the challenges of AI-generated long videos, particularly issues with narrative coherence and character consistency, and introduces MemFlow, a new memory mechanism designed to address these problems [1][2][3]. Group 1: Challenges in AI Video Generation - AI-generated long videos often suffer from narrative inconsistencies, such as characters appearing different after a scene change or the AI confusing multiple characters [1]. - Traditional models use a "chunk generation" strategy, which leads to difficulties in maintaining continuity across video segments [4][6]. - Existing memory strategies have significant limitations, including only remembering the first segment, fixed-size memory compression, and independent processing of segments, all of which contribute to narrative disjointedness [5][6]. Group 2: Introduction of MemFlow - MemFlow is a novel adaptive memory mechanism that enhances AI's long-term memory and narrative coherence, aiming to resolve the aforementioned issues [3][7]. - It establishes a dynamic memory system that maintains visual consistency and narrative clarity, even in complex scenarios with multiple characters [8][9]. Group 3: Mechanisms of MemFlow - MemFlow employs two core designs: Narrative Adaptive Memory (NAM) and Sparse Memory Activation (SMA), which allow for efficient retrieval of relevant visual memories and reduce computational load [11]. - NAM intelligently retrieves the most relevant memories based on current prompts, while SMA activates only the most critical information, enhancing both speed and quality of video generation [11]. Group 4: Performance Evaluation - MemFlow demonstrated significant improvements in key performance metrics, achieving a quality consistency score of 85.02 and an aesthetic score of 61.07, outperforming other models in long video generation tasks [13][14]. - The model maintained high semantic consistency throughout the video, particularly in the latter segments, which is crucial for narrative coherence [15][17]. - In terms of subject and background consistency, MemFlow achieved scores of 98.01 and 96.70 respectively, showcasing its ability to maintain visual unity amidst complex narrative changes [18][17]. Group 5: Visual Comparisons and Efficiency - Visual comparisons highlighted MemFlow's superiority in maintaining character consistency and avoiding narrative confusion, unlike other models that struggled with character drift and inconsistencies [19][21][23]. - MemFlow operates efficiently on a single NVIDIA H100, achieving a real-time inference speed of 18.7 FPS, with minimal performance loss compared to baseline models [25]. Group 6: Future Implications - MemFlow represents a significant advancement in AI video generation, transitioning from simple video creation to complex narrative storytelling [26][27]. - This innovation indicates a shift towards AI systems capable of understanding, remembering, and coherently narrating stories, marking the dawn of a new era in AI video creation [28].
Minimax、智谱抢夺“全球大模型第一股”
Hua Er Jie Jian Wen· 2025-12-22 11:14
Core Insights - The competition for the title of "the first global large model stock" is intensifying, with Minimax releasing its IPO prospectus shortly after Zhipu [1] Group 1: Minimax's Business Developments - Minimax has made significant progress in the AI video generation sector, despite challenges in monetizing user subscriptions for large language models [2] - The company has developed a core suite of self-researched large models, including MiniMax M2, Hailuo-02, and Speech-02, leading to applications like MiniMax and Hailuo AI [2] - Hailuo, launched in August 2024, has already become a key revenue source, generating $0.17 billion (1.2 billion RMB) in the first three quarters of 2025, accounting for 32.6% of total revenue [2] Group 2: Market Performance and User Engagement - Hailuo's paid user base reached 310,000, with an average revenue contribution of $56 per user [2] - However, Hailuo's revenue still lags behind Kuaishou's AI video generation app "Ke Ling," which achieved over $0.25 billion in revenue in the second quarter of this year [2] - Hailuo's pricing strategy includes "Basic" and "Premium" packages priced at $9.99/month and $199.99/month, respectively, targeting overseas markets where user willingness to pay is higher [2] Group 3: User Retention Challenges - The AI video generation sector faces significant uncertainty regarding user retention, with early data showing low retention rates for similar applications like Sora [3] - Hailuo's user retention rates in Singapore are also concerning, with 1-day, 7-day, 30-day, and 60-day retention rates at 22.57%, 4.62%, 0.8%, and 0.66%, respectively [4] Group 4: Financial Performance and Strategic Adjustments - Minimax reported net losses of $0.465 billion in 2024 and $0.512 billion in the first three quarters of 2025 [6] - To mitigate losses, Minimax has reduced its promotional spending, with sales expenses in the first three quarters of 2025 at $0.039 billion, a decrease of over 25% year-on-year [6] - Despite these efforts, the company still struggles to cover its computing costs, which totaled $0.18 billion in sales and R&D expenses for the first three quarters of 2025 [6]
日耗50万亿Token,火山引擎的AI消费品战事
36氪· 2025-12-19 10:31
Core Viewpoint - The AI market is rapidly evolving, with major players like Volcano Engine leading the way in model consumption and innovation, particularly in the areas of multi-modal capabilities and AI agents [3][5][51]. Group 1: Market Growth and Trends - As of December, the daily token usage of Doubao model has surpassed 50 trillion, representing a growth of over 10 times compared to the same period last year [3]. - By 2025, the token usage is projected to reach 16.4 trillion, indicating significant growth potential in the AI market [4]. - The competition among cloud vendors for "AI cloud supremacy" is intensifying, with major updates from companies like Google and OpenAI [4]. Group 2: Product Innovations - Volcano Engine has released key products focusing on multi-modal capabilities and AI agents, including the Doubao flagship model 1.8 and the video generation model Seedance 1.5 pro [5][6]. - The Seedance 1.5 pro model emphasizes the ability to produce "publishable complete works," showcasing advancements in video generation technology [10][11]. - The model's improvements in voice and image synchronization have made it a standout in the market, achieving high levels of usability with minimal input [11][18]. Group 3: Business Model and Strategy - Volcano Engine aims to simplify model usage by integrating multiple capabilities into a single API, reducing complexity for clients [38][39]. - The company is focusing on enhancing the efficiency of model training and deployment, with the Seedance 1.5 pro achieving over a 10-fold increase in inference speed [46]. - A new billing model, "AI Savings Plan," has been introduced to help enterprises save up to 47% on costs, reflecting a shift towards value-based pricing [47][48]. Group 4: System Engineering and Infrastructure - The competition in AI infrastructure has shifted from merely comparing model capabilities to a broader system engineering challenge [51]. - Volcano Engine is developing a comprehensive AI infrastructure that includes both the core model (Doubao) and operational tools (AgentKit) to facilitate easier deployment for enterprises [53]. - The goal is to enable every enterprise to have its own AI assistant, akin to having a website or app, supported by a complete ecosystem [54].
推理成本砍半 百集短剧不穿帮
Nan Fang Du Shi Bao· 2025-12-18 23:15
Core Insights - The release of Seko 2.0 by SenseTime marks a shift in AI video generation from a "show-off" phase to a commercially viable stage, focusing on consistency in multi-episode content generation [2] - The adaptation of Seko to domestic AI chips, particularly Cambricon, has led to a significant reduction in inference costs by approximately 50%, indicating a competitive shift in the AI video sector towards cost efficiency and content consistency [2][3] Group 1: Technological Advancements - Seko 2.0 introduces a multi-episode generation capability, addressing the challenge of maintaining character consistency across different scenes and episodes [5] - The integration of SekoIDX (consistency model) and SekoTalk (audio-visual synchronization) technologies aims to enhance the coherence of character portrayal and narrative continuity in long-form content [5] - The collaboration with Cambricon signifies a move towards a more resilient domestic supply chain for AI video generation, reducing reliance on imported computing power [4] Group 2: Market Dynamics - The reduction in computing costs is particularly crucial for B-end users, such as short drama studios, where profitability is heavily influenced by operational expenses [4] - The platform has attracted over 200,000 creators since its launch in July, with 50% of them focusing on short dramas and comic dramas, indicating a growing user base and market interest [2] - The hybrid model of "AI for the main structure, human for details" is emerging as a new norm in film production, reflecting a shift in how content is created and monetized [5][6]
奥特曼飙河南话,小扎马斯克真人约架!豆包新模型把AI视频玩成「活人」
Sou Hu Cai Jing· 2025-12-18 12:26
新智元报道 最近的AI视频模型大混战,豆包也下场了! 就在今天,火山引擎在FORCE大会上,正式发布了豆包视频生成模型Seedance 1.5 pro,生成效果一下子就把我们震到了。 比如,被谷歌折磨得不行的OpenAI CEO奥特曼,痛苦扶额飙出河南方言: 编辑:编辑部 【新智元导读】就在刚刚,字节Seedance 1.5 pro一上线,网友们都玩疯了!音画同步、方言直出效果太惊艳,文物直播、熊猫唠嗑、小扎和马斯克上演 真人角斗,这个模型的升级,将彻底改变未来的AI视频制作流程。 唉呀,最近谷歌咋恁牛咧?发那个模型直接给咱干趴下了!昨天的生图模型都没人瞅! 甚至,已经有网红大V用它做出爆款视频了。 老祖宗文物们走进直播间里开始孤身摇,一边还唱着时下最火的热门歌曲,如此脑洞十足的视频,眼看着就要在小红书开始病毒式传播。 不用怀疑,这么逼真的效果,背后都来自Seedance 1.5 pro的加持! 没错,这次的全方位升级,直接让它在AI视频模型中全面领先。 首先,Seedance 1.5 pro可以支持音视频联合生成了,不再局限于视觉维度。 其次,模型的视觉冲击力和运动效果,又一次突破了上限。 多语言的超自然对 ...
AI视频生成,如何撕开创作边界?
3 6 Ke· 2025-12-18 09:30
01. 当新技术遇上老难题 如果给2025年下半年的AI行业选一个受关注的方向,视频生成几乎是绕不开的答案。在OpenAI发布Sora 2并上线App版本后,AI视频的热度几乎以"病毒 式"的速率在全球范围内迅速扩散开来。 但梳理产业发展的脉络,才会发现,这并非是偶然的产品爆红。背后,是过去两年里视频生成技术在画面质量、时序建模与可用性上的持续进步。Sora、 Veo、通义万相,无论是大公司还是创业公司,不断累加的技术贡献,让全球AI视频相关能力的迭代节奏显著加快。 当技术突破与国内的规模化需求在同一时间点汇合,内容行业逐渐形成一个清晰判断:AI视频生成已经成为下一代内容基础设施的重要组成部分,更稳定 的技术和更快的工具远远不够,创作者们需要的可能是一套更底层、可扩展的生产力方案。 更深层的影响,正在产业内部逐步显现。 当模型的进步不再局限于画面质量本身,而是逐步覆盖叙事能力、人物与风格一致性、音画同步、跨镜头逻辑延续等更接近工业化生产的关键要素。当生成 效果跨过"能看"的门槛,开始接近"可用""好用",AI视频才真正进入大众视野,也随之成为当前极具想象空间的赛道之一。 与此同时,视频行业本身也在面临着一种结 ...
AI视频生成,如何撕开创作边界?
36氪· 2025-12-18 09:26
人人都能创作视频的时代来了。 封面来源 | 通义万相生成 当新技术遇上老难题 如果给2025年下半年的AI行业选一个受关注的方向,视频生成几乎是绕不开的答案。在OpenAI发布Sora 2并上线App版本后,AI视频的热度几乎以"病毒 式"的速率在全球范围内迅速扩散开来。 但梳理产业发展的脉络,才会发现,这并非是偶然的产品爆红。背后,是过去两年里视频生成技术在画面质量、时序建模与可用性上的持续进步。Sora、 Veo、通义万相,无论是大公司还是创业公司,不断累加的技术贡献,让全球AI视频相关能力的迭代节奏显著加快。 更深层的影响,正在产业内部逐步显现。 当模型的进步不再局限于画面质量本身,而是逐步覆盖叙事能力、人物与风格一致性、音画同步、跨镜头逻辑延续等更接近工业化生产的关键要素。当生成 效果跨过"能看"的门槛,开始接近"可用""好用",AI视频才真正进入大众视野,也随之成为当前极具想象空间的赛道之一。 与此同时,视频行业本身也在面临着一种结构性难题。 过去十余年里,围绕视频展开的产业始终是全球范围内增长最快、资本最密集、创新最活跃的领域之一。从影视娱乐、广告营销,到电商内容、社交平台与 创作者经济,视频逐渐 ...
不儿,这谁还能看出是AI演的视频啊
量子位· 2025-12-18 09:26
金磊 发自 凹非寺 量子位 | 公众号 QbitAI 这一次,我真的分不清 视频到底是不是AI生成 的了。 来,咱们先来看一下这段 演技飙升 的视频片段: Prompt:女子泣不成声,说台词:"江辰……你一定要活着回来,好吗?……答应我"。女子边说话边将右手抬起抚摸男子的脸。背景 音乐伤感。影视级。 这台词、这演技、这眼神、这口型,不说是AI生成的,一般人绝对会以为是哪个电影里的片段。 但重点还不是效果的逼真—— 因为这10s的片段,人物对白配音、视频背景音乐和音效,统统都是通过上面的Prompt 一锅出 的。 这就是刚刚 火山引擎 在FORCE原动力大会上推出的最新 豆包视频生成模型Seedance 1.5 Pro 。 主打的就是 音画高精同步,一镜入戏 。 就这个功能一出,打造一个有趣好玩的小短片,那真是分分钟的事情了。 例如我们以这位AI女主角为原型: 然后就可以用Seedance 1.5 Pro搞一个"川剧"—— 《至辣园》 : 从这两个实测案例中,我们不难看出,这次豆包视频生成模型Seedance 1.5 Pro整体亮点可以总结为: 目前,Seedance 1.5 Pro已经上线 即梦AI 和 豆包 ...