Workflow
AI视频生成
icon
Search documents
传媒ETF(159805)涨超2.3%,字节大模型海外爆火
Xin Lang Cai Jing· 2026-02-09 02:22
消息面上,近日,一款名为Seedance2.0的AI视频生成模型再度刷屏海内外互联网。Seedance2.0由字节 跳动推出,可根据文本或图像创建电影级视频。它采用双分支扩散变换器架构,可同时生成视频和音 频。只需编写详细的提示或上传一张图片,Seedance 2.0 即可在60秒内生成带有原生音频的多镜头序列 视频。值得一提的是,这款模型独有的多镜头叙事功能,能够根据单个提示自动生成多个相互关联的场 景。 机构指出,腾讯上周连跌,一是市场担心互联网平台加税,实际游戏增值税无加税空间,也未验证出新 税种;二是元宝活动被封;三是Q4业绩下调传闻,但目前15倍PE仍具备性价比。目前元宝下载情况稳 健,腾讯 AI 与大厂差距可能缩小,继续推荐。 数据显示,截至2026年1月30日,中证传媒指数(399971)前十大权重股分别为蓝色光标、分众传媒、利 欧股份、巨人网络、岩山科技、昆仑万维、恺英网络、三七互娱、光线传媒、完美世界,前十大权重股 合计占比53.71%。 截至2026年2月9日 09:51,中证传媒指数(399971)强势上涨2.30%,成分股中文在线上涨20.00%,海看股 份上涨19.99%,捷成股份上 ...
字节又一款AI产品火了
财联社· 2026-02-09 01:49
以下文章来源于财联社AI daily ,作者张真 财联社AI daily . 财联社及科创板日报旗下产品——未来已来,AI前沿,独家、深度、专业! 近日,一款名为Seedance2.0的AI视频生成模型再度刷屏海内外互联网。 根据官方资料,Seedance2.0由字节跳动推出,可根据文本或图像创建电影级视频。它采用双分支扩散变换器架构,可同时生成视频和音 频。只需编写详细的提示或上传一张图片,Seedance 2.0 即可在60秒内生成带有原生音频的多镜头序列视频。 值得一提的是, 这款模型独有的多镜头叙事功能,能够根据单个提示自动生成多个相互关联的场景 。AI会自动保持所有场景切换中角色、 视觉风格和氛围的一致性,无需手动编辑。官方声称:"非常适合创建从开头到高潮的完整叙事序列,并确保专业级的连贯性。" 于是一经发布,大量用户纷纷主动尝试Seedance2.0,类似下图效果: 图源:影视飓风 与此同时,知名科普博主影视飓风的一则评测视频再度加速Seedance2.0"出圈"。测评结果显示,该模型在视频的大范围的运动、分镜、音 画匹配等方面均有可圈可点之处。比如分镜上具有"明显的角度切换",能够"像真人导演一 ...
AI应用正当时:字节发布Seedance2.0,AI视频生成迈上新台阶
Changjiang Securities· 2026-02-09 01:19
丨证券研究报告丨 报告要点 行业研究丨点评报告丨软件与服务 [Table_Title] AI 应用正当时:字节发布 Seedance2.0,AI 视 频生成迈上新台阶 [Table_Summary] 近日字节在飞书里发布最新视频模型 Seedance 2.0 的产品介绍文档,目前已经在即梦平台上 线,会员用户(至少 69 元)可以直接使用,支持文本/图片生视频,也支持视频和音频作为参 考素材输入。2026 年大模型进入"模型×场景"深度融合阶段,互联网平台在模型、算力与数 据侧具备系统性优势,2C 入口争夺有望全面展开;在 B 端,规则清晰、高价值的已具备规模 化落地条件,也是海外头部厂商重点突破方向。 分析师及联系人 [Table_Author] %% %% %% %% research.95579.com 1 软件与服务 cjzqdt11111 宗建树 郭敬超 刘思缘 SAC:S0490520030004 SAC:S0490525120002 SFC:BUX668 请阅读最后评级说明和重要声明 [Table_Title AI 应用正当时:字节发布 2] Seedance2.0,AI 视 频生成迈上新台阶 ...
童年的滚球兽「走进」现实?华为天才少年创业,全球首个虚实融合的实时交互视频模型来了
机器之心· 2026-02-09 01:18
Core Viewpoint - The article discusses the emergence of Xmax AI's real-time interactive video model X1, which allows users to seamlessly integrate virtual characters into their real-world environment, marking a significant advancement in the field of AI video generation and interaction [3][10][26]. Group 1: Technology and Innovation - Xmax AI has developed the X1 model, which enables real-time interaction with virtual characters using just a smartphone camera, eliminating the need for complex prompts or lengthy rendering times [4][10]. - The global AI video generation market is projected to grow from $614.8 million in 2024 to $2.5629 billion by 2032, indicating strong demand and competition in the sector [8]. - Xmax AI's approach focuses on making AI video generation accessible to the general public by lowering interaction barriers and enhancing real-world integration [10][26]. Group 2: Features of X1 Model - The X1 model offers four core functionalities: dimensional interaction, world filters, touch animations, and expression capture, allowing users to interact with virtual characters in a natural and engaging manner [10][11][14][16]. - Dimensional interaction allows users to summon characters into their environment using a reference image, while world filters enable real-time transformation of video styles based on uploaded images [11][14]. - Touch animations bring static images to life, allowing users to control movements through touch, and expression capture generates dynamic emojis based on real-time facial recognition [15][16]. Group 3: Technical Challenges and Solutions - Xmax AI faces significant technical challenges, including achieving ultra-low latency for real-time interactions, understanding user intent, and addressing data scarcity for training models [19][20]. - The company has innovated an end-to-end streaming re-rendering video model architecture to meet the demand for real-time responsiveness, reducing latency to milliseconds [24]. - To tackle the issue of intent understanding, Xmax AI has developed a unified interaction model that comprehensively interprets user gestures and actions [24]. Group 4: Team and Expertise - The founding team of Xmax AI comprises individuals with strong technical backgrounds, including experience at leading AI companies and academic institutions, which enhances their capability to address complex engineering challenges [22][23]. - The team has successfully built a robust technical foundation that combines algorithmic knowledge with practical engineering skills, positioning them well to innovate in the AI video generation space [22][24]. Group 5: Future Vision - Xmax AI aims to redefine user interaction with AI-generated content, envisioning a future where virtual characters can seamlessly integrate into daily life, serving as virtual companions or pets [26][28]. - The company's slogan, "Play the World through AI," encapsulates its mission to make the virtual world more interactive and accessible, allowing users to engage with digital content in a tangible way [28].
硬刚马斯克,超越Sora2的国产模型强势登场了!支持16秒声画同出
Sou Hu Cai Jing· 2026-01-30 14:40
Core Viewpoint - The AI video model Vidu Q3 Pro from Shenshu Technology has achieved significant recognition, ranking first in China and second globally on the Artificial Analysis leaderboard, marking a key advancement in domestic AI video generation technology [2][3]. Group 1: Model Performance and Features - Vidu Q3 Pro is the first domestic video generation model to break into the international first tier, following only Musk's xAI Grok [2][3]. - The model supports up to 16 seconds of synchronized audio and video output, allowing for high-quality voice, narration, dialogue, sound effects, and music to be generated simultaneously [9]. - It features automatic camera angle switching based on content, enhancing the storytelling aspect by simulating professional directing techniques [10]. - Vidu Q3 can render text in multiple languages directly within the video, eliminating the need for post-production text integration [11]. Group 2: Overcoming Limitations - The model addresses three major limitations in AI video generation: sound synchronization, camera language diversity, and text rendering [4][5][8]. - By integrating sound, camera, and text rendering, Vidu Q3 transforms from a simple video generator to a comprehensive creative engine capable of storytelling [12]. Group 3: Practical Applications - Vidu Q3 is suitable for various content creation scenarios, including short dramas, advertisements, and animated content, effectively covering the entire production process from script to output [16]. - The model enhances efficiency in advertising and product demonstration by automating the video creation process, reducing the need for multiple rounds of scripting, shooting, and editing [18]. - It also shows strong applicability in self-media and podcasting, allowing for batch production of engaging content [20]. Group 4: Industry Impact - Vidu Q3 represents a significant upgrade in creative capabilities, redefining the roles of content creators, advertisers, and marketers [21][22]. - The evolution of AI video models from mere "cameras" to "directors" signifies a new phase in industrial-level content production [24].
马斯克还在卷10秒,中国AI直接掀桌!16秒一镜到底,全球唯一
Sou Hu Cai Jing· 2026-01-30 11:04
Core Insights - The AI video generation industry is witnessing intense competition, particularly with the launch of Vidu Q3, which introduces a new era of "audio-visual generation" [2][8] - Vidu Q3 is the first model capable of generating a complete 16-second audio-visual output in a single instance, significantly enhancing narrative capabilities [7][11] - The model's advanced features include multi-language text rendering, professional-level production capabilities, and precise control over camera angles and transitions, setting it apart from competitors [7][17][24] Group 1: Industry Competition - Silicon Valley giants are heavily competing in the AI video space, with Google’s Veo 3.1 and other models like Grok Imagine and Runway Gen 4.5 making significant advancements [4][7] - Vidu Q3 has emerged as a strong contender, ranking first in China and second globally, surpassing notable models from Google and OpenAI [7][8] Group 2: Technological Advancements - Vidu Q3's ability to generate 16-second videos without the need for post-production or stitching is a groundbreaking achievement in the industry [11][23] - The model addresses previous limitations in AI video generation, such as short video lengths and lack of audio-visual synchronization, by providing a cohesive storytelling experience [11][23] Group 3: Creative Potential - The introduction of Vidu Q3 allows creators to produce high-quality content with minimal effort, enabling a new wave of creativity among individual content creators and marketers [26][28] - The model's capabilities facilitate a shift from traditional video production processes to a more streamlined and efficient approach, empowering users to become directors of their own stories [28][24]
这个真人版《火影忍者》竟然是AI做的,来自中国AI视频新王者Vidu Q3
量子位· 2026-01-30 11:02
Core Viewpoint - The article highlights the rapid advancements in AI video generation technology, particularly focusing on the capabilities of Vidu Q3, which can generate 16-second audio and video outputs seamlessly, showcasing significant improvements in narrative and visual quality [2][5][40]. Group 1: Vidu Q3 Features - Vidu Q3 is the first AI model globally to support the simultaneous generation of 16 seconds of audio and video, producing outputs that closely resemble original anime scenes [2][5]. - The model supports multiple languages, including Chinese, English, and Japanese, enhancing its usability across different markets [3]. - Vidu Q3 has achieved recognition from Artificial Analysis, ranking first in China and second globally, surpassing competitors like Elon Musk's Grok and Google's Veo [5]. Group 2: Technical Capabilities - The AI can generate video and audio in one go, with features like free switching of camera angles and transitions, and it supports a resolution of 1080P, which can be enhanced to 4K [6]. - The model demonstrates complete narrative capabilities, with precise text rendering and the ability to understand and incorporate contextual audio effects, such as background sounds and character expressions [19][22]. Group 3: Industry Evolution - The evolution of AI video generation has been rapid, with significant advancements occurring in less than nine months, contrasting sharply with the historical timeline of human cinema development [33][35]. - The introduction of audio-video integration marks a shift from visual-only generation to a multi-modal approach, indicating a deeper understanding of the relationship between sound and visuals [38][40]. - Vidu Q3's ability to produce coherent narratives within a 16-second timeframe signifies a leap in AI's storytelling capabilities, suggesting that future developments in AI video generation may come even faster than anticipated [40][41].
快手:可灵AI创意生产力平台落地,持续领跑全球视频生成大模型赛道
Jing Ji Guan Cha Wang· 2026-01-30 04:31
当前,AI视频生成领域普遍面临动态效果生硬、风格一致性差、复杂指令响应不足等痛点。针对这些 难题,新一代创意生产力平台——可灵AI(Kling AI)通过底层技术突破实现了视频生成能力的全方位升 级。可灵AI的核心竞争力源于四大技术维度的创新突破。在模型设计上,采用类Sora的DiT结构,以 Transformer替代传统U-Net架构,解决了卷积网络在复杂任务中"感受野与定位精度不可兼得"的局限, 同时对隐空间编解码、时序建模模块进行升维优化;创新研发计算高效的3D Attention全注意力机制作 为时空建模模块,既能精准捕捉复杂运动轨迹,又能兼顾运算成本,让视频动态效果更自然。在数据保 障层面,构建精细化标签体系筛选训练数据,研发专用视频描述模型生成结构化文本,大幅提升模型对 文本指令的响应准确度,避免"文本与视觉脱节"问题。计算效率上,摒弃行业主流的DDPM方案,采用 传输路径更短的flow模型作为扩散模型基座,在保证生成质量的同时提升运算速度。能力扩展方面,支 持直接处理不同长宽比数据以保留原始构图,研发自回归视频时序拓展方案应对数分钟长视频生成需 求,还可接入相机运镜、帧率、边缘/深度信息等多类控 ...
昆仑万维开源的SkyReels-V3,把马斯克请来带货了
机器之心· 2026-01-29 10:26
Core Viewpoint - The rise of AI-generated virtual influencers is transforming social media, with brands collaborating and millions of followers engaging as if they were real celebrities [1][2]. Group 1: Technology and Features - Kunlun Wanwei has launched the open-source SkyReels-V3, a multi-modal video generation model that includes capabilities for reference image-to-video, video extension, and audio-driven virtual avatars [3][9]. - The model allows users to create high-fidelity videos from a single image and audio, maintaining accurate lip-sync and expressions [4][35]. - SkyReels-V3 can generate coherent videos by uploading 1-4 reference images and using text prompts, ensuring narrative logic and visual consistency [11][42]. Group 2: Practical Applications - The model has been tested in e-commerce scenarios, successfully generating videos that showcase products in various settings, such as a model displaying a handbag in an urban environment [12][19]. - It can extend video clips while preserving motion dynamics and visual style, offering both single-shot and multi-angle transition modes [26][31]. - The virtual avatar model can create synchronized audio-visual content, supporting multiple characters in interactive scenes without synchronization issues [38][47]. Group 3: Technical Insights - SkyReels-V3 integrates three core modules within a single architecture, achieving high fidelity and flexible multi-modal applications [40][41]. - The video extension feature employs a dual-mode mechanism for seamless transitions, enhancing narrative continuity and visual engagement [45][46]. - The model's modular design allows for independent use of its components or flexible combinations, catering to various application scenarios [49]. Group 4: Market Position and Future Outlook - The open-source strategy reflects the competitive landscape in AI video generation, enabling rapid ecosystem development and feedback loops [51][52]. - Kunlun Wanwei's history of technological advancements in video generation, including previous models like SkyReels-V1 and SkyReels-V2, showcases its commitment to innovation [53][54]. - The launch of SkyReels-V3 signals an intensifying competition in AI video generation, with diminishing technical barriers and the onset of more significant challenges [56].
万物皆可参考是种什么体验?Vidu Q2参考生Pro:特效、演技、细节全都要
机器之心· 2026-01-28 04:59
编辑|+0 最近,一段「威尔·史密斯吃意面」的今昔对比视频在社交媒体刷屏,引发了无数感慨。 两年前,初出茅庐的 AI 视频还是「抽象鬼畜」的代名词,五官乱飞、逻辑崩坏;仅仅两年过去,当同一主题再次被演绎,从吞咽时肌肉的牵动,到光影在 面部的细腻流转,AI 已进化至「惟妙惟肖」的真·智能水准。 这两年,浓缩了 AI 视频生成行业翻天覆地的技术跃迁。然而,行业并未止步于画质的内卷。在各家厂商竞逐「可控性」高地的当下,AI 视频正站在一个 关键转折点: 从解决「有没有」,到追求「精不精」 。 回顾 Vidu 的进化之路:2025 年 9 月,Vidu Q2 全球首发,以惊艳的图生视频、参考生视频能力技惊四座;12 月,Q2「生图全家桶」上线,首日突破 50 万次的使用量,印证了市场对高质量生成的渴望。 昨天,Vidu Q2 参考生 Pro 正式发布。 登陆 Vidu.cn 或 Vidu API: platform.vidu.cn ,体验最新产品功能。 短短数月,它完成了从「生成」到「编辑」的闭环,更推出了 全球首个「万物可参考」的视频模型 ,将参考模态从静态图像一举扩展至动态视频与多维元 素。其全新 Slogan「 ...