Workflow
AI视频生成
icon
Search documents
国产AI视频三国杀:可灵、即梦、Vidu,谁会是最大赢家?
3 6 Ke· 2025-07-30 00:16
1、 本文 从产品实测、技术路线、商业前景三个维度 , 分析 即梦、可灵、 Vidu这三位国产头部 玩家,谁会是最大赢家? 2、 从测试结果看, 可灵优势是表现力强,劣势是容易 "用力过猛" ; Vidu 优势是真实、细腻,劣势是节奏慢、爆发力不足 ; 即梦优势是均 衡、可控,劣势是略显 "平庸"。 3、 AI视频生成背后的关键技术 是DiT ( Diffusion Transformer) 。可灵 AI在技术路线上选择了与Sora一致的DiT架构,Vidu的U-ViT则走了另 一条融合之路,即梦背后也有DiT的身影,主要以字节自研Seedance1.0系列模型为主; 4、 如果说技术决定了产品的下限,那么市场、生态和推广策略则决定了它们的上限。最终赢家,极有可能在可灵和即梦之间产生。原因很简 单: AI视频的终极战场在应用,在生态。 5、我们更倾向于拥有剪映的即梦。因为可灵的成功更依赖于"爆款内容"的出现,而即梦的成功则建立在"赋能工具"的普及上。工具的渗透通 常比内容的爆发更持久、更具粘性。当然,这仅仅是基于当前战局的逻辑推演。 国产AI视频有了新进展。就在2025世界人工智能大会(WAIC)期间,快手旗 ...
马斯克偷偷憋了个大招!Grok秒出《阿凡达》画质,好莱坞瑟瑟发抖?
Sou Hu Cai Jing· 2025-07-29 12:28
新智元报道 编辑:KingHZ 【新智元导读】马斯克又放大招!这次不是火箭,不是Grok智商升级,而是一个几乎能拍电影的AI视频生成器「Imagine」。它不但能加音效、配画面, 还支持多风格生成。网友实测效果太炸裂! 马斯克的Grok也能生成视频了! Grok即将推出了「Imagine」视频功能,直接挑战谷歌的Veo 3。 马斯克表示正在修复相关的bug,并且附上了机器人修复机器鸟的视频。 视频效果之炫目,让Michael Hyacinth怀疑这段视频来自某部电影中的情节。 视频中,机器人修复的金光闪烁的「机器之鸽」,让网友联想起古希腊数学家、哲学家、数学力学先驱Archytas的机械飞鸟传说。 源自古代天空的奇想:Archytas的飞行鸽 —— 可能是世界上最早的「机器人」? 这是人类历史上首个具备自我推进能力的飞行装置。虽然它在今天看来并不算真正意义上的飞行,但这项发明在理解鸟类飞行机制与空气动力学方面,迈 出了具有划时代意义的一步。 网友表示这次马斯克在视频上动真格了。 电影级质量 细节逼真到离谱 得到试用机会的网友,用Grok制作了赛博朋克风格的视频。 代码在血色的暗室里跳动,机械手在键盘上掀起金属 ...
阿里开源电影级AI视频模型!MoE架构,5B版本消费级显卡可跑
量子位· 2025-07-29 00:40
Core Viewpoint - Alibaba has launched and open-sourced a new video generation model, Wan2.2, which utilizes the MoE architecture to achieve cinematic-quality video generation, including text-to-video and image-to-video capabilities [2][4][5]. Group 1: Model Features and Performance - Wan2.2 is the first video generation model to implement the MoE architecture, allowing for one-click generation of high-quality videos [5][24]. - The model shows significant improvements over its predecessor, Wan2.1, and the benchmark model Sora, with enhanced performance metrics [6][31]. - Wan2.2 supports a 5B version that can be deployed on consumer-grade graphics cards, achieving 24fps at 720P, making it the fastest basic model available [5][31]. Group 2: User Experience and Accessibility - Users can easily create videos by selecting aesthetic keywords, enabling them to replicate the styles of renowned directors like Wong Kar-wai and Christopher Nolan without needing advanced filmmaking skills [17][20]. - The model allows for real-time editing of text within videos, enhancing the visual depth and storytelling [22]. - Wan2.2 can be accessed through the Tongyi Wanxiang platform, GitHub, Hugging Face, and Modao community, making it widely available for users [18][56]. Group 3: Technical Innovations - The introduction of the MoE architecture allows Wan2.2 to handle larger token lengths without increasing computational load, addressing a key bottleneck in video generation models [24][25]. - The model has achieved the lowest validation loss, indicating minimal differences between generated and real videos, thus ensuring high quality [29]. - Wan2.2 has significantly increased its training data, with image data up by 65.6% and video data up by 83.2%, focusing on aesthetic refinement [31][32]. Group 4: Aesthetic Control and Dynamic Capabilities - Wan2.2 features a cinematic aesthetic control system that incorporates lighting, color, and camera language, allowing users to manipulate over 60 professional parameters [37][38]. - The model enhances the representation of complex movements, including facial expressions, hand movements, and interactions between characters, ensuring realistic and fluid animations [47][49][51]. - The model's ability to follow complex instructions allows for the generation of videos that adhere to physical laws and exhibit rich details, significantly improving realism [51]. Group 5: Industry Impact and Future Prospects - With the release of Wan2.2, Alibaba has continued to build a robust ecosystem of open-source models, with cumulative downloads of the Qwen series exceeding 400 million [52][54]. - The company is encouraging creators to explore the capabilities of Wan2.2 through a global creation contest, indicating a push towards democratizing video production [54]. - The advancements in AI video generation technology suggest a transformative impact on the film industry, potentially starting a new era in AI-driven filmmaking from Hangzhou [55].
爱诗科技携拍我AI及开放平台首次亮相WAIC
Group 1 - The 2025 World Artificial Intelligence Conference (WAIC 2025) was held in Shanghai from July 26 to 28, showcasing the domestic version of the AI video generation platform "拍我AI" (PixVerse) by the company Aishi Technology [1] - Aishi Technology, founded in April 2023 by Wang Changhu, former head of visual technology at ByteDance, focuses on AI video generation technology and serves industries such as marketing, advertising, and gaming [1] - The PixVerse platform, launched in January 2024, has gained significant traction, reaching the fourth position in the US iOS app store and exceeding 60 million global users as of May 2025 [1] Group 2 - The core features of the "拍我AI" open platform include multi-frame generation, intelligent lip-syncing, creative video continuation, cinematic camera movements, and professional audio-visual integration, all of which are now available on the domestic web and API platforms [2] - Recent updates to the platform have enhanced narrative capabilities for AI video creation, significantly improving efficiency in high-narrative demand scenarios such as movie trailers, animated novels, advertisements, and short films [2] - The company claims that its model training costs are significantly lower than industry standards, allowing for more efficient model iterations and global deployment, which is supported by their effective "data alchemy" approach [2]
瑞银证券熊玮:中企在AI视频生成模型崭露头角
广告方面,AI赋能的技术改进被证明能提高头部媒体平台的变现效率;公司评论和渠道调查表明,点 击率、转化率和eCPM提高了5%—10%,而围绕广告形式的创新可能进一步增强用户体验,推动长期增 长。 2025世界人工智能大会即将召开。瑞银证券中国互联网行业分析师熊玮日前发表专题报告指出,短期来 看,面向企业AI智能体的服务中具备更强的变现能力,预计云和广告将是AI变现最为明确的两大领域; 另外,垂直领域智能体有望比通用智能体更早变现;一些中国公司在AI视频生成模型的竞争中领先。 企业AI智能体变现模式成熟 云厂商将受益于AI需求的快速增长 鉴于中国日新月异的AI行业格局,熊玮指出,注意到未上市公司和科技初创企业在AI变现方面取得了 初步进展,但仍看好现有互联网企业。短期来看,AI在面向企业的服务中具备更强的变现能力,预计云 和广告将是AI变现最为明确的两大领域。 云业务方面,熊玮估计今年一季度,主要中国云服务商的AI相关收入占比平均已达到10%至20%,年初 至今市场对2025年的预期上升6个至13个百分点。预计云厂商将受益于AI需求的迅速增长和传统云服务 交叉销售的增加,尤其是在AI普及率上升以推动更多推理需求 ...
A股跌破3600点,什么情况?
Sou Hu Cai Jing· 2025-07-25 07:57
Market Overview - A-shares experienced a slight decline today, with all three major indices falling. The Shanghai Composite Index closed below 3600 points, down 0.33%, the Shenzhen Component down 0.22%, and the ChiNext Index down 0.23% [1] - The market is characterized by rotation among sectors, with AI concept stocks rebounding collectively and healthcare equipment showing strength, while previously leading sectors like Hainan Free Trade and hydropower concepts faced declines [1] Key Factors Influencing A-shares - The drop below 3600 points is attributed to heavy selling pressure above this level, leading to divergent market opinions. The 3600-point mark is seen as a significant psychological barrier, with previous attempts to break through failing [1] - Despite a series of positive news, market sentiment has become fragmented, with concerns over second-quarter earnings and a need for adjustments in previously high-performing sectors [1] Sector Performance - AI concept stocks showed a collective rebound, with projections indicating that the global market for AI video generation will grow from $615 million in 2024 to $717 million in 2025, a year-on-year increase of 17%, and reach $2.563 billion by 2032, with a compound annual growth rate of 20% from 2025 to 2032 [2] - Huawei's computing stocks performed actively, driven by the upcoming World Artificial Intelligence Conference where Huawei will showcase its Ascend 384 super node technology [2] - The medical device sector showed signs of strength, with reports suggesting a shift away from a solely low-price focus in procurement, potentially leading to valuation and performance recovery [2] - The liquor sector, particularly Moutai, is facing downward pressure, with prices for 500ml bottles dropping to 1870 yuan and box prices to 1920 yuan, indicating ongoing weakness in the sector [3]
谷歌Veo 3新玩法刷屏!国内同款神器也能复制
AI研究所· 2025-07-24 10:09
Core Viewpoint - The article discusses the rising popularity of Google's video generation model, Veo 3, and its impact on content creation, particularly in the home furnishing and ASMR sectors, highlighting the creative potential of AI in video production [1][11]. Group 1: Veo 3 and Its Impact - Veo 3 has gained significant traction, with over 40 million videos created since its launch, showcasing its ability to transform spaces creatively, such as turning an empty room into a Nordic-style bedroom [1][11]. - The model has sparked a wave of creative content on social media, with users producing various engaging videos, including humorous takes on historical events and absurd news reports [4][7][9]. Group 2: User Experience and Limitations - Despite the excitement, users have expressed dissatisfaction with the limitations of the Pro and Ultra versions, which restrict daily video generation and video length [4][11]. - The demand for creative content remains high, as evidenced by the ongoing "整活" competition among creators, pushing the boundaries of what Veo 3 can achieve [4][7]. Group 3: Domestic AI Tools - The article raises questions about whether domestic AI tools can replicate the success of Veo 3, introducing a new platform called 讯飞绘镜, which offers a comprehensive AI video creation experience [11][12]. - 讯飞绘镜 allows users to generate scripts and storyboards based on initial ideas, enhancing the creative process and making it easier for creators to bring their visions to life [12][16].
专访与光同尘创始人陈发灵:AI重构影视行业生产逻辑 中国影视制作迎来“弯道超车”机遇
Core Insights - The article discusses how AI technology, particularly AIGC (Generative Artificial Intelligence), is revolutionizing the film and video production industry by significantly reducing costs and production time while enhancing creative processes [1][2][3]. Group 1: AI's Impact on Production Efficiency - Traditional film production often required months or years, but with AIGC, a five-person team can now complete a project in as little as two weeks [2][3]. - The cost of producing a video has decreased dramatically; for example, a traditional advertising project that cost 1 million yuan and took 90 days can now be done for 300,000 yuan in 20 days using AI [3]. - AI allows directors to communicate directly with AI models to generate scenes, which streamlines the creative process and reduces the time needed for revisions [2][3]. Group 2: Market Opportunities and Future Projections - By 2030, it is expected that AI-generated content could account for over 30% of the market, with the potential for half of all online videos to be produced by AI [7][8]. - The year 2024 is identified as a pivotal year for AI video application development, with a growing number of leading players emerging in the market [7][8]. - The company aims to establish a global presence, having already set up a subsidiary in the U.S. and collaborating with local industries in Southeast Asia [8][9]. Group 3: Educational and Developmental Initiatives - The company collaborates with universities to develop a curriculum that integrates real-world project experience into educational content, addressing the talent shortage in the AI field [9]. - An intelligent platform has been developed to embed professional knowledge into the system, enabling newcomers to produce quality videos after training [9]. - The integration of education, research, and application is seen as a unique approach that differentiates the company in the competitive landscape of AI video generation [8][9].
世界首个「实时、无限」扩散视频生成模型,Karpathy投资站台
机器之心· 2025-07-19 03:13
Core Viewpoint - The article discusses the revolutionary breakthrough in AI video generation with the launch of Decart's MirageLSD, which allows real-time, unlimited-length video transformation from any video stream with a latency of 40 milliseconds [3][18]. Group 1: Technology and Features - MirageLSD is the first video generation model capable of producing unlimited-length videos, overcoming previous limitations of error accumulation in traditional models [23][24]. - The technology achieves zero-latency video generation, allowing real-time interaction by generating each frame based on previous frames and user prompts, thus enabling continuous video creation without pre-set endpoints [28][32]. - The model utilizes a causal autoregressive structure, which supports immediate feedback and adapts to changes in video content and user input [34][35]. Group 2: Applications and Potential - The technology opens up new applications such as transforming camera footage into alternate realities, real-time movie production, and simplified game development [7][8][9]. - It also enables innovative uses in video conferencing backgrounds, virtual try-ons, and augmented reality enhancements [11][12]. - The potential for "killer applications" remains vast, with the technology being compared to concepts from popular culture, such as "Sword Art Online" [15]. Group 3: Future Developments - Decart plans to continue releasing model upgrades and new features, including facial consistency, voice control, and precise object manipulation [16]. - The platform will also introduce streaming support for live broadcasts and game integration, expanding its functionality [16].
靠视频大模型赚钱,还是个梦
投中网· 2025-07-18 06:10
Core Viewpoint - The AI video generation sector is experiencing intense competition among major players, with significant advancements in technology and commercial viability, yet challenges remain in achieving consistent output and cost-effectiveness for creators [4][6][19]. Group 1: Industry Overview - The AI video generation market has seen rapid product iterations from major companies like Kuaishou, ByteDance, Alibaba, and Tencent, leading to improvements in semantic response, image quality, and overall realism [4][6]. - Kuaishou's Keling AI has gained a significant market share, surpassing competitors like Runway and Veo-2, with a user base of 22 million globally within a year of launch [8][9]. - ByteDance's Yidong AI is catching up, with its app ranking first in downloads on the Apple App Store, indicating strong user engagement [10][12]. Group 2: Competitive Landscape - The competition is characterized by a lack of significant technological gaps among the leading models, with each platform focusing on different strengths, such as consistency and realism [11][19]. - Keling AI's early market entry provided it with a first-mover advantage, but newer entrants are quickly closing the gap [8][21]. - The commercial models of Keling and Yidong are similar, offering both free and subscription-based services, with Yidong focusing on user growth while Keling targets professional users [12][14]. Group 3: Challenges in AI Video Generation - Despite lower production costs compared to traditional methods, creators face challenges in achieving consistent quality and managing unpredictable costs associated with AI video generation [14][15]. - Technical limitations, such as maintaining consistency across frames and generating complex motion shots, hinder the effectiveness of current AI models [16][19]. - The industry is encountering a plateau in technological advancements, with key constraints being architectural limitations, computational power, and the scarcity of high-quality training data [19][20]. Group 4: Future Outlook - The future of AI video generation will likely depend on the ability of companies to enhance user experience and optimize workflows rather than solely focusing on technological breakthroughs [20][21]. - Keling is investing in creator ecosystems through competitions and talent support, while ByteDance leverages its extensive ecosystem to enhance content creation capabilities [22].