Workflow
音视频一体化生成
icon
Search documents
百度用50天将视频价格打到行业70%!内部负责人:成本优化还有空间
AI前线· 2025-08-28 07:31
Core Viewpoint - Baidu's MuseSteamer has achieved a significant upgrade, becoming the first in the industry to realize integrated generation of multi-voice video, enhancing user experience in video creation [2][4]. Group 1: Product Features - The MuseSteamer offers four versions: Turbo, Lite, Pro, and Voice, with varying pixel quality and features, such as high cost-performance and integrated voice capabilities [3]. - The model supports environmental sound effects and multi-character voice generation, allowing creators to produce videos with just an image and prompt [4][10]. Group 2: Technological Breakthroughs - The upgrade includes five core technological breakthroughs, focusing on the unique phonetic habits and contextual expressions of the Chinese language [4]. - The end-to-end training approach enables integrated content generation, overcoming traditional modular methods and enhancing dialogue logic and emotional interaction [5]. Group 3: Cost Efficiency - Baidu has introduced a competitive pricing system, offering services at up to 70% lower than similar industry products, making high-quality video production more accessible [8][9]. - The team has optimized GPU computing and engineering processes, significantly improving efficiency and reducing costs [9]. Group 4: Market Impact - The introduction of MuseSteamer has led to increased internal usage and advertising revenue, indicating a positive impact on overall business performance [13]. - Over 60% of search traffic now incorporates AIGC-generated content, enhancing video quality and distribution [13][14].
百度上线蒸汽机2.0视频生成大模型,实现多人有声视频一体化
第一财经网· 2025-08-21 08:46
8月21日,百度蒸汽机(MuseSteamer)音视频一体化模型完成升级,实现了多人有声视频一体化生成。 百度蒸汽机是中文音视频一体化生成的I2V模型,其多模态潜在空间规划技术能够自主协调多角色身 份、情感与互动逻辑。该系列大模型已经在百度搜索、营销等多个场景落地应用。 ...
百度蒸汽机视频生成大模型升级2.0版本,价格低至行业70%
Xin Lang Ke Ji· 2025-08-21 07:33
Core Insights - Baidu's MuseSteamer audio-video integration model has completed an upgrade, achieving the industry's first multi-person audio-video generation [2] - The upgraded versions, including Turbo, Lite, Pro, and a full audio version, are now available for users through Baidu search or the "Hui Xiang" platform [2] - The model is the world's first Chinese audio-video integration I2V model, featuring innovative Latent Multi-Modal Planner technology [2] Group 1 - The MuseSteamer can autonomously coordinate multiple roles, emotions, and interaction logic, achieving over 98% accuracy in rendering Chinese speech details and emotional expressions [2] - It delivers cinematic-quality HD video, realistic environmental sound effects, and natural character voices in synchronized output [2] - The model has been applied in various scenarios, including Baidu search and marketing, with pricing reduced to 70% of industry standards [2] Group 2 - Industry experts note that the upgrade not only enhances quality but also significantly reduces creative costs [2] - Renowned visual effects supervisor Yao Qi showcased a sci-fi short film "Return" created with MuseSteamer 2.0, stating that it eliminates the need for million-dollar budgets for Hollywood-level shots [2]