Workflow
视频生成模型
icon
Search documents
OpenAI“抖音”被嘲“好尬”?!Altman 大秀Sora 2、赶上谷歌Veo 3,但要邀请码才能玩?
AI前线· 2025-10-01 02:24
Core Viewpoint - OpenAI has launched a new application named Sora, which integrates the new model Sora 2, aimed at enhancing video creation, sharing, and viewing experiences [2]. Group 1: Sora 2 Model - OpenAI expresses strong confidence in Sora 2, likening it to a pivotal moment in video technology, similar to GPT-3.5 for text [2]. - Sora 2 has undergone significant optimizations in understanding the physical world, positioning it as the best video generation model available [2]. - Despite its advancements, OpenAI acknowledges that the model is not perfect and still makes mistakes, indicating that further training on video data is necessary to better simulate reality [4]. Group 2: Sora Application Features - The core of the Sora application revolves around the "Cameos" feature, allowing users to create and remix videos, discover personalized video streams, and embed themselves into Sora scenes [5]. - Users can verify their identity and capture their likeness through a short video and audio recording, which enhances the interactive experience [5]. - Initial testing of the "upload yourself" feature has been well-received, with users reporting new friendships formed through the application [5]. Group 3: Community Reception - The community's response to OpenAI's demonstrations has been mixed, with some users expressing excitement while others find the output awkward or unsatisfactory [6][9]. - Specific feedback includes criticism of the editing and audio quality, with some users feeling discomfort due to the unnaturalness of the content [9].
Sora 2 中国首测?Open AI 这次真成了!
歸藏的AI工具箱· 2025-09-30 20:32
Core Viewpoint - Sora 2 is presented as the world's most advanced video generation model, capable of creating high-quality videos with minimal input, including voice cloning and multi-language support, and it features a social app for collaborative video creation [1][17]. Group 1: Model Features - Sora 2 allows users to generate videos by simply recording three numbers, showcasing its advanced voice and video synthesis capabilities [1]. - The model can maintain character consistency while changing backgrounds and scenarios, demonstrating its versatility in video generation [6][7]. - It incorporates automatic camera cuts and scene changes, reflecting an understanding of video composition and storytelling logic [8][11]. Group 2: User Interaction - Users can remix videos by providing simple prompts, allowing for creative alterations to existing content [5]. - The platform supports image uploads for scene generation, enhancing the customization options for users [6]. - Sora 2 includes a social aspect where users can invite friends to collaborate on video projects, resembling a social media experience [1][17]. Group 3: Content Limitations - The model has strict copyright restrictions, preventing the generation of copyrighted content, although it appears to allow some exceptions [11]. - There are challenges with maintaining consistency in certain product representations, indicating areas for improvement in commercial applications [9]. Group 4: Overall Impact - Sora 2 is positioned as a groundbreaking tool for end-users, combining audio, visual, and narrative elements to create complete videos from minimal input [17]. - The model's capabilities suggest a significant advancement in video generation technology, potentially transforming user engagement in content creation [17].
北京跑出未来独角兽:要用“具身 Sora ”做机器人大脑,已融资数千万
Sou Hu Cai Jing· 2025-08-28 00:03
八月初的世界机器人大会上,宇树科技创始人王兴兴演讲时抛出了引发行业激烈讨论的观点。 他认为,机器人尚未大规模落地的核心原因,并非硬件不足……最大的问题是模型。现阶段看视频生成模型的路线,相比 VLA 收敛概率更 大。 而有意思的是,几乎在同一时间,灵生科技宣布开源业内首个支持异步运行的快慢双系统视觉语言动作框架 RealDualVLA,为机器人复杂操 作任务提供了高效协同的全新解决方案,而这一方案背后的数据生成技术,恰恰是灵生独创的视频生成模型 - 称为 " 具身 Sora"。 2023 年,在腾讯等互联网大厂有过多年 AI 算法和产业经验的杨洪兵创办北京灵生科技有限公司(简称:灵生科技),专注于具身智能机器人 大脑研发,其核心产品为云 - 边 - 端一体化大脑系统(LingBrain),目前已获数千万融资。 杨洪兵认为,具身智能真正的变革,在于给机器人装上一个能独立思考和行动的 " 大脑 " ,而机器人 " 大脑 " 的进化,则来自开源带来的生态 繁荣。 灵生科技不仅开源了自研的 VLA 模型,还提出用生成视频、" 跟我学 " 的方式来训练机器人大模型,让它们先像人一样在脑海里 " 推演 " 操作 流程,再 ...
可灵AI单季度营收2.5亿元,视频生成模型的赚钱能力正在提升
Xin Lang Cai Jing· 2025-08-22 01:51
快手可灵AI的赚钱能力正在大幅提升。 快手发布的2025年第二季度财报显示,可灵AI在该季度的营收达到了2.5亿元。而根据此前快手财报披露的信息,可灵 AI自去年7月开始商业化到今年2月的累计收入为1亿元。这意味着,可灵AI的营收能力已经提升了数倍。 智通财经记者 | 肖芳 智通财经从一位快手内部人士处了解到,今年4月和5月,可灵AI的月度付费金额均超过1亿元。在第二季度财报电话会 上,快手CFO金秉还透露,可灵AI在商业化营收上的进展比预期更快,根据目前的营收情况判断,预计今年可灵AI今 年全年的营收比今年年初定下的目标翻一倍。 智通财经编辑 | 文姝琪 财报显示,快手第二季度的总营收为350亿元。其中,线上营销服务收入达198亿元,直播收入达100亿元,是快手营收 的主要来源。从营收结构上看,可灵AI对快手整体收入的贡献还十分有限,但其营收快速增长的更大价值在于逐步验 证了视频生成模型的商业化能力。 自去年年初Sora发布以来,国内很多互联网大厂也开始投入资源研发视频生成模式,但行业中也有一些质疑的声音。比 如,百度CEO李彦宏曾在去年表示,Sora这种视频生成模型的投入周期太长,10年、20年都可能拿不 ...
百度辟谣蒸汽机视频生成模型多个海外仿冒网址
Xin Lang Cai Jing· 2025-08-19 11:37
Core Viewpoint - Baidu has issued a warning regarding the proliferation of fake websites related to its video generation model, MuseSteamer, urging users to be cautious and discerning [1] Group 1 - Baidu's MuseSteamer has garnered significant attention since its launch, with an upgrade event scheduled for August 21 to introduce version 2.0, which will include Turbo, Lite, Pro, and audio versions of the model [1] - The MuseSteamer was officially launched on July 2, and on its first day, it received over 100 applications per minute, accumulating more than 300,000 registered users within two weeks [1]
被多家海外网站仿冒,百度蒸汽机视频生成模型最新声明
Xin Lang Ke Ji· 2025-08-19 11:28
新浪科技讯 8月19日晚间消息,百度营销发布官方声明,表示近期海外出现大量关于视频生成模型—— 百度蒸汽机(MuseSteamer)的虚假网站,紧急提示用户注意甄别,谨防受骗。 声明同时提到,百度蒸汽机(MuseSteamer)自上线以来受到各方关注,将于8月21日举办升级发布会, 全新推出百度蒸汽机 2.0 版本,包括Turbo、Lite、Pro和有声版全系模型。 据悉,百度蒸汽机(MuseSteamer)于7月2日正式发布,发布首日平均每分钟超百人申请,2 周内注册 用户超 30 万。 此次即将推出的 2.0 版本基于多模态时空规划、中文场景深度优化以及音视端到端建模等领先的技术能 力,能够实现多人音视频一体化生成、复杂运镜、电影级的人物细腻表演、丰富镜头表现和流畅画质 等。 责任编辑:何俊熹 ...
硅基流动SiliconCloud上线阿里通义万相Wan2.2
Di Yi Cai Jing· 2025-08-15 13:19
Group 1 - SiliconCloud has launched the latest open-source video generation foundational model Wan2.2 from Alibaba's Tongyi Wanshang team [1] - The models include text-to-video model Wan2.2-T2V-A14B and image-to-video model Wan2.2-I2V-A14B, both priced at 2 yuan per video [1]
WRC 2025聚焦(2):人形机器人临近“CHATGPT时刻” 模型架构成核心突破口
Xin Lang Cai Jing· 2025-08-12 06:33
Core Insights - The humanoid robot industry is on the brink of a "ChatGPT moment," with significant breakthroughs expected within 1-2 years driven by policy and demand [1] - The average growth rate for domestic humanoid robot manufacturers and component suppliers is projected to be between 50-100% in the first half of 2025 [1] - The main challenge in the industry is not hardware but the architecture of embodied intelligent AI models, with the VLA model having inherent limitations [1][4] Short-term Outlook (1-2 years) - The domestic market is expected to maintain rapid growth due to policy subsidies and the expansion of application scenarios, with high visibility of orders for complete machines and core components [2] - Key players like Tesla and Figure AI could accelerate global supply chain division and standardization once they achieve mass production [2] Mid-term Outlook (2-5 years) - The integration of end-to-end embodied intelligent models with world models and RL Scaling Law could become the mainstream architecture, facilitating the transition from prototype to large-scale commercialization [2] - Distributed computing is anticipated to become a critical supporting infrastructure, collaborating with 5G/6G and edge computing providers [2] - Investment opportunities include hardware manufacturers entering the mass production phase, AI companies with video generation world model capabilities, and distributed computing centers and edge cloud service providers [2] Long-term Outlook (5+ years) - If end-to-end embodied intelligence and low-latency distributed computing are realized, the market for household and industrial humanoid robots could expand rapidly, potentially reaching annual shipment volumes in the millions [2] - The focus of competition is expected to shift from technological breakthroughs to cost control and ecosystem development [2] Hardware Status - Current humanoid robot hardware can meet most application needs, although optimization is still required in mass production and engineering [3] AI Model Challenges - The VLA model is considered a "foolproof architecture" but struggles with real-world interactions due to insufficient data, and its effectiveness remains limited even after reinforcement learning training [4] - The video generation/world model approach is seen as more promising, allowing for task simulation before real-world application, which may lead to faster convergence [4] RL Scaling Law - Current reinforcement learning training lacks transferability, requiring new tasks to be trained from scratch, which is inefficient [5] - Achieving a scaling law similar to that of language models could significantly accelerate the learning speed of new skills [5] Distributed Computing Trends - Humanoid robots are limited by size and power consumption, with onboard computing equivalent to a few smartphones [6] - Future developments will rely on localized distributed servers to reduce latency, ensure safety, and lower the cost of individual computing units [6]
宇树科技王兴兴:机器人数据关注度有点太高了,最大问题在模型
Group 1 - The core viewpoint is that the most important aspect for the robotics industry in the next 2 to 5 years is the development of end-to-end embodied intelligent AI models [1][24] - The current challenge in the robotics field is not the hardware performance, which is deemed sufficient, but rather the inadequacy of embodied intelligent AI models [1][18] - There is a misconception that the data issue is the primary concern; however, the real problem lies in the model architecture, which is not yet good or unified enough [1][21] Group 2 - The VLA (Vision-Language-Action) model combined with Reinforcement Learning (RL) is seen as insufficient and requires further upgrades and optimization [2][21] - The company has developed various models of quadruped and humanoid robots, with the quadruped model GO2 being the most shipped globally in recent years [3][4] - The humanoid robot G1 has become a representative model in the humanoid robot sector, achieving significant sales and market presence [5][6] Group 3 - The company emphasizes the importance of making robots capable of performing tasks rather than just for entertainment or display purposes [9][14] - Recent advancements in AI technology have led to improved performance in robot movements, including complex terrain navigation [11][12] - The company has focused on developing its core components, including motors and sensors, to enhance the performance and cost-effectiveness of its robots [10][24] Group 4 - The robotics industry is experiencing significant growth, with many companies reporting a 50% to 100% increase in business due to rising demand and supportive policies [16][17] - The global interest in humanoid robots is increasing, with major companies like Tesla planning to mass-produce humanoid robots [17][18] - The future of robotics will likely involve distributed computing to manage the computational demands of robots effectively [25][26]
花旗:料二季度业绩符合预期,将快手目标价上调至88港元,市盈率估值从13倍上调至15倍
Zhi Tong Cai Jing· 2025-07-30 09:16
7月30日,港股三大指数集体收跌,恒生指数跌0.43%,国企指数跌0.43%,恒生科技指数跌1.57%。互 联网板块承压背景下,快手逆势展现韧性,盘中最高涨超2%,尾盘涨幅收窄仍录得0.42%上涨,报72.4 港元,成交额达29.1亿港元。 对于即将发布的二季度业绩,花旗预测快手营收将同比增长11%至345亿元人民币,经调整净利润约51 亿元,均符合市场预期。报告特别强调,随着货架电商广告系统的深度优化,叠加可灵AI持续创收, 公司下半年增长动能充足,全年商品交易总额(GMV)13%的增长目标有望稳健达成。 按照往年惯例,8月下旬,快手即将发布2025年第二季度财报。近期,多家机构均发布了快手的看多报 告,料快手二季度业绩符合预期。 其中,花旗在其7月28日发布的一则报告中,将快手的目标价从66港元上调至88港元,若以最新收盘价 计算,潜在上行空间达21%。 花旗分析师在报告中指出,看多原因主要有以下两点:其一是视频生成模型可灵AI的商业化进程超预 期,公司此前披露4-5月单月收入突破1亿元人民币,结合一季度逾1.5亿元收入表现,全年收入有望大 幅超越管理层1亿美元指引;其二是货架电商广告变现能力提升,预计二 ...