Xverse

Search documents
科技周报|智元、宇树中标中国移动旗下公司1.2亿元人形机器人采购订单;美团加码“0元购”,沪上阿姨忙到闭店
Di Yi Cai Jing· 2025-07-13 04:03
智元、宇树中标中国移动旗下公司1.2亿元人形机器人采购订单 近日,智元机器人和宇树科技在中移(杭州)信息技术有限公司2025年至2027年人形双足机器人代工服务采购项目中 中标。中移(杭州)信息技术有限公司由中国移动通信有限公司100%持股。根据此前该项目的比选公告,中移(杭 州)信息技术有限公司本次采购的项目总预算为1.2405亿元(含税)。采购包1为全尺寸人形双足机器人,预算为7800 万元(含税),最终中选人为智元机器人;采购包2为小尺寸人形双足机器人、算力背包、五指灵巧手,预算为4605万 元(含税),中标人为宇树科技。 7月11日,记者走访北京奶茶店发现,随着美团加码"0元购",一家沪上阿姨门店陷入"爆单"混乱,中午12点左右就已 闭店不再接单。一位外卖员表示,只要是"0元购"多的奶茶店都开始爆单了。 点评:相比此前各平台比拼红包补贴,美团有了转变打法的味道,主打"神枪手"等优惠频道的用户心智。各平台的"打 法"正在出现分野。相比闪击战,淘宝更愿意发起周度、月度的常态化作战,试图覆盖用户的更多消费场景。京东则将 补贴对准了高客单价商品。 智元获上市公司上纬新材实际控制权 7月8日晚间,上纬新材(68 ...
百度文心大模型4.5系列开源,字节发布图像生成新模型Xverse
GOLDEN SUN SECURITIES· 2025-07-07 00:31
证券研究报告 | 行业周报 gszqdatemark 2025 07 07 年 月 日 传媒 百度文心大模型 4.5 系列开源,字节发布图像生成新模型 Xverse 行情概览:本周(6.30-7.4)中信一级传媒板块上涨 2.39%。本周传媒板 块在游戏板块带动下继续上涨,临近中报期重视中报预期较好公司的投资机 会。2025 年下半年传媒继续看好游戏等基本面驱动板块,同时弹性方向看好 AI 应用及 IP 变现。AI 应用聚焦新应用的映射投资及部分较成熟应用的数据跟 踪,重点关注 AI 陪伴、AI 教育及 AI 玩具方向。IP 变现聚焦有 IP 优势及全产 业链潜力的公司,潮流玩具、影视内容等方向有机会。 板块观点与关注标的:1)游戏:重点关注 ST 华通、吉比特、恺英网络、巨 人网络、神州泰岳、心动公司等,关注完美世界、冰川网络、华立科技等;2) AI:豆神教育、盛天网络、上海电影、荣信文化、盛天网络、中文在线、易点 天下、视觉中国、盛通股份、焦点科技、世纪天鸿、佳发教育等;3)资源整 合预期:中视传媒、国新文化、广西广电、华智数媒、吉视传媒、游族网络 等;4)国企:慈文传媒、皖新传媒、中文传媒、南方传媒、凯 ...
视频模型赛道“热闹”起来,变现仍是大难题
Huan Qiu Wang· 2025-07-06 02:16
【环球网财经综合报道】近一个月来,视频模型领域似乎迎来了久违的喧嚣。生数科技将其视频模型Vidu更新至可一键生成32秒视频,并支持音视频合成与 4D生成;MiniMax推出海螺Hailuo-02,实现最高1080P、最长10秒的超清视频端到端生成;百度也发布了首个图生视频大模型MuseSteamer,瞄准广告商等 专业视频内容创作者。 尽管AI领域的Agent(智能体)正备受资本追捧,视频模型的热度相对有限。瑞银研报指出,视频模型训练所需的视频语料内容限制,使得该领域的竞争强 度预计不及大语言模型。尽管如此,以大型互联网/科技企业为主导,辅以爱诗科技、生数科技、MiniMax等明星创业公司组成的"战队",正借着基础模型效 率提升的东风,加速产品迭代与商业化探索。 回顾过去,Sora的热度已催生了一波新品,从2024年初的爱诗科技PixVerse到如今的生数科技Vidu、智谱清影、字节跳动PixelDance等,竞争日趋激烈。据 AGI-Eval评测,部分模型如PixVerse-V3等已超越Sora。但与AI应用层的创业热潮相比,视频模型创业仍显克制,主要因为技术成熟度、高昂成本及商业化路 径不清晰等因素。 M ...
视频模型赛道“热闹”起来了,但变现仍不容易
Di Yi Cai Jing· 2025-07-05 08:19
视频大模型行业暂时不会出现一家独大的局面。 近一个月的视频模型产品更新之前,Sora的热度已带动一批新品面世。包括2024年上半年的爱诗科技PixVerse、Runway Gen-3、Luma Dream Machine,下半 年的生数科技Vidu、智谱清影、字节跳动PixelDance、MiniMax海螺等。 一方面,视频模型你追我赶。据AGI-Eval(上海交通大学、同济大学等高校和机构合作发布的大模型评测社区),2024年12月,PixVerse-V3、可灵1.5、 Video-01的得分(评测维度包括视频-文本一致性、视频质量、运动质量等)超越Sora。 但同时,受限于技术成熟度、商业化、成本高等因素,视频模型的创业热度并不及AI应用层,主要由大型互联网/科技企业组成,由爱诗科技、生数科技、 Pika、Runway、MiniMax等明星创业公司补充。 此前,MiniMax创始人闫俊杰表示,视频类工作复杂度比文本更难,上下文文本很长,一个5秒的视频就有几M(兆字节,MB)大小,但文本领域,5秒看 100个字的大小可能都不到1K(千字节,KB),这是几千倍的存储差距。该差距带来的挑战便是如果通过基本文本构 ...
可灵悄悄赚了1个亿
36氪· 2025-07-03 23:59
Core Viewpoint - The commercialization of video generation has made significant progress, with revenues from marketing and promotion now balancing out with investments [1][24]. Group 1: AI Video Generation Trends - AI-generated ASMR and animal sports videos have gained immense popularity on short video platforms, attracting millions of likes and shares [2][3]. - The release of Google's video generation model Veo3 in May has been a game-changer, enabling high-quality AI videos with synchronized audio, thus transforming content creation [5][11]. - The rapid advancement of AI content creation tools has led to a surge in creators leveraging these technologies, with many accounts emerging on short video platforms [3][6]. Group 2: Market Dynamics and Competition - The competitive landscape is evolving, with various players like 可灵 and 即梦 making strides in the AI video generation space, alongside Google's efforts [10][14]. - 可灵's video generation model has achieved over 30% market share, surpassing competitors like Runway and Veo-2 [14]. - The industry is witnessing a shift in user preferences, with creators increasingly relying on video generation tools for efficiency, as evidenced by a threefold increase in download rates for generated images [15][19]. Group 3: Financial Performance and Projections - 可灵 is projected to achieve an annual recurring revenue (ARR) exceeding $100 million by March 2025, outpacing other AI products like Cursor [17][19]. - The annual revenue for leading video generation products is expected to reach $1 billion this year, with potential growth to $5-10 billion next year [19]. - Despite the positive outlook, industry leaders acknowledge that the commercialization process is still in its early stages, with many challenges remaining [25][26].
2025全球数字经济创新大赛AIGC创作大赛在京启动
Zheng Quan Ri Bao Wang· 2025-07-02 13:14
Group 1 - The 2025 Global Digital Economy Innovation Competition AIGC Creation Competition was launched to promote the integration of AI-generated content (AIGC) technology with creative industries, providing a platform for global creators to showcase their creativity and realize value [1] - The competition features six distinctive tracks including short video dramas, fashion design, digital IP, sound creation, and code generation, utilizing a dual-track model of "scenario-based propositions + free propositions" [1] - The competition aims to establish a sustainable talent ecosystem for digital content creation through a "Future Creators" training program, fostering collaboration between industry, academia, and research [1] Group 2 - The competition introduces three major innovative highlights: a "demand-driven" competition mechanism where industry leaders participate in setting the topics, ensuring that creations have commercial viability from the outset [2] - A global creative track is established to facilitate international dialogue and showcase China's AIGC technological innovation while engaging with top international creative talents [2] - The competition adopts an open and inclusive technical philosophy, allowing participants to choose their creative tools freely, with support from specialized tools like PixVerse provided by the organizing committee [2]
2025全球数字经济创新大赛AIGC创作大赛启动
Xin Jing Bao· 2025-07-02 10:00
新京报讯(记者吴婷婷)7月2日,2025全球数字经济创新大赛AIGC创作大赛启动,大赛旨在推动人工 智能生成内容(AIGC)技术与艺术创作的深度融合,优秀成果可直接对接产业资源实现快速转化。 活动现场,北京信息化协会与多家单位签署了战略合作协议。这些行业领军企业将充分发挥各自在内容 制作、技术研发、渠道分发等领域的优势,分别担任各赛道的牵头单位,为参赛者提供技术支持和商业 化落地指导。 北京市经济和信息化局副局长刘维亮指出,北京市人工智能核心产业营收超3500亿元,同比增长超 12%,人工智能核心企业超2400家。在通用大模型等主流技术路线上,北京持续保持领先优势,截至目 前,北京模型备案量达132款,占全国超三成,稳居全国首位。"本次AIGC创作大赛,正是我们推动 AIGC技术普及、激发社会创新活力、挖掘优秀人才和项目的重要举措。" 本次大赛在赛道设置上实现重大创新,打造了短视频短剧、服饰设计、数字IP、声音创作、代码生成等 特色赛道,采用"场景化命题+自由命题"双轨并行模式。大赛设立"产业需求导向"的赛事机制,各赛道 命题均由赛道牵头企业深度参与制定,确保作品从创作源头就具备商业化基因,优秀成果可直接对接 ...
字节图像生成新模型:主打多主体一致性,新基准数据集同时亮相
量子位· 2025-07-02 09:33
Core Viewpoint - ByteDance has introduced Xverse, a multi-subject control generation model that allows precise control over each subject without compromising image quality [2][6]. Group 1: Xverse Overview - Xverse utilizes a method based on the Diffusion Transformer (DiT) to achieve consistent control over multiple subjects' identities and semantic attributes [6]. - The model comprises four key components: T-Mod adapter, text flow modulation mechanism, VAE encoding image feature module, and regularization techniques [8][10][11]. Group 2: Key Components - T-Mod adapter employs a perceiver resampler to combine CLIP-encoded image features with text prompt features, generating cross-offsets for precise control [8]. - The text flow modulation mechanism converts reference images into modulation offsets, ensuring accurate control during the generation process [9]. - The VAE encoding module enhances detail retention, resulting in more realistic images while minimizing artifacts [10]. Group 3: Regularization Techniques - Xverse introduces two critical regularization techniques to improve generation quality and consistency: XVerseBench benchmark testing and multi-dimensional evaluation metrics [11][12]. - XVerseBench includes a diverse dataset with 20 human identities, 74 unique objects, and 45 different animal species, featuring 300 unique test prompts [11]. Group 4: Evaluation Metrics - The evaluation metrics include area retention loss, text-image attention loss, DPG score, Face ID similarity, DINOv2 similarity, and aesthetic score [12][13]. - These metrics assess the model's editing capabilities, identity maintenance, object feature retention, and overall aesthetic quality of generated images [13]. Group 5: Comparative Performance - Xverse has been compared with leading multi-subject generation technologies, demonstrating superior performance in maintaining identity and object correlation in generated images [14][15]. - Quantitative data shows Xverse achieving an average score of 73.40 across various metrics, outperforming several other models [15]. Group 6: Research Background - The ByteDance Intelligent Creation Team has a history of focusing on AIGC consistency, developing advanced generation models and algorithms for multi-modal content creation [17]. - Previous innovations include DreamTuner for high-fidelity identity retention and DiffPortrait3D for 3D modeling, laying the groundwork for Xverse [18][19][21]. Group 7: Future Directions - The team aims to enhance AI creativity and engagement, aligning with daily needs and aesthetic experiences [22].
国产视频生成模型持续发力推动行业发展
Huajin Securities· 2025-06-29 13:47
Investment Rating - The industry investment rating is "Outperform the Market" (maintained) [2][8] Core Insights - The domestic video generation models are continuously advancing, driving industry development. The first global AI unit story collection "New World Loading" premiered recently, showcasing the capabilities of the Kuaishou Keling AI model [5] - Keling AI has achieved significant revenue growth, reaching 150 million yuan in Q1 2025, with nearly 70% of its revenue coming from paid subscriptions by professional users in the self-media and marketing sectors [5] - The top five domestic video generation models have made notable progress, with ByteDance's Seedance 1.0 ranked first, followed by Minimax Hailuo02 and Kuaishou Kling2.0 [5] Summary by Sections Industry Performance - The industry has shown strong relative returns over the past year, with a 34.3% increase in relative returns over 12 months and a 47.59% increase in absolute returns [4] Investment Recommendations - The report suggests focusing on companies such as Zhongwen Online, Yuedu Group, Kaiying Network, Shanghai Film, Kunlun Wanwei, and others, as they are expected to benefit from the ongoing advancements in video generation applications [5]
爱诗科技联合举办 CVPR 2025第二届高效端侧生成技术研讨会(EDGE)
Cai Fu Zai Xian· 2025-06-17 08:15
Group 1 - The CVPR 2025 Second Workshop on Efficient Edge Generation Technology (EDGE) successfully concluded in Nashville, Tennessee, USA [2] - Two papers, "AdaVid: Adaptive Video-Language Pretraining" and "Scaling On-Device GPU Inference for Large Generative Models," were recognized as the top contributions during the workshop [2] Group 2 - Aishi Technology's AI video generation platform, PixVerse, co-hosted the workshop and collaborated with leading global scholars and experts [4]