Vidu Q1

Search documents
实测Vidu Q1参考生功能,看到诸葛亮丘吉尔拿破仑在长城拍照留念
机器之心· 2025-07-11 08:27
机器之心报道 看到这里,大概就可以看出 Vidu Q1 参考生功能的不寻常之处了。 编辑:Youli 这次真的不一样,遇到了「想象力的神」! 以前常说「要把自己活成一支队伍」,如今感谢 AI,真的实现了。 最近,生数科技旗下 AI 视频模型 Vidu Q1 推出参考生功能,极大简化传统内容生产流程,真正实现「一个人就是一个剧组」! 首先,我们来看一个视频: 这几个人物形象大家应该都很熟悉。 摇着羽扇、说着「想不到世间还有如此厚颜无耻之人」出现在各大鬼畜视频中的诸葛亮,英国铁血首相丘吉尔,以及战绩可查的拿破仑,如今他们跨越时空,围 坐在会议室中密切交谈,实现「世纪大会晤」! 如果用常规的 AI 图生视频来做的话,一般要经过写脚本、文生图 / P 图 / 融图、图片生成、图生视频、成片等步骤,但实际上,这里只用了三张图片和 Vidu Q1 的 参考生功能! 就像把大象放进冰箱只需要三步一样,这里也只需要三个步骤:找到上传照片、写提示词、成片。 更炫技的操作是,X 网友 Alex,她是一名艺术家兼程序员,在她的操作下,1989 年版本的蝙蝠侠与 1993 年版的侏罗纪公园霸王龙,不仅同框出现,还上演激烈 「对打」, ...
腾讯研究院AI速递 20250710
腾讯研究院· 2025-07-09 14:49
Group 1: Veo 3 Upgrade - The Google Veo 3 upgrade allows audio and video generation from a single image, maintaining high consistency across multiple angles [1] - The new feature is implemented through the Flow platform's "Frames to Video" option, enhancing camera movement capabilities, although the Gemini Veo3 entry is currently unavailable [1] - User tests indicate natural expressions and effective performances, marking a significant breakthrough in AI storytelling applicable in advertising and animation [1] Group 2: Hugging Face 3B Model - Hugging Face has released the open-source 3B parameter model SmolLM3, outperforming Llama-3.2-3B and Qwen2.5-3B, supporting a 128K context window and six languages [2] - The model features a dual-mode system allowing users to switch between deep thinking and non-thinking modes [2] - It employs a three-stage mixed training strategy, trained on 11.2 trillion tokens, with all technical details, including architecture and data mixing methods, made available [2] Group 3: Kunlun Wanwei Skywork-R1V 3.0 - Kunlun Wanwei has open-sourced the Skywork-R1V 3.0 multimodal model, achieving a score of 142 in high school mathematics and 76 in MMMU evaluation, surpassing some closed-source models [3] - The model utilizes a reinforcement learning strategy (GRPO) and key entropy-driven mechanisms, achieving high performance with only 12,000 supervised samples and 13,000 reinforcement learning samples [3] - It excels in physical reasoning, logical reasoning, and mathematical problem-solving, setting a new performance benchmark for open-source models and demonstrating cross-disciplinary generalization capabilities [3] Group 4: Vidu Q1 Video Creation - Vidu Q1's multi-reference video feature allows users to upload up to seven reference images, enabling strong character consistency and zero storyboard video generation [4] - Users can combine multiple subjects with simple prompts, with clarity upgraded to 1080P, and support for character material storage for repeated use [5] - Test results show it is suitable for creating multi-character animation trailers, supporting frame extraction and quality enhancement, reducing video production costs to less than 0.9 yuan per video [5] Group 5: VIVO BlueLM-2.5-3B Model - VIVO has launched the BlueLM-2.5-3B edge multimodal model, which excels in over 20 evaluations and supports GUI interface understanding [6] - The model allows flexible switching between long and short thinking modes, introducing a thinking budget control mechanism to optimize reasoning depth and computational cost [6] - It employs a sophisticated structure (ViT+Adapter+LLM) and a four-stage pre-training strategy, enhancing efficiency and mitigating the text capability forgetting issue in multimodal models [6] Group 6: DeepSeek-R1 System - The X-Masters system, developed by Shanghai Jiao Tong University and DeepMind Technology, has achieved a score of 32.1 in the "Human Last Exam" (HLE), surpassing OpenAI and Google [7] - The system is built on the DeepSeek-R1 model, enabling smooth transitions between internal reasoning and external tool usage, using code as an interactive language [7] - X-Masters employs a decentralized-stacked multi-agent workflow, enhancing reasoning breadth and depth through collaboration among solvers, critics, rewriters, and selectors, with the solution fully open-sourced [7] Group 7: Zhihui Jun's Acquisition - Zhihui Jun's Zhiyuan Robot has acquired control of the listed company Shuangwei New Materials for 2.1 billion yuan, aiming for a 63.62%-66.99% stake [8] - Following the acquisition, Shuangwei New Materials' stock resumed trading with a limit-up, reaching a market value of 3.77 billion yuan, with the actual controller changing to Zhiyuan CEO Deng Taihua and core team members including "Zhihui Jun" Peng Zhihui [8] - This acquisition, conducted through "agreement transfer + active invitation," is seen as a landmark case for new productivity enterprises in A-shares following the implementation of national policies [8] Group 8: AI Model Usage Trends - In the first half of 2025, the Gemini series models captured nearly half of the large model API market, with Google leading at 43.1%, followed by DeepSeek and Anthropic at 19.6% and 18.4% respectively [9] - DeepSeek V3 has maintained a high user retention rate since its launch, ranking among the top five in usage, while OpenAI's model usage has fluctuated significantly [9] - The competitive landscape shows differentiation: Claude-Sonnet-4 leads in programming (44.5%), Gemini-2.0-Flash excels in translation, GPT-4o leads in marketing (32.5%), and role-playing remains highly fragmented [9] Group 9: AI User Trends - A report by Menlo Ventures indicates that there are 1.8 billion AI users globally, with a low paid user rate of only 3%, and a high student usage rate of 85%, while parents are becoming heavy users [10] - AI is primarily used for email writing (19%), researching topics of interest (18%), and managing to-do lists (18%), with no single task dependency exceeding one-fifth [10] - The next 18-24 months are expected to see six major trends in AI: rise of vertical tools, complete process automation, multi-person collaboration, explosion of voice AI, physical AI in households, and diversification of business models [10]
生数科技视频模型Vidu Q1推出参考生功能,重构传统视频生产方式
Zheng Quan Shi Bao Wang· 2025-07-08 13:45
Vidu Q1参考生直接跳过中间复杂度较高的分镜制作环节,仅需上传人物、道具、场景等参考图,Vidu Q1基于参考生功能对于人物、场景、道具等元素的深层理解和各元素之间的互动关系,即可直接将多 个参考元素融合为一段视频素材,真正实现零分镜生成。 相较于文生视频的不可控和图生视频对分镜的重度依赖,参考生兼具可控性与灵活性的双重优势。不过 更为重要的创新在于,文生视频与图生视频仍是基于传统视频制作方式,而Vidu Q1参考生不只是对于 原有传统制作效率的显著提升,更是打破了固有的传统内容创作方式,打造了AI原生工作流,从参考 图元素到视频素材生成,中间仅需一步,创作门槛大幅降低。 不仅如此,Vidu Q1参考生功能的推出,也给予创作者更多灵活性。上传的人物、道具、场景等素材分 别是创作者强大的演员库、道具库和场景库,作为永不疲惫的"数字演员",组成了庞大且任意调配 的"虚拟剧组"。 创作者可以利用Vidu Q1参考生功能随时调用其中的任意素材,可以是多个人物同一场景,或者同一场 景,不同人物或道具,或者不同场景,同一人物等,将有无数种排列组合,排列组合不同,生成的视频 内容也不同。这无疑提高了素材的可复用性,只需 ...
视频生成大模型的2025半年“赛点”:向左刷榜“跑分”,向右刷屏“跑量”
3 6 Ke· 2025-05-29 01:59
果然,一如当年Sora的发布一般炸裂,在AI视频生成上还得看国外厂商秀操作?! 在举办的2025 Google I/O开发者大会上,谷歌带来视频生成类大模型领域的又一重磅产品Veo 3。距离上一代Veo 2发布,才过去半年左右,Veo 3的更新非 常炸裂,不仅成功实现了视频与音频的原生集成,包括音乐、背景音效,甚至是角色之间的对话也能自然生成,并根据画面同步口型。 视频生成类大模型全面开启"有声时代"。在Veo 3更强的物理规律理解与模拟下,当前AI视频生成的真实感与沉浸感又上升了一个层级。 在这样的效果面前,国产的视频生成类大模型还有超越的可能吗?先不说结果,只讨论过去Veo 2发布后半年的行业历程,在全球权威评测榜单VBench Leaderboard、Artificial Analysis上,这一领域的竞争格局并非一成不变,国内厂商如快手的可灵1.6pro、可灵2.0、阿里的通义万相、生数科技的Vidu Q1等 都曾陆续登顶榜首。 视频作为当今内容消费的主要载体,在诸多领域都是拥有极高的流量和热度。哪怕是在AI大模型领域,关于视频生成类大模型赛道的竞争似乎也比其他 细分领域要激烈些,厂商之间的"互殴"尤 ...
为什么AI视频工具长得越来越像?
3 6 Ke· 2025-05-07 07:50
Core Insights - The AI video sector has seen a shift in focus from OpenAI's Sora to new players like Keke and Jiemeng, with industry players now prioritizing the reduction of the gap between AI video production and consumption [4][5][6] - The competition among AI video players is intensifying, with frequent updates and new model releases expected in 2025, indicating a rapid evolution in the industry [4][12][26] - There is a growing concern among mid-tier AIGC entrepreneurs regarding the commercial viability of AI video, as production costs remain high while client budgets are decreasing [4][16][18] Group 1: Industry Dynamics - The AI video landscape is becoming increasingly crowded, with numerous players emerging and competing for market share [23][26] - The focus of competition has shifted from model parameters to three key dimensions: consistency, usability, and playability [6][13][14] - Many AI video products are becoming homogenized in terms of functionality, leading to increased competition on quality, cost, and interaction forms [5][16] Group 2: Technological Advancements - AI video players are enhancing video generation consistency by improving frame transitions and scene realism, which are critical for quality [9][11] - Major players are iterating their foundational models regularly, with updates occurring at least every six months to maintain competitive advantage [11][12] - New features such as dynamic editing capabilities and end-to-end production tools are being developed to improve usability for creators [13][14] Group 3: Market Challenges - Despite the proliferation of tools and features, many creators express anxiety over rising production costs and decreasing project budgets [16][18][21] - The pricing strategies in the AI video market are not leading to significant reductions in costs, with many companies maintaining high prices for advanced models [20][21] - The complexity of video creation demands a multi-platform approach, as no single company currently meets all needs in the market [27]
【产业互联网周报】中国已成为全球人工智能专利最大拥有国;传Manus融资7500万美元;美分析师:H20出口管制毫无意义,对中国AI发展影响不大
Tai Mei Ti A P P· 2025-04-28 03:16
Group 1 - China has become the world's largest holder of artificial intelligence patents, accounting for 60% of the total [2] - The National Intellectual Property Administration is advancing the innovation of intellectual property systems in the AI field and plans to establish new protection rules for AI and big data [2] - The report from the World Intellectual Property Organization highlights the positive momentum in China's AI development [2] Group 2 - Manus AI, a Chinese startup, has raised $75 million in a new funding round led by Benchmark, increasing its valuation to nearly $500 million [3] - The company plans to expand its services into markets including the US, Japan, and the Middle East with the new funds [3] Group 3 - iFlytek reported a revenue of 4.658 billion yuan for Q1 2025, a year-on-year increase of 27.74%, with net profit growth of 35.68% [6] - The company's non-net profit increased by 48.29%, and operating cash flow rose by 48.54% [6] Group 4 - ByteDance's Agent product "Kouzi Space" has entered internal testing, focusing on solving complex work tasks with multiple expert agents [4] - The product is driven by domestic models and integrates various tools to enhance task-solving capabilities [4] Group 5 - Shenzhen University has officially established an Artificial Intelligence College, collaborating with Tencent Cloud to build an industry academy [9] - The college includes a research team of approximately 80 members, including two academicians from the Chinese Academy of Sciences [9] Group 6 - Lenovo and Xinhua Union Culture, along with Hanshe Culture Group, have launched China's first intelligent agent for the cultural tourism industry [10] - The intelligent agent is based on large models and aims to enhance operational management and industry empowerment [10] Group 7 - Ant Group has established two operational centers in Guangzhou, focusing on digital finance and cross-border payment [11] - The centers are part of a strategic cooperation agreement with the Guangzhou municipal government [11] Group 8 - Alibaba has announced the cancellation of the "refund only" policy across multiple e-commerce platforms, marking a significant shift in consumer rights [13] - This change aims to balance merchant rights protection with consumer experience improvement [13] Group 9 - Huawei has officially launched its high-speed L3 commercial solution, preparing for the commercial capabilities of L3 by 2025 [14] - The company emphasizes the challenges of transitioning from L2 to L3 automation [14] Group 10 - Tencent Cloud has introduced a cabin-side large model that provides precise Q&A services for driving behavior and vehicle operation [15] - This model is designed to enhance user experience in the automotive sector [15] Group 11 - Yandex has launched a new generation AI in-car platform tailored for the Russian-speaking market, featuring smart voice interaction [16] - The platform has already gained over 70 million monthly active users in Russia [16] Group 12 - ZTE Corporation reported a net profit decline of 10.5% year-on-year for Q1 2025, despite a revenue increase of 7.82% [20] - The company's revenue reached 32.968 billion yuan [20] Group 13 - The first humanoid robot half marathon concluded in Beijing, with the top three companies being clients of Feishu [7] - These companies utilized AI products for management and efficiency improvements [7] Group 14 - The establishment of the Greater Bay Area (Dongguan) AI Alliance aims to enhance AI development and application scenarios by 2027 [26] - The alliance includes major tech companies and aims to utilize over 10,000 P of intelligent computing power [26] Group 15 - The launch of the "Deep Small Note" application in Shenzhen allows users to apply for business licenses using AI [27] - This marks a significant step towards fully intelligent government service applications [27] Group 16 - OceanBase has announced a comprehensive entry into the AI era, appointing its CTO as the head of AI strategy [57] - The company aims to build a data foundation for the AI era [57]
传媒行业周报:积极关注高景气社交出海、Agent及多模态AI应用行业周报
KAIYUAN SECURITIES· 2025-04-28 00:55
《多模态 AI 突破不止,政策暖风持续 助力 IP、体验消费—行业周报》 -2025.4.13 2025 年 04 月 27 日 投资评级:看好(维持) 行业走势图 -14% 0% 14% 29% 传媒 沪深300 2024-04 2024-08 2024-12 数据来源:聚源 相关研究报告 《MCP 及政策助力 AI 发展,继续关 注高景气 IP 赛道—行业点评报告》 -2025.4.21 行 业 研 究 | 方光照(分析师) | 田鹏(分析师) | 肖江洁(联系人) | | --- | --- | --- | | fangguangzhao@kysec.cn | tianpeng@kysec.cn | xiaojiangjie@kysec.cn | | 证书编号:S0790520030004 | 证书编号:S0790523090001 | 证书编号:S0790124070035 | 风险提示:出海社交产品收入、AI 应用商业化进展、游戏流水等低于预期。 开 源 证 券 证 券 研 究 请务必参阅正文后面的信息披露和法律声明 1/19 社交、游戏出海中东北非等地延续高景气,关注拥有卡位、运营优势的公司 根据 ...
行业周报:积极关注高景气社交出海、Agent及多模态AI应用-20250427
KAIYUAN SECURITIES· 2025-04-27 14:34
Investment Rating - The industry investment rating is "Positive" (maintained) [2] Core Viewpoints - The report emphasizes the continued high growth in social and gaming sectors, particularly in the MENA region, and suggests focusing on companies with operational advantages and market positioning [4] - The report highlights the advancements in domestic video models and the ongoing expansion of AI applications, recommending continued investment in AI-related sectors [5] Summary by Sections Industry Data Overview - "Peace Elite" ranks first in the iOS free chart in mainland China, while "Honor of Kings" holds the top position in the iOS revenue chart [12][16] - The film "Sunshine Flower" achieved the highest box office for the week, grossing 0.39 billion CNY [26] Industry News Overview - Coze, an AI tool, entered the domestic top ten rankings, while Photoroom improved its position in the overseas rankings [33] - The report notes the approval of 118 games by the National Press and Publication Administration in April [33] Company Performance Highlights - ZhiZi City Technology reported a total revenue of 5.09 billion CNY for 2024, a year-on-year increase of 53.9%, with social business revenue reaching 4.63 billion CNY, up 58.1% [4] - Yalla Technology reported a revenue of 339.7 million USD for 2024, with a net profit of 134.2 million USD, reflecting an 18.7% year-on-year increase [4] Recommendations - The report recommends focusing on companies with strong market positioning and local operational capabilities, highlighting Tencent Holdings and ShengTian Network as key recommendations, with beneficiaries including ZhiZi City Technology and Yalla Technology [4][5]
生数科技全新视频大模型Vidu Q1上线:动漫视频生成领域全球第一
IPO早知道· 2025-04-23 10:25
以极致效果锁定性价比第一。 本文为IPO早知道原创 作者| Stone Jin 微信公众号|ipozaozhidao 据 IPO早知道消息, 生数科技全新视频大模型 Vidu Q1 日前 全球上线 。 据视频生成模型权威测评基准 VBench-1.0以及VBench-2.0发布的测评结果,Vidu Q1在VBench 系列的两个榜单上都超越了Runway 、OpenAI Sora、快手的Kling等国内外顶尖模型,拿下文生 视频赛道榜单双第一。 • 电影级高清画质: Vidu Q1 文生视频和图生视频支持1080P视频直出,无论是宏大的科幻叙事还 是人物特写的细微表情,都可以清晰呈现; 而 在国内权威大模型测评机构 SuperCLUE的图生视频榜中 ,Vidu Q1也在动漫风格、写实风格上均 斩获双榜单第一的成绩。 值得注意的是, Vidu Q1在VBench-1.0的视频质量、视频语义一致性以及VBench-2.0常识推理、 物理理解等综合维度上达到SOTA水平(即当前最先进的模型),成为全球视频生成效果最强模型。 事实上, 在提升创作者生产力和创作力上,生数 Vidu 技术和产品上一直引领全球 —— 此次 ...