多模态生成
Search documents
首次证实RL能让3D模型学会推理,复杂文本描述下生成质量跃升
3 6 Ke· 2026-02-27 02:33
图像生成用RL已经打出了漂亮的成绩单,那3D生成呢? 当GRPO让大模型在数学、代码推理上实现质变,研究团队率先给出答案——首个将强化学习系统性引入文本到3D自回归生成的研究正式诞生,并被 CVPR 2026接收。该研究不只是简单移植2D经验,而是针对3D生成的独特挑战,从奖励设计、算法选择、评测基准到训练范式,做了一套完整的系统性 探索。 核心矛盾在于:3D对象没有「标准视角」。一张图对不对,人一眼就能看出来;但一个3D物体,需要从多个视角同时评估几何一致性、纹理质感与语义 对齐——任何一个维度设计不当,训练就会崩。 更深层的问题是,3D生成模型在自回归解码时,每一个token都携带着对整体结构的隐式承诺。这种长程依赖让奖励信号的稀疏性问题在3D中比2D更加突 出——模型很难在中途感知到哪里出了问题。 研究团队将这个问题拆成四个维度系统研究: 奖励模型怎么设计——哪类奖励信号对3D生成最有效? RL算法怎么选——GRPO的哪些变体适合3D的序列特性? 为什么3D比2D难得多? RL在文本、图像生成上屡试不爽,但直接搬到3D行不通。 最出人意料的发现:通用大模型(Qwen2.5-VL)评估3D一致性,比专用模 ...
字节发布Seedance 2.0,真人校验才可生成分身
Sou Hu Cai Jing· 2026-02-12 19:57
2月12日,字节跳动发布最新视频生成模型Seedance2.0,旗下AI产品豆包和即梦宣布接入。目前用户可以在豆包 APP、电脑端、网页版以及即梦APP、即梦网页版等产品中体验该模型。 其中,豆包APP、即梦APP支持真人出镜,用户需要先通过录音录像完成真人校验,才能生成本人形象的数字人 分身,使用该分身生成AI视频。而在豆包电脑端、网页版以及即梦网页版等场景中,平台均明确提示暂不支持上 传真人人脸素材。 官方技术报告显示,Seedance 2.0 采用极致的稀疏架构来提升训练和推理效率,基于统一的多模态视频生成架 构,模型涌现出了强大的泛化能力 ,不仅能生成音画同步的高质量音视频,还可支持组合的多模态参考、视频编 辑、视频延长等复杂功能。 在基于多模态参考生成、复杂音视频指令遵循、复杂运动稳定性、专业镜头语言、音视频表现力及视听一体化协 同等多维度的测评中,Seedance 2.0 的表现均处于业内领先水平。其在运动稳定性、指令遵循及画面美感维度均 有显著提升,生成的复杂动作流畅细腻,并支持专业级组合运镜与叙事节奏控制。 文/北京青年报记者 温婧 编辑/张丽 Seedance 2.0 能够支持图像、视频、音 ...
清华系创企,拿下国内视频生成领域最大单笔融资
3 6 Ke· 2026-02-05 08:50
Core Insights - The article highlights that Shengshu Technology has completed over 600 million RMB in A+ round financing, setting a record for the largest single financing in China's video generation sector [1] - The company aims to achieve over tenfold growth in users and revenue by 2025, with a global reach across more than 200 countries and regions [1] Financing Details - The financing round was led by Zhongguancun Science City and Xinglian Capital, with strategic investments from companies like Wanxing Technology, Visual China, and Tuolisi [1] - Shengshu Technology has completed a total of six financing rounds and one equity transfer, with notable investors including Huawei, Ant Group, and Baidu [7][9] Product Development - Shengshu Technology focuses on developing multimodal general large models and applications, providing video generation and multimodal generation products through various platforms [2] - The company is recognized as one of the earliest teams to research multimodal generation algorithms, having introduced the U-ViT architecture ahead of OpenAI's DiT [2] Model Performance - The Vidu Q3 model, aimed at professional film production, ranked first in China and second globally in a recent AI benchmark test, surpassing competitors like Runway Gen-4.5 and Google Veo3.1 [3] - Vidu Q3 supports features such as 16-second audio-visual synchronization, 1080P quality, and multilingual output [4] Market Presence - Vidu has established a strong presence in the film industry, covering over 90% of content providers and production institutions, with clients including Sony Pictures and Tencent Animation [7] - The company also serves clients in the internet and smart hardware sectors, including ByteDance and Samsung, focusing on content production and product interaction innovation [7] Competitive Landscape - The video generation sector remains highly competitive, with significant investments flowing into startups, while major companies like Kuaishou and Google are also expanding their influence [10] - The article emphasizes the need for startups to differentiate themselves in technology, application scenarios, or ecosystems to succeed in this competitive environment [10]
锦秋被投生数科技首席科学家朱军教授当选ACM Fellow|Jinqiu Spotlight
锦秋集· 2026-01-22 06:26
Core Insights - The article highlights the announcement of the 2025 ACM Fellow list, featuring notable scholars, including Professor Jun Zhu from Tsinghua University, recognized for his contributions to machine learning and Bayesian methods [2][11]. Group 1: ACM Fellow Announcement - The 2025 ACM Fellow list includes 19 Chinese scholars, accounting for approximately 27% of the total [6][14]. - The ACM Fellow designation is a prestigious honor, representing the top 1% of ACM members, with over 100,000 members globally [7][11]. - The contributions of the 2025 Fellows span various fields, including medical AI, computer graphics, data management, human-computer interaction, and robotics [12]. Group 2: Contributions of Notable Scholars - Jun Zhu is recognized for his work in probabilistic machine learning theories and methods, particularly in representation learning and sparse topic coding [103]. - Baoquan Chen from Peking University is acknowledged for his contributions to large-scale scene reconstruction and discrete geometry processing [20]. - Pei Cao, currently at YouTube, is honored for her advancements in network caching and search engine efficiency [15][19]. Group 3: Industry Implications - The article discusses the potential impact of video generation technology, with a focus on the U-ViT architecture developed by Shengshu Technology, which is expected to revolutionize content production by 2026 [4]. - The shift in focus from model breakthroughs to deeper integration into production scenarios is anticipated as the industry evolves [4].
「AI 100」榜单启动招募,AI产品“年会”不能停丨量子位智库
量子位· 2026-01-15 08:53
Core Viewpoint - The article discusses the launch of the "AI 100" list by Quantum Bit Think Tank, aimed at recognizing and evaluating the most impactful AI products in China for 2025, highlighting the rapid evolution and potential of AI technologies in various sectors [4][12]. Group 1: AI 100 List Overview - The "AI 100" list is divided into three main categories: "Flagship AI 100," "Innovative AI 100," and the top three products in ten popular sub-sectors [6]. - The "Flagship AI 100" will focus on the strongest AI products of 2025, showcasing those that have achieved significant technological breakthroughs and practical application value [7]. - The "Innovative AI 100" aims to identify emerging products in 2025 that have the potential to lead industry changes in 2026, representing cutting-edge AI technology [8]. Group 2: Sub-sector Focus - The ten hottest sub-sectors for the top three products include AI browsers, AI agents, AI smart assistants, AI workstations, AI creation, AI education, AI healthcare, AI entertainment, Vibe Coding, and AI consumer hardware [9]. Group 3: Application and Evaluation Criteria - The evaluation of the "AI 100" list employs a dual assessment system combining quantitative and qualitative measures, focusing on user data and expert evaluations to ensure objectivity and accuracy [13]. - Quantitative metrics include user scale, growth, activity, and retention, with over 20 specific indicators such as total downloads and active user numbers [13]. - Qualitative assessments consider long-term development potential, including underlying technology, market space, functionality, monetization potential, team background, and growth speed [13].
「AI 100」榜单启动招募,AI产品“年会”不能停丨量子位智库
量子位· 2026-01-14 08:10
Core Insights - The article discusses the emergence of numerous keywords in the AI product sector by 2025, highlighting transformative AI products that are leading the market [4] - The "AI 100" list by Quantum Bit Think Tank aims to evaluate and recognize the top AI products in China, reflecting the industry's evolution and future trends [4][12] Group 1: AI 100 List Overview - The "AI 100" list is divided into three main categories: "Flagship AI 100," "Innovative AI 100," and the top three products in ten popular sub-sectors [6] - The "Flagship AI 100" will focus on the strongest AI products of 2025, showcasing those that have achieved significant technological breakthroughs and practical application value [7] - The "Innovative AI 100" aims to identify products that are expected to emerge in 2026, representing cutting-edge AI technology and potential industry disruptors [8] Group 2: Sub-sector Focus - The ten hottest sub-sectors for the top three products include AI browsers, AI agents, AI smart assistants, AI workstations, AI creation, AI education, AI healthcare, AI entertainment, Vibe Coding, and AI consumer hardware [9] Group 3: Application and Evaluation - The evaluation of the "AI 100" list employs a dual assessment system combining quantitative and qualitative metrics, focusing on user data and long-term development potential [13] - Quantitative metrics include user scale, growth, activity, and retention, while qualitative metrics consider technology, market space, design, monetization potential, team background, and growth speed [13]
「AI 100」榜单启动招募,AI产品“年会”不能停丨量子位智库
量子位· 2026-01-12 04:13
Core Insights - The article discusses the emergence of numerous keywords in the AI product sector by 2025, highlighting transformative AI products that are leading the market [4] - The "AI 100" list by Quantum Bit Think Tank aims to evaluate and recognize the top AI products in China, reflecting the industry's evolution and future trends [4][12] Group 1: AI 100 List Overview - The "AI 100" list is divided into three main categories: "Flagship AI 100," "Innovative AI 100," and the top three products in ten popular sub-sectors [6] - The "Flagship AI 100" will focus on the strongest AI products of 2025, showcasing those that have achieved significant technological breakthroughs and practical application value [7] - The "Innovative AI 100" aims to identify products that are expected to emerge in 2026, representing cutting-edge AI technology and potential industry disruptors [8] Group 2: Sub-sector Focus - The ten sub-sectors for the top three products include AI Browser, AI Agent, AI Smart Assistant, AI Workbench, AI Creation, AI Education, AI Healthcare, AI Entertainment, Vibe Coding, and AI Consumer Hardware [9] - This categorization is designed to provide a more precise reflection of development trends within each specific field [9] Group 3: Application and Evaluation Process - The application period for the "AI 100" list runs from now until January 15, 2026, with the results to be published in mid to late January 2026 [10] - The evaluation system combines quantitative and qualitative assessments, focusing on user data and expert evaluations to ensure objectivity and accuracy [13]
「AI 100」榜单启动招募,AI产品“年会”不能停丨量子位智库
量子位· 2026-01-10 03:07
Core Insights - The article discusses the emergence of numerous keywords in the AI product sector by 2025, highlighting transformative AI products that are leading the market [4] - The "AI 100" list by Quantum Bit Think Tank aims to evaluate and recognize the top AI products in China, reflecting the industry's evolution and future trends [4][12] Group 1: AI 100 List Overview - The "AI 100" list is divided into three main categories: "Flagship AI 100," "Innovative AI 100," and the top three products in ten popular sub-sectors [6] - The "Flagship AI 100" will focus on the strongest AI products of 2025, showcasing those that have achieved significant technological breakthroughs and practical application value [7] - The "Innovative AI 100" aims to identify products that are expected to emerge in 2026, representing cutting-edge AI technology and potential industry disruptors [8] Group 2: Sub-sector Focus - The ten hottest sub-sectors for the top three products include AI browsers, AI agents, AI smart assistants, AI workstations, AI creation, AI education, AI healthcare, AI entertainment, Vibe Coding, and AI consumer hardware [9] Group 3: Application and Evaluation Criteria - The evaluation of the "AI 100" list employs a dual assessment system combining quantitative and qualitative measures, focusing on user data and expert evaluations [13] - Quantitative metrics include user scale, growth, activity, and retention, while qualitative assessments consider long-term potential, technology, market space, and user experience [13]
「AI 100」榜单启动招募,AI产品“年会”不能停丨量子位智库
量子位· 2026-01-09 04:09
Core Insights - The article discusses the emergence of numerous keywords in the AI product sector in China by 2025, highlighting the rapid evolution and innovation in AI technologies [4] - The "AI 100" list by Quantum Bit Think Tank aims to evaluate and recognize the top AI products that represent China's AI capabilities [4][12] Group 1: AI 100 List Overview - The "AI 100" list is divided into three main categories: "Flagship AI 100," "Innovative AI 100," and the top three products in ten popular sub-sectors [6] - The "Flagship AI 100" will focus on the strongest AI products of 2025, showcasing those that have achieved significant technological breakthroughs and practical application value [7] - The "Innovative AI 100" aims to identify products that are expected to emerge in 2025 and have the potential to lead industry changes in 2026 [8] Group 2: Sub-sector Focus - The ten hottest sub-sectors for the top three products include AI browsers, AI agents, AI smart assistants, AI workstations, AI creation, AI education, AI healthcare, AI entertainment, Vibe Coding, and AI consumer hardware [9] Group 3: Application and Evaluation - The evaluation of the "AI 100" list employs a dual assessment system combining quantitative and qualitative measures, focusing on user data and expert evaluations [13] - Quantitative metrics include user scale, growth, activity, and retention, while qualitative assessments consider long-term potential, technology, market space, and user experience [13]
「AI 100」榜单启动招募,AI产品“年会”不能停丨量子位智库
量子位· 2026-01-06 01:01
Core Insights - The article discusses the emergence of numerous keywords in the AI product sector by 2025, highlighting transformative AI products that are leading the market [4] - The "AI 100" list by Quantum Bit Think Tank aims to evaluate and recognize the top AI products in China, reflecting the industry's evolution and future trends [4][12] Group 1: AI 100 List Overview - The "AI 100" list is divided into three main categories: "Flagship AI 100," "Innovative AI 100," and the top three products in ten popular sub-sectors [6] - The "Flagship AI 100" will focus on the strongest AI products of 2025, showcasing those that have achieved significant technological breakthroughs and practical application value [7] - The "Innovative AI 100" aims to identify products that are expected to emerge in 2026, representing cutting-edge AI technology and potential industry disruptors [8] Group 2: Sub-sector Focus - The ten hottest sub-sectors for the top three products include AI Browser, AI Agent, AI Smart Assistant, AI Workbench, AI Creation, AI Education, AI Healthcare, AI Entertainment, Vibe Coding, and AI Consumer Hardware [9] - This targeted approach aims to provide a clearer picture of development trends within specific AI fields [9] Group 3: Application and Evaluation - The evaluation of the "AI 100" list employs a dual assessment system combining quantitative and qualitative metrics, focusing on user data and long-term development potential [13] - Quantitative metrics include user scale, growth, activity, and retention, while qualitative assessments consider technology, market space, design, monetization potential, and team background [13]