Workflow
Diffusion(扩散模型)
icon
Search documents
中国模型为何会在AI视频上领跑
Hua Er Jie Jian Wen· 2026-02-11 04:25
直到这次字节的Seedance2.0出圈,很多人才第一次真正意识到,中国模型在 AI 视频这条赛道上,似乎 已不只是追赶,而是开始跑在前面了。 Seedance2.0不是靠某一帧画面惊艳出圈,而是带来了一种更隐蔽、却更深刻的变化,即AI 视频第一次 像一件可以被稳定交付的工业品。 多模态输入、自动运镜、长时一致性,这些能力叠加在一起,意味着创作者可以避免反复抽卡的痛苦, 而去推进一条可复用的生产流程。 但如果把时间线往前拨,会发现中国公司在AI视频的领先并不是突然发生的。 其实更早之前,中国模型在 AI 视频领域已获得了清晰的领先窗口。 例如去年4月的快手可灵2.0,文生视频对比Sora胜负比达367%,在人物一致性、生成稳定性与复现率上 全面领先,率先实现可商用的AI视频生产能力。 AI视频的稳定性非常重要,人物能不能保持一致,画面会不会中途崩坏,生成结果能不能被反复复 现。 这些指标恰恰决定了视频能否进入真实生产。 后来我们能看到,一批中国公司沿着同一条路径继续推进。 字节在 Seedance 体系里不断强化叙事和镜头逻辑,而一些更小创业团队甚至会把视频生成直接嵌进电 商、广告、游戏买量的工作流中。 这些 ...
理想一篇论文入选近半年端到端自动驾驶推荐度最高的10篇论文
理想TOP2· 2025-06-18 11:43
Core Viewpoint - The article discusses the top 10 recommended papers in the field of end-to-end autonomous driving, highlighting the increasing presence of Li Auto in the discourse surrounding autonomous driving technology and research [2][20][22]. Group 1: Overview of Recommended Papers - The article presents a list of 10 highly recommended papers in the end-to-end autonomous driving domain, compiled from interviews with leading researchers [22][26]. - The papers cover various innovative approaches, including reinforcement learning, vision-language models, and multimodal frameworks [27][29][35][40]. Group 2: Key Innovations and Technologies - The paper "TransDiffuser" introduces an encoder-decoder model for trajectory generation, utilizing multimodal perception information to create diverse and high-quality trajectories [10][42]. - The diffusion model is highlighted for its ability to generate trajectories by learning from noise, significantly improving the model's performance in complex traffic environments [6][7][13][16]. - The architecture of TransDiffuser includes a scene encoder for processing multimodal data and a denoising decoder for trajectory generation [11][12][14]. Group 3: Performance Metrics and Results - TransDiffuser achieved a Predictive Driver Model Score (PDMS) of 94.85 on the NAVSIM benchmark, outperforming existing methods [15][42]. - The model's efficiency is enhanced through the use of ordinary differential equations (ODE) sampling, allowing for rapid trajectory generation [7][13]. Group 4: Future Directions and Challenges - The authors of the papers acknowledge challenges in fine-tuning models and suggest future work could involve integrating reinforcement learning and exploring models like OpenVLA [17][18]. - The article emphasizes the ongoing evolution in the field, with a shift towards more integrated and robust approaches to autonomous driving [70].