Workflow
video generation
icon
Search documents
How OpenAI's Sora 2 goes beyond video generation.
Sequoia Capital· 2025-11-06 17:01
When you put enough compute and data into these systems in order to actually solve this task of predicting the next token, you need to develop an internal representation of how the world functions, right. You need to like simulate things. The models make lots of mistakes right now at like low compute scales.But as you continue pushing, you know, from 3 to four to five, you just see these internal world models get more and more robust. And it's really analogous for video, right. And in many ways more explici ...
X @Elon Musk
Elon Musk· 2025-10-26 15:18
Grok Imagine video generation understands physicsLuis Batalha 🇵🇹🇺🇸 (@luismbat):Yesterday a friend asked how to spot AI-generated videos.I said: physics anomalies are the ultimate tell.But it’s getting harder. Here’s Grok Imagine: a ball and a feather, dropped in air vs. in a vacuum. https://t.co/yZFLik3hmP ...
X @Elon Musk
Elon Musk· 2025-10-18 18:59
Product Update - Grok Imagine video generation 功能已升级 [1] - 建议更新 Grok 应用程序以获取最新版本 [1]
X @Elon Musk
Elon Musk· 2025-10-05 12:59
产品特性 - Grok 提供即时文本、图像和视频生成功能 [1] - Grok 4 提供最快的文本生成速度 [1] - Grok Imagine Video 在 15 秒内完成视频生成 [1] - Grok Imagine 提供快速图像生成能力 [1]
X @Elon Musk
Elon Musk· 2025-08-05 07:57
Product Features - Grok Imagine is recognized for its speed in generating images and videos [1] - The tool is also noted for its high production quality [1]
X @Elon Musk
Elon Musk· 2025-07-28 21:07
Product Update - Grok app is available for download with a subscription model [1] - Video generation beta and Valentine/Ani features are available for trial [1]
A whistle stop tour of AI creation with Paige Bailey
Google DeepMind· 2025-07-10 13:06
Gemini模型进展与特点 - Google DeepMind发布了升级版VO3模型,该模型在视觉和听觉效果上都有显著提升,能够生成更逼真、更具沉浸感的视频内容 [1][2] - V3模型引入了prompt rewriting功能,可以优化用户输入的prompt,使其更详细、更符合用户的设想,从而提高生成视频的质量 [1] - V3模型生成的视频片段通常为8秒,这是为了在公开版本中提供充分的创作控制空间,更长的内部版本也存在 [2] - Gemini模型能够输出文本、代码、图像和音频,并且能够编辑图像和控制音频,这得益于其将多种模态信息整合到一个模型中,而不是依赖于拼接不同的模型 [3] - Gemini模型通过整合视频、音频和详细的帧级别描述等多模态数据进行训练,从而能够生成更自然、更逼真的声音和响应 [3] Gemini在AI Studio和Flow中的应用 - AI Studio提供了一个实验平台,用户可以在其中尝试最新的Gemini模型,包括文本转语音功能,可以生成具有不同情感和语言的音频 [5][12] - Flow是由Google Labs团队开发的专业电影制作工具,它提供了一个专门的开发环境,允许用户拼接视频片段、控制摄像头,并进行其他高级编辑 [3][4] - AI Studio中的Gemini Live功能,结合了Project Astra的实时视觉理解能力,可以实时分析屏幕内容并提供相关信息 [14][16] Gemini在应用开发中的潜力 - AI Studio提供了一个新的build功能,即使是没有编程经验的用户也可以使用Gemini模型构建应用程序,生成的代码针对最新的SDK进行了优化 [28][29] - 通过build功能创建的应用程序可以直接部署到Cloud Run,从而方便用户与他人分享和使用 [39][40] - Gemini模型可以帮助开发者专注于构建和构思产品体验,而无需花费大量时间进行代码维护和升级 [42][44] 安全与伦理考量 - VO模型引入了安全过滤器,以防止生成不当内容,例如涉及儿童或特定公众人物的图像 [20][21] - 通过Gemini App生成的视频带有专门的水印,以表明其为AI生成,从而减少deepfake和诈骗的风险 [20][21]
Snap CEO Evan Spiegel on using AI to build image and video models
CNBC Television· 2025-07-09 16:33
AI Strategy & Competitive Advantage - Snapchat focuses on image, video, and 3D generation, leveraging its camera-centric platform where users create billions of snaps daily [1] - Building small, on-device image and video models provides a competitive edge by enabling technology deployment to hundreds of millions of users without server-side costs [2] - The company is exploring AI user interfaces, particularly within the context of its Spectacles glasses [2] Future of AI & User Experience - AI is currently constrained to text boxes, but its potential will be unlocked when it becomes contextually aware and integrated into the real world [4] - The evolution of AI is compared to the early days of social media, where the true potential wasn't realized until the advent of the smartphone [3]