Veo 3
Search documents
第一档AI生成的下饭综艺,700万人入坑
3 6 Ke· 2025-11-10 04:11
部分网友对AI做的视频,认知和接受度仅限于两年前风靡网络的"甄嬛吃汉堡",作为一种片段式的鬼畜文化来猎奇娱乐。 最近,在B站上出现了一个长达近7分钟的"纯AI综艺",讲全世界6位厨师如何把灭绝了6500万年的远古沧龙做成6道菜,收获了700多万点击。有人压根没 看出来这是AI做的,还以为是美国烹饪竞技真人秀《地狱厨房》出了续集。 这部AI综艺对行业内最大的启发是:加入了作者主体性的AI长视频内容,是否会成为各平台UP主和创作达人,一个升级版的内容赛道。在这之前,据说 有50%的网友对AI做的内容是"排斥的"。 科技综艺,骗过700万网友 沧龙是一种海生爬行动物,有"史前水生蜥蜴"之称,有时候凶残起来连同伴也吃进肚里。 在这部名为《把远古沧龙做成六道菜》的AI综艺里,肉质可期的沧龙,被来自世界6个国家的主厨,各自做成了一道家乡菜,参与厨艺大赛,争夺黑衫。 图 | 综艺截图 提前释放的上、中两集,共出场了4位厨师,一位来自印度的厨师,做成了一道加入了西红柿酸奶的"大肠咖喱",结果因为"太糊"被淘汰出局。 而来自东北亚中日韩三国的厨师,则分别献上了上海融合菜"恐龙红烧肉"、"泡菜炖沧龙尾"、"沧龙生鱼片plus土豆 ...
OpenAI推出安卓版Sora:进入多个市场,使用仍需邀请码
3 6 Ke· 2025-11-05 01:59
Core Insights - OpenAI's AI video generation app Sora has officially launched on the Android platform, expanding its reach to major markets including the US, Canada, Japan, South Korea, Thailand, and Vietnam [1][3] - Sora was initially released on iOS at the end of September, achieving over 1 million downloads within five days and maintaining a top position in the App Store [3] - The launch of the Android version signifies OpenAI's comprehensive strategy in the mobile sector, providing a more accessible AI video creation tool for users [3] Product Features - Sora allows users to generate complete videos from text prompts or images, with automatic voiceovers [6] - Key functionalities include: - Instant video generation: Users can create videos with sound effects in seconds [6] - "Cameos" feature: Users can embed their own or others' likenesses into videos [7] - Diverse style options: Supports various visual styles including cinematic, animated, realistic, cartoonish, and hyperreal [8] - Creation and interaction: Users can remix others' works, altering characters and storylines [9] - Community sharing: Supports browsing and sharing AI-generated videos in a social media-like feed [10] Market Position and Competition - Sora aims to compete with TikTok and Instagram by integrating generative video technology with social interaction [4] - The app's video generation speed and quality are noted to be competitive with Google's Veo 3, enhancing its potential for viral sharing [4] Future Developments - OpenAI plans to introduce enhanced editing features for Sora, including the ability to splice multiple clips and adjust visual themes [11] - Upcoming features include "Character Cameos," allowing users to create anthropomorphized videos featuring pets or inanimate objects [11] - Future versions will also support customized feeds, prioritizing content from followed creators over trending videos [11] Regulatory and Legal Considerations - Sora faced criticism for generating inappropriate videos involving historical figures, leading to the suspension of certain content generation features and the implementation of deepfake detection mechanisms [11] - The app's content policy has shifted to an "opt-in" model for copyright holders, requiring explicit permission for generating related content [11] - Legal disputes have arisen regarding the "Cameo" feature's name, which is similar to a video greeting service, and the case is still ongoing [11]
Google's First AI Ad Avoids the Uncanny Valley by Casting a Turkey
WSJ· 2025-10-31 10:00
Core Insights - The search giant has become the largest entity to create an advertisement entirely using its Veo 3 and other artificial intelligence tools [1] Group 1 - The company is leveraging advanced AI technology to innovate in advertising [1] - This move signifies a growing trend in the industry towards automation and AI-driven content creation [1] - The use of AI tools like Veo 3 may set a precedent for future advertising strategies across various sectors [1]
AIGC如何“破界”?看行业大咖拆解,从模型能力到商业增长的全球落地法则
Sou Hu Cai Jing· 2025-10-28 11:06
Core Insights - The rapid development of AI technology is reshaping global industry dynamics, evolving from an "auxiliary tool" to a "core engine" driving business growth, particularly through AIGC [2] - The focus of the upcoming closed-door conference "Fusion Without Boundaries: New Pathways for AIGC Going Global" is on the deep application of AIGC in cross-border scenarios, addressing compliance, payment technology adaptation, and content localization [2] - The Vidu model by Shengshu Technology demonstrates advanced capabilities in multimodal generation, achieving significant breakthroughs in video generation, including features like video extension and emotional rendering [6][9] AI and Video Generation - The emergence of multimodal generative models, particularly in video generation, is leading to a transformative shift in social media interactions, moving from "few creators" to "everyone co-creating" [4] - Shengshu Technology's Vidu model supports various forms of video generation, significantly lowering content creation barriers by allowing users to generate coherent videos from multiple images [9] - The competitive landscape in video generation is intense, with around 10 leading companies continuously iterating their models, and Shengshu Technology holds a significant market share in niche areas like comic production [11] AI in Content Creation - AI is fundamentally changing work processes, enhancing productivity while still requiring human oversight for creative aspects, as seen in the case of TVB's AI drama [13] - AI enables individuals without specialized skills to quickly engage in content creation, reducing costs and entry barriers, but the ultimate success still relies on core content production capabilities [13] Cross-Border Payment Challenges - Cross-border payment processes are complex, with varying consumer preferences across countries impacting conversion rates, necessitating localized payment experiences [23] - Tax and compliance risks are significant, with over 80 countries imposing VAT or GST on digital goods, leading to potential legal and financial repercussions for non-compliance [25] - FastSpring's model as a record merchant alleviates the burden of compliance and risk management for businesses, allowing them to focus on product and market strategies [30]
A16Z最新洞察:视频模型从狂飙到分化,产品化是下一个机会
3 6 Ke· 2025-10-28 00:18
Core Insights - The video generation model industry is transitioning from a phase of rapid performance improvement to a "product era," focusing on diversity and specialization rather than just model parameters and benchmark scores [2][4][12] - There is a growing realization that no single model can dominate all video generation tasks, leading to a trend of specialization where different models excel in specific areas [4][11][12] - The need for better integrated products to simplify the creative process is becoming increasingly apparent, as many creators still rely on multiple tools to achieve their desired outcomes [13][15][16] Group 1: Industry Trends - The pace of progress in video generation models has slowed, with most mainstream models now capable of generating impressive 10-15 second videos with synchronized audio [1][6] - The concept of a "superior model" in the video domain is being challenged, as recent releases like Sora 2 have not consistently outperformed predecessors like Veo 3 [4][11] - The industry is witnessing a shift towards models that are tailored for specific capabilities, such as physical simulation and multi-shot editing, rather than one-size-fits-all solutions [2][11][12] Group 2: Product Development - The current landscape shows that while video generation capabilities have improved, the corresponding product development has not kept pace, leading to a gap in user experience and creative efficiency [13][15] - Companies are beginning to address this gap by developing tools that allow users to modify video elements more intuitively, such as Runway's suite of tools and OpenAI's Sora Storyboard [15][16] - The future is expected to see more specialized models for specific industries or scenarios, along with comprehensive creative toolkits that integrate various media elements into a cohesive workflow [16]
新模型组团出道,多项机器人技术开源,近期AI新鲜事还有这些……
红杉汇· 2025-10-17 00:04
Group 1 - The emergence of large language models (LLMs) has significantly advanced the automation of scientific discovery, with AI Scientist systems leading the exploration [5][6] - Current AI Scientist systems often lack clear scientific goals, resulting in research outputs that may seem immature and lack true scientific value [5] - A new AI Scientist system, DeepScientist, has achieved research progress equivalent to three years of human effort in just two weeks, demonstrating its capability in various fields [6] Group 2 - OpenAI recently held a developer conference with around 1,500 attendees and over tens of thousands of online viewers, showcasing its achievements and new tools [8] - OpenAI's platform has attracted 4 million developers, with ChatGPT reaching 800 million weekly active users and processing nearly 6 billion tokens per minute [8] - New tools and models were introduced, including the Apps SDK and AgentKit, enhancing the capabilities of ChatGPT and facilitating rapid prototyping for developers [8] Group 3 - The latest version of the image generation model, Hunyuan Image 3.0, has topped the LMArena leaderboard, outperforming 26 other models [11][12] - Hunyuan Image 3.0 is the largest open-source image generation model with 80 billion parameters and 64 expert networks, showcasing advanced capabilities in knowledge reasoning and aesthetic performance [12] Group 4 - NVIDIA has open-sourced several key technologies at the Conference on Robot Learning, including the Newton physics engine and the GR00T reasoning model, aimed at addressing challenges in robot development [13][15] - These technologies are expected to significantly shorten the robot development cycle and accelerate the implementation of new technologies [15] Group 5 - The newly released GLM-4.6 model has 355 billion total parameters and a context window expanded to 200,000 tokens, enhancing its performance across various tasks [16] - GLM-4.6 has achieved over 30% improvement in token efficiency and a 27% increase in coding capabilities compared to its predecessor, making it one of the strongest coding models available [16] Group 6 - Anthropic has launched Claude Sonnet 4.5, which excels in programming accuracy and maintains stability during complex tasks, outperforming previous models [20][22] - Claude Sonnet 4.5 achieved an 82.0% accuracy rate on the SWE-bench Verified benchmark, surpassing competitors and emphasizing its alignment and safety features [22] Group 7 - DeepMind's new video model, Veo 3, demonstrates zero-shot learning capabilities, allowing it to perform complex visual tasks without prior training [24][28] - Veo 3's understanding of physical laws and abstract relationships indicates its potential to evolve into a foundational visual model similar to LLMs [28]
X @Demis Hassabis
Demis Hassabis· 2025-10-15 16:44
Product Upgrade - Google DeepMind 发布了 Veo 3.1,这是 Veo 视频模型的重大升级 [1][2] - Veo 3.1 增强了真实感、更丰富的音频、场景扩展、更好的叙事控制和更精确的编辑功能 [1] - Veo 3.1 伴随着为电影制作人、故事讲述者和开发者提供的改进的创意控制,其中许多都带有音频功能 [2] Product Features - Veo 3.1 是最先进的视频模型 [1]
Veo 3.1 and more artistic control in Flow
Google DeepMind· 2025-10-15 15:56
We’re introducing Veo 3.1, which brings richer audio, more narrative control, and enhanced realism that captures true-to-life textures. Veo 3.1 is state-of-the-art and builds on Veo 3, with stronger prompt adherence and improved audiovisual quality when turning images into videos. We’re also introducing new capabilities, and bringing audio to existing capabilities for the first time. Learn more: https://blog.google/technology/ai/veo-updates-flow ____ Subscribe to our channel https://www.youtube.com/@googled ...
全网最后一批躺赚博主,也被Al挤兑失业了
创业邦· 2025-10-14 03:12
Core Viewpoint - The article discusses the rising trend of AI-generated ASMR videos as a means for individuals to escape from the pressures of modern life, highlighting the emotional and psychological impacts of these videos on viewers [5][14][21]. Group 1: AI-Generated ASMR Videos - AI-generated ASMR videos have evolved significantly, offering surreal experiences that allow viewers to escape reality, such as videos featuring the cutting of glass or diamonds [6][7]. - These videos provide a sense of control and order, contrasting with the chaos of everyday life, and cater to the growing demand for personalized and immersive content [19][20]. - The technology behind creating these videos has become more accessible, with tools like Google's Veo 3 enabling creators to produce high-quality content with minimal investment [24][28]. Group 2: Psychological Impact - The use of AI-generated content serves as a coping mechanism for individuals facing anxiety and stress, allowing them to temporarily disconnect from their real-life challenges [12][14]. - Viewers often find comfort in the predictability and perfection of AI-generated environments, which fulfill their desire for a safe and controlled space [19][30]. - However, there is a concern that reliance on these videos may lead to a diminished ability to find joy in everyday experiences, as the threshold for pleasure increases with exposure to more stimulating content [35][36]. Group 3: Commercial Implications - The rise of AI-generated ASMR videos has created a new value chain in the content industry, where platforms benefit from increased user engagement and creators can quickly produce content to capture audience attention [35]. - This trend reflects a shift in consumer behavior towards seeking quick and effective ways to alleviate stress, aligning with the fast-paced nature of modern life [21][30]. - The low-cost production of these videos has led to a rapid proliferation of content, catering to the insatiable curiosity of viewers and creating a unique community culture around these experiences [30][35].
马斯克旗下xAI加入“世界模型”竞赛,“视觉模型”会是下一个“大语言模型”吗?
硬AI· 2025-10-13 14:23
Core Viewpoint - The competition in artificial intelligence is shifting from text-based models to "world models" that can understand and simulate the physical world, with xAI entering the race alongside major players like Google and Meta [2][3][4]. Group 1: xAI's Entry and Strategy - xAI has hired AI experts from Nvidia to focus on the development of world models, which are trained on vast amounts of video and robotic data to understand physical laws [3][6]. - The first commercial application planned by xAI for world models is in the gaming sector, aimed at generating interactive 3D environments, signaling a clear path for commercialization [6][8]. - xAI is actively recruiting for its "omni team" with salaries ranging from $180,000 to $440,000, indicating a strong commitment to creating advanced AI experiences beyond text [8]. Group 2: Advancements in Video Models - Google researchers predict that future video models will become as intelligent as language models, showcasing the potential for significant advancements in AI capabilities [4][11]. - Video models are beginning to unlock surprising abilities through "next frame prediction," similar to how language models learn additional skills through simple tasks [11][14]. - The development of smarter video models could lead to the creation of highly capable robotic agents, enhancing the interaction between AI and the physical world [15]. Group 3: Challenges and Industry Perspectives - Despite the promising outlook, the path to world models is fraught with challenges, particularly the high costs associated with acquiring and processing sufficient training data [17]. - Industry experts express skepticism about AI's ability to address fundamental issues in gaming, emphasizing the need for leadership and vision rather than just technical advancements [17]. - The entry of xAI into the world model competition adds momentum to the shift from digital information processing to the simulation and interaction with complex physical realities [18].