Workflow
可灵2.1
icon
Search documents
谈“AI抖音”尚早,Sora 2们会先改变影视行业
Hu Xiu· 2025-10-04 01:01
国庆假期刚开始,Sora 2就引爆了整个AI圈子。 这个新的视频模型带来了对现实世界更精准的呈现,有更强的可控性,能创造出复杂的音频,并且能够轻松地将现实世界中的人和物插入到AI生成的视 频内容中,完成一场现实里很难实现的高难度客串(Cameo)。 于是,我们在这两天看到了大量由OpenAI CEO Sam Altman参演的AI视频作品,他会与瑞克、莫蒂对谈,和同事一起撞翻充满彩色泡泡球的浴缸,在一个 播客中参与讨论。这些内容发布在与Sora 2同步推出的Sora App上,被看作是"AI抖音"将要到来的体现。 很多人相信,人们会被客串激发出更多AI视频的创造行为。但我们认为,即便有完整的内容推荐功能,当下的Sora App本质也还是一个工具,而不是平 台,它与另一个近期大火的AI视频生成产品Higgsfield是同类——都是在利用AI提供更高级的滤镜,激发出人们跟风和模仿的冲动。 Sora 2带来的模型能力的提升,更可能会加快To B方向的落地,推动整个视频大模型行业的技术更新,让AI更好地服务于有创作冲动的人。我们不知道To C层面的AI抖音什么时候能够出现,以及AI抖音会带来什么样的商业模式,但我们能 ...
实测可灵AI的新视频模型,它生成的动作戏酷到封神。
数字生命卡兹克· 2025-09-22 01:33
Core Viewpoint - The article discusses the advancements of the AI video generation model, 可灵2.5, highlighting its significant improvements in motion and performance capabilities compared to its predecessor, 可灵2.1, and its potential impact on creative freedom for young creators [1][54]. Group 1: Motion Evolution - 可灵2.5 demonstrates a substantial enhancement in motion capabilities, allowing for seamless transitions between complex actions such as falling, running, and riding a motorcycle, showcasing a high level of realism [2][5]. - The model can generate dynamic and fluid movements in various scenarios, including parkour and sports, achieving effects comparable to professional films [10][18][20]. - In contrast, 可灵2.1 struggled with maintaining realistic interactions with the environment, often resulting in disjointed or unrealistic movements [6][12]. Group 2: Performance Evolution - 可灵2.5 shows a marked improvement in the accuracy of emotional expressions and character performances, allowing for nuanced portrayals of complex emotions [29][45]. - The model can effectively convey subtle emotional transitions, such as a character's shift from anger to calmness, which was less successful in 可灵2.1 [29][42]. - The ability to generate diverse emotional expressions has been significantly enhanced, allowing for more relatable and engaging character interactions [35][50]. Group 3: Overall Improvements - The update to 可灵2.5 not only elevates motion and performance capabilities but also enhances the model's understanding of context and detail, addressing previous limitations in generating coherent narratives [54][56]. - The advancements in text-to-video capabilities allow creators to generate content with minimal input, fostering greater creative freedom [55][57].
量大管饱!让藏师傅疯狂涨粉的 Nano Banana 玩法合集 02
歸藏的AI工具箱· 2025-09-05 09:12
Core Insights - The article discusses the rising popularity of Nano Banana, highlighting its widespread use and the innovative applications being explored by users [1][3]. Group 1: AI Applications - The article introduces the concept of creating AI-generated dance videos using calligraphy as a reference, showcasing the creative potential of Nano Banana [4][10]. - It details the process of converting architectural floor plans into 3D renderings, emphasizing the versatility of Nano Banana in architectural visualization [17][20]. - The article explains how to generate exaggerated visual effects for video thumbnails, enhancing engagement through creative imagery [33][35]. Group 2: User Engagement and Community - The article notes the significant increase in user engagement across platforms like Twitter, Xiaohongshu, and Douyin, indicating a growing community around Nano Banana [1]. - It highlights the collaborative nature of the community, where users share tutorials and innovative uses of Nano Banana, fostering a culture of creativity and experimentation [1][3]. Group 3: Technical Guidance - The article provides detailed instructions on generating videos using specific AI models, emphasizing the importance of prompt engineering for desired outcomes [12][16]. - It outlines the steps for creating 3D models from 2D images, showcasing the technical capabilities of Nano Banana in transforming visual content [24][30]. - The article discusses the integration of various software tools to enhance the functionality of Nano Banana, indicating a trend towards multi-software workflows in creative projects [28][32].
用AI一键直出超绝电影级转场,我的PR真的可以卸载了。
数字生命卡兹克· 2025-08-21 13:48
Core Viewpoint - The article discusses the advancements in AI video generation, particularly focusing on the new features of the 可灵 2.1 version, which includes the ability to use "head and tail frames" for enhanced video effects and storytelling [5][10][40]. Group 1: AI Video Generation Features - The 可灵 2.1 version introduces the head and tail frame functionality, allowing users to set precise starting and ending points for video generation, enhancing control over the visual style and narrative [5][10][11]. - The comparison between 可灵 1.6 and 2.1 shows significant improvements in motion dynamics and visual quality, with the latter providing a more fluid and impactful video experience [7][9][40]. - The article highlights the importance of head and tail frames in storytelling, enabling the creation of emotional narratives through visual cues [11][12][14]. Group 2: Applications and Creative Possibilities - The head and tail frame feature can be applied to various video types, from cinematic productions to simple effects that everyday users can create [18][19]. - Examples of creative uses include transforming ordinary scenes into visually stunning transitions, such as a car morphing into a transformer or a sketch evolving into a skyscraper [21][27][29]. - The article emphasizes that the potential of 可灵 2.1 lies in the user's imagination, as the technology simplifies the process of creating complex visual effects [19][37]. Group 3: Technical Insights and Tips - For optimal results, the article suggests that the motion in head and tail frames should be dynamic and engaging, enhancing the overall impact of the video [38][40]. - The AI's ability to maintain consistency and stability in video generation is highlighted as a key factor in achieving high-quality outputs [44]. - Users are encouraged to experiment with the technology, as it bridges the gap between imagination and reality in video creation [44].
可灵 AI 技术部换将;宇树机器人“撞人逃逸”上热搜;邓紫棋自曝投资 AI 公司获 10 倍收益 | AI周报
AI前线· 2025-08-17 05:33
Group 1 - The first humanoid robot sports event took place on August 14, featuring 280 teams from 16 countries, showcasing the capabilities of humanoid robots in various competitions [3][4] - The UTree H1 robot won the 1500 meters race with a time of 6:34.40, marking the first gold medal in the event [3] - The TianGong robot team lost to UTree in both the 1500 meters and 400 meters races, with the CTO of TianGong expressing a desire to learn from UTree's performance [3][4] Group 2 - A corruption scandal involving DeepSeek's parent company has emerged, revealing that over 1.18 billion yuan was illicitly obtained through a kickback scheme over six years [8][9] - Reports indicate that DeepSeek's next-generation model, R2, will not be released in August as previously speculated, with the focus instead on iterative improvements to existing products [10] - The company has faced challenges due to supply chain issues related to AI chips, impacting its development timeline [10] Group 3 - Manus is facing potential forced withdrawal of a $75 million investment from Benchmark due to regulatory scrutiny over compliance with U.S. investment restrictions in Chinese AI firms [11] - The company has shifted its focus from domestic expansion to international markets, particularly Singapore, following the investment controversy [11][12] Group 4 - Kuaishou announced a leadership change in its AI division, with Gai Kun taking over the technical department, amid rumors of the departure of the previous head [12][13] - The CEO of Leifen publicly criticized a former employee over product performance comparisons, indicating internal conflicts and challenges in the company's public image [14] Group 5 - OpenAI employees are seeking to sell approximately $6 billion in stock at a valuation of $500 billion, indicating strong investor interest despite the company's current losses [15] - The company is also exploring advertising as a revenue stream while maintaining a focus on subscription growth [38] Group 6 - Alibaba's "扫地僧" Cai Jingxian, the first programmer for Taobao, has reportedly left the company, marking a significant personnel change [17][18] - G.E. has launched a new open-source platform for robotics, aiming to integrate various aspects of robot control and learning [36] Group 7 - The National Data Bureau reported a dramatic increase in daily token consumption in AI applications, reflecting rapid growth in the sector [30] - Alibaba's international platform has gained popularity with its AI agent, prompting plans for expansion to accommodate increased demand [31]
腾讯研究院AI每周关键词Top50
腾讯研究院· 2025-05-30 18:51
Group 1: Key Trends in AI - The article highlights the emergence of various AI models and applications, indicating a rapid evolution in the AI landscape, with significant contributions from companies like Google, OpenAI, and Tencent [2][3]. - Notable advancements include the release of new models such as QwenLong-L1-32B by Alibaba and the introduction of the RLVR paradigm by Claude, showcasing the competitive nature of AI development [2][3]. - The article also emphasizes the importance of AI applications across different sectors, including updates to existing products and the launch of innovative tools like AI Scientist and real-time camera features [2][3]. Group 2: Corporate Activities and Acquisitions - The acquisition of Informatica by Salesforce is mentioned, reflecting ongoing consolidation in the tech industry as companies seek to enhance their AI capabilities [3]. - The article notes the merger of Haiguang Information with Zhongke Shuguang, indicating strategic moves to bolster computational power and resources in the AI sector [2]. Group 3: Industry Perspectives - Insights from industry leaders suggest a transformative shift in AI platforms, with Google and Anthropic providing perspectives on automation in white-collar jobs and the growth logic of AI products [3]. - The article discusses the implications of AI on employment, with NVIDIA offering recommendations for adapting to the changing job landscape due to AI advancements [3].
腾讯研究院AI速递 20250530
腾讯研究院· 2025-05-29 15:55
Group 1: DeepSeek-R1 and AI Developments - The new version of DeepSeek-R1 has been officially open-sourced, surpassing Claude 4 Sonnet in programming capabilities and performing comparably to o4-mini (Medium) [1] - DeepSeek-R1's core advantages include deep reasoning capabilities, natural text generation, and support for long-duration thinking of 30-60 minutes, allowing for the execution of complex code in a single run [1] - Tencent has integrated multiple products with the latest DeepSeek R1 model within a day, offering users free and unlimited access to the model [3] Group 2: Keling 2.1 Launch - Keling 2.1 has been launched with a price reduction of 65%, featuring improved performance and speed, categorized into standard, high-quality, and master versions [2] - The high-quality version (35 inspiration points) matches the old master version in quality, supporting 1080P video but only for image-to-video generation [2] - The new version significantly enhances cost-effectiveness, making AI video creation more accessible for ordinary users [2] Group 3: Opera Neon Browser - Opera has introduced Opera Neon, the first "AI Agent" browser, aiming to redefine the role of browsers in the network [4] - Opera Neon consists of three main features: Neon Chat (chatting), Neon Do (executing web tasks), and Neon Make (complex creation), which can understand user intent and convert it into actions [4] - The Neon Make feature utilizes cloud technology to execute complex tasks, such as generating reports and designing game prototypes, even while the user is offline [4] Group 4: VAST's Tripo Studio Upgrade - VAST has upgraded Tripo Studio with four core functionalities: intelligent component segmentation, texture magic brush, intelligent low-poly generation, and automatic rigging for all objects [5] - Intelligent component segmentation allows for one-click disassembly, accurately identifying different parts of a model [5] - The automatic rigging feature can recognize various biomechanical characteristics and quickly allocate skeletal weights, enabling non-professionals to complete the entire 3D creation process with over a tenfold efficiency increase [5] Group 5: Odyssey's World Model - Odyssey, founded by autonomous driving experts, has launched a world model capable of real-time video generation at 40 milliseconds per frame, supporting real-time interaction [6] - This technology differs from traditional video models by learning pixel and motion data from real-life videos, using a narrow distribution model architecture to address autoregressive modeling challenges [6] - Odyssey has secured $27 million in funding, with the current preview version supported by H100 GPU clusters, outputting 30 FPS for 5-minute coherent interactive videos [6] Group 6: AI Scientist Zochi - The AI scientist Zochi's paper has been accepted by the top-tier conference ACL, marking it as the first AI system to independently pass peer review at an A* level conference [7] - Zochi's paper demonstrates a multi-round attack method with a success rate of 100% on GPT-3.5 and 97% on GPT-4 [7] - Zochi can autonomously complete the scientific research process from literature analysis to peer review, although its company has faced criticism regarding the misuse of the scientific peer review process [7] Group 7: Wanda 2.0 Robot - Youliqi has launched the Wanda 2.0 wheeled dual-arm robot, priced from 88,000 yuan, capable of autonomously completing complex long-sequence tasks [8] - Wanda 2.0 is equipped with a pre-trained multimodal large model UniTouch and a long-sequence task planning model UniCortex, learning new actions with only 5-10 demonstrations [8] - Youliqi has reduced costs by 70% through full-stack self-research, targeting the C-end and small B customer market, and has completed several hundred million yuan in financing [8] Group 8: Boston Dynamics Atlas Robot - Boston Dynamics has upgraded the Atlas robot, which now features 3D spatial perception and real-time object tracking capabilities, allowing it to perform complex industrial tasks in automotive factories [9] - The core technology includes a 2D object detection system, 3D spatial positioning based on key points, and a SuperTracker object pose tracking system, capable of handling object occlusion and positional changes [9] - The system integrates kinematic data, visual data, and force feedback to estimate poses accurately, with the team working on building a unified foundational model to enhance perception and action integration [9] Group 9: Google CEO's Perspective on AI - Google CEO Pichai believes AI represents a platform-level transformation larger than the internet, entering a phase where research is becoming reality [10] - AI is transitioning into the second stage of building usable products, with search evolving into an agent that can execute tasks on behalf of users, potentially creating Web 2.0-level killer applications [10] - The key transformation brought by AI lies in the change of interaction methods and the lowering of creative barriers, with the third stage involving the integration of AI with the physical world to form universal robotic systems [10]
可灵2.1刚刚上线,价格降了65%,更快、更听话、也更强。
数字生命卡兹克· 2025-05-29 03:42
Core Insights - The launch of Kling 2.1 introduces significant improvements in effectiveness, speed, and pricing, making it a compelling option for users [1][27]. - Kling 2.1 offers three distinct models: Standard, High Quality, and Master, catering to different user needs and budgets [10][28]. Pricing and Value - The pricing structure has been adjusted, with the High Quality version of Kling 2.1 being 65% cheaper than the previous Master version, making it more accessible for everyday users [10][27]. - The Standard version is priced at 20 inspiration points for 720P, the High Quality version at 35 inspiration points for 1080P, and the Master version at 100 inspiration points for high-end cinematic effects [10][28]. Performance Comparison - Kling 2.1 High Quality and Master versions outperform previous models in terms of visual quality and dynamic motion, with the Master version providing superior results for professional-grade projects [27][28]. - Speed tests indicate that Kling 2.1 performs comparably to Kling 1.6, with both completing tasks in under one minute, while the Master versions take over three minutes [18][27]. User Experience - Users have reported that the Professional Mode of Kling 2.1 is sufficient for most casual video styles, while the Master version is better suited for action scenes and high-intensity projects [2][28]. - The updates have made it possible for a broader range of creators to access high-quality video generation tools, enhancing the overall user experience [27][28]. Market Positioning - Kling 2.1 aims to fill the gap between affordability and quality, allowing users to choose models based on their specific creative needs and budget constraints [28]. - The differentiation between the three models allows for targeted marketing towards various segments, from casual creators to professional filmmakers [28].