Workflow
歸藏的AI工具箱
icon
Search documents
Gemini 的 PPT 生成:使用技巧及模板提示词
歸藏的AI工具箱· 2025-11-05 06:02
Gemini APP 前几天上线了 PPT 生成的能力,我昨天尝试了一下发现相当可以啊。 由于用的前端代码的方式实现,所以我们可以 用提示词控制的非常细 ,包括 PP T 的各种风格细节,生成的质量比 Anthropic 那一坨强多了。 另外这东西 可以跟 Gemini 本身和谷歌其他产品的各种功能打通。 这次的内容我会先介绍一下具体这个功能怎么使用,然后分享一些我探索出来的各种 Gemini PPT 生成提示词。 如果你不想要复杂的内容的话,让 Gemini 帮你生成 PPT 很简单。 在输入框开启 Canvas 模式,然后直接跟他说以"XXXX"为主题帮我生成 PPT 就可以了。 比如你可以去 Google 幻灯片编辑 PPT 的细节,导出成 PPT 格式,也可以将深度研究的结果 变成 PPT。 而且 Gemini 是自带搜索的,所以你甚至可以完全让他帮你填充内容,比如我这里就让他搜 O pen AI 最近的算力投资信息然后生成 PPT。 然后你可能会注意到,生成的结果右上角有个选项可以下载,下载下来的结果是 PDF 格式 的。 看到这里你可能就得说了 PDF 没有用啊,我需要 PPT 格式的。 别着急, ...
承包你的品牌营销物料|谷歌再发重磅 AI 设计产品
歸藏的AI工具箱· 2025-10-29 07:59
Group 1 - Google Labs has introduced a new AI design product called Pomelli, which focuses on generating marketing materials that align with brand aesthetics at a low cost [4][30]. - Pomelli extracts brand-related elements from a company's website, such as theme colors, product capabilities, and positioning, to create marketing content [4][11]. - The product is currently available in the United States, Canada, Australia, and New Zealand [4]. Group 2 - Users can input their website URL, and Pomelli will analyze the site to create a brand DNA card, detailing elements like logos, fonts, and color schemes [11][30]. - The tool allows for the generation of marketing content by inputting specific campaign details, optimizing text, and providing design previews [15][19]. - Users can customize generated images by adjusting backgrounds, titles, content, and call-to-action buttons, ensuring brand consistency [23][25]. Group 3 - The advantages of Pomelli include its user-friendly interface and the ability to quickly produce advertising content, which is more efficient than traditional agency methods [30]. - However, the tool heavily relies on the quality of the website's information, and if the site lacks comprehensive content, the output may be limited [31]. - Current limitations include a lack of aesthetic variety in generated images, weak control over background images, and no support for controlling image ratios, which is crucial for advertising [32][30].
AI 音乐都发展成这样了?藏师教你一键生成爆款 AI 音乐
歸藏的AI工具箱· 2025-10-16 13:19
Core Insights - The article discusses the rapid rise of AI-generated music, particularly focusing on the capabilities of the Suno V5 model, which allows for advanced customization and control over music generation [5][21]. - The author highlights the potential of AI in transforming the music industry, enabling users to create high-quality remixes and original compositions without extensive musical knowledge [6][21]. Summary by Sections AI Music Generation - The Suno V5 model has evolved significantly, allowing users to control various elements of music creation, including style, lyrics, and audio modifications [5][6]. - AI-generated music has gained immense popularity, with numerous tracks receiving hundreds of thousands of likes on social media platforms [3][21]. Workflow and Features - A simple workflow has been developed for generating music using Suno, which includes two main approaches: remixing existing tracks and creating original compositions based solely on prompts [6][18]. - The model allows for detailed customization, including specifying vocal gender, style influences, and even the "weirdness" factor to create unique sounds [7][8]. Prompt Creation - Users can create structured prompts for the AI by defining global style characteristics and providing detailed instructions for each section of the song [10][11]. - The prompts must include specific elements such as core genre, instrumentation, vocal style, and production characteristics to guide the AI effectively [10][11]. Industry Impact - The article suggests that the advancements in AI music generation could revitalize the stagnant music industry by enabling more creative expressions and reducing reliance on traditional music production methods [21][23]. - The potential for AI to remix classic songs in various styles is seen as a positive development, offering fresh interpretations of well-known tracks [21][23].
藏师傅想解决 Claude Code 最恶心的问题
歸藏的AI工具箱· 2025-10-14 13:12
Core Viewpoint - The article discusses the development of an open-source project called "ai-claude-start" aimed at simplifying the configuration and management of multiple Claude Code models, addressing the challenges faced by users in managing environment variables and API integrations [2][22]. Group 1: Project Introduction - The project "ai-claude-start" allows users to quickly configure multiple Claude Code model APIs and select which model to start when launching Claude Code [2][4]. - It provides a user-friendly solution for managing environment variables without affecting the original settings of Claude Code, ensuring safety and ease of use [4]. Group 2: Installation and Usage - Installation of the project is straightforward, supporting npm and npx commands for users who have Node.js installed [5][6]. - Users can initiate the setup process by running the command "ai-claude-start setup," which guides them through configuring API addresses, API keys, and model names [7][14]. - The project includes pre-configured API addresses for Anthropic, Zhiyu, and Kimi, allowing users to easily select from these options or input custom configurations [9][11]. Group 3: Development Process - The development of the project involved collaboration with GPT-5 and Sonnet 4.5, focusing on creating a solution to the problem of environment variable management [16][19]. - The project was designed to allow users to select profiles and manage API keys securely, with features for setup, listing, and deleting profiles [16][19]. - The final product includes automated testing and documentation to ensure functionality and ease of use for the community [20][22].
太猛了!终于有人来管管 AI 视频的语音和表演了:GAGA AI 实测
歸藏的AI工具箱· 2025-10-10 10:03
Core Viewpoint - The article discusses the capabilities of the GAGA-1 model developed by Sand.ai, highlighting its advanced performance in character dialogue and expression, surpassing previous models like Sora2 in nuanced facial expressions and voice synchronization [1][2][15]. Performance Testing - Initial tests showed GAGA-1's ability to generate detailed facial expressions and voice synchronization, particularly in nuanced scenarios [2][5]. - The model demonstrated clear lip movements and voice output, even in complex scenarios involving environmental sounds [4][6]. - GAGA-1 supports multilingual output, performing well in English, Japanese, and Spanish, with accurate lip synchronization and expression [8][16]. Emotional Expression - The model effectively conveyed complex emotions, such as shame and desperation, with natural voice modulation and facial expressions [9][10]. - In a dual-character scenario, GAGA-1 maintained emotional intensity and expression accuracy, even under challenging conditions [14][15]. Usage Guidelines - Suggestions for optimal use include specifying emotional changes in prompts and limiting complex body movements to avoid performance issues [16]. - The model currently supports a 16:9 aspect ratio, with plans for future vertical format support [16]. Industry Implications - The development of GAGA-1 signifies a shift in AI video models towards enhanced emotional expression and multimodal output, moving beyond basic content generation [16][17]. - The model's advancements suggest a need for industry professionals to adapt to the evolving capabilities of AI in video production [17].
Sora 2 中国首测?Open AI 这次真成了!
歸藏的AI工具箱· 2025-09-30 20:32
Core Viewpoint - Sora 2 is presented as the world's most advanced video generation model, capable of creating high-quality videos with minimal input, including voice cloning and multi-language support, and it features a social app for collaborative video creation [1][17]. Group 1: Model Features - Sora 2 allows users to generate videos by simply recording three numbers, showcasing its advanced voice and video synthesis capabilities [1]. - The model can maintain character consistency while changing backgrounds and scenarios, demonstrating its versatility in video generation [6][7]. - It incorporates automatic camera cuts and scene changes, reflecting an understanding of video composition and storytelling logic [8][11]. Group 2: User Interaction - Users can remix videos by providing simple prompts, allowing for creative alterations to existing content [5]. - The platform supports image uploads for scene generation, enhancing the customization options for users [6]. - Sora 2 includes a social aspect where users can invite friends to collaborate on video projects, resembling a social media experience [1][17]. Group 3: Content Limitations - The model has strict copyright restrictions, preventing the generation of copyrighted content, although it appears to allow some exceptions [11]. - There are challenges with maintaining consistency in certain product representations, indicating areas for improvement in commercial applications [9]. Group 4: Overall Impact - Sora 2 is positioned as a groundbreaking tool for end-users, combining audio, visual, and narrative elements to create complete videos from minimal input [17]. - The model's capabilities suggest a significant advancement in video generation technology, potentially transforming user engagement in content creation [17].
告别抽卡!全能&高度可控|藏师傅教你用即梦数字人 1.5
歸藏的AI工具箱· 2025-09-29 10:10
Core Viewpoint - The article discusses the launch of the Omnihuman 1.5 version by the company, highlighting its enhanced capabilities in generating dynamic videos with lip-syncing and improved control over character actions and emotions, making it a powerful tool for creating engaging content [1][30]. Group 1: Features and Enhancements - The Omnihuman 1.5 version allows users to define character performances and movements, significantly improving the quality of AI-generated videos compared to the previous version [1][4]. - The update introduces a feature for action description input, expanding the use cases for digital humans, making it highly customizable [2][4]. - The model now supports natural lip-syncing for non-human characters and various styles, enhancing the overall visual appeal [5][8]. Group 2: User Experience and Functionality - Users can control multiple characters in a scene, allowing for more complex dialogues and interactions, which increases the model's usability [7][8]. - The system requires three main components to create a video: an initial image, audio, and corresponding action/emotion prompts, which can be organized in a structured format for better results [9][12]. - The article provides a detailed tutorial on how to prepare materials and utilize the platform effectively, emphasizing the importance of clear and specific prompts [16][19]. Group 3: Market Position and Future Developments - The advancements in Omnihuman 1.5 position it as a sophisticated tool for content creators, transforming the creative process from an unpredictable art form into a more structured engineering task [30]. - The new model is set to be available on mobile platforms by September 30, further broadening its accessibility and user base [30].
Figma MCP + GPT-Codex:新的 Vibe Coding 之王
歸藏的AI工具箱· 2025-09-25 10:25
Core Viewpoint - The article discusses the recent updates to Figma's remote MCP service and how it enhances the integration with AI tools like GPT-5 Codex, improving design and coding efficiency. Group 1: Figma MCP Service Update - The new Figma remote MCP service eliminates the need for complex installation processes and local clients, streamlining user experience [5][21] - Users can easily set up the MCP by copying a JSON code into the Cursor settings, simplifying the connection process [6][7] - The service requires a subscription, and alternative access methods are mentioned [8] Group 2: Integration with AI Tools - The integration with AI IDEs like Cursor allows for direct usage of GPT-5 Codex, enhancing design capabilities [5][9] - Users can utilize commands in Claude Code to access Figma MCP, facilitating the design process [10] - The AI can generate web pages from design drafts, but the quality of the output depends on the original design's structure [15][16] Group 3: Design and Development Process - The article emphasizes the importance of using high-quality design drafts to ensure effective AI output [15][16] - It suggests a step-by-step approach for complex designs, allowing the AI to handle components incrementally [15] - The article provides specific design guidelines for creating a visually appealing web page, including color schemes and layout styles [19][20] Group 4: Future Implications - The update indicates significant growth potential for Vibe Coding infrastructure, enhancing efficiency in design and coding [21] - The integration of AI does not eliminate the need for design skills; rather, it enhances productivity while maintaining the necessity for aesthetic judgment and foundational knowledge [21]
可灵2.5Turbo实测|顶尖AI视频模型,真能打平CG吗?
歸藏的AI工具箱· 2025-09-23 10:37
Core Viewpoint - The release of Kling 2.5 Turbo marks significant advancements in AI video generation, showcasing improved understanding of complex prompts and dynamic video stability, while offering competitive pricing for high-quality outputs [1][17]. Group 1: Performance Improvements - The model demonstrates enhanced comprehension of complex prompts, particularly those involving intricate causal and temporal relationships [1][17]. - Video generation stability has improved, especially in high-speed dynamic scenarios, maintaining consistent style throughout the video [1][17]. - The cost for generating a 5-second high-quality video has decreased from 35 points in the previous model to 25 points in the new version [1]. Group 2: Testing and Comparisons - Various tests were conducted to evaluate the model's performance, including scenes with complex actions and dynamic camera movements, which were executed smoothly without distortion [2][3][7]. - The model successfully generated videos in different artistic styles while maintaining consistency across the outputs, showcasing its versatility [6][7]. - Comparisons with top CG works from the World Rendering Competition indicate that Kling 2.5 Turbo can compete with high-quality CG productions in specific scenarios [10][11][17]. Group 3: Understanding of Motion and Physics - The AI model exhibits a deeper understanding of the underlying physics of motion, as evidenced by its ability to incorporate realistic movements and transitions, such as the gradual unfolding of a princess dress [17][18]. - The model's ability to add natural movements, like staggering after dodging an attack, reflects its comprehension of physical logic beyond simple prompt adherence [17][18]. - The synchronization of visual effects with character movements, such as the transformation of a warrior into a wolf, indicates an advanced level of cognitive processing in the AI's creative approach [18].
Notion 3.0 |AI转型最成功的互联网产品是怎么做的?
歸藏的AI工具箱· 2025-09-19 13:26
Core Viewpoint - Notion has successfully transformed into a versatile AI-driven tool with the release of Notion 3.0, integrating advanced AI capabilities to enhance user experience and productivity [2][30]. AI Capabilities - Notion AI now supports top models like GPT-5 and Claude 4.1, allowing users to add context through file uploads and database selections [2][4]. - Users can link Notion with other software like Gmail and GitHub to enrich the context for AI tasks [4][9]. - The AI can assist in generating and modifying database formats, creating visual representations like bar charts based on user requests [9][10]. Meeting and Writing Enhancements - Notion AI includes features for real-time transcription and summarization of meetings, making it easier to create meeting records [13]. - Users can customize AI prompts for specific tasks, allowing for collaborative input and visibility of AI-generated content [14][15]. - The AI can refine selected text, enhancing the writing process [16]. Custom Agent Features - Notion 3.0 introduces customizable Agents, allowing users to define their names, icons, and interaction styles, enhancing personalization [18][20]. - Agents can be designed to automate tasks, such as summarizing reports and generating discussion frameworks for meetings, significantly reducing workload [25][28]. - The ability to publish Agent templates on Notion's marketplace provides monetization opportunities for creators [22]. Integration and Functionality - The updated Notion MCP can now not only query information but also modify and write content, improving integration with other AI tools [27][28]. - Users can leverage AI to create complex functions in tables using natural language, simplifying the process of function creation [30]. Market Position and Strategy - Notion's transformation highlights the importance of context and supportive features in maximizing AI capabilities [31]. - The combination of strong template distribution and monetization strategies positions Notion favorably in the competitive landscape of AI tools [32].