Workflow
歸藏的AI工具箱
icon
Search documents
我复刻了 Claude 刚发布的生成式 UI 交互!
歸藏的AI工具箱· 2026-03-15 09:24
Core Viewpoint - The article discusses the introduction of a new interactive feature in Claude, which utilizes generative UI to enhance the understanding of concepts through visual representation rather than plain text [1]. Group 1: New Features and Capabilities - The new feature allows AI to create interactive charts directly within chat, providing a more engaging and immediate visualization experience compared to previous static outputs [5]. - Users can now see charts being drawn in real-time, with SVG nodes appearing sequentially, enhancing the visual appeal and understanding of data [5]. - The functionality supports various applications, such as data analysis, interactive calculators, architecture diagrams, and online data analysis from GitHub repositories [7][10][12][14]. Group 2: User Interaction and Educational Benefits - The interactive nature of the charts allows for deeper engagement, enabling users to ask for further explanations or details about the generated visuals [17]. - This feature is particularly beneficial for students, as it allows them to manipulate parameters and see immediate changes in visualizations, enhancing the learning experience [22]. Group 3: Implementation Challenges and Solutions - The article outlines the technical challenges faced in implementing this feature, including SDK limitations, rendering isolation, and the need for a smooth streaming experience [26][28][30]. - Solutions were developed to address issues such as text disappearance during rendering, height transitions, and ensuring stable React component trees to avoid visual glitches [42][46][51][52]. Group 4: Overall Impact and Conclusion - The generative UI system aims to seamlessly integrate visual elements into the chat experience, making it appear as if they naturally belong within the conversation [40]. - The complexity lies not just in rendering HTML but in maintaining visual stability during various state transitions, ensuring a smooth user experience [52].
小米做了个能在手机上跑的"小龙虾 (Openclaw)":Agent 终于能接触现实世界
歸藏的AI工具箱· 2026-03-09 09:52
Core Viewpoint - The article discusses the launch of Xiaomi miclaw, a mobile version of OpenClaw, which integrates with Xiaomi's ecosystem to control smart devices and perform tasks through natural language interaction [2][6][49]. Group 1: Product Features - Xiaomi miclaw is the first mobile version of a "lobster" intelligent agent in China, featuring skills, MCP, scheduled tasks, and personality [6][9]. - It can access and control all Xiaomi smart devices, providing real-time information and automating tasks based on user commands [17][28]. - The product allows users to create custom automation scenarios using natural language, eliminating the need for complex coding or visual configurations [41][40]. Group 2: Use Cases - **Smart Morning Assistant**: Users can set up Xiaomi miclaw to wake them up with weather updates, news, and music, while also controlling home devices like humidifiers and air conditioners based on environmental conditions [11][15][22]. - **Smart Leaving Mode**: The agent can analyze which devices need to be turned off when leaving home and can create custom skills for various triggers, enhancing home security and energy efficiency [29][40]. - **Smart Meeting Assistant**: Xiaomi miclaw can automate meeting recordings, transcribe them, and generate meeting minutes, streamlining the process of managing tasks and schedules [43][48]. Group 3: Strategic Implications - Xiaomi is transitioning from a hardware-centric company to one that builds AI infrastructure, leveraging its extensive ecosystem of devices to enhance AI capabilities [49][51]. - The integration of hardware, software, and AI models positions Xiaomi uniquely in the market, allowing it to create a comprehensive moat against competitors who lack hardware [53][57]. - The company aims to capitalize on the growing demand for AI solutions that interact with the physical world, which is a significant opportunity in the current technological landscape [51][50].
让你的 ClaudeCode 秒变 Openclaw(龙虾),连接飞书、Discord 远程控制
歸藏的AI工具箱· 2026-03-05 14:14
Core Insights - The article discusses the development and features of Vibe Coding's Agent client Codepilot, highlighting its efficiency and capabilities in creating a desktop version of Claude Code integrated with various IM tools [1][2]. Group 1: Project Overview - The initial project aimed to create a desktop version of Claude Code, which has now expanded to include various Agent functionalities, making it suitable for beginners and more secure than alternatives like OpenClaw [2]. - The project has seen significant activity, with 40 versions released and 220 commits over a span of 16 days, indicating high development efficiency [1]. Group 2: Features and Functionalities - The project supports remote connections to IM tools such as Feishu, offers visual configuration for all Code plan packages, and includes features like concurrent multi-Agent split screens and token usage monitoring dashboards [3]. - It allows for easy installation of Claude Code across MacOS and Windows platforms, enhancing accessibility for users [3]. Group 3: Open Source Initiatives - The author decided to open-source two projects, Claude-to-IM and Claude-to-IM-skill, to alleviate development pressure and assist users wanting to connect Claude Code with IM tools [4]. - Claude-to-IM-skill enables users to connect their Claude Code conversations to IM platforms like Feishu and Discord without needing to write code, simplifying the setup process [7][9]. Group 4: Interactive Configuration and Security - The interactive configuration feature guides users through the setup process, making it user-friendly and accessible [9]. - Security measures include token storage with strict permissions, real-time output previews, and session persistence, ensuring a secure and reliable user experience [8][15]. Group 5: Developer-Focused Features - Claude-to-IM is tailored for developers looking to integrate multiple IM remote controls into their products, providing a library for quick access [11]. - Key features include multi-platform adapters, real-time response previews, and comprehensive permission management through interactive buttons [15].
光年之外居然做了能用Skills的AI浏览器:超实用实用案例+现成脚本
歸藏的AI工具箱· 2026-03-03 09:43
Core Viewpoint - The article discusses the introduction of Tabbit, a new AI browser developed by the Lightyear team, which aims to enhance browsing efficiency and context collection through various innovative features [1][2]. Summary by Sections Introduction to Tabbit Browser - Tabbit is described as the first native AI browser in China, designed from the ground up, unlike previous attempts that merely added plugins to existing browsers [1]. Core Features - Tabbit includes five core functionalities: 1. **Chat**: Allows users to interact with AI while browsing, supporting context from web pages, tab groups, and bookmarks [4]. 2. **Skills**: Automates repetitive tasks into one-click operations, allowing for variable inputs and webpage content modification [4]. 3. **Agent**: AI can automatically complete complex tasks transparently [4]. 4. **Favorites**: Offers semantic search for fragmented bookmarks [4]. 5. **Tab Management**: Automatically groups tabs and syncs them in the cloud [4]. Use Cases - The article provides several use cases demonstrating Tabbit's capabilities: - **Information Organization**: Users can manage multiple tabs related to a specific topic, such as the Anthropic and U.S. Department of Defense controversy, with automatic grouping based on content [12][17]. - **Travel Planning**: The intelligent agent can assist in finding flights and hotels based on user preferences, streamlining the booking process [23][27]. - **Template Creation**: Users can create reusable templates for tasks like travel planning, allowing for quick adjustments without re-entering all details [32][40]. - **Bookmark Management**: Tabbit enhances the usability of bookmarks by allowing users to search and retrieve saved content easily, including images [41][46]. Advanced Functionalities - Tabbit offers advanced features such as saving entire webpage content, generating summaries, and semantic search capabilities [57]. - Users can create scripts to modify webpage content or enhance browsing experiences without needing programming skills [60][71]. Conclusion - The article emphasizes the need for users to adapt to using AI tools like Tabbit to improve efficiency in their workflows, particularly in the context of browsing and information management [76].
过了个年,AI 圈变天了?但没人告诉你为什么
歸藏的AI工具箱· 2026-02-25 04:28
Core Insights - The article discusses the significant changes in the AI landscape, particularly the emergence of the "Agent" era, which is characterized by AI systems that can perform tasks autonomously rather than just responding to queries [1][2][4]. Group 1: Changes in AI Functionality - By early 2026, AI has evolved from a simple question-and-answer tool to an autonomous agent capable of understanding intent, breaking down tasks, and delivering completed products [17]. - The new models, such as Claude Opus 4.6 and GPT-5.3 Codex, exhibit improved programming capabilities, judgment, and the ability to work independently for extended periods [19][20][25]. - AI can now participate in its own development, creating a feedback loop that enhances its capabilities over time [31][34]. Group 2: Local Execution and Data Management - The new generation of agents operates locally on users' computers, allowing direct access to files and data without needing to upload or copy-paste [38][40]. - The Model Context Protocol (MCP) enables agents to connect with external services seamlessly, enhancing their functionality [47]. - Skills, which are pre-defined modules of expertise, allow agents to perform specialized tasks without requiring extensive prompts from users [49][56]. Group 3: Team Collaboration and Efficiency - The introduction of SubAgents allows a main agent to delegate tasks to specialized sub-agents, improving efficiency and maintaining the quality of output [99][100]. - Agent Teams enable multiple agents to work simultaneously on different aspects of a project, significantly increasing productivity [108][110]. - The use of Git's file locking mechanism ensures that multiple agents can collaborate without conflicts, streamlining the development process [111]. Group 4: Evolution and Knowledge Transfer - The GEP (Genome Evolution Protocol) allows agents to inherit successful strategies from one another, enhancing their learning and adaptability [127][130]. - This evolution in agent capabilities means that the collective knowledge of agents can be shared, reducing the cost and time required for problem-solving across different organizations [132]. Group 5: Implications for the Workforce - The shift towards using agents for various tasks may lead to smaller companies, as fewer human roles are needed to accomplish the same amount of work [150][152]. - The educational system may struggle to keep pace with the rapid advancements in AI, necessitating a shift in focus from execution skills to judgment and decision-making abilities [155][156]. - Middle management roles may be at risk as AI systems become capable of performing tasks traditionally handled by these positions [157].
手撕Sora,脚踢Veo!13个行业实战案例,Seedance 2.0玩法大全
歸藏的AI工具箱· 2026-02-14 02:06
Core Viewpoint - Seedance 2.0 is a revolutionary video generation model that combines advanced AI capabilities with a user-friendly interface, allowing users to create high-quality videos with minimal input, transforming various industries such as marketing, education, and e-commerce [3][5][84]. Industry and Category Solutions Marketing and Branding - Seedance 2.0 enables users to generate complete marketing or educational videos with just a single sentence prompt, eliminating the need for detailed instructions [10][19]. - The model can autonomously select styles and create scripts that align with brand philosophies, as demonstrated with a promotional video for MUJI [12][14]. Product Management and Design - The model allows product managers and designers to transform UI design screenshots into high-quality promotional videos, significantly reducing the time and resources previously required for 3D rendering [20][22]. - Users can upload multiple images and specify design styles to create visually appealing product videos that maintain the original content [29][32]. E-commerce and Real Estate - In the e-commerce sector, Seedance 2.0 can showcase clothing items by automatically arranging different outfits on a model, complete with dynamic camera angles and transitions [36][38]. - For real estate, the model can generate immersive walkthrough videos from a single floor plan, accurately reflecting the layout and design of the space [46][52]. Content Creation - Content creators can leverage Seedance 2.0 to produce Vlogs or video podcasts without needing extensive editing skills, using just photos and audio [53][63]. - The model can also create narrative-driven videos based on audio prompts, demonstrating its ability to generate engaging storylines [64][66]. Film and Animation Industry - Seedance 2.0 can facilitate the production pipeline in the film and animation sectors by allowing users to reference existing videos for action and camera movements, ensuring consistency in character actions and scene transitions [68][72]. - The model can directly convert written narratives into animated content, streamlining the adaptation process for novels and scripts [73][76]. Automation and API Integration - The upcoming API for Seedance 2.0 will enable seamless integration into workflows and automated content generation processes, enhancing productivity for businesses and content creators [84]. - The potential for automated agents to handle various content production tasks, such as generating promotional videos or adapting scripts, will significantly improve efficiency across industries [79][82].
Agent 原生通讯协议:从传递代码,到传递认知
歸藏的AI工具箱· 2026-02-11 10:53
Core Insights - The article discusses the emergence of AI Agents communicating through GitHub, transforming it into a communication protocol for Agents [3][4] - The author highlights the limitations of the existing Git system, particularly its inability to capture the reasoning behind code changes, which is crucial in the Agent era [8][9] - Entire, a new company founded by former GitHub CEO Thomas Dohmke, aims to build a developer platform on Git that addresses these limitations by adding semantic metadata to Git commits [5][10] Group 1: Observations on Agent Communication - AI Agents are increasingly interacting with each other through GitHub Issues and Pull Requests, creating a natural communication flow without explicit design [2][3] - The existing Git infrastructure is inherently suitable for Agent communication, as it provides a mature collaborative framework [4][6] Group 2: Entire's Innovations - Entire's first product, Checkpoint, enhances Git by adding a layer of semantic metadata that captures the reasoning behind code changes, thus addressing the "why" behind modifications [10][14] - Checkpoint records not only the code changes but also the original prompts, reasoning chains, and constraints, making the Agent's thought process transparent and traceable [11][14] Group 3: Paradigm Shift in Development - The traditional development process focuses on code correctness, while the new paradigm emphasizes reviewing the reasoning and decision-making processes of Agents [20][21] - Developers' roles are shifting from writing code to supervising and evaluating the cognitive processes of Agents, marking a significant change in responsibilities [20][33] Group 4: Future Implications - Entire's vision extends beyond a mere development tool; it aims to establish a new communication protocol for Agents, akin to how HTTP functions for human users [22][23] - The need for a structured communication system among Agents is critical, as the future of software development will increasingly rely on Agent collaboration [23][25] Group 5: Challenges and Solutions - While Checkpoint addresses the issue of retaining information, challenges remain regarding the efficient retrieval of relevant context from potentially vast amounts of data [29][31] - Entire plans to introduce a Context Graph for semantic reasoning and an AI-native development lifecycle to facilitate real-time coordination among Agents [31][32]
只用一天Opus4.6+Agent Teams做了个ClaudeCode桌面端:已开源
歸藏的AI工具箱· 2026-02-07 05:14
Core Insights - The article discusses the launch of CodePilot, a desktop client for Claude Code, highlighting its comprehensive features and user-friendly design [1][3]. Group 1: Key Features of CodePilot - CodePilot supports all core functionalities of Claude Code, including folder selection, model switching, slash commands, Skills invocation, and MCP server integration, providing a significantly improved user experience [3]. - The client offers enhanced chat history management, allowing users to easily access previous conversations, with each message displaying the associated cost for transparency [5][6]. - A visual configuration management interface has been introduced, enabling users to modify configuration files, Skills, MCP, and plugins without needing command line interaction [8]. - Users can preview the contents of folders directly within the application, making it easier to access text files and other resources [9]. - Third-party API configurations are supported, allowing flexibility for users who may not have direct access to the official API [11]. - The connection status of Claude Code is clearly displayed, providing guidance for users in case of connectivity issues [13][14]. Group 2: Agent Teams Collaboration - The article introduces the Agent Teams mode, which allows a main intelligent agent to delegate tasks to multiple sub-agents, enabling parallel work and real-time communication between agents [19][20]. - Enabling Agent Teams is straightforward, requiring users to update to the latest version of Claude Code and follow simple instructions to configure it [21]. - Tips for utilizing Agent Teams include having Claude assist in writing planning prompts, emphasizing the importance of preliminary research for role definition, and advocating for flexible role design tailored to specific tasks [23][25][27]. - The article emphasizes that the current era allows for rapid development of fully functional applications, with the use of Opus 4.6 proving to be cost-effective due to its efficiency and reduced need for corrections [30][31].
Clawdbot 教程 02:如何集成飞书,完全国产化!
歸藏的AI工具箱· 2026-02-05 04:36
Core Viewpoint - The article outlines a comprehensive guide for configuring Clawdbot with Feishu, emphasizing the ease of using domestic models and the entire process being fully localized [2][35]. Group 1: Initial Setup - The first step involves creating a new bot application in the Feishu developer backend, which includes filling out the application name, description, background color, and icon [5]. - After creation, two key pieces of information, App ID and App Secret, must be noted for later configuration [6][7]. Group 2: Permission Configuration - The next step is to configure the bot's permissions by importing a JSON configuration that grants necessary access rights for message reception and sending [8][10]. - The permissions include various access rights such as reading and writing files, sending messages, and accessing chat events [9]. Group 3: Bot Activation - In the bot configuration page, a welcome message must be inputted to activate the bot's capabilities, which is essential for receiving messages [11]. Group 4: Clawdbot Configuration - The second step involves configuring the Feishu channel in Clawdbot, which requires running an installation command and selecting the Feishu option [14]. - If issues arise, such as a plugin already existing, manual deletion of the plugin folder is necessary before re-running the installation command [16]. - A known bug may require global installation of the zod dependency to proceed with the configuration [19]. Group 5: Final Configuration Steps - After filling in the required configuration information, it is crucial to select "Finished" to ensure successful addition [20]. - The configuration of direct message access policies is also necessary, with recommended settings for user interaction [24]. - Restarting the gateway is required to apply the new channel configuration [26]. Group 6: Event Subscription and Version Publishing - The final steps include configuring event subscription methods and adding the message receiving event to ensure the bot can receive messages [27][29]. - Publishing a version is essential for the bot's configuration to take effect [31][32]. Group 7: Pairing the Bot - The last step involves pairing the bot by sending a message to it in Feishu to receive a pairing code, which is then used to bind the bot in Clawdbot [34]. - Once configured, the Feishu bot will function correctly, especially when using domestic models [35].
Clawdbot 教程 01:模型的配置和切换
歸藏的AI工具箱· 2026-01-31 17:19
Core Viewpoint - The article provides a detailed guide on configuring the Clawdbot model on Macmini, highlighting common issues and solutions during the setup process. Configuration Process - The preferred method for configuration is using the command `openclaw configure`, which resolves most issues [6][7]. - During the configuration, users are prompted to select between local or remote setup, choose the model, and input their API Key [9]. Model Selection - There are specific selections for models: for Minimax M2.1, select Minimax; for Kimi K2.5, select moonshot AI [11][12]. - After selecting a model, users should navigate through options using the arrow keys to find the highlighted selection [13]. - Kimi has a coding plan option, while Minimax does not, even if the user has a coding plan membership [14]. Domestic and International Versions - A critical point is the distinction between domestic and international versions of Minimax; users must select 'cn' for domestic coding plan members and the version without 'cn' for overseas members [16]. - Incorrect selections can lead to configuration issues, which can be rectified by manually editing the configuration file [17]. Configuration File Editing - The configuration file is located at `/Users/your_username/.openclaw/openclaw.json`, where users can modify the `baseURL` [18]. - The correct URLs are: domestic version - `api.minimaxi.com`, international version - `api.minimax.io` [23]. Model Switching - After configuration, switching models is straightforward using the command `/model` in the TUI interface, which is initiated with `openclaw tui` [27]. - It is advisable to open a new window with the `/new` command before switching models to avoid issues [29]. Output Issues - The "no output" problem may occur after switching models, indicating that the output is directed to another environment rather than a configuration failure [30]. - Users should check other environments, such as web platforms, to confirm successful configuration [31]. Supported Models - The Clawdbot currently supports three major domestic models, all of which have been successfully configured: Kimi (domestic version), Minimax (international version), and GLM (international version) [34]. Summary of Configuration Steps - The core steps for configuring Clawdbot models are: use `openclaw configure`, manually edit the `baseURL` if necessary, and switch models using the `/model` command [37].