Workflow
EvoFabric
icon
Search documents
华为升级行业Agent算法架构!MindScale自己写prompt和工作流,KV Cache减少5.7倍token
Xin Lang Cai Jing· 2026-02-12 12:13
Core Insights - The article discusses the launch of Huawei's MindScale algorithm package aimed at enhancing the development of industry-specific agents, which are seen as vital for improving productivity and value creation in various sectors [1][13]. Group 1: Challenges in Industry Agent Development - MindScale identifies four core challenges in the widespread adoption of industry agents: the need for self-evolving workflows, automation of prompt optimization, historical knowledge reuse, and efficient training and inference processes [3][16]. - The project includes solutions such as the EvoFabric algorithm for self-evolving agents and SOP2Workflow for generating executable workflows from natural language documents [3][16]. Group 2: Workflow and Memory Optimization - The framework supports a state graph engine that allows for deep mixing of multiple agents, tools, and memory forms, facilitating rapid copying, migration, and deployment of complex intelligent processes [7][20]. - A memory module enhances agent performance over time by utilizing trajectory memory and evaluation results to create an optimized context for experience [7][20]. Group 3: Prompt Optimization Techniques - The SCOPE algorithm enables online prompt optimization between inference steps, achieving over 20% accuracy improvement in specific reasoning scenarios [7][21]. - The C-MOP model introduces a feedback loop for prompt optimization, addressing conflicts in text gradients and enabling automatic prompt adjustments based on positive and negative feedback [8][21]. Group 4: Efficiency and Hardware Adaptation - MindScale emphasizes training and inference efficiency, with the TrimR algorithm significantly reducing inference latency by up to 70% in high-concurrency scenarios without compromising accuracy [10][23]. - The introduction of KV-Embeddings redefines the use of KV Cache, allowing for efficient representation reuse during inference, which can reduce the number of generated tokens by up to 5.7 times [12][25].
华为升级行业Agent算法架构!MindScale自己写prompt和工作流,KV Cache减少5.7倍token
量子位· 2026-02-12 07:52
Core Viewpoint - The article emphasizes the significance of industry-specific agents in enhancing productivity and value creation through the application of large models in various sectors [1]. Group 1: Challenges in Industry Agent Development - The MindScale project identifies four core challenges in the widespread application of agents across industries: self-evolving workflows, automated prompt optimization, historical knowledge reuse, and complex reasoning evaluation [4]. - The project aims to address these challenges by providing solutions in collaboration with various partners [4]. Group 2: Workflow Development and Automation - The algorithm package includes the EvoFabric agent algorithm, which facilitates self-evolving workflows, allowing for rapid generation of executable workflows from natural language documents and historical tool libraries using SOP2Workflow [5][6]. - The traditional manual maintenance of workflows relies heavily on expert experience, which poses challenges in reusing historical knowledge and maintaining efficiency in training and inference [7]. Group 3: Prompt Optimization Techniques - The article discusses the implementation of a prompt optimization algorithm, SCOPE, which allows developers to optimize prompts between inference steps, achieving over 20% accuracy improvement in specific scenarios [11]. - The C-MOP model introduces a feedback loop for prompt optimization, addressing conflicts in text gradients and enabling automatic prompt optimization based on positive and negative feedback [11][14]. Group 4: Efficiency and Performance Enhancements - MindScale focuses on optimizing training and inference efficiency for industry-specific models, with the TrimR algorithm significantly reducing inference latency by up to 70% in high-concurrency scenarios without compromising accuracy [14][16]. - The introduction of KV-Embeddings redefines the use of KV Cache, enhancing performance in chain-of-embedding scenarios and reducing the number of generated tokens by up to 5.7 times [16]. Group 5: Hardware Adaptation and Implementation - MindScale includes code implementations that are compatible with Ascend hardware, enabling industry developers to build high-precision and efficient agents based on domestic computing power [18]. - The TrimR algorithm employs a lightweight verifier to detect and truncate unnecessary intermediate thoughts without requiring fine-tuning of the large model or verifier, suitable for high-concurrency production environments [19].