Workflow
Prompt Optimization
icon
Search documents
华为升级行业Agent算法架构!MindScale自己写prompt和工作流,KV Cache减少5.7倍token
Xin Lang Cai Jing· 2026-02-12 12:13
Core Insights - The article discusses the launch of Huawei's MindScale algorithm package aimed at enhancing the development of industry-specific agents, which are seen as vital for improving productivity and value creation in various sectors [1][13]. Group 1: Challenges in Industry Agent Development - MindScale identifies four core challenges in the widespread adoption of industry agents: the need for self-evolving workflows, automation of prompt optimization, historical knowledge reuse, and efficient training and inference processes [3][16]. - The project includes solutions such as the EvoFabric algorithm for self-evolving agents and SOP2Workflow for generating executable workflows from natural language documents [3][16]. Group 2: Workflow and Memory Optimization - The framework supports a state graph engine that allows for deep mixing of multiple agents, tools, and memory forms, facilitating rapid copying, migration, and deployment of complex intelligent processes [7][20]. - A memory module enhances agent performance over time by utilizing trajectory memory and evaluation results to create an optimized context for experience [7][20]. Group 3: Prompt Optimization Techniques - The SCOPE algorithm enables online prompt optimization between inference steps, achieving over 20% accuracy improvement in specific reasoning scenarios [7][21]. - The C-MOP model introduces a feedback loop for prompt optimization, addressing conflicts in text gradients and enabling automatic prompt adjustments based on positive and negative feedback [8][21]. Group 4: Efficiency and Hardware Adaptation - MindScale emphasizes training and inference efficiency, with the TrimR algorithm significantly reducing inference latency by up to 70% in high-concurrency scenarios without compromising accuracy [10][23]. - The introduction of KV-Embeddings redefines the use of KV Cache, allowing for efficient representation reuse during inference, which can reduce the number of generated tokens by up to 5.7 times [12][25].
华为升级行业Agent算法架构!MindScale自己写prompt和工作流,KV Cache减少5.7倍token
量子位· 2026-02-12 07:52
Core Viewpoint - The article emphasizes the significance of industry-specific agents in enhancing productivity and value creation through the application of large models in various sectors [1]. Group 1: Challenges in Industry Agent Development - The MindScale project identifies four core challenges in the widespread application of agents across industries: self-evolving workflows, automated prompt optimization, historical knowledge reuse, and complex reasoning evaluation [4]. - The project aims to address these challenges by providing solutions in collaboration with various partners [4]. Group 2: Workflow Development and Automation - The algorithm package includes the EvoFabric agent algorithm, which facilitates self-evolving workflows, allowing for rapid generation of executable workflows from natural language documents and historical tool libraries using SOP2Workflow [5][6]. - The traditional manual maintenance of workflows relies heavily on expert experience, which poses challenges in reusing historical knowledge and maintaining efficiency in training and inference [7]. Group 3: Prompt Optimization Techniques - The article discusses the implementation of a prompt optimization algorithm, SCOPE, which allows developers to optimize prompts between inference steps, achieving over 20% accuracy improvement in specific scenarios [11]. - The C-MOP model introduces a feedback loop for prompt optimization, addressing conflicts in text gradients and enabling automatic prompt optimization based on positive and negative feedback [11][14]. Group 4: Efficiency and Performance Enhancements - MindScale focuses on optimizing training and inference efficiency for industry-specific models, with the TrimR algorithm significantly reducing inference latency by up to 70% in high-concurrency scenarios without compromising accuracy [14][16]. - The introduction of KV-Embeddings redefines the use of KV Cache, enhancing performance in chain-of-embedding scenarios and reducing the number of generated tokens by up to 5.7 times [16]. Group 5: Hardware Adaptation and Implementation - MindScale includes code implementations that are compatible with Ascend hardware, enabling industry developers to build high-precision and efficient agents based on domestic computing power [18]. - The TrimR algorithm employs a lightweight verifier to detect and truncate unnecessary intermediate thoughts without requiring fine-tuning of the large model or verifier, suitable for high-concurrency production environments [19].
Build a Prompt Learning Loop - SallyAnn DeLucia & Fuad Ali, Arize
AI Engineer· 2026-01-06 17:30
[music] Hey everyone, gonna get started here. Thanks so much for joining us today. Um, I'm Sally. I'm the director of RISE.I'm going to be walking you through some of crowd prompt learning. Uh we're actually going to be building a driven optimization loop for the part of the workshop. Um I come from a technical background and started off in data science before I made my way over to product.Uh I do like to still be touching code today. I think one of my favorite projects that I work on is building our own ag ...