大模型自进化
Search documents
LLM工业级自进化:北邮与腾讯AI Lab提出MoE-CL架构,解决大模型持续学习核心痛点
机器之心· 2025-09-30 00:27
Core Insights - The article discusses the urgent need for "self-evolution" in industrial-grade large language models (LLMs) to adapt dynamically to new tasks while retaining existing capabilities [2][6] - The proposed solution is the MoE-CL framework, which combines task-specific and shared LoRA experts with a GAN-based approach to ensure efficient knowledge transfer and retention [2][6][28] Group 1: Introduction and Background - The rapid growth of digital economy and diverse text data presents challenges in processing across different domains, necessitating a solution that efficiently handles new tasks while preserving knowledge from old tasks [5][6] - Traditional methods either require extensive resources for training separate models for each text type or struggle with performance imbalances when using a single model [5][6] Group 2: Methodology - MoE-CL focuses on knowledge accumulation and task adaptation in multi-task learning, utilizing LoRA technology to enhance Transformer blocks and reduce parameter updates [8][10] - The framework includes task-specific and shared LoRA experts, with a GAN module to separate and optimize task-specific and shared knowledge [8][12][14] Group 3: Experimental Results - In A/B testing within Tencent's real business scenarios, MoE-CL reduced manual intervention costs by 15.3% and achieved a high removal rate of 28.8% in task A, demonstrating significant operational efficiency [3][26] - MoE-CL outperformed existing methods in accuracy and stability across various tasks, showcasing its robust performance in dynamic environments [21][22] Group 4: Conclusion - The MoE-CL framework effectively addresses the challenges of catastrophic forgetting and knowledge transfer through its unique architecture, enabling continuous learning and adaptation in LLMs [28]