Workflow
大模型记忆
icon
Search documents
陈天桥邓亚峰联手破解大模型记忆难题!4个月打造SOTA系统,悬赏8万美元发起全球记忆挑战赛
量子位· 2026-02-05 06:01
鹭羽 发自 凹非寺 量子位 | 公众号 QbitAI 开年,DeepSeek论文火遍全网,内容聚焦 大模型 记忆 。 无独有偶,谷歌近期也发布了一篇被誉为"Attention is all you need"V2 (Nested Learning: The Illusion of Deep Learning Architectures) 的重磅论文,核心同样指向记忆瓶颈。 就连最近这只彻底破圈的AI大龙虾——OpenClaw (原名Clawdbot) ,其亮点之一依旧是记忆。 也就是说, 记忆 ≈今年全球AI圈集体押注的技术风口≈皇冠明珠。 几乎所有你能想到的大模型团队,都在加班加点往自家模型里塞记忆功能…… 但这一次,让我们把视线从这些科技巨头身上稍稍挪开,就会发现有一支后起之秀同样不容小觑。 他们就是陈天桥和邓亚峰带队的 EverMind 。 为什么这样说呢? 且看产品,最新发布世界级长期记忆系统—— EverMemOS ,发布即SOTA。 一举打破多项记忆基准测试的同时,还能远超此前所有的基线方法。 其次,它是真正能用的。 不是只会跑测试的"花架子",实际部署后效果照样能打。而且团队有底气有信心,技术代 ...
大模型的“健忘症”有药了
虎嗅APP· 2025-12-01 13:21
Core Viewpoint - The article discusses the limitations of large models in retaining long-term memory, highlighting the challenges faced in practical applications and the need for a more human-like memory system in AI [3][10][25]. Group 1: Limitations of Current AI Models - The large model industry is experiencing a "memory loss" issue, where AI struggles to retain information over extended interactions, leading to repeated questions and irrelevant responses [4][6]. - The technical architecture of current models, such as the Transformer, suffers from attention decay over long sequences, resulting in the loss of earlier instructions during conversations [5][7]. - The lack of a shared memory mechanism among different AI agents leads to fragmented interactions, causing inefficiencies and confusion in customer service scenarios [6][7]. Group 2: Need for Improved Memory Mechanisms - A more sophisticated memory system is essential for AI to evolve beyond simple question-answering capabilities and to develop understanding and reasoning abilities [15][26]. - The concept of memory in AI should not just focus on storing more data but on retaining valuable information that can guide decision-making [11][12]. - The development of a memory infrastructure that allows for shared, manageable, and traceable memory among AI agents is crucial for enhancing their collaborative capabilities [10][22]. Group 3: Redefining AI Memory with "Memory Bear" - The company "Red Bear AI" is working on a product called "Memory Bear," which aims to create a memory system that mimics human memory processes, allowing for better retention and utilization of information [10][28]. - This system includes short-term working memory for task connections and long-term memory for knowledge retention, enabling AI to respond more accurately and contextually [14][18]. - The introduction of a structured memory graph allows for the analysis and retrieval of relevant memories, significantly improving the efficiency and accuracy of AI responses [17][18]. Group 4: Implications for Business and Future of AI - The ability of AI to retain memory will fundamentally change its role in business, allowing it to replace human-like interactions in customer service and other sectors [21][22]. - As AI develops a continuous memory, it will be able to understand user context and history, enhancing trust and effectiveness in various applications [22][26]. - The evolution of memory systems in AI is seen as a critical step towards achieving general artificial intelligence (AGI), where memory plays a vital role in reasoning and learning [26][28].
强化学习+大模型记忆:Mem-α,让智能体第一次学会“如何记忆”
机器之心· 2025-11-07 07:17
Core Insights - The article emphasizes that "memory" is becoming a crucial factor for intelligent agents to achieve long-term intelligence, especially in the context of rapidly evolving large language models [2] - Mem-α is introduced as a solution to the limitations of existing memory-enhanced agents, which often rely on manual rules and prompts, by incorporating reinforcement learning for autonomous memory management [2][9] Memory Management Challenges - Existing memory-enhanced agents face three main challenges: not knowing which information to retain long-term, when to update old memories, and how to allocate different types of memories effectively [8] - Prior to Mem-α training, models like Qwen3-4B struggled with memory updates, leading to frequent errors in question answering [6] Mem-α Contributions - Mem-α transforms memory construction into a sequence decision problem optimized through reinforcement learning, allowing agents to autonomously explore optimal memory management strategies [9] - The architecture of Mem-α is inspired by cognitive science, featuring a three-layer memory system that enables flexible use of different memory types [15] Training and Evaluation - Mem-α's training dataset is constructed from four dimensions, focusing on accurate retrieval, test-time learning, and long-range understanding, while excluding conflict resolution due to the lack of real-world benchmarks [17] - Experimental results show that Mem-α significantly outperforms existing methods across all evaluation tasks, particularly in accurate retrieval and long-range understanding [22] Key Findings - Mem-α demonstrates a strong generalization ability, effectively managing memory usage while maintaining high performance, reducing memory consumption by nearly 50% compared to other models [22] - The structured memory architecture of Mem-α enhances the organization and retrieval of complex information, outperforming flat memory baselines [24] - Mem-α exhibits robust extrapolation capabilities, generalizing well to extremely long sequences despite being trained on shorter samples [24] Ablation Study - An ablation study reveals that prior to Mem-α, models had low accuracy and struggled with memory management, but after training, accuracy improved significantly, showcasing the effectiveness of reinforcement learning in memory management [25] Future Implications - Mem-α indicates a trend where memory management evolves from an engineering problem to a learnable one, suggesting potential applications in multimodal memory and personalized memory strategies [27]