Workflow
Memory
icon
Search documents
重塑记忆架构:LLM正在安装「操作系统」
机器之心· 2025-07-16 04:21
Core Viewpoint - The article discusses the limitations of large language models (LLMs) regarding their context window and memory management, emphasizing the need for improved memory systems to enhance their long-term interaction capabilities [5][6][9]. Context Window Evolution - Modern LLMs typically have a limited context window, with early models like GPT-3 handling around 2,048 tokens, while newer models like Meta's Llama 4 Scout claim to manage up to 10 million tokens [2][4]. Memory Management in LLMs - LLMs face an inherent "memory defect" due to their limited context window, which hampers their ability to maintain consistency in long-term interactions [5][6]. - Recent research has focused on memory management systems like MemOS, which treat memory as a critical resource alongside computational power, allowing for continuous updates and self-evolution of LLMs [9][49]. Long Context Processing Capabilities - Long context processing capabilities are crucial for LLMs, encompassing: - Length generalization ability, which allows models to extrapolate on sequences longer than those seen during training [12]. - Efficient attention mechanisms to reduce computational and memory costs [13]. - Information retention ability, which refers to the model's capacity to utilize distant information effectively [14]. - Prompt design to maximize the advantages of long context [15]. Types of Memory in LLMs - Memory can be categorized into: - Event memory, which records past interactions and actions [18]. - Semantic memory, encompassing accessible external knowledge and understanding of the model's capabilities [19]. - Procedural memory, related to the operational structure of the system [20]. Methods to Enhance Memory and Context - Several methods to improve LLM memory and context capabilities include: - Retrieval-augmented generation (RAG), which enhances knowledge retrieval for LLMs [27][28]. - Hierarchical summarization, which recursively summarizes content to manage inputs exceeding model context length [31]. - Sliding window inference, which processes long texts in overlapping segments [32]. Memory System Design - Memory systems in LLMs are akin to databases, integrating lifecycle management and persistent representation capabilities [47][48]. - Recent advancements include the development of memory operating systems like MemOS, which utilize a layered memory architecture to manage short-term, medium-term, and long-term memory [54][52]. Innovative Memory Approaches - New memory systems such as MIRIX and Larimar draw inspiration from human memory structures, enhancing LLMs' ability to update and generalize knowledge rapidly [58][60]. - These systems aim to improve memory efficiency and model inference performance by employing flexible memory mechanisms [44].
Context Engineering for Agents
LangChain· 2025-07-02 15:54
Context Engineering Overview - Context engineering is defined as the art and science of filling the context window with the right information at each step of an agent's trajectory [2][4] - The industry categorizes context engineering strategies into writing context, selecting context, compressing context, and isolating context [2][12] - Context engineering is critical for building agents because they typically handle longer contexts [10] Context Writing and Selection - Writing context involves saving information outside the context window, such as using scratch pads for note-taking or memory for retaining information across sessions [13][16][17] - Selecting context means pulling relevant context into the context window, including instructions, facts, and tools [12][19][20] - Retrieval-augmented generation (RAG) is used to augment the knowledge base of LLMs, with code agents being a large-scale application [27] Context Compression and Isolation - Compressing context involves retaining only the most relevant tokens, often through summarization or trimming [12][30] - Isolating context involves splitting up context to help an agent perform a task, with multi-agent systems being a primary example [12][35] - Sandboxing can isolate token-heavy objects from the LLM context window [39] Langraph Support for Context Engineering - Langraph, a low-level orchestration framework, supports context engineering through features like state objects for scratchpads and built-in long-term memory [44][45][48] - Langraph facilitates context selection from state or long-term memory and offers utilities for summarizing and trimming message history [50][53] - Langraph supports context isolation through multi-agent implementations and integration with sandboxes [55][56]
Architecting Agent Memory: Principles, Patterns, and Best Practices — Richmond Alake, MongoDB
AI Engineer· 2025-06-27 09:56
AI Agents and Memory - The presentation focuses on the importance of memory in AI agents, emphasizing that memory is crucial for making agents reflective, interactive, proactive, reactive, and autonomous [6] - The discussion highlights different forms of memory, including short-term, long-term, conversational entity memory, knowledge data store, cache, and working memory [8] - The industry is moving towards AI agents and agentic systems, with a focus on building believable, capable, and reliable agents [1, 21] MongoDB's Role in AI Memory - MongoDB is positioned as a memory provider for agentic systems, offering features needed to turn data into memory and enhance agent capabilities [20, 21, 31] - MongoDB's flexible document data model and retrieval capabilities (graph, vector, text, geospatial query) are highlighted as key advantages for AI memory management [25] - MongoDB acquired Voyage AI to improve AI systems by reducing hallucination through better embedding models and re-rankers [32, 33] - Voyage AI's embedding models and re-rankers will be integrated into MongoDB Atlas to simplify data chunking and retrieval strategies [34] Memory Management and Implementation - Memory management involves generation, storage, retrieval, integration, updating, and forgetting mechanisms [16, 17] - Retrieval Augmented Generation (RAG) is discussed, with MongoDB providing retrieval mechanisms beyond just vector search [18] - The presentation introduces "Memoriz," an open-source library with design patterns for various memory types in AI agents [21, 22, 30] - Different memory types are explored, including persona memory, toolbox memory, conversation memory, workflow memory, episodic memory, long-term memory, and entity memory [23, 25, 26, 27, 29, 30]
X @Sam Altman
Sam Altman· 2025-06-03 21:02
also, today we are making a lightweight version of memory available to the free tier of chatgpt!memory has probably become my favorite feature in chatgpt; excited for us to improve this a lot over time. ...
New Speculative Novel The Version Who Stayed by Krispy Launches on Amazon — A Hauntingly Beautiful Exploration of Identity, Memory, and Emotional Truth
GlobeNewswire News Room· 2025-04-28 15:04
Core Insights - Krispy's debut novel, *The Version Who Stayed*, is the first in a series titled *The Mirror Archive*, exploring themes of identity, regret, and emotional truth through the character Auren Solven [1][9] - The narrative is set in a world similar to our own but includes metaphysical elements, focusing on the emotional consequences of choices rather than technical explanations [2][4] Summary by Sections Narrative Overview - The story begins with a cryptic letter that prompts Auren to reflect on a life-altering decision, leading to a journey through alternate realities [2] - Auren's exploration reveals a life they could have lived, highlighting the complexities of joy and regret [3] Emotional Depth - The novel emphasizes emotional realism, prioritizing personal truth and the courage to accept chosen lives over fantastical elements [4][5] - Krispy's storytelling philosophy centers on the quiet moments that shape individuals, suggesting that transformation often occurs in stillness [5] Reader Reception - Early responses to the novel have been overwhelmingly positive, with readers praising its emotional intelligence and relatability [6][7] - The book has been described as accessible even to those unfamiliar with literary sci-fi, resonating with a broad audience [7] Unique Positioning - *The Version Who Stayed* distinguishes itself by reclaiming softness as a form of resistance, focusing on self-acceptance rather than triumph [8] - Krispy's work invites readers to reflect on their own lives and decisions, making Auren's journey a lens for personal examination [8] Publication and Availability - The novel is now available on Amazon, marking the beginning of Krispy's literary journey and inviting readers to engage with its themes [9][10]
Deep Research类产品深度测评:下一个大模型产品跃迁点到来了吗?
Founder Park· 2025-04-23 12:37
以下文章来源于海外独角兽 ,作者拾象 Founder Park 正在搭建开发者社群,邀请积极尝试、测试新模型、新技术的开发者、创业者们加入,请扫码详细填写你的产品/项目信息,通过 审核后工作人员会拉你入群~ 海外独角兽 . 研究科技大航海时代的伟大公司。 Deep Research 产品可被理解为 一个以大模型能力为基础、集合了检索与报告生成的端到端系统,对信息进行迭代搜索和分析,并生成详细报告作为输 出。 参考 Han Lee 的 2x2 分析框架,目前 Deep Research 类产品在 输出深度、训练程度 两大维度呈现分异。 输出深度 即产品在先前研究成果的基础上进行了 多少次迭代循环以收集更多信息,可进一步被理解为 Agentic 能力的必要基础。 低训练程度 指代经过人工干预和调整的系统,比如使用人工调整的 prompt,高训练程度则是指利用机器学习对系统进行训练。 和传统 LLM Search 产品相比,Deep Research 是迈向 Agent 产品雏形的一次跃迁,可能也将成为具有阶段代表性的经典产品形态。 Deep Research 产品通过系列推理模型嵌入,已生长出了 Agent 产品 ...