Workflow
Context Compaction
icon
Search documents
Build Hour: Agent Memory Patterns
OpenAI· 2025-12-04 20:28
Agent Memory Patterns & Context Engineering - Context engineering is both an art and a science, requiring judgment to decide what matters most while employing concrete patterns and methods for systematic context management [1] - Modern LLMs perform based on the context provided, making context engineering a broader discipline than prompt engineering or retrieval, encompassing prompt engineering, structured output, RAG, state and history management, and memory [1] - Core strategies for effective context management include reshape and fit (context trimming, compaction, summarization), isolate and route (offloading context to sub-agents), and extract and retrieve (memory extraction, state management, memory retrieval) [1] - Short-term memory (in-session techniques) focuses on maximizing the context window during active interaction, while long-term memory (cross-session) builds continuity across sessions [1] - Context management challenges include context burst (sudden token spikes), context conflict (contradictory information), context poisoning (incorrect information propagation), and context noise (redundant tool definitions) [2] Techniques and Solutions - Solutions involve managing context efficiently using techniques like trimming, compaction, state management, and memories, moving beyond prompt engineering [3] - AI agents can be grouped into RAG-heavy assistants, tool-heavy workflows, and conversational concierges, each with different context profiles [3] - Prompting best practices include being explicit and structured, giving room for planning and self-reflection, and avoiding conflicts in tool definitions [3] - Engineering techniques include context trimming (dropping older turns), context compaction (dropping tool calls from older turns), and context summarization (compressing prior messages into structured summaries) [3][4] - Memory shapes can range from simple JSON notes to complex paragraphs, with extraction using memory tools and state management involving defining state objects [26][27][28][29] Best Practices and Evaluation - Best practices in agent memory design include understanding the typical context, deciding when and how to remember and forget, and continuously cleaning and consolidating memories [31][32] - Evaluation involves running evals with and without memory, building memory-specific evals for long-running tasks, and finding the right heuristics for context engineering techniques [36][37][38][39][40] - Strategies for keeping memory fresh include temporal tags and weighted decay to manage stale memories and prioritize recent information [46][47][48][49] - Scaling agent memory systems involves considering whether to use a retrieval-based approach (scaling vector databases) or a summarization approach (scaling data storage) [51][52][53][54][55]