ICLR 2026｜人大&通义：别再只会堆上下文了！IterResearch用40K上下文轻松实现2048轮交互不退化

Core Viewpoint - The article discusses the limitations of traditional Search Agents that rely on a linear context stacking approach, leading to performance degradation over multiple interaction rounds. It introduces IterResearch, a new iterative deep research paradigm that allows Agents to maintain performance while interacting up to 2048 times within a fixed context length of 40K [4][19]. Group 1: Limitations of Traditional Approaches - Traditional Search Agents face challenges due to the accumulation of context, which leads to "context suffocation" and "noise contamination," ultimately hindering the Agent's ability to provide accurate responses [15]. - The linear growth of memory in traditional ReAct paradigms results in a compressed "generation budget," forcing Agents to produce shorter and less thoughtful answers [15][19]. - Existing strategies like context folding and summarization do not fundamentally change the linear growth structure, merely delaying the inevitable collapse of context [9][10]. Group 2: IterResearch Paradigm - IterResearch proposes a shift from context stacking to context reconstruction, allowing Agents to "clean up" their workspace continuously, thus maintaining a constant complexity in their working space [11][14]. - The core mechanism involves an evolving report that summarizes past findings, compresses irrelevant information, and updates reasoning states, enabling Agents to operate in a stable environment [13][16]. - The iterative process allows for a constant state space (O(1)), contrasting with the linear growth (O(t)) seen in traditional methods, enhancing the Agent's ability to scale interactions effectively [14]. Group 3: Performance Outcomes - In experiments, IterResearch demonstrated a significant increase in accuracy on the BrowseComp benchmark, rising from 3.5% to 42.5% over 2048 interaction rounds, with no signs of performance degradation [19]. - The findings suggest that the challenges in long-term tasks may stem more from limited exploration depth rather than insufficient reasoning capabilities [19]. - Interestingly, Agents averaged only about 80 rounds of interaction, indicating an ability to terminate the process once sufficient information was gathered, showcasing efficiency in decision-making [19]. Group 4: Applicability and Future Directions - IterResearch's iterative logic can be applied as a prompting strategy to closed-source models without requiring retraining, yielding performance improvements of 12.7 percentage points for o3 and 19.2 percentage points for DeepSeek-V3.1 compared to traditional methods [21]. - The structural cognitive mechanism of IterResearch addresses common bottlenecks in long-range reasoning, making it applicable across various model architectures [23]. - The article concludes that IterResearch opens new possibilities for the capabilities of long-term Agents, suggesting a promising direction for future research and development in the field [24].