6.4万star的开源智能体框架全面重构!OpenHands重大升级,叫板OpenAI和谷歌
机器之心·2025-11-08 04:02

Core Insights - OpenHands development team announced the completion of the architectural restructuring of the OpenHands Software Agent SDK, evolving from V0 to V1, which provides a practical foundation for prototyping, unlocking new custom applications, and large-scale reliable deployment of agents [1][2]. Design Principles - OpenHands V1 introduces a new architecture based on four design principles that address the limitations of V0: 1. Sandbox execution should be optional rather than universally applicable, allowing for flexibility without sacrificing security [9]. 2. Default statelessness with a single source of truth for session state, ensuring isolation of changes and enabling deterministic replay and strong consistency [10]. 3. Strict separation of relevant items, isolating the core of the agent into a "software engineering SDK" for independent evolution of research and applications [11]. 4. Everything should be composable and safely extensible, with modular packages that support local, hosted, or containerized execution [12][13]. Ecosystem and Features - OpenHands V1 is a complete software agent ecosystem, including CLI and GUI applications built on the OpenHands Software Agent SDK [15][16]. - The SDK features a deterministic replay capability, an immutable configuration for agents, and an integrated tool system that supports both local prototyping and secure remote execution with minimal code changes [18][20]. Comparison with Competitors - The team compared OpenHands SDK with OpenAI, Claude, and Google SDKs, highlighting that OpenHands uniquely combines 16 additional features, including native remote execution and multi-LLM routing across over 100 vendors [21][22]. Reliability and Evaluation - OpenHands SDK's reliability and performance are assessed through continuous testing and benchmark evaluations, with automated tests costing only $0.5–3 per run and completing in 5 minutes [24][25]. - The SDK demonstrates competitive performance in software engineering and general agent benchmarks, achieving a 72% solution rate on SWE-Bench and a 67.9% accuracy on GAIA using Claude Sonnet 4.5 [29][30].