LLM智能体

Search documents
开源RL框架Verlog来了,专为LLM智能体打造,400回合不成问题
机器之心· 2025-10-08 04:13
它在继承 VeRL 和 BALROG 的基础上,并遵循 pytorch-a2c-ppo-acktr-gail 的成熟设计原则,引入了一系列专 门优化手段,从而在任务跨度从短暂交互到数百回合时,依然能够实现稳定而高效的训练。 以往的框架(如 VeRL 和 RAGEN)能够较好地处理约 10 回合的任务,verl-agent 则可扩展至 50 回合。而 Verlog 则被设计用于超过 400 回合的环境,使其在复杂的长期决策任务中具备独特优势。 这一能力已在 BabyAI、BabaIsAI 和 Crafter 等高难度领域得到验证。以 Crafter 为例,其回合长度范围在 70 到 400 步之间,平均约为 190 步。在这些充满挑战的环境中,Verlog 都能够开箱即用地展现出强劲的性能。 机器之心报道 机器之心编辑部 AI 时代,智能体对短期对话的处理能力已不再是难题。真正的挑战是让智能体在数百步的探索中依然保持 清晰的推理与稳健的决策。 传统的强化学习框架在几十步内尚能应付,但一旦任务延展至数百步,奖励稀疏、历史冗长、策略崩塌便 接踵而至。 为了应对这些挑战,来自卡内基梅隆大学、香港大学等机构的研究者提出 ...
如何为LLM智能体编写工具?Anthropic官方教程来了
机器之心· 2025-09-12 11:31
Core Insights - The article emphasizes the need to rethink tool development for agentic AI systems, moving away from traditional deterministic logic to accommodate the non-deterministic nature of AI agents [1][3][10] - It highlights that the effectiveness of AI agents is heavily dependent on the tools provided to them, and outlines a path for optimizing these tools [1][3][4] Tool Definition and Development - Tools for AI agents are defined as new software forms that bridge deterministic systems and non-deterministic agents, requiring a different approach to design [8][9][10] - The article suggests a rapid prototyping approach for tool development, followed by comprehensive evaluations to assess performance and make iterative improvements [12][14] Evaluation Process - Evaluation tasks should be generated based on real-world scenarios and data sources, ensuring that prompts are paired with verifiable responses [23][25] - The article advises against overly simplistic testing environments, advocating for complex conditions that can effectively stress-test the tools [27] Tool Design Principles - It is recommended to build a limited number of well-thought-out tools that align with high-value workflows, rather than creating numerous redundant tools [43][47] - Tools should be designed with clear and independent objectives to prevent confusion among AI agents when selecting the appropriate tool [45][50] Naming and Response Optimization - Implementing namespaces for tools can help clarify their functions and reduce confusion for AI agents [48][51] - Tools should return high-signal information, prioritizing context relevance over flexibility, to enhance the agent's performance [52][56] Future Outlook - The article concludes that the development of efficient tools for AI agents requires a shift from predictable deterministic patterns to non-deterministic approaches, with a focus on iterative, evaluation-driven processes [66]