Workflow
Agents
icon
Search documents
X @Avi Chawla
Avi Chawla· 2025-06-25 06:30
How Agents test Agents, clearly explained (with code): ...
Model Context Protocol: Origins and Requests For Startups — Theodora Chu, MCP PM, Anthropic
AI Engineer· 2025-06-18 22:55
MCP Origins and Goals - MCP was created to address the challenge of constantly copying and pasting context into LLMs, aiming to give models the ability to interact with the outside world [4][5][6] - The goal is to establish an open-source, standardized protocol for model agency, enabling broader participation in the ecosystem [7][8] - Anthropic believes that enabling model agency is crucial for LLMs to reach the next level of usefulness and intelligence [8] MCP Development and Adoption - MCP was initially developed internally and gained traction during a company hack week [9][10] - Early feedback questioned the need for a new protocol and its open-source nature, given existing tool-calling capabilities [12][13] - Adoption by coding tools like Cursor marked a turning point, followed by broader adoption from Google, Microsoft, and OpenAI [14] Protocol Principles and Updates - The protocol prioritizes server simplicity, even if it increases client complexity, based on the belief that there will be more servers than clients [20][21] - Recent updates include support for streamable HTTP to enable more birectionality for agent communication [19] - Future development focuses on enhancing the agent experience, including elicitation to allow servers to request more information from end users [26][27] - Plans include a registry API to facilitate models finding MCPs independently, further supporting model agency [28] Ecosystem Opportunities - The industry needs more high-quality servers across various verticals beyond dev tools, such as sales, finance, legal, and education [31][34] - There is a significant opportunity in simplifying server building through tooling for hosting, testing, evaluation, and deployment [36] - Automated MCP server generation is a potential future direction, leveraging increasing model intelligence [37] - Tooling around AI security, observability, and auditing is crucial as applications gain more access to external data [38]
No Code LangSmith Evaluations
LangChain· 2025-06-18 15:10
LangChain Agent Evaluation - LangChain 降低了 Agent 评估的门槛,使得非开发者也能轻松进行 [1] - Langraph Studio 新增了快速评估 Langraph Agent 的功能 [3] - 用户可以在 Langraph Studio 中选择数据集并启动评估实验 [3][4] - 评估结果可在 Langsmith 中查看,包括模型输出和评估分数 [5] Evaluation Importance and Accessibility - 评估对于构建有效的 Agent 至关重要 [7] - 传统评估对开发者有较高要求,需要掌握 SDK、Piest 和 Evaluate API 等 [7] - LangChain 旨在提供一种无需代码的方式,让任何人都能评估 Langraph Agent [8] - 非技术用户可以基于直觉评估模型选择和提示词等 [9] Configuration and Customization - 用户可以在 Studio 界面中轻松切换 graph 配置,并以此为基础启动评估 [9] - 开发者可以预先设置包含输入主题和参考输出的数据集 [10] - 可以将评估器(Evaluator)绑定到数据集,并自定义评估标准和评分规则 [11][12][13] - 用户可以在 Studio 中修改 graph 配置(如模型、提示词),并启动新的评估实验 [15][16][17] - Studio 提供了无代码配置方式,方便快速迭代 [18]
Gateway to AGI
Matthew Berman· 2025-06-17 14:07
Alpha Evolve AI that can discover new knowledge. It really feels like we're at this inflection point of the intelligence explosion. Are we at that inflection point given it seems like this is self-improving artificial intelligence.You're spot on to the potential for something like Alpha Evolve. It's amazing. We launch in this lowkey way.Yeah, it's one of the most groundbreaking work we are doing. The fact that you know you can have these agents which can go improve code, make discoveries. What an extraordin ...
Exposing Agents as MCP servers with mcp-agent: Sarmad Qadri
AI Engineer· 2025-06-11 16:57
My name is Sarmad and today I want to talk about building effective agents with model context protocol or MCP. So a lot has changed in the last year. Um especially as far as agent development is concerned.I think 2025 is the year of agents and uh things like MCP make agent design simpler and more robust than ever before. So I want to talk about what the agent tech stack looks like in 2025. The second thing is a lot of uh MCP servers today are just you know onetoone mappings of existing REST API uh uh servic ...
从搜索到解决方案:解锁火山 DeepSearch 的“三连跳” MCP 玩法
歸藏的AI工具箱· 2025-04-24 09:34
最近真是捅了 MCP 窝了,上周火山开了一次开发者见面会,发布了挺多东西的,主要有: RTC 硬件这个也不太好测试,主要我也不懂,而且需要硬件,这次主要试一下 DeepSearch 服务。 其实现在所谓的 Agents 服务主要的任务和内容还是基于AI 搜索信息的加工和再整理,这部分是核心,也是 非常吃技术能力的地方。 火山把这部分能力变成应用之后对于开发者来说省了很多事情,人人都能搞 DeepSearch 了。 效果怎么样 先来一个最常见的问题和测试旅游规划。 即使这种看起来简单的任务很多 AI 搜索其实做的不好,看着内容输出很多,很多都是各个景点介绍的废话。 用户其实需要的是实时性比较强的信息,比如交通怎么安排,怎么样可以顺路,一些危险的项目需要准备哪些 东西等。 正式发布了豆包深度思考模型 Doubao-1.5-thinking-pro 和全新的视觉理解模型 Doubao-1.5-vision- pro,这个咱们上周介绍过了,视觉推理非常强大, 感兴趣可以去看我的测试 。 还发布了方舟 × RTC 硬件:把端侧自动唤醒与云端大模型语音能力一次打包,让玩具、家居、穿戴等设备 一键升级为能与人自然实时对话的 ...