LangSmith
Search documents
LangChain 创始人警告:2026 成为“Agent 工程”分水岭,传统软件公司的生存考验开始了
AI前线· 2026-01-31 05:33
Core Viewpoint - The emergence of "long-horizon agents" is reshaping the software engineering paradigm, moving from deterministic code-based systems to models that operate as black boxes, requiring real-time execution to understand their behavior [2][3][6]. Group 1: Long-Horizon Agents - Long-horizon agents are seen as a turning point in AI, with predictions that their adoption will accelerate by the end of 2025 to 2026 [2]. - These agents function more like "digital employees," capable of executing tasks over extended periods, learning from trial and error, and self-correcting [2][3]. - The transition to long-horizon agents may challenge traditional software companies, similar to the shift from on-premises to cloud solutions, where not all companies successfully adapted [2][3]. Group 2: Differences in Software Development - Traditional software development relies on deterministic logic written in code, while agent-based systems introduce non-deterministic behavior, making it necessary to observe their real-time execution to understand their operations [30][32]. - The concept of "tracing" has become crucial in agent systems, allowing developers to track internal processes and understand the context at each step, which differs significantly from traditional software debugging methods [31][32]. - The iterative process of developing agents is more complex, as developers cannot predict behavior before deployment, necessitating more rounds of refinement and adjustments [34][36]. Group 3: The Role of Data and Instructions - Existing software companies possess valuable data and APIs that can be leveraged in the agent era, but the ability to effectively utilize these assets will depend on new engineering approaches [37][38]. - The instructions on how to use data effectively are becoming increasingly important, as traditional methods of human execution are being automated through agents [38]. - The integration of domain-specific knowledge into agent systems is essential for their effectiveness, as seen in examples from the financial sector [38]. Group 4: Future of Agent Development - Memory capabilities in agents are anticipated to become a significant competitive advantage, allowing them to learn and improve over time [51][52]. - The development of user interfaces for long-horizon agents will likely require both synchronous and asynchronous management to handle tasks effectively [53][54]. - Code sandboxes are expected to become a critical component of agent capabilities, enabling safe execution and verification of scripts [56].
LangChain 创始人警告:2026 成为“Agent 工程”分水岭,传统软件公司的生存考验开始了
程序员的那些事· 2026-01-31 03:16
转自:InfoQ ,编译 | Tina 过去几十年,软件工程有一个稳定不变的前提:系统的行为写在代码里。工程师读代码,就能推断系 统在大多数场景下会怎么运行;测试、调试、上线,也都围绕"确定性"展开。但 Agent 的出现正在动 摇这个前提:在 Agent 应用里,决定行为的不再只是代码,还有模型本身——一个在代码之外运 行、带着非确定性的黑箱。你无法只靠读代码理解它,只能让它跑起来、看它在真实输入下做了什 么,才知道系统"到底在干什么"。 在播客中,LangChain 创始人 Harrison Chase 还把最近一波"能连续跑起来"的编程 Agent、Deep Research 等现象视为拐点,并判断这类"长任务 Agent"的落地会在 2025 年末到 2026 年进一步加 速。 这也把问题推到了台前:2026 被很多人视为"长任务 Agent 元年",现有的软件公司还能不能熬过 去?就像当年从 on-prem 走向云,并不是所有软件公司都成功转型一样,工程范式一旦变化,就会 重新筛选参与者。长任务 Agent 更像"数字员工"——它不是多回合聊天那么简单,而是能在更长时间 里持续执行、反复试错、不断自 ...
红杉对话 LangChain 创始人:2026 年 AI 告别对话框,步入 Long-Horizon Agents 元年
3 6 Ke· 2026-01-28 01:01
Group 1 - The core assertion of the article is that AGI (Artificial General Intelligence) represents the ability to "figure things out," marking a shift from the era of "Talkers" to "Doers" by 2026, driven by Long Horizon Agents [1][2] - Long Horizon Agents are characterized by their ability to autonomously plan, operate over extended periods, and exhibit expert-level features, expanding their capabilities from specific verticals to complex tasks across various domains [1][2] - The article highlights that the value of Long Horizon Agents lies in their ability to produce high-quality drafts for complex tasks, with a focus on the need for opinionated software harnesses and file system permissions as standard features for all agents [1][2][3] Group 2 - Harrison Chase emphasizes that the recent advancements in models and the understanding of effective harnessing have led to the successful implementation of Long Horizon Agents, particularly in the coding domain, which is rapidly expanding to other fields [2][4] - The article discusses the importance of Scaffolding and Harness in the development of agents, where Scaffolding refers to auxiliary code structures that guide model outputs, while Harness encompasses the software environment that manages context and tool interactions [3][8] - The emergence of AI Site Reliability Engineers (AI SREs) is noted as a significant application of Long Horizon Agents, capable of handling long-duration tasks and generating comprehensive reports for human review [5][6] Group 3 - The article outlines the evolution of agent frameworks, transitioning from general frameworks to more opinionated harness architectures, with a focus on the integration of planning tools and file system interactions [8][10] - The concept of Deep Agents is introduced, which represents the next generation of autonomous agent architecture built on LangGraph, emphasizing the need for effective context management and compression techniques [9][12] - The discussion includes the challenges of context management in Long Horizon Agents, particularly the need for efficient compression strategies as task cycles extend [11][18] Group 4 - The article identifies the critical role of Memory in Long Horizon Agents, allowing them to self-improve and adapt over time, which is essential for maintaining performance in long-duration tasks [36][37] - The future interaction models for Long Horizon Agents are anticipated to combine asynchronous and synchronous modes, allowing for effective management and collaboration between agents and users [38][39] - The necessity for agents to have access to file systems is emphasized, as it enhances context management and operational capabilities, particularly in coding tasks [41][42]
红杉对话 LangChain 创始人:2026 年 AI 告别对话框,步入 Long-Horizon Agents 元年
海外独角兽· 2026-01-27 12:33
Core Insights - The article asserts that AGI represents the ability to "figure things out," marking a shift from the era of "Talkers" to "Doers" in AI by 2026, driven by Long Horizon Agents [2] - Long Horizon Agents are characterized by their ability to autonomously plan, operate over extended periods, and exhibit expert-level features across complex tasks, expanding from coding to various domains [3][4] - The emergence of these agents is seen as a significant turning point, with the potential to revolutionize how complex tasks are approached and executed [3][21] Long Horizon Agents' Explosion - Long Horizon Agents are finally beginning to work effectively, with the core idea being to allow LLMs to operate in a loop and make autonomous decisions [4] - The ideal interaction with agents combines asynchronous management and synchronous collaboration, enhancing their utility in various applications [3][4] - The coding domain has seen the most rapid adoption of these agents, with examples like AutoGPT demonstrating their capabilities in executing complex multi-step tasks [4][5] Transition from General Framework to Harness Architecture - The distinction between models, frameworks, and harnesses is crucial, with harnesses being more opinionated and designed for specific tasks, while frameworks are more abstract [8][9] - The evolution of harness engineering is particularly advanced in coding companies, which have successfully integrated these concepts into their products [12][14] - The integration of file system permissions into agents is essential for effective context management and task execution [24] Future Interactions and Production Forms - Memory is identified as a critical component for self-improvement in agents, allowing them to retain and utilize past interactions to enhance performance [35] - The future of agent interaction is expected to blend asynchronous and synchronous modes, facilitating better user engagement and task management [36] - The necessity for agents to access file systems is emphasized, as it significantly enhances their operational capabilities [39]
Tracing Claude Code to LangSmith
LangChain· 2025-12-19 21:05
Are you curious about what cloud code is doing behind the scenes. Or do you want observability in the critical workflows that you've set up with claude code. Hey, I'm Tanish from Langchain and we built a claude code to LinkSmith integration so that you can see each step that cla takes whether that be an LLM call or tool calls.Um it's pretty fascinating to see the entire trace. So I want to show you what this looks like. Um uh I have uh a project here.It's a very very very simple uh agent that I build with u ...
The agent development loop with LangSmith + Claude Code / Deepagents
LangChain· 2025-12-17 17:53
Hey, this is Lance. Recently put out this blog post called debugging deep agents with lang. And the big idea here was connecting lang as a system of record for your traces with code agents like deep agents, but it could be other code agents like clock code to create kind of an iterative feedback loop.So you're having a code agent produce some langraph code that's being run. Traces are going to lang. And there's a way for the code agents to pull traces back, reflect on them, and update your lane share langra ...
Observing & Evaluating Deep Agents Webinar with LangChain
LangChain· 2025-12-12 21:40
Explore the unique challenges of observing and evaluating Deep Agents in production. Deep Agents represent a shift in how AI systems operate – unlike simple chatbots or basic RAG applications, these agents run for extended periods, execute multiple sub-tasks, and make complex decisions autonomously. In this session, we'll dive into practical approaches for gaining visibility into Deep Agent behavior and measuring their effectiveness using LangSmith. Learn more about Deep Agents here: https://blog.langchain. ...
How to debug voice agents with LangSmith
LangChain· 2025-12-09 21:39
Voice is one of the most natural ways to interact with AI. And as the models are getting better, I'm excited about new use cases and interaction patterns that it's going to unlock, especially in industries like education and customer service. It's surprisingly easy to get started building a voice agent.And so let's go through that in this video. I'm Tannushri and I'm going to show you how to build a voice agent, specifically a French tutor with this framework called Pipecat. going to walk through how it wor ...
LangChain Academy New Course: LangSmith Essentials
LangChain· 2025-11-13 17:24
I'm excited to announce the release of our latest LangChain Academy course, LangSmith Essentials. In this quickstart course, you'll learn to observe, evaluate, and deploy an AI agent in less than 30 minutes. Testing applications is an essential part of the development lifecycle, but LLM systems are non-deterministic, meaning we can't predict exactly what output a given input will produce.When you add multi-turn interactions and tool-calling agents into the mix, the process becomes even more complex and less ...
Why We Built LangSmith for Improving Agent Quality
LangChain· 2025-11-04 16:04
Langsmith Platform Updates - Langchain is launching new features for Langsmith, a platform for agent engineering, focusing on tracing, evaluation, and observability to improve agent reliability [1] - Langsmith introduces "Insights," a feature designed to automatically identify trends in user interactions and agent behavior from millions of daily traces, helping users understand how their agents are being used and where they are making mistakes [1] - Insights is inspired by Anthropic's work on understanding conversation topics, but adapted for Langsmith's broader range of agent payloads [5][6] Evaluation and Testing - Langsmith emphasizes the importance of methodical testing, including online evaluations, to move beyond simple "vibe testing" and add rigor to agent development [1][33] - Langsmith introduces "thread evals," which allow users to evaluate agent performance across entire user interactions or conversations, providing a more comprehensive view than single-turn evaluations [16][17] - Online evals measure agent performance in real-time using production data, complementing offline evals that are based on known examples [24] - The company argues against the idea that offline evals are obsolete, highlighting their continued usefulness for regression testing and ensuring agents perform well on known interaction types [30][31] Use Cases and Applications - Insights can help product managers understand which product features are most frequently used with an agent, informing product roadmap prioritization [2][12] - Insights can assist AI engineers in identifying and categorizing agent failure modes, such as incorrect tool usage or errors, enabling targeted improvements [3][13] - Thread evals are particularly useful for evaluating user sentiment across an entire conversation or tracking the trajectory of tool calls within a conversation [21] Future Development - Langsmith plans to introduce agent and thread-level metrics into its dashboards, providing greater visibility into agent performance and cost [26] - The company aims to enable more flows with automation rules over threads, such as spot-checking threads with negative user feedback [27]