Tracing
Search documents
Why We Built LangSmith for Improving Agent Quality
LangChain· 2025-11-04 16:04
Langsmith Platform Updates - Langchain is launching new features for Langsmith, a platform for agent engineering, focusing on tracing, evaluation, and observability to improve agent reliability [1] - Langsmith introduces "Insights," a feature designed to automatically identify trends in user interactions and agent behavior from millions of daily traces, helping users understand how their agents are being used and where they are making mistakes [1] - Insights is inspired by Anthropic's work on understanding conversation topics, but adapted for Langsmith's broader range of agent payloads [5][6] Evaluation and Testing - Langsmith emphasizes the importance of methodical testing, including online evaluations, to move beyond simple "vibe testing" and add rigor to agent development [1][33] - Langsmith introduces "thread evals," which allow users to evaluate agent performance across entire user interactions or conversations, providing a more comprehensive view than single-turn evaluations [16][17] - Online evals measure agent performance in real-time using production data, complementing offline evals that are based on known examples [24] - The company argues against the idea that offline evals are obsolete, highlighting their continued usefulness for regression testing and ensuring agents perform well on known interaction types [30][31] Use Cases and Applications - Insights can help product managers understand which product features are most frequently used with an agent, informing product roadmap prioritization [2][12] - Insights can assist AI engineers in identifying and categorizing agent failure modes, such as incorrect tool usage or errors, enabling targeted improvements [3][13] - Thread evals are particularly useful for evaluating user sentiment across an entire conversation or tracking the trajectory of tool calls within a conversation [21] Future Development - Langsmith plans to introduce agent and thread-level metrics into its dashboards, providing greater visibility into agent performance and cost [26] - The company aims to enable more flows with automation rules over threads, such as spot-checking threads with negative user feedback [27]
Getting Started with LangSmith (2/8): Types of Runs
LangChain· 2025-09-29 04:27
Hi, welcome back. In this video, we're going to talk about the types of runs you can create while tracing in Linksmith. We'll then show how these runs can help you understand your application's execution.Traces can be thought of as logs for your application. And the Langrain team has put a lot of effort into the UX for displaying traces. This is because traditional logs can be difficult to parse for LLM applications.If you've ever had to dig through huge unforatted stack traces for an LLM application, you k ...
Tracing Claude Code to LangSmith
LangChain· 2025-08-06 14:32
Setup and Configuration - Setting up tracing from Claude Code to Langsmith requires creating a Langsmith account and generating an API key [1] - Enabling telemetry for Claude Code involves setting the `CLOUD_CODE_ENABLE_TELEMETRY` environment variable to 1 [3] - Configuring the OTLP (OpenTelemetry Protocol) exporter with HTTP transport and JSON encoding is necessary for Langsmith ingestion [4] - The Langsmith Cloud endpoint needs to be specified for logs from Claude Code, or a self-hosted instance URL if applicable [5] - Setting the API key in the headers allows authentication and connection to Langsmith, along with specifying a tracing project [5] - Enabling logging of user prompts and inputs is done by setting the appropriate environment variable to true [6] Monitoring and Observability - Langsmith collects and displays events from Claude Code, providing detailed logs of Claude Code sessions [3] - Traces in Langsmith show individual actions performed by Claude Code, including model names, token usage, and latency [8] - Claude Code sends cost information associated with each request to Langsmith [8] - Langsmith's waterfall view groups runs based on timestamps, showing the sequence of user prompts and Claude Code actions [13] - Langsmith provides pre-built dashboards for monitoring general usage, including the total number of traces, token usage, and costs over time [14]
n8n Tracing to LangSmith
LangChain· 2025-08-05 14:30
AI Workflow Automation & Observability - N8N is an AI workflow builder that allows users to string together nodes into AI agents and set up external triggers for automated execution [1] - Langsmith is an AI observability and evaluation product designed to monitor the performance of AI applications [2] Integration & Setup - Connecting N8N to Langsmith requires generating a Langsmith API key and setting it in the N8N deployment environment [3][8] - Additional environment variables can be set to enable tracing to Langsmith, specify the trace destination, and define the project name [4] Monitoring & Debugging - Langsmith traces provide visibility into the workflow, including requests to OpenAI, model usage, latency, and token consumption [6] - Langsmith offers a monitoring view to track app usage, latency spikes, error rates, and LLM usage/spending [7]
X @Avi Chawla
Avi Chawla· 2025-06-30 19:06
LLM Application Evaluation - Deepeval enables component-level evaluation and tracing of LLM applications, addressing the need to identify issues within retrievers, tool calls, or the LLM itself [1] - The "@observe" decorator allows tracing of individual LLM components like tools, retrievers, and generators [2] - Metrics can be attached to each component for detailed analysis [2] - Deepeval provides a visual breakdown of component performance [2] Open Source and Data Control - Deepeval is a 100% open-source tool with over 8.5 thousand stars [2] - Users can self-host Deepeval to maintain control over their data [2] Ease of Use - Implementing Deepeval requires only 3 lines of code [1] - No refactoring of existing code is needed [1]
X @Avi Chawla
Avi Chawla· 2025-06-30 06:33
Core Functionality - DeepEval provides open-source tracing for LLM applications using a Python decorator `@observe` [1] - The solution enables component-level evaluations of LLM apps, addressing issues within retrievers, tool calls, or the LLM itself [1] - It allows attaching different metrics to each component of the LLM application [1] - DeepEval offers a visual breakdown of the performance of each component [1] Open Source and Hosting - DeepEval is 100% open-source with over 8,500 stars [2] - The solution can be self-hosted, ensuring data privacy [2]
Getting Started with LangSmith (1/7): Tracing
LangChain· 2025-06-25 00:47
Langsmith Platform Overview - Langsmith is an observability and evaluation platform for AI applications, focusing on tracing application behavior [1] - The platform uses tracing projects to collect logs associated with applications, with each project corresponding to an application [2] - Langsmith is framework agnostic, designed to monitor AI applications regardless of the underlying build [5] Tracing and Monitoring AI Applications - Tracing is enabled by importing environment variables, including Langmouth tracing, Langmith endpoint, and API key [6] - The traceable decorator is added to functions to enable tracing within the application [8] - Langsmith provides a detailed breakdown of each step within the application, known as the run tree, showing inputs, outputs, and telemetry [12][14] - Telemetry includes token cost and latency of each step, visualized through a waterfall view to identify latency sources [14][15] Integration with Langchain and Langraph - Langchain and Langraph, Langchain's open-source libraries, work out of the box with Langsmith, simplifying tracing setup [17] - When using Langraph or Langchain, the traceable decorator is not required, streamlining the tracing process [17]