Debugging
Search documents
How to Debug, Evaluate, and Ship Reliable AI Agents with LangSmith
LangChain· 2026-03-12 02:12
Hi everyone. Uh just going to give it a few minutes to let everyone kind of filter in and we'll get started. All right.Um, so we'll go ahead and get started. Um, today we're going to be talking primarily about how to, um, essentially debug, evaluate, and ship reliable AI agents. Um and we're going to talk about it in the context of Langmith. Um so a lot of what we see today um is built into our Langmith product in terms of the way to ship reliable and production agents.So I'm excited to jump into all of tho ...
Observability and Evals for AI Agents: A Simple Breakdown
LangChain· 2026-02-17 16:30
Two of the most crucial things when building production agents is setting up proper observability and setting up proper evaluation. And these are actually tied and coupled, and this is different than in software engineering and the role observability and evaluation play when building agents is different than in software engineering as well. So I wanna talk a little bit about how we view observability and how it powers a lot of agent evaluation.Maybe starting briefly highlighting some of the things that we t ...
Agent Observability Explained
LangChain· 2026-02-04 18:56
your agent worked perfectly yesterday. Today it's crashed. You pull up the logs and you scan the code base.Maybe it's hallucination, maybe the context window overflowed. The real problem is that you can't tell. That's because you're still debugging agents like traditional software, and that mindset's already obsolete.In order to understand, debug and iterate on your agents, you need something new. You need tracing. See, with traditional software, we can design and predict behavior.If I process a refund, I e ...
How to stop fearing mistakes and start celebrating them | Hannah Sabo | TEDxBodø
TEDx Talks· 2026-01-29 17:25
I thought that I was bad at physics. I was actually terrified of making mistakes. I had never taken a physics course before college and I was bad at it.Or at least I thought I was. I made a lot of mistakes. I felt like my predictions never were right.Like maybe I would have better odds if I just guessed. And over the course of the year, I lost my confidence. I became scared to share my ideas with my classmates, my teaching assistants, the instructor, and even myself.Despite this rocky start, I actually got ...
The agent development loop with LangSmith + Claude Code / Deepagents
LangChain· 2025-12-17 17:53
Hey, this is Lance. Recently put out this blog post called debugging deep agents with lang. And the big idea here was connecting lang as a system of record for your traces with code agents like deep agents, but it could be other code agents like clock code to create kind of an iterative feedback loop.So you're having a code agent produce some langraph code that's being run. Traces are going to lang. And there's a way for the code agents to pull traces back, reflect on them, and update your lane share langra ...
How to debug voice agents with LangSmith
LangChain· 2025-12-09 21:39
Voice is one of the most natural ways to interact with AI. And as the models are getting better, I'm excited about new use cases and interaction patterns that it's going to unlock, especially in industries like education and customer service. It's surprisingly easy to get started building a voice agent.And so let's go through that in this video. I'm Tannushri and I'm going to show you how to build a voice agent, specifically a French tutor with this framework called Pipecat. going to walk through how it wor ...
X @Avi Chawla
Avi Chawla· 2025-12-05 06:31
Core Problem & Solution - AI 代码生成提速,但工程瓶颈转移至代码审查,开发者 90% 的调试时间用于 AI 生成的代码 [1] - AI 代码审查存在盲点,与 AI 代码生成器有相同的根本缺陷 [1] - SonarQube MCP Server 提供企业级代码分析,针对漏洞、代码异味等提供即时反馈 [1] SonarQube Capabilities - SonarQube 每日处理超过 7500 亿行代码,积累了丰富的 bug 模式经验 [2] - SonarQube 检测安全漏洞(SQL 注入、XSS、硬编码密钥等)[4] - SonarQube 识别代码异味和技术债务 [4] - SonarQube 发现测试覆盖率缺口 [4] - SonarQube 评估可维护性问题 [4] AI Reviewer Limitations - AI 审查器进行模式匹配,而非验证 [3] - AI 审查器验证语法,而非系统行为 [3] - AI 审查器审查代码,而非后果 [3] Setup - 安装 SonarQube MCP 服务器 [4] - 将其添加到 AI 助手的配置中 [4]
X @Avi Chawla
Avi Chawla· 2025-11-26 19:28
RT Avi Chawla (@_avichawla)You're in a tech lead interview at Google.The interviewer asks:"AI generates 30% of our code now.But our engineering velocity has only increased by 10%.How would you fill this gap?"You: "Using AI code reviewers will solve this."Interview over!Here's what you missed:Many engineers think the solution to AI bugs is more AI.Their mental model is simple: "If AI can write it, AI can review it."But if AI could catch these issues, why didn't it write correct code in the first place?There' ...
X @Avi Chawla
Avi Chawla· 2025-11-26 06:31
You're in a tech lead interview at Google.The interviewer asks:"AI generates 30% of our code now.But our engineering velocity has only increased by 10%.How would you fill this gap?"You: "Using AI code reviewers will solve this."Interview over!Here's what you missed:Many engineers think the solution to AI bugs is more AI.Their mental model is simple: "If AI can write it, AI can review it."But if AI could catch these issues, why didn't it write correct code in the first place?There's enough evidence to sugges ...
Vibe Debugging Explained
Greylock· 2025-09-30 19:53
What does volume debugging look like in my mind. To perform these kind of tasks like help me understand uh what commit has landed in production or is this feature flag enabled, right. An engineer needs understanding of code but also understanding of production and and production is composed of all of these different tools that each has a silo of data but the tools don't really talk to each other, right.And so it falls upon a human to bring their tribal knowledge and also you know knowledge of how to operate ...