Orchestrating Agents at Scale

Agent Kit Overview - Introduces Agent Kit as a complete set of building blocks for building, deploying, and optimizing agentic workflows [1] - Agent Kit includes an agent builder for visually designing workflows, ChatKit for pre-built UI components, and updated evaluation and tracing patterns for monitoring and optimization [2][3][4] Agent Builder Features - Allows users to drag and drop nodes onto a canvas to visually design workflows [2] - Enables exporting workflows as code (JavaScript or Python) using the OpenAI Agents SDK for self-hosting [2][28] - Includes a node picker with LLMs, agent tools (like file search), and logical nodes (like while loops and if/else statements) [14] ChatKit Functionality - Provides pre-built UI components like chat interfaces, streaming, and customizable widgets for agentic workflows [3] - Simplifies UI development for complex dynamic workflows [6] - Supports features like streaming tokens, reasoning summaries, and widgets, even when self-hosting [34] Evaluation and Optimization - Offers tracing to monitor agent behavior, including agent order, duration, model details, and token usage [4][38] - Includes graders to evaluate workflow correctness and readability, with the ability to grade all workflows for a bird's eye view [39][42] - Provides a visual eval builder for optimizing individual agents, including editing prompts, adding tools, or changing the model [44][48] - Allows for automatic prompt optimization using ground truth data and grader results [51][52] Deployment and Self-Hosting - Enables deployment to OpenAI's cloud without managing servers [26][54] - Supports self-hosting by implementing the chat protocol and exporting workflows as code [26][27] - Allows switching to local tools and backends for accessing data within private clouds [30][31] Use Case: Semi-Truck Maintenance - Demonstrates a use case for a semi-truck manufacturer handling maintenance inquiries with an agentic workflow [4] - The workflow retrieves relevant maintenance procedures, identifies necessary parts, and provides instructions to maintenance engineers [9][10] - Shows how to tweak workflows, add widgets, and change prompts for improved output and user experience [13][18]