Building Applications with AI Agents — Michael Albada, Microsoft

Agentic Development Landscape - The adoption of agentic technology is rapidly increasing, with a 254% increase in companies self-identifying as agentic in the last three years based on Y Combinator data [5] - Agentic systems are complex, and while initial prototypes may achieve around 70% accuracy, reaching perfection is difficult due to the long tail of complex scenarios [6][7] - The industry defines an agent as an entity that can reason, act, communicate, and adapt to solve tasks, viewing the foundation model as a base for adding components to enhance performance [8] - The industry emphasizes that agency should not be the ultimate goal but a tool to solve problems, ensuring that increased agency maintains a high level of effectiveness [9][11][12] Tool Use and Orchestration - Exposing tools and functionalities to language models enables agents to invoke functions via APIs, but requires careful consideration of which functionalities to expose [14] - The industry advises against a one-to-one mapping between APIs and tools, recommending grouping tools logically to reduce semantic collision and improve accuracy [17][18] - Simple workflow patterns, such as single chains, are recommended for orchestration to improve measurability, reduce costs, and enhance reliability [19][20] - For complex scenarios, the industry suggests considering a move to more agentic patterns and potentially fine-tuning the model [22][23] Multi-Agent Systems and Evaluation - Multi-agent systems can help scale the number of tools by breaking them into semantically similar groups and routing tasks to appropriate agents [24][25] - The industry recommends investing more in evaluation to address the numerous hyperparameters involved in building agentic systems [27][28] - AI architects and engineers should take ownership of defining the inputs and outputs of agents to accelerate team progress [29][30] - Tools like Intel Agent, Microsoft's Pirate, and Label Studio can aid in generating synthetic inputs, red teaming agents, and building evaluation sets [33][34][35] Observability and Common Pitfalls - The industry emphasizes the importance of observability using tools like OpenTelemetry to understand failure modes and improve systems [38] - Common pitfalls include insufficient evaluation, inadequate tool descriptions, semantic overlap between tools, and excessive complexity [39][40] - The industry stresses the importance of designing for safety at every layer of agentic systems, including building tripwires and detectors [41][42]