AI Engineer
Search documents
What does Enterprise Ready MCP mean? — Tobin South, WorkOS
AI Engineer· 2025-06-27 09:31
MCP and AI Agent Development - MCP is presented as a way of interfacing between AI and external resources, enabling functionalities like database access and complex computations [3] - The industry is currently focused on building internal demos and connecting them to APIs, but needs to move towards robust authentication and authorization [9][10] - The industry needs to adapt existing tooling for MCP due to its dynamic client registration, which can flood developer dashboards [12] Enterprise Readiness and Security - Scaling MCP servers requires addressing free credit abuse, bot blocking, and robust access controls [12] - Selling MCP solutions to enterprises necessitates SSO, lifecycle management, provisioning, fine-grained access controls, audit logs, and data loss prevention [12] - Regulations like GDPR impose specific logging requirements for AI workloads, which are not widely supported [12] Challenges and Future Development - Passing scope and access control between different AI workloads remains a significant challenge [13] - The MCP spec is actively developing, with features like elicitation (AI asking humans for input) still unstable [13] - Cloud vendors are solving cloud hosting, but authorization and access control are the hardest parts of enterprise deployment [13]
CI in the Era of AI: From Unit Tests to Stochastic Evals — Nathan Sobo, Zed
AI Engineer· 2025-06-27 09:22
Software engineers have long understood that high-quality code requires comprehensive automated testing. For decades, our industry has relied on deterministic tests with clear pass/fail outcomes to ensure reliability. High-quality software depends on automated testing. That's certainly true at Zed, where we're building a next-generation native IDE in Rust. Zed runs at 120 frames per second, but it would also crash once a second if we didn't maintain and run a comprehensive suite of unit tests on every chang ...
Building AI Agents that actually automate Knowledge Work - Jerry Liu, LlamaIndex
AI Engineer· 2025-06-24 00:16
Agents are all the rage in 2025, and every single b2b SaaS startup/incumbent promises AI agents that can "automate work" in some way. But how do you actually build this? The answer is two fold: 1. really really good tools 2. carefully tailored agent reasoning over these tools that range from assistant-to-automation based UXs. The main goal of this talk is to a practical overview of agent architectures that can automate real-world work, with a focus on document-centric tasks. Learn the core building blocks o ...
Large Scale AI on Apple Silicon (as mentioned by @AndrejKarpathy ) — Alex Cheema, EXO Labs
AI Engineer· 2025-06-20 22:52
Scientific Rigor and Progress - Scientific progress is not always linear, and inertia within the scientific community can hinder the adoption of new ideas, even with sound methodology [8][18] - Questioning assumptions is crucial for scientific advancement, as illustrated by historical examples in physics and rat experiments [9][16] - Oversimplification of science is a common pitfall, and publishing both successful and unsuccessful results is important for transparency and learning [18][35] AI Development and Hardware - The "hardware lottery" suggests that the best research ideas in AI don't always win due to various factors, including hardware limitations and existing paradigms [22] - Large Language Models (LLMs) can create inertia by reinforcing existing practices, such as the dominance of Python, making it harder for new programming languages to gain adoption [23][24] - GPUs addressed the von Neumann bottleneck of CPUs by changing the ratio of bytes loaded to floating-point operations executed, enabling significant performance improvements in AI [21] Exo's Solution and Research - Exo is developing an orchestration layer for AI that runs on different hardware targets, aiming to provide a reliable system for managing distributed devices in ad hoc mesh networks [25] - Exo models everything as a causally consistent set of events, creating a causal graph to reason about the system and ensure data consistency across distributed systems [26][27] - Exo's technology enables the efficient utilization of diverse hardware, such as combining Nvidia Spark (high compute) and Studio (high memory bandwidth) for LLM generation [28][29] - Exo is researching new optimizers that are more efficient per flop than Adam but require more memory, leveraging the memory-to-flops ratio of Apple silicon [31][32][33]
Building Agents with Amazon Nova Act and MCP - Du'An Lightfoot, Amazon (Full Workshop)
AI Engineer· 2025-06-19 02:04
Workshop Overview - The workshop focuses on building AI agents using Amazon's agent technologies [1] - Participants will gain hands-on experience in building sophisticated AI agents [1] - The workshop is 2-hour long [1] Technologies Highlighted - Amazon Nova Act is used for reliable web navigation [1] - Model Context Protocol (MCP) connects agents to external data sources and APIs [1] - Amazon Bedrock Agents orchestrates complex workflows [1] Skills Acquired - Participants will learn to build agents that can navigate the web like humans [1] - Participants will learn to perform complex multi-step tasks [1] - Participants will learn to leverage specialized tools through natural language commands [1]
Model Context Protocol: Origins and Requests For Startups — Theodora Chu, MCP PM, Anthropic
AI Engineer· 2025-06-18 22:55
MCP Origins and Goals - MCP was created to address the challenge of constantly copying and pasting context into LLMs, aiming to give models the ability to interact with the outside world [4][5][6] - The goal is to establish an open-source, standardized protocol for model agency, enabling broader participation in the ecosystem [7][8] - Anthropic believes that enabling model agency is crucial for LLMs to reach the next level of usefulness and intelligence [8] MCP Development and Adoption - MCP was initially developed internally and gained traction during a company hack week [9][10] - Early feedback questioned the need for a new protocol and its open-source nature, given existing tool-calling capabilities [12][13] - Adoption by coding tools like Cursor marked a turning point, followed by broader adoption from Google, Microsoft, and OpenAI [14] Protocol Principles and Updates - The protocol prioritizes server simplicity, even if it increases client complexity, based on the belief that there will be more servers than clients [20][21] - Recent updates include support for streamable HTTP to enable more birectionality for agent communication [19] - Future development focuses on enhancing the agent experience, including elicitation to allow servers to request more information from end users [26][27] - Plans include a registry API to facilitate models finding MCPs independently, further supporting model agency [28] Ecosystem Opportunities - The industry needs more high-quality servers across various verticals beyond dev tools, such as sales, finance, legal, and education [31][34] - There is a significant opportunity in simplifying server building through tooling for hosting, testing, evaluation, and deployment [36] - Automated MCP server generation is a potential future direction, leveraging increasing model intelligence [37] - Tooling around AI security, observability, and auditing is crucial as applications gain more access to external data [38]
Building Protected MCP Servers — Den Delimarsky and Julia Kasper, Microsoft DevDiv
AI Engineer· 2025-06-18 18:50
Join us to see how VS Code and GitHub Copilot's expanding suite of AI features can match or even surpasses the benefits of other popular AI developer tools. We'll focus on practical scenarios to ensure immediate applicability and work through live demos of Copilot features such as: Code generation using Edits, Planning/problem solving using Chat, Inline terminal command generation, Boilerplate code generation using Agent mode, Improving boilerplate with custom instructions and then refactoring using Agent m ...
The Geopolitics of AI Infrastructure - Dylan Patel, SemiAnalysis
AI Engineer· 2025-06-18 00:55
As AI reshapes the global balance of power, the infrastructure behind it—chips, data centers, power, and supply chains—has become a new arena for geopolitical competition. This talk explores how nations are racing to secure critical AI hardware, control compute capacity, and assert influence over the technologies and talent that define the future. About Dylan Patel Dylan is the founder, CEO, and Chief Analyst for SemiAnalysis, the preeminent authority on all things AI and semiconductors. Through Dylan’s unw ...
Case Study + Deep Dive: Telemedicine Support Agents with LangGraph/MCP - Dan Mason
AI Engineer· 2025-06-17 18:58
Industry Focus: Autonomous Agents in Healthcare - The workshop explores building autonomous agents for managing complex processes like multi-day medical treatments [1] - The system aims to help patients self-administer medication regimens at home [1] - A key challenge is enabling agents to adhere to protocols while handling unexpected patient situations [1] Technology Stack - The solution utilizes a hybrid system of code and prompts, leveraging LLM decision-making to drive a web application, message queue, and database [1] - The stack includes LangGraph/LangSmith, Claude, MCP, Nodejs, React, MongoDB, and Twilio [1] - Treatment blueprints, designed in Google Docs, guide LLM-powered agents [1] Agent Evaluation and Human Support - The system incorporates an agent evaluation system using LLM-as-a-judge to assess interaction complexity [1] - The evaluation system escalates complex interactions to human support when needed [1] Key Learning Objectives - Participants will learn how to build a hybrid system of code and prompts that leverages LLM decisioning [1] - Participants will learn how to design and maintain flexible agentic workflow blueprints [1] - Participants will learn how to create an agent evaluation system [1]
The Web Browser Is All You Need - Paul Klein IV
AI Engineer· 2025-06-17 18:47
Company Overview - Browserbase provides infrastructure connecting large language models and the web, enabling end-to-end workflow automation [1] - Browserbase views itself as the "last-mile" interface between large language models and the web [1] Funding & Investment - Browserbase raised $27.5 million in its first 12 months [1] - The funding includes a $6.5 million seed round and a $21 million Series A [1] - CRV, Kleiner Perkins, and Okta Ventures led the Series A funding [1] Technology & Innovation - The web browser may become the default MCP server for the internet, enabling production AI Agents [1] - Browserbase offers fast, reliable, multi-region headless-browser infrastructure for developers and AI agents [1]