AI Engineer
Search documents
12-Factor Agents: Patterns of reliable LLM applications — Dex Horthy, HumanLayer
AI Engineer· 2025-07-03 20:50
Core Principles of Agent Building - The industry emphasizes rethinking agent development from first principles, applying established software engineering practices to build reliable agents [11] - The industry highlights the importance of owning the control flow in agent design, allowing for flexibility in managing execution and business states [24][25] - The industry suggests that agents should be stateless, with state management handled externally to provide greater flexibility and control [47][49] Key Factors for Reliable Agents - The industry recognizes the ability of LLMs to convert natural language into JSON as a fundamental capability for building effective agents [13] - The industry suggests that direct tool use by agents can be harmful, advocating for a more structured approach using JSON and deterministic code [14][16] - The industry emphasizes the need to own and optimize prompts and context windows to ensure the quality and reliability of agent outputs [30][33] Practical Applications and Considerations - The industry promotes the use of small, focused "micro agents" within deterministic workflows to improve manageability and reliability [40] - The industry encourages integrating agents with various communication channels (email, Slack, Discord, SMS) to meet users where they are [39] - The industry advises focusing on the "hard AI parts" of agent development, such as prompt engineering and flow optimization, rather than relying on frameworks to abstract away complexity [52]
MCP Is Not Good Yet — David Cramer, Sentry
AI Engineer· 2025-07-03 16:00
MCP Overview & Architecture - MCP (Micro Control Plane) is defined as a pluggable architecture for agents, contextualized within an enterprise cloud service [5][6] - Sentry's MCP server was initially built as a fun project and is biased towards Sentry's application monitoring services [4][5] - The industry views MCP as a potential solution for integrating services into various agents, enabling bug fixes and workflow enhancements within editors [7][8][25] Implementation & Challenges - Implementing MCP involves complexities around OAUTH 21%, requiring solutions like Cloudflare Shim for proxying OAUTH 2 API [16][17] - A key challenge is that MCP cannot simply sit on top of Open API; systems need to be designed around how agents and models react to provided context [19][20][21] - Current client support for native authentication is still evolving, with some clients like Cursor experiencing breakage [22] Security & Best Practices - Security is a major concern, particularly with the standard IO interface, and random MCP tools should not be allowed within organizations [27] - For B2B SaaS companies, focusing on OAUTH with remote environments is crucial for integrating services into agents [25] - Companies should avoid simply proxying Open API and exposing it as tools, as this yields poor results; intentional design and context provision are necessary [30] Agent-Centric Approach - The industry should focus on building agents, viewing MCP as a plug-in architecture to leverage the value of LLMs [39][40] - Exposing agents through the MCP architecture, particularly in B2B settings, is seen as a significant value unlock [42] - Optimizing for context in workflows and understanding data is crucial when designing agents, with a focus on providing structured information like Markdown for language models [31][50]
The New Lean Startup — Sid Bendre, Oleve
AI Engineer· 2025-07-01 16:57
Company Overview & Vision - Aliv is building consumer software products aiming to improve users' lives [3] - The company's vision is to create a portfolio of "one person billion-dollar companies" [34] - Aliv emphasizes a lean startup approach, focusing on small teams and early profitability [1][2] Key Achievements & Metrics - Aliv scaled a portfolio of products to $6 million in ARR (Annual Recurring Revenue) profitably [3] - The company has generated over 500 million views across social media [3] - One product, Unstuck AI, reached 1 million users in under nine weeks [8] - Another product launch saw 10,000 users in less than 30 hours [4] Lean Operating Principles - Prioritizes hiring "10xer generalists" with complementary skills [10][11] - Emphasizes a "profit-first mentality" to guide decision-making [11][12] - Focuses on continuous process refinement and learning from failures [13] - Leverages "super tools" by reinventing the ways to use old tools and consolidating workflows [14][15] - Believes in building compounding benefits through technical playbooks and operational blueprints [14][15] Organizational Structure - Adopts a "harvester and cultivator" model for its engineering organization, inspired by Palantir [21][22] - Harvesters are product engineers who own and manage their products end-to-end [22][23] - Cultivators are AI software engineers focused on building the company's agentic operating system and automation [24] AI Tooling & Automation - Uses AI tooling to augment existing talent, not to compensate for shortcomings [25] - Implements a three-stage automation strategy: human-led tooling, workflow automation, and autonomous decision-making systems [28][29][30] - Aims to build a company where strategic insights are provided by people, but operations are run by AI agents [30] - Explores using AI agents for market research, acquisition target scoring, and growth system automation [30][31]
Intro to GraphRAG — Zach Blumenfeld
AI Engineer· 2025-06-30 22:56
[Music] So, as you come in, we have here a server set up with everything you'll need. If you want to follow along, you should have gotten a post-it note. If you don't, just raise your hand and my colleague Alex over here will come find you and we'll provide you with one.Uh, basically what you're going to do is you're just going to go, if you have a number 160 or below, you go to this link here, the QR code on top as well. Um, and if you have a number that's 2011 or above, you go to the second link or the QR ...
The emerging skillset of wielding coding agents — Beyang Liu, Sourcegraph / Amp
AI Engineer· 2025-06-30 22:54
AI Coding Agents: Efficacy and Usage - Coding agents are substantively useful, though opinions vary on their best practices and applications [1] - The number one mistake people make with coding agents is using them the same way they used AI coding tools six months ago [1] - The evolution of frontier model capabilities drives distinct eras in generative AI, influencing application architecture [1] Design Decisions for Agentic LLMs - Agents should make edits to files without constant human approval [2] - The necessity of a thick client (e.g., forked VS Code) for manipulating LLMs is questionable [2] - The industry is moving beyond the "choose your own model" phase due to deeper coupling in agentic chains [2] - Fixed pricing models for agents introduce perverse incentives to use dumber models [2] - The Unix philosophy of composable tools will be more powerful than vertical integration [2] Best Practices and User Patterns - Power users write very long prompts to program LLMs effectively [4] - Directing agents to relevant context and feedback mechanisms is crucial [5] - Constructing front-end feedback loops (e.g., using Playwright and Storybook) accelerates development [6] - Agents can be used to better understand code, serving as an onboarding tool and enhancing code reviews [9][11] - Sub-agents are useful for longer, more complex tasks by preserving the context window [12][13]
Agents, Access, and the Future of Machine Identity — Nick Nisi (WorkOS) + Lizzie Siegle (Cloudflare)
AI Engineer· 2025-06-30 22:52
Agent & MCP Server Development - Cloudflare and Work OS are collaborating to promote the idea that agents acting on behalf of users need the same credentials and authorization as user-facing projects [1] - The industry is moving towards more fine-grained authorization for AI agents, potentially authorizing per-line changes, per-tool changes, or even network connections [20] - Cloudflare offers a free tier for Durable Objects, which can be used for persistent storage in agents [3] Cloudflare's Offerings - Cloudflare provides compute cloud workers, AI model hosting, vectorized inference, vector database, SQL database, durable objects, video streaming, and image optimization [2] - Cloudflare workers have bindings that allow interaction with other Cloudflare products and other companies' products [3] - Cloudflare's agents framework includes an OAuth framework for setting up authorization, enabling easy identification of the worker or agent acting on behalf of a user [5] MCP Server Demo & Use Case - A basic MCP server was built using Cloudflare and Work OS, which is available for users to check out and run [6] - The demo showcases ordering a shirt via an agent, demonstrating how agents can act on behalf of users with proper authorization [9][10][11] - The demo uses Cloudflare's key-value storage to save order data, accessible through the interface [12] - Durable Objects can store data directly on the context associated with a worker object, unique for each user [14][16] Security & Authorization - The industry emphasizes the importance of audit trails with OAuth tools to track agent interactions, including reasons for interaction, the user on whose behalf it acted, and the outcome [21] - The industry needs to consider users as deputies who have access to tools and can potentially misuse them [21]
Containing Agent Chaos — Solomon Hykes, Dagger
AI Engineer· 2025-06-28 16:30
AI agents promise breakthroughs but often deliver operational chaos. Building reliable, deployable systems with unpredictable LLMs feels like wrestling fog – testing outputs alone is insufficient when the underlying workflow is opaque and flaky. How do we move beyond fragile prototypes? This talk, from the creator of Docker, argues the solution lies outside the model: engineering reproducible execution workflows built on rigorous architectural discipline. Learn how containerization, applied not just to depl ...
The Build-Operate Divide: Bridging Product Vision and AI Operational Reality
AI Engineer· 2025-06-28 02:49
Product leaders see AI possibilities. Operations teams see implementation chaos. That disconnect can kill promising AI features before they ever reach users. In this session, Chris Hernandez (Chime) and Jeremy Silva (Freeplay) share an integrated framework that bridges product strategy and operational reality. You'll learn how they transformed fragmented AI workflows into a unified approach—from prototyping and prompt testing to human review loops and model benchmarking. We’ll explore how to build evaluatio ...
Optimizing inference for voice models in production - Philip Kiely, Baseten
AI Engineer· 2025-06-28 02:39
Key Optimization Goal - Aims to achieve Time To First Byte (TTFB) below 150 milliseconds for voice models [1] Technology and Tools - Leverages open-source TTS models like Orpheus, which have an LLM backbone [1] - Employs tools and optimizations such as TensorRT-LLM and FP8 quantization [1] Production Challenges - Client code, network infrastructure, and other outside-the-GPU factors can introduce latency [1] - Common pitfalls exist when integrating TTS models into production systems [1] Scalability and Customization - Focuses on scaling TTS models in production [1] - Extends the system to serve customized models with voice cloning and fine-tuning [1]
Conquering Agent Chaos — Rick Blalock, Agentuity
AI Engineer· 2025-06-28 00:15
Agent deployments can be dicey, especially at first. This session goes over all the things that cause headache with deployments from serverless issues to networking issues - and how we fix them. About Rick Blalock Seasoned founder with exit. Developer at night and during the day if I can fit it in meetings... Scaled a mobile developer platform from hundreds to 800,000 developers. Successfully started and sold a fisheries platform & app to the world's largest fishing app with 15m+ users, and then led that co ...