AI Engineer
Search documents
Why Your Agent’s Brain Needs a Playbook: Practical Wins from Using Ontologies - Jesús Barrasa, Neo4j
AI Engineer· 2025-06-27 09:53
Knowledge Graph & LLM Application - Knowledge graphs combined with large language models (LLMs) can be used to build AI applications, particularly with graph retrieval augmented generation (RAG) architecture [2] - Graph RAG replaces vector databases with knowledge graphs built on graph databases, enhancing retrieval strategies [3] - Using a knowledge graph provides richer retrieval strategies beyond vector semantic search, including contextualization and structured queries [4] - Property graph model implements nodes and relationships, nodes represent entities and relationships connect them [4][5] Ontology & Schema - Ontologies provide an implementation-agnostic approach to representing schemas, facilitating knowledge graph creation for both structured and unstructured data pipelines [14][17] - Ontologies describe a domain with definitions of classes and relationships, matching well with graph models [15] - Financial Industry Business Ontology (FIBO) is a public financial industry ontology example [15] - Storing ontologies in the graph can drive dynamic behavior in retrievers, allowing for on-the-fly adjustments by modifying the ontology [29][30] Retrieval Strategies - Graph captures text chunks with embeddings, creating a new search space for vector search [20] - Vector search finds vectors in proximity, which can be dereferenced back to the graph for contextualization, navigation, and enrichment [20] - Dynamic queries, driven by ontologies, can be used to create dynamic retrievers, enabling data-driven behavior [26][29]
GraphRAG methods to create optimized LLM context windows for Retrieval — Jonathan Larson, Microsoft
AI Engineer· 2025-06-27 09:48
Graph RAG Applications & Performance - Graph RAG is a key enabler for building effective AI applications, especially when paired with agents [1] - Graph RAG excels at semantic understanding and can perform global queries over a code repository [2][3] - Graph RAG can be used for code translation from Python to Rust, outperforming direct LLM translation [4][9] - Graph RAG can be applied to large codebases like Doom (100,000 lines of code, 231 files) for documentation and feature development [10][12][13] - Graph RAG, when combined with GitHub Copilot coding agent, enables complex multi-file modifications, such as adding jump capability to Doom [18][20] Benchmark QED & Lazy Graph - Benchmark QED is a new open-source tool for measuring and evaluating Graph RAG systems, focusing on local and global quality metrics [21][22] - Benchmark QED includes AutoQ (query generation), AutoE (evaluation using LLM as a judge), and AutoD (dataset summarization and sampling) [22] - Lazy Graph RAG demonstrates dominant performance against vector RAG on data local questions, winning 92%, 90%, and 91% of the time against 8K, 120K, and 1 million token context windows respectively [29][30] - Lazy Graph RAG can achieve performance at a tenth of the cost compared to using a 1 million token context window [32] - Lazy Graph RAG is being incorporated into Azure AI and Microsoft Discovery Platform [34]
Agentic GraphRAG: Simplifying Retrieval Across Structured & Unstructured Data — Zach Blumenfeld
AI Engineer· 2025-06-27 09:44
Knowledge Graph Architecture & Agentic Workflows - Knowledge graphs can enhance agentic workflows by enabling reasoning and question decomposition, moving beyond simple vector searches [4] - Knowledge graphs facilitate the expression of simple data models to agents, aiding in accurate information retrieval and expansion with more data [5] - The integration of knowledge graphs allows for more precise question answering through a more expressive data model [22] Data Modeling & Entity Extraction - Data modeling should focus on defining key entities and their relationships, such as people, skills, and activities [17] - Entity extraction from unstructured documents, like resumes, can be used to create a graph database representing these relationships [18] - Pydantic classes and Langchain can be used for entity extraction workflows to decompose documents and extract JSON data containing skills and accomplishments [19][20] Benefits of Graph Databases - Graph databases enable flexible queries and high performance for complex traversals across skills, systems, domains, and accomplishments [30] - Graph databases allow for easy addition of new data and relationships, which is crucial for rapid iteration and adaptation in agentic systems [37] - Graph databases facilitate the creation of tools to find collaborators based on shared projects and domains [39] Practical Application: Employee Skills Analysis - The presentation uses an employee graph example to demonstrate skills analysis, similarity searches, and identification of skill gaps [5] - Initial attempts to answer questions using only document embeddings are inaccurate, highlighting the need for entity extraction and metadata [9] - By leveraging a knowledge graph, the system can accurately answer questions about the number of developers with specific skills, such as Python, and identify similar employees based on skill sets [24][25]
Revenue Engineering: How to Price (and Reprice) Your AI Product — Kshitij Grover, Orb
AI Engineer· 2025-06-27 09:41
Pricing Principles for AI Products - Pricing is a form of friction that can either enable or prevent product adoption, requiring careful consideration of value delivery and target audience [2] - Traditional pricing principles emphasize simplicity, value signaling through willingness to pay, and margin protection [8] - AI native pricing prioritizes predictability for mature companies needing to budget, speed for early-stage products, and adapting to variable costs [10][11][12] Key Considerations for AI Pricing - Audience understanding is crucial, considering their buying journey, value expectations, and decision-making processes [15][16] - Packaging and pricing tiers influence user perception and incentives, shaping how users interact with the product [18][19] - Margin structure should focus on axes of scaling and flexibility to experiment, rather than fixed margins due to rapidly changing underlying costs [13][14] Strategies for Margin Management and Flexibility - Differentiate through R&D innovation and pass technical advantages to users as pricing leverage [23][24] - Implement rate limits or guardrails to prevent degenerate workloads and incentivize reasonable usage, rather than linearly scaling costs [25] - Incrementally evolve pricing in response to R&D investments, aligning monetization with the perceived value by end-users [30][31] Future Trends in AI Agent Pricing - Expect continued price wars and a move towards effectively unlimited plans with caps and guardrails [36][37] - Outcome-based pricing will become more prevalent, requiring clear definitions of success and measurable SLAs [37][38] - Real-time visibility, spend management, and balance alerts will become more sophisticated, offering users greater control over spend [38][39][40]
"Data readiness" is a Myth: Reliable AI with an Agentic Semantic Layer — Anushrut Gupta, PromptQL
AI Engineer· 2025-06-27 09:40
Problem Statement - Data readiness is a myth, and achieving perfect data for AI is an unattainable pipe dream [1][2][3] - Fortune 500 companies lose an average of $250 million due to poor data quality [7] - Traditional semantic layers and knowledge graphs are insufficient for capturing the nuances of business language and tribal knowledge [8][9][10][11][12][13][14] Solution: Agentic Semantic Layer (PromQL) - PromQL is presented as a "day zero smart analyst" AI system that learns and improves over time through course correction and steering [17][18][19][20] - It uses a domain-specific language (DSL) for data retrieval, computation, aggregation, and semantics, decoupling LLM plan generation from execution [21][22] - The system allows for editing the AI's "brain" to correct its understanding and guide its learning [28] - It incorporates a prompt learning layer to improve the semantic graph and create a company-specific business language [31] - The semantic layer is version controlled, allowing for fallback to previous builds [33] Key Features and Benefits - Correctable, explainable, and steerable AI that improves with use [19] - Ability to handle messy data and understand business context [24][25] - Reduces months of work into immediate start, enabling faster AI deployments [37] - Self-improving and achieves 100% accuracy on complex tasks [37] Demonstrated Capabilities - The system can understand what revenue means and perform calculations [23] - It can identify and correct errors in data, such as incorrect status values [24] - It can integrate data from multiple databases and SAS applications [25][27] - It can summarize support tickets and extract sentiment [26][29] - It can learn the meaning of custom terms and relationships between tables [35][36] Customer Validation - A Fortune 500 food chain company and a high-growth fintech company achieved 100% accurate AI using PromQL [38]
Building Agentic Applications w/ Heroku Managed Inference and Agents — Julián Duque & Anush Dsouza
AI Engineer· 2025-06-27 09:38
Heroku Managed Inference and Agents Platform Overview - Heroku Managed Inference and Agents platform enables developers to build agentic applications that can reason, make decisions, and trigger actions [1] - The platform allows for provisioning and deploying LLMs, running untrusted code securely in multiple languages, and extending agents with the Model Context Protocol (MCP) [1] Key Capabilities - Heroku Managed Inference and Agents facilitates the deployment and management of LLMs [1] - The platform supports secure execution of untrusted code in Python, Nodejs, Go, and Ruby [1] - Model Context Protocol (MCP) can be used to extend agent capabilities [1] Target Applications - The platform is suitable for building internal tools, developer assistants, or customer-facing AI features [1]
Events are the Wrong Abstraction for Your AI Agents - Mason Egger, Temporal.io
AI Engineer· 2025-06-27 09:35
Core Argument - The presentation argues that event-driven architecture (EDA), while seemingly loosely coupled at runtime, is tightly coupled at design time, leading to complexities and challenges in AI agent development [21][22] - It proposes a shift in focus from events to durable execution as the core of AI agent architecture, which simplifies development and handles failures more effectively [26][27] Problems with Event-Driven Architecture - EDA sacrifices clear APIs, as events lack the documentation and structure of traditional APIs [15] - Business logic becomes fragmented and scattered across multiple services, making debugging and understanding the system more difficult [16] - Services become ad hoc state machines, leading to potential race conditions and difficult-to-debug issues [18][19] - EDA can lead to reluctance to iterate on architecture due to fear of breaking existing functionality [25] Durable Execution as a Solution - Durable execution is presented as a crash-proof execution environment that automatically preserves application state, virtualizes execution, and is not limited by time or hardware [27][28][29][30][31][32][33][34] - It allows developers to focus on business logic rather than managing events and queues [38] - Temporal provides durable execution as an open-source, MIT-licensed product with SDKs for multiple programming languages [38][39] - Durable execution abstracts away the complexities of events into the software layer [40][43] Temporal's Offering - Temporal's durable execution system offers automatic retries for failures, such as LLM downtime or rate limits [36] - It supports polyglot programming, allowing functions written in different languages to be called seamlessly [39] - Temporal is available for demonstration and further discussion at the company's booth and Slack channel [44][45]
Prompt Engineering is Dead — Nir Gazit, Traceloop
AI Engineer· 2025-06-27 09:34
Core Argument - The presentation challenges the notion of "prompt engineering" as a true engineering discipline, suggesting that iterative prompt improvement can be automated [1][2] - The speaker advocates for an alternative approach to prompt optimization, emphasizing the use of evaluators and automated agents [23] Methodology & Implementation - The company developed a chatbot for its website documentation using a Retrieval-Augmented Generation (RAG) pipeline [2] - The RAG pipeline consists of a Chroma database, OpenAI, and prompts to answer questions about the documentation [7] - An evaluator was built to assess the RAG pipeline's responses, using a dataset of questions and expected answers [5][7] - The evaluator uses a ground truth-based LLM as a judge, checking if the generated answers contain specific facts [10][13] - An agent was created to automatically improve prompts by researching online guides, running evaluations, and regenerating prompts based on failure reasons [5][18][19] - The agent uses Crew AI to think, call the evaluator, and regenerate prompts based on best practices [20] Results & Future Considerations - The initial score of the prompt was 0.4 (40%), and after two iterations with the agent, the score improved to 0.9 (90%) [21][22] - The company acknowledges the risk of overfitting to the training data (20 examples) and suggests splitting the data into train/test sets for better generalization [24][25] - Future work may involve applying the same automated optimization techniques to the evaluator and agent prompts [27] - The demo is available in the trace loop/autoprompting demo repository [27]
The Eyes Are The (Context) Window to The Soul: How Windsurf Gets to Know You — Sam Fertig, Windsurf
AI Engineer· 2025-06-27 09:34
Core Problem in AI Coding Space - Generating code is not difficult, but generating code that fits into existing codebases, adheres to organizational policies, personal preferences, and is future-proof is challenging [13][14][15] - The magic of AI coding tools like Windsurf lies in context, specifically "what context" and "how much" [16] Windsurf's Context Philosophy - "What context" is divided into two buckets: heuristics (user behavior) and hard evidence (environment/codebase) [17][18][19] - Relevant output is determined by the prompt, the state of the codebase, and the user state [20] - Windsurf prioritizes optimizing the relevance of context over simply increasing the size of the context window to address latency [21][22] Windsurf's Capabilities - Windsurf excels at finding relevant context quickly due to its background in GPU optimization [23] - Windsurf provides connectors for users to perform context retrieval at their level, including embedding search, memories, rules, and custom workspaces [24] Data Privacy - Windsurf processes information only within the user's editor and does not access the user's operating machine [31] - Windsurf's servers are stateless, and the company does not store or train on user data [31][32]
What does Enterprise Ready MCP mean? — Tobin South, WorkOS
AI Engineer· 2025-06-27 09:31
MCP and AI Agent Development - MCP is presented as a way of interfacing between AI and external resources, enabling functionalities like database access and complex computations [3] - The industry is currently focused on building internal demos and connecting them to APIs, but needs to move towards robust authentication and authorization [9][10] - The industry needs to adapt existing tooling for MCP due to its dynamic client registration, which can flood developer dashboards [12] Enterprise Readiness and Security - Scaling MCP servers requires addressing free credit abuse, bot blocking, and robust access controls [12] - Selling MCP solutions to enterprises necessitates SSO, lifecycle management, provisioning, fine-grained access controls, audit logs, and data loss prevention [12] - Regulations like GDPR impose specific logging requirements for AI workloads, which are not widely supported [12] Challenges and Future Development - Passing scope and access control between different AI workloads remains a significant challenge [13] - The MCP spec is actively developing, with features like elicitation (AI asking humans for input) still unstable [13] - Cloud vendors are solving cloud hosting, but authorization and access control are the hardest parts of enterprise deployment [13]