AI Engineer

Search documents
Revenue Engineering: How to Price (and Reprice) Your AI Product — Kshitij Grover, Orb
AI Engineer· 2025-06-27 09:41
[Music] Thanks for coming to my talk. I'm Shazage. Uh I'm one of the co-founders at Orb.Um and I'm going to be talking about how to think about pricing. Um maybe top level takeaway uh from this talk is that pricing is a is a deep complicated topic. We're going to cover some examples. We're going to cover some tactical advice.Um, but in general, the way you should think about pricing is pricing is a form of friction uh for your product and sometimes that friction can be applied for very good reason. Sometime ...
"Data readiness" is a Myth: Reliable AI with an Agentic Semantic Layer — Anushrut Gupta, PromptQL
AI Engineer· 2025-06-27 09:40
Problem Statement - Data readiness is a myth, and achieving perfect data for AI is an unattainable pipe dream [1][2][3] - Fortune 500 companies lose an average of $250 million due to poor data quality [7] - Traditional semantic layers and knowledge graphs are insufficient for capturing the nuances of business language and tribal knowledge [8][9][10][11][12][13][14] Solution: Agentic Semantic Layer (PromQL) - PromQL is presented as a "day zero smart analyst" AI system that learns and improves over time through course correction and steering [17][18][19][20] - It uses a domain-specific language (DSL) for data retrieval, computation, aggregation, and semantics, decoupling LLM plan generation from execution [21][22] - The system allows for editing the AI's "brain" to correct its understanding and guide its learning [28] - It incorporates a prompt learning layer to improve the semantic graph and create a company-specific business language [31] - The semantic layer is version controlled, allowing for fallback to previous builds [33] Key Features and Benefits - Correctable, explainable, and steerable AI that improves with use [19] - Ability to handle messy data and understand business context [24][25] - Reduces months of work into immediate start, enabling faster AI deployments [37] - Self-improving and achieves 100% accuracy on complex tasks [37] Demonstrated Capabilities - The system can understand what revenue means and perform calculations [23] - It can identify and correct errors in data, such as incorrect status values [24] - It can integrate data from multiple databases and SAS applications [25][27] - It can summarize support tickets and extract sentiment [26][29] - It can learn the meaning of custom terms and relationships between tables [35][36] Customer Validation - A Fortune 500 food chain company and a high-growth fintech company achieved 100% accurate AI using PromQL [38]
Building Agentic Applications w/ Heroku Managed Inference and Agents — Julián Duque & Anush Dsouza
AI Engineer· 2025-06-27 09:38
Heroku Managed Inference and Agents Platform Overview - Heroku Managed Inference and Agents platform enables developers to build agentic applications that can reason, make decisions, and trigger actions [1] - The platform allows for provisioning and deploying LLMs, running untrusted code securely in multiple languages, and extending agents with the Model Context Protocol (MCP) [1] Key Capabilities - Heroku Managed Inference and Agents facilitates the deployment and management of LLMs [1] - The platform supports secure execution of untrusted code in Python, Nodejs, Go, and Ruby [1] - Model Context Protocol (MCP) can be used to extend agent capabilities [1] Target Applications - The platform is suitable for building internal tools, developer assistants, or customer-facing AI features [1]
Events are the Wrong Abstraction for Your AI Agents - Mason Egger, Temporal.io
AI Engineer· 2025-06-27 09:35
Welcome everyone. Uh my name is Mason Edgar. I work at Temporal and today we're going to talk about uh events are the wrong abstraction for your AI agents.So uh who here raise of hands recognizes what this diagram is out of curiosity. Okay. So this is a map of our solar system um in a geocentric projection. Uh this is where we have earth as the center of our solar system and this is how celestial objects move around the earth.And this was used to kind of calculate uh celestial trajectories prior to like the ...
Prompt Engineering is Dead — Nir Gazit, Traceloop
AI Engineer· 2025-06-27 09:34
Core Argument - The presentation challenges the notion of "prompt engineering" as a true engineering discipline, suggesting that iterative prompt improvement can be automated [1][2] - The speaker advocates for an alternative approach to prompt optimization, emphasizing the use of evaluators and automated agents [23] Methodology & Implementation - The company developed a chatbot for its website documentation using a Retrieval-Augmented Generation (RAG) pipeline [2] - The RAG pipeline consists of a Chroma database, OpenAI, and prompts to answer questions about the documentation [7] - An evaluator was built to assess the RAG pipeline's responses, using a dataset of questions and expected answers [5][7] - The evaluator uses a ground truth-based LLM as a judge, checking if the generated answers contain specific facts [10][13] - An agent was created to automatically improve prompts by researching online guides, running evaluations, and regenerating prompts based on failure reasons [5][18][19] - The agent uses Crew AI to think, call the evaluator, and regenerate prompts based on best practices [20] Results & Future Considerations - The initial score of the prompt was 0.4 (40%), and after two iterations with the agent, the score improved to 0.9 (90%) [21][22] - The company acknowledges the risk of overfitting to the training data (20 examples) and suggests splitting the data into train/test sets for better generalization [24][25] - Future work may involve applying the same automated optimization techniques to the evaluator and agent prompts [27] - The demo is available in the trace loop/autoprompting demo repository [27]
The Eyes Are The (Context) Window to The Soul: How Windsurf Gets to Know You — Sam Fertig, Windsurf
AI Engineer· 2025-06-27 09:34
Core Problem in AI Coding Space - Generating code is not difficult, but generating code that fits into existing codebases, adheres to organizational policies, personal preferences, and is future-proof is challenging [13][14][15] - The magic of AI coding tools like Windsurf lies in context, specifically "what context" and "how much" [16] Windsurf's Context Philosophy - "What context" is divided into two buckets: heuristics (user behavior) and hard evidence (environment/codebase) [17][18][19] - Relevant output is determined by the prompt, the state of the codebase, and the user state [20] - Windsurf prioritizes optimizing the relevance of context over simply increasing the size of the context window to address latency [21][22] Windsurf's Capabilities - Windsurf excels at finding relevant context quickly due to its background in GPU optimization [23] - Windsurf provides connectors for users to perform context retrieval at their level, including embedding search, memories, rules, and custom workspaces [24] Data Privacy - Windsurf processes information only within the user's editor and does not access the user's operating machine [31] - Windsurf's servers are stateless, and the company does not store or train on user data [31][32]
What does Enterprise Ready MCP mean? — Tobin South, WorkOS
AI Engineer· 2025-06-27 09:31
MCP and AI Agent Development - MCP is presented as a way of interfacing between AI and external resources, enabling functionalities like database access and complex computations [3] - The industry is currently focused on building internal demos and connecting them to APIs, but needs to move towards robust authentication and authorization [9][10] - The industry needs to adapt existing tooling for MCP due to its dynamic client registration, which can flood developer dashboards [12] Enterprise Readiness and Security - Scaling MCP servers requires addressing free credit abuse, bot blocking, and robust access controls [12] - Selling MCP solutions to enterprises necessitates SSO, lifecycle management, provisioning, fine-grained access controls, audit logs, and data loss prevention [12] - Regulations like GDPR impose specific logging requirements for AI workloads, which are not widely supported [12] Challenges and Future Development - Passing scope and access control between different AI workloads remains a significant challenge [13] - The MCP spec is actively developing, with features like elicitation (AI asking humans for input) still unstable [13] - Cloud vendors are solving cloud hosting, but authorization and access control are the hardest parts of enterprise deployment [13]
CI in the Era of AI: From Unit Tests to Stochastic Evals — Nathan Sobo, Zed
AI Engineer· 2025-06-27 09:22
Software engineers have long understood that high-quality code requires comprehensive automated testing. For decades, our industry has relied on deterministic tests with clear pass/fail outcomes to ensure reliability. High-quality software depends on automated testing. That's certainly true at Zed, where we're building a next-generation native IDE in Rust. Zed runs at 120 frames per second, but it would also crash once a second if we didn't maintain and run a comprehensive suite of unit tests on every chang ...
Building AI Agents that actually automate Knowledge Work - Jerry Liu, LlamaIndex
AI Engineer· 2025-06-24 00:16
Agents are all the rage in 2025, and every single b2b SaaS startup/incumbent promises AI agents that can "automate work" in some way. But how do you actually build this? The answer is two fold: 1. really really good tools 2. carefully tailored agent reasoning over these tools that range from assistant-to-automation based UXs. The main goal of this talk is to a practical overview of agent architectures that can automate real-world work, with a focus on document-centric tasks. Learn the core building blocks o ...
Large Scale AI on Apple Silicon (as mentioned by @AndrejKarpathy ) — Alex Cheema, EXO Labs
AI Engineer· 2025-06-20 22:52
Scientific Rigor and Progress - Scientific progress is not always linear, and inertia within the scientific community can hinder the adoption of new ideas, even with sound methodology [8][18] - Questioning assumptions is crucial for scientific advancement, as illustrated by historical examples in physics and rat experiments [9][16] - Oversimplification of science is a common pitfall, and publishing both successful and unsuccessful results is important for transparency and learning [18][35] AI Development and Hardware - The "hardware lottery" suggests that the best research ideas in AI don't always win due to various factors, including hardware limitations and existing paradigms [22] - Large Language Models (LLMs) can create inertia by reinforcing existing practices, such as the dominance of Python, making it harder for new programming languages to gain adoption [23][24] - GPUs addressed the von Neumann bottleneck of CPUs by changing the ratio of bytes loaded to floating-point operations executed, enabling significant performance improvements in AI [21] Exo's Solution and Research - Exo is developing an orchestration layer for AI that runs on different hardware targets, aiming to provide a reliable system for managing distributed devices in ad hoc mesh networks [25] - Exo models everything as a causally consistent set of events, creating a causal graph to reason about the system and ensure data consistency across distributed systems [26][27] - Exo's technology enables the efficient utilization of diverse hardware, such as combining Nvidia Spark (high compute) and Studio (high memory bandwidth) for LLM generation [28][29] - Exo is researching new optimizers that are more efficient per flop than Adam but require more memory, leveraging the memory-to-flops ratio of Apple silicon [31][32][33]