AI Engineer
Search documents
Building a Smarter AI Agent with Neural RAG - Will Bryk, Exa.ai
AI Engineer· 2025-07-29 07:01
Core Problem & Solution - The presentation introduces Exa, a search engine designed for AI, addressing the limitations of traditional search engines built for human users [5][23] - Exa aims to provide an API that delivers any information from the web, catering to the specific needs of AI systems [22][41] - Exa uses transformer-based embeddings to represent documents, capturing meaning and context beyond keywords [11][12] AI vs Human Search - Traditional search engines are optimized for humans who use simple queries and want a few relevant links, while AIs require complex queries, vast amounts of knowledge, and precise, controllable information [23][24] - AI agents need search engines that can handle multi-paragraph queries, search with extensive context, and provide comprehensive knowledge [31][32][33] - Exa offers features like adjustable result numbers (10, 100, 1000), date ranges, and domain-specific searches, giving AI systems full control [44] Market Positioning & Technology - Exa launched in November 2022 and gained traction for its ability to handle complex queries that traditional search engines struggle with [15] - The company recognized the need for AI-driven search after the emergence of ChatGPT, realizing that LLMs need external knowledge sources [17][18] - Exa combines neural and keyword search methods to provide comprehensive results, allowing agents to use different search types based on the query [47][48] Future Development - Exa is developing a "research endpoint" that uses multiple searches and LLM calls to generate detailed reports and structured outputs [51] - The company envisions a future where AI agents have full access to the world's information through a versatile search API [48] - Exa aims to handle a wider range of queries, including semantic and complex ones, turning the web into a controllable database for AI systems [38][39][40]
Make your LLM app a Domain Expert: How to Build an Expert System — Christopher Lovejoy, Anterior
AI Engineer· 2025-07-28 19:55
Core Problem & Solution - Vertical AI applications face a "last mile problem" in understanding industry-specific context and workflows, which is more critical than model sophistication [4][6] - Anterior proposes an "adaptive domain intelligence engine" to convert customer-specific domain insights into performance improvements [17] - The engine consists of measurement (performance evaluation) and improvement (iterative refinement) components [17] Measurement & Metrics - Defining key performance metrics that users care about is crucial, such as minimizing false approvals in healthcare or preventing dollar loss from fraud [18][19][20] - Developing a failure mode ontology helps categorize and analyze different ways the AI can fail, enabling targeted improvements [21][22] - Combining metric tracking with failure mode analysis allows prioritization of development efforts based on the impact on key metrics [26][27] Iteration & Improvement - Failure mode labeling creates ready-made datasets for iterative model improvement, using production data to ensure relevance [29] - Domain experts can suggest changes to the application pipeline and provide new domain knowledge to enhance performance [32][33] - This process enables rapid iteration, potentially fixing issues the same day by adding relevant domain knowledge and validating with evals [37] Domain Expertise - The level of domain expertise required depends on the specific workflow and optimization goals, with clinical reasoning requiring experienced doctors [38][39] - Bespoke tooling is recommended for integrating domain expert feedback into the platform and workflows [41] - Domain expert reviews provide performance metrics, failure modes, and suggested improvements, all in one [38] Results & Performance - Anterior achieved a 95% accuracy baseline in approving care requests, which was further improved to 99% through iterative refinement using the described system [14][15]
Shipping something to someone always wins — Kenneth Auchenberg (ex. Stripe, VSCode)
AI Engineer· 2025-07-28 19:54
Core Product Development Principle - Shipping something to someone always wins, emphasizing rapid iteration and feedback loops over big launches [1][34] - The key is enabling rapid iterative loops to get feedback from real users and maximize shots at the goal [1] - In the age of AI, this translates to building a "skateboard" first, then evolving it to a "car," ensuring a continuously viable product [2][4] - A continuously viable solution is significantly more valuable because it provides feedback along the way, avoiding building in a vacuum [5][6] Feedback Loop Implementation - Establish a feedback loop with real users who can see something, provide feedback, and allow for iterative improvements, ideally within a day [7] - Being able to ship every day is crucial for a fast feedback loop, requiring specific focus on the target customers [9] - Work with real people (not just personas) to understand their problems and build empathy [10][11] - Write the PI (Product Information) FAQ or launch blog post early to sanity check and communicate the product effectively [12] Navigating Constraints and AI Integration - Design the best product first, before considering constraints like legal, compliance, and financial aspects [15] - AI accelerates all aspects of product building, but the fundamental process of talking to users and getting feedback remains the same [26] - Product management becomes more critical as the cost of writing code approaches zero, emphasizing customer knowledge and rapid feedback [28][29]
Why your product needs an AI product manager, and why it should be you — James Lowe, i.AI
AI Engineer· 2025-07-28 19:53
[Music] Hi everyone. Thanks for that welcome. Uh, as you just heard, my name is James Low.I'm head of AI engineering at the Incubator for AI. We're a small team of experts uh, in the UK government. We were created by 10 Downing Street to deliver public good using AI and we do that via experimentation and product building.The UK government delivers uh for its citizens. It spends over a trillion pounds delivering for its over 70 million citizens. So there's a lot to play for.At the incubator for AI, uh we del ...
What Is a Humanoid Foundation Model? An Introduction to GR00T N1 - Annika & Aastha
AI Engineer· 2025-07-28 16:29
Market Trends & Industry Dynamics - McKinsey 报告指出,全球 30 个最发达经济体中,职位数量超过了能够胜任的人数,过去十年中,职位增长率超过人口增长率 420% [2][3] - 物理 AI 对于解决休闲、酒店、医疗保健、建筑、交通运输、制造业等行业的问题至关重要,这些行业不能仅靠像 ChatGPT 这样的聊天机器人来解决 [3][4] - 英伟达 Project Groot 是将人形机器人和其他形式的机器人技术引入世界的战略,涵盖了计算基础设施、软件和所需的研究 [11] Robotics Foundation Model & Technology - 英伟达的 GR 101 机器人基础模型是开源且高度可定制的,其一大特点是跨具身性,该模型包含 20 亿参数 [1][12] - 机器人数据策略包括:少量且昂贵的真实世界数据(机器人执行真实任务),大量非结构化的互联网视频数据(人类解决任务),以及理论上无限的合成数据 [14][16][17][18] - Project Groot 的数据解决方案包括数据金字塔策略,强调通过模拟和世界基础模型来增强和倍增高质量数据 [13][18][19] - Groot N1 系统引入了双系统架构,系统一快速执行任务(120 赫兹),系统二缓慢规划复杂任务,灵感来源于 Daniel Kahneman 的《思考快与慢》 [23][24][25] - Groot N1 采用扩散 Transformer 块,结合视觉编码器、VLM(视觉语言模型)和文本分词器处理图像和文本输入,并通过动作解码器生成可用于特定机器人的动作向量 [27][28][29][30] - 机器人学习的两种主要方式是模仿学习(通过复制人类专家)和强化学习(通过试错最大化奖励),Groot N1 结合使用了这两种方法 [32][33][36] Deployment & Compute Infrastructure - 物理 AI 生命周期包括生成数据、使用数据和部署,英伟达称之为“三大计算机问题”,涉及不同计算特征:模拟阶段(OVX Omniverse),训练阶段(DGX),边缘部署阶段(AGX) [9][10]
Real-time Experiments with an AI Co-Scientist - Stefania Druga, fmr. Google Deepmind
AI Engineer· 2025-07-28 16:29
[Music] My name is Stefania. I'm so glad you made it until the yeah uh last day of the conference and came to the robotics track. So, we're going to start with a live demo uh and then we'll switch to the presentation just like to to kind of like swap things around.So, I'm going to try to connect the microscope over here. Uh and let's see the other camera and some sensors. So, my talk is about real time uh science co-scientist.So, think about pair programmers. How many of you use any form of copilot for codi ...
Scaling AI Agents Without Breaking Reliability — Preeti Somal, Temporal
AI Engineer· 2025-07-28 15:15
Temporal's Value Proposition for Agentic AI - Temporal positions itself as a solution for building reliable and scalable agentic AI applications, addressing the inherent unreliability of LLMs and the complexity of distributed systems [2][3][7] - The company's mission is to outsource reliability and scalability, allowing developers to focus on business logic [7][8] - Temporal offers language-idiomatic SDKs and handles plumbing code, ensuring reliable process execution and providing guardrails [8][9] - Customers are currently running agents on Temporal at scale in production, experiencing agility and speed in development [12][13] Technical Architecture and Impact - Temporal introduces a workflow abstraction to simplify complex interactions between LLMs, chat history databases, and tools [15][17] - Using Temporal can accelerate development, with case studies showing feature delivery velocity improvements of over 6x [18] - Temporal Cloud handles the heavy lifting of reliability and scalability, while the agent's code runs in the user's environment [28][29] - Temporal is an open-source product, offering a code exchange for developers to explore and utilize [31] Customer Success and Adoption - Dust and other companies are building agents on top of Temporal [11] - Gorgeous, a customer service provider for brands like Reebok and Timbuktu, uses AI agents built on Temporal [11][12] - A consumer application customer is scaling with events without needing to handle scale logic [19]
Government Agents: AI Agents vs Tough Regulations — Mark Myshatyn, Los Alamos National Laboratory
AI Engineer· 2025-07-28 04:15
https://www.linkedin.com/in/markmyshatyn/ ...
Ship Agents that Ship: A Hands-On Workshop - Kyle Penfound, Jeremy Adams, Dagger
AI Engineer· 2025-07-27 22:30
Coding agents are transforming how software gets built, tested, and deployed, but engineering teams face a critical challenge: how to embrace this automation wave without sacrificing trust, control, or reliability. In this 80 minute workshop, you’ll go beyond toy demos and build production-minded AI agents using Dagger, the programmable delivery engine designed for real CI/CD and AI-native workflows. Whether you're debugging failures, triaging pull requests, generating tests, or shipping features, you'll le ...
The AI Engineer’s Guide to Raising VC — Dani Grant (Jam), Chelcie Taylor (Notable)
AI Engineer· 2025-07-27 18:00
A no fluff, all tactics discussion. More AI engineers should build startups, the world needs more software. But there’s a way to raise VC and it’s hard to do it if you’ve never seen it done. We are going to walk through the exact playbook to raise your first round of funding. We will show you real pitch decks, real cold emails and real term sheets so when you go out to raise your first round of funding, you are setup to do it. Every AI Engineer should be equip to start their own company and this session mak ...