Workflow
AI Engineer
icon
Search documents
How agents will unlock the $500B promise of AI - Donald Hruska, Retool
AI Engineer· 2025-07-23 15:51
AI Market Growth & Trends - AI infrastructure spending has reached $0.5 trillion, yet many companies are limited to basic chatbots and code generation [2] - Anthropic's annualized revenue has grown rapidly, 3xing in 5 months, reaching $3 billion by the end of May [3] - OpenAI is projected to reach $12 billion in revenue by the end of 2025, a 3x increase from the previous year, driven by enterprise AI spending [4] - Cost per token for AI inference dropped dramatically by 99.7% from 2022 to 2024 [33] - Google searches for "AI agents" increased 11x in the last 16 months [34] Retool's Agentic AI Solution - Retool is breaking into Agentic AI with the release of Retool Agents, enabling enterprises to build agents with guardrails that integrate into production systems [2] - Retool customers have automated over 100 million hours of work, freeing up human potential [31] - Retool's cheapest agent is priced at $3 per hour [33] Agent Development Strategies - Companies have four options for agent development: building from scratch, using a framework like Lang graph, using an agent platform like Retool Agents, or using verticalized agents [16][17][18][19] - The decision to build or buy agents depends on whether it's part of the core product, involves regulated data, or is a commodity workflow needed quickly [21] - When considering a managed platform, evaluate the breadth of connectors, built-in permissioning, compliance, audit trails, and observability [22][23] Enterprise Considerations for AI Agents - Enterprises need to consider single sign-on, role-based access control, secure integration with external services, audit logs, compliance, and internationalization when deploying AI agents [13][14] - Risks of using AI-generated code in production include hallucinations, unpredictable results, security vulnerabilities, and cost overruns [15]
How Intuit uses LLMs to explain taxes to millions of taxpayers - Jaspreet Singh, Intuit
AI Engineer· 2025-07-23 15:51
Intuit's Use of LLMs in TurboTax - Intuit successfully processed 44 million tax returns for tax year 2023, aiming to provide users with high confidence in their tax filings and ensure they receive the best deductions [2] - Intuit's Geni experiences are built on GenOS, a proprietary generative OS platform designed to address the limitations of out-of-the-box tooling, especially concerning regulatory compliance, safety, and security in the tax domain [4][5] - Intuit uses Claude (Anthropic) for static queries related to tax refunds and OpenAI's GPT-4 for dynamic question answering, such as user-specific tax inquiries [9][10][12] - Intuit is one of the biggest users of Claude, with a multi-million dollar contract [9][10] Development and Evaluation - Intuit emphasizes a phased evaluation system, starting with manual evaluations by tax analysts and transitioning to automated evaluations using LLM as a judge [16][17] - Tax analysts also serve as prompt engineers, leveraging their expertise to ensure accurate evaluations and prompt design [16][17] - Key evaluation pillars include accuracy, relevancy, and coherence, with a strong focus on tax accuracy [20][24] - Intuit uses AWS Ground Truth for creating golden datasets for evaluations [22] Challenges and Learnings - LLM contracts are expensive, and long-term contracts are slightly cheaper but create vendor lock-in [25][26] - LLM models have higher latency compared to backend services (3-10 seconds), which can be exacerbated during peak tax season [27][28] - Intuit employs safety guardrails and ML models to prevent hallucination of numbers in LLM responses, ensuring data accuracy [40][41] - Graph RAG outperforms regular RAG in providing personalized and helpful answers to users [42][43]
From Hype to Habit: How We’re Building an AI-First SaaS Company—While Still Shipping the Roadmap
AI Engineer· 2025-07-23 15:51
Strategy - AI first 意味着从在产品中添加 AI 功能发展到通过 AI 视角重新思考如何规划、构建和交付价值 [4] - AI first 公司需要像初创公司一样的好奇心和敏捷性,同时具备企业般的纪律性,两者并行 [12] - 公司需要平衡当前客户需求和对未来 AI 投资之间的关系,避免过度关注一方而落后 [11] - 规划方式需要拥抱不确定性,学习和发现塑造前进的道路,目的地本身也会随着对可能性的了解而演变 [13] Ways of Working - 需要将发现过程视为可重复的、有意的过程,在规划周期中构建用于实验、黑客马拉松和学习的时间 [19][20] - 将流程视为产品,根据结果评估其有效性,如果流程不能提高方向的清晰度、帮助团队或加速决策,则需要迭代或完全删除 [23] - 从速度转向智能速度,意味着培养有目的地快速行动的能力,在清晰、动力和适应性中工作 [25] People - 成为 AI first 公司主要是一种文化转型,需要重新思考在 AI 时代优秀人才的定义,不仅在 AI 团队中,而且在整个公司中 [26][27] - 投资于 T 型人才,即拥有深厚专业知识,同时可以扩展宽度、快速原型设计、跨部门流畅协作并实现端到端系统的人才 [29] - 需要在整个组织内建立 AI 流利度,让每个团队都感到有能力理解 AI,并有足够的信心使用 AI 进行构建 [33][34]
Machines of Buying and Selling Grace - Adam Behrens, New Generation
AI Engineer· 2025-07-23 15:51
E-commerce Evolution with AI - E-commerce has evolved from physical stores to online platforms, and AI is now digitizing participants and their interactions, moving from static websites to merchant and consumer agents [1][2][5] - The goal remains transaction completion, but the focus shifts to dynamic, real-time, and generative interfaces for both human and agentic consumers [6][7] Challenges and Solutions in the Agentic Commerce - The industry faces challenges in enabling software agents to complete transactions, with solutions including delegated authentication via partners like Visa [13][14][15] - Moving from inferred buyer intent (keyword searches, click data) to explicitly captured intent through conversation data is crucial [16] - Merchants are exploring how to convert fuzzy intent into specific product SKUs, noting higher conversion rates, dollar values, and lifetime values from AI channels [17][18] - Ensuring product availability across numerous stores requires moving beyond existing product feed infrastructure and web scraping towards a unified API for product data [20][21][22] - Representing buyer and seller preferences needs to evolve from siloed data to rich context across all aspects of their lives, with market design challenges addressed by third-party institutions [23][24][26] The Future of Retail and Brand Strategy - Fortune 500 companies are adapting to technological shifts, with examples like Samsung evolving from a fish merchant to a technology leader [29][30] - Brands are creating APIs and MCP servers for chat clients, abstracting complex product systems into consistent APIs [31][32] - Companies are connecting product data with brand and design systems to experiment with generative interfaces and conversational commerce [33][34] - Enabling payment flows for bot traffic is essential, as AI chat users demonstrate higher intent and conversion rates [35][36] - The industry believes stores will evolve back to their original form: a conversation, with brands owning surfaces in various applications [36][40]
How to Build Planning Agents without losing control - Yogendra Miraje, Factset
AI Engineer· 2025-07-23 15:51
[Music] Hi everyone, I'm Yogi. I work at Faxet, a financial data and software company. And today I'll be sharing some of my experience while building agent.In last few years we have seen tremendous growth in AI and especially in last couple of years we are on exponential curve of intelligence growth and yet it feels like when we are develop AI applications driving a monster truck through a crowded mall with a tiny joysticks. So AI applications have not seen its charge GPD moment yet. There are many reasons ...
POC to PROD: Hard Lessons from 200+ Enterprise GenAI Deployments - Randall Hunt, Caylent
AI Engineer· 2025-07-23 15:50
Core Business & Services - Kalin builds custom solutions for clients, ranging from Fortune 500 companies to startups, focusing on app development and database migrations [1][2] - The company leverages generative AI to automate business functions, such as intelligent document processing for logistics management, achieving faster and better results than human annotators [20][21] - Kalin offers services ranging from chatbot and co-pilot development to AI agent creation, tailoring solutions to specific client needs [16] Technology & Architecture - The company utilizes multimodal search and semantic understanding of videos, employing models like Nova Pro and Titan v2 for indexing and searching video content [6][7] - Kalin uses various databases including Postgress, PG vector, and OpenSearch for vector search implementations [13] - The company builds AI systems on AWS, utilizing services like Bedrock and SageMaker, and custom silicon like Tranium and Inferentia for price performance improvements of approximately 60% over Nvidia GPUs [27] AI Development & Strategy - Prompt engineering has proven highly effective, sometimes negating the need for fine-tuning models [40] - Context management is crucial for differentiating applications, leveraging user data and history to make strategic inferences [33][34] - UX design is important for mitigating the slowness of inference, with techniques like caching and UI spinners improving user experience [36][37]
From Copilot to Colleague: Building Trustworthy Productivity Agents for High-Stakes Work - Joel Hron
AI Engineer· 2025-07-23 12:15
AI转型与策略 - 行业 North Star 从 "helpfulness"(有用)转变为 "productive"(生产力),要求 AI 系统生成输出和决策 [1][7] - Agentic AI 被视为一个可调节的 spectrum,根据用例调整 autonomy(自主性)、context(上下文)、memory(记忆)和 coordination(协调)等 levers [9][10][11][12][13] - 构建 Agentic AI 系统时,应着眼于整个问题,而不是过度关注 MVP(最小可行产品),构建完整系统后再进行优化 [21][31] 行业应用与技术 - Thomson Reuters 拥有 4,500 名领域专家,并拥有超过 1.5 terabytes 的专有内容,为软件产品提供支持 [4] - Thomson Reuters 每年在 AI 产品开发上投入超过 2 亿美元 [5] - 通过分解传统应用程序,将组件作为工具提供给 agents 使用,为旧系统注入新的活力 [20][31] 评估与挑战 - Evals(评估)是 AI 开发中最困难的部分,用户期望确定性,但这与 AI 系统的运作方式不符 [15] - 人工评估结果存在高度 variability(变异性),即使是同一批领域专家,对相同数据的评估结果也会有 10% 以上的波动 [15] - 在构建具有更高 agency(代理能力)的系统时,引用源材料变得更具挑战性,agents 可能会出现 drift(漂移),难以追踪原因 [17]
How to Hire AI Engineers when EVERYONE is cheating with AI — Beth Glenfield, DevDay
AI Engineer· 2025-07-22 19:55
AI在技术招聘中的影响 - AI作弊服务正在兴起,例如Clu筹集了530万美金的融资,并实现了接近100万美金的年度经常性收入(ARR)[4] - 谷歌和Meta的面试中,有93%的候选人能够成功通过LeetCode题目,表明传统算法题的区分度正在下降[4] - 越来越多的面试中出现了AI助手,这意味着面试实际上是在考察谁拥有最佳的AI编码助手[4] - Salesforce宣布今年将不再招聘软件工程师,因为他们通过使用AI,生产力提高了30%[5] 技术招聘的新方向 - 行业需要寻找具备创造性问题解决能力、协作领导力以及能够与AI协同工作的人才,而不是仅仅擅长LeetCode的人[7][8] - 行业应观察候选人如何与AI队友在实际业务场景中协作,而不是要求他们解决在工作中永远不会用到的难题[8] - 行业应衡量候选人如何委派任务、处理模糊性以及应对需求变更,而不是衡量他们记忆算法的能力[9] - Devday正在重新构想AI时代的技术招聘流程,通过创建真实的工作场所模拟来评估候选人[10][11] - 行业应衡量候选人与AI的协作能力、处理模糊性的能力、技术决策的沟通能力以及指导他人的能力[12][13] 招聘挑战与应对 - 小公司在人才争夺战中面临巨大压力,因为它们无法像谷歌或Meta那样提供高薪和品牌效应[3][7][13] - 招聘到不胜任AI产品开发的人才,可能会给公司带来高达2万到6万美元的损失[14] - 未来的工程岗位需要创造力、协作能力以及根据商业判断进行工作的能力,而不是仅仅编码[15]
Stateful environments for vertical agents — Josh Purtell, Synth Labs
AI Engineer· 2025-07-22 19:52
Hey All - gave a talk on building stateful environments for vertical agents at AI tinkerers and ppl really liked it, happy to do again. Here's the repo - general code that endows environments like Pokemon Red, Minecraft, Swe-Bench, and others with the same interface for development and agent training. github.com/synth-laboratories/Environments Recorded at the AI Engineer World's Fair in San Francisco. Stay up to date on our upcoming events and content by joining our newsletter here: https://www.ai.engineer/ ...
Books reimagined: use AI to create new experiences for things you know — Lukasz Gandecki, Xolvio
AI Engineer· 2025-07-22 19:50
[Music] So, my name is Sukash Ganditski and I've been programming since I was a little kid and I want to tell you about my newest project um books reimagined. So, how to use AI to create new experiences for things you already know. So, how it all started, I was reading a book about uh Donald Trump re-election and since, as you can hear, I'm not from the United States.Um, there was a bit a few too many characters to me. I didn't follow everyone. Uh, so I decided to vip code my way through the understanding.I ...