Workflow
AI Engineer
icon
Search documents
Useful General Intelligence — Danielle Perszyk, Amazon AGI
AI Engineer· 2025-08-02 13:15
We’re all hearing that AI agents will enable AGI, but they can’t yet reliably perform even basic computer tasks. It turns out that getting AI to click, type, and scroll is more challenging than getting it to generate code. How can we build general-purpose agents that can do anything we can do on a computer? This is our goal at the Amazon AGI SF Lab. In this talk, I’ll propose a new approach to agents that we call Useful General Intelligence. After describing how we’re solving the biggest challenges in compu ...
The 2025 AI Engineering Report — Barr Yaron, Amplify
AI Engineer· 2025-08-01 22:51
AI Engineering Landscape - The AI engineering community is broad, technical, and growing, with the "AI Engineer" title expected to gain more ground [5] - Many seasoned software developers are AI newcomers, with nearly half of those with 10+ years of experience having worked with AI for three years or less [7] LLM Usage and Customization - Over half of respondents are using LLMs for both internal and external use cases, with OpenAI models dominating external, customer-facing applications [8] - LLM users are leveraging them across multiple use cases, with 94% using them for at least two and 82% for at least three [9] - Retrieval-Augmented Generation (RAG) is the most popular customization method, with 70% of respondents using it [10] - Parameter-efficient fine-tuning methods like LoRA/Q-LoRA are strongly preferred, mentioned by 40% of fine-tuners [12] Model and Prompt Management - Over 50% of respondents are updating their models at least monthly, with 17% doing so weekly [14] - 70% of respondents are updating prompts at least monthly, and 10% are doing so daily [14] - A significant 31% of respondents lack any system for managing their prompts [15] Multimodal AI and Agents - Image, video, and audio usage lag text usage significantly, indicating a "multimodal production gap" [16][17] - Audio has the highest intent to adopt among those not currently using it, with 37% planning to eventually adopt audio [18] - While 80% of respondents say LLMs are working well, less than 20% say the same about agents [20] Monitoring and Evaluation - Most respondents use multiple methods to monitor their AI systems, with 60% using standard observability and over 50% relying on offline evaluation [22] - Human review remains the most popular method for evaluating model and system accuracy and quality [23] - 65% of respondents are using a dedicated vector database [24] Industry Outlook - The mean guess for the percentage of the US Gen Z population that will have AI girlfriends/boyfriends is 26% [27] - Evaluation is the number one most painful thing about AI engineering today [28]
Agents vs Workflows: Why Not Both? — Sam Bhagwat, Mastra.ai
AI Engineer· 2025-08-01 16:00
[Music] Okay. Agents or workflows. Why not both.Uh, thank you, Alex, for the nice intro. Um, uh, like like he said, I used to be the founder of co-founder of Gatsby. Um, I wrote a book called Principles of AI agents, which is floating around.Hopefully many of you have gotten a copy. We we have more around the conference. Uh there was a big debate uh a couple of months ago which the term on Twitter people may have noticed um which I just referenced.Um and I think like this is a big reason why I'm h why we're ...
Why We Don’t Need More Data Centers - Dr. Jasper Zhang, Hyperbolic
AI Engineer· 2025-08-01 15:00
[Music] Nice meeting you guys. Uh great to be here and uh I'm here to present hyperbolic which is AI cloud for developers. And so my topic is uh why we don't need more data centers.It's like a very eye-catching title. Uh but I what I want to clarify is I still think b building data centers is important but just building data centers alone can solve the problem. So uh wait before we get started uh let me introduce myself.I'm Jasper. I'm the CEO and co-founder of Hyperbolic. Um I did my math PhD at UC Berkele ...
Flipping the Inference Stack — Robert Wachen, Etched
AI Engineer· 2025-08-01 14:30
Flipping the Inference Stack: Why GPUs Bottleneck Real Time AI at Scale Current AI inference systems rely on brute-force scaling—adding more GPUs for each user—creating unsustainable compute demands and spiraling costs. Real-time use cases are bottlenecked by their latency and costs per user. In this talk, AI hardware expert and founder Robert Wachen will break down why the current approach to inference is not scalable, and how rethinking hardware is the only way to unlock real-time AI at scale. ---related ...
Infrastructure for the Singularity — Jesse Han, Morph
AI Engineer· 2025-08-01 14:30
We're at an inflection point where AI agents are transitioning from experimental tools to practical coworkers. This new world will demand new infrastructure for RL training, test-time scaling, and deployment. This is why Morph Labs developed Infinibranch last year, and we are excited to finally unveil what's next. About Jesse Han Jesse Han is the Founder and CEO of Morph Labs, a company building the infrastructure for the singularity. Morph is the creator of Infinibranch, a breakthrough in cloud technology ...
Hacking the Inference Pareto Frontier - Kyle Kranen, NVIDIA
AI Engineer· 2025-08-01 13:45
Your model works! It aces the evals! It even passes the vibe check! All that’s required is inference, right? Oops, you’ve just stepped into a minefield: -Not low-latency enough? Choppy experience. Users churn from your app. -Not cheap enough? You’re losing money on every query. -Not high enough output quality? Your system can’t be used for that application. A model and the inference system around it form a “token factory” associated with a Pareto frontier— a curve representing the best possible trade-offs b ...
Pipecat Cloud: Enterprise Voice Agents Built On Open Source - Kwindla Hultman Kramer, Daily
AI Engineer· 2025-07-31 18:56
[Music] Hi everybody. My name is Quinn. I am a co-founder of a company called Daily.Dy's other founder is in the back there, Nina. I'm stepping in for my colleague Mark, who couldn't make it today, so we're going to do this fast and very informally, but I think that's a good way to do it at an engineering conference. I don't have as much code to show as the last awesome presentation, but I'll try to show a little bit.We're going to talk about building voice agents today. Uh I work on an open source vendor n ...
[Voice Keynote] Your realtime AI is ngmi — Sean DuBois (OpenAI), Kwindla Kramer (Daily)
AI Engineer· 2025-07-31 16:00
Sean DuBois of OpenAI and Pion, and Kwindla Hultman Kramer of Daily and Pipecat, will talk about why you have to design realtime AI systems from the network layer up. Most people who build realtime AI apps and frameworks get it wrong. They build from either the model out or the app layer down. But unless you start with the network layer and build up, you'll never be able to deliver realtime audio and video streams reliably. And perhaps even worse, you'll get core primitives wrong: interruption handling, con ...
Why ChatGPT Keeps Interrupting You — Dr. Tom Shapland, LiveKit
AI Engineer· 2025-07-31 16:00
ChatGPT Advanced Voice Mode isn’t interrupting just you. Interruptions, and turn-taking in general, are unsolved problems for all Voice AI agents. Nobody likes being cut short – and people have much less patience for machines than they do for other humans. Turn-taking failures take many forms (e.g., the agent interrupts the user, the agent mistakes a cough for an interruption), and all of them lead to users immediately hanging up the phone. In this talk, we use human conversation as a framework for understa ...