Workflow
Real-time AI
icon
Search documents
Flipping the Inference Stack — Robert Wachen, Etched
AI Engineer· 2025-08-01 14:30
Flipping the Inference Stack: Why GPUs Bottleneck Real Time AI at Scale Current AI inference systems rely on brute-force scaling—adding more GPUs for each user—creating unsustainable compute demands and spiraling costs. Real-time use cases are bottlenecked by their latency and costs per user. In this talk, AI hardware expert and founder Robert Wachen will break down why the current approach to inference is not scalable, and how rethinking hardware is the only way to unlock real-time AI at scale. ---related ...
Realtime Conversational Video with Pipecat and Tavus — Chad Bailey and Brian Johnson, Daily & Tavus
AI Engineer· 2025-06-27 10:30
[Music] We're here to talk about real time conversational video uh with Pipecat. That's me, and with Tavis, that's Brian. We'll introduce ourselves a little bit more, but in the interest of keeping it moving, let's talk about what we're here for.If anybody Have any of you ever seen one of these robot concierge things. Do they work. No, they don't.They're terrible, right. Um, it's actually possible nowadays to build this kind of thing, but actually good. Um, it's a little bit tricky, but that's what we're he ...