Workflow
Real-time AI
icon
Search documents
Flipping the Inference Stack — Robert Wachen, Etched
AI Engineer· 2025-08-01 14:30
Scalability Challenges in AI Inference - Current AI inference systems rely on brute-force scaling, adding more GPUs per user, leading to unsustainable compute demands and spiraling costs [1] - Real-time use cases are bottlenecked by latency and costs per user [1] Proposed Solution - Rethinking hardware is the only way to unlock real-time AI at scale [1] Key Argument - The current approach to inference is not scalable [1]
From Vegas to Velocity: VIP Play, Inc. CEO Les Ottolenghi Reveals the Next Wave of Real-Time Intelligence at AI4 2025
Prnewswire· 2025-07-29 15:00
Company Overview - VIP Play, Inc. is focused on pioneering mobile sports wagering and operates a proprietary technology platform in Tennessee, holding an interim iGaming and mobile sports-betting license in West Virginia [5] - The company offers a modern sportsbook with unique wager offerings, sweepstakes contests, and social features, leveraging cloud-native architecture and an AI-driven product roadmap [5] Leadership and Innovation - Les Ottolenghi, the CEO of VIP Play, Inc., is recognized for his expertise in innovative and disruptive technologies, having transitioned from Chief Commercial and Transformation Officer at Lee Enterprises to lead VIP Play [2] - Ottolenghi will present at the AI4 2025 Conference, focusing on how real-time AI technologies are transforming player interaction in gaming and entertainment [1][4] Strategic Framework - At the AI4 conference, Ottolenghi will introduce the Information Refinery Model, a strategic enterprise framework aimed at delivering scalable, intelligent experiences in milliseconds [3] - This model is based on Ottolenghi's extensive experience in gaming technology, including his previous roles at Caesars Entertainment and Las Vegas Sands [3] Industry Insights - The keynote will emphasize that speed is crucial for engagement in the gaming and tech industries, highlighting how real-time systems and predictive modeling can create new revenue and retention models [4] - The AI4 conference is a significant global forum for applied AI, attracting leaders from various sectors, including gaming, streaming, fintech, and mobile, who are interested in deploying real-time AI at scale [4]
Realtime Conversational Video with Pipecat and Tavus — Chad Bailey and Brian Johnson, Daily & Tavus
AI Engineer· 2025-06-27 10:30
Core Technology & Products - Tavis offers a conversational video interface, an end-to-end pipeline for conversations with AI replicas, with a response time around 600 milliseconds [9] - Tavis's proprietary models, Sparrow Zero and Raven Zero, are being integrated into Pipecat [10][11] - Pipecat is an open-source framework designed as an orchestration layer for real-time AI, handling input, processing, and output of media [15][18] - Pipecat uses frames, processors, and pipelines to manage data flow, with processors handling frames of audio, video, or voice activity detection [23][24] Strategic Partnership & Integration - Tavis and Pipecat are partnering to enhance conversational AI, leveraging Pipecat's capabilities for real-time observability and control [8] - Enterprise customers are using Pipecat and want to integrate Tavis's technology within it, leading Tavis to move its best models into Pipecat [39] - Tavis is integrating its Phoenix rendering model, turn-taking, response timing, and perception models into Pipecat [39][40] Future Development & Deployment - Tavis is developing a multilingual turn detection model to improve conversational AI speed and prevent interruptions [41] - Tavis is working on a response timing model to adjust response speed based on conversation context [42][43] - Tavis's multimodal perception model will analyze emotions and surroundings to provide more nuanced conversational flow [44] - Pipecat Cloud offers a solution for deploying bots at scale, simplifying the process without requiring Kubernetes expertise [49]