Speech to Text
Search documents
How to debug voice agents with LangSmith
LangChain· 2025-12-09 21:39
Voice is one of the most natural ways to interact with AI. And as the models are getting better, I'm excited about new use cases and interaction patterns that it's going to unlock, especially in industries like education and customer service. It's surprisingly easy to get started building a voice agent.And so let's go through that in this video. I'm Tannushri and I'm going to show you how to build a voice agent, specifically a French tutor with this framework called Pipecat. going to walk through how it wor ...
速递|ElevenLabs发布独立语音检测模型,旨在精细化理解和转录语音
Z Potentials· 2025-02-27 04:09
Core Viewpoint - ElevenLabs has raised $180 million in funding, primarily known for its audio generation capabilities, and is now entering the speech detection market to compete with other players like Gladia, Speechmatics, AssemblyAI, Deepgram, and OpenAI's Whisper model [1][2]. Group 1: Company Overview - ElevenLabs is valued at $3.3 billion and has a large voice library that supports various enterprises in providing speech-to-text services [1]. - The company has launched its first independent speech-to-text model, Scribe, which supports over 99 languages, with more than 25 categorized as having "excellent accuracy" (word error rate below 5%) [1]. Group 2: Model Performance - In benchmark tests, Scribe outperformed Google Gemini 2.0 Flash and Whisper Large V3 across multiple languages [2]. - The model features intelligent speaker separation, providing word-level timestamps for accuracy and automatically marking sound events like audience laughter [3]. Group 3: Pricing and Availability - Scribe is currently priced at $0.40 per hour for transcribing audio, which is competitive, although some competitors offer lower prices with different functionalities [3]. - The model currently supports only pre-recorded audio formats, with plans to release a low-latency real-time version soon [3].