Workflow
Voxtral模型
icon
Search documents
OpenAI发布端对端语音模型GPT-Realtime,助力开发者构建语音智能体
3 6 Ke· 2025-08-30 16:34
Core Insights - OpenAI has launched its most advanced end-to-end speech model, GPT-Realtime, which aims to provide developers with a more efficient and cost-effective way to build voice agents [1][3][11] - The pricing for GPT-Realtime has been significantly optimized, reducing costs by 20% compared to the previous model, GPT-4o-Realtime-Preview [1][11] - The new model demonstrates substantial improvements in performance, including better audio quality, expressiveness, and the ability to follow complex instructions [3][5][7][10] Pricing and Cost Efficiency - GPT-Realtime's pricing is set at $32 per million audio input tokens and $64 per million audio output tokens, compared to the previous model's $40 and $80 respectively [1] - The new pricing structure allows developers to create efficient voice agents at a lower cost while enjoying superior performance [1] Model Performance Enhancements - GPT-Realtime shows a significant leap in performance metrics, achieving an accuracy of 82.8% in the Big Bench Audio reasoning test, up from 65.6% for the previous model [5] - The model's instruction-following accuracy reached 30.5% in the MultiChallenge Audio test, surpassing the previous model's performance [7] - In the ComplexFuncBench Audio test, GPT-Realtime achieved a function call accuracy of 66.5%, indicating improved capabilities in using external tools [10] Developer Empowerment and API Upgrades - The Realtime API has reached production-level standards, allowing for direct audio processing and reducing latency [11] - New features include support for remote model context protocol (MCP) servers, enabling easier integration with external data sources [12] - The API now supports image input, allowing for multimodal conversations and expanding use cases for voice agents [12] Competitive Landscape - The release of GPT-Realtime occurs amid intense competition in the voice AI market, with companies like Anthropic and Meta making significant advancements [13][14] - OpenAI's enhancements aim to provide a more user-friendly and cost-effective solution, positioning the company favorably in the competitive landscape [14]