新语音模型
Search documents
OpenAI整合团队拟一季度发布新语音模型 为发布AI个人无屏设备铺路
智通财经网· 2026-01-01 23:28
Core Insights - OpenAI is optimizing its audio AI model to prepare for a planned voice-driven personal device launch by Q1 2026 [2][3] - The new audio model aims to improve emotional expression and real-time conversation capabilities, addressing current limitations in accuracy and response speed compared to text models [2][3] Group 1: Technical Developments - OpenAI has integrated engineering, product, and research teams to tackle audio interaction technology bottlenecks [2] - The new audio model architecture will generate more precise responses and handle complex scenarios like conversation interruptions [3] - The company is focusing on a screenless interaction model, believing that voice communication aligns more closely with human interaction instincts [3] Group 2: Hardware and User Experience - OpenAI plans to launch a series of screenless devices, including smart glasses and smart speakers, positioning them as "collaborative companions" rather than mere application interfaces [2][3] - The company faces challenges in changing user behavior, as most ChatGPT users have not yet adopted voice interaction due to insufficient audio model quality or lack of awareness of the feature [4] Group 3: Strategic Initiatives - OpenAI has invested nearly $6.5 billion to acquire a company co-founded by former Apple design chief Jony Ive, focusing on supply chain, industrial design, and model development [4] - The timeline suggests that OpenAI must enhance existing ChatGPT voice functionalities to build a user base and validate the practicality of audio interaction in everyday scenarios before the product launch [5]
报道:OpenAI整合团队拟一季度发布新语音模型,为发布AI个人无屏设备铺路
Hua Er Jie Jian Wen· 2026-01-01 22:27
OpenAI正优化其音频人工智能模型,为计划中的语音驱动型个人设备做准备。 据报道,新语音模型将具备更自然的情感表达能力和实时对话功能,包括处理对话打断的能力,这是现 有模型无法实现的关键特性,计划2026年第一季度发布。 报道援引知情人士称,OpenAI还计划推出一系列无屏设备,包括智能眼镜和智能音箱,将设备定位为 用户的"协作伴侣"而非单纯的应用入口。 不过在推出支持语音指令的消费级AI硬件产品前,OpenAI需要先改变用户的使用习惯。 1月1日,据The Information报道,OpenAI过去两个月内整合工程、产品和研究力量,集中攻克音频交 互的技术瓶颈,目标打造一款可通过自然语音指令操作的消费级设备。 团队整合聚焦无屏交互方式 公司内部研究人员认为,当前ChatGPT的语音模型在准确性和响应速度上均落后于文本模型,且两者使 用的底层架构并不相同。 据报道,OpenAI当前的语音模型与文本模型分属不同架构,导致用户通过语音与ChatGPT对话时,获得 的回答质量和速度均逊于文本模型。 为解决这一问题,OpenAI在过去两个月内完成了关键团队整合。 在组织层面,今夏从Character.AI加入的语 ...