语音AI

Search documents
肥城市以英语“人机对话”测评破题 探索教育数字化与评价改革融合路径
Qi Lu Wan Bao Wang· 2025-06-30 06:43
Core Insights - The article highlights the innovative approach taken by Feicheng City in integrating digital transformation with educational assessment, specifically through a systematic evaluation of English listening and speaking skills among students [1][5]. Group 1: Digital Transformation in Education - Feicheng City has implemented a large-scale intelligent assessment of English speaking and listening abilities across 44 primary schools, involving 25,000 students from grades three to five [1][5]. - The assessment system utilizes AI technology to generate personalized reports that include voice diagnostics, capability analysis, and learning suggestions [5][6]. Group 2: Overcoming Challenges - The city faced challenges such as network issues in rural schools, lack of teacher familiarity with digital tools, and students' weak foundational skills [3]. - To address these, Feicheng established a special fund to enhance network bandwidth in rural schools and implemented a three-tiered research mechanism for teacher training, achieving full digital skill coverage for 217 English teachers [3][8]. Group 3: Impact and Results - The success rate of the assessment system in rural schools improved from 65% to 98%, laying the groundwork for broader implementation [3]. - Data-driven teaching methods have led to significant improvements, such as a 21 percentage point increase in phonetic reading accuracy among rural students and a reduction in the score gap for situational responses from 12.5 to 5.2 points [8]. Group 4: Future Directions - Feicheng's practices are seen as a replicable model for high-quality educational development in rural areas, focusing on the integration of technology with educational scenarios [8]. - The city plans to expand its focus from English language assessment to a comprehensive evaluation system across all subjects, continuously exploring new educational assessment frameworks guided by core competencies [8].
首个全面梳理语音大模型发展脉络的权威综述,入选ACL 2025主会
机器之心· 2025-06-17 04:50
想象一下,如果 AI 能够像人类一样自然地进行语音对话,不再需要传统的 「 语音转文字(ASR)- 文本大模型处理(LLM)- 文字转语音(TTS) 」 的 繁琐流程,而是直接理解和生成语音,那将是怎样的体验?这就是 语音大模型 (语音语言模型,SpeechLM)要解决的核心问题。 传统的语音交互系统存在三大痛点:信息丢失、延迟严重、错误累积。当语音转换为文字时,音调、语气、情感等副语言信息完全丢失;多个模块串联导致 响应延迟明显;每个环节的错误会层层累积,最终影响整体效果。 SpeechLM 的出现彻底改变了这一局面。它能够端到端地处理语音,既保留了语音中的丰富信息,又大幅降低了延迟,为真正自然的人机语音交互铺平了 道路。 本文第一作者:崔文谦,香港中文大学博士生,致力于语音大模型,多模态大模型,AI音乐生成等方向的研究。 由香港中文大学团队撰写的语音语言模型综述论文《Recent Advances in Speech Language Models: A Survey》已成功被 ACL 2025 主会议接收!这 是该领域首个全面系统的综述,为语音 AI 的未来发展指明了方向。 ArXiv链接:https: ...