Workflow
多方言语音合成
icon
Search documents
清华大学与巨人网络联合首创多方言语音合成框架,数据代码方法全开源
Xin Lang Ke Ji· 2025-10-15 06:28
Core Insights - The collaboration between Giant Network AI Lab and Tsinghua University's SATLab has led to the development of DiaMoE-TTS, an open-source framework for multi-dialect speech synthesis aimed at promoting fairness and accessibility in dialect voice synthesis [1][2] - Existing industrial-grade models for speech synthesis often rely on large proprietary datasets, making it difficult for practitioners and researchers in dialect TTS to access necessary resources [1] - DiaMoE-TTS is designed to be a comprehensive solution that rivals industrial-grade dialect TTS models, utilizing a unified IPA expression system based on linguistic expertise and relying solely on open-source dialect ASR data [1] Summary by Sections - **Development and Purpose** - DiaMoE-TTS is a pioneering open-source framework for multi-dialect speech synthesis created by Giant Network AI Lab and Tsinghua University [1] - The initiative aims to ensure that researchers, developers, and language preservationists can freely use, improve, and expand the framework [2] - **Technical Aspects** - The framework is built on a unified IPA expression system and has been validated in multilingual scenarios, including English, French, German, and Dutch [1] - Before launching dialect versions for Cantonese, Sichuanese, and Shanghainese, the research team ensured the method's scalability and robustness across various languages [1]