Core Viewpoint - The article discusses the advancements in AI voice technology by Huoshan Engine, particularly focusing on the upgrades to the Doubao voice synthesis and voice replication models, which enhance emotional expression and contextual understanding in AI-generated speech [5][11][41]. Group 1: AI Voice Technology Upgrades - Huoshan Engine has upgraded its Doubao voice synthesis model to version 2.0, which allows for better emotional expression and understanding of dialogue [7][11]. - The upgrade includes two main models: Doubao voice synthesis model 2.0 and Doubao voice replication model 2.0, enabling AI to replicate voices and understand emotional nuances [7][8]. - The new models can interpret user instructions regarding emotions, dialects, tones, and speech rates, significantly improving the quality of AI-generated speech [12][21]. Group 2: Contextual Understanding and Emotional Expression - The models can now incorporate context from previous dialogue, enhancing the coherence and emotional depth of the generated speech [12][23]. - The ability to accurately read complex formulas has improved, with the Doubao model achieving around 90% accuracy in reading complex formulas for school subjects, compared to less than 50% for similar models [24][25]. - The advancements allow for a more human-like interaction, moving from merely sounding human to truly understanding human emotions and context [11][41]. Group 3: Technological Innovations and Applications - The Doubao large model 1.6 has been upgraded to support adjustable thinking lengths, allowing users to balance effectiveness, latency, and cost [30][33]. - Huoshan Engine has introduced a Smart Model Router, which optimally matches user tasks with the most suitable models, significantly reducing costs by up to 71% in cost-prioritized modes [39][41]. - The technology has been applied in various commercial scenarios, enhancing user experiences in products from companies like Xiaomi and OPPO, and improving complex demand responses in platforms like Dongchedi [45][46]. Group 4: Growth and Infrastructure - The daily token usage of the Doubao large model has surged from 120 billion to over 30 trillion, marking a 253-fold increase in just over a year [47][48]. - This growth is supported by Huoshan Engine's robust AI cloud infrastructure, which provides the necessary computational power and high-quality data for model training and inference [48].
新豆包模型让郭德纲喊出发疯文学:(这班)不上了!不上了!不上了!!!
量子位·2025-10-16 06:11