Workflow
Disco
icon
Search documents
完爆ChatGPT,谷歌这招太狠:连你的「阴阳怪气」都能神还原
3 6 Ke· 2025-12-15 02:04
Core Insights - Google has launched the Gemini 2.5 Flash Native Audio model, which enables real-time voice translation while preserving tone and delivering a more natural conversational experience, marking a significant advancement in AI interaction [1][3][10]. Group 1: Technological Advancements - The new model allows for direct audio processing without converting speech to text, enhancing the speed and emotional nuance of interactions [6][8]. - Gemini 2.5 Flash supports real-time speech translation, allowing for continuous listening and automatic language switching during conversations, effectively acting as an invisible translator [11][19]. - The model captures emotional nuances in speech, translating not just words but also the speaker's tone and attitude, which is crucial in contexts like business negotiations [12][14][15]. Group 2: Developer and Business Implications - The update improves the accuracy of function calls and command adherence, increasing the compliance rate from 84% to 90%, which is vital for enterprise-level applications [18][23]. - Gemini 2.5 enhances multi-turn dialogue capabilities, allowing for more coherent and logical conversations, making AI interactions feel more human-like [24]. - The introduction of Gemini API in 2026 will expand these capabilities to more products, lowering the barrier for businesses to create advanced AI customer service solutions [28][29]. Group 3: Future Outlook - The advancements signal a shift towards voice interaction as a primary interface for technology, moving AI beyond screens and into everyday life [25][27]. - The potential for users to communicate across language barriers with ease suggests a transformative impact on global communication [28].
X @Demis Hassabis
Demis Hassabis· 2025-12-14 10:08
RT Google AI (@GoogleAI)Our teams have been cooking 🧑‍🍳! Here’s a recap of updates that went out this week:— Gemini models are bringing state-of-the-art text translations to Search and the Translate app across nearly 20 languages. We also made updates to our Gemini audio models and introduced new speech-to-speech capabilities in beta in the Google Translate app— Google AI Pro members can share your plans with up to 5 people at no extra cost, or send a friend an extended 4-month trial— @GoogleLabs launched D ...
X @TechCrunch
TechCrunch· 2025-12-11 18:04
Google debuts ‘Disco,’ a Gemini-powered tool for making web apps from browser tabs https://t.co/c4IJsZ1yz5 ...