MiniMax Speech 2.6
Search documents
黄仁勋投了家复刻马斯克声音的AI公司
Sou Hu Cai Jing· 2025-11-03 04:14
Core Insights - Cartesia, a voice AI company, has recently launched its new voice model Sonic-3 and completed a $100 million Series B funding round, with NVIDIA among the investors [1][3][12] Company Overview - Cartesia was founded by Karan Goel, a talented individual from Stanford AI Lab, who has previously excelled in the field of state space models (SSM) [2][10] - The company has a strong academic foundation, with its core team primarily composed of members from Stanford AI Lab, including co-founder Albert Gu, a notable figure in the development of the Mamba architecture [3][4] Product Development - Cartesia has rapidly progressed since its inception, launching its first product, the Sonic voice model, shortly after securing seed funding. The company has since released multiple iterations, including Sonic-2.0 and the latest Sonic-3 [6][12] - Sonic-3 features significant upgrades, including improved emotional expression and faster response times, with a latency of only 90 milliseconds and an end-to-end response time of 190 milliseconds, making it one of the fastest voice generation systems available [8][12] Technology Differentiation - Unlike traditional voice AI models that rely on Transformer architecture, Sonic-3 is built on SSM, allowing for more natural and context-aware interactions without the need to revisit the entire conversation history [8][12] - This innovative approach enhances the model's ability to capture emotional nuances and respond more fluidly, positioning Cartesia as a leader in real-time voice AI technology [8][12] Market Context - The voice AI sector is witnessing significant advancements, with other companies like MiniMax also launching competitive products, indicating a growing market for voice models that can handle diverse languages and accents [14]
黄仁勋投了家复刻马斯克声音的AI公司
量子位· 2025-11-03 03:12
梦瑶 发自 凹非寺 量子位 | 公众号 QbitAI 如果我不说,你能分清哪个是马斯克本人的声音吗? 大NO特NO!!!其实这俩都不是。。。 这段堪比"本尊"的语音,就出自语音AI公司 Cartesia 刚刚发布的语音模型 Sonic-3 。 伴随新模型对外公布的还有新融资: Cartesia披露完成1亿美元的B轮融资,投资方里英伟达赫然在列。 此外,这家公司之所以如此受关注,还跟其创始人密切相关。 其创始人、CEO是来自斯坦福AI Lab的印度天才少年 Karan Goel ,之前就在状态空间模型(SSM)领域锋芒毕露了。 是时候认识认识Cartesia了~ 融资+上新,两件大事一块来 Cartesia这家公司,开局就是典型硅谷精英剧本。 Cartesia的初始核心成员,清一色来自斯坦福AI实验室,妥妥的学术派大拿班底。 从发布节奏到融资节奏,Cartesia基本把"边卷技术边收钱"这件事,执行到了极致了… 其中,Cartesia首席科学家和联合创始人 Albert Gu 还是 一名华裔,也是是Mamba架构的共同发明人之一。 △ 从左往右第三位为Albert Gu 其实,Cartesia从一开始没走主流圈 ...
【产业互联网周报】 “十五五”规划建议:全面实施“人工智能+”行动,抢占人工智能产业应用制高点;黄仁勋GTC大会最新演讲勾勒AI蓝图;退出中国市场?SA...
Tai Mei Ti A P P· 2025-11-03 02:12
国内资讯 智源发布多模态世界大模型悟界·Emu3.5,可实现跨场景具身操作 智源发布多模态世界大模型 Emu3.5,以自回归方式实现了对多模态序列的"Next-State Prediction(NSP)",获得了可泛化的世界建模能力。在场景应用层面,模型不仅能实现跨场景的具身操 作、具备泛化的动作规划与复杂交互能力,也能完成文图生成、图片编辑与时空变换。 零一万物联合开源中国发布 "Open AgentKit 平台" 11月1日消息,在全球开源技术峰会 GOTC 2025上,零一万物联合开源中国发布 "Open AgentKit 平 台"(简称OAK)。OAK 是一个专为开发者打造的一站式开源解决方案,旨在通过其开放、可组合、 可观测的四大核心模块——Framework、Runtime、Builder 和 Studio——覆盖从构建、测试、部署到监 控的 AI Agent 全开发周期。该平台强调跨生态兼容与低门槛开发,希望通过开源社区共建,推动智能 体生态自由生长,实现 AI Agent的高效与普及。 【产业互联网周报将整合本周最重要的企业级服务、云计算、大数据领域的前沿趋势、重磅政策及行研 报告。】 玻色量 ...
腾讯研究院AI每周关键词Top50
腾讯研究院· 2025-11-01 02:33
Core Insights - The article presents a weekly roundup of the top 50 keywords related to AI developments, highlighting significant trends and innovations in the industry [2]. Group 1: Chips - Vera Rubin is a notable keyword associated with NVIDIA, indicating advancements in chip technology [3]. - Qualcomm has introduced a new AI inference solution, showcasing its commitment to enhancing AI capabilities [3]. Group 2: Models - OpenAI has developed a safety classification model, emphasizing the importance of security in AI applications [3]. - Cursor has launched its self-developed Composer model, reflecting the trend of companies creating proprietary AI models [3]. - NVIDIA's OmniVinci model and MiniMax's M2 model are also highlighted, indicating ongoing innovation in AI modeling [3][4]. Group 3: Applications - Sora has introduced a role cameo feature, enhancing user interaction with AI [3]. - MiniMax Speech 2.6 and Beijing Zhiyuan's WuJie·Emu3.5 are examples of new AI applications aimed at improving communication [3]. - Adobe's Firefly Image 5 and Tencent's interactive AI podcast demonstrate the growing integration of AI in creative and media sectors [3][4]. Group 4: Technology - The NEO home robot by 1X Technologies and the LeRobot v0.4.0 by Hugging Face represent advancements in consumer robotics [4]. - Neuralink's PRIMA artificial vision and Merge Labs' ultrasound brain-machine interface highlight significant technological innovations in AI and neuroscience [4]. Group 5: Capital - OpenAI is undergoing a capital structure reorganization and has plans for an IPO, indicating its growth and potential market impact [4]. Group 6: Events and Opinions - There is a call for copyright protection in Japan, reflecting ongoing discussions about intellectual property in the AI space [4]. - Yoshua Bengio's new definitions of AGI and insights on mental health data from OpenAI indicate evolving perspectives on AI's role in society [4].
腾讯研究院AI速递 20251031
腾讯研究院· 2025-10-30 16:06
https://mp.weixin.qq.com/s/_dmZj9IwtbRLpvXHulQ_8g 二、Cursor 2.0更新,自研模型Composer,多agent并行 生成式AI 一、OpenAI 刚刚开源了两个专门用于安全分类的推理模型 1. OpenAI开源gpt-oss-safeguard安全分类模型(120b和20b版本),采用Apache 2.0许可证,能直接理解策略文档进 行内容分类无需重新训练; 2. 该模型在多个基准测试中表现超越GPT-5-thinking,在内容审核评估集和ToxicChat数据集上达到行业最佳性价 比; 3. OpenAI内部已使用该技术(Safety Reasoner原型)处理图像生成和Sora 2等产品,安全推理算力占比高达16%。 1. Cursor发布2.0版本,推出首个自研编码模型Composer,生成速度达每秒250个token,是同类前沿系统的4倍,标志 从"AI外壳"向"AI原生平台"转型; 2. Composer采用混合专家(MoE)架构,通过强化学习针对软件工程优化,在Cursor Bench评测中达到前沿水平,已被团 队日常开发使用; 3. 新 ...