CoCo

Search documents
100万token!全球首个混合架构模型M1开源了!近期AI新鲜事还有这些……
红杉汇· 2025-06-25 11:06
Group 1 - MiniMax-M1 is the world's first hybrid architecture model supporting the longest context window, with 1 million tokens input and 80,000 tokens output, completed training in 3 weeks at a cost of 3.8 million yuan [3][6] - The model outperforms or matches several open-source models like DeepSeek-R1 and Qwen3 in various benchmark tests, and even exceeds OpenAI's o3 and Claude 4 Opus in complex tasks [4][6] - A key innovation of MiniMax-M1 is the Lightning Attention mechanism, which reduces computational complexity and improves efficiency by dividing attention calculations into intra-block and inter-block components [5][7] Group 2 - The model's input length of 1 million tokens is approximately 8 times that of DeepSeek R1, while its output length of 80,000 tokens surpasses Gemini 2.5 Pro's 64,000 tokens [6] - The Lightning Attention mechanism employs tiling technology to optimize GPU memory usage, allowing for efficient training without slowing down as sequence length increases [7] - The new CISPO algorithm enhances training efficiency, achieving double the training speed compared to traditional methods, allowing performance to be reached in half the training steps [7] Group 3 - Microsoft has released over 700 real-world Agent applications, showcasing how AI is transforming work across various industries, including finance, healthcare, technology, and education [10][12] - Notable examples include Accenture's autonomous agent that automates overdue payment collections, reducing sales outstanding days by up to 20%, and KPMG's ComplyAI, which improves compliance maturity and reduces ongoing compliance work by 50% [12] Group 4 - Zhiyuan AI has launched CoCo, an enterprise-level intelligent assistant with memory capabilities, allowing it to provide tailored services based on employee interactions and departmental functions [14] - CoCo integrates seamlessly into existing workflows and offers task planning and editing options, enhancing operational efficiency [14] Group 5 - OpenAI has introduced the o3-pro model, which surpasses Google's Gemini 2.5 Pro in mathematical benchmark tests, showcasing its leading performance in reasoning models [16][19] - The o3-pro model is now available for ChatGPT Pro and Team users, with API access for developers at a cost of $20 per million input tokens and $80 per million output tokens [19] Group 6 - Zhiyuan Research Institute has released Video-XL-2, a lightweight model for long video understanding, which significantly improves processing efficiency and can handle videos of up to 10,000 frames [21][23] - The model's architecture allows for efficient processing on a single GPU, making it suitable for applications in content analysis and behavior monitoring [23] Group 7 - Google has launched the Google AI Edge Gallery, enabling users to run AI models locally on their phones, allowing for functionalities like image generation and code editing without internet connectivity [27] - This application is positioned as an experimental version and is open-sourced under the Apache 2.0 license, promoting privacy and offline usage [27]
如何用AI Agent让企业效率翻倍?
Sou Hu Cai Jing· 2025-06-09 16:35
作者丨樱木 编辑丨九黎 2025年5月,红杉资本AI峰会在旧金山落下帷幕。这场汇聚150位全球顶尖AI公司创始人的大会达成重要共识:下一轮AI竞争的核心不再是工具本 身,而是为用户创造的实际收益。在此背景下,Agent的重要性被前所未有的推至所有人的视野前沿。 硅谷大厂开启了第一波加速,微软CEO纳德拉在主题演讲中宣布:"我们已经进入了AI Agent时代,正在见证AI系统如何以全新方式帮助我们解决 问题。"Open AI CEO山姆·奥特曼宣布,推出面向开发者的新Codex Agent,称"这可能是编程史上最大的变革。" 而将目光聚焦到国内,大厂纷纷亲自下场,当没人怀疑AI Agent确定性时,实用性的困境却在继续。用户喜欢用"AI实习生"来诠释当下AI Agent的 能力。 从实际情况来看,所谓"AI实习生"直接揭示的是当下AI Agent无法在泛化场景下,满足企业和个人的执行落地要求。特别是当企业进入到数字化 转型的深水区,具体的场景和数据开始细化,传统AI难以打通数据孤岛,员工困囿于重复工作等问题就显得异常突出,关于AI agent的升级似乎 迫在眉睫。 如何解决这一困境,一直聚焦于大模型商业化的智谱 ...
腾讯研究院AI速递 20250610
腾讯研究院· 2025-06-09 14:06
生成式AI 一、 ChatGPT 4o低调更新,现在它也会先思考,再去联网搜索 1. ChatGPT 4o现在在回答复杂问题前会先停顿几秒"思考",页面显示"Thought for a few seconds",然后再决定搜索或直接回答; 2. 这种"先理解后搜索"的能力提高了回答准确性,但用户需要等待更长时间,移动端触发率 更高; 3. OpenAI未官宣此功能,但已将这种思考能力扩展到GPT-4.1和GPT-4.5等非推理模型 中。 https://mp.weixin.qq.com/s/ZxkMFmjp6dYRaf6EyVgp4A 二、 谷歌Veo 3 Fast版价格暴降5倍,360°关键词解锁3D效果 1. 谷歌Veo 3模型新增"360°"关键词功能,能生成3D环绕效果视频,但在物理真实性上仍有 缺陷; 2. 推出Veo 3-Fast版本,支持文生视频和自动生成配音,速度更快且价格降低80%; 3. Fast版本生成8秒720P视频仅需20 credits(比标准版便宜5倍),但面部细节和光照效果 略有下降。 https://mp.weixin.qq.com/s/Vw9C6MHOT43yqVl6tsw ...