Google AI Edge Gallery

Search documents
100万token!全球首个混合架构模型M1开源了!近期AI新鲜事还有这些……
红杉汇· 2025-06-25 11:06
Group 1 - MiniMax-M1 is the world's first hybrid architecture model supporting the longest context window, with 1 million tokens input and 80,000 tokens output, completed training in 3 weeks at a cost of 3.8 million yuan [3][6] - The model outperforms or matches several open-source models like DeepSeek-R1 and Qwen3 in various benchmark tests, and even exceeds OpenAI's o3 and Claude 4 Opus in complex tasks [4][6] - A key innovation of MiniMax-M1 is the Lightning Attention mechanism, which reduces computational complexity and improves efficiency by dividing attention calculations into intra-block and inter-block components [5][7] Group 2 - The model's input length of 1 million tokens is approximately 8 times that of DeepSeek R1, while its output length of 80,000 tokens surpasses Gemini 2.5 Pro's 64,000 tokens [6] - The Lightning Attention mechanism employs tiling technology to optimize GPU memory usage, allowing for efficient training without slowing down as sequence length increases [7] - The new CISPO algorithm enhances training efficiency, achieving double the training speed compared to traditional methods, allowing performance to be reached in half the training steps [7] Group 3 - Microsoft has released over 700 real-world Agent applications, showcasing how AI is transforming work across various industries, including finance, healthcare, technology, and education [10][12] - Notable examples include Accenture's autonomous agent that automates overdue payment collections, reducing sales outstanding days by up to 20%, and KPMG's ComplyAI, which improves compliance maturity and reduces ongoing compliance work by 50% [12] Group 4 - Zhiyuan AI has launched CoCo, an enterprise-level intelligent assistant with memory capabilities, allowing it to provide tailored services based on employee interactions and departmental functions [14] - CoCo integrates seamlessly into existing workflows and offers task planning and editing options, enhancing operational efficiency [14] Group 5 - OpenAI has introduced the o3-pro model, which surpasses Google's Gemini 2.5 Pro in mathematical benchmark tests, showcasing its leading performance in reasoning models [16][19] - The o3-pro model is now available for ChatGPT Pro and Team users, with API access for developers at a cost of $20 per million input tokens and $80 per million output tokens [19] Group 6 - Zhiyuan Research Institute has released Video-XL-2, a lightweight model for long video understanding, which significantly improves processing efficiency and can handle videos of up to 10,000 frames [21][23] - The model's architecture allows for efficient processing on a single GPU, making it suitable for applications in content analysis and behavior monitoring [23] Group 7 - Google has launched the Google AI Edge Gallery, enabling users to run AI models locally on their phones, allowing for functionalities like image generation and code editing without internet connectivity [27] - This application is positioned as an experimental version and is open-sourced under the Apache 2.0 license, promoting privacy and offline usage [27]
腾讯研究院AI速递 20250603
腾讯研究院· 2025-06-02 15:08
Group 1: AI Mechanisms and Tools - Mamba's core authors introduced two attention mechanisms, GTA and GLA, designed for inference, which can double decoding speed and throughput [1] - Flowith launched Agent Neo, the world's first AI agent capable of infinite execution and output, with a million-token context capability [2] - FLUX.1 Kontext is a unified framework for various image tasks, excelling in character consistency and rapid generation speed [3] Group 2: General AI Agents - Fairies, a general AI agent developed by Peking University alumni, can perform 1,000 operations without an invitation code [4][5] - ElevenLabs released Conversational AI 2.0, enhancing voice assistants' ability to understand user intent and manage multi-modal interactions [6] Group 3: AI Applications and Market Trends - Google launched the experimental Google AI Edge Gallery, allowing local execution of AI models on mobile devices [7] - Hugging Face introduced two open-source humanoid robots, with prices starting at $250, aimed at AI application development [8] - Mary Meeker's AI trends report highlighted a 99.7% drop in AI inference costs over two years, with Chinese models emerging at significantly lower costs [9] Group 4: Future of AI - OpenAI's COO Lightcap discussed the transition from conversational models to general AI agents, with over 3 million paid seats for ChatGPT Enterprise [10] - LeCun's research indicated that large language models struggle with nuanced semantic tasks, questioning their path to artificial general intelligence [11]
速递|谷歌低调上线AI Edge Gallery,开源本地AI运行器
Z Potentials· 2025-06-02 04:18
图片来源:谷歌 上周,谷歌悄然发布了一款应用程序,允许用户在手机上运行来自 AI 开发平台 Hugging Face 的一系 列公开可用 AI 模型。 这款名为 Google AI Edge Gallery 的应用目前支持 Android 平台,即将登陆 iOS 。用户可通过它查 找、下载并运行兼容的模型,实现图像生成、问题解答、代码编写与编辑等功能。 所有模型均离线运行,无需互联网连接,直接调用手机处理器完成计算。 云端运行的 AI 模型通常比本地版本更强大,但也存在明显缺陷。部分用户可能不愿将个人或敏感数 据发送至远程数据中心,或希望在没有 Wi-Fi 和移动网络的环境下仍能使用 AI 模型。 Google AI Edge Gallery 图片来源: Googl e Google AI Edge Gallery 还提供 " 提示实验室 " 功能,用户可启动模型驱动的 " 单轮 " 任务,如文本 摘要和重写。该实验室配备多个任务模板和可配置参数,用于微调模型行为。 性能表现可能因设备而异,谷歌提醒道。硬件配置更高的现代设备运行模型速度自然会更快,但模型 大小同样关键。相比小型模型,大型模型完成相同任务(比如 ...