Workflow
FastVLM
icon
Search documents
AI周观察:英伟达沙特交易驱动风险偏好提升,端侧AI加速渗透
SINOLINK SECURITIES· 2025-05-18 14:39
Investment Rating - The report does not explicitly state an investment rating for the industry Core Insights - The global AI-related applications, particularly chat assistants, have seen a significant increase in activity, with overseas applications like ChatGPT and Gemini growing by approximately 6%-8%, while domestic applications such as Doubao and ChatGLM have surged by around 20% [2][10] - NVIDIA is responding to increased export restrictions by launching a downgraded version of its H20 chip, with backorders from China reaching $18 billion, exceeding its total revenue from China in FY2024 [2][12] - CoreWeave reported a Q1 revenue of $982 million, a 420% year-over-year increase, and raised its full-year revenue guidance to $4.9-5.1 billion, despite a net loss of $315 million [2][19] - Global smartphone sales reached approximately 301 million units in Q1 2025, a year-over-year growth of 0.38%, with AI-enabled smartphone sales increasing by about 89% [2][23] - AI laptop shipments reached around 18 million units in Q1 2025, marking a year-over-year growth of approximately 201% and a penetration rate of 40.74% [2][35] Summary by Sections Overseas Market Review - The report highlights the rising activity in AI-related applications, particularly chat assistants, with notable growth in both overseas and domestic markets [5][10] NVIDIA Insights - NVIDIA's stock price has risen due to policy relaxations, but earnings expectations remain unverified, with significant backorders from China [12][16] CoreWeave Financial Performance - CoreWeave's Q1 revenue significantly exceeded expectations, and the company has strong growth prospects despite an expanded net loss [19][22] Consumer Electronics Dynamics - The global smartphone market shows modest growth, with a notable increase in AI-enabled devices, while AI laptops are experiencing rapid growth in shipments and market penetration [23][35]
85倍速度碾压:苹果开源FastVLM,能在iphone直接运行的视觉语言模型
机器之心· 2025-05-16 16:31
| 机器之心报道 | | --- | FastVLM—— 让苹果手机拥有极速视觉理解能力 当你用苹果手机随手拍图问 AI:「这是什么?」,背后的 FastVLM 模型正在默默解码。 最近,苹果开源了一个能在 iPhone 上直接运行的高效视觉语言模型 ——FastVLM(Fast Vision Language Model)。 代码链接: https://github.com/apple/ml-fastvlm 代码仓库中还包括一个基于 MLX 框架的 iOS/macOS 演示应用,优化了在苹果设备上的运行性能。 看这个 demo,反应速度是不是反应非常「Fast」!这就是 FastVLM 的独特之处。 相较于传统模型,FastVLM 模型专门注重于解决 体积、速度 这两大问题,速度快到相对同类模型, 首个 token 输出速度提升 85 倍 。 该模型引入了一种新型混合视觉编码器 FastViTHD ,融合了卷积层和 Transformer 模块,配合多尺度池化和下采样技术,把图片处理所需的「视觉 token」数量砍 到极低 —— 比传统 ViT 少 16 倍,比 FastViT 少 4 倍。它以卓越的速度和 ...
iOS 19还没来,我提前在iPhone上体验到了苹果最新的AI
Hu Xiu· 2025-05-15 12:04
都 2025 年了,还有谁没用上苹果 AI? 发布会看得热血沸腾,现实里却心灰意冷。 就在我以为苹果今年大概率也"稳中摆烂"的时候,突然发现苹果最近低调开源了一款小模型:FastVLM。 没搞发布会,也没在官网上大张旗鼓宣传,本来我也没太在意,但当技术部老哥将这款模型在顶配 iPhone 16 Pro Max 跑起来后,我承认有些坐不住了。 一句话概括,这个模型很"苹果"。 模型启动速度极快,识别图像的能力也不错,全程本地执行,没有云端那一套操作,看起来不惊艳,但用起来……有点意思。 我承认,有那么一瞬间,我觉得苹果 AI 又支棱起来了。 模型 GitHub 下载地址:https://github.com/apple/ml-fastvlm 作为一组可以在 iPhone、iPad、Mac 等设备上本地运行的视觉语言模型(Vision-Language Model), FastVLM 包括三种参数量级:FastVLM-0.5B、1.5B 和 7B。 普通用户也能将其部署到 iPhone 上,只是需要一定的技术门槛。苹果研究团队在 GitHub 提供了完整的安装教程,有技术基础的用户可以参考: 本来,只有我们国行 ...
OpenAI推出医疗开源测试基准HealthBench;苹果发布可在iPhone上运行的极速视觉语言模型FastVLM | 全球科技早参
Mei Ri Jing Ji Xin Wen· 2025-05-12 23:53
Group 1 - OpenAI has launched HealthBench, an open-source benchmark designed to measure AI systems' capabilities in healthcare, developed with input from 262 doctors across 60 countries, featuring 5,000 real health dialogues and 48,562 unique scoring criteria [2] - Apple's FastVLM, a visual language model optimized for high-resolution image processing, has been released, achieving up to 85 times faster encoding speed, paving the way for real-time multimodal AI applications on mobile devices [3] - The FDA has announced the immediate integration of AI technology across all its centers to expedite drug approval processes, significantly enhancing review efficiency by reducing repetitive tasks [4] Group 2 - Tesla is launching an AI agent to improve customer communication, capable of detecting delays and monitoring conversation sentiment, with a pilot program starting at ten locations [5] - Google's Gemini 2.5 Pro has upgraded its video understanding capabilities, supporting analysis of videos up to 6 hours long and achieving an accuracy rate of 84.7% in benchmark tests, indicating a shift towards video-driven multimodal products [6][7]
腾讯研究院AI速递 20250513
腾讯研究院· 2025-05-12 14:46
生成式AI 一、 Transformer八子之一 初创 Sakana AI 提出 「连续思维机器」 1. CTM将神经元活动同步作为核心机制,通过时序信息实现更复杂的神经行为,推理过程更 像人类思维; 2. 神经元可访问自身历史并学习利用这些信息计算下一输出,所有行为均为自然涌现,未被 预先设计; 3. CTM在迷宫求解和图像识别等任务中展现出类人思维过程,思考时间越长准确率越高,且 可根据任务难度调整思考时长。 https://mp.weixin.qq.com/s/hxL8ylal_4gY8IUIL7TWWA 二、 苹果发布 FastVLM, iPhone 直接运行的极速视觉语言模型 1.苹果发布移动端视觉语言模型FastVLM,采用双阶段处理(图像转token、token生成语 言),可直接部署在iPhone等设备上运行; 2.FastVLM在效率方面表现突出,0.5B版本较LLaVA首token输出快85倍,体积减少3.4倍; 7B版本配合Qwen2较Cambrian模型快7.9倍; 3.FastVLM具有高效处理高分辨率图像的能力,结合轻量级设计,显示出在智能眼镜等移动 设备上的应用潜力。 https ...