内存优化 - filings, earnings calls, financial reports, news

内存优化

Search documents

Z Tech｜对话Meta FAIR研究科学家：利用Confidence动态过滤，告别低效推理

Z Potentials· 2025-09-05 02:27

Core Viewpoint - The article discusses the emergence of the Deep Think with Confidence (DeepConf) method, which enhances the inference efficiency and performance of large language models (LLMs) by dynamically filtering low-quality inference trajectories using internal confidence signals during the inference process [1][5]. Group 1: DeepConf Methodology - DeepConf addresses the limitations of existing methods by utilizing model internal confidence signals to filter out low-quality inference trajectories, thereby improving both inference efficiency and performance [1][10]. - The method can be seamlessly integrated into existing service frameworks without requiring additional model training or hyperparameter tuning, making it user-friendly for developers [8][10]. - DeepConf operates in both offline and online modes, allowing for flexibility in application depending on the use case [8]. Group 2: Performance Metrics - In offline mode, DeepConf@512 achieved a 99.9% accuracy on the GPT-OSS-120B model, significantly surpassing the traditional majority vote accuracy of 97.0% [10]. - In online mode, DeepConf can reduce the number of generated tokens by up to 84.7% compared to full parallel inference while simultaneously improving accuracy, effectively balancing performance and efficiency [10]. Group 3: Contributors and Research Background - Jiawei Zhao, a research scientist at Meta FAIR and a Caltech PhD, focuses on optimization methods for LLMs and deep learning [5][6]. - Yichao Fu, a PhD student at UCSD, specializes in LLM inference optimization and has contributed to multiple research projects aimed at improving LLM scheduling and breaking sequential dependencies in inference [8][10].

大型语言模型推理优化

内存优化

Artificial Intelligence

DeepConf

大型语言模型推理优化

内存优化

Artificial Intelligence

DeepConf

小米取得内存优化方法相关专利

Jin Rong Jie· 2025-08-26 05:28

Group 1 - The core point of the article is that Beijing Xiaomi Mobile Software Co., Ltd. has obtained a patent for a "memory optimization method, device, and computer storage medium" with the authorization announcement number CN113722080B, applied on May 2020 [1] - Beijing Xiaomi Mobile Software Co., Ltd. was established in 2012 and is primarily engaged in software and information technology services, with a registered capital of 148.8 million RMB [1] - The company has made investments in 4 enterprises, participated in 139 bidding projects, and holds 5000 patent information along with 123 administrative licenses [1]

半导体芯闻· 2025-04-25 10:19

如果您希望可以时常见面，欢迎标星收藏哦~ 来源：内容编译自 eetimes ，谢谢。 ZeroPoint Technologies 和 Rebellions 旨在开发一种 AI 加速器，以降低 AI 推理的成本和功耗。据称，ZeroPoint Technologies 的内存优化技术能够快速压缩数据、增加数据中心的内存容量并提高每瓦的 AI 推理性能。 2025年4月，瑞典内存优化知识产权（IP）供应商ZeroPoint Technologies（以下简称ZeroPoint）宣布与Rebellions建立战略合作伙伴关系，共同开发用于AI推理的下一代内存优化AI加速器。该公司计划在 2026 年发布一款新产品，并声称"有望实现前所未有的代币/秒/瓦特性能水平"。作为合作的一部分，两家公司将使用 ZeroPoint 的内存压缩、压缩和内存管理技术来增加基本模型推理工作流程的内存带宽和容量。 ZeroPoint 首席执行官 Klas Moreau 声称其基于硬件的内存优化引擎比现有的软件压缩方法快 1,000 倍。 ZeroPoint 的内存压缩 IP 价值主张首先，压缩和解压缩。其次，压缩生成的 ...