Large Language Models (LLMs)
Search documents
模型压缩到70%,还能保持100%准确率,无损压缩框架DFloat11来了
机器之心· 2025-04-28 04:32
机器之心报道 编辑:陈萍、+0 大型语言模型(LLMs)在广泛的自然语言处理(NLP)任务中展现出了卓越的能力。然而,它们迅速增长的规模给高效部署和推理带来了巨大障碍,特别是在计 算或内存资源有限的环境中。 例如,Llama-3.1-405B 在 BFloat16(16-bit Brain Float)格式下拥有 4050 亿个参数,需要大约 810GB 的内存进行完整推理,超过了典型高端 GPU 服务器(例如, DGX A100/H100,配备 8 个 80GB GPU)的能力。因此,部署该模型需要多个节点,这使得它昂贵且难以获取。 本文,来自莱斯大学等机构的研究者提出了一种解决方案, 可以 将任何 BFloat16 模型压缩到原始大小的 70%,同时还能在任务上保持 100% 的准 确性。 论文标题: 70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float 为了应对 LLM 不断增长的模型尺寸,通常会采用量化技术,将高精度权重转换为低位表示。这显著减少了内存 ...
Google Jeopardy: Advertising, DOJ Threats Pressure Alphabet Stock
ZACKS· 2025-04-23 18:40
As we head into Alphabet's ((GOOGL) earnings report tomorrow, some of the biggest questions analysts will be asking on the conference call should surround Wiz and NVIDIA and those big plans I discussed in my earlier article.But there are two more issues burning like a GPU at OpenAI: their ad model in the age of LLMs and their (undesired) spotlight from the DOJ on antitrust action.It's fair to ask if Google’s advertising model is in jeopardy since Alphabet still generates the majority of its revenue from Sea ...
Google GenAI, AI Cloud Services Drive Analyst Confidence In Long-Term Growth
Benzinga· 2025-04-16 18:02
Core Insights - Google's primary upside valuation driver over the next three to five years will be its proprietary large language models (LLMs) [1] - GenAI is expected to enhance Google's internal operations and revenue growth, with Google Cloud benefiting from LLMs and related applications [1] - The Gemini LLM has competitive advantages due to the vast data from Google's search engine and YouTube, positioning it well in the market [2] Group 1 - GenAI is anticipated to disrupt content creation, user behavior, and business models [2] - Google's "Zero Click" strategy is identified as a significant disruption for 2024, raising concerns about consumer trust and monetization of high-quality content [4] - The proliferation of low-value content may affect session time and ad monetization for publishers reliant on traffic referrals [5] Group 2 - GenAI's summaries in search results are reducing the need for users to click through to links, impacting traditional ad revenue models [5] - The risks associated with GenAI include copyright violations, hallucinations, and deepfakes, which could threaten the integrity of media [5] - A virtual panel on the impact of GenAI on media and the internet will be hosted by Needham analyst Laura Martin, discussing various stocks including GOOGL [3]