Workflow
H100 GPU
icon
Search documents
X @Cathie Wood
Cathie Wood· 2025-07-12 16:06
.@ARKInvest has been wondering how much of a sleeper #Tesla’s Dojo is. Interesting perspective here.RimRunner (@samwose):@StockSavvyShay 1. Breaking Free from NvidiaTesla currently relies on Nvidia’s A100 and H100 GPUs for training its massive video-based neural networks. While powerful, these chips are general-purpose and optimized for broader markets like LLMs and gaming.Dojo 2, by contrast, is Tesla’s ...
这种大芯片,大有可为
半导体行业观察· 2025-07-02 01:50
公众号记得加星标⭐️,第一时间看推送不会错过。 人工智能(AI)模型呈指数级增长,目前已达到万亿参数,这揭示了传统单芯片图形处理单元 (GPU)架构在可扩展性、能源效率和计算吞吐量方面的显著局限性。晶圆级计算已成为一种 变 革 性 的 范 式 , 它 将 多 个 小 芯 片 集 成 到 一 块 单 片 晶 圆 上 , 以 提 供 前 所 未 有 的 性 能 和 效 率 。 Cerebras晶圆级引擎(WSE-3)拥有4万亿晶体管和90万个核心,特斯拉的Dojo每个训练芯片拥 有1.25万亿晶体管和8,850个核心,这些平台都体现了晶圆级AI加速器满足大规模AI工作负载需 求的潜力。 本综述对晶圆级AI加速器和单芯片GPU进行了全面的比较分析,重点关注它们在高性能AI应用中的 相 对 性 能 、 能 源 效 率 和 成 本 效 益 。 同 时 , 也 探 讨 了 台 积 电 ( TSMC ) 的 晶 圆 上 芯 片 封 装 技 术 (CoWoS)等新兴技术,该技术有望将计算密度提高多达40倍。 此外,本研究还讨论了关键挑战,包括容错、软件优化和经济可行性,深入探讨了这两种硬件范式之 间的权衡和协同作用。此外,还 ...
26天倒计时:OpenAI即将关停GPT-4.5Preview API
3 6 Ke· 2025-06-18 07:34
近日,OpenAI向开发者发了一封邮件,宣布将于7月14日正式移除 GPT-4.5 Preview API。 图注:OpenAI邮件。图源网络 对于那些已经将GPT-4.5深度集成到自己产品或工作流中的开发者来说,这无异于一次震撼。他们必须在不到一个月的时间内,从OpenAI提供的近40个模 型中,重新寻找一个替代品。 为什么非关不可? 许多人将矛头指向了高昂的计算成本。毕竟,一个性能优越、但商业上不划算的模型,在任何一家公司的账本上都不会长久。 图注:GPT模型一览 GPT-4.5 API 定价高达 75 美元 / 百万输入 tokens,150 美元 / 百万输出 tokens,几乎是 GPT-4.1 的多倍。 OpenAI官方称,这次移除计划早在4月发布GPT-4.1时就已公布。GPT-4.5从始至终都是一个"实验性"产品,其使命是为未来的模型迭代提供经验,尤其是 在创意和写作的细微之处。邮件只是按计划发送的提醒。 不够,GPT-4.5 预览版将继续作为选项,通过应用程序顶部的下拉模型选择菜单,提供给个人 ChatGPT 用户使用。 图注:用户表示GPT-4.5是最喜欢的模型之一。 最近,OpenAI公 ...
奥特曼声称ChatGPT单次查询仅耗电0.34瓦时,这个数据靠谱吗?
3 6 Ke· 2025-06-17 12:27
6月17日消息,OpenAI首席执行官山姆·奥特曼(Sam Altman)近日首次披露了ChatGPT查询的具体能耗数据。 他在博文中透露,ChatGPT单次查询平均耗电0.34瓦时(0.00034千瓦时),用水约0.000085加仑——相当于节能灯泡 工作两分钟的耗电量,或约1/15茶匙的水量。 作为人工智能行业领军企业,OpenAI此次公开能耗数据具有标志性意义,为评估AI技术对环境的影响提供了重要参 考,并在业内引发热议。本文将对这个数据进行剖析,并呈现正反观点。 0.34瓦时数据是否可信? 支持这一数据的主要依据来自第三方研究的相互印证: 1)独立研究数据吻合 行业报告显示,ChatGPT日均查询量达10亿次。若按单次0.34瓦时计算,单日总能耗约340兆瓦时。技术专家据此推 算,OpenAI可能需要部署3200台搭载英伟达DGX A100芯片的服务器集群。这意味着每台服务器需要每秒处理4.5次查 询。 该数据的可信度首先体现在与第三方研究的高度吻合。知名研究机构Epoch.AI在2025年发布报告显示,GPT-4o单次查 询能耗约为0.0003千瓦时,与OpenAI公布的数据基本一致。 Epoch. ...
Prediction: Nvidia Stock Is Going to Hit $200 in 2025
The Motley Fool· 2025-06-12 08:55
Nvidia (NVDA -0.85%) stock has soared by 870% since the start of 2023, catapulting its market capitalization to a whopping $3.5 trillion. Demand continues to exceed supply for the company's graphics processing units (GPUs) for the data center, which are the most powerful chips in the world for developing artificial intelligence (AI) models.Nvidia CEO Jensen Huang says new AI reasoning models, which spend more time thinking to produce accurate responses, require up to 1,000 times more computing capacity than ...
五大原因,英伟达:无法替代
半导体芯闻· 2025-06-06 10:20
日益白热化的全球人工智能(AI) 芯片市场,尽管华为(Huawei) 推出Ascend 910C GPU 寄望协助 中国摆脱依赖英伟达(NVIDIA),但遇到明显阻力。 Wccftech 报导,字节跳动、阿里巴巴和腾讯等中国科技大厂,至今仍未大量订购华为AI 芯片。因 英伟达根深蒂固的生态系统(如CUDA 软体) 与华为产品不足。华为910C GPU 缺乏科技企业订 单,转向中国大型国企(SOEs) 和地方政府采购。市场策略转变,突显华为AI 芯片抢占主流市场的 严峻挑战。 来源:内容来自 wcctech 。 华为AI 芯片推广面临五大障碍,是多重因素交织,共同造成华为Ascend 910C GPU 市场推广巨 大阻力。这些障碍不仅限制了华为市场渗透率,也让中国科技大厂对产品望而却步。 首先,英伟达的CUDA 生态系统的根深蒂固。中国许多科技大逞在英伟达的CUDA 生态系统中投 入了大量资金与时间。 CUDA 是英伟达专为其GPU 开发的平行计算平台和程式设计模型,广泛应 用于AI 训练和高性能计算领域,其成熟的工具、函式库和庞大的开发者社群,已形成了一个难以 打破的「护城河」。 对于这些科技公司而言,一旦脱 ...
摩根士丹利:DeepSeek R2-新一代人工智能推理巨擘?
摩根· 2025-06-06 02:37
Investment Rating - The semiconductor production equipment industry is rated as Attractive [5][70]. Core Insights - The imminent launch of DeepSeek R2, which features 1.2 trillion parameters and significant cost efficiencies, is expected to positively impact the Japanese semiconductor production equipment (SPE) industry [3][7][11]. - The R2 model's capabilities include enhanced multilingual support, broader reinforcement learning, multi-modal functionalities, and improved inference-time scaling, which could democratize access to high-performance AI models [7][9][11]. - The development of efficient AI models like R2 is anticipated to increase demand for AI-related SPE, benefiting companies such as DISCO and Advantest [11]. Summary by Sections DeepSeek R2 Launch - DeepSeek's R2 model is reported to have 1.2 trillion parameters, a significant increase from R1's 671 billion parameters, and utilizes a hybrid Mixture-of-Experts architecture [3][7]. - The R2 model offers cost efficiencies with input costs at $0.07 per million tokens and output costs at $0.27 per million tokens, compared to R1's $0.15-0.16 and $2.19 respectively [3][7]. Industry Implications - The launch of R2 is expected to broaden the use of generative AI, leading to increased demand for AI-related SPE across the supply chain, including devices like dicers, grinders, and testers [11]. - The report reiterates an Overweight rating on DISCO and Advantest, which are positioned to benefit from the anticipated increase in demand for AI-related devices [11]. Company Ratings - DISCO (6146.T) is rated Overweight with a target P/E of 25.1x [12]. - Advantest (6857.T) is also rated Overweight, with a target P/E of 14.0x [15].
摩根士丹利:DeepSeek R2 可能即将发布-对日本SPE行业的影响
摩根· 2025-06-06 02:37
Investment Rating - The semiconductor production equipment industry is rated as Attractive [5] Core Insights - The imminent launch of DeepSeek R2, which features 1.2 trillion parameters and significant cost efficiencies, is expected to positively impact the Japanese semiconductor production equipment (SPE) industry [3][7] - The development of lightweight, high-performing AI models like DeepSeek R2 is anticipated to democratize access to generative AI, thereby expanding the market for AI-related SPE [11] Summary by Sections DeepSeek R2 Characteristics - DeepSeek R2 is reported to have 1.2 trillion parameters, with 78 billion active parameters and utilizes a hybrid Mixture-of-Experts architecture [3] - The input cost for R2 is $0.07 per million tokens, significantly lower than R1's $0.15-0.16, while the output cost is $0.27 compared to R1's $2.19 [3][7] - Enhanced multilingual capabilities and broader reinforcement learning are key upgrades in R2, allowing it to handle various data types including text, image, voice, and video [9][11] Market Implications - The anticipated launch of R2 is expected to boost demand for AI-related devices, including GPU and HBM, as well as custom chips and other AI devices [11] - The report reiterates an Overweight rating on DISCO and Advantest, which are expected to benefit from increased demand for AI-related devices [7][11] Company Ratings - Advantest (6857.T) is rated Overweight with a target price of ¥10,300 based on expected earnings peak [16] - DISCO (6146.T) is also rated Overweight with a target P/E of 25.1x based on earnings estimates [13]
ASIC市场,越来越大了
3 6 Ke· 2025-06-05 11:05
这一点早已达成业内共识。但令人意外的是,ASIC增长的速度实在是太快了。摩根士丹利预计,AI ASIC市场规模将从2024年的120亿 美元增长至2027年的300亿美元,年复合增长率达到34%。 要知道2023年—2029年,高性能计算GPU市场的年复合增长率是25%,而CPU和APU的增长率仅为5%和8%。 01ASIC市场,蛋糕膨胀 ASIC市场在增长。 TrendForce的最新研究报告指出,随着人工智能服务器需求的迅猛增长,美国主要的云计算服务提供商(CSP)正加快内部开发专用集 成电路(ASIC)芯片的步伐,平均每1至2年便推出新一代产品。在中国,人工智能服务器市场正逐步适应美国自2025年4月起实施的 新出口管制政策。据预测,这些措施将导致2025年进口芯片(如NVIDIA和AMD产品)的市场份额从2024年的63%下降至约42%。 与此同时,在政府积极推动国产人工智能处理器的政策扶持下,预计中国本土芯片制造商的市场份额将提升至40%,与进口芯片的市 场份额几乎持平。 定制芯片是一种经济选择,而不是技术选择。ASIC蛋糕增长最重要的驱动力只有一个:钱。 从当前来看,GPU服务器依然是最终用户的首 ...
NVIDIA Powers World's Largest Quantum Research Supercomputer
GlobeNewswire News Room· 2025-05-19 04:43
Core Insights - NVIDIA has launched the Global Research and Development Center for Business by Quantum-AI Technology (G-QuAT), featuring the ABCI-Q supercomputer, which is the largest research supercomputer dedicated to quantum computing globally [1][14] - The ABCI-Q supercomputer integrates 2,020 NVIDIA H100 GPUs connected via the NVIDIA Quantum-2 InfiniBand networking platform, facilitating unprecedented quantum-GPU computing capabilities [3][2] - The collaboration between NVIDIA and Japan's National Institute of Advanced Industrial Science and Technology (AIST) aims to advance quantum error correction and application development, essential for building practical quantum supercomputers [4][5] Industry Impact - Quantum processors are expected to enhance AI supercomputers in addressing complex challenges across various sectors, including healthcare, energy, and finance [2] - The integration of quantum hardware with AI supercomputing is anticipated to accelerate the realization of quantum computing's potential [4] - ABCI-Q will enable researchers to tackle core challenges in quantum computing technologies, expediting the development of practical use cases [5]