人工智能推理
Search documents
这项技术,颠覆芯片堆叠
半导体行业观察· 2026-01-09 01:53
公众号记得加星标⭐️,第一时间看推送不会错过。 麻省理工学院的研究人员提出了一种新的解决方案,旨在解决现代计算中最棘手的效率问题之一:逻 辑电路和存储器之间数据传输所消耗的能量。该团队最近发现,通过在传统CMOS芯片的后端工艺 (BEOL)中添加额外的有源器件层,可以将通常用于布线的区域变成一个可以同时容纳逻辑晶体管 和存储器晶体管的堆叠结构。 研究人员在 IEEE IEDM 上发表了两篇相关论文,分别以BEOL 氧化铟晶体管和BEOL 纳米级铁电存 储器件为中心。 3D堆叠技术并非新技术,但将单片堆叠直接应用于已完成的逻辑电路会受到温度的限制。标准的硅 器件制造工艺通常需要一定的热预算,这可能会损坏先前构建的晶体管和金属层。麻省理工学院团队 的核心策略是避免"预先"构建新的硅器件,而是在芯片后端(传统上导线和金属键合所在的位置)添 加有源层。 这种"翻转"至关重要,因为它将后端工艺(BEOL)转化为额外的器件空间,而无需底层CMOS工艺 承受额外的高温前端工艺。它还缩短了计算、嵌入式存储器和互连之间的物理路径,从而避免了传统 布局中能量的浪费。 麻省理工学院提出的架构是一种垂直集成的器件堆叠结构,它制造在现 ...
大佬就是大佬!黄仁勋一句话引爆市场,牛股飙涨1080%,这类股集体闪崩
Xin Lang Cai Jing· 2026-01-07 05:37
来源:美股财经社 大佬就是大佬!英伟达首席执行官黄仁勋在CES 2026演讲中简单的几句话,就引发美股存储概念暴 涨,同时也让数据中心冷却类股票集体下跌。 存储类股集体大涨 来源:市场资讯 周二,闪迪(SanDisk Corp.,SNDK)的股价疯狂飙升近28%,报349.63美元/股,再度创下历史新 高,自去年2月份以来已经暴涨10倍,总市值达到512.4亿美元。 此前,英伟达公司首席执行官黄仁勋在CES科技展上发表讲话,强调了内存和存储的必要性。 黄仁勋表示:"就存储而言,这目前是一个完全未被开发的市场。这是一个从未存在过的市场,而且很 可能成为全球最大的存储市场,它将承载全球人工智能的工作记忆。" 黄仁勋强调,目前人工智能相关的存储需求已经超过了现有基础设施的能力,并表示待处理的数据 量"现在实在是太大了"。 闪迪自去年2月上市交易以来,势头强劲,2026年前三个交易日涨幅超过47%,自去年4月22日触底以来 累计飙升1080%。 周二,该股成为标普500指数中表现最佳的股票,紧随其后的是存储公司西部数据(WDC)、希捷科技 (STX)、美光科技(MU),这三家公司的股价均实现了两位数百分比的涨幅,集体 ...
谷歌AI论文趋势:推理为王
Huafu Securities· 2025-12-31 02:43
谷歌 AI 论文趋势:推理为王 投资要点: 从谷歌前沿产品 Deep Think Mode 看:推理过程下的长思考 行 业 动 态 跟 踪 2025 年 12 月 4 日,谷歌在 Gemini 应用中向 Google AI Ultra 订阅 用户推送 Gemini 3 Deep Think 模式,通过运用并行思维技术推动思维 能力的前沿。其上一代模型 Gemini2.5 Deep Think 鼓励模型利用扩展 推理路径,使深度思维随着时间推移成为更优秀、更直观的问题解决 者;通过延长推理时间或思考时间,让 AI 有更多时间去探索不同的假 设,并对复杂问题提出创造性的解决方案。 从谷歌近期论文看:推理端的新算法进步 新架构支持推理时学习与记忆能力。谷歌发布论文《Titans: Learning to Memorize at Test Time》提出了一种新的神经长时记忆模块, 它能够学习记忆历史上下文,并在运用遥远历史信息的同时,协助注 意力机制聚焦于当前上下文。该神经记忆模块兼具快速并行化训练与 高效推理的优势。从记忆视角看,注意力机制因有限但精准的上下文 依赖建模能力,可视为短期记忆;而神经记忆凭借其数据存 ...
伯恩斯坦:英伟达与Groq交易具有战略意义
Xin Lang Cai Jing· 2025-12-29 12:39
来源:滚动播报 伯恩斯坦分析师Stacy A. Rasgon重申对英伟达的跑赢大盘评级,同时维持275美元的目标股价。有报道 称,英伟达已与人工智能芯片初创企业Groq达成一项价值200亿美元的合作协议,该协议后被证实为 Groq推理技术的非独家授权协议。Groq核心管理团队将加盟英伟达,而Groq本身将在新任首席执行官 西蒙・爱德华兹的领导下保持独立运营,同时继续开展其云端业务。伯恩斯坦认为,此项合作具有战略 意义:一方面能够巩固英伟达在人工智能推理领域的市场地位——相较于模型训练领域,推理市场的竞 争更为激烈;另一方面,随着推理需求的持续增长,此举也将进一步强化英伟达的行业龙头地位。 ...
国产GPU第二股沐曦股份大涨近560% 单签盈利近30万元
Xin Hua Cai Jing· 2025-12-17 01:54
Group 1 - The core viewpoint of the article highlights the successful IPO of Muxi Co., a leading domestic high-performance general-purpose GPU company, which saw its stock price surge by 559% on its debut, reaching approximately 690 yuan per share and a total market capitalization close to 280 billion yuan [1][2]. - Muxi Co. issued shares at a price of 104.66 yuan each, making it the second highest IPO price on the STAR Market this year, following Moer Technology [2]. - The funds raised from the IPO will be allocated to the development and industrialization of new high-performance general-purpose GPUs, AI inference GPUs, and advanced GPU technology for emerging applications [2]. Group 2 - Muxi Co. is focused on the independent research and development of a full-stack high-performance GPU chip and computing platform, with key products including the Xisi N series for intelligent computing inference and the Xiyun C series for training and general computing [2]. - The latest product, the Xiyun C600 series, is positioned between NVIDIA's A100 and H100 in terms of performance and is expected to enter risk mass production by the end of this year, with formal mass production slated for the first half of next year [2]. - In the A-share market, seven semiconductor stocks have been listed this year, with an average first-day increase of approximately 242.94% [3].
苹果首款服务器芯片,更多细节曝光
半导体行业观察· 2025-12-16 01:22
公众号记得加星标⭐️,第一时间看推送不会错过。 众所周知,苹果公司热衷于垂直整合,尽可能将关键技术节点保留在公司内部,其庞大的定制芯片设 计工作或许是这种模式最恰当的例证。 当然,推理芯片的架构与用于训练人工智能模型的芯片的架构有着根本的不同,前者更加注重延迟和 吞吐量。人工智能推理芯片还采用了精度较低的基于数学的架构,例如 INT8。 因此,鉴于此背景,我们可以合理推断,苹果和博通在推进 Baltra 的整体设计过程中,可能会重点 关注这些方面。 与此同时,苹果庞大的定制芯片产品线仍在不断扩展。除了广为人知的A系列和M系列芯片外,苹果 现在还使用其自主研发的C1调制解调器芯片。此外,这家库比蒂诺巨头可能还会推出一款基于其 Apple Watch专用S系列芯片的衍生产品,用于其计划于明年发布的AI智能眼镜中。 参考链接 https://wccftech.com/apples-ai-server-chip-baltra-likely-to-be-used-primarily-for-ai-inference/ (来源 : 编译自wccftech ) 当然,这些定制人工智能芯片的实际部署预计将在 2027 年进行 ...
明日(12月5日)!摩尔线程登陆A股 沐曦股份开启申购
Xin Hua Cai Jing· 2025-12-04 14:25
明日(12月5日),摩尔线程将在科创板上市,同日,另一家国产GPU公司沐曦股份将开启申购。 沐曦股份同日开启申购 同日,另一家国产GPU公司沐曦股份将开启申购,发行价格为104.66元/股,对应的2024年摊薄后静态 市销率为56.35倍。按此发行价格,预计上市时市值约为418.74亿元。 沐曦股份称,本次发行数量为4010万股,占发行后公司总股本比例为10.02%。若本次发行成功,预计 募集资金总额41.97亿元,将用于投资"新型高性能通用GPU研发及产业化项目""新一代人工智能推理 GPU研发及产业化项目"和"面向前沿领域及新兴应用场景的高性能GPU技术研发项目"。 据招股书,沐曦股份是国内高性能通用GPU产品的主要领军企业之一,致力于自主研发全栈高性能GPU 芯片及计算平台。公司旗舰产品曦云C系列训推一体GPU芯片,基于全自研的GPU IP、指令集和架构, 在通用性、单卡性能、集群性能及稳定性、生态兼容与迁移效率等方面均达到国内领先水平,具备较强 的综合竞争力。根据Bernstein Research以销售金额口径测算的数据并结合IDC数据,以算力规模口径测 算的结果,沐曦股份在2024年中国AI芯片市 ...
博通:AI 推理需求爆发,有望大幅上涨
美股研究社· 2025-11-28 11:06
Core Viewpoint - The artificial intelligence ecosystem is transitioning from the training phase to the inference phase, becoming a strong revenue engine for large tech companies and providing structural growth benefits for Broadcom's custom chips and networking products [1][22]. Group 1: AI Demand and Market Trends - There is a significant increase in demand for AI inference, which is expected to drive custom chip demand in the second half of 2026, leading to revenue growth in AI business [1][5]. - Major tech companies, including Google and ByteDance, are increasingly adopting Broadcom's custom chips, which are more cost-effective compared to Nvidia's GPUs [2][4]. Group 2: Custom Chip Advantages - Broadcom's custom accelerators are significantly cheaper than Nvidia's GPUs, with performance improvements in each generation [2]. - Google's upcoming seventh-generation Tensor Processing Unit (TPU) Ironwood is designed specifically for inference, showcasing the trend towards more efficient custom solutions [4]. Group 3: Financial Performance - Broadcom reported a 22% year-over-year revenue growth in Q3, reaching $15.95 billion, driven by strong performance in custom AI accelerators and networking switches [11]. - The AI semiconductor business saw a 63% year-over-year revenue increase, contributing significantly to overall revenue [13]. Group 4: Future Projections - Broadcom anticipates a substantial increase in AI business revenue, projecting it could reach nearly $54 billion by FY2027, accounting for about 50% of total revenue [5][12]. - The company expects to see a 34.9% year-over-year revenue growth in FY2026, reaching $85.4 billion [12]. Group 5: Networking Solutions - Broadcom is focusing on its Tomahawk 6 switch, which is the first Ethernet switch with a capacity of 102.4 Tbps, facilitating the deployment of large-scale AI accelerator clusters [9][10]. - The shift from Nvidia's GPU+InfiniBand ecosystem to Ethernet is beneficial for Broadcom, as demand for Ethernet solutions is on the rise [8]. Group 6: Cash Flow and Valuation - Broadcom has a strong cash flow generation capability, converting 44% of revenue into free cash flow, which supports its valuation premium [18][19]. - The company maintains a competitive valuation compared to Nvidia, with a forward P/E ratio of 36.9, indicating strong profit margins and growth potential [19][20].
从iPhone17热卖到“AI推理超级蓝海” 苹果(AAPL.US)悄然踏向新一轮牛市轨迹
智通财经网· 2025-09-30 04:43
Core Viewpoint - Bank of America highlights strong demand for Apple's iPhone 17 series, despite initial user criticism regarding lack of standout features, driven by significant upgrades in AI capabilities and key performance metrics [1][2] Group 1: iPhone 17 Demand and Delivery - The delivery cycle for the iPhone 17 series is significantly longer than last year's models, indicating strong demand, with the average delivery time around 19 days compared to 5 days for the iPhone 16 series [2][3] - In China, the standard iPhone 17 has a delivery time of up to 25 days, while other international regions average about 18 days, reflecting robust demand [3] - The iPhone 17 Pro and Pro Max models have delivery times similar to last year, with Pro Max slightly longer at 21 days, while the Pro model remains at 14 days [3] Group 2: Market Sentiment and Stock Performance - Apple's stock has rebounded over 10% since September, driven by strong iPhone 17 demand and market optimism regarding its potential benefits from the AI sector, with analysts projecting a target price of $300 [2] - As of the latest market close, Apple's stock price was $254.43, with a market capitalization of $3.8 trillion, ranking just behind Nvidia and Microsoft [2] Group 3: AI Market Potential - Bernstein's report anticipates a massive $1 trillion opportunity in AI inference systems by 2030, benefiting large tech companies like Apple focused on IT hardware and consumer electronics [1][5] - The AI infrastructure market is expected to see exponential growth, with Nvidia's CEO predicting AI infrastructure spending could reach $3 trillion to $4 trillion by 2030 [5][6] - Apple is positioned as a key player in the AI inference revolution, with its extensive ecosystem of 2.35 billion active devices providing a significant advantage for integrating AI capabilities [6][7]
NPU,大有可为
半导体行业观察· 2025-08-28 01:14
Core Insights - The global AI inference market is expected to grow rapidly, reaching approximately $10.6 billion in 2023 and projected to increase to about $25.5 billion by 2030, with a CAGR of around 19% [2] - The NPU market is anticipated to expand due to the demand for higher inference throughput, lower latency, and improved energy efficiency, which NPU technology is well-suited to meet [2] - Companies like Sambanova and Grok are leading the NPU market, focusing on specialized AI applications and cloud-based services [3] Group 1 - The AI inference market is projected to grow from $10.6 billion in 2023 to $25.5 billion by 2030, indicating a significant market opportunity [2] - NPU technology is emerging as a viable alternative to traditional GPUs, offering low power consumption and high efficiency tailored for AI applications [2] - The semiconductor industry is shifting towards application-specific integrated circuits (ASICs) for AI, moving away from mature CPU and GPU technologies [2] Group 2 - Sambanova integrates its dataflow architecture NPU with proprietary software, targeting major clients including the U.S. government and financial institutions [3] - Grok specializes in real-time inference with its custom-designed chips, focusing on cloud-based LLM services for high-speed data center applications [3] - AI semiconductor companies must prioritize energy efficiency and target customized markets to compete effectively against general-purpose GPUs like those from Nvidia [3]