Workflow
LPU
icon
Search documents
GPU的替代者,LPU是什么?
半导体行业观察· 2025-08-03 03:17
公众号记得加星标⭐️,第一时间看推送不会错过。 内存架构:SRAM 作为主存储器 FP32 用于 1 位错误传播的注意逻辑 混合专家 (MoE) 权重的块浮点,其中稳健性研究表明没有可测量的退化 容错层中激活的 FP8 存储 传统加速器沿用了专为训练设计的内存层级结构:DRAM 和 HBM 作为主存储,并配备复杂的缓存 系统。DRAM 和 HBM 都会在每次权重提取时引入显著的延迟——每次访问数百纳秒。这适用于时 间局部性可预测且运算强度较高的高批量训练,但推理需要按顺序执行层,运算强度要低得多,这暴 露了 DRAM 和 HBM 带来的延迟损失。 Moonshot 的 Kimi K2 最近在GroqCloud上发布了预览版,开发人员不断问我们:Groq 如何如此快 速地运行 1 万亿参数模型? 传统硬件迫使人们做出选择:要么更快的推理速度,但质量会下降;要么更精确的推理速度,但延迟 令人无法接受。这种权衡之所以存在,是因为 GPU 架构会针对训练工作负载进行优化。而 LPU—— 专为推理而设计的硬件——在保持质量的同时,消除了造成延迟的架构瓶颈。 无需权衡的准确性:TruePoint Numerics 传统加 ...
芯片新贵,集体转向
半导体芯闻· 2025-05-12 10:08
Core Viewpoint - The AI chip market is shifting focus from training to inference, as companies find it increasingly difficult to compete in the training space dominated by Nvidia and others [1][20]. Group 1: Market Dynamics - Nvidia continues to lead the training chip market, while companies like Graphcore, Intel Gaudi, and SambaNova are pivoting towards the more accessible inference market [1][20]. - The training market requires significant capital and resources, making it challenging for new entrants to survive [1][20]. - The shift towards inference is seen as a strategic move to find more scalable and practical applications in AI [1][20]. Group 2: Graphcore's Transition - Graphcore, once a strong competitor to Nvidia, is now focusing on inference as a means of survival after facing challenges in the training market [6][4]. - The company has optimized its Poplar SDK for efficient inference tasks and is targeting sectors like finance and healthcare [6][4]. - Graphcore's previous partnerships, such as with Microsoft, have ended, prompting a need to adapt to the changing market landscape [6][5]. Group 3: Intel Gaudi's Strategy - Intel's Gaudi series, initially aimed at training, is now being integrated into a new AI acceleration product line that emphasizes both training and inference [10][11]. - Gaudi 3 is marketed for its cost-effectiveness and performance in inference tasks, particularly for large language models [10][11]. - Intel is merging its Habana and GPU departments to streamline its AI chip strategy, indicating a shift in focus towards inference [10][11]. Group 4: Groq's Focus on Inference - Groq, originally targeting the training market, has pivoted to provide inference-as-a-service, emphasizing low latency and high throughput [15][12]. - The company has developed an AI inference engine platform that integrates with existing AI ecosystems, aiming to attract industries sensitive to latency [15][12]. - Groq's transition highlights the growing importance of speed and efficiency in the inference market [15][12]. Group 5: SambaNova's Shift - SambaNova has transitioned from a focus on training to offering inference-as-a-service, allowing users to access AI capabilities without complex hardware [19][16]. - The company is targeting sectors with strict compliance needs, such as government and finance, providing tailored AI solutions [19][16]. - This strategic pivot reflects the broader trend of AI chip companies adapting to market demands for efficient inference solutions [19][16]. Group 6: Inference Market Characteristics - Inference tasks are less resource-intensive than training, allowing companies with limited capabilities to compete effectively [21][20]. - The shift to inference is characterized by a focus on cost, deployment, and maintainability, moving away from the previous emphasis on raw computational power [23][20]. - The competitive landscape is evolving, with smaller teams and startups finding opportunities in the inference space [23][20].
芯片新贵,集体转向
半导体行业观察· 2025-05-10 02:53
在这种格局下,新晋芯片企业在训练市场几乎没有生存空间。"训练芯片的市场不是大多数玩家 的竞技场",AI基础设施创业者坦言,"光是拿到一张大模型训练订单,就意味着你需要烧掉数千 万美元——而且你未必赢。" 如果您希望可以时常见面,欢迎标星收藏哦~ 在AI芯片这个波澜壮阔的竞技场上,一度被奉为"技术圣杯"的大规模训练,如今正悄然让位于更 低调、但更现实的推理市场。 Nvidia依然在训练芯片市场一骑绝尘,Cerebras则继续孤注一掷地打造超大规模计算平台。但其 他曾在训练芯片上争得面红耳赤的玩家——Graphcore、英特尔Gaudi、SambaNova等——正在 悄悄转向另一个战场:AI推理。 这一趋势,并非偶然。 AI训练作为一个重资本、重算力、重软件生态的产业,Nvidia的CUDA工具链、成熟的GPU生态 与广泛的框架兼容性,使其几乎掌握了训练芯片的全部话语权。而Cerebras虽然另辟蹊径,推出 了超大芯片的训练平台,但仍局限于科研机构和极少数商业化应用场景。 正因如此,那些曾在训练芯片上"正面硬刚"Nvidia的创业公司,开始寻求更容易进入、更能规模 化落地的应用路径。推理芯片,成为最佳选项。 Gr ...