LPU专题报告一:架构创新突破大模型推理延迟瓶颈,广阔市场空间有望快速放量
CAITONG SECURITIES·2026-03-16 06:45

Investment Rating - The report maintains a "Positive" investment rating for the industry [2] Core Insights - LPU is a new generation chip designed for large model inference, centered around the TSP architecture, which optimizes the execution order and timing of instructions, enhancing performance and reducing hardware complexity [3][11] - LPU can significantly reduce inference latency in large models, improving user experience by addressing memory bandwidth bottlenecks during the decoding phase [7][41] - The LPU market is poised for rapid growth, having entered the initial production phase, with a substantial increase in token consumption driving demand for inference chips [7][69] Summary by Sections Section 1: LPU and TSP Architecture - LPU is a custom chip for large model inference, designed for compute-intensive tasks with a focus on optimizing inference efficiency [11] - The TSP architecture includes five functional slices, allowing for deterministic instruction execution and improved performance [17][28] - The design enables software-defined hardware, where the compiler can directly control the chip's hardware state [30] Section 2: Reducing Inference Latency - Inference latency is closely linked to user experience, primarily occurring during the decoding phase, which is bandwidth-constrained [41][61] - LPU's faster memory bandwidth addresses these latency issues, enhancing the overall performance of large models [62] - The LPU-based models offer faster inference speeds and cost-effectiveness, with significant performance metrics reported [64][67] Section 3: Market Potential and Production - The rapid growth in token consumption indicates a high growth potential for the inference chip market, with projections showing a significant increase in market size by 2031 [69][70] - LPU has entered the initial production phase, with both international and domestic companies advancing in the market [71][74]

LPU专题报告一:架构创新突破大模型推理延迟瓶颈,广阔市场空间有望快速放量 - Reportify