国模国芯
Search documents
国内首个国产AI推理千卡集群落地,采用云天励飞全自研AI推理芯片
IPO早知道· 2026-03-12 05:38
Core Viewpoint - The article discusses the establishment of an AI inference cluster by Yuntian Lifei in Zhanjiang, which aims to create a "national model and national chip" ecosystem, leveraging domestic AI technologies to support various industry applications and enhance local digital transformation [3][14]. Group 1: Project Overview - Yuntian Lifei won a bid for the Zhanjiang AI penetration support project with a contract amount of 420 million yuan, focusing on building a domestic AI inference cluster based on self-developed AI inference acceleration cards [3]. - The cluster will utilize domestic large models like DeepSeek to provide AI capabilities for government and industry applications, aiming to create a model for the "national model and national chip" ecosystem [3][14]. Group 2: Technical Architecture - The AI inference cluster is designed to meet high concurrency, high throughput, and low latency requirements, employing a "Prefill-Decode separation" architecture to optimize resource allocation during different processing stages [6]. - The system architecture prioritizes optimizing the Prefill phase while balancing the Decode phase, ensuring high throughput efficiency even in long-context inference scenarios [7]. Group 3: Chip Development and Cost Efficiency - The AI inference cluster will be built in three phases, all utilizing Yuntian Lifei's self-developed AI inference acceleration cards, with the first phase deploying the X6000 inference acceleration card [10]. - Future plans include launching three generations of AI inference chips over the next three years, focusing on optimizing both Prefill and Decode phases to achieve millisecond-level inference latency [11]. Group 4: Industry Implications - The establishment of the Zhanjiang AI inference cluster represents a shift in the AI infrastructure development logic, moving from merely pursuing computational scale to emphasizing efficiency and cost [13]. - The cluster is expected to provide a significant computational foundation for local industry digital transformation and facilitate the collaborative development of domestic models and chips [14].