翼芯
Search documents
中国电信完成业界首个面向大模型推理的异构算力协同技术验证
Xin Lang Cai Jing· 2025-10-13 23:42
Group 1 - The core viewpoint of the articles highlights the successful implementation of the DeepSeek series model by China Telecom Research Institute in collaboration with various industry partners, achieving cost reduction and efficiency improvement in large model inference through a combination of NVIDIA and domestic computing power [1][2] - The DeepSeek 671B model demonstrated a throughput performance improvement of 30% to 72% across multiple scenarios, with a doubling of concurrent capability and a maximum reduction of 42% in inference costs under the same throughput conditions [1] - The successful verification of heterogeneous computing power collaboration for large model inference reflects China Telecom's deep understanding of intelligent computing optimization technology and its innovative practices in adapting domestic computing power [2] Group 2 - The industry consensus is shifting towards optimizing chip design for the Prefill and Decode stages of inference, with NVIDIA and Huawei releasing respective chip design plans that incorporate "high compute low storage" and "low compute high storage" strategies [2] - China Telecom Research Institute has developed a full-stack self-research heterogeneous mixed inference system that showcases three core advantages: efficient transmission between heterogeneous chip PD pools, automatic recommendation and real-time optimization of PD resource allocation, and dynamic scheduling of inference tasks [2] - China Telecom aims to continue enhancing the high-quality development of domestic computing power, creating a "connected and efficient collaborative" heterogeneous computing ecosystem for large model training and inference [2]