Workflow
AI infra工具链
icon
Search documents
存力中国行暨先进存力AI推理工作研讨会在京顺利召开
Zheng Quan Ri Bao Wang· 2025-11-07 07:29
Core Insights - The conference focused on the role of advanced storage in empowering AI model development in the AI era [1][2] - Key experts from various organizations discussed the challenges and solutions related to AI inference and storage technology [2][3][4] Group 1: Advanced Storage and AI Inference - The chief expert from the China Academy of Information and Communications Technology emphasized that advanced storage is crucial for improving AI inference efficiency and controlling costs [2] - The national policies highlight the importance of advancing storage technology and enhancing the storage industry's capabilities [2] - A working group was established to promote collaboration and innovation in storage technology within the AI inference sector [2] Group 2: Technical Challenges and Solutions - Current challenges in AI inference include the need for upgraded KV Cache storage, multi-modal data collaboration, and bandwidth limitations [3] - China Mobile is implementing layered caching, high-speed data interconnects, and proprietary high-density servers to enhance storage efficiency and reduce costs [3] - Huawei's UCM inference memory data management technology addresses the challenges of data management, computational power supply, and cost reduction in AI applications [4] Group 3: Industry Collaboration and Future Directions - The conference facilitated discussions among industry experts from various companies, contributing to the consensus on the future direction of the storage industry [5] - The focus is on enhancing computational resource utilization and addressing issues related to high concurrency and low latency in AI inference [4][5] - The successful hosting of the conference is seen as a step towards fostering innovation and collaboration in the storage industry [5]
Token经济时代,AI推理跑不快的瓶颈是“存力”?
Tai Mei Ti A P P· 2025-11-07 04:08
Core Insights - The AI industry is undergoing a structural shift, moving from a focus on GPU scaling to the importance of storage capabilities in enhancing AI performance and cost efficiency [1][10] - The demand for advanced storage solutions is expected to rise due to the increasing requirements of AI applications, with storage prices projected to remain bullish through Q4 2025 [1][10] - The transition from a "parameter scale" arms race to a "inference efficiency" commercial competition is anticipated to begin in 2025, emphasizing the significance of token usage in AI inference [2][10] Storage and Inference Changes - The fundamental changes in inference loads are driven by three main factors: the exponential growth of KVCache capacity due to longer contexts, the complexity of multi-modal data requiring advanced I/O capabilities, and the need for consistent performance under high-load conditions [4][10] - The bottleneck in inference systems is increasingly related to storage capabilities rather than GPU power, as GPUs often wait for data rather than being unable to compute [5][10] - Enhancing GPU utilization by 20% can lead to a 15%-18% reduction in overall costs, highlighting the importance of efficient data supply over merely increasing GPU numbers [5][10] New Storage Paradigms - Storage is evolving from a passive role to an active component in AI inference, focusing on data flow management rather than just capacity [6][10] - The traditional storage architecture struggles to meet the demands of high throughput, low latency, and heterogeneous data integration, which hinders AI application deployment [7][10] - New technologies, such as CXL and multi-level caching, are being developed to optimize data flow and enhance the efficiency of AI inference systems [6][10] Future Directions - The next three years will see a consensus on four key directions: the scarcity of resources will shift from GPUs to the ability to efficiently supply data to GPUs, the management of data will become central to AI systems, real-time storage capabilities will become essential, and CXL architecture will redefine the boundaries between memory and storage [10][11][12] - The competition in AI will extend beyond model performance to the underlying infrastructure, emphasizing the need for effective data management and flow [12]