高通发布AI200与AI250，升级数据中心AI推理解决方案

Core Insights - Qualcomm has launched next-generation AI inference optimization solutions for data centers, including acceleration cards and rack systems based on Qualcomm AI200 and AI250 chips, focusing on rack-level performance and memory capacity optimization to support generative AI inference across various industries [1][3]. Group 1: Qualcomm AI200 and AI250 Solutions - The Qualcomm AI200 solution is designed for rack-level AI inference, targeting large language models (LLM), multimodal models (LMM), and other AI workloads, with advantages in low total cost of ownership and performance optimization. Each acceleration card supports 768GB LPDDR memory, meeting high memory capacity needs while controlling costs [3][4]. - The Qualcomm AI250 solution introduces a near-memory computing architecture that achieves over 10 times effective memory bandwidth improvement while significantly reducing power consumption, enhancing efficiency and performance for AI inference workloads. It also features decoupled AI inference capabilities for efficient hardware resource utilization [3][4]. Group 2: Common Features and Software Support - Both Qualcomm AI200 and AI250 rack solutions share several common technical designs, including support for direct liquid cooling to enhance thermal efficiency, compatibility with PCIe vertical expansion and Ethernet horizontal expansion to meet various deployment needs, and built-in confidential computing features to ensure the security of AI workloads. The total power consumption for the entire rack is controlled at 160 kilowatts, aligning with data center energy management standards [3][4]. - Qualcomm provides a large-scale AI software stack that covers the entire link from application layer to system software layer, optimized for AI inference scenarios. This software stack supports mainstream machine learning frameworks, inference engines, generative AI frameworks, and decoupled services for LLM/LMM inference optimization [4][5]. Group 3: Future Plans - The Qualcomm AI200 is expected to be commercially available by 2026, while the AI250 is planned for market launch in 2027. Qualcomm aims to iteratively advance its data center product technology roadmap annually, focusing on optimizing AI inference performance, energy efficiency, and total cost of ownership to better meet the evolving demands of generative AI [5].