高通发布AI200和AI250 赋能高速生成式AI推理

Core Insights - Qualcomm has launched next-generation AI inference optimization solutions for data centers, featuring the Qualcomm AI200 and AI250 chips, which provide rack-level performance and memory capacity for high-performance generative AI inference [1][2] Group 1: Product Features - The Qualcomm AI200 is designed specifically for rack-level AI inference, supporting large language models (LLM) and multimodal models (LMM) with a total memory capacity of 768GB LPDDR, offering low total cost of ownership and optimized performance [1] - The Qualcomm AI250 introduces an innovative near-memory computing architecture that enhances effective memory bandwidth by over 10 times while significantly reducing power consumption, thus improving efficiency and performance for AI workloads [1][2] Group 2: System Capabilities - Both solutions support direct liquid cooling for enhanced thermal efficiency, PCIe vertical expansion, Ethernet horizontal expansion, and confidential computing to ensure the security of AI workloads, with a total system power consumption of 160 kilowatts [2] - Qualcomm emphasizes that these AI infrastructure solutions allow customers to deploy generative AI with industry-leading total cost of ownership while meeting modern data center demands for flexibility and security [2] Group 3: Software Ecosystem - Qualcomm offers a comprehensive AI software stack that spans from application to system software layers, optimized for AI inference, supporting mainstream machine learning frameworks and inference engines [2] - Developers can utilize Qualcomm's Efficient Transformers Library and Qualcomm AI Inference Suite for seamless model integration and one-click deployment of Hugging Face models, providing ready-to-use AI applications and operational services [2] Group 4: Future Plans - The Qualcomm AI200 and AI250 are expected to be commercially available in 2026 and 2027, respectively, with the company committed to advancing its data center product technology roadmap annually, focusing on AI inference performance, efficiency, and total cost of ownership advantages [3]