曦望发布启望S3推理成本较上一代降约90%，押注「极致性价比」GPU与算力新范式

Core Viewpoint - The article discusses the transition of the AI industry from "training-driven" to "inference-driven" models, highlighting the importance of cost efficiency and system stability in the delivery of inference capabilities, particularly through the launch of the new inference GPU, Sunrise S3 [2][5]. Group 1: Product Launch and Features - Sunrise officially launched its new inference GPU, the S3, at the first Sunrise GPU Summit, marking its first public appearance after raising approximately 3 billion yuan in strategic financing [2]. - The S3 chip is designed specifically for large model inference, featuring a system-level design that enhances performance and cost-effectiveness, achieving over a 10-fold improvement in overall cost-performance ratio compared to its predecessor [5][6]. - The S3 supports precision switching from FP16 to FP4, significantly improving low-precision inference efficiency while increasing memory capacity by four times compared to the previous generation [5][6]. Group 2: Cost Reduction and Efficiency - In typical inference scenarios, the unit cost of token inference with the S3 has decreased by approximately 90% compared to the previous generation, enabling scalable deployment of AI applications [5][6]. - The overall delivery cost of the new SC3-256 ultra-node solution is controlled within the range of ten million yuan, significantly lower than similar solutions in the industry that cost over one hundred million yuan [6]. Group 3: Ecosystem and Cloud Strategy - Sunrise aims to build a collaborative inference cloud to address challenges such as resource fragmentation and operational complexity in the deployment of inference capabilities [8][9]. - The inference cloud will utilize the S3 as a foundation, pooling distributed computing resources into a unified inference power pool, allowing enterprises to access model capabilities on-demand without worrying about hardware configurations [9]. - The company has initiated a "one cent per million tokens" inference cost plan in collaboration with partners, indicating a shift towards economically feasible large model inference [9]. Group 4: Strategic Collaborations - Sunrise has signed a strategic cooperation agreement with Zhejiang University to establish a joint research center focused on advanced topics such as optical interconnect GPU architecture and AI high-precision weather forecasting [10]. - The company has also formed strategic partnerships with various enterprises to promote the application of inference capabilities across industries such as transportation, manufacturing, and healthcare [10].