Workflow
推理云生态
icon
Search documents
曦望发布启望S3推理成本较上一代降约90%,押注「极致性价比」GPU与算力新范式
IPO早知道· 2026-01-29 00:15
Core Viewpoint - The article discusses the transition of the AI industry from "training-driven" to "inference-driven" models, highlighting the importance of cost efficiency and system stability in the delivery of inference capabilities, particularly through the launch of the new inference GPU, Sunrise S3 [2][5]. Group 1: Product Launch and Features - Sunrise officially launched its new inference GPU, the S3, at the first Sunrise GPU Summit, marking its first public appearance after raising approximately 3 billion yuan in strategic financing [2]. - The S3 chip is designed specifically for large model inference, featuring a system-level design that enhances performance and cost-effectiveness, achieving over a 10-fold improvement in overall cost-performance ratio compared to its predecessor [5][6]. - The S3 supports precision switching from FP16 to FP4, significantly improving low-precision inference efficiency while increasing memory capacity by four times compared to the previous generation [5][6]. Group 2: Cost Reduction and Efficiency - In typical inference scenarios, the unit cost of token inference with the S3 has decreased by approximately 90% compared to the previous generation, enabling scalable deployment of AI applications [5][6]. - The overall delivery cost of the new SC3-256 ultra-node solution is controlled within the range of ten million yuan, significantly lower than similar solutions in the industry that cost over one hundred million yuan [6]. Group 3: Ecosystem and Cloud Strategy - Sunrise aims to build a collaborative inference cloud to address challenges such as resource fragmentation and operational complexity in the deployment of inference capabilities [8][9]. - The inference cloud will utilize the S3 as a foundation, pooling distributed computing resources into a unified inference power pool, allowing enterprises to access model capabilities on-demand without worrying about hardware configurations [9]. - The company has initiated a "one cent per million tokens" inference cost plan in collaboration with partners, indicating a shift towards economically feasible large model inference [9]. Group 4: Strategic Collaborations - Sunrise has signed a strategic cooperation agreement with Zhejiang University to establish a joint research center focused on advanced topics such as optical interconnect GPU architecture and AI high-precision weather forecasting [10]. - The company has also formed strategic partnerships with various enterprises to promote the application of inference capabilities across industries such as transportation, manufacturing, and healthcare [10].
曦望发布推理GPU芯片启望S3 推进推理云生态共建
Zheng Quan Ri Bao Wang· 2026-01-28 12:53
Core Insights - Sunrise has launched its new inference GPU chip "Qiwang S3" at the first Sunrise GPU Summit, marking its first public appearance after raising approximately 3 billion yuan in strategic financing over the past year [1] - The company emphasizes an "All-in inference" approach, focusing on long-term delivery capabilities, unit costs, and system stability, as inference becomes the primary power consumption scenario in the AI industry [1][3] - The Qiwang S3 chip is designed for large model inference, achieving over a 10-fold improvement in overall cost-effectiveness compared to its predecessor in typical inference scenarios [1][2] Product Features - The Qiwang S3 supports precision switching from FP16 to FP4, significantly enhancing low-precision inference efficiency while maintaining model performance [2] - It is the first domestic GPU product to adopt LPDDR6 memory, increasing memory capacity by four times compared to the previous generation, addressing common memory bottlenecks in large model inference [2] - The unit token inference cost in mainstream large model scenarios has decreased by approximately 90% compared to the previous generation, enabling scalable deployment of the "one cent per million tokens" concept [2] Ecosystem Development - Sunrise aims to build a comprehensive "chip + system + ecosystem" layout around inference scenarios, positioning itself beyond just a chip manufacturer [4] - The company is developing a collaborative inference cloud, which integrates dispersed computing resources into a unified inference power pool, providing enterprises with on-demand access to large model inference services [3] - The inference cloud is based on the Qiwang S3 and utilizes GPU pooling and elastic scheduling, allowing businesses to scale computing power flexibly according to their workload [3] Strategic Vision - The company believes that the AI industry is transitioning from a "training-driven" model to an "inference-driven" model, emphasizing long-term delivery capabilities and system stability over one-time training investments [3][4] - Sunrise's chairman stated that whoever can continuously reduce inference costs will control the cost curve of the AI industry, highlighting the importance of systematic innovation in the inference power system for sustainable growth in AI applications [4]