Core Insights - The core message emphasizes that whoever masters efficient, controllable, and sustainable inference infrastructure will dominate the speed of AI implementation [3][5]. Group 1: Company Overview - The company, known as Xi Wang, is positioned as a leading GPU chip company focused on inference, aiming to optimize large model inference [4]. - Xi Wang's mission is to excel in large model inference, transitioning from a training-driven to an inference-driven AI industry [4][5]. - The company was established in 2020, evolving from the chip division of SenseTime, and has accumulated significant experience in AI applications over the past decade [5][6]. Group 2: Market Trends - By 2026, inference computing power is projected to account for 66% of AI workloads, surpassing training, indicating a structural shift in the industry [4]. - The demand for real-time interaction and complex scenarios, such as 3D and video generation, is driving the need for high-frequency response in AI applications [4][5]. Group 3: Cost Structure and Strategy - Inference costs currently represent 70% of AI application expenses, which is critical for profitability and commercial success [4][5]. - The company aims to reduce inference costs significantly, targeting a reduction from "per unit" to "per fraction," making AI infrastructure as accessible as utilities [4][7]. Group 4: Product Development and Innovation - Xi Wang has invested 2 billion in R&D over the past eight years, successfully producing the S1 and S2 chips, with the S3 chip recently launched [7][8]. - The company plans to set a new industry benchmark by achieving a cost of "one cent per million tokens" for inference [7][8]. Group 5: Business Model - The company is not merely a chip seller but aims to create a comprehensive ecosystem around "chip + system + ecology" [8][9]. - Xi Wang intends to collaborate with major AI firms and various computing power providers to optimize existing systems and enhance cost efficiency [8][9]. Group 6: Future Vision - The company envisions becoming the foundational infrastructure for affordable and stable computing power in the AI era, linking technology, policy, and commercial models [9]. - The future of AI in China is expected to rely on scalable and cost-effective inference infrastructure, marking a significant transition from following to leading in the domestic AI chip market [9].
曦望董事长徐冰:把大模型推理这件事,做到极致
Sou Hu Cai Jing·2026-01-29 11:35