Core Insights - The company announced its strategic focus on large-scale AI inference chips, aiming to reduce the cost of inference for million tokens by over 100 times within the next three years [2][6] - The global computing power industry is shifting towards inference capabilities, with major players like Google and NVIDIA emphasizing system optimization for efficiency and cost reduction [4][5] Group 1: Company Strategy - The company has established the GPNPU technology route, defined as GPGPU + NPU + 3D stacked storage, to address the challenges of portability, deployability, and sustainable cost reduction [5] - The CEO highlighted five key elements of the company's competitive advantage: technology, production capacity, ecosystem, market, and capital, which collectively support the company's strategic goals [5] - The company is one of the few in China with sufficient domestic production capacity, ensuring high certainty for large-scale chip production and delivery [5] Group 2: Industry Trends - The competition in the inference era is shifting from merely enhancing model parameters to improving application efficiency, focusing on lower inference costs and delivery efficiency [4] - The roadmap aims to align with international mainstream platforms, optimizing key inference stages like long context pre-filling and low-latency decoding to achieve cheaper, more stable, and easier deployment [6] - The essence of competition in the inference era is the cost per inference unit, which must be made affordable and stable for AI to transition from a visible capability to an accessible productivity tool [6]
云天励飞披露大算力芯片战略,要把推理成本降低百倍以上