云天励飞罗忆:推理超越训练,国产算力的真正战场在生态与成本丨GAIR 2025

Core Insights - The article discusses the shift in AI from training to inference, highlighting that inference is now surpassing training in terms of power consumption and importance in the industry [22][24]. - The focus is on the evolution of AI technology, particularly in China, where companies like Yuntian Lifei are building their own AI technology systems by investing in both algorithms and chips [5][6]. Group 1: AI Industry Evolution - The AI industry has undergone significant changes since 2014, with a notable acceleration in the pace of technological development, particularly with the advent of large models [18][20]. - The demand for inference capabilities has increased dramatically, with a reported growth of nearly 100 times from last year to this year [8][28]. - By the end of 2024, it is expected that domestic AI chips will account for over 50% of the AI chip market in China, surpassing non-domestic high-end GPUs [28][24]. Group 2: Yuntian Lifei's Strategy - Yuntian Lifei has adopted a dual approach of focusing on both algorithms and chips, which has allowed the company to navigate the complexities of the AI landscape effectively [5][6]. - The company emphasizes the importance of integrating into existing ecosystems, particularly the CUDA ecosystem, to reduce adaptation costs for clients [8][9]. - Yuntian Lifei aims to enhance its core capabilities in inference, ensuring that its technology is both reusable and deliverable, thereby providing clear value to customers [13][31]. Group 3: Challenges and Opportunities - The primary challenge facing AI inference is the cost, as companies strive to make AI more precise while managing expenses [11][12]. - The article highlights the need for a robust ecosystem that supports the integration of various technologies, including the development of standards and protocols for AI chips [12][30]. - The future of AI infrastructure is expected to move towards heterogeneity and high cost-effectiveness, addressing the performance-cost-accuracy trade-off [39][41].