国内多数模型训练使用中文数据占比超60%
Ren Min Ri Bao·2025-08-18 22:31
Core Insights - The importance of Chinese data in enhancing the training performance of domestic AI models is highlighted, with over 60% of training data being Chinese, and some models reaching 80% [1] - The development and supply capability of high-quality Chinese data is continuously improving, driving rapid enhancements in the performance of AI models in China [1] Token Consumption Growth - The average daily consumption of tokens in China was 100 billion at the beginning of 2024, which surged to over 30 trillion by the end of June this year, marking a growth of over 300 times in just one and a half years [1]