Workflow
国家数据局破题AI数据荒:7大标注基地已服务163个大模型
2 1 Shi Ji Jing Ji Bao Dao·2025-07-22 23:18

Core Insights - High-quality, multi-modal, and well-annotated data is crucial for the development of artificial intelligence, with current industry feedback indicating a shortage of quality datasets hindering progress [1][5] - The National Data Bureau is actively promoting the construction of data industry clusters and enhancing the data element market, recognizing data as a key driver of the digital economy [2][3] Data Industry Development - As of mid-2023, seven data annotation bases have been established in cities like Hefei and Chengdu, producing 524 datasets exceeding 29PB in size, serving 163 large models [1] - The upcoming 2025 China International Big Data Industry Expo will feature activities focused on high-quality datasets and data annotation, including supply-demand matching events [1][5] Policy and Regional Initiatives - The National Data Bureau, along with other departments, has issued guidelines to promote high-quality data industry development, fostering a competitive and vibrant data ecosystem [3] - Cities such as Shanghai and Henan have implemented policies to support data industry growth, while Guizhou has established a data sharing platform to facilitate government data sharing [3][4] Infrastructure and Resources - Guizhou, as a key hub for the "East Data West Computing" strategy, has 28 large data centers with a storage capacity of 25EB and a computing power scale of 85EFLOPS, accounting for over 98% of the province's intelligent computing capacity [3][4] Future Plans - The National Data Bureau plans to optimize data industry planning and transition from "point breakthroughs" to "full-scale development," with initiatives to establish data industry cluster pilot projects in the second half of the year [4][6] - An ecological cultivation action will be launched to collect and promote high-quality dataset case studies, hold technical exchange activities, and create a regular supply-demand matching platform [5][6]