Workflow
Data element
icon
Search documents
广东首批高质数据集赛题正式“发榜”,探索数据价值转化新路径
Core Viewpoint - Data has become a core production factor driving industrial transformation, with high-quality datasets being essential for unlocking data value. Guangdong is leading the way in creating a new high ground for digital and intelligent development by hosting the first high-quality dataset innovation competition [1]. Group 1: Competition Overview - The first high-quality dataset innovation competition in Guangdong was launched on December 2, 2023, in Dongguan, with the theme "Data Gathering in the Bay Area, Intelligent Creation for the Future" [1]. - The competition employs a "challenge and leaderboard" mechanism to promote the discovery, supply, circulation, innovative application, and transformation of high-quality datasets, aiming to inject strong momentum into the digital transformation of the Guangdong-Hong Kong-Macao Greater Bay Area [1][2]. - The competition focuses on key sectors such as industrial manufacturing, healthcare, technological innovation, urban governance, and transportation, aiming to create reusable high-quality datasets for AI model training and industry applications [2]. Group 2: Key Participants and Structure - The initial batch of high-quality dataset challenges was announced, involving major sectors like energy, biomedicine, finance, and education, with participation from organizations such as State Grid, Guangzhou Laboratory, and Ping An Insurance [4]. - The competition will follow a structured organization system of "1 set of leaderboard mechanism + 3 competition phases + N supply-demand matchmaking events," creating a complete closed loop from data supply to technological research and industry upgrading [4]. - The event aims to promote the replication and promotion of mature data application scenarios while exploring the potential of emerging fields like low-altitude economy and industrial internet [4]. Group 3: Expert Insights - Experts highlighted that data preprocessing, annotation, synthesis, and quality assessment are critical steps in building high-quality datasets, ensuring they effectively support AI model training and application [4][5]. - Various organizations, including Baidu and China Telecom, are collaborating to establish standardized production processes and quality certification for high-quality datasets, addressing challenges in data collection and compliance [5]. - The ongoing investment and practical experience in Guangdong's high-quality dataset construction are transitioning from isolated breakthroughs to a more widespread development, providing robust data support for the innovation of the AI industry [5].
数据要素与产业加速融合 2030年我国数据产业规模将达7.5万亿元
Yang Shi Wang· 2025-05-18 03:46
Core Insights - China aims to cultivate a robust data element industry chain, projecting a data industry scale of 7.5 trillion yuan by 2030 [1][3] - As the first country to incorporate data as a production factor, China has established a comprehensive data industry chain, with a data production total of 41.06 zettabytes in 2024, reflecting a 25% year-on-year growth [3] - The number of data-related enterprises in China exceeds 190,000, with the current data industry scale surpassing 2 trillion yuan [3] Data Sharing and Integration - Public data sharing is a crucial breakthrough for the marketization of data elements, with a 7.5% increase in local public data open platforms and a 7.1% rise in open data volume in 2024 [5] - The number of high-quality data sets has grown by 27.4% year-on-year, indicating a significant push towards integrating public and enterprise data [5] - The National Data Bureau is planning to establish a comprehensive data infrastructure by 2029 [5] Artificial Intelligence and Data Quality - Data has surpassed traditional production factors, becoming a core driver of AI breakthroughs and industrial transformation [6] - High-quality data sets are essential for enhancing AI model performance and reshaping the entire industry chain from R&D to commercial application [6] - The construction of high-quality data sets involves critical processes such as data collection, cleaning, annotation, and quality assessment [8] Data Annotation Industry - China's data annotation industry has surpassed 8 billion yuan in value, entering a new phase of scale and standardization [10] - The number of companies developing or applying AI has increased by 36% year-on-year, with high-quality data sets growing by 27.4% [10] - Companies utilizing large model data technologies and data application enterprises have seen year-on-year growth of 57.21% and 37.14%, respectively [10] Challenges in Data Development - Despite the acceleration in high-quality data set innovation, challenges remain, including low data stock and production, inconsistent data quality, lack of mainstream high-value data, and low data utilization efficiency [12]