数博对话|北京交通大学教授张向宏:高质量数据集是决定大模型质量的关键因素
Huan Qiu Wang·2025-08-19 08:02

Core Insights - The 2025 China International Big Data Industry Expo aims to promote efficient aggregation and utilization of data resources, driving industrial transformation and high-quality economic development [1] - The event serves as a global platform for showcasing data achievements, facilitating technical exchanges, and fostering collaboration between academia and industry [6] Digital Transformation - Digital transformation has shown effectiveness in various sectors, categorized into four tiers: - First tier includes industries like internet, finance, and commerce, where digital transformation is deepening and broadening [2] - Second tier encompasses meteorology, healthcare, and transportation, with increasing data application [2] - Third tier involves government services and high-end manufacturing, still in the early stages of data utilization [2] - Fourth tier consists of SMEs and agriculture, where data resources are underutilized [2] Data Element Marketization - Data is a new type of production factor, facing challenges in rights confirmation, pricing, and trading due to its unique characteristics [3] - The uncertainty of data value complicates its pricing and market transactions [3] High-Quality Data Sets - Constructing high-quality data sets for AI applications is essential for realizing the value of data elements [4] - The data industry chain includes data resource supply, high-quality data set production, and application across various sectors [4] Infrastructure Challenges - The explosive growth of data presents new challenges in storage, computing power, and algorithm optimization [5] - Key breakthroughs in technology are needed to address these challenges [5] Private Data Security - Private data security is a critical bottleneck in building high-quality data sets, with 80% of global data being non-circulating [5] - Ensuring the safe and efficient circulation of private data is a universal issue that needs to be addressed [5] Role of the Expo - The expo plays a significant role in leading industry technology directions and promoting the application of cutting-edge technologies [6] - It facilitates collaboration between educational institutions and enterprises, enhancing the integration of education and industry [6] - The expo encourages cross-regional and cross-sector collaborative innovation [6] Future Optimization - Future expos should emphasize the construction of high-quality data sets for AI applications and traditional industry digital transformation [7] - Establishing a national data infrastructure to ensure safe and efficient data circulation is crucial for achieving "data freedom" [7] Long-term Vision for Guizhou - The expo aims for Guizhou to become a leading hub for data technology and policy, with aspirations to be a model for data application and security [8]