华为袁远:中国是数据大国,但数据语料建设仍面临关键挑战
Guan Cha Zhe Wang·2025-12-18 13:34

Core Insights - The 2025 Global Data Technology Conference (GDTC) was held in Beijing, focusing on building advanced data infrastructure to unlock data value in the AI era [1][3] - Huawei's Vice President and President of the Data Storage Product Line, Yuan Yuan, highlighted the challenges in China's data corpus construction, including a low data retention rate of only 2.8% and a data sharing rate of less than 25% [1][4] Group 1: Data Challenges - China is a global data powerhouse, yet it faces significant challenges in data corpus construction, such as a data retention rate of only 2.8% [4] - The scarcity of high-quality data is evident, with China's model training data volume being only about 10% of that of leading Western countries [4] - Data sharing remains insufficient, with many urban and enterprise data still stored in "silos," leading to a data sharing rate of less than 25% [4] - The global annual data breach count has reached an alarming 47.16 billion records, posing significant risks across industries [4] Group 2: Recommendations for Data Infrastructure - At the city level, it is recommended to leverage urban hub roles to create advanced storage centers that promote the aggregation, governance, and trusted circulation of public and industry data [4][5] - At the industry level, building data sharing collaboration platforms is essential to transition from fragmented data use to intelligent integration, enhancing high-quality industry knowledge bases [5] - At the enterprise level, companies should focus on building AI data lakes to strengthen data sharing, management, and agile usage, exemplified by the integration of diverse data types for autonomous driving [5] Group 3: Future Directions - Continuous technological innovation is crucial for advanced data infrastructure development, with plans to enhance AI data lake capabilities and address data collection, storage, governance, and utilization issues [6] - The company aims to improve and open-source end-to-end AI toolsets to enrich the AI tool ecosystem in China, emphasizing the importance of practical tools for sustainable intelligent capabilities [6] - Research will focus on compliance governance, secure data flow, and cross-border auditing in the context of trusted data cross-border flow [6]