Core Insights - Shandong province will initiate a project to develop a corpus database focusing on high-end equipment and several other industries, aiming to enhance data technology and standards [2] Group 1: Project Overview - The project will target industries such as high-end equipment, tobacco products, agricultural and sideline food processing, furniture manufacturing, wood processing, leather and feather products, footwear manufacturing, instrument manufacturing, and waste resource utilization [2] - The project emphasizes the creation of high-quality industry-specific corpus databases to support natural language processing, computer vision, machine learning, and deep learning tasks [2] Group 2: Data Requirements and Standards - The corpus database must contain no less than 100,000 entries at the time of project acceptance, ensuring high data quality, coverage, potential value, and application effectiveness [2] - Acceptance of the project will require third-party evaluation to verify the quality and standards of the corpus [2] Group 3: Encouragement for Resource Optimization - Shandong encourages industries to accelerate the optimization and integration of corpus resources and actively open public corpora [2]
山东将在高端装备等领域开展语料库揭榜挂帅
Da Zhong Ri Bao·2026-02-06 01:06