Workflow
数据标注
icon
Search documents
山西大同:书写推动高质量发展的“三张答卷”
Ren Min Ri Bao· 2025-11-23 22:52
而今,放眼曾以"煤都"著称的山西省大同市,只见光伏板如蓝色波涛连绵涌动,数据中心里的指示灯似 璀璨星河不停闪烁,古城墙下传统文化与现代文明交相辉映。 面对资源型城市转型的时代考题,大同市以一场深刻的城市革命,书写着能源之城绿色蝶变、算力之城 筑基未来、文化之城脉动新生的优异答卷。 第一张答卷:能源之城绿色蝶变 "十四五"时期,大同以"一张蓝图绘到底"的韧劲,实现从单一依赖传统能源向构建多元互补、清洁高 效、安全可靠的新型能源体系转变。作为国家重要的能源基地,大同始终端稳能源饭碗,原煤年产量稳 定在1.5亿吨以上,2021—2024年累计生产原煤6.4亿吨,其中以长协价保供电煤4.3亿吨,为能源安全提 供坚实支撑。 在传统能源升级上,大同以智能化破局。中煤大同能源公司调度中心内,工作人员轻点鼠标便可洞察井 下500米深处的实时工况,实现"减人不减产、增效更安全"。目前,全市已建成14座智能化煤矿,先进 产能占比超85%,9家燃煤电厂完成超低排放改造,煤电清洁高效利用水平持续提升。 文化"破圈"的密码,深植于大同厚重的历史积淀。2000多年的建城史中,多民族文化的交融碰撞,塑造 了大同独特的城市气质。云冈石窟的万 ...
东北三省共建数据标注产业集群
Liao Ning Ri Bao· 2025-11-23 00:48
去年5月,国家数据局布局建设7个国家级数据标注基地,辽宁沈阳是其中之一。目前,沈阳市已标 注数据总量超过8323TB,形成134个高质量数据集,应用于76个大模型,参与制定国家标准2项、行业 标准4项;引育数据标注企业65家,从业人员达1.18万余人,产业规模约25.9亿元;加快推动数据端释放 要素核心价值,已交易数据集28个,交易金额1.02亿元。 据悉,东北地区将以专业化、智能化、国际化为导向,通过政府引导、企业主体、市场运作的协同 共建模式,构建特色鲜明、功能互补的区域性产业集群体系,形成专业化分工协作、互联互通的产业生 态,建设覆盖东北亚的数据标注产业,打造具有全球竞争力的东北地区数据标注产业集群。 东北数据标注解决方案联合体将整合各地资源和优势,为全国客户提供全栈式、高价值的解决方 案,联合开拓应用场景,"打包"东北的工业、农业、文旅等全域产业升级需求,为区域内的标注企业提 供试验田和首用地,共同下好协同发展这盘"大棋"。 开拓场景,链接需求,打造"东北数据标注"集体品牌。日前,由中国信息协会、沈阳市数据局举办 的东北地区数据标注产业高质量发展座谈会在沈阳市召开。辽宁、吉林、黑龙江三省及相关地市共同 ...
建设高质量数据集,江苏势在必行、必须先行
Xin Hua Ri Bao· 2025-11-06 08:16
Core Insights - The "2025 National High-Quality Data Set and Data Annotation Industry Supply and Demand Matching Conference" held in Nanjing successfully attracted over 500 companies and resulted in more than 90 collaborations with a transaction value exceeding 900 million yuan [1] - Jiangsu province aims to leverage its rich data resources to enhance the construction of high-quality data sets, which is essential for seizing opportunities in artificial intelligence development [1][2] - The definition of high-quality data sets varies across industries, but they must meet the training needs of AI large models [2] Industry Overview - Jiangsu has established 321 high-quality data sets across key sectors such as healthcare, transportation, industry, energy, and cultural tourism, with a total data scale exceeding 93PB [1] - The province has implemented a "1+N" policy framework to optimize the environment for artificial intelligence development, focusing on collaboration between supply and demand enterprises [2][7] Challenges in Data Annotation - Data annotation is crucial for AI development, requiring specialized knowledge and skills, particularly in complex fields like medical data [3][4] - The industry faces challenges such as insufficient data supply and a lack of skilled data annotators, which hinder the progress of large models in niche areas [4] Cost Considerations - The high costs associated with data storage and processing are significant challenges for companies, with many high-quality data sets being discarded due to storage expenses [5][6] - Companies are exploring solutions like establishing cold storage centers in less developed regions to reduce costs associated with data storage [5] Financial Support and Standards - The data industry is knowledge and capital-intensive, with a significant portion of costs tied to acquiring raw data [6] - Financial institutions are encouraged to provide support for data collection and annotation, potentially through innovative financing models [6] - The establishment of standards for high-quality data sets is underway, with guidelines and quality assessment protocols being developed to address current challenges [6]
3位00后,估值700亿
3 6 Ke· 2025-10-28 12:09
Core Insights - Mercor, an AI recruitment startup, has raised $250 million in new funding, achieving a valuation of $10 billion, which is five times its previous valuation of $2 billion earlier this year [1][3] - Founded in 2023 by three college dropouts, Mercor has developed a large professional talent network and has seen its annual recurring revenue grow from $1 to $500 million in just 17 months [1][3] Company Overview - Mercor specializes in AI-driven recruitment, utilizing AI to screen resumes and match candidates to job positions quickly [3][5] - The company has expanded its services to include data annotation and large model evaluation, leveraging its extensive network of 30,000 experts [3][9] - The startup's revenue has quadrupled since the turmoil at Scale AI, a competitor, leading to an influx of Scale's former employees and clients [13][14] Business Model and Revenue - Mercor's annual recurring revenue reached $70 million by February, driven by its new business in large model evaluation [3][9] - The company manages a network of experts who can earn significant daily wages, with total earnings exceeding $1.5 million daily [9][10] - The new funding will be allocated to expanding the talent network, enhancing the matching system, and improving delivery speed [3][4] Competitive Landscape - Mercor's main competitor, Scale AI, faced challenges after being acquired by Meta, which led to concerns about data neutrality and client trust [13][14] - The controversy surrounding Scale AI has inadvertently benefited Mercor, resulting in a significant increase in its revenue and client base [14][15] Future Prospects - Mercor's AI-driven recruitment model has positioned it as a key player in the large model evaluation space, filling a critical gap in the industry [15][16] - The company aims to continue leveraging its talent network to support the growing demand for high-quality data and expert feedback in AI model development [16]
泰安打造全流程数据标注生态圈
Da Zhong Ri Bao· 2025-10-27 03:26
Group 1 - The article highlights the emergence of Xiaohongshu as a platform for young people to explore diverse interests, supported by precise data annotation and review processes [1] - Shandong Feilixin Digital Technology Co., Ltd. is collaborating with major companies like Tencent and Alibaba to enhance content operations through its technological advantages [1] - Data annotation involves labeling raw data such as images, text, audio, and video, enabling machine learning models to understand and learn from the data [1] Group 2 - The data annotation industry in Tai'an has developed a solid foundation, characterized by a leading enterprise, two major annotation clusters, and a complete industrial chain [2] - Taiying Technology is identified as a leader in the digital middle and back-office operation service industry, rapidly expanding its presence in the data annotation market [2] - Tai'an has gathered over 30 data annotation companies, forming a comprehensive industrial chain from upstream data collection and governance to downstream AI training and application [2]
大连数字和软件服务交易会启幕
Liao Ning Ri Bao· 2025-10-25 00:59
Core Insights - The 2025 Dalian Digital and Software Service Trade Fair commenced with the theme "Digital Intelligence Empowering Industry, Innovation Leading the Future," focusing on six cutting-edge sectors: artificial intelligence, data labeling, industrial internet, vehicle networking, low-altitude economy, and cross-border e-commerce [1] Industry Developments - Dalian High-tech Zone announced ecological planning for nine industrial parks, emphasizing the establishment of a data labeling industry park as a crucial platform for high-quality data sets and a solid foundation for artificial intelligence [1] - The Dalian Data Labeling Industry Park was inaugurated, integrating public service platforms, talent training centers, and office spaces for data labeling companies, targeting sectors such as intelligent driving, healthcare, embodied intelligence, marine economy, and financial regulation [1] - The park aims to develop into a regional data service hub with a workforce of nearly 10,000 by 2027, positioning itself as one of the most influential data labeling industry bases in China [1] Project Collaborations - During the trade fair, eight key digital economy cooperation projects were signed, covering essential areas such as digital technology research and development, software innovation applications, and industrial ecosystem construction [1] - The event also organized various industry matching and inspection activities to facilitate the implementation of projects and establish a tracking mechanism for project execution, ensuring that cooperation intentions translate into tangible results [1]
在美国,有多少硕博被当做鉴黄师?
虎嗅APP· 2025-10-19 13:20
以下文章来源于差评前沿部 ,作者世超 差评前沿部 . 站在科技的前列线,关注AI、机器人等前沿科技。 本文来自微信公众号: 差评前沿部 ,作者:纳西 & 西西,编辑:江江 & 面线,题图来自:AI生成 众所周知,现在这些AI大模型,一个个都能上天,巴不得赶明儿就统治地球。 资本圈现在也是框框砸钱,动不动就一亿美金挖人才,巅峰时期的爽子都没他挣得多。 可能有些差友印象中,AI不就是靠这些大牛搭建框架,然后找人把整个互联网的资料"哐哐"往里 灌,然后就顿悟了嘛。 但其实,人工智能,它首先得靠人工。甚至你自己,可能都在不知不觉中成了AI的免费劳动力。 不信你回想一下,短视频刷着刷着,突然弹出来一个窗口,问你刚刚那个视频是不是广告,或者对上 个视频满不满意?甭管你是好心还是手滑,只要你点了提交,朋友,你已经成为了算法进化Play中的 一环。 这跟在一堆图片里找红绿灯的人机验证一样,本质上,就是在薅你的劳动力。 不仅咱们普通人被迫营业了,在这背后,还有一帮AI的老师、保姆、甚至是心理医生,也在日以继 夜地给AI擦屁股喂养料。 但比起这些光鲜亮丽的AI和明星科学家,这帮人的日子过得却是相当牛马。 前几天有国外媒体曝光出来 ...
在美国,有多少硕博被当做鉴黄师?
Hu Xiu· 2025-10-19 10:55
Core Insights - The article discusses the disparity between the high valuations of AI companies and the low wages of the human labor force that supports them, highlighting the exploitation of skilled workers in the AI training process [1][12][48] Group 1: AI Workforce and Compensation - AI evaluators at Google, despite being highly educated, earn only $16 to $21 per hour, translating to about $3,000 per month, which is significantly lower than the salaries of AI engineers [23][25] - The article emphasizes that many AI trainers are experienced professionals, including writers and educators, yet their compensation does not reflect their qualifications [22][27] - The disparity in pay raises questions about the value placed on different skill sets within the tech industry, particularly the undervaluation of humanities and social sciences [28][30] Group 2: Nature of AI Training Work - The work involved in training AI, such as data labeling and content evaluation, is often tedious and resembles assembly line work, with low pay and high expectations [15][16][35] - The article describes the rigorous standards for AI training tasks, where even minor errors can lead to significant penalties, further emphasizing the exploitative nature of the work [17][40] - The industry relies heavily on outsourcing, creating a pyramid structure where a few top engineers benefit while a large number of lower-tier workers are underpaid and overworked [36][43] Group 3: Global Context and Ethical Concerns - The article highlights that the exploitation of labor in AI training is not limited to the U.S., with similar practices observed in other countries, where workers face harsh conditions and low pay [31][45] - It points out that the psychological toll on workers, especially those handling sensitive content, is often overlooked, raising ethical concerns about the treatment of labor in the tech industry [44][48] - The narrative draws parallels between modern AI labor practices and historical labor exploitation, suggesting that the advancements in technology should not come at the cost of human dignity [50][52]
发展数据标注技术,把数据“原油”炼成“汽油”
Ren Min Ri Bao· 2025-10-15 06:46
Core Insights - The Chinese government is actively promoting the development of the data labeling industry as part of its "Artificial Intelligence+" initiative, emphasizing the importance of data labeling in enhancing AI capabilities and creating high-quality datasets [1][2]. Group 1: Industry Growth and Projections - By 2027, the data labeling industry is expected to see significant improvements in specialization, intelligence, and technological innovation, with an annual compound growth rate exceeding 20% [2]. - As of mid-2023, seven data labeling bases have been established in cities like Hefei and Chengdu, generating over 8.3 billion yuan in related industry output [2]. Group 2: Industry Trends - The data labeling industry is evolving with technological advancements, including intelligent labeling techniques and human-machine collaboration, which enhance efficiency and accuracy [3]. - The industry is transitioning from labor-intensive to knowledge-intensive, requiring higher professional standards for practitioners, especially in specialized fields like medical imaging and autonomous driving [3]. - The scope of data being labeled is expanding from single-modal to multi-modal, with applications extending into specialized sectors such as healthcare and industrial manufacturing [3]. Group 3: Collaborative Ecosystem Development - There is a call for collaborative efforts to strengthen the data labeling ecosystem, with local governments encouraged to implement policies and facilitate cooperation among industry stakeholders [4]. - Companies are urged to align their data labeling capabilities with actual market demands and collaborate on tool development and process optimization to establish industry standards [4].
发展数据标注技术,把数据“原油”炼成“汽油”(新视点)
Ren Min Ri Bao· 2025-10-14 22:12
Core Insights - The Chinese government is actively promoting the development of the data labeling industry as part of its "Artificial Intelligence+" initiative, emphasizing the importance of data labeling in enhancing AI capabilities and creating high-quality datasets [1][2]. Group 1: Industry Growth and Projections - By 2027, the data labeling industry is expected to see significant improvements in specialization, intelligence, and technological innovation, with an annual compound growth rate exceeding 20% [2]. - As of mid-2023, seven data labeling bases have been established in cities like Hefei and Chengdu, generating over 8.3 billion yuan in related industry output [2]. Group 2: Industry Trends - The data labeling industry is evolving with technological advancements, including intelligent labeling techniques and human-machine collaboration, which enhance efficiency and accuracy [3]. - The industry is transitioning from labor-intensive to knowledge-intensive, requiring higher professional standards for practitioners, especially in specialized fields like medical imaging and autonomous driving [3]. - The scope of data being labeled is expanding from single-modal to multi-modal, with applications extending into specialized sectors such as healthcare and industrial manufacturing [3]. Group 3: Collaborative Ecosystem Development - The data labeling industry is still in its early stages and requires collaborative efforts to build a robust ecosystem, with local governments encouraged to strengthen policy implementation and industry cooperation [4]. - Companies are urged to align their data labeling capabilities with actual market demands and collaborate on tool development and process optimization to establish industry standards [4].