Data Annotation
Search documents
人机协作中,他们教机器“读”世界
Xin Lang Cai Jing· 2026-01-28 22:02
(来源:新华日报) □ 本报记者 周娴 实习生 任馨怡 上午9点,徐州市泉山区的江苏淮海科技城园区,江苏京数智能科技有限公司的办公区里,键盘敲击声 如潮水般准时响起。近50名年轻人端坐在电脑前,指尖重复着点击、拖拽、分类的动作——他们正通过 专业标注工具,为一张张商品图像打上精准"标签"。从商品标题、主图,到SKU(库存量单位)属性, 每一个细节都经由他们的双手,被逐一转化为机器能够理解的"语言"。他们,教会机器"读懂"世界。 数据标注行业驶入"快车道" "目前,江苏淮海科技城内已聚集20多家数据标注相关企业,规模小的不足50人,规模大的则超过200 人。"据江苏淮海科技城相关负责人介绍,这些企业的标注业务主要围绕三类通用模型展开:一类服务 于车企的自动驾驶系统,一类面向豆包、千问等大语言模型进行文本与图像标注,还有一类则专注于京 东、淘宝等电商平台的商品信息标注。 市场调研机构艾瑞咨询的数据显示,到2025年,中国人工智能数据采集与标注服务市场规模预计将突破 120亿元。在江苏,数据标注相关岗位的招聘信息遍布各地:南京某研究院招募标注工程师,月薪可达 万元,提供双休与五险一金;徐州有企业面向实习生开放岗位, ...
数据在身边,残疾人也能成为人工智能时代的“炼油人”
Hua Xia Shi Bao· 2026-01-13 12:41
正在工作的数据标注师 本报(chinatimes.net.cn)记者李氏琼 王晓慧 沈阳报道 发展人工智能产业,既需要顶层设计型的战略人才,也需要扎根实践型的技能人才。 数据标注工作需要工作人员坐下来,不断在现有数据基础上"打标签"的耐心、细心和责任心,这恰好与 残疾人"重脑力专注、轻肢体强度"的工作需求契合。正因如此,越来越多的残疾人也参与到"炼油"中 来。 优势凸显与就业赋能 在数据标注行业,残疾人有特殊优势。比如,听力障碍者有更敏锐的视觉感知,能在图像标注中快速捕 捉细微差异;肢体不便者手部动作更稳定,适配长时间键盘鼠标操作的需求;脑瘫人士行动受限,但是 在节奏清晰、流程明确的重复性任务中,却有远超常人的专注力与持久性。 而且残疾人参与数据标注工作,往往能更敏锐地识别出潜在的歧视性表达或不当标签,反向优化人工智 能的工具属性,提升整体标注质量,让人工智能大模型更具包容性、更贴合社会多元需求。 近年来,随着"东数西算"工程的持续推进,全国七大数据标注基地的陆续建成,数据资源大量向中西部 倾斜,依托地区劳动力成本优势,岗位数据标注岗位得以大量布局,也解决了不少残疾人就业难、离家 远的问题。 "我们这里好多人 ...
贵阳贵安高质量发展“新脉动”|做AI产业背后的“数字基石”,看贵阳这家企业怎么干?
Sou Hu Cai Jing· 2026-01-07 22:05
Core Insights - The article highlights the importance of data annotation as a critical bridge between raw data and intelligent algorithms in the AI industry, particularly as AI moves from technical exploration to large-scale industrial application [1] Group 1: Company Overview - Guizhou Dinglian Data Co., Ltd. is a big data company focused on the data annotation industry, providing AI data annotation and review services across various sectors such as smart transportation, smart education, and new retail [1][4] - Established in 2014, the company operates with the mission of "making data more valuable" and has set up a platform operation center in Guizhou in 2023 [3] Group 2: Services and Capabilities - The company offers comprehensive data solutions, including image, text, audio, and video annotation, as well as 3D point cloud processing, covering all stages from data collection to AI application [6] - Dinglian Data has formed deep partnerships with well-known internet companies and new energy vehicle enterprises, solidifying its position in the smart transportation sector [4] Group 3: Workforce and Collaboration - The platform has registered nearly 160,000 online data annotators, providing a robust service support capability [6] - The company collaborates with over 10 universities and 30 colleges in Guizhou to enhance students' practical skills through training and practice [8] Group 4: Future Goals - The company aims to achieve significant growth in the new year, targeting over 500,000 registered users and 20 strategic partners, while expanding its operational base to at least 4,000 square meters [10] - Plans include establishing a data operation center in Guiyang and a data base in Anshun to promote collaborative development between the two locations [10]
AI创业版黄仁勋:37岁华人0融资5年干到240亿,谷歌OpenAI都是客户
量子位· 2025-12-27 04:59
Jay 发自 凹非寺 量子位 | 公众号 QbitAI 37岁华裔学霸AI创业,0融资,估值240亿美元。 是的,白手起家, 没拿投资人一分钱 。 更强悍的是,纯靠一己之力,轻松斩获谷歌、OpenAI等AI巨头的大单,硬生生给公司干成 了估值240亿美元的超级独角兽。 而这家公司的创始人—— Edwin Chen ,如今也凭借180亿的身价,跻身福布斯400的最年 轻富豪,也是这波新晋富豪中最富有的一位。 AI创业成最年轻新晋富豪 福布斯400新晋最年轻富豪—— Edwin Chen ,美裔华人,年仅37岁。 这时候,Edwin忽然从科幻电影《降临》的原著中得到了灵感。 《降临》讲的是一位人类语言学家,试图通过破译外星文明的文字与其建立沟通。但随着理 解不断加深,她却逐渐掌握了一种语言之外的能力—— 对时间的非线性认知,乃至「预见未 来」 。 在Edwin看来,在我们的世界里,人类,就是那批拥有超能力的外星人。而AI可以通过标注 数据,学习我们的思维模式,最终获得独属于人类的超能力——智能。 从大厂打工人,到硅谷估值240亿的超级独角兽,他仅仅花了5年。 Edwin毕业于MIT,先后在推特、谷歌和脸书工作,担 ...
19岁亚裔女孩,做“赏金猎人”,融了1个亿
虎嗅APP· 2025-11-08 09:29
Core Insights - Datacurve is a new company in the high-quality data labeling sector, aiming to challenge established players like Scale AI, with a unique "gamified labeling" approach that has attracted significant investment and participation from skilled engineers [3][4][12]. Group 1: Company Overview - Datacurve has raised a total of $17.7 million (approximately 120 million RMB) in funding, with a recent $15 million Series A round led by notable investors from top AI companies [4][12]. - The company operates a platform called Shipd, which gamifies data labeling tasks by packaging them as "quests" that engineers can complete for cash rewards [3][10]. Group 2: Unique Business Model - The platform has attracted over 14,000 engineers, who are motivated by the challenge and gaming experience rather than just monetary compensation [7][8]. - Datacurve emphasizes an "engineer-first culture," creating a community that values recognition and professional identity, distinguishing it from traditional data labeling platforms [11][12]. Group 3: User Experience Optimization - The tasks on Shipd are designed to be technically challenging, with multiple validation mechanisms to ensure high data quality [8][10]. - The platform incorporates competitive elements such as leaderboards and rewards for consecutive task completions, enhancing engagement among participants [10][11]. Group 4: Market Position and Competition - Datacurve faces competition from other data labeling companies like Surge AI, which also focus on high-quality data, but Datacurve's unique model may provide a competitive edge if it can maintain data quality and engineer participation [25]. - The company is not solely reliant on data labeling for its future; it plans to expand into other verticals such as finance, medicine, and marketing [25].
海天瑞声:DeepSeek等AI新技术并未减少数据标注需求
Sou Hu Cai Jing· 2025-07-04 07:41
Core Viewpoint - The company, Haitai Ruisheng, reassures investors that recent share reductions by major shareholders and executives are driven by personal financial needs rather than a lack of confidence in the company's future growth. The company emphasizes its commitment to maintaining core competitiveness through strategic investments and highlights the ongoing demand for data labeling in the AI sector despite advancements in technology [1]. Group 1: Shareholder Actions - The share reduction actions by shareholders and executives comply with regulations set by the China Securities Regulatory Commission and the stock exchange, with plans disclosed in advance [1]. - The company clarifies that the recent share reductions were primarily due to personal financial needs of the shareholders [1]. - The company has adopted both centralized bidding and block trading methods for share reductions, with block trading not directly impacting the secondary market prices [1]. Group 2: Industry Outlook - The introduction of AI technologies like DeepSeek has not diminished the need for data labeling; instead, it has driven the industry towards higher specialization and increased demand for quality labeled data [1]. - The acceleration of large model industrialization in sectors such as finance, healthcare, and law is leading to a growing need for high-quality labeled data, requiring deeper involvement from industry experts [1]. - The evolution of AI from single-modal to multi-modal applications (including voice and visual data) is expected to create additional data demand [1]. Group 3: Company Performance - The company reports that its operational performance in the first half of the year remains stable and continues to improve, with specific financial data to be disclosed in future reports [1]. - The company prioritizes the rights of minority shareholders and has recently returned value to investors through dividends, with plans to enhance management of share reductions to minimize market impact [1].
华人 AI Surge 欲融 10 亿美金估值 150 亿,Grammarly 收购 Superhuman,Figma 提交上市
投资实习所· 2025-07-02 03:54
Group 1: Figma's Market Position and Financials - Figma has submitted its IPO application after a failed acquisition by Adobe, which was canceled due to regulatory issues [1] - The company reported a revenue of $821 million over the past 12 months, representing a 46% year-over-year growth, with a gross margin of 91% [1] - 78% of Fortune 2000 companies are using Figma, and 76% of its customers utilize at least two of its products [1] - Figma currently holds $1.54 billion in cash and has received a $1 billion breakup fee from Adobe [1] Group 2: Figma's Strategic Moves and AI Integration - Figma is expanding its offerings by integrating AI capabilities, launching products like Figma Sites, Figma Make, Figma Buzz, and Figma Draw [1] - The company has invested $70 million in Bitcoin ETFs and plans to purchase an additional $30 million in Bitcoin using USDC [1] Group 3: Grammarly's Acquisitions and Market Strategy - Grammarly has acquired the AI email product Superhuman, which was valued at $825 million in 2021 and has an ARR of approximately $35 million [2][3] - The acquisition aims to enhance Grammarly's AI-driven productivity platform and improve collaboration and communication experiences [3] - Superhuman's technology will be leveraged to develop advanced AI agents, focusing on the email sector [3] Group 4: Superhuman's Unique Approach and Market Impact - Superhuman has a unique approach to customer onboarding, taking 18 months to build its MVP and intentionally onboarding only 4 to 5 new customers weekly [6] - The company has helped users save over 4 hours per week on email processing and has sent over 500 million messages [2][5] Group 5: Surge AI's Growth and Market Position - Surge AI, founded by Edwin Chen, focuses on data annotation and reinforcement learning, serving high-profile clients like Google and OpenAI [8] - The company is preparing to raise $1 billion at a valuation potentially exceeding $15 billion [8]
没融资收入超 Scale AI 的竞对创始人也是华人,一个 16 岁少年融了 100 万美金
投资实习所· 2025-06-20 05:37
Core Insights - The article highlights the rapid growth and potential of AI as a new wealth lever, exemplified by the acquisition of AI Coding product Base44 by Wix for $80 million just six months after its founding [1] - Surge AI has emerged as a hidden champion in the AI training data sector, achieving a $1 billion ARR without external funding and surpassing the revenue of competitors like Scale AI [3][13] Company Overview - Surge AI was founded by Edwin Chen, who has a unique background in mathematics and linguistics from MIT, which has contributed to the company's success in the AI field [3] - The company has a team of around 100 people and has been profitable since its inception, focusing on high-quality data annotation services [3][5] Market Opportunity - Edwin Chen identified a significant gap in the availability of high-quality annotated data, even among tech giants like Google and Facebook, which struggle with data annotation challenges [4] - Surge AI was established during the pandemic, leveraging the availability of skilled individuals to build a high-quality annotation workforce [5] Technological Advantages - Surge AI has developed proprietary quality control technologies to ensure high-quality data for training AI models, addressing the sensitivity of large language models to low-quality data [6] - The company employs domain expert annotation teams across various fields, providing the necessary depth and breadth for training advanced language models [7] - Surge AI offers a rapid experimentation interface, allowing clients to quickly design and launch new tasks without lengthy guidelines [9] - The company also conducts red team testing to identify and address security vulnerabilities in AI models [10] Strategic Partnerships - A key breakthrough for Surge AI was its collaboration with Anthropic, which has validated its technical capabilities and established its authority in AI safety and alignment [11] Competitive Positioning - Unlike competitors such as Scale AI, Surge AI positions itself as a high-end data annotation service, focusing on the most complex AI training tasks [13] - Surge AI achieved a tenfold growth within six months of its founding, with an ARR of $1 billion, surpassing Scale AI's revenue of $870 million during the same period [13]
挂牌示范园区、建立产教融合培训中心……武汉数据标注产业这样发展
Chang Jiang Ri Bao· 2025-06-13 07:23
Core Viewpoint - Wuhan is promoting the integration of technological innovation and industrial innovation through the "Three-Year Action Plan for the Development of the Data Annotation Industry (2025-2027)" to elevate the data annotation industry to new heights [1][5]. Group 1: Industry Development - The data annotation industry in Wuhan has rapidly developed, gathering over 60 key enterprises and creating high-quality datasets and annotation tool platforms [5]. - Two projects from Wuhan were selected as part of the first batch of excellent data annotation cases at the 8th Digital China Construction Summit [5]. - The Wuhan Data Bureau has established a project and enterprise database, identifying 57 key enterprises and 37 key projects in the data annotation sector [5][6]. Group 2: Support Measures - Wuhan will create an online information platform for supply-demand matching in the data annotation industry and organize offline matching activities to enhance collaboration across the industry chain [5]. - The city plans to establish data annotation demonstration parks in collaboration with districts, providing comprehensive support in talent, financing, and R&D innovation [5][6]. - A training center for data annotation will be established to train at least 600 skilled talents annually [6]. Group 3: Technological Focus - The focus will be on supporting original and secondary development in various technical directions, including text annotation, audio annotation, video annotation, point cloud annotation, and motion capture [6]. - The city aims to secure policy and financial support for data annotation projects, encouraging enterprises to increase innovation investment and drive industry upgrades [6].