AI爬虫

Search documents
一分钟3.9万次请求,网站被AI爬虫“碾压”,Meta和OpenAI遭点名,开发者接连祭出神级反爬“武器”
3 6 Ke· 2025-08-22 11:28
Core Viewpoint - The rise of AI crawlers is significantly impacting websites, with major companies like Meta, Google, and OpenAI being identified as the primary culprits behind this issue [1][4][5]. Group 1: AI Crawler Impact - AI crawlers account for 80% of AI robot traffic, with the remaining 20% being fetchers [2][4]. - Major companies dominate the crawler traffic, with Meta holding 52%, Google 23%, and OpenAI 20%, collectively accounting for 95% of the traffic [4]. - The peak traffic from crawlers can reach up to 39,000 requests per minute, causing severe strain on websites [1][13]. Group 2: Real-World Examples - A case study of Trilegangers, a website specializing in 3D models, illustrates the destructive impact of crawlers, leading to the website's collapse due to excessive data scraping by OpenAI [10]. - Fastly's report highlights that even a peak of 1,000 requests per minute can disrupt services for database-dependent sites [13]. Group 3: Developer Responses - Developers are implementing various countermeasures against crawlers, such as the "Anubis" system, which uses proof-of-work to increase the cost of scraping [19]. - Other tactics include "ZIP bombs" that overwhelm crawlers with excessive data and gamified CAPTCHAs that require users to complete challenges to prove they are human [20][21]. - Cloudflare has introduced an AI Labyrinth to mislead crawlers, which has seen over 50 billion requests daily from AI crawlers [24]. Group 4: Future Considerations - The ongoing battle between developers and crawlers is expected to continue, with crawlers evolving to bypass new defenses [26]. - Fastly suggests that smaller websites can use robots.txt to manage crawler traffic and consider deploying advanced systems like Anubis for better control [25].
AI全面战争,从爬虫毁灭互联网开始
Hu Xiu· 2025-03-24 14:13
这是第一次,全世界最大的网络基础设施公司之一,Cloudflare,开始用魔法打败魔法,用AI来对抗AI爬虫。 这事有意思的程度,足以载入AI发展史册。这是一次AI领域的全面战争。 你可能现在还有很多疑惑,Cloudflare是什么,AI爬虫是什么,AI迷宫又是什么,这个事到底有意思在哪。 作为这一切的开始,我想先跟你讲一个故事,一个在今年1月份,发生在一个仅有7人的乌克兰公司的故事。 这个公司叫做Triplegangers,做的业务特别简单,就是卖人的3D数字模型。 AI全面战争,从爬虫毁灭互联网开始 昨天看到一个非常有意思的事情。 Triplegangers专注于销售"人体的数字孪生"模型素材,这些高清3D模型照片来自真实人类扫描,价值巨大。 创始人Tomchuk对自己公司的业务一直很满意,公司虽然不大,但这是他最喜欢的事情。 这个网站一共有65000个产品页面,每个产品的页面至少放着三张高清照片。 每一张图片都细致地标注了年龄、肤色、纹身甚至伤疤。 但是,就在一个普通的周六早上, 这种平静被一场风暴骤然打破。 Tomchuk收到了一条紧急通知:公司的网站崩溃了,因为受到了大量的DDoS攻击。 他懵了,因 ...