Tootfinder

Opt-in global Mastodon full text search. Join the index!

AI bots that scrape the internet for training data are hammering the servers of libraries, archives, museums, and galleries,
and are in some cases knocking their collections offline,
according to a new survey published today.
While the impact of AI bots on open collections has been reported anecdotally,
this survey is the first attempt at measuring the problem,
which in the worst cases can make valuable, public resources unavailable to humans
because the…

@grumpybozo@toad.social
2025-06-18 01:35:26

They (or an intentional DDoS) have been pounding the #SpamAssassin RuleQA site into catatonia. They construct URLs which are legitimate and which each cause the site to go digging for the specific performance of a rule on an arbitrary date in the past. Hundreds of rules tested daily for ~20 years.

@tante@tldr.nettime.org
2025-06-17 12:40:05

"AI bots that scrape the internet for training data are hammering the servers of libraries, archives, museums, and galleries, and are in some cases knocking their collections offline"
#AI is ruining our digital world
(Original title: AI Scraping Bots Are Breaking Open Libraries, Archives, and Museums)

@johl@mastodon.xyz
2025-06-17 12:15:44

AI bots that scrape the Internet for training data are stressing out the servers of galleries, libraries, archives, and museums. In some cases they bring #GLAM collections offline.

@erc_bk@fosstodon.org
2025-04-08 02:49:17

Another damn dynamic website bit me today. Tried to scrape some data with a POST request, but it was sneakily generated by secondary GET request that was triggered by a button I didn't even click.