AI web crawlers are a menace.
https://thelibre.news/foss-infrastructure-is-under-attack-by-ai-companies/
Literally no reason they can't just make their scraping a bit less evil to avoid this. Just add some cache logic and respect robots.txt 🤷
@Kiloku It's so frustrating that people with foolish assumptions are being given so much money and zero accountability.
@mauve it's not a matter of difficulty. They are desperate for more and more data because they already gobbled up everything that they had permission to, and it didn't give them their "AGI" dream. They *want* to ignore every restriction because they stupidly believe more data will make their bullshit generators better