markwyner@mas.to
@markwyner@mas.to
Beiträge
-
RE: https://tldr.nettime.org/@tante/116605858023186072 -
RE: https://tldr.nettime.org/@tante/116605858023186072@inthehands for a while I was hesitant to block Google. They have a psychological grip on us. We’re made to feel like we must play their game or our site doesn’t exist.
Fuck that. I’m out. I’m gonna block all of their bots. It’s gonna be 403 city.
-
RE: https://tldr.nettime.org/@tante/116605858023186072@inthehands crawlers choose whether or not they want to oblige robots.txt and meta noindex/nofollow.
The proper way to do this is add agent detection on the server-side, and force a 403. This essentially refuses a request.
This only works if you know all of the agents and they’re not using covert agents. Anyone can use any agent to crawl the web.
But the 403 solution is pretty solid overall.