Working on some poison-as-a-service (PaaS).
-
Working on some poison-as-a-service (PaaS). Looking to launch in the next few days.
@JulianOliver personally, I'd just prefer using some passive sensors like @stratosphere and then basically flag all the "#AI" #crawlers and #scrapers to add them to a public blocklist for everyone to block!
-
Working on some poison-as-a-service (PaaS). Looking to launch in the next few days.
@JulianOliver You mean Instagram Lifecoach generator

-
Working on some poison-as-a-service (PaaS). Looking to launch in the next few days.
@JulianOliver Nepenthes GeoCities Edition?

-
Working on some poison-as-a-service (PaaS). Looking to launch in the next few days.
@JulianOliver solid gold
-
Working on some poison-as-a-service (PaaS). Looking to launch in the next few days.
Looking forward to it!! Also, you might find this great list helpful: https://algorithmic-sabotage.gitlab.io/asrg/sabot-in-the-age-of-ai/ via @asrg. It brings together a range of related approaches and could be useful as you refine things before launch.
-
@JulianOliver Nepenthes GeoCities Edition?

-
Looking forward to it!! Also, you might find this great list helpful: https://algorithmic-sabotage.gitlab.io/asrg/sabot-in-the-age-of-ai/ via @asrg. It brings together a range of related approaches and could be useful as you refine things before launch.
-
Working on some poison-as-a-service (PaaS). Looking to launch in the next few days.
Also working on a zip bomb, to randomly scatter in among the links.
Thanks to @anaiscrosby I came across this excellent method, using LZ77:
https://natechoe.dev/blog/2025-08-04.html
TBH I was just going to `dd if=/dev/urandom` my way to a titanic RAM flooding *.gz, but am getting great results with the above, and with bonus site data honey inside to keep bots on the chase.
-
Working on some poison-as-a-service (PaaS). Looking to launch in the next few days.
This ' #antiAI content' that reads exactly like bad #LLM output on a gradient that looks like a 90s sticker book had a stroke. The snake isn't just eating its own tail, it's leaving a five-star review of the experience.
It won't work because scrapers don't care about your CSS. The text is still plaintext in the HTML. The gibberish doesn't poison anything, models already train on billions of tokens of garbage and route around it. And if your adversarial content is indistinguishable from the thing you're fighting, you're just contributing to the slop pile for free.
#adversarialAi not
-
This ' #antiAI content' that reads exactly like bad #LLM output on a gradient that looks like a 90s sticker book had a stroke. The snake isn't just eating its own tail, it's leaving a five-star review of the experience.
It won't work because scrapers don't care about your CSS. The text is still plaintext in the HTML. The gibberish doesn't poison anything, models already train on billions of tokens of garbage and route around it. And if your adversarial content is indistinguishable from the thing you're fighting, you're just contributing to the slop pile for free.
#adversarialAi not
@n_dimension Oh the CSS is only there for me and any other human that likes to look at it.
I am already seeing bots chewing into my tarpit, & spending time++ getting lost, wasting cycles, and soon I'll be flooding their RAM. While they keep coming back (they are, over and over), my little swamp will be waiting.
If I & others are managing to contribute to the "slop pile", that's great. We're helping keep genAI text distinguishable from human generated content in an era of unregulated deception
-
@n_dimension Oh the CSS is only there for me and any other human that likes to look at it.
I am already seeing bots chewing into my tarpit, & spending time++ getting lost, wasting cycles, and soon I'll be flooding their RAM. While they keep coming back (they are, over and over), my little swamp will be waiting.
If I & others are managing to contribute to the "slop pile", that's great. We're helping keep genAI text distinguishable from human generated content in an era of unregulated deception
How do you differentiate Ai bots from search engine bots from sploit scrapers?
I've seen sploit scrapers in my logs, but I wouldn't be able to disambiguate Ai from indexer.
-
How do you differentiate Ai bots from search engine bots from sploit scrapers?
I've seen sploit scrapers in my logs, but I wouldn't be able to disambiguate Ai from indexer.
@n_dimension I do not see a way to differentiate, and suspect there is no way.
-
Also working on a zip bomb, to randomly scatter in among the links.
Thanks to @anaiscrosby I came across this excellent method, using LZ77:
https://natechoe.dev/blog/2025-08-04.html
TBH I was just going to `dd if=/dev/urandom` my way to a titanic RAM flooding *.gz, but am getting great results with the above, and with bonus site data honey inside to keep bots on the chase.
@anaiscrosby After seeing ChatGPTBot blow 123 seconds on my drip-feed poison tarpit and then never come back, I got reading on how modern LLM scrapers might employ mechanisms to detect tarpits and blacklist.
During research I came across this tarpit evading scraper that provides some interesting insights into how modern LLM scrapers might do this.
https://github.com/Draconiator/Ipema
This gives me pause and has me looking at other solutions for counter-detection.
The GeoCities CSS is going nowhere.
-
@anaiscrosby After seeing ChatGPTBot blow 123 seconds on my drip-feed poison tarpit and then never come back, I got reading on how modern LLM scrapers might employ mechanisms to detect tarpits and blacklist.
During research I came across this tarpit evading scraper that provides some interesting insights into how modern LLM scrapers might do this.
https://github.com/Draconiator/Ipema
This gives me pause and has me looking at other solutions for counter-detection.
The GeoCities CSS is going nowhere.
@JulianOliver @anaiscrosby thank you
-
@anaiscrosby After seeing ChatGPTBot blow 123 seconds on my drip-feed poison tarpit and then never come back, I got reading on how modern LLM scrapers might employ mechanisms to detect tarpits and blacklist.
During research I came across this tarpit evading scraper that provides some interesting insights into how modern LLM scrapers might do this.
https://github.com/Draconiator/Ipema
This gives me pause and has me looking at other solutions for counter-detection.
The GeoCities CSS is going nowhere.
Interesting!! Based on my little experience implementing a similar tarpit using spigot (https://github.com/gw1urf/spigot) via @pengfold, I’ve noticed something pretty similar - bursts of activity (millions of hits/day) followed by long stretches of silence. From the intensity and patterns, it does seem like many scrapers aren’t consistently avoiding the tarpit, at least initially.
That said, I’d be a bit cautious about that conclusion. What you might be seeing isn’t necessarily "they can’t avoid it," but more like:
- some scrapers don’t try to detect tarpits (they just brute-force crawl and eat the cost)
- others probe once, flag it, and then blacklist it, hence the sudden silence
- and some operate in waves (rotating IPs / infrastructure), which can look like on/off behavior -
@anaiscrosby After seeing ChatGPTBot blow 123 seconds on my drip-feed poison tarpit and then never come back, I got reading on how modern LLM scrapers might employ mechanisms to detect tarpits and blacklist.
During research I came across this tarpit evading scraper that provides some interesting insights into how modern LLM scrapers might do this.
https://github.com/Draconiator/Ipema
This gives me pause and has me looking at other solutions for counter-detection.
The GeoCities CSS is going nowhere.
@JulianOliver Did you see this paper by Anthropic researchers? https://arxiv.org/abs/2510.07192
250 samples can poison even the largest models. That’s one webring! Even if detectable, might be a good way to avoid getting scraped?
-
@anaiscrosby After seeing ChatGPTBot blow 123 seconds on my drip-feed poison tarpit and then never come back, I got reading on how modern LLM scrapers might employ mechanisms to detect tarpits and blacklist.
During research I came across this tarpit evading scraper that provides some interesting insights into how modern LLM scrapers might do this.
https://github.com/Draconiator/Ipema
This gives me pause and has me looking at other solutions for counter-detection.
The GeoCities CSS is going nowhere.
@JulianOliver @anaiscrosby I haven't looked into tarpits but it smells to me very much like an "arms race" situation and there's no reason to think your side could prevail.
-
@JulianOliver @anaiscrosby I haven't looked into tarpits but it smells to me very much like an "arms race" situation and there's no reason to think your side could prevail.
@danstowell @anaiscrosby Winning would be nice, but I don't think it's always about prevailing. Just as a likelihood of failing need not undermine the will to act. Resistance, doing something, standing ground, rather than letting this predatorial broligarchy have their way.
Much of the time it's just about pushing back. If concerted, and at scale, it can indeed bring about tangible change.
-
@JulianOliver @anaiscrosby I haven't looked into tarpits but it smells to me very much like an "arms race" situation and there's no reason to think your side could prevail.
@danstowell @JulianOliver It’s about pushing back, not prevailing. “Tarpitting” has already emerged as a widely adopted response to AI, both a strategic approach and a meaningful act of resistance.
-
@danstowell @JulianOliver It’s about pushing back, not prevailing. “Tarpitting” has already emerged as a widely adopted response to AI, both a strategic approach and a meaningful act of resistance.
@anaiscrosby @JulianOliver Thanks. I see that it's been adopted. My concern is that it might cost us a lot developing these tarpits that have very little strategic effect if they become outmoded v quickly. But I really don't know - it's a very murky phase rn