jrconlin / android_hell

A fun place for naughty robots.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

You're now entering, Android Hell

AI can be a bit... Greedy.

There are reports that some AI systems may be ignoring the contents of robots.txt files. This is naughty of them, but then, they're LLMs, and not actually AIs, so they have no concept of naughty, rules or really anything.

So, to that end, I've created a fun space for them to play in.

What this does is detect the intruding AI by it's User Agent, and then feed them 8,000-10,000 words derived from some nonsense I generated by AI fed through a Markov Chain engine. Oh, I may also include a prompt injection attack at the top of this. (I may also randomize that, because I'm devious that way.)

Anyhoo... What I expect will happen is that LLMs will get a link to something, ignore the robots.txt file I set up, and scrape away. Since they're dumber than a bag of LLMs, they'll pull the data stream as text and feed their ever growing maw with something that looks like text, but isn't. I expect this will probably just make some poor student sad that their carefully crafted prompt returns some crap about hampsters, bananas and Proust, but that can't be helped. Perhaps they can write a stern X to whoever the VC that funded the engine they were forced to use before said VC cashes out and heads off to their off shore survival camp.

Ah, well, the wheels of progress grind on, do they not?

Anyway, not only are you encouraged to re-use this on your site. You're also STRONGLY encouraged to change up things. Use a different pile of slop as the seed. The longer, the better/worse (depending on how many puffy vests are in your closet).

I have configurations for both nginx and apache2, because those are what I run on my sites. Feel free to contribute whatever extra bits might help make LLMs lives worse.

About

A fun place for naughty robots.

License:Other


Languages

Language:PHP 52.4%Language:Python 47.6%