How to Build A Network of Robot Sleeper Agents

And they won’t even have red glowing spines that give them away…

Why *did* their spines glow red? Was that ever explained?

I’ve been somewhat unconcerned about the misinformation applications of LLMs (ChatGPT and cousins). After all, people are perfectly capable of generation misinformation, in quite sufficient volumes to do harm, relatively cheaply.

However, after witnessing some experiments where folks were using LLM powered Bing to do twitter-based “style transfer” (in other words, asking the LLM to read a particular person’s tweets and then generate tweets in the style of that person) it occurs to me what an LLM could be used to do that would be a genuinely novel form of misinformation: the generation of synthetic “personalities” and the insertion of these personalities into online communities. Here’s how you would do it:

  1. Find a whole lot of online communities (subreddits, forums, youtube content communities, twitter cliques, etc) you want to infiltrate. What these communities are about isn’t all that important, you want a broad base of lots of communities. Knitting subreddits, YouTube gamer circles, Star Wars fan forums, parenting twitter, all of it.
  2. Sort through these communities and generate corpora of language from each (this actually could be the most “interesting” part of this process, you might need to do some network analysis).
  3. Use your corpora to fine-tune an LLM to generate social media posts in the typical fashion of your communities (this could be computationally intensive…. or not, if you can just send the ChatGPT API a block of posts and say “write like this please”)
  4. Feed your fine tuned LLMs posts from their communities, and have them write responses that match the recent discussions on the forum. At this point, you just let your LLM based “community members” blend in, you don’t ask them to say anything in particular, just keep up with the chit-chat.
  5. (optional) Write another deep learning tool that watches engagement with posts and tries to steer the LLM output towards high-engagement contributions (risky, could backfire, potentially computationally intensive)
  6. At either a predetermined time (e.g. ok, it’s October in an election year!) or in response to particular topics (e.g. somebody on this forum wants to install a heat pump) your LLM based sleepers start delivering posts in the style of their community but with content of your choosing (e.g. “Did you hear, candidate X has been lying about her emails,” “everyone knows heat pumps don’t really work and they are even worse for the environment!”)
  7. You now have a distributed faux-grassroots message network that would put most previous forms of astroturf to shame, both in terms of distribution and flexibility.

So yeah, that’s how you would do it, if you were some sort of LLM powered supervillain. Which I am not…. yet…

Leave a Reply

Your email address will not be published. Required fields are marked *

css.php