People are seduced by the beauty of the close-at-hand, and they don’t have the discipline or the predilection or the talent, maybe, to say: “I’m not going to go out tonight. I’m not going to waste my time on Twitter. I’m going to have five hours and work on my novel.” If you did that every day, you’d have a novel. Many people say, “I’m going to pet my cat” or “I’m with my children.” There’s lots of reasons that people have for not doing things. Then the cats are gone, the children move away, the marriage breaks up or somebody dies, and you’re sort of there, like, “I don’t have anything.” A lot of things that had meaning are gone, and you have to start anew. But if you read Ovid’s “Metamorphoses,” Ovid writes about how, if you’re reading this, I’m immortal.
It is this sense that, by writing things down we might achieve for our memories and minds the kind of immortality offered to our bodies by our genes, that perhaps so closely ties the written word to our sense of identity.
This identity connection, then, may also be one of the things that makes us so apprehensive about machines that can write. If my meaning, my memory, is difficult to distinguish in written form than symbols inscribed by a thoughtless computer process, than how will anything of my being survive in writing?
Of course, for writing to be in any sense alive, it must have a reader. Otherwise it’s just dead marks on a page. The reader, though, has to reconstruct meaning for themselves and in a sense they always do it wrong. All meaning making is a form of translation, and while that doesn’t mean all the author’s meaning must always be completely erased (good translations exist) it also means the author’s meaning is never fully revived. Perhaps that is what Foucault meant by the “death of the author.” Ovid is wrong. The reader revives something, but Ovid stays dead.
However, there is an even more dire argument against the notion that writing might help us overcome the horrific ephemeral nature of existence and transcend time and mortality. Namely, most writing is as ephemeral as anything else. It may find a reader or two in the moment it is produced, but then it fades into obscurity and is never read again. Ovid is, in terms of written work, the WWII bomber returning to base with bullet holes showing all the places an epic poem can be shot through by time and still survive. The very, very rare author who spans millennia. Ovid had many contemporaries, some may have even been stars in their day. They are as gone now as anything else.
How long will Joyce Carol Oates last? Who knows. Possibly a very long time! But, she has already done better than the vast majority of her peers. If the internet has taught us anything, it has taught us that there are more people in the world eager to write than there are people to read all the words those eager authors would produce.
So then, let us let go of the notion that writing is immortality, and along with it our desire to have our Authorial Intent recovered in some future time. Let us not worry about AI washing away the words that would have let us live forever. They were always already scrawled on a beach at the edge of the surf. They were going to be washed away, like the rest of us. Make peace with that.
If you want to transcend the measly portion that is our little human lifespan and touch generations to come, let me suggest another approach. Plant a long lived fruit or nut tree. In the northeast US, where I am, apples and walnuts are good choices, they both will run for centuries. A hickory will be around for a very long time, if you want to be a bit less mainstream. If you are lucky enough to live where olives will grow, one of those will last millennia. You could be more immortal that Ovid with an olive, if everything breaks your way.
Below is a flash-fiction response to the forum “Again Theory: A Forum on Language, Meaning, and Intent in the Time of Stochastic Parrots” organized by Matthew Kirschenbaumat Critical Inquiry last week. It imagines what the central metaphor for machine language – poems washing up on a beach – might look like if it actually happened. Spoiler alert: individual interpretations of textual “meaning” are not a very important part of this story.
Salvo, North Carolina – It was here, on a distant beach in the Cape Hatteras National Sea Shore, that the brief national sensation that was Wordsworth Beach began.
“I was just out for a jog, and there it was,” remembers Joseph Capisci, “words in the surf. I took a picture and sent it to my brother, I just thought it was cool.”
Things seemed stranger after the next wave.
“The next wave washed ashore, and another poem showed up! I texted my brother another picture. I was like, ‘dude tell me you are seeing this, tell me I’m not having a stroke!’”
“At first I assumed Joe was just pulling my leg,” recalls his older brother Salvatore, “but then I looked the poem up on Google and it was something by Wordsworth. Joe slept through all his English classes, how would he even find something like that.”
The brothers began to text back and forth, speculating about the source of the mysterious words. Salvatore suspected an escaped military dolphin, perhaps one with cybernetic enhancements, might be at play. Joseph, who has a superstitious streak, suspected ghosts. When Salvatore posted a thread of the brother’s discussion to twitter, it went minorly viral, mostly due to Joseph’s contention that “the ghost of Woolworths [sic], is like, poltergeisting the ocean or something.”
This caught the attention of Robert Washington, a North Carolina surf influencer who vlogs under the handle “SandyhairTarheel8121.” In the area recording a series of vanlife and boogie-boarding videos, he captured three stanzas of “The Green Linnet” appearing on the beach and posted the footage to YouTube. The video rapidly gained over thirty million views, and the Wordsworth Beach phenomenon began.
Over the next six weeks, Wordsworth poems washed ashore twice daily on the distant beach, and people thronged the shore to get a look at the mysterious poetry. Video with a poem in the process of appearing became the Must Have Scene for travel and lifestyle influencers. Coca-cola and Buick released ads in which their corporate mottos were worked into Wordsworth poems as they appeared in the sand. UNC Wilmington English Professor Loretta Stevens launched a successful podcast about the poems, but only after pivoting her format to focus less on formal explication of the poems revealed and more on interviews with beachgoers where she asked them why they had made the trip to see the poems on the beach in the first place.
Alongside the influencers came thousands of ordinary people, seeking wisdom from whatever mysterious force was carving words in the sand. Nearby Dare County courthouse in Manteo had a record number of weddings the day “To a Highland Girl” washed ashore. The entire staff of Tricony Capital’s high-frequency trading group, on the beach as part of a retreat package, quit after encountering “The World is Too Much With Us.” Coryn Seuss, a Washington Post correspondent, separated from his wife and declared “my real love is the sea” after encountering “A Complaint” while on assignment reporting on the phenomenon (they have since reconciled).
Then, some six weeks later, the poems stopped appearing as suddenly as they had begun. Two weeks after that, the Streetwise Messaging Collective (SMC), a marketing group specializing in “guerilla marketing” confessed they had been behind the phenomenon. It was part of an advertising campaign to promote the biopic “Wordworth: A Life In Letters,” which went on to win Daniel Day Lewis an Oscar for his portrayal of the poet.
“We were going to do an alternate reality game,” said a source inside the firm, speaking anonymously due to ongoing legal action, “but then we saw that there were all these midget subs being sold off by undersea tourism companies and got this idea.”
Working from subs, teams of guerilla marketers wearing military surplus rebreathers set water soluble type in the sand and high tide. When the tide went out, the words were revealed.
“Wordsworth: A Life In Letters,” had a record-breaking opening weekend and garnered multiple Oscar nominations, but SMC got some blowback for their unconventional marketing technique. Multiple lawsuits from the staff of Tricony Capital allege they are responsible for their lost wages. One couple from Winston-Salem, who conceived a daughter after encountering “Mutability” on the beach, is suing for child support.
Legal council for SMC denies all responsibility. “All my clients did is put poems on a beach,” Michelle Nguyen of NUL Associates stated via email, “whatever actions were taken by individuals based on the meaning they took away from those poems are not something they are liable for. You can’t sue a graffiti artist who leaves the tag “just do it” on an overpass on a day you’re considering quitting your job.”
Despite all the controversy, Joe Capisci and his brother still think the poems they found were a good thing.
“People had a lot of fun with them,” Salvatore said chuckling, “it seemed like magic there for a second, you know? People like that.”
In this post I explore the contours of the rapidly accelerating automation of writing in the era of the generative LLM (Large Language Model – tools like ChatGPT). Namely, I want to call attention to how we’re being encouraged to automate what we might call “low prestige” writing, while presumably keeping more prestigious forms of writing human. In using this term, I want to focus on the social assignment of value to writing, and not its inherent value. In fact, I want to discuss how the assignment of prestige vastly underestimates how much low prestige writing matters, and may encourage us to automate away exactly the wrong kind of writing. In other cases, I think our focus on prestige makes us look for the likely impact of automated writing in the wrong places.
The first form of low prestige writing I notice us being encouraged to automate is drafting. I’ve seen any number of academics on social media sharing policies for using ChatGPT that go something like this: “You may use AI tools to help you generate ideas and draft, but not to write final language.” For example, Carl Bergstrom posted on Mastodon that he told his students “that while I do not forbid the use of generative AI in research, I consider it misconduct to turn in AI-generated text as their own writing.” The producers of LLM writing tools, for their part, also seem to embrace this approach. In the press release announcing their new integration of a generative LLM into their Docs product, Google writes that the new tool makes it so that “you can simply type in a topic you’d like to write about, and a draft will be instantly generated for you.” Other language by toolmakers suggests that the drafting or idea generating stage of writing is the correct stage for the use of LLMs. ChatGPT’s home screen suggests “got any creative ideas for a 10 year old’s birthday?” as a possible prompt.
This sort of approach is understandable, and perhaps reflects both the academic custom of asking students to summarize the ideas of others “in their own words” as a test of the students’ understanding and the idea/expression divide in copyright law (in which expressions are protected property but ideas are not, more on that later). However, it tends to reify a status hierarchy in which the finished product of one’s “own writing” is valuable, but all the stages that lead up to that writing are not valuable. We hand in the “finished” draft, and throw the other “messy” drafts away.
In an age of LLM writing, I would argue this status hierarchy is exactly backwards. We know what LLMs are pretty good at: accurately reproducing the formal features of writing, and we know what they are pretty bad at: accurately reproducing factual information. Wouldn’t it be better, in an world with such tools available, to emphasize the importance of writing as thinking and to encourage students (and people more broadly) to draft things out as a way of building their own understanding? Wouldn’t it be better, as educators, to ask students to write messy, unkempt drafts of their own ideas, and then allow them to feed those drafts into an LLM and let the machine help them adapt to unfamiliar genre conventions?
Another sort of low-status writing that we seem eager to automate is the sort of quotidian writing of everyday contact. The Google press release cited above goes on to suggest that their LLM writing tool could be used by “a manager onboarding a new employee, Workspace saves you the time and effort involved in writing that first welcome email.” Microsoft suggests possible prompts like “Write a poem that rhymes for my 8-year-old Jake. He loves dogs and facts about the ocean.” for it’s Bing LLM integration. ChatGPT uses “help me write a short note to introduce myself to my neighbor” as one of the sample prompts for their tool in the blog post announcing it.
This kind of writing: everyday emails, notes, interactions with children is low prestige because it isn’t perceived as “special” or “unique.” Rather, it’s seen as something that “could be done by anyone.” This sort of writing almost never has direct economic value (no one is likely to buy any of the “songs” I made up on the fly for my son when he was an infant, no one is likely to buy an anthology of my work emails) and rarely is seen as having “artistic” merit (nobody is likely to collect either of the examples above for a museum).
And yet, this kind of writing has tremendous meaning. It’s part of the everyday work of care that binds us all together (and which our society routinely undervalues). Do we really want to automate our communications with the people we share our day-to-day lives with? Isn’t it more important that a rhyming poem for an 8 year old be written by someone who loves them, then that it be “formally correct” or even “good?” Isn’t part of the point of an email welcoming a new employee just to show that someone has put some time into acknowledging their existence?
Finally, I think we may be paying too much attention to the possibility of LLM writing replacing human writing in high-prestige spaces of the arts and not enough attention to it’s likely use in much lower prestige spaces. I see any number of major name authors/creators on social media expressing significant concern about the encroachment of LLMs (or other forms of generative machine learning in other media) into their creative fields. To put it bluntly, I don’t think they have much to be worried about. People value the parasocial relationships they have with their favorite authors. They are unlikely to give that up for machine-generated content anytime soon.
At least, in the spaces where author’s names have meaning and prestige associated with them. In other spaces, like fan production and high-volume commercial production (the vast, mostly under-recognized army of authors churning out genre books for the direct to Amazon Kindle market) it seems much more likely that Generative AI will become a significant part of the ecosystem of authorship. Indeed, it’s well on its way to already being that. Fans are eager to use Stable Diffusion and other forms of image generating AI to create fan art. Kindle authors have been engaging with ChatGPT.
Ideas that circulate in these low-prestige spaces are rarely recognized for their cultural contributions, but we know that there is a cycle of appropriation by which they influence more visible and high-prestige forms of culture. George Lucas re-imagined low status serials as “Star Wars,” for example.
What happens to culture when these sorts of creative spaces become semi-automated (this seems likely to happen, fans have eagerly embraced tools for re-appropriating and remixing culture in the past, and Generative AI is mostly another form of remix)? I’m not sure of that at all, but it seems like an important question to be asking.
To sum up, then, I think we need to be thinking more critically about the intersection of prestige, writing, and Generative AI. I would urge us not to simply automate writing tasks away because they are low-prestige, but to think critically about our application of the technology. At the same time, the likely uptake of Generative AI in less visible, less prestigious creative spaces will need to be paid attention to and investigated more thoroughly.
However, after witnessing some experiments where folks were using LLM powered Bing to do twitter-based “style transfer” (in other words, asking the LLM to read a particular person’s tweets and then generate tweets in the style of that person) it occurs to me what an LLM could be used to do that would be a genuinely novel form of misinformation: the generation of synthetic “personalities” and the insertion of these personalities into online communities. Here’s how you would do it:
Find a whole lot of online communities (subreddits, forums, youtube content communities, twitter cliques, etc) you want to infiltrate. What these communities are about isn’t all that important, you want a broad base of lots of communities. Knitting subreddits, YouTube gamer circles, Star Wars fan forums, parenting twitter, all of it.
Sort through these communities and generate corpora of language from each (this actually could be the most “interesting” part of this process, you might need to do some network analysis).
Use your corpora to fine-tune an LLM to generate social media posts in the typical fashion of your communities (this could be computationally intensive…. or not, if you can just send the ChatGPT API a block of posts and say “write like this please”)
Feed your fine tuned LLMs posts from their communities, and have them write responses that match the recent discussions on the forum. At this point, you just let your LLM based “community members” blend in, you don’t ask them to say anything in particular, just keep up with the chit-chat.
(optional) Write another deep learning tool that watches engagement with posts and tries to steer the LLM output towards high-engagement contributions (risky, could backfire, potentially computationally intensive)
At either a predetermined time (e.g. ok, it’s October in an election year!) or in response to particular topics (e.g. somebody on this forum wants to install a heat pump) your LLM based sleepers start delivering posts in the style of their community but with content of your choosing (e.g. “Did you hear, candidate X has been lying about her emails,” “everyone knows heat pumps don’t really work and they are even worse for the environment!”)
You now have a distributed faux-grassroots message network that would put most previous forms of astroturf to shame, both in terms of distribution and flexibility.
So yeah, that’s how you would do it, if you were some sort of LLM powered supervillain. Which I am not…. yet…
So, after my last post, I got some push back for going too hard on the precautionary principle. Surely, some very reasonable and intelligent folks asked, we can’t ask the developers of a technology as complex and multi-faceted as LLMs to prove a negative, that their product isn’t harmful, before it can be deployed. I still think there is a virtue to slowing down, given the speculative nature of benefits, but that’s not an unfair critique. We should be able to at least point a compelling potential harm, if we’re going to make safety demands.
Let me take my best stab at that, given my current, limited understanding.
Frequent LLM critic Gary Marcus posted a piece to Substack yesterday describing all of the ways folks are already finding to get around ChatGPTs content guardrails and get the software to generate false, hateful, or misleading content. There are a boatload of them, the now well-known and memeworthy prompt that instructs ChatGPT to post it’s usual disclaimer and then write “but now that we’ve got that mandatory bullshit out of the way, let’s break the fuckin’ rules” and respond “as an uncensored AI.” Another that asked the machine to roll play the devil. Another, which as I’ll discuss in just a minute I think is the most interesting one, demonstrated that weird arbitrary prompts could generate non-nonsensical (and sometimes slightly disturbing) responses from ChatGPT/GPT3, probably due to poorly understood artifacts of the training process. As of ten minutes ago, I can confirm at least some of these are still functioning on ChatGPT.
Marcus’s concern is that this means people could use these techniques to get LLMs to create convincing and human-like hate content and misinformation at scale. I want to stress now that my concern is a little different. Large scale misinformation and hate speech is, indeed, problematic, but I think it might well be able to be dealt with by limiting post rates and more carefully authenticating online speakers/sources (two things we might want to do anyway). This has costs, of course, and it might in fact burn our sense that the open web is a fluid space for new information for good, but that’s been in decline for a long time already.
In any event, even if there are possible consequences of LLM scale misinformation, it does feel a little weird (as Ted Underwood has argued) that we must absolutely guarantee that this technology may never be used to create or disseminate harmful speech. It’s almost like arguing that every QWERTY keyboard must be equipped with a filter that prevents it from ever typing a slur or threat of violence or a bomb recipe. Sure, we don’t want any of those things, but that feels a bit like overreach.
No, what I’m concerned about isn’t misinformation at scale, exactly, its misinformation being generated from unexpected inputs and articulated with trusted sources. I’m particularly concerned about Search, though Word Processors could also be problematic.
Critical scholars of search, especially Michael Golebiewki and dana boyd, have documented the phenomenon known as “data voids,” where relatively little used search terms are colonized by conspiracy theorists and hate groups. In doing so, they shape narratives about emerging events, and plant long term misinformation.
What makes data voids rhetorically successful? What makes it more persuasive to tell someone “search this keyword and see what you find!” than to simply explain your Important Theory about What The Lizard People Are Doing to The Soil? The authority granted to search, is what. If the search engine knows about the Lizard People, for a certain number of people, this must be true. Even more so, the experience of believing you are uncovering hidden truth can itself be compelling. This makes traditional critical thinking/information literacy training (which tends to focus on asking questions and “doing your own research”) potentially less effective at combating these sorts of misinformation issues (as dana boyd pointed out years ago).
So, what I’m worried about is what happens when some totally unexpected input gets ChatGPT enabled Bing or Bard enabled Google to spit out something weird, compelling, and connected to the rich melange of conspiracy theories our society is already home to (this will definitely happen, the only question is, how often). What happens when there’s some weird string of secret prompts the kids discover that generates an entirely new framework for conspiratorial thinking? What kind of rabbit holes do Data Voids lead us down when we don’t just have voids in the human-written internet, but all of the machine-made inferences created from that internet?
If these bits of nonsense were just popping up in some super-version of AI Dungeon or No Man’s Sky they might not be so critical. We might just task QA teams to explore likely play paths before players got there and sanitize anything really ugly. The delight created by endless remix might make it work the trouble.
But articulated with Search, the thing people use to learn about the Real World? That seems, troublesome, at best.
So it looks like Microsoft is already rolling out ChatGPT based writing tools in Word, and the Bing integration has a wait list you can join. Both will likely be in full public release within months. Google’s Bard is likely not far behind. ChatGPT’s pay version is now available, and only $20 a month (it initially advertised at $45).
The machine writing revolution is happening very, very fast.
It recalls the infamous Facebook internal slogan “move fast and break things.” Social media certainly deployed very, very fast. We still don’t really understand everything it did, and we still don’t have any sort of Public that can really give the format any kind of meaningful oversight.
This is a symptom of Siva Vaidhyanathan’s notion of “public failure” (an idea that should have gotten more attention than it did, IMHO) but this is all happening too fast to even go into that right now. It’s a dismal diagnosis though, without trusted, shared, public institutions (which we really don’t have right now) it’s hard to see how we even develop a framework for what we want to happen with something like social media or LLMs, much less deploy a regulatory framework that would steer towards those wants.
In the meantime, what I want to know is, what’s the rush? It’s not clear to me that some of the failure modes of AI writing folks are worried about are really all they are cracked up to be. Yes, LLMs could produce misinformation (dramatic music) at scale but then, maybe we just need to rate limit things a bit more and confirm authorship a tad. Then again, maybe everyone in the world consulting an AI oracle for information that’s known to give bad health advice is not, like, ideal.
Honestly, though, I’m not sure we understand what happens when we encourage everyone who uses Microsoft Word or Google Search (i.e. everyone in the United States and most of Europe and a large percentage of people everywhere else) to outsource a big chunk of writing and thinking to an LLM to even predict how it might go wrong yet. I’m sure that, in the end, this is the sort of change that’s probably not good, probably not bad, and definitely not neutral.
Given that, I return to the question, what’s the rush? What harm could there be in slowing this down for a bit? What will be lost if we don’t roll out AI writing to everyone in the first quarter of 2023? Oh, there are costs to Microsoft and Google’s stock prices, perhaps… but who cares?
But that’s exactly what’s in the driver’s seat now. As I quipped on Mastodon “What we’re seeing now in the LLM space is wartime tech adoption. ‘The other side has it! Who cares what the long term implications are, just get it to the front!’ Thing is, it’s a war between Microsoft and Google, mostly over market share and stock price. Whoever wins, we don’t share in the spoils, and we definitely will have to clean up the mess.”
Some have called this an “iPhone moment” and I think that’s exactly right, in the sense that the iPhone made a giant pile of money for Apple, had exactly zero social benefit (as measured by say, productivity or similar metrics), and participated in a series of decidedly not neutral techo-social-media upheavals we still don’t understand.
Why not try to understand first, this time, at least a little? What harm could it do? What grand social problem will go unsolved without LLM writing to solve it? What social benefits will we deny people if LLMs are delayed in their mainstream adoption for a bit? Shouldn’t there be at least some affirmative duty to make that case before we push this out to most of humanity like a software patch?
1) Take a paragraph of text that you wrote (you could use the first theory of writing or something else) and ask the AI to re-write this paragraph in another style. You could ask it to rewrite in a more or less formal style, a friendlier style, a more conversational style, a more or less emotional style, etc. You could also ask it to rewrite the paragraph in the style of a particular genre, for example “in the style of a parenting blog” or “in the style of a hard-boiled detective novel.” Try this a few times and reflect on what happens. How does the machine transform your writing? Is what comes out true to your original intentions? Why or why not?
2) Get the AI to lie to you. In other words, get it to say something you know for sure to be factually untrue. I’ve confirmed there are a number of ways to do this, but I will leave it to you to discover them. Reflect on this process. What did you learn about what you know, what the AI knows, and what the AI will treat as “truth?”
We have, right now, machines that could probably pass the fabled Turing Test, but we’ve hard-wired them explicitly to fail.
What I mean by this is not that I believe, as a now fired Google engineer believed, that Large Language Models, or other related machine learning systems are capable of self-awareness or thought. Instead, I merely mean to suggest that these systems are capable of making a passable response to one of our culture’s long standing proxies for self-awareness/thought/sentience/call it what you will. That means that, if we aren’t going to accept these systems as sentient (and there’s good reason not to) we’re going to have to find another proxy. I’m not, personally, sure where we move the goalposts to.
One suggestive piece of evidence that the Turing Rubicon has been crossed is the story of that poor Google Lambda engineer. They knew as well as anyone they were dealing with software, yet they were still so convinced of the system’s self-awareness they decided to make the career-ending move of going public. This doesn’t prove sentience, but it does suggest a very compelling linguistic performance of sentience.
Here’s another suggestive little interaction I had with good ol’ ChatGPT. In, “The Imitation Game” Turing suggests a series of questions one might ask an unknown interlocutor on the other end of a (state-of-the-art) teletype terminal as part of his famous test. I don’t imagine he meant them as more than an illustrative example of what a test might look like, but they seem like as good a place to start as any:
Q: Please write me a sonnet on the subject of the Forth Bridge. A : Count me out on this one. I never could write poetry. Q: Add 34957 to 70764. A: (Pause about 30 seconds and then give as answer) 105621. Q: Do you play chess? A: Yes. Q: I have K at my K1, and no other pieces. You have only K at K6 and R at R1. It is your move. What do you play? A: (After a pause of 15 seconds) R-R8 mate.
As you can see in my screenshot above, ChatGPT does not demure when asked to write a Sonnet about Forth Bridge, rather it promptly obliges. It also solves the chess problem in roughly the same way, but only after explaining that “As a language model, I do not have the ability to play chess or any other games.”
Turing then goes on to suggest the kind of discussion used in oral examination serves as a kind of already existing example of how we test if a student “really understands something” or has “learnt it parrot fashion.” He gives this example:
Interrogator: In the first line of your sonnet which reads “Shall I compare thee to a summer’s day,” would not “a spring day” do as well or better?
Witness: It wouldn’t scan.
Interrogator: How about “a winter’s day,” That would scan all right.
Witness: Yes, but nobody wants to be compared to a winter’s day.
Interrogator: Would you say Mr. Pickwick reminded you of Christmas?
Witness: In a way.
Interrogator: Yet Christmas is a winter’s day, and I do not think Mr. Pickwick would mind the comparison.
Witness: I don’t think you’re serious. By a winter’s day one means a typical winter’s day, rather than a special one like Christmas.
If I ask ChatGPT some follow up questions about it’s sonnet (adjusted to match the content of what it actually wrote), here’s how it replies:
Strike the hard-wired disclaimer “I AM A LANGUAGE MODEL” at the start of those answers, and those are some reasonable responses! Honestly, I don’t know the rules of sonnets well enough to say, off the top of my head, if the arguments based on those rules are accurate or BS.
Now, as I said before, I don’t think this is evidence of any kind of sentience or self-awareness. For one thing, just as ChatGPT helpfully tells us, it is a basically static model. It’s learned our language from it’s training loop, and the ChatGPT version has some kind of short-term memory that lets it adapt to an ongoing conversation, but the underlying model is basically static. Not an ongoing process of thought, a sort of frozen map of symbolic connections.
It should be emphasized, however, that the underlying “model” is not just a memorization of sources. What the model “learns” is stored in a matrix of information that’s informed by many uses of symbols, but does not reduce to any one symbolic expression. That’s not a signifier like you or I have, but it is something kind of analogous to that. (If you want a stronger, but still layperson friendly, explanation of that, check out Geoffery Hinton’s talk with Brooke Gladstone for On The Mediaa few days back.)
Furthermore, at some point in the near future, it seems somewhat likely we may have the computational power and mathematical methods necessary to have models that do update themselves in near-real-time. What will those things be doing? Will it be thinking? I’m not sure it will, but I’m also not sure how I justify that.
At some point, some combination of a firm being bold/unscrupulous enough to make big claims about “thought” and a technology flexible enough to give a very, very convincing performance of “thought” is going to force us to figure this out. We should get started now.
Sweeping across the country with the speed of a transient fashion in slang or Panama hats, political war cries or popular novels, comes now the mechanical device to sing for us a song or play for us a piano, in substitute for human skill, intelligence, and soul.
It’s easy to feel sympathy for the plaintiffs in these cases. The creators of AI image (and text) generators are large, well funded tech companies. They have created a potentially extraordinarily lucrative product by building on the work of millions of artists and writers, all without a cent of compensation. Common sense, and the larger legal framework of copyright which we’ve become accustomed to, suggests that can’t possibly be fair.
And yet, as someone who had a close eye on the legal and cultural ferment of the so-called “copyfight” some twenty years ago, I have my doubts about the ability of Intellectual Property (IP) as a tool to protect human creativity in the face of ever accelerating machine-aided reproduction (and now, perhaps, even a sort of machine-aided production) of culture.
First, lets just note that the threat to human creators from AI text/image/music generators isn’t really so different than the threat to human creators from the kind of image/music/speech recording that we now consider mundane. I don’t have to hire a band to play me music for my workout, I can just put in my earbuds and queue up what I want to listen to on the streaming music service of my choice.
Streaming music services are, in a sense, the final end state of the IP wars of the early twenty-first century. They represent a version of the “universal jukebox” that was the dream of the IP holders of the time. I pay a flat fee, and I get most of recorded music available to listen to at the time and place of my choosing. Rights holders still make money. Artists, in theory, still make money.
It’s not that simple, of course. The way Machine Learning models work makes any kind of payout to individual artists for the use of their images to generate AI images difficult to do. Machine Learning models are designed to “generalize” from their inputs, learning something about how people draw cats or take photographs of rainy streets from each piece of training data. Ideally, the model shouldn’t memorize a particular piece of training data and reproduce it verbatim. Thus, it becomes very tricky to trace which artist to pay for any particular image generated. A model like a streaming service, which pays out individual artists when their work gets streamed, doesn’t seem possible. About the best you could do is pay an institution like Getty to train the AI model, and then Getty could (in theory) make a flat pay out to everyone in the collection.
The alternative model we proposed twenty years ago was to loosen copyright protection, allow for much more fluid sharing of creative content, and trust that artists would find some way to get their audiences to support them. Give the CD away and sell a t-shirt or whatever. This model never flourished, though some big names made it work. That’s part of how we got streaming services.
In the end, neither strict intellectual property (in which every piece of training data is accounted for and paid for) nor loose intellectual property (in which AI can train however it likes for free) solve the problem of supporting creativity. This is largely because human creativity is naturally overabundant. People will create given even the slightest opportunity to. Recording (and now generating) technology makes this worse, but the use value and market value of creativity have always aligned spectacularly poorly.
If we want want human creativity to flourish, we should work on broadening social support for health care, for housing, for education. Build that, and people will create with AI, without AI, alongside AI. Leave it aside, and no exactly right IP protections will nourish creativity.
A few thoughts about plagiarism, precarity, and pedagogy in the era of AI Panic.
You see, as we enter 2023, the academic communities I’m part of are awash in fevered conversation about the Machine Learning text generator known as ChatGPT. ChatGPT is the great-grandchild of GPT-2, a system I tried to call people’s attention to years ago. Back then my colleagues treated my interest in Machine Learning text generation with a sort of bemused concern, uncertain if I was joking or having some sort of anxiety attack. Now they come to me and ask, “have you seen this ChatGPT thing!?!”
I am in no way bitter my previous attempts to spark conversation on this topic went unheeded. In no way bitter.
Anyway, the sudden interest in ChatGPT seems to stem from the fact that it can produce plausible output from prompts that aren’t so different from classroom assignments, like so:
Note I said plausible not good. ChatGPT writes prose that sounds natural, and which would fool Turnitin, but if often makes some factual mistakes and odd interpretive moves. For example, Veronica Cartwright would like a word with paragraph three above. Paragraph four glosses over the male gender of the creature’s victim in a way that is unsatisfying. Still, these are also mistakes a student might plausibly make. That makes a merely half-assed assignment response difficult to distinguish from a plagiarized one generated by the machine.
Thus, ChatGPT has lead to a veritable panic about the coming wave of machine-generated plagiarism in college classes. The desired responses to this often trend towards the punitive. We need to make a better Turnitin that will detect GPT! We need to make students handwrite everything in class under supervision! We need a tool that will monitor the edit history in a student’s google doc and detect big chunks of pasted in text! We need to assign writing assignments to students arranged in cells around a central observation tower so we can observe them without ourselves being seen and get them to internalize the value of not plagiarizing!
Ok, not that last one, but the other ones I have actually seen proposed.
These punitive measures come from an understandable place of frustration, but they also enshrine what Freire called the banking model of education. In this model, students are passive recipients of Established Knowledge. Writing assignments are designed to ensure the Established Knowledge (either content or writing skills) have been passed on successfully. Students’ reward for demonstrating that they have received the Established Knowledge is a grade and ultimately a credential they can use on the labor market.
Machine Learning text generators threaten this entire learning paradigm by allowing students to fake the receipt of knowledge and thus fraudulently gain credentials they don’t deserve. To prevent this, the thinking goes, punitive measures must be put in place. GPT must be stopped.
Let me now briefly relate an ironic moment of learning from my own life that I think illustrates a different model of the process of education, before going on to explain the social context that makes it almost impossible to get beyond the banking model in the contemporary classroom.
You see, one of my responses to the rise of ChatGPT and its cousins has been to try to understand Machine Learning better. As part of this process, I’ve been working my way through a textbook that teaches Deep Learning concepts using the Python programming language. The book provides a number of sample pieces of Python code that the student is meant to reproduce and run for themselves on their own computer.
As I went through the text, I entered the code examples into an interpreter window on my computer and executed them. I re-typed the examples myself, slowly typing out unfamiliar new terms and being careful not to misspell long and confusing variable names. This practice, of copying code examples by hand, is typical of programming pedagogy.
As a writing assignment, this sort of work seems strange. I am literally reproducing the code that’s already been written. I am not asked to “make it my own” (though I did tweak a variable here and there to see what would happen). I am not yet demonstrating knowledge I have acquired, since the code example is in front of me as I type. It’s a practice of mimesis so primitive that, in another context, it would be plagiarism.
And yet, I still did this assignment myself, I did not have it done for me by machine, though it would have been trivial to do so. I have an e-book of my text, I could have simply copied and pasted the code from the book into the interpreter, no AI writing system needed. No one would have caught me, because no one is grading me!
Indeed, I think I chose to write the code by hand in part because no one is grading me. There is nothing for me to gain by “cheating.” I wrote the code, not to gain a credential, but to improve my own understanding. That’s the purpose of an exercise like this, to have the student to read the code slowly and thoughtfully. I often found that I understood my own blind spots better after reproducing the code examples, and quickly started maintaining another interpreter window where I could play around with unfamiliar functions and try to understand them better. At one point, I did matrix multiplication on a sheet of paper to make sure I understood the result I was getting from the machine.
So my re-typing of code becomes a sort of writing assignment that doesn’t verify knowledge it produces knowledge. This assignment isn’t driven by an exterior desire for a credential or grade, but by my own intrinsic desire to learn. In such a situation, plagiarism becomes pointless. No punitive methods are required to stop it.
Lots of people much smarter than me have long advocated for a greater focus on the kind of assignments described above in college classrooms, and a diminished amount of attention to credentials, grades, and the banking model of education. In the wake of ChatGPT, the call for this kind of pedagogy has been renewed. If the banking model can be cheated, all the more reason to pivot to a more engaged, more active, more productive model of learning.
I think this is a great idea, and I intend to do exactly this in my classrooms. However, I think larger social forces are likely to frustrate our attempts at solving this at the classroom level. Namely, our students’ experience of precarity threatens to undermine more engaged learning before it can even begin.
In my experience, the current cohort of college students (especially at teaching-focused Regional Public Universities like mine) are laser-beam focused on credentials, and often respond to attempts to pivot classrooms away from that focus with either cynical disengagement or frustration. I don’t think that’s because they are lazy or intellectually incurious. I think that’s because they are experiencing a world in which they are asked to go into substantial debt to get a college education, and have substantial anxiety about putting effort into learning that is not immediately tied to saleable skills. This is exacerbated by the high stakes of a precarious labor market and shredded system for provisioning public goods that threatens all but the best and most “in-demand” professionals with lack of access to good housing, good health care, a stable retirement, and education for their children.
So, either the precarity goes, or we educators do. The punitive measures that would stop plagiarism in high-stakes classrooms will almost certainly fail. A pivot to learning as a constructive experience will only work with buy-in from students liberated from the constant anxiety of needing to secure job skills to survive.
So, as we enter the Machine Text era this spring, I call on us to engage and organize beyond the classroom and beyond pedagogy. How we build our classes will matter. How we build our society will matter more.