Media, Risk, Regulation, and Utopia

The NY Times has a special section today on privacy. One piece proclaims “My Phone Knows All, and That’s Great“. It’s satire, but satire so dry that in our Poe’s Law dominated age it is destined to be taken as sincere opinion over and over and over again. Indeed, it’s not so different from sincere arguments I hear from students all the time: “I wasn’t using my data, why do I care if Google vacuums it all up for ads? What bad thing is going to happen to me if I get a targeted ad, anyway?”

The honest answer to that question is “probably nothing.” Probably nothing bad will happen to you. It’s important to point out, however, this is also the honest answer to the question “what bad thing will happen if I don’t put my baby in a rear-facing car seat?” Probably nothing. Probably you will go about your day and never have a car accident and the baby will be fine. That’s usually what happens. Almost all the time. Almost.

But of course, that “almost,” that small sliver of a chance that something bad could happen (even though, at the scale of n=1, it probably, won’t) scaled to 300 million people, means thousands of children saved by rear-facing car seats. Thus we regulate, and mandate that manufacturers produce safe car seats, and parents install them. It’s an awkward, imperfect, ungainly system. It’s understandable that, as they spend their 20th awkward minute in the driveway trying to install an ungainly child safety device, many parents may briefly entertain conspiracy theories that the whole system is a profit making ploy on the part of the Gracco and Chicco corporations. Nonetheless, we do it, and it basically works.

In the same way, online surveillance is probably mostly harmless at an individual level. At a systemic level, the harms become more plausible. Some individuals may be uniquely likely to be harmed by ads that trigger traumas or prey off of vulnerabilities (think of the ads targeted at soon-to-be parents, at the sick, at the depressed). At a society-wide level, better slicker ads could further fuel the churn of over-consumption that seems to exacerbate, if not cause, so many social ills.

Of course, we’ve dealt with potential harms of ads before. At the dawn of Television, Groucho Marx would open “You Bet Your Life” with an integrated ad for Old Gold cigarettes. We eventually decided that both tobacco ads and integrated ads were bad ideas, and regulated against them (though the latter is on its way back). TV was still able to use advertising as a business model for funding a fundamentally public good (broadcast, over-the-air TV, which anyone could pick up with an antenna, an idea that seems vaguely scandalous in today’s premium-subscriptions-for-everything world). In the same way, we could put regulatory limits on what advertisers can do with our data, how they can target ads, etc. It wouldn’t kill the business model. Oh, the platforms and the ad folks will scream bloody murder about it, because their margins will suffer, but they will survive.

I, personally, would have preferred a slightly more radical option: call it the BBC model for internet content, where every purchase of an internet-connected device would pay a small fee towards a collective fund to pay internet content providers. Again, this is a reasonable adaptation of public-goods provisioning models from the broadcast age. A flawed mechanism, but one we know to work.

Fifteen years ago, there were serious proposals for such a plan, which would have avoided the era of targeted advertisers (and the surveillance system they have built) entirely. It never really got any traction, though. Instead, there was an idea in the air that the Internet Was Different. That it would be a mistake to try the techniques of the past on this radical, decentralized medium. That, rather than developing a centralized mechanism for collecting and distributing fees as a business model for content creation, it would be better to allow flexible, horizontal, associations of volunteers to build things on their own. These volunteers could build amazing things for free, just look at Wikipedia! Clay Shirky argued, that if we could just harness some of the cognitive surplus being wasted on the “heat sink” of Television we could easily build “100 Wikipedia projects per year.”

But, of course, we didn’t build hundreds or thousands of Wikipedias. We built one Wikipedia, and we built Facebook. In retrospect, the radically decentralized Utopia seems like a cover story for the advertising free-for-all we were actually building.

Writing Instruction in the Age of Mechanical Composition

I’m sorry Dave, this assignment is too important for me to allow you to jeopardize it…

Our students are about to start using Machine-Learning driven semi-automated composition tools, first as illicit tools designed specifically for cheating, and then as a normal part of the everyday composition process. As writing teachers, we need to think about how we’ll respond to this.

First as plagiarism….

What makes me so sure automated composition tools are coming? A recent machine learning system for generating text developed by Open AI, called GPT2, generates uncannily good text output with minimal human intervention.

While the full version of the GPT2 language model has not been released, and presumably requires some fairly serious computer hardware, a simpler version of GPT2, called GPT2-117M is available for anyone to download and play with on github. It took me, a self-trained coder and system administrator of fairly modest skill, about an hour to get the software up and running on a very standard Windows PC. I fed this simpler version of the software the prompt for our First Year Writing program’s standard writing reflection exercise: “What is Your Definition of Good, Effective Writing?” and got output like this:

My definition of good effective writing is that it may be written with as much care as possible for an author’s pleasure, the same quality which applies to a child. I should therefore be wary of making use of this method when it should serve as a source of truth. I must, in fact, like to point out this method and take advantage of its advantages. It will be an error to say that all good works are good works, but in every case there will always be an out-of-hand way where the good works that might exist ought to be, and where the out-of-hand way is a far better choice. To make good work of a great author is the main virtue of all good works, whether it be of children or of poets or theologians, when there can be any good way from it to any good work, although in rare instances, and not only in cases where there may be a special need to find good works, there may be no other way, and there will always be a special need. In short, a good work is simply written without attention, not without effort. Nothing is more difficult to write, as one might suppose, and the whole idea of writing cannot take place without the use of written prose. If any other way were possible I might consider myself quite open to using this method. No more should people say, ‘We are going to write this book by this night! It is good for us!’ No more should anybody say, ‘It’s good for our children! It’s good for us! It should only be done by one who loves his children!’ This method will be adopted as a rule and used wherever possible.

(GPT2 – 117 Output)

The relative simplicity of setting up the available GPT2 tool, and the relatively modest computer required to run the tool, both suggest that creating an “auto-generate your assignment” website will likely crop up in the next few months. Students are likely to use such a resource, much as they already use tools like auto-paraphrasing websites.

Using auto-generated text to cheat on writing assignments is, I would argue, a symptom of larger failures in the way we teach and assess writing. As administrations pack writing classrooms with ever more students, and assign ever more contingent faculty to do the work of first-year writing, the amount of time instructors have to devote to reviewing each writing assignment dwindles. This encourages the use of automated plagiarism-detection tools, like Turnitin, which in turn legitimize the use of automated plagiarism-detection-avoidance tools, like auto-paraphrasers and now, likely, GPT2-based text generators. Students likely think, “If the instructor can’t take the time to read my assignment, why should I take the time to write it?” Machine reading begets machine writing and vice-versa, just as in the now decades-long war of spammer against spam detector.

Legitimate Cyborg Writers and Bullshit Writing Work

While automated writing tools may start out functioning as illicit plagiarism aids, they are likely to spread to the world of legitimate writing tools in short order. In many ways, automated writing is already a part of how we compose. Autocomplete in smart phone messaging apps is the most everyday form of this, and tools like Google’s email auto-response have begun to extend the role of cyborg writing in our everyday lives. It isn’t hard to imagine a new and improved form of Microsoft Word’s infamous “Clippy” tool that would allow writers to compose various genres of text by selecting the desired sort of document, entering some key facts, and then using GPT2 or a similar machine-learning driven text-generation tool to create a draft document the author could then revise (or perhaps tweak by setting further parameters for the tool, “Siri, make this blog post 16% snarkier”).

Such a cyborg writing environment may strike some as unsettling. Surely, critics might say, the process of composition is too important to our identity and sense of self to be automated away like this. I think there is some important truth in this critique, which I’ll elaborate on later, but I also think that the world is awash in what we might call (to paraphrase David Graeber) “Bullshit Writing Work.” Writing done, not because any actual audience wants to read it, but because some requirement somewhere says it simply must be done. Work emails, report summaries, box blurbs, website filler, syllabus mandatory policies, etc, etc. We’ve all at some point written something we knew no one would ever read, just because The Requirement Says There Must be text there. If automated tools can do the bullshit writing work, we should let them. There is no implicit honor in drudgery.

I know that the practice of teaching writing has wrestled for a long time with the problem of bullshit writing assignments, and that many people have done a lot of thinking about how to make student writing feel like something composed with a real purpose and audience in mind, rather than something that simply Must Be Done Because Syllabus. I also know that, in my own experience as a teacher, I often struggle to put these ideas into practice successfully. Too often I find that I try to assign something I mean to be Real Writing (here’s a scenario, now compose a blog post, a grant proposal, a tweet!) that ends up feeling to students like writing they must do for class, and then also pretend that they are doing it for some other reason. Bullshit on bullshit.

I can’t help but wonder if, as we think about the imminent arrival of even-more-automated cyborg writing tools than the ones we already have, we might use this as an opportunity to think about how and why writing matters. In short, as machines begin to take an ever-increasing role in creating the products of writing, I wonder if we could redouble our efforts to help students understand the importance of the process of writing. In particular, I think we need to focus on the value the writing process has in and of itself, and not as a means for creating a written product. In other words, we might:

  1. Explicitly emphasize writing-to-think and writing-to-learn. Writing is a process, first and foremost, of composing the self (a lesson I learned from Alexander Reid’s The Two Virtuals). Even as “compositions” become automated, the process of self-composition remains something we do in and through written symbols, and keeping those symbols close to the self, in plain text rather than in the black boxes of machine-learning algorithms, remains a powerful tool for thought.
  2. Spend even more time working on pre-writing, planning, outlining, note-taking. Often times, students are simply told to do this work, with the expectation that they can figure that out on their own, and that they will really need our help when they get to the drafting and revising stage.
  3. Embrace cyborg text, and allow it into our classrooms. This doesn’t mean we should abandon writing exercises that might help students build hands-on experience with text. Such exercises will help them build important instincts that will continue to serve them well. But it does mean we should consider teaching how to plan for, engage with, and revise the products of machine-assisted writing as it enters the mainstream.

The ultimate effects of semi-automated writing are far more profound than these few pragmatic steps. Still, these are some ways we might adjust our classrooms in the near term, as we continue to wrestle with larger shifts.