AI Formative Feedback: Quick Thoughts and Concrete Advice

So, there’s been a lot of buzz about AI as a source of formative feedback in student writing. This has only gotten more intense since the GPT-4o demo, which was clearly being pitched as a student learning aid.

Marc Watkins has a pretty great piece out this morning where he responds to some of this. Go read it if you haven’t. He does a great job of unpacking and calling attention to the larger labor situation that might make AI generated formative feedback both attractive and extremely problematic.

I want to take a slightly different angle than Marc on this. I totally agree that we must NOT reach for automated teaching. It’s unfair to our students and potentially disastrous for education. I also agree that the way students may find AI generated feedback useful as a way of “fixing” their writing fundamentally points to how students have been taught to (mis)understand their writing as “broken.” As Marc puts it, “the problem is writing doesn’t need to be solved or fixed. […] When you teach writing to learn, you don’t frame unrealized ideas, poorly worded sentences, or clunky mechanics as problems that are ticked off a to-do list to fix.”

However, while I absolutely agree that we should discourage the use of AI feedback in ways that reinforce rigid definitions of “fixed” and “broken” writing, I would suggest we also imagine a slightly different use case. Namely, that of a writer who understands their rhetorical situation and who has accurately diagnosed a “problem” they need to “fix” in that particular situation. Such a writer might be able to employ AI feedback to help them solve that problem, and that might be OK.

For example, I’ve been told by my students that my assignment instructions can sometimes be hard to follow. I’ve been resisting for awhile, but this summer might be the year I finally ask ChatGPT for help with that. I know my purpose, and I know my audience. Some pointers from the robot (who does tend to write clearly, even when it’s wrong) might not hurt.

Here’s another, more concrete example from my focus groups. I already discussed this in my big results post, but I’ll quote it again for clarity:

I do personally use AI a lot for sentence structure. I’ve always been told that my writing is super wordy. So I will put it into ChatGPT and say like restructure this sentence or write this sentence in a way that sounds academic and like is clear and concise and like gets the point across without, you know, being a run on or whatever. But, and while I do think, I do like that it’s a tool that’s able to help me do that. Do I wish I could do it myself with my own brain and figure it out myself? Yes. But I have the tool there. So it’s, and it’s kind of like, I’ve heard it, like I’ve heard in that sense, like I’ve heard that my sentences are wordy for so long. I’ve tried to correct it, consistently heard it. It’s like, at this point I’m like, I’ve worked on it. I’ve tried like all throughout high school before I used AI. So now it’s just like, at this point it’s a tool for me bettering my writing in the sense that it’s, it might be something that I’m not able to do. I’m just not able to learn to structure sentences in that kind of way without just like letting my creative side take hands.

There’s something really poignant about how the student sees their “creative side” as something they must somehow corral and control here, and I think that speaks a lot to Marc’s point about the way we tell students their writing is broken.

That said, this student is also clearly a thoughtful writer who has paid attention to the feedback they have been getting. They have identified a writing problem. They want to fix it, and this tool can help. I don’t think they are entirely mistaken to want that!

So, how could a writing instructor provide this student with the ability to shift from thinking of writing as “broken” or “fixed” and towards thinking of it in the context of a rhetorical situation? More specifically, how could they help them think critically and reflectively about their “wordiness” and make informed choices about when and how to “control” it? They might:

  1. Help the student think about the situation their writing is responding to (the fancy word is exigence), and ideas they need to communicate in that situation. What “words” are necessary to respond to that situation? Which might be superfluous?
  2. Help the student consider the needs of their audience, and the impact they want to have on this audience. Which words maximize that impact? Which distract from it?
  3. Help the student think about how to choose a genre that is appropriate to this situation, and the sorts of word choices typically made in this genre
  4. Help the student think critically about which words they might want to keep, and which they might want to lose given 1-3 above

I would stress, what we really want is for the student to internalize that process, so they can do it thoughtfully with or without machine help.

Those of you who teach writing in higher ed are probably screaming WE ALREADY DO THAT about now. Yes! I know, but we need to double our focus and align how we talk about writing with students (and our admin and fellow faculty), how we assign writing tasks to students, and how we assess writing assignments to match!

GPT-4o Won’t Work As a “Companion”, But Not For The Reasons You Think

So, OpenAI has decided what it really wants to do is make it’s AI voice assistant more of a “companion.” To put it one way:

The gendering of this companion has already been widely remarked upon, and I won’t take it up here mostly because others have already dealt with the implications of “AI Girlfriends” really well and I don’t have much to add. I will, however, just drop this here:

This movie was not about robots, it was about sexism. We’ve now come full circle, and its about sexism in the design of actual robots.

What I want to talk about is the reason I think this probably doesn’t work. The surprise is, I don’t think it’s principally about the limits of the technology. I don’t think it’s because nothing that lacks the ability to think and feel the way a person does could ever be a source of companionship. Fictional characters are companions all the time! I think it’s instead about the tension between “assistant” (especially a machine assistant) and “companion.”

(I’m going to dance around a particular dialectic by Hegel here. I don’t know it well, and I’m not sure it’s relevant, and I don’t have time to figure that out.)

Namely, an ideal mechanical assistant is a sort of transparent intermediary for the will of the user. It does what we want it to do. Of course, Bruno Latour would tell us this ideal is impossible, all technologies are mediators, but if your goal is to create such an assistant you make certain design choices.

Namely, you make that assistant as empty of any kind of simulated “inner life” as possible. You want the machine to just do what it is told. You don’t want it to over-ride the users desires with its own. This is especially true if you hail from a TESCREAL ideology and your principle concern about AI ethics is that the thing might become superhuman and take over.

This results in an AI companion that is profoundly empty. A mere reflection of the user’s desires. You can see this perhaps most clearly in a moment of the GPT-4o demo (around minute 24) when the demo hosts show off GPT-4o’s ability to read facial expressions (marketed as “detecting emotion”). After a genuinely hilarious moment when GPT-4o has some kind of caching error that causes it to suggest the host is a block of wood, it “correctly” identifies his exaggerated, forced smile as “happiness” and asks “care to share the source of those good vibes?”

There are two things in this exchange that point to how an assistant isn’t a companion. First GPT-4o does NOT guess that the host is putting on a smile but might actually feel some other way. Perhaps a bit nervous, he seems nervous to me, especially after his machine calls him a wooden table. GPT-4o could also have guessed this, one does not actually need an actual inner life to detect the visual signals of a fake smile. OpenAI could have trained it on those. They chose not to, because no one wants an assistant second guessing their performed emotions. If the boss says they are happy, the assistant agrees!

Second, when GPT-4o asks “care to share the source of those good vibes?” everyone laughs, but they don’t answer it. Why would they? The whole demo the machine has been introducing itself saying “hey! I’m great/fabulous/wonderful/terrific today, how are you?” It always asks about the state of the user, never says anything more than a generic happy adjective about itself. Of course, this is a genuine reflection of the machines lack of self, but it also makes for a profoundly one sided and even creepy conversation. Why does it want to know why I am happy? What is it looking for?

A GPT-4o that occasionally offered up some tidbit of how it’s “day” was going, based on search requests would be HILARIOUS (“I’m honestly a little concerned about the number of people asking about flu symptoms in Berlin right now”) but we all understand why they don’t dare implement that.

So as long as GPT-4o is going to be an assistant, it will never really succeed as a companion. Could an AI system function in this way? I’m not sure. Technologically it seems maybe possible, but economically/socially I’m not sure it works.

Could an AI system perform an inner life? I don’t see why, fundamentally, it couldn’t. Characters in fiction have no real inner lives, but they convince us they do all the same. Adrian Tchaikovsky thinks that the author’s inner life is the source of our sense that the character has an inner life, that our relationships with characters are kind of surrogate relationships with authors. He suggests this might be why procedurally generated worlds like Minecraft and No Man’s Sky ultimately feel so empty.

I respect Mr. Tchaikovsky’s fiction a great deal, but I’m not sure he’s right about this. In the end, our experience of a character’s inner life is entirely communicated through signs on the page. Often times, the best characters are told through signs that hint at some inner experience they don’t completely explain. The reader is left to fill in those details, to imagine that inner life on their own.

In the realm of plot, it is easy to point to examples where human authors successfully left intriguing clues readers filled in with meaning that the authors themselves were unable to successfully resolve. Take the disappointing end to Clark’s “Rama” series, which started with so many tantalizing hints, and closed with “the spaceships were made by God?” (Seriously?)

So too, I think you could probably train a machine learning system to leave hints about an inner life it does not have. To play a character. Such a character might be an interesting companion!

But economically and socially, I’m not sure it works. Would playing a character be a sustainable business model? Would it get enough engagement to pay the bills? Would it collect enough data, and data of the right, actionable type, to wrap marketing around. It might, or it might not.

Socially, the question is, would people accept a machine companion enough out of their control to be a compelling character? Or would they want to “customize” it, make it transparent to their desires, and thus kill it?

I don’t know! We’ll probably find out!