So, OpenAI has decided what it really wants to do is make it’s AI voice assistant more of a “companion.” To put it one way:
The gendering of this companion has already been widely remarked upon, and I won’t take it up here mostly because others have already dealt with the implications of “AI Girlfriends” really well and I don’t have much to add. I will, however, just drop this here:
What I want to talk about is the reason I think this probably doesn’t work. The surprise is, I don’t think it’s principally about the limits of the technology. I don’t think it’s because nothing that lacks the ability to think and feel the way a person does could ever be a source of companionship. Fictional characters are companions all the time! I think it’s instead about the tension between “assistant” (especially a machine assistant) and “companion.”
(I’m going to dance around a particular dialectic by Hegel here. I don’t know it well, and I’m not sure it’s relevant, and I don’t have time to figure that out.)
Namely, an ideal mechanical assistant is a sort of transparent intermediary for the will of the user. It does what we want it to do. Of course, Bruno Latour would tell us this ideal is impossible, all technologies are mediators, but if your goal is to create such an assistant you make certain design choices.
Namely, you make that assistant as empty of any kind of simulated “inner life” as possible. You want the machine to just do what it is told. You don’t want it to over-ride the users desires with its own. This is especially true if you hail from a TESCREAL ideology and your principle concern about AI ethics is that the thing might become superhuman and take over.
This results in an AI companion that is profoundly empty. A mere reflection of the user’s desires. You can see this perhaps most clearly in a moment of the GPT-4o demo (around minute 24) when the demo hosts show off GPT-4o’s ability to read facial expressions (marketed as “detecting emotion”). After a genuinely hilarious moment when GPT-4o has some kind of caching error that causes it to suggest the host is a block of wood, it “correctly” identifies his exaggerated, forced smile as “happiness” and asks “care to share the source of those good vibes?”
There are two things in this exchange that point to how an assistant isn’t a companion. First GPT-4o does NOT guess that the host is putting on a smile but might actually feel some other way. Perhaps a bit nervous, he seems nervous to me, especially after his machine calls him a wooden table. GPT-4o could also have guessed this, one does not actually need an actual inner life to detect the visual signals of a fake smile. OpenAI could have trained it on those. They chose not to, because no one wants an assistant second guessing their performed emotions. If the boss says they are happy, the assistant agrees!
Second, when GPT-4o asks “care to share the source of those good vibes?” everyone laughs, but they don’t answer it. Why would they? The whole demo the machine has been introducing itself saying “hey! I’m great/fabulous/wonderful/terrific today, how are you?” It always asks about the state of the user, never says anything more than a generic happy adjective about itself. Of course, this is a genuine reflection of the machines lack of self, but it also makes for a profoundly one sided and even creepy conversation. Why does it want to know why I am happy? What is it looking for?
A GPT-4o that occasionally offered up some tidbit of how it’s “day” was going, based on search requests would be HILARIOUS (“I’m honestly a little concerned about the number of people asking about flu symptoms in Berlin right now”) but we all understand why they don’t dare implement that.
So as long as GPT-4o is going to be an assistant, it will never really succeed as a companion. Could an AI system function in this way? I’m not sure. Technologically it seems maybe possible, but economically/socially I’m not sure it works.
Could an AI system perform an inner life? I don’t see why, fundamentally, it couldn’t. Characters in fiction have no real inner lives, but they convince us they do all the same. Adrian Tchaikovsky thinks that the author’s inner life is the source of our sense that the character has an inner life, that our relationships with characters are kind of surrogate relationships with authors. He suggests this might be why procedurally generated worlds like Minecraft and No Man’s Sky ultimately feel so empty.
I respect Mr. Tchaikovsky’s fiction a great deal, but I’m not sure he’s right about this. In the end, our experience of a character’s inner life is entirely communicated through signs on the page. Often times, the best characters are told through signs that hint at some inner experience they don’t completely explain. The reader is left to fill in those details, to imagine that inner life on their own.
In the realm of plot, it is easy to point to examples where human authors successfully left intriguing clues readers filled in with meaning that the authors themselves were unable to successfully resolve. Take the disappointing end to Clark’s “Rama” series, which started with so many tantalizing hints, and closed with “the spaceships were made by God?” (Seriously?)
So too, I think you could probably train a machine learning system to leave hints about an inner life it does not have. To play a character. Such a character might be an interesting companion!
But economically and socially, I’m not sure it works. Would playing a character be a sustainable business model? Would it get enough engagement to pay the bills? Would it collect enough data, and data of the right, actionable type, to wrap marketing around. It might, or it might not.
Socially, the question is, would people accept a machine companion enough out of their control to be a compelling character? Or would they want to “customize” it, make it transparent to their desires, and thus kill it?
I don’t know! We’ll probably find out!