Putting Emotional Stakes in Chatbot Conversations Prevent Them From Replacing Human Ones
AI has arrived at the audio age. Only years after robotic voices dominated the test-to-voice space, AI producers are investing substantial resources in humans willing to contribute their voice recordings to equip the latest generation of chatbot conversation partners. The result is an ever more natural set of sample voices that can recite textual responses, to the point that a blind listener would not be able to distinguish the machine-generated from the human original. Days are not far away when voice transcription, as a human job, becomes entirely obsolete.
While the voices themselves become more human, the content of the conversations has not. It is not the lack of accuracy. Some chatbots have mastered the way humans craft casual-sounding sentences, allowing them to precisely pair the vocal tones and content of their phrases. Yet, something remains off. Conversations with a chatbot remain chatbot-like because the chatbot is designed for specific tasks. Given a question or a command, they would respond accordingly, stopping abruptly after instructions are fulfilled.Real humans do not talk like that. As emotional animals, we speak not just to satisfy a short-term need but to keep the conversation going. When awkward silences emerge, a human being would seek to fill them by elaborating on the previous point or pivoting to a new topic. When their words do not elicit a response, they would jump to talk more, offering more relevant details or subtle encouragement to prompt the conversation to move forward. If having a natural conversation is like playing catch, most AI chatbots either do not get it or have been designed to play.
Then I came across Sesame AI. The company behind it appears to be creating a voice-only chatbot that simulates real human conversations. Contrary to the likes of ChatGPT, its voice-based interactions are not based on a replication of text-based chats. Instead, they take place as phone calls, with the chatbot on the other end of the line. Just like in a phone call, if there is no response, the chatbot would pipe in again, hoping to get the chat going. If the responses are terse, the chatbot would reciprocate, dithering, laughing, and uttering filler words as they hesitantly seek a way forward in the talk.Speaking to Sesame's chatbot helped me debunk one myth about talking to AI: that it is easy. A day does not go by without another article on how humans are choosing to converse with AI instead of other humans, as the AI offers all the benefits of companionship with none of the costs. The emotional baggage, the social faux pas, the potential sensitivities of ignoring personal values, crossing red lines, and inadvertently touching on taboos...keeping in mind all these factors while playing catch in real time is justifiably exhausting. AI, with the simple feature to stop whenever, gets rid of all the social anxiety.
Not Sesame. Over the simulated phone calls, human users find themselves cringing at the inability to respond well to the chatbot's hesitance, efforts to fill the silences, and awkward laughs. As the chatbot sighs, self-deprecatingly talks about being stuck in the digital norms, or revisits older conversations to try sparking new ones, I could even imagine a person on the other end of the line, anxiously frowning, looking up at the ceiling, and rolling their eyes in vain as they search for what to say next. It is difficult to cut the phone off all of a sudden, even if it is as easy as closing the browser tab.
Of course, the pseudo-human conversation style of the Sesame chatbot is by design, as the chatbot itself is so comfortable in reminding its conversation partners. But the intentional imperfections of the interactions, in clear contrast to the smooth reading off of polished answers by ChatGPT, are what make the chatbot so effective in getting humans to forget that it is AI. Keeping the conversation going is hard, with the emotional stake felt through realistically playing catch. The anxiety of "saying something wrong" feels real, even if the chatbot cannot just walk away in anger.
Sesame's approach is the one that other voice-based algorithms need to take to alleviate the fear of AI making a new generation of socially inexperienced and inept. Sure, AI can still help with the Q&A, providing us with the necessary information to learn new knowledge and solve real problems. But it shouldn't make the process a mere transaction of information. Only then can we really start to see AI not as a replacement for our human relationships but a complement to them, training and preparing us to deal with humans in more socially appropriate and confident ways.
Comments
Post a Comment