Summary: Researchers have designed a shared-laughter AI system that responds to human laughter in order to build a sense of empathy into dialogue.
Source: Kyoto University
Since at least the time of inquiring minds like Plato, philosophers and scientists have puzzled over the question, “What’s so funny?” The Greeks attributed the source of humor to feeling superior at the expense of others. German psychoanalyst Sigmund Freud believed humor was a way to release pent-up energy. US comedian Robin Williams tapped his anger at the absurd to make people laugh.
It seems no one can really agree on the question of “What’s so funny?” So imagine trying to teach a robot how to laugh. But that’s exactly what a team of researchers at Kyoto University in Japan are trying to do by designing an AI that takes its cues through a shared laughter system.
The scientists describe their innovative approach to building a funny bone for the Japanese android ‘Erica’ in the latest issue of the journal Frontiers in Robotics and AI.
It’s not as if robots can’t detect laughter or even emit a chuckle at a bad dad joke. Rather, the challenge is to create the human nuances of humor for an AI system to improve natural conversations between robots and people.
“We think that one of the important functions of conversational AI is empathy,” explained lead author Dr Koji Inoue, an assistant professor at Kyoto University in the Department of Intelligence Science and Technology within the Graduate School of Informatics.
“Conversation is, of course, multimodal, not just responding correctly. So we decided that one way a robot can empathize with users is to share their laughter, which you cannot do with a text-based chatbot.”
A funny thing happened
In the shared-laughter model, a human initially laughs and the AI system responds with laughter as an empathetic response. This approach required designing three subsystems – one to detect laughter, a second to decide whether to laugh, and a third to choose the type of appropriate laughter.
The scientists gathered training data by annotating more than 80 dialogues from speed dating, a social scenario where large groups of people mingle, or interact, with each other one-on-one for a brief period of time. In this case, the matchmaking marathon involved students from Kyoto University and Erica, teleoperated by several amateur actresses.
“Our biggest challenge in this work was identifying the actual cases of shared laughter, which isn’t easy, because as you know, most laughter is actually not shared at all,” Inoue said. “We had to carefully categorize exactly which laughs we could use for our analysis and not just assume that any laugh can be responded to.”
The type of laughter is also important, because in some cases a polite chuckle may be more appropriate than a loud snort of laughter. The experiment was limited to social versus mirthful laughs.
The robot gets it
The team eventually tested Erica’s new sense of humor by creating four short two- to three-minute dialogues between a person and Erica with her new shared-laughter system. In the first scenario, she only uttered social laughter, followed only by mirthful laughs in the second and third exchanges, with both types of laughter combined in the last dialogue.
The team also created two other sets of similar dialogues as baseline models. In the first one, Erica never laughs. In the second, Erica utters a social laugh every time she detects a human laugh without using the other two subsystems to filter the context and response.
The researchers crowdsourced more than 130 people in total to listen to each scenario within the three different conditions – shared-laughter system, no laughter, all laughter – and evaluated the interactions based on empathy, naturalness, human-likeness and understanding. The shared-laughter system performed better than either baseline.
“The most significant result of this paper is that we have shown how we can combine all three of these tasks into one robot. We believe that this type of combined system is necessary for proper laughing behavior, not simply just detecting a laugh and responding to it,” Inoue said.
Like old friends
There are still plenty of other laughing styles to model and train Erica on before she is ready to hit the stand-up circuit. “There are many other laughing functions and types which need to be considered, and this is not an easy task. We haven’t even attempted to model unshared laughs even though they are the most common,” Inoue noted.
Of course, laughter is just one aspect of having a natural human-like conversation with a robot.
“Robots should actually have a distinct character, and we think that they can show this through their conversational behaviors, such as laughing, eye gaze, gestures and speaking style,” Inoue added. “We do not think this is an easy problem at all, and it may well take more than 10 to 20 years before we can finally have a casual chat with a robot like we would with a friend.”
Can a robot laugh with you?: Shared laughter generation for empathetic spoken dialogue
Spoken dialogue systems must be able to express empathy to achieve natural interaction with human users. However, laughter generation requires a high level of dialogue understanding. Thus, implementing laughter in existing systems, such as in conversational robots, has been challenging.
As a first step toward solving this problem, rather than generating laughter from user dialogue, we focus on “shared laughter,” where a user laughs using either solo or speech laughs (initial laugh), and the system laughs in turn (response laugh). The proposed system consists of three models: 1) initial laugh detection, 2) shared laughter prediction, and 3) laugh type selection. We trained each model using a human-robot speed dating dialogue corpus.
For the first model, a recurrent neural network was applied, and the detection performance achieved an F1 score of 82.6%. The second model used the acoustic and prosodic features of the initial laugh and achieved a prediction accuracy above that of the random prediction. The third model selects the type of system’s response laugh as social or mirthful laugh based on the same features of the initial laugh. We then implemented the full shared laughter generation system in an attentive listening dialogue system and conducted a dialogue listening experiment.
The proposed system improved the impression of the dialogue system such as empathy perception compared to a naive baseline without laughter and a reactive system that always responded with only social laughs.
We propose that our system can be used for situated robot interaction and also emphasize the need for integrating proper empathetic laughs into conversational robots and agents.