Summary: EmoNet, a new convolutional neural network, can accurately decode images into eleven distinct emotional categories. Training the AI on over 25,000 images, researchers demonstrate image content is sufficient to predict the category and valence of human emotions.
Source: University of Colorado at Boulder
Could a computer, at a glance, tell the difference between a joyful image and a depressing one?
Could it distinguish, in a few milliseconds, a romantic comedy from a horror film?
Yes, and so can your brain, according to research published this week by University of Colorado Boulder neuroscientists.
“Machine learning technology is getting really good at recognizing the content of images – of deciphering what kind of object it is,” said senior author Tor Wager, who worked on the study while a professor of psychology and neuroscience at CU Boulder. “We wanted to ask: Could it do the same with emotions? The answer is yes.”
Part machine-learning innovation, part human brain-imaging study, the paper, published Wednesday in the journal Science Advances, marks an important step forward in the application of “neural networks” – computer systems modeled after the human brain – to the study of emotion.
It also sheds a new, different light on how and where images are represented in the human brain, suggesting that what we see – even briefly – could have a greater, more swift impact on our emotions than we might assume.
“A lot of people assume that humans evaluate their environment in a certain way and emotions follow from specific, ancestrally older brain systems like the limbic system,” said lead author Philip Kragel, a postdoctoral research associate at the Institute of Cognitive Science. “We found that the visual cortex itself also plays an important role in the processing and perception of emotion.” THE BIRTH OF EMONET
For the study, Kragel started with an existing neural network, called AlexNet, which enables computers to recognize objects. Using prior research that identified stereotypical emotional responses to images, he retooled the network to predict how a person would feel when they see a certain image.
He then “showed” the new network, dubbed EmoNet, 25,000 images ranging from erotic photos to nature scenes and asked it to categorize them into 20 categories such as craving, sexual desire, horror, awe, and surprise.
EmoNet could accurately and consistently categorize 11 of the emotion types. But it was better at recognizing some than others. For instance, it identified photos that evoke craving or sexual desire with more than 95 percent accuracy. But it had a harder time with more nuanced emotions like confusion, awe, and surprise.
Even a simple color elicited a prediction of an emotion: When EmoNet saw a black screen, it registered anxiety. Red conjured craving. Puppies evoked amusement. If there were two of them, it picked romance. EmoNet was also able to reliably rate the intensity of images, identifying not only the emotion it might illicit but how strong it might be.
When the researchers showed EmoNet brief movie clips and asked it to categorize them as romantic comedies, action films or horror movies, it got it right three-quarters of the time.
WHAT YOU SEE IS HOW YOU FEEL
To further test and refine EmoNet, the researchers then brought in 18 human subjects.
As a functional magnetic resonance imaging (fMRI) machine measured their brain activity, they were shown 4-second flashes of 112 images. EmoNet saw the same pictures, essentially serving as the 19th subject.
When activity in the neural network was compared to that in the subjects’ brains, the patterns matched up.
“We found a correspondence between patterns of brain activity in the occipital lobe and units in EmoNet that code for specific emotions. This means that EmoNet learned to represent emotions in a way that is biologically plausible, even though we did not explicitly train it to do so,” said Kragel.
The brain imaging itself also yielded some surprising findings. Even a brief, basic image – an object or a face – could ignite emotion-related activity in the visual cortex of the brain. And different kinds of emotions lit up different regions.
“This shows that emotions are not just add-ons that happen later in different areas of the brain,” said Wager, now a professor at Dartmouth College.
“Our brains are recognizing them, categorizing them and responding to them very early on.”
Ultimately, the researchers say, neural networks like EmoNet could be used in technologies to help people digitally screen out negative images or find positive ones. It could also be applied to improve computer-human interactions and help advance emotion research.
The takeaway, for now, says Kragel:
“What you see and what your surroundings are can make a big difference in your emotional life.”
Emotion schemas are embedded in the human visual system
Theorists have suggested that emotions are canonical responses to situations ancestrally linked to survival. If so, then emotions may be afforded by features of the sensory environment. However, few computational models describe how combinations of stimulus features evoke different emotions. Here, we develop a convolutional neural network that accurately decodes images into 11 distinct emotion categories. We validate the model using more than 25,000 images and movies and show that image content is sufficient to predict the category and valence of human emotion ratings. In two functional magnetic resonance imaging studies, we demonstrate that patterns of human visual cortex activity encode emotion category–related model output and can decode multiple categories of emotional experience. These results suggest that rich, category-specific visual features can be reliably mapped to distinct emotions, and they are coded in distributed representations within the human visual system.