To Identify a Voice, Brains Rely on Sight

Summary: Voice and face recognition may be linked even more intimately than previously thought.

Source: University of Pittsburgh

To recognize a famous voice, human brains use the same center that lights up when the speaker’s face is presented, finds a clever neuroscience study where participants were asked to identify U.S. presidents.

The new study, published last week in the Journal of Neurophysiology, suggests that voice and face recognition are linked even more intimately than previously thought.

It offers an intriguing possibility that visual and auditory information relevant to identifying someone feeds into a common brain center, allowing for more robust, well-rounded recognition by integrating separate modes of sensation.

“From behavioral research, we know that people can identify a familiar voice faster and more accurately when they can associate it with the speaker’s face, but we never had a good explanation of why that happens,” said senior author Taylor Abel, M.D., associate professor of neurological surgery at the University of Pittsburgh School of Medicine.

“In the visual cortex, specifically in the part that typically processes faces, we also see electrical activity in response to famous people’s voices, highlighting how deeply the two systems are interlinked.”

Even though the interplay between the auditory and the visual brain processing systems has been widely acknowledged and investigated by various teams of neuroscientists all over the world, those systems were traditionally thought to be structurally and spatially distinct.

Until recently, few studies attempted to directly measure activity from the brain center — the primary role of which is to consolidate and process visual information — to determine whether this center is also engaged when participants are exposed to famous voice stimuli.

Researchers at Pitt had a unique opportunity to study that interaction in patients with epilepsy who, as part of their medical care, were temporarily implanted with electrodes measuring brain activity to determine the source of their seizures.

This shows a head with a mouth and one with a giant eye ball — Even though the interplay between the auditory and the visual brain processing systems has been widely acknowledged and investigated by various teams of neuroscientists all over the world, those systems were traditionally thought to be structurally and spatially distinct. Image is in the public domain

Five adult patients consented to participate in the study, where Abel and his team showed participants photographs of three U.S. presidents — Bill Clinton, George W. Bush and Barack Obama — or played short recordings of their voices, and asked participants to identify them.

Recordings of the electrical activity from the region of the brain responsible for processing visual cues — called the fusiform gyri, or FG — showed that the same region became active when participants heard familiar voices, though that response was lower in magnitude and slightly delayed.

“This is important because it shows that auditory and visual areas interact very early when we identify people, and that they don’t work in isolation,” said Abel. “In addition to enriching our understanding of the basic functioning of the brain, our study explains the mechanisms behind disorders where voice or face recognition is compromised, such as in some dementias or related disorders.”

Ariane Rhone, Ph.D., of the University of Iowa, and Kyle Rupp, Ph.D., of Pitt, are co-first authors. Additional authors of the study are Dan Tranel, Ph.D., and Matthew Howard, III, Ph.D., both of the University of Iowa; and Jasmine Hect, Ph.D., and Emily Harford, Ph.D., both of Pitt.

Funding: This work was supported by National Institutes of Health grants R01 DC004290 and R21 DC019217.

About this auditory and visual neuroscience research news

Author: Anastasia Gorelova
Source: University of Pittsburgh
Contact: Anastasia Gorelova – University of Pittsburgh
Image: The image is in the public domain

Original Research: Closed access.
“Electrocorticography reveals the dynamics of famous voice responses in human fusiform gyrus” by Ariane Rhone et al. Journal of Neurophysiology

Abstract

Electrocorticography reveals the dynamics of famous voice responses in human fusiform gyrus

Voice and face processing occur through convergent neural systems that facilitate speaker recognition. Neuroimaging studies suggest that familiar voice processing engages early visual cortex, including the bilateral fusiform gyri (FG) on the basal temporal lobe.

However, what role the FG plays in familiar voice processing and whether it is driven by bottom-up or top-down mechanisms is unresolved. In this study we directly examined neural responses to familiar voices and faces in human FG using direct cortical recordings in epilepsy surgery patients.

We tested the hypothesis that neural populations in human FG respond to familiar voices and investigated the temporal properties of voice responses in FG. Recordings were acquired from five adult epilepsy surgery patients during a person identification task using visual and auditory stimuli from famous speakers (U.S. presidents Barack Obama, George W. Bush, and Bill Clinton).

Patients were presented with images of presidents or clips of their voices and asked to identify the portrait/speaker. Our results demonstrate that a subset of face responsive sites in and near FG also exhibit voice responses that are both lower in magnitude and relatively delayed (300-600 ms) relative to visual responses.

The dynamics of voice responses revealed by direct cortical recordings suggests a top-down feedback-mediated response to familiar voices in FG that may facilitate speaker identification.

One Comment

kafantaris George says:
January 4, 2023 at 9:15 pm
“From behavioral research, we know that people can identify a familiar voice faster and more accurately when they can associate it with the speaker’s face, but we never had a good explanation of why that happens.” [Now we learn that the] “auditory and visual areas [are shared brain locations that] interact very early when we identify people, and that they don’t work in isolation.”
Good work, Dr. Abel.
And if both visual and auditory information feeds into a common brain center, it seems reasonable to try and integrate visual information (such as lip movements) into voice recognition programs.

Comments are closed.

To Identify a Voice, Brains Rely on Sight

About this auditory and visual neuroscience research news

Hand Dominance Is Driven by Practice, Not Birth

Does Melatonin Help Reduce Chronic Pain?

Creatine Shows Mixed Results in Treating Depression

Hand Dominance Is Driven by Practice, Not Birth

Does Melatonin Help Reduce Chronic Pain?

Creatine Shows Mixed Results in Treating Depression

Twin Technology for Restoring Sight and Touch

To Identify a Voice, Brains Rely on Sight

About this auditory and visual neuroscience research news

Hand Dominance Is Driven by Practice, Not Birth

Does Melatonin Help Reduce Chronic Pain?

Creatine Shows Mixed Results in Treating Depression

Hand Dominance Is Driven by Practice, Not Birth

Does Melatonin Help Reduce Chronic Pain?

Creatine Shows Mixed Results in Treating Depression

Twin Technology for Restoring Sight and Touch

Subscribe