Revealing How the Brain Understands One Voice in a Noisy Crowd

Summary: Study reveals how the brain is able to focus its attention on a single speaker in a crowded, noisy environment.

Source: University of Rochester

In a crowded room where many people are talking, such as a family birthday party or busy restaurant, our brains have the ability to focus our attention on a single speaker. Understanding this scenario and how the brain processes stimuli like speech, language, and music has been the research focus of Edmund Lalor, Ph.D., associate professor of Neuroscience and Biomedical Engineering at the University of Rochester Medical Center.

Recently, his lab found a new clue into how the brain is able to unpack this information and intentionally hear one speaker, while weaning out or ignoring a different speaker. The brain is actually taking an extra step to understand the words coming from the speaker being listened to, and not taking that step with the other words swirling around the conversation.

“Our findings suggest that the acoustics of both the attended story and the unattended or ignored story are processed similarly,” said Lalor. “But we found there was a clear distinction between what happened next in the brain.”

For this study, recently published in The Journal of Neuroscience, participants simultaneously listened to two stories, but were asked to focus their attention on only one.

Using EEG brainwave recordings, the researchers found the story that participants were instructed to pay attention to was converted into linguistic units known as phonemes—these are units of sound that can distinguish one word from another—while the other story was not.

“That conversion is the first step towards understanding the attended story,” Lalor said. “Sounds need to be recognized as corresponding to specific linguistic categories like phonemes and syllables, so that we can ultimately determine what words are being spoken—even if they sound different—for example, spoken by people with different accents or different voice pitches.” Co-authors on this paper include Rochester graduate student Farhin Ahmed, and Emily Teoh of Trinity College, University of Dublin.

The research was funded by the Science Foundation of Ireland, Irish Research Council Government of Ireland, Del Monte Institute for Neuroscience Pilot Program, and the National Institute on Deafness and Other Communication Disorders.

This work is a continuation of a 2015 study lead by Lalor that was published in the journal Cerebral Cortex. That research was recently awarded the 2021 Misha Mahowald Prize for Neuromorphic Engineering for its impact on technology aimed at helping disabled humans improve sensory and motor interaction with the world, like developing better wearable devices, e.g. hearing aids.

The research originated at the 2012 Telluride Neuromorphic Engineering Cognition Workshop and led to the multi-partner institution Cognitively Controlled Hearing Aid project funded by the European Union, which successfully demonstrated a real-time Auditory Attention Decoding system.

This shows a woman highlighted in a crowd
The brain is actually taking an extra step to understand the words coming from the speaker being listened to, and not taking that step with the other words swirling around the conversation. Image is in the public domain

“Receiving this prize is a great honor in two ways. First, it is generally very nice to be recognized by one’s peers for having produced valuable and impactful work. This community is made up of neuroscientists and engineers, so to be recognized by them is very gratifying,” Lalor said.

“And, second, it is a great honor to be connected to Misha Mashowald—who was a pioneer in the field of neuromorphic engineering and who passed away far too young.”

John Foxe, Ph.D., director of the Del Monte Institute for Neuroscience was a co-author on this study that showed it was possible to use EEG brainwave signals to determine who a person was paying attention to in a multi-speaker environment. It was novel work in that it went beyond the standard approach of looking at effects on average brain signals.

“Our research showed that—almost in real time—we could decode signals to accurately figure out who you were paying attention to,” said Lalor.

About this auditory neuroscience research news

Author: Press Office
Source: University of Rochester
Contact: Press Office – University of Rochester
Image: The image is in the public domain

Original Research: Closed access.
Attention differentially affects acoustic and phonetic feature encoding in a multispeaker environment” by Emily S. Teoh et al. Journal of Neuroscience


Attention differentially affects acoustic and phonetic feature encoding in a multispeaker environment

Humans have the remarkable ability to selectively focus on a single talker in the midst of other competing talkers. The neural mechanisms that underlie this phenomenon remain incompletely understood. In particular, there has been longstanding debate over whether attention operates at an early or late stage in the speech processing hierarchy.

One way to better understand this is to examine how attention might differentially affect neurophysiological indices of hierarchical acoustic and linguistic speech representations. In this study, we do this by using encoding models to identify neural correlates of speech processing at various levels of representation. Specifically, we recorded EEG from fourteen human subjects (nine female and five male) during a “cocktail party” attention experiment. Model comparisons based on these data revealed phonetic feature processing for attended, but not unattended speech.

Furthermore, we show that attention specifically enhances isolated indices of phonetic feature processing, but that such attention effects are not apparent for isolated measures of acoustic processing. These results provide new insights into the effects of attention on different pre-lexical representations of speech, insights that complement recent anatomical accounts of the hierarchical encoding of attended speech.

Furthermore, our findings support the notion that – for attended speech – phonetic features are processed as a distinct stage, separate from the processing of the speech acoustics.


Humans are very good at paying attention to one speaker in an environment with multiple speakers. However, the details of how attended and unattended speech are processed differently by the brain is not completely clear.

Here we explore how attention affects the processing of the acoustic sounds of speech as well as the mapping of those sounds onto categorical phonetic features.

We find evidence of categorical phonetic feature processing for attended, but not unattended speech. Furthermore, we find evidence that categorical phonetic feature processing is enhanced by attention, but not for acoustic processing.

These findings add an important new layer in our understanding of how the human brain solves the cocktail party problem.

Join our Newsletter
I agree to have my personal information transferred to AWeber for Neuroscience Newsletter ( more information )
Sign up to receive our recent neuroscience headlines and summaries sent to your email once a day, totally free.
We hate spam and only use your email to contact you about newsletters. You can cancel your subscription any time.
  1. Having survived a Traumatic Brain Injury and having very limited hearing alongside a Cognitive Impairment, I fully understand the difficulties of hearing in a crowd. Luckily I have a good lipreading ability, having worked in a noisy environment when I was a teenager, but it can be exhausting to listen and communicate. It definitely uses many skills and brain functions to be able to isolate a conversation. My friend’s are used to me saying “I will sit here so that I can hear you all.” It is not uncommon to cause misunderstanding or even arguments when I misinterpretate words that sound the same. There are many in the English Language.

Comments are closed.