Summary: The brain processes speech by using a buffer, maintaining a “time stamp” of the past three speech sounds. Findings also reveal the brain processes multiple sounds at the same time without mixing up the identity of each sound by passing information between neurons in the auditory cortex.
Source: NYU
Our brains “time-stamp” the order of incoming sounds, allowing us to correctly process the words that we hear, shows a new study by a team of psychology and linguistics researchers.
Its findings, which appear in the journal Nature Communications, offer new insights into the intricacies of neurological function.
“To understand speech, your brain needs to accurately interpret both the speech sounds’ identity and the order that they were uttered to correctly recognize the words being said,” explains Laura Gwilliams, the paper’s lead author, an NYU doctoral student at the time of the research and now a postdoctoral fellow at the University of California, San Francisco.
“We show how the brain achieves this feat: Different sounds are responded to with different neural populations. And, each sound is time-stamped with how much time has gone by since it entered the ear. This allows the listener to know both the order and the identity of the sounds that someone is saying to correctly figure out what words the person is saying.”
While the brain’s role in processing individual sounds has been well-researched, there is much we don’t know about how we manage the fast auditory sequences that constitute speech. Additional understanding of the brain’s dynamics can potentially lead to addressing neurological afflictions that diminish our ability to understand the spoken word.
In the Nature Communications study, the scientists aimed to understand how the brain processes the identity and order of speech sounds, given that they unfold so quickly. This is significant because your brain needs to accurately interpret both the speech sounds’ identity (e.g., l-e-m-o-n) and the order that they were uttered (e.g., 1-2-3-4-5) to correctly recognize the words being said (e.g. “lemon” and not “melon”).
To do so, they recorded the brain activity of more than 20 human subjects—all native English speakers—while these subjects listened to two hours of an audiobook. Specifically, the researchers correlated the subjects’ brain activity in relation to the properties of the speech sounds that distinguish one sound from another (e.g. “m” vs “n”).
The researchers found that the brain processes speech using a buffer, thereby maintaining a running representation—i.e., time-stamping—of the past three speech sounds.
The results also showed that the brain processes multiple sounds at the same time without mixing up the identity of each sound by passing information between neurons in the auditory cortex.
“We found that each speech sound initiates a cascade of neurons firing in different places in the auditory cortex,” explains Gwilliams, who will return to NYU’s Department of Psychology as an assistant professor in 2023.
“This means that the information about each individual sound in the phonetic word ‘k-a-t’ gets passed between different neural populations in a predictable way, which serves to time-stamp each sound with its relative order.”
The study’s other authors were Jean-Remi King of École normale supérieure in Paris, Alec Marantz, a professor in NYU’s Department of Linguistics and NYU Abu Dhabi Institute, and David Poeppel, a professor in NYU’s Department of Psychology and managing director of the Ernst Struengmann Institute for Neuroscience in Frankfurt, Germany.
About this auditory neuroscience research news
Author: Press Office
Source: NYU
Contact: Press Office – NYU
Image: The image is in the public domain
Original Research: Open access.
“Neural dynamics of phoneme sequences reveal position-invariant code for content and order” by Laura Gwilliams et al. Nature Communications
Abstract
Neural dynamics of phoneme sequences reveal position-invariant code for content and order
Speech consists of a continuously-varying acoustic signal. Yet human listeners experience it as sequences of discrete speech sounds, which are used to recognise discrete words.
To examine how the human brain appropriately sequences the speech signal, we recorded two-hour magnetoencephalograms from 21 participants listening to short narratives.
Our analyses show that the brain continuously encodes the three most recently heard speech sounds in parallel, and maintains this information long past its dissipation from the sensory input. Each speech sound representation evolves over time, jointly encoding both its phonetic features and the amount of time elapsed since onset.
As a result, this dynamic neural pattern encodes both the relative order and phonetic content of the speech sequence. These representations are active earlier when phonemes are more predictable, and are sustained longer when lexical identity is uncertain.
Our results show how phonetic sequences in natural speech are represented at the level of populations of neurons, providing insight into what intermediary representations exist between the sensory input and sub-lexical units.
The flexibility in the dynamics of these representations paves the way for further understanding of how such sequences may be used to interface with higher order structure such as lexical identity.