Summary: A new study reveals how the brain processes sound and how quickly neurons transition from processing the sound of speech to the language based words.
Source: University of Maryland.
You’re walking along a busy city street. All around you are the sounds of subway trains, traffic, and music coming from storefronts. Suddenly, you realize one of the sounds you’re hearing is someone speaking, and that you are listening in a different way as you pay attention to what they are saying.
How does the brain do this? And how quickly does it happen? Researchers at the University of Maryland are learning more about the automatic process the brain goes through when it picks up on spoken language.
Neuroscientists have understood for some time that when we hear sounds of understandable language our brains react differently than they do when we hear non-speech sounds or people talking in languages we do not know. When we hear someone talking in a familiar language, our brain quickly shifts to pay attention, process the speech sounds by turning them into words, and understand what is being said.
In a new paper published in the Cell Press/Elsevier journal Current Biology, “Rapid transformation from auditory to linguistic representations of continuous speech,” Maryland researchers were able to see where in the brain, and how quickly – in milliseconds – the brain’s neurons transition from processing the sound of speech to processing the language-based words of the speech.
The paper was written by Institute for Systems Research (ISR) Postdoctoral Researcher Christian Brodbeck, L. Elliot Hong of the University of Maryland School of Medicine, and Professor Jonathan Z. Simon, who has a triple appointment in the Departments of Biology and Electrical and Computer Engineering as well as ISR.
“When we listen to someone talking, the change in our brain’s processing from not caring what kind of sound it is to recognizing it as a word happens surprisingly early,” said Simon. “In fact, this happens pretty much as soon as the linguistic information becomes available.”
When it is engaging in speech perception, the brain’s auditory cortex analyzes complex acoustic patterns to detect words that carry a linguistic message. It seems to do this so efficiently, at least in part, by anticipating what it is likely to hear: by learning what sounds signal language most frequently, the brain can predict what may come next. It is generally thought that this process–localized bilaterally in the brain’s superior temporal lobes–involves recognizing an intermediate, phonetic level of sound.
In the Maryland study, the researchers mapped and analyzed participants’ neural brain activity while listening to a single talker telling a story. They used magnetoencephalography (MEG), a common non-invasive neuroimaging method that employs very sensitive magnetometers to record the naturally occurring magnetic fields produced by electrical currents inside the brain. The subject typically sits under or lies down inside the MEG scanner, which resembles a whole-head hair drier, but contains an array of magnetic sensors.
The study showed that the brain quickly recognizes the phonetic sounds that make up syllables and transitions from processing merely acoustic to linguistic information in a highly specialized and automated way. The brain has to keep up with people speaking at a rate of about three words a second. It achieves this, in part, by distinguishing speech from other kinds of sound in about a tenth of a second after the sound enters the ears.
“We usually think that what the brain processes this early must be only at the level of sound, without regard for language,” Simon notes. “But if the brain can take knowledge of language into account right away, it would actually process sound more accurately. In our study we see that the brain takes advantage of language processing at the very earliest stage it can.”
In another part of the study, the researchers found that people selectively process speech sounds in noisy environments.
Here, participants heard a mixture of two speakers in a “cocktail party” scenario, and were told to listen to one and ignore the other. The participants’ brains only consistently processed language for the conversation to which they were told to pay attention, not the one they were told to ignore. Their brains stopped processing unattended speech at the level of detecting word forms.
“This may reveal a ‘bottleneck’ in our brains’ speech perception,” Brodbeck says. “We think lexical perception works by our brain considering the match between the incoming speech signal and many different words at the same time. It could be that this mechanism involves mental resources that have limitations on how many different options can be tried simultaneously, making it impossible to attend to more than one speaker at the same time.”
This study lays the foundation for additional research into how our brains interpret sounds as words. For example, how and when does the brain decide which word is being said? There is evidence that the brain actually sifts through possibilities, but it is currently unknown how the brain successfully narrows down the choices to a single word and connects it with the meaning of the ongoing discourse. Also, since it is possible to measure what fraction of the speech sounds are clear enough to be processed as being components of words, the researchers may be able to test listening comprehension when subjects can’t, or don’t understand how to, report it properly.
Funding: The NIH funded this study.
Source: Rebecca Copeland – University of Maryland
Publisher: Organized by NeuroscienceNews.com.
Image Source: NeuroscienceNews.com image is in the public domain.
Original Research: Open access research for “Rapid Transformation from Auditory to Linguistic Representations of Continuous Speech” by Christian Brodbeck, L. Elliot Hong, Jonathan Z. Simon in Current Biology. Published November 29 2018.
Rapid Transformation from Auditory to Linguistic Representations of Continuous Speech
During speech perception, a central task of the auditory cortex is to analyze complex acoustic patterns to allow detection of the words that encode a linguistic message. It is generally thought that this process includes at least one intermediate, phonetic, level of representations, localized bilaterally in the superior temporal lobe. Phonetic representations reflect a transition from acoustic to linguistic information, classifying acoustic patterns into linguistically meaningful units, which can serve as input to mechanisms that access abstract word representations. While recent research has identified neural signals arising from successful recognition of individual words in continuous speech, no explicit neurophysiological signal has been found demonstrating the transition from acoustic and/or phonetic to symbolic, lexical representations. Here, we report a response reflecting the incremental integration of phonetic information for word identification, dominantly localized to the left temporal lobe. The short response latency, approximately 114 ms relative to phoneme onset, suggests that phonetic information is used for lexical processing as soon as it becomes available. Responses also tracked word boundaries, confirming previous reports of immediate lexical segmentation. These new results were further investigated using a cocktail-party paradigm in which participants listened to a mix of two talkers, attending to one and ignoring the other. Analysis indicates neural lexical processing of only the attended, but not the unattended, speech stream. Thus, while responses to acoustic features reflect attention through selective amplification of attended speech, responses consistent with a lexical processing model reveal categorically selective processing.