This shows two heads and sound waves.
The framework used in this study was more effective than older methods at capturing these complex processes. Credit: Neuroscience News

How the Brain Processes Speech in Real Time

Summary: Researchers have developed a computational framework that maps how the brain processes speech during real-world conversations. Using electrocorticography (ECoG) and AI speech models, the study analyzed over 100 hours of brain activity, revealing how different regions handle sounds, speech patterns, and word meanings.

The findings show that the brain processes language in a sequence—moving from thought to speech before speaking and working backward to interpret spoken words. The framework accurately predicted brain activity even for new conversations, surpassing previous models.

These insights could improve speech recognition technology and aid individuals with communication disorders. The study provides a deeper understanding of how the brain effortlessly engages in conversation.

Key Facts:

  • Layered Processing: The brain handles speech at three levels—sounds, speech patterns, and word meanings.
  • Sequential Processing: Before speaking, the brain forms words into sounds; after hearing, it deciphers meaning.
  • Real-World Insights: The AI model accurately predicted brain activity during natural conversations.

Source: Hebrew University of Jerusalem

A new study led by Dr. Ariel Goldstein, from the Department of Cognitive and Brain Sciences and the Business School at the Hebrew University of Jerusalem, Google Research, in collaboration with the Hasson Lab at the Neuroscience Institute at Princeton University,  Dr. Flinker and Dr. Devinsky from the NYU Langone Comprehensive Epilepsy Center, has developed a unified computational framework to explore the neural basis of human conversations.

This research bridges acoustic, speech, and word-level linguistic structures, offering unprecedented insights into how the brain processes everyday speech in real-world settings.

The study, published in Nature Human Behaviour, recorded brain activity over 100 hours of natural, open-ended conversations using a technique called electrocorticography (ECoG).

To analyze this data, the team used a speech-to-text model called Whisper, which helps break down language into three levels: simple sounds, speech patterns, and the meaning of words. These layers were then compared to brain activity using advanced computer models.

The results showed that the framework could predict brain activity with great accuracy. Even when applied to conversations that were not part of the original data, the model correctly matched different parts of the brain to specific language functions.

For example, regions involved in hearing and speaking aligned with sound and speech patterns, while areas involved in higher-level understanding aligned with the meanings of words.

The study also found that the brain processes language in a sequence. Before we speak, our brain moves from thinking about words to forming sounds, while after we listen, it works backward to make sense of what was said.

The framework used in this study was more effective than older methods at capturing these complex processes.

“Our findings help us understand how the brain processes conversations in real-life settings,” said Dr. Goldstein.

“By connecting different layers of language, we’re uncovering the mechanics behind something we all do naturally—talking and understanding each other.”

This research has potential practical applications, from improving speech recognition technology to developing better tools for people with communication challenges. It also offers new insights into how the brain makes conversation feel so effortless, whether it’s chatting with a friend or engaging in a debate.

The study marks an important step toward building more advanced tools to study how the brain handles language in real-world situations.

About this speech processing and neuroscience research news

Author: Yarden Mills
Source: Hebrew University of Jerusalem
Contact: Yarden Mills – Hebrew University of Jerusalem
Image: The image is credited to Neuroscience News

Original Research: Open access.
A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations” by Ariel Goldstein et al. Nature Human Behavior


Abstract

A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations

This study introduces a unified computational framework connecting acoustic, speech and word-level linguistic structures to study the neural basis of everyday conversations in the human brain.

We used electrocorticography to record neural signals across 100 h of speech production and comprehension as participants engaged in open-ended real-life conversations. We extracted low-level acoustic, mid-level speech and contextual word embeddings from a multimodal speech-to-text model (Whisper).

We developed encoding models that linearly map these embeddings onto brain activity during speech production and comprehension.

Remarkably, this model accurately predicts neural activity at each level of the language processing hierarchy across hours of new conversations not used in training the model.

The internal processing hierarchy in the model is aligned with the cortical hierarchy for speech and language processing, where sensory and motor regions better align with the model’s speech embeddings, and higher-level language areas better align with the model’s language embeddings.

The Whisper model captures the temporal sequence of language-to-speech encoding before word articulation (speech production) and speech-to-language encoding post articulation (speech comprehension). The embeddings learned by this model outperform symbolic models in capturing neural activity supporting natural speech and language.

These findings support a paradigm shift towards unified computational models that capture the entire processing hierarchy for speech comprehension and production in real-world conversations.

Join our Newsletter
I agree to have my personal information transferred to AWeber for Neuroscience Newsletter ( more information )
Sign up to receive our recent neuroscience headlines and summaries sent to your email once a day, totally free.
We hate spam and only use your email to contact you about newsletters. You can cancel your subscription any time.