AI Turns Brain Waves into Spoken Words

Summary: Researchers achieved a breakthrough in converting brain signals to audible speech with up to 100% accuracy. The team used brain implants and artificial intelligence to directly map brain activity to speech in patients with epilepsy.

This technology aims to give a voice back to people in a locked-in state, who are paralyzed and cannot speak. The researchers believe that the success of this project marks a significant advance in the realm of Brain-Computer Interfaces.

Key Facts:

Using a combination of brain implants and AI, the researchers were able to predict spoken words with an accuracy of 92-100%.
The team focused their experiments on non-paralyzed people with temporary brain implants, decoding what they said out loud based on their brain activity.
While the technology currently focuses on individual words, future goals include the ability to predict full sentences and paragraphs based on brain activity.

Source: Radboud University

Researchers from Radboud University and the UMC Utrecht have succeeded in transforming brain signals into audible speech.

By decoding signals from the brain through a combination of implants and AI, they were able to predict the words people wanted to say with an accuracy of 92 to 100%.

Their findings are published in the Journal of Neural Engineering this month.

This shows'a person's head. — Researchers around the world are working on ways to recognize words and sentences in brain patterns. Credit: Neuroscience News

The research indicates a promising development in the field of Brain-Computer Interfaces, according to lead author Julia Berezutskaya, researcher at Radboud University’s Donders Institute for Brain, Cognition and Behaviour and UMC Utrecht. Berezutskaya and colleagues at the UMC Utrecht and Radboud University used brain implants in patients with epilepsy to infer what people were saying.

Bringing back voices

“Ultimately, we hope to make this technology available to patients in a locked-in state, who are paralyzed and unable to communicate,” says Berezutskaya.

“These people lose the ability to move their muscles, and thus to speak. By developing a brain-computer interface, we can analyse brain activity and give them a voice again.”

For the experiment in their new paper, the researchers asked non-paralyzed people with temporary brain implants to speak a number of words out loud while their brain activity was being measured.

Berezutskaya: “We were then able to establish direct mapping between brain activity on the one hand, and speech on the other hand. We also used advanced artificial intelligence models to translate that brain activity directly into audible speech.

“That means we weren’t just able to guess what people were saying, but we could immediately transform those words into intelligible, understandable sounds. In addition, the reconstructed speech even sounded like the original speaker in their tone of voice and manner of speaking.”

Researchers around the world are working on ways to recognize words and sentences in brain patterns.

The researchers were able to reconstruct intelligible speech with relatively small datasets, showing their models can uncover the complex mapping between brain activity and speech with limited data.

Crucially, they also conducted listening tests with volunteers to evaluate how identifiable the synthesized words were.

The positive results from those tests indicate the technology isn’t just succeeding at identifying words correctly, but also at getting those words across audibly and understandably, just like a real voice.

Limitations

“For now, there’s still a number of limitations,’ warns Berezutskaya. ‘In these experiments, we asked participants to say twelve words out loud, and those were the words we tried to detect.

“In general, predicting individual words is less complicated than predicting entire sentences. In the future, large language models that are used in AI research can be beneficial.

“Our goal is to predict full sentences and paragraphs of what people are trying to say based on their brain activity alone. To get there, we’ll need more experiments, more advanced implants, larger datasets and advanced AI models.

“All these processes will still take a number of years, but it looks like we’re heading in the right direction.”

About this AI and neurotech research news

Author: Thomas Haenen
Source: Radboud University
Contact: Thomas Haenen – Radboud University
Image: The image is credited to Neuroscience News

Original Research: Open access.
“Direct speech reconstruction from sensorimotor brain activity with optimized deep learning models” by Julia Berezutskaya et al. Journal of Neural Engineering

Abstract

Direct speech reconstruction from sensorimotor brain activity with optimized deep learning models

Development of brain-computer interface (BCI) technology is key for enabling communication in individuals who have lost the faculty of speech due to severe motor paralysis.

A BCI control strategy that is gaining attention employs speech decoding from neural data.

Recent studies have shown that a combination of direct neural recordings and advanced computational models can provide promising results.

Understanding which decoding strategies deliver best and directly applicable results is crucial for advancing the field.

In this paper, we optimized and validated a decoding approach based on speech reconstruction directly from high-density electrocorticography recordings from sensorimotor cortex during a speech production task.

We show that 1) dedicated machine learning optimization of reconstruction models is key for achieving the best reconstruction performance; 2) individual word decoding in reconstructed speech achieves 92-100% accuracy (chance level is 8%); 3) direct reconstruction from sensorimotor brain activity produces intelligible speech.

These results underline the need for model optimization in achieving best speech decoding results and highlight the potential that reconstruction-based speech decoding from sensorimotor cortex can offer for development of next-generation BCI technology for communication.