AI Restores Voices Through Microscopic Neck Movements

Summary: Imagine speaking in total silence and having a machine recreate your exact voice in real-time. Researchers have developed a wearable “Multiaxial Strain Mapping Sensor” that reads microscopic movements in the neck muscles and skin to reconstruct speech.

This AI-powered technology can “hear” words without a single vibration of the vocal cords, offering a lifeline to those who have lost their voices to disease or surgery.

Key Points

Noise-Immune Communication: Because the sensor reads skin movement rather than sound waves, it works perfectly in incredibly loud environments, like factories or construction sites, where traditional microphones fail.
Restoring Identity: For patients who have undergone laryngeal surgery (removal of the voice box), this technology doesn’t just provide a robotic output, it can synthesize their actual pre-surgery voice.
Silent Communication: The technology enables “silent speech” in sensitive environments like libraries, theaters, or secret military operations, allowing for clear communication without making a sound.
Daily Life Integration: The device is designed for the “real world,” featuring high accuracy even when the wearer is moving or in high-stress industrial settings.

Source: POSTECH

Hearing words even when spoken in silence, a new technology has been developed that reads the subtle movements of neck muscles using light and employs AI to restore them into actual voices.

A research team led by Professor Sung-Min Park (Department of IT Convergence Engineering, Mechanical Engineering, Electrical Engineering, and the Graduate School of Convergence) and Dr. Sunguk Hong (Department of Mechanical Engineering) at POSTECH (Pohang University of Science and Technology) conducted this study.

This shows a person's neck with glowing lines. — Researchers hope this technology will accelerate the day when patients with speech disorders can reclaim their original voices. Credit: Neuroscience News

The findings were published in the online edition of Cyborg and Bionic Systems, a Science Partner Journal in the field of biomedical engineering.

The research began with tiny changes that occur around the neck when a person speaks. It is not just the vocal cords that create sound. Whenever we speak, the muscles and skin around the neck move together, drawing an invisible “movement map” on the skin. The research team focused on the fact that these microscopic movements contain information about what the person intends to say.

To capture this information, the research team developed a ‘Multiaxial Strain Mapping Sensor.’ This sensor, which combines a miniature camera with small reference markers on a soft silicone material, can be conveniently worn on the neck and detects even the most minute skin movements.

The wearing position and tightness can be adjusted for the individual, and an algorithm automatically corrects errors that may occur when the device is reattached, allowing it to operate stably in daily environments.

The strain patterns collected by the sensor are analyzed by AI. It estimates the words or sentences the user intends to say and combines them with voice synthesis technology trained on the individual’s vocal characteristics to reproduce the actual voice. Even without producing sound, it “reads” the speech and converts it into a voice.

Existing voice restoration technologies used biological signals such as ‘EMG (electromyography)’ or ‘EEG (electroencephalography),’ but they had limitations in daily life due to complex equipment and uncomfortable wearability. The research team solved this problem with a wearable sensor and confirmed through experiments that speech could be reconstructed with high accuracy even in noisy environments such as factories.

The scope of application is also broad. It is expected to be used in various fields, such as communication assistance for patients who have lost their voices due to vocal cord diseases or laryngeal surgery, communication technology for industrial sites without microphones or radios, and even “silent communication” in libraries or conference rooms.

Professor Sung-Min Park, who led the study, said, “We hope this technology will accelerate the day when patients with speech disorders can reclaim their voices,” adding, “It is a noteworthy technology because it has a wide range of potential applications, including assisting laryngectomized patients, communicating in noisy industrial environments, and even supporting silent conversations.“

Funding: Meanwhile, this research was conducted with support from Doctoral Course Research Grant Program and the Mid-career Researcher Program of the Ministry of Education, Bio&Medical Technology Development Program and the Pioneering Convergence Science and Technology Development Program of the Ministry of Science and ICT.

Key Questions Answered:

Q: Does this mean someone could “eavesdrop” on my silent thoughts?

A: No. The device only works when you are physically moving your neck muscles to form words (subvocalization). It reads intent through muscle action, not by reading your mind.

Q: How is this better than the “electronic larynx” devices used today?

A: Traditional “electrolarynx” devices produce a very robotic, buzzing sound and require the user to hold a device to their throat. This new sensor is wearable, hands-free, and creates a natural-sounding voice that sounds like the user’s own.

Q: Could this be used for secret communication?

A: Absolutely. One of the highlighted use cases is “silent communication” for libraries or noisy industrial sites where you need to relay complex instructions without a microphone or without disturbing others.

Editorial Notes:

This article was edited by a Neuroscience News editor.
Journal paper reviewed in full.
Additional context added by our staff.

About this AI and neurotech research news

Author: Yung-Eui Kang
Source: POSTECH
Contact: Yung-Eui Kang – POSTECH
Image: The image is credited to Neuroscience News

Original Research: Open access.
“Soft Multiaxial Strain Mapping Interface with AI-Driven Decoding for Silent Speech in Noise” by Sunguk Hong, Junyoung Yoo, and Sung-Min Park. Cyborg and Bionic Systems
DOI:10.34133/cbsystems.0536

Abstract

Soft Multiaxial Strain Mapping Interface with AI-Driven Decoding for Silent Speech in Noise

Silent speech interfaces (SSIs) offer a viable alternative to traditional microphones in capturing clear audio in noisy environments. We propose a reconceptualized SSI that reproduces voice by monitoring continuous multiaxial strain maps induced by throat muscle movements.

The system integrates a computer vision-based optical strain (CVOS) sensor with deep learning-based voice reconstruction, enabling clear alphabetic communication under extreme noise conditions.

The CVOS sensor—comprising a soft silicone substrate with micromarkers and a tiny camera—achieves high-sensitivity marker detection and captures complex strain patterns with higher scalability and reliability compared to conventional wearable sensors.

The inference pipeline of the CVOS-based SSI incorporates physics-based automated baseline calibration and content-adaptive temporal attention, enabling robust analysis of the captured strain patterns.

Based on the inference results, a personalized text-to-speech model subsequently reconstructs the speaker’s voice. These algorithmic features ensure robustness under dynamic conditions by employing real-time adaptive signal processing that compensates for inter- and intrasubject anatomical variability.

Alphabet-based communication is achieved through the synergy between optimized algorithms and interface design.

The performance of the CVOS-based SSI was validated in real-world noisy scenarios, confirming its practical applicability.