New Study Redefines Human Sound Localization

Summary: Researchers challenged a longstanding theory from the 1940s about how humans locate sounds, a discovery with profound implications for hearing technology. The study reveals that humans utilize a simpler, more energy-efficient neural network for spatial hearing, similar to other mammals like rhesus monkeys and gerbils.

This network not only locates sounds but also separates speech from background noise, addressing the ‘cocktail party problem’ faced by hearing devices and smartphones. These insights could revolutionize the design of hearing aids and digital assistants, making them more adaptable and efficient.

Key Facts:

Debunking Old Theories: The study disproves the old assumption that a complex, dedicated neural network is necessary for sound localization in humans.
Simpler Neural Networks: Findings show humans use a sparse and energy-efficient neural circuitry for spatial hearing, akin to simpler mammals.
Implications for Technology: The research indicates that hearing aids and smart devices could be improved not by mimicking complex human language processes but by adopting simpler, more direct sound processing techniques.

Source: Macquarie University

Macquarie University researchers have debunked a 75-year-old theory about how humans determine where sounds are coming from, and it could unlock the secret to creating a next generation of more adaptable and efficient hearing devices ranging from hearing aids to smartphones.

In the 1940s, an engineering model was developed to explain how humans can locate a sound source based on differences of just a few tens of millionths of a second in when the sound reaches each ear.

This shows a head and sound waves. — All types of machine hearing struggles with the challenge of hearing in noise, known as the ‘cocktail party problem’. Credit: Neuroscience News

This model worked on the theory that we must have a set of specialised detectors whose only function was to determine where a sound was coming from, with location in space represented by a dedicated neuron.

Its assumptions have been guiding and influencing research – and the design of audio technologies – ever since.

But a new research paper published in Current Biology by Macquarie University Hearing researchers has finally revealed that the idea of a neural network dedicated to spatial hearing does not hold.

Lead author, Macquarie University Distinguished Professor of Hearing, David McAlpine, has spent the past 25 years proving that one animal after another was actually using a much sparser neural network, with neurons on both sides of the brain performing this function in addition to others.

Showing this in action in humans was more difficult.

Now through the combination of a specialised hearing test, advanced brain imaging, and comparisons with the brains of other mammals including rhesus monkeys, he and his team have shown for the first time that humans also use these simpler networks.

“We like to think that our brains must be far more advanced than other animals in every way, but that is just hubris,” Professor McAlpine says.

“We’ve been able to show that gerbils are like guinea pigs, guinea pigs are like rhesus monkeys, and rhesus monkeys are like humans in this regard.

“A sparse, energy efficient form of neural circuitry performs this function – our gerbil brain, if you like.”

The research team also proved that the same neural network separates speech from background sounds – a finding that is significant for the design of both hearing devices and the electronic assistants in our phones.

All types of machine hearing struggles with the challenge of hearing in noise, known as the ‘cocktail party problem’. It makes it difficult for people with hearing devices to pick out one voice in a crowded space, and for our smart devices to understand when we talk to them.

Professor McAlpine says his team’s latest findings suggest that rather than focusing on the large language models (LLMs) that are currently used, we should be taking a far simpler approach.

“LLMs are brilliant at predicting the next word in a sentence, but they’re trying to do too much,” he says.

“Being able to locate the source of a sound is the important thing here, and to do that, we don’t need a ‘deep mind’ language brain. Other animals can do it, and they don’t have language.

“When we are listening, our brains don’t keep tracking sound the whole time, which the large language processors are trying to do.

“Instead, we, and other animals, use our ‘shallow brain’ to pick out very small snippets of sound, including speech, and use these snippets to tag the location and maybe even the identity of the source.

“We don’t have to reconstruct a high-fidelity signal to do this, but instead understand how our brain represents that signal neurally, well before it reaches a language centre in the cortex.

“This shows us that a machine doesn’t have to be trained for language like a human brain to be able to listen effectively.

“We only need that gerbil brain.”

The next step for the team is to identify the minimum amount of information that can be conveyed in a sound but still get the maximum amount of spatial listening.

About this auditory neuroscience research news

Author: Georgia Gowing
Source: Macquarie University
Contact: Georgia Gowing – Macquarie University
Image: The image is credited to Neuroscience News

Original Research: Open access.
“The neural representation of an auditory spatial cue in the primate cortex” by Jaime Undurrag et al. Current Biology

Abstract

The neural representation of an auditory spatial cue in the primate cortex

Highlights

Neural representation of ITD in human cortex is sparse and sound-frequency dependent
Cortical activation to ITD is correlated with spatial perceptual qualities
Scaled for head size, humans and macaques show a common cortical representation of ITD
Our findings suggest a common representation of ITDs across mammalian cortex

Summary

Humans make use of small differences in the timing of sounds at the two ears—interaural time differences (ITDs)—to locate their sources.

Despite extensive investigation, however, the neural representation of ITDs in the human brain is contentious, particularly the range of ITDs explicitly represented by dedicated neural detectors.

Here, using magneto- and electro-encephalography (MEG and EEG), we demonstrate evidence of a sparse neural representation of ITDs in the human cortex.

The magnitude of cortical activity to sounds presented via insert earphones oscillated as a function of increasing ITD—within and beyond auditory cortical regions—and listeners rated the perceptual quality of these sounds according to the same oscillating pattern.

This pattern was accurately described by a population of model neurons with preferred ITDs constrained to the narrow, sound-frequency-dependent range evident in other mammalian species.

When scaled for head size, the distribution of ITD detectors in the human cortex is remarkably like that recorded in vivo from the cortex of rhesus monkeys, another large primate that uses ITDs for source localization.

The data solve a long-standing issue concerning the neural representation of ITDs in humans and suggest a representation that scales for head size and sound frequency in an optimal manner.