Summary: Researchers report a potential benefit in delaying the time between audio and visual signals in improving speech comprehension.
Source: City University of London.
A new paper has shed light on sensory timing in the brain by showing that there is a lag between us hearing a person’s voice and seeing their lips move, and these delays vary depending on what task we’re doing.
The research from City, University of London – co-authored by academics from University of Sussex, Middlesex University and Birkbeck, University of London – also showed potential benefits of tailoring the time delay between audio and video signals to each individual, finding that speech comprehension improved in 50 per cent of participants by 20 words in every 100, on average. The paper is published in the journal Scientific Reports.
Dr Elliot Freeman, author of the paper and Senior Lecturer in Psychology at City, University of London, said: “Our study sheds new light on a controversy spanning over two centuries, over whether senses can really be out of sync in some individuals. Speech comprehension was at its best in our study when there was a slight auditory lag, which suggests that perception can actually be less-than-optimal when we converse or watch TV with synchronised sound and vision. We think that by tailoring auditory delays to each individual we could improve understanding and correct their sub-optimal perception.”
To investigate the effect, the team showed 36 participants audiovisual movies depicting the lower half of a face speaking words or syllables, with the audio degraded by background noise. Stimuli were presented with a range of audiovisual asynchronies, spanning nine equally spaced levels from 500ms auditory lead to 500ms auditory lag, including simultaneous. In two separate tasks, participants had to identify which phoneme (‘ba’ or ‘da’), or which word the speaker said.
They team found that there was an average auditory lag of 91ms for the phoneme identification task (based on the famous McGurk effect) and 113ms for word identification, but this varied widely between individuals.
Interestingly, the measurements from each task for individual participants were similar for repetitions of the same task, but not for different tasks. This suggests that we may have not just one general individual delay between seeing and hearing, but different ones for different tasks.
The authors speculate that these task-specific delays arise because neural signals from eyes and ears are integrated in different brain areas for different tasks, and that they must each travel via different routes to arrive there. Individual brains may differ in the length of these connections.
Dr Freeman said: “What we found is that sight and sound are in fact out of sync, and that each time delay is unique and stable for each individual. This can have a real impact on the interpretation of audiovisual speech.
“Poor lip-sync, as often experienced on cable TV and video phone calls, makes it harder to understand what people are saying. Our research could help us better understand why our natural lip-syncing may not always be perfect, and how this may affect everyday communication. We might be able to improve communication by artificially correcting sensory delays between sight and sound. Our work might be translated into novel diagnostics and therapeutic applications benefiting individuals with dyslexia, autism spectrum or hearing impairment.”
Readers are invited to take the test for themselves.
Source: Marcia Goodrich – City University of London
Image Source: NeuroscienceNews.com image is adapted from the City University of London news release.
Original Research: Full open access research for “Sight and sound persistently out of synch: stable individual differences in audiovisual synchronisation revealed by implicit measures of lip-voice integration” by Alberta Ipser, Vlera Agolli, Anisa Bajraktari, Fatimah Al-Alawi, Nurfitriani Djaafara & Elliot D. Freeman in Scientific Reports. Published online April 21 2017 doi:10.1038/srep46413
Sight and sound persistently out of synch: stable individual differences in audiovisual synchronisation revealed by implicit measures of lip-voice integration
Are sight and sound out of synch? Signs that they are have been dismissed for over two centuries as an artefact of attentional and response bias, to which traditional subjective methods are prone. To avoid such biases, we measured performance on objective tasks that depend implicitly on achieving good lip-synch. We measured the McGurk effect (in which incongruent lip-voice pairs evoke illusory phonemes), and also identification of degraded speech, while manipulating audiovisual asynchrony. Peak performance was found at an average auditory lag of ~100 ms, but this varied widely between individuals. Participants’ individual optimal asynchronies showed trait-like stability when the same task was re-tested one week later, but measures based on different tasks did not correlate. This discounts the possible influence of common biasing factors, suggesting instead that our different tasks probe different brain networks, each subject to their own intrinsic auditory and visual processing latencies. Our findings call for renewed interest in the biological causes and cognitive consequences of individual sensory asynchronies, leading potentially to fresh insights into the neural representation of sensory timing. A concrete implication is that speech comprehension might be enhanced, by first measuring each individual’s optimal asynchrony and then applying a compensatory auditory delay.
“Sight and sound persistently out of synch: stable individual differences in audiovisual synchronisation revealed by implicit measures of lip-voice integration” by Alberta Ipser, Vlera Agolli, Anisa Bajraktari, Fatimah Al-Alawi, Nurfitriani Djaafara & Elliot D. Freeman in Scientific Reports. Published online April 21 2017 doi:10.1038/srep46413