Prior multisensory learning can facilitate auditory-only voice-identity and speech recognition in noise

Research output: Contribution to journalResearch articleContributedpeer-review

Contributors

Abstract

Seeing the visual articulatory movements of a speaker, while hearing their voice, helps with understanding what is said. This multisensory enhancement is particularly evident in noisy listening conditions. Multisensory enhancement also occurs even in auditory-only conditions: auditory-only speech and voice-identity recognition is superior for speakers previously learned with their face, compared to control learning; an effect termed the “face-benefit”. Whether the face-benefit can assist in maintaining robust perception in increasingly noisy listening conditions, similar to concurrent multisensory input, is unknown. Here, in two behavioural experiments, we examined this hypothesis. In each experiment, participants learned a series of speakers’ voices together with their dynamic face, or control image. Following learning, participants listened to auditory-only sentences spoken by the same speakers and recognised the content of the sentences (speech recognition, Experiment 1) or the voice-identity of the speaker (Experiment 2) in increasing levels of auditory noise. For speech recognition, we observed that 14/30 participants (47%) showed a face-benefit. While 19/25 participants (76%) showed a face-benefit for voice-identity recognition. For those participants who demonstrated a face-benefit, the face-benefit increased with auditory noise levels. Taken together, the results support an audio-visual model of auditory communication and suggest that the brain can develop a flexible system in which learned facial characteristics are used to deal with varying auditory uncertainty.

Details

Original languageEnglish
JournalQuarterly Journal of Experimental Psychology
Publication statusE-pub ahead of print - 20 Aug 2024
Peer-reviewedYes

External IDs

ORCID /0000-0002-2531-4175/work/166324445
ORCID /0000-0001-7989-5860/work/166324980
unpaywall 10.1177/17470218241278649
Scopus 85204594463

Keywords

Keywords

  • audio-visual, multisensory, person recognition, speech in noise, voice identity