Early auditory sensory processing of voices is facilitated by visual mechanisms

Publikation: Beitrag in FachzeitschriftForschungsartikelBeigetragenBegutachtung

Beitragende

  • Sonja Schall - , Max-Planck-Institut für Kognitions- und Neurowissenschaften (Autor:in)
  • Stefan J. Kiebel - , Max-Planck-Institut für Kognitions- und Neurowissenschaften, Universitätsklinikum Jena (Autor:in)
  • Burkhard Maess - , Max-Planck-Institut für Kognitions- und Neurowissenschaften (Autor:in)
  • Katharina Von Kriegstein - , Max-Planck-Institut für Kognitions- und Neurowissenschaften, Humboldt-Universität zu Berlin (Autor:in)

Abstract

How do we recognize people that are familiar to us? There is overwhelming evidence that our brains process voice and face in a combined fashion to optimally recognize both who is speaking and what is said. Surprisingly, this combined processing of voice and face seems to occur even if one stream of information is missing. For example, if subjects only hear someone who is familiar to them talking, without seeing their face, visual face-processing areas are active. One reason for this crossmodal activation might be that it is instrumental for early sensory processing of voices-a hypothesis that is contrary to current models of unisensory perception. Here, we test this hypothesis by harnessing a temporally highly resolved method, i.e., magnetoencephalography (MEG), to identify the temporal response profile of the fusiform face area in response to auditory-only voice recognition. Participants briefly learned a set of voices audio-visually, i.e., together with a talking face. After learning, we measured subjects' MEG signals in response to the auditory-only, now familiar, voices. The results revealed three key mechanisms that characterize the sensory processing of familiar speakers' voices: (i) activation in the face-sensitive fusiform gyrus at very early auditory processing stages, i.e., only 100. ms after auditory onset, (ii) a temporal facilitation of auditory processing (M200), and (iii) a correlation of this temporal facilitation with recognition performance. These findings suggest that a neural representation of face information is evoked before the identity of the voice is even recognized and that the brain uses this visual representation to facilitate early sensory processing of auditory-only voices.

Details

OriginalspracheEnglisch
Seiten (von - bis)237-245
Seitenumfang9
FachzeitschriftNeuroImage
Jahrgang77
PublikationsstatusVeröffentlicht - 5 Aug. 2013
Peer-Review-StatusJa
Extern publiziertJa

Externe IDs

PubMed 23563227
ORCID /0000-0001-7989-5860/work/142244397

Schlagworte

Schlagwörter

  • Audiovisual, FFA, Human, Multisensory, Speaker recognition

Bibliotheksschlagworte