• Aucun résultat trouvé

Seeing our own voice: an electrophysiological study of audiovisual speech integration during self perception

N/A
N/A
Protected

Academic year: 2021

Partager "Seeing our own voice: an electrophysiological study of audiovisual speech integration during self perception"

Copied!
2
0
0

Texte intégral

(1)

HAL Id: hal-01297672

https://hal.archives-ouvertes.fr/hal-01297672

Submitted on 5 Apr 2016

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Seeing our own voice: an electrophysiological study of audiovisual speech integration during self perception

Avril Treille, Coriandre Vilain, Sonia Kandel, Marc Sato

To cite this version:

Avril Treille, Coriandre Vilain, Sonia Kandel, Marc Sato. Seeing our own voice: an electrophysiolog- ical study of audiovisual speech integration during self perception. IMRF 2015 - 16th International Multisensory Research Forum, Jun 2015, Pise, Italy. �hal-01297672�

(2)

SEEING OUR OWN VOICE: AN ELECTROPHYSIOLOGICAL STUDY OF AUDIOVISUAL SPEECH INTEGRATION DURING SELF PERCEPTION

Avril Treille

1

, Coriandre Vilain

1

, Sonia Kandel

1

& Marc Sato

2

1 GIPSA-lab, Département Parole & Cognition, CNRS & Grenoble Université, Grenoble, France

2Laboratoire Parole & Langage, CNRS & Aix Marseille Université, France

Introduction

To recognize one's own face and voice is key for our self-awareness and for our ability to communicate effectively with others. Interestingly, recent studies suggest that better recognition of one's actions may result from the integration of sensory inputs with our own sensory-motor knowledge. However, whether hearing our voice and seeing our articulatory gestures facilitate audiovisual speech integration is still debated.

 Participants

18 healthy adults, right-handed native French speakers.

 Stimuli

Syllables : /pa/, /ta/, /ka/

Modalities : auditory (A), visual (V), audio-visual (AV) and incongruent audio-visual (AVi, self auditory signal, other visual signal)

Half of the stimuli were related to the participant (self condition), the other half to an unknown speaker (other condition).

A total of 1176 stimuli were created

 Tasks

1) Before experiment, a short training was performed.

3) EEG session : a three-alternative forced-choice identification task, with participants instructed to categorize each perceived syllable with their right hand, after an audio “beep”.

 Data acquisition

EEG data were continuously recorded from 64 scalp electrodes (international 10–20 system) using the Biosemi ActiveTwo AD-box EEG system operating at a sampling rate of 256 Hz.

Two additional electrodes served as reference [CMS] & [DRL]

One other external reference electrode was at the top of the nose. The electrooculogram controlling for horizontal (HEOG) and vertical (VEOG) eye movements were recorded using electrodes at the outer canthus of each eye as well as above and below the right eye. Before the experiment, the impedance of each electrode was adjusted to get low offset voltage and stable DC.

 Analysis

Behavioral analyses:

EEG analyses on fronto-central electrodes (F3/F4/C3/C4/Fz/Cz):

Correlation between EEG and behavioral data

Correlations between Integration (EEG; AV- A+V) & Visual identification (%)

Methods

Results

Discussion

1) Behavioral results : All modality was perfectly perceived except in the visual modality. /pa/ was better perceived than /ta/ and /ka/ during visual presentation.

2) Integration for both self and other signals : Early integration processing (on P2) during AV-self and AV-other speech perception compared to A+V.

3) Speaker effect on N1 latency : Compared to Visual-other, Visual-self stimuli induced a temporal facilitation on N1 during integration mechanisms.

4) Correlations on N1 latency for self visually ambiguous syllables : A negative correlation was oberved between visual-self identificcation and integration results on N1.

=> In line with previous EEG studies on multimodal speech perception, our results point to the existence of early integration mechanisms of auditory and visual speech information. Crucially, they also provide evidence for a processing advantage when the perceptual situation involves our own speech productions mostly for visually ambiguous syllables. Viewing our own utterances leads to a temporal facilitation of the integration of auditory and visual speech signals.

Behavioral

% ANOVA : Modality (A, AV, V, AVi), speaker (Self/other), syllables (/pa/, /ta/, /ka/)

Pre-

processing

- Re-referenced off-line to the nose - Filtering : 2-30 Hz - Epochs : 1000ms (baseline from -500 to -400ms - Rejection : ±60 µV to the acoustic syllable onset)

NI & P2 Amplitude Latency

Speaker’s effect

ANOVA : Auditory modality (Self/other), Visual modality (Self/other/None)

ANOVA : : Auditory modality (Self/other), Visual modality (Self/other/None)

Audio-visual integration

ANOVA : Signal type (Bimodal/Sum), Auditory modality (Self/Other), Visual modality (Self/Other)

ANOVA : Signal type (Bimodal/Sum), Auditory modality (Self/Other), Visual modality (Self/Other)

Part of this research was supported by a grant from the European Research Council (FP7/2007-2013 Grant Agreement no. 339152, "Speech Unit(e)s") Correspondence: avril.treille@gipsa-lab.inpg.fr

1) Behavioral - % correct responses

(p<.001)

3) EEG – Self effect on integration 2) EEG - Integration (AV <> A+V)

4) Correlations between Integration (EEG; AV- A+V)

& Visual identification (%)

102 104 106 108 110 112 114 116 118

Visual-self Visual-other N1 Latency – Visual effect

0 1 2 3 4 5 6

Sum Bimodality

P2 Amplitude – Type effect

-7 -2 3

50% 60% 70% 80% 90% 100%

Correlation - N1 Amplitude

V-self V-other

-7 -2 3

50% 60% 70% 80% 90% 100%

Correlation – P2 Amplitude

V-self V-other

-50 -30 -10 10 30 50

50% 60% 70% 80% 90% 100%

Correlation - N1 Latency

V-self V-other

-50 -30 -10 10 30 50

50% 60% 70% 80% 90% 100%

Correlation – P2 Latency

V-self V-other

- Amplitude N1 & P2 : No efffect - Latence P2 : No effetc

- Latence N1 : Self : r=.41, p<.02; other : r=.01, p<.94

NS

NS NS

r=.41, p<.02

Self : N1 latency is negatively correlated with visual saliency

60%

65%

70%

75%

80%

85%

90%

95%

100%

A-self A-other AV-self AV-other AVi-self Avi-other V-self V-other

% of correct responses

Pa Ta Ka

• A = AV = Avi > V

• /pa/ = /ta/ = /ka/

except for V-self and V-other

• No effect of the speaker

-5 -4 -3 -2 -1 0 1 2 3 4 5

-500 -453 -406 -359 -313 -266 -219 -172 -125 -78 -31 16 63 109 156 203 250 297 344 391 438 484

AsVs As+Vs

-5 -4 -3 -2 -1 0 1 2 3 4 5

-500 -453 -406 -359 -313 -266 -219 -172 -125 -78 -31 16 63 109 156 203 250 297 344 391 438 484

AoVs Ao+Vs

-5 -4 -3 -2 -1 0 1 2 3 4 5

-500 -453 -406 -359 -313 -266 -219 -172 -125 -78 -31 16 63 109 156 203 250 297 344 391 438 484

AsVo As+Vo

-5 -4 -3 -2 -1 0 1 2 3 4 5

-500 -453 -406 -359 -313 -266 -219 -172 -125 -78 -31 16 63 109 156 203 250 297 344 391 438 484

AoVo Ao+Vo

Visual-Self : reduced N1 latency (p<.02)

P2 amplitude : AV < A+V (p<. 02)

=> integration

Références

Documents relatifs

We retimed naturally produced sentences to an isochronous form (or a matched anisochronous form, see hereafter) by locally compressing or elongating speech portions cor- responding

To investigate the encoding of more complex multivariate representations of the speech stimulus and to isolate measures of multisensory integration at different

Table 3 Shows the correlation results observed between the priming effect (according to attention: priming effect observed in the attended condition and the priming effect observed

Four types of stim- uli were included in the test: ‘natural’ (natural prosodic contour and voice quality encoded by the actor), ‘voice quality’ (natural voice quality of the

(1) a) In line with previous studies on multimodal speech perception =&gt; integration mechanisms of auditory and visual speech signals.. b) A visual processing advantage when

The two focus papers (Coleman, Hawkins) and one of the three commentaries (Laver) in this section are concerned with the form and function of phonetic and phonological

The moving sound (M-S) stimuli were created by counter rotation of the acoustic scene according to the rotation of the participant in the previous experiment during the same

In the current work, we addressed the issue of identifying the electrophysiological correlates of online speech distortion detection and repair in the context of auditory