HAL Id: hal-02068138
https://hal.archives-ouvertes.fr/hal-02068138
Submitted on 14 Mar 2019
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
The role of the P-center in cortical tracking of speech
Vincent Aubanel
To cite this version:
Vincent Aubanel. The role of the P-center in cortical tracking of speech. Perceptuo-motor relationships in speech communication, Jan 2018, Genève, Switzerland. �hal-02068138�
The role of the P-center in cortical tracking of speech
Vincent Aubanel
University of Grenoble Alpes, CNRS, GIPSA-lab, Grenoble, France
[email protected]
Introduction
• Cortical oscillations track speech [1]
• Phase-resetting
mechanisms at play
• Cortical tracking is
maintained in noise [2]
• Speech is quasi-periodic
Q1: which temporal cue is most relevant for driving
cortical tracking ?
- Amplitude envelope
- P-center
Q2: Does more regular speech help in noisy
conditions?
- naturally timed
- isochronous
- anisochronous
0 2 4 6 8
Frequency (kHz)
The LATCH on the BACK GATE NEEDS a NAIL
dhaxl ae ch oh n dhaxb ae k g ey t n iy d z ey n ey l
●
●
● ●
●
● ● ●
●
●
● ●
● ● ● ●
● ●
● ●
7
1
8
6
2
3 5 4
1
2
3 4
0.00 0.25 0.50 0.75 1.00
Amplitude envelope
0.50 0.75 1.00 1.25 1.50
Time scale function
0 2 4 6 8
Frequency (kHz)
The LATCH on theBACK GATE NEEDS a NAIL
dhaxl ae ch oh n dhaxb ae k g ey t n iy d z ey n ey l
0.5 1.0 1.5 2.0 2.5
time (s)
A
B C
D
E
F
●
●
●
●
3 4
stressed syllables
all
syllables
low
number of peaks
high number of peaks
Hz
Mean frequency
●
●
●
0.25 ●
0.30
stressed syllables
all
syllables
low
number of peaks
high number of peaks
δ
Temporal distortion
Discussion Listeners' performance
Methods
Naturally timed speech (p-centers in red)
Isochronously retimed speech (anchored to p- centers)
WSOLA
time transformation Anchor points:
• stressed syllables
• unstressed syllables
• primary peaks
• secondary peaks
• 55 participants (Exp. I:26, Exp.II: 29)
• Material: 180 Harvard sentences mixed with
speech-shaped noise at
—3dB SNR
• Task: Speech intelligibility in noise
• 5 keyword scored per sentence
• Timescale function
applied to speech with WSOLA
• Anisochronous condition obtained by reversing
isochronous timescale function
• Same-duration transformation
Sentences distortion
References
All figures from: Aubanel et al. (2016) Front. Human Neurosci. 10:430
[1] Ding et al. (2016). Nature Neurosci.
19:158
[2] Fuglsang et al. (2017) Neuroimage 156:435
Acknowledgments European
Research Council (FP7 Programme, Grant no. 339152, "Speech Unit(e)s"