• Aucun résultat trouvé

Orofacial somatosensory inputs enhance speech intelligibility in noisy environments

N/A
N/A
Protected

Academic year: 2021

Partager "Orofacial somatosensory inputs enhance speech intelligibility in noisy environments"

Copied!
2
0
0

Texte intégral

(1)

HAL Id: hal-03083564

https://hal.archives-ouvertes.fr/hal-03083564

Submitted on 19 Dec 2020

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Orofacial somatosensory inputs enhance speech intelligibility in noisy environments

Rintaro Ogane, Jean-Luc Schwartz, Takayuki Ito

To cite this version:

Rintaro Ogane, Jean-Luc Schwartz, Takayuki Ito. Orofacial somatosensory inputs enhance speech intelligibility in noisy environments. ISSP 2020 - 12th International Seminar on Speech Production, Dec 2020, Providence (virtual), United States. �hal-03083564�

(2)

Discussion Results

Orofacial somatosensory inputs enhance

speech intelligibility in noisy environments

Rintaro Ogane 1 , Jean-Luc Schwartz 1 , Takayuki Ito 1,2

1 Univ. Grenoble Alpes, CNRS, Grenoble INP*, GIPSA-lab, Grenoble, France, 2 Haskins Laboratories, New Haven, USA

* Institute of Engineering Univ. Grenoble Alpes

Introduction

This work was supported by the European Research Council under the European Community's Seventh Framework Program (FP7/2007-2013 Grant Agreement no. 339152) and the National Institute on Deafness and Other Communication Disorders R01DC017439.

We thank Gurvan Quiniou for data collection and analysis, and Silvain Gerber for statistical analysis. We also thank Coriandre Vilain for his technical support.

Acknowledgments

Somatosensory inputs associated with facial skin deformation enhance speech

intelligibility in noise, when the somatosensory stimulation is compatible with the articulatory nature of the corresponding speech sound.

The orofacial somatosensory system may intervene in the process of speech detection in noisy environments.

Summary

Methods

Data analysis.

Mean probability of correct response rate across all SNR conditions.

Auditory processing system

auditory-visual interaction

Somatosensory information

Visual information

auditory-somatosensory interaction

How does somatosensory input affect the processing of speech ?

Do somatosensory inputs associated with facial skin deformation enhance

speech intelligibility in noise ?

Speech perception is an interactive process with multiple modalities and some perceptuo(multisensory)-motor connections

(Schwartz et al., 2012)

.

Experimental setup (Ito et al., 2009)

Speech intelligibility in noise was increased in SKIN compared to CTL.

≈ 3% increased for speech target /pa/.

Somatosensory effect is consistent with audio-visual speech processing.

Participants : 22 native French speakers.

14 for Exp. 1 and 8 for Exp. 2.

Speech materials.

/pa/ for Exp. 1 and /py/ for Exp. 2.

Speech detection test.

Task : to identify which noise period includes the target speech sound ?

Speech stimulus was embedded in

background noises (80 dB of SPL) with 8 SNR levels.

-8 dB to -15 dB for target /pa/.

-10 dB to -17 dB for target /py/.

Two experimental conditions were alternated every 8 trials.

SKIN : with somatosensory stimulation.

CTL : auditory-alone.

Do somatosensory inputs provide different effects in different types of

auditory stimulation ? Q. 1

Q. 2

Somatosensory stimulation on the face (SKIN).

Upward direction.

A half-wave 6 Hz sinusoidal pattern.

Applied in both noise periods.

The timing was adjusted to match the peak amplitude between somatosensory and auditory stimuli.

Somatosensory effect may appear when the somatosensory stimulation is matched with articulatory gesture in speech sound

(Ogane et al., 2019; 2020)

.

Target /pa/ Target /py/

Contact : rintaro.ogane@gipsa-lab.grenoble-inp.fr

McGurk effect (McGurk & MacDonald, 1976). Word segmentation (Sell & Kaschak, 2009).

Lexical processing in French (Strauß et al., 2015).

Speech detection in noise (Sumby & Pollack, 1954; Erber, 1969; Grant & Seitz, 2000; Bernstein et al., 2004; Kim & Davis, 2004).

Note: two onsets of target speech sound were applied to avoid the participant’s anticipation.

Audio-somatosensory detection advantage

Correct response rate.

Mean probability of correct response rate.

Statistical results.

Mean probability for /pa/ was significantly different from zero

(t(13) = 2.33, p = 0.036, one sample t-test).

No significant difference was found for /py/ (**)

(t(7) = 1.77, p > 0.12, one sample t-test).

A relationship between somatosensory stimulation & articulatory gesture in auditory stimulation.

Speech intelligibility increased only when the speech target was /pa/.

Somatosensory information (this experiment)

Speech intelligibility in noise

Visual information

(Bernstein et al., 2004; Schwartz et al., 2004;

Sumby and Pollack, 1954)

Additional sensory inputs associated with auditory information increase the intelligibility of speech sounds in noisy environment.

Vowel perception (Ito et al., 2009, Trudeau-Fisette et al., 2017). Lexical perception (Ogane et al., 2019; 2020).

Upward

somatosensory stimulation

Articulatory gesture to produce /pa/

Articulatory gesture to produce /py/

Target /pa/ Target /py/

(**) to be confirmed with an increased number of participants

for "py" (delayed by COVID).

Références

Documents relatifs

Potential sources for this service System support: § 24/7 System maintenance System equipment Software upgrades User interface development Marketing activities: § Internal

Influence of the sampling density on the noise level in displacement and strain maps obtained by processing periodic patterns...

Healthy controls with knee casts used a pitch movement strategy similar to that of SCA patients to oVset their lack of knee movement in regaining balance fol- lowing

Table II presents the results of the number of IVF cycles, the number of oocytes retrieved, the number of pronucleids obtained, the number of embryos transferred, the pregnancy

To test the hypothesis that similar processing stages are used to localize touch on an arm and tool, we compared the similarity of the multivariate neural patterns of suppression

Since activity shapes barrel development by postnatal day 5-7, before mice engage in active whisking behaviors, an important question was to identify the source of

Lexical factors such as the frequency of words that compose the babble contribute to informational masking during speech-in-speech comprehension. Stronger lexical

Rule-based articulatory speech synthesis boasted a comprehensive control over the entire set of the articulatorsthe jaw, the tongue, the lips, the velum, the epiglottis and the