HAL Id: hal-03215820
https://hal.archives-ouvertes.fr/hal-03215820
Submitted on 4 May 2021
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Tongue motor control: deriving articulator trajectories and muscle activation patterns from an optimization
principle
Pierre Baraduc, Tsiky Rakotomalala, Pascal Perrier
To cite this version:
Pierre Baraduc, Tsiky Rakotomalala, Pascal Perrier. Tongue motor control: deriving articulator tra-jectories and muscle activation patterns from an optimization principle. Neural Control of Movement, Apr 2021, Virtual, France. 2021. �hal-03215820�
500
Postural goal Acoustic goal
500 1000 1500 2000 2500 F2 (Hz) 200 400 600 800 F1 (Hz) i a o e 1000 1500 2000 2500 F2 (Hz) 2000 2500 3000 3500 F3 (Hz) i a o e 1000 2000 F2 (Hz) 200 400 600 800 F1 (Hz) i a o e 1000 2000 F2 (Hz) 2000 2500 3000 3500 F3 (Hz) i a o e Data, perception [5]:
Pierre Baraduc, Tsiky Rakotomalala, Pascal Perrier
GIPSA-lab, UMR 5216 CNRS / Univ. Grenoble-Alpes / Grenoble-INP
{pierre.baraduc, ny-tsiky.rakotomalala, pascal.perrier}@gipsa-lab.grenoble-inp.fr
Tongue motor control: deriving articulator trajectories
and muscle activation patterns from an optimization principle
Pierre Baraduc, Tsiky Rakotomalala, Pascal Perrier
GIPSA-lab, UMR 5216 CNRS / Univ. Grenoble-Alpes / Grenoble-INP
{pierre.baraduc, ny-tsiky.rakotomalala, pascal.perrier}@gipsa-lab.grenoble-inp.fr
Key features of sensorimotor systems:
• Multisensory integration
• Use of internal models to predict the sensory outcomes of actions
• Comparison of the sensory input with the internal prediction to optimally update the internal estimate of the system
Speech production
• Coordination task: lips, jaw, tongue
• Resistance to external disturbances (inertial forces, objects in mouth, distorted audio feedback...)
• Can optimal feedback control theory illuminate the control
of tongue movements during speech (tongue kinematics, coarticulation, use of feedback...)?
• Minimization of effort produces plausible tongue trajectories
(kinematics, EMG)
• Part of phonemic variability linked to aspects of sensorimotor control?
• Toy model suggests coarticulation can be tackled by this method
• Model predictions should be validated with formant tracking, EMA recordings
and intramuscular EMG
METHODS
METHODS
INTRODUCTION
RESULTS
RESULTS
CONCLUSIONS
REFERENCES
[1] Payan, Y., and Perrier, P. (1997). Synthesis of VV sequences with a 2D biomechanical tongue model controlled by the Equilibrium
Point Hypothesis. Speech Comm 22, 185–205.
[2] Badin, P., Elisei, F., Bailly, G. and Tarabalka, Y. (2008). An audiovisual talking head for augmented speech generation: models and
animations based on a real speaker's articulatory data. In Vth Conference on Articulated Motion and Deformable Objects, pp. 132–143.
[3] Badin, P., and Fant, G. (1984). Notes on vocal tract computation. STL-QPSR 25, 53–108. [4] Bryson A.E. (1999) Dynamic optimization. Addison-Wesley.
[5] Patri, J.-F., Diard, J., and Perrier, P. (2015). Optimal speech motor control and token-to-token variability: a Bayesian modeling
approach. Biol Cybern 109, 611–626.
Funded by
MIAI grant
Optimal controller
motor commands
acoustic feedback (formant positions) proprioceptive feedback
(tongue position/velocity)
vocal tract shape system state
acoustic goal Biomechanical modelof tongue Vocal tractacoustics
Optimal estimator
Brain
Cost function
Tongue biomechanics:
• Finite element (FE) model of tongue deformation (sagittal 2D model)
• Seven muscles modelled: anterior genioglossus, posterior genioglossus, hyoglossus, styloglossus, verticalis, inferior longitudinalis, superior longitudinalis [1]
• Hill-type muscle model
• Activity-dependent tissue elasticity, small deformation approximation
• Fixed tongue floor
• Contacts with palate, velum and pharyngeal wall modeled as elastic interaction (wall: high stiffness)
• System continuous-time ODE solved with robust Runge-Kutta integration
Vocal tract, from tongue shape to acoustics:
• For a given external tongue contour, a fixed jaw position, and a fixed lip aperture, we deduce the shape of the complete vocal tract using anatomical reference data (MRI, [2])
• We then compute the resonances of the vocal tract with an harmonic model following [3] after discretization of the tract in 44 tubes of identical length, and keep the first three formants.
0 2 4 6 8 10 12 14 16 18
Distance from glottis (cm)
0 1 2 3 4 5 6 Area (cm 2) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Frequency (kHz) -20 -10 0 10 20 30 40 Amplitude of Transfer fn (dB) F1 F2 F3
Optimal control model:
Plant:
• Continuous time dynamics
over fixed time T • Proprioceptive feedback p modeled as 4D projection on principal axes of position and velocity, 3D acoustic feedback a (formant values)
• Acoustic F1-F2 goal • Variability study:
- linear reduced plant model identified over ~ 50,000 FEM simulations
- only motor noise: additive (SD σA) and multiplicative
(SD σM) Gaussian white noise on motor command
- Internal state estimate via extended Kalman filtering
Optimization:
• Unconstrained optimization: cost function includes neuromotor effort and precision penalty:
• Indirect optimal control (Pontryagin based), gradient descent and/or Newton-Raphson method [4]
• Some checks of sensitivity to initial parameters
muscle fibers
hyoid bone
X (mm)
Y (mm)
activity-dependent muscle tissue elasticity
Anterior Genioglossus Posterior Genioglossus Verticalis Inferior Longitudinalis Superior Longitudinalis Hyoglossus Styloglossus
Key to simulation results
Exploring the control of contacts:
• One degree of freedom corresponding to principal component • Tube model with auditory, proprioceptive and tactile feedback • Muscular redundancy, inertia, elasticity towards neutral
• Intuition for more complex models, while convergence easier
• Coarticulation emerges from effort optimization • Delayed effects of earlier constraints
• (and large difference in optimization algorithms)
• Effort optimization leads to loopy trajectories in formant space
• Though goals are acoustic, intermediate postures (red) seem very similar to final postures from /ə/ muscle A muscle B x 0 0 50 100 150 200 250 300 0 2 4 6 8 10 12 14 16 18 20 Position (mm) Time (ms) ə i k i baseline
first /i/ extended after first /i/ extended before
From /ǝ/ to /i/
From /ǝ/ to /e/
From /ǝ/ to /a/
Control of contacts (VCV):
Vowel sequences (not center-out):
Acoustic variability (linear reduced model):
Trajectory
Muscle activation (A.U.)
Time (ms) 0 05 100 150 200 250 300 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 muscle A muscle B Muscle activation ɔ i e ɐ ə ə → i → ɐ → ɔ → e ə → i → ɐ → ɔ → e ɔ i e ɐ ə
Vocal tract slicing to acoustic transfer function (bottom)From tube areas (top)
Predicted EMG
Trajectory, acoustic Tongue surface points Trajectory, sagittal
Predicted EMG
Trajectory, acoustic
Tongue surface points Trajectory, sagittal
Predicted EMG
Trajectory, acoustic Tongue surface points Trajectory, sagittal
Predicted EMG
Trajectory, acoustic
Tongue surface points Trajectory, sagittal
From /ǝ/ to /ɔ/
96 98 100 102 104 106 108 110 112 Y (mm) 50 60 70 80 90 100 110 X (mm) 0 0.05 0.1 0.15 Time (s) 0 0.05 0.1 0.15 Time (s) 300 400 500 600 700 F1 (Hz) F2 (Hz) 800 1200 1600 2000 2400 1500 2000 2500 3000 3500 F3 (Hz) Time (s) 0 1 2 3 4 5 6 7 8
Muscle activation (A.U.)
0 0.05 0.1 0.15 0 .0 0.1 0.15 Time (s) 0 1 2 3 4 5 6 7
Muscle activation (A.U.)
50 60 70 80 90 100 110 120 X (mm) 90 88 92 94 96 98 100 102 104 106 108 Y (mm) 0 0.05 0.1 0.15 Time (s) 0 0.05 0.1 0.15 Time (s) 300 400 500 600 700 F1 (Hz) F2 (Hz) 800 1200 1600 2000 2400 1500 2000 2500 3000 3500 F3 (Hz) 300 400 500 600 700 F1 (Hz) F2 (Hz) 800 1200 1600 2000 2400 1500 2000 2500 3000 3500 F3 (Hz) 0 0.05 .1 0.15 0.2 Time (s) 50 60 70 80 90 100 110 X (mm) 92 94 96 98 100 102 104 106 108 Y (mm) 0 0.05 .1 0.15 0.2 Time (s) 0 1 2 3 4 5 6 7 8 9 10
Muscle activation (A.U.)
0 0.04 0.08 0.12 Time (s) 0.16 0.2 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Muscle activation (A.U.)
300 400 500 600 700 F1 (Hz) F2 (Hz) 800 1200 1600 2000 2400 1500 2000 2500 3000 3500 F3 (Hz) Time (s) 0 0.05 0.1 0.15 50 60 70 80 90 100 110 X (mm) 98 96 100 102 104 106 108 110 Y (mm) 0 0.05 0.1 0.15 Time (s) 0 0.05 0.1 0.15 Time (s) 800 1000 1200 1400 1600 1800 2000 2200 2400 300 350 400 450 500 550 600 650 700 F1 (Hz) 800 1000 1200 1400 1600 1800 2000 2200 2400 F2 (Hz) F2 (Hz) 1500 2000 2500 3000 3500 F3 (Hz) loop loop acoustic loop