• Aucun résultat trouvé

2.5.1.Overview of linear source-filter model

The source-filter model (Fant, 1960) has enabled great progress in our understanding of voice production, particularly for voice analysis and synthesis. The simplest model treats the source (from the vocal folds) and the filter (the effect of the vocal tract) as independent.

This would suggest that aerodynamic and acoustic effects due to the vocal tract are assumed to have no influence on the vibration of the vocal folds, and the details of vocal fold vibration do not affect the filtering. This approximation allows a qualitative and semi-quantitative understanding of voice production, particularly in cases when f0 is lower than the frequency of the first acoustic resonance of the vocal tract, as is typically the case in speech.

Consider phonation with f0 = 150 Hz: a vocal tract with resonances as described in section 2.3.1 would have the effect of boosting the transmission of the third harmonic 3f0 = 450 Hz and the 10th harmonic 10f0 = 1500 Hz, so that the envelope of the frequency spectrum measured outside the mouth would have peaks at these frequencies, as shown in Figure 2-12. The peaks in the spectral envelope (the formants Fi) in this example are close to the resonances of the vocal tract. However, since the spectrum is sampled at f0, the fFi do not provide precise information about vocal tract resonances, notably in the case of high female voices. Broadband noise due to airflow through the glottis provides some additional information at frequencies other than the harmonics of the voice. However, the frequency of the formants fFi has an expected error of at least 10% of f0 when determined with standard methods of linear predictive coding (Monsen and Engebretson, 1983; Vallabha and Tuller, 2002).

Figure 2-12 Source filter model for phonated speech. Reproduced with permission from Wolfe et al. (2009a).

The glottal source spectrum (top) can be either periodic (harmonic spectrum) or broadband (continuous spectrum). The source is filtered by the gain of the vocal tract and the radiation impedance of the open mouth (centre), to produce the output sound spectrum (bottom). The y-axes show sound pressure level and so the + denotes that the impedances are multiplied. The formants (peaks in the output sound spectrum) occur at frequencies close to those of the vocal tract resonances, or peaks in the vocal tract gain function.

2.5.2.Limitations to the independent source-filter model

A theory of voice production in which the source and filter are independent cannot adequately explain subtleties of voice quality and extreme cases of what is now termed source-filter interaction. These interactions are categorised below.

2.5.2.1. Source influences on the filter

Source-filter interaction includes effects of the source on the filter. The properties of the vibrating vocal folds (e.g. glottal aperture and open quotient) influence the acoustic resonances of the vocal tract (Barney et al., 2007; Lulich et al., 2009; Swerdlin et al., 2010).

2.5.2.2. Filter influences on the source

The models of the source discussed in 2.2.2 include some dependence between source and filter, as the acoustic load of the vocal tract is important. For example the simple mass spring model requires the presence of a vocal tract to oscillate (Flanagan, 1968). An inertive load upstream means that the airflow continues through the glottis, giving rise to the Bernoulli force. This is the behaviour highlighted by the Fletcher model (Fletcher, 1993;

Tarnopolsky et al., 2000).

The two-mass model, which is more frequently used today, demonstrates bifurcations and pitch jumps when f0 and fR1 are comparable (Ishizaka and Flanagan, 1972; Pelorson et al., 1994; Ruty, 2007) and is predicted to benefit from inertance downstream (Titze, 2008).

There is no strong experimental evidence for the impedance ‘preferences’ of the vocal folds, and they likely depend on the mode of oscillation. However, when fRi and nf0 are close the vocal tract applies a large load impedance on the vocal folds, and the sign of the reactance changes across the resonance. This may be the cause of unintended effects reported in this range, such as pitch jumps or reduction in sound intensity (Hatzikirou et al., 2006; Titze et al., 2008). Additionally, the constriction of the aryepiglottic larynx affects f0 (Bailly et al., 2008; Lucero et al., 2012; Ingo R. Titze, 2004; I. R. Titze, 2004; Titze, 2008;

Titze and Story, 1997).

Note that the resonance tuning mentioned in 2.3.2, could in principle be explained by the independent source-filter model. However, since fR1 approaches f0, some non-linear interaction may be present.

2.6. Straw Phonation: Possible Source-Filter Interaction

In speech therapy and voice training, various methods of occlusions or partial occlusions of the vocal tract are made, from placing the hand over the mouth, to lip trills, tongue trills, and their combination as ‘blowing raspberries’ (for an overview of the various methods, see Titze (2006)). Using straws to provide the partial occlusion allows for the aerodynamic and acoustic load on the vocal tract to be more precisely controlled than other methods by selecting the appropriate straw dimensions (Pillot-Loiseau et al., 2009; Titze, 2002).

Protocol for using the straws varies, but the effect is to raise the pressure in the vocal tract (the intra-oral pressure Pio), increase source filter coupling and to improve the vocal intensity and economy, i.e. using the minimum vocal fold collision necessary to produce the voice (Pillot-Loiseau et al., 2009; Titze, 2006). Several physiological aspects of this technique have been studied on human subjects, such as the impact on laryngeal muscle activity and glottal adduction (Laukkanen et al., 2008), and the articulatory and acoustical adjustments such as closing the velum (Laukkanen et al., 2012; Vampola et al., 2011).

Further investigations have suggested that phonation into a straw could be a useful diagnostic tool to provide an estimation of phonation threshold pressure (Titze, 2009), or

The physics of phonation into a straw is complex, since it changes both the DC and AC loads on the vocal folds. Further, human subjects can rapidly and unconsciously make adjustments to adapt their vocal gesture to changes in the phonatory situation (Baer, 1979).

However, the implication of straw phonation as a therapeutic technique is that the person using the straw can learn to retain the changes after removing the straw. To simplify the problem, in vitro experiments provide a system to assess the underlying physics.

Using an in vitro model Bailly et al. (2008) showed that a vocal-tract constriction in the region of the aryepiglottic folds can either facilitate or impede vocal fold vibration by changing Pio. Chapter 5 describes experiments performed on the same model system, using latex vocal folds and a rigid vocal tract replica, with straws at the ‘lips’. These focus on three aspects of the physics of straw phonation: the aerodynamic effects of the downstream constriction; the effect of the acoustic load provided by the straw, and the mechanical resonances of the vocal folds. These in vitro experiments suggest several ways that the straws may influence the vocal fold behaviour via source-filter interaction.

This chapter describes the in vivo application of the three-microphone three-calibration technique to the vocal tract to measure its acoustic and mechanical resonances with the glottis closed5. The measurements are used with a simple model of the vocal tract to determine the effective visco-thermal losses and the mechanical properties of the yielding walls.