• Aucun résultat trouvé

Speech quality in mobile devices is degraded by acoustic echo and ambient noise. The annoyance due to both disturbances increases as their level increases. Speech processing algorithms need to be implemented within mobile phones to maintain good speech quality.

While both are addressed at some point in this thesis, we focus on approaches to improve echo control for single- and dual-microphone devices.

7.1 Single microphone echo control

Existing approaches to single microphone echo control are based on adaptive filtering followed by echo postfiltering. Adaptive echo cancellation aims to estimate the echo signal recorded by the microphone while echo postfiltering consists in attenuating the residual echo in the frequency or subband domain. Similarly noise reduction approaches aim to attenuate the noise in the frequency or subband domain.

A simple approach to reduce the computational complexity of a speech enhancement scheme consists in combining echo postfiltering and noise reduction. In our case, we focus mainly on efficient filtering methods (i.e. perturbation suppression). Subband domain filtering suffers from frequency domain aliasing which can be avoided by performing the filtering in the time domain. Our assessment shows that both subband and time domain filtering methods are equivalent in terms of perturbation attenuation. Perceived distortion is nevertheless different. Filtering in the time domain tends to introduce crackling noise during DT periods whereas subband filtering results in musical noise.

Echo postfiltering and noise reduction can also be combined in the frequency domain.

Filtering in the frequency domain sometimes corresponds to circular convolution which suffers from time domain aliasing. We show how this aliasing can be avoided with appro-priate zero-padding. This zero-padding, also referred to as the linear convolution, requires additional DFT and is thus very computational demanding. We proposed an alterna-tive scalable filtering method - the scalability of the proposed method can be exploited to reduce the computational complexity. Comparison of these frequency domain related filtering methods show that they are similar in terms of perturbation attenuation and near-end speech distortion. The proposed systems is therefore a good alternative to circular convolution in the frequency domain.

Another contribution reported in this thesis refers to synchronized echo control system.

Our approach to synchronize AEC and echo postfiltering is based on the link between the AEC system mismatch and its power spectrum. Assessments show the proposed

126 7. Conclusion and perspectives synchronization method results in significant reduction in convergence time of the AEC and to better echo suppression.

7.2 Dual-microphone echo control

The focus in Part II of this thesis relates to echo control for DM mobile devices. An analysis of the echo problem based on recordings with DM devices is reported. It results from this study that by placing the transducers in a certain configuration, significant level differences an be observed between the microphone signals depending on whether we are in echo-only, near-end speech only or DT periods.

Unlike existing multi-microphone echo control systems, we choose to use an adaptive filter followed by echo postfiltering. In Chapter 5, we show how the level difference observed with our recordings can be exploited for DT detection. A new DM echo postfiltering gain rule is also proposed. Assessment show the proposed DTD can be used efficiently to improve any existing SM echo control algorithm. We also show that the use of the proposed DTD in conjunction with the new gain rule yields good echo attenuation as well as low distortion of the useful near-end speech signal. We show in Chapter 5 that the performance of the proposed DTD and gain rule significantly increase as the level difference between the microphones signals also increases. The performance of methods based on level difference lead to recommendations regarding hardware configurations of the transducers on the devices. The primary microphone should be placed as far as possible from the loudspeaker while the secondary microphone should be placed as close as possible to guarantee maximum level difference between microphone signals.

The methods presented in Chapter 5 are bounded to function for certain transducers configuration. In Chapter 6 we proposed a more general approach to DM echo postfil-tering. This method exploits the correlation between the microphone signals and can be extended to non-linear echo suppression. We show through experiments that this method outperforms the SM approach in terms of echo attenuation and speech distortion.

7.3 Perspectives

While SM echo control solutions have largely been investigated in the literature, our contri-butions show that there is still room for speech quality improvement. The synchronization approach presented in this thesis is based on a NLMS adaptive filter. The synchronization approach could be interest to improve convergence speed of the APA.

The proposed DM approaches to echo postfiltering all use estimates of the echo and near-end RTFs. Our analyzes show that the accuracy of these RTFs highly impact on the echo suppression performance of the proposed methods. RTF estimates can be found in the literature: they are used within multi-microphones speech enhancement algorithms based on beamforming [Gannot et al., 2001, Reuven et al., 2007a]. One of the weakness related to RTF is that it is difficult to its target or ideal value since it is a ratio of impulse responses. Since, it is difficult to define the target value of a RTF, its estimates cannot be evaluated separately but are evaluated as part of a speech enhancement module as we did in Chapters 5 and 6. Methods to improve the proposed DM postfilters should carry on more accurate estimation approaches of the RTF.

The proposed DM methods presented in this thesis do not account for the presence of ambient noise. Future investigations regarding these methods should assess their

per-formance and robustness in presence of noise. Moreover the proposed DM methods are compared to SM methods. Further validation of the proposed methods should include comparison of their performance with existing beamforming approaches.

It is well known that hard decision systems such as the proposed DTD suffers from false positive (i.e. DT declared whereas there is no DT) and false negative (i.e. DT not declared whereas it is DT) errors that are due to thresholding. The proposed measurement of the normalized PLD could also be directly used to control the AEC and/or echo postfiltering through soft decision methods. In such case the stepsize or echo suppression gain could be computed or postprocessed based on the value of the PLD. This soft-decision methods make even more sense given the fact our analysis of the normalized PLD is a function of the signal-to-echo ratio (see Section 5.3).

128 Conclusion

Annulation d’écho acoustique pour