OTHER ERROR-RESILIENT CODING TECHNIQUES - FIGURE 3.2: Block diagram of a basic CELP codec

FIGURE 3.2: Block diagram of a basic CELP codec

3.6 OTHER ERROR-RESILIENT CODING TECHNIQUES

In previous sections we looked at two main ways to alleviate the consequences of packet loss: concealment and FEC. In the first technique, we try to synthesize the missing information based on surrounding (i.e., received) blocks. The second technique sends some additional redundant information (FEC), which helps re-cover the missing information, either in its natural form or with reduced fidelity.

A few other techniques fall somewhere between these two, in the sense that

in-stead of adding redundancy, they will leave some redundancy in the signal at coding time. Ideally this is done in a well-planned way, leaving only the redun-dancy that will be most effective in recovering the lost packets. This is in contrast to some older techniques (e.g., G.711), where redundancy was left in the signals mostly to simplify computation. An example of such an error-resilient technique can be seen in the Siren codec (G.722.1), which intentionally does not use dif-ferential coding, to increase robustness to noise. A few other techniques used to improve error resilience include Multiple Description Coding (Chapter 17) and unequal error protection (Chapter 9), which can be used with standard codecs, but are particularly useful when used with scalable codecs (Chapter 6).

3.7 SUMMARY AND FURTHER READING

In this chapter we have looked at Error Concealment Strategies and Error Re-silient Coding for Audio Communication. We looked at some of the basic tech-niques that are used in concealing packet losses, applied to several kinds of codecs, including frame-independent codecs, overlapped transform codecs, and fully predictive codecs. We looked at some of the techniques incorporated into international standards, and looked at a few additional techniques. We saw that many codecs are available and can be used for specific application. The particular choice of a codec will generally involve system design issues, for example, com-putational complexity, bandwidth availability, backward compatibility, and so on.

Furthermore, commercial considerations often play a major role as well. These include existing intellectual property right, licensing terms, availability of source code, and so on. For example, many of the codecs mentioned were designed for a specific application. As a general rule, CELP codecs tend to perform well in terms of rate/distortion for most rates above 2400 bps, as long as encoding clean speech. For example, mostly all codecs used in cellular phone systems are CELP based. However, when coding music or when background noise becomes more prevalent, waveform codecs start to present good performance. Indeed, while the ITU and the GSM have standardized several CELP codecs to use at differ-ent rates, in telecommunication systems some of the primary VoIP systems use waveform-based codecs. For example, Microsoft Messenger uses Siren/G711.1 as the default codec. This can be partially attributed to the fact that bandwidth constraints on VoIP are not as severe as in cellular systems and partially to the fact that use of a close-talking microphone is not expected in the desktop environ-ment.

The main objective of this chapter was to look at the different techniques avail-able. A number of subsequent chapters will look at related topics, in particular about aspects related to FEC, scalable audio coding, and adaptive playout. These are techniques that are particularly important for speech communication.

REFERENCES

[1] ITU-T Recommendation G.711, Pulse code modulation (PCM) of voice frequencies, November 1988.

[2] ITU-T Recommendation G.711, Appendix I, A high quality low-complexity algo-rithm for packet loss concealment with G.711, September 1999.

[3] ITU-T Recommendation G.728, Coding of speech at 16 kbit/s using low-delay code excited linear prediction, September 1992.

[4] ITU-T Recommendation G.728, Coding of speech at 8 kbit/s using conjugate-structure algebraic-code-excited linear prediction (CS-ACELP), March 1996.

[5] ITU-T Recommendation G.722.1, Low-complexity coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss, May 2005.

[6] ITU-T Recommendation G.722.2, Wideband coding of speech at around 16 kbit/s using Adaptive Multi-Rate Wideband (AMR-WB), July 2003.

[7] R. V. Cox, D. Malah, and D. Kapilow, “Improving upon toll quality speech for VOIP,” Signals, Systems and Computers, 2004. Conference Record of the Thirty-Eighth Asilomar Conference, vol. 1, pp. 405–409, November 2004.

[8] E. Gunduzhan and K. Momtahan, “Linear prediction based packet loss concealment algorithm for PCM coded speech,” IEEE Transactions on Speech and Audio Process-ing, vol. 9, num. 8, pp. 778–785, November 2001.

[9] M. Elsabrouty, M. Bouchard, and T. Aboulnasr, “Receiver-based packet loss conceal-ment for pulse code modulation (PCM G.711) coder,” Signal Processing, vol. 84, pp.

663–667, 2004.

[10] K. Jarvinen et al., “GSM enhanced full rate speech codec,” Proc. of ICASSP, vol. 2, pp. 771–774, April 1997.

[11] 3GPP Recommendation TS 26.071, AMR speech Codec; General description, ver 6.0.0, December 2004.

[12] T. E. Tremain, “The Government Standard Linear Predictive Coding Algorithm:

LPC-10,” Speech Technology Magazine, pp. 40–49, April 1982.

[13] X. Huang, A. Acero, and H. Hon, Spoken language processing: A guide to theory, algorithms and system development,” Prentice Hall, 2001.

[14] “MELP vocoder algorithm: The new 2400 bps federal standard speech coder,” At-lanta Signal Processors, Inc., available (as of August 2006) at http://www.aspi.com/

tech/specs/pdfs/melp.pdf.

[15] J. Rosenberg, “Distributed Algorithms and Protocols for Scalable Internet Tele-phony,” Ph.D. thesis, Columbia University, 2001.

[16] D. Florencio, Personal notes about Siren codec packet loss concealment, Microsoft, 2004.

[17] V. Hardman et al., “Reliable audio for use over the Internet,” Proc. INET, 1995.

[18] J.-C. Bollot and A. Vega-Garcia, “The case for FEC-based error control for packet audio in the Internet,” ACM Multimedia Systems, 1996.

[19] S. Kozat and D. Florencio, “Media dependent FEC,” Internal Report, Microsoft, 2003.

4 Mechanisms for Adapting

Dans le document MULTIMEDIA OVER IP AND WIRELESS NETWORKS (Page 99-102)