• Aucun résultat trouvé

Other applications of Lempeh-Ziv coding

Dans le document Data Mining (Page 158-162)

Multimedia Data Compression

3.12 TEXT COMPRESSION

3.12.4 Other applications of Lempeh-Ziv coding

LZ coding techniques are not necessarily applicable to text compression only.

Variants of the LZ coding techniques have been found to be effective to com-press many other datatypes. They can be effectively used to comcom-press general-purpose data effectively, for archival and storage. LZ coding techniques can be applied to compress databases (both numeric and text), graphical charts, geographical maps, and many other special kinds of images. The LZ-based coding schemes have also been adopted in many international coding stan-dards.

LZW-based coding has been found to be effective to losslessly compress different kinds of images. The widely used image file format 'GIF' (Graphical Interchange Format) is an implementation of the LZW algorithm. This is very similar to the popular compress utility in UNIX. GIF is very effective in compressing computer-generated graphical images and pseudo-color or color-mapped images. TIFF (Tag Image File Format) is another industry standard based on LZ coding. This is useful for compressing dithered binary images, which simulate gray scale images through a variation of the density of black dots. The CCITT (previously ITU-T) Recommendation V.42 bis is a com-pression standard of data over a telephone network. The comcom-pression mode

140 DATA COMPRESSION

of this standard uses the LZW algorithm to compress data to be transmitted through the modem.

3.13 CONCLUSIONS AND DISCUSSION

In this chapter we have introduced the fundamental principles behind multi-media data compression. Data compression has great potential in the near future to improve the efficiency of data mining systems, by exploiting the benefits of compact and shorter representation of data. This is particularly important because data mining techniques typically deal with large databases, and data storage management is a big issue for managing such large databases.

However, the data mining community has hitherto failed to take advantage of the knowledge in the area of data compression and develop special data mining techniques based on the principles behind data compression. Never-theless, there have been limited efforts at usage of data compression to reduce the high dimensionality of multimedia datasets, with applications for min-ing multimedia information in a limited manner. Multimedia data minmin-ing is covered in detail in Chapter 9.

We have discussed various issues of multimedia data compression, along with some theoretical foundations. We presented some basic source coding algorithms, often used in data compression, in order to introduce this area of development to the readers. We have described the principles behind the popular algorithms for image and text type multimedia data. We avoided dis-cussion on compression of other datatypes such as video, audio, and speech because it is beyond the scope of this book. The advantages of data compres-sion are manifold and will enable more multimedia applications at reduced costs, thereby aiding its usage by a larger population, with newer applica-tions, in the near future.

REFERENCES

1. C. E. Shannon and W. Weaver, The Mathematical Theory of Communi-cation. Urbana, IL: University of Illinois Press, 1949.

2. C. E. Shannon, "Certain results in coding theory for noisy channels,"

Information Control, vol. 1, pp. 6-25, 1957.

3. C. E. Shannon, "Coding theorems for a discrete source with a fidelity criterion," Technical Report, IRE National Convention Record, 1959.

4. B. McMillan, "The basic theorems of information theory," Annals of Mathematics and Statistics, vol. 24, pp. 196-219, 1953.

REFERENCES 141

5. A. N. Netravali and B. Haskell, Digital Pictures. New York: Plenum Press, 1988.

6. A. K. Jain, Fundamentals of Image Processing. Englewood Cliffs, NJ:

Prentice-Hall, 1989.

7. W. B. Pennenbaker and J. L. Mitchell, JPEG: Still Image Data Compres-sion Standard. New York: Chapman & Hall, 1993.

8. R. Hunter and A. H. Robinson, "International digital facsimile standard,"

Proceedings of IEEE, vol. 68, pp. 854-867, 1980.

9. D. A. Huffman, "A method for the construction of minimum redundancy codes," Proceedings of the IRE, vol. 40, pp. 1098-1101, 1952.

10. R. J. Clarke, Transform Coding of Images. New York: Academic Press, 1985.

11. I. T. Jolliffe, Principal Component Analysis. New York: Springer-Verlag, 1986.

12. D. Hand, H. Mannila and P. Smyth, Principles of Data Mining. Cam-bridge, MA: The MIT Press, 2001.

13. K. R. Rao and P. Yip, Discrete Cosine Transform - Algorithms, Advan-tages, Applications. San Diego, CA: Academic Press, 1990.

14. M. Ghanbari, Video Coding: An Introduction to Standard Codecs, vol. 42 of Telecommunications Series. London, United Kingdom: IEEE, 1999.

15. S. G. Mallat, "A theory for multiresolution signal decomposition: The wavelet representation," IEEE Transactions on Pattern Analysis and Ma-chine Intelligence, vol. 11, pp. 674-693, 1989.

16. I. Daubechies, Ten Lectures on Wavelets. CBMS, Philadelphia: Society for Industrial and Applied Mathematics, 1992.

17. M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies, "Image coding using wavelet transform," IEEE Transactions on Image Processing, vol. 1, pp. 205-220, 1992.

18. J. M. Shapiro, "Embedded image coding using zerotrees of wavelet coeffi-cients," IEEE Transactions on Signal Processing, vol. 41, pp. 3445-3462, 1993.

19. T. Acharya and A. Mukherjee, "High-speed parallel VLSI architectures for image decorrelation," International Journal of Pattern Recognition and Artificial Intelligence, vol. 9, pp. 343-365, 1995.

20. H. Lohscheller, "A subjectively adapted image communication system,"

IEEE Transactions on Communications, vol. 32, pp. 1316-1322, 1984.

142 DATA COMPRESSION

21. "C source code of JPEG encoder research 6b," Sixth public release of the Independent JPEG group's free JPEG software, The Independent JPEG Group, ftp://ftp.uu.net/graphics/jpeg/jpegsrc-v6b.tar.gz, March 1998.

22. "Information technology - JPEG2000 Image Coding System," Final Com-mittee Draft Version 1.0 ISO/IEC JTC 1/SC 29/WG 1 N1646R, March 2000.

23. D. S. Taubman and M. W. Marcellin, JPEG2000: Image Compression Fundamentals, Standards and Practice. Boston, USA: Kluwer Academic Publishers, 2002.

24. J. Ziv and A. Lempel, "A universal algorithm for sequential data compres-sion," IEEE Transactions on Information Theory, vol. 23, pp. 337-343, 1977.

25. J. Ziv and A. Lempel, "Compression of individual sequences via variable-rate coding," IEEE Transactions on Information Theory, vol. 24, pp. 530-536, 1978.

26. J. A. Storer and T. G. Syzmanski, "Data compression via textual substi-tution," Journal of the ACM, vol. 29, pp. 928-951, 1982.

27. T. Welch, "A technique for high-performance data compression," IEEE Computer, vol. 17, pp. 8-19, 1984.

28. T. C. Bell, J. G. Cleary, and I. H. Witten, Text Compression. Englewood Cliffs, NJ: Prentice-Hall, 1990.

29. T. Acharya and J. F. JaJa, "An on-line variable-length binary encoding of text," Information Sciences, vol. 94, pp. 1-22, 1996.

30. J. G. Cleary and I. H. Witten, "Data compression using adaptive coding and partial string matching," IEEE Transactions on Communications, vol. 32, pp. 396-402, 1984.

31. A. Moffat, "Implementing the PPM data compression scheme," IEEE Transactions on Communications, vol. 38, pp. 1917-1921, 1990.

32. I. H. Witten, R. M. Neal, and J. G. Cleary, "Arithmetic coding for data compression," Communications of the ACM, vol. 30, pp. 520-540, 1987.

33. M. Burrows and D. J. Wheeler, "A block-sorting lossless data compression algorithm," Technical Report 124, Digital Equipment Corporation, Palo Alto, CA, May 1994.

Dans le document Data Mining (Page 158-162)