Modifications - Principal Component Analysis (PCA)

2.3 Principal Component Analysis (PCA)

2.3.5 Modifications

The transformation defined by the principal component analysis – when a joint transformation for all object classes is used, e.g., for object classification –

maximizes the overall variance of the training samples. Hence, there is no mech-anism which guarantees that just the inter-class variance is maximized, which would be advantageous. A modification proposed by Fisher [7] has been used by Belhumeur et al. for face recognition [2]. They point out that it is desirable to max-imize the inter-class variance by the transformation, whereas intra-class variance is to be minimized at the same time. As a result, the transformed training samples should build compact clusters located far from each other in transformed space. This can be achieved by solving the generalized eigenvalue problem

SB·wi=λi·SC·wi (2.22) with SBdenoting the between-class scattermatrix and SC the within-class scatter-matrix.

References

1. Ballard, D.H and Brown, C.M., “Computer Vision”, Prentice-Hall, Englewood Cliffs, N.J, 1982, ISBN 0-131-65316-4

2. Belhumeur P., Hespanha, J. and Kriegman, D. “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):711–720, 1997

3. Canny, J.F., “A Computational Approach to Edge Detection”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6):679–698, 1986

4. Chellappa, R. and Bagdazian, R., “Optimal Fourier Coding of Image Boundaries”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(1):102–105, 1984

5. Chuang, G. and Kuo, C.C., "Wavelet Descriptor of Planar Curves: Theory and Applications”, IEEE Transactions on Image Processing, 5:56–70, 1996

6. Duda, R.O., Hart, P.E. and Stork, D.G., “Pattern Classification”, Wiley, New York, 2000, ISBN 0-471-05669-3

7. Fisher, R.A., “The Use of Multiple Measures in Taxonomic Problems”, Annals of Eugenics, 7:179–188, 1936

8. Flusser, J. and Suk, T., “Pattern Recognition by Affine Moment Invariants”, Pattern Recognition, 26(1):167–174, 1993

9. Miyazawa, K., Ito, K., Aoki, T., Kobayashi, K. and Nakajima, N., “An Effective Approach for Iris Recognition Using Phase-Based Image Matching”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(10):1741–1756, 2008

10. Murase, H. and Nayar, S., “Visual Learning and Recognition of 3-D Objects from Appearance”, International Journal Computer Vision, 14:5–24, 1995

11. Niblack, C.W., Barber, R.J., Equitz, W.R., Flickner, M.D., Glasman, D., Petkovic, D. and Yanker, P.C., “The QBIC Project: Querying Image by Content Using Color, Texture and Shape”, In Electronic Imaging: Storage and Retrieval for Image and Video Databases, Proceedings SPIE, 1908:173–187, 1993

12. Steger, C., “Occlusion, Clutter and Illumination Invariant Object Recognition”, International Archives of Photogrammetry and Remote Sensing, XXXIV(3A):345–350, 2002

13. Takita, K., Aoki, T., Sasaki, Y., Higuchi, T. and Kobayashi, K., “High-Accuracy Subpixel Image Registration Based on Phase-Only Correlation”, IEICE Transactions Fundamentals, E86-A(8):1925–1934, 2003

14. Turk, M. and Pentland, A., “Face Recognition Using Eigenfaces”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, 586–591, 1991

15. Ulrich, M. and Steger, C., “Performance Comparison of 2D Object Recognition Techniques”, International Archives of Photogrammetry and Remote Sensing, XXXIV(5):99–104, 2002 16. Van Otterloo, P., “A Contour-Oriented Approach to Shape Analysis”, Prentice Hall Ltd.,

Englewood Cliffs, 1992

17. Zhang, D. and Lu, G., “Generic Fourier Descriptor for Shape-based Image Retrieval”, IEEE Int’l Conference on Multimedia and Expo, 1:425–428, 2002

Chapter 3 Transformation-Search Based Methods

Abstract Another way of object representation is to utilize object models consist-ing of a finite set of points and their position. By the usage of point sets recognition can be performed as follows: First, a point set is extracted from a scene image.

Subsequently, the parameters of a transformation which defines a mapping of the model point set to the point set derived from the scene image are estimated. To this end, the so-called transformation space, which comprises the set of all possible transform parameter combinations, is explored. By adopting this strategy occlusion (resulting in missing points in the scene image point set) and background clutter (resulting in additional points in the scene image point set) both lead to a reduc-tion of the percentage of points that can be matched correctly between scene image and the model. Hence, occlusion and clutter can be controlled by the definition of a threshold for the portion of the point sets which has to be matched correctly. After introducing some typical transformations used in object recognition, some examples of algorithms exploring the transformation space including the so-called generalized Hough transform and the Hausdorff distance are presented.

3.1 Overview

Most of the global appearance-based methods presented so far suffer from their invariance with respect to occlusion and background clutter, because both of them can lead to a significant change in the global data representation resulting in a mismatch between model and scene image.

As far as most of the methods presented in this chapter are concerned, they utilize object models consisting of a finite set of points together with their posi-tion. In the recognition phase, a point set is extracted from a scene image first.

Subsequently, transformation parameters are estimated by means of maximizing the similarity between the scene image point set and the transformed model point set (or minimizing their distance respectively). This is done by exploring the so-called transformation space, which comprises the set of all possible transform parameter combinations. Each parameter combination defines a transformation between the model data and the scene image. The aim is to find a combination which maximizes 41 M. Treiber, An Introduction to Object Recognition, Advances in Pattern Recognition,

DOI 10.1007/978-1-84996-235-3_3,CSpringer-Verlag London Limited 2010

the similarity (or minimizes a distance, respectively). Finally, it can be checked whether the similarities are high enough, i.e., the searched object is actually present at the position defined by the transformation parameters.

Occlusion (leading to missing points in the scene image point set) and back-ground clutter (leading to additional points in the scene image point set) both result in a reduction of the percentage of points that can be matched correctly between scene image and the model. Hence, the amount of occlusion and clutter which still is acceptable can be controlled by the definition of a threshold for the portion of the point sets which has to be matched correctly.

The increased robustness with respect to occlusion and clutter is also due to the fact that, with the help of point sets, local information can be evaluated, i.e., it can be estimated how well a single point or a small fraction of the point set located in a small neighborhood fits to a specific object pose independent of the rest of the image data (in contrast to, e.g., global feature vectors where any discrepancy between model and scene image affects the global features). Additionally, it is pos-sible to concentrate the point set on characteristic parts of the object (in contrast to gray value correlation, for example).

After a brief discussion of some transformation classes, some methods adopting a transformation-based search strategy are discussed in more detail. The degrees of freedom in algorithm design for this class of methods are

• Detection method for the point set (e.g., edge detection as proposed by Canny [3], see also Appendix A). The point set must be rich enough to provide discrimina-tive information of the object. On the other hand, however, large point sets lead to infeasible computational complexity.

• Distance metric for measuring the degree of similarity between the model and the content of the scene image at a particular position.

• Matching strategy of searching the transformation space in order to detect the minimum of the distance metric. A brute force approach which exhaustively evaluates a densely sampled search space is usually not acceptable because the algorithm runtime is too long. As a consequence a more intelligent strategy is required.

• Class of transformations which is evaluated, e.g., affine or similarity transforms.

Dans le document Advances in Pattern Recognition (Page 55-59)