Literature review - XIV Spanish Meeting on Computational Geometry

Considerable progress has been made in object recognition domain using feature based classification [6, 7]. However, these methods often fail to detect objects manifested by geometric shapes that are often found in signs and landmarks [7]. Recent studies con-firm that most state-of-the-art computer vision techniques struggle to detect geometric properties in objects, resulting in confusion and misclassification [7, 8]. Some authors proposed to discard rotation invariance and consider feature directionality to match the object. In such approaches, the template is rotated at regular intervals and inserted into the database [4, 7, 8]. The matching is performed against a single rotated (and/or scaled) instance of a template. The problem, however, is that this method only works when the shape contains enough distinctive features relative to other objects in the data-base. Also, the accuracy of the method is dependent on the frequency of the rotation

2Partially supported by NSERC Discovery grant.

CRM Documents, vol.8, Centre de Recerca Matemàtica, Bellaterra (Barcelona), 2011

165

166 Object recognition using DT

interval. To address the above problems, we propose an original method based on De-launay triangulation (DT), which considers topological relationships among objects. DT has been proven highly beneficial in many applied problems pertaining to space decom-position, biometric processing, information systems, navigation, clustering, and pattern analysis [3, 10].

Methodology based on Delaunay triangulation

To establish the uniqueness of an object class, we present an original method based on topological relationships to incorporate the relative positions of the geometric features.

At first, we considered to establish a topology through a boundary-arc relational graph or clustering from the transitive closure of the relations graph and applying original Maximum Flow Bipartite Matching algorithm [10]. However, there were many issues with methods being computationally expensive and not practical. To provide an optimal topological representation of features, we propose training and matching algorithm to identify objects based on Delaunay triangulation. Each vertex representing a matched 2D object feature, called CART feature (introduced by the authors in [1]), has three directional vectors that define the orientation and characteristics of the edge, with DT constructed on these feature vectors as sites. Next, each graph-edge in the DT represents an augmented feature vector which is constructed using the edge and the CART feature vectors (Fig. 1). The entries are the two simpletones of the CART feature vectors that are joined by the DT edge, the angles of the directional vectors (three for each CART feature), and the length of the DT edge. Then, the characteristics vector of each CART feature is added, containing information such as sharpness, skew, scale, region influence and rotation histogram. The weights are assigned in a consistent manner which is de-pendent on the type of feature being compared. For Simpletone matching, three weights are needed. Two weights are for the matching priority of the RGB colors, and the third weight is for the dot product of the 2-Simpletone (rod-like patterns in color subspace) directions. In order to compute the distance between two feature vectors, we use the sum of weighted distances of all sub-components. Most distance computations are fairly simple, except the Simpletone distances. These are the “S-tone 1” and “S-tone 2” compo-nents of the feature vector. The distance between two 2-simpletones (or 2.5-simpletones) is computed as the area of the minimal surface (or the maximum tension surface) that contains both Simpletones. A good way to approximate this surface is by triangulating the four endpoints of the Simpletone [2]. There are four such triangulation possible. The triangulation with the minimum total surface area is chosen as the distance. The area of a triangle in three dimensional space is computed using the cross product of the two edge vectors. When computing distances between clusters in2.5-simpletones, the closest distance betweenPi andQj is chosen as the edge length.

For training, we assume that we have a number of instances of the object that we call “the templates”. Each of these templates contains a single instance of the object which occupies most of the area. A large number of templates should be collected for training in such a way that the background is randomized and the instances cover all aspects of small variations in perspective, lighting, etc. We devised a method based on Delaunay triangulation to adjust the weight matrix in such a way that the impact of the background and the unstable clutter of feature on the matching cost is attenuated. The weight is then assigned to be the inverse of the computed cumulative cost. After the training process is completed, we have a set of templates each consisting of a Delaunay

XIV Spanish Meeting on Computational Geometry, 27–30 June 2011 167

Figure 1. The feature vector for a Delaunay edge consists of elements from the vertices (green and red represents the two endpoints) and the edge (blue).

triangulation and feature vectors for all vertices and edges. In addition, each DT consists of a weight matrix that maps each edge to an array of weightswk(e, k), whereebelongs to DT(t)for the given templatetand the subcomponent indexkfor subcomponentS_k. For matching, we consider a test image against a given template. The test image is processed using the Simpletone analysis, followed by CART and the feature vector extraction. The template is rescaled in the range[0..1]with respect to the test image at regulag interval.

For higher efficiency, we consider a correspondence matrix and examine scale at several levels. Initially, we consider a few evenly distributed scales and pick the best match.

In the next level we reduce the range and further refine the scale (i.e., in our case 32 different scases, in 3 levels). For each scale, we find a matching cost is computed using the Maximum Bipartite Matching Algorithm [2] (Figure 3). One crucial optimization is to prune Delaunay edges using the connectivity graph and randomization.

Experimentation. We implemented the proposed method and performed exhaustive experimental studies on performance of the object recognition system.

Figure 2. The two cases describe the response of the training algorithm for two sets of training templates for a feature L.

First, training is carried out by providing positive and negative examples of the ob-ject class we are attempting to detect. After this, obob-ject recognition system takes as

168 Object recognition using DT

input a test image and decides whether the image contains an instance of the object class. We analyzed the detection characteristic of the proposed DT based method and compare it with the result obtained from Haar Cascade [9] by creating a training data-base of 128 stop signs cropped from natural images. In trial with 20 signs and non signs, the detection rate at 80% had a 65% FAR, while our DT CART method showed much better performance. The runtime profile indicates that our method outperforms the Haar Cascade for the given set of test images. At 90% detection rate, the Haar method had a 28.67% error rate. In comparison, the false alarm rate for our CART based DT method was only 5.33% at 90% detection rate. The training time for the Haar method was very high (five day training). The training time for our method was 2 hours. The method achieves near real-time performance with an average 6.73 seconds per frame speed or 9 frames per minute.

Figure 3. Example of a template and corresponding DT.

Conclusion

We presented a novel method based on DT for shape recognition. The method is able to represent invariant arcs in a topological context and use that representation for general sign recognition. Delaunay triangulation was utilized for optimal weight assignment and then maximum bipartite algorithm ws applied for shape matching. Extensive runtime analysis indicates that the proposed DT-based method has a significantly better detection rate for the test images than popular methods such as Haar transform.

References

[1] R. A. Apu and M. L. Gavrilova, Shape matching through contour extraction using Circular Aug-mented Rotational Trajectory (CART) algorithm,IJBIDM Journal,4, 2(2010), 192–210.

[2] R. A. Apu and M. L. Gavrilova, Flexible tracking of object contours using LR-traversing algorithm, IGIV 2006, IEEE-CS (2006), 503–513.

[3] M. Eck, T. DeRose, T. Duchamp, H. Hoppe, A. Lounsbery, and W. Stuetzle, Multiresolution analysis of arbitrary meshes, ACM SIGGRAPH (1995), 173–182.

[4] R. Fergus, P. Perona, and A. Zisserman, Object class recognition by unsupervised scale-invariant learning, IEEE-CS, CVPR (2003), II-264 – II-271.

[5] M. T. Fischler and R. A. Elschlager, The representation and matching of pictorial structures,IEEE Transactions on Computers,C-22, 1(1973), 67–92.

[6] P. E. Forssen and D. G. Lowe, Shape descriptors for maximally stable extremal regions, ICCV’07 (2007), 1–8.

[7] D. G. Lowe. Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision,60, 2(2004), 91–110.

[8] A. Vedaldi, G. Guidi, and S. Soatto, Relaxed matching kernels for object recognition, IEEE Confer-ence on Computer Vision and Pattern Recognition (2008), 1–8.

[9] P. A. Viola and M. D. Jones, Rapid object detection using a boosted cascade of simple features, CVPR1(2001), 511–518.

[10] C. Wang and M. L. Gavrilova, Delaunay triangulation algorithm for fingerprint matching, 3rd In-ternational Symposium on Voronoi Diagrams in Science and Engineering (2006), 208–216.

XIV Spanish Meeting on Computational Geometry, 27–30 June 2011

Parallel Delaunay triangulation based on Lawson’s

Dans le document XIV Spanish Meeting on Computational Geometry (Page 175-179)