**3.3 Enriched Markov Random Fields and C-ICM algorithm**

**3.3.4 Experimental results**

3.3.4.1 Experimental results on the VHR Strasbourg data set

In this subsection, we show the results of our MRF-based clustering on the VHR Strasbourg data set using our proposed C-ICM algorithm. To evaluate the quality of our results, we used three different quality measures:

• The Davies-Bouldin Index, see section 2.6.1: an unsupervised quality measure that evaluates the quality of the clusters based on their internal variance and the distance between the prototypes of the different clusters. For this index a lower value is better.

• The Rand Index, see Appendix B: this index compares how close two partitions are. In the context of these experiments we compared our results with our ground-truth made from the expert maps. For the Rand Index, a value close to 1 means a 100% similarity.

• The Probabilistic Rand Index [137, 138]. This index is similar to the original Rand Index, but has been specifically modified to compare partitions from image data sets. Therefore it seemed interesting to use it. This index also takes its values between 0 and 1, with 1 being a full match.

We did not use the adjusted Rand index because of the difference between our clustering results and the ground-truth regarding the number of clusters. This difference tended to bias the adjusted Rand index and to favor solutions featuring a larger number of clusters.

In Table (3.1), we give the results of 3 different instances of the algorithm searching for 7, 8 and 9 clusters. As one can see, the Probabilistic Rand Index results compared with the ground truth are quite good (around 80% of similarity), with the 7-cluster segmentation being the closest one to the ground-truth, followed by the 9-cluster seg-mentation which is the best one from a clustering point of view (lowest Davies-Bouldin Index). The results of the regular Rand Index are similar except that the 9-cluster segmentation has the best score with this index.

These results are surprisingly good given the fact that the C-ICM is unsupervised, and thus it tends to prove the effectiveness of the semantic rich version of this algorithm to process this type of satellite images. Four extracts of the 9-cluster segmentation are shown in Figure 3.13.

It is however important to emphasize that all 3 clustering results suffer from prob-lems that are very common in image segmentation. Regardless of the 27 attributes, it seems that the original colors still are the most important factors. For instance water areas, shadow areas and dark roofs are often grouped in a single cluster. With our current knowledge of this data set, it is difficult to say whether this problem comes from the segmentation algorithm, from the data preprocessing that created segments

3.3. ENRICHED MARKOV RANDOM FIELDS AND C-ICM ALGORITHM 55 C-ICM DB Index Rand Index Probabilistic Rand Index

7 clusters 2.76±0.31 0.783±0.036 0.826±0.040 8 clusters 3.33±0.24 0.785±0.023 0.796±0.017 9 clusters 3.11±0.27 0.792±0.013 0.817±0.025

Table 3.1: Results achieved by the C-ICM algorithm searching for 7, 8 and 9 clusters:

Davies-Bouldin Index, Rand Index, and Probabilistic Rand Index

from these shadow areas, or a bit of both. These segmentation defaults are quite easy to spot on full scale color versions of the image extracts as shown in Figures 3.13(a) and 3.13(d).

(a) European Parlia-ment

(b) Urban Area (c) A Sport complex

(d) Strasbourg residen-tial area

(e) A graveyard

Figure 3.13: Extracts of the 9-cluster segmentation

On the semantic side, after linking each cluster to the corresponding expert label from the visual results, the following properties were found :

• There is a strong spatial connection between the tree areas and grass areas (dark
green and light green respectively), with a transition probability of≈0.65from a
tree segment to a grass segment. This interesting property from the neighborhood
matrix *A* could be translated as “Trees are often surrounded by grass”.

• There is a double side low transition probability (< 0.02) from modern urban buildings segments (in yellow) to water segments. It matches with the fact that modern buildings are rarely built directly adjacent to a river, nor in the middle of it.

• The water cluster has an average transition probability to itself (≈0.47) . This is consistent with the river and water areas of the city of Strasbourg being rather linear and small.

While these results may seem obvious, considering that they are coming from an unsupervised algorithm that has no external knowledge on the true nature of each cluster, they are still quite impressive.

Cluster Number Computation time C-ICM 7 clusters 27,630ms C-ICM 8 clusters 31,495ms C-ICM 9 clusters 34,438ms

Table 3.2: C-ICM algorithm average Computation times for different number of clusters

Finally, in Table (3.2), we give the computation times associated with the previous experiments for 10 iterations of the EM algorithm followed by the C-ICM algorithm iterating until convergence. These computation times were acquired using an i5-3210M 2.5GHz processor on an OpenMP parallelized version of our C++ code. The data set processed by the algorithms weighted around 60MB and included : the 187,058 segments’ 27 attributes and their neighborhood dependencies graph.

As one can see, the computation times are quite low given the size and complexity of the data set as well as the quality of the results.

3.3.4.2 Experimental results on a pixel-based satellite image

For comparison purposes, we used our algorithm on a pixel-based satellite image, so that we could compare the result of our proposed energy model from Equation (3.8) with the Potts model (Equation (3.3)) and the model adapted for the GMM from Equation (3.7). The original image that we used is made of 1131×575 pixels and can be seen in Figure (3.14) (Copyright 2014 Cnes/Spot Image, DigitalGlobe, Google).

Figure 3.14: From left to right, and from top to bottom : the original image, the result
of our algorithm and equation (3.8), the result using energy equation (3.3) and *β* = 1,
the result using energy equation (3.7) and *β* = 1.

As can be seen in Figure (3.14), our energy model achieves a decent segmentation
with a fair amount of easily visible elements such as roads, rocky areas and some
buildings. On the other hand the classical HMRF-EM algorithm using Equation (3.7)
fails to aggregate neighbor pixels from common elements despite a relatively high value
for *β, while the algorithm based on the Potts model tends to give too coarse results*
and the different objects of interest in this image are blurry.

3.4. DISCUSSION 57 While this was not the goal of this experiment, it clearly shows why satellites images should be preprocessed and segmented first, instead of applying the clustering directly to the pixels.

### 3.4 Discussion

In this chapter, we have introduced the specific clustering application that is image segmentation. In particular, our goal was to propose a method suited for the clustering of data acquired from very high resolution satellite images. To this end, we have studied several of the already existing approaches and we have adapted one of them (the ICM algorithm using a MRF model) to our problem. The resulting algorithm has shown promising results both in a very high resolution satellite image data set and on a regular satellite image data set. Furthermore our proposed methods gives extra semantic information on the clusters that would not be available when using previously existing algorithms. These pieces of information give clues on how the clusters geographically relate to each other in the analyzed image.

We have also shown that while our proposed method and many other methods give relatively good results, image data sets remain difficult to tackle. The whole process from the raw image to the final clustering is a difficult one where initial noise, shadows, deformations, and then segmentation errors and clustering errors accumulate as we move through the different steps. Furthermore, with high resolution satellite images, there are different possible scales of interest available, which makes it difficult to choose parameters as simple as the number of clusters to look for.

In this context, we can see how it could be useful to combine several clustering methods to tackle this kind of image data sets. The benefits would include a higher chance of achieving good results by combining several methods, but could be extended to the possibility of a cross-scale analysis. Various ways of applying these ideas are discussed in details in the next chapter on collaborative clustering.

## Chapter 4

## Collaborative Clustering

### Contents

4.1 Introduction to collaborative clustering . . . 60 4.1.1 Problematic and applications . . . 60 4.1.2 Horizontal and Vertical Collaboration . . . 68 4.2 State of the art in collaborative clustering . . . 69 4.2.1 Collaborative Fuzzy C-Means . . . 69 4.2.2 Prototype-based collaborative algorithms . . . 70 4.2.3 The SAMARAH Method . . . 71 4.3 Horizontal probabilistic collaborative clustering guided

by diversity . . . 73 4.3.1 Context and Notations . . . 73 4.3.2 Algorithm . . . 75 4.3.3 Convergence Properties . . . 82 4.3.4 Experimental results . . . 86 4.4 Discussion . . . 90

59