Region-based approaches - The DART-Europe E-theses Portal

2.8 Discussion

3.1.2 Region-based approaches

Region-based approaches are the basis of Object Based Image Analysis (OBIA) [21].

The main idea behind these methods is that since the pixels themselves have no se-mantic meaning, a first step is required to regroup the pixels into regions that will represent the real objects of interest to be identified or put into clusters. Therefore, for region-based approaches the data mining process consists in two steps instead of only one:

1. Segmentation of the original image to determine the border of the objects of interest

2. Clustering or classification of the newly identified regions

These regions have new and unique characteristics that are based on both the characteristics of the pixels they are made of, but also shape, size and texture features.

Figure 3.1: Example of a crude segmentation to detect a white lamb on a grass background

3.1.2.1 Image Segmentation

The segmentation of an image is a process that consists in grouping together neighbor pixels with the goal of finding homogeneous segments the borders of which will be a good approximation of the objects present within the image [52]. The segments created using this process are supposed to be relevant and match the real objects that can be found in the picture.

The definition of a proper image segmentation has been formalized by Pavlidis and Zucker [100, 162] in the form of the 4 following axioms:

• Each pixel of the image must belong to one and only one segment.

• Each segment must be continuous, i.e. made of connected neighbor pixels.

• Each segment must be an homogeneous entity.

• Two adjacent segments must be two distinct homogeneous entities.

Among these four conditions, axioms 3) and 4) rely on a notion of homogeneity that -not unlike the -notion of similarity in clustering- is rather difficult to assess. Thus, image segmentation is a difficult process that can lead to results of varying quality depending on the homogeneity criterion and the algorithm that are used. Over-segmentation and under-segmentation are the two most common problems:

• Over-Segmentation: The image contains too many segments after the segmenta-tion process. In this case, many of the objects to be found remain spread over several small segments that do not contain enough pixels. This problem can generally be solved by merging together segments that are too similar or do not represent anything.

• Under-Segmentation: The image does not contain enough segments. The result-ing segments are so big that they contain several objects inside of them. Unlike with over-segmentation, this problem is more difficult to solve.

3.1. INTRODUCTION TO DATA MINING APPLIED TO IMAGES 43 In Figure 3.2, we show an example of over-segmentation: the river and some of the buildings are clearly over-segmented. The colors in the image are representing the real object of interest that “should” be found.

Figure 3.2: Example of an over-segmentation

While both cases should be avoided when possible, there is no generic method that solves these problems. In any case, it is always better to have an over-segmentation rather than an under-segmentation. In the case, over-segmentation the real objects may still be found during the clustering or classification process, even if they are split between several segments. However when several objects are merged in the single segment because of an under-segmentation, there is no way to fix it during the clus-tering/classification process, and some classes or clusters may be lost for good. Over-segmentation it therefore a much more preferable preprocessing result.

More details on the different segmentation algorithms can be found in the literature [99].

3.1.2.2 Limitations of region-based approaches

While region-based approaches are more adapted than pixel-based approaches when dealing with VHR images, they also have their limits and disadvantages.

The first obvious limitation is the segmentation process needed to create the regions.

As we have shown in the previous subsection, this process can be cumbersome and requires that the user choose carefully a potentially large number of parameters to achieve acceptable results. Because the segmentation process is a mandatory step for region-based approaches, the quality of the segmentation will have a huge impact on the subsequent clustering or classification process.

Another important aspect is that when the segments and regions are created they add a large number of new attributes that may have to be taken into consideration:

surface of the segments, perimeter and elongation, extrema, variance and average values of the attributes in a given segment, contrast with the neighboring segments, etc. In the study realized by Anne Puissant [107] in her PhD thesis on the redundancy of geometric attributes (surface, perimeter, elongation, etc.), the author concludes that these attributes may or may not be relevant, or redundant depending on the type of objects that one wants to identify.

Finally, another obvious limitation of region-based approaches lies in the fact that -particularly with satellite pictures- there may be several levels of objects of interests to be found depending of the desired level of detail during the clustering process. However,

it is not yet possible for that kind of hierarchy between objects made of other objects to be displayed in a segmentation. Therefore the risk of having an under-segmentation at an acute level of detail remains high, while on the contrary the image may end up being over-segmented for a lesser and broader level of detail.

Example: An urban area is made of several different urban sectors that in turn are made of different buildings and streets.

Dans le document The DART-Europe E-theses Portal (Page 42-45)