Hyperspectral image segmentation: The butterfly approach

(1)

HYPERSPECTRAL IMAGE SEGMENTATION: THE BUTTERFLY APPROACH

Gorretta,N., Roger, JM., Rabatel, G., Bellon-Maurel, V.

Cemagref,

UMR Itap,

Montpellier, France

Fiorio, C.

Lirmm,

CNRS UMR 5506

Montpellier, France

Lelong, C.

Cirad,

UMR TETIS,

Montpellier France

ABSTRACT

Few methods are proposed in the litterature for coupling the spectral and the spatial dimension available on hyperspectral images. This paper proposes a generic segmentation scheme named butterfly based on an iterative process and a cross anal-ysis of spectral and spatial information. Indeed, spatial and spatial structures are extracted in spatial and spectral space respectively both taking into account the other one. To apply this layout on hyperspectral imgages, we focus particulary on spatial and spectral structures i.e. topologic concepts and la-tent variable for the spatial and the spectral space respectively. Moreover, a cooperation scheme with these structures is pro-posed. Finally, results obtained on real hyperspectral images using this specific implementation of the butterfly approach are presented and discussed.

Index Terms— Image segmentation, chemometrics

1. INTRODUCTION

Most of the methods devoted to hyperspectral imaging pro-cessing conduct data analysis without taking into account spatial information. Pixels are processed individually, as an array of spectral data without any spatial structure. Standard classifications approaches are thus widely used (k-means, fuzzy-c-means, hierarchical classification...). Linear mod-elling methods such as Partial Least Square analysis (PLS) or non linear approaches like Support Vector Machine (SVM) are also used at different scales (remote sensing or laboratory applications). However, with the development of high resolu-tion sensors, coupling spectral and spatial informaresolu-tion when processing complex images appears to be a very relevant approach. Indeed, the integration of the spatial and spectral information in a processing scheme appears important to take advantage of the complementaries that both sources of infor-mation can provide [1].

Actually, a few methods for coupling the spectral and the spatial dimensions are proposed in the litterature. The most recent approaches can be broadly classified in two main categories. The first one is related to a direct extension of individual pixel classification methods using just the spectral dimension (k-means, fuzzy-c-means or FCM, SVM). Spatial

dimension is integrated as an additionnal classification pa-rameter (Markov fields with local homogeneity constraints [2], SVM with spectral and spatial kernels combination [3], geometrically guided fuzzy C-means [4]...). The second cate-gory combines the two fields related to each dimension (spec-tral and spatial), namely chemometric and image analysis. Various strategies have been attempted. The first one relies on chemometrics methods (Principal Component Analysis or PCA, Independant Component Analysis or ICA, Curvilinear Component Analysis...) to reduce the spectral dimension and then to apply standard image processing techniques on the resulting score images. Another approach is to extend the definition of basic image processing operators to this new dimensionality (extended morphological profiles [5] or wa-tershed segmentation approaches [6] for example).

However, the approaches mentioned above tend to favour only one description either directly or indirectly (spectral or spatial). The purpose of this article is to propose a generic hy-perspectral processing approach that strikes a better balance in the treatment of both kinds of information. This method, called butterfly [7], aims to perform an iterative and a cross analysis of data in the spectral and the spatial domains, lead-ing to the segmentation of a hyperspectral image.

This article is organized as follows. Section 2 is devoted to the presentation of the butterfly approach. An implementa-tion of this generic processing scheme is also given. In the following section, experiments performed using the butterfly approach on two hyperspectral images at different resolution scales (proxi-detection and airborne) are described. Results are then presented and discussed.

2. THE BUTTERFLY APPROACH 2.1. Conventions

Capital bold characters will be used for matrices, e.g. Z ; small bold characters for column vectors, e.g. z_iwill denote theithcolumn ofZ ; row vectors will be denoted by the trans-pose notation, e.g.zT_j will denote thejthrow ofZ ; non bold characters will be used for scalars, e.g. matrix elementsz_ijor indicesi.

(2)

image (HSI) is a functionR2 toRplinking a pixel(x, y) to a spectrums(x, y), signal vector measured at p wavelength. When the image topology is considered, HSI is an × m rect-angular set with a partitionP including k connected regions. When the spatial dimension is promoted, a HSI is a 3D matrix notedZ_(n,m,p), withn the number of lines, m, the number of columns andp the number of wavelength. For computing rea-sons, the matrix could be reshaped in a 2D matrix(n × m, p) ofn × m spectra built on p wavelengths. Any partition P is associated with a membership matrixQ_(n×m,k). Each line of

Q, corresponding to one pixel, contains the boolean

member-ship degrees to thek regions.

Let Z be a HSI with a partition P. Let z be the mean spectrum of the image andZ the image of the same size as Z with all pixel values equal toz. Let Z be the mean centered matrix, i.e. Z = Z − Z. The total inertia matrix of image is defined by: T = ZTZ, the between-region inertia matrix

is given byB = ZTQ (QTQ)−1QTZ and the within-region

inertia matrix is given byW = T − B. The Wilks Lambda ([8]) is defined as: Λ(Z, P_k) = trace(B)/trace(T). This in-dex increases from0 to 1 measuring the gap between regions in the spectral space.

2.2. Theory

Given a hyperspectral imageZ, the purpose of the butterfly approach is to conduct a segmentation ofZ, i.e. a partition

P including k regions Ri, withi = 1 . . . k. To perform this

segmentation, a spatial and a spectral structure are identified in the spatial and the spectral space respectively. Next, the available information is reduced accordling to each structure. To ensure cooperation between each space, the butterfly ap-proach is based on an iterative process used to extract, at each round, a spectral structure taking into account a spatial struc-ture and vice-versa. Both extracted strucstruc-tures are thus refined gradually to lead to a final segmentation of the hyperspectral image.

The scheme proposed above is a generic one. To apply it, we have to focus on specific spatial and spectral structures and to define cooperation between them. In this article, we will focus solely on particular notions of topology for the spatial space i.e. the notions of connectivity and adjacency and the building of latent variables for the spectral space. In addition, to ensure collaboration, concepts of regions and false colors or score images (spatial space) are transfered to concepts of classes and latent variables (spectral space) respectively. The above is summarized in figure 1.

Thus, one ”butterfly” round is made up of two steps: 1. Extraction of a spatial structure (topology) based on a

spectral structure,

2. Extraction of a spectral structure (latent variables) based on a spatial structure.

Fig. 1. The butterfly approach

These steps are processed in a successive, iterative and inter-dependent way.

The first step deals with the use of commonly used image processing tools (region segmentation algorithms) on a lim-ited number of score images. This makes it relatively easy to process. To carry out the second step, chemometric tools are used to reveal a subspace (latent variables) which enables us to characterize the data according to the membership ma-trixQ and the optimization of a criterion based on the be-tween (B) or the within inertia (W) of the data. The choice of this criterion is dependent on the region segmentation strategy used i.e. either a top down approach (splitting) or a bottom-up approach (merging). According to the theorical principles of these approaches, the used criterion will be the maximi-sation ofB and W for the bottom-up and top down strategy respectively.

2.3. Implementation

In this paper, the butterfly approach is implemented using a split and merge strategy. First, we over-segment the image by a serie of split based on spectral and spatial features following the butterfly approach. Secondly, we merge back the regions, in order to get the number of expected regions or to satisfy a stop criterion.

Each split operation is conducted by projecting the image data onkslatent variable determined by the diagonalisation of the within inertia matrix (the firstk_seigenvectors). Next, split

(3)

of all exiting region inp regions are tested and the one that maximizes the Wilks Lambda (Λ(Z, P_k)) criterion is chosen. Besides, to ensure a progressive segmentation process and a good complementarity between extrated structures (spectral, spatial) at each round,p is adjusted to 2.

Each merge operation is conducted by projecting the im-age data onkmlatent variables determined by the diagonalisa-tion of the between inertial matrix (the firstkmeigenvectors). Next, merge of all adjacent regions are tested and the one that maximises the Wilks Lambda is chosen.

3. MATERIAL AND METHODS

In order to evaluate the performance of the proposed ap-proach, two data sets were used:

• The first one is a DAIS (Digital Airborne Imaging

Spectrometer) hyperspectral image data in the reflec-tive and thermal wavelength ranges (496 − 20000 nm with80 bands) from the city of Pavia in Italy. To re-duce processing time, this image was rere-duced to obtain a final image size(n = 140) × (m = 160) × (p = 80). Black and white image (band50) is given Figure 2-a.

• The second one was acquired in field (plant

identifica-tion) with a Hyspex VNIR-1600 hyperspectra system (Norsk Elecktro Optikk) in160 narrow bands (400-980 nm with 3.6 nm spectral resolution). To reduce pro-cessing time, this image was binned to obtain a final image size (n = 120) × (m = 150) × (p = 160). Color image is given Figure 2-b.

(a) Image DAIS (band50) (b) Image Hyspex

Fig. 2. Hyperspectral images

For the split phase, we used the normalized cut algorithm as proposed by Shi [9]. A weighted graph of the image score is then constructed by taking each pixel as a node, and con-necting each pair of pixels by an edge. The edge weight

w(i, j) between pixel i and j is defined as the product of an

intensity similarity term and spatial proximity term and mea-sures the likelihood fo pixeli and j belonging to the same image region: w(i, j) = ⎧ ⎨ ⎩e −Xi−Xj2 σx e −Ii−Ij2 σI if Xi− Xj < R, 0 otherwise (1) where,XiandIidenote pixel location and intensity (im-age score),R a radius, σxandσI regularisation parameters.

For both imagesksandkmare adjusted to1 and 99, 99% respectively. In fact, for the split phase, one latent variable is enough to decide if two regions are not similar. Even more, two regions that are different on one latent variable will be necessary on more. On the contrary, for the merge phase, to be sure that two regions are similar, it is necessary to look at many latent variables. Indeed,k_mis adjusted to explain a large amount of explained variance.

4. RESULTS

Figures 3-a and b show segmentation results obtained on the DAIS image for the split and merge phase respectively. Regularisation (σI andσx) andR parameters were fixed to 1.0, 15 and 20. According to the number of urban structures included in the image, split and merge phase was stopped as soons as15 and 12 regions have been found respectively. That corresponds to14 split rounds and 3 merge rounds.

On this image, obtained segmentation allows to extract a majority of urban structures and more particulary buildings. However, the split phase leads to the segmentation of the in-land waterways and the road is not wholly detected. These errors could be explained by the poor quality of some spectral bands which do have not been left out for the treatment.

(a) End of split stage (b) End of merge stage

Fig. 3. Results on DAIS image

Figures 4-a and b show segmentation results obtained on the hyspex image for the split and merge phase respectively. Regularized (σ_I andσ_x) andR parameters were fixed to 0.8,

(4)

20 and 15. Split and merge phase was stopped as soons as 10 and5 regions have been found respectively. That corresponds to9 split rounds and 5 merge rounds. On this image, segmen-tation results are particulary relevants. Indeed, vegesegmen-tation and background are correctly segmented.

(a) End of split stage (b) End of merge stage

Fig. 4. Results on Hyspex image

5. CONCLUSION

In this paper, a generic approach to segment hyperspectral im-ages has been proposed. This method called butterfly is based on an iterative process and a cross analysis of spectral and spatial structures. To demonstrate the relevance of this layout , we have focused on particular spectral and spatial structures i.e. topologic (connectivity) and latent variables concepts for the spatial and the spectral space respectively. In this case, cooperation between extracted structures was made using het-erogeneity concepts. In addition, we have implemented this approach by using a split and merge segmentation process and a normalized cuts segmentation algorithm. Results obtained on in field and airbone hyperspectral images are promising. They must be confirmed by additional tests on other images and with ground truth for a quantitative evaluation.

However, as mentionned above, butterfly approach is a generic one. Indeed, a perspective of this work lies in the determination of other spectral and spatial structures and col-loboration processes to address a wide variety of image seg-mentation problems.

6. ACKNOWLEDGMENT

The authors would like to thank Dr. Paolo Gamba from the University of Pavia, Italy, for the DAIS data set.

7. REFERENCES

[1] Antonio Plaza, Jon Atli Benediktsson, Joseph W. Board-man, Jason Brazile, Lorenzo Bruzzone, Gustavo Camps-Valls, Jocelyn Chanussot, Mathieu Fauvel, Paolo Gamba,

Anthony Gualtieri, Mattia Marconcini, James C. Tilton, and Giovanna Trianni, “Recent advances in techniques for hyperspectral image processing,” Remote Sensing of

Environment, vol. In Press, pp. –, 2009.

[2] O. Prony, X. Descombes, and J. Zerubia, “Classification des images satellitaires hyperspectrales en zone rurale et priurbaine,” Rapport de recherche INRIA 4008, INRIA, septembre 2000.

[3] G. Camps-Valls, L. Gomez-Chova, J. Munoz-Mari, J. Vila-Frances, and J. Calpe-Maravilla, “Composite ker-nels for hyperspectral image classification,” Geoscience

and Remote Sensing Letters, vol. 3, no. 1, pp. 93 – 97,

2006.

[4] J. C. Noordam and W.H.A.M. Van den. Broek, “Mul-tivariate image segmentation based on geometrically guided fuzzy c-means clustering,” Journal of

Chemomet-rics, vol. 16, pp. 1–11, 2002.

[5] J.A. Benediktsson, J.A. Palmason, and J Sveinsson, “Classification of hyperspectral data from urban areas based on extended morphological profiles,” IEEE

Trans-actions on Geoscience and Remote Sensing, vol. 43, no.

3, pp. 480–491, 2005.

[6] G. Noyel, J. Angulo, and D. Jeulin, “Morphological seg-mentation of hyperspectral images,” Image Analysis and

Stereology, vol. 26, pp. 101–109, 2007.

[7] N. Gorretta, J-M Roger, C. Fiorio, V. Bellon-Maurel, G. Rabatel, and C. Lelong, “Proposition d’une strat´egie de segmentation d’images hyperspectrales,” ”Revue de

traitement du signal”, vol. ”In Press”, pp. ”–”, 2009.

[8] S Wilks, Multidimensional scatter, Standford Press, 1960.

[9] J. Shi and J. Malik, “Normalized cuts and image segmen-tation,” in IEEE Conference Computer Vision and Pattern