Statistical modeling for the interpretation of polarimetric radar images

(1)

Statistical Modeling for the Interpretation of Polarimetric Radar Images

Statistische modellering voor de interpretatie van polarimetrische radarbeelden

Chi Liu

Promotoren: prof. dr. ir. W. Philips, dr. W. Liao, prof. dr. H. Li Proefschrift ingediend tot het behalen van de graad van Doctor in de ingenieurswetenschappen: computerwetenschappen Vakgroep Telecommunicatie en Informatieverwerking Voorzitter: prof. dr. ir. J. Walraevens Faculteit Ingenieurswetenschappen en Architectuur Academiejaar 2019 - 2020

(2)

Wettelijk depot: D/2020/10.500/6

(3)

Universiteit Gent Faculteit Ingenieurswetenschappen en Architectuur Vakgroep Telecommunicatie en Informatieverwerking

Promotoren: prof. dr. ir. Wilfried Philips dr. ir. Wenzhi Liao

prof. dr. ir. Heng-Chao Li

Voorzitter van de jury: prof. dr. ir. Patrick De Baets Leden van de jury: dr. ir. Ljubomir Jovanov (secretary)

prof. dr. ir. Paul Scheunders prof. dr. ir. Jocelyn Chanussot prof. dr. ir. Ann Franchois

Universiteit Gent Faculteit Ingenieurswetenschappen en Architectuur Onderzoeksgroep: Image Processing and Interpretation - (IPI) Vakgroep Telecommunicatie en Informatieverwerking

Sint-Pietersnieuwstraat 41, B-9000 Gent, België Tel.: +32-9-264.34.12

Fax.: +32-9-264.42.95

Voorzitter: prof. dr. ir. Joris Walraevens

Dit werk kwam tot stand in het kader van een specialisatiebeurs van het CSC (China Scholarship Council).

Proefschrift ingediend tot het behalen van de graad van Doctor in de ingenieurswetenschappen: computerwetenschappen Academiejaar 2019 - 2020

(4)

(5)

Acknowledgements

This work would not have been possible without the guidance, support, and help of many people. I am especially indebted to my supervisors, prof. Wilfried Philips, dr. Wenzhi Liao, and prof. Heng-Chao Li, for their continued support of my doctoral research. They have offered me a great deal of freedom to carry out the doctoral research. I would also like to thank them for the extended and numerous discussions about my work. Their immense knowledge and insightful feedback have benefited me greatly.

I want to express my gratitude to the jury members of the Examination Board for their thoughtful evaluation of my thesis and valuable comments, which broaden my perspective.

My sincere appreciation also goes to my (ex-)colleagues at TELIN in Ghent University; they have created such a nice and pleasant atmosphere in the work- place. Special thanks are due to Ljubomir Jovanov for his time and help in translating the required parts of this thesis into Dutch. I am deeply grateful to all my friends; thank you for sharing enjoyable moments and helping me maintain a good work-life balance.

I would like to thank my parents for always showing faith in me and encour- aging me to do what I desired. They have supported me with their patience, love, and trust throughout the years of my PhD.

Ghent, December, 2019.

Chi Liu

(6)

(7)

List of Figures

1.1 Space-borne Earth observation. . . 2

1.2 An excerpt of a PolSAR image . . . 3

2.1 An illustration of speckle in a PolSAR image . . . 9

2.2 The speckle formation. . . 9

2.3 Example marginal PDFs from the complex Wishart models . . 12

2.4 Fitting results for the PolSAR data in the ocean area . . . 15

2.5 Fitting results for the PolSAR data in the woodland . . . 15

2.6 Fitting results for the PolSAR data in the urban area . . . 16

3.1 An illustration of marginal densities from WMMs. . . 22

3.2 Probabilistic graph model for the formulated Bayesian model . 25 3.3 Images for experiments . . . 32

3.4 Effect of speckle filtering on the “Kakadu” image . . . 34

3.5 Effect of speckle filtering on the “Vancouver” image . . . 35

3.6 Effect of the model parameterα_i0. . . 36

3.7 Effect of the model parameterWi0 . . . 37

3.8 Classification maps for the “Vancouver” image using the compared methods . . . 39

3.9 Classification maps for the “San Francisco Bay” image using the compared methods . . . 40

3.10 Classification maps for the “Kakadu” image using the compared methods . . . 41

3.11 Time consumption . . . 41

4.1 Test regions for the evaluation of the compound distributions . 49 4.2 Fitting results obtained by the compared texture PDFs. . . 51

4.3 Fitting results obtained by the compared compound PDFs . . . 55

4.4 A general flowchart for the H/α-Wishart method and the Chernoff-Wishart method . . . 66

4.5 A comparison of methods for unsupervised classification based on the “Flevoland” image . . . 68

4.6 A comparison of methods for unsupervised classification based on the “Kakadu” image . . . 70

4.7 A comparison of methods for unsupervised classification based on the “Oberp” image . . . 71

(12)

4.8 A comparison of methods for unsupervised classification based

on the “San Francisco” image . . . 73

5.1 Effect of the spatial relationships on classification of the “Flevoland” image . . . 92

5.2 Effect of the spatial relationships on classification of the “Kakadu” image . . . 94

5.3 Effect of the spatial relationships on classification of the “San Francisco” image . . . 96

5.4 Effect of the spatial relationships on classification of the “Oberp” image . . . 97

5.5 Effect of the patch size and the neighborhood size . . . 99

5.6 Classification results obtained by the compared methods based on the “Flevoland” image . . . 101

5.7 Classification results obtained by the compared methods based on the “Kakadu” image . . . 103

5.8 Classification results obtained by the compared methods based on the “San Francisco” image . . . 104

5.9 Classification results obtained by the compared methods based on the “Oberp” image . . . 106

6.1 Images for experiments . . . 120

6.2 Curves of overall accuracy using KNN classifiers . . . 122

6.3 Effect of unlabeled samples in the proposed method . . . 123

6.4 Effect of the many-to-one correspondence in the proposed method.125 6.5 Effect of the number of components for the additional class-joint distribution. . . 126

6.6 Classification results for the “local Flevoland” image . . . 129

6.7 Classification results for the “complete Flevoland” image . . . . 130

6.8 Classification results for the “Foulum” image . . . 131

(13)

List of Tables

1.1 Advantages and Drawbacks of the Image Sensors for Earth Ob-

servation. . . 2

2.1 Fitting Performance of the Theoretical Model. . . 14

4.1 Models for the PolSAR data . . . 46

4.2 Performance Evaluation of the Texture Models. . . 51

4.3 Comparison of the Compound Distributions. . . 56

4.4 Classification Accuracy on the “Flevoland” image. . . 69

4.5 Classification Accuracy on the “Kakadu” image. . . 72

4.6 Classification Accuracy on the “Oberp” image. . . 72

4.7 Classification Accuracy on the “San Francisco” image. . . 74

5.1 Settings of Patch Size and Neighborhood Size . . . 93

5.2 Classification Accuracy on the “Flevoland” Image Using the Pro- posed Method in the Different Settings. . . 93

5.3 Classification Accuracy on the “Kakadu” Image Using the Pro- posed Method in the Different Settings. . . 95

5.4 Classification Accuracy on the “San Francisco” Image Using the Proposed Method in the Different Settings. . . 95

5.5 Classification Accuracy on the “Oberp” Image Using the Pro- posed Method in the Different Settings. . . 98

5.6 Classification Accuracy Using the Proposed Method with Differ- ent Patch Sizes and Neighborhood Sizes. . . 100

5.7 Quantitative Evaluation of the Compared Methods Based on the “Flevoland” Image . . . 102

5.8 Quantitative Evaluation of the Compared Methods Based on the “Kakadu” Image . . . 103

5.9 Quantitative Evaluation of the Compared Methods Based on the “San Francisco” Image . . . 104

5.10 Quantitative Evaluation of the Compared Methods Based on the “Oberp” Image . . . 105

6.1 Classification Accuracy on the “Local Flevoland” Image . . . . 128

6.2 Classification Accuracy on the “Complete Flevoland” Image . . 132

6.3 Classification Accuracy on the “Foulum” Image . . . 133

(14)

(15)

List of Abbreviations

CDF Cumulative Distribution Function

CRF Conditional Random Field

DPMM Dirichlet Process Mixture Model

EM Expectation-Maximization

ENL Equivalent Number of Looks

EO Earth Observation

FMM Finite Mixture Model

GFD Generalized Fisher Distribution

GMM Gaussian Mixture Model

i.i.d independent and identically distributed

KL Kullback-Leibler

KSD Kolmogorov–Smirnov Distance

LiDAR Light Detection and Ranging

MAP Maximum A Posteriori Probability

MCMC Markov Chain Monte Carlo

MLC Multi-Look Complex

MoLC Method of Log-Cumulant

MoMLC Method of Matrix Log-Cumulant

MRF Markov random Field

NoG Number of Groups

OA Overall Accuracy

PDF Probability Density Function

PolSAR Polarimetric Synthetic Aperture Radar

RWMM Relaxed Wishart Mixture Model

RW-SS Relaxed-Wishart-Based Semi-Supervised

(16)

SAR Synthetic Aperture Radar

SKLD Symmetric Kullback–Leibler Divergence

SLC Single-Look Complex

SVFMM Spatially Variant Finite Mixture Model

SVM Support Vector Machine

tDPMM Textured Dirichlet Process Mixture Model

TSVM Transductive Support Vector Machine

WMM Wishart mixture model

W-SS Wishart-Based Semi-Supervised

(17)

Samenvatting

In de afgelopen decennia zijn de technologieën voor aardobservatie (Earth Ob- servation, EO) sterk verbeterd door de snelle ontwikkeling van teledetectiesatel- lieten en krachtige teledetectiesensoren. De toenemende beschikbaarheid van EO-gegevens maakt het mogelijk om EO-technologieën op grote schaal te gebruiken in praktische toepassingen, zoals monitoring van de landbouw en stad- splanning.

Een van de belangrijkste soorten EO-data zijn beelden die worden verkre- gen met behulp van polarimetrische synthetische apertuurradar (PolSAR), die zijn eigen radiogolven uitzendt om interessante gebieden te belichten en die terugverstrooide signalen ontvangt om teledetectiebeelden te vormen. PolSAR- beelden reflecteren de fysieke verstrooiingseigenschappen van verlichte gebieden. Omwille van de beeldvormingsmechanismen van PolSAR bestaat er meestal een inherent fenomeen (d.w.z. spikkel), en dit manifesteert zich als korrelige patronen in PolSAR-beelden. De korrelige patronen als gevolg van spikkels verminderen de kwaliteit van PolSAR-beelden en bemoeilijken de interpretatie van PolSAR-beelden. Bovendien kunnen de kenmerken van spikkels in verschillende soorten regio’s significant verschillen. Zo kunnen spikkels in stedelijke gebieden leiden tot een grotere fluctuatie van PolSAR-datawaarden dan in grasland.

Als gevolg daarvan komen PolSAR-beelden altijd met ongewenste fenomenen voor de interpretatie van PolSAR-beelden, zoals de korrelige patronen door spikkels en de sterke fluctuatie van PolSAR-gegevens in stedelijke gebieden. Het is van cruciaal belang om deze fenomenen zorgvuldig in acht te nemen om tot goede prestaties te komen bij de interpretatie van PolSAR-beelden.

In dit proefschrift ontwikkelen we methoden voor de interpretatie van PolSAR-beelden door specifieke scenario’s en intrinsieke eigenschappen van PolSAR-data in overweging te nemen. We onderzoeken theoretisch en em- pirisch de statistische kenmerken van de PolSAR-gegevens in homogene regio’s (bijv. landbouwgebieden) en heterogene regio’s (bijv. stedelijke gebieden). Op basis hiervan ontwikkelen we nieuwe statistische modellen voor de PolSAR- gegevens en passen we deze modellen toe op de interpretatie van PolSAR- beelden.

We richten ons op drie specifieke onderwerpen: statistische modellering van de PolSAR-gegevens met hoge scène heterogeniteit, incorporatie van lokale correlaties in PolSAR-beelden, en het omgaan met een gemeenschappelijk scenario waarin gelabelde voorbeelden onvoldoende zijn. Wat betreft het eerste onderwerp, hoge scène heterogeniteit kan leiden tot zware fluctuaties in de

(18)

PolSAR-gegevenswaarden. Als hoge scène heterogeniteit in PolSAR-gegevens niet goed wordt overwogen, kan het rumoerige uiterlijk in heterogene regio’s van de interpretatieresultaten opvallend zijn, wat leidt tot verminderde in- terpretatieprestaties. Betreffende het tweede onderwerp, lokale correlaties in PolSAR-beelden bieden nuttige voorkennis, namelijk dat het mogelijk is dat naburige pixels tot hetzelfde type landbedekking behoren. Deze voorkennis is nuttig bij het verbeteren van de prestaties van de interpretatie van PolSAR- beelden. Tot slot onderzoeken we het gebruik van statistische modellering- stechnieken bij de interpretatie van PolSAR-afbeeldingen wanneer gelabelde PolSAR-voorbeelden onvoldoende zijn. Dit is een veel voorkomend scenario en het is een open uitdaging om zowel gelabelde als niet-gelabelde voorbeelden effectief te benutten om goede classificatieprestaties te bereiken. Het onder- zoek naar deze onderwerpen resulteert in de belangrijkste bijdragen van dit proefschrift als volgt.

De eerste bijdrage van deze thesis is dat we een nieuwe kansverdeling voorstellen voor de PolSAR-gegevens met hoge scène heterogeniteit. We onderzoeken eerst het mechanisme voor het karakteriseren van de scene heterogeniteit van de PolSAR-gegevens, dat wil zeggen dat geobserveerde PolSAR- gegevens met hoge scène heterogeniteit meestal gekarakteriseerd worden door de vermenigvuldiging van twee willekeurige variabelen (d.w.z. het kader van het productmodel): een willekeurige matrix die “ideale” spikkels voorstelt en een scalaire variabele die de ruimtelijke variatie vertegenwoordigt. Op basis hiervan stellen we een nieuwe matrix-variabele kansverdeling voor de PolSAR- gegevens voor, die goede prestaties laat zien in het nauwkeurig aanpassen van de PolSAR-gegevens in heterogene scènes. Het idee achter de voorgestelde verdeling is om flexibele kansverdelingen te gebruiken voor de verborgen variabele (d.w.z. de scalaire variabele in het kader van het productmodel), de gezamenlijke kansverdeling van de PolSAR-gegevens en de verborgen variabele te formuleren, en de verborgen variabele te marginaliseren om een marginale (samengestelde) verdeling van de PolSAR-gegevens te verkrijgen. Door de verborgen variabele heeft het resulterende model extra flexibiliteit voor het beschrijven van de sterk fluctuerende PolSAR-gegevens in heterogene scènes.

De tweede bijdrage gaat over het beschouwen van de heterogeniteit van scènes in de onbeheerde classificatie van PolSAR-beelden: we stellen voor om het mechanisme voor het karakteriseren van de heterogeniteit van scènes op te nemen in een kader voor onbeheerde classificatie, in plaats van direct gebruik te maken van de samengestelde verdelingen die zijn afgeleid van het kader van het productmodel. Aangezien het gebruik van het mechanisme in de voorgestelde methode het voordeel heeft dat de marginalisatie wordt geëlimi- neerd ten opzichte van de verborgen variabele, hoeft de voorgestelde methode geen rekening te houden met potentiële gecompliceerde functies die voortvloeien uit de marginalisatie en is ze eenvoudig te implementeren. Bovendien biedt de voorgestelde methode een interpreteerbare procedure voor het genereren van gegevens en kenmerkt ze expliciet de heterogeniteit van de scène.

Een andere belangrijke bijdrage in dit proefschrift is dat we een op patch

(19)

gebaseerd statistisch model voorstellen om lokale correlaties tegelijkertijd op pixelniveau en patchniveau te exploiteren. Op pixelniveau (binnen patches) behandelt de voorgestelde methode de hevige fluctuatie van PolSAR-gegevens waarden door de rol van de sterk fluctuerende PolSAR-gegevens te verminderen;

op patchniveau (tussen patches) maakt de voorgestelde methode het mogelijk om de aangrenzende patches toe te wijzen aan hetzelfde type landbedekking met een hoge mogelijkheid. We passen dit model toe op de onbeheerde classificatie van PolSAR-beelden. De experimentele resultaten tonen de effectiviteit van de voorgestelde methode aan bij het opnemen van lokale correlaties.

De technieken voor statistische modellering van de PolSAR-gegevens hebben een enorm potentieel voor vele toepassingen. We onderzoeken ook het gebruik van de technieken van statistische modellering in half beheerde classificatie van PolSAR-gegevens. Het idee is om te leren van klasse-gezamenlijke verdelingen (d.w.z. een gezamenlijke verdeling van PolSAR-gegevens en klassenlabels) op basis van gelabelde en ongelabelde PolSAR-steekproeven. Elke klasse-verdeling is de vermenigvuldiging van een voorafgaande waarschijnlijkheid van een specifieke klasse van landbedekkingstypes en een klasse-voorwaardelijke verdeling van de PolSAR-gegevens gezien die specifieke klasse. Wij stellen voor om elke klasse-verdeling te benaderen door gebruik te maken van een gewogen som van ontspannen Wishart-verdelingen; het gebruik van de gewogen som van ontspannen Wishart-verdelingen kan nauwkeurig passen bij de PolSAR-gegevens die bij elke klasse van landbedekkingstypes horen en heeft het voordeel dat het de po- tentiële multimodale eigenschap in elk landbedekkingstype vastlegt. Bovendien houden we in de voorgestelde methode rekening met de situatie waarin som- mige niet-gelabelde monsters behoren tot onbekende landbedekkingstypen die niet worden ontdekt uit de labels in opleidingssets. Om rekening te houden met deze situatie, gebruiken we een extra klasse, collectief voor onbekende bodem- bedekkingstypes. Experimentele resultaten tonen aan dat de voorgestelde half beheerde methode effectief gebruik kan maken van gelabelde en niet-gelabelde PolSAR-monsters, en dat de voorgestelde methode kan profiteren van het gebruik van de extra klasse en het gebruik van de veel-op-een overeenkomst.

Dit doctoraatsonderzoek resulteerde in de publicatie van drie eerste- auteurspapers in peer-reviewede tijdschriften en één eerste-auteurspaper in een peer-reviewed internationale conferentie.

(20)

(21)

Summary

In the last decades, Earth observation (EO) technologies have been greatly advanced by the rapid development of remote sensing satellites and high- performance remote sensing sensors. Growing availability of EO data makes it possible to widely use EO technologies in practical applications, such as agriculture monitoring and urban planning.

One of the main types of EO data is the image acquired by polarimetric synthetic aperture radar (PolSAR), which emits its own radio waves to illuminate regions of interest and receives backscattered signals to form remote sensing images. PolSAR images reflect physical scattering properties of illuminated regions. Due to the imaging mechanisms of PolSAR, there usually exists an inherent phenomenon (i.e., speckle), and it manifests itself as a granular pattern in PolSAR images. The granular patterns resulting from speckle degrade the quality of PolSAR images and complicate the interpretation of PolSAR images. Moreover, the characteristics of speckle in different types of regions could be significantly different. For example, speckle in urban regions can result in heavier fluctuation of PolSAR data values than that in grassland.

As a consequence, PolSAR images always come with undesirable phenomena for the interpretation of PolSAR images, such as the granular patterns due to speckle and the heavy fluctuation of PolSAR data values in urban regions. It is of crucial importance to carefully consider those phenomena so as to achieve good performances in the interpretation of PolSAR images.

In this thesis, we develop methods for the interpretation of PolSAR images by considering specific scenarios and intrinsic properties of PolSAR data. We theoretically and empirically explore the statistical characteristics of the Pol- SAR data in homogeneous regions (e.g., agriculture areas) and heterogeneous regions (e.g., urban areas). On this basis, we further develop new statistical models for the PolSAR data and apply these models to the interpretation of PolSAR images.

We focus on three specific topics: statistical modeling of the PolSAR data with high scene heterogeneity, incorporation of local correlations in PolSAR images, and dealing with a common scenario in which labeled samples are not sufficient. For the first topic, high scene heterogeneity can lead to heavy fluctuation of the PolSAR data values. If high scene heterogeneity in PolSAR data is not appropriately considered, noisy appearance can be conspicuous in heterogeneous regions, leading to degraded interpretation performances. For the second topic, local correlations in PolSAR images provide useful prior knowledge that it has high possibility for neighboring pixels to belong to the same

(22)

land cover type. This prior knowledge is helpful in improving performances of the interpretation of PolSAR images. Finally, we explore the use of statistical modeling techniques in the interpretation of PolSAR images when labeled PolSAR samples areinsufficient. This is a common scenario and it is an open challenge to effectively exploit both labeled and unlabeled samples to achieve good classification performances. The research on these topics results in the main contributions of this thesis as follows.

The first contribution of this thesis is that we propose a novel probability distribution for the PolSAR data with high scene heterogeneity. We first explore the mechanism for characterizing scene heterogeneity of the PolSAR data, that is, observed PolSAR data with high scene heterogeneity are usually characterized by the multiplication of two random variables (i.e., the framework of theproduct model): a random matrix representing “ideal” speckle and a scalar variable representing the spatial variation of the scattering properties. On this basis, we propose a novel matrix-variate probability distribution for the Pol- SAR data, which shows good performances in accurately fitting to the PolSAR data in heterogeneous scenes. The idea behind the proposed distribution is to use flexible probability distributions for the hidden variable (i.e., the scalar variable in the framework of the product model), formulate the joint probability distribution of the PolSAR data and the hidden variable, and marginalize out the hidden variable to obtain a marginal (compound) distribution of the PolSAR data. Due to the hidden variable, the resulting model has additional flexibility for describing the heavily-fluctuating PolSAR data in heterogeneous scenes.

The second contribution is about considering scene heterogeneity in unsupervised classification of PolSAR images: we propose to incorporate the mechanism for characterizing scene heterogeneity into a framework for unsupervised classification, rather than directly using the compound distributions derived from the framework of the product model. Since the use of the mechanism in the proposed method has the advantage of eliminating the marginalization over the hidden variable, the proposed method does not need to deal with potential complicated functions arising from the marginalization and is easy to implement. Moreover, the proposed method provides an interpretable data- generating procedure and explicitly characterizes scene heterogeneity.

Another major contribution in this thesis is that we propose a patch-based statistical model to exploit local correlations simultaneously at pixel level and patch level. At the pixel level (within patches), the proposed method deals with the heavy fluctuation of PolSAR data values by reducing the role played by the heavily fluctuating PolSAR data; at the patch level (between patches), the proposed method allows the neighboring patches to be assigned to the same land cover type with high possibility. We apply this model to unsupervised classification of PolSAR images. The experimental results demonstrate the effectiveness of the proposed method in incorporating local correlations.

The techniques of statistical modeling of the PolSAR data have immense potential for many applications. We also explore the use of the techniques of

(23)

statistical modeling in semi-supervised classification of PolSAR data. The idea is to learnclass-jointdistributions (i.e., a joint distribution of PolSAR samples and class labels) based on labeled and unlabeled PolSAR samples. We propose to approximate each class-joint distribution by using a weighted sum of relaxed Wishart distributions; the use of the weighed sum of relaxed Wishart distributions can provide accurate fits to the PolSAR data belonging to each class of land cover types, and it has the advantage of capturing the potential multimodal property in each land cover type. Moreover, in the proposed method, we take into account the situation in which some of unlabeled samples belong to unknown land cover types that are not discovered from the labels in training sets. To allow for this situation, we use an additional class-joint distribution collectively for unknown land cover types. Experimental results show that the proposed semi-supervised method can effectively exploit labeled and unlabeled PolSAR samples, and the proposed method can benefit from the use of the additional class-joint distribution and the use of the many-to-one correspondence.

This doctoral research resulted in the publication of three first-authored pa- pers in peer-reviewed journals and one first-authored paper in a peer-reviewed international conference.

(24)

(25)

1

Introduction

Earth-observing sensors, which are mounted on satellites [see Fig. 1.1], air- crafts, and ships, collect various types of information about the Earth planet, such as temperature, wind speed, building heights, ocean salinity, and images of the Earth surface. These Earth Observations (EOs) of the land, the ocean, and the atmosphere are helpful in understanding how the Earth works in the long term, predicting short-term changes (e.g., weather shift), evaluating cli- mate change, and exploring the impact of human activities on the environment.

Among various types of EOs, images have provided a wide support for many applications such as urban planning, fast response to natural disasters and hazards, and the sustainable management of natural resources.

There are many Earth-observing sensors that acquire images, such as optical hyperspectral sensors, polarimetric synthetic aperture radar (PolSAR), and Light Detection and Ranging (LiDAR). These sensors have different advantages and disadvantages as summarized in Table 1.1. PolSAR is one of the main sources for EO images. PolSAR is an active imaging system, which is carried on a moving platform. It emits its own radio waves and receives echoes from the area of the illumination. PolSAR can effectively acquire images of the Earth surface all day and in all-weather conditions, which is an advantage over optical sensors (e.g., hyperspectral imaging sensors). Compared with the distance/height data acquired by Light Detection and Ranging (LiDAR), the dual/quad-polarimetric (multi-channel) data acquired by the PolSAR reflect information about the shape, the orientation, and the roughness of the Earth surface. It provides insights into the physical scattering mechanism and helps in understanding the area of interest.

The value of PolSAR has been recognized. Over the past few decades, the applications of PolSAR have advanced through the rapid development of PolSAR systems, such as AIRSAR by the Jet Propulsion Laboratory (NASA- JPL), E-SAR and TerraSAR-X by German Aerospace Center (DLR), EMISAR by Electromagnetics Institute (EMI) of the Technical University of Denmark, and RADARSAT-2 by Canadian Space Agency. A huge number of PolSAR images have been used in various applications ranging from man-made target extraction [Wu 15], ship detection [Wei 14b] to glacier monitoring [Akbari 14]

(26)

Figure 1.1: Space-borne Earth observation.

Table 1.1: Advantages and Drawbacks of the Image Sensors for Earth Observation.

Sensors Advantages Disadvantages

Optical •High spectral resolution •Weather dependent

•A few free data sets

LiDAR •High accuracy •Expensive

•3-dimensional representation of targets

PolSAR •Weather and daylight independent •Intrinsic phenomena

•Detect underground objects complicate interpretation.

•Combination of polarizations provides additional information

and damage assessment [Li 12]. The central task in the applications of PolSAR images is the interpretation, which usually associates the pixels and the regions in PolSAR images with appropriate landuse categories.

1.1 Problem Statement

PolSAR images usually exhibit undesirable phenomena (e.g., the granular patterns in Fig. 1.2) for the interpretation task.

The granular pattern is the result of an intrinsic phenomenon, i.e., speckle.

Speckle is formulated by the combination of reflected signals from elementary targets/scatterers. Due to the random locations of the scatterers within a resolution cell (or a pixel), the reflected signals from the scatterers come with different phases [Lee 09]. Thus, the signal received by PolSAR can be a strong one if the reflected signals are added constructively; otherwise, a weak signal

(27)

1.1 Problem Statement 3

Figure 1.2: An excerpt of a PolSAR image acquired by airborne E-SAR of the German Aerospace Center (DLR) over Oberpfaffenhofen, Germany. The excerpt is displayed by using the polarimetric channels of PolSAR data respectively for red, green, and blue channels.

is received. The received signals could be highly different from pixel to pixel.

This leads to the granular patterns (i.e., the phenomenon of speckle) in PolSAR images.

PolSAR images can include spatially changing surfaces (i.e., heterogeneous regions) [Vasile 11], such as forest and urban regions. In these regions, speckle originates from not only the combination of reflected signals with different phases and amplitude but also the spatial variations of the scattering properties [Posner 93]. Speckle in these regions usually results in heavier fluctuation of the PolSAR data values.

Those intrinsic phenomena degrade the quality of PolSAR images and complicate the interpretation of PolSAR images. On the other hand, those intrinsic phenomena are involved with the imaging mechanism of PolSAR and contain potentially useful information. Research is required on carefully considering those intrinsic phenomena in the interpretation of PolSAR images.

Moreover, PolSAR systems have acquired a huge number of large-scale images over the past decades. A common challenge is to efficiently and effectively interpret a rapidly growing number of PolSAR images.

This thesis focuses on the computer-aided interpretation of PolSAR images, which provides classification results to facilitate the interpretation. We will investigate effective methods so as to obtain good performances in the interpretation of PolSAR images. The research will carefully consider specific scenarios and the intrinsic phenomena. We will conduct experiments to evaluate performances of methods for the interpretation task.

(28)

1.2 Contributions and Publications

This thesis investigates specific topics involved with the interpretation of Pol- SAR images; these topics are 1) statistical modeling of PolSAR data in heterogeneous scenes, 2) incorporating local correlations in PolSAR images, and 3) applying the techniques of statistical modeling of PolSAR data to semi- supervised classification. The contributions are summarized as follows.

• A novel probability density function for the PolSAR data with high scene heterogeneity

Speckle in PolSAR images is the result of combining signals reflected from a number of elementary targets/scatterers of illuminated regions. When it comes to spatially changing surfaces (e.g., forest regions) [Vasile 11], the number of scatterers can be significantly different from pixel to pixel, leading to heavy fluctuation of PolSAR data values. Theoretical models for speckle usually assume a large number of scatterers in homogeneous medium [Xie 02, Lee 09]. This assumption, however, does not consider the spatial fluctuation of the number of scatterers, and the theoretical models cannot accurately characterize the Pol- SAR data in heterogeneous scenes. Therefore, we develop a novel probability density function of the PolSAR data, which is flexible in accurately characterizing the PolSAR data in heterogeneous scenes of a single land cover type. This model is developed within the framework of the product model for the PolSAR data [Lopès 97], within which the PolSAR covariance matrix is expressed by the multiplication of a complex-Wishart-distributed matrix representing “ideal”

speckle and a scalar variable representing the spatial fluctuation of the number of targets/scatterers [Sheen 92, Lopès 97]. We use a flexible probability distribution for the hidden variable (i.e., the scalar variable in the framework of the product model), then formulate a joint/bivariate probability distribution of the PolSAR data and the hidden variable, and finally marginalize out the hidden variable to obtain the proposed distribution of the PolSAR data.

This work was published in a peer-reviewed journal [Liu 17].

• Explicitly considering heterogeneity in unsupervised Classifica- tion of PolSAR data

Within the framework of the product model, scene heterogeneity is taken into account by introducing a hidden variable that is associated with the spatial fluctuation of the number of scatterers. This results in a joint/bivariate distribution of the PolSAR and the hidden variable, and this joint distribution can be expressed as the multiplication of a distribution of the hidden variable and another distribution of the PolSAR data conditioned on the hidden variable [Lopès 97]. Thus, the joint/bivariate probability distribution can represent a data-generating procedure, that is, the PolSAR data with scene heterogeneity can be obtained by 1) first generating a sample of the scalar hidden variable according to a uni-variate distribution and 2) then generating an observed Pol- SAR data point according to a conditional distribution given the sample of the hidden variable. This data-generating procedure explicitly considers scene heterogeneity through the hidden variable.

We propose to consider scene heterogeneity in unsupervised classification of

(29)

1.2 Contributions and Publications 5

the PolSAR data by incorporating the aforementioned data-generating procedure into a framework for unsupervised classification. Compared with existing methods that directly using the marginal (compound) distributions derived from the framework of the product model, the proposed method explicitly characterizes scene heterogeneity by the hidden variable. Moreover, since the data-generating procedure does not involve the marginalization over the hidden variable, the proposed method eliminates the need to deal with complicated functions that arise from the marginalization.

This work resulted in one peer-reviewed journal paper [Liu 19a].

• A patch-based statistical model applied to unsupervised Classi- fication of PolSAR Data.

Local spatial correlations in PolSAR images are shown to be beneficial to improve classification performances [Wu 08, Liu 13]. We propose a patch-based method to exploit local correlations simultaneously at two spatial levels (i.e., pixel level and patch level). At the pixel level (within patches), it is assumed in the proposed method that pixels within a patch belong to the same type of land covers, and we also deal with the heavy fluctuation of PolSAR data values by reducing the role played by the heavily fluctuating PolSAR data. At the patch level (between patches), the proposed method allows neighboring patches to have high possibility to be assigned to the same land cover type.

This work was published in a peer-reviewed journal [Liu 18].

• Semi-supervised classification of PolSAR data considering the many-to-one correspondence.

In semi-supervised scenarios, a large number of unlabeled samples are read- ily available, yet there are only a small number of labeled samples. With a small number of labeled samples, classifiers may not be well trained, failing to provide good classification performances. In this situation, we propose a semi-supervised method for classification of PolSAR data. We aim to learn a class-joint distribution of PolSAR samples and class labels based on labeled and unlabeled PolSAR samples. Once this class-joint distribution is learned, we can evaluate a marginal distribution of the PolSAR data and further calculate the posterior class probability according to Bayes’ theorem. This forms the basis for the classification of the PolSAR data. A situation could happen that some of unlabeled PolSAR samples belong to unknown classes of land covers, for example, when ground truth is not well established due toincomplete information. To allow for this situation, we use an additional class-joint distribution collectively for unknown land cover types. Moreover, in view of the potential multi-modal property within land cover types, we propose to approximate each class-joint distribution by a weighted sum of relaxed Wishart distributions;

the weighed sum of relaxed Wishart distributions have potential for accurately fitting to the PolSAR data belonging to each class of land cover types.

This work resulted in one peer-reviewed conference paper [Liu 19b].

(30)

1.3 Outline

Chapter 2 introduces the statistics of PolSAR data as well as the corresponding theoretical models (statistical models). The theoretical models for PolSAR data in homogeneous areas (e.g., water and farmlands) are explained based on the physical formulation of the received signals. This physical formation forms the basis for exploring the statistical characteristics of PolSAR data and for understanding intrinsic phenomena that complicate the interpretation of PolSAR images.

Chapter 3 investigates the effectiveness of statistical modeling techniques in the interpretation of PolSAR images. We explore a data-generating procedure that describes how data in PolSAR images are generated from multiple land cover types. The data-generating procedure is associated with a statistical model which includes the theoretical model (i.e., the homogeneous clutter model) as an element, and we apply this statistical model to unsupervised classification of PolSAR images.

Chapter 4 deals with the intrinsic phenomenon of scene heterogeneity in the interpretation of PolSAR images by using statistical modeling techniques. We first introduce the mechanism and the key factors that lead to scene heterogeneity in PolSAR images. On this basis, we investigate on accurately modeling PolSAR data with high scene heterogeneity, and we propose a novel probability distribution for the PolSAR data. For the interpretation of PolSAR images, we propose to consider scene heterogeneity by explicitly modeling it. This idea is tested by experiments.

Chapter 5 focuses on incorporating local correlations so as to facilitate the interpretation of PolSAR images. To this end, we propose a patch-based statistical model and apply it to the interpretation task. It is assumed in the proposed model that PolSAR data within each patch belong to the same type of land covers. We will give details about the techniques that are used in the proposed method to consider the heavy fluctuation of PolSAR data values within patches and to incorporate local correlations.

Chapter 6 investigates the use of the techniques of statistical modeling in semi-supervised classification of PolSAR data. We propose a semi-supervised method by estimating class-joint distributions (of the PolSAR samples and class labels) based on labeled and unlabeled PolSAR samples. In the proposed method, each class-joint distribution is approximated by a weighted sum of probability distributions (component densities), which establishes the many- to-one correspondence between component densities and classes. We also test the idea of introducing an additional class-joint distribution to take into account the situation in which some of unlabeled PolSAR samples belong to unknown land cove types. Moreover, we empirically explore the effectiveness of the proposed method in exploiting labeled and unlabeled samples, and the effect of the many-to-one correspondence on classification performances.

Chapter 7 presents the general conclusions of this thesis and discusses possible directions for future work.

(31)

2

Statistics of PolSAR Data

PolSAR is usually carried by an aircraft, a satellite, or an unmanned aerial ve- hicle. While these platforms move, PolSAR transmits its own electromagnetic waves sideways and downwards by an antenna, which is directed perpendicular to the flight path (i.e.,azimuth) to illuminate the Earth surface. In this case, a PolSAR antenna also moves together with the platform along the flight path.

This movement of an antenna allows PolSAR to construct a long virtual antenna (or a large “synthetic aperture”) by combining the received echoes, which facilitates achieving high resolution in the flight direction [Moreira 13].

Nowadays, almost all the fully polarimetric SAR systems are working in monostatic case [Lee 09]. That is, a fully polarimetric SAR system alterna- tively transmits electromagnetic waves in horizontal polarization and vertical polarization, and the PolSAR system receives the backscattered signals in both polarizations by the same antenna as the transmitting one. The general scattering process is characterized by establishing the relation between the transmitted and the scattered signals, which is expressed as follows [Tragl 90, Lee 09, Mor- eira 13]:

E^s_h E_v^s

| {z }

Scattered signal

=exp{−j·kl}

l

S_hh S_hv Svh Svv

| {z }

Complex-valued scattering matrixS

E_h^t E_v^t

| {z }

Transmitted signal

. (2.1)

where the factor ^{exp{−j·kl}}_l represents the attenuation and the phase shift due to the propagation of the transmitted radio waves between PolSAR and a given target. j=√

−1is the imaginary unit, andldenotes the distance between Pol- SAR and the center of a given target. k = 2π/λ is the wave number, which depends on the wavelengthλof the emitted radio waves by PolSAR. The su- perscripts and the subscripts in Equation (2.1) indicate the scattered/received signal (s), the transmitted signal (t), the horizontal polarization (h), and the vertical polarization (v).

The scattering matrix Sin Equation (2.1) transforms the emitted electromagnetic wave into the received signal. All of its four complex-valued entries provide amplitude and phase information, which are determined by the inter- action of the electromagnetic waves with a given target. The scattering matrix

(32)

provides rich information about the scattering mechanism of a given target.

Given the scattering matrix, a target can be characterized by a vector

s=





 S_hh Shv

Svh

Svv







=







S_hh,r+j·S_hh,i Shv,r+j·Shv,i

Svh,r+j·Svh,i

Svv,r+j·Svv,i







, (2.2)

where j = √

−1 is the imaginary unit, and the subscripts r and i indicate the real part and the imaginary part. Each entry corresponds to a polarimetric channel, which is a combination of the transmitting polarization and the receiving polarization. For example, the entryShv is associated with the combination of the horizontal transmitting polarization and the vertical receiving polarization. The vector in Equation (2.2) is referred to as a scattering vector. This scattering vector represents the single-look complex (SLC) PolSAR data. In the monostatic case, reciprocity ensures that the scattering matrix is symmetric, i.e., Shv =Svh [Tragl 90]. Thus, the scattering vector reduces to s= [Shh,√

2Shv, Svv]^T, where the superscriptT is the transpose operator.

The entries in the scattering vector are also referred to ascomplex scattering amplitudes [Moreira 13], which can be used in classification and interpretation tasks.

Based on the scattering process characterized by Equation (2.1), we will introduce in Section 2.1 the speckle effect in PolSAR data and explain the formation of speckle. Section 2.2 and Section 2.3 introduce the speckle statistics of single-look complex (SLC) PolSAR data (i.e., the scattering vector in Equation (2.2)) and multi-look complex (MLC) PolSAR data. In Section 2.4, we evaluate the statistical model for the MLC PolSAR data. This chapter is summarized in Section 2.5.

2.1 Speckle and Its Formation

Speckleis an inherent phenomenon in PolSAR images, which manifests itself as a granular pattern as shown in Fig. 2.1. Speckle in PolSAR images is the result of combining the backscattered waves with different phases and amplitudes emanating from different elementary targets/scatterers in the illuminated area.

The speckle effect causes significant pixel-to-pixel variation in scattering vector values.

A pixel in a PolSAR image corresponds to a small area of the PolSAR illumination, which contains a number of randomly-located elementary targets, such as spheres and dipoles [Cloude 96]. These types of elementary targets/scatterers represent different scattering mechanisms. The received signal for a pixel is the combination of electromagnetic waves reflected from the independent scatterers as shown in Fig. 2.2. Therefore, the scattering vector that characterizes the overall targets in a pixel is also the sum of several scattering vectors. For a pixel including M elementary scatterers, each entry in its

(33)

2.1 Speckle and Its Formation 9

Figure 2.2: The speckle formation.

scattering vector is given by

Spq=Spq,r+j·Spq,i

=

M

X

m=1

S^m_pq,r+j·

M

X

m=1

S_pq,i^m , (2.3)

wherepandqindicate the polarization states of the transmitting signal and the

(34)

backscattered signal, andS_pq,r^m +j·S^m_pq,irepresents the contribution from them- th elementary scatterer. The subscriptsrandiindicate the real and imaginary parts. The entrySpq could be of any polarimetric channel in Equation (2.2).

Within a resolution cell (or a pixel), since the elementary scatterers are located in different places, the reflected waves by the scatterers travel different distances to the PolSAR, leading to different phases [Sarabandi 92]. If the reflected waves from the scatterers are added constructively, the PolSAR receives a strong signal; otherwise, a weak signal could be received. Thus, even two (ad- jacent) pixels of the same object could exhibit a large variation in scattering vector values due to the random locations of the elementary scatterers. This leads to the speckle effect in a PolSAR image.

Speckle is also referred to as “noise” due to the noise-like/“salt-and-pepper”

appearance of PolSAR images [Qiu 04]. However, the early discussion shows that speckle includes the overall information about the scatterers’ locations and the magnitude of the reflected signals. Therefore, speckle should be treated carefully rather than being simply removed as noises.

2.2 Statistics of Single-Look Complex Data

In view of the speckle effect, it could be erroneous to characterize a measured scattering vector as a deterministic value. A more accurate way is to treat it as a random vector with an appropriate probability distribution. This section introduces a theoretical probability density function (PDF), which is derived based on the physical formation of speckle, to describe the speckle statistics for the single-look complex (SLC) PolSAR data (i.e., the scattering vectors).

The distribution of speckle depends on the number of individual scatterers in a resolution cell (or a pixel). Fully developed speckle requires that a resolution cell of a homogeneous medium includes a large number of scatterers, and the individual scatterers result in highly different phases [Sarabandi 92, Xie 02, Lee 09]. In that case, the central limit theorem applies in Equation (2.3), and the real partSpq,r and the imaginary partSpq,i of each scattering vector elementSpqcan be approximated as zero-mean independent Gaussian variables [Sarabandi 92,Lee 09]. Thus, it is reasonable for these real and imaginary parts to jointly follow a multivariate Gaussian distribution. It is worth noting that the real part and imaginary part of each S_pq are independent, yet any two complex-valued entries of s (e.g., S_pq and S_km p, q, k, m ∈ {h, v}) could be correlated.

Goodman [Goodman 63] showed that the aforementioned real-valued multivariate Gaussian distribution can be equivalently expressed as amultivariate complex Gaussian distribution of the corresponding complex-valued random vector. That is, the complex-valued scattering vector s (i.e., the SLC Pol- SAR data) can be characterized by a multivariate complex Gaussian distribution [Goodman 63, Lopès 97] as

ps(s) = 1

π^d|Ω|exp

−s^∗TΩ⁻¹s , (2.4)

(35)

2.3 Statistics of Multilook Complex Data 11

where | · |evaluates the determinant, and the superscript∗T is the conjugate transpose operator. Ω =E[ss^∗T]is the covariance matrix of the random scattering vectors. dis the dimension of the scattering vectors: d= 3 under the assumption of reciprocity (i.e.,Shv=Svh in Equation (2.2)) [Tragl 90, Lee 09];

d= 4, otherwise.

The multivariate complex Gaussian distribution is suitable for the SLC Pol- SAR data in homogeneous areas, where speckle is fully developed and the pixel- to-pixel variation in scattering vector values is due to fully developed speckle.

In heterogeneous regions (e.g., forest and urban regions), the conditions of fully developed speckle are not necessarily fulfilled, leading to non-Gaussian clutter.

In Chapter 4, we will give more details about speckle in heterogeneous scenes.

2.3 Statistics of Multilook Complex Data

The speckle effect complicates the interpretation of PolSAR images. To reduce the speckle effect, (single-polarization) SAR systems acquire multiple independent images (i.e., looks) of the same scene and incoherently average the looks (e.g., averaging real-valued intensity data over looks instead of complex-valued data) to form multilook processed images [Moreira 91, Moreira 13]. The averaging operation over the looks is calledmultilook processing and can be done in the process of the formation of SAR images. Multilook processing can also be done after images are formed. In this situation, multilook processing is per- formed by incoherently averaging image data of non-overlapping neighborhoods to form multilooked data [Lee 09].

Multilook PolSAR data are usually obtained by spatially averaging the neighboring covariance matrices that are formed from the scattering vector [An- finsen 11b,Argenti 13]. This multilook processing results in the multilook complex (MLC) PolSAR covariance matrix as

C= 1 L

L

X

i=1

s_is^∗T_i , (2.5)

whereLis the number of the neighboring pixels for the averaging operation (i.e., the number of looks), ands_is^∗T_i is a single-look covariance matrix formed from the scattering vector s_i. The diagonal elements in the multilook covariance matrix C provide the L-look intensities of polarimetric channels. The off- diagonal elements preserve the mean values of the phase difference between the polarimetric channels, which are helpful in understanding the scattering mechanisms of the illuminated area. With this multilook covariance matrix, it is also possible to further reduce the speckle noise by using speckle filters, since speckle filters are generally designed based on MLC PolSAR data (second-order statistics) rather than the SLC PolSAR data [Lee 99b, Nie 15, Ma 18].

The MLC PolSAR covariance matrix C in Equation (2.5) is the average of the second-order statistic of several random scattering vectors{si}^L_i=1. Un- der the conditions of fully developed speckle, those random scattering vec-

(36)

0 0.5 1 1.5 2 2.5 3 3.5 Intensity

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

Probability Density

L= 2 L= 4 L= 8 L= 16

(a)

0 0.5 1 1.5 2 2.5 3 3.5

Amplitude 0

0.5 1 1.5 2 2.5 3 3.5

Probability Density

L= 2 L= 4 L= 8 L= 16

(b)

Figure 2.3: The marginal PDFs of single-channel PolSAR data. (a) The PDFs of the multilook intensity with σ = 1 and L = {2,4,8,16}. σ = 1 implies the unit mean of these PDFs. (b) The PDFs of the multilook amplitude with σ = 1 and L={2,4,8,16}. Large number of looks results in the high peak of the distribution for the multilook intensity and the multilook amplitude.

tors individually follow the multivariate complex Gaussian distributions [Good- man 63,Lee 09]. In this case, that second-order statistic (i.e., the MLC PolSAR covariance matrixCin Equation (2.5)) was investigated in [Goodman 63], and it was shown that the second-order statistic follows thecomplex Wishart distribution [Goodman 63]. Thus, an MLC PolSAR covariance matrixC can be characterized by the complex Wishart distribution, i.e.,

pC(C;L,Ω) = L^Ld Γd(L)

|C|^L−d

|Ω|^L exp

−L·tr(Ω⁻¹C)

, (2.6)

where|·|andtr(·)evaluate the determinant and the trace, respectively.Γd(L) = π^d(d−1)² Qd−1

i=0Γ(L−i), dis the number of rows in the squared matrix C, and Γ(x) = R+∞

0 z^x−1exp{−z}dz is the Gamma function. Ω = E[C] is the scale matrix, which represents the expectation of the covariance matrix C. This distribution is suitable to characterize the covariance matrices in homogeneous area of PolSAR images.

Based on the complex Wishart distribution, the marginal PDF of single- polarization SAR data (i.e., the multilook intensity) takes the form of

pI(I;L, σ) = L^L Γ(L)

I^L−1 σ^L exp

−LI σ

, (2.7)

whereσis the scale parameter. The expectation and the variance of the intensity are given byE[I] =σand Var[I] =σ²/L. Furthermore, according to the relationI=A²between the intensityIand the amplitudeA, the PDF of the multilook amplitudeAcan also be obtained as

pA(A;L, σ) = 2L^L Γ(L)

A^2L−1 σ^L exp

−LA² σ

. (2.8)

(37)

2.4 Assessment of the Theoretical Distribution 13

Example (marginal) PDFs of the multilook intensity and the multilook amplitude are shown in Fig. 2.3.

2.4 Assessment of the Theoretical Distribution

In this section, we test the fit ability of the complex Wishart distribution. Note that the complex Wishart distribution is a matrix-variate distribution, which complicates visualization. For this reason, we show plots of marginal distributions as well as the empirical histograms associated with the corresponding polarimetric channels. Following the work [Lee 94b], we plot the fitted PDFs of the multilook amplitude.

To acquire the fitted marginal PDF for each polarimetric channel, the scale parameterσand the number of looksLin Equation (2.8) are obtained according to their maximum likelihood estimates. Since the parameters in the PDF of the multilook amplitude [see Equation (2.8)] are inherited from the PDF of the multilook intensity [see Equation (2.7)], it is convenient to obtain an estimate of the scale parameter as [Lee 09]

σb=<I>=<A²>, (2.9) where bσ is the estimate, < I>evaluates the mean of the sample intensities, and we exploit the relation between the intensity and the amplitudeI=A².

The (equivalent) number of looksLcan be obtained in advance according to PolSAR systems or by the estimation using the PolSAR data of homogeneous regions [Frery 07]. Note the estimate Lb of the (equivalent) number of looks is a constant for all polarimetric channels and all regions in a PolSAR image.

Specifically, we select multilook PolSAR data from homogeneous areas of test images, independently compute theequivalent number of looks (ENL) for each channel [Anfinsen 09a], and evaluate the mean of all the ENLs as the estimate Lb of the number of looks [Frery 07]. That is, the estimateLb of the number of looks for a PolSAR image is given by

Lb= P

p,q∈{h,v}Lb_pq

d , (2.10)

wheredis the number of polarimetric channels, andLbpqis the estimate of ENL for the polarimetric channel indicated by the subscript “pq” (p, q∈ {h, v}). Lbpq

in Equation (2.10) is calculated by Lbpq= <Ipq>²

<(I_pq−<I_pq>)²>= <A²_pq>²

<(A²_pq−<A²_pq>)²>, (2.11) where the denominator and the numerator are, respectively, the sample variance and the square of the sample mean associated with a polarimetric channel.

The fit ability is assessed according to two metrics: the symmetric Kullback- Leibler divergence (SKLD) [Martínez-Usó 07] and the Kolmogorov-Smirnov

(38)

Table 2.1: Fitting Performance of the Theoretical Model.

Polarimetric SKLD(KSD)

channel Ocean Woodland Urban Area

HH 0.082(0.057) 0.125(0.083) 2.109(0.349) HV/VH 0.047(0.025) 0.119(0.082) 1.391(0.261) VV 0.036(0.030) 0.085(0.065) 1.531(0.289)

Note: Smaller values of the SKLD and the KSD indicate better fitting to samples.

distance (KSD) [Cui 14]. In the experiment, the SKLD measures the overall distance between a fitted curve and an empirical histogram, which is evaluated according to

DSKL=

M

X

i=1

pilog₂ pi

hi

+

M

X

i=1

hilog₂hi

pi

, (2.12)

wherehiis the empirical probability in thei-th bin, andpiis the probability of a fitted distribution in thei-th bin. The KSD provides the maximum discrepancy between a cumulative distribution function (CDF) from the fitted PDF and an empirical CDF from the samples. For both of these two metrics, small values indicate good fits to samples.

The test areas include ocean, woodland, and urban areas, which are selected from two test PolSAR images acquired by different PolSAR sensors. For each test area, the original intensity values are normalized by the division over the sample mean, and the normalized amplitude is obtained as the square root of the normalized intensity.

2.4.1 Fitting results

The fitted curves and the corresponding empirical histograms for the test areas are shown in Figs. 2.4–2.6. The SKLD and KSD values are shown in Table 2.1.

In Fig. 2.4 for the ocean area, the fitted PDFs coincide well with the empirical histograms for the polarimetric channels. The test ocean area is a (ap- proximately) homogeneous area, where the conditions of the fully developed speckle are well fulfilled and the granular pattern in Fig. 2.4(a) is mainly due to the speckle. Thus, the theoretical model for speckle provides good fits to the ocean area; this is also confirmed by the small values of the SKLD and the KSD in Table 2.1.

It is shown in Table 2.1 that the theoretical model results in relatively large values of the SKLD and KSD for the woodland and the urban area. In the fitted PDFs, the (equivalent) number of looks is determined according to Equation (2.10). This estimate is used for all polarimetric channels of a PolSAR image, since the (equivalent) number of looks is a (imaging) constant for the whole image. For the woodland in Fig. 2.5, the peaks of the histograms are not well captured by the fitted PDFs. In Fig. 2.6, the deviation between the fitted PDFs

(39)

2.4 Assessment of the Theoretical Distribution 15

(a) Ocean area

0 1 2 3

Normalized Amplitude 0

0.5 1 1.5 2

Probability Density

Normalized histogram Estimated marginal PDF

(b) HH

0 1 2 3

0.5 1 1.5 2

Probability Density

(c) HV/VH

0 1 2 3

0.5 1 1.5 2

Probability Density

(d) VV

Figure 2.4: The fitting results to the multilook PolSAR data in the ocean area. (a) The Pauli-RGB image of the ocean area. (b), (c), and (d) show the histograms of the normalized amplitudes and the fitted PDFs for polarimetric channels.

(a) Woodland

0 1 2 3

0.5 1 1.5 2

Probability Density

(b) HH

0 1 2 3

0.5 1 1.5 2

Probability Density

(c) HV/VH

0 1 2 3

0.5 1 1.5 2

Probability Density

(d) VV

Figure 2.5: The fitting results to the multilook PolSAR data in the woodland. (a) The Pauli-RGB image of the woodland. (b), (c), and (d) show the histograms of the normalized amplitudes and the fitted PDFs for polarimetric channels.

and the empirical histograms is pronounced for the urban area. In these two types of (extremely) heterogeneous areas, the pixel-to-pixel variation in the PolSAR images arises from not only the combination of the complex-valued backscattered signals but also additional sources, such as spatial structures.

Thus, the theoretical model may not be accurate in characterizing the data in the woodland and urban areas.

(40)

(a) Urban area

0 1 2 3

0.5 1 1.5 2

Probability Density

(b) HH

0 1 2 3

0.5 1 1.5 2

Probability Density

(c) HV/VH

0 1 2 3

0.5 1 1.5 2

Probability Density

(d) VV

Figure 2.6: The fitting results to the multilook PolSAR data in the urban area. (a) The Pauli-RGB image of the urban area. (b), (c), and (d) show the histograms of the normalized amplitudes and the fitted PDFs for polarimetric channels.

2.5 Conclusion

In this chapter, we introduced the SLC PolSAR data (i.e., the scattering vector) and the MLC PolSAR data (i.e., the covariance matrix), which both can be used to characterize a target. Since PolSAR covariance matrices include information about the intensity and the phase difference between polarimetric channels, we always deal with the covariance matrix or its equivalent format rather than the SLC PolSAR data.

Speckle in PolSAR data was formulated as the random walk process in the complex plane. Under the conditions of fully developed speckle, theoretical models were introduced for the SLC PolSAR data and the MLC PolSAR data;

the theoretical model for MLC PolSAR data is a PDF of complex-valued matrix variate. We conducted the experiments to fit the theoretical model to the MLC PolSAR data of several patches of PolSAR images, and tested the theoretical model. It was shown that the theoretical model fails to accurately fit to the PolSAR data in the woodland and urban areas, where the pixel-to-pixel variation in PolSAR images is due to the combination of the complex-valued backscattered signals as well as additional sources. For the ocean area, where the conditions of fully developed speckle can be well fulfilled, the theoretical model provides accurate fits. The theoretical model can be used to characterize PolSAR data and is useful in the interpretation of PolSAR images.

Statistical modeling for the interpretation of polarimetric radar images