Supervised classification of multidimensional and irregularly sampled signals

(1)

HAL Id: hal-02092347

https://hal.archives-ouvertes.fr/hal-02092347

Submitted on 8 Apr 2019

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Supervised classification of multidimensional and irregularly sampled signals

Alexandre Constantin, Mathieu Fauvel, Stéphane Girard, Serge Iovleff

To cite this version:

Alexandre Constantin, Mathieu Fauvel, Stéphane Girard, Serge Iovleff. Supervised classification of multidimensional and irregularly sampled signals. Statlearn 2019 - Workshop on Challenging problems in Statistical Learning, Apr 2019, Grenoble, France. pp.1. �hal-02092347�

(2)

Supervised classification of multidimensional and irregularly sampled signals.

Alexandre Constantin ¹ , Mathieu Fauvel ² , Stéphane Girard ¹ and Serge Iovleff ³

1

Université Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK, Grenoble, France

2

CESBIO, Université de Toulouse, CNES/CNRS/IRD/UPS/INRA, Toulouse, France

3

Laboratoire Paul Painlevé - Université Lille 1, CNRS, Inria, France

Introduction

Background:

Recent space missions, such as Copernicus Sentinel-2

^a

, provide high resolution Satellite Image Time Series (SITS) to study continental surfaces, with a very short revisit period (5 days for sentinel-2). In order to process such data, statistical models are regularly used [1, 2], which usually require a regular temporal sampling. However, for SITS, clouds and shadows (eg. figure from [3]), as well as the satellite orbite, an irregular temporal sampling is common.

Contribution:

A new statistical approach using Gaussian processes is proposed to classify irregularly sampled signals without temporal rescaling. Moreover, the model offers a theoretical framework to impute missing values such as cloudy pixels.

a

https://www.esa.int/Our_Activities/Observing_the_Earth/Copernicus/Sentinel-2

Model

Gaussian Processes (GP ) model:

Let S =

(y

_i

, z

_i

)

ⁿ

i=1

a set of multidimensional and irregularly sampled signals. A signal Y is modeled as a vector of p independent random processes T → R

^p

, with T = [0, T ]. The associated label is modeled by a discrete random variable Z taking its values in {1, . . . , C}. The model introduced here is based on two assumptions: 1) The coordinate processes Y

_b

, b ∈ {1, . . . , p} of Y are independent, 2) Each process Y

_b

is, conditionally to Z = c, a Gaussian process. Then

Y

_b

(t)|Z = c ∼ GP (m

_b,c

(t), K

_b,c

(t, s)),

where m

_b,c

: T → R

^p

is a mean function, and K

_b,c

a covariance kernel with hyperparameters θ

_b,c

. For example θ

_b,c

= {γ

_b,c²

, h

_b,c

, σ

_b,c²

} with

K

_b,c

(t, s) = γ

_b,c²

k(t, s|h

_b,c

) + σ

_b,c²

δ

_t,s

An irregularly sampled noisy signal y

_i

is observed on T

_i

time stamps {t

ⁱ₁

, . . . , t

ⁱ_T

i

} ∈ T and its bth coordinate is represented by a vector in R

^Tⁱ

. We write y

_i,b

= [Y

_bⁱ

(t

ⁱ₁

), . . . , Y

_bⁱ

(t

ⁱ_T

i

)]

^T

, with

y

_i,b

|Z

_i

= c ∼ N

_T_i

µ

_i,b,c

, Σ

ⁱ_b,c

.

There µ

_i,b,c

= B

ⁱ_b

α

_b,c

is the sampled mean projected on a finite- dimensional space (B

ⁱ_b

is the fixed design matrix, α

_b,c

is the unknown vector of coordinates). Σ

ⁱ_b,c

is the matrix kernel K

_b,c

evaluations at {t

ⁱ₁

, . . . , t

ⁱ_T

i

}.

Estimation:

• α

_b,c

and θ

_b,c

are estimated by maximizing the log-likelihood,

− 1 2

X

i|Z_i=c

log

Σ

ⁱ

(θ

_b,c

)

+ (y

_i,b

− B

ⁱ_b

α

_b,c

)

^>

Σ

ⁱ

(θ

_b,c

)

⁻¹

(y

_i,b

− B

ⁱ_b

α

_b,c

).

• α

_b,c

is given by an explicit formula, while θ

_b,c

is computed thanks to a gradient technique.

Classification and Imputation of missing values

The assigned class is given by the MAP rule from the posterior probability

P (Z = c|y

_j

) = π ˆ

_c

Q

p

b=1

f

_T_j

y

_j

, B

^j_b

α ˆ

_b,c

, Σ

^j

(ˆ θ

_b,c

)

K

P

`=1

π ˆ

_`

Q

^p

b=1

f

_T_j

y

_j

, B

^j_b

α ˆ

_b,`

, Σ

^j

(ˆ θ

_b,`

) .

When the class is known to be c, the missing value at t

^∗

is estimated through the computation of conditional expectation.



 



 



Y ˆ

_b,cⁱ

(t

^∗

) =B

_bⁱ

(t

^∗

) ˆ α

_b,c

+ K

_b,c

(t

^∗

, t

_1:T_i

)

^>

Σ

ⁱ

(ˆ θ

_b,c

)

⁻¹

(y

_i,b

− B

_bⁱ

α ˆ

_b,c

) var( ˆ Y

_b,cⁱ

(t

^∗

)) =K

_b,c

(t

^∗

, t

^∗

)

− K

_b,c

(t

^∗

, t

_1:T_i

)Σ

ⁱ

(ˆ θ

_b,c

)

⁻¹

K

_b,c

(t

_1:T_i

, t

^∗

)

We also generalized this imputation when the class is unknown.

Validation (Synthetic data)

0 10 20 30 40 50

t: temporal instants 2

1 0 1 2 3 4 5 6

Y(t): Amplitude (1 band)

Example of two signals (dots) that belongs to two different classes Classification rate based on average time samples

n

_t

5 10 25 50 75

Acc

_exp

(%) 52.8 52.9 74.3 93.9 94.2 Acc

_sin

(%) 64.3 85.3 100 100 100

0 10 20 30 40 50

−2 0 2 4 6

Amplitude

Noisy observed signal Predicted signal

2 * Standard deviation Missing values

0 10 20 30 40 50

Time

−2 0 2 4 6

Amplitude

Imputation on two signals belonging to the same class.

Future work

We are now implementing the model for massive real data (Sentinel-2).

We are also working on a new model when the bands are correlated.

This work is supported by the French National Research Agency in the framework of the Investissements d’Avenir program (ANR-15-IDEX-02) and by the Centre National d’Etudes Spatiales (CNES).

[1] P. J. Brockwell and R. A. Davis. Time Series: Theory and Methods.

Springer-Verlag, Berlin, Heidelberg, 1986.

[2] C. K. Williams and C. E. Rasmussen. Gaussian processes for machine learning. the MIT Press, 2006.

[3] Sentinel hub blog. https://medium.com/sentinel-hub . Accessed:

2019-03-21.

Supervised classification of multidimensional and irregularly sampled signals