HAL Id: hal-02092347
https://hal.archives-ouvertes.fr/hal-02092347
Submitted on 8 Apr 2019
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Supervised classification of multidimensional and irregularly sampled signals
Alexandre Constantin, Mathieu Fauvel, Stéphane Girard, Serge Iovleff
To cite this version:
Alexandre Constantin, Mathieu Fauvel, Stéphane Girard, Serge Iovleff. Supervised classification of multidimensional and irregularly sampled signals. Statlearn 2019 - Workshop on Challenging problems in Statistical Learning, Apr 2019, Grenoble, France. pp.1. �hal-02092347�
Supervised classification of multidimensional and irregularly sampled signals.
Alexandre Constantin 1 , Mathieu Fauvel 2 , Stéphane Girard 1 and Serge Iovleff 3
1
Université Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK, Grenoble, France
2
CESBIO, Université de Toulouse, CNES/CNRS/IRD/UPS/INRA, Toulouse, France
3
Laboratoire Paul Painlevé - Université Lille 1, CNRS, Inria, France
Introduction
Background:
Recent space missions, such as Copernicus Sentinel-2
a, provide high resolution Satellite Image Time Series (SITS) to study continental surfaces, with a very short revisit period (5 days for sentinel-2). In order to process such data, statistical models are regularly used [1, 2], which usually require a regular temporal sampling. However, for SITS, clouds and shadows (eg. figure from [3]), as well as the satellite orbite, an irregular temporal sampling is common.
Contribution:
A new statistical approach using Gaussian processes is proposed to classify irregularly sam- pled signals without temporal rescaling. Moreover, the model offers a theoretical framework to impute missing values such as cloudy pixels.
a
https://www.esa.int/Our_Activities/Observing_the_Earth/Copernicus/Sentinel-2
Model
Gaussian Processes (GP ) model:
Let S =
(y
i, z
i)
ni=1
a set of multidimensional and irregularly sampled signals. A signal Y is modeled as a vector of p independent random processes T → R
p, with T = [0, T ]. The associated label is modeled by a discrete random variable Z taking its values in {1, . . . , C}. The model introduced here is based on two assumptions: 1) The coordinate processes Y
b, b ∈ {1, . . . , p} of Y are independent, 2) Each process Y
bis, conditionally to Z = c, a Gaussian process. Then
Y
b(t)|Z = c ∼ GP (m
b,c(t), K
b,c(t, s)),
where m
b,c: T → R
pis a mean function, and K
b,ca covariance kernel with hyperparameters θ
b,c. For example θ
b,c= {γ
b,c2, h
b,c, σ
b,c2} with
K
b,c(t, s) = γ
b,c2k(t, s|h
b,c) + σ
b,c2δ
t,sAn irregularly sampled noisy signal y
iis observed on T
itime stamps {t
i1, . . . , t
iTi
} ∈ T and its bth coordinate is represented by a vector in R
Ti. We write y
i,b= [Y
bi(t
i1), . . . , Y
bi(t
iTi
)]
T, with
y
i,b|Z
i= c ∼ N
Tiµ
i,b,c, Σ
ib,c.
There µ
i,b,c= B
ibα
b,cis the sampled mean projected on a finite- dimensional space (B
ibis the fixed design matrix, α
b,cis the unknown vector of coordinates). Σ
ib,cis the matrix kernel K
b,cevaluations at {t
i1, . . . , t
iTi
}.
Estimation:
• α
b,cand θ
b,care estimated by maximizing the log-likelihood,
− 1 2
X
i|Zi=c
log
Σ
i(θ
b,c)
+ (y
i,b− B
ibα
b,c)
>Σ
i(θ
b,c)
−1(y
i,b− B
ibα
b,c).
• α
b,cis given by an explicit formula, while θ
b,cis computed thanks to a gradient technique.
Classification and Imputation of missing values
The assigned class is given by the MAP rule from the posterior probability
P (Z = c|y
j) = π ˆ
cQ
pb=1
f
Tjy
j, B
jbα ˆ
b,c, Σ
j(ˆ θ
b,c)
K
P
`=1
π ˆ
`Q
pb=1
f
Tjy
j, B
jbα ˆ
b,`, Σ
j(ˆ θ
b,`) .
When the class is known to be c, the missing value at t
∗is estimated through the computation of conditional expectation.
Y ˆ
b,ci(t
∗) =B
bi(t
∗) ˆ α
b,c+ K
b,c(t
∗, t
1:Ti)
>Σ
i(ˆ θ
b,c)
−1(y
i,b− B
biα ˆ
b,c) var( ˆ Y
b,ci(t
∗)) =K
b,c(t
∗, t
∗)
− K
b,c(t
∗, t
1:Ti)Σ
i(ˆ θ
b,c)
−1K
b,c(t
1:Ti, t
∗)
We also generalized this imputation when the class is unknown.
Validation (Synthetic data)
0 10 20 30 40 50
t: temporal instants 2
1 0 1 2 3 4 5 6
Y(t): Amplitude (1 band)
Example of two signals (dots) that belongs to two different classes Classification rate based on average time samples
n
t5 10 25 50 75
Acc
exp(%) 52.8 52.9 74.3 93.9 94.2 Acc
sin(%) 64.3 85.3 100 100 100
0 10 20 30 40 50
−2 0 2 4 6
Amplitude
Noisy observed signal Predicted signal
2 * Standard deviation Missing values
0 10 20 30 40 50
Time
−2 0 2 4 6
Amplitude