Quand les données sont des courbes

(1)

l’Analyse de Données Fonctionnelles en Océanographie

Quand les données sont des courbes

David NERINI

Mediterranean Institute of Oceanography UMR 7294 CNRS, Aix-Marseille University

david.nerini@univ-amu.fr

MQEM, Marseille, 2017

D. NERINI (MIO) FDA MQEM, 2017 1 / 70

(2)

Qu’est ce qu’une donnée fonctionnelle ? Quelques exemples

Courbes échantillonnées

En oceanographie, les données sont souvent des courbes échantillonnées

Distribution de taille de zooplancton

(3)

Courbes échantillonnées Spectre d’électrophorèse

control sediment

contaminated sediment

single sample

continuous distribution

approximated distribution

(4)

Courbes multiples échantillonnées

Profils physicochimiques

5 10 15 20 25 30 35

-10-8-6-4-20

temperature (°C)

(m)

5 10 15 20 25 30 35 40

-10-8-6-4-20

salinity (SI)

prof.(m)

0 50 100 150

-10-8-6-4-20

dissolved oxygens (%)

prof.(m)

depth

(5)

Courbes multiples échantillonnées

Emprunte optique d’un dinoflagellé Ceratium

(6)

Courbes appariés échantillonnées Contour d’otolithe

(7)

Images satellites

(SST) Brute, lissée & dérivée spatiale

84 86 88 90 92 94 96

−54

−52

−50

−48

−46

−44

−42

long

lat

84 86 88 90 92 94 96

−54

−52

−50

−48

−46

−44

−42

xgrid

ygrid

84 86 88 90 92 94 96

−54

−52

−50

−48

−46

−44

−42

longx

laty

(8)

Qu’est ce qu’une donnée fonctionnelle ? Définition

Definition

Une données fonctionnelles est l’échantillon d’une variable aléatoire indicée par un argument

Nombreuses observations bruitées

Beaucoup de points échantillonnés

La grille d’échantillonnage peut changer d’une courbe à l’autre

ATTENTION

Impossibilité de créer un tableau de données ! Hypothèse clef : régularité

Y

_i

t

_j

= ˆ Y

_i

t

_j

+ ε

_i

t

_j

où Y

_i

t

_j

est la donnée brute, Y ˆ

_i

(t) une courbe régulière et ε

_i

t

_j

un résidu.

(9)

Definition

Une données fonctionnelles est l’échantillon d’une variable aléatoire indicée par un argument

Nombreuses observations bruitées Beaucoup de points échantillonnés

La grille d’échantillonnage peut changer d’une courbe à l’autre ATTENTION

Impossibilité de créer un tableau de données ! Hypothèse clef : régularité

Y

_i

t

_j

= ˆ Y

_i

t

_j

+ ε

_i

t

_j

où Y

_i

t

_j

est la donnée brute, Y ˆ

_i

(t) une courbe régulière et ε

_i

t

_j

un résidu.

(10)

Definition

Une données fonctionnelles est l’échantillon d’une variable aléatoire indicée par un argument

Nombreuses observations bruitées Beaucoup de points échantillonnés

La grille d’échantillonnage peut changer d’une courbe à l’autre ATTENTION

Impossibilité de créer un tableau de données !

Hypothèse clef : régularité Y

_i

t

_j

= ˆ Y

_i

t

_j

+ ε

_i

t

_j

où Y

_i

t

_j

est la donnée brute, Y ˆ

_i

(t) une courbe régulière et ε

_i

t

_j

un résidu.

(11)

Definition

Une données fonctionnelles est l’échantillon d’une variable aléatoire indicée par un argument

Nombreuses observations bruitées Beaucoup de points échantillonnés

La grille d’échantillonnage peut changer d’une courbe à l’autre ATTENTION

Impossibilité de créer un tableau de données ! Hypothèse clef : régularité

Y

_i

t

_j

= ˆ Y

_i

t

_j

+ ε

_i

t

_j

où Y

_i

t

_j

est la donnée brute, Y ˆ

_i

(t) une courbe régulière et ε

_i

t

_j

un résidu.

(12)

De la données brute à une fonction Régression linéaire

A partir de l’échantillon Y = (Y (t

₁

) , · · · , Y (t

p

))

⁰

, on crée une combinaison linéaire de K fonctions de base fixées d’avance {φ

₁

, · · · , φ

_K

} tel que

Y (t ) =

K

X

k=1

c

_k

φ

_k

(t) + ε (t) .

L’estimation des paramètres c = (c

₁

, · · · , c

_K

)

⁰

est réalisée par régression des moindres carrés en minimisant :

SSE (c) = (Y − Φc)

⁰

(Y − Φc) Solution similaire à celle d’une régression classique :

ˆ c = Φ

⁰

Φ

−1

ΦY Finalement la courbe est reconstituée avec :

Y ˆ (t) =

K

X

k=1

c ˆ

_k

φ

_k

(t) Dimension réduite quand K < p

(13)

A partir de l’échantillon Y = (Y (t

₁

) , · · · , Y (t

p

))

⁰

, on crée une combinaison linéaire de K fonctions de base fixées d’avance {φ

₁

, · · · , φ

_K

} tel que

Y (t ) =

K

X

k=1

c

_k

φ

_k

(t) + ε (t) .

L’estimation des paramètres c = (c

₁

, · · · , c

_K

)

⁰

est réalisée par régression des moindres carrés en minimisant :

SSE (c) = (Y − Φc)

⁰

(Y − Φc) Solution similaire à celle d’une régression classique :

c ˆ = Φ

⁰

Φ

−1

ΦY

Finalement la courbe est reconstituée avec : Y ˆ (t) =

K

X

k=1

c ˆ

_k

φ

_k

(t)

Dimension réduite quand K < p

(14)

A partir de l’échantillon Y = (Y (t

₁

) , · · · , Y (t

p

))

⁰

, on crée une combinaison linéaire de K fonctions de base fixées d’avance {φ

₁

, · · · , φ

_K

} tel que

Y (t ) =

K

X

k=1

c

_k

φ

_k

(t) + ε (t) .

L’estimation des paramètres c = (c

₁

, · · · , c

_K

)

⁰

est réalisée par régression des moindres carrés en minimisant :

SSE (c) = (Y − Φc)

⁰

(Y − Φc) Solution similaire à celle d’une régression classique :

c ˆ = Φ

⁰

Φ

−1

ΦY Finalement la courbe est reconstituée avec :

Y ˆ (t) =

K

X

k=1

c ˆ

_k

φ

_k

(t)

Dimension réduite quand K < p

(15)

A partir de l’échantillon Y = (Y (t

₁

) , · · · , Y (t

p

))

⁰

, on crée une combinaison linéaire de K fonctions de base fixées d’avance {φ

₁

, · · · , φ

_K

} tel que

Y (t ) =

K

X

k=1

c

_k

φ

_k

(t) + ε (t) .

L’estimation des paramètres c = (c

₁

, · · · , c

_K

)

⁰

est réalisée par régression des moindres carrés en minimisant :

SSE (c) = (Y − Φc)

⁰

(Y − Φc) Solution similaire à celle d’une régression classique :

c ˆ = Φ

⁰

Φ

−1

ΦY Finalement la courbe est reconstituée avec :

Y ˆ (t) =

K

X

k=1

c ˆ

_k

φ

_k

(t)

Dimension réduite quand K < p

(16)

On peut rajouter des contraintes (régularité, positivité, convexité, monotonie, . . .)

SSE (f ) = 1 n

n

X

i=1

(y

_i

− f (t

_i

))

²

+ λ × kLf k

²

Gorgones soumises à un stress thermique

● ● ●

●

● ●

●

● ●

●

● ●

● ● ● ● ● ● ● ●

● ● ● ●

● ● ●

● ● ● ●

0 5 10 15 20 25 30 35

0.0 0.2 0.4 0.6 0.8 1.0

Time (d)

death prob.

0 5 10 15 20 25 30 35

−0.02 0.00 0.02 0.04 0.06 0.08

Time (d)

Velocity

0 5 10 15 20 25 30 35

−0.2

−0.1 0.0 0.1

Time (d)

Acceleration

(17)

Traitement des courbes Notions de base

Statistiques de base

Temp. / Prec. en France (moyenne 1960-2000)

−5 0 5 10

4244464850

FRANCE

long.

lat.

ABBV

AGEN

AJAC ALNC

ANGS

ANGM

BAST BSCO

BIAZ BORD

BGES BRST

CAEN

CHRX CHBG

CLFD DIJO DUNK

GOUR EMBR LPUY GREN LILL

LIMG LYON

MRS METZ

MLAU MTDM

MONT

MTPL MULH NACY

NTES

NIMS NICE ORLS

PARI

PERP PLUM

PTRS REIM

RNES RUEN

STDZ

STGI SETE

STAB STBG

TRBS TLN

TLSE TOUR

TROY

VCHY

●●

●

JANFEBMAR APRMAYJUNJULAUGSEPOCTNOV DEC

0510152025

Temperature (°C)

●

●●

●

●●

●

●●

●

JANFEBMAR APRMAYJUNJULAUGSEPOCTNOV DEC

050100150

Precipitation (mm)

(18)

Fonction moyenne et fonction variance Température en France

Y ¯ (t) = 1 n

n

X

i=1

Y

_i

(t) , σ

_Y

(t) = 1 n

n

X

i=1

Y

_i

(t) − Y ¯ (t)

2

(19)

Opérateur de corrélation & de Variance Fonction bivariée de corrélation

ρ (s, t) = 1 n

n

X

i=1

Y

_i

(t) − Y ¯ (t) σ (t)

Y

_i

(s) − Y ¯ (s) σ (s)

(20)

Traitement des courbes Choix de la base

Ajustement des données

L’augmentation du nombre K de fonctions de base conduit à un meilleur ajustement

Le choix de la base dépend des données (base de Fourier, B-splines, polynômes, ...)

Température ajustée à Marseille \ Base de Fourier

(21)

Analyse de la variabilité Courbes uniques

ACP fonctionnelles

Objectifs : En utilisant les données n

Y ˆ

₁

, · · · , Y ˆ

_n

o

trouver la meilleure projection 2D des observations qui sont des courbes La position des individus dans l’espace factoriel est interprétée comme un changement dans la forme des courbes

Le problème de projection se résume à la décomposition spectrale de l’opérateur de variance-covariance V

_Y

(décomposition aux valeurs propres)

V

_Y

ξ

_k

= λ

_k

ξ

_k

ce qui est équivalent à

Z

D

v

_Y

(s, t) ξ

_k

(s) ds = λ

_k

ξ

_k

(t) , t ∈ D

Fonctions ξ

_k

(t) associées aux valeurs propres λ

_k

.

Décomposition aux v. p. des courbes équivalente à celle réalisée

sur les coefs de la décomposition à une métrique Φ prêt.

(22)

ACP fonctionnelles

Objectifs : En utilisant les données n

Y ˆ

₁

, · · · , Y ˆ

_n

o

trouver la meilleure projection 2D des observations qui sont des courbes La position des individus dans l’espace factoriel est interprétée comme un changement dans la forme des courbes

Le problème de projection se résume à la décomposition spectrale de l’opérateur de variance-covariance V

_Y

(décomposition aux valeurs propres)

V

_Y

ξ

_k

= λ

_k

ξ

_k

ce qui est équivalent à

Z

D

v

_Y

(s, t) ξ

_k

(s) ds = λ

_k

ξ

_k

(t) , t ∈ D

Fonctions ξ

_k

(t) associées aux valeurs propres λ

_k

.

Décomposition aux v. p. des courbes équivalente à celle réalisée sur les coefs de la décomposition à une métrique Φ prêt.

(23)

ACP fonctionnelles

Objectifs : En utilisant les données n

Y ˆ

₁

, · · · , Y ˆ

_n

o

trouver la meilleure projection 2D des observations qui sont des courbes La position des individus dans l’espace factoriel est interprétée comme un changement dans la forme des courbes

Le problème de projection se résume à la décomposition spectrale de l’opérateur de variance-covariance V

_Y

(décomposition aux valeurs propres)

V

_Y

ξ

_k

= λ

_k

ξ

_k

ce qui est équivalent à

Z

D

v

_Y

(s, t) ξ

_k

(s) ds = λ

_k

ξ

_k

(t) , t ∈ D

Fonctions ξ

_k

(t) associées aux valeurs propres λ

_k

.

Décomposition aux v. p. des courbes équivalente à celle réalisée

sur les coefs de la décomposition à une métrique Φ prêt.

(24)

Représentations de l’ACPF

La représentation 2D des individus est classique Représentation factorielle 2D

(25)

Représentation de l’ACPF

Représentation 2D des variables comme une perturbation de la courbe moyenne Y ¯ (t) ± √

λ

_k

ξ

_k

(t) lors d’un déplacement le long

d’un axe factoriel

(26)

Analyse de la variabilité Fonction bivariée

Autre representation

Analyse procustéenne + SPM entre espace géographique et espace projectif

De la position GPS à l’espace factoriel

(27)

Analyse de la variabilité Fonction bivariée

Extension : splines vectorielles

Comparaison sorties de modèles / images MODIS

Permet d’évaluer spatialement l’efficacité d’un modèle

Trouver une mesure de distance entre deux surfaces

Nécessité de régulariser les deux objets

(28)

Analyse de la variabilité smoothing with TPS

Modèle OPA, champ observé & approximé

min (

_n

X

i=1

kf(x

_i

) − f

_i

k

²

+ λ × Z

αk∇divfk

²

+ βk∇rotfk

²

)

α >> β pour un système hautement turbulent (variabilité forte du rotationnel)

α << β pour une activité verticale intense (flot localement

divergent).

(29)

Functional linear prediction introduction

Prec. profile / Temp. profile

Starting from a sample (P

_i

, T

_i

)

_i=1,···_,n

, we want to explain variations of profile P using functional variable T as predictor.

2 4 6 8 10 12

0510152025

temperature curves

time

T (°C)

2 4 6 8 10 12

050100150

Precipitation curves

time

P (mm)

(30)

The simplest form of the model is

P

_i

(t) = α + βT

_i

(t) + ε

_i

(t)

Example : P(t) = 3 × T (t) + 4

(31)

OK but . . . limited flexibility

and curve shape cannot be modified. Try a more complex version :

Example : P(t) = β(t) × T (t) + α(t)

(32)

OK but . . . limited flexibility

and curve shape cannot be modified. Try a more complex version : Example : P(t) = β(t) × T (t) + α(t)

(33)

Functional linear prediction Basics

In both cases, solutions are those of standard linear regression Interactions at same time t

Full Functional Linear Model :

P

_i

(t) = α (t) + B (T

_i

) (t) + ε

i

(t) where

B (T

_i

) (t) = Z

β(s, t)T

_i

(s)ds

is a linear operator of finite trace (HS) with kernel function β. Functional coefficients α and β are solution of the normal equations :

( B ˆ V ˆ

_T

= V ˆ

_TP

ˆ

α = µ ˆ

_P

− B ˆ (ˆ µ

_T

)

(34)

In both cases, solutions are those of standard linear regression Interactions at same time t

Full Functional Linear Model :

P

_i

(t) = α (t) + B (T

_i

) (t) + ε

i

(t) where

B (T

_i

) (t) = Z

β(s, t)T

_i

(s)ds

is a linear operator of finite trace (HS) with kernel function β.

Functional coefficients α and β are solution of the normal equations :

( B ˆ V ˆ

_T

= V ˆ

_TP

ˆ

α = µ ˆ

_P

− B ˆ (ˆ µ

_T

)

(35)

PROBLEM : In infinite dimension, no solution can be found when computing :

B ˆ = ˆ V

_TP

V ˆ

_T⁻¹

SOLUTION : the use of regularization method

1

Tikhonov

B ˆ = ˆ V

TP

κI + ˆ V

T

−1

2

Spectral cut regularized inverse B ˆ = ˆ V

_TP

V ˆ

_T^†

with

V ˆ

_T^†

=

k

X

j=1

λ

⁻¹_j

(ξ

_j

⊗ ξ

_j

)

Precipitation curve in new town n + 1 is predicted with

P ˆ

_n+1

= ˆ α(t) + ˆ B(T

_n+1

)

(36)

PROBLEM : In infinite dimension, no solution can be found when computing :

B ˆ = ˆ V

_TP

V ˆ

_T⁻¹

SOLUTION : the use of regularization method

1

Tikhonov

B ˆ = ˆ V

TP

κI + ˆ V

T

−1

2

Spectral cut regularized inverse B ˆ = ˆ V

_TP

V ˆ

_T^†

with

V ˆ

_T^†

=

k

X

j=1

λ

⁻¹_j

(ξ

_j

⊗ ξ

_j

)

Precipitation curve in new town n + 1 is predicted with P ˆ

_n+1

= ˆ α(t) + ˆ B(T

_n+1

)

(37)

Example : Regularized V

_T

Vx with 3 eigenfunctions

Month

2.2 2.2

2.2

2.4 2.4

2.4

2.4 2.6

2.6 2.6

2.6 2.8

2.8 2.8

2.8 3

3

3 3.2

3.2

3.2 3.4

3.4

3.4 3.6

3.6

3.6 3.6

3.6 3.8

3.8

3.8 3.8

4 4

4

4 4

4.2 4.2

4.2

4.2 4.4

4.4 4.4

4.4

Jan Feb Mar Avr May Jun Jul Aug Sep Oct Nov Dec

JanFebMarAvrMayJunJulAugSepOctNovDec

2.2

2.2 2.2

2.2

2.4 2.4

2.4

2.4 2.6

2.6 2.6

2.6 2.8

2.8 2.8

2.8 3

3

3 3.2

3.2

3.2 3.4

3.4

3.4 3.6

3.6

3.6 3.6

3.6 3.8

3.8

3.8 4

4

4 4.2

4.2

4.2 4.4

4.4 4.4

4.4

Month

Month Co

var iance

(38)

Functional linear prediction Predictions

2 4 6 8 10 12

050100150200

TOUR

time

value

●

● ●●

2 4 6 8 10 12

050100150200

EMBR

time

value

●

●●

●

● ●

●

● ●

2 4 6 8 10 12

050100150200

NTES

time

value

●

● ●●

● ● ●

●

● ●

●

2 4 6 8 10 12

050100150200

LYON

time

value

● ● ●

●

●●

●

2 4 6 8 10 12

050100150200

CHRX

time

value

●

● ●

●

● ●●

●

●●

2 4 6 8 10 12

050100150200

RUEN

time

value

●

● ●

●

2 4 6 8 10 12

050100150200

MULH

time

value

●●

●

● ● ●● ●

●

2 4 6 8 10 12

050100150200

CLFD

time

value

●

● ●

●

● ●

●

(39)

Functional linear prediction Predictions

−5 0 5 10

4244464850

RMSE

long.

lat.

ABBV

AGEN

AJAC ALNC

ANGS

ANGM

BAST BSCO

BIAZ BORD

BGES BRST

CAEN

CHRX CHBG

CLFD DIJO DUNK

GOUR EMBR LPUY GREN LILL

LIMG LYON

MRS METZ

MTDM MLAU

MONT

MTPL

MULH NACY

NTES

NIMS NICE ORLS

PARI

PERP PLUM

PTRS

REIM

RNES RUEN

STDZ

STGI SETE

STAB STBG

TRBS TLN

TLSE TOUR

TROY

VCHY

(40)

Elephant seals dataset introduction

Since a decade, marine mammals constitute valuable auxiliaries for operational oceanography

(41)

Since a decade, marine mammals constitute valuable auxiliaries

for operational oceanography

(42)

(43)

Elephant seals dataset

(44)

(45)

(46)

(47)

(48)

(49)

(50)

(51)

(52)

(53)

(54)

(55)

(56)

(57)

(58)

(59)

Elephant seals dataset Light & ChlA data

Sampled data

C(p) = Z

Pmax

0

β(s, t)L

⁰

(s)ds + ε(t)

(60)

Elephant seals dataset Variance Operators

ChlA Variance function

Variance function Chl−A (µg/l)

Depth (m)

0.05

0.1

0.15 0.2 0.25

0.3 0.35 0.4

0.45 0.5

0.55

−175 −141 −107 −90 −73 −56 −39 −22 −5

−175

−158

−141

−124

−107

−90

−73

−56

−39

−22

−5

Depth (m) Depth (m)

Co var iance

(61)

Elephant seals dataset Variance Operators

Cross-covariance function

(62)

Elephant seals dataset Fitted data

Using the model Prediction

0.0 1.0 2.0 3.0

−150−100−500

Chl a (µg/l)

Depth (m)

1

0.0 1.0 2.0 3.0

−150−100−500

Chl a (µg/l)

Depth (m)

2

0.0 1.0 2.0 3.0

−150−100−500

Chl a (µg/l)

Depth (m)

3

0.0 1.0 2.0 3.0

−150−100−500

Chl a (µg/l)

Depth (m)

4

0.0 1.0 2.0 3.0

−150−100−500

Chl a (µg/l)

Depth (m)

5

0.0 1.0 2.0 3.0

−150−100−500

Chl a (µg/l)

Depth (m)

6

0.0 1.0 2.0 3.0

−150−100−500

Chl a (µg/l)

Depth (m)

7

0.0 1.0 2.0 3.0

−150−100−500

Chl a (µg/l)

Depth (m)

8

0.0 1.0 2.0 3.0

−150−100−500

Chl a (µg/l)

Depth (m)

9

0.0 1.0 2.0 3.0

−150−100−500

Chl a (µg/l)

Depth (m)

10

0.0 1.0 2.0 3.0

−150−100−500

Chl a (µg/l)

Depth (m)

11

0.0 1.0 2.0 3.0

−150−100−500

Chl a (µg/l)

Depth (m)

12

(63)

Spatial functional linear model Position of the problem

Consider a collection of curves E = {Y

_i

, i = 1, . . . , n} sampled at n random spatial position inside a domain D

Each observation is a realization of a functional random variable Y

_i

at position x

_i

∈ D, where H is an Hilbert space equipped with standard inner product h·, ·i

_H

and the associated norm k·k

_H

.

?

(64)

If we admit second order stationarity assumptions, the mean function µ (t) is the same at any point of the domain

E (Y

_i

) = µ, ∀x

_i

∈ D

The covariance operator C

_ij

such that C

_ij

= E

(Y

_i

− µ) ⊗ Y

_j

− µ

only depends on the vector h between a function Y

_i

at position x

_i

and Y

_j

at position x

_j

= x

_i

− h

C

_ij

(f ) = E

(Y

_i

− µ) ⊗ Y

_j

− µ (f)

= E

hf, Y

_i

− µi

_H

Y

_j

− µ

, f ∈ H

(65)

If we admit second order stationarity assumptions, the mean function µ (t) is the same at any point of the domain

E (Y

_i

) = µ, ∀x

_i

∈ D

The covariance operator C

_ij

such that C

_ij

= E

(Y

_i

− µ) ⊗ Y

_j

− µ

only depends on the vector h between a function Y

_i

at position x

_i

and Y

_j

at position x

_j

= x

_i

− h

C

_ij

(f ) = E

(Y

_i

− µ) ⊗ Y

_j

− µ (f)

= E

hf, Y

_i

− µi

_H

Y

_j

− µ

, f ∈ H

(66)

If we admit second order stationarity assumptions, the mean function µ (t) is the same at any point of the domain

E (Y

_i

) = µ, ∀x

_i

∈ D

The covariance operator C

_ij

such that C

_ij

= E

(Y

_i

− µ) ⊗ Y

_j

− µ

only depends on the vector h between a function Y

_i

at position x

_i

and Y

_j

at position x

_j

= x

_i

− h

C

_ij

(f ) = E

(Y

_i

− µ) ⊗ Y

_j

− µ (f )

= E

hf, Y

_i

− µi

_H

Y

_j

− µ

, f ∈ H

(67)

We seek to estimate Y

₀

, the curve at unknown location x

₀

, with the linear model

Y b

₀

=

n

X

i=1

B

_i

(Y

_i

)

with B

_i

:H → H being linear operators such that B

_i

(f ) (t) =

Z

τ

β

_i

(s, t) f (s) ds, ∀f ∈ H, ∀t ∈ τ.

The B

_i⁰

s ∈ B (H) the space of HS operators equipped with the inner product

∀ (A, B) ∈ B

²

(H) , hA, Bi

_B

= X

k,l

hA (u

_k

) , u

_l

i

_H

hB (u

_k

) , u

_l

i

_H

where the functions u

_k

form an orthonormal basis of H.

(68)

We seek to estimate Y

₀

, the curve at unknown location x

₀

, with the linear model

Y b

₀

=

n

X

i=1

B

_i

(Y

_i

)

with B

_i

:H → H being linear operators such that B

_i

(f ) (t) =

Z

τ

β

_i

(s, t) f (s) ds, ∀f ∈ H, ∀t ∈ τ.

The B

⁰_i

s ∈ B (H) the space of HS operators equipped with the inner product

∀ (A, B) ∈ B

²

(H) , hA, Bi

_B

= X

k,l

hA (u

_k

) , u

_l

i

_H

hB (u

_k

) , u

_l

i

_H

where the functions u

_k

form an orthonormal basis of H.

(69)

Spatial functional linear model Conditions of unbiasedness

Looking for an unbiased estimator leads to develop the following equation

E

Y b

₀

− Y

₀

= 0,

which, under second order stationarity assumptions, gives the unbiasedness condition

"

_n

X

i=1

B

_i

#

(µ) = µ.

Interesting, but the mean function µ is unknown !

(70)

Looking for an unbiased estimator leads to develop the following equation

E

Y b

₀

− Y

₀

= 0,

which, under second order stationarity assumptions, gives the unbiasedness condition

"

_n

X

i=1

B

_i

#

(µ) = µ.

Interesting, but the mean function µ is unknown !

(71)

Trying to extend the previous condition to the more general constraint

"

_n

X

i=1

B

_i

#

(f) = f, ∀f ∈ H

leads to find operators B

_i

checking

n

X

i=1

B

_i

= I.

Damn it ! The Identity operator is NOT an integral operator over H

(72)

Trying to extend the previous condition to the more general constraint

"

_n

X

i=1

B

_i

#

(f) = f, ∀f ∈ H

leads to find operators B

_i

checking

n

X

i=1

B

_i

= I.

Damn it ! The Identity operator is NOT an integral operator over H

(73)

Trying to extend the previous condition to the more general constraint

"

_n

X

i=1

B

_i

#

(f) = f, ∀f ∈ H

leads to find operators B

_i

checking

n

X

i=1

B

_i

= I.

Damn it ! The Identity operator is NOT an integral operator over H

(74)

Spatial functional linear model A smoother Hilbert space

Reconsider the last condition with

n

X

i=1

B

_i

= K

where K would be a linear operator which should verify

K (f) = f , ∀f ∈ H K ∈ B (H) ⇔ kK k

²_B

< ∞

Can we found a Hilbert space where such an operator exists ? The answer is YES.

(75)

Spatial functional linear model A smoother Hilbert space

Reconsider the last condition with

n

X

i=1

B

_i

= K

where K would be a linear operator which should verify

K (f) = f , ∀f ∈ H K ∈ B (H) ⇔ kK k

²_B

< ∞

Can we found a Hilbert space where such an operator exists ?

The answer is YES.

(76)

Spatial functional linear model Reproducing Kernel Hilbert space

Let now H be a RKHS. This space owns a reproducing kernel function κ defined as

κ : τ × τ → R

(s, t) 7−→ κ (s, t)

which checks the following conditions

∀t ∈ τ, κ (·, t) ∈ H

∀t ∈ τ, ∀f ∈ H, hκ (., t) , f i

_H

= f (t) .

The operator K attached to the kernel function κ verifies K (f ) = f It is a HS operator ⇔ kK k

²_B

< ∞

The form of K is entirely attached to the choice of h·, ·i

_H

.

(77)

Spatial functional linear model Reproducing Kernel Hilbert space

Let now H be a RKHS. This space owns a reproducing kernel function κ defined as

κ : τ × τ → R

(s, t) 7−→ κ (s, t)

which checks the following conditions

∀t ∈ τ, κ (·, t) ∈ H

∀t ∈ τ, ∀f ∈ H, hκ (., t) , f i

_H

= f (t) .

The operator K attached to the kernel function κ verifies K (f ) = f It is a HS operator ⇔ kK k

²_B

< ∞

The form of K is entirely attached to the choice of h·, ·i

_H

.

(78)

Spatial functional linear model A broader form for the functional linear model

We now consider the problem of finding an estimator

Y b

₀

=

n

X

i=1

B

_i

(Y

_i

)

for which B

_i

(Y

_i

) (t) = hβ

_i

(., t) , Y

_i

i

_H

The constraint for unbiasedness can be extended to

n

X

i=1

B

_i

= K

for suitable choice of a RKHS H possessing a kernel operator K which plays the role of an identity operator

The objective is now to find estimates of B

_i

, i = 1, ..., n by constrained minimization of

E

Y b

₀

− Y

₀

2 H

(79)

We now consider the problem of finding an estimator

Y b

₀

=

n

X

i=1

B

_i

(Y

_i

)

for which B

_i

(Y

_i

) (t) = hβ

_i

(., t) , Y

_i

i

_H

The constraint for unbiasedness can be extended to

n

X

i=1

B

_i

= K

for suitable choice of a RKHS H possessing a kernel operator K which plays the role of an identity operator

The objective is now to find estimates of B

_i

, i = 1, ..., n by constrained minimization of

E

Y b

₀

− Y

₀

2 H

(80)

We now consider the problem of finding an estimator

Y b

₀

=

n

X

i=1

B

_i

(Y

_i

)

for which B

_i

(Y

_i

) (t) = hβ

_i

(., t) , Y

_i

i

_H

The constraint for unbiasedness can be extended to

n

X

i=1

B

_i

= K

for suitable choice of a RKHS H possessing a kernel operator K which plays the role of an identity operator

The objective is now to find estimates of B

_i

, i = 1, ..., n by constrained minimization of

E

Y b

₀

− Y

₀

2 H

(81)

Spatial functional linear model Constrained optimization

Seeking the minimizer of E

Y b

₀

− Y

₀

2

H

under the constraint P

n

i=1

B

_i

= K leads to solve the constrained problem min

(B₁,...,Bn)⁰∈Bⁿ(H)

X

i

X

j

B

_j

, C

_ij

B

_i

B

+ hK , C

₀₀

i

_B

− 2 X

i

hC

_i0

, B

_i

i

_B

The constraint can be included into the minimization problem by constructing the functional F such that

F (B

₁

, ..., B

_n

, Λ) = X

i

X

j

B

_j

, C

_ij

B

_i

B

+ hK , C

₀₀

i

_B

− 2 X

i

hC

_i0

, B

_i

i

_B

+2 ×

* Λ, X

i

B

_i

− K +

B

where Λ ∈ B (H) acts as a Lagrange multiplier

(82)

Seeking the minimizer of E

Y b

₀

− Y

₀

2

H

under the constraint P

n

i=1

B

_i

= K leads to solve the constrained problem min

(B₁,...,Bn)⁰∈Bⁿ(H)

X

i

X

j

B

_j

, C

_ij

B

_i

B

+ hK , C

₀₀

i

_B

− 2 X

i

hC

_i0

, B

_i

i

_B

The constraint can be included into the minimization problem by constructing the functional F such that

F (B

₁

, ..., B

_n

, Λ) = X

i

X

j

B

_j

, C

_ij

B

_i

B

+ hK , C

₀₀

i

_B

− 2 X

i

hC

_i0

, B

_i

i

_B

+2 ×

* Λ, X

i

B

_i

− K +

B

where Λ ∈ B (H) acts as a Lagrange multiplier

(83)

Let δ

^∆_B

i

F be the Gateaux differential of F at B

_i

in direction ∆ such that

δ

_B^∆

i

F (B

₁

, ..., B

_i

, ..., B

n

, Λ) =

lim

_ε→0¹_ε

[F (B

₁

, ..., B

_i

+ ε∆, ..., B

_n

, Λ) − F (B

₁

, ..., B

_i

, ..., B

_n

, Λ)]

This limit exists for all ∆ and, thanks to the self-adjoint property of C

_ij

, can easily be computed for all i = 1, ..., n as

δ

^∆_B

i

F (B

₁

, ..., B

_i

, ..., B

_n

, Λ) = 2

* X

j

C

_ij

B

_j

− C

_i0

+ Λ, ∆ +

B

δ

_Λ^∆

F (B

₁

, ..., B

_i

, ..., B

_n

, Λ) =

* X

i

B

_i

− K , ∆ +

B