Functional PLS regression with functional response

(1)

Functional PLS regression with functional response

C. P^REDA¹, J. S^CHILTZ²

1Ecole Polyechnique Universitaire de Lille/ INRIA Lille Nord Europe, France

2Luxembourg School of Finance, University of Luxembourg, Luxembourg

(2)

Introduction

Predictor : X = {X_t, t ∈ T_X} X_t : Ω → R, Response : Y = {Y_t, t ∈ T_Y } Y_t : Ω → R. Hypothesis :

- E(X_t²) < ∞, E(Y_t²) < ∞, - X and Y are L₂–continuous, - ∀ω ∈ Ω :

{X_t(ω), t ∈ T_X} ∈ L₂(T_X), {Y_t(ω), t ∈ T_Y } ∈ L₂(T_Y ), - E(X_t) = 0, ∀t ∈ T_X, E(Y_t) = 0, ∀t ∈ T_Y .

(3)

The linear model is written as Y (s) =

Z

T_X

β(s, t)X(t)dt + ε_t, s ∈ T_Y ,

where β is the coefficient regression function and {ε_t}_t∈T_Y are the residuals.

Least squares criterion : Wiener-Hopf equation :

Cov(Y (s)X(t)) = Z

T_X

Cov(X(t), X(r))β(s, r)dr, s ∈ T_Y , In general, no unique solution !

(4)

Solutions :

- Functional Principal Components

- Basis expansion approximation (Malfait and Ramsay (2003))

- Penalized functional linear model (Harezlak et.al (2007), Eilers and Marx (1996))

(5)

Partial Least Squares : Tucker criterion

max

w ∈ L₂(T_X), kwk_L₂₍T_X) = 1 c ∈ L₂(T_Y ), kck_L₂₍T_Y ) = 1

Cov²

Z

T_X

X_tw(t)dt, Z

T_Y

Y_tc(t)dt

.

(6)

C_XY : L₂(T_Y ) → L₂(T_Y ) : C_XY (f)(t) = g(t) =

Z

T_Y

E(X_tY_s)ds, f ∈ L₂(T_Y ), t ∈ T_X

C_{Y X} : L₂(T_X) → L₂(T_X) : C_{Y X}(g)(t) = f(t) =

Z

T_X

E(Y_tX_s)ds, g ∈ L₂(T_X), t ∈ T_Y

U_X = C_XY ◦ C_{Y X} : U_Xw = λw.

U_Y = C_{Y X} ◦ C_XY : U_Y c = λc.

(7)

First PLS component :

t₁ = Z

T_X

X_tw(t)dt.

Escoufier’s operators : W ^XZ =

Z

T_X

E(X_tZ)X_tdt, ∀Z r.r.v,

W ^Y Z = Z

T_Y

E(Y_tZ)Y_tdt, ∀Z r.r.v.

W^XW^Y t₁ = λ⁽¹⁾_maxt₁

(8)

PLS iteration :

Let X₀ = {X_0,t = X_t, ∀t ∈ T_X} and Y₀ = {Y_0,t = Y_t, ∀t ∈ T_Y }. Step h, h ≥ 1 :

W^X

h−1W^Y

h−1t_h = λ^(h)_maxt_h. Simple linear regression on t_h :

X_h,t = X_h₋_1,t − p_h(t)t_h, t ∈ T_X, Y_h,t = Y_h₋_1,t − c_h(t)t_h, t ∈ T_Y .

(9)

Proposition.

a) {t_h}_h≥1 forms an orthogonal system in the linear space spanned by {X_t}_t∈T_X,

b) Y_t = c₁(t)t₁ + c₂(t)t₂ + . . . + c_h(t)t_h + Y_h,t, t ∈ T_Y , c) X_t = p₁(t)t₁ + p₂(t)t₂ + . . . + p_h(t)t_h + X_h,t, t ∈ T_X, d) E(Y_h,tt_j) = 0, ∀t ∈ T_Y , ∀j = 1, ..., h,

e) E(X_h,tt_j) = 0, ∀t ∈ T_X, ∀j = 1, ..., h.

Yˆ_t = c₁(t)t₁ +c₂(t)t₂ + . . .+c_h(t)t_h = Z

T_X

β_{P LS}_(h)(t, s)X(s)ds.

(10)

PLS and basis expansion.

X(t) ≈

K

X

i=1

α_iφ_i(t), ∀t ∈ T_X, Y (t) ≈

L

X

i=1

γ_iψ_i(t), ∀t ∈ T_Y , Metrics :

Φ = {φ_i,j}₁≤i,j≤K, φ_i,j = hφ_i, φ_ji_L₂₍T_X)

Ψ = {ψ_i,j}₁≤i,j≤L, ψ_i,j = hψ_i, ψ_ji_L₂₍T_Y )

Coefficients :

Λ = Φ^1/2α, Π = Ψ^1/2γ

(11)

Equivalence with finite dimensional PLS regression

i) The PLS regression of Y on X is equivalent to the PLS regression of Π on Λ in the sense that at each step h of the PLS algorithm, 1 ≤ h ≤ K, we have the same PLS

components for both regressions.

ii) If Σ is the L × K-matrix of the regression coefficients of Π on Λ obtained with the PLS regression at step h,

1 ≤ h ≤ K,

Π = ΣΛ + ε,

then the PLS approximation at step h of the regression coefficient function β is given by

βˆ_{P LS}_(h)(t, s) =

L

X

i

K

X

j

S_i,jψ_i(t)φ_j(s), (t, s) ∈ T_Y × T_X,

1 1 ¹ 1

(12)

Simulation study

X_t =

K=7

X

i=1

α_iφ_i(t), t ∈ T_X = [0, 1],

where the {α_i}_i=1,...,K are independent r.v.’s identically

distributed uniformly on the interval [−1; 1] and φ = {φ_i}_i=1,...K is a cubic B-spline basis on [0, 1] with equidistant knots.

Let define

Y (t) =

Z ₁

0

β(t, s)X_sds + ε_t, t ∈ T_Y = [0, 1],

where β(t, s) = (t − s)², ∀(t, s) ∈ [0, 1]², and ε_t is the residual.

(13)

One obtains :

E(X_t) = 0, V(X_t) = 1 3

K

X

1

φ²_i (t), ∀t ∈ [0, 1], E(Y_t) = 0, V(Y_t) = 1

3

K

X

1

Z 1

0

β(t, s)φ_i(s)ds ²

, ∀t ∈ [0, 1].

The residual {ε_t}_t∈[0,1] is a zero-mean random process such that the ε_t are normally distributed with variance σ_t² > 0 and ε_t and ε_s are independent ∀(s, t) ∈ [0, 1]², s 6= t. The residual variance, σ_t², is chosen such that the signal-noise ratio, V(Y_t)

V(ε_t) is controlled.

In our simulation we considered V(Y_t)

V(ε_t) = 0.9, ∀t ∈ [0, 1].

(14)

SSE_Y =

Z ₁

0

E(Y_t − Yˆ_t)²dt, SSE_β =

Z 1

0

Z 1

0

(β(t, s) − βˆ(t, s))²dtds and V_Y =

Z 1

0

V(Y_t)dt.

SSE_Y is computed using the leave-one-out cross-validation, whereas SSE_β is computed from the model including all the n observations.

(15)

From the cross-validation scores obtained for each response in the vector Π, h = 5 seems a good choice for the number of PLS components. The model obtained with h = 5 PLS components gives the following matrix S,

S = 0 B B B B B B B

@

−0.0551 1.4171 1.7687 1.5224 0.2926 −0.9046 −0.6530 0.0559 0.4597 1.2733 0.8860 0.0741 −0.5878 −0.5195 0.0262 −0.5313 0.4191 −0.0219 −0.2213 −0.0678 −0.2143 0.8245 1.1117 −0.9000 0.3107 −0.2631 −0.1077 0.2155 1.1491 2.3503 −1.4923 0.6788 −0.2762 −0.2187 0.4060

1 C C C C C C C A .

One obtains V(Y ) ≈ 0.00273, SSE_Y ≈ 0.00038 and SSE_β ≈ 0.00042. The ratio SSE_Y

V_Y ≈ 0.13919 shows a good fit of the approximated model.

(16)

Functional PLS regression with functional response