• Aucun résultat trouvé

Bootstrap of constraint estimators with application to rank estimation.

N/A
N/A
Protected

Academic year: 2022

Partager "Bootstrap of constraint estimators with application to rank estimation."

Copied!
23
0
0

Texte intégral

(1)

Bootstrap of constraint estimators with application to rank estimation.

Fran¸cois Portier

IRMAR-University of Rennes 1

April 17, 2012

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 1 / 20

(2)

Table of contents

1 Bootstrap, hypothesis testing, estimation under constraint

2 Application to rank estimation

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 2 / 20

(3)

Introduction to Bootstrap

Goal: To reproduce the asymptotic behavior of some estimators.

Means: Creation of a new sample which “look like” the previous one.

•Bootstrap of Efron [Efron(1982)] :

Suppose that (X1, ...,Xn) i.i.d. with lawP. We draw (X1, ...,Xn) with respect to the law

Pb=n−1

n

X

i=1

δXi.

Defineθ0=E[X], X =1nPn

i=1Xi and X= 1

n

n

X

i=1

Xi= 1 n

n

X

i=1

NiXi withNi∼ mult(1/n)

•Another bootstrap method:

X=X+1 n

n

X

i=1

iXi= 1 n

n

X

i=1

(i+ 1)Xi

withi any i.i.d. sequence standard random variable.

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 3 / 20

(4)

Introduction to Bootstrap

Example of results with both previous bootstrap methods

Ifφis continuously differentiable on a neighborhood ofθ0=E[X], ifP has a finite second order moment 2, then

√n(φ(X)−φ(X)) bootstrap √

n(φ(X)−φ(θ0)), i.e.

L(√

n(φ(X)−φ(X))|bP)n→∞= L(√

n(φ(X)−φ(θ0))|P).

Why the bootstrap ? Alternative to the use of the asymptotic law ([Hall(1992)]) for

Building confidence interval

Hypothesis testing (for the choice of quantile)

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 4 / 20

(5)

Test of equal means: classical bootstrap works

Assumeθ0∈R,

H0: θ0=θ against H1: θ06=θ To arbitrate:

k√

n(X−θ)k2is compared to

qα a quantile of the limiting distribution qα a quantile of the bootstrap statistic

Level and power

PH0(k√

n(X−θ)k2>qα orqα) and PH1(k√

n(X −θ)k2>qαor qα) Forqα: OK.

Forqα: √

n(X−X)

| {z }

do not depend onH0orH1

bootstrap √

n(X−θ0) ⇒OK.

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 5 / 20

(6)

In general classical bootstrap fails

Assumeθ0∈R2andCis the unit circle,

H0: θ0∈ C against H1: θ0∈ C/

⇒Constraint estimators :

Tbn=n min

g(θ)=0kX−θk2

Does the classical Bootstrap works ?

UnderH0, Tbn=|√

n(φ(X)−φ(θ0))|2 withφ:x → min

g(θ)=0kx−θk Bootstrap candidate:

Tn=|√

n(φ(X)−φ(X))|2 Can not work becauseφis notC1.

⇒Even if we can bootstrap√

n(X−θ0), it is not clear we are able to bootstrap some constraint estimators.

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 6 / 20

(7)

From nowθ0∈Rp (parameter of interest), it existsθbsome consistent estimators ofθ0. Define the random function

Qbn(θ) = (bθ−θ)TSb(bθ−θ).

Question

If we can bootstrap√

n(bθ−θ0), does theunder H0-law of

√n(bθc−θ0) with bθc= argmin

g(θ)=0

Qb(θ) can be bootstrapped ?

Applications

Statistics of the kind min

g(θ)=0

Q(θ) to arbitrate between

H0: g(θ0) = 0 and H1:g(θ0)6= 0

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 7 / 20

(8)

Intuitively

Define

θc= argmin

g(θ)=0

Q(θ) and Q(θ) = (θ−θ)TS−θ).

As traditional bootstrap: we expect results such as

√n(θc−θbc) bootstrap√

n(bθc−θ0)

The idea : A good choice of θ

We try to ”reproduce”H0with

θ=θbc+ ”something going to 0 with good speed and variance”

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 8 / 20

(9)

θbc = argmin

g(θ)=0

(bθ−θ)TbS(bθ−θ) and θc = argmin

g(θ)=0

−θ)TS−θ) Assumptions :

1 Sb−→P S andS−→P S.

2 S is full rank.

3 g :Rp→Rq isC1on a neighborhood ofθ0andJg0) is of full rank.

Theorem

Under H0, if√

n(θ−θbc)bootstrap√

n(bθ−θ0)→d X (Gaussian) we have

√n(bθc−θbc)bootstrap √

n(bθc−θ0)under H0.

| {z }

(Gaussian limit)

(1.1)

Under H1, we additionally need to assume the existence ofθc such asθbc a.s.→θc

with g(θc) = 0, to get (1.1).

⇒ θ=θbc+ (θclassical−θ) withb θclassical comes from any methods of classical bootstrap.

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 9 / 20

(10)

What about T

?

Corollary

Under the previous set of assumptions under H0 and H1, T= argmin

g(θ)=0

−θ)TS−θ) bootstrap Tb under H0

| {z }

weighted Chi-squared limit

Problem

The assumption for convergence underH1: θbc a.s.→ θc need to be check for each case.

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 10 / 20

(11)

Assumptions :

1

n(θ−bθc) bootstrap√

n(bθ−θ0)→d X (Gaussian)

2 Sb−→P S andS−→P S

3 g :Rp→Rq isC1on a neighborhood ofθ0andJg0) is of full rank.

4 S is full rank.

Corollary 2

The test with null hypothesis

H0: g(θ0) = 0 againstH1: g(θ0)6= 0

and associated statisticTb with bootstrap calculation of quantile is consistent.

For the test procedure, one can draw

T1, ...,TB to estimateqα and we do not rejectH0ifTb ≤q, or rejectH0 if not.

In other words Corollary 2 means:

⇒The asymptotic level of the test isα.

⇒The power of the test goes to 1.

Rk: This kind of test is pivotal (Chi-squared) whenS =Var(X)−1.

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 11 / 20

(12)

Application to rank estimation

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 12 / 20

(13)

Framework and notation

Goal: Estimation of the rank of a matrix Means: Hypothesis testing.

Assumptions

Mb andM are matricesRp×H such that

√n(~(M)b −vec(M))−→ Nd (0,Γ) bΓ−→P Γ

Nothing moreor Γ invertibleorΓ =FFT⊗GGT invertible.

Notations: rank(M) =d0, SVD ofM andM:b

M= (U1U0)

D1 0

0 0

V1T V0T

and Mb = (bU1Ub0) Db1 0 0 Db0

! Vb1T Vb0T

!

P1=U1U1T,Q1=U0U0T,P2=V1V1T,Q2=V0V0T,Pb1,Pb2,Qb1Qb2.

(bλ1, ...,bλp), (resp. (λ1, ..., λp)) singularvalues ofMb (resp. M) in ascending order.

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 13 / 20

(14)

Short review

Ford= 0, ...,d0, we test

H0: d0=d against H1: d0>d

Some statistics

[Li(1991)] Tb1=n

p−d

X

k=1

2k (=nkvec(Qb1MbQb2)k2) [Bura and Yang(2011)] Tb2=nvec(Qb1MbQb2)T+vec(Qb1MbQb2) [Cragg and Donald(1997)] Tb3=n min

rank(M)=d

vec(Mb −M)T−1vec(Mb −M) By noting that

Lemma (From PCA)

Pb1MbPb2= argmin

rank(M)=d

kMb −Mk2F = argmin

rank(M)=d

kvec(Mb −M)k2

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 14 / 20

(15)

we get

[Li(1991)] Tb1=n min

rank(M)=dkvec(Mb −M)k2 [Bura and Yang(2011)] Tb2=nvec(Mb −Mbc)T+vec(Mb −Mbc) [Cragg and Donald(1997)] Tb3=n min

rank(M)=dvec(Mb −M)T+vec(Mb −M) withMbc = argmin

rank(M)=d

kvecv(Mb −M)k2.

Application of the results

{rank(M) =d}is a smooth submanifold.

Example of sufficient conditions for bootstrap: It existsξ1, ..., ξn i.i.d. with E[kξ1k2F]<+∞such that Mb = 1nPn

i=1ξi.

⇒Example for Tb1andTb2:M=Pb1MbPb2+1 n

n

X

i=1

iξi

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 15 / 20

(16)

Example under H

1

i.e. d < d

0

when θ b

c

does not converge

We need to ensure the a.s. convergence of

θbc = argmin

rank(M)=d

kvec(M)b −vec(M)k2=Pb1MbPb2

⇒problem of convergence of eigenprojectors Riesz formula: Pλ=H

Cλ(Iz−M)−1dz.

Suppose thatM andMb are symetric withMb a.s.→M, then Pb=H

Cb(Iz−M)b −1dz

ifλp−d+16=λp−d

=

from a certain rank

H

C(Iz−M)b −1dz Ifλp−d+1p−d thenP does not exists. Rk: Application of the previous results toMbMbT andMbTM.b

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 16 / 20

(17)

Example under H

1

i.e. d < d

0

when θ b

c

does not converge

We need to ensure the a.s. convergence of

θbc = argmin

rank(M)=d

kvec(M)b −vec(M)k2=Pb1MbPb2

⇒problem of convergence of eigenprojectors Riesz formula: Pλ=H

Cλ(Iz−M)−1dz.

Suppose thatM andMb are symetric withMb a.s.→M, then Pb=H

Cb(Iz−M)b −1dz ifλp−d+1=6=λp−d

from a certain rank

H

C(Iz−M)b −1dz

Ifλp−d+1p−d thenP does not exists. Rk: Application of the previous results toMbMbT andMbTM.b

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 16 / 20

(18)

Example under H

1

i.e. d < d

0

when θ b

c

does not converge

We need to ensure the a.s. convergence of

θbc = argmin

rank(M)=d

kvec(M)b −vec(M)k2=Pb1MbPb2

⇒problem of convergence of eigenprojectors Riesz formula: Pλ=H

Cλ(Iz−M)−1dz.

Suppose thatM andMb are symetric withMb a.s.→M, then Pb=H

Cb(Iz−M)b −1dz ifλp−d+1=6=λp−d

from a certain rank

H

C(Iz−M)b −1dz Ifλp−d+1p−d thenP does not exists.

Rk: Application of the previous results toMbMbT andMbTM.b

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 16 / 20

(19)

Conclusion

Concluding remarks

We provide a general bootstrap procedure for constraint estimator associate to a quadratic function.

The test procedure associate is consistent.

Large application thanks to hypothesis testing.

As an example, it can easily be applied to rank estimation.

Work in progress

Alleviate the underH1 assumptionθca.s.→θc for theTb stat.

Possibility to extend such results toM−estimator,Z−estimator.

Simulation study : bootstrap vs asymptotic,

and also constraint bootstrap vs traditional bootstrap.

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 17 / 20

(20)

E. Bura and J. Yang.

Dimension estimation in sufficient dimension reduction: a unifying approach.

J. Multivariate Anal., 102(1):130–142, 2011.

John G. Cragg and Stephen G. Donald.

Inferring the rank of a matrix.

J. Econometrics, 76(1-2):223–250, 1997.

Bradley Efron.

The jackknife, the bootstrap and other resampling plans, volume 38 of CBMS-NSF Regional Conference Series in Applied Mathematics.

Society for Industrial and Applied Mathematics (SIAM), Philadelphia, Pa., 1982.

Peter Hall.

The bootstrap and Edgeworth expansion.

Springer Series in Statistics. Springer-Verlag, New York, 1992.

Ker-Chau Li.

Sliced inverse regression for dimension reduction.

J. Amer. Statist. Assoc., 86(414):316–342, 1991.

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 18 / 20

(21)

SIR

Sufficient dimension reduction (SDR) introduit par [Li(1991)]: on suppose le mod`ele de r´egression suivant,

Y =g(PZ, ε), Z ⊥⊥ε

o`uY ∈R,Z ∈Rp, P est un projecteur orthogonal de rangd0etg est inconnue.

But de la SDR : Estimation de P .

Enjeux : Obtenir une meilleur vitesse lors de l’estimation de g .

L’inf´erence surP se base sur

E[Z|Y]∈Im(P) p.s.

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 18 / 20

(22)

SIR

On partitionne l’image deY enH tranches appel´eesI(h)

Enjeux de SIR

Estimer l’espace engendr´e par les vecteurs

E[Z|Y ∈I(1)], . . . ,E[Z|Y ∈I(H)]

Procedure de SIR:

1/ Estimation de

Ch=E[Z1{Y∈I(h)}]∈Ec pour h= 1, ...,H.

2/ Extraire une base de span(bC1, ...,CbH) : Elements propres de la matrice MbSIR =X

h

bp−1h CbhCbhT avecph=P(Y ∈I(h)).

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 19 / 20

(23)

Trouver la dimension

En notantηb1, ...,ηbp les vecteurs propres deMbSIR dans l’ordre croissant des v.p., on peut estimerPde mani`ere consistante par

Pb =

d0

X

k=1

ηbkkT,

maisd0 est inconnu.

Importance de bien estimer d

0

Perte dans la valeur explicative du mod`ele.

Vitesse non-param´etrique mauvaise.

Fran¸cois Portier (IRMAR) Bootstrap of constraint estimators April 17, 2012 20 / 20

Références

Documents relatifs

Our second contribution is a creative way of constructing a triplet representation for the defining matrices of all smaller ares during the doubling iterations so that the

To address this issue, we construct multiscale basis functions within the framework of generalized multiscale finite element method (GMsFEM) for dimension reduction in the

Lin, Structured doubling algorithms for solving g-palindromic quadratic eigenvalue problems, Technical Report, NCTS Preprints in Mathematics, National Tsing Hua University,

Since the shift-and-invert Arnoldi method is known to converge very fast when a proper shift is known, the overall computational costs of GE_GTSHIRA and GE_TSHIRA, including computing

For related research projects, strategies for the optimal setting of parameters and optimization of the computation and data structures in the SDA_ls, pre-processing of AREs to

By using variational methods, the existence and the non-existence of nontrivial homoclinic solutions are obtained, depending on a parameter.. Ó 2014

In this paper, we propose a new method to compute the numerical conformal maps to circular regions based on a variational formulation of the problem (4) based on [6].. Due to

The aim of this paper is to develop a time-dependent Hermite–Galerkin spectral method (THGSM) to approximate the solution to the nonlinear convection–diffusion equations with