Testing for equality of means of a Hilbert space valued random variable

(1)

Statistics/Probability Theory

Testing for equality of means of a Hilbert space valued random variable

Jean Gérard Aghoukeng Jiofack

^a,b

, Guy Martial Nkiet

^a

aUniversité des sciences et techniques de Masuku, URMI, BP 943, Franceville, Gabon

bUniversité de Yaoundé 1, faculté des sciences, département de mathématiques, BP 812, Yaoundé, Cameroun Received 10 May 2009; accepted after revision 23 September 2009

Available online 30 October 2009 Presented by Paul Deheuvels

Abstract

We propose a test for equality of means of a random variable valued into a real separable Hilbert space. The test statistic is based on projections of empirical means onto spaces spanned by principal directions obtained from principal component analysis of the random variable. The asymptotic distribution of this test statistic is derived under the null hypothesis and the consistency of the obtained test is proved. An application to the case of functional variables is indicated.To cite this article: J.G. Aghoukeng Jiofack, G.M. Nkiet, C. R. Acad. Sci. Paris, Ser. I 347 (2009).

Résumé

Un test d’égalité des moyennes d’une variable aléatoire à valeurs dans un espace de Hilbert.Nous proposons un test d’égalité des moyennes pour une variable aléatoire à valeurs dans un espace de Hilbert réel séparable. La statistique de test est basée sur des projections des moyennes empiriques sur des directions obtenues à partir de l’analyse en composantes principale de la variable aléatoire. La loi limite, sous hypothèse nulle, de cette statistique de test est obtenue et la convergence du test obtenu est prouvée. L’application au cas de variables fonctionnelles est indiquée.Pour citer cet article : J.G. Aghoukeng Jiofack, G.M. Nkiet, C. R. Acad. Sci. Paris, Ser. I 347 (2009).

Version française abrégée

SoientXetY deux variables aléatoires (v.a.) à valeurs respectivement dans(H,BH)et(F,P(F )), oùH est un espace de Hilbert réel séparable, BH est sa tribu borélienne, F est l’ensemble{1, . . . , q}et P(F ) la tribu de ses parties. Notant · la norme induite par le produit scalaire, supposant queE(X⁴) <+∞et considérantm=E(X), m=E(X|Y =), nous proposons dans cette note un test de l’hypothèse nulleH0définie en (1) contre l’hypothèse alternative notée H1. Pour cela, on suppose que les valeurs propres décroissantesλ_i de l’opérateur de covariance V de X sont distinctes, et on utilise la statistique définie en (2) sur la base d’un échantillon i.i.d.{(X_i, Y_i)}1in

E-mail addresses:[email protected] (J.G. Aghoukeng Jiofack), [email protected] (G.M. Nkiet).

doi:10.1016/j.crma.2009.10.010

(2)

de(X, Y ). Considérant les matrices diagonalesD_V(p)=diag(λi;1ip)et=diag(p;1q), oùp= P (Y=) >0, la matrice carréeΣd’ordrepqdéfinie en (3), et notant⊗^Kle produit matriciel de Kronecker, on a : Théorème 2.1.SousH0,nTn(p)→^L Q=U^TΓ U(n→ +∞), oùΓ =(DV(p))⁻¹⊗^K⁻¹etUN_R^pq(0, Σ ).

Pour un niveauα∈ ]0,1[donné,H0est rejetté lorsqueF_Q(nT_n(p)) >1−α, oùF_Qdésigne la fonction de répar- tition deQ. Celle-ci peut être calculée en utilisant des formules d’approximation données dans [13] et qui nécessitent le calcul des valeurs propres deΣ^1/2Γ Σ^1/2. Les matricesΣandΓ étant inconnues en pratique, on les remplace par les estimateurs convergents obtenus en remplaçant, dans les formules qui les définissent, chaque paramètre par son analogue empirique. Le théorème suivant montre la convergence du test proposé, pour des valeurs depsuffisamment grandes.

Théorème 2.2.SousH1, il existep₀∈N^∗tel que pour touspp₀etα∈ ]0,1[, on a:

n→+∞lim P

F_QnT_n(p)

>1−α

=1.

Ce test peut être appliqué au cas d’une variable fonctionnelle. En effet, lorsqueX=(X(t ))_t_∈[_0,1_]est à valeurs dans L²([0,1]), et si l’on suppose que chaqueXiest observé en des pointst1, . . . , tLtels que 0t1< t2<· · ·< tL1, on obtient la matriceX=(X⁽ⁱ⁾(tj))sur la base de laquelle on peut calculer lesλˆiet les composantes principalescˆi(k)= X_i,uˆ_k en utilisant, par exemple, la fonction pca.fd disponible dans la librairie fda de R. Cela permet d’obtenir les matricesΓˆ etΣˆ ; cette dernière étant construite comme en (3) avec les termesσˆ_{ij k}donnés en (6).

1. Introduction

Letting(Ω,A, P )be a probability space, we consider two random variablesX andY defined on this space and valued respectively into(H,BH)and(F,P(F )), whereH is a real separable Hilbert space,BH is the related Borel σ-field,F is the set{1, . . . , q}andP(F )denotes theσ-field of all subsets ofF. Denoting by·,· the inner product of H and by·the related norm, we suppose thatE(X⁴) <+∞. Then we consider the meanm=E(X)and, for any ∈F, the conditional meanm=E(X|Y=)and the probabilityp=P (Y=)which is assumed, without loss of generality, to be non-null. We are interested in testing for the hypothesis

H0:m₁= · · · =m_q (1)

against the alternative H1: ∃∈F, τ=0, whereτ=m−m. This problem is within a field, calledFunctional Data Analysis, that has emerged in the last decade. It consists of statistical methods allowing analysis of data that are curves, including finely sampled records over some time interval. The book by Ramsay and Silverman [14] gives a comprehensive introduction to this field of statistics, whereas nonparametric approaches are tackled in [9]. Several recent papers deal with statistical testing procedures for functional variables, mainly in the context of functional linear regression models for which tests for no effect are considered (see, e.g., [3,4,11]). However, a few papers propose tests on the means of functional variables. More precisely, Mas [12] introduced a procedure for testing the equality of the mean to a fixed value, whereas tests for comparison of means, that is the hypothesisH0given in (1), were proposed in [2,5] and [8]. The practical interest of using these methods was illustrated in [2] on electroencephalogram data, and in [5] on data from experimental cardiology. Recent works also show the interest of making statistical inference on others location parameters in the context of functional variables. In this vein, mode and median were tackled in [9], whereas a general depth measure is proposed in [6] which also contains a discussion about the applicability of this idea for inference in functional data analysis. Our purpose in this paper is to introduce a new method for testing for H0; this method is based on projections of theτ’s onto principal directions, that is lines spanned by unit eigenvectors of the covariance operator ofX. Our approach is described in the general framework of Hilbert space valued random variables and is easily applied when functional variables are considered.

2. Main results

The covariance operator ofXis defined byV =E((X−m)⊗(X−m)), where⊗denotes the tensor product such that, for any(x, y),x⊗yis the linear map:h→ x, hy. It is known that, under the above assumption,V is a Hilbert–

(3)

Schmidt self-adjoint operator acting fromHto itself. Letting(λ_i)_i₁be the decreasing sequence of eigenvalues ofV, we consider an orthonormal basis(u_i)_i₁ ofH such thatu_i is an eigenvalue ofV associated withλ_i. We assume throughout the paper that the following assumption holds:

(A) ∀i1, λi> λ_i₊₁>0.

Given an i.i.d. sample {(X_i, Y_i)}1in of (X, Y ), we consider the empirical counterpart Vˆ_n of V, defined as Vˆ_n=n⁻¹_n

i=1(X_i− ¯X⁽ⁿ⁾)⊗(X_i− ¯X⁽ⁿ⁾), where X¯⁽ⁿ⁾=n⁻¹_n

i=1X_i, and we denote by(ˆλ_i,uˆ_i)_i₁a family of eigenelements ofVˆnsuch that(ˆλi)i1is the decreasing sequence of eigenvalues ofVˆnand(uˆi)i1is an orthonormal system of related eigenvectors,uˆ_i being associated withλˆ_i. In addition, for∈F, we put

ˆ n=

n i=1

1_{Yi=}, pˆ=nˆ

n, X¯⁽ⁿ⁾ = 1 ˆ n

n i=1

1_{Yi=}X_i and τˆⁿ= ¯X⁽ⁿ⁾ − ¯X⁽ⁿ⁾. For testingH0againstH1, we make use of the statistic given by

T_n(p)= p i=1

q =1

ˆ λ⁻_i ¹pˆ

ˆ τⁿ,uˆ_i2

, (2)

wherepis sufficiently large. This statistic is a consistent estimator ofT (p)=p i=1

q

=1λ⁻_i ¹pτ, u_i ². Indeed, an obvious application of the strong law of large numbers gives the almost sure convergence ofpˆ (respectivelyτˆⁿ; respectively Vˆ_n) to p (respectively τ; respectively V in L(H )) as n→ +∞. Further, Lemma 1 in [10] gives

|ˆλ_i −λ_i| ˆV_n−V and ˆu_i −u_i2√

2 û_i⊗ û_i −u_i ⊗u_i. Then, using Proposition 3 in [7] which ensures the almost sure convergence inL(H )of uˆ_i⊗ û_i tou_i⊗u_i asn→ +∞, we deduce from the preceding inequalities thatλˆ_i anduˆ_i converge almost surely, asn→ +∞, toλ_i andu_i respectively. Consequently,T_n(p)converges almost surely toT (p)asn→ +∞. The asymptotic distribution underH0of this statistic is given in the theorem that is stated below. Let us introduce the operators

V⁽⁾=E

(X−m)⊗(X−m) Y =

and Θ_j=δ_jp_jV^{(j )}+p_jp

V −V^{(j )}−V⁽⁾ , where(j, )∈F²andδ_jdenotes the Kronecker delta. This permits to consider thepq×pqmatrix

Σ=

⎛

⎜⎜

⎜⎝

σ₁₁₁₁ σ₁₁₁₂ . . . σ_11pq σ₁₂₁₁ σ₁₂₁₂ . . . σ_12pq

... ... . . . ... σ_pq11 σ_pq12 . . . σ_pqpq

⎞

⎟⎟

⎟⎠, whereσij k= ui, Θj uk . (3)

Considering the diagonal matricesD_V(p)=diag(λi;1ip)and=diag(p;1q), and denoting by⊗^K the Kronecker matrix product, we have:

Theorem 2.1.UnderH0,nT_n(p)converges in distribution, asn→ +∞, toQ=U^TΓ U, whereΓ =(D_V(p))⁻¹⊗^K ⁻¹andUN_R^pq(0, Σ ).

Sketch of the proof. We first express the test statistic as T_n(p)=tr(Rⁿ_p), where Rⁿ_p=(V_pⁿ)⁻¹B_n with V_pⁿ = p

i=1λˆ_iuˆ_i ⊗ ˆu_i and B_n =q

=1pˆⁿ τˆⁿ⊗ ˆτⁿ. Then, letting {f₁, . . . , f_q} be the canonical basis of R^q, putting W=q

=11_{Y=}f,Wi=q

=11_{Y_i=}fand Z=

X−m W

, Z_i=

X_i− ¯X⁽ⁿ⁾ W_i

, V_Z=E(Z⊗Z), V_Z⁽ⁿ⁾=1 n

n i=1

Z_i⊗Z_i,

we show thatnRⁿ_p=ϕ_n(K_n), whereK_n=√

n(V_Z⁽ⁿ⁾−V_Z)andϕ_nis the random map fromL(H×R^q)toL(H )given byϕ_n(S)=q

=1(pˆⁿ)⁻¹(Π₁₂(S)f)⊗((V_pⁿ)⁻¹(Π₁₂(S)f)). Here,Π₁₂ is defined byΠ₁₂(S)=P₁SP₂^∗ whereP₁ (respectivelyP₂) is the canonical projection fromH×R^qtoH(respectivelyR^q). On the one hand, we show thatK_n

(4)

converges in distribution asn→ +∞to a r.v.Khaving a centered normal distribution inL(H×R^q)with covariance operator equal to that of

S 1

2

Z₀⊗Z₀−E(Z₀⊗Z₀)

−(Z₀−μ)⊗Π₀(μ)−Π₀(Z₀−μ)⊗μ+Π₀(Z₀−μ)⊗Π₀(μ)

,

whereS:A∈L(H×R^q)→A+A^∗and Z0=

X W

, μ=E(Z0), Π0: u

v

∈H×R^q−→

u 0

∈H×R^q.

On the other hand, using this convergence result and the almost sure convergences of pˆⁿ and (V_pⁿ)⁻¹ top and (Vp)⁻¹respectively, whereVp=p

i=1λiui⊗ui, it is easy to check thatϕn(Kn)−ϕ(Kn)converges in probability to 0, ϕ being defined byϕ(S)=q

=1(p)⁻¹(Π₁₂(S)f)⊗((V_p)⁻¹(Π₁₂(S)f)). Therefore,ϕ_n(K_n)has the same asymptotic distribution thanϕ(K_n). From the continuity ofϕwe deduce (see [1]) that this asymptotic distribution is the distribution of ϕ(K), that is a centered normal distribution. Consequently,nT_n(p)converges in distribution, as n→ +∞, toQ=tr(ϕ(K)). From some easy calculations, we finally obtainQ=U^TΓ U where, denoting by vec the usual vectorization operator which stacks the columns of a matrix into one vector, and considering thep×q matrixM(K)=(u_i, Π₁₂(K)f_j )₁_i_p,₁_j_q,U is defined byU=vec(M(K)^T). This random vector is a linear function ofKand has, therefore, a centered normal distribution inR^pq. Its covariance matrix is obtained from further calculations that show that it is equal toΣ. 2

For a given significance level α∈ ]0,1[, the hypothesisH0 will be rejected ifF_Q(nT_n(p)) >1−α, where F_Q denotes the cumulative distribution function of Q. Since Qis a quadratic form of a normally distributed random vector,F_Q can be computed or approximated by using formulas given in [13] and which involve the eigenvalues of Σ^1/2Γ Σ^1/2. In practiceΣandΓ are unknown; so, they are to be replaced by consistent estimators. For estimatingΓ, we have to consider Γˆ =(Dˆ_V(p))⁻¹⊗^Kˆ⁻¹whereDˆ_V(p)=diag(ˆλ_i;1ip)andˆ =diag(pˆ;1q).

A consistent estimator ofΣis obtained by taking thepq×pqmatrixΣˆ constructed as in (3) but with elements given byσˆ_{ij k}= ˆu_i,Θˆ_juˆ_k , where

Vˆ_n⁽⁾= 1 ˆ n

n i=1

1I_{Yi=}

X_i− ¯X⁽ⁿ⁾

⊗

X_i− ¯X⁽ⁿ⁾

and Θˆ_j=δ_jpˆ_jVˆ_n^{(j )}+ ˆp_jpˆVˆ_n− ˆV_n^{(j )}− ˆV_n⁽⁾ . (4) The following theorem shows the consistency of the test for a sufficiently largep:

Theorem 2.2.UnderH1, there existsp₀∈N^∗such that, for anypp₀and anyα∈ ]0,1[, we have:

n→+∞lim P

F_QnTn(p)

>1−α

=1.

Proof. Since τ²=_+∞

i=1τ, ui 2 then, under H1, there exists (0, p0)∈F ×N^∗ such that τ₀, up₀ 2>0.

Therefore, we have for anypp0,T (p)λ⁻_p¹

0p₀τ₀, up₀ 2>0. Forpp0, let us consider a realεsuch that 0< ε < T (p). From the inequality

P T_n(p)−T (p) < ε

PT_n(p) > T (p)−ε

and from the convergence in probability ofT_n(p)toT (p)asn→ +∞, we deduce that

n→+∞lim PT_n(p) > T (p)−ε

=1. (5)

Letn₀be an integer such that∀nn₀, n⁻¹F⁻_Q¹(1−α) < T (p)−ε. Then, for allnn₀, we have PT_n(p) > T (p)−ε

PT_n(p) > n⁻¹F⁻_Q¹(1−α)

and the required result is obtained from (5) and this later inequality. 2

(5)

3. Application to functional variables

Here we show how the test introduced in the previous section can be applied in order to deal with functional variables, so introducing a new approach for testing for homogeneity of means for such variables. LetX=(X(t ))_t_∈[_0,1_]be a random variable valued intoL²([0,1])andY be as previously defined. We consider an i.i.d. sample{(X_i, Y_i)}1in

of(X, Y ), assuming that each functionX_i is observed on fine grids of pointst₁, . . . , t_Lsatisfying 0t₁< t₂<· · ·<

t_L1. This leads to then×LmatrixX=(X_i(t_j))₁_i_n,1_j_Lthat contains all observations of X. For showing how the previous test forH0 can be applied here, we just have to explain how to obtain the matricesΓˆ andΣˆ in practice. First note that, theλˆ_i’s and the scorescˆ_i(k)= X_i,uˆ_k are output of the function pca.fd that is available in the R package fda, and which require the matrixXas argument. So, by using this function the matrixΓˆ can be easily computed. Moreover, it is easy to check that ˆu_i,Vˆ_nuˆ_k = ˆβ_ikⁿ and ˆu_i,Vˆ_n^{(j )}uˆ_k = ˆβ_ikⁿ_|_j, where

βˆ_ikⁿ =1 n

n r=1

cˆr(i)− ¯c(i) ˆ

cr(k)− ¯c(k)

and βˆ_ikⁿ_|_j= 1 ˆ n_j

n r=1

1I_{Y_r=j} ˆ

cr(i)− ¯c(i) ˆ

cr(k)− ¯c(k) ,

withc(i)¯ =n⁻¹n

r=1cˆr(i). Therefore, from (4) we deduce thatΣˆ is constructed as in (3) with entries given by:

ˆ

σ_{ij k}=δ_jpˆ_jβˆ_ikⁿ_|_j+ ˆp_jpˆβˆ_ikⁿ − ˆβ_ikⁿ_|_j− ˆβ_ikⁿ_|

. (6)

Acknowledgements

J.G. Aghoukeng Jiofack is grateful for financial support from the Agence Universitaire de la Francophonie (AUF).

G.M. Nkiet’s work was partly supported by the academy of sciences for the developing world (TWAS) through the grant 05-154 RG/MATHS/AF/AC.

References

[1] P. Billingsley, Convergence of Probability Measures, Wiley, 1968.

[2] C. Bugli, P. Lambert, Functional ANOVA with random functional effects: An application to event-related potentials modelling for electroen- cephalograms analysis, Statistics in Medicine 25 (2006) 3718–3739.

[3] H. Cardot, F. Ferraty, A. Mas, P. Sarda, Testing hypotheses in the linear functional model, Scand. J. Statist. 30 (2003) 241–255.

[4] H. Cardot, A. Goia, P. Sarda, Testing for no effect in functional linear regression models, some computational approaches, Commun. Stat.

Simulation Comput. 33 (2004) 179–199.

[5] A. Cuevas, M. Febrero, R. Fraiman, An ANOVA test for functional data, Comput. Statist. Data Anal. 47 (2004) 111–122.

[6] A. Cuevas, R. Fraiman, On depth measures and dual statistics. A methodology for dealing with general data, J. Multivariate Anal. 100 (2009) 753–766.

[7] S. Dossou-Gbete, A. Pousse, Asymptotic study of eigenelements of a sequence of random selfadjoint operators, Statistics 22 (1991) 479–491.

[8] J. Fan, S.K. Lin, Test of significance when data are curves, J. Amer. Statist. Assoc. 93 (1998) 1007–1021.

[9] F. Ferraty, Ph. Vieu, Nonparametric Functional Data Analysis: Theory and Practice, Springer, 2006.

[10] L. Ferré, A.F. Yao, Functional sliced inverse regression analysis, Statistics 37 (2003) 475–488.

[11] P. Kokoszka, I. Maslova, J. Sojka, L. Zhu, Testing for lack of dependence in the functional linear model, Can. J. Statist. 36 (2008) 1–16.

[12] A. Mas, Testing for the mean of random curves: A penalization approach, Stat. Inference Stoch. Process. 10 (2007) 147–163.

[13] A.M. Mathai, S.B. Provost, Quadratic Forms in Random Variables: Theory and Applications, Dekker, 1992.

[14] J.O. Ramsay, B.W. Silverman, Functional Data Analysis, Springer, 2005.