• Aucun résultat trouvé

ESTIMATION IN A FAMILY OF LINEAR MIXED MODELS

N/A
N/A
Protected

Academic year: 2022

Partager "ESTIMATION IN A FAMILY OF LINEAR MIXED MODELS"

Copied!
6
0
0

Texte intégral

(1)

OF LINEAR MIXED MODELS

GABRIELA BEGANU and ION PURCARU

We deal with the estimation of the unknown parameters in a family of multivariate linear growth curve models with random effects, which occupies a central role in longitudinal studies. It is proved that the best linear unbiased estimators of the fixed effects are identical to the ordinary least squares estimators. The generalized Henderson method III (see [4]) is uesd in order to derive the quadratic unbiased estimators of the covariance components by means of the orthogonal projection operators on finite dimensional Hilbert spaces corresponding to the linear models considered.

AMS 2000 Subject Classification: Primary 62J05; Secondary 47A05.

Key words: best linear unbiased estimator, quadratic unbiased estimator.

1. INTRODUCTION

Analysis of the repeated measurement or growth curve models has been considered by many authors, including Lange and Laird [8], Rao [9], Sala-i- Martin et al. [11] and Szatrowski [12].

In this paper, a generalization of this topic is presented as a mixed linear model with multivariate random effects for which the unknown parameters are the fixed effects and the covariance components.

Our purpose is to derive the best linear unbiased estimator (BLUE) of the expected value of observations and the quadratic unbiased estimators (QUE) of the covariance components corresponding to the linear regression model considered.

The fixed effects are estimated by the BLUE assuming that the covari- ance structure of the model is known. We show in Section 2 that it is the same as the ordinary least squares estimator (OLSE) and their equality is independent of the between-individuals design matrix of the linear model.

The necessary and sufficient condition for the BLUE to be equal to the OLSE of the mean is one of the conditions given by Zyskind [13]. Since this condition yields efficient estimators in several problems that are available in econometrics ([1]), it was chosen for its practicability.

REV. ROUMAINE MATH. PURES APPL.,53(2008),2–3, 125–130

(2)

In Section 3 the generalized Henderson method III (see [4]) is applied to the linear regression model considered in order to derive the quadratic unbiased estimators of the covariance components.

The estimation of the unknown parameters is necessary, among other purposes, in problems of prediction of the future responses.

2. BLUE OF THE EXPECTED VALUE

It is supposed that for a given individual or experimental unit, m dis- tinct characteristics are measured at each of pdifferent occasions under given experimental conditions. Assuming that n individuals are assigned randomly according to some experimental design, the measurements are independent and can be represented by the relation

(1) Y =AB(X0⊗Im) +λ(X0⊗Im) +E,

where A and X are the between-individuals and the within-individual design matrices of full column rank, respectively, B is the r×qm unknown matrix of fixed effects, λis then×qmmatrix of random effects andE is then×pm matrix of disturbances. (The symbol “⊗” is the usual Kronecker matrix product and Im is the m×m identity matrix.)

We assume that the lines of λ and E are independent random vectors, identically distributed with zero means and the same covariance matricesIq⊗ Σλ andIp⊗Σe, respectively. Then the expected value of the observationsY is

(2) E(Y) =µ=AB(X0⊗Im)

while the covariance matrix is

(3) cov(vecY) =V ⊗In= Σ,

where

(4) V = (XX0)⊗Σλ+Ip⊗Σe

(vec Y is the npm×1 vector obtained by rearranging the columns ofY one below the other).

Notice that the individual’s regression coefficients in (1) are composed from both fixed and random effects. This means that the regression coef- ficients are influenced by different experimental design conditions over indi- viduals or other relevant background covariates (such as initial age or sex or socioeconomic status), wich are specific to the individual but the same over all occasions for a given individual.

Under assumptions (2), (3) and (4) and considering that Σ is known, it is easy to prove that the BLUE

(5) µbBLUE=A(A0A)−1A0Y V(X⊗Im)[(X0⊗Im)V(X⊗Im)]−1(X0⊗Im)

(3)

of µis the same as the OLSE

(6) bµOLSE=A(A0A)−1A0Y[X(X0X)−1X0⊗Im].

Vin (5) stands for any generalized inverse ofV ifV is non-negative definite, or its inverse V−1 ifV is positive definite.

If we denote by Lr,s the finite dimensional Hilbert space of all linear transformations from Rr to Rs, and R(A), R(X⊗Im) are the ranges of the linear operators A ∈ Lr,n and X ⊗Im ∈ Lqm,pm, respectively, then PA = A(A0A)−1A0 andPX = [X(X0X)−1X0]⊗Im are the orthogonal projections on R(A) andR(X⊗Im), respectively.

Therefore, using the Kronecker operators product ((RS)Tc = RT S0, where R∈ Lr,r,S ∈ Ls,s andT ∈ Ls,r), estimators (5) and (6) can be expres- sed as

µbBLUE= (PA[(Xc ⊗Im)[(X0⊗Im)V(X⊗Im)](X0⊗Im)V])Y and

µbOLSE= (PAPc X)Y.

The equality of µbBLUE and µbOLSE can be established under different conditions. Here, we shall use the necessary and sufficient condition given by Zyskind [13].

Proposition 1. Let the linear model (1) satisfy assumptions (2), (3) and (4). Then

(7) µbBLUE=bµOLSE

Proof. It is known (see [13]) that (7) holds if and only if there exists a matrix R such that

(8) Σ(X⊗Im⊗A) = (X⊗Im⊗A)R

This is a Zyskind’s condition ([13]) corresponding to model (1), where we allow for the vec operation (vec (ABC) = (C0⊗A)vecB).

Replacing Σ andV given by (3) and (4), respectively, the equation (8), we obtain that

[((XX0)⊗Σλ+Ip⊗Σe)(X⊗Im)]⊗A= [(XX0X)⊗Σλ+X⊗Σe]⊗A=

= (X⊗Im⊗A)[((X0X)⊗Σλ+Iq⊗Σe)⊗Ir].

Hence there exists a qmr×qmrmatrix

(9) R= [(X0X)⊗Σλ+Iq⊗Σe]⊗Ir

that satisfies (8).

Corollary1.For the linear model(1)with assumptions(2), (3)and(4), equation (7) holds independently ofA.

(4)

Proof. If we set

(10) Q= (X0X)⊗Σλ+Iq⊗Σe

in (9), then the condition (8) becomes

V(X⊗Im) = (X⊗Im)Q

which means that (7) holds regardless of the between-individuals matrixA of model (1).

It can be easily seen that the equality of the two estimators is very helpful in computation of the estimator of µ (orB).

Assuming that the random matrixY is distributed asN(AB(X0⊗Im),Σ), the BLUE of µ also is normal distributed with expected value µ and cova- riance matrix

cov(vecµbBLUE) = cov(vecµbOLSE) =

= [(XX0)⊗Σλ+X(XX0)−1X0⊗Σe]⊗A(A0A)−1A0.

Consequently, the BLUE of the fixed effects in the family of models (1) is BbBLUE=BbOLSE= (A0A)−1A0Y[X(X0X)−1⊗Im]

and its normal distribution has mean B and covariance matrix cov(vecµbBLUE) = [(XX0)−1⊗Im]Q⊗(A0A)−1, where Qis given by (10).

It follows that the BLUE of the fixed effects is identical to the OLSE for every member of family (1).

3. QUE OF THE COVARIANCE COMPONENTS

The covariance components Σλ and Σe of model (1) can be estimated using various methods like maximum likelihood estimation [5] or restricted maximum likelihood estimation [6], [8]. However, the Henderson method III, a computational technique similar to the EM algorithm [7], is generally prefered.

Corresponding to model (1), the quadratic forms used in the generalized Henderson method III (see [3]) are

Q1= ([Ip−X(X0X)−1X0]⊗Im)Y0Y([Ip−X(X0X)−1X0]⊗Im) and

Q2 = [X(X0X)−1X0⊗Im]Y0[In−A(A0A)−1A0]Y[X(X0X)−1X0]⊗Im].

Considering the orthogonal projectionsMA=In−PAandMX =Ip⊗Im−PX on the orthogonal complements of R(A) and R(X⊗Im), respectively, these

(5)

quadratic forms become

Q1 = (MXMc X)Y0Y and

Q2 = (PXPc X)Y0MAY.

They can also be obtained by an iterative method of estimation based on the Gram-Schmidt orthogonalizing process of the design matrices of model (1) (see [2]).

According to assumptions (2), (3) and (4), the random quadratic forms Q1 and Q2 have the expected values

E(Q1) = ([Ip−X(X0X)−1X0]⊗Im)[(X⊗Im)B0A0AB(X0⊗Im)+

+nV]([Ip−X(X0X)−1X0]⊗Im) =n[Ip−X(X0X)−1X0]⊗Σe

and

E(Q2) = [X(X0X)−1X0⊗Im){(X⊗Im)B0A0[In−A(A0A)−1A0]AB(X0⊗Im)+

+(n−r)V}[X(X0X)−1X0⊗Im) = (n−r)[(XX0)⊗Σλ+X(X0X)−1X0⊗Σ1] =

= (n−r)[X(X0X)−1⊗Im]Q(X0⊗Im), where Qis given by (10).

The generalized Henderson method III (see [4]) consists of equating the quadratic forms Q1 and Q2 to their expected values, respectively. So, we can state

Proposition 2. The QUE of the covariance components in the linear model (1)are the solutionsΣbλ andΣbe of the equations

( Q1=n[Ip−X(X0X)−1X0]⊗Σe

Q2= (n−r)[X(X0X)−1⊗Im]Q(X0⊗Im).

The quadratic random formsQ1andQ2are independently distributed as Wishart (Ip⊗Σe, n(p−q)) and Wishart ([(X0X)−1⊗Im]Q, n−r), respectively.

They are also independent of BbOLSE.

A disadvantage of the Henderson method III (but not only of this esti- mation technique) is that the solutionΣbλ can be negative definite. In this case one can use procedures such as that suggested in [7] to construct a quadratic estimator of Σλ which is at least non-negative definite. Some sufficient condi- tions are given in [2] for QUE of the covariance components to be non-negative definite quadratic forms.

It can be shown ([4]) that the quadratic unbiased estimators for the covariance structure exist for every member of the family (1) and, as a con- sequence, the closed-form expressions for the estimated covariance matrix of the OLSE of B also exist for the entire family of linear regression models.

(6)

The given estimators of the fixed effects and covariance components can be used in the problem of prediction [9] concerning the components of future responses vectors of a further individual, given the values of components of previous response vectors for it.

REFERENCES

[1] B.N. Baltagi, On the efficiency of two stage and three stage least squares estimators.

Econometric Rev.7(2) (1989), 165–169.

[2] G. Beganu,A Gram-Schmidt orthogonalizing process of design matrices in linear models as estimating procedure of covariance components. Rev. R. Acad. Cien. Ser. A Mat.99(2) (2005), 187–194.

[3] G. Beganu, A two-stage estimator of individual regression coefficients in multivariate linear growth curve models. Rev. Acad. Colombiana Cienc. Exact. Fis. Natur.30(117) (2006), 549–554.

[4] G. Beganu, Quadratic estimators of covariance components in a multivariate mixed linear model. Statist. Methods Appl.16(3) (2007), 347–356.

[5] N. Chatterjee,A two-stage regression model for epidemiological studies with multivariate desease classification data. J.Amer. Statist. Assoc.99(2004), 127–138.

[6] H. Cui, K.W. Ng and L. Zhu,Estimation in mixed effects model with errors in variables.

J. Multivariate Anal.91(2004), 53–61.

[7] A.P. Dempster, D.B. Rubin and R.K. Tsutakawa,Estimation in covariance components models. J. Amer. Statist. Assoc.76(1981), 341–353.

[8] N. Lange and N.M. Laird, The effect of covariance structure on variance estimation in balanced growth-curve models with random parameters. J. Amer. Statist. Assoc.84 (1989), 241–247.

[9] C.R. Rao,Prediction of future observations in growth curve models. Statist. Sci.4(1987), 434–471.

[10] C.G. Reinsel,Multivariate repeated-measurement or growth curve models with multivari- ate random effects covariance structure. J. Amer. Statist. Assoc.77(1982), 190–195.

[11] X. Sala-i-Martin, G. Doppelhofer and R.I. Miller,Determinants of long-term growth: A Bayesian averaging of classical estimates(BACE)approach. American Econ. Rev.94 (2004), 813–835.

[12] T.H. Szatrowski,Necessary and sufficient conditions for explicit solutions in the multi- variate normal estimation problem for patterned mean and covariances. Ann. Statist.8 (1980), 802–810.

[13] G. Zyskind,On canonical forms, negative covariance matrices and best and simple linear least squares estimators in linear models. Ann. Math. Statist.38(1967), 1679–1699.

Received 15 July 2007 Academy of Economic Studies

Department of Mathematics Piat¸a Romanˇa, nr. 6 010374 Bucharest, Romania gabrielabeganu@yahoo.com

ionpurcaru@yahoo.fr

Références

Documents relatifs

It is established that estimators joining the midpoints (as their reciprocal forms studied in the present article for quantile estimation) reach a minimal MISE at the order n −2...

In this chapter, we first outline some of the most important and widely used latent linear models in a unified frame- work with an emphasis on their identifiability

It has been shown that a canonical transformation technique is applicable to the restricted BLUP evaluation when an animal model is assumed. It is not

By using the functional delta method the estimators were shown to be asymptotically normal under a second order condition on the joint tail behavior, some conditions on the

In this paper, we propose to construct a non parametric estimator of the covariance func- tion of a stochastic process by using a model selection procedure based on the Unbiased

Under mild assumptions on the estimation mapping, we show that this approximation is a weakly differentiable function of the parameters and its weak gradient, coined the Stein

The average ratio (estimator of variance/empirical variance) is displayed for two variance estimators, in an ideal situation where 10 independent training and test sets are

If the amount of data is large enough, PE can be estimated by the mean error over a hold-out test set, and the usual variance estimate for means of independent variables can then