• Aucun résultat trouvé

Robust Estimation and Inference for Generalised Latent Trait Models

N/A
N/A
Protected

Academic year: 2022

Partager "Robust Estimation and Inference for Generalised Latent Trait Models"

Copied!
29
0
0

Texte intégral

(1)

Report

Reference

Robust Estimation and Inference for Generalised Latent Trait Models

MOUSTAKI, Irini, VICTORIA-FESER, Maria-Pia

Abstract

The paper discusses the effect of model deviations such as data contamination on the maximum likelihood estimator (MLE) for a general class of latent trait models (citeNP{MoKn:00}). This is done with the use of the influence function (Hampel 1968, 1974) a mathematical tool to assess the robustness properties of any statistic, such as an estimator.

Simulation studies show that the MLE can be seriously biased by model deviations.

Therefore, we then propose alternative robust estimators that are not less influenced by data contamination. The performance of the robust estimators in terms of bias and variance is compared to the MLE estimator both analytically and through simulation studies.

MOUSTAKI, Irini, VICTORIA-FESER, Maria-Pia. Robust Estimation and Inference for Generalised Latent Trait Models. London School of Economics, 2002

Available at:

http://archive-ouverte.unige.ch/unige:6508

Disclaimer: layout of this document may differ from the published version.

(2)

Robust Estimation and Inference for Generalised Latent Trait Models

Irini Moustaki and Maria-Pia Victoria-Feser London School of Economics

University of Geneva

Statistics Research Report LSERR76, LSE March 2002

Abstract

The paper discusses the effect of model deviations such as data con- tamination on the maximum likelihood estimator (MLE) for a general class of latent trait models (Moustaki and Knott 2000). This is done with the use of the influence function (Hampel 1968, 1974) a mathe- matical tool to assess the robustness properties of any statistic, such as an estimator. Simulation studies show that the MLE can be seriously biased by model deviations. Therefore, we then propose alternative robust estimators that are not less influenced by data contamination.

The performance of the robust estimators in terms of bias and vari- ance is compared to the MLE estimator both analytically and through simulation studies.

Keywords Generalized latent trait, mixed items, influence func- tion, robust estimation

(3)

1 Introduction

The study of relationships among variables is a common research problem in social sciences. Theoretical constructs such as intelligence, ability, emotion and stress are not directly measurable but only indirectly through observed variables that are supposed to be indicators of those unobserved constructs.

Therefore, models such as factor analysis or structural equation models that allow measurements of unobservable variables or latent variables by means of observable or manifest variables (also called items) are very important.

In practice any type of manifest variables can be collected, such as mul- tiple choice binary data (e.g. correct/incorrect), ordinal data such as Likert type scales (e.g. opinions) or metric data (e.g. scores obtained on a test), etc. To analyse the relationships between those different types of variables, one needs to use the appropriate methodology. In the literature there are two main approaches for analyzing the interrelationships among a set of observed variables. One approach, the underlying variable approach, supposes that the underlying scale of the different variables is normal and for practical reasons is observed on discrete (binary or ordinal) values. Tetrachoric, polychoric (Olsson 1979) or polyserial (Olsson, Drasgow, and Dorans 1982) correlations are computed among those underlying continuous variables. Poon and Lee (1987) developed a method for finding the maximum likelihood estimates of the parameters of a multivariate normal distribution in these situations.

Their method can be used for structural equation modelling as it is proposed in Muth´en (1984) and Lee, Poon, and Bentler (1992). This is not the ap- proach we consider here because of the relatively strong assumptions on the underlying distributions.

The other approach, the item response theory approach, models the ob- served variables as they are by postulating distributions on the observed variables. A generalized model framework for any type of observed data in the exponential family is discussed in Moustaki and Knott (2000). Their paper gives an estimation procedure for the maximum likelihood estimator (ML) of the generalized latent trait models (GLT). They extend the work of Moustaki (1996) for mixed binary and metric variables and Bartholomew and Knott (1999) for categorical variables. O’Muircheartaigh and Moustaki (1999) also consider the case of missing values.

However, a classical ML approach makes the fundamental assumption that the data are generated exactly from the model and in particular that there are no errors in the set of responses. For example, in the case of normal variables a subject with a response more than 3 standard deviations away from the mean has an unexpected response under the normal model which is considered to be either an error (e.g. recording error) or just an unusual

(4)

subject not representative of the sampled population. For binary variables it is harder to define when a data set is contaminated or not. For example, if the assumed model is a Guttman model then any positive/correct response that is followed by a negative/wrong response does not comply with the assumed model. As we deviate from the deterministic nature of the response patterns under the Guttman model it is more difficult to detect response patterns that are not generated by the assumed model. Subjects that indicate the presence of model deviation, i.e. they are highly improbable under the assumed model, might have been generated by another (not assumed) model.

The question that is addressed in this paper is what is the effect of these unexpected set of responses on the ML estimator? Do the parameter esti- mates change radically if subjects that do not “fit the model” are present in the sample or in other words is the ML estimator for the GLT model robust?

If the ML estimator is not robust, that means that in principle one subject can change the conclusions drawn from the data analysis. This is obviously a non desirable property of the estimation procedure. In that case, a robust estimator which is built to be resistant to model deviations should be first developed and then used in practice. The aims of the paper are first to in- vestigate the robustness properties of the ML estimator for GLT model and then to propose robust estimators.

General robustness theory can be found in Huber (1981) and Hampel, Ronchetti, Rousseeuw, and Stahel (1986) who have set the foundations. We adopt here the approach based on theInfluence Function (IF) (Hampel 1968, 1974) a mathematical tool to assess the robustness properties of any statistic such as an estimator or a test statistic. Let Fθ denote a parametric model such as the GLT model, where θ is a vector of parameters. Let also T be an estimator of θ which can be written as a functional of any distribution F, i.e. T(F). The theory of robust statistics starts by enlarging the hypothesis about the distribution of the data by supposing that the data generating the model is in a neighbourhood (1−ε)Fθ+εGof the true model. In other words, it is supposed that a large proportion (1−ε) of the data are generated by the model Fθ whereas a small fraction ε >0 comes from any distributionG.

To assess the behaviour in the neighbourhood of the hypothetical model of T with usually infinitesimal values of ε, one uses the IF. Formally the IF is defined as

IF(x;T, Fθ) = lim

ε↓0

T(Fθ)−T(Fε) ε

with Fε = (1−ε)Fθ+ε∆x where ∆x is the distribution which assigns proba- bility 1 to an arbitrary pointx. Data generated fromFεare usually said to be generated under model contamination. The IF then measures the influence of an infinitesimal amount of contaminated data at the arbitrary point x on

(5)

the value of the statistic T. In fact, Hampel et al. (1986) show that the IF gives information on the behaviour ofT for any “contamination” distribution G, since one has that

sup

G T((1−ε)Fθ+εG)−T(Fθ) εsup

x IF(x;T, Fθ)

TheIF can also be seen as a first order approximation of the asymptotic bias of T (see Hampel et al. 1986). This means that the IF can be used to assess the robustness properties ofT in that if it is unbounded then the asymptotic bias of T can be infinite (or very large) under model contamination, meaning that T is not robust. If it can take large values, then although the bias is in this case finite, it might nevertheless be large. In the later case, in order to measure the maximal size of the bias, one can use the self-standardized sensitivity (Hampel et al. 1986) given by

γ(T, Fθ) = sup

x

IF(x;T, Fθ)TV(T, Fθ)−1IF(x;T, Fθ)1/2

(1) where V(T, Fθ) is the asymptotic variance of T. Hence the upper bound of the asymptotic bias is measured in the metric given by the asymptotic covariance matrix of the estimator. We also have the result that

γ(T, Fθ)2 ≥s

where s = dim(θ) is the number of parameters. The IF and γ will be used to assess the robustness properties of the ML estimator for the GLT model.

The paper is organised as follows. The GLT model and the ML estimator of its parameters are presented in section 2. In section 3, the robustness properties of the ML estimator are studied by means of the IF and the self- standardized sensitivity. Several robust estimators are presented in section 4 and their robustness, efficiency and consistency properties are studied. In section 5, the behaviour of the ML and a robust estimator under model contamination are studied through a simulation study. Finally, section 6 concludes.

2 The generalized latent trait model

A latent variable model aims to explain the interrelationships among pman- ifest response variables x1, . . . , xp with qlatent variables z1, . . . , zq whereqis much smaller than p. The conditional distribution of xi|z (z = [z1, . . . , zq]) is taken from the exponential family, i.e.

gi(xi|z, αi) = exp xiθi

φi b(θi)

φi +c(xi, φi)

(6)

whereθi =θi(z, αi) is called a canonical parameter andb(θi) andc(xi, φi) are specific functions taking a different form depending on the distribution of the response variable xi (see Moustaki and Knott 2000). φi is a scale parameter which is usually estimated separately from the rest of the parameters αi. The link between the manifest and latent variables is defined through plink functions b(θi) such that we have the following relationship

θi(z, αi) = αi0+ q

j=1

αijzj =αiz

where αi = [αi0, . . . , αiq] and z = [1, z1, . . . , zq]T. The functions b(θi) = b(θi(z, αi)) are defined for different types of data and distributions as follows

Binary responses with logit link: b(θi) = log(1 + exp(θi))

Poisson responses with log link: b(θi) = exp(θi)

Normal responses with identity link: b(θi) =θi2/2

Gamma responses with reciprocal link: b(θi) =log(−θi)

In the ordinal case, the canonical parameter θi(z, αi) is not linear with respect to the latent variables. For a generalized framework for modelling ordinal data see Moustaki (2000).

The aim of a GLT model is to reduce the dimension of the manifest variables into a smaller number of variables by taking into account the corre- lation structure of thexi’s. An assumption that is made for a GLT model is that of conditional independence which says that the manifest variables are conditionally independent given the latent variables, i.e.

g(x|z, α) = Πpi=1gi(xi|z, αi)

with x = [x1, . . . , xp], α = [αT1, . . . , αTp]T and g the conditional distribution of x given z. The joint distribution of the manifest variables can thus be written as

f(x;α) =

· · ·

pi=1gi(xi|z, αi)]h(z)dz

where the zj inz are assumed to follow independently standard normal dis- tributions, i.e. h(z) =

q j=1

ϕ(zj).

For a sample of size n, the log-likelihood is then L(α, φ) = 1

n n

h=1

logf(xh;α)

(7)

The partial derivatives are

∂L(α, φ)

∂αiT = 1 n

n h=1

si(xh;α)

= 1

n n h=1

1 f(xh;α)

· · ·

g(xh|z, α)vec

xih−bi)

φi z h(z)dz (2) where bi) = ∂θ

ib(θi). The derivative of the specific function b(θi) with respect to the canonical parameter θi is equal toE[xi|z]. Note also that the second derivative of the specific function b(θi) with respect to θi multiplied by φi is equal to var[xi|z]. The roots of (2) define the ML estimator ˆαi, ∀i.

Differentiating the log-likelihood with respect to the scale parameter leads to

∂L(α, φ)

∂φTi = 1 n

n h=1

si(xh;φ)

= 1

n n h=1

1 f(xh;α)

· · ·

g(xh|z, α)

−xihθi−b(θi)

φ2i +ci, xih) h(z)dz (3) For the Binomial, the multinomial and the Poisson distribution the scale

parameter φ = 1. For the Normal distribution, we have ci, xi) = 0.5

x2i φ2i 1

φi

so that

s(x;φi) = 1 f(x;α)

· · ·

g(x|z, α)0.5 φ2i

(xi−θi)2−φi

h(z)dz

In order to find the ML estimator, one has to rely on an iterative process described by Moustaki and Knott (2000) who propose to approximate the integrals by Gauss-Hermite quadrature with k weights ϕ(ztj) and abscissae ztj for each latent variable j = 1, . . . , q, giving

f(x; α) = k t1=1

. . . k tq=1

h(zt) [Πpi=1gi(xi|zt, αi)]

(8)

and

si(x;α) = k t1=1

. . . k tq=1

h(zt)g(x|zt, α) f(x;α) vec

xi−biit)

φi zt (4) with h(zt) =

q j=1

ϕ(ztj) and zt= [zti1, . . . , ztiq].

3 Robustness properties of the ML estimator for GLT model

In this section we study the robustness properties of the ML estimator for the GLT model by means of theIF and the self-standardized sensitivity. The features of the GLT model are relatively complicated and this is why both approaches are necessary. We will study in turn the model parameters αand the scale parameters. We will also restrict our study to the case of a mixture between normal and binary variables with one latent variable.

3.1 Model parameters α

To study the robustness properties of the ML estimator for the GLT model as defined by (2) we use the IF. For ML estimators with score functions, it is given by

IF(x,α, Fˆ ) =M(s, F)−1s(x, α) where

M(s, F) =

· · ·

s(x, α)sT(x, α)f(x;α)dx and s(x;α)T =

si(x;α)T

i=1,...,p (see Hampel et al. 1986). It is therefore proportional to the score function. For the GLT model, the score function which is given in (4), depends on the point of contamination x through the quantities f(x;α),g(x|z, α) = Πpi=1gi(xi|z, αi) andxi. For example the effect of an extreme value for the ith manifest variable has an influence not only on the ML estimator of αi corresponding to this manifest variable, but also on the other estimates. Actually the ML estimator of the whole vector α can, in principle, be influenced by extreme data. What is not clear is the size of theIF for different types of variables. Indeed, the quantity

x

ibi(θi) φi

can be very large if xi is far away from its expectation, but at the same time its density gi(xi|z, αi) becomes very small and the behaviour of gf(x|z(x;α)) is not straightforward to study. One could also expect the IF to be bounded,

(9)

since for extreme values in xthe corresponding conditional density g(x|z, α) should be very small or even nil.

In order to investigate this point, we computed theIF for each parameter as a function of one of thexi inx. The model we chose is a one-factor model fitted to two binary (i= 1,2) and three (i= 3,4,5) normal manifest variables with parameter’s values of

α1 = [1.0,0.7]

α2 = [0.8,1.0]

α3 = [2.0,0.6] and φ3 = 1

α4 = [2.5,0.7] and φ4 = 1

α5 = [3.0,0.8] and φ5 = 1

Figure 1 shows the IF for each parameter of the model, when the third manifest variable (i.e. the first normal variable) takes values between -50 and 50 (the other manifest variables are set to a value of 1). We suppose the scale parameters known. One can see that the IF is bounded in a natural way when x3 becomes really extreme. This can be explained by the fact that the conditional density g(x|z) becomes 0. However, it seems that the size of the bias which is proportional to the IF can be quite large for all parameters. It is however larger for the parameters corresponding to the first normal variable, i.e. the one which carries the contamination.

In order to have an idea of the size of the (asymptotic) bias of the ML estimator, we can compute the self-standardized sensitivity given in (1) under different contamination settings and also different parameters’ values. We have that the asymptotic covariance matrix of the ML estimates is the inverse of the information matrix given by

· · ·

s(x;α)s(x;α)Tf(x;α)dx −1

Therefore, γ is γ(T, Fθ) = sup

x

s(x, α)T

· · ·

s(x, α)sT(x, α)f(x;α)dx −1

s(x, α) 1/2

For different combinations of contaminations (i.e. 1st, 2nd and/or 3rd normal variable taking extreme values) and for the case when the scale parameters are known, we found that γ 388 which means that for a small amount of contamination, say 1%, the bias on the ML estimates can be as large as 3.88!!!

(10)

3.2 Scale parameter

The scale parameter φi is also estimated via the ML equation given by (3).

As for theαparameters, it is difficult to study the impact of model deviations such as extreme observations on the scale estimates by just looking at the expression of the IF. As before and for the same model, we computed the IF for the scale parameter when the value of one of the manifest variables is varied (here x3). As one can see in Figure 2, the score function is bounded for extreme values in x3 and that is probably due to the conditional density g(x|z) becoming 0. However, the value of theIF can be very large, especially for the scale estimate corresponding to the variable which values are varied.

The self-standardized sensitivity measure when the scale parameter is also estimated by means of the ML estimator is this time γ 1402, i.e. more than 3 times larger than in the case where the scale parameters are known.

The study of the IF and the self-standardized sensitivity permits one to have an idea of the asymptotic bias of the ML estimator. To make the point even stronger, we propose to perform simulation studies. This will be done in the next section together with simulations studies for robust estimators.

4 Robust estimation for GLT model

4.1 Robust M -estimators

Several classes of estimators have been defined (see e.g. Hampel et al. 1986) in which one can find robust estimators. The most well known class of estimators is the class of M-estimators defined by Huber (1964) which is a generalisation of the ML estimator. Indeed, a relatively general function ψ (see Huber 1981) replaces the score function leading to an M-estimator defined implicitly as the solution in α of

n h=1

ψ(xh;α) = 0 (5)

It is known that the IF of M-estimators is proportional to ψ (see Hampel et al. 1986) so that choosing a bounded ψ or controlling the bound on ψ defines a robust estimator. For GLT models, we can easily generalize the ML estimator to M-estimators. In the following subsections we present and analyze several proposals of M-estimators for the α parameters.

(11)

4.1.1 Optimal bias-robust estimator

Among the M-estimators, the one which has the smallest covariance matrix under the constraint of a boundedIF (see Hampel et al. 1986) is the optimal bias-robust estimator (OBRE). The later is defined for any score function as

1 n

n h=1

A(α)[s(xh;α)−a(α)]wc(xh) = 0

where the weight function wc is deduced from the Huber function with pa- rameter cand is given by

wc(x;α) = min

1 ; c

A(α)[s(x;α)−a(α)]

and the p(q+ 1)×p(q+ 1) matrixA(α) andp(q+ 1) vectora(α) are implicitly defined through

· · ·

[s(x;α)−a(α)][s(x;α)−a(α)]Twc(x)2f(x;α)dx = A(α)−1A(α)T

· · ·

[s(x;α)−a(α)]wc(x)f(x;α)dx = 0 (6) Needless to say that solving the implicit equations to find A(α) and a(α) is a rather complicated procedure. The parameter space becomes very large even in rather simple problems. For example, if we have p = 10 and we fit only one latent variable, then A(α) is of dimension 20×20 (and even bigger if the scale parameters need to be estimated as well)! We therefore propose to explore other simpler M-estimators.

4.1.2 Residual-robust estimator

If one looks at each element of the score function defining the ML estimator, i.e.

1 n

n h=1

1 f(xh;α)

· · ·

g(xh|z, α)vec

xih−bit)

φi z h(z)dz one notices that each observation of a variable i contributes to the score function proportionally to the typical deviation xihφbi(θit)

i . If the observed value xih for the ith variable is large with respect to its conditional expec- tation biit), then the ML estimator could be attracted by this observation in that the estimated values would be biased towards it. This is then a

(12)

possible quantity to bound. However, in order to be able to choose a value for the bounding constant, it is better to standardize the deviations by the square root of the conditional variances (biiti = 2b∂θi(2θit)

i φi). We therefore propose to bound the following quantity: ∀xi, zt

xi −biit) biiti

≤c (7) so that a unique value for the bounding parameter can be set. The M- estimator we propose is then given implicitly by

n h=1

ψ(xh;α) = n h=1

i(xh;α)]i=1,...,p = 0 (8)

with

ψi(x;α) =

· · ·

g(x|z, α) f(x;α) vec

xi−biit)

φi wc(xi, αi) z h(z)dz and where

wc(xi, αi) =

⎧⎨

1 if |xi−biit)| ≤c

biiti

c

bi(θit)φi

|xibi(θit)| otherwise (9) It should be stressed that the residual-robust estimator is not defined for the scale parameters. If the scale parameters need to be estimated, then one has to look for a robust estimator as well. We do not propose one here, because later in the paper another robust estimator will be proposed for all the model parameters including the scale.

To see if the estimator really downweights extreme values, one can look at the IF which are presented in Figure 3 for the parameter of the mixed GLT model with two binary and three normal manifest variables, when the third manifest variable (i.e. the first normal variable) takes values between -50 and 50 (the others manifest variables are set to a value of 1). It should be compared to Figure 1. One can see that overall, the influence of extreme data is much more limited than with the ML estimator. It is however, not sure that the maximal bias is small and hence the self-standardized sensitiv- ity needs to be computed. We found γ 69 (when the scale parameters are known) which means that the bias can be as large as say 0.69 for 1% of contaminated data. Is this a large quantity? In several simulations studies, we have found that the residual-robust estimator can be biased even with

(13)

bounding constants c as low as 2. The reason is that the weights in (9) do not take into account the ratio of the conditional versus the joint probability

g(xh|z,α)

f(xh;α) which can in practice take very large values for some atypical obser- vations. It is therefore important to propose an M-estimator that limits the influence of all elements of the score function.

4.1.3 Globally weighted robust estimator

It is becoming clear that in order to limit the influence of an extreme obser- vation on the resulting estimator, one has to bound the whole score function.

This is actually what the OBRE does, but as we have seen, it is probably too complicated to implement to the GLT models. What we propose here is a similar version given by

1 n

n h=1

ψc(xh;α) = 1 n

n h=1

[si(xh;α)wi(xh, c)]i=1,...p = 0

where the weight function wi is the Huber function with parameter c given by

wi(x;c) = min

1 ; c

si(x;α) and

si(x;α) = 1 f(x;α)

· · ·

g(x|z, α)vec

xi−bi)

φi z h(z)dz Note that one could choose different bounding constants c, one for each manifest variable. We did not however explore this possibility because we want to keep the robust estimator as simple as possible. Note also that this robust estimator is also defined for the scale parameters. The IF of this globally weighted robust (GWR) estimator with bounding constantc= 4 for the parameters of the mixed GLT model with two binary and three normal manifest variables, when the third manifest variable (i.e. the first normal variable) takes values between -50 and 50 (the other manifest variables are set to a value of 1) are given in Figure 4. One can notice that theIF are now bounded under a lower bound so that the bias should be smaller. One can also set a bounding constant to a lower value, say c= 2 and get theIF given in Figure 5. The information we can extract from these IF is that globally the bias should be limited, except maybe for the parameters of the binary variables α11 and α12.

Finally, the self-standardized sensitivity for the GWR estimator withc= 4, and c= 2 isγ 23.78 and γ 11.89 respectively. The bias can then be limited to 0.1 or 0.3 for 1% or 3% contaminated data.

(14)

4.2 Consistency

For an M-estimator defined generally through a ψ-function as n

h=1

ψ(xh;α) = 0 Fisher consistency implies

· · ·

ψ(x;α)f(x;α)dx= 0

When this is not the case, one can make the M-estimator Fisher consistent by adding a proper quantity in its definition, i.e.

1 n

n h=1

ψ(xh;α)−a(α) = 0 such that

a(α) =

· · ·

ψ(x;α)f(x;α)dx For the GWR estimator, we have

a(α) =

· · ·

g(x|z, α)vec

xi−bii(z))

φi diag (wi(x, c))z dxh(z)dz

i=1,...,p

This quantity is not obvious to compute, but as we will see from our simula- tions, if there is a bias (under the true model), then it is very small.

4.3 Efficiency

The asymptotic covariance matrix of M-estimators defined in (5) is given by V(ψ, α) =M−1(ψ, α)Q(ψ, α)MT(ψ, α) (10) where

M(ψ, α) =

· · ·

ψ(x;α)sT(x;α)f(x;α)dx and

Q(ψ, α) =

· · ·

ψ(x;α)ψT(x;α)f(x;α)dx

These quantities are very complicated to compute unless some simplifications are made. First the quadrature points are used to compute the integrals

(15)

involved in s(x;α) and in ψ(x;α) as given in (4) and (8) respectively. Then we propose to compute the sample version ofV(ψ, α) which is obtained when

M(ψ, α) = 1 n

n h=1

ψ(xh;α)sT(xh;α) and

Q(ψ, α) = 1 n

n h=1

ψ(xh;α)ψT(xh;α)

The sample is simulated given the values of the model parameters. With the parameter values used in our previous simulation studies and a simu- lated (uncontaminated) sample of 1000 observations, we found a relationship between the efficiency of the GWR estimator and the bounding constant c which given in Figure 6. In particular, for an efficiency ratio of 95%, one can use a bounding constant of c= 1.1, whereas a bounding constant of c = 2, leads to an efficiency ratio of 98.7%. It should be noted that in principle the efficiency depends on the parameter values. A strategy that is often adopted in such cases is to try different bounding constants c and compute the efficiency given the values of the estimates.

Note that inference for each estimated parameter can be performed, since the asymptotic covariance matrix for any M-estimator is given by (10). It can be estimated in the same way as it is done for computing the efficiency.

5 Simulation study

In this section, we present a small simulation study that should enable one to confirm the results we found theoretically. In particular, we would like to check that the bias under contaminated data is smaller with the GWR esti- mator and that in some settings, the ML estimator can be seriously biased.

Moreover, we would like to have an idea of the bias of the GWR estimator under no contamination, i.e. at the true model. In order to see that, we have simulated 100 samples of size 500 from the mixed GLT model we used previously. We also contaminated the data in two different fashions. In one case we chose randomly 3% of the first normal variable (i.e. observations of x3) that we set to an arbitrary value (20), whereas in the second case, we chose randomly 3% of the subjects and set their responses on all the normal variables to an arbitrary value (20). We then estimated the 10 parameters of the GLT model and the results are presented in the form of boxplots in respectively Figure 7 for the binary items, in Figure 8 for the means (α0i) of the normal items and in Figure 9 for the latent variable parameters (α1i)

(16)

of the normal items. The horizontal lines correspond to the true parameters values. The first two boxplots in each graph are the distributions of the estimators under no model contamination, the following two when only one normal variable is contaminated (3%) and the last two when all three normal variables are contaminated (3%). One can first notice, that at the model, i.e. when there is no contamination, the GWR estimator is not biased or the bias is very small. When there is contamination, the ML estimator is biased, not always in the same manner, but the GWR estimator is either not biased or at least less biased. Therefore, for a small efficiency loss (i.e. less than 2%), there is a certain gain in using the GWR we proposed.

We also tried a more robust estimator by setting the bounding constant to c= 1.1. The results are presented in Figure 10, 11 and 12. With a lower bounding constant (c= 1.1), the GWR estimator for the binary parameters become unbiased under contamination. They behave similarly withc= 2 for the parameters (α1i) of the normal items, except that under the model, they seem to be more biased. A correction for the bias seems therefore necessary.

6 Conclusion

In this paper we have shown that the ML estimator for the GLT model, at least when binary and normal manifest variables are mixed, can be biased when the data are not exactly generated by the postulated model. This is for example the case when there are extreme observations, i.e. subjects not behaving like the majority. We have investigated this robustness problem by means of the IF and the self-standardized sensitivity. We have also pro- posed some robust alternative estimators for the α parameters, one of which seems to be satisfactory in terms of limited bias, efficiency and computational complexity. What we have not yet investigated is the estimation of the scale parameters. The scale parameters are needed to compute confidence intervals for the α parameters. This problem is however left for future research.

(17)

References

Bartholomew, D. J. and M. Knott (1999). Latent Variable Models and Factor Analysis. Kendall’s Library of Statistics 7. London: Arnold.

Hampel, F. R. (1968). Contribution to the Theory of Robust Estimation.

Ph. D. thesis, University of California, Berkeley.

Hampel, F. R. (1974). The influence curve and its role in robust estimation.

Journal of the American Statistical Association 69, 383–393.

Hampel, F. R., E. M. Ronchetti, P. J. Rousseeuw, and W. A. Stahel (1986).

Robust Statistics: The Approach Based on Influence Functions. New York: John Wiley.

Huber, P. J. (1964). Robust estimation of a location parameter.Annals of Mathematical Statistics 35, 73–101.

Huber, P. J. (1981). Robust Statistics. New York: John Wiley.

Lee, S.-Y., W.-Y. Poon, and P. M. Bentler (1992). Structural equation models with continuous and polytomous variables. Psychometrika 57, 89–105.

Moustaki, I. (1996). A latent trait and a latent class model for mixed ob- served variables. British Journal of Mathematical and Statistical Psy- chology 49, 313–334.

Moustaki, I. (2000). A latent variable model for ordinal variables.Applied Psychological Measurement 24, 211–223.

Moustaki, I. and M. Knott (2000). Generalized latent trait models. Psy- chometrika 65, 391–411.

Muth´en, B. (1984). A general structural equation model with dichoto- mous, ordered categorical and continuous latent variables indicators.

Psychometrika 49, 115–132.

Olsson, U. (1979). Maximum likelihood estimation of the polychoric cor- relation coefficient. Psychometrika 44, 443–460.

Olsson, U., F. Drasgow, and N. Dorans (1982). The polyserial correlation coefficient. Psychometrika 47, 337–347.

O’Muircheartaigh, C. and I. Moustaki (1999). Symmetric pattern mod- els: A latent variable approach to item non-responsein attitude scales.

Journal of the Royal Statitical Society, Series A 162, 177–194.

Poon, W.-Y. and S.-Y. Lee (1987). Maximum likelihood estimation of mul- tivariate polyserial and polychoric correlation coefficients (corr: V53 p301). Psychometrika 52, 409–430.

(18)

Figure 1: IF for the ML estimator of a mixed model with two binary and three normal manifest variables.

(19)

Figure 2: IF for the ML estimator of the scale parameter of a mixed GLT model with two binary and three normal manifest variables.

(20)

Figure 3: IF for the residual robust estimator (c= 4) of a mixed GLT model with two binary and three normal manifest variables.

(21)

Figure 4: IF for the GWR estimator (c = 4) of a mixed GLT model with two binary and three normal manifest variables.

(22)

Figure 5: IF for the GWR estimator (c = 2) of a mixed GLT model with two binary and three normal manifest variables.

(23)

Figure 6: Efficiency versus bounding constant cfor the GWR estimator.

(24)

Figure 7: Distribution of the ML and GWR (c = 2) estimators for the binomial parameters under different data contamination.

(25)

Figure 8: Distribution of the ML and GWR (c= 2) estimators for the means (α0i) of the normal items under different data contamination.

(26)

Figure 9: Distribution of the ML and GWR (c = 2) estimators for the latent variable parameters (α1i) of the normal items under different data contamination.

(27)

Figure 10: Distribution of the ML and GWR (c = 1.1) estimators for the binary parameters under different data contamination.

(28)

Figure 11: Distribution of the ML and GWR (c = 1.1) estimators for the means (α0i) of the normal items under different data contamination.

(29)

Figure 12: Distribution of the ML and GWR (c = 1.1) estimators for the latent variable parameters (α1i) of the normal items under different data contamination.

Références

Documents relatifs

We present a new method for estimating normals that ad- dresses the requirements of real data: sensitivity to sharp features, robustness to noise, to outliers, and to

Inspired by the Bayesian evidence framework proposed in [31] to solve noisy interpolation problems, we have derived a generic and reliable hierarchical model for robust

In the presence of small deviations from the model (Figure 2), the χ 2 approximation of the classical test is extremely inaccurate (even for n = 100), its saddlepoint version and

Robust and Accurate Inference for Generalized Linear Models:..

In an effort to reduce the number of nodes generated by the ordinary DCA, a new algorithm that takes advantage of the structural properties of the regression tree is developed:

We have presented a new procedure, SFDS, for the problem of learning linear models with unknown noise level under the fused sparsity scenario.. We showed that this procedure is

In [7], the authors introduce an exact single-stage MILP formulation for two-stage robust problems with mixed recourse decisions and binary variables in linking constraints between

These include mixtures of both finite and non-parametric product distributions, hidden Markov models, and random graph mixture models, and lead to a number of new results and