Measures of model adequacy only and of model adequacy and model

1.3 Measures

1.3.1 Measures of model adequacy only and of model adequacy and model

We introduce and compare in this Section the measures belonging to categories A, A&S and F&S (cf. Table 1.1).

1.3.1.1 Gelman and Pardoe (2006)

For a LMM with L variance components, Gelman and Pardoe (2006) presented two Bayesian measures that summarize information in the data at each “level” of the model.

We write level in quotation marks because it corresponds to the separate variance com-ponents, rather than to the more usual definition based on the hierarchy of the data. For instance, consider this varying-intercept model:

y_ij =β₀+b_i0+β₁x_ij+_ij, i= 1, . . . , m, j = 1, . . . , n_i

bi0 ∼ N(0, τ²), ij ∼ N(0, σ²), (1.2)

with a predictor x_ij. (1.2) can be written hierarchically withβ_0i =β₀+b_i0, as y_ij ∼ N(β_0i+β₁x_ij, σ²), β_0i ∼ N(β₀, τ²).

This model has two “levels” (data y_ij, interceptsβ_0i) with a different variance component at each “level” (σ², τ²).

The model is defined at each “level” l= 1, . . . , L as ζ_k^(l) =ν_k^(l)+e^(l)_k

for k = 1, . . . , K^(l). ζ_k⁽¹⁾ corresponds to y_ij in (1.1) and for l > 1, ζ_k^(l) are the random coefficients. ν_k^(l) are the linear predictors and e^(l)_k are the errors that follow a distribution with mean 0 and standard deviation σ^(l). For instance, for model (1.2) and for l = 1, we have ζ_k⁽¹⁾ = y_ij, ν_k⁽¹⁾ = β₀ +b_i0 +β₁x_ij, e⁽¹⁾_k = _ij and σ⁽¹⁾ = σ. For l = 2, we have ζ_k⁽²⁾ =β_0i, ν_k⁽²⁾ =β₀, e⁽²⁾_k =b_i0 and σ⁽²⁾ =τ.

Subsequently, we suppress the superscripts (l), as in Gelman and Pardoe (2006), be-cause we work with each “level” separately. The variation explained by the linear pre-dictors ν_k for each “level” is defined in the population by 1−[E (Var(e_k))][E (Var(ζ_k))]⁻¹ and is computed as

R²_“level” = 1− EV^K_k=1(ˆe_k) EV^K_k=1(ˆζ_k),

where “E” is the posterior mean, “Var” is the posterior variance, “V” is the finite-sample variance operator (V^m_i=1(x_i) = (m−1)⁻¹^P^m_i=1(x_i −x)¯ ²), and ˆe_k and ˆζ_k are the estimates of ek and ζk, respectively. The expectations are estimated by averaging over posterior simulation draws, which gives rise to a “Bayesian adjusted R²”, that is a generalization of the classical adjusted R² in regression. R²_“level” usually takes values between 0 and 1.

For each “level,” the values of 0 and 1 indicate, respectively, a poor and a perfect fit with respect to the error variance explained at each “level.” IfR²_“level” is negative, the prediction is so poor that the estimated error variance is larger than the variance of the data.

The measure that summarizes the average amount of pooling at each “level” is the pooling factorλ_“level”, which is defined in the population by 1−[Var (E (e_k))][E (Var(e_k))]⁻¹ and is computed as

λ_“level”= 1− V^K_k=1(E (ˆe_k)) EV^K_k=1(ˆe_k).

This measure ranges from 0 to 1, where 0 and 1 correspond to no and, respectively, complete pooling. For model (1.2), the complete-pooling model is yij = β0+β1xij +ij

with common estimates ∀iand the no-pooling model is y_ij =β_0i+β₁x_ij+_ij withm β_0i’s estimated by least squares. A low pooling factor λ_“level” < 0.5 indicates a higher degree of within-group information than population-“level” information. A high pooling factor λ_“level”>0.5 indicates a higher degree of population-“level” information than within-group information.

1.3.1.2 Snijders and Bosker (1994)

Two measures of modeled variation, one at each level of a two-level LMM, are defined.

This model is equivalent to model (1.1), in which levels 1 and 2 correspond to the subject level j and group level i, respectively, R_i = σ²I_n_i and D = τ²I_q, where I_q is the q×q identity matrix. These measures are defined for two-level models only and they require a null model, which contains a fixed intercept and a random intercept, with variance τ₀²I_q, and for which R_i0 =σ²₀I_n_i.

The level-1 modeled proportion of variation is defined in the population as the propor-tional reduction in mean squared prediction error fory_ij, 1−(var(y_ij −X_ijβ)) (var(y_ij))⁻¹. The corresponding criterion is

R₁² = 1− var(y^d _ij −X_ijβ) var(yd _ij) ,

where var is obtained by plugging-in the parameter estimates in the population formulad

and X_ij is the 1×p vector of fixed effects for subjectj in group i.

The level-2 modeled proportion of variation is defined in the population as the propor-tional reduction in mean squared prediction error for ¯y_i., 1−var(¯y_i.−X_i.β)(var(¯y_i.))⁻¹.

The corresponding criterion is

R²₂ = 1− var(¯^d y_i.−X_i.β) var(¯d y_i.) ,

where ¯yi. and X_i. are the group means of yij and X_ij, respectively.

To estimate population parameters, the random effects and the errors are assumed normally distributed. For R²₁ and for R₂² for balanced data (n_i =n ∀i), the numerator is the sample variance of the model of interest and the denominator is the sample variance of the null model. For example, for model (1.2), the criteria that estimate the population parameters are R²₁ = 1−(ˆσ²+ ˆτ²)/(ˆσ₀²+ ˆτ₀²) andR²₂ = 1−(ˆσ²/n+ ˆτ²)/(ˆσ₀²/n+ ˆτ₀²), with ˆ

σ², ˆτ², ˆσ₀² and ˆτ₀², that are the estimates ofσ², τ², σ₀² and τ₀² respectively.

In the case of unbalanced data, the authors advise to use a representative value ofn_i, such as the harmonic mean m⁻¹^P_in⁻¹_i ⁻¹. The interpretation of the R²₁ and R²₂ is the same as the traditional coefficient of determination (Draper and Smith, 1998). R₁² andR₂² identify which predictors are useful to predict yij and ¯yi., respectively. Population values lie between 0 and 1. Negative values of the estimates are possible when the fixed part of the model is misspecified.

1.3.1.3 Vonesh et al. (1996)

The model concordance correlation coefficient for generalized nonlinear mixed-effects mod-els is defined as

r_c= 1−

i=1(y_i−yˆ_i)⁰(y_i−yˆ_i)

i=1(y_i−y1¯ _n_i)⁰(y_i−y1¯ _n_i) +^P^m_i=1(ˆy_i−y1ˆ _n_i)⁰(ˆy_i−y1ˆ _n_i) +N(¯y−y)ˆ ². Initially introduced by L. I. Lin (1989) to measure the degree of agreement between pairs of observations, r_cis interpretable as a concordance correlation coefficient between observed and predicted values.

To assess the GOF associated with the fixed effects and to select the best set of fixed effects, a marginal model concordance correlation is obtained by setting ˆb_i = 0 in ˆy_i. If bˆ_i are not set to zero, as in the original definition, r_c is referred to as the conditional model concordance correlation and it assesses the GOF associated with fixed and random effects. The range of values of r_c is between −1 and 1, as the usual Pearson correlation, but with a slightly different interpretation. Indeed, r_c measures the level of agreement, or concordance, between y_i and ˆy_i, and a value of 1 indicates perfect fit, while a value smaller than, or equal to, zero indicates lack of fit. Adjusted values for the number of parameters in β are defined as r_c,a= 1−N(N −p)⁻¹(1−r_c), which allows for using the conditional rc,a for model selection.

1.3.1.4 Vonesh and Chinchilli (1996)

Another measure of explained residual variation for generalized nonlinear mixed-effects models, that requires the specification of a null model is introduced as follows:

R²_{V C} = 1−

i=1(y_i−yˆ_i)⁰V⁻¹_i (y_i−ˆy_i)

i=1(y_i−yˆ_i0)⁰V⁻¹_i (y_i−ˆy_i0),

for any positive definite matrix V_i. Sensible choices for V_i are either ˆR_i or ˆR_i0, the covariance estimates of R_i or R_i0, respectively obtained by plugging-in the parameter

estimates. For a measure that can be compared across different candidate models, the most reasonable choice is ˆR_i0.

This measure can be interpreted as the proportional decrease in residual variability compared with the residual variability of a null model, which is either one containing only a fixed intercept, or one with a fixed intercept and a random intercept. As for rc, a marginal and a conditional R²_{V C} can be defined, depending whether ˆb_i in ˆy_i are set to zero or not (original definition). The adjusted values are denoted R²_{V C,a} and are defined as 1−N(N −p)⁻¹(1−R²_{V C}). As for rc, the marginal R²_{V C} can be used for fixed effects selection and the adjusted conditional R²_{V C}, R²_{V C,a}, allows one for adopting it for model selection.

1.3.1.5 Zheng (2000)

For generalized LMMs with normally distributed random effects, we consider three mea-sures for assessing model adequacy among a set proposed by Zheng (2000). The first measure is the proportional reduction in deviance Drand, defined as

D_rand = 1−

i=1d_i(y_i,yˆ_i)

i=1di(y_i,y1¯ _n_i),

where the numerator is the deviance under the model of interest and the denominator is the deviance under the null model that fits only a fixed intercept. For a fixed dispersion parameterφ, theN×1 vector of responsesy= (y₁, . . . ,y_m)⁰, theN×pdesign matrix for fixed effects X = (X₁, . . . ,X_m)⁰, the mq×1 vector of random effects b = (b₁, . . . ,b_m)⁰, µ=E(y|X,b), and the log of the joint likelihood function given Xandb,l(µ, φ;y), the devianced(y,µ) is defined asd(y,µ) =−2φ[l(µ, φ;y)−l(y, φ;y)]. Drandis an extension of the coefficient of determination, as, in the LMM with normal random effects and errors,

i=1d_i(y_i,yˆ_i) = ^P^m_i=1(y_i −yˆ_i)⁰(y_i −yˆ_i) and ^P^m_i=1d_i(y_i,y1¯ _n_i) = ^P^m_i=1(y_i −y1¯ _n_i)⁰(y_i −

y1_n_i). The measure Drand ranges between 0 and 1, with 0 meaning that the model of interest provides no improvement in prediction over the null model and 1 meaning perfect prediction.

The second measure is the proportional reduction in Penalized Quasi-Likelihood (PQL) Prand, defined as

P_rand = 1−

i=1d_i(y_i,yˆ_i)/(2 ˆφ) + ˆb⁰( ˆD⊗I_m)⁻¹b/2ˆ

i=1d_i(y_i,y1¯ _n_i)/(2 ˆφ) ,

where I_m is the m×m identity matrix and ˆb = (ˆb₁, . . . ,bˆ_m)⁰ is the mq×1 vector of predicted random effects. P_rand ranges between 0 and 1, equals 0 when the prediction is poor and/or the random effects are large, and equals 1 when the prediction is perfect and the random effects are 0. Due to the penalty for a large number of random effects, P_rand allows for model selection.

The third measure by Zheng (2000) is the concordance index c, a measure of cross-sectional agreement. It equals the proportion of pairs of observations with unequal y values for which the ranks of ˆy and of y are concordant. As c depends on ranking information only, it cannot distinguish between models that yield the same ranking of the fitted values.

In order to identify the best set of fixed effects, we further consider the marginal ver-sions of these three measures obtained by setting ˆb_i = 0 in ˆy_i for D_rand and c, and, for

Prand, by setting ˆb_i = 0 in ˆy_i and in the second term of its numerator, which simpli-fies to the marginal D_rand, showing that P_rand is also an extension of the coefficient of determination.

1.3.1.6 Xu (2003)

For LMMs with normal errors with constant varianceR_i =σ²I_n_i and with normal random effects, Xu (2003) presented three measures that require a null model specified either with only a fixed intercept, or with also a random intercept. For both null models, R_i0 is equal to σ²₀I_n_i.

The two measures of explained variation are defined as r² = 1− σˆ²

ˆ σ₀²,

where ˆσ² and ˆσ²₀ are the estimates of σ² and σ²₀, and R² = 1−

i=1(y_i−yˆ_i)⁰(y_i−yˆ_i)

i=1(y_i−yˆ_i0)⁰(y_i−yˆ_i0).

A measure of explained randomness using the conditional likelihood of the observed data given the predicted random effects is also proposed as

ρ² = 1−σˆ² ˆ σ₀² exp

P_m

i=1(y_i−yˆ_i)⁰(y_i−yˆ_i)

Nσˆ² −

i=1(y_i−yˆ_i0)⁰(y_i−yˆ_i0) Nσˆ₀²

These measures are interpretable as the classical coefficient of determination, with range [0,1]. Xu (2003) emphasized that in ordinary linear regression, if the REML esti-mates are used for the variance components, r² is the adjusted R². As an extension of the classical adjusted coefficient of determination, r² can be used for model selection.

For fixed effects selection, we further consider the marginal version of R² by setting bˆ_i = 0 in ˆy_i.

1.3.1.7 H. Liu et al. (2008)

To estimate the portion of explained variation of the modeled data by the fitted LMM, H. Liu et al. (2008) introduced three R² statistics. They assumed that the errors and the random effects are normally distributed with a constant variance R_i =σ²I_n_i for the errors. The measures compare the residuals of an intercept-only model with the residuals of the model of interest considering, respectively, the fixed effects only (R²_F), the fixed and random effects (R²_T), and the total fixed effects, where all variables and individuals are treated as fixed (R²_TF):

R²_F = 1−

i=1(y_i−X_iβ)ˆ ⁰(y_i −X_iβ)ˆ

i=1(y_i−y1¯ _n_i)⁰(y_i −y1¯ _n_i), R_T² = 1−

i=1(y_i−yˆ_i)⁰(y_i−yˆ_i)

i=1(y_i−y1¯ _n_i)⁰(y_i−y1¯ _n_i), R²_TF = 1−

i=1(y_i−(X_i,Z_i) ˆη)⁰(y_i−(X_i,Z_i) ˆη)

i=1(y_i−y1¯ _n_i)⁰(y_i−y1¯ _n_i) ,

where ˆη =^P^m_i=1((X_i,Z_i)⁰(X_i,Z_i))⁻¹(X_i,Z_i)⁰y_i. R²_F is obtained by setting ˆb_i = 0 in R²_T. Thus, it can be seen as a marginal measure and can be used for fixed effects selection. And R²_T can be seen as a conditional measure, as in Sections 1.3.1.3 and 1.3.1.4. H. Liu et al.

(2008) noted thatR²_TF is defined more as being a theoretical measure of the upper bound of R² rather than a practical R² measure for modeling with the LMMs. We nevertheless consider it for completeness.

To adjust for the dimensions of the design matrices, adjusted values for R²_F and R²_TF, but not for R²_T, are proposed: R²_F,a = 1 − N(N − (p +q))⁻¹(1− R²_F) and R²_TF,a = 1−N(N −rank((X_i,Z_i)))⁻¹(1−R²_TF). R²_TF,a can thus be used for model selection.

These measures usually range between 0, indicating complete lack of fit, and 1, indi-cating perfect fit. Negative values are also possible and would indicate that the model of interest provides no improvement, in terms of explained variation, over the intercept-only model.

1.3.1.8 Measures comparison

When the data y_ij are normally distributed in the LMM, some of the measures discussed above are equivalent. This is the case for (a) the marginal R²_{V C}, the marginal D_rand, the marginal P_rand, the marginal R² and R²_F; (b) the conditional R²_{V C} and R² based on the null model with a fixed intercept, D_rand and R²_T; and (c) the conditional R²_{V C} and R² based on the null model with a fixed intercept and a random intercept. When we consider the adjusted values for the marginalR²_{V C}, the conditionalR²_{V C}’s andR²_F, these equalities no longer hold.

Some other measures, while they are not equivalent, are defined similarly and will thus give close values. This is the case for (a) the two measures of Snijders and Bosker (1994); (b) the marginal r_c, the marginal R²_{V C}, the marginal D_rand, the marginal P_rand, the marginal R² and R²_F; (c) the conditional r_c, the conditionalR²_{V C},R² and ρ² based on the null model with a fixed intercept, D_rand,P_rand,R²_T and R²_TF; and (d) the conditional R²_{V C},R² and ρ² based on the null model with a fixed intercept and a random intercept.

Dans le document Measures of model adequacy and model selection in mixed-effects models (Page 26-31)