• Aucun résultat trouvé

Students’ performance on a math test

For 519 students in 23 schools, the score on a math test assessed only once (yij, where i=schools and j=students) is considered. Kreft and Leeuw (1998) analyzed these data with different models. For illustration, we selected seven of these models. The predictors used at the student level are the time spent on math homework each week (homeworkij) and the parent’s highest education level (parentedij). The explanatory variables used at the school level are the total school enrollment (schoolsizei) and the school type (publici), differentiating public (publici = 1) from private (publici = 0) schools. We will not center any of the predictors, as transformation of explanatory variables is not the topic of this thesis and centering does not affect models’s fit to the data (cf. Chapter 5 of Kreft and Leeuw (1998) for a discussion on centering).

yij ∼ N(αi+βihomeworkij, σy2) αi

βi

!

∼ N

γ0+γ1publici δ0+δ1publici

!

, σ2α ρσασβ ρσασβ σβ2

!!

.

The fixed-effect predictors of model 7 are homework, public and the cross-level interaction between homework and public, whereas its random part contains a random intercept and a random slope for homework. Alternative considered models are the following, with models 1, 2, 3 and 6, that are obtained by restriction on model 7:

Model 1 contains only a fixed and a random intercept (γ1 = 0; βi = 0 ∀i).

Model 2 corresponds to model 7 without the predictor public, the cross-level interaction between homework and public and the random slope for homework (γ1 = 0; βi =β

∀i).

Model 3 corresponds to model 7 without the predictor public and the cross-level inter-action (γ1 =δ1 = 0).

Model 4 is equivalent to model 3 with the addition of the predictor parented.

Model 5 is equivalent to model 3 with the addition of the predictor schoolsize.

Model 6 corresponds to model 7 without the cross-level interaction (δ1 = 0).

Models 1 to 4 allow for testing the hypotheses that math achievement can be ex-plained by the time spent on math homework each week, and that the effect of the time spent on math homework is different among schools. Figure A.2 seems to support both these hypotheses. In addition, models 1 to 4 allow for testing the hypothesis that math achievement can be enhanced by help offered by parents. Models 5 to 7 allow for identi-fying school characteristics that can explain the math score. More precisely, we test the hypotheses that the math score is influenced by the total school enrollment, or by the school type, and that the influence of the time spent on homework on the math score de-pends on the school type. The last hypothesis seems relevant when looking at Figure A.3, which shows different patterns between private and public schools. However, there is a lot of variability in the data, which could lead to disprove this hypothesis. The results are presented in Table A.2. For the measures requiring a null model, we additionally fit the null model 0 that contains only a fixed intercept.

As in the previous illustration, the measures evaluating overall model adequacy (cat-egory A and A&S), that are all extensions of the classical R2, except λy, Rα2, λα, R2β and λβ, give similar results. Bigger values are observed for models 3 to 7, indicating that the proportion of variation in the data explained by the model is higher for models including the predictor homework and its random slope. Furthermore, the measures of Gelman and Pardoe (2006) allow for understanding the relative importance of predictors and errors at each “level” of the considered models. For the first “level,” the pooling factor λy is close to zero for all models, as expected. For the second “level,”R2αis equal to zero for models 1 to 4, as no predictor at the school level is included. For models 5 to 7, R2α measures the proportion of the variation among schools explained by the predictor at the school level (schoolsize or public, respectively), and this proportion is close to zero and even negative, indicating that there is no across schools variation. The pooling factor λα is less than 0.5

homework

y

30 40 50 60 70

0 2 4 6

6053 6327

0 2 4 6

6467 7194

0 2 4 6

7472 7474

7801 7829 7930 24371 24725

30 40 50 60 70 25456

30 40 50 60 70

25642 26537 46417 47583 54344 62821

68448

0 2 4 6

68493 72080

0 2 4 6

72292

30 40 50 60 70 72991

Figure A.2: Scores on a math test (y) in function of the time spent on math homework each week (homework), for each of the 23 schools.

homework

y

30 40 50 60 70

0 2 4 6

0

0 2 4 6

1

Figure A.3: Scores on a math test (y) in function of the time spent on math homework each week (homework), for private (publici = 0, left side) and public (publici = 1, right side) schools.

Table A.2: Measures for models of math test scores data. For the measures computed with several null models, (0) indicates that the considered null model is model 0 and (1) indicates that the considered null model is model 1. The dashed line separates null models. m.=marginal; c.=conditional.

Model

Measure Category 0 1 2 3 4 5 6 7

R2y A 0.291 0.378 0.534 0.556 0.534 0.534 0.533

R2α A 0 0 0 0 -0.051 0.029 -0.004

R2β A 0 0 0 0 -0.046

λy A 0.038 0.039 0.078 0.079 0.078 0.077 0.080

λα A 0.146 0.156 0.127 0.155 0.137 0.141 0.173

λβ A 0.112 0.132 0.111 0.113 0.157

Drand A 0.317 0.402 0.571 0.590 0.571 0.570 0.570

c A 0.697 0.732 0.784 0.792 0.784 0.783 0.783

R2 (0) A 0.317 0.402 0.571 0.590 0.571 0.570 0.570

ρ2(0 A) 0.317 0.401 0.569 0.589 0.569 0.569 0.569

R2 (1) A 0.124 0.371 0.400 0.371 0.370 0.370

ρ2(1) A 0.124 0.369 0.399 0.370 0.368 0.369

R2T A 0.317 0.402 0.571 0.590 0.571 0.570 0.570

c. rc,a A&S 0.460 0.558 0.713 0.730 0.713 0.712 0.712

c. R2V C,a (0) A&S 0.316 0.400 0.569 0.588 0.568 0.567 0.567

c. R2V C,a (1) A&S 0.121 0.369 0.397 0.368 0.366 0.365

Prand A&S 0.292 0.380 0.536 0.559 0.537 0.537 0.538

r2 (0) A&S 0.292 0.378 0.535 0.557 0.535 0.535 0.535

r2 (1) A&S 0.123 0.344 0.375 0.344 0.343 0.343

R2TF,a A&S 0.290 0.377 0.534 0.557 0.534 0.534 0.534

R21 F&S 0.137 -1.771 -1.232 -1.775 -1.692 -1.803

R22 F&S 0.174 -6.071 -4.475 -6.081 -5.816 -6.152

m. rc,a F&S -0.002 0.256 0.214 0.396 0.220 0.316 0.326

m. RV C,a2 F&S -0.010 0.171 0.140 0.290 0.143 0.212 0.217

m. Drand F&S -0.008 0.174 0.143 0.294 0.148 0.216 0.223

m. Prand F&S -0.008 0.174 0.143 0.294 0.148 0.216 0.223

m. c F&S 0.5 0.629 0.629 0.695 0.636 0.658 0.655

m. R2 F&S -0.008 0.174 0.143 0.294 0.148 0.216 0.223

R2F,a F&S -0.012 0.169 0.136 0.288 0.140 0.209 0.214

-2LL ML S 3933.1064 3800.797 3730.494 3639.036 3602.350 3638.613 3634.841 3634.771

-2LL REML S 3798.679 3729.319 3635.562 3600.043 3634.231 3628.388 3625.150

mAIC S 3937.1064 3806.776 3738.494 3651.036 3616.350 3652.613 3648.841 3650.771

BIC S 3945.568 3819.532 3755.502 3676.548 3646.114 3682.376 3678.604 3684.786

cAIC ML S 3777.448 3710.464 3580.288 3555.187 3580.642 3580.678 3580.888

cAIC REML S 3777.448 3710.474 3580.191 3555.050 3580.548 3580.519 3580.593

DIC S 3936.979 3777.452 3710.774 3580.010 3554.891 3563.811 3490.515 3485.402

for all models, meaning that there is more within-school information than population-level information. In other words, the estimates of γ0 and γ1 are closer to the 23 predictions

ˆ

αi obtained by least squares for the no-pooling model αi = γ0 +γ1schoolsizei +ei (or αi = γ0 +γ1publici +ei, respectively) than to the common estimates obtained for the complete-pooling model. For the third “level,” R2β is equal to zero for models 3 to 6, as no explanatory variable is included at that “level.” For model 7, R2β is negative, mean-ing that no variation in the homework effects across schools is explained by public. The pooling factor λβ is less than 0.5 for models 3 to 7, indicating again that there is more within-school information than population-level information.

The measures allowing for selecting the best fixed part (category F&S) underscore the relevance of homework and parented, or of homework and public. The difference between the model containing homework and public and that containing additionally the interaction is very small, and, on grounds of parsimony, we should not include the interaction. The negative values of R12 and R22 for models 3 to 6 are due to the fact that the estimate of the variance of the random intercept for these models is much greater than the estimate of the variance of the random intercept for the null model, which corresponds to model 1. Kreft and Leeuw (1998) argued that this increase is the result of multicollinearity and advise centering the predictors at the student level around the school means, but despite centering, R12 and R22 remain negative (results not shown). According to Snijders and Bosker (1994), negative values may indicate a misspecification of the fixed part of the model but, considering the other measures, it does not seem to be the case here. Indeed, when we exclude the random slope for homework, we obtain positive values for R12 and R22 (results not shown).

Among the measures of model selection (category S), the LRT built on -2LL ML is used for the comparison of the fixed effects structure. Given that homework appears to be necessary, we compare model 3 with models 4 to 6 to test the need of the inclusion of an additional predictor (parented or schoolsize or public, respectively). With a significance level of 5%, the additional explanatory variables parented and public seem useful (Com-pare models 3 and 4: LRT of 36.684 for 1 df, p <0.001; Compare models 3 and 5: LRT of 0.423 for 1 df, p= 0.515; Compare models 3 and 6: LRT of 4.195 for 1 df, p= 0.040).

To test the need of the inclusion of the interaction between homework and public, we compare models 6 and 7 and the LRT indicates that model 6 is to be preferred (Compare models 6 and 7: LRT of 0.07 for 1 df, p= 0.791). The random slope for homework seems necessary, as confirmed by the LRT (Compare models 2 and 3: LRT of 91.458 for 2 df, p < 0.001). Remark that we consider again -2LL ML to perform a LRT, but consid-ering -2LL REML is also feasible and gives rise to similar results. Thereby, our results are the same as those obtained by Kreft and Leeuw (1998). Among models 1 to 4, all the information criteria point at model 4 as the most appropriate, as it gives rise to the smallest value. It is more difficult to make a choice between models 5 to 7 allowing the identification of school characteristics influencing the math score, as all the information criteria have close values, except the DIC. Indeed, when comparing model 5 to models 6 and 7, the difference between the DIC values clearly indicate that models 6 and 7 are most appropriate. In comparison with model 3, models 5 to 7 do not seem to improve the adjustment of the model to the data, as the difference between two information criterion values is small, except for the DIC.

In summary, the measures evaluating model adequacy and the LRT show the im-provement of the adjustment of the model to the data if a random slope for homework is included. The indices comparing the fixed part of the model and the LRT favor the

of public in addition to homework. The information criteria also favor models with a random slope for homework. Moreover, they all favor models containing the explana-tory variables homework and parented, but for models containing school characteristics, considering the DIC or the other information criteria give rise to different conclusions.

Following the recommendations given in Section 1.6 (discussion of Chapter 1), we would consider the cAIC REML, the marginal Prand and Prand to respectively select the most appropriate model, select the best set of fixed effects, and evaluate the adjustment of the model at hand. We would thus conclude that, considering the cAIC REML, model 4 is the best in terms of adjustment to the data and that models 3, 5, 6 and 7 are equivalently appropriate. Considering the marginal Prand, we conclude that homework and parented should be included, and the school characteristic that should be included in addition to homework is public. Combining these results, models 4 and 6 seem the most appropriate for these data and considering Prand, these models reduce by 56% and 54%, respectively, the PQL compared to the null model containing a fixed intercept. In the end, according to model 4, the math score increases with the time spent on math homework and this effect is different among schools. Moreover, the parents with more education are most helpful, implying higher math scores. Model 6 gives the additional information that the students in public sector tend to have lower math scores than students in private sector.

A.2 ANOVAs and measures of Gelman and Pardoe