Conference Presentation
Reference
Bayesian Structural Equation Modeling of the WISC-IV with a Large Referred US Sample
GOLAY, Philippe, et al.
Abstract
Numerous studies have supported exploratory and confirmatory bifactor structures of the WISC-IV in US, French, and Irish samples. When investigating the structure of cognitive ability measures like the WISC-IV, subtest scores theoretically associated with one latent variable could also be related to other factors. A major drawback of classical confirmatory factor analysis (CFA) is that the majority of factor loadings need to be fixed to zero to estimate the model parameters. This unnecessary strict parameterization can lead to model rejection and cause researchers to perform many exploratory modifications to achieve acceptable model fit. Bayesian structural equation modeling (BSEM) overcomes this limitation by replacing fixed-to-zero-loadings with “approximate” zeros that translates into small, but not necessary zero, cross-loadings. Because all relationships between factors and subtest scores are estimated, both the number of models to be tested and the risk of capitalizing on the chance characteristics of the data are decreased. The objective of this study was to determine whether secondary interpretation of the 10 [...]
GOLAY, Philippe, et al. Bayesian Structural Equation Modeling of the WISC-IV with a Large Referred US Sample. In: 9th Conference of the International Test Commission, San Sebastian (Spain), 2-5 July, 2014
Available at:
http://archive-ouverte.unige.ch/unige:38747
Disclaimer: layout of this document may differ from the published version.
Philippe Golay1,2, Thierry Lecerf1, Marley W. Watkins3
& Gary L. Canivez4
1University of Geneva, 2University of Lausanne, 3Baylor University, 4Eastern Illinois University
THE 9TH CONFERENCE OF THE INTERNATIONAL TEST COMMISSION SAN SEBASTIAN, SPAIN 2-5 JULY, 2014
• The Wechsler Intelligence Scale for children remains the most widely used test in the field of intelligence
assessment.
• General intelligence (g) has traditionally been
conceptualized as a superordinate factor (higher-order model).
• But most recent research has shown better support for g as a breadth factor (bifactor model): exploratory /
confirmatory bifactor structures in US, French, Swiss and
3
g
Similarities
Comprehension
Block Design
Matrix Reasoning Perceptual
Reasoning PRI
Picture Concept
Working Memory
WMI
Digit Span
Letter Number
Processing Speed
PSI
Coding
Symbol Search Verbal
Comprehension VCI
Vocabulary
Higher order model
Bifactor model
g
Similarities
Comprehension
Block Design
Matrix Reasoning Perceptual
Reasoning PRI
Picture Concept
Working Memory
WMI
Digit Span
Letter Number
Processing Speed
PSI
Coding
Symbol Search Verbal
Comprehension VCI
Vocabulary
Goal 1 : Compare Higher-order (indirect hierarchical) versus Bifactor (direct
hierarchical) models of the 10 WISC-IV core
subtests from a large referred US sample.
Many controversies remain on the nature of the constructs measured by each subtest score:
Subtest scores theoretically associated with one latent variable could also be related to other
factors.
Many disagreements remain about constructs that would contribute, at a secondary level, to the results of each of the subtests scores.
5
Contribution of fluid reasoning in the Similarities verbal subtest score.
Contribution of general verbal information and crystalized intelligence to performance in the Picture Concept subtest score.
Contribution of visual abilities in the Symbol Search processing speed subtest score.
…
Goal 2: Determine more precisely which constructs are adequately measured by WISC-IV core subtests and can secondary interpretation of some subtest scores be supported by the data ?
7
EFA is not very restrictive because the
relationships between all items and all factors are estimated.
Two decisions remain for selecting a proper solution:
Number of factors
▪ on the basis of theoretical and statistical considerations.
Rotation method.
▪ Orthogonal rotations vs oblique rotations.
▪ Hypothesized complexity of the factorial structure.
Most rotations methods are designed to seek a simple structure with a low factorial structure complexity.
When several subtest scores are expected to load on more than one factor, these rotations are inefficient and cannot recover the correct structure.
Expected factor complexity is not always easy to determine
a priori. 9
=
* 0 0
* 0 0
0
* 0
0
* 0
0 0
*
0 0
*
Λ
=
* 0
*
* 0
*
* 0
*
0
*
*
0
*
*
0
*
*
Λ
Contrarily to EFA, CFA allows estimating only some of the model parameters on the basis of theoretical knowledge.
With CFA the majority of factor loadings need to be fixed to zero to estimate the model parameters.
Although needed for model identification, these restrictions do not always faithfully reflect the researchers’ hypotheses.
Small but not necessarily zero loading could be equally or even more compatible with theory.
This unnecessary strict parameterization can contribute to poor model fit, distorted factors and biased factor correlations (Marsh, et al., 2010).
It also may cause researchers to perform many exploratory modifications to achieve acceptable model fit (risk of overfitting & loss of meaning for indices of statistical significance).
11
•
Build on the strenghts of both methods and avoid their weaknesses: bayesian approach to model estimation.
•
BSEM could be seen as an intermediate approach between CFA and EFA:
• It allows, like CFA, to specify the expected loadings.
• At the same time, it is also possible, like with EFA, to maintain a certain level of uncertainty and estimate all loadings.
13
Latent variable 1
Subtest Subtest Subtest
Subtest Subtest Subtest
Latent variable 2
With classical CFA, most secondary loadings are fixed to exactly zero
0 0 0 0 0 0
Latent variable 1
Subtest Subtest Subtest
Subtest Subtest
Subtest -1 -0.5 0 0.5 1
Diffuse non informative priors (zero mean and infinite variance)
Latent variable 2
15
Latent variable 1
Subtest Subtest Subtest
Subtest Subtest Subtest
-1 -0.5 0 0.5 1
Informative priors (zero mean and small variance)
Latent variable 2
Bayesian estimation combines prior distributions for all parameters with the experimental data and
forms posterior distributions via Bayes' theorem.
posterior ∝ likelihood x prior
The prior variance was 0.01 which results in 95%
credibility interval of ± 0.20 (small cross-loadings).
MCMC estimation with Mplus 7.0.
16
• BSEM overcomes CFA’s limitations by replacing fixed-to-zero-loadings with “approximate” zeros that translates into small, but not necessary zero, cross-loadings.
• Approximate zeros often reflect more accurately theoretical assumptions and facilitate unbiased estimations of the model parameters.
17
• BSEM allows the estimation of many parameters without depending on the selection of a method of rotation as needed when performing an EFA.
• Because all relationships between factors and subtest scores are estimated this approach eliminates the need for comparisons of many competing models.
• It is also possible to determine the precise nature of the constructs measured by the core subtest scores of the WISC-IV.
•
WISC-IV data were obtained from 1130 US children who were referred for evaluation of learning difficulties.
•
As it appears to be common in clinical
assessments, only the 10 core subtests were administered.
19
Age IQ
Sample N % Male Mean
(SD) Min/Max Mean
(SD) Min/Max
US 1130 62%
(696) 10.24
(2.51) 6-0/16-11 89.94
(17.16) 40/147
20
Model Number of free
parameters
PPP Value
Difference between observed & replicated Χ2
95% C.I. DIC
Estimated number of parameters Lower
2.5% Upper 2.5%
1. WISC-IV - higher order model 34 0.000 34.732 90.485 25983.2 33.785 2. WISC-IV - higher order model
with cross-loadings (priors variance = 0.01)
64 0.151 -15.168 45.585 25941.9 40.471
3. WISC-IV - bifactor model 40 0.000 21.281 76.965 25963.1 27.042 4. WISC-IV - bifactor model
with cross-loadings (priors variance = 0.01)
70 0.388 -27.095 35.638 25913.3 23.009
The WISC-IV bifactor model with small cross-loadings showed better fit to the data overall.
Note. Higher Posterior Predictive P-Value and Lower DIC indicates better fit to the data.
21
Loadings estimates (median) VCI PRI WMI PSI
95% CI 95% CI 95% CI 95% CI
Similarities 0.798* 0.134* -0.016 -0.065
0.679 0.943 0.015 0.249 -0.123 0.088 -0.154 0.013
Vocabulary 0.932* -0.007 0.033 -0.035
0.797 1.085 -0.137 0.114 -0.078 0.143 -0.124 0.046
Comprehension 0.821* -0.056 0.010 0.021
0.695 0.963 -0.180 0.061 -0.099 0.116 -0.060 0.107
Block Design -0.040 0.870* -0.096 0.052
-0.159 0.074 0.730 1.041 -0.266 0.011 -0.039 0.140
Picture Concepts 0.103* 0.535* 0.081 0.009
0.001 0.207 0.404 0.670 -0.024 0.203 -0.070 0.091
Matrix Reasoning 0.010 0.772* 0.042 0.007
-0.105 0.121 0.620 0.933 -0.074 0.160 -0.079 0.096
Digit Span 0.005 -0.026 0.713* 0.023
-0.134 0.141 -0.173 0.113 0.467 0.931 -0.072 0.112
Letter-Number Sequencing 0.032 -0.016 0.779* 0.035
-0.110 0.163 -0.168 0.123 0.559 1.016 -0.062 0.134
Coding -0.050 -0.051 0.039 0.708*
-0.173 0.065 -0.194 0.082 -0.099 0.171 0.481 0.944
Symbol Search -0.017 0.110 -0.005 0.780*
-0.142 0.104 -0.038 0.259 -0.130 0.125 0.533 1.021
Loadings estimates (median) G VCI PRI WMI PSI
95% CI 95% CI 95% CI 95% CI 95% CI
Similarities 0.756* 0.380* 0.052 -0.008 -0.051
0.680 0.823 0.202 0.504 -0.074 0.184 -0.134 0.132 -0.156 0.054
Vocabulary 0.796* 0.488* 0.007 0.043 -0.014
0.713 0.887 0.267 0.606 -0.142 0.145 -0.127 0.187 -0.138 0.102
Comprehension 0.691* 0.394* -0.049 0.007 0.020
0.603 0.779 0.160 0.521 -0.186 0.153 -0.141 0.157 -0.095 0.133
Block Design 0.699* -0.025 0.408 -0.050 0.034
0.592 0.800 -0.176 0.130 -0.198 0.608 -0.209 0.153 -0.109 0.162
Picture Concepts 0.691* 0.012 0.148 0.001 -0.024
0.601 0.767 -0.150 0.163 -0.159 0.394 -0.165 0.188 -0.146 0.087
Matrix Reasoning 0.752* 0.005 0.285 0.029 0.002
0.671 0.822 -0.127 0.125 -0.097 0.479 -0.116 0.162 -0.119 0.112
Digit Span 0.637* 0.022 -0.010 0.256 0.026
0.566 0.710 -0.116 0.149 -0.151 0.134 -0.178 0.502 -0.084 0.131
Letter-Number Sequencing 0.738* 0.039 -0.006 0.311 0.044
0.663 0.816 -0.113 0.175 -0.156 0.141 -0.200 0.600 -0.088 0.162
Coding 0.691* -0.030 -0.027 0.036 0.554*
0.603 0.779 -0.171 0.108 -0.172 0.126 -0.127 0.179 0.341 0.811
Symbol Search 0.621* -0.002 0.071 0.019 0.472*
0.539 0.699 -0.135 0.124 -0.085 0.204 -0.116 0.151 0.302 0.714
Results of the higher-order models (Models 1
and 2) highlighted two theoretically meaningful cross-loadings.
The loading from VCI to Picture Concepts was considered substantial.
The cross-loading from PRI to Similarities was also substantive.
No other hypothesized cross-loadings were supported.
23
In contrast, results of the bifactor models
(Models 3 and 4) revealed no cross-loadings.
The breadth conception of the g-factor left less
unmodeled complexity than the higher-order
structure.
Loadings of the subtests scores on the g-factor were systematically higher than their respective loadings on the four index scores.
Index scores represented rather small deviations from unidimensionality and did not necessarily provide additional and separate information from the Full Scale IQ score (FISQ).
25
Results on a sample of 1130 referred US children showed that the bifactor model fit was better than the higher order
solution. Models including small cross-loadings were more adequate.
BSEM allowed us to estimate models that were closer to theoretical assumptions. BSEM also permited to test more complex models that were not possible to estimate through maximum likelihood estimation.
BSEM suggested a simple and parsimonious interpretation
of the subtest scores. 26
Thank you very much for your attention
Contact : philippe.golay@unil.ch
BSEM was conducted using Mplus 7.0 with Markov Chain Monte Carlo (MCMC) estimation algorithm with Gibbs sampler.
Three chains with 50,000 iterations, different starting values, and different random seeds were estimated.
The convergence of the chains was verified using the
Potential Scale Reduction Factor (PSR; Gelman & Rubin, 1992).
A Kolmogorov-Smirnov test of equality of the posterior parameter distributions across the three chains was also performed for all models.
The 1st half of the chain was discarded (burn-in phase) and
29
Second order loadings estimates (median) g
95% CI
Verbal Comprehension Index 0.874*
0.793 0.932
Perceptual Reasoning Index 0.893*
0.809 0.943
Working Memory Index 0.921*
0.816 0.977
Processing Speed Index 0.709*
0.509 0.827
30
Loadings estimates g VCI PRI WMI PSI
Similarities 0.756 0.380 0.052 -0.008 -0.051
Vocabulary 0.796 0.488 0.007 0.043 -0.014
Comprehension 0.691 0.394 -0.049 0.007 0.020
Block Design 0.699 -0.025 0.408 -0.050 0.034
Picture Concepts 0.691 0.012 0.148 0.001 -0.024
Matrix Reasoning 0.752 0.005 0.285 0.029 0.002
Digit Span 0.637 0.022 -0.010 0.256 0.026
Letter-Number Sequencing 0.738 0.039 -0.006 0.311 0.044
Coding 0.691 -0.030 -0.027 0.036 0.554
Symbol Search 0.621 -0.002 0.071 0.019 0.472
Omega-Hierarchical 0.875 0.215 0.109 0.104 0.311
Omega-hierarchical coefficients for group (index) factors were likely too low for interpretation.