Bayesian Structural Equation Modeling of the WISC-IV with a Large Referred US Sample

(1)

Conference Presentation

Reference

GOLAY, Philippe, et al.

Abstract

Numerous studies have supported exploratory and confirmatory bifactor structures of the WISC-IV in US, French, and Irish samples. When investigating the structure of cognitive ability measures like the WISC-IV, subtest scores theoretically associated with one latent variable could also be related to other factors. A major drawback of classical confirmatory factor analysis (CFA) is that the majority of factor loadings need to be fixed to zero to estimate the model parameters. This unnecessary strict parameterization can lead to model rejection and cause researchers to perform many exploratory modifications to achieve acceptable model fit. Bayesian structural equation modeling (BSEM) overcomes this limitation by replacing fixed-to-zero-loadings with “approximate” zeros that translates into small, but not necessary zero, cross-loadings. Because all relationships between factors and subtest scores are estimated, both the number of models to be tested and the risk of capitalizing on the chance characteristics of the data are decreased. The objective of this study was to determine whether secondary interpretation of the 10 [...]

GOLAY, Philippe, et al. Bayesian Structural Equation Modeling of the WISC-IV with a Large Referred US Sample. In: 9th Conference of the International Test Commission, San Sebastian (Spain), 2-5 July, 2014

Available at:

http://archive-ouverte.unige.ch/unige:38747

Disclaimer: layout of this document may differ from the published version.

(2)

Philippe Golay^1,2, Thierry Lecerf¹, Marley W. Watkins³

& Gary L. Canivez⁴

1University of Geneva, ²University of Lausanne, ³Baylor University, ⁴Eastern Illinois University

THE 9TH CONFERENCE OF THE INTERNATIONAL TEST COMMISSION SAN SEBASTIAN, SPAIN 2-5 JULY, 2014

(3)

• The Wechsler Intelligence Scale for children remains the most widely used test in the field of intelligence

assessment.

• General intelligence (g) has traditionally been

conceptualized as a superordinate factor (higher-order model).

• But most recent research has shown better support for g as a breadth factor (bifactor model): exploratory /

confirmatory bifactor structures in US, French, Swiss and

(4)

3

g

Similarities

Comprehension

Block Design

Matrix Reasoning Perceptual

Reasoning PRI

Picture Concept

Working Memory

WMI

Digit Span

Letter Number

Processing Speed

PSI

Coding

Symbol Search Verbal

Comprehension VCI

Vocabulary

Higher order model

Bifactor model

g

Similarities

Comprehension

Block Design

Matrix Reasoning Perceptual

Reasoning PRI

Picture Concept

Working Memory

WMI

Digit Span

Letter Number

Processing Speed

PSI

Coding

Symbol Search Verbal

Comprehension VCI

Vocabulary

(5)



Goal 1 : Compare Higher-order (indirect hierarchical) versus Bifactor (direct

hierarchical) models of the 10 WISC-IV core

subtests from a large referred US sample.

(6)



Many controversies remain on the nature of the constructs measured by each subtest score:

 Subtest scores theoretically associated with one latent variable could also be related to other

factors.

 Many disagreements remain about constructs that would contribute, at a secondary level, to the results of each of the subtests scores.

5

(7)



Contribution of fluid reasoning in the Similarities verbal subtest score.



Contribution of general verbal information and crystalized intelligence to performance in the Picture Concept subtest score.



Contribution of visual abilities in the Symbol Search processing speed subtest score.



…

(8)



Goal 2: Determine more precisely which constructs are adequately measured by WISC-IV core subtests and can secondary interpretation of some subtest scores be supported by the data ?

7

(9)



EFA is not very restrictive because the

relationships between all items and all factors are estimated.



Two decisions remain for selecting a proper solution:

 Number of factors

▪ on the basis of theoretical and statistical considerations.

 Rotation method.

▪ Orthogonal rotations vs oblique rotations.

▪ Hypothesized complexity of the factorial structure.

(10)

 Most rotations methods are designed to seek a simple structure with a low factorial structure complexity.

 When several subtest scores are expected to load on more than one factor, these rotations are inefficient and cannot recover the correct structure.

 Expected factor complexity is not always easy to determine

a priori. ⁹

























=

* 0 0

0

* 0

0

* 0

0 0

*

0 0

*

Λ

























=

* 0

*

* 0

*

* 0

*

0

*

0

*

0

*

Λ

(11)

 Contrarily to EFA, CFA allows estimating only some of the model parameters on the basis of theoretical knowledge.

 With CFA the majority of factor loadings need to be fixed to zero to estimate the model parameters.

 Although needed for model identification, these restrictions do not always faithfully reflect the researchers’ hypotheses.

(12)

 Small but not necessarily zero loading could be equally or even more compatible with theory.

 This unnecessary strict parameterization can contribute to poor model fit, distorted factors and biased factor correlations (Marsh, et al., 2010).

 It also may cause researchers to perform many exploratory modifications to achieve acceptable model fit (risk of overfitting & loss of meaning for indices of statistical significance).

11

(13)

•

Build on the strenghts of both methods and avoid their weaknesses: bayesian approach to model estimation.

•

BSEM could be seen as an intermediate approach between CFA and EFA:

• It allows, like CFA, to specify the expected loadings.

• At the same time, it is also possible, like with EFA, to maintain a certain level of uncertainty and estimate all loadings.

(14)

13

Latent variable 1

Subtest Subtest Subtest

With classical CFA, most secondary loadings are fixed to exactly zero

0 0 0 0 0 0

(15)

Subtest Subtest

Subtest -1 -0.5 0 0.5 1

Diffuse non informative priors (zero mean and infinite variance)

(16)

15

-1 -0.5 0 0.5 1

Informative priors (zero mean and small variance)

(17)

 Bayesian estimation combines prior distributions for all parameters with the experimental data and

forms posterior distributions via Bayes' theorem.

posterior ∝ likelihood x prior

 The prior variance was 0.01 which results in 95%

credibility interval of ± 0.20 (small cross-loadings).

 MCMC estimation with Mplus 7.0.

16

(18)

• BSEM overcomes CFA’s limitations by replacing fixed-to-zero-loadings with “approximate” zeros that translates into small, but not necessary zero, cross-loadings.

• Approximate zeros often reflect more accurately theoretical assumptions and facilitate unbiased estimations of the model parameters.

17

(19)

• BSEM allows the estimation of many parameters without depending on the selection of a method of rotation as needed when performing an EFA.

• Because all relationships between factors and subtest scores are estimated this approach eliminates the need for comparisons of many competing models.

• It is also possible to determine the precise nature of the constructs measured by the core subtest scores of the WISC-IV.

(20)

•

WISC-IV data were obtained from 1130 US children who were referred for evaluation of learning difficulties.

•

As it appears to be common in clinical

assessments, only the 10 core subtests were administered.

19

Age IQ

Sample N % Male Mean

(SD) Min/Max Mean

(SD) Min/Max

US 1130 62%

(696) 10.24

(2.51) 6-0/16-11 89.94

(17.16) 40/147

(21)

20

Model ^{Number of}^free

parameters

PPP Value

Difference between observed & replicated Χ²

95% C.I. DIC

Estimated number of parameters Lower

2.5% Upper 2.5%

1. WISC-IV - higher order model 34 0.000 34.732 90.485 25983.2 33.785 2. WISC-IV - higher order model

with cross-loadings (priors variance = 0.01)

64 0.151 -15.168 45.585 25941.9 40.471

3. WISC-IV - bifactor model 40 0.000 21.281 76.965 25963.1 27.042 4. WISC-IV - bifactor model

with cross-loadings (priors variance = 0.01)

70 0.388 -27.095 35.638 25913.3 23.009

The WISC-IV bifactor model with small cross-loadings showed better fit to the data overall.

Note. Higher Posterior Predictive P-Value and Lower DIC indicates better fit to the data.

(22)

21

Loadings estimates (median) ^VCI ^PRI ^WMI ^PSI

95% CI 95% CI 95% CI 95% CI

Similarities ^0.798* ^0.134* ^-0.016 ^-0.065

0.679 0.943 0.015 0.249 -0.123 0.088 -0.154 0.013

Vocabulary ^0.932* ^-0.007 ^0.033 ^-0.035

0.797 1.085 -0.137 0.114 -0.078 0.143 -0.124 0.046

Comprehension ^0.821* ^-0.056 ^0.010 ^0.021

0.695 0.963 -0.180 0.061 -0.099 0.116 -0.060 0.107

Block Design ^-0.040 ^0.870* ^-0.096 ^0.052

-0.159 0.074 0.730 1.041 -0.266 0.011 -0.039 0.140

Picture Concepts ^0.103* ^0.535* ^0.081 ^0.009

0.001 0.207 0.404 0.670 -0.024 0.203 -0.070 0.091

Matrix Reasoning ^0.010 ^0.772* ^0.042 ^0.007

-0.105 0.121 0.620 0.933 -0.074 0.160 -0.079 0.096

Digit Span ^0.005 ^-0.026 ^0.713* ^0.023

-0.134 0.141 -0.173 0.113 0.467 0.931 -0.072 0.112

Letter-Number Sequencing ^0.032 ^-0.016 ^0.779* ^0.035

-0.110 0.163 -0.168 0.123 0.559 1.016 -0.062 0.134

Coding ^-0.050 ^-0.051 ^0.039 ^0.708*

-0.173 0.065 -0.194 0.082 -0.099 0.171 0.481 0.944

Symbol Search ^-0.017 ^0.110 ^-0.005 ^0.780*

-0.142 0.104 -0.038 0.259 -0.130 0.125 0.533 1.021

(23)

Loadings estimates (median) ^G ^VCI ^PRI ^WMI ^PSI

95% CI 95% CI 95% CI 95% CI 95% CI

Similarities ^0.756* ^0.380* ^0.052 ^-0.008 ^-0.051

0.680 0.823 0.202 0.504 -0.074 0.184 -0.134 0.132 -0.156 0.054

Vocabulary ^0.796* ^0.488* ^0.007 ^0.043 ^-0.014

0.713 0.887 0.267 0.606 -0.142 0.145 -0.127 0.187 -0.138 0.102

Comprehension ^0.691* ^0.394* ^-0.049 ^0.007 ^0.020

0.603 0.779 0.160 0.521 -0.186 0.153 -0.141 0.157 -0.095 0.133

Block Design ^0.699* ^-0.025 ^0.408 ^-0.050 ^0.034

0.592 0.800 -0.176 0.130 -0.198 0.608 -0.209 0.153 -0.109 0.162

Picture Concepts ^0.691* ^0.012 ^0.148 ^0.001 ^-0.024

0.601 0.767 -0.150 0.163 -0.159 0.394 -0.165 0.188 -0.146 0.087

Matrix Reasoning ^0.752* ^0.005 ^0.285 ^0.029 ^0.002

0.671 0.822 -0.127 0.125 -0.097 0.479 -0.116 0.162 -0.119 0.112

Digit Span ^0.637* ^0.022 ^-0.010 ^0.256 ^0.026

0.566 0.710 -0.116 0.149 -0.151 0.134 -0.178 0.502 -0.084 0.131

Letter-Number Sequencing ^0.738* ^0.039 ^-0.006 ^0.311 ^0.044

0.663 0.816 -0.113 0.175 -0.156 0.141 -0.200 0.600 -0.088 0.162

Coding ^0.691* ^-0.030 ^-0.027 ^0.036 ^0.554*

0.603 0.779 -0.171 0.108 -0.172 0.126 -0.127 0.179 0.341 0.811

Symbol Search ^0.621* ^-0.002 ^0.071 ^0.019 ^0.472*

0.539 0.699 -0.135 0.124 -0.085 0.204 -0.116 0.151 0.302 0.714

(24)



Results of the higher-order models (Models 1

and 2) highlighted two theoretically meaningful cross-loadings.

 The loading from VCI to Picture Concepts was considered substantial.

 The cross-loading from PRI to Similarities was also substantive.

 No other hypothesized cross-loadings were supported.

23

(25)



In contrast, results of the bifactor models

(Models 3 and 4) revealed no cross-loadings.



The breadth conception of the g-factor left less

unmodeled complexity than the higher-order

structure.

(26)



Loadings of the subtests scores on the g-factor were systematically higher than their respective loadings on the four index scores.



Index scores represented rather small deviations from unidimensionality and did not necessarily provide additional and separate information from the Full Scale IQ score (FISQ).

25

(27)

 Results on a sample of 1130 referred US children showed that the bifactor model fit was better than the higher order

solution. Models including small cross-loadings were more adequate.

 BSEM allowed us to estimate models that were closer to theoretical assumptions. BSEM also permited to test more complex models that were not possible to estimate through maximum likelihood estimation.

 BSEM suggested a simple and parsimonious interpretation

of the subtest scores. ²⁶

(28)

Thank you very much for your attention

Contact : philippe.golay@unil.ch

(29)

 BSEM was conducted using Mplus 7.0 with Markov Chain Monte Carlo (MCMC) estimation algorithm with Gibbs sampler.

 Three chains with 50,000 iterations, different starting values, and different random seeds were estimated.

 The convergence of the chains was verified using the

Potential Scale Reduction Factor (PSR; Gelman & Rubin, 1992).

 A Kolmogorov-Smirnov test of equality of the posterior parameter distributions across the three chains was also performed for all models.

 The 1^st half of the chain was discarded (burn-in phase) and

(30)

29

Second order loadings estimates (median) g

95% CI

Verbal Comprehension Index 0.874*

0.793 0.932

Perceptual Reasoning Index 0.893*

0.809 0.943

Working Memory Index 0.921*

0.816 0.977

Processing Speed Index 0.709*

0.509 0.827

(31)

30

Loadings estimates g VCI PRI WMI PSI

Similarities 0.756 0.380 0.052 -0.008 -0.051

Vocabulary 0.796 0.488 0.007 0.043 -0.014

Comprehension 0.691 0.394 -0.049 0.007 0.020

Block Design 0.699 -0.025 0.408 -0.050 0.034

Picture Concepts 0.691 0.012 0.148 0.001 -0.024

Matrix Reasoning 0.752 0.005 0.285 0.029 0.002

Digit Span 0.637 0.022 -0.010 0.256 0.026

Letter-Number Sequencing 0.738 0.039 -0.006 0.311 0.044

Coding 0.691 -0.030 -0.027 0.036 0.554

Symbol Search 0.621 -0.002 0.071 0.019 0.472

Omega-Hierarchical 0.875 0.215 0.109 0.104 0.311

Omega-hierarchical coefficients for group (index) factors were likely too low for interpretation.