• Aucun résultat trouvé

Eduardo Doval

N/A
N/A
Protected

Academic year: 2022

Partager "Eduardo Doval"

Copied!
16
0
0

Texte intégral

(1)

aberrant response patterns on real data

Eduardo Doval1, M. Dolors Riba1, Rebeca García-Rueda 2 & Jordi Renom3

1Departament de Psicobiologia i de Metodologia de les Ciències de la Salut.

Facultat de Psicologia, Universitat Autònoma de Barcelona.

2Servei de Llengües-UAB idiomes, Universitat Autònoma de Barcelona.

1Departament de Metodologia de les Ciències del Comportament.

Facultat de Psicologia, Universitat de Barcelona.

(2)

INTRODUCTION

• In a competence test, the total score is usually used as an indicator of the test-taker’s level of competence.

• A total score is only a valid indicator as long as it follows the

expected response pattern (Evidence based on response processes;

AERA, APA & NCME, 2014).

Aberrant response patterns (ARP) + Total Score

(3)

• A variety of person-fit indices are available (Meijer and Sitjsma, 2001; Karabatsos, 2003).

• Person-fit indices may be parametric or non-parametric.

• The non parametric group-based are easy to calculate and are based on feasible assumptions: HT, U3, MCI.

• Given a fixed Type I error rate of .05, HT had the highest power to detect ARP, followed by U3, and MCI (Tendeiro and Meijer (2014) .

• These group-based person-fit indices outperformed parametric statistics like lz (Karabatsos, 2003).

(4)

AIM

To evaluate, with real data, the sensitivity of the extreme values ​​of the indices to detected different types of ARP:

Harnisch and Linn' Modified Caution Index (MCI) Van der Flier' U3

Sijtsma' HT

(5)

Data

Test of Basic Language Skills in Catalan.

Target population 72.153 students.

65.767 students (91%).

Participants

6th grade primary students in Catalonia (2013-2014).

(6)

Test

Basic Language Skills in Catalan.

RESEARCH METHOD

30 MCQ

Reading comprehension

Retrieving information

Interpreting

information Reflecting Literary Text Non Literary Text

(7)

Overview of the analysis

1. MCI, U3 and HT (PerFit R-package; Tendeiro, 2015).

2. Selection of 5% extreme cases.

3. Identification of the ARP types.

4. Comparision of % the ARP types detected by each index.

(8)

Identification of the three groups of items

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

proportion of correct responses

Items ordered by difficulty

Lower difficulty

(L)

Medium difficulty

(M)

Higher difficulty

(H)

33 rd percentile

.894

66 th percentile

.787

RESEARCH METHOD

Identification of the ARP type

(9)

Lower difficulty Medium difficulty Higher difficulty

Difficulty .97 .97 .93 .93 .91 .91 .91 .9 .9 .9 .89 .89 .88 .88 .86 .86 .84 .83 .81 .8 .78 .78 .75 .74 .7 .68 .68 .68 .6 .59 Score=16 Expected 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Observed 1 0 0 1 0 0 1 1 1 1 0 1 0 0 0 1 0 0 1 1 1 1 1 1 0 0 1 0 0 1

Proportion of correct responses

Expected 1 0.6 0

Observed 0.6 0.4 0.6

Differences

(observed-expected) Lower than expected (L) Lower than expected (L) Higher than expected (H)

-0,8 -0,6 -0,4 -0,2 0 0,2 0,4 0,6 0,8

LD MD HD

Observed - Expected

Identification of the ARP type

(10)

RESEARCH METHOD

6 response pattern types

E-E-E

L-L-H

-0,8 -0,6 -0,4 -0,2 0 0,2 0,4 0,6 0,8

LD MD HD

O-E

-0,8 -0,6 -0,4 -0,2 0 0,2 0,4 0,6 0,8

LD MD HD

O-E

-0,8 -0,6 -0,4 -0,2 0 0,2 0,4 0,6 0,8

LD MD HD

O-E

L-E-H L-H-H

-0,8 -0,6 -0,4 -0,2 0 0,2 0,4 0,6 0,8

LD MD HD

O-E

E-L-H L-H-E

-0,8 -0,6 -0,4 -0,2 0 0,2 0,4 0,6 0,8

LD MD HD

O-E

-0,8 -0,6 -0,4 -0,2 0 0,2 0,4 0,6 0,8

LD MD HD

O-E

Identification of the ARP type

(11)

• 4,475 ARP (6.8% of the total).

• ARP detected:

Three indices = 47,4% (2120) Two indices = 31.7% (1420) One index = 20.9% (935)

MCI U3 HT

(12)

L-L-H

-0,8 0,2

LD MD HD

O-E

-0,8 0,2

LD MD HD

O-E L-E-H

48%

73.8%

E-L-H

-0,8 0,2

LD MD HD

O-E

RESULTS

32.6%

MCI+U3+HT MCI+HT MCI+U3 HT +U3 MCI HT U3 42.2%

(1890)

23.6%

(1054)

29.5%

(1319)

L-H-H

-0,8 0,2

LD MD HD

O-E 1,9%4.7%

(211)

9.2%

5.4%

64.1%

0%

0.2%

1.1%

0%

0%

9.4%

5.9%

0%

55.9%

0.4%

2.6%

3.1%

0%

0.5%

0.3%

0.2%

6%

32.2%

13.9%

0%

36%

(13)

• Tendeiro and Meijer (2014) found that HT outperformed U3 and MCI.

• In our study U3 is more sensitive to patterns involving extreme difficulty values.

L-L-H L-E-H L-H-H

HT is more sensitive to patterns involving medium and high difficulty values.

E-L-H

-0,8 0,2

LD MD HD

O-E

-0,8 0,2

LD MD HD

O-E

-0,8 0,2

LD MD HD

O-E

-0,8 0,2

LD MD HD

O-E

(14)

• The use of both indices, HT and U3, is recommended to detect the greatest number of ARP cases and a wider variety of ARP types.

CONCLUSION

(15)

AERA, APA y NCME (2014). The Standards for Educational and Psychological Testing. Washington, DC: AERA.

International Test Comission. (2012). Quality Control in Scoring, Test Analysis, and Reporting of Test Scores. [www.intestcom.org]

Karabatsos, G. (2003). Comparing the aberrant response detection performance of thirty-six person-fit statistics. Applied Measurement in Education 16, 277–298.

Meijer, R.R. and Sitjsma, K. (2001). Methodology Review: Evaluating Person Fit. Applied Psychological Measurement,25(2), pp. 107–135.

Tendeiro, J.N. (2015). Package ‘PerFit’ [Sotfware]. University of Groningen.

Available at http://cran.r-project.org/web/packages/PerFit

Tendeiro, J. N., & Meijer, R. R. (2014). Detection of invalid test Scores: The usefulness of simple nonparametric statistics. Journal of Educational Measurement, 51, 239-259. doi:10.1111/jedm.12046

(16)

Thank you for your attention

[email protected]

This work received financial support from the Dirección General de Investigación y Gestión del Plan Nacional de I+D+I, of the Spanish Ministry for Economy and Competitiveness (project EDU2013-41399-P).

We thank the "Consell d’Avaluació de Catalunya " the support given to this study.

[email protected]

[email protected] [email protected]

Références

Documents relatifs

To determine whether a small organic sugarcane farming system is profitable or not, especially in terms of productivity, control of bioagressors (weeds, rats and pests) and

We show applications of this method to the optimization over the space of curves, where this Finsler gradient allows one to construct piecewise regular curve evolutions.. 1.1

C’est aussi pour cela que nous intervenons dans les fêtes populaires, au sein même des communautés indigènes, en perpétuant l’héritage de nos prédécesseures

En effet, parmi les femmes de notre étude qui ont eu un suivi gynécologique insuffisant (profil A), si seule une minorité d’entre elles s’est fait dépister pour

When combined with the immense light-gathering power of LUVOIR, it will deliver unprecedented views of the disks, winds, chromospheres and magnetospheres around a broad range

models, the Lokal-Modell (LM) and the Regional Atmospheric Modeling System (RAMS), coupled with an explicit dust emission model have previously been used to simulate mineral dust

In this paper, we investigate the relationships between crop biophysical parameters (LAI, SPAD, Humidity) and two airborne-derived spectral indices (NDVI: Normalized Vegetation

We obtain (a) for antimitotic products, the percentage of resistant strains in the popu- lation analysed; (b) for SBI (Sterol Biosyn- thesis Inhibitor) fungicides, the variation