Date post: | 17-Dec-2015 |
Category: |
Documents |
Upload: | donna-carson |
View: | 216 times |
Download: | 0 times |
A controversy in PISA and other large-scale assessments: the trade-off between model fit, invariance
and validityDavid Andrich
CEM: 30 years of Evidence in Education
London : 23 September 2014
Program for International Student Assessment - PISA
Many uses and misuses
e.g. may reject the program
e.g. may reject the methodology
Will consider one methodological attack here
General assessment plan in PISA
To cover the curriculum, multiple booklets (16) with links are used in each country
Students do different booklets
All countries receive the same booklets
Place different booklets on the same scale
Use a probabilistic model for this purpose
The model estimates are then involved in comparing countries
A methodological attack - DIFValid comparisons - items should work invariantly among countries (same relative difficulty in all countries)
Not invariant - said to have differential item functioning (DIF)
If DIF – what can be done about it?
If DIF – can comparisons be made valid?
It depends!
The presentation
1. Distinguish between causal and index variables
2. Imagine the assessment of physics in multiple domains
3. Set up an idealised assessment design in three countries
4. Illustrate the model used and the concepts of
(a) fit to the model (b) DIF
5. Show tension between model fit and validity.
Causal and Index Variables
Stenner, A. J., et. al.(2008). Formative and reflective models: Can a Rasch analysis tell the difference? Rasch Measurement Transactions,
22, 1059 – 1060.
Causal and Index variables
Causal Example
E.G heat– indicated by thermometers
Change in heat cause change on the thermometer
(i) Same changes on all thermometers
(ii) Thermometers are exchangeable
Index Example
E.G Indicators of SES
education, occupational prestige, income, and neighbourhood
(iii) Change in one indicator does not change other indicators
(iv) Indicators not exchangeable
Science proficiency – in light
Assessment understanding of light (relatively thin variable)
Items of a test related to the curriculum on light
Causal variable – understanding of light governs performance on all items of the test
Items in principle exchangeable (avoid effect of teaching to the test)
Students from three countries
Simulation of an idealisation of the PISA controversy
All countries of equal proficiency
Item difficulties similar in the 5 domains – 8 items each
All items administered to all countries
Have some DIF by domains
Summary: DIF and Interpretation
Split on a domain is equivalent to deleting it
Most valid interpretation? Depends on source of DIF!Artefact or substantive
Cannot be answered only statistically.
Understand DIF, test and curriculum implications