+ All Categories
Home > Documents > A controversy in PISA and other large- scale assessments: the trade-off between model fit,...

A controversy in PISA and other large- scale assessments: the trade-off between model fit,...

Date post: 17-Dec-2015
Category:
Upload: donna-carson
View: 216 times
Download: 0 times
Share this document with a friend
Popular Tags:
19
A controversy in PISA and other large-scale assessments: the trade-off between model fit, invariance and validity David Andrich CEM: 30 years of Evidence in Education London : 23 September 2014
Transcript

A controversy in PISA and other large-scale assessments: the trade-off between model fit, invariance

and validityDavid Andrich

CEM: 30 years of Evidence in Education

London : 23 September 2014

Program for International Student Assessment - PISA

Many uses and misuses

e.g. may reject the program

e.g. may reject the methodology

Will consider one methodological attack here

General assessment plan in PISA

To cover the curriculum, multiple booklets (16) with links are used in each country

Students do different booklets

All countries receive the same booklets

Place different booklets on the same scale

Use a probabilistic model for this purpose

The model estimates are then involved in comparing countries

A methodological attack - DIFValid comparisons - items should work invariantly among countries (same relative difficulty in all countries)

Not invariant - said to have differential item functioning (DIF)

If DIF – what can be done about it?

If DIF – can comparisons be made valid?

It depends!

The presentation

1. Distinguish between causal and index variables

2. Imagine the assessment of physics in multiple domains

3. Set up an idealised assessment design in three countries

4. Illustrate the model used and the concepts of

(a) fit to the model (b) DIF

5. Show tension between model fit and validity.

Causal and Index Variables

Stenner, A. J., et. al.(2008). Formative and reflective models: Can a Rasch analysis tell the difference? Rasch Measurement Transactions,

22, 1059 – 1060.

Causal and Index variables

Causal Example

E.G heat– indicated by thermometers

Change in heat cause change on the thermometer

(i) Same changes on all thermometers

(ii) Thermometers are exchangeable

Index Example

E.G Indicators of SES

education, occupational prestige, income, and neighbourhood

(iii) Change in one indicator does not change other indicators

(iv) Indicators not exchangeable

Science proficiency – in light

Assessment understanding of light (relatively thin variable)

Items of a test related to the curriculum on light

Causal variable – understanding of light governs performance on all items of the test

Items in principle exchangeable (avoid effect of teaching to the test)

Assess a broad physics construct

Students from three countries

Simulation of an idealisation of the PISA controversy

All countries of equal proficiency

Item difficulties similar in the 5 domains – 8 items each

All items administered to all countries

Have some DIF by domains

Model and Fit of Item 21 - Sound

Model and DIF, 17 – 24, Sound

C1 > C2, C3

Model and Fit Item 29 – Electricity and Magnetism

Model and DIF, 25 – 32, Elec & MagC3 > C1, C2

Resolve items by country: Sound

Split items by country: Elec and mag.

Summary of Means

Summary: DIF and Interpretation

Split on a domain is equivalent to deleting it

Most valid interpretation? Depends on source of DIF!Artefact or substantive

Cannot be answered only statistically.

Understand DIF, test and curriculum implications

Thank You


Recommended