+ All Categories
Home > Documents > Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item...

Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item...

Date post: 27-Mar-2015
Category:
Upload: isabel-gordon
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:
25
Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations on the Scores of Students with Disabilities on English-Language Arts Assessments Mary Pitoniak, Linda Cook, Frederic Cline, and Cara Cahalan-Laitusis Educational Testing Service NCME Presentation April 10, 2006
Transcript
Page 1: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Copyright © 2006 Educational Testing Service

Listening. Learning. Leading.

Using Differential Item Functioning to Investigate

the Impact of Accommodations on the Scores of

Students with Disabilities on English-Language Arts

Assessments

Mary Pitoniak, Linda Cook, Frederic Cline, and Cara Cahalan-Laitusis

Educational Testing ServiceNCME Presentation

April 10, 2006

Page 2: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Copyright © 2006 Educational Testing Service 2

P 2

Purpose and Overview of the Study

• The purpose of this study was to examine differential item functioning on the English-Language Arts assessment described by Linda

• DIF analyses are statistical procedures that are used to identify items that function differently for different subgroups of examinees

• DIF “exists when examinees of equal ability differ, on average, according to their group membership in their responses to a particular item” (Standards)

Page 3: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Copyright © 2006 Educational Testing Service 3

P 3

Purpose and Overview of the Study (continued)

• Issues investigated:

– Do 2 different DIF detection methods yield the same results?

– Are the results interpretable in terms of a priori or a posteriori evaluation of item content?

– Of particular interest:When the read-aloud modification is used, do the items function differentially for students?

Page 4: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Copyright © 2006 Educational Testing Service 4

P 4

Purpose and Overview of the Study (continued)

• Features of study:

– 2 DIF detection methods– Large enough sample sizes (not always the

case)– Looked at 3 different criteria (total score,

Reading score, Writing score); we decided to go with total score for several reasons

– Used purification step, as recommended by literature

Page 5: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Copyright © 2006 Educational Testing Service 5

P 5

Comparisons Made in the Study

Comparison Number Reference Group Focal Group

1.3 Without disabilities LD no accommodations

1.4 “ LD IEP/504 accommodations

1.5 “ LD read-aloud modification

(& IEP/504 accommodations)

3.1 LD no accommodations

LD IEP/504 accommodations

3.2 “ LD read-aloud modification

(& IEP/504 accommodations)

Page 6: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Copyright © 2006 Educational Testing Service 6

P 6

DIF Methods Used

• Mantel-Haenszel

• Logistic Regression

• For both methods, we used ETS classification system:

– Category A contains items with negligible DIF;– Category B contains items with slight to

moderate values of DIF;– Category C contains items with moderate to

large values of DIF.

Page 7: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Advantages Disadvantages

Mantel-Haenszel

1. Computationally simple

2. Most powerful DIF detection method when DIF is constant (or uniform) and group mean abilities are equal

3. Low Type 1 error rates when the compared groups have equal mean abilities

4. A relatively small sample size is needed (e.g., 200-250 per group) for reasonable power

5. Effect size measure is relatively more sensitive to actual DIF conditions

1. Power is negatively affected by inadequate sample size and unequal group mean abilities

Logistic Regression

1. Relatively powerful in detecting uniform DIF

2. Superior power in detecting nonuniform DIF

3. Method is very flexible and allows different models to be specified and it can handle multiple ability estimates

1. Computationally intensive

2. The effect size measure (ΔR2) and set guidelines are not as sensitive to the actual magnitude of DIF

Comparison of Mantel-Haenszel vs. Logistic Regression

Page 8: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Example of Uniform DIF

0%

20%

40%

60%

80%

100%

0 16 28 40 52 64

Total Score

Pe

rce

nt

Co

rre

ct

Page 9: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Example of Non-Uniform DIF

0%

20%

40%

60%

80%

100%

0 16 28 40 52 64

Total Score

Pe

rce

nt

Co

rre

ct

Page 10: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Copyright © 2006 Educational Testing Service 10

P 10

Results

• Within this presentation, I will present results only for Reading items (and not Writing), both for time reasons and because we were most interested in the effects of the accommodations on performance on the Reading items

Page 11: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Copyright © 2006 Educational Testing Service 11

P 11

Results (continued)

• Overall

– No items flagged as “C”

– Each method flagged 9 items as “B” (out of 42 items X 5 comparisons, or 210 possible flags)

– However, those 9 items were not the same items—in all, 12 different items were flagged by at least one of the methods

– There were inconsistencies between methods

Page 12: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Copyright © 2006 Educational Testing Service 12

P 12

Number of Items Flagged by Each Method

Method Reference

Group Focal Group

Mantel-Haenszel

Logistic Regression

Non-LD LD no acc

1 6

“ LD IEP/504

1 3

“ LD read-aloud

6 0

LD no acc

LD IEP/504

0 0

“ LD read-aloud

1 0

Total 9 9

Page 13: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Agreement Between Flags for Methods by Comparison Type

Reference Group

Focal Group Agreement

Discrepancy: Uniform vs.

no flag

Discrepancy: Non-uniform vs. uniform

Non-LD LD no acc

36 1 5

“ LD IEP/504

39 2 1

“ LD read-aloud

36 6 0

LD no acc

LD IEP/504

42 0 0

“ LD read-aloud

41 1 0

Page 14: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Mantel-Haenszel Logistic

Regression No flag Favors non-LD

Favors LD no accomm

Total

No flag 36 36

Favors non-LD

0

Favors LD no accomm

1 1

Non-uniform 1

1 1 2

Non-uniform 2

3 3

Total 41 1 0 42

Non-LD vs.

LD No Accommodation

Page 15: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Mantel-Haenszel Logistic

Regression No flag Favors non-LD

Favors LD IEP/504

Total

No flag 39 39

Favors non-LD

0

Favors LD IEP/504

2 2

Non-uniform 1

1 1

Non-uniform 2

0

Total 41 1 0 42

Non-LD vs.

LD IEP/504 Accommodation

Page 16: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Mantel-Haenszel Logistic

Regression No flag Favors non-LD

Favors LD Read-Aloud

Total

No flag 36 1 5 42

Favors non-LD

Favors LD Read-Aloud

Non-uniform 1

Non-uniform 2

Total 36 1 5 42

Non-LD vs.

LD Read-Aloud Modification

Page 17: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Mantel-Haenszel Logistic

Regression No flag Favors

LD Non-Acc Favors LD

IEP/504

Total

No flag 42 42

Favors LD Non-Acc

Favors LD IEP/504

Non-uniform 1

Non-uniform 2

Total 42

LD Non-Accommodated vs.

LD IEP/504 Accommodation

Page 18: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Mantel-Haenszel Logistic

Regression No flag Favors

LD-Non Acc. Favors LD

Read-Aloud

Total

No flag 40 42

Favors LD-Non Acc.

1

Favors LD Read-Aloud

1

Non-uniform 1

Non-uniform 2

Total 42 42

LD Non-Accommodated vs.

LD Read-Aloud Modification

Page 19: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Example of Discrepancies in Flags

Item Flags

M-H—Uniform LR—No flag

The items flaggedby MH (but not LR) as favoring students with read-aloud modification did show differences such as these graphically for LR

0%

20%

40%

60%

80%

100%

0 16 28 40 52 64

Total Score

Pe

rce

nt

Co

rre

ct

Page 20: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Example of Discrepancies in Flags

0%

20%

40%

60%

80%

100%

0 16 28 40 52 64

Total Score

Pe

rce

nt

Co

rre

ct

Item Flags

M-H—Uniform

LR—No flag

Page 21: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Copyright © 2006 Educational Testing Service 21

P 21

A Priori Theories About Read-Aloud Modification Results

• 5 items were easier for students who received the read-aloud modification than for non-LD students.

• A priori theories were not that accurate!– Item A: harder (refer back)

– Item B: easier (short item; intonation/body language)

– Item C: easier (intonation/body language)

– Item D: harder (char. of options)

– Item E: harder (length of options)

Page 22: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Copyright © 2006 Educational Testing Service 22

P 22

A Posteriori Interpretation About Read-Aloud Modification Results

• The reasons why these 5 items were easier with read-aloud accommodation were not obvious to test developers

Page 23: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Copyright © 2006 Educational Testing Service 23

P 23

What Do the Results Say About the 3 Questions Posed

• Do 2 different DIF detection methods yield the same results?

– Neither flagged an item as “C.”

– There were discrepancies in “B” flags, however.

– Some discrepancies are explainable in terms of advantages/disadvantages of methods as listed earlier.

Page 24: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Copyright © 2006 Educational Testing Service 24

P 24

• Are the results interpretable in terms of a priori or a posteriori evaluation of item content? – Not consistently

• Of particular interest:When the read-aloud modification is used, do the items function differentially for students?– Yes, some items were easier when

read-aloud, which supports this state’s decision to view read-aloud as a modification

3 Questions (continued)

Page 25: Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Copyright © 2006 Educational Testing Service 25

P 25

• ELL and ELL/LD groups to be compared

• Grade 8 ELA to be evaluated

• DIF analysis paradigm to be utilized

Next Steps


Recommended