+ All Categories
Home > Documents > Creating Tests that Measure Well and that Model Good ...

Creating Tests that Measure Well and that Model Good ...

Date post: 19-Jan-2022
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
33
Creating Tests that Measure Well and that Model Good Instructional and Learning Practice National Conference on Student Assessment Minneapolis, MN June 2012
Transcript

Creating Tests that Measure Well and that Model Good Instructional

and Learning Practice

National Conference on Student Assessment Minneapolis, MN

June 2012

Session Outline

• Presenters – Randy Bennett (ETS) – Edys Quellmalz (WestEd)

• Discussant – Brian Gong (NCIEA)

• Q&A

Copyright © 2011 by Educational Testing Service.

2

CBAL: Modeling Good Instructional and Learning Practice through Assessment

Randy Bennett ETS

[email protected]

Presentation at the National Conference on Student Assessment, Minneapolis, MN, June 2012

Overview

• Brief description of CBAL’s goal and design characteristics • Brief outline of pilot results • Examples of how we try to model good teaching and

learning practice • List of outstanding issues • Summary

Copyright © 2011 by Educational Testing Service.

4

5

Cognitively Based Assessment of, for, and as Learning

• Began in 2007 • Goal: Create knowledge and capability, grounded in the learning

sciences, that can be configured in different ways to address the assessment innovation needs of the field

• CBAL assessment prototypes attempt to: – Document what students have achieved (“of learning”) – Help identify how to plan instruction (“for learning”) – Offer worthwhile educational experiences (“as learning”)

• R&D covers reading, writing, mathematics, and science from elementary school through adult education

5

Copyright © 2011 by Educational Testing Service.

Key Design Characteristics

• Summative and formative assessment built as part of a coherent system

• System model was created from a detailed theory of action • Assessment designs are grounded in principles and

domain conceptions from learning-sciences’ research • Assessment prototypes are computer-delivered and make

heavy use of structured, scenario-based task sets • Summative assessments use a distributed design • Assessment prototypes are built to measure well and to

model good instructional and learning practice

Copyright © 2011 by Educational Testing Service.

6

Summary of Results from 16 Online Summative Assessment Pilots

Copyright © 2011 by Educational Testing Service.

7

Content Area (& # of Form Admini-strations)

# of Tests

Median (M) of the M p+ Values

M of the M Percent Omitted/ Missing

M Coeff. Alpha

Most Frequent Factor Analytic Result

M r with Other Tests of the Same Skill

Diff. Btwn Auto-Human (H) & H-H Agreement

Reading (6)

3,062 .51 0% .88 1 F within & across forms

.74 M=6 k pts

Writing (9)

5,410 .57 1% .82 1 F within & across forms

--- 3 r pts

Math (12)

1,347 .45 6% .92 1 F within forms

.76 M=15 k pts

Note. M=median; F = factor; k = kappa; r = correlation coefficient.

Summary of Results from 16 Online Summative Assessment Pilots

Copyright © 2011 by Educational Testing Service.

8

Content Area (& # of Form Admini-strations)

# of Tests

Median (M) of the M p+ Values

M of the M Percent Omitted/ Missing

M Coeff. Alpha

Most Frequent Factor Analytic Result

M r with Other Tests of the Same Skill

Diff. Btwn Auto-Human (H) & H-H Agreement

Reading (6)

3,062 .51 0% .88 1 F within & across forms

.74 M=6 k pts

Writing (9)

5,410 .57 1% .82 1 F within & across forms

--- 3 r pts

Math (12)

1,347 .45 6% .92 1 F within forms

.76 M=15 k pts

Note. M=median; F = factor; k = kappa; r = correlation coefficient.

Summary of Results from 16 Online Summative Assessment Pilots

Copyright © 2011 by Educational Testing Service.

9

Content Area (& # of Form Admini-strations)

# of Tests

Median (M) of the M p+ Values

M of the M Percent Omitted/ Missing

M Coeff. Alpha

Most Frequent Factor Analytic Result

M r with Other Tests of the Same Skill

Diff. Btwn Auto-Human (H) & H-H Agreement

Reading (6)

3,062 .51 0% .88 1 F within & across forms

.74 M=6 k pts

Writing (9)

5,410 .57 1% .82 1 F within & across forms

--- 3 r pts

Math (12)

1,347 .45 6% .92 1 F within forms

.76 M=15 k pts

Note. M=median; F = factor; k = kappa; r = correlation coefficient.

Summary of Results from 16 Online Summative Assessment Pilots

Copyright © 2011 by Educational Testing Service.

10

Content Area (& # of Form Admini-strations)

# of Tests

Median (M) of the M p+ Values

M of the M Percent Omitted/ Missing

M Coeff. Alpha

Most Frequent Factor Analytic Result

M r with Other Tests of the Same Skill

Diff. Btwn Auto-Human (H) & H-H Agreement

Reading (6)

3,062 .51 0% .88 1 F within & across forms

.74 M=6 k pts

Writing (9)

5,410 .57 1% .82 1 F within & across forms

--- 3 r pts

Math (12)

1,347 .45 6% .92 1 F within forms

.76 M=15 k pts

Note. M=median; F = factor; k = kappa; r = correlation coefficient.

Summary of Results from 16 Online Summative Assessment Pilots

Copyright © 2011 by Educational Testing Service.

11

Content Area (& # of Form Admini-strations)

# of Tests

Median (M) of the M p+ Values

M of the M Percent Omitted/ Missing

M Coeff. Alpha

Most Frequent Factor Analytic Result

M r with Other Tests of the Same Skill

Diff. Btwn Auto-Human (H) & H-H Agreement

Reading (6)

3,062 .51 0% .88 1 F within & across forms

.74 M=6 k pts

Writing (9)

5,410 .57 1% .82 1 F within & across forms

--- 3 r pts

Math (12)

1,347 .45 6% .92 1 F within forms

.76 M=15 k pts

Note. M=median; F = factor; k = kappa; r = correlation coefficient.

Summary of Results from 16 Online Summative Assessment Pilots

Copyright © 2011 by Educational Testing Service.

12

Content Area (& # of Form Admini-strations)

# of Tests

Median (M) of the M p+ Values

M of the M Percent Omitted/ Missing

M Coeff. Alpha

Most Frequent Factor Analytic Result

M r with Other Tests of the Same Skill

Diff. Btwn Auto-Human (H) & H-H Agreement

Reading (6)

3,062 .51 0% .88 1 F within & across forms

.74 M=6 k pts

Writing (9)

5,410 .57 1% .82 1 F within & across forms

--- 3 r pts

Math (12)

1,347 .45 6% .92 1 F within forms

.76 M=15 k pts

Note. M=median; F = factor; k = kappa; r = correlation coefficient.

Summary of Results from 16 Online Summative Assessment Pilots

Copyright © 2011 by Educational Testing Service.

13

Content Area (& # of Form Admini-strations)

# of Tests

Median (M) of the M p+ Values

M of the M Percent Omitted/ Missing

M Coeff. Alpha

Most Frequent Factor Analytic Result

M r with Other Tests of the Same Skill

Diff. Btwn Auto-Human (H) & H-H Agreement

Reading (6)

3,062 .51 0% .88 1 F within & across forms

.74 M=6 k pts

Writing (9)

5,410 .57 1% .82 1 F within & across forms

--- 3 r pts

Math (12)

1,347 .45 6% .92 1 F within forms

.76 M=15 k pts

Note. M=median; F = factor; k = kappa; r = correlation coefficient.

Modeling Good Teaching and Learning Practice

• CBAL Summative (and formative assessments) try to: – Give students something substantive and reasonably realistic with

which to reason, read, write, or do mathematics or science – Include tools and representations similar to ones proficient

performers tend to use – Connect qualitative (conceptual) understanding with formalism – Use “lead-in” and “culminating tasks” to suggest how the skills

required for more complex performance might be decomposed for instruction

– Use (CCSS-aligned) learning progressions to denote and measure levels of qualitative change in student understanding

Copyright © 2011 by Educational Testing Service.

14

Copyright © 2010 by Educational Testing Service.

Modeling Good Teaching and Learning Practice

• CBAL Summative (and formative assessments) try to: – Give students something substantive and reasonably realistic with

which to reason, read, write, or do mathematics or science – Include tools and representations similar to ones proficient

performers tend to use – Connect qualitative (conceptual) understanding with formalism – Use “lead-in” and “culminating tasks” to suggest how the skills

required for more complex performance might be decomposed for instruction

– Use (CCSS-aligned) learning progressions to denote and measure levels of qualitative change in student understanding

Copyright © 2011 by Educational Testing Service.

16

Copyright © 2012 by Educational Testing Service.

17

Copyright © 2010 by Educational Testing Service.

18

Modeling Good Teaching and Learning Practice

• CBAL Summative (and formative assessments) try to: – Give students something substantive and reasonably realistic with

which to reason, read, write, or do mathematics or science – Include tools and representations similar to ones proficient

performers tend to use – Connect qualitative (conceptual) understanding with

formalism – Use “lead-in” and “culminating tasks” to suggest how the skills

required for more complex performance might be decomposed for instruction

– Use (CCSS-aligned) learning progressions to denote and measure levels of qualitative change in student understanding

Copyright © 2011 by Educational Testing Service.

19

Copyright © 2012 by Educational Testing Service.

20

Will the lake become so shallow that water can no longer flow through the dam?

Copyright © 2010 by Educational Testing Service.

21

Modeling Good Teaching and Learning Practice

• CBAL Summative (and formative assessments) try to: – Give students something substantive and reasonably realistic with

which to reason, read, write, or do mathematics or science – Include tools and representations similar to ones proficient

performers tend to use – Connect qualitative (conceptual) understanding with formalism – Use “lead-in” and “culminating tasks” to suggest how the skills

required for more complex performance might be decomposed for instruction

– Use (CCSS-aligned) learning progressions to denote and measure levels of qualitative change in student understanding

Copyright © 2011 by Educational Testing Service.

22

Copyright © 2012 by Educational Testing Service.

23

Modeling Good Teaching and Learning Practice

• CBAL Summative (and formative assessments) try to: – Give students something substantive and reasonably realistic with

which to reason, read, write, or do mathematics or science – Include tools and representations similar to ones proficient

performers tend to use – Connect qualitative (conceptual) understanding with formalism – Use “lead-in” and “culminating tasks” to suggest how the skills

required for more complex performance might be decomposed for instruction

– Use (CCSS-aligned) learning progressions to denote and measure levels of qualitative change in student understanding

Copyright © 2011 by Educational Testing Service.

24

CBAL Definition of “Learning Progression”

• A description of qualitative change in a student’s level of sophistication for a key concept, process, strategy, practice, or habit of mind. Change in student standing on such a progression may be due to a variety of factors, including maturation and instruction. Each progression is presumed to be modal--i.e., to hold for most, but not all, students. Finally, it is provisional, subject to empirical verification and theoretical challenge

Copyright © 2011 by Educational Testing Service.

25

Provisional Learning Progression for Argument-Building (Deliberation)

• PRELIMINARY: Can distinguish reasons from non-reasons and infer whether reasons would be used to support or oppose a position

• FOUNDATIONAL: Can self-generate multiple reasons to support an opinion

• BASIC: Can rank and select reasons by how convincing they seem; Can distinguish facts and details that strengthen a point from those that weaken it; can distinguish between reasoning that seems convincing because one agrees with it and reasoning that seems convincing because of the content of the argument.

• INTERMEDIATE: Can recognize counter examples. Can distinguish valid from invalid arguments and recognize unsupported claims and obvious fallacies.

• ADVANCED: Can identify and question the warrants of arguments, distinguish necessary and sufficient evidence, and synthesize a position from many sources of evidence, using that to identify key evidence and propose new lines of argument.

Copyright © 2011 by Educational Testing Service.

26

Provisional Learning Progression for Argument-Building (Deliberation)

• PRELIMINARY: Can distinguish reasons from non-reasons and infer whether reasons would be used to support or oppose a position

• FOUNDATIONAL: Can self-generate multiple reasons to support an opinion

• BASIC: Can rank and select reasons by how convincing they seem; Can distinguish facts and details that strengthen a point from those that weaken it; can distinguish between reasoning that seems convincing because one agrees with it and reasoning that seems convincing because of the content of the argument.

• INTERMEDIATE: Can recognize counter examples. Can distinguish valid from invalid arguments and recognize unsupported claims and obvious fallacies.

• ADVANCED: Can identify and question the warrants of arguments, distinguish necessary and sufficient evidence, and synthesize a position from many sources of evidence, using that to identify key evidence and propose new lines of argument.

Copyright © 2011 by Educational Testing Service.

27

Copyright © 2012 by Educational Testing Service.

28

Provisional Learning Progression for Argument-Building (Deliberation)

• PRELIMINARY: Can distinguish reasons from non-reasons and infer whether reasons would be used to support or oppose a position

• FOUNDATIONAL: Can self-generate multiple reasons to support an opinion

• BASIC: Can rank and select reasons by how convincing they seem; Can distinguish facts and details that strengthen a point from those that weaken it; can distinguish between reasoning that seems convincing because one agrees with it and reasoning that seems convincing because of the content of the argument.

• INTERMEDIATE: Can recognize counter examples. Can distinguish valid from invalid arguments and recognize unsupported claims and obvious fallacies.

• ADVANCED: Can identify and question the warrants of arguments, distinguish necessary and sufficient evidence, and synthesize a position from many sources of evidence, using that to identify key evidence and propose new lines of argument.

Copyright © 2011 by Educational Testing Service.

29

Copyright © 2012 by Educational Testing Service.

30

Outstanding Issues

• Do our modeling strategies affect classroom teaching and learning practice?

– Do teachers change their instructional practice in the intended ways?

– Do students change their learning practice in the intended ways? – Does achievement improve?

• Are our learning progressions useful for measurement and for instruction?

• Do the modeling strategies and learning progressions appear to be of benefit for students from special populations, as well as for those from the general population?

Copyright © 2011 by Educational Testing Service.

31

Summary

• In CBAL we are: – Designing assessment prototypes to measure well and have positive impact – Attempting to have positive impact by modeling good teaching and learning

practice • Give students something substantive and reasonably realistic with which to work • Include tools and representations similar to ones used by proficient performers • Connect qualitative understanding with formalism • Use “lead-in” and “culminating tasks” to suggest how complex performance might

be decomposed • Use provisional learning progressions to denote and measure levels of qualitative

change • Data on “measuring well” appear promising • Much more work needs to be done to verify the effectiveness of our “practice-

modeling” attempts

Copyright © 2011 by Educational Testing Service.

32

For More About CBAL

• Overview Papers – Bennett, R. E., & Gitomer, D. H. (2009). Transforming K-12 assessment: Integrating

accountability testing, formative assessment, and professional support. In C. Wyatt-Smith & J. Cumming (Eds.), Educational assessment in the 21st century (pp. 43-61). New York: Springer.

– Bennett, R. E. (2010). Cognitively Based Assessment of, for, and as Learning: A preliminary theory of action for summative and formative assessment. Measurement: Interdisciplinary Research and Perspectives, 8, 70-91.

– Bennett, R. E. (2011). CBAL: Results from piloting innovative K-12 assessments (RR-11-23). Princeton, NJ: Educational Testing Service.

• Commentaries – Embretson, S. (2010). Cognitively based assessment and the integration of

summative and formative assessments. Measurement: Interdisciplinary Research & Perspectives, 8, 180-184.

– Linn, R. L. (2010). Commentary: A new era of test-based educational accountability. Measurement: Interdisciplinary Research & Perspectives, 8,145–149.

• www.ets.org/research/topics/cbal/initiative

Copyright © 2011 by Educational Testing Service.

33


Recommended