+ All Categories
Home > Documents > Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson...

Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson...

Date post: 20-Jan-2016
Category:
Upload: nathan-sanders
View: 215 times
Download: 0 times
Share this document with a friend
68
Applications in educational monitoring and practice: the logic of standard- setting Mark Wilson University of California, Berkeley Presented at the Standard-setting in the Nordic Countries Conference, CEMO, University of Oslo, September 22, 2015
Transcript
Page 1: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Applications in educational monitoring and practice:

the logic of standard-setting

Mark Wilson University of California, Berkeley

• Presented at the Standard-setting in the Nordic Countries Conference,

CEMO, University of Oslo, September 22, 2015

Page 2: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Outline

• The logic of standard-setting

• Applied to two traditional standard-setting methods

• An alternative: Unidimensional constructs

• An alternative: Multidimensional constructs

• Summary, Conclusions, etc.

Page 3: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Outline

• The logic of standard-setting

• Applied to two traditional standard-setting methods

• An alternative: Unidimensional constructs

• An alternative: Multidimensional constructs

• Summary, Conclusions, etc.

Page 4: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

The logic of standard-setting

I. A way to define the outcome objectives

II. A way to decide what is (qualitatively) “enough” of the standards

III. A way to make manifest student performance on the standards (i.e, a “test”)

IV. A way to decide which performances are acceptable

Page 5: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

I. A way to define the outcome objectives: The “Standards”

• Students in Grade X studying subject Y should be able to succeed on the following “standards” ...

• A. First standard

• B. Second standard

• :

• N. Nth standard

Page 6: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

II. A way to decide what is (qualitatively) “enough” of the

standards• Success on every standard in {A, B, ...N}?

• Enough success every standard in {A, B, ...N}?

• Enough success on enough of the standards in {A, B, ...N}?

Page 7: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

III. A way to make manifest student performance on the standards:

The Test.

• Therefore a Test T is constructed,

• Consisting, say, of n items for each of the N standards (i.e, nN items altogether), or some approximation to that. – (wlog, assume that each item is scored 0 or 1.)

Page 8: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

IV. A way to decide which performances are acceptable:

The setting of cut-scores or “Standard Setting”

• Students are scored on the Test, and array from 0 to nN in scores.

• (Can calculate a mean, standard deviation, percentiles, etc.) 

• Thus, the standard “Standard Setting” problem:

• How to decide what score represents “enough” of subject Y.

Page 9: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

In summary,

I. A way to define the outcome objectives: The “Standards”

II. A way to decide what is (qualitatively) “enough” of the standards

III. A way to make manifest student performance on the standards: The Test

IV. A way to decide which performances are acceptable: The setting of cut-scores or “Standard-Setting”

Page 10: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

However,

• most “methods” of Standard Setting start at the end, at part IV.

• And proceed to develop and test technical solutions to just this problem,

 

• In my view, THIS is the real Standard Setting problem

Page 11: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Outline

• The logic of standard-setting

• Applied to two traditional standard-setting methods

• An alternative: Unidimensional constructs

• An alternative: Multidimensional constructs

• Summary, Conclusions, etc.

Page 12: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Outline

• The logic of standard-setting

• Applied to two traditional standard-setting methods• The Angoff Method

• An alternative: Unidimensional constructs

• An alternative: Multidimensional constructs

• Summary, Conclusions, etc.

Page 13: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

The Angoff Method

• Concept: “... ‘borderline’ test-taker...one whose knowledge and skills are on the borderline between the upper group and the lower group.”

• “...the judge considers each question as a whole and makes a judgment of the probability that that a borderline test-taker would answer the question correctly.”

• “...the passing score is computed from the expected scores for the individual items...”– Livingston, S.A., and Zieky, M.J. (1982). Passing Scores: A

Manual for Setting Standards of Performance on Educational and Occupational Tests. Princeton: Educational Testing Service.

Page 14: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Reminder about the logic above

I. A way to define the outcome objectives: The “Standards”

II. A way to decide what is (qualitatively) “enough” of the standards

III. A way to make manifest student performance on the standards: The Test

IV. A way to decide which performances are acceptable: The setting of cut-scores or “Standard Setting”

Page 15: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Critique of Angoff Method based on the logic above

I. A way to define the outcome objectives: The “Standards”

assumes that the standards have been developed

II. A way to decide what is (qualitatively) “enough” of the standards

assumes that this has been developed, and that the judges know it

III. A way to make manifest student performance on the standards: The Test

assumes that the test has been developed with this in mind

IV. A way to decide which performances are acceptable: The setting of cut-scores or “Standard Setting”

assumes that this has been developed, and that the judges know it

assumes that “taking the average” of the probabilities (expectations) is the best way to summarize the judgments.

Page 16: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Outline

• The logic of standard-setting• Applied to two traditional standard-setting

methods• The Angoff Method

• The “Matrix” method

• An alternative: Unidimensional constructs• An alternative: Multidimensional constructs• Summary, Conclusions, etc.

Page 17: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Practical Context

• Mixed modes of assessment – e.g., multiple choice + performance items

(lots) (few)

• Need to map into criterion levels – e.g., performance levels

• Need to maintain comparability of criterion levels across administrations:– Standard-setting– Standard-propagating

Page 18: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

“Matrix” Method

• Based on judgment of teachers and other professionals involved in testing process

• Matrix of multiple choice by open-ended total scores mapped to performance levels

• Final matrix decided by committee consensus

Page 19: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Example of

Perform-ance

Levels(High-school

Algebra)

Page 20: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Example of Performance Levels(detail)

Page 21: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

An example matrix

Page 22: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

An example matrix

Page 23: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

An example matrix 1

2

3

4

5

Page 24: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

An example matrix

Page 25: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Critique of the “Matrix” Method based on the logic above

I. A way to define the outcome objectives: The “Standards”

assumes that the standards have been developed and related to the

“performance levels,” but, in fact, not the usually the case

II. A way to decide what is (qualitatively) “enough” of the standards

assumes that this has been developed, and that the judges know it

could be true if there was a relationship between the standards and the performance level

III. A way to make manifest student performance on the standards: The Test

assumes that the test has been developed with this in mind

could be true if there was a relationship between the standards and the performance level, and this had been used to develop the items

IV. A way to decide which performances are acceptable: The setting of cut-scores or “Standard Setting”

assumes that the judges can create it

Page 26: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Outline

• The logic of standard-setting

• Applied to two traditional standard-setting methods

• An alternative: Unidimensional constructs– The BEAR Assessment System

• An alternative: Multidimensional constructs

• Summary, Conclusions, etc.

Page 27: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

How the BEAR Assessment System helps here...

I. A way to define the outcome objectives: The “Standards”

II. A way to decide what is (qualitatively) “enough” of the standards

III. A way to make manifest student performance on the standards: The Test

IV. A way to decide which performances are acceptable: The setting of cut-scores or “Standard Setting”

Page 28: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

The BEAR Assessment System: BAS

Page 29: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

An Example:Assessing Data Modeling and Statistical

Reasoning (ADM) project

PIs: Rich Lehrer & Leona Schauble (Vanderbilt U) & Mark Wilson (UC Berkeley)

-Developed an assessment system for a “Statistical Modeling” curriculum for middle school

-Multi-year, multidisciplinary collaborative of teachers, learning science and assessment experts

-Designed “a developmental perspective on learning” - learning progression with 7 relational construct maps

-Used “reformed curriculum” – conjecture-based whole-class discussions within instruction

-Embedded “new ideas about assessment” into everyday instruction

Page 30: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Conceptualization of measurement variables:

• CoS4 - Investigate and anticipate qualities of a sampling distribution.

• CoS3 - Consider statistics as measures of qualities of a sample distribution.

• CoS2 - Calculate statistics.

• CoS1 - Describe qualities of distribution informally.

A Sample Construct Map for:Conceptions of Statistics

30

Page 31: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Conceptualization of measurement variables:

• CoS4 - Investigate and anticipate qualities of a sampling distribution.

• CoS3 - Consider statistics as measures of qualities of a sample distribution.

• CoS2 - Calculate statistics.

• CoS1 - Describe qualities of distribution informally.

A Sample Construct Map for:

Conceptions of Statistics

31

Page 32: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

detail view: CoS3 CoS3F Choose/Evaluate statistic by considering qualities of one or

more samples.CoS3E Predict the effect on a statistic of a change in the process

generating the sample.CoS3D Predict how a statistic is affected by changes in its components

or otherwise demonstrate knowledge of relations among components.

CoS3C Generalize the use of a statistic beyond its original context of application or invention.

CoS3B Invent a sharable (replicable) measurement process to quantify a quality of the sample.

CoS3A Invent an idiosyncratic measurement process to quantify a quality of the sample based on tacit knowledge that others may

notshare.

32

Page 33: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

33

Items Design:Open Assessment Prompt

Students received their final grades in Science today. In addition to giving each student their grade, the teacher also told the class about the overall class average.

Student Final grades

Robyn 10

Jake 9

Calvin 6

Sasha 7

Mike 8

Lori 8

When the teacher finished grading Mina’s work and added her final grade into the overall class average, the overall class average stayed the same. What could Mina’s final grade have been? (Show your work).

48/6 = 8

Page 34: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Where the “Standards” are... CoS3F Choose/Evaluate statistic by considering qualities of one or

more samples.CoS3E Predict the effect on a statistic of a change in the process

generating the sample.CoS3D Predict how a statistic is affected by changes in its components

or otherwise demonstrate knowledge of relations among components.

CoS3C Generalize the use of a statistic beyond its original context of application or invention.

CoS3B Invent a sharable (replicable) measurement process to quantify a quality of the sample.

CoS3A Invent an idiosyncratic measurement process to quantify a quality of the sample based on tacit knowledge that others may

notshare.

34

Page 35: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Measurement Model:Wright Map

Initial results for CoS ...

Page 36: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.
Page 37: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Outline

• The logic of standard-setting

• Applied to two traditional standard-setting methods

• An alternative: Unidimensional constructs– The BEAR Assessment System– Banding & Construct-Mapping

• An alternative: Multidimensional constructs

• Summary, Conclusions, etc.

Page 38: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

How Banding & Construct-Mapping help here...

I. A way to define the outcome objectives: The “Standards”

II. A way to decide what is (qualitatively) “enough” of the standards

III. A way to make manifest student performance on the standards: The Test

IV. A way to decide which performances are acceptable: The setting of cut-scores or “Standard Setting”: Item-side

Page 39: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.
Page 40: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.
Page 41: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Outline• The logic of standard-setting

• A few traditional standard-setting methods

• An alternative: Unidimensional constructs– The BEAR Assessment System– Banding & Construct-Mapping

• An alternative: Multidimensional constructs

• Summary, Conclusions, etc.

Page 42: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

How the BEAR Assessment System helps here...

I. A way to define the outcome objectives: The “Standards”

II. A way to decide what is (qualitatively) “enough” of the standards

III. A way to make manifest student performance on the standards: The Test

IV. A way to decide which performances are acceptable: The setting of cut-scores or “Standard Setting”: Student-side

Page 43: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Construct-Mapping

Aim: to give judges information that will help them balance different aspects of the test

– Could be sub-components of content, or item-types– E.g. (from Algebra example), MC items and open-

ended items

Technique: Wright map relating score levels to item locations to indicate what the response vector tells us about what a student knows and can do

Page 44: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

The Wright Map• The item types are scaled together to

estimate the best-fitting composite, according to pre-determined item-weights (substantive decision)

• The calibration is then used to create a map of all item/level locations

• The judging committee, through a consensus-building process, chooses cut points between performance levels on the map

Page 45: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Software tool

• Dynamic display of item map

• Displays, for any chosen proficiency level– probability of passing all multiple-choice items– probability of attaining every level on written

response items– expected total score on multiple-choice section;

expected score on each written response item

• Allows choice of weights for item types

Page 46: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.
Page 47: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Final Standards Map

Page 48: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.
Page 49: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.
Page 50: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Example of resulting "mapping matrix"

Page 51: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Outline

• The logic of standard-setting

• A few traditional standard-setting methods

• An alternative: Unidimensional constructs

• An alternative: Multidimensional constructs

• Summary, Conclusions, etc.

Page 52: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.
Page 53: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Learning progressions

• Learning progressions are descriptions of the successively more sophisticated ways of thinking about an important domain of knowledge and practice that can follow one another as children learn about and investigate a topic over a broad span of time. They are crucially dependent on instructional practices if they are to occur. (CCII, 2009)

– Aka learning trajectories, progressions of developmental competence, and profile strands

• More than one path leads to competence• Need to engage in curriculum debate about which learning

progressions are most important– Try and choose them so that we end up with fewer standards per

grade level

Page 54: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Image of a Learning Progression: Curriculum version

Page 55: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

One Possible Relationship:the levels of the learning progression are levels of several construct

maps

Page 56: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Another possible relationship:the levels are staggered

Page 57: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Making a summative construct map

Page 58: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Making a composite scale for the summative construct

A combined reflective/formative model.

Page 59: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Then the standard-setting scale becomes ...

a derived measure based on the sampling design of the levels across the constructs

Reliability can be controlled by lighter/heavier sampling of items

A “construct map” can be developed in a post-hoc way (similar to the PISA “defined variable”)

Can be combined with other derived measures from other age or grade-appropriate Learning Progressions.

Page 60: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Outline

• The logic of standard-setting

• A few traditional standard-setting methods

• An alternative: Unidimensional constructs

• An alternative: Multidimensional constructs

• Summary, Conclusions, etc.

Page 61: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Summary, Conclusions, etc.• Standard-setting must be seen as more than a mere “technical

exercise”• It involves much prior work, both substantive and technical,

including • (a) How to develop standards that are “ready” for standard-setting• (b) How to develop items that support that• (c) How to decide which student performances on the test are

“enough” • and requires an overarching framework of all of that to be coherent

Page 62: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Presentation has offered ...

• Sample of how traditional standard setting methods fall short from this perspective

• A suggestion for one approach that does attempt to address this issue

• Its a complex problem, and hence one should not expect an easy solution

• Simpler for single-dimension constructs, more complex for higher-dimension constructs.

Page 63: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Further Issues

• Items– Consistency over years– Large “committee effect” of specifics of “rich”

items

• Factors that affect committee judgments– difficulty of items in a particular year

– committee leadership

– committee membership

• Effects on teaching, policy ...

Page 64: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Teaching/Learning• Policy-makers, administrators, etc. need the results

of standard-setting for large-scale tests.• In the main, teachers do not!!!• They need good formative assessments, and the

positive effects of good formative assessment is well-documented – eg., Black & Wiliam meta-anlaysis

• Thus, a major requirement of standardized tests and standard-setting methods is that they not damage classroom instruction.

Page 65: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Teaching/Learning

• The approach described above has the virtue that it bases good large-scale test construction (i.e., for standard-setting)

• On good formative assessment.

Page 66: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Thank-you.

[email protected]

• http://bearcenter.berkeley.edu

Page 67: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

Footnote on I.

I. A way to define the outcome objectives: The “Standards”

assumes that the standards have been developed and related to the

“performance levels,” but, in fact, not the case

If there are no standards, only the performance levels, then this critique may be empty. (Eg., Language Testing)

Page 68: Applications in educational monitoring and practice: the logic of standard-setting Mark Wilson University of California, Berkeley Presented at the Standard-setting.

A further issue ...

-Sometimes the best choice for the relative weight between the item modes is not clear-This affects the slope of “diagonal sections” in the previous slide-Can adapt software to allow the committee to also decide which weight to choose-We have not found that committees are good at this.


Recommended