+ All Categories
Home > Documents > Creating quality multiple choice exams - University of Alberta · to be learned and the level of...

Creating quality multiple choice exams - University of Alberta · to be learned and the level of...

Date post: 26-Apr-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
39
Creating quality multiple choice exams: Planning, creating, & improving Cheryl Poth [email protected] Katya Chudnovsky Centre for Research in Applied Measurement & Evaluation DEPARTMENT OF EDUCATIONAL PSYCHOLOGY
Transcript
Page 1: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

Creating quality multiple

choice exams: Planning, creating, & improving

Cheryl Poth

[email protected]

Katya Chudnovsky

Centre for Research in Applied Measurement & Evaluation

D E PA RT M E N T O F

E D U C AT I O N A L

P S Y C H O L O G Y

Page 2: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

To what extent might these scenarios

be familiar to you? (as a student or instructor)

What issues do these scenarios represent?

28 August 2012 Creating quality multiple choice exams 2

Page 3: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

Among the issues are:

Unfamiliar vocabulary

Specific cultural references

Linking to previous experiences

Reading over-emphasized

Others?

28 August 2012 Creating quality multiple choice exams 3

Page 4: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

What our students are saying guides

this workshop “Your exams are cover what we talked about in

class.” ◦ Focus on assessing what you taught What assessment considerations must be embedded into the

planning of an exam?

“ Your exams make me think harder than I have before.” ◦ Focus on increasing quality of items What principles guide the creation/selection of quality

multiple choice items?

“I feel your exams are fair.” ◦ Focus on enhancing use of item analysis How can the exam be improved for the future?

28 August 2012 Creating quality multiple choice exams 4

Page 5: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

What guides our exam planning?

PLANNING

Content

Balance

Big ideas

Item type

28 August 2012 Creating quality multiple choice exams 5

Page 6: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

What guides our exam planning?

A. What are we assessing?

◦ What are the big ideas related to knowledge and

skills that we have taught?

◦ This represents the learner outcomes that

students would be expected to know/be able to

do after the course is completed

28 August 2012 Creating quality multiple choice exams 6

Page 7: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

7 28 August 2012 Creating quality multiple choice exams

Page 8: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

8 28 August 2012 Creating quality multiple choice exams

Page 9: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

What guides our exam planning?

B. What can be assessed by a multiple-choice

(MC) exam? What are some examples from the big idea activity that

could be assessed using multiple choice

◦ What might be other item types that can be

used?

Ranking, matching, completion, short answer, essay,

performance assessment

What are some examples from the big idea activity that

would be better assessed using one of the other types?

28 August 2012 Creating quality multiple choice exams 9

Page 10: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

What guides our exam planning? A. Does the exam balance the content

intended to assess?

◦ Do you have a table of test specifications?

What is it?

Visual representation of the items in terms of both the content

to be learned and the level of cognition expected of the

students.

When to construct?

Ideally, prior to the beginning of instruction, but for sure

before you finalize the exam

28 August 2012 Creating quality multiple choice exams 10

Page 11: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

Selected Response Constructed

Response

Topics

Selected response=52

Constructed response=3

Remembering/

Understanding

Applying and

Above

Applying and

Above

Formative and Summative

Assessments

5 2

Curriculum, instruction, and

assessment alignment

4 2

Taxonomy Levels 1 8

Fair Assessment 1 1

Reliability and Validity 2 5

Selected Response 5 8

Developing Pencil & Paper Tests 2 2

Methods of Scoring & Interpreting 2

Assessment Audience 1 1

Number of items at

each taxonomy level for

each content area.

Blueprint for an EDPY 303

Examination

Taxonomy levels

tested.

Content areas

tested.

28 August 2012 Creating quality multiple choice exams 11

Page 12: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

What guides our our creation/selection

of quality items?

Variety of cognitive

levels

High quality items

Creation/

Selection

Resources

28 August 2012 Creating quality multiple choice exams 12

Page 13: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

What guides our creation/selection

of quality items?

A. Does the exam assess the depth of skills

intended to assess?

◦ Is there variety in the item level of cognition?

28 August 2012 Creating quality multiple choice exams 13

Introducing Bloom’s Taxonomy!

Page 14: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

What level of Blooms do each of these

instructional objectives require?

Students will:

1. Recall the basic purposes for commercial advertising

2. Describe the basic techniques advertisers use to sell products to consumers

3. Observe a series of television commercials and identify in each one the selling technique(s) employed

4. Differentiate the observed television commercials on the basis of their effectiveness in promoting a product/service

5. Script and perform a commercial designed to sell a product of their choice

6. Judge the effectiveness of the commercials created by their peers using a class-generated set of criteria

A. Creating

B. Evaluating

C. Analyzing

D. Applying

E. Understanding

F. Remembering

28 August 2012 Creating quality multiple choice exams 14

Page 15: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

Test-wiseness debrief 1. The purpose of the cluss in furmpaling is to remove

A. cluss-prags

B. tremalis

C. cloughs

D. plumots

2. Trassig is true when

A. lusp trasses the vom

B. the viskal flans, if the viskal is donwil or zortil

C. the belgo frulls

D. dissles lisk easily

3. The sigla frequently overfesks the trelsum because

A. all siglas are mellious

B. siglas are always votial

C. the trelsum is usually tarious

D. no trelsa are feskable

28 August 2012 Creating quality multiple choice exams 15

Page 16: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

Test your test-wiseness! 4. The fribbled breg will minter best with an

A. Derst

B. Morst

C. Sorter

D. Ignu

5. Among the reasons for tristal doss are

A. The sabs foped and the foths tinzed

B. The kredges roted with orots

C. Few rakobs were accepted in sluth

D. Most of the polats were thonced

6. The mintering function of the ignu is most effectively carried

out with

A. Raxma tol

B. The groshing stantol

C. The fribbled breg

D. A frally sush 28 August 2012 Creating quality multiple choice exams 16

Page 17: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

What guides our creation/selection

of quality items?

B. Do the items meet the criteria for being

fair?

◦ Are the items considered high quality?

◦ Why is high quality important?

28 August 2012 Creating quality multiple choice exams 17

Page 18: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

What are the Parts of a MC Item?

18

Calculus was independently developed by

Newton and

A. Barrow

B. Kepler

C. Leibniz

D. Pascal

These are the “distracters”,

(alternatives that are incorrect).

This is the “keyed response” or correct choice.

These are the “alternatives”, “choices”, or “options”.

This is the stem.

28 August 2012 Creating quality multiple choice exams

Page 19: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

Guidelines for a High-Quality MC Stem

Focused wording: the item can be answered to some

extent without looking at the alternatives

Question or statement form

Key words highlighted

Try to Avoid:

Double-barreled stem (more than one idea)

Verbal and grammar cues to the answer

Content bias (e.g. references to pop-culture, gender bias)

28 August 2012 Creating quality multiple choice exams 19

Page 20: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

Guidelines for a High-Quality MC

Distractors Use common student errors

Use language appropriate to the students

Alternatives should all be homogeneous (length and

complexity)

Try to Avoid:

The 3:1 Split...One of the alternatives stands out “like a sore

thumb”

A multiple-choice item that is actually a “true-false” item because

nearly all students can eliminate two distracters

Unjustified use of “all/none of the above”

28 August 2012 Creating quality multiple choice exams 20

Page 21: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

If you want to increase the cognitive level of

your MC item- consider using a Source-

Based M-C Item Which of the

following students

deserve an A as their

final grade?

A. Bob

B. Bob, Gwen and

Roger

C. Bob, Gwen, Roger

and Pam

Creating quality multiple choice exams 21

Level of Achievement Over Time

0

10

20

30

40

50

60

70

80

90

Oct Dec Feb Apr June

Time

Level

of

Ach

ievem

en

t

Pam

Roger

Gwen

Bob

This is the novel, yet

familiar introductory

(source) material.

This is one of a series of selected-

response items that relates to the

introductory material.

28 August 2012

Page 22: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

Guidelines for a High- Quality Source-

based MC Introductory Material

relates to the content covered, but is new to the students

Is brief and easy to read (appropriate level).

Stimulates thought about an issue or topic

Comes from a credible source and can be used (copyright!) Instructors can create their own introductory materials

Follows the guidelines for MC item stems

Try to Avoid:

Irrelevant source material

Testing knowledge directly cited in the source material

Testing trivia or memorized facts

Extraneous cues

28 August 2012 Creating quality multiple choice exams 22

Page 23: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

Improving Multiple-Choice Items

23

A table of test specifications

A. provides a more balanced sampling of content

B. specifies the method of scoring to be used on a test

C. indicates how a test will be used to improve learning

D. arranges the instructional objectives in order of their importance

Rewrite:

The main advantage of using a table of test specifications when preparing an

achievement test is that it

A. improves the sampling of content

B. increases the objectivity of the test

C. reduces the amount of time required

D. makes the construction of test items easier

Guideline:

present a single clearly formulated problem in the

stem and emphasize the key words

28 August 2012 Creating quality multiple choice exams

Page 24: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

How about this one?

24

The paucity of plausible, but incorrect, statements that can be related to a central

idea poses a problem when constructing which one of the following types of test

items?

A. essay items

B. true-false items

C. completion items

D. multiple-choice items

Rewrite:

A lack of plausible yet incorrect alternatives causes the greatest difficulty when

constructing

A. essay items

B. true-false items

C. completion items

D. multiple-choice items

Guideline: write the stem of

the item in simple, clear language

28 August 2012 Creating quality multiple choice exams

Page 25: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

What guides our our

creation/selection of quality items?

C. What resources are available as a starting

point?

◦ Do we have test bank items or old exams to

adapt?

◦ What might be some of the adaptations you

make?

Increase the level of cognition

Change item type

Reword to focus it on the “big idea”

28 August 2012 Creating quality multiple choice exams 25

Page 26: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

What guides our our use of item

analysis?

Exam reliability Item Difficulty Item

Discrimination

28 August 2012 Creating quality multiple choice exams 26

Page 27: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

Balancing Item Difficulties

on an Exam

28 August 2012 Creating quality multiple choice exams 27

Page 28: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

Which is the Best Eye Chart?

E

F P

T O Z

L P E D

P E C F D

E D F C Z P

F E L O P Z

D E F P O T E C

L E F O D P C T

F D P L T C E O

P E Z O L C F T D

E F P

T O Z

L P E D

P E C F D

E D F C Z P

F E L O P Z

D E F P O T E C

L E F O D P C T

F D P L T C E O

P E Z O L C F T D

28 August 2012 Creating quality multiple choice exams 28

Page 29: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

N

U

J

K

Q R

C

E

R

Y

E L H

X

B

S A

D

M

W

O W

F

I

D

E

V

P H

T

G

R

B

N

T W

Z

E

An Exam Should have a Balance of

Item Difficulties

28 August 2012 Creating quality multiple choice exams 29

Page 30: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

What guides our use of item analysis?

A. Is the exam reliable?

◦ If machine-scored, check KR-20

◦ If hand-scored, look for patterns

◦ What does this tell you?

◦ What is considered high

reliability vs. low reliability?

28 August 2012 Creating quality multiple choice exams 30

Page 31: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

The Anatomy of Question Analysis

ITEM 42: DIF= .366, RPB= .430, CRPB= .366 (95% CON= .255, .467)

RBIS= .551, CRBIS= .469, IRI= .207

GROUP N NR NF O 1 2 3 4*

TOTAL 257 0 0 0 .163 .280 .191 .366

HIGH 61 0 .049 .148 .131 .672

MID 129 0 .194 .240 .209 .357

LOW 67 0 .209 .478 .209 .104

TEST SCORE MEANS 28.952 27.264 29.735 35.138

DISCRIMINATING POWER -.160 -.330 -.078 .568

STANDARD ERROR OF D.P. .011 .018 .013 .019

Question 42

This question meets all major standards.

DIF:

The correct answer was selected by

36.6% of all students.

CRPB: This statistic indicates the

discriminating power of the

question. The higher the number,

the greater the question

discrimination between high- and

low-achieving students. The

minimum acceptable in our branch

standard is .200.

The question was answered by 257

students, divided into high, mid,

and lower groups, based on the

scores they achieved for the entire

multiple-choice test.

These statistics indicates the

proportion of students among the

entire group who chose each answer.

(1= Choice “A” = 16.3%

2= Choice “B” = 28.0%

3= Choice “C” = 19.1%

4= Choice “D” = 36.6%)

These figures indicate the

percentage of students who chose

each alternative among the three

sub-groups of students.

Test Score Means:

These statistics indicate the average score for the

group of students that selected each alternative.

For example, the average score for students who

selected choice A was 28.952 (in this case) out

of a total of 50. The average score of the group

selecting the keyed response should always be

higher than the average score for groups

selecting each of the distracters.

* indicates keyed response

28 August 2012 Creating quality multiple choice exams 31

Page 32: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

What guides our use of item analysis?

B. How difficult is the item?

◦ If machine-scored, check DIF

◦ What does this tell you?

◦ What is considered difficult, moderate and easy?

28 August 2012 Creating quality multiple choice exams 32

Page 33: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

What guides our use of item analysis?

C. Is the item positively discriminating?

◦ If machine-scored, check RBIS & CRPB

◦ What does this tell you?

◦ What is considered appropriate?

What should you do if its negative?

28 August 2012 Creating quality multiple choice exams 33

Page 34: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

14. Elections are often held in non-democratic countries primarily as a means of

A. reinforcing the perceived legitimacy of the régime in power B. providing an opportunity for citizens to effect political change C. meeting the legal requirements imposed by legislated constitutions D. providing the elite with an insight into popular attitudes and beliefs

ITEM 14: DIF= .628, RPB= .448, CRPB= .391 (95% CON= .286, .486)

RBIS= .572, CRBIS= .499, IRI= .216 GROUP N NR NF O 1* 2 3 4

TOTAL 277 1 0 1 .628 .108 .130 .130

HIGH 74 0 .905 .014 .000 .081 MID 124 1 .629 .105 .161 .097 LOW 79 0 .367 .203 .203 .228

TEST SCORE MEANS 34.201 25.700 25.639 27.556

DISCRIMINATING POWER .538 -.189 -.203 -.147 STANDARD ERROR OF D.P. .021 .009 .008 .010

Question 14 This item meets all major standards.

This question also discriminates very well. Note that although 20.3% of the lowest-achieving group selected alternative C (3), none of the highest-achieving group chose that same alternative.

Moderate

Difficulty

Very Effective

Distracters

28 August 2012 Creating quality multiple choice exams 34

Page 35: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

What to look for in your item analysis

Difficulty Discrimination Analysis

Greater than

.90

Any value Easy item – desirable to have some of

these on the assessment

Between

.50 and .90

Greater than

.20

Moderate difficulty and highly

discriminating – typical of quality items

Between

.50 and .90

Less than

.20

Moderate difficulty, non-discriminating

*probably needs adjustment

Less than

.50

Greater than

.20

Tough question, highly discriminating

*fair to have some of these

Less than

.50

Less than

.20

Tough question, does NOT discriminate

* toss this one out.

28 August 2012 Creating quality multiple choice exams 35

Page 36: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

Item Analysis Activity A

28 August 2012 Creating quality multiple choice exams 36

What can you read from this analysis?

Page 37: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

What to do it you don’t machine score?

Estimate Item Properties by Hand

Item Difficulty

# of correct answers

# of people taking the

test

ρ = # correct/total

Item Discrimination

1. Rank-order examinees and select

the equal number of them from the

highest and lowest scoring group

2. Calculate how many in each group

got the item right

3. Use the following formula:

D = # correct (up) - # correct (low)

# of students in either group

28 August 2012 Creating quality multiple choice exams 37

Page 38: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

Reliability of Assessment Validity of Interpretation

Consistency of scores

Accuracy

• Extent to which an assessment

method measures what we

intend it to measure

http://www.socialresearchmethods.net/kb/relandval.php

Why is any of this important? Does the exam (or items) tell us whether students have learned what we

intended for them to learn?

28 August 2012 Creating quality multiple choice exams 38

Page 39: Creating quality multiple choice exams - University of Alberta · to be learned and the level of cognition expected of the ... 28 August 2012 Creating quality multiple choice exams

References & Time for Questions

Bloom’s taxonomy: http://www.odu.edu/educ/roverbau/Bloom/blooms_taxonomy.htm

Fairness:

Principles for Fair Assessment Practices for Education in Canada. (1993). Edmonton, Alberta: Joint Advisory Committee. Retrieved June 12, 2009 from http://www.education.ualberta.ca/educ/ psych/crame/files/eng_prin.pdf.

Item Construction:

Gronlund N.E. (2005). Assessment of Student Achievement. (Seventh edition). Third custom edition for the University of Alberta.

Designing exams:

R.M. Felder, "Designing Tests to Maximize Learning," J. Prof. Issues in Engr. Education and Practice, 128(1), 1-3 (2002).<http://www.ncsu.edu/felder-public/Papers/TestingTips.htm>.

28 August 2012 Creating quality multiple choice exams 39


Recommended