+ All Categories
Home > Documents > Creating quality multiple choice exams - ualberta.ca · Creang(quality(mul%ple(choice(exams(...

Creating quality multiple choice exams - ualberta.ca · Creang(quality(mul%ple(choice(exams(...

Date post: 20-Aug-2018
Category:
Upload: vuhanh
View: 216 times
Download: 0 times
Share this document with a friend
10
Crea%ng quality mul%ple choice exams 15 January 2015 1 Creating quality multiple choice exams: Planning, developing, & improving Cheryl Poth, PhD [email protected] Btissam El Hassar, MPP Centre for Research in Applied Measurement & Evaluation DEPARTMENT OF EDUCATIONAL PSYCHOLOGY Overview of Exam design process Planning a Developing b administrating scoring Improving c January 2015 Creating quality multiple choice exams 2 Among the issues for exam designing: Assessment method-level Length and time allocated Types of question formats Item-level Unfamiliar vocabulary Specific cultural references Linking to previous experiences Reading over-emphasized Others? January 2015 Creating quality multiple choice exams 3 What guides our exam planning? PLANNING A. Big ideas B. Item type C. Exam emphasis January 2015 Creating quality multiple choice exams 4
Transcript

Crea%ng  quality  mul%ple  choice  exams   15  January  2015  

1  

Creating quality multiple choice exams:

Planning, developing, & improving

Cheryl Poth, PhD [email protected]

Btissam El Hassar, MPP Centre for Research in Applied Measurement & Evaluation

D E PA RT M E N T O F E D U C AT I O N A L P S Y C H O L O G Y

Overview of Exam design process

Planning

a

Developing

b

administrating scoring

Improving

c

January 2015 Creating quality multiple choice exams 2

Among the issues for exam designing:

� Assessment method-level ◦  Length and time allocated ◦ Types of question formats

�  Item-level ◦ Unfamiliar vocabulary ◦  Specific cultural references ◦  Linking to previous experiences ◦ Reading over-emphasized

� Others?

January 2015 Creating quality multiple choice exams 3

What guides our exam planning?

PLANNING

A. Big ideas

B. Item type

C. Exam

emphasis

January 2015 Creating quality multiple choice exams 4

Crea%ng  quality  mul%ple  choice  exams   15  January  2015  

2  

What guides our exam planning?

A.  What are we assessing? ◦ What are the big ideas related to knowledge and

skills that we have taught?

◦ This represents the learner outcomes that students would be expected to know/be able to do after the course is completed

January 2015 Creating quality multiple choice exams 5

What guides our exam planning?

B.  What can be assessed by a multiple-choice (MC) exam? �  What are some examples from the big idea activity that

could be assessed using multiple choice

◦ What might be other item types that can be used? �  Ranking, matching, completion, short answer, long

answer, performance assessment �  What are some examples from the big idea activity that

would be better assessed using one of the other types?

◦ What can guide our choice of item focus? January 2015 Creating quality multiple choice exams 6

�  Dr. Benjamin Bloom �  Creator of a Taxonomy of

Educational Objectives

�  Cognitive (mental skills)* �  Affective (attitudes and

emotions) �  Psychomotor (physical

skills)

Bloom’s Taxonomy

Creating quality multiple choice exams 7 January 2015 Creating quality multiple choice exams 8 January 2015

Crea%ng  quality  mul%ple  choice  exams   15  January  2015  

3  

Multiple choice

Short/Long Answer

Performance Assessment

Remembering

Understanding

Applying

Analyzing

Evaluating

Creating

How well does each Assessment Method Measure a Cognitive Level?

Creating quality multiple choice exams 9 January 2015

What guides our exam planning?

c. What does the exam emphasize? ◦ Do you have a table of test specifications? �  What is it?

�  Visual representation of the items in terms of both the content to be learned and the level of cognition expected of the students.

�  When to construct? �  Ideally, prior to the beginning of instruction, but for sure

before you finalize the exam

January 2015 Creating quality multiple choice exams 10

Selected Response = 60 Taxonomy Total

Topics  

Remembering/ Understanding

Applying and Above  

Assessment Audiences 1 1 2

Curriculum, Instruction, and Assessment Alignment

3 7 10

Bloom’s Taxonomy 3 8 11

Fair Assessment 3 4 7

Reliability and Validity 3 3 6

Selected-Response Items 5 10 15

Completion Items 0 2 2

Gathering Evidence of Learning 4 3 7

Total 22 38 60

EDPY 303 Midterm Exam Blueprint – Fall 2014

Creating quality multiple choice exams 11

Content areas tested

Number of items at each taxonomy level for each content area

Taxonomy levels tested

January 2015

Course Content

Level of Thinking – Blooms’ Taxonomy

Topic # Remembering Understanding Applying Analyzing Total %

20

40

16

14

10

100%

1 1-5,

6,7,10,13,14

2 8,9,11,12, 15-28, 31,32

3 29,30, 33-38

4 39,41-46

6 40,47-50

Total 20% 40% 30% 10%

True/False, Matching, Completion, Multiple-choice?

Restricted or Extended Response?

Creating quality multiple choice exams 12 January 2015

Crea%ng  quality  mul%ple  choice  exams   15  January  2015  

4  

Take home message for planning

�  What and how you teach needs to be reflected in the assessments and linked with the intended big ideas.

�  Blooms taxonomy is helpful for thinking about what you want students to be able to do and know.

�  Exam blueprint/Table of test specifications is a useful tool to make explicit the emphasis of content and level of cognition.

�  Questions: ◦  How do you plan for your exams? Would Blooms taxonomy help

improve your exam planning? ◦  Would exam blueprint help improve the exam planning and help

your students have clear expectations?

January 2015 Creating quality multiple choice exams 13

What guides our development/selection of high quality items?

A. Variety of cognitive

levels

B. Criteria for technically

sound items

Development/ Selection

C. What resources are

available

January 2015 Creating quality multiple choice exams 14

What guides our creation/selection of quality items?

A.  Does the exam assess the depth of skills intended to assess? ◦  Is there variety in the item level of cognition?

January 2015 Creating quality multiple choice exams 15

What level of Blooms do each of these instructional objectives require?

Students will:

1.  Recall the basic purposes for commercial advertising

2.  Describe the basic techniques advertisers use to sell products to consumers

3.  Observe a series of television commercials and identify in each one the selling technique(s) employed

4.  Differentiate the observed television commercials on the basis of their effectiveness in promoting a product/service

5.  Script and perform a commercial designed to sell a product of their choice

6.  Judge the effectiveness of the commercials created by their peers using a class-generated set of criteria

A.  Creating

B.  Evaluating

C.  Analyzing

D.  Applying

E.  Understanding

F.  Remembering

January 2015 Creating quality multiple choice exams 16

Crea%ng  quality  mul%ple  choice  exams   15  January  2015  

5  

What guides our development/selection of technically sound items?

B.  Do the items meet the criteria for being fair? ◦  Are the items considered equitable? �  Why is this important? �  What might items that are not look like?

◦ Are the items considered technically sound? �  Why is this important? �  What might items that are not look like?

January 2015 Creating quality multiple choice exams 17

An item is equitable when it is…

January 2015 Creating quality multiple choice exams 18

ü free from racial, ethnic and sexual bias

ü free of irrelevant material

ü stated in appropriate and clear language

ü free from pop cultural references that would not be familiar to all students

ü the item is free of verbal clues to the answer

ü the stem is focused on a single, meaningful problem

ü key words in the stem are emphasized as needed

ü distracters are all plausible

ü alternatives are homogenous and parallel in structure

ü alternatives are in some order that is logical and easily understood

ü legitimate use of ‘all of the above’ and ‘none of the above’

ü the item has correct spelling and grammar January 2015 Creating quality multiple choice exams 19

An item is technically sound when it is.. What are the Parts of a MC Item?

20

Calculus was independently developed by Newton and A. Barrow B. Kepler C. Leibniz D. Pascal

These are the “alternatives”, “choices”, or “options”.

These are the “distracters”, (alternatives that are incorrect).

This is the “keyed response” or correct choice.

This is the stem.

January 2015 Creating quality multiple choice exams

Crea%ng  quality  mul%ple  choice  exams   15  January  2015  

6  

Guidelines for a High-Quality MC Stem

ü  Focused wording: the item can be answered to some extent without looking at the alternatives

ü Question or statement form

ü  Key words highlighted

Ø  Try to Avoid: Ø  Double-barreled stem (more than one idea) Ø  Verbal and grammar cues to the answer Ø  Content bias (e.g. references to pop-culture, gender bias)

January 2015 Creating quality multiple choice exams 21

Guidelines for a High-Quality MC Distractors ü Use common student errors ü Use language appropriate to the students ü  Alternatives should all be homogeneous (length and

complexity) ü  Alternatives should be arranged in a logical order (e.g.

pyramid or reverse-pyramid, alphabetical order, numerical order, etc.)

Ø  Try to Avoid: Ø  The 3:1 Split...One of the alternatives stands out “like a sore

thumb” Ø  A multiple-choice item that is actually a “true-false” item because

nearly all students can eliminate two distracters Ø  Unjustified use of “all/none of the above”

January 2015 Creating quality multiple choice exams 22

If you want to increase the cognitive level of your MC item- consider using a Source-Based M-C Item

Which of the following students deserve an A as their final grade? A. Bob B. Bob, Gwen and Roger C. Bob, Gwen, Roger and Pam

Creating quality multiple choice exams 23

Level of Achievement Over Time

0

10

20

30

40

50

60

70

80

90

Oct Dec Feb Apr June

Time

Leve

l of A

chie

vem

ent

PamRogerGwenBob

This is the novel, yet familiar introductory (source) material.

This is one of a series of selected-response items that relates to the introductory material.

January 2015

Example: Improving Multiple-Choice Items

24

A table of test specifications A. provides a more balanced sampling of content B. specifies the method of scoring to be used on a test C. indicates how a test will be used to improve learning D. arranges the instructional objectives in order of their importance

January 2015 Creating quality multiple choice exams

Crea%ng  quality  mul%ple  choice  exams   15  January  2015  

7  

Example: How about this one?

25

The paucity of plausible, but incorrect, statements that can be related to a central idea poses a problem when constructing which one of the following types of test items? A. essay items B. true-false items C. completion items D. multiple-choice items

January 2015 Creating quality multiple choice exams

What guides our developing/selection of technically sound items?

C.  What resources are available as a starting point? ◦ Do we have test bank items or old exams to

adapt? ◦ What might be some of the adaptations you

make? ü  Increase the level of cognition ü  Change item type ü  Reword to focus it on the “big idea”

ü How will the exam be administered?

January 2015 Creating quality multiple choice exams 26

Learning Assessment Centre (LAC) �  Provides the possibility to administer secure digital/

computerised exams

�  Question format types supported are: MC, fill-in-the blank, short answer, essay.

�  Benefits of digital exams are: ◦  Multiple Choice and short answer questions can be instantly

marked without using scantron sheets. ◦  Students can use the accessibility features on the computer to

accommodate vision difficulties and preferences. ◦  SSDS students could write the exam with the rest of the students. ◦  Instructors can access all of the exams in one place from flexible

locations.

�  Location: in 3-106 Education North

�  Website: http://digital.ualberta.ca/learning-assessment-centre-lac January 2015 Creating quality multiple choice exams 27

Take home message for developing MC questions � Attend to the cognitive levels of items � Make sure your items meet the criteria

for being technically sound and equitable � Consider differing ways of administering! � Questions ◦ Do you consider the cognitive levels of exam

questions in your design? ◦ Do you use other criteria for creating MC

questions? How could the for mentioned check list help you with your question design?

January 2015 Creating quality multiple choice exams 28

Crea%ng  quality  mul%ple  choice  exams   15  January  2015  

8  

What guides our use of item analysis?

A. Exam reliability

B. Item difficulty

C. Item discrimination

January 2015 Creating quality multiple choice exams 29

____ Lengthening a test will increase its reliability.

Creating quality multiple choice exams 30

Exam Reliability: Spot the Problem in the Following Alternative Response Item

January 2015

What guides our use of item analysis?

A.  Is the exam reliable? ◦  If machine-scored, check KR-20 ◦  If hand-scored, look for patterns ◦ What does this tell you? ◦ What is considered high reliability vs. low reliability?

January 2015 Creating quality multiple choice exams 31

What guides our use of item analysis?

B.  How difficult is the item? ◦  If machine-scored, check DIF ◦ What does this tell you? ◦ Range from 0 to 1 �  Easy more than 0.9 �  Moderate 0.5 to 0.9 �  Difficult less than 0.5 ρ=   #  of  correct  answers/#  of  people  taking  the  test 

January 2015 Creating quality multiple choice exams 32

Crea%ng  quality  mul%ple  choice  exams   15  January  2015  

9  

The Anatomy of Question Analysis ITEM 42: DIF= .366, RPB= .430, CRPB= .366 (95% CON= .255, .467) RBIS= .551, CRBIS= .469, IRI= .207 GROUP N NR NF O 1 2 3 4* TOTAL 257 0 0 0 .163 .280 .191 .366 HIGH 61 0 .049 .148 .131 .672 MID 129 0 .194 .240 .209 .357 LOW 67 0 .209 .478 .209 .104 TEST SCORE MEANS 28.952 27.264 29.735 35.138 DISCRIMINATING POWER -.160 -.330 -.078 .568 STANDARD ERROR OF D.P. .011 .018 .013 .019 Question 42 This question meets all major standards.

DIF: The correct answer was selected by 36.6% of all students.

CRPB: This statistic indicates the discriminating power of the question. The higher the number, the greater the question discrimination between high- and low-achieving students. The minimum acceptable in our branch standard is .200.

The question was answered by 257 students, divided into high, mid, and lower groups, based on the scores they achieved for the entire multiple-choice test.

These statistics indicates the proportion of students among the entire group who chose each answer. (1= Choice “A” = 16.3% 2= Choice “B” = 28.0% 3= Choice “C” = 19.1% 4= Choice “D” = 36.6%)

These figures indicate the percentage of students who chose each alternative among the three sub-groups of students.

Test Score Means: These statistics indicate the average score for the group of students that selected each alternative. For example, the average score for students who selected choice A was 28.952 (in this case) out of a total of 50. The average score of the group selecting the keyed response should always be higher than the average score for groups selecting each of the distracters.

* indicates keyed response

January 2015 Creating quality multiple choice exams 33

14. Elections are often held in non-democratic countries primarily as a means of

A. reinforcing the perceived legitimacy of the régime in power B. providing an opportunity for citizens to effect political change C. meeting the legal requirements imposed by legislated constitutions D. providing the elite with an insight into popular attitudes and beliefs

ITEM 14: DIF= .628, RPB= .448, CRPB= .391 (95% CON= .286, .486) RBIS= .572, CRBIS= .499, IRI= .216 GROUP N NR NF O 1* 2 3 4 TOTAL 277 1 0 1 .628 .108 .130 .130 HIGH 74 0 .905 .014 .000 .081 MID 124 1 .629 .105 .161 .097 LOW 79 0 .367 .203 .203 .228 TEST SCORE MEANS 34.201 25.700 25.639 27.556 DISCRIMINATING POWER .538 -.189 -.203 -.147 STANDARD ERROR OF D.P. .021 .009 .008 .010 Question 14 This item meets all major standards.

This question also discriminates very well. Note that although 20.3% of the lowest-achieving group selected alternative C (3), none of the highest-achieving group chose that same alternative.

Moderate Difficulty

Very Effective Distracters

January 2015 Creating quality multiple choice exams 34

What to look for in your item analysis?

Difficulty (DIF)

Discrimination (CRPB)

Analysis

Greater than .90

Any value Easy item – desirable to have some of these on the assessment

Between .50 and .90

Greater than .20

Moderate difficulty and highly discriminating – typical of quality items

Between .50 and .90

Less than .20

Moderate difficulty, non-discriminating *probably needs adjustment

Less than .50

Greater than .20

Tough question, highly discriminating *fair to have some of these

Less than .50

Less than .20

Tough question, does NOT discriminate * toss this one out.

January 2015 Creating quality multiple choice exams 35

Item Analysis Activity

January 2015 Creating quality multiple choice exams 36

What can you read from this analysis?

Crea%ng  quality  mul%ple  choice  exams   15  January  2015  

10  

Reliability of Assessment

Ø Consistency of scores

http://www.socialresearchmethods.net/kb/relandval.php

Why is any of this important? ü  Does the exam (or items) tell us whether students have learned what we

intended for them to learn? And can we consistently answer that question?

January 2015 Creating quality multiple choice exams 37

Validity of Interpretation

Ø Accuracy •  Extent to which an assessment

method measures what we intend it to measure

Take home message for improving

�  Item Analysis is an untapped source for information

�  Important to consider measurement aspects of your items to ensure reliability in the exam and validity of your interpretations

�  Questions: ◦  Do you have previous exam’ item analysis you can use

to improve current tests? ◦  How reliable are your exams? What steps could you

take to improve the reliability of the result and validity of the results’ interpretation?

January 2015 Creating quality multiple choice exams 38

References & Time for Questions �  Bloom’s taxonomy:

http://www.odu.edu/educ/roverbau/Bloom/blooms_taxonomy.htm �  Fairness: Principles for Fair Assessment Practices for Education in Canada. (1993).

Edmonton, Alberta: Joint Advisory Committee. Retrieved June 12, 2009 from http://www.education.ualberta.ca/educ/ psych/crame/files/eng_prin.pdf.

�  Item Construction: Gronlund N.E. (2005). Assessment of Student Achievement. (Seventh edition). Third custom edition for the University of Alberta.

�  Designing exams: R.M. Felder, "Designing Tests to Maximize Learning," J. Prof. Issues

in Engr. Education and Practice, 128(1), 1-3 (2002).<http://www.ncsu.edu/felder-public/Papers/TestingTips.htm>.

January 2015 Creating quality multiple choice exams 39


Recommended