+ All Categories
Home > Documents > Technical Report on the Standard Setting Exercise for the ... · Medical Council of Canada MCCQE...

Technical Report on the Standard Setting Exercise for the ... · Medical Council of Canada MCCQE...

Date post: 27-May-2019
Category:
Upload: truonghuong
View: 216 times
Download: 0 times
Share this document with a friend
31
Technical Report on the Standard Setting Exercise for the Medical Council of Canada Qualifying Examination Part I Psychometrics and Assessment Services July 2015
Transcript

Technical Report on the Standard Setting Exercise for the Medical Council of Canada Qualifying Examination Part I

Psychometrics and Assessment Services July 2015

Medical Council of Canada

MCCQE Part I Standard Setting Report 2

TABLE OF CONTENTS

INTRODUCTION ..................................................................................................................................... 3

Pre-Session Activities ......................................................................................................................... 3

SELECTING A STANDARD SETTING METHOD ....................................................................... 3

SELECTING PARTICIPANTS AND ASSIGNING INTO PANELS ............................................... 4

SELECTING TEST QUESTIONS FOR THE STANDARD SETTING SESSION......................... 4

PRE-SESSION MATERIALS ....................................................................................................... 5

Activities During the Two-Day Session .............................................................................................. 5

ORIENTATION ............................................................................................................................. 5

DEFINING THE BORDERLINE CANDIDATE ............................................................................. 5

THE PRACTICE TEST................................................................................................................. 6

THE PRACTICE BOOKMARK METHOD .................................................................................... 6

TWO ROUNDS OF BOOKMARKING .......................................................................................... 7

Recommendation from the Panelists.................................................................................................. 9

Evaluation of the Standard Setting Judgments .................................................................................. 9

Providing Feedback through an Online Survey ................................................................................ 10

Concluding Remarks ........................................................................................................................ 11

REFERENCES ...................................................................................................................................... 12

Table 1: Canadian and International Medical Graduate Pass/Fail Rates for the Years 2012-2014 ..... 13

Table 2: Standard Setting Results for Panels 1 and 2 for Rounds 1 and 2 .......................................... 13

Figure 1: Failure Rates for First-Time Takers (Panel 1) ....................................................................... 14

Figure 2: Failure Rates for First-Time Takers (Panel 2) ....................................................................... 14

Figure 3: Failure Rates for First-Time Takers (Combined Panels) ...................................................... 15

Figure 4: Failure Rates for all First-Time Takers (Round 2) ................................................................. 15

Figure 5: Failure Rates for all First-Time Takers and Hofstee Boundaries .......................................... 16

APPENDIX A: Demographic Information Sheet .................................................................................. 17

APPENDIX B: Demographic Summary of the Two Panels ................................................................. 20

APPENDIX C: Standard Setting Agenda ............................................................................................. 21

APPENDIX D: Defining Borderline Performance and the Minimally Competent Candidate ............... 23

APPENDIX E: Form to Document a Bookmark for Each Round ......................................................... 24

APPENDIX F: Form to Document a Bookmark for Each Round ......................................................... 25

APPENDIX G: Part I Standard Setting Fall 2014 – Post-Session Survey Summary .......................... 26

Medical Council of Canada

MCCQE Part I Standard Setting Report 3

INTRODUCTION

In the context of licensing and certification, standard setting has become a critical and essential

component of assessment programs. Standard setting is a process by which an acceptable level

of performance is defined (Kane, 1994, 1998). For the medical profession, standard setting is the

establishment of a qualitative statement of what minimum level of performance should be attained

to practice medicine safely and effectively. An integral part of the standard setting process is also

the establishment of a cut score on an assessment of interest that is congruent with the definition

of a minimum performance level. At the Medical Council of Canada (MCC), standard setting is an

essential part of every examination program, including the Medical Council of Canada Qualifying

Examination (MCCQE) Part I.

The MCCQE Part I is a computer-delivered examination which assesses basic medical

knowledge and skills expected to be mastered at the end of medical school. It is composed of a

three and a half hour multiple-choice (MCQ) component and a four-hour clinical decision-making

(CDM) component. Its MCQ component consists of seven sections of 28 questions in which

testlets of four questions for each of the six disciplines (Internal Medicine, Obstetrics/Gynecology,

Pediatrics, Population Health, Ethical, Legal, and Organizational aspects of Medicine, Psychiatry,

and Surgery) are presented to candidates. The CDM component is composed of 36 clinical cases

each including one to four questions. CDM questions can either be selected-response type or

constructed- response (CR) type items.

The purpose of the standard setting session for the MCCQE Part I that took place October 23-24,

2014, was to arrive at a recommended cut score for subsequent review and approval by the

Central Examination Committee (CEC). The most important aspect of standard setting is the

validity of the process and activities. In the sections that follow, we describe in detail the pre

standard setting session activities, as well as the activities that took place during the standard

setting session for the MCCQE Part I.

Pre-Session Activities

SELECTING A STANDARD SETTING METHOD

Standard setting methodologies abound but not all are well suited for the types of items that are

used in the MCCQE Part I. Several methodologies were considered but the Bookmark method

was chosen because of its simplicity and the ease with which both MCQs and CDM items can be

integrated in the cut score (Cizek, 2007). The Bookmark method is an item mapping procedure

where items are ordered from easiest to most difficult based on operational data and panelists

are asked to place a bookmark at the point at which they believe a minimally proficient candidate

Medical Council of Canada

MCCQE Part I Standard Setting Report 4

would no longer correctly answer subsequent items presented in the ordered exam form. De

facto, this corresponds to the cut score for each panelist. A detailed description of participants’

task is outlined in a later section of this report.

SELECTING PARTICIPANTS AND ASSIGNING INTO PANELS

Since the panelists selected for a standard setting exercise represent a microcosm of all MCCQE

Part I examination stakeholders, it is critical to select participants that are representative with

respect to a number of key variables, including the region of Canada, ethnicity, medical specialty

and years of experience. Furthermore, to assess the reproducibility of the cut score across 2

groups of physicians, we decided to split our panelists into 2 matched subgroups. The latter

allows us to collect critical validity evidence in support of the recommended cut score.

The process of selecting participants started with an invitation which was forwarded to physicians

from across Canada, targeting Family Physicians as well as a broad range of other specialists. A

total of 22 physicians were retained based on several key criteria (see Appendix A for the

demographic information survey that was filled out by all potential participants). As previously

mentioned, we attempted to select panelists in both subgroups that were reflective of various

regions across the country (i.e., Western, Central, and Eastern Canada); medical specialty (family

medicine, internal medicine, surgery, obstetrics and gynecology, pediatrics, and psychiatry);

ethnicity (i.e., Asian, Black, Caucasian, First Nation, or Hispanic), sex, and years of experience

supervising residents. In Appendix B, we present a summary of the demographics of the two

panels. Some minor imbalance ensued when five participants bowed out a few days before the

session. Two of these people decided not to participate on account of the tragic incident that

occurred in Ottawa at the War Memorial and Parliament building center block the day before this

session.

SELECTING TEST QUESTIONS FOR THE STANDARD SETTING SESSION

All questions used for the standard setting session were taken from the most recent MCCQE

Part I, namely the spring 2014 administration. Dichotomously scored MCQs were calibrated

using the Rasch model (Rasch, 1960/1980) which, in turn, were used as anchors to calibrate the

CDM questions (Rasch model for dichotomous CDMs and the partial credit model (Masters,

1982) for polytomous CDMs). With the bookmark method, the basic question that panelists must

answer is the following: “Is it likely that the borderline candidate will be able to answer this

question correctly”. A typical probability level used with the bookmark method is the 67%

response probability or, 2/3 chance of answering correctly. Therefore, response probabilities

were calculated using a 2/3 probability criterion for each dichotomously scored MCQ and CDM

and for each step value for polytomously scored CDMs.

Medical Council of Canada

MCCQE Part I Standard Setting Report 5

PRE-SESSION MATERIALS

To assist panelists to prepare for the standard setting session, we asked them to read an article

(De Champlain, 2004) and a book chapter (De Champlain, 2014) on the topic of standard setting

that we sent out prior to the exercise in October, 2014. Additionally, the agenda for the two-day

session was mailed out to participants a few weeks before the session (see Appendix C).

Activities During the Two-Day Session

ORIENTATION

The success of any standard setting session relies heavily on the extensive training of

participating panelists. This helps to ensure that panelists have the same objective in mind and

the same basic premises and understanding of the standard setting process. To this end, we

spent half of the first day of the exercise training our panelists on a number of issues, including

the structure and content of the MCCQE Part I. Examples of questions for both components of

the examination were shown with the type of scoring rubrics that would be seen in the exercises

included in the session. This was followed by a tutorial on standard setting, including issues to

consider, methods and sources of evidence to support the reliability and validity of any cut-score.

Particular attention was provided to the method that was selected to arrive at a recommended

cut- score for the MCCQE Part I exam, namely, the Bookmark method. In addition, a second,

ancillary standard setting method was introduced, the Hofstee method, which was used as a

complement to the item-centered Bookmark approach. The Hofstee method is described in the

literature as a compromise method (Hofstee, 1983) in that it integrates both norm-referenced

(relative interpretations) and criterion-referenced (absolute interpretations) considerations in a

“gut estimate” that is used to further validate the cut-score obtained following the Bookmark

exercise.

DEFINING THE BORDERLINE CANDIDATE

Commonly, standard setting methodologies, including the Bookmark method, assume that a cut-

score is set for the minimally proficient or borderline candidate. This hypothetical candidate is

critical in setting the cut-score, i.e., a point on the continuum of professional competence that

separates those deemed as competent candidates from those deemed as incompetent. The

Bookmark method requires that panelists clearly define what constitutes a minimally proficient (or

borderline) candidate, with respect to what they may know and not know in the domains targeted

by the MCCQE Part I exam.

To assist panelists in this task, a basic definition was developed by the Vice-chair of the CEC and

offered to the panelists as a starting point. After much discussion, the participants agreed on

Medical Council of Canada

MCCQE Part I Standard Setting Report 6

some modifications and enhancements by listing several attributes that they felt were reflective of

borderline candidate behaviours and attitudes. The definition that was agreed upon by all our

panelists is shown in Appendix D.

THE PRACTICE TEST

To better understand the type of questions that Part I candidates must answer during an

examination, a practice test was administered to the panelists prior to collecting their judgments.

It contained a representative sample of 50 multiple-choice questions and 26 clinical decision-

making questions selected from the spring 2014 MCCQE Part I examination. Panelists were

given 90 minutes to complete the practice test after which they were instructed to self-score their

test using an item map which provided correct answers for each question. The purpose of the

practice test was also to give participants a sense of the level of difficulty of the MCCQE Part I.

Participants were not asked to share their resulting score with other panelists. However, this

exercise did provide the basis for a discussion of their perceived level of difficulty of the questions

and the appropriateness of the content in relation to the purpose of the Part I examination and its

target population (i.e. candidates entering supervised training or residency).

THE PRACTICE BOOKMARK METHOD

A practice bookmark exercise was planned to train the panelists in this procedure before they

engaged in the actual full-scale activity. The same questions used in the practice test were used

for this exercise as well. However, the questions were now ordered by difficulty level, from

“easiest” to “most difficult”, based on actual spring, 2014 MCCQE Part I candidate performances.

The goal of this standard setting method was to allow panelists, in a practice round, to identify a

point on the scale that they believed reflected minimal competency in the domains measured by

the MCCQE Part I examination.

Each participant was presented with a booklet that contained examination questions (one per

page) that were ordered by difficulty from easiest to most difficult. Each participant was asked to

place their bookmark at the point at which they felt a minimally proficient (or borderline) candidate

would correctly answer all items up to that point and incorrectly answer all items beyond that

point. The basic question that panelists must answer in the Bookmark procedure is the following:

“Is it likely that a minimally proficient candidate will be able to correctly answer this test question?”

Of course, the “likeliness” must be defined more specifically. In the Bookmark method, it is

defined as having a 2/3 chance of answering correctly (or 2/3 chance of reaching a CR score or

higher – for polytomous items). The expression “RP67” is often used to capture the essence of a

.67 response probability; simply another way of expressing the 2/3 chance of answering correctly.

Medical Council of Canada

MCCQE Part I Standard Setting Report 7

Panelists were instructed to read questions starting with the first question in their booklet and

proceed one item at a time sequentially until they arrived at a point where they felt that the

minimally proficient candidate would no longer have a 2/3 chance of correctly answering the item.

Panelists were not provided with the correct answers for this initial practice round. Following this

initial bookmark placement, panelists were then provided with an item map that contained

information on each question in the booklet including the correct answer as well as the associated

RP67 value. Following this practice round, panelists were invited to begin the actual two rounds

of the Bookmark standard setting exercise.

TWO ROUNDS OF BOOKMARKING

Round 1 (Preliminary round). Following the practice bookmark round, panelists were reminded of

some key points about the Bookmark method and were assigned to their respective panels. They

were then each provided with a booklet that contained 236 items (one form’s worth of items)

which were ordered by difficulty level (based on RP67 value) from easiest to most difficult. They

were then instructed to independently place a bookmark at the point at which they felt a

minimally proficient (or borderline) candidate would correctly answer all items up to that point and

incorrectly answer all items beyond that point. Forms were distributed for documenting each

panelist’s bookmark (see Appendix E). The panelists were given 3.5 hours to complete their

round 1 bookmark placement. Note that the judgments provided in round 1 were solely based on

the item text that was provided, i.e., no performance data were given.

Following round 1, panelists were asked to provide answers to the following four Hofstee method

questions: (1) What is the minimally acceptable cut-score (Cmin), even if all candidates attained

this score level; (2) What is the maximum acceptable cut-score (Cmax), even if no candidate tis

score level; (3) What is the minimum tolerable failure rate (Fmin) and; (4) What is the maximum

tolerable failure rate (Fmax). Again, this information is used to gauge the appropriateness of the

Bookmark method cut-score as per the panelists’ holistic views. Forms were distributed (see

Appendix F) to allow panelists to record the data for the Hofstee method. Forms were collected

and provided to Statistical Analysts who in turn entered the data in an application which allowed

us to view each panel’s bookmark overlaid with the Hofstee boundaries. Figures 1 and 2 illustrate

Bookmark and Hofstee data for round 1 for Panels 1 and 2, respectively. Figure 3 combines the

data for both panels. Panel 1 panelists are represented as blue letters on each graph. Panel 1

had 9 panelists: A, C, D, E, G, H, I, J, and K. Panel 2 panelists are represented as red letters.

Panel 2 had 8 panelists: A, B, E, F, G, H, I, and J. The placement of letters on the graphs have

significance only on the x-axis, namely the cut scores on the theta scale. The stacking of some of

the letters was done simply to distinguish panelists whose cut score was the same instead of

Medical Council of Canada

MCCQE Part I Standard Setting Report 8

superimposing them. The placement of the letters has no significance on these graphs in terms of

failure rates for individual panelists.

Panelists from both panels were gathered in one room to provide them with impact data which

consisted of failure rates given their respective cut scores and combined cut score values. Pass

and failure rates for Canadian and International Medical Graduates for the years 2012-2014 were

presented to all panelists (See Table 1). Also, a cumulative distribution of examination results

was prepared from all first-time candidates who completed the spring 2014 MCCQE Part I. For

each score, a distribution of cumulative percentage of failures was established and a look-up

table was created to obtain a percentage failure for any given cut score obtained from each

panelist.

To translate bookmark placement into cut scores on the item response theory (IRT) ability (theta)

scale, an additional look-up table was created that listed: (1) item identification number for each

item used in the bookmarking exercise; (2) the corresponding booklet page number; (3) the

Rasch item difficulty measure and; (4) the RP67 value or IRT ability value needed to have a 2/3

chance of correctly answering any given item in the sample MCCQE Part I exam form that was

used in our standard setting exercise. Once we obtained all bookmark placement page numbers,

those were entered and a corresponding cut score was identified using the look-up table for each

panelist, panel and overall.

To obtain a panel-level cut score, the median cut score was calculated from the distribution of cut

scores by panel. The median was chosen instead of the mean since it mitigates the influence of

extreme values when they occur. The latter value corresponded to the preliminary or round 1 cut

score.

In Figure 1, we can observe that failure rates increase as cut scores increase and that the cut

score obtained by the Hofstee method (established by drawing a line down to the cut score at the

point where Fmax / Cmin and Fmin / Cmax lines traverse the cumulative failure rates curve) for Panel

1 falls between the lower and higher boundaries identified by the Hofstee method. This is a

desirable outcome. It is desirable because it indicates that the cut score (-0.39 on the theta scale)

identified by Panel 1 falls within what they expected in terms of maximum and minimum failure

rates and maximum and minimum cut scores.

In Figure 2, Panel 2 results for round 1 are presented. The results indicate that this panel had

incongruent outcomes between what they established as acceptable Hofstee boundaries and the

bookmark cut score (-0.78 on the theta scale). It would seem that 2 panelists (B and E) are

Medical Council of Canada

MCCQE Part I Standard Setting Report 9

mostly responsible for this outcome. Figure 3 illustrates the results of the combined data for both panels taken together resulting in a combined cut score of -0.58 which falls within the Hofstee

higher and lower boundaries. Panelists were provided with an opportunity to discuss the results

presented to them after this preliminary round. Much discussion ensued in terms of the impact

on medical graduates who would potentially fail given the cut score produced by round 1

bookmarking. Some panelists expressed the fact that, given the impact data, they felt that they

were too lenient in terms of what they expected the borderline candidate would be able to

master while others felt they were too harsh.

Round 2 (Final round). Panelists were then directed to their respective subgroup to engage in the

second and final round of bookmarking. Results from this second round constitute the

recommended cut score which was subsequently brought forward to the CEC for consideration

and adoption. Panelists were given two hours to complete this final standard setting round. As

was the case in the preliminary round (round 1), forms were gathered from panelists who

indicated their second bookmark placement as well as their responses to the four Hofstee

questions (post round 2). Graphical representations for round 2 bookmarking results are

presented in Figures 4 and 5. In Figure 4, round 2 individual and panel bookmark cut scores and

corresponding failure rates are presented. In Figure 5, the same data are provided with an

additional overlay of the Hofstee boundaries from round 2. The combined (i.e., both panels taken

together) cut score of -0.22 on the |IRT ability scale (theta) would fail 14% of all first-time

candidates using the spring 2014 examination results. This cut score would fail 5.1% of first-time

Canadian medical graduates from the spring, 2014 MCCQE Part I administration.

Recommendation from the Panelists

The above-mentioned figures were presented to all panelists concurrently and they were

provided with an opportunity to discuss the impact of using the resulting cut score. Several

panelists expressed their satisfaction with the method that they used to arrive at the final cut

score. They felt comfortable with the results of the exercises. Consistent with their mandate as

set at the beginning of the meeting, they recommended that the cut score of -0.22 on the IRT

ability scale be brought forward to the CEC for approval, at the spring, 2015 meeting.

Evaluation of the Standard Setting Judgments

Details of each panel’s recommended cut scores following Round 2 (final round) are presented in

Table 2. This table presents a summary of the 2 panels’ cut scores and their associated

descriptive statistics, namely the means, medians and standard deviations. The standard error of

judgment (SEJ) is also presented. This statistic captures the amount of variability associated with

each panel’s cut score. It provides a rough indication of the extent to which the same or a similar

Medical Council of Canada

MCCQE Part I Standard Setting Report 10

cut score would be obtained if we were to gather physicians with the same demographics as the

ones that were chosen for this session, who would have gone through the same type of training

and using the same examination items. By building a confidence interval around the SEJ, we can

evaluate the extent to which the 2 panels arrived at comparable cut scores. Panel 1’s interval

extends from 0-.37 to -0.18 and Panel 2’s interval extends from -0.38 to 0.03. From this finding,

we confidently conclude that their cut scores were very comparable.

Providing Feedback through an Online Survey

At the conclusion of the meeting, panelists were provided with an opportunity to provide feedback

on the activities in which they participated. An online survey tool was developed for this specific

purpose. Panelists were informed that the feedback provided would be treated anonymously. All

but one panelist completed the survey before they left the meeting. One of the panelists

completed the survey one day later.

Results of the survey are presented in Appendix G. All 17 participants thought that the

information regarding the overview of the MCCQE Part I was either good (18%), very good

(18%), or excellent (65%). They thought that the overview of standard setting was either good

(6%), very good (29%), or excellent (65%). Central to the exercises during this standard setting

session was the notion of the minimally competent (i.e., borderline) candidate. Participants were

asked to assess the clarity of the definition of that target population that they developed. All 17

participants thought that the definition was clear (76%) or very clear (24%).

A significant amount of time was devoted to training panelists to the task which was felt by staff

as extremely important to ensure a common understanding of what we expected of them before

they engaged in the actual bookmarking exercise. Ninety- four percent of panelists thought that

exercise was appropriate, 6% thought that it was somewhat appropriate, and none thought it was

not appropriate. All participants thought that the training provided for the bookmark method was

either good (12%), very good (18%) or excellent (71%).

Among the facto" that influenced participants the most when they engaged in the Bookmark

method were their perception of the level of difficulty of the items (94%), the description of the

minimally competent candidate (88%), the item statistics provided in round 2 (76%), and the

knowledge and skills measured by the items (76%). Among the factors that had the least

influence on their bookmarking exercise were the quality of the item distractors (12%) and the

number of answer choices per item (18%).

Participants were asked about their level of understanding of how to apply the bookmark and

Hofstee methods during round 1. For the bookmark method, 16 out of 17 participants said that

Medical Council of Canada

MCCQE Part I Standard Setting Report 11

they either understood (29%) or understood very well (65%) this process while one participant

reported that they understood “somewhat”. For the Hofstee method, 1 participant (6%) said that

they understood somewhat, 5 participants said that they understood (29%), 11 participants (65%)

said that they understood very well, while none of the participants reported not understanding the

method “at all”.

Participants were also asked about their level of confidence regarding the consequential/

feedback data and the final discussion. Two participants (12%) felt somewhat confident, 6

participants (35%) felt confident, 9 participants (53%) felt very confident, whereas none of the

participants felt that they were not at all confident.

One of the significant outcomes desired following a standard setting exercise is a standard that

participants would recommend with a very high level of confidence. As part of the survey,

participants were asked about the level of confidence in the final recommended passing score.

One participant felt somewhat confident while the large majority reported being confident (18%)

or very confident (76%) about the recommended cut score value.

Finally, participants were surveyed on potential improvements to consider for further standard

setting exercises. Among the suggestions for improvement were comments about providing

impact data after the practice bookmark method as well as each panelist’s bookmark placement.

Also, one participant suggested providing failure rates for each panelist’s bookmark following the

practice bookmark method. A few participants felt that there were no improvements to be made.

Concluding Remarks

The main goal of this report was to outline the main activities that constituted the standard setting

exercise for the MCCQE Part I. In summary, two panels were gathered for the purpose of

establishing and recommending a cut score by participating in a 2- day session during which they

were trained in the Bookmark and Hofstee standard setting methods. A significant amount of time

was spent defining the target population and training of panelists on various critical aspects of the

exercise. Two panels established highly comparable cut scores as demonstrated by the overlap

of their respective confidence interval using the standard error of judgment. A high level of

confidence in the recommended cut score was expressed by a majority of participants. Several

staff from Psychometrics and Assessment Services and the Evaluation Bureau participated in

making this a successful session. Finally, a comprehensive description of all the activities and the

resulting cut score as well as impact data for both the spring 2014 and 2015 cohorts were

presented to the CEC on June 8, 2015 for their discussion and consideration. The CEC

unanimously accepted the recommended cut score of -0.22 (427 on the 3-digit MCCQE Part I

reporting scale) at this meeting.

Medical Council of Canada

MCCQE Part I Standard Setting Report 12

REFERENCES

Cizek, G. J. and Bunch, M. B. (2007). Standard Setting: A Guide to Establishing and Evaluating

Performance Standards on Tests (55-189). Thousand Oaks, CA: Sage.

De Champlain, A. F. (2014). Standard setting methods in medical education. In T. Swanwick

(Ed.). Understanding Medical Education: Evidence, Theory and Practice. (305-316). Chichester,

West Sussex: John Wiley & Sons, Ltd.

De Champlain, A. F. (2004). Ensuring that the competent are truly competent: An overview of

common methods and procedures used to set standards on high-stakes examinations. Journal

of Veterinary Medical Education, 31, 61-5.

Hofstee, W. K. B. (1983). The case for compromise in educational selection and grading. In S. B.

Anderson and J. S. Helmick (Eds.). On educational testing (109-127). San Francisco: Jossey-

Bass.

Kane, M. (1994). Validating the Performance Standards Associated with Passing Scores. In

Review of Educational Research. Fall 1994 64 (3), 425-461.

Kane, M. (1998). Choosing Between Examinee-Centered and Test-Centered Standard-Setting

Methods, Educational Assessment, 5 (3), 129-145.

Masters, G.N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149- 174.

Rasch, G. (1960/1980). Probabilistic models for some intelligence and attainment tests.

(Copenhagen, Danish Institute for Educational Research), expanded edition (1980) with foreword

and afterword by B.D. Wright. Chicago: The University of Chicago Press

Wright, B. D. and Stone, M. H. (1979). Best Test Design: Rasch Measurement.

Medical Council of Canada

MCCQE Part I Standard Setting Report 13

Table 1: Canadian and International Medical Graduate Pass/Fail Rates for the Years 2012-2014

2012 2013 2014

Canadian Medical Graduates First- Time Takers

FAIL 1.5% 1.3% 2.3%

PASS 98.5% 98.7% 97.7%

Canadian and International Medical Graduates First-Time Takers

FAIL 9.0% 8.6% 10.6%

PASS 91.0% 91.4% 89.4%

Table 2: Standard Setting Results for Panels 1 and 2 for Rounds 1 and 2

Summary of Cut Scores by Panel for Rounds 1 and 2

Round 1 Round 2

Panel 1 Panel 2 Panel 1 Panel 2

Panelist 1 -0.07 -0.98 -0.26 -0.04

Panelist 2 -0.89 -1.74 -0.31 -0.02

Panelist 3 -0.46 -1.73 -0.46 -0.44

Panelist 4 -0.37 0.74 -0.19 -0.17

Panelist 5 -0.07 -1.08 -0.26 -0.37

Panelist 6 -0.95 -0.27 -0.59 -0.18

Panelist 7 -0.99 -0.58 -0.20 -0.53

Panelist 8 0.29 0.53 -0.27 0.38

Panelist 9 -0.39 -0.43

Mean -0.43 -0.64 -0.33 -0.17

Median -0.39 -0.78 -0.27 -0.17

Standard Deviation 0.44 0.93 0.13 0.29

Standard Error of Judgment (SEJ)

0.16 0.35 0.05 0.10

Medical Council of Canada

MCCQE Part I Standard Setting Report 14

Figure 1: Failure Rates for First-Time Takers (Panel 1)

Figure 2: Failure Rates for First-Time Takers (Panel 2)

Medical Council of Canada

MCCQE Part I Standard Setting Report 15

Figure 3: Failure Rates for First-Time Takers (Combined Panels)

Figure 4: Failure Rates for all First-Time Takers (Round 2)

Medical Council of Canada

MCCQE Part I Standard Setting Report 16

Figure 5: Failure Rates for all First-Time Takers and Hofstee Boundaries

Medical Council of Canada

MCCQE Part I Standard Setting Report 17

APPENDIX A: Demographic Information Sheet

The information requested below is being collected to help the MCC obtain a pan-Canadian

representative panel to recommend a passing score on the MCC Part I Examination. This

information will only be used to select the panel members so that we can represent the diversity

of physicians across the country. The information will not be linked in any way to the collection

of data for setting the passing score. A reminder that the meeting will take place on October 23-24, 2014 therefore we are asking panelists to be available on those 2 days.

Please provide your name and contact information, and check a box next to each of the

questions. The form can be sent by mail or electronically by 30 April 2014.

Name: __________________________________________________________________

Email: _____________________________________ Phone number: ________________

Mailing address: __________________________________________________________

________________________________________________________________________

________________________________________________________________________

1. Number of years in practice post residency:

1-5 years ☐

6-10 years ☐

11-20 years ☐

21-30 years ☐

More than 30 years ☐

2. Number of years’ experience supervising residents:

1-5 years ☐

6-10 years ☐

11-20 years ☐

21-30 years ☐

More than 30 years ☐

Medical Council of Canada

MCCQE Part I Standard Setting Report 18

3. Do you have experience supervising Canadian Medical Graduates?

Yes ☐

No ☐

4. Have you ever been a member of a Medical Council test committee?

Yes ☐

No ☐

5. Country of medical training (post graduate training):

Canada ☐

Other ☐

6. Region of the country in which you live:

Alberta ☐

British Columbia ☐

Manitoba ☐

Maritimes ☐

Ontario ☐

Quebec ☐

Saskatchewan ☐

Territories ☐

7. First Language:

English ☐

French ☐

Other (________________________) ☐

8. Gender:

Male ☐

Female ☐

Medical Council of Canada

MCCQE Part I Standard Setting Report 19

9. Ethnicity:

Asian ☐

Black ☐

Caucasian ☐

First Nations ☐

Hispanic ☐

10. Medical Specialty:

Pediatrics ☐

Internal Medicine ☐

Psychiatry ☐

Obstetrics and Gynecology ☐

Surgery ☐

Family Medicine ☐

Other ☐

11. Type of community in which you work:

Urban ☐

Rural ☐

12. Type of care setting:

Hospital-based ☐

Community-based ☐

Medical Council of Canada

MCCQE Part I Standard Setting Report 20

APPENDIX B: Demographic Summary of the Two Panels

Variable of Interest

Group Panel A Panel B Total

Gender Female 56% 50% 53%

Male 44% 50% 47%

Geographic Region

West 22% 38% 29%

Central 56% 38% 47%

East 22% 25% 24%

Medical Specialty

Internal Medicine 33% 38% 35%

Surgery 22% 13% 18%

Obstetrics/Gynecology 11% 13% 12%

Pediatrics 22% 13% 18%

Psychiatry 0% 13% 6%

Family Medicine 11% 13% 12%

Number of Years

Supervising Residents

1-5 years 11% 38% 24%

6-10 years 44% 13% 29%

11-20 years 11% 25% 18%

21-30 years 33% 25% 29%

Country of Medical Training

Canada 89% 88% 88%

Other 11% 12% 12%

Medical Council of Canada

MCCQE Part I Standard Setting Report 21

Appendix C: Standard Setting Agenda

STANDARD SETTING FOR QUALIFYING EXAMINATION PART I

Location: MCC office, 2283 St. Laurent Blvd., Ottawa, ON

University Boardroom (3rd Floor)

OCTOBER 23-24, 2014 | 8:00 a.m. – 4:00 p.m.

AGENDA – DAY 1: Thursday, Oct. 23, 2014

CONTINENTAL BREAKFAST 08:00

1. Breakfast and Registration 08:00

1.1 Complete confidentiality and biographical information forms

1.2 Let panellists know to what table/room they belong

2. Welcome and Introduction by MCC

2.1 Introduction of Panellists

2.2 Overview of Agenda

2.3 Overview of Part I Examination

2.4 Overview of Standard Setting

2.5 Overview of Bookmark Method

3. Review Practice Test and Self-Score 09:30

3.1 Break as needed

3.2 Take Practice Test: 50 MCQs + 25 CDM questions

3.3 Self-score using Practice Test Item Map

3.4 Discuss knowledge and skills on test

LUNCH 11:45

4. Develop Target Student Description and Reach Consensus 12:30

4.1 Clear definition of minimally competent candidate starting residency

5. Training of Bookmark Method & Practice 13:15

5.1 Practice bookmark method 50 MCQs and 39 CDMs

P.M. BREAK 14:45

15:00 6. Practice Ordered Item Booklet (OIB)

6.1 Provide item map for Practice OIB

6.2 Discussion of ordered difficulty and placement of bookmark

6.3 Survey post-bookmark training

7. Additional Discussion/Clarification 16:30

END OF DAY 1 17:00

Medical Council of Canada

MCCQE Part I Standard Setting Report 22

AGENDA – DAY 2: Friday, Oct. 24, 2014

CONTINENTAL BREAKFAST 08:00

08:30

11:30

8. Independently Mark Round 1 Bookmark Judgement/Hofstee by Panel

9. End of Round 1

9.1 Data entry

LUNCH / Data Entry 11:30

12:15 10. Round 1 – Data Feedback Whole Group

10.1 Provide Panel- and room-level data and impact data

10.2 Round 1 panel discussions with large group

11. Independently Make Round 2 Bookmark Judgements/Hofstee 13:00

P.M. BREAK 15:15

15:15

15:45

12. End of Round 2

12.1 Data entry

13. Round 2 Data Feedback

13.1 Provide panel- and room-level data and impact data

13.2 Presentation of Bookmark Recommendation

14. Complete Final Evaluation and Collection of Materials 16:15

END OF DAY 2 16:30

Medical Council of Canada

MCCQE Part I Standard Setting Report 23

APPENDIX D: Defining Borderline Performance and the Minimally Competent Candidate

The “minimally competent” candidate entering residency is a candidate who possesses the

minimum level of knowledge, skills and attitudes required to safely practice medicine under

supervision. A “minimally competent” candidate’s performance is acceptable, despite gaps in their

knowledge and clinical decision-making skills.

The minimally competent candidate will:

• Have the right attributes

• Be able to reflect limits of their own

• Be able to recognize that a patient is sick, but doesn’t necessarily know why

• May not have the ability to adequately recognize life threatening situations

• Be able to gather information but not necessarily be able to integrate it

• Be reliable in identifying red flags (and sense of urgency) for patient safety

• Ask for help

• Improve over time

• Recognize his/her own weakness

• Have a willingness to learn and reflect on feedback

• Incorporate professionalism

• Be clinically, logically and culturally competent

• Build a rapport with the patient

• Synthesize information

Medical Council of Canada

MCCQE Part I Standard Setting Report 24

APPENDIX E: Form to Document a Bookmark for Each Round

Panel: ______________________________________________________________________

Panelist: _____________________________________________________________________

Standard Setting for the Qualifying Examination Part I

The Bookmark Method

Please indicate the page number of the item on which you placed your bookmark. It is the item for

which, in your judgment, a minimally proficient candidate’s chance of answering correctly falls

below a 2/3 probability.

Please initial after each round:

Round Bookmark Page Initials

1

2

Medical Council of Canada

MCCQE Part I Standard Setting Report 25

APPENDIX F: Form to Document a Bookmark for Each Round

Panel: ______________________________________________________________________

Panelist: _____________________________________________________________________

Standard Setting for the Qualifying Examination Part I

The Hofstee Method

Please answer the following 4 questions, once after each round:

1. What is the highest percent correct cut score that would be acceptable, even if every

candidate attains that score? This value represents your estimate of the maximum level of

knowledge that should be required of candidates.

Round 1: ______ Round 2: ______

2. What is the lowest percent correct cut score that would be acceptable, even if no candidate

attains that score? This value represents your judgment of the minimum acceptable

percentage of knowledge that should be tolerated.

Round 1: ______ Round 2: ______

3. What is the maximum acceptable failure rate? This value represents your judgment of the

highest percentage of failing candidates that could be tolerated.

Round 1: ______ Round 2: ______

4. What is the minimum acceptable failure rate? This value represents your judgment of the

lowest percentage of failing candidates that could be tolerated.

Round 1: ______ Round 2: ______

Medical Council of Canada

MCCQE Part I Standard Setting Report 26

APPENDIX G: Part I Standard Setting Fall 2014 – Post-Session Survey Summary

1. Which panel did you participate in? (Select ONE)

Response Chart Percentage Count

Panel 1 (University room) 53% 9

Panel 2 (Barr/Bérard room) 47% 8

Total responses 17

2. What was your impression of the clarity of the information regarding the overview of

the MCCQE Part I exam that was provided on the morning of Day 1? (Select ONE)

Response Chart Percentage Count

Excellent 65% 11

Very good 18% 3

Good 18% 3

Fair 0% 0

Poor 0% 0

Total responses 17

3. What was your impression of the clarity of the information regarding the overview of

standard setting that was provided on the morning of Day 1? (Select ONE)

Response Chart Percentage Count

Excellent 65% 11

Very good 29% 5

Good 6% 1

Fair 0% 0

Poor 0% 0

Total responses 17

4. What was your impression of the clarity of the information regarding the overview of

the Bookmark Method that was provided on the morning of Day 1? (Select ONE)

Response Chart Percentage Count

Excellent 53% 9

Very good 41% 7

Good 6% 1

Fair 0% 0

Poor 0% 0

Total Responses 17

Medical Council of Canada

MCCQE Part I Standard Setting Report 27

5. How clear were you about the description of the “Minimally Competent” (or sometimes

called “Borderline”) candidate on the MCCQE Part I exam as you began the task of

setting a passing score following the training on the afternoon of Day 1? (Select ONE)

Response Chart Percentage Count

Very clear 24% 4

Clear 76% 13

Somewhat clear 0% 0

Not clear 0% 0

Total Responses 17

6. How clear were you about the description of the “Minimally Competent” (or sometimes

called “Borderline”) candidate on the MCCQE Part I exam as you began the task of

setting a passing score following the training on the afternoon of Day 1? (Select ONE)

Response Chart Percentage Count

Yes, very helpful 47% 8

Yes, helpful 47% 8

Yes, somewhat helpful 6% 1

Not helpful at all 0% 0

Total Responses 17

7. How would you judge the length of time spent (about 45 minutes on the agenda) on the

afternoon of Day 1 introducing, discussing and editing the definition of the “Minimally

Competent” or “Borderline” candidate? (Select ONE)

Response Chart Percentage Count

About right 82% 14

Too little time 6% 1

Too much time 12% 2

Total Responses 17

8. What is your impression of the practice session for applying the Bookmark Method to a

set of MCQs and CDM questions on the afternoon of Day 1? (Select ONE)

Response Chart Percentage Count

Appropriate 94% 16

Somewhat appropriate 6% 1

Not appropriate 0% 0

Total Responses 17

Medical Council of Canada

MCCQE Part I Standard Setting Report 28

9. What is your overall evaluation of the training that was provided for setting a passing

score using the Bookmark Method? (Select ONE)

Response Chart Percentage Count

Excellent 71% 12

Very good 18% 3

Good 12% 2

Fair 0% 0

Poor 0% 0

Total Responses 17

10. What factors influenced your placement of your Bookmark on day 2? (Select ALL

choices that apply)

Response Chart Percentage Count

88% 15

94% 16

76% 13

53% 9

41% 7

76% 13

12% 2

18% 3

The description of the minimallycompetent or borderline candidate

My perception of the difficulty of the test items

The test item statistics

Other panelists during the discussion

My experience in the field

Knowledge and skills measured by the test items

The quality of the distractors to the test items

The number of answer choices to the test items

Other (please specify) 0% 0

Total Responses 17

11. How did you feel about participating in the group discussions regarding the ordered

item booklet? (Select ONE)

Response Chart Percentage Count

Very comfortable 82% 14

Somewhat comfortable 18% 3

Unsure 0% 0

Somewhat uncomfortable 0% 0

Very uncomfortable 0% 0

Total Responses 17

Medical Council of Canada

MCCQE Part I Standard Setting Report 29

12. How would you rate your understanding of how to apply the Bookmark Method during

the marking round 1 on Day 2? (Select ONE)

Response Chart Percentage Count

I understood very well 65% 11

I understood 29% 5

I understood somewhat 6% 1

I did not understand at all 0% 0

Total Responses 17

13. How comfortable were you in applying the Bookmark Method during marking round 1

on Day 2? (Select ONE)

Response Chart Percentage Count

Very comfortable 53% 9

Somewhat comfortable 35% 6

Unsure 12% 2

Somewhat uncomfortable 0% 0

Very uncomfortable 0% 0

Total Responses 17

14. How comfortable were you in applying the Bookmark Method during marking round 1

on Day 2? (Select ONE)

Response Chart Percentage Count

I understood very well 53% 9

I understood 35% 6

I understood somewhat 12% 2

I did not understand at all 0% 0

Total Responses 17

15. How comfortable were you in applying the Hofstee during marking round 1 on Day 2?

(Select ONE)

Response Chart Percentage Count

Very comfortable 47% 8

Somewhat comfortable 41% 7

Unsure 6% 1

Somewhat uncomfortable 6% 1

Very uncomfortable 0% 0

Total Responses 17

Medical Council of Canada

MCCQE Part I Standard Setting Report 30

16. What level of confidence do you have that the consequential/feedback data and final

discussion this afternoon helped the panel arrive at a defensible passing score?

(Select ONE)

Response Chart Percentage Count

Very confident 53% 9

Confident 35% 6

Somewhat confident 12% 2

Not at all confident 0% 0

Total Responses 17

17. What level of confidence do you have in the final recommended passing score?

(Select ONE)

Response Chart Percentage Count

Very confident 76% 13

Confident 18% 3

Somewhat confident 6% 1

Not at all confident 0% 0

Total Responses 17

18. How could the method used for setting a passing score on the MCCQE Part I exam have

been improved?

# Response

1. The process as executed was excellent.

2. no

3. I think it took a little while to grasp the concept of minimally competent & hence the book mark but became very clear after the initial exercise

4. I think that people are pushed to change their scores after the first session on day 2. The bias was to increase the passing score on the second round because of the large disparity in panels.

5. This is my first time doing this exercise, so I do not have previous experience for comparison. Having said that, I don't feel there was nothing to improve.

6. it would have been valuable after the practice bookmark to provide the data including the impact information and graphical spread, as we had done after round 1 on day 2.

7. I think the discussions were excellent!

8. no improvement needed - there was lots of time for discussion which I think was important

9. Not sure; I thought the process went well as it is.

10. Develop the list of competencies from the onset of the exercise.

Medical Council of Canada

MCCQE Part I Standard Setting Report 31

11. the teaching, preparation, and handling of questions were all excellent. there was some confusion among participants as to whether they should discuss with others or not, especially during round I. Given the discussion that ensued after the impact statistics were shown, I wonder about including that on the practice day so desensitize people to this aspect.

12. As suggested at the time, letting us know immediately what failure rate would result from with our individual bookmarks would be helpful.


Recommended