Post on 27-Jun-2020
transcript
Clerkship Assessments: Standardization, Subjectivity, and “Science”
Matt Fitz, MD
November 7, 2019
• I have no conflicts of interest
Today’s Objectives
1. Differentiate the role that NBME exams may have in predicting USMLE exams with restructuring of the preclinical & clinical years
2. Identify limitations of NBME subject exam assessments
3. Discuss additional and different options for assessment in the clinical setting
Where does this fit?
• Assessment– Multiple-Choice Exams (& the NBME)– Clinical Performance Evaluation / EPAs– OSCE– Professionalism– Curriculum– Simulation / Clinical Skills
Where does this fit?
• Assessment– Multiple-Choice Exams (& the NBME)– Clinical Performance Evaluation / EPAs– OSCE– Professionalism– Curriculum– Simulation / Clinical Skills
NBME exams… not this again
• Potential applications to your curricular change: Timing– Impact of shorter preclinical time– Impact of Step 1– How many NBME exams are necessary?
30 – 50% “uncovered” material?
Our study
1. We wanted to look at how different IM Clerkship characteristics would affect the NBME IM Subject Exam
2. And what about the impact of earlier clinical starts?
3. We also wanted to look at how different IM Clerkship characteristics and NBME usage would affect Step 2
4. Controlled for Step 1
The Schools
• 60 schools (IRB exemption / approval)• >20,000 examinees from 2011-2014• Examinee data was cross-referenced by NBME
exam date and school reported testing times• IM Clerkships were highly variable• 8 week (40%) v 12 week (40%) clerkships
served as the framework for analysis
What were these characteristics?
• Length• Combined• Ambulatory• P/F• Honors
• Study Day(s)• Didactic Hours• # of NBME subject exams• Preclinical curriculum
– Hybrid– Traditional– Organ-based
Two additional notes…
• Longitudinal students (3 schools)
• Non-traditional academic year start date schools*
NBME exam performance: July start dates
Note: N = 14,667 examinees from K = 37 Medical Schools. These results also control for examinees’ Step 1 Score
NBME exam performance: July start dates
Construct Comparison Mean Difference
95% CI Adj. pLower Upper
Note: N = 14,667 examinees from K = 37 Medical Schools. These results also control for examinees’ Step 1 Score
NBME exam performance: May start dates
Note: N = 4,211 examinees from K = 10 Medical Schools. These results also control for examinees’ Step 1 Score
NBME exam performance: May start dates
Construct ComparisonMean
Difference
95% CIAdj.
pLower Upper
Note: N = 4,211 examinees from K = 10 Medical Schools. These results also control for examinees’ Step 1 Score
Impact on USMLE Step 2 July academic start
Note: N = 13,486 non-longitudinal examinees from K = 36 Medical Schools. These results also control for examinees’ Step 1 Score.
Impact on USMLE Step 2 May academic start
Correlation Coefficients and RSQ
Step 1 RSQ Step 2 RSQ Exam Assessment
Clerkships
Neurology 0.62 - 0.72 ~.45 0.59 - 0.74 ~.5 Department
Medicine 0.64 - 0.72 0.65 - 0.69 Department
Psychiatry 0.57 - .64 0.59 - 0.68 Department
Surgery 0.59 - 0.68 ~.45 0.64 - 0.7 ~.5 NBME
Ob/Gyn 0.56 - 0.62 0.6 - 0.65 NBME
Pediatrics 0.57- 0.62 ~.4 0.61 - 0.71 ~.45 Aquifer
Family Medicine 0.51 - 0.61 0.54 - 0.59 Aquifer
Collective R2 ~.75
At Brown (2011–2014), the collective R2 was .8At USUHS, (2009-1010), the collective R2 was .77
Discoveries
1. Clerkship Length has significant impact on NBME and Step 2 scores for non-traditionalclerkship dates – but not for the traditional schools
2. Other clerkship variables have very little significant impact
3. Longitudinal students score similarly, if not better, when compared to their peers
4. Increasing number of NBME exams does not have significant impact on Step 2 scores
A Few Ongoing Questions?
• What is the real impact of Step 1?• What does this mean for preclinical and clinical
integration? … if anything• Why do the non-traditional schools demonstrate
differences?– And what may be the impact for us
• Will there be impact on Step 1 with NBME & clerkship variability as schools become more “non-traditional?”
Questions / Discussion?
Valid N Odds Ratio95% Confidence Interval
pLower Upper
First Clinical Performance (CP1) 797 1.088 1.038 1.141 .001Weighted CP1 796 1.275 1.073 1.515 .01Second Clinical Performance (CP2) 797 1.064 1.015 1.115 .01Weighted CP2 797 1.238 1.039 1.476 .02Outpatient Clinical Performance (OCP) 797 1.089 1.038 1.143 .001Weighted OCP 797 0.968 0.603 1.554 .55NBME score 546 1.010 0.975 1.047 .58Weighted NBME Score 797 1.100 0.889 1.360 .38OSCE score 795 1.027 0.974 1.082 .33Weighted OSCE score 692 1.392 0.619 3.128 .42MCQ Exam Score 146 0.994 0.931 1.060 .85Exam Free Text Score 143 1.027 0.943 1.119 .54Exam Score 651 1.024 0.980 1.069 .29Weighted Exam Score 797 1.113 0.828 1.496 .48Overall Clerkship Score 797 1.088 1.021 1.159 .01Final Grade (overall significance) 778 .052
High Pass versus Pass 2.451 1.165 5.156 .02Honors versus Pass 2.263 0.909 5.635 .08Honors versus High Pass 0.923 0.409 2.086 .85
Table 1A: Odds of chief resident selection
External Validity – High Performers
Internal & External Validity – Low Performers
Valid N Odds Ratio95% Confidence Interval
pLower Upper
Medicine vs Surgery2012-2013 144 6.771 2.822 16.248 <.0012013-2014 136 2.716 1.313 5.617 .012014-2015 155 5.073 2.062 12.481 <.0012015-2016 154 3.380 1.495 7.645 .004
All Years 586 4.117
2.758 6.145
<.001
Table 2Odds of recording lower level CPE responses between medicine versus surgery
Overall #s are quite low; For Medicine, the number of low performers identified were at 2.5% of the total evaluation marks.
Surrogate Markers – Word Count
Medicine Surgery
MedianInterquartile Range
MedianInterquartile Range
Lower Upper Lower Upper
2012-2013 54 39 73 16 3 30
2013-2014 61 45 77 18 3 36
2014-2015 70 52 87 25 10 44
2015-2016 69 50 82 25 13 41
All Years 64 46 80 22 7 38
Summary statistics for word counts
Surrogate Markers – Word Count
Summary statistics for word counts
Surrogate Markers – Unique Boxes
Valid N Odds Ratio95% Confidence Interval
pLower Upper
Medicine vs Surgery
2012-2013 144 19.963 12.451 32.006 <.001
2013-2014 136 1.067 0.753 1.511 .71
2014-2015 155 0.932 0.687 1.264 .65
2015-2016 154 1.623 1.186 2.221 .003
All Years 586 1.917 1.633 2.249 <.001
T
Table 5AOdds of more unique boxes checked for medicine v surgery
Surrogate Markers – Unobserved Behaviors:
Generally, the medicine clerkship was more likely to have more unobserved behaviors than students in the Surgery Clerkship
Pro Faculty and staff are encouraged NOT to blindly assess every competency if they did not have enough observations of a particular competency (Procedures)Con Or does this actually reflect that the teams were not observing behaviors and raises the question about the overall validity of all other competencies evaluated