Evaluating Outcomes
Marshall (Mark) Smith, MD, PhDDirector, Simulation and Innovation
Banner Health
Lecture today is adapted from presentation given at last Laerdal SUN meeting by Geoffrey T. Miller, and some of the slides today are used/modified
from that presentation.
Geoffrey T. MillerAssociate Director, Research and Curriculum Development
Division of Pehospital and Emergency HealthcareGordon Center for Research in Medical Education
University of Miami Miller School of Medicine
Acknowledgement and Appreciation
The wisdom of a
group is greater
than that of a
few experts…
Session Aims
• Discuss the importance and use of outcomes
evaluation and challenges to traditional assessments
• Discuss some learning models that facilitate
developing assessments
• Discuss the importance of validity, reliability and
feasibility as it relates to assessment
• Discuss types of assessments and their application in
healthcare education
Evaluation
• Systematic determination of merit, worth, and
significance of something or someone using
criteria against a set of standards.
– Wikipedia, 2009
Assessment
• Educational Assessment is the process of
documenting, usually in measurable terms,
knowledge, skills, attitudes and beliefs.
Assessment can focus on the individual
learner,……– Wikipedia, 2009
Assessment vs Evaluation
• Assessment is about the progress and
achievements of the individual learners
• Evaluation is about the learning program as a
whole
Tovey, 1997
So why measure anything?
Why not just teach?
Measurement
• What is measured, improves
• You Can't Manage What You Don't Measure
• You tend to improve what you measure
Measurement
• Promotes learning
• Allows evaluation of individuals and learning
programs
• Basis of outcomes- or competency-based education
• Documentation of competencies
Measurements in Future
• Credentialing• Privileging • Licensure • Board certification • High stake assessments for practitioners
All involve assessment of competence
What are the challenges today
of traditional methods of
measurement/assessment
for healthcare providers?
Challenges in traditional assessments
• Using actual (sick) patients for evaluation of skills
– Cannot predict nor schedule clinical training events
– Compromise of quality of patient care, safety
– Privacy concerns
– Patient modesty
– Cultural issues
– Prolongation of care (longer procedures, etc)
Challenges in traditional assessments
• Challenges with other models
– Cadaveric tissue models
– Animal models / labs
Challenges in traditional assessments
• Feasibility issues for large-scale examinations
• Standardized, perceived fairness issues in high-stakes
settings
• Standardized patients (SPs) improve reliability, but
validity issues exist: cannot mimic many physical
findings
Challenges in traditional assessments
• Wide range of clinical problems, including
rare and critical events
• Availability
• Financial cost
• Adequate resources
• Reliability, validity, feasibility
Kirkpatrick's Four Levels of Evaluation
• Reaction
• Learning
• Performance
• Results
Kirkpatrick's Four Levels of Evaluation
1. Reaction
• Measures only one thing – learners perception
• Not indicative of any skills, performance
• Success is critical to success of program
• Relevance to learner important
Kirkpatrick's Four Levels of Evaluation
2. Learning
• This is where learner changes
• Requires pre and post testing
• Evaluation at this step is through learner
assessment
• First level to measure change in learner!
Kirkpatrick's Four Levels of Evaluation
3. Performance (Behavior)
• Action that is performed• Consequence of behavior is performance• Traditionally involves measurement in the
workplace• Transfer of learning from classroom to work
environment
Kirkpatrick's Four Levels of Evaluation
4. Results
• Clinical and quality outcomes
• Difficult to measure in healthcare
• Perhaps better in team training
• Often ROI that management wants
Kirkpatrick's Four Levels of Evaluation
• Reaction
• Learning
• Performance
• Results
1. Increasing complexity
2. Increasing difficulty to
measure, time
consuming
3. Increasing value!
Kirkpatrick's Four Levels of Evaluation
• Reaction
• Learning
• Performance
• Results
How do we start to develop outcome measurements
Learning Outcomes
Projected Outcomes
Course Objectives
Goals
Expected Outcomes
Development of Curricula
• Analysis– Clearly define and clarify desired outcomes*
• Design • Development• Implementation • Evaluation
ADDIE
Defining Assessments
• Outcomes are general, objectives are specific
and support outcomes
• If objectives are clearly defined and written,
questions and assessments nearly write
themselves
Defining Outcomes
• Learners are more likely to achieve competency and
mastery of skills if the outcomes are well defined and
appropriate for the level of skill training
• Define clear benchmarks for learners to achieve
• Clear goals with tangible, measurable objectives
• Start with the end-goal in mind and the assessment
metrics, then the content will begin to develop itself
Role of Assessment in Curricula Design
Course •Teaching and learning•Assessment and evaluation
Refine •Learner and Course Outcomes•Modify curricula/assessments
Course •Teaching and learning•Assess learners
Use of assessments in healthcare simulation
InformationInformation
DemonstrationDemonstration
PracticePractice
Rosen, MA et al. Measuring Team Performance in Simulation-Based Training: Adopting Best Practices for Healthcare.Simulation in Healthcare 3:2008;33–41.
FeedbackFeedback RemediationRemediation
MeasurementMeasurement DiagnosisDiagnosis
Preparing assessments
• What should be assessed?
– Any part of curriculum considered essential
and/or has significant designated teaching time
– Should be consistent with learning outcomes that
are established as the competencies for learners
– Consider weighted assessments
Clinical competence and performance
• Competent performance requires acquisition of basic
knowledge, skills & attitudes
• Competence =
– Application of specific KSAs
• Performance =
– Translation of competence into action
Three Types of Learning(Learning Domains)
Bloom's Taxonomy
• Cognitive: mental skills (Knowledge)
• Psychomotor: manual or physical skills (Skills)
• Affective: growth in feelings or emotional areas
(Attitude)
Three Types of Learning
Bloom's Taxonomy
• Cognitive = Knowledge K
• Psychomotor = Skills S
• Affective = Attitude A
Bloom’s Taxonomy – Knowledge
Bloom’s Taxonomy – Knowledge
Bloom’s Taxonomy – Knowledge
Bloom’s Taxonomy – Knowledge
Bloom’s Taxonomy – Knowledge
The Anti – Blooms…
Bloom’s Taxonomy – Skills
• Bloom’s committee did not propose a
compilation of the psychomotor domain
model, but others have since.
Bloom’s Taxonomy – Attitude
Five Major Categories• Receiving phenomena• Responding to Phenomena• Valuing• Organization• Internalizing values
Knowledge Competencies
KnowledgeCognitive knowledge• (factual) Recall• Comprehension• Application• Analysis• Synthesis• Evaluation
Knowledge
Skills
Skills• Communication• Physical Exam• Procedures• Informatics• Self Learning• Time Management• Problem Solving
Skill competencies
Knowledge
Skills
Attitudes• Behavior• Teamwork• Professionalism• Key Personal
Qualities• MotivationAttitudes
XAttitude competencies
Continuous process
Knowledge
SkillsAttitudes
Possible Outcome Competencies(GME Based)
• Patient care• Medical knowledge• Practice-based
learning and improvement
• Interpersonal and communication skills
• Professionalism • Systems-Based
Practice
Knowledge
SkillsAttitudes
So we know what we
want to measure, but
how do we do that?
Copyright ©2003 BMJ Publishing Group Ltd.
Norcini, J. J BMJ 2003;326:753-755
Miller’s Pyramid of Competence
Miller’s Pyramid
• Top two cells of the pyramid, in the domains
of action, or performance, reflect clinical
reality
• The professionalism and motivation required
to continuously apply these in the real setting
must be observed during actual patient care.
Miller’s Pyramid
• Top two levels most difficult to measure
• Quality of assessment in the clinical setting lags
far behind
• In training Evaluation Reports and Likert Scale
• Little value as formative, or feedback, instrument
that might contribute to the learner’s education
Miller’s Pyramid of
competence for
learning and assessment
Miller’s Pyramid of Competence
Miller GE. The Assessment of Clinical Skills / Competence / Performance, Academic Medicine, 65:9, S63-S67.
Does
Shows
Knows How
KnowsCognition
Behavior
LearningOpportunity• Reading /
Independent Study
• Lecture• Computer-
based• Colleagues /
Peers
Teaching and Learning “Knows”
Does
Shows
Knows How
Knows
Miller GE. The Assessment of Clinical Skills / Competence / Performance, Academic Medicine, 65:9, S63-S67.
Assessment of “Knows”
Factual Tests
Does
Shows
Knows How
Knows
Miller GE. The Assessment of Clinical Skills / Competence / Performance
Assessment Tools for “Knows”
• Multiple Choice Questions (MCQs)
• Short Answer
• True / False
• Matching (extended)
• Constructed Response Questions
LearningOpportunity• Problem-based
Ex.• Tabletop
Exercises• Direct
Observation• Mentors
Teaching and Learning - “Knows How”
Does
Shows
Knows How
Knows
Miller GE. The Assessment of Clinical Skills / Competence / Performance
Clinical Context
Based Tests
Assessment of “Knows how”
Does
Shows
Knows How
Knows
Miller GE. The Assessment of Clinical Skills / Competence / Performance
• Multiple-choice question
• Essay• Short answer• Oral interview
The Tools of “Knows How”
LearningOpportunity• Skill-based
Exercises• Repetitive practice
• Small Group• Role Playing
Teaching and Learning - “Shows”
Does
Shows
Knows How
Knows
Miller GE. The Assessment of Clinical Skills / Competence / Performance
Assessment of “Shows”
PerformanceAssessment
Does
Shows
Knows How
Knows
Miller GE. The Assessment of Clinical Skills / Competence / Performance
• Objective Structured Clinical
Examination (OSCE)
• Standardized Patient-based
The Tools of “Shows”
Variables in Clinical Assessment
ClinicalAssessment
Examiner
Patient
Student
CONTROL VARIABLES as much as possible…
LearningOpportunity• Experience
Teaching and Learning - “Does”
Does
Shows
Knows How
Knows
Miller GE. The Assessment of Clinical Skills / Competence / Performance
Assessment of “Does”
PerformanceAssessment
Does
Shows
Knows How
Knows
Miller GE. The Assessment of Clinical Skills / Competence / Performance
• Undercover / Stealth / Incognito
Standardized Patient-based
• Videos of performance
• Portfolio of learner
• Patient Satisfaction Surveys
The Tools of “Does”
Influences on clinical performance
Does
Performance
CompetenceSyste
m related Individual related
Cambridge Model for delineating performance and competence
Rethans JJ, et al. The relationship between competence and performance: implications for assessing practice performance, Medical Education, 36:901-909.
Reliability
and
Validation
Assessments - Reliability
• Does the test consistently measure what it is
supposed to be measuring?
– Types of reliability:
• Inter-rater (consistency over raters)
• Test-retest (consistency over time)
• Internal consistency (over different items/forms)
Inter-rater Reliability
• Multiple judges code independently using the same criteria
• Reliability = raters code same observations into same classification
• Examples• Medical record reviews• Clinical skills• Oral examinations
Factors Influencing Reliability
• Test length• Longer tests give more reliable scores
• Group homogeneity• The more heterogeneous the group, the higher
the reliability• Objectivity of scoring• The more objective the scoring, the higher the
reliability
Assessments - Validity
• Are we measuring what we are supposed to be measuring
• Use the appropriate instrument for the knowledge, skill, or attitude you are testing
• The major types of validity should be considered– Face– Content– Construct
Both archers are equally reliableValidity = quality of archer’s hits
Archer 1 hits bulls eye every time
Archer 2 hits outer ring in same spot every time
Validity is accuracy
Reliable and Valid
Reliable, not valid
Not reliable, not validNot reliable, not valid
Reliability and Validity
Improving reliability and validity
• Base assessment on outcome/objectives- event triggers- observable behavior- behavioral rating-assess
• Define:– Low-medium-high performance– Use of rubric or rating metric– Use (video) training examples of performance– Employ quality assurance/improvement system
Assessments types
Choose the appropriate
assessment method:
– Formative
– Summative
– Self
– Peer
Assessment• Formative Assessment
– Lower stakes – One of several, over time of course or program– May be evaluative, diagnostic, or prescriptive– Often results in remediation or progression to next level
• Summative Assessment– Higher stakes – Generally final of course or program – Primary purpose is performance measurement– Often results in a “Go, No-Go” outcome
Assessments - self
• Encourages responsibility for the learning process, fosters skills in making judgments as to whether work is of an acceptable standard – it improves performance.
• Most forms of assessment can be adapted to a self-assessment format (MCQs, OSCEs, and short answers)
• Students must be aware of standards required for competent performance.
Assessments - peer
• Enables learners to hone their skills in their ability to work with others and professional insight
• Enables faculty to obtain a view of students they do not see
• An important part of peer assessment is for students to justify the marks they award to others
• Justification can also be used as a component when faculty evaluates attitudes and professionalism.
Assessments – Setting Standards
• Should be set to determine competence• Enables certification to be documented, accountable
and defensible• Appropriately set standards for an assessment will
pass those students who are truly competent• Standards should not be two low (false positives) to pass
those who are incompetent, nor too high (false negative) to fail those who are competent.
Assessments – Setting Standards
• Those responsible in setting standards must
also have a direct role in teaching students at
the level being examined and assist in
providing examination material
Assessments – Setting Standards• Standards should be set around a core curriculum that
includes the knowledge, skills and attitudes required of all students
• When setting a standard the following should be considered:– What is assessed must reflect the core curriculum– Students should be expected to reach a high standard in
the core components of the curriculum (For instance an 80-90% pass mark of for the important core and 60-80% for the less important aspects.)
– Students should be required to demonstrate mastery of the core in one phase of the curriculum before moving on to the next part of the curriculum
Assessments - Feasibility• Is the administration and taking of the assessment
instrument feasible in terms of time and resources• The following questions should be considered:
– How long will it take to construct the instrument?– How much time will be involved with the scoring process?– Will it be relatively easy to interpret the scores and
produce the results?– Is it practical in terms of organization?– Can quality feedback result from the instrument?– Will the instrument indicate to the students the important
elements within the course?– Will the assessment have a beneficial effect in terms of
student motivation, good study habits and positive career aspirations?
Practicality
• Number of students to be assessed
• Time available for the assessment
• Number of staff available
• Resources/equipment available
• Special accommodations
Assessment
Instruments
Assessments - Instruments
• Be aware of the types of assessment
instruments available as well as the
advantages and disadvantages of each
• Use more than one assessment instrument
and more than one assessor if possible
when looking at skills and attitudes
Choosing appropriate assessment methods
• When choosing the assessment instrument,
the following should be answered:
–Is it valid
–Is it reliable
–Is it feasible
Assessments – Knowledge Instruments
• Objective tests (short answer,
true/false, matching, multiple choice)
• Objective Structured Clinical
Evaluations (OSCEs)
• Constructed response questions
• Rating scales (used on clerkships)
Assessments – Skill Instruments
• Objective tests (Simulation based)• OSCEs• Constructed response questions• Critical reading papers (interpreting literature)• Checklists• Rating Scales• Portfolios (self-evaluation, time management)
Weighted Checklists
• List of items to measure
• Set of weights of each
• Summary score
Assessments – Attitude Instruments
• Portfolios• Essays / Modified essay questions• OSCEs• Checklists• Rating scales• Patient management problems• Short/long case assessments
Assessment Metrics
• Procedural or Check List assessment
• Global Rating assessment
Assessment Metrics• Procedural or Check List assessment
BCLS Y N
Open Airway
Check Breathing
BCLS Y N
Open Airway(< 5 sec of LOC)
Check Breathing(< 5 sec of Airway)
BCLS Y N
Open Airway
Check Breathing
A
Rating Score +1 -1 0 *Assist
Assessment Metrics• Global Rating assessment
Code Blue P F
CPR and
ACLS
Code Blue
CPR<1(low) - 5(Hi)> points
ACLS<1(low)- 5(Hi)> points
Code Blue H M
CPR
ACLS
L
Rating Score +1 0 -1
Pts.
Review
• Assessment drives learning• Clearly define the desired outcome,
ensure that it can be measured• Consider the effectiveness of the
measurement• Feedback to individual candidates• Feedback to training programs
Questions and discussion
Good Luck
101
CompetencyCompetency is noted when a learner is observed performing a
task or function that has been established as a standard by the profession. The achievement of professional competency requires the articulation of learning objectives as observable, measurable outcomes for a specific level of learner performance. Such specific detailing of performance expectations defines educational competencies. They are verified on the basis of evidence documenting learner achievement, and must be clearly communicated to learners, faculty, and institutional leaders prior to assessment.
____________Identified by members of work group on competency-based women’s health education at APGO Interdisciplinary Women’s Health Education Conference in September, 1996, Chantilly, VA.