Download - ‘Assessment personalization in an era of massification…’ · •Blended learning curriculum •Learner held assessment and content on smartphones and tablets across programme

Richard Fuller

Leeds Institute of Medical Education

AIMING FOR A NEW HORIZON?

‘Assessment personalization in an era of

massification…’

Massification (n)

Globalisation and

internationalisation of

education

Educational development

applied to mass audience

Application of new

‘technologies’ and uniformity of

practice / student experience

(e.g. the OSCE)

The growth of organised,

standardised testing

IMPACT

• Growth of more organised models of testing – e.g. schedules of assessment

• Constructive alignment and blueprinting

• Holistic assessment – knowledge, clinical skills and (eventually) professionalism

• A focus on quality….

1. JUST FOCUS ON THE

LATEST IN FASHION TOOL

2. PUT PROFESSIONALISM IN A

DIFFERENT TALK/WORKSHOP….

Unprofessionalism/lack of

professionalism

• Critical incidents?

• Markers/traits & Consequences (Papadakis)?

• Lapses (Ginsberg)?

• Identity (Creuss)?

• Faculty (Steinert)?

Or, Meritorious behaviours – care, compassion, team working?

3. MAKE IT EASY TO MARK

• What is easy to assess (and mark) does not equate with importance and value

• SBAs too often ‘knowledge’ only - bias is away from challenging, realistic tasks

• Can we (re)conceptualise the challenges and benefits of computer based assessment in delivering better tests?

4. MAKING ASSESSMENT DECISIONS

IS EASY…

Assessors Learners

Institutions Remediation

What am I

testing?

‘Double Duty’

- Teacher

- Professional /

Assessor

Patients / Safety

Identity?

Resilience?

Capacity for

improvement

Readiness to

join profession?

Motivation

Culture

Reputation

Standards

Tools

Systems

Effective?

Alternatives /

Exits?

Boud Studies in continuing Education 2000

Yepes-Rios et al BEME Guide 42 Medical Teacher 2016

5. EMBRACE THAT

RELIABILITY

‘TYRANNY’

6. THEN CONTINUE THAT GRADE /

TARGET DRIVEN CULTURE….

• ‘Grey and fuzzy’ (Yorke)

• ‘Transactional Currency’ (Sadler)

• ‘Value-laden, inherently unstable process reactive to complex and changing interactions within an assessment’ (Hodges)

• Conclusion:

• Within any individual ‘test’, difficulty at the pass/fail boundary remains (Sadler)

• ‘Let’s stop the pretence of consistent marking’ (Bloxham)

Yorke 2011

Sadler 2010

Bloxham et al 2016

GLOBAL?

REALITY?

30 end of block MCQ tests

5 projects

8 skills portfolios

12 end of block OSCEs

15 professionalism surveys

20 end of placement

evaluations

Endless WBA and compulsory

reflective exercises

Limited (efficacy) feedback….

FEEDBACK…….

• ‘Grade Justification’

(Sadler)

• ‘Hopefully useful

information’ (Boud)

• Need to overcome

cultural ‘history’

– Feedback to pass the test

– Feedback to get a better

grade

IMPACT ON LEARNERS AND

TEACHERS?

• Grades are arbitrary and can be subject to inflation

• (Bad) Assessment does not drive meaningful learning

• Over tested students are:

– Disjointed - ‘learn & forget’

– Disengaged - shallow, ‘strategic’ learners that focus

only on ‘what they think can be tested’

– Dissatisfied - ‘that’s not why I came to medical

school’

CHANGING ASSESSMENT’S HORIZONS?

Using assessment data ‘intelligently’ (Programmatic Assessment)

- e.g. McMaster Master WBA programme for postgraduate training

Milestones and ‘Growth’

- e.g. US Milestones programme, GMC General Professional Capabilities

‘Sustainable Assessment revisited’ (Boud & Soler 2016)

- Assessment to focus on learning beyond the timescale of the (given) course..

CAN WE RECONCEPTUALISE

ASSESSMENT IN MEDICAL EDUCATION?

• Can we rebalance our focus on both high stakes decision making and psychometric analysis and consequences on learning and instruction?

• Move to a personalised model of assessment that is more….

• Compassionate?

CASE 1: EASING TRANSITION SHOCK

• Entry to Medical School– grappling with being a new starter (and existing ‘habits’) yet ‘hit’ with the summative test

– Transitions and Critically Intense Learning Period

• Cumulative/continuous testing across much of Year 1 and Year 2

– Multi-stage class tests with debrief and feedback + alternate testing for those who struggle

– Weekly cases and continuous assessment opportunities

– Non Graded Pass

• Outcomes

– Happier students and staff

– Better test success in high stakes end of year exam

– Developing trust and ‘learning to learn’ effectively

Baker 2017

Kilminster et al Med Educ 2011

Velji. BSc Med Ed Leeds 2015

CASE 2: RITUALS AND RELAXING

• Should we sequester / quarantine students for large scale tests to reduce impact of cheating?

– Large scale psychometric evidence says ‘no’

– Continued student anxiety….

• Student research project looking at what students do in the run up to and time in between/after OSCEs

– Draws on sports and performance art

– ‘rituals and rites’ identified that help students manage high stress situations proximal to performance – many solo/in solitude

– Justification for not sequestering, but scope for further inter-disciplinary work on keeping assessment ‘compassionate’

McCourt et al Med Educ 2012

Lever. BSc Med Ed Leeds 2014





• Customised?

CASE STUDY 3: IMPROVE LARGE

SCALE TEST DESIGN & DELIVERY

• Challenges in assessment

– Considerations of reliability >> Sampling >> item

quality

– Large, expensive assessments

– Challenges of setting standards and accurate decision

making.

• Hypothesis

– Not enough focus on blueprinting and high quality

items

– Overtesting of those we know who will pass.

– Under assessment of those we are most concerned about

ADAPTIVE TESTING: THE

SEQUENTIAL TEST (SQT)

• Shorter tests with ‘an adaptive stopping rule’

– Stronger (obviously competent) candidates receive a shorter ‘screening test’

– Candidates of concern receive longer or multiple sequence tests

• Designed primarily around effective blueprinting and effective sampling/testing of those of most concern

– Fairer to all stakeholders

– Better reliability (driven through more testing of the critical pass-fail/borderline group)

– Improved diagnostic accuracy over ‘traditional tests’

– More effective and cost efficient use of resources

Wainer & Feinburg, Significance 2015

Pell, Fuller, Homer (multiple)

IMPACT OF REMEDIATION OF

FAILING STUDENTChange in percentile ranks following Y5 resit year

(n=19)

• Median OSCE rankings

improve by 30 percentile

points

• Knowledge test: 20

percentile points

• Longitudinal ‘proportion

of change’ has reduced

in Year 5 over cohorts

• Engagement with

remediation and ITA

‘better’

• Better acceptance of

failure with S1/S2 vs

single test and resit

IMPACT ON THE ‘JUST PASSING /

BORDERLINE STUDENT’

How do those who get brought back for S2 and progress do (Y4 →

Y5) compared that only do S1?

• Emerging results – OSCE

• Those who got brought back for S2

OSCE (only) improved their relative

ranking a little over Y4 to Y5 (n=38;

total cohort = 442).

• Those brought back for both OSCE

and knowledge make a smaller

improvement (n=20).

• Evidence from WBA monitoring that

Seq 2 seems to ‘switch on’ better

engagement (epiphany moments).

• Sustained result across 3 cohorts

3.3

1.3

-1.6

-2.0

-1.0

0.0

1.0

2.0

3.0

4.0

S2 OSCE S2 Both (OSCE) S1 OSCE

Median percentile change - OSCE





• Consequential?

CASE STUDY 4: (WORKPLACE)

ASSESSMENT AS DIAGNOSIS?

• Blended learning curriculum

• Learner held assessment and content on smartphones and tablets across programme

• Assessment for Learning

• Feedback, reflection & student evaluation to learner and assessor

• Big data set and longitudinal impact~ 3000 Year 4 and 5 students (2011 +)

> 50 000 individual workplace assessments

LEARNER ENGAGEMENT AND

PROGRESSION

• 2011

– Emergent correlation between early, sustained WBA engagement and OSCE success (r 0.327)

– Late onset, bare minimum approach (r -0.25)

• 2015 - 7

– Sustained and growing correlation between engagement and success (0.59; p<0.001). Strong link with SQT outcomes

• ‘Early Alert’ with differential ITA and support for those not engaging

– Worrying correlation with persistent disengagement and SQT outcomes ( -0.6, p <0.001)

• Introduction of customised nudges for ‘at risk’ student in 2016-17

• Early results show good outcomes – nudge responders not in S2, better than ‘predicted’ performance

• Correlation now -0.32. What to do with the resistors?

IMPACT ON FEEDBACK

• Assessor comments

– ‘Efficient Rectal

Exam’

• Student response

– ‘Thanks’

• It was very useful to have discussed patient management and discussion prior to seeing this patient.

• I was then able to use the skills taught to present and create a management plan.

• I need to practice patient management discussion but today's clinic has given me the foundation to do so

• You handled this well and were confident enough to take a patient centred and problem list approach including seeking his views about management planning.

• Your presentation and summary were much more fluid and confident and you made good use of practical tips to help manage info from the patient

• Good evidence of reasoning and discrimination and great feedback from the patient who felt confident to seek reassurance from you about his meds

• Next steps - routine use of these approaches in all your notes and build confidence in note taking whilst consulting rather than at the end.

Trends in who students are asking to

assess…

Growth in feedback quantity, quality

and student activity….

Strong correlation with self regulation

and student success…

Conceptual framework of self-

determination, learner identity and

autonomous motivation

A RECONCEPTUALISED FUTURE?

PERSONALISED ‘3C’ ASSESSMENT

• COMPASSIONATE

– Learning initiated assessment - A partnership between teachers and learners

– Well designed tests and programmes of assessment that are sensitive to learner and teacher ‘load’ and the needs of patients and wider society

• CUSTOMISED

– Wider scale adaptive test models that invest in supporting learners of different ability

– Intelligent use of existing structured assessment – and research driven innovation

• CONSEQUENTIAL

– Sensitive use of ‘big data’ (numbers and words) to generate individualised, meaningful feedback, actions and growth

– Focus on personalization of achievement & impact for all stakeholders

THANKS FOR YOUR ATTENTION

References, copies of

slides and comments:

Professor Richard

Fuller

[email protected]

@LeedsARG

mailto:[email protected]