+ All Categories
Home > Documents > Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris...

Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris...

Date post: 16-Dec-2015
Category:
Upload: tamsyn-lawrence
View: 217 times
Download: 0 times
Share this document with a friend
Popular Tags:
33
Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. http://www.dcs.gla.ac.uk/~johnson UCD: 12 th December 2003
Transcript
Page 1: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

Computational Methods for Finding Patterns ofHuman and System ‘Failure’ in Mishap Reports

Chris Johnson

University of Glasgow, Scotland.http://www.dcs.gla.ac.uk/~johnson

UCD: 12th December 2003

Page 2: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

A: Detection and Notifi cation

B: Data gathering

C: Reconstruction

D: Analysis

E: Recommendations and Monitoring

F: Reporting and ExchangeJohnson, Le Galo and Blaize; European Incident Reporting Requirements in Air Traffic Management,EUROCONTROL, 2000.

Page 3: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

0

1

2

3

4

5

6

7

8

9

10

Could the

incident have

been anticipated

by risk

managers?

Could the

incident have

been anticipated

by participants?

How severe was

the incident?

How much is

such an incident

f eared by staff ?

How confi dent

are you in

avoiding such

incidents?

How risky was

the incident?

How easy is it to

control the

outcome of such

incidents?

How visible was

the incident?

How much eff ort

is necessary to

avoid f uture

incidents?

bad

good

Page 4: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

NASA safety managers complain that the Web Program Compliance Assurance and Status System is too cumbersome.

Personnel use Lessons Learned Information System only on an ad hoc basis.

Hazard reports rarely communicated effectively, nor are databases used by engineers and managers capable of translating operational experiences into effective risk management practices. (CAIB, p.189)

“Centers and contractors used Problem Reporting and Corrective Action database differently, preventing comparisons across the database.

Page 5: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.
Page 6: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.
Page 7: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

• Probabilistic information retrieval:•Avoids problem of codification;•But issues of precision and recall.

•Conversational case based reasoning:

• Extended form of US Navy’s NACODAE system;• Flexible precision & recall.

•Word sense disambiguation etc.

Page 8: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

FAA GAIN lacks computational support.

Someone must address this opportunity…

Meta-Level Concerns for Aerospace

Page 9: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

Linda, JavaSpaces and Middleware for Incident Reporting

<A320, 12/12/2003, “ATC came through…”>

<B777, 1/12/2003, “On final approach…”>

< “Weather poor but …”>

<B737, “Maintenance failure on …”>

UK

US

Australia

<A320, “No clearance…”>

Concurrency and distribution

Page 10: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

<A320, 12/12/2003, “ATC came through…”>

<B777, 1/12/2003, “On final approach…”>

< “Weather poor but …”>

<B737, “Maintenance failure on …”>

UK

US

Australia

<A320, ?, ?>

<A320, “No clearance…”>

Overloading of matching operators

<?, ?, match(CRM)>

Linda, JavaSpaces and Middleware for Incident Reporting

Page 11: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

<A320, 12/12/2003, “ATC came through…”>

<B777, 1/12/2003, “On final approach…”>

< “Weather poor but …”>

<B737, “Maintenance failure on …”>

UK

US

Australia

<A320, ?, ?>

<A320, “No clearance…”>

Leases and persistence

<?, ?, match(CRM)>

Linda, JavaSpaces and Middleware for Incident Reporting

Page 12: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

So does the software say something new and

useful?

Page 13: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

Look, I’m not blaming you, I’m just suing you…

•Medical errors lead to:• 45,000-100,000 deaths (US). • RTA=43,000, Aids=16,000.

•Additional care $15 billion:

–45% have some mishap.–17% prolonged hospital stay.

Case Study 1: FDA Telemedicine

Page 14: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

Courtesy: Univ. of Virginia, Office of Telemedicine

•SE Virginia medical centres:

1 nurse monitors system; 49 remote patients; 5 ICUs at 3 centres.

• Staff 50-80% of ICU budget.

Courtesy: NASA Telemedicine Instrumentation Pack project

Page 15: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.
Page 16: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

A: MDR Report Identifier

B: Event Information

E: Professional information

F: Distributor Information

G: Manufacturer Information

H: Device Information

MDR Report Key MDR Event Key Report Number

Source Code Number of devices

Date receivedNumber of patients

Master Event Data File, Section A: MDR Report Identifier

MDR Report Key Manufacturer’sName

Master Event Data File, Section G: Manufacturer Information

Manufacturer’sAddress

Source Type

Date Manufacturer received report

MDR Report Key Made when?

Master Event Data File, Section H: Device Information

Single use device?

Remedial Action

Use code Correction number

Event type

Master Event Data File Format Identifier

MDR Report Key Device Event Key

Device Data File

Device Seq. Number

Device available for examination?

Brand Name

Generic Name

Age? …

MDR Report Key Patient Seq. Number

Patient Data File

Date report received

Sequence and treatment

Patient Outcome

MDR Report Key Text key

Text Data File

Text type

Patient Seq. number

Report date

Text

Page 17: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

Findings from MAUDE: Safety Culture and Telemedical Mishaps

• Introduction of telemedicine implies:– less clinical staff more technical staff;– technical staff don’t understand devices/procedures?

• Increasing reliance on vendor’s guidance:– vendors in turn rely on manufacturers;– communication often breaks down or is too slow.

• No common ‘safety culture’;– many incidents stem from poor communication;– Strong parallels with NASA (CAIB Chapter 7).

Page 18: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

Cluster 1: Configuration

• EASITM software provides 12-lead ECG data on 5-leads to patient.

TECH NOTED EASI 12-LEAD DISPLAY ON CENTRAL STATION FROM TRANSMITTER THAT WASNT EASI CAPABLE.

CUSTOMER REPLACED TRANSMITTER, RELOADED CENTRAL STATION SOFTWARE, CONFIRMED ALL SIGNALS WERE CORRECTLY TRANSMITTED AND LABELED.

CUSTOMER DID NOT UNDERSTAND DIFFERENCE BETWEEN STANDARD ECG AND EASI.

CUSTOMER WAS RETRAINED TO FURTHER THEIR UNDERSTANDING OF DIFFERENCE. (MDR TEXT KEY: 1379795)

• Less electrodes reduce work for nurses, improves patient comfort.

Page 19: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

• Social implications: clinicians and support rely on suppliers’ explanations.

• Symptomatic of system safety problems:– manufacturers gain insights that should be caught earlier in development.

• Retraining is proposed, no idea of systemic causes of human ‘error’?

DURING INVESTIGATION, ENGINEERS CONFIGURED A SYSTEM IN SAME SETUP AS CUSTOMER. FOUND MAINFRAME RECEIVERS CAN RECEIVE INCORRECT BIT TO MISIDENTIFY TRANSMITTER AS EASI

CAPABLE…

• Report doesn’t state how to prevent mis-configuration.

Cluster 1: Configuration

Page 20: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

Cluster 2: Sub-contractors

• End-user frustrated by device unreliability and manufacturers’ response:

SEVERAL UNITS RETURNED FOR REPAIR HAD FAN UPGRADES TO ALLEVIATE TEMP PROBLEMS. HOWEVER, THEY FAILED IN USE AGAIN AND WERE RETURNED FOR REPAIR…

AGAIN SALESMAN STATED ITS NOT A THERMAL PROBLEM ITS A PROBLEM WITH X’s Circuit Board.

X ENGINEER STATED Device HAS ALWAYS BEEN HOT INSIDE, RUNNING AT 68⁰C AND THEIR product ONLY RATED AT 70⁰C….

ANOTHER TRANSPONDER STARTED TO BURN…SENT FOR REPAIR. SHORTLY AFTER MONITOR BEGAN RESETTING FOR NO REASON… (MDR TEXT KEY: 1370547)

• Manufacturers felt reports not safety-related: – “reports relate to end-user frustration regarding product reliability (not

safety)”.

Page 21: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

• Telemedicine applications developed by groups of suppliers:– flexibility and cost savings during development, manufacture, marketing; – problems if incidents stem from sub-components not manufactured by

suppliers; – incident reports must be propagated back along the supply chain.

• Manufacturer states problems stem from subcontractors circuit board: – more problems after faulty board replaced, customer returns unit again; – connectors to PCB not properly seated but still passes acceptance test? – connector not seated completely during initial repair and gradually loosens

over time?

Cluster 2: Subcontractors

Page 22: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

• “Fly-fix-fly” approach undermines attempts to improve patient safety.

• Confused dialogue between clinician, vendor, manufacturer…– End-user may see technical issues as form of excuse (eg PCB

connectors)…

• Device repairs not only rectify problems, they introduce new ones:– compounds end-user uncertainty and distrust of device reliability;– communication fails and shared safety culture erodes over time.

Cluster 2: Subcontractors

Page 23: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

Cluster 3: Modification Induced Bugs

IN SOFTWARE RELEASE VF2, IF PATIENT IN "AUTOADMIT" MODE, PARAMETER DATA AUTOMATICALLY COLLECTED AND STORED IN THE SYSTEMS DATABASE,

IF THE PATIENT LATER REMOVED (BUT NOT DISCHARGED) FROM ORIGINAL BED/NETWORK LOCATION, DATA COLLECTION TEMPORARILY DEACTIVATED (EG DURING MOVE FOR TREATMENT).

PROBLEM OCCURS WHEN NEW PATIENT ADMITTED TO SAME BED/NETWORK LOCATION BUT ORIGINAL PATIENT NOT DISCHARGED WHILE CONNECTED TO THAT LOCATION.

NEW PATIENT ADMISSION STORES DATA IN DATABASE CORRECTLY. HOWEVER, IN PARALLEL, INCORRECTLY APPENDS NEW PATIENT DATA ON TOP OF OLD PATIENT'S RECORD…

(MDR TEXT KEY: 1340560)

Page 24: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

Safety Culture and Telemedical Mishaps

• Software identifies 40-50% more US telemedical mishaps in 6 months.

• Analysis of reports suggests no ‘quick fixes’ but:– Regulators need to focus on dialogue between manufacturers and users;– Consider detailed training requirements for telemedicine before approval;– Especially look at end-user maintenance and configuration issues;– Introduce training in safety and risk management for support staff?

• Joint US/UK AHRQ presentation in Washington.– Things are only going to get worse…

Page 25: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

Da Vinci, 1st robotic aid approved by the FDA: New York Presbyterian Hospital uses it on atrial septal defects.

Page 26: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

Case Study 2: Inter-Industry Comparisons

Page 27: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

Cluster 1: Programming Errors

• Pilot didnt check 1st Officer programming FMC.

• “ATC informed us we were off course ... it took minutes to figure out what happened. ATC vectored us back onto departure and gave us a climb clearance. ATC also pointed out traffic, but we never saw it. We arent sure if our error caused a conflict.

• First Officer programmed FMC. I checked the Route Page to see if it matched our clearance. It showed correct departure and transition. I did not check Legs Pages to see if all fixes were there. I will next time!

• We made an error programming the FMC, then became complacent… I should have done a more complete check of the First Officer's programming”

Page 28: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

• Computer flight plan was route ABC.

• ATC clearance was via route D-E-F.

• Original flight plan should have been destroyed, so as not to accidentally revert to old route.

• First Officer very experienced and I had complete trust that he was capable of loading correct waypoints, but both he and I failed to use a visible method of marking the computer flight plan.

• 99% of time, cleared route is same as computer flight plan, but not always, as I found out the hard way. ATC caught my error”.

Cluster 1: Programming Errors

Page 29: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

• Container ship grounds, same route every week.

• 4 deck officers, good visibility, 2 radars and GPS.

• Charts had courses in black ink, couldnt be erased.

• At 0243 altered course to 237°, position plotted.

• 45 minutes later, ship grounds at full speed.

• Watch officer set auto steering to wrong course.

• 237 next to reciprocal 157 for return voyage.

Cluster 1: Programming Errors

Page 30: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

• During the descent, we were doing some HF radio checks, and forgot to arm the altitude select mode on the flight director. As a result, we descended through our altitude....

• We promptly returned to FL280. As a crew, we are very diligent and disciplined about altitude assignments.

• But in this case, because our attention was diverted from the task at hand, we flew through our assigned altitude. It was that classic trap: both crew members distracted by something and nobody flying the airplane.

Cluster 2: Warnings as Safety Nets

Page 31: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

• 3 on fishing vessel, 2 cook, pump bilges, maintain watch.

• Skipper asleep on the deck of the wheelhouse.

• Vessel’s planned track 0.35 miles from a rig.

• Automated radar alarm system set to 0.3 miles.

• VHF off; skipper said too much distracting traffic.

• Rig ask stand-by safety vessel for help, alongside boat.

• Nobody on bridge or deck even after sounding horns.

• ‘Abandon platform stations’ as precautionary measure.

• Skipper protests on being wakened, “under control”.

• Radar warning system is a safety net or final safeguard.

Cluster 2: Warnings as Safety Nets

Page 32: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

Conclusions• Must make better use of lessons learned

systems.

• Use Tuple Space and IR to search for key issues:– distributed and persistent architectures for retrieval;– avoids need for standardised formats;– can be used within and between industries.

• Caveats: – does it tell us anything new?– how valid are inter-industry comparisons?– how do we get from clusters to recommendations?

Page 33: Computational Methods for Finding Patterns of Human and System ‘Failure’ in Mishap Reports Chris Johnson University of Glasgow, Scotland. johnson.

Questions?


Recommended