+ All Categories
Home > Documents > Root Cause Analysis - · Build Cause & Effect Chart and Write Report Distribute Report, Update...

Root Cause Analysis - · Build Cause & Effect Chart and Write Report Distribute Report, Update...

Date post: 09-Mar-2018
Category:
Upload: vuongkhanh
View: 222 times
Download: 4 times
Share this document with a friend
145
1 © Life Cycle Engineering 2008 1 © Life Cycle Engineering 2008 1 © Life Cycle Engineering 2008 © Life Cycle Engineering 2014 Root Cause Analysis Effective problem solving within a Quality Management System
Transcript

1 © Life Cycle Engineering 2008 1 © Life Cycle Engineering 2008 1 © Life Cycle Engineering 2008 © Life Cycle Engineering 2014

Root Cause Analysis Effective problem solving within a Quality Management System

© Life Cycle Engineering 2014

Workshop Overview

Understanding the RCA method

Managing the RCA program

Implementing the process

Managing the RCA Tools

Begin work within your Quality Management System

© Life Cycle Engineering 2014

Effective Use of Root Cause Analysis

Requires discipline and consistency – Each step in the investigation process must

be followed – All findings must be fully planned and

documented using reporting process – Evaluation must be free of bias or prejudice – Execution, Execution, Execution

© Life Cycle Engineering 2014

Cultural Change

•  RCA implementation has to transcend departmental boundaries and permeate throughout the organization

•  Everyone has a role and plays a part •  As a part of continuous improvement,

everyone has to get involved •  Culture of blame or culture of

improvement?

© Life Cycle Engineering 2014

Need for a Process

This is a little story about four people named Everybody, Somebody, Anybody, and Nobody.

There was an important job to be done and Everybody was sure that Somebody would do it. Anybody could have done it, but Nobody did it. Somebody got angry about that because it was Everybody's job.

Everybody thought that Anybody could do it, but Nobody realized that Everybody wouldn't do it. It ended up that Everybody blamed Somebody when Nobody did what Anybody could have done.

- Anonymous

© Life Cycle Engineering 2014

Benefits of RCA

•  Saves time – tackle root cause(s), not multiple symptoms

•  Fact/data driven change •  Drives out repetitive failures •  Means to communicate facts •  Provides the economic solution

© Life Cycle Engineering 2014

THE PROCESS

© Life Cycle Engineering 2014

The RCA Process

NOTIFICATION CLARIFICATION/ CLASSIFICATION

ROOT CAUSE ANALYSIS

CORRECTIVE ACTION

EVALUATION VERIFICATION DOCUMENTATION

1. 2. 3.

4. 5. 6.

© Life Cycle Engineering 2014

Notification

Sources of Potential RCA Investigations – Data analysis is the preferred method

Reliability Engineers identify potential problems before they manifest as actual problems

– Workforce reports a problem or pending problem

If there is no current process, problems can be reported through informal methods, e.g. phone calls, e-mail, personal contact

© Life Cycle Engineering 2014

Triggers

“Many RCA program initiatives fail because the organization attempts to perform RCA on everything. It is important to establish guidelines for what will trigger the RCA effort.”

© Life Cycle Engineering 2014

No

OSHA recordable

injury?

Reportable release

or outside complaint?

Operating Rates below Target OAU?

Loss of critical

equipment or system?

One time Cost (maint., quality, etc.)

> $15K? Repeat

failure of >3X per year?

Supply chain deviation?

Root Cause Analysis Screening Criteria: A Model

No RCA required

Assemble RCA Team

Perform the RCA

Build Cause & Effect Chart and Write Report

Distribute Report, Update Action Plan

Monitor Results, Measure

Performance

Results Acceptable?

Enter into CMMS database,

Share findings with entire plant

No

No

No

No

No

No

No

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

INCIDENT

© Life Cycle Engineering 2014

Incident Clarification

Allows investigator to determine –  If root-cause analysis is needed – Best method of performing RCA

– Specific approach or type of analysis that should be used

© Life Cycle Engineering 2014

Classification

•  Equipment damage or failure •  Operating performance

– Product Quality

– Capacity Restrictions •  Economic performance •  Safety

•  Regulatory compliance

© Life Cycle Engineering 2014

RCA

Choose your tool: •  5 Whys •  Advanced Analysis

•  Design/Application Review,

•  Cause and Effect, •  Sequence of Events, •  Fault Tree Analysis, •  Change Analysis, •  FMEA, •  Events and Causal

Factors

© Life Cycle Engineering 2014

Corrective Actions

•  Most events have more than one corrective action

•  Not all are financially justifiable •  Each of these actions must be evaluated to

determine: – Their effectiveness – Total cost associated with action

© Life Cycle Engineering 2014

Evaluation Steps

•  Develop a list of all potential corrective actions

•  Evaluate the technical merit of each action – Will it completely correct the problem

and prevent recurrence?

•  Estimate the total cost of the action

© Life Cycle Engineering 2014

Cost-benefit Analysis

•  A full cost-benefit analysis is the final step before making a recommendation

•  The cost analysis must define all costs that can be directly attributed to the problem being investigated and that will be incurred as part of the corrective action

© Life Cycle Engineering 2014

Cost Analysis

There are two major cost classifications that should be included in the analysis:

– Abnormal or incremental costs caused directly or indirectly by the existing problem

– Cost required to correct the problem and to prevent a recurrence

© Life Cycle Engineering 2014

Incurred Costs

Most problems that warrant a RCA have a measurable financial impact. Costs must be clearly defined and costs include all charges:

– Maintenance labor and material –  Incremental production labor and material – Lost production capacity – Business lost because of late delivery

© Life Cycle Engineering 2014

Cost Of Correction

All costs required to implement corrective action(s):

– Maintenance labor and material – Lost production caused by downtime – Engineering and procurement costs – Training, procedure development,

policy changes, etc

© Life Cycle Engineering 2014

Cost Analysis

•  Must include all incremental and new costs caused by the problem or required to correct it

•  Care must be taken to ensure all direct and indirect costs are included

•  Indirect costs, such as training, are often over-looked, but can be substantial

© Life Cycle Engineering 2014

Benefit Analysis

•  Quantify all benefits that will be gained by implementing the corrective action(s)

•  Benefits must be defined in realistic, financial terms

•  Benefits should be broken into two types: – Actual costs – Cost avoidance

© Life Cycle Engineering 2014

Actual Costs

•  Reduction in maintenance labor and material •  Reduction in delays, downtime, poor quality •  Reduction in overtime premiums for

production and maintenance

• PW35

© Life Cycle Engineering 2014

Costs Avoidance

Problems generate incremental costs that can be avoided. These costs include:

– Losses due to poor quality – Overtime premiums – Expedited vendor deliveries – Capacity losses due to poor equipment

condition, improper operation, inadequate maintenance

© Life Cycle Engineering 2014

Cost Avoidance

These costs also include: – Fines and penalties caused by spills,

releases, or non-conformance to regulatory requirements

– Medical expenses caused by poor working conditions or accidents

© Life Cycle Engineering 2014

Cost-benefit Comparison

•  The final step in the cost-benefit analysis is a comparison of costs vs. benefits

•  The actual differential required to justify varies, but many companies expect a one year pay back on investment

© Life Cycle Engineering 2014

Cost-benefit Analysis

•  Must clearly show that benefits will offset all incurred cost and generate a measurable improvement in one or more cost categories

•  As a rule, a three year history of costs and a three year projection of benefits should be used for the comparison

© Life Cycle Engineering 2014

Verify Corrective Actions

•  The next step in RCA is verification that the corrective actions resolved the problem

•  Questions to be asked: – Were action items completed? – Will the initial problem recur? – Did the action create another problem

that may affect reliability or costs? – Did we get the expected return?

© Life Cycle Engineering 2014

Proper Documentation

•  It is not complete until it is fully documented •  Must follow Engineering Change

Management (ECM) or Management of Change (MOC) Process to make all changes required

© Life Cycle Engineering 2014

Report and Recommendations

•  A final report that concisely defines the problem, its impact, root causes and recommended corrective actions is the next step in RCA

•  The report must be well planned and properly prepared

© Life Cycle Engineering 2014

DMAIC

© Life Cycle Engineering 2014

• Identify problem • Write problem statement

• Gather evidence

• Build preliminary business case

• Establish measures of effectiveness • Set objectives

• Perform RCA •  Identify and prioritize solutions

• Build action plan

• Evaluate if objectives were met • Standardize and

implement solutions in other areas

• If objectives were not fully met, use RCA to re-

evaluate • Sustain improvement

• define

• DMAIC Process

• DMAIC

• MEA

SUR

E

• ANALYZE

© Life Cycle Engineering 2014

Define

A fundamental characteristic of an effective Reliability Engineer is working “smart”

– Everything you do should be carefully evaluated and fully planned

In problem solving: –  Identify the problem – Write a problem statement – Gather evidence to quantify and verify a

problem

© Life Cycle Engineering 2014

Measure

•  Build a preliminary Business Case •  Prioritize execution •  Establish measures of effectiveness

•  Set improvement objectives

© Life Cycle Engineering 2014

Analyze

Select appropriate problem solving methodology

– There are hundreds of “tools”-selecting the best one is key to success

Perform the RCA – Follow, without exception, the RCA

process

© Life Cycle Engineering 2014

Analyze

Identify and prioritize solutions – Always more than one solution

Build action plan for correction

–  Include quantifiable measurement criteria

© Life Cycle Engineering 2014

Improve

Evaluate if objectives were met – Evaluate results of corrective action(s) – Verify full, universal implementation

Standardize and implement solutions in other areas

Verify that no other problems created

© Life Cycle Engineering 2014

Control

•  If objectives were not fully met, use RCA to re-evaluate

•  Sustain improvement •  If corrective actions do not fully resolve the

issue, or create other issues – Make adjustments or corrections to ensure

permanent solution

© Life Cycle Engineering 2014

PDCA Process PLAN

Identify Problem

Write Problem Statement

Gather Evidence

Build Preliminary

Business Case

Establish Measures of Effectiveness

Set Improvement

Objectives

DOPerform Root

Cause Analysis

- Fishbone Diagram- 4 Ms- Brainstorming- 5 Whys- Multi-Voting

Identify and Prioritize Solutions

- Brainstorming- Affinity Analysis- Multi-Voting

Build Action Plan

- Why?- What?-- When?-

CHECKImplement Action Plan

- Create action register- Get approvals- Review it- Do it- Measure it

Evaluate Outcomes

- Analyze date

Were Objectives Met?

ACTStandardize and Implement

Solutions

- Communicate- Develop new procedures- Develop standard work instructions- Conduct training- Monitor and measure performance

Yes

No

© Life Cycle Engineering 2014

FAILURE MODEL

© Life Cycle Engineering 2014

Failure Model

1.  Cause – the reason for the failure (cause is always in terms of root cause(s))

2.  Failure Mode – the means by which the failure manifests itself

3.  Effect – the impact of the failure

Failure is viewed as a three-part event:

© Life Cycle Engineering 2014

Failure Model

• Failure Mode • Root Cause • Effect on user

Note: Any of the three may be linked with multiple pairings of the others.

© Life Cycle Engineering 2014

Failure Model

The level at which any root cause should be identified is the level at which it is possible to identify an appropriate failure management policy

© Life Cycle Engineering 2014

• Cause Mode

• Hypothesis

• Physical Roots

• Human Roots

• Latent Roots

• Failure

• ………………………….................Level 1

• …………….................Level 2

• ..................Level 3

• Level 4 ………………………………….....

• Level 5 ………………………………………………..

© Life Cycle Engineering 2014

Physical Roots

•  This is contained in the physical evidence gathered after failure

•  Become proficient in identifying the different types of Physical Failure Mechanisms

•  Have “on hand” useful reference texts •  NOT usually the Root Cause

© Life Cycle Engineering 2014

Physical Roots

Compression

Tension

Torsion Ductile Brittle

F

F

F

F

© Life Cycle Engineering 2014

Component Failure Analysis Learning from the Physical Roots

•  Detailed examination of failed components can provide clues to causes from the physical roots

•  Components must be preserved for engineering analysis

•  Metallurgical labs can provide analytical assistance

•  Suppliers often provide charts with pictures of common failure examples

© Life Cycle Engineering 2014

Human Roots

Inappropriate human intervention that led to/ allowed failure occurrence

– Commission – Omission

© Life Cycle Engineering 2014

Latent (System) Roots

•  Why the human action was allowed or wasn’t detected/ prevented

•  Most effective root to develop corrective action for (if feasible) to prevent recurrence

© Life Cycle Engineering 2014

Common Root-causes

•  Materials •  Machine/Equipment •  Environment •  Management •  Procedures •  Management systems

© Life Cycle Engineering 2014

INCIDENT REPORTING

© Life Cycle Engineering 2014

Collecting Physical Evidence

First priority in equipment failure or accident investigations

–  If possible, system should be isolated and preserved until investigation can be completed

–  If not, scene of failure or accident must be fully documented before machine or system is removed or repaired

© Life Cycle Engineering 2014

Preserving Evidence

•  Digital photographs, sketches, and all recorded data should be collected and preserved

•  Interviews with personnel directly involved with incident should be conducted as quickly as possible (pg. 33) – Memory fades with time

© Life Cycle Engineering 2014

FIVE WHYS

© Life Cycle Engineering 2014

5 Whys

•  Widely accepted, simple method for conducting RCA

•  Easy for everyone to learn •  Can be used as an everyday problem solving

tool

•  Not intended for use on extremely complex problems

© Life Cycle Engineering 2014

5 Whys

Simple logic process to determine probable root cause of a problem

Simply asking “why” a condition exists, and repeating the process – Each time you ask “why”, the answer gets

closer to the source of the problem – Keep asking “why” until you have confidence

that a root-cause has been isolated

• “Why” is a Reliability Engineer’s favorite word, “Logic” and “Common Sense” his or her best tools

© Life Cycle Engineering 2014

When Should We Apply 5 Whys?

•  Always. Never accept the first reason for an incident

•  Try the simple way (5 Whys) first •  If not sufficient, move to one of the more

advanced tools

© Life Cycle Engineering 2014

Who Should Apply 5 Whys?

•  Everyone •  Everyone is encouraged to have a probing,

questioning attitude to drive improvement

© Life Cycle Engineering 2014

Shortcomings of 5 Whys

•  One solution from exercise •  Economics of selection •  Feasibility of solution

•  Ease of solution

© Life Cycle Engineering 2014

INTRODUCTION TO ADVANCED ANALYSIS

© Life Cycle Engineering 2014

Analysis Techniques

•  There are over 101 known analysis techniques that can be used to determine Root Cause

•  Our goal: pick the technique which yields the desired result with the least amount of effort

© Life Cycle Engineering 2014

Analysis Techniques

•  Multi-tier approach •  Technique choice may depend on:

– Problem complexity – Hours of downtime – Cost of failure – Failure type, e.g. EHS

© Life Cycle Engineering 2014

Advanced Analysis Techniques Covered

Design/Application Review Cause and Effect Sequence of Events

Fault Tree Analysis Change Analysis FMEA Event and Causal Factor Analysis

© Life Cycle Engineering 2014

DESIGN / APPLICATION REVIEW

Advanced Analysis Technique # 1

© Life Cycle Engineering 2014

Design Review

If a full root-cause analysis is justified, the first step is a comprehensive design review

– Applies to all problems -- not just asset related

– Non-asset problems -- evaluate processes and practices

Determines the specific installation, maintenance, operating requirements and limitations of the investigated machine or system

© Life Cycle Engineering 2014

Design Review

Level of effort determined by complexity of problem Many common problems can be resolved with

simplified design review – Even complex systems are comprised of

simple, well-known components More complex problems will require extensive,

in-depth review – Some problems could take 5-30 days to do

a comprehensive design evaluation

© Life Cycle Engineering 2014

Design Review Data Sources

•  Nameplate Data •  Procurement Specifications •  Vendor Specifications

•  O & M Manuals

© Life Cycle Engineering 2014

Nameplate Data

Simple problems may be resolved using nameplate data

– Data defines minimal performance criteria, such as flow, pressure, amp load, etc

– Combined with data in troubleshooting guides in REFERENCE TEXTS

© Life Cycle Engineering 2014

Specifications used to procure machine or system should clearly define its operating envelope

– Range of incoming product

– Range of output product – Operating efficiency – Other parameters that can be used to

evaluate the system

Procurement Specifications

© Life Cycle Engineering 2014

Specifications provided with procured system – Should coincide with procurement specs –  If not, deviations may be key to problem

resolution Careful comparison of procurement and vendor specifications is essential

Vendor Specifications

© Life Cycle Engineering 2014

Operations and Maintenance (O&M) manuals are provided with most machines and systems

– Excellent source of recommended operating and maintenance practices

– Most include comprehensive troubleshooting guide that includes all known failure modes

Operating and Maintenance Manuals

© Life Cycle Engineering 2014

Objectives of Design Review

Determine: –  Design limitations, –  Acceptable operating envelope, –  Probable failure modes, and –  Specific indices that quantify actual operating condition

Provide factual basis for application, maintenance, and operating practices evaluation and ultimate root-cause determination

© Life Cycle Engineering 2014

Review Results

Incoming product specifications – Acceptable range of variations in incoming

product (i.e. density, volume) Output product specifications

– Acceptable range of variations in output product

Work to be performed – Determined by difference in incoming and

output

© Life Cycle Engineering 2014

Acceptable Operating Envelope

•  Machines and production systems are designed to perform a specific task or range of tasks

•  Operating envelope bounds the full range of operating tasks that the system is designed to perform

© Life Cycle Engineering 2014

Acceptable Operating Envelope

•  Many chronic problems are the direct result of machines or systems that are operating outside of their acceptable operating envelope

•  Establishing the boundaries will permit direct comparison during the application review

© Life Cycle Engineering 2014

Acceptable Operating Envelope

65% 70% 75%

80% 80% 75%

70% 65%

Best Efficiency Point (BEP)

FLOW in gallons per minute (GPM)

Tota

l Dyn

amic

Hea

d (F

eet)

200 400 600 800 1000

50

200

100

150

Hydraulic Curve for Centrifugal Pump

© Life Cycle Engineering 2014

Is Problem Solvable?

Some problems can be resolved after a complete design review

–  Obvious design defects or inherent deficiencies are found

–  One or more of defects may be the source of the problem or deficiency

If the answer is “yes”, the next step is to develop a test to confirm the cause-effect relationship If “no”, continue with other RCA tools

© Life Cycle Engineering 2014

CAUSE AND EFFECT

Advanced Analysis Technique # 2

© Life Cycle Engineering 2014

Cause-and-Effect Analysis

•  Graphical approach to failure analysis (Ishakawa Diagram)

•  Also called Fishbone or 4M Analysis because of graphic pattern and classifications

© Life Cycle Engineering 2014

Cause-and-Effect Analysis

•  Plots relationship between various factors that contribute to specific event

•  Factors are grouped in sub-classifications to facilitate analysis

© Life Cycle Engineering 2014

4M Cause-Effect Diagram

Effect

Man Machine

Materials

Human error

No training

No supervision

No procedures

Poor surpervision

No enforcement

Misapplication

Poor maintenance

Age

Wrong materials

Misapplication

Vendor error

Methods

© Life Cycle Engineering 2014

Example of Cause and Effect Diagram

Blockage in Downspout

Man Methods

Management Procedures

Over-cookingScorching

Improper startupFailure to follow CIP

Ineffective Startup proceduresIneffective CIP procedures

Failure to enforce CIP

Flow rate too slow Incorrect recipe

Extended use of cooker

Auto startup control logic

Steam temp. too high

Failure to enforce switch-over procedure

Operator inconsistencyFailure to follow switch-over procedure

Ineffective operator training

Retention (cooking) time

© Life Cycle Engineering 2014

Uses Of Cause-Effect

Process deviations –  Problems associated with capacity restrictions,

product quality, abnormal costs Regulatory compliance

–  OSHA violations –  Environmental releases

Safety issues

• Most production problems require complete understanding of all probable variables that could contribute to a problem

© Life Cycle Engineering 2014

Limitations

Cause-and-Effect Analysis has serious limitations:

•  Does not provide a clear sequence-of-events that leads to failure

•  Does not isolate specific cause or combination of forcing functions that result in problem

•  It displays all of the possible causes

© Life Cycle Engineering 2014

Step 1 Identify the problem during one of your team’s brainstorming sessions. Draw a box around the problem. This is called the “effect”.

Step 2 Draw a long process arrow leading into the box. This arrow represents the direction of influence.

• Bad Tasting • Coffee

Cause & Effect Analysis – Fishbone Diagram

• Problem or “Effect”

• Bad Tasting • Coffee

© Life Cycle Engineering 2014

Step 3 Decide the major categories of causes. Groups often start by using Machines, Materials, Methods, and Man. For some problems, different categories work better.

MACHINE

METHOD

MATERIALS

MAN

BAD TASTING COFFEE

Cause & Effect Analysis – Fishbone Diagram (cont.)

© Life Cycle Engineering 2014

STEP 4 Decide the possible causes related to each main category. For example, possible causes related to man are experience, ability and individual preference.

MACHINE

METHOD

MATERIALS

MAN

drip perk manual automatic

filter size of machine

sugar cream

temperature electric, gas, open fire

experience ability individual preference

BAD TASTING COFFEE

grind

Cause & Effect Analysis – Fishbone Diagram (cont.)

brand

© Life Cycle Engineering 2014

Step 5 Eliminate the trivial, non-important causes.

Cause & Effect Analysis – Fishbone Diagram (cont.)

MACHINE

METHOD

MATERIALS

MAN

drip perk manual automatic

filter size of machine

sugar cream

temperature electric, gas, open fire

experience ability individual preference

BAD TASTING COFFEE

grind

brand

© Life Cycle Engineering 2014

Cause & Effect Analysis – Fishbone Diagram (cont.)

Step 6 Discuss the causes that remain and decide which are important. Circle them.

MACHINE

METHOD

MATERIALS

MAN

drip perk manual automatic

filter size of machine

sugar cream

temperature electric, gas, open fire

experience ability individual preference

BAD TASTING COFFEE

grind

brand

© Life Cycle Engineering 2014

SEQUENCE OF EVENTS

Advanced Analysis Technique # 3

© Life Cycle Engineering 2014

Sequence-of-events Analysis

•  One of the most effective tools for root-cause analysis

•  Graphically displays sequence of events leading to failure, event, or incident

•  Provides means to display both factual and assumed factors that may have contributed to an event

© Life Cycle Engineering 2014

Sequence-of-events Symbols

Events: •  Events are displayed as

rectangular boxes, which are connected by flow direction arrows that provide the proper sequence for events

•  Each box should contain only one event and the date and time that it occurred

•  Use precise, factual, non-judgmental words and quantify when possible

© Life Cycle Engineering 2014

Events

An event box can be used for an actual variable or action

–  “Operator A opens valve B” –  “Flake transfer begins” –  “Operator C changes pressure setting to

100 psig” –  “Operator B diverted flow to silo #3”

Specific time of event must also be noted

© Life Cycle Engineering 2014

Sequence-of-events Symbols

Qualifiers •  Each event should be

clarified by using oval data blocks that provide qualifying data pertinent to that event

•  Each oval should contain only one qualifier

•  Each qualifier oval should be connected to a specific event using a direction arrow

EVENT

08/05/97 13:52

QUALIFIER

QUALIFIER

© Life Cycle Engineering 2014

Qualifiers

Concise description that clarify •  “CA operator A notifies preparation operator

A”

•  “Preparation operator A confirms start of transfer”

•  “Level gauge indicates 1/2 full” •  “Last gauge calibration 03/03/97”

© Life Cycle Engineering 2014

Sequence-of-events Symbols

Forcing functions •  Factors that could have

contributed to the event should be displayed as a hexagon-shaped data box

•  Each hexagon should contain one concisely defined forcing function

•  Forcing functions should be connected to a specific event

EVENT

08/05/97 13:52

FORCING FUNCTION

© Life Cycle Engineering 2014

Forcing Functions

Variables or actions that could contribute •  “Pressure fluctuations in conveyor system” •  “Valve failed to open”

•  “Blockage in conveyor”

© Life Cycle Engineering 2014

Sequence-of-events Symbols

Assumptions •  Unconfirmed conditions or

contributing factors can be included in the flow diagram by using annotations

•  This method permits the inclusion of multiple assumptions or unanswered questions that may help clarify an event

•  Assumptions should be confirmed or deleted as soon as possible

EVENT

08/05/97 13:52

Assumptions, unanswered questions or other data that may be pertinent to event.

© Life Cycle Engineering 2014

Assumptions

Any unproven event, qualifier, or forcing function that may have contributed

•  “Solenoid operator believed defective” •  “Level gauge is unreliable” •  “# 3 silo believed to be empty”

© Life Cycle Engineering 2014

Sequence-of-events Symbols

Incident •  The incident box contains a

brief statement of the reason for the investigation

•  The incident box should be inserted at the proper point in the event sequence and connected to the event boxes using direction arrows

•  There should be only one incident data box included in each investigation

EVENT

08/05/97 13:52

INCIDENT

08/05/97 14:01

EVENT

08/05/97 13:52

© Life Cycle Engineering 2014

Incident

There should be only one incident in each sequence-of-events diagram

The final event or failure that triggered investigation

–  “Fluidizer ‘A’ trips off-line” –  “Catastrophic fan failure”

–  “Bearing failed”

© Life Cycle Engineering 2014

Fluidizer A trips

08/03/97 08:10

CA Operator A resets breaker

08/03/97 08:20

CA Operator A restarts transfer

08/03/97 08:21

Fluidizer A trips08/03/97 08:23

CA Operator A stops transfer

08/03/97 08:30

High amp load present

Crew A inspects pneumatic conveyor

08/03/97 10:00

Section A-935 completely blocked with flake

#3 Silo overflowing with flake

I.C. Tech. A inspects level gage on #3 Silo

08/03/97 12:00

Transmitter lense coated with flake

Section A-935 completely blocked with flake

Silo #3 completely full and flake compacted

Flake transfer begins8/03/97 07:30

CA Operator A notifies Preparation Operator A

08/03/97 06:55 a.m.

Prep. Operator A confirms start of transfer

Prep. Operator A selects #3 Silo08/03/97 07:25

#3 Silo assumed to be empty

Prep. Operator A opens valve to #3 Silo

08/03/97 07:29

Flake transfer continues

08/03/97 07:30 08:00

Level gage indicates 1/2 full

Prep. Op. A checks #3 Silo Level Indicator

08/03/97 08:01

Last gage calibration 03/03/97

Level gage has history of problemsLevel control is questionable

© Life Cycle Engineering 2014

Sequence-of-events

•  Computer-based program is beneficial – Microsoft VISIO or equal

•  Should be a dynamic process –  Initial diagram made when event first

reported – Refined throughout the investigation

process – All assumptions should be confirmed or

eliminated before conclusion

© Life Cycle Engineering 2014

FAULT TREE ANALYSIS

Advanced Analysis Technique # 4

© Life Cycle Engineering 2014

Fault Tree Analysis

•  Method of analyzing system reliability and safety

•  Provides objective basis for analysis •  Limits analysis to specific incident, failure, or

event

•  A deductive rather than inductive approach

© Life Cycle Engineering 2014

Fault Tree Analysis Flow Diagram

Define top event

Establish boundaries

Understand system

Construct fault tree

Analyze tree

Corrective action

© Life Cycle Engineering 2014

Benefits of Fault Tree Analysis

•  Helps analyst understand system failures deductively and points out system failure points

•  Provides insight into system behavior (operating dynamics)

•  Graphical model and logical presentation of event or combinations of events causing failure or top event

•  Depicts relationship of system components or behavior that contributed to failure

© Life Cycle Engineering 2014

Fault Tree Logic Diagram

•  Use “and” “or” logic to define relationship of potential failure modes

•  “and” gate means both events must occur before failure will occur

•  “or” gate means either one of the events may result in failure

© Life Cycle Engineering 2014

Uses Of Fault Tree

Equipment or component failures – Resolution of specific, clearly defined

failures

Design and application reviews – Deductive logic beneficial in understanding

relationship of system behavior

© Life Cycle Engineering 2014

Fault-tree Logic Diagram

OR

Motor Overheats

Primary Wiring Failure (Shorted)

Excessive Current to Motor

Primary Motor Failure

(Overheated)

Fuse Fails to Open

Excessive Current in Circuit

OR

OR

Primary Power Supply Failure

(Surge)

Primary Fuse Failure (Closed)

© Life Cycle Engineering 2014

Equipment: Mill Problem: Damage to Ring Gear & Pinion.

Instrumentation Failure.

Misalignment of Gear Set.

Lack of Lubrication Contamination Operating Conditions

Cause Modes

Problem Definition

Inadequate alarms on system to indicate failure

Spray system not working properly.

Supporting Hypotheses

Lack of Air Lack of lubricant

Physical Roots

Air Solenoid Valve not reconnected.

Human Roots

Latent Roots

Original System Design was inadequate.

No alarms indicated system failure.

Undesirable Spray Pattern

Non compliance with Standard Procedure of obtaining pattern.

No System in place to ensure follow-up.

No System in place to ensure follow-up.

© Life Cycle Engineering 2014

CHANGE ANALYSIS

Advanced Analysis Technique # 5

© Life Cycle Engineering 2014

Change Analysis

Purpose: Examine potential effects of modification

Application: All systems

© Life Cycle Engineering 2014

Six Steps in Change Analysis

Incident Occurrence with Undesirable

consequence

Comparable Activity without Undesirable

Consequence

Compare

Analyze Differences for Effect on Undesirable

Consequence

Set Down Differences

Integrate Information

Relevant to the Causes of the Undesirable

Consequence

© Life Cycle Engineering 2014

Change Analysis Work Sheet

• Source: DOE Root Cause Analysis Guidance Document,

1992

Change Factor Difference/Change Effect Questions to Answer

What (Conditions, occurrence, activity, equipment)When (Occurred, identified, plant status, schedule)Where (Physical location, environmental conditions)How(Work practice, ommission, extraneous action, out of sequence procedure)Who(Personnel Involved, training, qualification, supervision

© Life Cycle Engineering 2014

FMEA Advanced Analysis Technique # 6

© Life Cycle Engineering 2014

FMEA

•  Developed by US Military and standardized by automotive industry

•  Top-down method •  Based on industrial and in-plant historical data •  Generally limited to major sub-systems

–  Can include components, but failure modes, probability of failure, etc. based on experience---not probability tables

© Life Cycle Engineering 2014

Example Of FMEA Analysis

Function Functional Failure Component Failure Mode Effect of Failure

Seve

rity

Cause of Failure

Prob

abilit

y

Current Control

Det

ectio

n

RPN

Improvements

New

RPN

Provide 1000 gpm of

Additive to process

No Flow Motor No rotation/torque Shuts down process Bearing seize due to

Lubrication Issue

Lube Motor Bearings

Include on Vibration and IR

route

Failure Modes and Effects Analysis

10 7 3 210

Subsystem: 36-1A Pump

© Life Cycle Engineering 2014

EVENT AND CAUSAL FACTOR ANALYSIS

Advanced Analysis Technique # 7

© Life Cycle Engineering 2014

ECF Charting

•  Experience has shown that accidents are rarely simple and almost never result from a single cause.

•  They are usually multifactorial and develop from clearly defined sequences of events which involve performance errors, changes, oversights, and omissions.

© Life Cycle Engineering 2014

ECF Charting

•  Assists the verification of causal chains and event sequences

•  Provides a structure for integrating investigation findings

•  Assists communication both during and on completion of the investigation.

© Life Cycle Engineering 2014

Secondary Event 1

Condition B

Secondary Event 2

Condition A

Primary Event 1

Primary Event 2

Primary Event 3

Accident Event

ECF Charting

Condition D

Condition C

© Life Cycle Engineering 2014

Causal Factor Relationships

EVENT EVENT (Potential) EVENT

Condition

Condition (Root Cause)

Condition (Contributing

Cause)

Condition (Direct Cause)

Condition

Condition

Condition (Contributing

Cause)

Condition

EVENT

Condition

Condition

The sequence of real time happenings or actions.

Any as-found or existing state that influences the outcome of a

particular task, process or operation.

Conditions that may exist but are not identified

© Life Cycle Engineering 2014

NOTE: Events should be arranged chronologically arranged from left to right.

EVENT EVENT EVENT

© Life Cycle Engineering 2014

Process Steps

•  Organize the accident data •  Guide the investigation •  Validate and confirm the true accident

sequence •  Identify and validate factual findings, probable

causes, and contributing factors; •  Simplify organization of the investigation report •  Illustrate the accident sequence in the

investigation report

© Life Cycle Engineering 2014

Limitations of ECF

ECF is an effective tool for understanding the sequence of contributing events that lead to an accident, it does have two primary limitations: •  Will not necessarily yield root causes.

Event charting is effective for identifying causal factors.

•  Overkill for simple problems. Using event charting can overwork simple problems.

© Life Cycle Engineering 2014

Final Documentation of Investigation

The final report can entail: 1.  Incident summary 2.  Initiating event

3.  Incident description 4.  Immediate corrective actions 5.  Root-causes

6.  Long-term corrective actions

© Life Cycle Engineering 2014

Final Documentation of Investigation

7.  Lessons learned 8.  External reports filed 9.  References and attachments 10. Investigator or investigating team

description 11. Review and approval team description 12. Distribution list

© Life Cycle Engineering 2014

Background Target Condition

Current Condition

Action Plan

Metrics

The A3 Process The A3 template is PowerPoint and can be directly printed as hardcopy or automatically inserted in to a

presentation without any manual changes

11” x 17” (A3) Format

Root Cause Analysis

© Life Cycle Engineering 2014

Business Case

Our Need

Current Condition

Target Condition

Situation Now Is..

Situation Will Be..

We believe that if…., then….

Improvement Activities Schedule Metrics

xxx xxxxx xxxxx xx ooo oooo oo o o oo o x xxx x xxx x xx x x

Current Future Actual 25 5

The Logic of A3 Thinking

© Life Cycle Engineering 2014

Business Case

•  Business Case = Problem Statement •  Clearly state problem(s) that we are trying to

solve. Could include one of the following: EHS People Profitability Customer Manufacturing Asset reliability

•  Keep problem list to a “critical few” •  Include cause and effect diagrams reflecting

impact on bottom line (lagging indicators) -- use graphs with goals

© Life Cycle Engineering 2014

Business Case Target Condition

Current Condition Action Plan

Metrics

ABS-related trouble calls represents 50% of total calls Majority of trouble calls are pneumatic or electrical problems

No preventive maintenance for pneumatic or electrical

Trouble calls in mid-speed modules result loss of capacity Reduction in number and duration will increase by 30%

Reduce pneumatic/electrical TC by 50% Increase capacity by 30% or 2 billion stick annually

Develop PMs for pneumatic/electrical components Pilot new PMs to determine effectiveness

Number of pneumatic/electrical TC per module Increase in sticks per module

10%10% 10%

10%

10%

10%

10%

10%10%

10%

ID Task Name Start Finish DurationSep 2006

3 4 5 6 7 8 9 10 11 12 13 14 15 16

1 1d9/4/20069/4/2006Task 1

2 1d9/4/20069/4/2006Task 2

3 1d9/4/20069/4/2006Task 3

4 1d9/4/20069/4/2006Task 4

5 1d9/4/20069/4/2006Task 5

10%10% 10%

10%

10%

10%

10%

10%10%

10%

10%10% 10%

10%

10%

10%

10%

10%10%

10%

10%10% 10%

10%

10%

10%

10%

10%10%

10%A3 Example, pg. 89

© Life Cycle Engineering 2014

•  EHS “System” Improvements – Ergonomics, Fatality, JSA’s. Safety Observations, Etc. •  Receipt-to-Pay Implemented – Best Practice •  Order-to-Cash Being Implemented – October 2005 •  Maintenance PdM System Being Implemented – Sharing with other plants •  Just started with Best Practices Sharing re. Robots and Forging with XXX and XXX •  Most Equipment not common with any other location. •  Maintenance EAM system implementation postponed to 2006 – XXX system being used. •  Develop a better system with suppliers/vendors.

• XYZ Company • A3 2005

•  TRR @ 1.5 or Less •  OEE @ 85% or Greater •  Internal Scrap < 5% •  GFPLH @ 5.25 (Avg. for 2006)

•  Continue with No Lost Work Days •  Launch All New Programs On-or-Ahead of Plan •  Reduce Plant Controllable Costs by 15% •  Stop Revenue Per Wheel Loss

• Business Requirements:

TRR

0

5

10

15

2000 2001 2002 2003 2004 2005 2006

Revenue

020406080

2000 2001 2002 2003 2004 2005 2006

Operational Availability

0

20

40

60

80

100

2000 2001 2002 2003 2004 2005 2006

• Business Case

Internal Scrap

048

121620

2000 2001 2002 2003 2004 2005 2006

GFPLH

0

2

4

6

8

2000 2001 2002 2003 2004 2005 2006

•  • Target Condition •  Continuous EHS Improvements: Ergonomics, Fatality Prev., BBT, Org. Tolerance,% At Risk, Etc. •  Rec-to-Pay and Order-to-Cash fully operational and Effective •  Maintenance PdM system functioning effectively with planned points checked daily -Tie in Bizware •  Routine Best Practices Sharing with “Learnings” Implemented •  Get all internal pieces of equipment refurbished with “locally available” spare parts •  Utilize MP@ Maintenance System to trend and take necessary actions. •  Implement an effective tracking system with outside suppliers and vendors

•  Get OEE routinely above 70% •  Orient/Train/Audit/Etc. operator engagement involving equipment wellness •  Conduct Effective Lean event at least once a month with Continuous Improvement Follow-Up •  Routine Spare Parts availability •  PM’s performed 100% to schedule with follow-up on effectiveness and appropriate adjustments •  Maintenance/Operator training conducted as necessary

•  Implement state-of-the-art equipment and technology – ex. Pre-Machine Restructure •  Improve Equipment/Tooling/Processes/Etc. to effect a 10% reduction in controllable costs •  Continue to increase capacity through cycle time and changeover time improvements •  Routinely share technology with other locations. Obtain state-of-the-art technology from suppliers •  All Programs Launched effectively: On-site Engr. Launch Teams, Concur. Engrg., Communication •  Establish effective in-plant feedback loop to reduce scrap & rework and increase throughput

• Current Condition • Global Manufacturing System

• Equipment Reliability

• Process Technology

•  OA running at 58%. •  Poor operator engagement in equipment wellness but starting to improve •  Lean events kicked off in April. Two Additional events held. Lean event planned each month. •  Spare Parts availability improving. Many foreign parts “reversed engineered” for local purchase •  PM’s performed @ 60% timeliness but improving. PM effectiveness starting to be evaluated. •  Extensive OEM-specific training provided to Maintenance personnel

•  One new Chiron Drill implemented – Only maintains parity with the competition •  Cost reductions realized as related to: Equipment/Processes/Tooling/Methods/Supplies/Etc. •  Capacity Increased through Cycle Time and Changeover Time improvements •  Some Technology Sharing with other locations. ATC had no helpful activities identified. •  Program Launch effectiveness needs major improvements

• Action Plan

• GLOBAL • MANUFACTURING

• SYSTEM

• EQUIPMENT • RELIABILITY

• PROCESS • TECHNOLOGY

• AREA METRICS IMPACT WHO HOW

• WAVE I & II Imple.

• Training Hours/Person

• Spare Parts Avail. %

• TPM Events Held

• Reduce Costs 20%

• Impr. Cycle Time-15%

• Impr. CO Time – 15%

• 100% Ontime Prog. Lch.

• EBS Systems Impl.

• Communications

• Supplier Sys./Nego.

• EHS/ABS Focus

• Training

• Cultural Orientation

• Improved Systems

• ABS/TPM

• New/Rehab. Equip.

• Process/Tooling Impr.

• Communications

• Production Trials

• Operations • Manager

• (Top 4) • (Top 4)

• Engineering • Manager

• Plant Manager

• (Top 4) • (Driver)

• Financials

• Throughput

• Scrap

• Productivity.

•  Operational Availability

•  Throughput.

• Cost

• Culture

• Financials

• Productivity

• Customer Satisfaction

• New Business

ROC

-20

0

20

40

2000 2001 2002 2003 2004 2005 2006

• Cost Control/Reductions

• Operational Availability

• First Pass Yield

• Manuf. Efficiency

• Global Manufacturing System

• Equipment Reliability

• Process Technology

• E

• E

© Life Cycle Engineering 2014

Current Condition

•  Show flow (process, material, and information) •  Highlight business case in current condition

flow •  Use bulleted text and/or graphs to further

explain flow •  Show lead-time and flow-time (if it is an issue) •  Continually update (monthly)

© Life Cycle Engineering 2014

XYZ PLANT

1 2 3 4Washer

Pre-Machine

Spinner

Forging Saw

Heater Billet Log Table Machining

2nd T

urn

1st T

urn

Dril

l

1 2 3 4Washer

Pre-Machine

Spinner

Forging Saw

Heater Billet Log Table

Heat Treat

Shot Blast

Die Shop

Phase II

Phase I

Admin Offices

Hub Float Sorting

Aging

Die Racks

Pin Stamp

Pin Stamp

De Burr

De Burr Final Insp

Outside Sources

2000 2001 2002 2003 2004 2005 2006 Volume 311,343 491,056 480,126 534,967 617,851 722,189 842,870 Revenue/Wheel $62.78 $88.95 $103.73 $103.04 $99.38 $97.09 $98.49 Cost/Wheel $68.88 $90.83 $80.05 $82.21 $89.15 $99.77 $94.36 Employees 264 219 190 185 185 185 184 GM 3.50 3.18 4.55 4.30 4.34 5.04 6.13 NIPT $(2,646,000) $(566,000) $8,380,000 $9,208,878 $3,542,141 $(1,222,052) $1,421,300 Capital 12/31 $36,328,000 $30,639,000 $31,389,000 $24,895,092 $24,787,091 $27,224,106 $25,852,544 ROC (11.8)% (1.8)% 35.8% 33.7% 12.8% (4.1%) 4.7%

Coating

Painting

Die Penetrant

Polishing

Machining

© Life Cycle Engineering 2014

Target Condition

•  Show flow (process, material, and information) •  Define Target Condition for year-end •  Verify that Target Condition supports business

case

© Life Cycle Engineering 2014

Action Plan

•  Do not include routine actions: –  “Keep SWPs updated on a regular basis”, –  “Continue preventive maintenance”, etc

•  Action items that are rescheduled should be highlighted –  Use red text, etc.

•  Verify that action items relate to Business Case •  Action plans must bridge the gap between current

and target condition –  Plan to achieve changes shown in target

© Life Cycle Engineering 2014

Top Level A3 – Action Plans Action Items Q1 Q2 Q3 Q4 SPA Jan Feb Mar

Dev and implement Creep Plans X X BU Pres X X X Dev and Implement Working capital management plans

X X BU Pres X X X

Restart Idled Capacity X Bu Pres Increase Value-added product X X Park D&I plan to reduce Planning cycle lead time

X X Park X

D&I maintenance strategy to support Loc OA goals

X BU Pres X X

D&I plan to close KPI gaps and improve process stability

X BU Pres X X

D&I plans to address safety performance gaps

X X Rawls/BU Pres

X X X

D& Implement People plan X X X Williams/BU Pres

X X X

D & Impl environmental compliance plans X X X BU Pres Manage Capital Expenditures to 70% of Depreciation

X X X X Adorno X X X

Attain Stage 2 of Hypothetical Plant Implementation

X X X BU Pres

© Life Cycle Engineering 2014

ACTION SPA Q2 Q3 Q4

Arbitrate absentee grievances B. Fry X

Reduce meetings held on overtime B. Fry X

Dept. managers evaluate/justify team leader overtime B. Fry X

Create database to track causes of stem damage S. Vogt X

Reduce cost of stem damage by 25% under 2001 B. Rickards X

Achieve 236 pots operating on A line/238 B line J. Whipp/ S. Vogt

X

Complete transformer maintenance repairs B. Allen X

Finalize and implement creep plan B. Rickards X X X

Enhance liquid level control system S. Vogt X

Improve cast house reliability Allen/Hillock X

Pull system partially implemented between cast house & potline

Rickards/Hillock X

Develop and implement Fluoride PMS for preventative measures

J. Whipp/R. Blain

X

Develop ABS strategy to improve delivery performance J.Kuchta X

–15% Inventory Reduction TAG X

• Action Plan-December 2005

• Labor • Productivity

• Stem Repair

• Pot Days

• Amps

• Metal Purity

• Volume

© Life Cycle Engineering 2014

Metrics

Should be indicators (preferably leading): – Number of safety observations could be a

leading indicator for improved safety results

– Number of employees trained on the job could be a leading indicator for reduced scrap rate

Show the starting point, the targeted goal and each month update the current results.

© Life Cycle Engineering 2014

• % OT • % Absenteeism • Shipping to Delivery Performance Ratio • Transaction to Delivery Performance Ratio • % Operational Availability • % Current Efficiency • % Fe in Hot Metal

• 12 % • 5%

• 94% • 94% • 76% • 92% • > .15

• 6% • 2.5% • 100% • 100% • 85% • 95% • >.12

• 11% • 5%

• 94% • 94% • 76%

• 92.5% • 0.15

• Metrics- December 2005

• Start

• Goal

• Current

© Life Cycle Engineering 2014

Summary

•  Root-cause analysis can be used for most problem-solving applications

•  The methods may vary, but the basic concepts are the same

•  All applications of RCA must be based on factual data

•  Perceptions, opinions, and assumptions must be proven or discounted

© Life Cycle Engineering 2014

Summary

The RCA Process: •  Classify the problem, incident, or event •  Determine if a full RCA is required •  Gather data to clarify the problem •  Select the best tool for analysis •  Perform a design review •  Evaluate the application

© Life Cycle Engineering 2014

Summary

•  Develop potential root causes •  Test hypothesis •  Develop potential corrective actions •  Prepare cost-benefit analysis •  Select best corrective action •  Write final report with recommendations •  Verify corrective action(s)

© Life Cycle Engineering 2014


Recommended