+ All Categories

Rca

Date post: 23-Aug-2014
Category:
Upload: 3320021
View: 163 times
Download: 2 times
Share this document with a friend
Popular Tags:
15
Root Cause Analysis Motivation, Process, Tools, and Perspectives Summary Root Cause Analysis (RCA) is a structured investigative process that aims to identify the true cause of a problem, and the actions necessary to eliminate, or mitigate that problem.. The trigger to start an RCA can be a major accident or incident, or an overall improvement program in the areas of safety, quality, or production/maintenance. The article starts with an example of a major railway accident whereby root causes needed to be investigated. A discussion of the RCA process is next, followed by an investigation of available RCA tools, and the role of RCA in improvement programs. The article ends with references for further reading on this subject. GS02003 Gerard Schram 15 pages May 2002 (Revised September 2004) SKF Reliability Systems @ptitudeXchange 5271 Viewridge Court San Diego, CA 92123 United States Tel. +1 858 496 3554 fax +1 858 496 3555 email: [email protected] Internet: www.aptitudexchange.com Use of this document is governed by the terms and conditions contained in @ptitudeXchange.
Transcript
Page 1: Rca

Root Cause Analysis Motivation, Process, Tools, and Perspectives

Summary Root Cause Analysis (RCA) is a structured investigative process that aims to identify the true cause of a problem, and the actions necessary to eliminate, or mitigate that problem.. The trigger to start an RCA can be a major accident or incident, or an overall improvement program in the areas of safety, quality, or production/maintenance. The article starts with an example of a major railway accident whereby root causes needed to be investigated. A discussion of the RCA process is next, followed by an investigation of available RCA tools, and the role of RCA in improvement programs. The article ends with references for further reading on this subject.

GS02003 Gerard Schram 15 pages May 2002 (Revised September 2004) SKF Reliability Systems @ptitudeXchange 5271 Viewridge Court San Diego, CA 92123 United States Tel. +1 858 496 3554 fax +1 858 496 3555 email: [email protected] Internet: www.aptitudexchange.com

Use of this document is governed by the terms and conditions contained in @ptitudeXchange.

Page 2: Rca

Root Cause Analysis

Introduction......................................................................................................................................3

Importance of RCA..........................................................................................................................4

Example: Railway Accident .....................................................................................................4

RCA Process ....................................................................................................................................6

RCA Tools/Methods ........................................................................................................................7

Problem Identification/Understanding......................................................................................7

Possible Cause Generation and Consensus Reaching ..............................................................7

Problem and Cause Data Collection .........................................................................................7

Possible Cause Analysis ...........................................................................................................8

Cause-Effect Analysis ..............................................................................................................9

Tool Selection.........................................................................................................................11

The Wider Perspective of RCA .....................................................................................................11

Role in HAZOP ......................................................................................................................11

Role in TQM / Six Sigma .......................................................................................................11

Role in TPM ...........................................................................................................................11

Role in Asset Management.....................................................................................................11

Role in (S) RCM.....................................................................................................................12

A Survey among Maintenance Professionals .........................................................................12

The Consequences Of RCA...........................................................................................................12

Commercial Methods/Software .....................................................................................................13

PROACT.................................................................................................................................14

Taproot....................................................................................................................................14

Conclusion .....................................................................................................................................14

References......................................................................................................................................14

© 2004 SKF Reliability Systems All Rights Reserved 2

Page 3: Rca

Root Cause Analysis

The NASA defines so called "direct" or "proximate" causes as:

Introduction The greatest tragedy underlying errors and resultant failures is that many of them are avoidable. Yet, one of the best effective concepts for improving reliability in engineering is often neglected. That concept is the learning and continuous improvement from (historical) case analysis. Well-studied examples are failures in civil engineering structures, such as the collapse of various suspension bridges (Tacoma Narrows bridge in oscillating mode due to wind, 1940).

The event(s) that occurred, including any condition(s) that existed immediately before the undesired outcome, directly resulted in its occurrence and, if eliminated or modified, would have prevented the undesired outcome.

Regarding an "undesired outcome", the NASA provides examples such as: failure, anomaly, schedule delay, broken equipment, product defect, problem, close call, mishap, etc. Then as definition of root cause, the NASA states:

Aeronautical and aerospace failures are also the subject of much attention, especially in the mass media. Nuclear and chemical engineering incidents can have major impacts too. Mechanical engineering failures generally result in somewhat less life-threatening situations, but can cause massive recall campaigns and product liability suits. It is obvious then, that recognizing and understanding failure (or a near failure) plays a key role in error-free design and operation. This understanding is necessary to eliminate the same causes and effects in the future.

One of multiple factors (events, conditions or organizational factors) that contributed to or created the proximate cause and subsequent undesired outcome and, if eliminated, or modified would have prevented the undesired outcome. Typically multiple root causes contribute to an undesired outcome.

NASA defines Root Cause Analysis (RCA) as:

A structured evaluation method that identifies the root causes for an undesired outcome and the actions adequate to prevent recurrence. Apart from physical failures, safety incidents,

quality defects, customer complaints, etc., can be the reason for a thorough investigation into their causes. In general, we can state that a problem is a deviation from what is defined normal, with negative impact. A problem is not always recognized (it can be perceived as normal). However, with an open-minded team and/or internal or external benchmarking, problems can be identified. Problem solving consists of identifying causes, and finding ways to eliminate them and prevent them from recurring. In other words, identifying the cause/s is often half the answer.

The American Society for Quality (ASQ) defines Root Cause Analysis (RCA) as:

RCA is a structured investigation that aims to identify the true cause of a problem, and the actions necessary to eliminate it.

In fact, RCA is a collective term used to describe a wide range of approaches, tools, and techniques used to uncover and model causes to problems. RCA is a method that helps professionals determine what happened, how it happened, and why it happened. It allows learning from past problems, failures, and accidents. RCA can be applied to any organizational, production, and administrative (etc.) problem.

A problem is often the result of multiple causes at different levels. The root cause is the “evil at the bottom" that sets in motion the cause-and-effect chain and creates the problem. © 2004 SKF Reliability Systems All Rights Reserved 3

Page 4: Rca

Root Cause Analysis

There exist slightly different terms, including Failure Analysis (FA) and Root Cause Failure Analysis (RCFA). Failure Analysis refers to the observation, categorization, and possibly documentation of a failure. As such it does not necessarily intend to find the root causes that resulted in that failure (how it failed). Root Cause Failure Analysis includes the investigation towards root causes, but is somewhat limited to the term "failure." The term “failure” is biased to physical failures, while root cause analysis is applicable to many more situations, such as safety incidents, quality problems, etc.

Finally, Failure Mode Effect Analysis (FMEA) is a more hypothetical analysis to determine how a component or process could fail (failure modes), including their risks and consequences. FMEA can be considered a proactive way to avoid problems that have not occurred before. On the other hand, RCA is generally initiated when an unplanned problem is happening. It then focuses on preventing reoccurrence in the future. The preventive action’s effect on risks and consequences are generally not taken into account.

Importance of RCA Why perform a RCA? If achievements from eliminating the problem and its consequences are larger than the efforts put into a RCA, this seems obvious. Although eliminating risk of recurrence of similar situations looks admirable, it could be perceived as the "program of the month." Resolving emergencies when they occur, while RCA aims to eliminate root causes and reduce the maintenance person’s responsibilities, may recognize a maintenance person.

Therefore, it is extremely important to align everyone in the same direction, both at management level and production and maintenance personnel. Creating the right,

open environment for learning from failures is essential [Latino, 2001].

Example: Railway Accident A real example shows how small root causes can lead to serious damage. This example originates from SKF Belgium. A goods train traveled from Antwerp harbor to a factory in France. After 30 km the train passed a station where the temperature of the axle boxes is measured to detect possible hot boxes. Everything was normal. 35 km further the train derailed. 8 wagons were destroyed, and damage was done to the rails and overhead electrical cabling. The goods traffic was stopped for several hours.

The accident happened in Belgium, the goods were French owned, and the railway wagons were property of the German State Railways. The wagon in question was overhauled just before the accident. (By international agreement, the Belgian Railways paid damages: > US $1,000,000.)

Figure 1. Relevant Locations within Belgium.

Starting point

Hot box control

derailment

50 km

The remains of the failed axle box, equipped with two spherical roller bearings SKF 229750 J/C3R505 (Y 25 bogie – 20-ton axle load design) are shown in Figure 2. We are looking for the root cause, as we want to eliminate this problem forever!

© 2004 SKF Reliability Systems All Rights Reserved 4

Page 5: Rca

Root Cause Analysis

Figure 2. Remains of the Axle Box Bearings.

Figure 4. The Axle Box as part of the Boogie.

Figure 5. Technical Drawing of the Axle Box with Two Spherical Roller Bearings and the Spacer Ring.

The wagons were equipped with “Y25” bogies, with axle boxes with double spring suspension. Maximum authorized axle load is 20 tons. The axle boxes incorporated spherical roller bearings SKF 229750 J/C3R505.

Figure 3. The Wagons.

In the analysis of root causes, one can clearly see that this was more than a hot runner. To some extent, the inside bearing was completely deformed from red-hot running. In fact, there are clues to indicate what happened:

• There is a gap between the (inside bearing) outer ring and the labyrinth seal. The inside bearing moved towards the outside

© 2004 SKF Reliability Systems All Rights Reserved 5

Page 6: Rca

Root Cause Analysis

• Problem Understanding: It is necessary to understand the nature, or essential failure modes, of the problem

• In principle, this should not be possible. For a 20 ton/axle arrangement, the distance ring on the axle between bearings is 35 mm wide, and regulates the precise bearing location • Root Cause Identification: Find the

correct root cause(s). This includes brainstorming and investigating possible root causes, and cause-effect relationships

• The width of the distance ring - called the spacer ring - was 14 mm

• Root Cause Elimination: Eliminate the root cause(s) to prevent the problem from recurring

• In fact, there are TWO different executions of this axle box: 20 ton / axle payload - axle box with a 35 mm spacer between bearings. And, a 22.5 ton / axle payload, a similar but slightly narrower axle box, with a 14 mm spacer between bearings

• Symptom Monitoring: Monitor symptoms to show the presence or elimination of the problem. Regularly take performance checks

• Somehow, the maintenance personnel installed the wrong spacer ring Generally, a team performs the RCA process.

As stated before, it is essential to create the right environment for an open, trustful approach. The following roles are distinguished within a manufacturing plant (2001):

• The bearing assembly was allowed to slide to the outside, which resulted in heavier axle load, more axle bending, material fatigue, and final collapse. The bearing was running at more than red hot, and was completely deformed. • Executives: Put a stamp of approval on

RCA, including expectations and time lines. They should be fully educated in RCA

• The train derailed just for a spacer!

This example shows the necessity of finding problem root causes with the goal of eliminating them from recurring. Human mistakes or erroneous procedures can be the root cause, but we should acknowledge the errors and learn from the mistakes.

• RCA Champions: Administer, support, and ensure the RCA effort from a management standpoint. They should be a mentor to the drivers and analysts, and should have the authority to protect persons in case of politically sensitive facts. They set performance expectations RCA Process

The following steps are ‘generally’ found in a RCA procedure: • RCA Drivers: Team leaders who organize

all details. The team meets, analyzes, hypothesizes, verifies, and draws factual conclusions. They develop recommendations to eliminate root causes

• Problem Identification: The problem should be recognized and assigned a name. If a problem is perceived as normal, it never improves. In the case of engineering constructions, the problem can be identified by symptom analysis and equipment inspections. In general, internal or external benchmarking can also identify problems (or opportunities)

Structured RCA effort intends to be a proactive task, so it should reside under the control of a reliability department. In the absence of such a department, RCA should be controlled by operations or engineering. The

© 2004 SKF Reliability Systems All Rights Reserved 6

Page 7: Rca

Root Cause Analysis

RCA effort should not be placed under the control of a reactive maintenance department, as their role is to respond to day-to-day activities in the field.

RCA Tools/Methods The American Society for Quality distinguishes tools and methods by their specific purposes (2000):

• Problem identification/understanding

• Possible cause generation and consensus reaching

• Problem and cause data collection

• Possible cause analysis

• Cause-and-effect analysis

We briefly mention the various techniques. Please refer to detailed publications, such as the original work of Ishikawa of the Asian Productivity Organization.

Problem Identification/Understanding Problem identification and understanding includes tools to identify and gain solid understanding of the problem.

Flowcharts: Many problems are connected to business or work processes. A process flowchart is an appropriate first step to illustrate where problems occur, and to provide an understanding of processes that contain or influence problems.

Critical Incident: A method to explore the most critical issues in a situation. A collection of people from different departments or functional areas is asked about most critical incidents. The answers are collected, sorted, and analyzed based on frequency. The most critical ones are the starting point for RCA.

Spider Chart: The spider chart gives a graphical impression of how the performance

of (business) processes compares with other organizations or departments (benchmarking). It compares and determines which problems are most critical from an external viewpoint.

Performance Matrix: Used to illustrate the performance and importance of problems and causes. High importance, high performance impact problems and causes are only selected.

Possible Cause Generation and Consensus Reaching The following section covers idea-generating tools to determine possible problem causes and tools to reach an agreement in case of disputes or different views.

Brainstorming: Generic process of generating a list of problem areas, consequences, causes, and ways to eliminate them. It can be structured or unstructured.

Brain Writing: Similar to brainstorming, brain writing uses written cards or a gallery of white boards or flip charts. It is preferred, as it reduces problem complexity, dominating people, or the possible anonymity.

Nominal Group Technique: A kind of brainstorming in which all participants have the same vote when selecting solutions / causes. Ideas are first generated, and then participants rank them individually. By totaling the points, a consensus is reached.

Paired Comparisons: Instead of comparing ideas all at once, they are compared pair-wise to reach a consensus.

Problem and Cause Data Collection Here we include tools and techniques to collect reliable root cause analysis data.

Sampling: Sampling draws conclusions about a larger group based on a smaller sample. A minimum understanding of statistics is required to perform reliable sampling.

© 2004 SKF Reliability Systems All Rights Reserved 7

Page 8: Rca

Root Cause Analysis

Surveys: Used to collect data about attitudes, feelings, or opinions, such as customer satisfaction, needs, and/or expectations.

Possible Cause Analysis Possible cause analysis covers techniques for analyzing the impact of different causes.

Check Sheets: A check sheet table used to systematically register data. Histogram: A bar chart used to visualize the

distribution and variation of a data set. The diagram helps to identify patterns or anomalies. The frequency of occurrence is depicted vertically, while the classes are ordered along the horizontal axis.

Cause of Machine Trouble

Jan Feb Totals per cause

unbalance II I 3

misalignment I III 4

bearings II 2

….

0

5

10

15

20

25

<1hr 1-4hr 5-8hr 8-24hr

shutdown

Figure 6. Histogram Example.

Table 1. Example of a Simple Check Sheet.

A Computer Maintenance Management System (CMMS) is another good source for data (data entering is properly done). For example, statistics may be derived on breakdowns and possible causes. Again, a representative set of data should be present.

Pareto Chart: The Pareto principle states that most effects, often 80 percent, are the result of a small number of causes, often 20 percent. The main purpose of the chart is to show the causes sorted by the degree of seriousness, expressed as the frequency of occurrence, cost, performance, etc. It shows which causes need further attention. Figure 7 is a simple example, in which two causes cover 80% of the problem.

Like the CMMS, other documentation on health/safety/environmental (HSE) accidents and incidents can be a valuable data source. Possibly, extra fields can be added to these systems to better trigger and track problems.

Relevant data may also be found in general databases with reliability data (often referred to as RAM data). A few example databases:

• OREDA for Offshore Reliability Data, with turbines, compressors, etc. http://www.oreda.com

• Process Equipment Reliability Database (PERD) of the American Institute of Chemical Engineers http://www.AIChe.org

© 2004 SKF Reliability Systems All Rights Reserved 8

Page 9: Rca

Root Cause Analysis

Relations Diagram: A tool to identify logical relationships between different ideas or issues in a complex or confusing situation. The factors under investigation are distributed in an empty chart area, and arrows illustrate the relationships between them.

1 0 0 %

0 %c a u se 1

c a u se 2c a u se 3

c a u se 4c a u se 5

F re q u e n c y

1 0

2 0

C u m u la tiv e %

Figure 7. Pareto Chart Example.

Affinity Diagram: A chart approach that helps identify seemingly unrelated ideas, causes, or other concepts so they might collectively be further explored. A way to handle and brainstorm about causes in a qualitative way rather than quantitative.

Cause-Effect Analysis The last stage is the cause-effect analysis. A few tools are mentioned here.

Cause-Effect Chart: This is a well-known technique used to relate possible causes to a problem. It is also called the Ishikawa diagram or fishbone diagram.

Scatter Charts: Illustrate relationships between two causes or other variables in a problem situation. This is achieved by plotting at least 30 samples of data pairs in one figure. Possible logarithmic axes may also be used. The data may be generated by experiments of changing variables and plotting the effects.

After completing the cause-effect diagram, examples / facts can also be entered. These illustrate the relationships, and provide an idea about their strength.

Paper thickness

"knob A"

Figure 8. Scatter Chart Example.

The cause-effect diagram shows that multiple causes can result in the same problem. The diagram can be used as a discussion aid to determine which causes are considered the primary (root) causes of the actual problem. If enough data is available, a probabilistic approach could yield the most likely root causes.

Figure 9. Cause-Effect Diagram (Fishbone).

© 2004 SKF Reliability Systems All Rights Reserved 9

Page 10: Rca

Root Cause Analysis

Advanced Tools: There are various other ways to model cause-effect relations based on (statistical) correlations or regression techniques. However, they fall outside the scope of this introduction article on RCA.

Fault-Trees: Another visual way to represent cause-effect relationships. The fault tree starts with faults / problems. Causes (can be different layers) are then depicted with arrows indicating the relationships.

Matrix Diagram: A visual technique for arranging possible causes by their contribution to the problem. Problem characteristics are ordered vertically, and possible causes horizontally. The contributions of the cause to problem characteristics are depicted in the matrix. By accumulating individual contributions, you get an idea of which causes are most significant. It is also sometimes referred to as a cause-effect matrix.

Other advanced techniques stem from artificial intelligence, such as artificial neural networks, fuzzy models, logical decision trees, and other network representation. The cause-effect networks are used to reason forward or backward. The network, together with reasoning capacity, forms a so-called expert system, or knowledge-based system.

These tools can be tuned by both "data" and "heuristics." For example, the Bayesian network is used to model cause-effect relations, where the strength of the relationship is modeled as probabilities. SKF applies the Bayesian network to support bearing failure or damage investigations.

Five Whys: The main purpose is to keep asking "why" when a cause is identified. Each cause is questioned whether it is a symptom, a lower level cause, or a root cause. The chains of causes can be drawn in a simple chart. The rule of thumb is that the method often required five rounds of the question “why.”

Figure 10. A Bayesian Network Used to Model Relations Between Causes and Effects. The Arrows Denote relationships, While Numbers and Red Bars Denote Probability of Occurrence.

© 2004 SKF Reliability Systems All Rights Reserved 10

Page 11: Rca

Root Cause Analysis

Role in TQM / Six Sigma Tool Selection Total Quality Management (TQM) and Six Sigma stand for a stream of programs aimed to tackle major causes of quality defects. We can state that RCA originates from quality improvement philosophies, and many RCA tools / methods are present in TQM and Six Sigma. Some RCA tools can be embedded in a plant's quality procedures, as one main goal is achieving a continuous process of quality improvement. For example, critical incidents investigation, performance spider charts, etc., can be done on regular basis.

These tools and methods are aids to get to the goal, rather than the solution. In the general RCA process, the tools support problem understanding and root cause identification steps. The American Society for Quality further outlines the particular strengths and weaknesses of the tools (2000). In general, the selection is very situation dependent.

Doggett (2004) concludes after investigating three RCA tools (Cause and Effect Diagrams, Interrelationship Diagrams, and Current Reliability Trees), that none of the tools were perceived significantly better in terms of finding root causes. On the other hand, the complexity of the tools varies, and as such the training requirements.

Role in TPM Total Productive Maintenance (TPM) stands for an improvement program that covers both production and maintenance functions. It is founded on the concept of ownership and complete integration of the production and maintenance functions.

The Wider Perspective of RCA Root cause analysis can be used after a major incident or accident like the railway problem outlined earlier. However, RCA can also be part of a bigger improvement program, such as safety, quality, or maintenance improvement programs. RCA identifies problems (opportunities to improve) and finds root causes.

The prime driver for TPM is the concept of Overall Equipment Effectiveness (OEE). The philosophy hinges on making equipment effectiveness the concern of everyone in the organization. OEE requires strict attention to the measurement and quantification of losses. When identifying big losses and their root causes, RCA tools play a useful role. As such, RCA tools can be part of a TPM program.

Role in HAZOP A Hazard and operability (HAZOP) study is a methodical review of a defined operation system to identify potential hazards and operability problems. It identifies and defines process and design deficiencies, the potential for, and consequences of human and organizational error, accidents from neighboring plant or activities, natural occurrences and catastrophes, and the possibilities of equipment component failures. As such, many RCA tools and methods can play a role in a HAZOP study.

Role in Asset Management Asset Management (AM) tries to attain the lowest life cycle cost with maximum availability, performance efficiency, and quality (maximum OEE). In other words, AM is the systematic planning and control of a physical asset throughout its life. An outcome of AM is the defining what specific maintenance practices need to be undertaken while considering the optimum means of implementing them. This is where RCA tools can again play a useful role.

© 2004 SKF Reliability Systems All Rights Reserved 11

Page 12: Rca

Root Cause Analysis

Role in (S) RCM A survey of the use of RCA techniques by maintenance professionals was conducted on the Plant Maintenance Resource Center in 2000. See the results at: http://www.plant-maintenance.com/articles/rca-survey-01.shtml

Reliability Centered Maintenance (RCM) and SRCM are structured processes to proactively identify equipment modifications and/or safety devices required to avoid any significant consequence as a result of equipment failure. Consequences can be operational loss, safety, health, or environmental. By RCM study, all of the potential modes of failure are uncovered and a maintenance strategy is devised to mitigate the consequences of the failure based on the criticality of the failure mode. In RCM, these failure modes are identified as the root cause(s) of the failure.

The key findings are:

59% of respondents indicated that they use some form of RCA process

Of those who indicated that they used some form of RCA, 79% indicated that they used formal, structured processes

Those using formal processes considered that the overall effectiveness of their approach was significantly better than did those people using informal processes.

This is where the main difference lies. The purpose of RCA is to uncover the underlying reasons (root causes) why an event (not just equipment related events, but any type of event) is occurring so that the necessary steps can be taken to eliminate the event in its entirety. This is accomplished by analyzing the modes (the point at which RCM stops).

Supervisory and technical staff are more likely to be involved in RCA than shop floor personnel.

The greatest benefits appear to be in the area of improved equipment availability and reliability. RCA uses for example a logic tree that

stresses verification at every level. The advantage is that the actual root causes that are uncovered are facts that have been derived from the verification process. RCM is driven by deriving a maintenance strategy, while RCA is driven by maintenance prevention.

60% of respondents indicated that they used external consultants to assist with their RCA implementation.

55% of respondents indicated that they used software to assist with their RCA process. Within RCM, FMEA stands as the central

vehicle; however, the RCA tools and methods can be of additional help when performing FMEA in the need of deeper investigation of the failure modes. Secondly, RCA is to be used in the process of updating (on periodic basis) the derived maintenance strategy from RCM such that a continuous improvement of the maintenance strategy is achieved.

The survey shows that RCA is quite wide spread amongst maintenance functions, and that the structured process of RCA is key to make RCA become effective.

The Consequences Of RCA To prevent the problem from recurring, the root cause(s) should be eliminated. The root cause investigation results necessary actions are considered the outcome of RCA. It is

A Survey among Maintenance Professionals

© 2004 SKF Reliability Systems All Rights Reserved 12

Page 13: Rca

Root Cause Analysis

In order to arrive at a continuous improvement situation, RCA needs to be embedded into the normal work processes. As an example, within the SKF concept of Proactive Reliability MaintenanceTM (PRM), an improvement loop is defined (Figure 11). Starting with an operational review, a predictive maintenance program is set-up. Where critical anomalies are detected, RCA is applied, providing corrective actions to prevent anomalies from occurring again. Formulating a number of key performance indicators monitors the process.

essential to know cause-effect relationships to prevent problems from recurring.

The assessment of these actions is generally not addressed within the RCA context. This is typically the second part of an FMEA process, whereby possible actions are assessed after their effect, in terms of risk or consequence decrease. It is worthwhile to consider this approach when assessing alternative actions. @ptitudeXchange provides articles on FMEA for further reading.

Figure 11: Proactive Reliability MaintenanceTM.

These types of work processes generally need adjustment in the standard job plans. For example, anomalies detected during predictive maintenance should feed/start RCA procedures. RCA results have to be documented extensively (see e.g. Reed, 2003), and recorded appropriately in CMMS for keeping good machinery history. Corrective work (e.g., cleaning, repair) or adjustments in maintenance strategy (e.g., preventive vs. predictive) needs to be planned and scheduled.

In case of large changes, a change management project may follow RCA. For example, when changing organizational structure or major responsibilities, a structured

management of change is needed (Schram & Yolton, 2004).

Commercial Methods/Software Just two of the many tools are mentioned here. Most commercial tools are tools with which cause-failure trees can be made or searched through, and then visualized. It should again be emphasized that RCA is more a process than a tool - the tool supports the structuring of the process.

© 2004 SKF Reliability Systems All Rights Reserved 13

Page 14: Rca

Root Cause Analysis

PROACT • Equipment Troubleshooting Tables Reliability Center Inc. offers a method called PROACT accompanied with a software tool. PROACT stands for:

• Component Troubleshooting Tables

• FRETT Analysis

• Equipment 7 Cause Categories • PReserving event data More information can be found at: http://www.taproot.com/ • Ordering the analysis team

• Analyzing event data Summary: PROACT is a process with an empty, supportive tool, while TapRoot is a step-by-step search in a database with tables and trees.

• Communicating findings and recommendations

• Tracking for bottom line results

Conclusion The method is clear, and a great deal of attention is spent on human organizational errors. Many other software tools only focus on (modeling) the mechanical issues. More information can be found at: http://www.reliability.com/

Root Cause Analysis (RCA) is a structured investigation that aims to identify the true cause of a problem, the cause-effect relationships, and the actions necessary to eliminate it. The trigger to start an RCA can be a major accident or incident, or an overall improvement program in the areas of safety, quality, or production / maintenance. The RCA process consists of problem identification / understanding. The outcomes of RCA are recommendations for change and monitoring to keep the problem from reoccurring. Several tools and methods exists that can support the RCA process.

Taproot System Improvements Inc. offers a software suite called TapRoot.

The suite of tools includes Root Cause Tree software, which provides the investigator with a fairly comprehensive list of causes that should be considered for any incident. Each causal factor that contributed to the incident should be analyzed one at a time. A dictionary provides explanations and definitions of each part of the root cause tree. This allows for consistent, non-overlapping root causes that create trending in a database. It also includes a checklist that ensures consideration of the most frequently occurring human performance contributors to an incident, which helps narrow down the seven basic cause categories. It also helps keep the investigator's mind open and focused.

Acknowledgements

The author would like to thank Wayne Reed for his contributions to this paper.

References Petroski, H. Design Paradigms - Case Histories of Error and Judgment in Engineering. Cambridge University Press, United Kingdom: 1994.

Ishikawa, K., Guide to Quality Control. Asian Productivity Organization: 1982.

A second software, Equifactor was created in cooperation with Heinz Bloch's equipment troubleshooting techniques. These techniques include:

© 2004 SKF Reliability Systems All Rights Reserved 14

Page 15: Rca

Root Cause Analysis

© 2004 SKF Reliability Systems All Rights Reserved 15

Magnusson, K., Kroslid, D., Bergman, B. Six Sigma - The Pragmatic Approach. Studentlitteratur, Lund: 2000.

Burr, J.T., "The Tools of Quality, Part I: Going with the Flow", Quality Progress, June 1990.

Sarazan, S., "The Tools of Quality, Part II: Cause-and-effect Diagrams", Quality Progress, July 1990.

Shaldin, P.D., "The Tools of Quality, Part III: Control Charts", Quality Progress, August 1990.

Juran Institute, "The Tools of Quality, Part IV: Histograms", Quality Progress, September 1990.

Juran Institute, "The Tools of Quality, Part V: Check Sheets", Quality Progress, October 1990.

Burr, J.T., "The Tools of Quality, Part VI: Pareto Charts", Quality Progress, November 1990.

Burr, J.T., "The Tools of Quality, Part VII: Scatter Diagrams", Quality Progress, December 1990.

Anderson, B., Fagerhaus, T. Root Cause Analysis - Tools and Techniques. American Society for Quality (ASQ), Quality Press, Milwaukee, Wisconsin: 2000.

NASA, Root Cause Analysis Overview, July 2003. Office of Safety & Mission Assurance, Chief Engineers Office.

Doggett A.M., A Statistical Comparison of Three Root Cause Analysis Tools. Journal of Industrial Technology. Vol. 20(2), 2004.

Latino, R.J., "Creating the environment for RCA to succeed", Maintenance Technology Magazine, April: 2001.

Latino, K.C., "Fighting Failure", Maintenance Technology Magazine, December: 2001.

Latino, R.J., Latino, K.C. "Root Cause Analysis - Improving Performance for Bottom Line Results". CRC press: 1999. (Second edition 2002.)

Clements-Jewery, K. "Structure the machine failure information", In: Asset Maintenance Management, Wilson, A. (Eds). Conference Communication, Monks Hill, UK: 1999.

Goodacre, J. "Identifying current needs using root cause analysis." Maintenance & Asset Management, Vol. 16(6), 18-23, 2001.

Bloch, H.P, Geitner F.K. Machinery Failure Analysis & Troubleshooting (3rd Edition). Gulf Professional Publishing: 1997.

Paradies, M, Unger L. Taproot : The System for Root Cause Analysis, Problem Investigation & Proactive Improvement. System Improvements Inc.: 2000.

Schram G., van der Vorst B., Decision-Support System for Bearing Failure Mode Analysis. EVOL03_no1_p25. http://www.aptitudeXchange.com

System Improvements Inc., Equipment Troubleshooting. SI04003. http://www.aptitudeXchange.com

Barratt, M., Proactive Maintenance. MB02028, http://www.aptitudeXchange.com

Reed, W., RCFA Report Template. GS03008, http://www.aptitudeXchange.com

Schram, G., Yolton J., Change Management. GS04015, http://www.aptitudeXchange.com


Recommended