+ All Categories
Home > Documents > Failure Prevention and recovery

Failure Prevention and recovery

Date post: 24-Feb-2016
Category:
Upload: barb
View: 163 times
Download: 0 times
Share this document with a friend
Description:
Failure Prevention and recovery . Chapter -19 . Summary . What is failure? Why failures happen? How do we measure failures? Detection and analysis of failures. How operations can improve their reliability? How should the operations should recover from the failures? . Failure . Failure . - PowerPoint PPT Presentation
Popular Tags:
49
Failure Prevention and recovery Chapter -19
Transcript
Page 1: Failure Prevention and recovery

Failure Prevention and recovery

Chapter -19

Page 2: Failure Prevention and recovery
Page 3: Failure Prevention and recovery

Summary

What is failure? Why failures happen?How do we measure failures?Detection and analysis of failures.How operations can improve their reliability?How should the operations should recover from

the failures?

Page 4: Failure Prevention and recovery

Failure

Page 5: Failure Prevention and recovery

Failure

Page 6: Failure Prevention and recovery

Failure

Page 7: Failure Prevention and recovery

Failure

Page 8: Failure Prevention and recovery

What is failure?

At its simplest ‘failure’ is when something does not work as it should do. If the shop assistant who sells you an item of clothing ‘fails’ to inform you of the fact that it should be dry cleaned, it is technically a failure. Yet usually in operation management, we use the term failure to denote a more dramatic event. Usually we mean something stopping to do what it should do. So a piece of material fails, or a process fails.

Page 9: Failure Prevention and recovery

Why do operations fail?

There are various reasons for the operations failures:

I. Design failures II. Facilities failures III. Supplier failures IV. Customer failures V. Environmental failures

Page 10: Failure Prevention and recovery

A. Design failures

A design may look fine on paper, but in real circumstances the limitations will become clearer.

Design failures happen due to two different situations:Because miscalculating or overlooking a characteristic

of demand – process fail to adjust with demand. For example a company process is designed to manufacture 3 televisions per hour, but the demand is to manufacture 7 televisions per hour.

Page 11: Failure Prevention and recovery

Unexpected circumstances – product size on the design becomes different from demanded size.

Page 12: Failure Prevention and recovery

Why systems fail

Design failures

Facilities failures

Staff failures

Failures inside the operation

Supply failures Customer

failures

Page 13: Failure Prevention and recovery

B. Facilities failure

Failures with machines, equipments, buildings, and fittings.

c. People failure People failures come in two types:Errors and Violations Errors – are mistakes in judgments (run motorbike on

reserve petrol)Violations – are doing the things contrarily to the

operating procedure. (driver avoiding changing the engine oil, causing major problems to engine)

Page 14: Failure Prevention and recovery

D. Supplier failure

Failure in the delivery or quality of goods and services. (a music band of the hotel fails to turn – in)

E. Customer failure Misuse of products and services from the production. F. Environmental disruption-related failure All the causes outside the opration. Example

hurricanes, floods, lightning, temperature, fire, crime, theft, terrorism.

Page 15: Failure Prevention and recovery

Measure failures

• Failures are usual happening as human failure. For example :1. A machine failure may happen due to the

poor design or maintenance . 2. A delivery failure by someone's errors to

manage supply schedule. 3. Customers mistake, because no one to

instruct the customer

Page 16: Failure Prevention and recovery

So, failures can be controlled to an extent, again an organization learn from failures. Thereby we call failures as opportunities. There are three main ways of measuring failure:1. Failure rates – checking how often failure

occurs. 2. Reliability - checking the chances of an

occurrence of failure. 3. Availability – checking the amount of

available useful operating time.

Page 17: Failure Prevention and recovery

FR (Failure Rate) measuring

The number of failures occurring over a period of time. The failure of an airport security system can be measured by measuring the failure of security breaches.

FR= number of failures × 100 total numbers of products tested

Page 18: Failure Prevention and recovery

Failure over-time – the ‘bath tub’ curve

• Failure is a function of time. Different stages the probability for failing will be different.

The curve that describes failure probability is called ‘bath-tub’ curve. According to this curve the failure probability is high at beginning and end of the life cycle

Page 19: Failure Prevention and recovery

There are three distinct stages.

The ‘infant-mortality’ or ‘early-life’ stage where early failures occurred by defective parts or improper use.

The ‘normal-life’ stage when the failure rate is usually low and constant.

The wear-out stage – when the failure rate increases as it reaching the end of its working life.

Page 20: Failure Prevention and recovery

How failure is measured

Time

Failu

re ra

te

‘Infant-mortality’ stage

Normal-life stage

Wear-out stage

Page 21: Failure Prevention and recovery

Reliability measuring

It measures the ability of a system, product or service to perform as expected over time.

Rs = R1 ×R2 ×R3 ×Rn ….. Rs = reliability of system Here we consider that a single failure in a

component of process causing failure to the whole components.

So the more the components in a system, the lesser will be the reliability.

Page 22: Failure Prevention and recovery

MTBF (MEAN TIME BETWEEN FAILURES)

MTBF = OPERAITNG HOURS NUMBER OF FAILURES III. Availability The degree to which the operation is ready to

work. An operation is not available if it has either failed or is being repaired followed by a failure.

Page 23: Failure Prevention and recovery

Failure prevention and recovery

There are three sets of activities which relate to failure:1. The first – understanding what failures are occurring

in the operation and why they are occurring. 2.Second – examine or find the ways to reduce chances

for failure and minimize consequences of failures. 3. Third – make plans and procedures to help the

organization from recovering when they occur.

Page 24: Failure Prevention and recovery

The three tasks of failure prevention and recovery

Failure detection and analysis

Finding out what is going wrong and why

Improving system reliability

Stopping things going wrong

Recovery

Coping when things do go wrong

Page 25: Failure Prevention and recovery

Mechanisms to detect failure There are six techniques to find out the failure: In-process checks – employees check that the service is acceptable

during the process itself (restaurants ) Machine diagnostic checks – a machine is tested by putting it

through many activities. ( computer service) Point of – departure – interviews - the staff may formally or

informally check that the services has been satisfactory. Phone surveys – used to solicit opinions about products or services. Focus groups - group of customers are brought together to

discover problems or finding out attitude towards products or services .

Page 26: Failure Prevention and recovery

Complaint feed back cards and questionnaires

Many organizations using them for collecting views about products or services.

Failure Analysis Understand why its has occurred.1. Accident investigation – specifically trained staff analyze the cases

of accident.( airplane, road accident) 2. Failure traceability - making sure an operation can trace ( fing proof

or evidence) 3. Complaint analysis – analyze the complaints.

Page 27: Failure Prevention and recovery

CIT or critical incident analysis

Finding out the satisfying and non – satisfying factors from customers.

Page 28: Failure Prevention and recovery

How failure is detected and analyzed

– in-process checks

– accident investigation

– failure mode-and-effect analysis

– fault-tree analysis

Failure detection mechanisms include:

Failure analysis procedures include:

– point-of-departure interview

– machine-diagnostic checks

Page 29: Failure Prevention and recovery

Failure mode and effect analysis

Identify the product or service or process that are important in determining the effect of failures. Or identifying failures before they happen by providing checklist procedures.

It has three steps What is likelihood that failure will occur? What would the consequence of failures be? How likely a failure to be detected before affecting

customers?.

Page 30: Failure Prevention and recovery

Based on the above questions, we use the RPN or Risk Priority number and find out the cause of failure.

There are seven steps involved in thisPage 629

Page 31: Failure Prevention and recovery

FailureSeverity of

consequenceEffect on customer

Normal operation

Probability of failure

Degree of severity

Likelihood of detection

Risk priority number

Failure modes effects analysis

Page 32: Failure Prevention and recovery

Fault-tree analysis

It is a logical procedure starting from a failure or potential failure and works back- wards to indentifying all possible causes and origins.

Page 33: Failure Prevention and recovery

Fault-tree analysis for below-temperature foodbeing served to customers

Food served to customer is below

temperature

Key

AND node

OR nodeCold plate

used

Plate taken too early from

warmer

Plate warmer malfunction

Oven malfunction

Timing error by chef

Ingredients not defrosted

Plate is cold

Food is cold

Page 34: Failure Prevention and recovery

Improving process reliability

The responsibility of this step of operational managers is to prevent failures, we can do it by following 4 steps.

1. Design out fail points. 2. Build redundancy 3. Fail-safeing 4. Maintenance

Page 35: Failure Prevention and recovery

a. Design out fail points

We can do it by proper product/service designing, by quality planning and control, by process controlling.

b. Redundancy Building redundancy to an operation means,

having a back-up system. (airplane, kidney, two red lights in cars)

Page 36: Failure Prevention and recovery

c.fail-safeing

• Coming from Japanese methods of operations improvement. It is known as Poka-yoke in Japan, which means prevent. So the Poka-Yoke are devices used against failures.

Page 37: Failure Prevention and recovery

3.5 inch diskette cannot be inserted unless it is orientated correctly. This is as far as a disk can be inserted upside-down. This feature, along with the fact that the diskette is not square, prohibits incorrect orientation. It is a control method.

Warning lights and chimes alert the driver of potential problems. These devices employ a control method and a warning method.

Poka-yoke (fail-safing)

Page 38: Failure Prevention and recovery

Filing cabinets can fall over if too many drawers are pulled out. For some filing cabinets, opening one drawer locks all the rest, reducing the chance of the filing cabinet tipping. It is a control method.

The window in the envelope is not only a labour saving device. It prevents the contents of an envelope intended for one person being inserted in an envelope address to another. It is a control method.

Poka-yoke (fail-safing)

Page 39: Failure Prevention and recovery

Examples for Poke-yoke techniques page 633

Maintenance Maintenance is how organizations try to avoid failure by

taking care of their physical facilities. Benefits of maintenance 1. it enhances safety 2.It enhances reliability 3. It enhances quality 4. Low operation cost 5. Longer life 6. Higher end value ( can be sued as second hand)

Page 40: Failure Prevention and recovery

Three basic approaches for maintenance

Run to breakdown ( RTB) - operate till something fails and do maintenance.

Preventive Maintenance – eliminate or reduce chances of failure by servicing the facilities.

Condition-based maintenance – perform maintenance only when facilities required. It is appliccable for expensive facilities.

Page 41: Failure Prevention and recovery

Mix of maintenance approaches

Page 42: Failure Prevention and recovery

A mixture of maintenance approaches is often used –in a motor car, for example

Use condition-based monitoring

maintenance

Use run-to-breakdown

maintenance

Use preventive maintenance

Page 43: Failure Prevention and recovery

Total productive maintenance

Means the productive maintenance carried out by all employees through small group activities. So TPM means maintenance management.

Five goals of TPM PAGE 538 Paragraph 2

Page 44: Failure Prevention and recovery

Reliability-centered maintenance

• It is another method of maintenance where different types of maintenance for different parts of a process.

Page 45: Failure Prevention and recovery

One part in one process can have several different failure modes, each of which requires a different approach

Cutters

Shredding process

Failu

res

Time

Cutter ‘wear out’ failure pattern

Solution

Preventive maintenance

before end of useful life

Page 46: Failure Prevention and recovery

One part in one process can have several different failure modes, each of which requires a different approach

Cutters

Shredding process

Failu

res

Time

Cutter ‘shake loose’ failure pattern

Solution

Ensure correct fitting

through training

Page 47: Failure Prevention and recovery

Recovery

The activities designed to adjust with the failures are known as recovery.

Failure planning The procedures which allow the operation to

recover from failure is called failure planning

Page 48: Failure Prevention and recovery

What’s happened

What consequences

Inform

Contain

Follow up

Find root cause

Engineer out

Analyze failure

Plan recovery

The stages in failure planning

Discover Act Learn Plan

Page 49: Failure Prevention and recovery

Procedures of business continuity

Avoid or recover from failures and keep business going.

Page 643


Recommended