+ All Categories
Home > Documents > A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and...

A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and...

Date post: 22-Dec-2015
Category:
View: 215 times
Download: 2 times
Share this document with a friend
Popular Tags:
74
A. Bobbio Bertinoro, March 10-14, 2 003 1 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica Università del Piemonte Orientale, “A. Avogadro” 15100 Alessandria (Italy) bobbio @ unipmn .it - http://www.mfn.unipmn.it/~bobbio Bertinoro, March 10-14, 2003
Transcript
Page 1: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 1

Dependability Theory and Methods

Part 1: Introduction and definitions

Andrea BobbioDipartimento di Informatica

Università del Piemonte Orientale, “A. Avogadro”15100 Alessandria (Italy)

[email protected] - http://www.mfn.unipmn.it/~bobbio

Bertinoro, March 10-14, 2003

Page 2: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 2

Dependability: DefinitionDependability: Definition

Dependability is the property of a system to be dependable in time, i.e. such that reliance can justifiably be placed on the service it delivers.

Dependability extends the interest on the system from the design and construction phase to the operational phase (life cycle).

Page 3: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 3

What dependability theory and practicewants to avoid

Page 4: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 4

dependability

measures

reliabilityavailabilitymaintainabilitysafetysecurity

means

fault forecastingfault tolerancefault removalfault prevention

threats faults errorsfailures

Dependability: TaxonomyDependability: Taxonomy

Page 5: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 5

Quantitative analysisQuantitative analysis

The quantitative analysis aims at numerically evaluating measures to characterize the dependability of an item:

Risk assessment and safety

Design specifications

Technical assistance and maintenance

Life cycle cost

Market competition

Page 6: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 6

Risk assessment and safetyThe risk associated to an activity is given proportional to the probability of occurrence of the activity and to the magnitute of the consequences.

A safety critical system is a system whose incorrect behavior may cause a risk to occur, causing undesirable consequences to the item, to the operators, to the population, to the environment.

R = P M

Page 7: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 7

Design specifications

Technological items must be dependable.

Some times, dependability requirements (both qualitative and quantitative) are part of the design specifications:

Mean time between failures

Total down time

Page 8: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 8

Technical assistance and maintenance

The planning of all the activity related to the technical assistance and maintenance is linked to the system dependability (expected number of failure in time).

planning spare parts and maintenance crews;

cost of the technical assistance (warranty period);

preventive vs reactive maintenance.

Page 9: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 9

Market competition

The choice of the consumers is strongly influenced by the perceived dependability.

advertisement messages stress the dependability;

the image of a product or of a brand may depend on the dependability.

Page 10: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 10

Purpose of evaluation

Understanding a system

– Observation

– Operational environment

– Reasoning

Predicting the behavior of a system

– Need a model

– A model is a convenient abstraction

– Accuracy based on degree of extrapolation

Page 11: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 11

Methods of evaluation

Measurement-Based

Most believable, most expensive

Not always possible or cost effective during

system design

Model-Based

Less believable, Less expensive

Analytic vs Discrete-Event Simulation

Combinatorial vs State-Space Methods

Page 12: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 12

Measurement-Based

Most believable, most expensive;

Data are obtained observing the behavior of physical

objects.

field observations;

measurements on prototypes;

measurements on components (accelerated tests).

Page 13: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 13

Closed-form

Answers

Numerical

SolutionAnalytic

Simulation

All models are wrong; some models are useful

Models

Page 14: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 14

Methods of evaluation

Measurements + Models data bank

Page 15: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 15

The probabilistic approachThe probabilistic approachThe mechanisms that lead to failure a technological object are very complex and depend on many physical, chemical, technical, human, environmental … factors.

The time to failure cannot be expressed by a determin-istic law.

We are forced to assume the time to failure as a random variable.

The quantitative dependability analysis is based on a probabilistic approach.

Page 16: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 16

ReliabilityReliability

The reliability is a measurable attribute of the dependability and it is defined as:

The reliability R(t) of an item at time t is the probability that the item performs the required function in the interval (0 – t) given the stress and environmental conditions in which it operates.

Page 17: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 17

Basic Definitions: cdfLet X be the random variable representing the time to failure of an item.

The cumulative distribution function (cdf) F(t) of the r.v. X is given by:

F(t) = Pr { X t }

F(t) represents the probability that the item is already failed at time t (unreliability) .

Page 18: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 18

Basic Definitions: cdf

Equivalent terminoloy for F(t) :

CDF (cumulative distribution function)

Probability distribution function

Distribution function

Page 19: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 19

Basic Definitions: cdf

1

0

F(t)

ta

F(b)

F(a)

b

F(0) = 0lim F(t) = 1t

F(t) = non-decreasing

Page 20: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 20

Basic Definitions: Reliability

Let X be the random variable representing the time to failure of an item.

The survivor function (sf) R(t) of the r.v. X is given by:

R (t) = Pr { X > t } = 1 - F(t)

R(t) represents the probability that the item is correctly working at time t and gives the reliability function .

Page 21: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 21

Basic Definitions

Equivalent terminology for R(t) = 1 -F(t) :

Reliability

Complementary distribution function

Survivor function

Page 22: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 22

Basic Definitions: Reliability

1

0

R(t)

ta b

R(0) = 1lim R(t) = 0t

R(t) = non-increasing

R(a)

Page 23: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 23

Basic Definitions: density

Let X be the random variable representing the time to failure of an item and let F(t) be a derivable cdf:

The density function f(t) is defined as:

d F(t)f (t) = ——— dt

f (t) dt = Pr { t X < t + dt }

Page 24: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 24

Basic Definitions: Density

0

f (t)

ta b

f(x) dx = Pr { a < X b } = F(b) – F(a) a

b

Page 25: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 25

Basic Definitions: Density

1

0

f (t)

t

00

dttRdtttfXEMTTF

Page 26: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 26

Basic Definitions

Equivalent terminology: pdf

probability density function

density function

density

f(t) = dt

dF ,)(

)()(

0

t

t

dxxf

dxxftF

For a non-negativerandom variable

Page 27: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 27

Quiz 1:The higher the MTTF is, the higher the

item reliability is.

1. Correct

2. Wrong

The correct answer is wrong !!!

Page 28: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 28

Hazard (failure) rate

h(t) t = Conditional Prob. system will fail in (t, t + t) given that it is survived until time t

f(t) t = Unconditional Prob. System will fail in (t, t + t)

)(1

)(

)(

)()(

tF

tf

tR

tfth

Page 29: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 29

is the conditional probability that the unit will fail in the interval given that it is functioning at time t.

is the unconditional probability that the unit will fail in the interval

Difference between the two sentences:– probability that someone will die between 90 and 91, given that he

lives to 90– probability that someone will die between 90 and 91

The Failure Rate of a Distribution

tΔth),( ttt

ttf ),( ttt

Page 30: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

30A. Bobbio

DFR IFR

Decreasing failure rate Increasing fail. rate

h(t)

t

CFRConstant fail. rate

(useful life)

(infant mortality – burn in) (wear-out-phase)

Bathtub curve

Page 31: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 31

Infant mortality (dfr)Also called infant mortality phase or reliability growth phase. The failure rate decreases with time.

Caused by undetected hardware/software defects; Can cause significant prediction errors if steady-state failure rates are used;Weibull Model can be used;

Page 32: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 32

Useful life (cfr)

The failure rate remains constant in time (age independent) .

Failure rate much lower than in early-life period.

Failure caused by random effects (as environmental shocks).

Page 33: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 33

Wear-out phase (ifr)The failure rate increases with age.

It is characteristic of irreversible aging phenomena (deterioration, wear-out, fatigue, corrosion etc…)

Applicable for mechanical and other systems.

(Properly qualified electronic parts do not exhibit wear-out failure during its intended service life)

Weibull Failure Model can be used

Page 34: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 34

Cumul. distribution function:

Reliability :

Density Function :

Failure Rate (CFR):

Mean Time to Failure:

0 1 tetF t

0 t tetf

0 ttR e t

tRtf

th

1MTTF

Exponential Distribution

Failure rate is age-independent (constant).

Page 35: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 35

2.50

The Cumulative Distribution Function of an Exponentially Distributed Random

Variable With Parameter = 1

F(t)1.0

0.5

0 1.25 3.75 5.00 t

F(t) = 1 - e - t

Page 36: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 36

2.50

The Reliability Function of an Exponentially Distributed Random

Variable With Parameter = 1

R(t)1.0

0.5

0 1.25 3.75 5.00 t

R(t) = e - t

Page 37: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 37

Exponential Density Function (pdf)

f(t)

MTTF = 1/

Page 38: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 38

Memoryless Property of the Exponential Distribution

Assume X > t. We have observed that the

component has not failed until time t

Let Y = X - t , the remaining (residual) lifetime

y

t

etXP

tyXtP

tXtyXP

tXyYPyG

1)(

)(

)|(

)|()(

Page 39: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 39

Memoryless Property of the Exponential Distribution (cont.)

Thus Gt(y) is independent of t and is identical to the

original exponential distribution of X

The distribution of the remaining life does not

depend on how long the component has been

operating

An observed failure is the result of some suddenly

appearing failure, not due to gradual deterioration

Page 40: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 40

Quiz 3: If two components (say, A and B) have independent

identical exponentially distributed times to failure, by the “memoryless” property, which of the following is

true?

1. They will always fail at the same time

2. They have the same probability of failing at time ‘t’ during operation

3. When these two components are operating simultaneously, the component which has been operational for a shorter duration of time will survive longer

Page 41: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 41

0

0

0 1

1

tetR

tettf

tetF

t

t

t

Weibull Distribution

Distribution Function:

Density Function:

Reliability:

Page 42: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 42

1

1

0 1

)(

)( ttth

tR

tf

Weibull Distribution : shape parameter;

: scale parameter.

Failure Rate:

1 Dfr

Cfr

Ifr

Page 43: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 43

Failure Rate of the Weibull Distribution with Various Values of

Page 44: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 44

Weibull Distribution for Various Values of

Cdf density

Page 45: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 45

We use a truncated Weibull Model

Infant mortality phase modeled by DFR Weibull and the steady-state phase by the exponential

0 2,190 4,380 6,570 8,760 10,950 13,140 15,330 17,520

Operating Times (hrs)

Fa

ilu

re-R

ate

Mu

ltip

lie

r

7

6

5

4

3

2

1

0

Figure 2.34 Weibull Failure-Rate Model

Failure Rate Models

Page 46: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 46

Failure Rate Models (cont.)

This model has the form:

where:

steady-state failure rate

is Weibull shape parameter

Failure rate multiplier =

SS

W tCt

1)(

760,8

760,81

t

t

SSWC ,11

SSW t)(

Page 47: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 47

Failure Rate Models (cont.)

There are several ways to incorporate time dependent failure rates in availability models

The easiest way is to approximate a continuous function by a piecewise constant step function

2,190 4,380 6,570 10,950 13,140 15,330 17,520

Operating Times (hrs)

Fa

ilu

re-R

ate

Mu

ltip

lie

r

7

6

5

4

3

2

1

0

Discrete Failure-Rate Model

8,7600

1

2 SS

Page 48: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 48

Failure Rate Models (cont.)

Here the discrete failure-rate model is defined by:

ss

W t

2

1)(

760,8

760,8380,4

380,40

t

t

t

Page 49: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 49

A lifetime experimentA lifetime experiment

N i.i.d components are put in a life test experiment.

1

2

3

4

N

t = 0

X 1

X 2

X 3

X 4

X N

Page 50: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 50

A lifetime experimentA lifetime experiment1234

N

X 1X 2

X 3X 4

X N

Page 51: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 51

Repairable systems

Availability

Page 52: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 52

Repairable systems

X 1, X 2 …. X n Successive UP times

Y1, Y 2 …. Y n Successive DOWN times

t

UP

DOWN

X 1 X 2 X 3

Y 1 Y 2

• • • • •

Page 53: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 53

Repairable systems

The usual hypothesis in modeling repairable systems is that:

The successive UP times X 1, X 2 …. X n are i.i.d. random variable: i.e. samples from a common cdf F (t)

The successive DOWN times Y1, Y 2 …. Y n are i.i.d. random variable: i.e. samples from a common cdf G (t)

Page 54: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 54

Repairable systems

The dynamic behaviour of a repairable system is characterized by:

the r.v. X of the successive up times

the r.v. Y of the successive down times

t

UP

DOWN

X 1 X 2 X 3

Y 1 Y 2

• • • • •

Page 55: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 55

Maintainability

Let Y be the r.v. of the successive down times:

G(t) = Pr { Y t } (maintainability)

d G(t) g (t) = ——— (density) dt g(t) h g (t) = ———— (repair rate) 1 - G(t)

MTTR = t g(t) dt (Mean Time To Repair) 0

Page 56: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 56

Availability

The avaiability A(t) of an item at time t is the probability that the item is correctly working at time t.

The measure to characterize a repairable system is the availability (unavailability):

Page 57: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 57

Availability

The measure to characterize a repairable system is the availability (unavailability):

A(t) = Pr { time t, system = UP }

U(t) = Pr { time t, system = DOWN }

A(t) + U(t) = 1

Page 58: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 58

Definition of Availability

An important difference between reliability and

availability is:

reliability refers to failure-free operation during an

interval (0 — t) ;

availability refers to failure-free operation at a given

instant of time t (the time when a device or system is

accessed to provide a required function),

independently on the number of cycles failure/repair.

Page 59: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 59

Definition of Availability

Operating and providinga required function

Failed andbeing

restored

1Operating and providing

a required function

System Failure and Restoration Process

t

I(t) indicator function

0

I(t)

1 working0 failed

Page 60: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 60

Availability evaluation

In the special case when times to failure and times to restoration are both exponentially distributed, the alternating process can be viewed as a two-state homogeneous Continuous Time Markov Chain

Time-independent failure rate Time-independent repair rate

Page 61: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 61

2-State Markov Availability Model

MTTR

MTTF

1

1UP1

DN0

Transient Availability analysis:

for each state, we apply a flow balance equation:

– Rate of buildup = rate of flow IN - rate of flow OUT

Page 62: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 62

2-State Markov Availability Model

MTTR

MTTF

1

1

UP1

DN0

Page 63: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 63

2-State Markov Availability Model

1A(t)

Ass=

MTTRMTTF

MTTFASS

11

1

Page 64: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 64

2-State Markov Model

tetA )()(

Ass

t

te

1) Pointwise availability A(t) :

2) Steady state availability: limiting value as

3) If there is no restoration (=0) the availability becomes the reliability A(t) = R(t) =

Page 65: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 65

Steady-state Availability

Steady-state availability:

In many system models, the limit:

exists and is called the steady-state availability

t

ss tAA lim

ssA

The steady-state availability represents the probability of finding a system operational after many fail-and-restore cycles.

Page 66: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 66

Steady-state Availability1

t

0UP DOWN

Expected UP time E[U(t)] = MUT = MTTF

Expected DOWN time E[D(t)] = MDT = MTTR

MTTRMTTF

MTTF

MDTMUT

MUTASS

Page 67: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 67

Availability: Example (I)Let a system have a steady state availability

Ass = 0.95

This means that, given a mission time T, it is expected that the system works correctly for a total time of: 0.95*T.

Or, alternatively, it is expected that the system is out of service for a total time:

Uss * T = (1- Ass) * T

Page 68: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 68

Availability: Example (II)Let a system have a rated productivity of W $/year.

The loss due to system out of service can be estimated as:

Uss * W = (1- Ass) * W

The availability (unavailability) is an index to estimate the real productivity, given the rated productivity.

Alternatively, if the goal is to have a net productivity of W $/year, the plant must be designed such that its rated productivity W’ should satisfy:

Uss * W’ = W

Page 69: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 69

AvailabilityWe can show that:

This result is valid without making any assumptions on the form of the distributions of times to failure & times to repair.

Also:

MTTRMTTF

MTTFASS

)yearminutes(

60*8760*)1(

perin

Adowntime ss

Availability, A 0.99 0.999 0.9999 0.99999 0.999999

Unavailability, U

Downtime in min./year

0.01

5,256

0.001

525.6

0.0001

52.56

0.00001

5.256

0.000001

0.5256

Page 70: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 70

Motivation – High Availability

Page 71: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 71

MDT (Mean Down Time or MTTR - mean time to restoration).

The total down time (Y ) consists of:

• Failure detection time• Alarm notification time• Dispatch and travel time of the repair person(s)• Repair or replacement time• Reboot time

0

)( dtttgYEMDT

Maintainability

Page 72: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 72

The total down time (Y ) consists of:

• Logistic time

Administrative times

Dispatch and travel time of the repair person(s)

Waiting time for spares, tools …

• Effective restoration time Access and diagnosis time Repair or replacement time Test and reboot time

Maintainability

Page 73: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 73

The total cost of a maintenance action consists of:

Cost of spares and replaced parts

Cost of person/hours for repair

Down-time cost (loss of productivity)

The down-time cost (due to a loss of productivity) can be

the most relevant cost factor.

Maintenance Costs

Page 74: A. BobbioBertinoro, March 10-14, 20031 Dependability Theory and Methods Part 1: Introduction and definitions Andrea Bobbio Dipartimento di Informatica.

A. Bobbio Bertinoro, March 10-14, 2003 74

Is the sequence of action that minimizes the total cost

related to a down time:

Reactive maintenance:

maintenance action is triggered by a failure.

Proactive maintenance:

preventive maintenance policy.

Maintenance Policy


Recommended