+ All Categories
Home > Documents > On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and...

On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and...

Date post: 28-Dec-2015
Category:
Upload: ernest-mccormick
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:
34
On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May 2012
Transcript
Page 1: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

On Model Validation Techniques

Alex KaragrigoriouUniversity of Cyprus

"Quality - Theory and Practice”,

ORT Braude College of Engineering, Karmiel, May 2012

Page 2: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

OUTLINE

• Introduction

• Graphical Methods

• Likelihood Method

• Kolmogorov Test

• Chi-Squared Tests

• Tests based on Measures

Page 3: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

After fitting a distribution model to a data set when performing life data analysis, we are often interested in diagnosing the model's fit or comparing the fit of different distributions.

In addition to the engineering knowledge that should always govern the choice of a distribution model, there are many statistical tools that can help in deciding whether or not a distribution model is a good choice from a statistical point of view.

These tools can also be used to compare the fit of different distributions.

Page 4: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

Reliability Terms

•Mean Time To Failure (MTTF) for non-repairable systems

•Mean Time Between Failures for repairable systems (MTBF)

•Reliability Probability (survival) R(t)

•Failure Probability (cumulative density function) F(t)=1-R(t)

•Failure Probability Density f(t)

•Failure Rate (hazard rate) λ(t)

•Mean residual life (MRL)

Page 5: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

Time Distributions (Models) of the Failure Density

• Exponential Distribution

Very commonly used, even in cases to which it does not apply (simple);

Applications: Electronics, mechanical components etc.

• Normal Distribution

Very straightforward and widely used;

Applications: Electronics, mechanical

components etc.

-( ) tf t e • Lognormal Distribution

Very powerful and can be applied to describe various failure processes;

Applications: Electronics, material,

structure etc.

• Weibull Distribution

Very powerful and can be applied to

describe various failure processes;

Applications: Electronics, mechanical

components, material, structure etc.

2

2

(ln - )-

21( )

2

t

f t et

-1-

( )t

tf t e

Page 6: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

Probability Plots – Graphical Validation

Probability plotting (e.g. Q-Q plot) is a graphical method that allows a visual assessment of the model fit.

Once the model parameters have been estimated, the probability plot can be created.

The next figure shows a comparison of the probability plots of the two choices (Weibulll & Exponential) using the data set.

Page 7: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.
Page 8: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.
Page 9: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

Problems typical with reliability & survival data

Censoring when the observation period ends, not all units have failed - some are survivors)

Lack of Failures if there is too much censoring, even though a large number of units may be under observation, the information in the data is limited due to the lack of enough failures)

Practical difficulty when planning reliability assessment tests and analyzing failure data.

Page 10: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

Type I Censoring – Right Censoring

n items are observed during a fixed time period [0, T]. The number of failures r is random. n-r items (also random) will be in operation (censored) at the end of the time period.

Also called "right censoring" since the failure times to the right (i.e., larger than T) are missing.

Page 11: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

Type II Censoring

We run the test until we observe exactly r failures. The time period T is random. n-r units are in operation (nonrandom).

In Type II censoring we know in advance how many failuretimes we have - this helps when planning adequate tests.

However, an open-ended random test time is generally impractical from a management point of view and this type of testing is rarely seen.

Page 12: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

Readout or Interval Censored Data

Sometimes exact times of failure are not known; only an interval of time in which the failure occurred is recorded.

Page 13: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

Likelihood Value

Use the MLE (Maximum Likelihood Estimation) method to estimate the parameters. Then, the likelihood value can be used to assess the fit:

The distribution with the largest L value is the best fit.

Page 14: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

Table: Comparing the log-likelihood value for comparing the fit of two distributions.

The log-likelihood value for the Weibull distribution is greater than that for the exponential distribution (i.e. the Weibull distribution is statistically a better fit).

data set

Distribution Model Weibull Exponential

Parameters β = 3.03, η = 100.99 λ = 0.0111

Log-Likelihood Value

-48.42 -55.04

Page 15: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

Modified Kolmogorov-Smirnov (KS) Test

The standard (KS) test is used for continuous distributions with known parameters. The Modified KS test is used when the parameters are unknown and need to be estimated.

For N failure times , we define to be the

empirical distribution function. The Modified KS test uses the maximum of the absolute difference between and the fitted cumulative distribution function, Q(t):

1,..., Nt t ( )NS t

( )NS t

Page 16: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

The distribution of the Modified KS test in the case of the null hypothesis (i.e. data set drawn from the fitted distribution) can be calculated.

The test returns the probability that . A high probability value, close to 1, indicates that there is a significant difference between the theoretical distribution and the data set.

The value for the Weibull distribution is smaller thus:

the Weibull distribution is statistically a better fit.

test

Distribution Model Weibull Exponential

Parameters β = 3.03, η = 100.99 λ = 0.0111

P(DCRIT < Dmax) 14.84% 89.58%

CRITD D

Page 17: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

Chi-Squared TestThe chi-squared test relies on the idea of grouping the data into a suitable number of intervals. Grouping involves a loss of information, and there is also often considerable arbitrariness in how the intervals are chosen. The optimal number k of intervals for a sample of size N may be estimated from Sturges' Rule

Let Ni be the number of data points in the i interval and ni the expected number according to the fitted distribution. The chi-squared statistic is

Page 18: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

A high probability value, close to 1, indicates that there is a significant difference between the theoretical distribution and the data set.

Table: Comparing two distributions using the chi-squared test

The value for the Weibull distribution is smaller (i.e. the Weibull distribution is statistically a better fit).

Distribution Model Weibull Exponential

Parameters β = 3.03, η = 100.99 λ = 0.0111 P(χ2 CRIT < χ2) 26.50% 66.76%

Page 19: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

Empirical model fitting – Distribution Free

(Kaplan-Meier) approach

•No underlying model (Weibull, lognormal etc) is assumed

•K-M estimation is an empirical (non-parametric) procedure

•Exact times of failure are required

Page 20: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

Kullback-Leibler:

Matusita:

Kagan:

Csiszar:

Hellinger:

wwhheerree iiss aa ccoonnvveexx ffuunnccttiioonn iinn 0,

ssaattiissffyyiinngg cceerrttaaiinn ccoonnddiittiioonnss

Cressie and Read:

Observe that Csiszar’s measure reduces to Kullback-Leibler divergence

if ( ) logu u u . If 21( ) 1

2u u or 2

( ) 1u u Csiszar’s measure

yields the Kagan’s and the Matusita’s divergence respectively.

Methods based on Measures

Page 21: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

21

The BHHJ Power Divergence [Basu et. al (1998)]

The BHHJ family reduces

to the Kullback-Leibler divergence for α↓0 and to the square of L2 distance for α = 1.

where (1.5.4)

Page 22: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

22

1 1

1

1 1: 1

ma a a

a j i j jj

BHHJ d q p q pa a

,

(1.5.5)

Discrete cases: Distance between 2 binomial/multinomial

1 1( / ), ln( / )

m m

j j j j j jj jCsiszar q p q KL p p q

1 2 1 2

1 2 1 2

: , , : ,

: , ,... , : , ,...m m

P p success p failure Q q success q failure

P p p p Q q q q

Page 23: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

The AIC Model Selection Criterion

For the construction of AIC, Akaike used the K-L measure

Akaike proposed the evaluation of the 2nd term (expected LogLik) using minus twice the mean expected LogLik

Finally, he provided an unbiased estimator of the expected LogLik:

ˆ ˆln ln (ln ) (ln )KLg gI g g g f E g E f

ˆ ˆ 12 (ln ) 2 .... (ln ) ( ) ...g g g i nE E f E f g x dx dx

ˆ

2 2ln( ( ))iAIC f x p

n n

Page 24: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

The AIC Model Selection Criterion

where p is the number of unknown parameters involved in the model/distribution.

In our case:

Weibull model: AIC=2x48.42 + 4=100.84

Exponential : AIC=2x55.04 + 2=112.08

The Weibull fit is better.

( ) 2 ln Likelihood 2AIC p p

Page 25: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

Other Model Selection Methods

where p is the number of unknown parameters involved in the model/distribution.

( ) 2 ln Likelihood ln( )BIC p n p

( ) 2 ln Likelihood ln(ln( )) , 2HQ p c n p c

Page 26: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

The DIC criterion is derived based on the BHHJ measure.

The DIC Model Selection

1 1ˆ ˆ ˆ

1

1( ) 1 ( ), 0 1

na a

ii

Q f z dz a f x an

Page 27: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

Modified Divergence Information

Criterion (MDIC)

2 / 2/ 2( ) * (2 ) (1 )ˆ

paMDIC p n MQ a p

1ˆ ˆ

1

11 ( )

na

ii

MQ f xn

where .

Page 28: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

28

Tests based on Measures

Page 29: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

29

Goodness of Fit Tests

Page 30: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

30

Goodness of Fit Tests

Page 31: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

31

Compare BHHJ test with the goodness of fit tests based on the Kullback measure (KL), the Kagan measure (Pearson chi-square test), the Matusita measure (Mat), and the Cressie and Read measure (CR).

Three different values of the index α are used: α = 0.01, 0.05 & 0.10.

Both the power and the type I error are investigated.

Simulated results: A trinomial distribution is used with n=150 and a number of 10000 simulations have been created.

Page 32: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

32

Goodness of Fit Tests

Page 33: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

33

Goodness of Fit Tests

Page 34: On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.

34

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3

BHHJ

KULLBACK

KAGAN

MATUSITA

% of rejections when Ho: M(150, 0.2, 0.6, 0.2) holds

% o

f rej

ectio

ns w

hen

H1:

M(1

50,

0.2,

0.7

, 0.

1) h

olds

POWER vs ‘SIZE’ of the TESTGoodness of Fit Tests


Recommended