+ All Categories
Home > Documents > Statistical concepts of validation of microsimulation models

Statistical concepts of validation of microsimulation models

Date post: 01-Feb-2016
Category:
Upload: jarvis
View: 45 times
Download: 0 times
Share this document with a friend
Description:
Statistical concepts of validation of microsimulation models. Philippe Finès Health Analysis Division Statistics Canada 24 November 2009. Introduction – renew. Validation is a process that includes steps in evaluation (ex: cost-effectiveness evaluation) assessment of uncertainty - PowerPoint PPT Presentation
Popular Tags:
35
1 Statistical concepts of validation of microsimulation models Philippe Finès Health Analysis Division Statistics Canada 24 November 2009
Transcript
Page 1: Statistical concepts of validation of microsimulation models

1

Statistical concepts of validation of microsimulation models

Philippe FinèsHealth Analysis Division

Statistics Canada

24 November 2009

Page 2: Statistical concepts of validation of microsimulation models

2

Introduction – renew Validation is a process

that includes steps in evaluation (ex: cost-effectiveness evaluation) assessment of uncertainty

of which the outcome is a verdict qualified by a criterion

“Valid” or “Not valid” Green, Yellow, Red lights; Score from 0 to 10

based on statistical tests inversely related to uncertainty?

and of which the purpose is to make recommendations

Page 3: Statistical concepts of validation of microsimulation models

3

Related concepts « Adjustment »

= steps done beforehand to make sure that simulation fits the data

Alignment Calibration

Evaluation = steps done to determine whether the program is

adequate Assessment

= steps done to analyse whether or not simulation fits reality

Assessment of uncertainty Assessment of goodness of fit Assessment of quality of reproduction of reality

Page 4: Statistical concepts of validation of microsimulation models

4

Train of thought #1 1a: data observed in reality vs

data generated in simulation 1b: relationship in reality vs

relationship in simulation

Page 5: Statistical concepts of validation of microsimulation models

5

Train of thought #1

Consider the relation (X,Y) Reality: Phenomenon being observed

Y = f(X) + e X has a « natural » variation se(X); and

the « natural » variation of Y S(Y) depends on se(X) and e

If f, se(X), e are known, then for any value of X, Y and se(Y) are known

Page 6: Statistical concepts of validation of microsimulation models

6

Train of thought #1 Simulation: Phenomenon being created

X’= X + d(X) Y’= Y + d(Y) Y’= Y + d(X) = f(X) + d(Y) + e Y’ = f(X’- d(X)) + d(Y) + e = f(X’) + e’ If f, s(X’) and e are known, then for any value of X, Y’

and s(Y’) are known But, we examine the following relation in the

simulation: Y’=g(X’)+e’ How does the relation (X’,Y’) obtained in the

simulation reproduces the relation (X,Y) observed in reality?

Page 7: Statistical concepts of validation of microsimulation models

7

Train of thought #1 One way to address this question is:

under which circumstances is this relationship (X’,Y’) close to relation (X,Y)?

Sufficient condition: d(X)=0, d(Y)=0. This is achieved when simulated data reproduces

« well » the reality (in the sense that simulated data can not be distinguished from the reality – this can be verified by a test)

Necessary condition: d(X) and d(Y) are « small » compared to e and to s(X).

In other words, the uncertainty of the simulation is « small » compared to the uncertainty of the model and the uncertainty of the data.

Page 8: Statistical concepts of validation of microsimulation models

8

Validity =indistinguishability A simulation model is « valid » if it

presents indistinguishability In the inputs X (i.e. if d(X) is not sign.

different from 0) In the outcomes X (i.e. if d(Y) is not sign.

different from 0) In the model (i.e. if e is not sig. diff. from 0) This can be tested: include origin

(simulation vs observed) as a dummy variable; test this variable

Page 9: Statistical concepts of validation of microsimulation models

9

Statistical analysis to test null hypothesis – CI criterion

A. RationaleUsing the approach of Haefner and Mankin et

al (in Marks, 2007), we can examine the relationship between S (specific, historical output = reality), M (output of the model) and Q (the intersection of S and M)

Page 10: Statistical concepts of validation of microsimulation models

10

Statistical analysis to test null hypothesis

No intersection between S and M

model is useless

Non null intersection between S and M

model is useful, but incomplete and inaccurate

Model is accurate butincomplete

Model is complete butinaccurate

Model is complete and accurate

(IDEAL CASE)

Page 11: Statistical concepts of validation of microsimulation models

11

Statistical analysis to test null hypothesis – CI criterion

B. Estimation of the components of total uncertainty of results

1. Computation of the confidence intervals:i. [Z-hat] (the C.I. for Z-hat) ii. [Zobs] (the C.I. for Zobs)

2. Comparison of confidence intervals:i. [Z-hat] [Zobs] score = 100% (Best case)ii. ( [Z-hat] [Zobs] and Z-hat [Zobs] and length ([Z-

hat]) length ([Zobs]) ) or ( [Zobs] [Z-hat] ) score = 80%

iii. [Z-hat] [Zobs] and Z-hat [Zobs] and length ([Z-hat]) > length ([Zobs]) score = 60%

iv. [Z-hat] [Zobs] and Z-hat [Zobs] score = 40% v. [Z-hat] [Zobs] score = 0%

Page 12: Statistical concepts of validation of microsimulation models

12

Statistical analysis to test null hypothesis – CI criterion

3. Possible cases (Assessment of a verdict): [Z-hat] [Zobs] Z-hat [Zobs] Z-hat [Zobs]

[Z-hat] [Zobs]

length ([Z-hat]) length ([Zobs]) Score=100% if best case; Score=80%

Score=40% Score=0%

length ([Z-hat]) > length ([Zobs])

Score=60% NB: It is the only case in which we need to know the proportion of the 4 components of [Z-hat]

Score=40% Score=0%

Page 13: Statistical concepts of validation of microsimulation models

13

Train of thought #2 Uncertainties are larger in

simulation, but by how much?

Page 14: Statistical concepts of validation of microsimulation models

14

Train of thought #2 In reality: X, Y

Uncertainty on X=s(X) Uncertainty on the relationship (X,Y)=e Uncertainty on Y=s(Y), which depends

on s(X) and e Elasticity of X with respect to Y =

(delta(Y)/s(Y))/(delta(X)/s(X)), where delta(X) is defined as a function of s(X) (e.g. +/- 1.96*s(X)/sqrt(n))

Page 15: Statistical concepts of validation of microsimulation models

15

Train of thought #2 In the simulation: X’=X+d(X), Y’=Y+d(Y)

Uncertainty on X=s(X) Uncertainty on the representation of X=d(X) Uncertainty on X’=s(X’), which depends on s(X) and d(X) Uncertainty on the relationship (X,Y)=e’ which depends on d(X),

d(Y), e Uncertainty on Y=s(Y), which depends on s(X) and e Uncertainty on the representation of Y=d(Y) Uncertainty on Y’=s(Y’), which depends on s(Y), d(Y) and e’ Elasticity of X’ with respect to Y’ =

(delta(Y’)/s(Y’))/(delta(X’)/s(X’)), where delta(X’) is defined as a function of s(X’) (e.g. +/- 1.96*s(X’)/sqrt(n))

PRCC (Partial rank correlation coefficient) answers to questions such as how the output is affected if we increase (or decrease) a specific parameter cf. Marino et al, 2008: Simeone Marino, Ian B. Hogue, Christian J. Ray, Denise E. Kirschner. A methodology for performing global uncertainty and sensitivity analysis in systems biology, Journal of Theoretical Biology 254 (2008) 178– 196

Page 16: Statistical concepts of validation of microsimulation models

16

Other notes One could write variability of a result as the sum of uncertainty

components (cf. Marino et al, 2008: PRCC and eFAST)

PRCC (Partial rank correlation coefficient) answers to questions such as how the output is affected if we increase (or decrease) a specific parameter

eFAST (extended Fourier amplitude sensitivity test) indicate which parameter uncertainty has the greatest impact on output variability

(Our position:) A model would be considered valid if it does not add much variability compared to the one already present in the datai. [(Xobs,Yobs)] [-hat] [Z-hat]; [Z-hat] to be compared to [Zobs]

ii. [Z-hat] (the C.I. for Z-hat) includes all the uncertainties implied in computation and simulation of Z-hat

[(Xobs,Yobs)]: variance estimates [-hat]: sensitivity analysis of parameters Uncertainty due to model Uncertainty due to simulation (=Monte-Carlo errors)

iii. [Zobs] (the C.I. for Zobs) includes all the uncertainties implied in measure of Zobs

[Zobs]: variance estimates

Page 17: Statistical concepts of validation of microsimulation models

17

Statistical analysis to test null hypothesis – Variance criterion We consider that the model is valid if the following

conditions are satisfied (the first 3 being essentially trivial):

i. In the training sample, E(ŷ) Y

ii. In the test sample, E(ŷ) Y

iii. Var (ŷ) in the training sample <= Var (y) in the test sample

iv. In the test sample, Var (ŷ) <= Var (y), that is, the predicted values of the y are less dispersed than the original y values, even when using the coefficients obtained in the training sample.

We will repeat n times the technique of dividing the original sample into a training and a test samples (e.g. n=500). If the condition (iv) is realized at least 475 times (that is, 95% of the times), we conclude in this case that the model to predict y is valid.

Page 18: Statistical concepts of validation of microsimulation models

18

Statistical analysis to test null hypothesis – MQE criterion

Mean Quadratic Error (MQE): MQE = Variance + (Bias)2 where Variance is computed among the ŷi’s and Bias

is the average gap between the ŷi’s and the yi’s We want our model to be useful in the sense that

Biais² is much smaller than Variance and MQE of the model is much smaller than Variance of the original yi’s (without any

model). MQE will be computed within the training sample and the test sample.

Although we expect that MQE will be larger in the test sample than in the training sample, we will determine if it is not unreasonably larger than that of the training sample (say, not larger that 1.25 times MQE of the training sample).

We will repeat the splitting of the original sample in a test and a training sample, for B iterations (say, B=500). For each iteration, we will verify if the 3 conditions described above are satisfied a large number of times (say, 95%).

If it is not the case, then the model has to be questioned as his lack of stability will make the results in POHEM uneasy to interpret.

If it is the case, then we will say that the model is valid, and proceed to the validation of the parameters.

Page 19: Statistical concepts of validation of microsimulation models

19

Train of thought #3 – Indistinguishability assessed in pivotal statistics

Page 20: Statistical concepts of validation of microsimulation models

20

Train of thought #3

To assess indistinguishability One has to concentrate on pivotal statistics, i.e.

the results for which the purpose is to validate the model. They represent “synthetic” results that are more easily interpretable and comparable between sources.

Ex: Z= number or proportion of deaths in a given year; Z= number or proportion of obese in a given year.

Some pivotal outcomes should include results after a certain amount of time. Ex: Z= number or proportion of deaths 20 years later; Z= number or proportion of obese 20 years later.

Page 21: Statistical concepts of validation of microsimulation models

21

Train of thought #3

For each node (ex: “Diabetes”), 1. we determine a pivotal statistics (ex:

Prevalence in 10 years)2. we determine the set of parameters that

have an impact on the node3. we examine the variability of the

parameters4. we build a series of scenarios that

reproduce the variability of the parameters5. we examine the impact of the scenarios on

the range of the pivotal statistics

Page 22: Statistical concepts of validation of microsimulation models

22

Train of thought #4 – conciliation of pivotal statistics and elasticity

Page 23: Statistical concepts of validation of microsimulation models

23

Train of thought #4

We also introduced the concept of « elasticity » of a statistics X as = the variation of Y / variation of X. More precisely, elast(X,Y)=(delta

Y/s(Y))/(delta X/s(X)), where delta X/s(X) is the « natural » variation of X.

Page 24: Statistical concepts of validation of microsimulation models

24

Train of thought #4 We want to introduce a theory that combines

the concept of elasticity and of pivotal statistics. The problem is that it combines the Xs and the Ys. For simplicity, let us call X an input and Y and outcome. One tentative theory could be:

A pivotal statistics Y is an outcome such that for many inputs Xi, elast(Xi,Y) is « large » (i.e. the confidence interval widens significantly)

Ex: in POHEM-OA, LE and HALE are both related to incidence of OA, obesity, RR of obesity; but HALE varies relatively more than LE when incidence of OA, proportion of obese and RR of obesiy vary within their « natural » variation

Page 25: Statistical concepts of validation of microsimulation models

25

Train of thought #4 We have to limit ourselves to the pivotal

statistics defined as the ones for which the elasticity is the largest: that means that we examine only the most sensitive outputs.

In other words, either we define pivotal statistics a priori (from the logical pathway) or we define them as the ones with precisely the largest elasticity.

Page 26: Statistical concepts of validation of microsimulation models

26

Thank you!

Questions?

[email protected]

Page 27: Statistical concepts of validation of microsimulation models

27

Our position

In this presentation, we will only look at the uncertainty of the parameters, i.e. we perform sensitivity analysis.

Page 28: Statistical concepts of validation of microsimulation models

28

Train of thought #2 For each node,

1. we determine a pivotal statistics1. that is easy to interpret2. that can be compiled for real data

2. we determine the set of parameters that have an impact on the node

1. on either the main arc that leads to this node or all of them

3. we examine the variability of the parameters:

1. From a K-fold technique or from bootstrap technique, if they are available, we will get the distributions of the parameters

2. If k-fold and bootstrap techniques are not available, we will use standard error of the parameters

3. If this is not possible either, we will use only mean +/- 0.5*mean

Page 29: Statistical concepts of validation of microsimulation models

29

Train of thought #24. we build a series of scenarios that reproduce the

variability of the parameters1. using a “multi-way probabilistic” approach, we will

implement the method suggested by Cronin et al. 1. The simulation will be run many times, with the values of

the parameters randomly chosen from plausible combined distributions. The results will be presented as a distribution of model predictions.

2. This technique is a special case of Bayesian techniques where values of parameters are generated from an observed distribution. The challenge in this case will be to make sure that the plausible distributions mentioned are multidimensional: the values of the parameters will have to take into account the total and partial correlations between risk factors.

2. There will therefore be three methods:1. No parameter uncertainty2. Parametric bootstrap analysis3. Cronin et al’s method

Page 30: Statistical concepts of validation of microsimulation models

30

Train of thought #25. we examine the impact of the scenarios on the

range of the pivotal statistics1. how much do the pivotal statistics vary?

1. Coefficient of sensitivity (CS) = elasticity = range of outcome / average of outcome

2. Is CS > or < 1?3. We will conclude that if the results are “highly”

sensitive to a group of parameters, it means that more emphasis has to be put on the accuracy of these parameters.

4. On the contrary, if they are relatively stable when the parameters vary within their permissible range, it means that the result is robust.

2. how much do the pivotal statistics stay close to the real data?1. face validity2. statistical analysis to test null hypothesis (H0) that

“simulated results do not differ from real data” see next pages

Page 31: Statistical concepts of validation of microsimulation models

31

Assessment of uncertainty Uncertainty

According to Briggs, Sculpher, Buxton (for the context of HTA):

Uncertainty relating to variability in sample data Uncertainty relating to the generalisability of results Uncertainty relating to extrapolation Uncertainty relating to analytical methods

Also (Wolf:) uncertainty of random numbers (Cronin et al: stochastic

variation) parameter uncertainty imputation error found in the starting-population data base analyst’s ignorance about the true value of “unmeasured

heterogeneity” Also (Cronin et al:)

choice of the specified model structure

Page 32: Statistical concepts of validation of microsimulation models

32

Assessment of uncertainty (Briggs, Sculpher, Buxton) “The increased use of the clinical trial […], encourages greater

use of formal statistical methods to handle some types of uncertainty systematically.

Sensitivity analysis is still 1 needed, however, […] in a number of contexts:

to deal with uncertainty in relation to data inputs for which no clear sample exists;

to attempt to increase the generalisability of the study; to handle uncertainty associated with attempts to extrapolate

away from the primary data source, in order to make the results more comprehensive [also in Weinstein, 2006]

to explore the implications of selecting a particular analytical method from amongst alternatives when no widely accepted approach exists” (p. 101)

Therefore, Briggs, Sculpher, Buxton oppose statistical methods to sensitivity analysis. However, they are not incompatible.

(1): Emphasis ours

Page 33: Statistical concepts of validation of microsimulation models

33

Assessment of uncertainty (Briggs, Sculpher, Buxton) Sensitivity analyses

Simple sensitivity analysis One-way Multi-way

Threshold analysis Identify the critical value of parameter (s) above or

below which the conclusion of a study will change Analysis of extremes

High cost: combination of all pessimistic assumptions about costs

Low cost: combination of all optimistic assumptions about costs

Probabilistic sensitivity analysis Taking into account the distribution of values

Page 34: Statistical concepts of validation of microsimulation models

34

Assessment of uncertainty (Briggs, Sculpher, Buxton) Uncertainty

Uncertainty relating to variability in sample data Uncertainty relating to the generalisability of results Uncertainty relating to extrapolation Uncertainty relating to analytical methods

Sensitivity analyses Simple sensitivity analysis Threshold analysis Analysis of extremes Probabilistic sensitivity analysis

Sensitivity analysis

Data variability

Generalisability

Extrapolation Analytical method

Simple sensitivity analysis

(generally useful)

Threshold analysis

(unlikely to be useful)

Analysis of extremes

(potentially useful in certain contexts)

Probabilistic sensitivity analysis

Page 35: Statistical concepts of validation of microsimulation models

35

Assessment of uncertainty – bootstrap approach (Cronin et al)

Sensitivity analysis on the parameters

1. No parameter uncertainty2. Parametric bootstrap analysis3. Parameter sampling design

(Latin square design)


Recommended