MtMeta-anali fdi tilysis of diagnostic accuracy studies · MtMeta-anali fdi tilysis of diagnostic...

Post on 16-Jul-2020

4 views 0 download

transcript

M t l i f di tiMeta-analysis of diagnostic accuracy studiesy

Mariska Leeflang(with thanks to Yemisi Takwoingi, Jon Deeks and Hans Reitsma)

1

Madhu.Pai
Typewritten Text
Dr Mariska Leeflang Dept. Clinical Epidemiology, Biostatistics and Bioinformatics Academic Medical Center, University of Amsterdam Room J1B – 210 PO Box 227700 1100 DE Amsterdam m.m.leeflang@amc.uva.nl

Di ti T t A R iDiagnostic Test Accuracy Reviews

1. Framing the question

2. Identification and selection of studies

3 Quality assessment3. Quality assessment

4. Data extraction

5. Data analysis

6. Interpretation of the results

2

p

Ulti t l f t l iUltimate goal of meta-analysis

Robust conclusions with respect to the research question(s)to the research question(s)

3

M t A l iMeta-Analysis

1. Calculation of an overall summary (average) of high precision, coherent with all observed data

2. Typically a “weighted average” is used h i f ti (l ) t di where more informative (larger) studies

have more say

A th d t hi h th t d 3. Assess the degree to which the study results deviate from the overall summary

Investigate possible explanations for the

4

4. Investigate possible explanations for the deviations

Th ( t ) l tiThe (meta-)analytic process

What anal ses did o plan?1. What analyses did you plan?a. Primary objectiveb. Subgroups, sensitivity analyses, etc.

2. What are the data at hand?a. Forest plotsb R ROC l tb. Raw ROC plotsc. Variation in predefined covariates?

Is meta analysis appropriate?3. Is meta-analysis appropriate?a. Sufficient clinical/methodological homogeneityb. Enough studies per review question

5

4. Meta-analysis

S f hi h l ?Summary of which values?Disease

(Ref test)(Ref. test)

Pres. Abs.

Index + TP FP

Sensitivity

Specificity -Test - FN TN

p y

Positive Predictive Value

Negative Predictive ValueNegative Predictive Value

Positive Likelihood Ratio

N i Lik lih d R iNegative Likelihood Ratio

Diagnostic Odds ratio

6

ROC curves

P li iti it d ifi it ?Pooling sensitivity and specificity?

7

P li iti it d ifi it ?Pooling sensitivity and specificity?

8

P li Lik lih d R ti ?Pooling Likelihood Ratios?

9

P li LR ?Pooling LRs?

10

P li dd ti ?Pooling odds ratios?

11

Let’s focus on sensitivity and specificity

Predictive values are directly depending on prevalence

Pooling likelihood ratios may lead to misleading / impossible results

Pooling odds ratios may be okay, but are difficult to interpret.

From the pooled sensitivity and specificity, it is still possible to calculate LRs and PVs

12

LRs and PVs.

D i ti A l iDescriptive Analysis

Forest plots point estimate with 95% CIp paired: sensitivity and specificity side-

by side

13

14

D i ti A l iDescriptive Analysis

Forest plots point estimate with 95% CI

i d iti it d ifi it id paired: sensitivity and specificity side-by side

ROC plot pairs of sensitivity & specificity in ROC

space bubble plot to show differences in

precision

15

precision

Plot in ROC Space

1.0at

e

0.8

posi

tive

ra

0.6

True

p

0.2

0.4

0.0

16False positive rate0.0 0.2 0.4 0.6 0.8 1.0

Diff t A hDifferent Approaches

P li t ti t Pooling separate estimates Not recommended

Summary ROC model Traditional approach relative simple Traditional approach, relative simple

More complex models More complex models Bivariate random approach Hierarchical summary ROC approach

17

y pp

Threshold effects

Decreasingff Decreasing threshold increases sensitivity but

1

for predicting spontaneous birthFetal fibronectin

1

for predicting spontaneous birthFetal fibronectin

decreases specificity

6.8

y 6.8

y

Increasing threshold

.4.

sens

itivi

ty.4

.se

nsiti

vity

threshold increases specificity but decreases

0.2

0.2

18

sensitivity 0

0.2.4.6.81specificity

0

0.2.4.6.81specificity

Implicit and explicit threshold effects

Explicit threshold: different thresholds are used for test positivity

Implicit threshold: there is no or only th h ld b t i t t one threshold, but in some cases tests

are earlier regarded as positive than in other casesother cases

19

Explicit threshold: (ROC) curve

The ROC curve represents the relationshiprelationship between the true positive rate (TPR) and the false positive rate (FPR) of the test at various thresholds used to distinguish disease cases from non-cases.

20Deeks, J. J BMJ 2001;323:157-162

I li it th h ldImplicit threshold

21

ELISA for invasive aspergillosis; cut-off value 1.5 ODI.

Di ti dd tiDiagnostic odds ratios

Ratio of the odds of positivity in the diseased to theRatio of the odds of positivity in the diseased to the odds of positivity in the non-diseased

FNFPTNTPORDiagnostic

veLRysensitivitysensitivit

DOR

1

veLRyspecificit

yspecificity

DOR

1

22

Di ti dd tiDiagnostic odds ratios

Cervical Cancer(Biopsy)

Present Absent

HPV + 65 93 158

Test - 7 161 198

72 254 35672 254 356

1616165DOR

23

16793

DOR

Di ti dd tiDiagnostic odds ratios

S i i iSensitivity

Specificity 50% 60% 70% 80% 90% 95% 99%

50% 1 2 2 4 9 19 9950% 1 2 2 4 9 19 99

60% 2 2 4 6 14 29 149

70% 2 4 5 9 21 44 23170% 2 4 5 9 21 44 231

80% 4 6 9 16 36 76 396

90% 9 14 21 36 81 171 89190% 9 36 8 89

95% 19 29 44 76 171 361 1881

99% 99 149 231 396 891 1881 9801

24

Symmetrical ROC curves and diagnostic odds ratios

1

A DOR i.8

As DOR increases, the ROC curvemoves closer to its ideal position nearuninformative test

line of symmetry

.4.6

Sen

sitiv

ity

ideal position near the upper-left corner.

ROC iline of symmetry

.2

ROC curve is asymmetric when test accuracy varies with threshold0

0.2.4.6.81Specificity

with threshold

25

DOR = 90DOR = 6

DOR = 15DOR = 3

Statistical modelling of ROC curves

statisticians like straight lines with axes that are independent variables

first calculate the logits of TPR and FPR first calculate the logits of TPR and FPR

and then graph the difference against their sum

26

Translating ROC space to D versus STranslating ROC space to D versus S

1.06

ve ra

te

0.6

0.8

dds

ratio

4

5

True

pos

itiv

0.4

D =

log

od2

3

T

0.0

0.2

1

0

1

27False positive rate0.0 0.2 0.4 0.6 0.8 1.0

-1

S-6 -5 -4 -3 -2 -1 0 1 2

Moses-Littenberg SROC method

5

6

What do the axes mean? Difference in logits is od

ds ra

tio

3

4

5

Difference in logits is the log of the DOR

Sum of the logits is a D =

log

1

2

3

marker of diagnostic threshold

-1

0

28S-6 -5 -4 -3 -2 -1 0 1 2

Moses-Littenberg SROC method

Regression models can be used to fit the straight lines to model relationship between test accuracy and test threshold

D = a + bS

Outcome variable D is the difference in the logits Explanatory variable S is the sum of the logits Explanatory variable S is the sum of the logits Ordinary or weighted regression – weighted by sample

size or by inverse variance of the log of the DOR

29

Li R iLinear Regression

5

6

3

4

5

D

1

2

-1

0

30

S-6 -5 -4 -3 -2 -1 0 1 2

Producing summary ROC curves

Transform back to the ROC dimensions Transform back to the ROC dimensions

where ‘a’ is the intercept, ‘b’ is the slope when the ROC curve is symmetrical, b=0 and

the equation is simpler

31

the equation is simpler

Linear Regression & Back Transformation

61.0

4

5

6

Q0.8

D

2

3

e po

sitiv

e ra

te 0.6

0

1 True

0.2

0.4

32

-1

S-6 -5 -4 -3 -2 -1 0 1 2

0.0

False positive rate0.0 0.2 0.4 0.6 0.8 1.0

Diff t it tiDifferent situations

What is the relationship between the underlying distribution and the y gROC curve and the D versus S line?

Let’s have a look at different situations.

33

ROC curve and logit difference and sumplot: small difference, same spread

0 080.1

cy non-diseased diseased

00.020.040.060.08

elat

ive

freq

uenc

non-diseased diseased

100e)

00 20 40 60 80 100

measurement

re

40

60

80

ive

rate

(%ag

e

6

10

gitF

PR

0

20

40

0 20 40 60 80 100

true

pos

iti

-2

2

-40 -20 0 20 40

logi

tTPR

-lo

g

34

false positive rate (%age)logit TPR + logit FPR

ROC curve and logit difference and sum plot: moderate difference, same spread

0.1

cy diseasednon diseased

00.020.040.060.08

ativ

e fr

eque

nc

diseasednon-diseased

100

0 20 40 60 80 100

measurement

rela

40

60

80

100

ive

rate

ge) 4

8

logi

tFPR

0

20

40

true

pos

iti

(%ag

-4

0-30 -20 -10 0 10 20 30 40

l it TPR + l it FPR

logi

tTPR

-l

35

0 20 40 60 80 100

false positive rate (%age)

logit TPR + logit FPR

ROC curve and logit difference and sum plot:large difference, same spread

0 1y0.020.040.060.080.1

ive

freq

uenc

y

non-diseased diseased

100

)

00 20 40 60 80 100

measurement

rela

ti

60

80

e ra

te (%

age)

4

8

logi

tFPR

0

20

40

true

pos

itive

-4

0-30 -20 -10 0 10 20 30 40

l it TPR l it FPR

logi

tTPR

-l

36

00 20 40 60 80 100

false positive rate (%age)

logit TPR + logit FPR

ROC curve and logit difference and sum plot: moderate difference, unequal spread

0.1

0.04

0.06

0.08

ive

freq

uenc

ynon-diseased diseased

0

0.02

0 20 40 60 80 100

measurement

rela

ti

60

80

100

ate

(%ag

e)

LOW DOR

2

4

6

8

10

logi

tfpr

0

20

40

rue

posi

tive

raHIGH DOR

-6

-4

-2

0

2

-30 -20 -10 0 10 20 30lo

gitt

pr-l

37

00 20 40 60 80 100

false positive rate (%age)

tr logit tpr + logit fpr

SROC regression: another exampleg p

1 0

0.8

1.0

6

7

0.4

0.6

Sen

sitiv

ity

3

4

5

unweighted

weighted

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

-4 -3 -2 -1 0 1 21

2

Transformation linearizes relationship between accuracy and threshold so that linear regression

1 - Specificity3 0

S

38

accuracy and threshold so that linear regression can be used

PSV example cont.PSV example cont.

6

7

0.8

1.0

3

4

5

unweighted

weighted0.4

0.6

Sens

itivi

ty

-4 -3 -2 -1 0 1 21

2

3

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

binverse transformation

S0.0 0.2 0.4 0.6 0.8 1.0

1 - Specificity

39

The SROC curve is produced by using the estimates of a and b to compute the expected sensitivity (tpr) across a range of values for 1-specificity (fpr)

Problems with the Moses-Littenberg SROC method

Poor estimation Poor estimation Tends to underestimate test accuracy due to zero-cell

corrections and bias in weights

Validity of significance tests Sampling variability in individual studies not properly taken

i t tinto account P-values and confidence intervals erroneous

O ti i t Operating points knowing average sensitivity/specificity is important but

cannot be obtained

40

Sensitivity for a given specificity can be estimated

Advanced models –HSROC and Bivariate methods

Hierarchical / multi level Hierarchical / multi-level allows for both within and between study variability, and

within study correlations between diseased and non-diseased groupsdiseased groups

Logistic correctly models sampling uncertainty in the true positive y p g y p

proportion and the false positive proportion no zero cell adjustments needed

R d ff t Random effects allows for heterogeneity between studies

Regression models

41

Regression models used to investigate sources of heterogeneity

Parameterizations

HSROC Bivariate HSROC

Mean lnDOR Variance lnDOR

Bivariate Mean logit sens Variance logit sens

Mean threshold Variance threshold

Mean logit spec Variance logit spec

Variance threshold

Shape of ROC

g p

Correlation between sensitivity andsensitivity and specificity

Other than the parameterization, the models are mathematically equivalent, see

42

p , y q ,Harbord R, Deeks J et al. A unification of models for meta-analysis of diagnostic accuracy studies. Biostatistics 2006;1:1-21.

Hierarchical SROC model

1threshold

shape

accuracy

.5sitiv

ity

.5

Sen

s

0

43

0.51Specificity

Bivariate model

1 correlation

specificity

.5sitiv

ity

sensitivity

specificity

.5

Sen

s

0

44

0.51Specificity

Outputs from the models

HSROC BivariateHSROC Estimates underlying SROC

curve, and the average operating point on the curve ( DOR d

Bivariate Estimates the average

operating point (mean sensitivity and specificity),

(mean DOR and mean threshold)

Possible to estimate mean

confidence and prediction ellipses

Possible to estimate mean Possible to estimate mean sensitivity, specificity and mean likelihood ratios, with standard errors obtained using the delta method

Possible to estimate mean likelihood ratios, with standard errors obtained using the delta method

using the delta method

Confidence and prediction ellipses estimable

Underlying SROC curve estimable

45

Fitting the models

HSROC Bi i tHSROC Hierarchical model

with non-linear

Bivariate Hierarchical model with

linear regression, regression, random effects and binomial error

g ,random effects and binomial error

Easy to fit in PROC Original code in

winBUGs Easy to fit in PROC

Easy to fit in PROC NLMIXED in SAS, can be fitted in PROC MIXED Easy to fit in PROC

NLMIXED in SASMIXED

Also in GLLAMM in STATA, MLWin

46

Syntax Proc NLMIXED - HSROC

l i d d t diproc nlmixed data=diag ;parms alpha=4 theta=0 beta=0s2ua=1 s2ut=1; s2ua 1 s2ut 1;

logitp = (theta + ut + (alpha + ua) * dis) * exp(-(beta)*dis);

p = exp(logitp)/(1+exp(logitp)); model pos ~ binomial(n,p);random ua ut ~ normal([0 0]random ua ut ~ normal([0 , 0],

[s2ua,0,s2ut]) subject=study;

47

shape Disease indicator

Hierarchical SROC model

1threshold

shape

accuracy

.5sitiv

ity

.5

Sen

s

0

48

0.51Specificity

Syntax Proc NLMIXED - Bivariate

l i d d t diproc nlmixed data=diag ;parms msens=1 mspec=2s2usens=0.2 s2uspec=0.6 cov=0; s2usens 0.2 s2uspec 0.6 cov 0;

logitp = (msens + usens)*dis + (mspec + uspec)*nondis;

p = exp(logitp)/(1+exp(logitp)); model pos ~ binomial(n,p);random usens uspec ~ normal([0 0]random usens uspec ~ normal([0 , 0],

[s2usens,cov,s2uspec]) subject=study;

49

Bivariate model

1 correlation

specificity

.5sitiv

ity

sensitivity

specificity

.5

Sen

s

0

50

0.51Specificity

METADASMETADAS

SAS macro developed to automate HSROC/bivariate analysis using PROC NLMIXEDNLMIXED

C b d t th ith R i Can be used together with Review Manager 5 (Cochrane review Software): Plot summary curve(s) Plot summary curve(s) Display summary point(s) Display 95% confidence and/or prediction

51

Display 95% confidence and/or prediction regions for summary point(s)

P t 2Part 2dealing with heterogeneity

The meta-analyst’s dream!

0 90

1,00

The meta-analyst s dream!

0 70

0,80

0,90

0,50

0,60

0,70sens

0,30

0,40

,siti

0,10

0,20vity

53

0,000,00 0,20 0,40 0,60 0,80 1,00

1-specificity

y

Realistic situation: vast heterogeneity

54

Echocardiography in Coronary Heart Disease

1.0y

0.8

ensi

tivity

0 4

0.6

Se

0.2

0.4

0.0

0 0 0 2 0 4 0 6 0 8 1 0

55

1-specificity0.0 0.2 0.4 0.6 0.8 1.0

GLAL in Gram Negative Sepsis

1.0

0.8

ensi

tivity

0 4

0.6

Se

0.2

0.4

0.0

0 0 0 2 0 4 0 6 0 8 1 0

56

1-specificity0.0 0.2 0.4 0.6 0.8 1.0

F/T PSA in the Detection of Prostate cancer

1.0ty

0.8

Sens

itivi

t

0.4

0.6

S

0.2

0.0

0 0 0 2 0 4 0 6 0 8 1 0

57

1-specificity0.0 0.2 0.4 0.6 0.8 1.0

Dip-stick Testing for Urinary Tract Infection

1.0

ity 0 6

0.8Se

nsiti

v

0.4

0.6

0.2

0.0

0.0 0.2 0.4 0.6 0.8 1.0

58

1-specificity

S f V i tiSources of Variation

I. Chance variationI.

II. Differences in threshold

III Bias

II.

IIIIII. Bias

IV. Clinical subgroups

III.

IV.

V. Unexplained variationV.

59

Sources of Variation: ChanceSources of Variation: Chance

Chance variability:l 100

Chance variability:l 40 sample size=100

1.0

sample size=40

1.0

y

0.8

y

0.8

Sen

sitiv

ity0.4

0.6

Sen

sitiv

ity

0.4

0.6

0 0

0.2

0.0

0.2

60

0.0

Specificity1.0 0.8 0.6 0.4 0.2 0.0

0.0

Specificity1.0 0.8 0.6 0.4 0.2 0.0

Sources of Variation: Threshold

1.0

Threshold: perfect negative

0.8

correlation no chance variability

nsiti

vity 0.6

Sen

0.2

0.4

0.0

0.2

61Specificity1.0 0.8 0.6 0.4 0.2 0.0

Sources of Variation: Threshold

Th h ld1.0

Threshold: perfect negative

correlation0.8

+ chance variabilityss=60

nsiti

vity 0.6

Sen

0.2

0.4

0.0

0.2

62Specificity1.0 0.8 0.6 0.4 0.2 0.0

Sources of Variation: Bias & SubgroupSources of Variation: Bias & Subgroup

1.0

Bias & Subgroup: sens & spec higher ss=60

0.8

ss=60 no threshold

nsiti

vity 0.6

Sen

0.2

0.4

0.0

0.2

63Specificity1.0 0.8 0.6 0.4 0.2 0.0

S f V i tiSources of Variation

I. Chance variation

II. Differences in threshold

III. Bias

S bIV. Subgroups

V. Unexplained variation

64

Comparison

Feature Older Model*

Advanced models**

Chance variability +/- +Chance variability +/- +

Threshold differences + +

Subgroup + +

Unexplained variation +/- +

65

* Moses-Littenberg model** Hierarchical and bivariate models

E l i h t itExploring heterogeneity

S i d bSummarise data per subgroup Subgroup analyses Meta-regression analysis Meta regression analysis

Covariates Study characteristics (patients, index tests,

reference standard, setting, disease stage, etc.) Methodological quality items (QUADAS items) Methodological quality items (QUADAS items)

66

Subgroup analysis and meta-regression

Advanced models can easily incorporate study Advanced models can easily incorporate study-level covariates

Different questions can be addressed: differences in summary points of sensitivity or

ifi itspecificity differences in overall accuracy differences in threshold differences in threshold differences in shape of SROC curve

67

Limitations of meta-regression

V lidit f i t i f ti Validity of covariate information poor reporting on design features

Population characteristics information missing or crudely availableinformation missing or crudely available

Lack of power small number of contrasting studies

68

Subgroup analysesSubgroup analyses

Subgroup 1:1.0

both sens & spec higher0.8

nsiti

vity 0.6

Se

0.2

0.4

0.0

0

69Specificity1.0 0.8 0.6 0.4 0.2 0.0

Prospective vs. Retrospective studies

1.0

y0.8

Sen

sitiv

ity

0.4

0.6

S

0.2

0.0

70Data collection: Prosp Retro

1-specificity0.0 0.2 0.4 0.6 0.8 1.0

Thi l k b tThis may look easy, but…

The following slides give the results of a study we did to incorporate the effects of quality into a meta-analysis.

Leeflang et al Impact of adjustment for quality on results of

71

Leeflang et al. Impact of adjustment for quality on results of metaanalyses of diagnostic accuracy. Clin Chem. 2007;53:164-72.

Eff t f hi h/l Q?Effects of high/low Q?

1. Change in DOR2. Change in consistency of DOR3. Change in heterogeneity

72

H thHypotheses

Deficiencies in study quality have been associated with inflated estimates and with heterogeneity. g y

Accounting for quality differences will therefore lead to …

less optimistic summary estimates … less optimistic summary estimates. … more homogenous results.

73

I ti St t i

Challenge 3

Incorporation Strategies

I i ( ti h h )1. Ignoring (sometimes graphs are shown)pooling all studies, disregarding quality

2. Subgroup Analysisg p yalso: quality as criterion for inclusionalso: stratification more than one subgroupalso: sensitivity analysis

3. Regression analysis Stepwise multivariable regression analysis and Multivariable regression analysis with a fixed set of

covariates

4. Weighted pooling‘not done’

74

5. Sequential analysishighest quality lowest qualitycumulative meta-analysis

M th dMethods

Q lit t i 487 t di i l d d i 30 t ti Quality assessment in 487 studies included in 30 systematic reviews.

QUADAS checklist used (Whiting et al. BMC Med Res Methodol, 2003)

Two definitions for high-quality:1. Evidence-based definition2. Common practice definitionp

Three methods for incorporation of quality:1. Exclusion of low quality studies2 Multivariable regression analysis with all items involved2. Multivariable regression analysis with all items involved3. Stepwise multivariable regression analysis (p>0.2)

Comparison of DORs, 95% CI of DORs, and changes in a hypothetical decision

75

hypothetical decision.

Evidence-based definition

76

C ti d fi itiCommon practice definition

77

R ltResults

Nonreporting of items was common, especially for blinding of index or reference test; time-interval between index test and reference test; and about inclusion of patientspatients.

Evidence-based definition: 72 high quality studies (15%); 12 reviews contained no high-quality studies.

Common-practice definition: 70 high quality studies (14%); 9 reviews contained no high-quality studies.

Fulfilling all 8 criteria: only 10 out of 487 studies were of high quality and only 1 meta-analysis out of 31 contained more than 3 high-quality studies…

78

Th St t iThe Strategies

Ignoring quality: Pooling all studies Ignoring quality: Pooling all studies

■ Analyzing s bg o ps

Only pooling high-quality studies; high q alit defined as f lfilling a subgroups: high quality defined as fulfilling a certain subset of criteria.

▲ Stepwise QUADAS-items with a p-value <0.2 ▲ multivariable regression analysis:

univariate are entered in a multivariable regression model

Multivariable regression analysis with a

A standard set of three QUADAS-items was used as covariates in each meta-analysis

79

analysis with a set of covariates:

each meta analysis.

DORID MA

80

C l i ?Conclusions?

We found no evidence for our hypothesis that adjusting for quality leads to less optimistic and more homogenous results.

Explanations: Poor reportingSmall sample size (30 SRs, small studies)Opposite effects of quality itemsOpposite effects of quality items DOR in stead of sensitivity and specificityRelation quality – estimates not straightforward

Still, poor quality will affect the trustworthiness. Therefore, report quality of individual studies and overall quality.

81

E iExercise

What do the results of a meta-lanalysis mean…?

I have some Output from SAS and STATA and would like to invite you

h l k hto have a look at them.

82

83Bivariate or HSROC? What do the parameters mean?

84

Part 3Test Comparisons

Differences between tests Diagnosis of lymph node metastasis in women with cervical Diagnosis of lymph node metastasis in women with cervical

cancer

2 imaging modalities: lymphangiography (LAG, n=17) CT (n=17)

Published meta analysis JAMA 1997;278:1096 1101 Published meta-analysis JAMA 1997;278:1096-1101

Modelled by adding covariate for test into the model statement, and parameter estimates for differences in:state e t, a d pa a ete est ates o d e e ces

Sensitivity and specificity for bivariate Log DOR, threshold and shape for HSROC

86

ROC plot of individual study results(L=lymphangiography C=CT)

0 8

1.0

C LLL

L

L Lsi

tivity 0.6

0.8 C

C

CCC CC

CL

LLL

L

L

L

LLL

L

L

Sen

s

0.2

0.4 C

CC

CCCC

L

L

0.0CC

87

1-specificity0.0 0.2 0.4 0.6 0.8 1.0

S ROC ti tSummary ROC estimates

CT

LAG

0 8

1.0

C LLL

L

L L

L

L

itive

rate

0.6

0.8 C

C

CCC CC

CL

LL

L

L

L

L

LLL

L

LLL

L L

True

pos

0 2

0.4C

CC

CCC

C

L

L

L

L

LL

0.0

0.2

C

C

C

L

L

88False positive rate

0.0 0.2 0.4 0.6 0.8 1.0

Average operating points and confidence ellipses

0 8

1.0

C LLL

L

L L

LAGsi

tivity 0.6

0.8 C

C

CCC CC

CL

LL

L

L

L

L

LLL

L

L

CT

Sen

s

0.2

0.4C

CC

CCC

C

L

L

CT

0.0CC

89

1-specificity0.0 0.2 0.4 0.6 0.8 1.0

Difference between average operating points

Imaging modality Sensitivity (95% CI) Specificity (95% CI)

LAG 0.67 (0.57 to 0.76) 0.80 (0.73 to 0.85) CT 0 49 (0 37 to 0 61) 0 92 (0 88 to 0 95)CT 0.49 (0.37 to 0.61) 0.92 (0.88 to 0.95)P-value Lag vs. CT 0.023 0.0002

90

Summary points or SROC curves?

Clinical interpretation Clinical interpretation Need to estimate performance at a threshold, using

sensitivity, specificity or/and likelihood ratios

Single threshold or mixed thresholds? Summary curve describes how test performance varies Summary curve describes how test performance varies

across thresholds. Studies do not need to report a common threshold to contribute.

Summary point must relate to a particular threshold.Summary point must relate to a particular threshold. Only studies reporting a common threshold can be combined.

91

Summary points or SROC curves?

Comparing tests and subgroups Comparing tests and subgroups Often wish to use as much data as possible –

if this means mixing thresholds SROC curves are d dneeded

if still a common threshold either method appropriate Possible to assess impact of threshold as a covariate SROC curves allow identification of crossing lines

A Cochrane review may include both an analysis of the A Cochrane review may include both an analysis of the SROC curves, and estimation of average threshold specific operating points

92

C ti lComparative analyses

Indirect comparisons Indirect comparisons

Different tests used in different studies

P i ll f d d b h diff b h Potentially confounded by other differences between the studies

Direct comparisons Direct comparisons

Patients receive both tests or randomized to tests

Diff i tt ib t bl t th t t Differences in accuracy more attributable to the tests

Few studies may be available and may not be representative

93

representative

Example of pilot Cochrane ReviewDown’ Syndrome screening review

Studies Participants

1st trimester - NT alone 10 79,412

1st trimester NT and serology 22 222 1711st trimester - NT and serology 22 222,171

2nd trimester - triple test (serology) 19 72,797

94

95

NT alone

Sensitivity: 72% (63%-79%)Indirect comparison

Specificity: 94% (91% -96%)

DOR: 39 (26-60)

NT with serology

Sensitivity: 86% (82%-90%)

Specificity: 95% (93% 96%)Specificity: 95% (93%-96%)

DOR: 110 (84-143)

RDOR: 2.8 (1.7-4.6), p <0.0001

Triple test

S iti it 82% (76% 86%)Sensitivity: 82% (76%-86%)

Specificity: 83% (77%-87%)

DOR: 21 (15-30)

96

( )

RDOR: 0.5 (0.3-0.9), p = 0.03

DIRECT COMPARISONSDIRECT COMPARISONS

NT alone

Sensitivity: 71% (59%-82%)

Specificity: 95% (91%-98%)

DOR: 41 (16-67)

NT with serology

Sensitivity: 85% (77%-93%)

Specificity: 96% (93%-98%)

DOR: 123 (40-206)

Triple test

No paired studies available

97

I di t Di t iIndirect versus Direct comparisons

NT alone

Sensitivity: 72% (63%-79%)

Specificity: 94% (91% 96%)

NT alone

Sensitivity: 71% (59%-82%)

Specificity: 95% (91% 98%)Specificity: 94% (91% -96%)

DOR: 39 (26-60)

Specificity: 95% (91%-98%)

DOR: 41 (16-67)

NT with serology

Sensitivity: 86% (82%-90%)

S f 9 % (93% 96%)

NT with serology

Sensitivity: 85% (77%-93%)Specificity: 95% (93%-96%)

DOR: 110 (84-143)

RDOR: 2.8 (1.7-4.6),

Specificity: 96% (93%-98%)

DOR: 123 (40-206)

98

RDOR: 2.8 (1.7 4.6), p <0.0001

Part 4Some other issues

A th hAnother approach…

Hypothesis testing is not common in diagnostic test accuracy research or diagnostic test accuracy research or in diagnostic meta-analyses.

But you could test whether the studies you found or whether the stud es you ou d o et e t esummary estimate falls within a certain target region.

100

T t iTarget regions) 80

100

Target region

e ra

te (s

en

60

80

ue p

ositi

ve

40

Tr

0

20

101False positive rate (1-spec)0 20 40 60 80 100

s) 80

100Target region

e ra

te (s

en

60

80

ue p

ositi

ve

40

Tr

0

20

102False positive rate (1-spec)0 20 40 60 80 100

P bli ti biPublication bias

In systematic reviews of intervention studies, publication bias is an important form of bias form of bias

To investigate publication bias in reviews, g p ,funnel plots are used.

I di ti i f l l t In diagnostic reviews, funnel plots are seriously misleading and alternatives have poor power.

103

P bli ti bi b k dPublication bias - background

many studies are done without ethical review or study registration prospective registration is therefore not available

diagnostic test accuracy studies do not test hypotheses so there is no ‘significance’ involvedhypotheses, so there is no significance involved

we have no clue whether publication bias exists fo diagnostic acc ac st dies and ho the for diagnostic accuracy studies and how the mechanisms behind it may work

104

SSummary

Part 1: meta-analysis introduction

P 2 h i Part 2: heterogeneity

Part 3: test comparisons Part 3: test comparisons

Part 4: some other issues

105