+ All Categories
Home > Documents > Incorporating uncertainty regarding applicability of evidence from meta-analyses into clinical...

Incorporating uncertainty regarding applicability of evidence from meta-analyses into clinical...

Date post: 27-Dec-2016
Category:
Upload: ramona
View: 212 times
Download: 0 times
Share this document with a friend
10
Incorporating uncertainty regarding applicability of evidence from meta-analyses into clinical decision making Levente Kriston*, Ramona Meister Department of Medical Psychology, University Medical Center Hamburg-Eppendorf, Martinstr. 52, 20246 Hamburg, Germany Accepted 13 September 2013; Published online 12 December 2013 Abstract Objectives: Judging applicability (relevance) of meta-analytical findings to particular clinical decision-making situations remains chal- lenging. We aimed to describe an evidence synthesis method that accounts for possible uncertainty regarding applicability of the evidence. Study Design and Setting: We conceptualized uncertainty regarding applicability of the meta-analytical estimates to a decision- making situation as the result of uncertainty regarding applicability of the findings of the trials that were included in the meta-analysis. This trial-level applicability uncertainty can be directly assessed by the decision maker and allows for the definition of trial inclusion prob- abilities, which can be used to perform a probabilistic meta-analysis with unequal probability resampling of trials (adaptive meta-analysis). A case study with several fictitious decision-making scenarios was performed to demonstrate the method in practice. Results: We present options to elicit trial inclusion probabilities and perform the calculations. The result of an adaptive meta-analysis is a frequency distribution of the estimated parameters from traditional meta-analysis that provides individually tailored information accord- ing to the specific needs and uncertainty of the decision maker. Conclusion: The proposed method offers a direct and formalized combination of research evidence with individual clinical expertise and may aid clinicians in specific decision-making situations. Ó 2014 Elsevier Inc. All rights reserved. Keywords: Evidence-based medicine; Meta-analysis; Heterogeneity; Uncertainty; External validity; Decision making; Statistical data interpretation ‘‘Identifying when it is appropriate to generalize from the abstract to the actual patient remains the central problem of any form of scientific clinical practice’’ [1, p. 289]. 1. Introduction Evidence-based clinical decisions at the bedside should integrate individual clinical expertise with findings from clinically relevant research [2e4]. This integration requires an evaluation (critical appraisal) of the external evidence by the clinician [3,5]. Currently, systematic reviews and meta-analyses of randomized controlled trials are consid- ered the highest level of evidence to inform treatment- related decisions, and several resources are available that support the clinician in the appraisal of such studies [5e10]. While evaluating a meta-analysis, a clinician is likely to be confronted with the assessment of both internal and external validity of the findings. Internal validity can be described as methodological rigor or low risk of system- atic internal bias in the estimation of the target parameter of interest, usually the treatment effect. In fact, most eval- uation guidelines have a strong focus on internal validity, whereas external validity has remained somewhat ne- glected [11e14]. Publications considering external validity used a multi- tude of terms such as generalizability, robustness, applica- bility, transferability, or relevance. It is rarely noted that these terms, if used synonymously, mix two major perspec- tives [12]. The first perspective describes external validity of findings as the extent to which they show generalizability to (or robustness across) other circumstances (eg, popula- tions, outcomes). With regard to this aspect, strategies have been developed for the identification, reporting, and synthe- sis of information to support decision makers [13]. How- ever, for a clinician making a bedside decision, external validity does not need to be assessed globally but rather with reference to the particular situation in which a decision is to be made. This second aspect describes applicability Conflict of interest: The authors report no potential conflicts of interest. Funding: No external funding was received for the study. * Corresponding author. Tel.: þ49-(0)40-7410-56849; fax: þ49-(0)40- 7410-54965. E-mail address: [email protected] (L. Kriston). 0895-4356/$ - see front matter Ó 2014 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.jclinepi.2013.09.010 Journal of Clinical Epidemiology 67 (2014) 325e334
Transcript

Journal of Clinical Epidemiology 67 (2014) 325e334

Incorporating uncertainty regarding applicability of evidencefrom meta-analyses into clinical decision making

Levente Kriston*, Ramona MeisterDepartment of Medical Psychology, University Medical Center Hamburg-Eppendorf, Martinstr. 52, 20246 Hamburg, Germany

Accepted 13 September 2013; Published online 12 December 2013

Abstract

Objectives: Judging applicability (relevance) of meta-analytical findings to particular clinical decision-making situations remains chal-lenging. We aimed to describe an evidence synthesis method that accounts for possible uncertainty regarding applicability of the evidence.

Study Design and Setting: We conceptualized uncertainty regarding applicability of the meta-analytical estimates to a decision-making situation as the result of uncertainty regarding applicability of the findings of the trials that were included in the meta-analysis.This trial-level applicability uncertainty can be directly assessed by the decision maker and allows for the definition of trial inclusion prob-abilities, which can be used to perform a probabilistic meta-analysis with unequal probability resampling of trials (adaptive meta-analysis).A case study with several fictitious decision-making scenarios was performed to demonstrate the method in practice.

Results: We present options to elicit trial inclusion probabilities and perform the calculations. The result of an adaptive meta-analysis isa frequency distribution of the estimated parameters from traditional meta-analysis that provides individually tailored information accord-ing to the specific needs and uncertainty of the decision maker.

Conclusion: The proposed method offers a direct and formalized combination of research evidence with individual clinical expertiseand may aid clinicians in specific decision-making situations. � 2014 Elsevier Inc. All rights reserved.

Keywords: Evidence-based medicine; Meta-analysis; Heterogeneity; Uncertainty; External validity; Decision making; Statistical data interpretation

Con

Fun

* C

7410-5

E-m

0895-4

http://d

‘‘Identifying when it is appropriate to generalize fromthe abstract to the actual patient remains the centralproblem of any form of scientific clinical practice’’[1, p. 289].

1. Introduction

Evidence-based clinical decisions at the bedside shouldintegrate individual clinical expertise with findings fromclinically relevant research [2e4]. This integration requiresan evaluation (critical appraisal) of the external evidenceby the clinician [3,5]. Currently, systematic reviews andmeta-analyses of randomized controlled trials are consid-ered the highest level of evidence to inform treatment-related decisions, and several resources are available thatsupport the clinician in the appraisal of such studies

flict of interest: The authors report no potential conflicts of interest.

ding: No external funding was received for the study.

orresponding author. Tel.: þ49-(0)40-7410-56849; fax: þ49-(0)40-

4965.

ail address: [email protected] (L. Kriston).

356/$ - see front matter � 2014 Elsevier Inc. All rights reserved.

x.doi.org/10.1016/j.jclinepi.2013.09.010

[5e10]. While evaluating a meta-analysis, a clinician islikely to be confronted with the assessment of both internaland external validity of the findings. Internal validity canbe described as methodological rigor or low risk of system-atic internal bias in the estimation of the target parameterof interest, usually the treatment effect. In fact, most eval-uation guidelines have a strong focus on internal validity,whereas external validity has remained somewhat ne-glected [11e14].

Publications considering external validity used a multi-tude of terms such as generalizability, robustness, applica-bility, transferability, or relevance. It is rarely noted thatthese terms, if used synonymously, mix two major perspec-tives [12]. The first perspective describes external validityof findings as the extent to which they show generalizabilityto (or robustness across) other circumstances (eg, popula-tions, outcomes). With regard to this aspect, strategies havebeen developed for the identification, reporting, and synthe-sis of information to support decision makers [13]. How-ever, for a clinician making a bedside decision, externalvalidity does not need to be assessed globally but ratherwith reference to the particular situation in which a decisionis to be made. This second aspect describes applicability

326 L. Kriston, R. Meister / Journal of Clinical Epidemiology 67 (2014) 325e334

What is new?

What this adds to what was known?� Judging applicability of evidence to specific target

settings is essential in evidence-based medicine.

� Particularly, the application of findings from broad-spectrum meta-analyses remains challenging.

� Adaptive meta-analysis enables the formal incor-poration of the clinician’s judgment of applicabil-ity (relevance) of meta-analytical evidence toa particular setting and the degree of accompanieduncertainty into clinical decision making.

� Uncertainty regarding applicability of evidencemay, but not must, increase decision uncertainty.

� Modeling uncertainty as an information-theoreticalmeasure (eg, entropy) seems a promising way offormalization.

� The proposed method can be generalized to dealwith any trial pool beyond meta-analysis.

(transferability or relevance) of the evidence to a specificdecision-making setting. The distinction between the twoperspectives of external validity becomes clear if one con-siders that even if a research finding is widely generalizableto several situations, it still can have a remarkably little ap-plicability in a particular setting and vice versa [15].

In the present study, we investigated the second perspec-tive of external validity, that is, applicability. We consid-ered research findings as potential information for clinicaldecision making and focused on decisions concerning thechoice among available treatments for a particular patientin a given clinical context. A clinical decision-making con-text can be characterized by a variety of attributes. Tradi-tionally, a limited number of central attributes have beenconsidered crucial for clinical decision making, frequentlyreferred to as the population, intervention, comparator,and outcome (PICO) scheme [13,16]. In this scheme, pop-ulations are described primarily by diagnostic categoriesand other clinical and/or demographic characteristics, inter-ventions and comparators by agents or elements that arebelieved to be the key determinants of the causal effect,and outcomes by (changes in) the patient’s state or condi-tion during or after the received intervention. We believethat the skill to decide which attributes are relevant to char-acterize a decision-making situation and which depth ofdifferentiation is needed within the single attributes is partof the clinical expertise as defined in evidence-based med-icine [4]. However, the perception of most clinicaldecision-making situations may vary among clinicians.Hence, we suppose that for most decision-making

situations, a globally valid characterization, or definition,does not exist.

When a clinician attempts to apply results of a meta-analysis at the bedside, users’ guides suggest seeking an-swers to questions such as ‘‘Can the results be applied tomy patient care?’’, ‘‘Is my patient so different from thosein the study that results cannot be applied?’’, and ‘‘Is thetreatment feasible in my setting?’’ [7,10]. Judgment on ap-plicability can be somewhat difficult, (1) if the central attri-butes of the meta-analysis and the target decision-makingsituation differ completely or in parts (eg, they refer to differ-ent age groups) or (2) if the meta-analysis has a broad scopecovering several decision-making situations and not only theone of interest (eg, it refers to many different age groupswithout differentiating between them). Concerning the firstdifficulty, ‘‘mechanistic’’ knowledge may be used to justifyextrapolation (or particularization) of study results by judg-ing to what extent the causal (eg, pathophysiological) mech-anisms of the effects in the study population are shared withthe patient of interest [17,18]. Here, we focus mainly on thesecond difficulty of applying findings from broad-spectrumreviews, which are not uncommon [19] and include trials thatare likely to be considered clinically heterogeneous [20]. As-sessment of clinical heterogeneity can be seen as judgingabout differences. From the perspective of the meta-analyst, already Lipsey and Wilson [21, p. 3] noted that‘‘the definition of what study findings are conceptually com-parable for purposes of meta-analysis is often fixed only inthe eye of the beholder. Findings that appear categoricallydifferent to one analyst may seem similar to another.’’ Notonly researchers (as authors of meta-analyses) but also clini-cians (as end users of them) may vary according to their viewon clinically relevant similarities and differences acrossdecision-making situations [20]. As stated, we consider thisvariation to be meaningful and part of the clinical expertisenecessary for high-quality health care.

Currently, judgment on the applicability of a meta-analysis to a bedside situation is likely to lead to a dichoto-mous (evidence applicable vs. not applicable) decisionby the clinician. However, especially in case of broad-spectrum reviews, this decision can be accompanied byconsiderable uncertainty. This decision uncertainty willlargely depend on the way the reviewers dealt with clinicalheterogeneity present among the trials included in the re-view, the clinical heterogeneity as subjectively perceivedby the decision-making clinician, and the concordance be-tween the two. In some cases, the decision of whethera meta-analysis is applicable will not be unequivocal, andforcing the clinician to choose between completely accept-ing and ignoring the findings may either lead to residualuncertainty or result in substantial information loss,respectively.

Although this field is largely underresearched, some ap-proaches have been developed to address applicability ofempirical findings (along with their uncertainty) to supportdecision making [22e24]. These methods consider

327L. Kriston, R. Meister / Journal of Clinical Epidemiology 67 (2014) 325e334

imperfect applicability of findings to a decision-making sit-uation to be quantifiable and propose to handle it as a formof external bias. They then try to adjust for this bias usingrather complex parametric statistical methods that are in-formed by prior expert input. Although useful in several sit-uations (eg, health policy decision making), they areunlikely to be routinely used by practicing clinicians be-cause of their complexity and requirement of expert input.

In summary, current practice of applying meta-analyticalfindings at the bedside is likely to be improvable becauseevidence that could in part inform decision making is fre-quently completely ignored. Easy-to-use methods thataccount for uncertainty during judgment of applicability ofempirical findings and allow for the use of partial informa-tion are currently lacking. In the present study, we attemptedto contribute to filling this gap and describe a practical andsimple method to incorporate uncertainty regarding subjec-tively perceived applicability of meta-analytical findingsinto bedside decision making by clinicians.

2. Method: adaptive meta-analysis

The basic assumption of the presented method is thatuncertainty regarding applicability of a meta-analyticalsummary estimate is grounded in uncertainty regarding ap-plicability of the results of the single trials that were in-cluded in the meta-analysis. Thus, uncertainty on themeta-analysis level is derived from uncertainty on the triallevel. Recently, a similar approach was presented for ratingsystematic reviews on the pragmatic-explanatory contin-uum [25], however, with the definition of applicability asa feature of the rated review per se and not as the congru-ence between the scope of the review and a specificdecision-making situation. For the here-described method,we assumed that a meta-analysis, of which results shouldbe applied for the bedside decision, was identified andthe results of the single trials that were included in themeta-analysis are available.

We propose that trial-level applicability uncertainty canbe mathematically modeled as the probability that a trial’sresults are applicable to the decision-making situation of in-terest. In a meta-analysis of i 5 1, 2, ., k trials, this prob-ability can be denoted with Pi. A value of zero indicatesthat the trial result is not applicable at all, and a value ofone indicates definite applicability. A value of 0.5 suggestsmaximum uncertainty, that is, no information is availableon the applicability of the results of the trial. A trial’s prob-ability of applicability can be interpreted as the probabilityof including the trial’s results in the meta-analytical sum-mary estimate. Thus, we read Pi as the inclusion probabilityof the trial i for meta-analysis.

If the trial inclusion probabilities are available, we canperform an adaptive meta-analysis, which is defined asa probabilistic meta-analysis with unequal probability re-sampling of trials based on Pi completing the followingsteps:

1. Create a random variable Xi for each trial using theBernoulli distribution,

Xi|Bð1;PiÞ;

where Pi is the trial inclusion probability.

2. Include the trial in the meta-analysis if Xi 5 1 and ex-clude the trial if Xi 5 0.

3. Perform the meta-analysis and save the output.4. Repeat these steps r times to obtain a frequency dis-

tribution of the meta-analytical estimates (eg, sum-mary effect size).

For ease of computation, we propose using random-effects meta-analysis with a closed formula for thebetween-trial variance parameter [26], although usage ofother estimators (eg, maximum likelihood) is also possible.The number of replications, r, should be at least 1,000 butshould be increased with increasing number of trials to beable to account for most possible trial combinations. Moni-tored meta-analytical estimates can be saved in vectors withlength of r and analyzed using descriptive statistical mea-sures (eg, quantiles). Visualization of the distributions withhistograms and/or box-and-whisker plots is recommended.

Any standard meta-analytical parameter can be moni-tored over replications including the summary effect, vari-ability estimates, heterogeneity statistics such as the I2 [27],lower and upper bounds of confidence or prediction inter-vals [28], and so on. We suggest that the obtained frequencydistributions of the monitored parameters can be interpretedprobabilistically, that is, calculation of probabilities of theparameters exceeding (or not) certain thresholds isstraightforward.

In the following, some special cases are discussed to en-hance the understanding of the presented method:

A. The trial inclusion probability Pi is zero or one for allk trials. These cases describe the absence of applica-bility uncertainty and result in completely includingor excluding each trial, respectively. Therefore, appli-cability uncertainty does not cause any variation inthe output across replications. Special cases of thisscenario are listed in the following.

A1. The trial inclusion probability Pi is zero for allk trials. In this case, the number of included trialsis always zero. It can be concluded that the meta-analysis does not contain any decision-relevant in-formation for the clinical question of interest.

A2. The trial inclusion probability Pi is one for all k trials.In this case, the same meta-analysis including all tri-als will be calculated r times, resulting inno variability of the estimates due to uncertain appli-cability. This case corresponds to accepting the sum-mary estimates of the original meta-analysis asapplicable.

A3. The trial inclusion probability Pi is zero or one (butvarying) for all k trials. In these cases, a subset of

328 L. Kriston, R. Meister / Journal of Clinical Epidemiology 67 (2014) 325e334

trials is used to perform the same meta-analysis rtimes. No variability due to uncertain applicabilitywill be present, but the results may differ from thoseof the original meta-analysis.

B. The trial inclusion probability Pi is different fromzero or one for at least one trial. These cases describethe presence of applicability uncertainty and result invariation in the included trial set across replications.Applicability uncertainty may or may not cause vari-ation in the output across replications, depending onthe monitored parameter and the trials’ results. Onespecial case of the scenario is listed here.

B1. The trial inclusion probability Pi is 0.5 for all k trials.This case corresponds to maximum uncertaintyregarding applicability of the trials’ findings. Theme-dian of the summary effect estimate across replica-tions is likely to be similar to the summary effect inthe original meta-analysis; however, its standard errorwill probably be larger (because of reduced power)and substantial variability due to uncertain applicabil-ity may be present, particularly if the between-trialstandard deviation in the complete trial set is large.

3. Input required from the clinician: elicitation of trialinclusion probabilities

To receive their individual applicability-adjusted meta-analytical output, clinicians are requested to judge applica-bility uncertainty of each trial on the basis of relevantinformation such as population, intervention, and compar-ator characteristics (so far available). For this, a linear vi-sual analog scale ranging from ‘‘definitely inapplicable’’through ‘‘I don’t know’’ to ‘‘definitely applicable’’ is used(Fig. 1). A judgment of definite applicability correspondsto an inclusion probability of one. An inclusion probability

Fig. 1. Scales for the elicitation of trial inclusion probabilities (A) for the tindicate example ratings).

of zero is elicited in case of a judgment of definite inappli-cability. A mark at maximally uncertain applicability (‘‘Idon’t know’’) leads to an inclusion probability of 0.5 forthe trial. Intermediate judgments lead to inclusion probabil-ity values between these anchors. Fig. 2 provides an exam-ple of a work sheet that the clinician would have tocomplete for the case study described in Section 5.Although the required time for filling the sheet dependslargely on the number of trials to judge, the amount ofavailable information on the trials, and the clinician’s con-scientiousness, the clinician is unlikely to need more thana few minutes to make the 10 judgments listed in Fig. 2.

When the number of trials is large, the clinician’s workload and therewith the time needed for the applicabilityassessments will increase. In this case, it may be more con-venient to ask the clinician to assess applicability of higherorder trial attribute categories such as diagnosis of includedpatients or form of comparator treatment rather than appli-cability of the single trials. Subsequently, the inclusionprobability of each trial can be calculated on the basisof the attribute-level assessments. If the higher order attri-butes are considered independent and the categories withinthe attributes are mutually exclusive, the trial inclusionprobability of each trial can be defined as the productof the corresponding higher order attribute categoryprobabilities:

rial lev

Pi5Yc

j51

Pij;

where Pi is the trial inclusion probability and Pij is the in-clusion probability for j 5 1, 2, ., c higher order attributecategories that the trial fits into. Higher order attributecategory inclusion probabilities can be elicited similar tothe case of trial applicability judgments, but the questionsneed to be adapted to the attributes and their categories

el and (B) for higher order trial attribute categories (filled dots

329L. Kriston, R. Meister / Journal of Clinical Epidemiology 67 (2014) 325e334

(eg, category ‘‘depression’’ of the higher order attribute‘‘diagnosis’’; Fig. 1).

4. Quantification of applicability uncertainty: entropy

Although uncertainty may be described by trial inclusionprobability, we propose that another measure, Shannon en-tropy, may be even more suitable for this purpose [29].Shannon entropy has some attractive properties such assymmetry (unchanged if the outcomes are reordered), addi-tivity (enables modeling uncertainty in hierarchical pro-cesses), and the fact that it reaches its maximum when alloutcomes are equiprobable. The entropy of the Bernoulliprocess underlying the applicability of each trial can be cal-culated as

Hi5 � ½Pilog2Pi þ ð1�PiÞlog2ð1�PiÞ�;

where Pi is the trial inclusion probability (see Section 3)and log20 is defined to equal zero.

An informative summary of the entropy for the wholeset of trials is the average of the trial entropies:

H5 � 1

k

Xk

i51

½Pilog2Pi þ ð1�PiÞlog2ð1�PiÞ�;

where k is the number of trials. H can take values rangingfrom zero (minimum entropy, ie, applicability uncertainty)to one (maximum applicability uncertainty). However, thiskind of averaging is only suitable to compare situationswithin a particular decision context, that is, with a fixednumber of applicability assessments. For between-contextcomparisons, sum of the entropies should be used.

5. Case study

We tested the introduced method in a case study by cre-ating various fictitious decision-making scenarios in whichresults of a recent meta-analysis on the effectiveness oflong-term psychodynamic psychotherapy (LTPP) shouldbe applied [30]. This meta-analysis included 10 trials thattested the effectiveness of LTPP compared with differenttreatments in patients with various mental disorders. Trialresults are reported as standardized mean differences withregard to an overall effectiveness outcome and may be in-terpreted according to Cohen’s standards (0.2 small, 0.5medium, and 0.8 large) [31]. Fig. 2 shows the work sheetlisting the assessments that the deciding clinician has tomake. The provided ratings are then directly converted intotrial inclusion probabilities as described in Section 3.

Trial inclusion probabilities of the different decision-making scenarios are provided in Table 1. The standardscenarios A1 (considering all trials as definitely applicable;entropy H 5 0), A2 (considering all trials as definitely in-applicable; H 5 0), and B1 (maximally uncertain applica-bility of each trial; H 5 1), as described in Section 2,were tested. In addition to these extreme scenarios, we

created some illustrative clinical cases. A3 describes theimaginary judgments of a senior physician of a hospitalward thinking about implementing LTPP for patients withborderline personality disorder. The physician comes toa dichotomous decision without any uncertainty (H 5 0),so that each trial is either included in or excluded fromthe adaptive meta-analysis. For scenario B2, imagine a deci-sion maker of a large mental health institution in which pa-tients suffering from various disorders are treated. Thedecision maker is confronted with the decision whetherLTPP or cognitive behavioral therapy should be followedas a general therapeutic approach in the institution. In thisscenario, variability due to uncertain applicability of trialsis present, and most replications in the adaptive meta-analysis will contain a different set of trials (H 5 0.770).In the last scenario (B3), the decision maker does not judgesingle trials but judges higher level attribute categories in-stead. Two attributes, diagnosis of the study population andcomparator treatment, are considered. For this purpose, pic-ture a general practitioner wondering whether to recom-mend LTPP or another well-defined treatment to a patientwith borderline personality disorder. The applicability un-certainty (entropy) in this scenario is H 5 0.821.

Adaptive meta-analysis results for these scenarios arepresented in Table 2. As expected, the only informative re-sult in scenario A1 (all trials definitely inapplicable) is thatthe meta-analysis cannot provide any decision-supportinginformation. Scenario A2 (all trials definitely applicable)does not contain any applicability uncertainty and yieldsa summary effect estimate of 0.55 with a 95% confidenceinterval lower bound of 0.30, a 95% prediction intervallower bound of �0.20, and an I2 of 64%. Scenario A3 (in-cluding a subset of studies with definite applicability) pro-vides somewhat different results with a summary effectestimate of 0.81, a 95% confidence interval lower boundof 0.35, a 95% prediction interval lower bound of �0.80,and an I2 of 74%. A comparison of the scenarios A2 andA3 shows that adaptive meta-analysis can be informativeeven in the absence of applicability uncertainty (or the pres-ence of applicability certainty) because it illustrates that theresults may vary in dependence of which trials are judgedapplicable. Scenario B1 (applicability is maximally uncer-tain for all trials) provides a summary effect estimate andan I2 that are comparable with the findings in scenarioA2, however, with somewhat larger estimates for the vari-ance components (as indicated by the lower values forthe lower bounds of the confidence and prediction intervals,respectively). Of special note is the substantial variation inthe parameter estimates across replications, which reflectsthe underlying uncertainty with regard to applicability ofthe trials. In this case, the applicability uncertainty resultedin an increased uncertainty of the estimated parameters andis likely to undermine any reasonable decision. The replica-tions of the adaptive meta-analysis in scenario B2 (varyingtrial inclusion probabilities) included on average five to sixtrials and produced estimates that are also comparable

How applicable do you judge the present trial's findings with view to the upcoming decision?

Trial nr. Available trial information:

Diagnosis of trial population/ Comparator treatment

Please mark a point on the scale that describes your applicability judgment for the trial appropriately:

1 Eating disorder/ Cognitive therapy

2 Borderline personality disorder/ Psychiatric treatment

3 Borderline personality disorder/ Structured clinical management

4 Borderline personality disorder/ Dialectical behavioral therapy

5 Anorexia nervosa/ Mixed treatments

6 Borderline personality disorder/ Treatment as usual

7 Depressive disorder/ Cognitive-behavioral therapy

8 Depressive and anxiety disorder/ Mixed treatments

9 Borderline personality disorder/ Treatment as usual

10 Cluster C personality disorder/ Cognitive therapy

Fig. 2. Work sheet for the case study requiring the physician to judge the applicability of 10 trials using visual elicitation scales (filled dots indicateexample ratings).

330 L. Kriston, R. Meister / Journal of Clinical Epidemiology 67 (2014) 325e334

(although somewhat smaller) with those in scenarios A2and B1. Variability of the estimates across replicationswas considerably lower than in scenario B1. The adaptivemeta-analysis in scenario B3 (varying trial inclusion prob-abilities) produced estimates that were largely comparablewith those in scenario B1, both with regard to magnitudeand variability across replications. This shows that an in-creased applicability certainty does not inevitably lead toa substantial increase in decision certainty.

As a graphical example, if we are primarily interested inthe summary effect estimate along with its statistical signif-icance in scenario B3, a stapled histogram like the one dis-played in Fig. 3 may be informative. It shows the frequencydistribution of the summary effect estimates over all repli-cations in scenario B3 and illustrates clearly that the sum-mary effect estimate varies strongly because of the

applicability uncertainty present. Nevertheless, the princi-pal direction of all estimates is comparable as the mean ef-fect size is most likely to fall between 0.35 and 0.75 and isvery likely to be statistically significantly different fromzero. Thus, even if considerable uncertainty with regardto applicability was present (H 5 0.821), decision uncer-tainty is likely to remain within tolerable limits.

6. Discussion

In the present study, we described an approach that en-ables incorporating applicability of meta-analytical findingsinto clinical decision making. By doing so, we allowed forthe clinician’s judgment of applicability to be accompaniedby any level of uncertainty and presented a statistical

Table 1. Trial inclusion probabilities for the investigated scenarios

Trial no. Diagnosis/comparator Scenario A1 Scenario A2 Scenario A3 Scenario B1 Scenario B2 Scenario B3

1 Eating disorder/cognitive therapy 0 1.00 0 0.50 0.70 0.70 � 0.70 5 0.492 Borderline personality disorder/

psychiatric treatment0 1.00 1.00 0.50 0.10 1.00 � 0.50 5 0.50

3 Borderline personality disorder/structuredclinical management

0 1.00 1.00 0.50 0.50 1.00 � 0.50 5 0.50

4 Borderline personality disorder/dialecticalbehavioral therapy

0 1.00 1.00 0.50 0.90 1.00 � 1.00 5 1.00

5 Anorexia nervosa/mixed treatments 0 1.00 0 0.50 0.50 0.70 � 0.50 5 0.356 Borderline personality disorder/treatment

as usual0 1.00 1.00 0.50 0.50 1.00 � 0.50 5 0.50

7 Depressive disorder/cognitive behavioraltherapy

0 1.00 0 0.50 1.00 0.80 � 0.80 5 0.64

8 Depressive and anxiety disorder/mixedtreatments

0 1.00 0 0.50 0.50 0.80 � 0.50 5 0.40

9 Borderline personality disorder/treatmentas usual

0 1.00 1.00 0.50 0.50 1.00 � 0.50 5 0.50

10 Cluster C personality disorder/cognitivetherapy

0 1.00 0 0.50 0.70 0.10 � 0.70 5 0.07

Entropy 0 0 0 1.000 0.770 0.821

331L. Kriston, R. Meister / Journal of Clinical Epidemiology 67 (2014) 325e334

method, adaptive meta-analysis, which makes use of thisinformation. We outlined options to elicit and quantifyuncertainty and display its effects in the adaptive meta-analytical output. A case study with several decision-making scenarios was performed to demonstrate themethod in practice.

We attempted to develop an easily implementablemethod. In its basic form, it can be used to enhance interpre-tation of any meta-analysis. The only evidence input neededis the effect estimate of each trial, its standard error, and a de-scription of the attributes that are likely to be relevant for theassessment of applicability for a specific decision-makingsituation. The one who wishes to interpret the findings ofthe meta-analysis is requested to judge the applicability oftrials (either at the trial level or at the level of higher orderattribute categories), and an adaptive meta-analysis can beperformed. Any parameter (and parameter combination) ofinterest can be summarized numerically or graphically. Sucha Meta-Analysis Interpretation Tool (MANITOO) can beadded to any meta-analysis and is able to provide individu-ally tailored information according to the specific needsand uncertainty of the decision maker. Although here pre-sented for interpretation of a single meta-analysis, the toolcan deal with any trial pool, thus opening the door to custom-ized summaries of an arbitrary number of relevant trials. Aslong as trial characteristics and results are documented ina standardized way, for example, like in systematic reviewsof the Cochrane Collaboration, reading out and using dataautomatically moves over within striking distance. Espe-cially in combination with recent developments such asthe call for automation of systematic reviews [32] and estab-lishment of standardized trial pools [33,34], implementationof the proposed tool seems achievable.

The only input a decision maker needs to provide is theapplicability assessments. The clinical expertise needed forthese judgments cannot be universally defined. In general,

the more expertise is present, the less uncertainty in appli-cability judgments and consequently in the estimated pa-rameters is to expect. Clinical expertise helps to decidewhether a trial’s findings are applicable to the upcoming de-cision. If no clear decision can be made (because of missingexpertise or other reasons), the answer ‘‘I don’t know’’ isalways acceptable. Such a judgment expresses uncertaintyof the clinician and might result in substantial variationof the parameter estimates across replications. However,it might still lead to an informative result (see Section 5).Thus, clinical expertise is less a perquisite but rather a com-ponent of the presented approach.

One of the questions arising is whether it is necessary toconsider applicability of evidence at all. In our opinion, cur-rently, there is a serious imbalance between aspects of internalvalidity and applicability that should be counteracted[11e14]. In the seminal investigation by Cabana et al. onwhy physicians do not follow clinical guidelines, barrierssuch as ‘‘interpretation of evidence,’’ ‘‘not applicable topractice population,’’ ‘‘oversimplified cookbook,’’ ‘‘reducesautonomy,’’ and ‘‘decreases flexibility’’ were reported fre-quently. As Haynes and Haines [35, p. 275] noted 15 yearsago, ‘‘local and individual circumstances of clinical practiceoften affect the delivery of care, and [.] tailoring of guide-lines to local circumstances is a process that is only just begin-ning to occur.’’ Fortunately, a continuously increasing interestin the interpretation and uptake of the results of systematic re-views can be observed [13,20,25,36e39]. The present workcontinues the tradition of systems for conducting [40], report-ing [41], and interpreting [42] meta-analyses that account forexternal validity (referring to it as applicability, relevance,and indirectness, respectively). MANITOO enriches existingmethodological approaches by offering the possibility ofadapting empirical findings formally to local circumstances.

Even if no uncertainty with regard to applicability ofsingle trials is present, the present method offers an

Table 2. Adaptive meta-analysis results for the investigated scenarios with 2,000 replications

Parameter Scenario A1 Scenario A2 Scenario A3 Scenario B1 Scenario B2 Scenario B3

No. of included trialsNo. of valid replications 2,000 2,000 2,000 2,000 2,000 2,000Median (IQR) 0 (0) 10 (0) 5 (0) 5 (2) 6 (2) 5 (2)Mean 6 SD 0 6 0 10 6 0 5 6 0 5.04 6 1.59 5.89 6 1.35 4.96 6 1.392.5%, 97.5% quantiles 0, 0 10, 10 5, 5 2, 8 3, 8 2, 8Minimum, maximum 0, 0 10, 10 5, 5 0, 10 2, 10 1, 9

Summary effect estimateNo. of valid replications 0 2,000 2,000 1,997 2,000 2,000Median (IQR) NA 0.55 (0) 0.81 (0) 0.54 (0.22) 0.43 (0.09) 0.57 (0.21)Mean 6 SD NA 0.55 6 0 0.81 6 0 0.56 6 0.16 0.44 6 0.08 0.57 6 0.142.5%, 97.5% quantiles NA 0.55, 0.55 0.81, 0.81 0.29, 0.91 0.31, 0.65 0.32, 0.82Minimum, maximum NA 0.55, 0.55 0.81, 0.81 0.01, 1.24 0.28, 0.82 0.14, 0.94

Lower bound of 95% CINo. of valid replications 0 2,000 2,000 1,997 2,000 2,000Median (IQR) NA 0.30 (0) 0.35 (0) 0.21 (0.13) 0.22 (0.08) 0.23 (0.13)Mean 6 SD NA 0.30 6 0 0.35 6 0 0.19 6 0.14 0.21 6 0.07 0.21 6 0.122.5%, 97.5% quantiles NA 0.30, 0.30 0.35, 0.35 �0.15, 0.42 0.05, 0.34 �0.12, 0.37Minimum, maximum NA 0.30, 0.30 0.35, 0.35 �0.62, 0.57 �0.19, 0.50 �0.62, 0.40

Lower bound of 95% PINo. of valid replications 0 2,000 2,000 1,901 1,992 1,938Median (IQR) NA �0.20 (0) �0.80 (0) �0.51 (0.82) �0.07 (0.47) �0.53 (0.73)Mean 6 SD NA �0.20 6 0 �0.80 6 0 �1.04 6 1.83 �0.19 6 0.47 �0.98 6 1.692.5%, 97.5% quantiles NA �0.20, �0.20 �0.80, �0.80 �7.97, 0.15 �1.37, 0.20 �7.70, 0.18Minimum, maximum NA �0.20, �0.20 �0.80, �0.80 �11.94, 0.32 �5.14, 0.32 �10.83, 0.23

I2 (in %)No. of valid replications 0 2,000 2,000 1,980 2,000 1,992Median (IQR) NA 64 (0) 74 (0) 63 (44) 30 (41) 66 (44)Mean 6 SD NA 64 6 0 74 6 0 51 6 29 28 6 23 51 6 292.5%, 97.5% quantiles NA 64, 64 74, 74 0, 86 0, 73 0, 86Minimum, maximum NA 64, 64 74, 74 0, 93 0, 83 0, 93

Abbreviations: IQR, interquartile range; NA, not applicable; SD, standard deviation; CI, confidence interval; PI, prediction interval.

332 L. Kriston, R. Meister / Journal of Clinical Epidemiology 67 (2014) 325e334

individualized meta-analysis including the set of studiesthat were judged applicable. The ‘‘worst’’ case is that alltrials are included and the adaptive meta-analysis providesthe same results as the original meta-analysis. If applicabil-ity uncertainty is present, it can be comprehensibly summa-rized and fed back using the entropy measure, and its

Fig. 3. Frequency distribution of the summary effect estimate obt

effects on the meta-analytical estimates can be observed.If the output varies substantially, the clinician may attemptreducing applicability uncertainty through acquiring moreinformation about either the evidence or the decision-making situation. The ‘‘success’’ of reducing uncertaintycan be evaluated by calculating the entropy reduction.

ained over all replications in scenario B3 of the case study.

333L. Kriston, R. Meister / Journal of Clinical Epidemiology 67 (2014) 325e334

Moreover, if applicability uncertainty does not result inconsiderable variability in the output, it may even increaseconfidence of the clinician by showing that applicabilityuncertainty is irrelevant for the outcome of interest. Theproposed method can also be used as a research tool to in-vestigate associations between applicability uncertainty(entropy) and outcome variability quantitatively, particu-larly in dependence of the characteristics of the analyzedtrial set, the decision-making situation, and the decidingclinician.

Here, we focused on external validity (applicability) ofthe evidence. Although it would be easily possible to incor-porate judgment on internal validity in the describedapproach, leaving internal validity in the hands of re-searchers is not an unattractive option [43]. This wouldmean a clear division of work according to competences,with researchers dealing mostly with internal validity andclinicians judging about applicability of the evidence. Al-though here outlined for specific decisions regarding thetreatment of individual patients, the presented tool couldbe easily modified and used to achieve a consensual effectestimate that is applicable to a whole population of interest.For this purpose, a group of experts (eg, developers of clin-ical practice guidelines) either need to find a consensuson the applicability of the involved studies or some kindof averaging of the judgments may be performed. The prob-abilistic nature of the applicability model offers a straight-forward way of dealing with disagreement among experts.For example, using simple average in a group of expertswith one half definitely favoring and the other half defi-nitely rejecting a particular study, the average judgmentwould be 0.5, that is, maximum uncertainty.

The outlined framework clearly needs further refinement.For example, the most optimal ways of elicitation of trial in-clusion probabilities should be explored focusing on thewording of the question that is stated and the visual proper-ties of the scale that is used, preferably by including clinicalpractitioners. Particularly, more work is needed on the elic-itation of trial inclusion probabilities through attributes ofthe decision-making situation, which is unavoidable if thenumber of trials is large and judging applicability of eachof them becomes tedious. We presented a simple approachfor independent attributes with mutually exclusive cate-gories; however, several characteristics can be measuredcontinuously. To enrich the described tool, combination withadvanced statistical procedures, eg, meta-regression, may bepossible. Methodological investigations are needed includ-ing analytical work and simulation studies to define the nec-essary number of replications for the adaptive meta-analysis.Although the use of resampling methods in meta-analysis isnot new [44], more experience with them is needed, espe-cially within the ‘‘frequentist’’ statistical framework. Com-bination with or transfer into Bayesian meta-analyticalmethods [18,45e47] is possible. In addition, as adaptivemeta-analysis adds a further source of variability (uncer-tainty regarding applicability) to the existing ones in

a standard meta-analysis (within- and between-trial variabil-ity), presentation of the output in a way that is helpful forpracticing clinicians remains challenging. However, it mayalso be an opportunity for further customization.

In the present study, we took a perspective that may be ofinterest beyond the developed application. This perspectiveuses variability of clinical decision-making situations asa valuable resource instead of considering it being a barrierof evidence implementation, incorporates clinical expertisein the evidence interpretation statistically, and handles un-certainty as an information-theoretical concept that can beaccounted for and used constructively. Additionally, it treatsempirical evidence not only as cumulating but also as a perse dynamic (probabilistic) rather than static (deterministic)construct, customizes information to local needs and cir-cumstances, and calls for publicly available data banks asglobal evidence resources. It relies strongly on technical so-lutions that enhance management and summation of empir-ical findings and thus support (but not substitute)interpretation and uptake of evidence. This ‘‘evidence-basedmedicine 2.0’’ perspective has rather evolved than was ex-plicitly intended in the presented study and may marka change in attitude to health care: the shift from universallyvalid evidence to locally applicable information.

References

[1] Smith GD, Egger M. Incommunicable knowledge? Interpreting and

applying the results of clinical trials and meta-analyses. J Clin Epide-

miol 1998;51:289e95.

[2] Sackett DL, Rosenberg WM, Gray JA, Haynes RB, Richardson WS.

Evidence based medicine: what it is and what it isn’t. BMJ 1996;312:

71e2.

[3] Rosenberg W, Donald A. Evidence based medicine: an approach to

clinical problem-solving. BMJ 1995;310:1122e6.

[4] Haynes RB, Devereaux PJ, Guyatt GH. Clinical expertise in the era

of evidence-based medicine and patient choice. Evid Based Med

2002;7:36e8.

[5] Sackett DL. Applying overviews and meta-analyses at the bedside.

J Clin Epidemiol 1995;48:61e6.[6] Dans AL, Dans LF, Guyatt GH, Richardson S. Users’ guides to the

medical literature: XIV. How to decide on the applicability of clinical

trial results to your patient. JAMA 1998;279:545e9.[7] Glasziou P, Guyatt GH, Dans AL, Dans LF, Straus S, Sackett DL. Ap-

plying the results of trials and systematic reviews to individual pa-

tients. ACP J Club 1998;129:A15e6.

[8] Greenhalgh T. Papers that summarise other papers (systematic re-

views and meta-analyses). BMJ 1997;315:672e5.

[9] Leucht S, Kissling W, Davis JM. How to read and understand and use

systematic reviews and meta-analyses. Acta Psychiatr Scand 2009;

119:443e50.[10] Oxman AD, Cook DJ, Guyatt GH. Users’ guides to the medical lit-

erature. VI. How to use an overview. JAMA 1994;272:1367e71.

[11] Rothwell PM. External validity of randomised controlled trials: ‘‘to

whom do the results of this trial apply?’’. Lancet 2005;365:82e93.

[12] Green LW, Glasgow RE. Evaluating the relevance, generalization,

and applicability of research: issues in external validation and trans-

lation methodology. Eval Health Prof 2006;29:126e53.[13] Atkins D, Chang SM, Gartlehner G, Buckley DI, Whitlock EP,

Berliner E, et al. Assessing applicability when comparing medical in-

terventions: AHRQ and the Effective Health Care Program. J Clin

Epidemiol 2011;64:1198e207.

334 L. Kriston, R. Meister / Journal of Clinical Epidemiology 67 (2014) 325e334

[14] Tonelli MR. Compellingness: assessing the practical relevance of

clinical research results. J Eval Clin Pract 2012;18:962e7.

[15] Kravitz RL, Duan N, Braslow J. Evidence-based medicine, heteroge-

neity of treatment effects, and the trouble with averages. Milbank Q

2004;82:661e87.[16] Richardson WS, Wilson MC, Nishikawa J, Hayward RS. The well-

built clinical question: a key to evidence-based decisions. ACP J Club

1995;123:A12e3.

[17] Howick J, Glasziou P, Aronson JK. Can understanding mechanisms

solve the problem of extrapolating from study to target populations

(the problem of ‘‘external validity’’)? J R Soc Med 2013;106:

81e6.[18] Goodman SN, Gerson J. Mechanistic evidence in evidence-based

medicine: a conceptual framework. Research white paper. AHRQ

Publication No. 13-EHC042-EF. Rockville, MD: Agency for Health-

care Research and Quality; 2013.

[19] Gøtzsche PC. Why we need a broad perspective on meta-analysis.

BMJ 2000;321:585e6.

[20] Kriston L. Dealing with clinical heterogeneity in meta-analysis. As-

sumptions, methods, interpretation. Int J Methods Psychiatr Res

2013;22:1e15.

[21] Lipsey MW, Wilson D. Practical meta-analysis. 1st ed. Thousand

Oaks, CA: Sage Publications, Inc; 2001.

[22] Spiegelhalter DJ, Best NG. Bayesian approaches to multiple sources

of evidence and uncertainty in complex cost-effectiveness modelling.

Stat Med 2003;22:3687e709.

[23] Wolpert RL, Mengersen KL. Adjusted likelihoods for synthe-

sizing empirical evidence from studies that differ in quality and

design: effects of environmental tobacco smoke. Stat Sci 2004;

19:450e71.

[24] Turner RM, Spiegelhalter DJ, Smith GCS, Thompson SG. Bias mod-

elling in evidence synthesis. J R Stat Soc Ser A Stat Soc 2009;172:

21e47.

[25] Koppenaal T, Linmans J, Knottnerus JA, Spigt M. Pragmatic vs. ex-

planatory: an adaptation of the PRECIS tool helps to judge the appli-

cability of systematic reviews for daily practice. J Clin Epidemiol

2011;64:1095e101.

[26] DerSimonian R, Laird N. Meta-analysis in clinical trials. Control

Clin Trials 1986;7:177e88.

[27] Higgins JPT, Thompson SG, Deeks JJ, Altman DG. Measuring incon-

sistency in meta-analyses. BMJ 2003;327:557e60.

[28] Riley RD, Higgins JPT, Deeks JJ. Interpretation of random effects

meta-analyses. BMJ 2011;342:d549.

[29] Shannon C. A mathematical theory of communication. Bell System

Tech J 1948;27:379e423, 623e656.

[30] Leichsenring F, Rabung S. Long-term psychodynamic psychotherapy

in complex mental disorders: update of a meta-analysis. Br J Psychi-

atry 2011;199:15e22.

[31] Cohen J. Statistical power analysis for the behavioral sciences. 2nd

ed. Hillsdale, NJ: Lawrence Erlbaum Associates; 1988.

[32] Tsafnat G, Dunn A, Glasziou P, Coiera E. The automation of system-

atic reviews. BMJ 2013;346:f139.

[33] Sim I, Detmer DE. Beyond trial registration: a global trial bank for

clinical trial reporting. PLoS Med 2005;2:e365.

[34] Cepeda MS, Lobanov V, Berlin JA. From ClinicalTrials.gov trial reg-

istry to an analysis-ready database of clinical trial results. Clin Trials

2013;10:347e8.[35] Haynes B, Haines A. Barriers and bridges to evidence based clinical

practice. BMJ 1998;317:273e6.

[36] Shrier I, Boivin J-F, Platt RW, Steele RJ, Brophy JM, Carnevale F,

et al. The interpretation of systematic reviews with meta-analyses:

an objective or subjective process? BMC Med Inform Decis Mak

2008;8:19.

[37] Wallace J, Byrne C, Clarke M. Making evidence more wanted: a sys-

tematic review of facilitators to enhance the uptake of evidence from

systematic reviews and meta-analyses. Int J Evid Based Healthc

2012;10:338e46.

[38] Saltman D, Jackson D, Newton PJ, Davidson PM. In pursuit of cer-

tainty: can the systematic review process deliver? BMC Med Inform

Decis Mak 2013;13:25.

[39] Burford B, Lewin S, Welch V, Rehfuess E, Waters E. Assessing the

applicability of findings in systematic reviews of complex interven-

tions can enhance the utility of reviews for decision making. J Clin

Epidemiol 2013;66:1251e61.

[40] Whiting PF, Rutjes AWS, Westwood ME, Mallett S, Deeks JJ,

Reitsma JB, et al. QUADAS-2: a revised tool for the quality assess-

ment of diagnostic accuracy studies. Ann Intern Med 2011;155:

529e36.

[41] Moher D, Liberati A, Tetzlaff J, Altman DG, the PRISMA Group.

Preferred reporting items for systematic reviews and meta-analyses:

the PRISMA statement. Ann Intern Med 2009;151:264e9.

[42] Guyatt GH, Oxman AD, Kunz R, Woodcock J, Brozek J, Helfand M,

et al. GRADE guidelines: 8. Rating the quality of evidenced

indirectness. J Clin Epidemiol 2011;64:1303e10.

[43] Gøtzsche PC, Ioannidis JPA. Content area experts as authors: helpful

or harmful for systematic reviews and meta-analyses? BMJ 2012;

345:e7031.

[44] Adams DC, Gurevitch J, Rosenberg MS. Resampling tests for meta-

analysis of ecological data. Ecology 1997;78:1277e83.

[45] Ashby D, Smith AF. Evidence-based medicine as Bayesian decision-

making. Stat Med 2000;19:3291e305.

[46] Sutton AJ, Abrams KR. Bayesian methods in meta-analysis and evi-

dence synthesis. Stat Methods Med Res 2001;10:277e303.

[47] Ades AE, Lu G, Higgins JPT. The interpretation of random-effects

meta-analysis in decision models. Med Decis Making 2005;25:

646e54.


Recommended