Date post: | 31-Dec-2015 |
Category: |
Documents |
Upload: | aurelia-evans |
View: | 30 times |
Download: | 0 times |
Quantitative Synthesis IPrepared for:
The Agency for Healthcare Research and Quality (AHRQ)
Training Modules for Systematic Reviews Methods Guide
www.ahrq.gov
Systematic Review Process Overview
To list the basic principles of combining data To recognize common metrics for meta-
analysis To describe the role of weights to combine
results across studies To distinguish between clinical and
methodological diversity and statistical heterogeneity
To define fixed effect model and random effects model
Learning Objectives
Quantitative overview/synthesis Pooling
Less precise Suggests that data from multiple sources are simply
lumped together
Combining Preferred by some Suggests applying statistical procedures to data
Synonyms for Meta-Analysis
Improve the power to detect a small difference if the individual studies are small
Improve the precision of the effect measure Compare the efficacy of alternative
interventions and assess consistency of effects across study and patient characteristics
Gain insights into statistical heterogeneity Help to understand controversy arising from
conflicting studies or generate new hypotheses to explain these conflicts
Force rigorous assessment of the data
Reasons To Conduct Meta-Analyses
Commonly EncounteredComparative Effect Measures
Type of DataType of DataCorresponding Effect Corresponding Effect
MeasureMeasure
Continuous
• Mean difference (e.g., mmol, mmHg)• Standardized mean difference
(effect size) • Correlation
Dichotomous Odds ratio, risk ratio, risk difference
Time to event Hazard ratio
For each analysis, one study should contribute only one treatment effect.
The effect estimate may be for a single outcome or a composite.
The outcome being combined should be the same — or similar, based on clinical plausibility — across studies.
Know the research question. The question drives study selection, data synthesis, and interpretation of the results.
Principles of Combining Datafor Basic Meta-Analyses
Biological and clinical plausibility Scale of effect measure Studies with small numbers of events do not
give reliable estimates
Things To Know About the DataBefore Combining Them
True Associations May DisappearWhen Data Are Combined Inappropriately
An Association May Be SeenWhen There Is None
Changes in the Same ScaleMay Have Different Meanings
Both A–B and C–D involve a change of one absolute unit
A–B change (1 to 2) represents a 100% relative change
C–D change (7 to 8) represents only a 14% relative change
Eff
ect
of
inte
rest
Variable of interest
A
B
C
D
Effect of the Choice of Metric on Meta-analysis
TreatmentTreatment ControlControl
StudyEvent
sTota
l RateEvent
sTota
l RateRelative Risk
Risk Differen
ce
A 100 1000 10% 200 1000 20% 0.5 10%
B 1 1000 0.1% 2 1000 0.2% 0.5 0.1%
Effect of Small Changes on the Estimate
Baseline Baseline casecase
Effect of Effect of decrease of decrease of
1 event1 event
Effect of Effect of increase of increase of
1 event1 event
Relative Relative change of change of estimateestimate
2/10
20%
1/10
10%
3/10
30%±50%
20/100
20%
19/100
19%
21/100
21%±5%
200/1,000
20%
199/1,000
19.9%
201/1,000
20.1%±0.5%
Outcomes that have two states (e.g., dead or alive, success or failure)
The most common type of outcome reported in clinical trials
2x2 tables commonly used to report binary outcomes
Binary Outcomes
A Sample 2x2 Table
ISIS-2 Collaborative Group. Lancet 1988;2:349-60.
Vascular Vascular deaths deaths SurvivalSurvival TotalTotal
Streptokinase
7917,801 8,592
Placebo1,029
7,566 8,595
Binary outcomes data to be extracted from studies
OR = (a / b) / (c / d)
Treatment Effect Metrics ThatCan Be Calculated From a 2x2 Table
a b
c d
EventsNo
Events
Treatment
Control
Group Rates
TR =
CR =
Treatment Effects Metrics
a
a + b
c
c + d
Risk Difference Odds Ratio Risk Ratio
RD = TR - CR OR = RR =TR / (1 - TR) TR
CRCR / (1 - CR)
Value ranges from -1 to +1 Magnitude of effect is directly interpretable Has the same meaning for the complementary
outcome (e.g., 5% more people dying is 5% fewer living)
Across studies in many settings, tends to be more heterogeneous than relative measures
Inverse is the number needed to treat (NNT) and may be clinically useful
If heterogeneity is present, a single NNT derived from the overall risk difference could be misleading
Some Characteristicsand Uses of the Risk Difference
Value ranges from 1/oo to + Has desirable statistical properties; better
normality approximation in log scale than risk ratio
Symmetrical meaning for complementary outcome (the odds ratio of dying is equal to the opposite [inverse] of the odds ratio of living)
Ratio of two odds is not intuitive to interpret Often used to approximate risk ratio (but
gives inflated values at high event rates)
Some Characteristicsand Uses of the Odds Ratio
Value ranges from 0 to Like its derivative, relative risk reduction, is easy
to understand and is preferred by clinicians Example: a risk ratio of 0.75 is a 25% relative reduction of the
risk
Requires a baseline rate for proper interpretation Example: an identical risk ratio for a study with a low event rate
and another study with higher event rate may have very different clinical and public health implications
Asymmetric meaning for the complementary outcome Example: the risk ratio of dying is not the same as the inverse of
the risk ratio of living
Some Characteristicsand Uses of the Risk Ratio
When the Complementary Outcomeof the Risk Ratio Is Asymmetric
DeadDead AliveAlive TotalTotalTreatment 20 80 100
Control 40 60 100
Odds Ratio (Dead) = 20 x 60 / 40 x 80 = 3/8 = 0.375Odds Ratio (Dead) = 20 x 60 / 40 x 80 = 3/8 = 0.375
Odds Ratio (Alive) = 80 x 40 / 20 x 60 = 8/3 = 2.67Odds Ratio (Alive) = 80 x 40 / 20 x 60 = 8/3 = 2.67
Risk Ratio (Dead) = 20/100 / 40/100 = 1/2 = 0.5Risk Ratio (Dead) = 20/100 / 40/100 = 1/2 = 0.5
Risk Ratio (Alive) = 80/100 / 60/100 = 4/3 = 1.33Risk Ratio (Alive) = 80/100 / 60/100 = 4/3 = 1.33
Calculation of Treatment Effects in the Second International Study of Infarct Survival (ISIS-2)
Treatment-Group Effect Rate = 791 / 8592 = 0.0921
Control-Group Effect Rate = 1029 / 8595 = 0.1197
Risk Ratio = 0.0921 / 0.1197 = 0.77
Odds Ratio = (791 x 7566) / (1029 x 7801) = 0.75
Risk Difference = 0.0921 – 0.1197 = -0.028
Vascular Vascular deaths deaths SurvivalSurvival TotalTotal
Streptokinase 791 7,801 8,592
Placebo 1,029 7,566 8,595
ISIS-2 Collaborative Group. Lancet 1988;2:349-60.
Treatment Effects Estimates in Different Metrics:Second International Study of Infarct Survival (ISIS-2)
Streptokinase vs. Placebo Vascular Streptokinase vs. Placebo Vascular DeathDeath
Estimate95% Confidence
Interval
Risk ratio 0.77 0.70 to 0.84
Odds ratio 0.75 0.68 to 0.82
Risk difference -0.028 -0.037 to -0.019
Number needed to treat 36 27 to 54
ISIS-2 Collaborative Group. Lancet 1988;2:349-60.
Example: Meta-Analysis Data Set
Beta-Blockers after Myocardial Infarction - Secondary PreventionBeta-Blockers after Myocardial Infarction - Secondary Prevention
Experiment Control Odds 95% CIExperiment Control Odds 95% CI N Study Year Obs Tot Obs Tot Ratio Low High N Study Year Obs Tot Obs Tot Ratio Low High === ============ ==== ====== ====== ====== ====== ===== ===== ===== === ============ ==== ====== ====== ====== ====== ===== ===== ===== 1 Reynolds 1972 3 38 3 39 1.03 0.19 5.45 1 Reynolds 1972 3 38 3 39 1.03 0.19 5.45 2 Wilhelmsson 1974 7 114 14 116 0.48 0.18 1.23 2 Wilhelmsson 1974 7 114 14 116 0.48 0.18 1.23 3 Ahlmark 1974 5 69 11 93 0.58 0.19 1.76 3 Ahlmark 1974 5 69 11 93 0.58 0.19 1.76 4 Multctr. Int 1977 102 1533 127 1520 0.78 0.60 1.03 4 Multctr. Int 1977 102 1533 127 1520 0.78 0.60 1.03 5 Baber 1980 28 355 27 365 1.07 0.62 1.86 5 Baber 1980 28 355 27 365 1.07 0.62 1.86 6 Rehnqvist 1980 4 59 6 52 0.56 0.15 2.10 6 Rehnqvist 1980 4 59 6 52 0.56 0.15 2.10 7 Norweg.Multr 1981 98 945 152 939 0.60 0.46 0.79 7 Norweg.Multr 1981 98 945 152 939 0.60 0.46 0.79 8 Taylor 1982 60 632 48 471 0.92 0.62 1.38 8 Taylor 1982 60 632 48 471 0.92 0.62 1.38 9 BHAT 1982 138 1916 188 1921 0.72 0.57 0.90 9 BHAT 1982 138 1916 188 1921 0.72 0.57 0.90 10 Julian 1982 64 873 52 583 0.81 0.55 1.18 10 Julian 1982 64 873 52 583 0.81 0.55 1.18 11 Hansteen 1982 25 278 37 282 0.65 0.38 1.12 11 Hansteen 1982 25 278 37 282 0.65 0.38 1.12 12 Manger Cats 1983 9 291 16 293 0.55 0.24 1.27 12 Manger Cats 1983 9 291 16 293 0.55 0.24 1.27 13 Rehnqvist 1983 25 154 31 147 0.73 0.40 1.30 13 Rehnqvist 1983 25 154 31 147 0.73 0.40 1.30 14 ASPS 1983 45 263 47 266 0.96 0.61 1.51 14 ASPS 1983 45 263 47 266 0.96 0.61 1.51 15 EIS 1984 57 858 45 883 1.33 0.89 1.98 15 EIS 1984 57 858 45 883 1.33 0.89 1.98 16 LITRG 1987 86 1195 93 1200 0.92 0.68 1.25 16 LITRG 1987 86 1195 93 1200 0.92 0.68 1.25 17 Herlitz 1988 169 698 179 697 0.92 0.73 1.18 17 Herlitz 1988 169 698 179 697 0.92 0.73 1.18
A 1986 study by Charig et al. compared the treatment of renal calculi by open surgery and percutaneous nephrolithotomy.
The authors reported that success was achieved in 78% of patients after open surgery and in 83% after percutaneous nephrolithotomy.
When the size of the stones was taken into account, the apparent higher success rate of percutaneous nephrolithotomy was reversed.
Simpson’s Paradox (I)
Charig CR, et al. BMJ 1986;292:879-82.
Simpson’s Paradox (II)
SuccesSuccesss
FailurFailuree
Open 81 6PN 234 36
SuccessSuccessFailurFailur
eeOpen 192 71PN 55 25
Stones < 2 cmStones < 2 cm Stones ≥ 2 cmStones ≥ 2 cm
Pooling Tables 1 and 2Pooling Tables 1 and 2
Open (93%) > PN (87%)Open (93%) > PN (87%) Open (73%) > PN Open (73%) > PN (69%)(69%)
Open (78%) < PN (83%)Open (78%) < PN (83%)
SuccesSuccesss
FailurFailuree
Open 273 77PN 289 61
Charig CR, et al. BMJ 1986;292:879-82. PN = percutaneous nephrolithotomy
StudyStudy NN
Mean Mean differencedifference
(mm Hg)(mm Hg)
95% 95% Confidence Confidence
IntervalInterval
A 554 -6.2 -6.9 to -5.5
B 304 -7.7 -10.2 to -5.2
C 39 -0.1 -6.5 to 6.3
Combining Effect Estimates
What is the average (overall) treatment-control difference in blood pressure?
Simple Average
(-6.2) + (-7.7) + (-0.1)(-6.2) + (-7.7) + (-0.1)
33== -4.7 mm -4.7 mm
HgHg
StudyStudy NN
Mean Mean differencdifferenc
ee
mmHgmmHg 95% CI95% CI
A 554 -6.2 -6.9 to -5.5
B 304 -7.7 -10.2 to -5.2
C 39 -0.1 -6.5 to 6.3
What is the average (overall) treatment-control difference in blood pressure?
Weighted Average
(554 x (554 x -6.2) + (304 x 6.2) + (304 x -7.7) + (39 x 7.7) + (39 x -0.1)0.1)554 + 304 + 39554 + 304 + 39
==-6.4 mm Hg6.4 mm Hg
k
ii
k
iii
w
xwX
1
1StudyStudy NN
Mean Mean differencdifferenc
ee
mmHgmmHg 95% CI95% CI
A 554 -6.2 -6.9 to -5.5
B 304 -7.7 -10.2 to -5.2
C 39 -0.1 -6.5 to 6.3
What is the average (overall) treatment-control difference in blood pressure?
General Formula:Weighted Average Effect Size (d+)
Where:
di = effect size of the ith study
wi = weight of the ith study
k = number of studies
k
ii
k
iii
w
dwd
1
1
Generally is the inverse of the variance of treatment effect (that captures both study size and precision)
Different formula for odds ratio, risk ratio, and risk difference
Readily available in books and software
Calculation of Weights
Is it reasonable? Are the characteristics and effects of studies sufficiently
similar to estimate an average effect?
Types of heterogeneity: Clinical diversity Methodological diversity Statistical heterogeneity
Heterogeneity (Diversity)
Are the studies of similar treatments, populations, settings, design, et cetera, such that an average effect would be clinically meaningful?
Clinical Diversity
25 randomized controlled trials compared endoscopic hemostasis with standard therapy for bleeding peptic ulcer.
5 different types of treatment were used: monopolar electrode, bipolar electrode, argon laser, neodymium-YAG laser, and sclerosant injection.
4 different conditions were treated: active bleeding, a nonspurting blood vessel, no blood vessels seen, and undesignated.
3 different outcomes were assessed: emergency surgery, overall mortality, and recurrent bleeding.
Example: A Meta-analysis With aLarge Degree of Clinical Diversity
Sacks HS, et al. JAMA 1990;264:494-9.
Are the studies of similar design and conduct such that an average effect would be clinically meaningful?
Methodological Diversity
Is the observed variability of effects greater than that expected by chance alone?
Two statistical measures are commonly used to assess statistical heterogeneity: Cochran’s Q-statistics I2 index
Statistical Heterogeneity
Cochran’s Q-Statistics:Chi-square (2) Test for Homogeneity
2
1
2)1(
ddwQ ii
k
idfk
di = effect measure; d+ = weighted average
Q-statistics measure between-study variation.Q-statistics measure between-study variation.
The I2 Index and Its Interpretation
Describes the percentage of total variation in study estimates that is due to heterogeneity rather than to chance
Value ranges from 0 to 100 percent A value of 25 percent is considered to be low heterogeneity, 50
percent to be moderate, and 75 percent to be large
Is independent of the number of studies in the meta-analysis; it could be compared directly between meta-analyses
2 max ,11
QH
k
22
2
1HI
H
Higgins JP, et al. BMJ 2003;327:557-60.
Example: A Fixed Effect Model
Suppose that we have a container with a very large number of black and white balls.
The ratio of white to black balls is predetermined and fixed.
We wish to estimate this ratio.
Now, imagine that the container represents a clinical condition and the balls represent outcomes.
Random Sampling From a Container With a Fixed Number of White and Black Balls (Equal Sample Size)
Random Sampling From a Container With a Fixed Number of Black and White Balls (Different Sample Size)
Different Containers With Different Proportions of Black and White Balls (Random Effects Model)
Random Sampling From Containers To Get an Overall Estimate of the Proportion of Black and White Balls
Fixed effect model: assumes a common treatment effect. For inverse variance weighted method, the precision of the
estimate determines the importance of the study. The Peto and Mantel-Haenzel methods are noninverse
variance weighted fixed effect models.
Random effects model: in contrast to the fixed effect model, accounts for within-study variation. The most popular random effects model in use is the
DerSimonian and Laird inverse variance weighted method, which calculates the sum of the within-study variation and the among-study variation.
Random effects model can also be implemented with Bayesian methods.
Statistical Models of Combining 2x2 Tables
Example Meta-analysis Where Fixed and the Random Effects Models Yield Identical Results
Example Meta-analysis Where Results from Fixed and Random Effects Models Will Differ
Gross PA, et al. Inn Intern Med 1995;123:518-27. Reprinted with permission from the American College of Physicians.
Weights of the Fixed Effectand Random Effects Models
*
1*
vvw
ii
ii v
w1
Random Effects WeightFixed Effect Weight
where: vi = within study variance
v* = between study variance
Commonly Used Statistical Methodsfor Combining 2x2 Tables
Odds Ratio Risk RatioRisk
DifferenceFixed Effect Model
• Mante• l-Haenszel
Peto• Exact• Inverse
variance weighted
• Mantel-Haenszel
• Inverse variance weighted
• Inverse variance weighted
Random Effects Model
• DerSimonian and Laird
• DerSimonian and Laird
• DerSimonian and Laird
HETEROGENEOUS TREATMENT EFFECTS
IGNORE INCORPORATEESTIMATE(insensitive)
EXPLAIN
FIXED EFFECT MODEL
DO NOT COMBINE WHEN
HETEROGENEITY IS PRESENT
RANDOM EFFECTS MODEL
SUBGROUP ANALYSES
META-REGRESSION(control rate, covariates)
Dealing With Heterogeneity
Lau J, et al. Ann Intern Med 1997;127:820-6. Reprinted with permission from the American College of Physicians.
Most meta-analyses of clinical trials combine treatment effects (risk ratio, odds ratio, risk difference) across studies to produce a common estimate, by using either a fixed effect or random effects model.
In practice, the results from using these two models are similar when there is little or no heterogeneity.
When heterogeneity is present, the random effects model generally produces a more conservative result (smaller Z-score) with a similar estimate but also a wider confidence interval; however, there are rare exceptions of extreme heterogeneity where the random effects model may yield counterintuitive results.
Summary:Statistical Models of Combining 2x2 Tables
Many assumptions are made in meta-analyses, so care is needed in the conduct and interpretation.
Most meta-analyses are retrospective exercises, suffering from all the problems of being an observational design.
Researchers cannot make up missing information or fix poorly collected, analyzed, or reported data.
Caveats
Basic meta-analyses can be easily carried out with readily available statistical software.
Relative measures are more likely to be homogeneous across studies and are generally preferred.
The random effects model is the appropriate statistical model in most instances.
The decision to conduct a meta-analysis should be based on: a well-formulated question, appreciation of the heterogeneity of the data, and understanding of how the results will be used.
Key Messages
Charig CR, Webb DR, Payne, SR, et al. Comparison of treatment of renal calculi by operative surgery, percutaneous nephrolithotomy, and extracorporeal shock wave lithotripsy. BMJ 1986;292:879–82.
Gross PA, Hermogenes AW, Sacks HS, et al. The efficacy of Influenza vaccine in elderly persons: a meta-analysis and review of the literature. Ann Intern Med 1995;123:518-27.
Higgins JPT, Thompson SG, Deeks JJ, et al. Measuring inconsistency in meta-analyses. BMJ 2003;327:557–60.
Lau J, Ioannidis JPA, Schmid CH. Quantitative synthesis in systematic review. Ann Intern Med 1997;127:820-6.
References (I)
ISIS-2 (Second International Study of Infarct Survival) Collaborative Group. Randomized trial of intravenous streptokinase, oral aspirin, both, or neither among 17,817 cases of suspected acute myocardial infarction: ISIS-2. Lancet 1988;2:349-60.
Sacks HS, Chalmers TC, Blum AL, et al. Endoscopic hemostasis: an effective therapy for bleeding peptic ulcers. JAMA 1990;264:494-9.
References (II)
This presentation was prepared by Joseph Lau, M.D., and Thomas Trikalinos, M.D., Ph.D., members of the Tufts Medical Center Evidence-based Practice Center.
The information in this module is based on Chapter 9 in Version 1.0 of the Methods Guide for Comparative Effectiveness Reviews (available at: http://www.effectivehealthcare.ahrq.gov/repFiles/2007_10DraftMethodsGuide.pdf).
Authors