Systematic Reviews: Methods and Procedures
George A. WellsEditor, Cochrane Musculoskeletal Review
GroupDepartment of Epidemiology and Community Medicine
University of OttawaOttawa, Ontario, Canada
• Meta-analysis is a statistical analysis of a collection of studies
• Meta-analysis methods focus on contrasting and comparing results from different studies in anticipation of identifying consistent patterns and sources of disagreements among these results
• Primary objective:• Synthetic goal (estimation of summary effect)
vs• Analytic goal (estimation of differences)
Meta-analysis:
• Systematic Review:
– the application of scientific strategies that limit bias to the systematic assembly, critical appraisal and synthesis of all relevant studies on a specific topic
• Meta-Analysis:
– a systematic review that employs statistical methods to combine and summarize the results of several studies
Features of narrative reviews and systematic reviews
QUESTION Broad Focused
SOURCES/ Usually unspecified Comprehensive; SEARCH Possibly biased explicit
SELECTION Unspecified; biased?Criterion-based;uniformly applied
APPRAISAL Variable Rigourous
SYNTHESIS Usually qualitative Quantitative
INFERENCE Sometimes Usually evidence- evidence-based based
NARRATIVE SYSTEMATIC
Steps of a Cochrane Systematic Review
• Clearly formulated question• Comprehensive data search• Unbiased selection and extraction
process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup
analyses if appropriate and possible• Prepare a structured report
• What is the study objective to validate results in a large population to guide new studies
Pose question in both biologic and health care terms specifying with operational definitions population intervention outcomes (both beneficial and harmful)
Inclusion Criteria
• Study design
• Population
• Interventions
• Outcomes
Steps of a Cochrane Systematic Review
• Clearly formulated question• Comprehensive data search• Unbiased selection and extraction
process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup
analyses if appropriate and possible• Prepare a structured report
• Need a well formulated and co-ordinated effort• Seek guidance from a librarian• Specify language constraints• Requirements for comprehensiveness of search
depends on the field and question to be addressed
• Possible sources include: computerized bibliographic database review articles abstracts conference proceedings dissertations books experts granting agencies trial registries industry journal handsearching
• Procedure: usually begin with searches of biblographic reports
(citation indexes, abstract databases) publications retrieved and references therein searched
for more references
as a step to elimination of publication bias need information from unpublished research databases of unpublished reports clinical research registries clinical trial registries unpublished theses conference indexes
Published Reports(publication bias ie. tendency to publish statistically significant results)
Steps of a Cochrane Systematic Review
• Clearly formulated question• Comprehensive data search• Unbiased selection and extraction
process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup
analyses if appropriate and possible• Prepare a structured report
• 2 independent reviewers select studies• Selection of studies addressing the question posed
based on a priori specification of the population, intervention, outcomes and study design
• Level of agreement: kappa• Differences resolved by consensus
• Specify reasons for rejecting studies
Study Selection
• 2 independent reviewers extract data using predetermined forms– Patient characteristics– Study design and methods – Study results– Methodologic quality
• Level of agreement: kappa• Differences resolved by consensus
Data Extraction
• Be explicit, unbiased and reproducible• Include all relevant measures of benefit and
harm of the intervention• Contact investigators of the studies for
clarification in published methods etc.• Extract individual patient data when published
data do not answer questions about: intention to treat analyses, time-to-event analyses, subgroups, dose-response relationships
Data Extraction ….
Steps of a Cochrane Systematic Review
• Well formulated question• Comprehensive data search• Unbiased selection and extraction
process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup
analyses if appropriate and possible• Prepare a structured report
• Size of study
• Characteristics of study patients
• Details of specific interventions used
• Details of outcomes assessed
Description of Studies
• Can use as:• threshold for inclusion• possible explanation form heterogeneity
• Base quality assessments on extent to which bias is minimized
• Make quality assessment scoring systems transparent and parsimonious
• Evaluate reproducibility of quality assessment• Report quality scoring system used
Methodologic Quality Assessment
Study Random Blinding Dropouts
Adami 1995 + + +
Black 1996 ++ + +
Bone 1997 + + --
Chestnut 1995 + + +
Hosking 1998 + -- +
Liberman 1995 + + +
McClung 1998 + + +
++ indicates that randomization was appropriate ( egRandom numbers were computer generated)
Quality Assessment: Example
Steps of a Cochrane Systematic Review
• Well formulated question• Comprehensive data search• Unbiased selection and extraction
process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup
analyses if appropriate and possible• Prepare a structured report
Outcome
Discrete(event)
Continuous(measured)
Odds Relative RiskRatio Risk Difference(OR) (RR) (RD)
Mean StandardizedDifference Mean Difference(MD) (SMD)
Overall Estimate
Fixed EffectsRandom Effects
Overall Estimate
Fixed EffectsRandom Effects
(Basic Data) (Basic Data)
Effect measures: discrete data
P1 = event rate in experimental group
P2 = event rate in control group
• RD = Risk difference = P2 - P1• RR = Relative risk = P1 / P2• RRR = Relative risk reduction = (P2-P1)/P2• OR = Odds ratio = P1/(1-P1)/[P2/(1-P2)]• NNT = No. needed to treat = 1 / (P2-P1)
Example
Experimental event rate = 0.3
Control event rate = 0.4
RD = 0.4 - 0.3 = 0.1
RR = 0.3 / 0.4 = 0.75
RRR = (0.4 - 0.3) / 0.4 = 0.25
OR = (0.3/0.7)/(0.4/0.6) = 0.64
NNT = 1 / (0.4 - 0.3) = 10
Discrete - Odds Ratio (OR)
Event No eventExperimental a b ne
Control c d nc
ee naP cc ncP
Basic Data a/ne c/nc
Odds: number of patients experiencing eventnumber of patients not experiencing event
Odds ratio: Odds in Experimental groupOdds in Control group
e c
e c
adP POR= =
1-P 1-P bc
Discrete - Odds Ratio Example
4613eP 387cP
Basic Data 13/46 7/38
Event No eventExperimental 13 33 46
Control 7 31 38
7451337
3113.
*
*OR
Discrete - Relative Risk (RR)
ee naP cc ncP
Basic Data a/ne c/nc
Event No eventExperimental a b ne
Control c d nc
Risk: number of patients experiencing eventnumber of patients
Risk Ratio: Risk in Experimental groupRisk in Control group
c)ba(
)dc(aPP RR ce
Discrete - Relative Risk - Example
4613Pe 387Pc
Basic Data 13/46 7/38
Event No eventExperimental 13 33 46
Control 7 31 38
1.5347/38
/PPRR ce
4613
Discrete - Risk Difference (RD)
ee naP cc ncP
Basic Data a/ne c/nc
Event No eventExperimental a b ne
Control c d nc
Risk: number of patients experiencing eventnumber of patients
Risk Difference: (Risk in Experimental group) - (Risk in Control group)
RD = Pe- Pc dc
c
ba
a
Discrete - Risk Difference - Example
4613Pe 387Pc
Basic Data 13/46 7/38
Event No eventExperimental 13 33 46
Control 7 31 38
RD = Pe- Pc = 13/46 - 7/38 = 0.098
Discrete - Odds Ratio
ee nap cc ncp
)1(
1
)1(n
1 s
1/2
eLo
cccee ppnpp
Event No eventExperimental a b ne
Control c d nc
Estimator:)ˆ1/(ˆ)ˆ1/(p̂
o e
cc
e
pp
p
ln(o) Lo
Standard Error:
)sZ exp(LoL/2o
100(1- )% CI:oL/2o sZ L
(O)
Discrete - Relative Risk
ee nap cc ncp
1
pn
p-1 s
1/2
ee
eLr
cc
c
pn
p
Event No eventExperimental a b ne
Control c d nc
Estimator: ce p̂/p̂ r ln(r) Lr
Standard Error:
)sZ exp(LrL/2r
100(1- )% CI:
(R)
rL/2r sZ L
Discrete - Risk Difference
ee nap cc ncp
)1(
n
)p-(1p s
1/2
e
eed
c
cc
n
pp
Event No eventExperimental a b ne
Control c d nc
Estimator: ce p̂-p̂ d
Standard Error:
100(1- )% CI:
(D)
d/2 sZ d
When to use OR / RR / RDAssociation OR
(0,)RR
(0,)RD
(- 1,1)‘Decreased’ <1 <1 <0None 1 1 0‘Increased’ >1 >1 >0
OR vs RR
Odds Ratio Relative Risk if event occurs infrequently (i.e. a and c small relative to b and d)
RR = a(c+d) ad = OR
(a+b)c bc
Odds Ratio > Relative Risk if event occurs frequently
RD vs RR
When interpretation in terms of absolute difference is betterthan in relative terms (eg. Interest in absolute reduction inadverse events)
PROPERTIES OF RISK DIFFERENCE (RD),RELATIVE RISK (RR) AND ODDS RATIO (OR)
RD RR OR
Simple measure? Yes Yes No
Symmetric (measure unaffected by Yes No Yeslabelling of study groups)?
Predicted event rates restricted to No No Yes[0,1] if measure is assumed constant?
Unbiased estimate available? Yes No No
Efficient estimation in small samples? No No Yes
Motivating biological model available? Yes Yes Yes
Continuous Data - Mean Difference (MD)
number mean standard deviation
Experimental ne se
Control nc sc
)x-x(Zx-x
n
s
n
s
cece
c
c
e
e
se )( :)
)
2/
22
CI % -100(1
x-x( se
x-x :(MD) difference Mean
ce
ce
ex
cx
Continuous Data - Standardized Mean Difference (SMD)
number mean standard deviation
Experimental ne se
Control nc sc
12)n4(n
42)n4(n f
2nn
1)s(n1)s(ns
:where
s
x-x f d :SMD
ce
ce
ce
2cc
2ee
ce
se(d) Z d : CI )%-100(1
)(2se(d)
/2
2/12
cece
ce
nn
d
nn
nn
ex
cx
Mean Difference• When studies have comparable outcome measures (ie.
Same scale, probably same length of follow-up)
• A meta-analysis using MDs is known as a weighted mean difference (WMD)
Standardized Mean Difference• When studies use different outcome measurements which
address the same clinical outcome (eg different scales)
• Converts scale to a common scale: number of standard deviations
When to use MD / SMD
Example: Combining different scales for Swollen Joint Count
Study ExptMean SD N
ControlMean SD N MD SMD
Andersen 6.9 5.2 12 19.4 12.2 12 -12.5 -1.287
Furst 18.0 11.0 17 27.0 15.0 16 -9.0 -0.671
Pinheiro -- -- -- -- -- -- -- --
Weinblatt 20.0 7.75 15 23.0 8.0 16 -3.0 -0.371
Williams 17.0 12.6 56 25.0 13.4 48 -8.0 -0.612
• “True” inter-study variation may exist (fixed/random-effects model)
• Sampling error may vary among studies (sample size)
• Characteristics may differ among studies (population, intervention)
Sources of Variation over Studies
• Parameter of interest: (quantifies average treatment effect)
• Number of independent studies: k
• Summary Statistic: Yi (i=1,2,…,k)
• Large sample size: asymptotic normal distribution
Fixed-effects model vs Random-effects model
Modelling Variation
Fixed-Effects Model
• Outcome Yi from study i is a sample from a distribution with mean
(ie. common mean across studies)
• Yi are independently distributed as N ( , ) (i=1,2,…,k) where = Var(Yi ) and assume E(Yi) =
2is
2is
Fixed-Effects Model
x
Random-Effects Model
• Outcome Yi from study i is a sample from a distribution with mean
(ie. study-specific means)
• Yi are independently distributed as N ( , ) (i=1,2,…,k) where = Var(Yi ) and assume E(Yi) =
• is a realization from a distribution of ‘effects’ with mean
• are independently distributed as N ( , ) (i=1,2,…,k) where• = Var ( ) is the inter-study variation
• is the average treatment effect
i
i
i
i
i
2
2 i
2is
2is
Random-Effects Model
x
• distribution of conditional on observed data, and is N ( )
• where Fi is the shrinkage factor for the ith study
Random-Effects Model …..
• after averaging study-specific effects, distribution of Yi is N ( , )• although is parameter of interest, must be considered and
estimated
i
22 is
2
Estimating Average Study Effect
)1(,)1( 2iiiii FsYFF
Estimating Study-Specific Effects
i
)/( 222 iii ssF
2,
Modelling Variation
• Studies are stratified and then combined to account for differences in sample size and study characteristics
• A weighted average of estimates from each study is calculated
• Question of whether a common or study-specific parameter is to be estimated remains …. Procedure:• perform test of homogeneity• if no significant difference use fixed-effects model• otherwise identify study characteristics that stratifies studies
into subsets with homogeneous effects or use random effects model
Fixed Effects Model
• Require from each study effect estimate; and standard error of effect estimate
Combine these using a weighted average:
pooled estimate = sum of (estimate weight)
sum of weights
where weight = 1 / variance of estimate
• Assumes a common underlying effect behind every trial
Fixed-Effects Model: General Scheme
Study Measure Std Error Weight
1 Y1 s1 W1
2 Y2 s2 W2
. . . .
. . . .
. . . .k Yk sk Wk
(no association: Yi=0)
2
1
i
is
W
Overall Measure:
) ˆse(ˆ : )%1(100
1 )ˆ(
ˆ
2/
ZCI
Wse
W
YW
ii
ii
iii
mle
Chi-Square Tests:
21
22hom
21
2
i
2i
2
2k
1i
2i
2
2hom
22
)ˆ(
)W(
W
11
ki
iiog
ii
i
assoc
kitotal
ogassoctotal
YW
W
Y
Y
) (k-) (df (k)
test Q sCochran'
(0,1) N 2assoc If ‘large’ association
If ‘large’ heterogeneity
1
2
1
2
Features in Graphic Display
• For each trial– estimate (square)
– 95% confidence interval (CI) (line)
– size (square) indicates weight allocated
• Solid vertical line of ‘no effect’– if CI crosses line then effect not significant (p>0.05)
• Horizontal axis– arithmetic: RD, MD, SMD– logarithmic: OR, RR
• Diamond represents combined estimate and 95% CI• Dashed line plotted vertically through combined estimate
Odds Ratio
Three methods for combining
(1) Mantel-Haenszel method
(2) Peto’s method
(3) Maximum likelihood method
Relative Risk
Risk Difference
Peto Odds Ratio
Mantel-Haenszel Odds Ratio
Relative Risk
Risk Difference
Weighted Mean Difference
Standardized Mean Difference
Weighted Mean Difference
Standardized Mean Difference
Heterogeneity
• Define meaning of heterogeneity for each review• Define a priori the important degree of heterogeneity (in large
data sets trivial heterogeneity may be statistically significant)• If heterogeneity exists examine potential sources (differences in
study quality, participants, intervention specifics or outcome measurement/definition)
• If heterogeneity exists across studies, consider using random effects model
• If heterogeneity can be explained using a priori hypotheses, consider presenting results by these subgroups
• If heterogeneity cannot be explained, proceed with caution with further statistical aggregation and subgroup analysis
Heterogeneity: How to Identify it
• Common sense
are the patients, interventions and outcomes in each of the included studies sufficiently similar
• Exploratory analysis of study-specific estimates
• Statistical tests
Heterogeneity: How to deal with it
Lau et al. 1997
• Subgroup analyses
subsets of trials
subsets of patients
SUBGROUPS SHOULD BE PRE-SPECIFIED TO AVOID BIAS
• Meta-regression
– relate size of effect to characteristics of the trials
Heterogeneity: Exploring it
Exploring Heterogeneity: subgroup analysis
Exploring Heterogeneity: subgroup analysis
Random Effects Model
• Assume true effect estimates really vary across studies
• Two sources of variation:- within studies (between patients)- between studies (heterogeneity)
• What the software does:- Revise weights to take into account both components of
variation:
• weight = 1 variance+heterogeneity
• When heterogeneity exists we get a different pooled estimate (but not necessarily) with a different interpretation a wider confidence interval a larger p-value
Random Effects Model
22
1)(
)(
)( )(ˆ
ii
ii
iii
mle sWwhere
W
YW
If is known then MLE of is2
If is unknown three common methods of inference can be used:
Restricted Maximum Likelihood (REML)
Bayesian
Method of Moments (MOM)
2
Method of Moments (Random effects model)
ii
ii
ogw WWW
k2
2hom2 )1(
,0max
Study Measure Weight (FE) Weight (RE)1 Y1 W1 w1
*=(w1-1+ )-1
2 Y2 W2 w2*=(w2
-1+ )-1
. . . .
. . . .
. . . .k Yk Wk wk
*=(wk-1+ )-1
Overall Measure
)ˆse( : )%1(100
1 )ˆ(
ˆ
*2/
*
*
*
*
*
*
ZCI
Wse
W
YW
ii
ii
iii
2w
2w
2w
Effect of model choice on study weights
Larger studies receive proportionally less weight in RE model
than in FE model
Fixed Effects
Random Effects
Fixed vs Random Effects: Discrete Data
Random Effects
Fixed EffectsFixed vs Random Effects: Continuous Data
Omission of Outlier - Chestnut Study
Analysis
• Include all relevant and clinically useful measures of treatment effect
• Perform a narrative, qualitative summary when data are too sparse, of too low quality or too heterogeneous to proceed with a meta-analysis
• Specify if fixed or random effects model is used• Describe proportion of patients used in final analysis• Use confidence intervals• Include a power analysis• Consider cumulative meta-analysis (by order of
publication date, baseline risk, study quality) to assess the contribution of successive studies
Steps of a Cochrane Systematic Review
• Well formulated question• Comprehensive data search• Unbiased selection and extraction
process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup
analyses if appropriate and possible• Prepare a structured report
Subgroup Analyses
• Pre-specify hypothesis-testing subgroup analyses and keep few in number
• Label all a posteriori subgroup analyses
• When subgroup differences are detected, interpret in light of whether they are:• established a priori
• few in number
• supported by plausible causal mechanisms
• important (qualitative vs quantitative)
• consistent across studies
• statistically significant (adjusted for multiple testing)
Sensitivity Analyses• Test robustness of results relative to key features of the studies and key
assumptions and decisions
• Include tests of bias due to retrospective nature of systematic reviews (eg.with/without studies of lower methodologic quality)
• Consider fragility of results by determining effect of small shifts in number of events between groups
• Consider cumulative meta-analysis to explore relationship between effect size and study quality, control event rates and other relevent features
• Test a reasonable range of values for missing data from studies with uncertain results
Funnel Plot• Scatterplot of effect estimates against sample
size• Used to detect publication bias• If no bias, expect symmetric, inverted funnel
• If bias, expect asymmetric or skewed shape
x x x x x xx x x x x x
x x x x x x x x x x
Suggestion of missing small studies
Effect Size (RR)
1.21.0.8.6.4.20.0
700
600
500
400
300
200
100
0
Intervention
H2-Blockers
Funnel Plot Example 1: Prophylaxis of NSAID induced Gastric Ulcers
Sa
mp
le S
ize
Funnel Plot Example 2: Alendronate for Postmenopausal Osteoporosis
0
500
1000
1500
2000
2500
0 5 10
Weighted Mean Difference
Sa
mp
le S
ize
WMD of % change in lumbar bone mineral density
Steps of a Cochrane Systematic Review
• Well formulated question• Comprehensive data search• Unbiased selection and extraction
process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup
analyses if appropriate and possible• Prepare a structured report
Presentation of Results
• Include a structured abstract
• Include a table of the key elements of each study
• Include summary data from which the measures are computed
• Employ informative graphic displays representing confidence intervals, group event rates, sample sizes etc.
Interpretation of Results
• Interpret results in context of current health care• State methodologic limitations of studies and review• Consider size of effect in studies and review, their
consistency and presence of dose-response relationship
• Consider interpreting results in context of temporal cumulative meta-analysis
• Interpret results in light of other available evidence• Make recommendations clear and practical• Propose future research agenda (clinical and
methodological requirements)
Generic Inferential Framework
Generic inferential framework
(1) Conceptually, think of a ‘generic’ effect size statistic T
(2) corresponding effect size parameter θ(3) associated standard error SE(T), square
root of variance(4) for some effect sizes, some suitable
transformation may be needed to make inference based on normal distribution theory
Generic inferential framework ...
(A) Fixed-Effects Model (FEM):– Assume a common effect size– Obtain average effect size as a weighted mean
(unbiased)• Optimal weight is reciprocal of variance (inverse
variance weighted method)
Generic inferential framework ...
• Variances inversely proportional to within-study sample sizes– what is the effect of larger studies in
calculating weights?– may also weigh by ‘quality’ index, q, scaled
from 0 to 1
Generic inferential framework ...
• Average effect size has conditional variance (a function of conditional variances of each effect size, quality index, …)– e.g.. V = 1/total weight
• Multiply the resulting standard error by appropriate critical value (1.96, 2.58, 1.645)
• Construct confidence interval and/or test statistic
Generic inferential framework ...
• Test the homogeneity assumption using a weighted effect size sums of squares of deviations, Q
• If Q exceeds the critical value of chi-square at k-1 d.f. (k = number of studies), then observed between-study variance significantly greater than what would be expected under the null hypothesis
Generic inferential framework ...
• When within-study sample sizes are very large, Q may be rejected even when individual effect size estimates do not differ much
• One can take different courses of action when Q is rejected (see next page)
Generic inferential framework ...
• Methodologic choices in dealing with ‘heterogeneous’ data
Generic inferential framework ...
(B) Random-Effects Model (REM):– Total variability of an observed study effect size reflects
within and between variance (extra variance component)– If between-studies variance is zero, equations of REM
reduce to those of FEM– Presence of a variance component which is significantly
different from zero may be indicative of REM
Generic inferential framework ...
• Once significance of variance component is established (e.g.. Q test for homogeneity of effect size), – its magnitude should be estimated– variance components can be estimated in many ways!
• the most commonly used method is the so-called the DerSimonian-Laird method which is based on method-of-moments approach
– Compute random effects weighted mean as an estimate of the average of the random effects in the population
– construct confidence interval and conduct hypothesis tests as before (new variance and thus new weights!!!)
Correlation Coefficient
Example: Correlation coefficient
• A measure of association more popular in cross-sectional observational studies than in RCTs is Pearson’s correlation coefficient, r given by
• X and Y must be continuous (e.g. blood pressure and weight)
• r lies between -1 to 1• not available in RevMan / MetaView at this time
2 2
( )( )
( ) ( )
X X Y Yr
X X Y Y
Correlation coefficient (cont’d)
• Following the generic framework discussed earlier:– the effect size statistic is r– the corresponding effect size parameter is the
underlying population correlation coefficient, – in this case, a suitable transformation is
needed to achieve approximate normality of effect size
– inference is conducted on the scale of the transformed variable and final results are back-transformed to the original scale
Correlation coefficient (cont’d)Assuming X and Y have a bivariate normal distribution, the Fisher’s Z
transformed variable
has, for large sample, an approximate normal distribution with mean of
and a variance of
Hence, weighting factor associated with Z is W = 1/Var = n-3.
1 1log
2 1
rZ
r
1 1log
2 1
1( )
3Var Z
n
Correlation coefficient (cont’d)
• meta-analysis is carried out on Z-transformed measures and final results are transformed back to the scale of correlation using
2
2
1
1
Z
Z
er
e
Numerical Example
• Source: Fleiss J., Statistical Methods in Medical Research 1993; 2: 121 -- 145.
• correlation coefficients reported by 7 independent studies in education are included in the meta-analysis
• Comparison: association between a characteristic of the teacher and the mean measure of his or her student’s achievement
__________________________________________Study n r Z* W** WZ WZ2
==============================================================
1 15 -0.073 -0.073 12 -0.876 0.064 2 16 0.308 0.318 13 4.134 1.315 3 15 0.481 0.524 12 6.288 3.295 4 16 0.428 0.457 13 5.941 2.715 5 15 0.180 0.182 12 2.184 0.397 6 17 0.290 0.299 14 4.186 1.252 7 __ 15 0.400 0.424 _ 12 ___5.088 2.157__Sum 88 26.945 11.195===================================================*Z = Fisher’s Z-transformation of r** W = n-3 2
2 2
2
( )( ) /
11.195 (26.945) /88 2.94
i i
i i i i i
Q W Z ZW Z W Z W
Example: Fleiss (1993)
Q = 2.94 on 6 df is not statistically significant.
Results and discussions
• No evidence for heterogeneous association across studies
• Fixed effect analysis may be undertaken• Questions:
– Would a random effect analysis as shown earlier produce a different numerical value for the combined correlation coefficient?
– How would the weights be modified to carry out a REM?
Results and discussions (cont’d)
• the weighted mean of Z is
• the approximate standard error of the combined mean is
/ 26.945/88 0.306i i iZ W Z W
1 1( ) 0.107
88i
SE ZW
Results and discussions (cont’d)
• Test of significance is carried out using
– this value exceeds the critical value 1.96 (corresponding to 5% level of significance), so we conclude that average value of Z (hence the average correlation) is statistically significant
0.3062.86
0.107( )
Zz
SE Z
Results and discussions (cont’d)
• 95% confidence interval for is
• Transforming back to the original scale, a 95% CI for the parameter of interest, , is
– again confirming a significant association
1.96 ( )
0.096 0.516
Z SE Z
0.096 0.474
Critical Appraisal of a
Systematic Review
(A) The Message
• Does the review set out to answer a precise question about patient care? – Should be different from an uncritical
encyclopedic presentation
(B) The Validity
• Have studies been sought thoroughly: Medline and other relevant bibliographic database
Cochrane controlled clinical trials register
Foreign language literature
"Grey literature" (unpublished or un-indexed reports: theses, conference proceedings, internal reports, non-indexed journals, pharmaceutical industry files)
Reference chaining from any articles found
Personal approaches to experts in the field to find unpublished reports
Hand searches of the relevant specialized journals.
Validity (cont’d)
• Have inclusion and exclusion criteria for studies been stated explicitly, taking account of the patients in the studies, the interventions used, the outcomes recorded and the methodology?
Validity (cont’d)
• Have the authors considered the homogeneity of the studies: the idea that the studies are sufficiently similar in their design, interventions and subjects to merit combination. – this is done either by eyeballing graphs like
the forest plot or by applications of chi-square tests (Q test)
(C) The Utility
• The various studies may have used patients of different ages or social classes, but if the treatment effects are consistent across the studies, then generalisation to other groups or populations is more justified.
Utility (cont’d)
• Be wary of sub-group analyses where the authors attempt to draw new conclusions by comparing the outcomes for patients in one study with the patients in another study– Be wary of "data-dredging" exercises, testing
multiple hypotheses against the data, especially if the hypotheses were constructed after the study had begun data collection.
Utility (cont’d)
• One may also want to ask: Were all clinically important outcomes considered?
Are the benefits worth the harms and costs?