22 | September 2016 Medical Writing | Volume 25 Number 3
Adam JacobsPremier Research, Wokingham, UK
Correspondence to:Adam JacobsSenior Principal StatisticianPremier ResearchFishponds Road WokinghamRG41 [email protected]
AbstractMeta-analysis is a statistical technique forsummarising the results of multiplestudies in a quantitative manner. Itshould not be confused with a systematicreview, though in practice the two areoften found together. The main pitfallswith meta-analyses are being sure thatthe studies being combined are similarenough that it makes sense to combinethem, and being sure that all relevantstudies have been included.
Meta-analysis is a statistical technique forcombining the results of more than onestudy. It should be immediately obvioushow useful this is: it is very rare that a singlestudy gives us a definitive answer inmedicine. To get a good idea of whether anintervention works to treat or preventdisease, or whether a particular environ -mental factor is associated with an increasedrisk of disease, for example, it is frequentlynecessary to take account of many studies toget a better overall picture.
By combining studies in this way, notonly can we reduce the risk of being fooledby a study with unusual results as a result ofa statistical fluke or bad study design, we canalso get more precise estimates of themagnitude of effects. It is entirely possible,for example, that several individual studieshave looked at a particular intervention butbeen underpowered to detect its effects, andeach of them alone failed to find a significanteffect, but if you combine all the studies ina meta-analysis you could find that theoverall result is that a statistically significanteffect can be confirmed.
Meta-analysis should not be confusedwith systematic review, although the twooften go together. A systematic review is anattempt to find and review the entirety ofliterature on a particular topic using athorough literature search, often looking forunpublished as well as published studies.This guards against any cherry-picking (atleast in theory) and ensures that decisionsare made on the totality of evidence.
Often, a systematic review will include ameta-analysis. Once all the relevant studieshave been identified, their results can becombined using a meta-analysis to give anumerical summary. However, it is possibleto do a systematic review without a meta-analysis: typically, results will be presentedin narrative form with no attempt made toproduce a precise numerical summary of theresults. This might be done, for example, ifall the studies identified had such differentmethods, interventions, or study pop u -lations that trying to combine them into asingle estimate does not make sense.
Equally, it is possible to do a meta-
A medical writer’s guide to meta-analysis
Jacobs – A Medical Writer’s Guide to meta-analysis
analysis without a systematic review.Sometimes studies may be chosen in a non-systematic way and yet still combined in ameta-analysis. Obviously when interpretingthe results of such an analysis it is importantto ask questions about what other studiesmight exist and why they were not included,but there may sometimes be legitimatereasons for meta-analysis of data that havenot been chosen through the methods of
systematic review.I say that “in theory” asystematic review guards
against cherry-picking,but in practice asystematic review isnot an absoluteguarantee. An imp -ort ant process in a
systematic review issetting the inclusion
criteria for the studiesthat will be included.
There are no hard and fastrules about what inclusion criteria
should be, and some judgement is alwaysrequired. For example, do you require aminimum sample size for each study, and ifso, what size? Will you include just trialsagainst placebo or also trials against activecomparators? Will you only includerandomised trials or will you also includeobservational research? Should there be aminimum study duration? Will you includestudies on all patients with cancer, allpatients with advanced cancer, or only onthose with confirmed metastatic disease?The possibilities are endless, and there areno right answers: the best choice willdepend very much on individual circum -stances.
And here is the problem. If you know theliterature in a particular area well – as manysystematic reviewers do – you will knowwhat the important studies are. You willtherefore know, when you decide on yourinclusion criteria, that a particular choice ofinclusion criteria will exclude specificstudies that you already know about. If youhave an agenda, then you can still cherrypick your data subtly by choosing inclusion
criteria to exclude the studies that you don’tlike. So just because a systematic review hasbeen conducted thoroughly and scrup -ulously in accordance with its inclusioncriteria, there is still no guarantee that allrelevant trials have been included. It’s alwaysworth reading the inclusion criteria carefullyand making your own mind up about howreasonable they are.
One of the most important decisions forthe meta-analyst is when it makes sense tocombine data and when it doesn’t. Bycombining a wide range of studies you canget apparently more statistically preciseestimates, as you have more data. However,that statistical precision may be illusory. Ifyou are investigating the efficacy of aparticular treatment in different studypopulations, for example, an overall estimatemay conceal the fact that the treatmentworks really well in some patients and isharmful in others. So when looking at ameta-analysis it is always worth looking atthe detail of the individual studies andasking if they are investigating the samething. If they are not, then an overallestimate may be meaningless.
Happily, this question of how com par -able different studies are can be investigatedstatistically. A good meta-analyst will lookfor a measure of heterogeneity among thestudies. It is expected that not all studies willgive exactly the same result just because ofnormal random variation, but do the studiesvary more than would be expected bychance? That’s a simple question to ask,though not so simple to answer. Although itis possible to calculate a simple statisticaltest and calculate a P value, where asignificant P value shows significant hetero -geneity, the results of such a test are notstraightforward to interpret, as there is ahigh risk of both false positive and falsenegative conclusions.
Higgins et al.1 have proposed an altern -ative approach to quantifying heterogeneity,by calculating a measure known as the I2
statistic, where 0 means that the studies areall identical and higher values (with amaximum of 100%) show increasingheterogeneity.
If you observe substantial heterogeneity,then it is reasonable to question therelevance of an overall estimate.
If you are looking at meta-analysis resultsyou will come across things called “fixedeffects estimates” and “random effectsestimates”. These are alternative statisticalapproaches for combining multiple studies,and are based on different assumptions.
The fixed effects method makes theassumption that there is no importantheterogeneity, and that all studies areessentially measuring the same thing. Inother words, it assumes that any differencesin estimates of treatment effects from onestudy to the next are due purely to statisticalrandom variability. If in fact you observe thatheterogeneity is low, then the fixed effectmeasure gives you a good summary of theresults.
The random effects method assumes thatheterogeneity is present, and the differencesamong studies are due partly to statisticalrandom variability, but also due to differ -ences in the “true” treatment effect that eachstudy is measuring, as it is not assumed thatall studies are measuring the same thing. Inpractical terms, the main difference betweenthe two methods is that random effectsestimation gives more weight to smallstudies that give different results to theaverage effect.
Interpreting the results of random effectsmeta-analyses is, as mentionedabove, difficult. Although itgives you an estimate ofthe average effect, thattreatment effect maydepend on specificcharacteristics of thestudies. If you want toapply the results to areal life situation, thereis no guarantee that youwill be applying it in anaverage situation. Your situ -ation may match some studies farbetter than others.
For example, some studies may have useddifferent doses. You may find that the highdose studies give greater treatment effects
www.emwa.org Volume 25 Number 3 | Medical Writing September 2016 | 23
One ofthe mostimportantdecisions for the meta-analystis when it makes sense tocombine dataand when itdoesn’t.
If youobserve
substantialheterogeneity,
then it isreasonable toquestion therelevance of
an overallestimate.
A Medical Writer’s Guide to meta-analysis – Jacobs
24 | September 2016 Medical Writing | Volume 25 Number 3
than the low dose studies. The relevantestimate is therefore not an average, but thetreatment effect for the dose level that youare interested in. That’s a fairly obviousexample, but there can be many other moresubtle factors that can affect treatmenteffects, such as the inclusion criteria for thestudy, treatment duration, concomitantmedications, healthcare setting, etc.
One way to deal with the problem ofheterogeneity is to determine the majorcause of heterogeneity and to presentseparate estimates for different groups. Forexample, Annane et al.2 did a systematicreview and meta-analysis to investigate theeffects of corticosteroids on overall mortal -
ity at 28 days in patients with severe sepsisand septic shock. Their overall meta-analyses did not find a significant effect onmortality (relative risk 0.92, 95% confidenceinterval 0.75 to 1.14, P = 0.46), but it alsofound significant heterogeneity (I2 = 58%,P = 0.003). When they divided their studiesinto those that had used long courses of lowdose corticosteroids or short courses of highdose corticosteroids, they found that therewas indeed a significant reduction inmortality in the studies that had used longcourses of low doses (relative risk: 0.80, 0.67to 0.95, P = 0.01), but not in the studies withshort courses of high doses. Ignoring theheterogeneity would have meant missing the
important difference between thedifference dosing regimens.
That said, use of corticosteroidsin sepsis is complex and con trov -ersial, and Annane et al’s analysis isunlikely to be the last word.Although a meta-analysis can givemore reliable results than a singlestudy, even a meta-analysis is oftennot sufficient to settle a medicalquestion once and for all. There isprobably considerably more het er -
ogeneity that needs to be unpicked in thiscase, including genetic features of the patientand the nature of the infecting organism.3
One very common way in which theresults of results of meta-analyses arepresented is with a graph known as a forestplot. The example in Figure 1 is typical.
This shows the results of a meta-analysison the effects on coronary heart disease(CHD) of increasing polyunsaturated fat inplace of saturated fat.4 There is a lot ofinformation in that one graph. We can seedetails of each study, including the name ofthe study, the number of patients, and thenumber of CHD events. We also see howextensive the dietary changes were in eachstudy as figures for % polyunsaturated fattyacid consumption in the control andintervention groups. We then see the resultspresented both graphically and in text. Thecentral blob of each line shows the estimatedrelative risk from each study, and the extentof the horizontal line shows the 95%confidence interval. The size of the centralblob shows how much weight the studyprovides (mainly a function of the numberof patients in each study), the bigger theblob, the more that study contributes to theoverall analysis. We then get the sameinformation in text form to the right of thegraph.
At the bottom, we see the overallestimate. Again, we see the relative risk andits confidence interval, presented bothgraphically and in text form. That’s theimportant number to take away from meta-analyses, though as stated previously, it maybe hard to interpret in the presence ofsignificant heterogeneity among studies.The forest plot gives us another means ofassessing heterogeneity by simply eyeballingthe spread of the estimates from theindividual studies.
Lastly, no discussion of meta-analyseswould be complete without a few wordsabout publication bias. Meta-analyses willnever give a true summary of all the researchthat has been done if some studies areexcluded. We know that not all studies arepublished. The claim by the All Trialscampaign that only 50% of studies arepublished is of course nonsense and the real
Figure 2: Hypothetical symmetricfunnel plotSo
urce
: Wik
iped
ia
x 2x 3x
y
2y
3y Generic
Funnel Plot
●●
●
●
●●
● ●●
●
●
●●
●
●●●
●
●
●●
●
●
●●
●
Figure 1: Forest plot of the effects of replacing saturated fat with polyunsaturated fat oncoronary heart disease. Abbreviations: CHD: coronary heart disease; MI: myocardialinfarction; PUFA: polyunsaturated fatty acid; RR: relative risk
Sour
ce: M
ozaf
faria
n et
al.4
Jacobs – A Medical Writer’s Guide to meta-analysis
www.emwa.org Volume 25 Number 3 | Medical Writing September 2016 | 25
figure is probably much higher,5 butnonetheless, the proportion of trials thatare published is certainly less than 100%,and we know that studies reportingnegative results are less likely to bepublished than positive studies.6 If ameta-analysis includes only positivetrials and ignores negative ones, then itwill give an over-optimistic estimate ofthe true treatment effect.
A careful meta-analyst will thereforetry to tell whether there is any evidencethat publication bias has occurred. Oneway to do this is with a funnel plot, inwhich the treatment effect of individualstudies is plotted on the x axis against thesize of the study on the y axis. If allstudies are published, the results wouldlook roughly like an inverted funnel,with a greater spread of studies towardsthe bottom of the plot, where smallsample sizes means that considerablevariation in results is likely, and a smallerspread towards the top, where largesample sizes would keep results close tothe “true” result (Figure 2).
If there is publication bias, it is likelythat small negative studies will beunpublished, whereas small positivestudies will be published. Large studiesare more likely to be published whateverthey show, as once you’ve gone to all thetrouble of doing a large study you aremore likely to be motivated to write itup. This can give rise to asymmetry inthe funnel plot. Figure 3 shows one
example of an asymmetric funnel plot.I created this funnel plot from data
provided in a Cochrane review of theeffect of pharmaceutical industry spon -sorship on publications.7 The review ersclaimed that trials sponsored by pharma -ceutical companies were more likely tobe favourable to the sponsor’s productthan independent studies. Certainly theresults of their meta-analysis showedthat very strongly, but how much can wetrust that result with such strongevidence of publication bias?
Meta-analysis is undoubtedly a usefultechnique that can provide importantinsights when summarising the medicalliterature. However, it is not a magicbullet, and must be interpreted with thesame caution you would apply to anyother results. Obviously if a meta-analysis is based on poor quality studies,the result will also be questionable. Butin addition, it is also important to beaware of whether the studies aresufficiently similar that a meta-analysismakes sense, and crucially, whether allrelevant studies have been included.
References1. Higgins JPT, Thompson SG, Deeks
JJ, Altman DG. Measuringinconsistency in meta-analyses.BMJ. 2003;327:557–60.
2. Annane D, Bellissant E, Bollaert E,Briegel J, Keh D, Kupfer Y.Corticosteroids for severe sepsis
and septic shock: a systematicreview and meta-analysis BMJ.2004;329(7464):480.
3. Morgan, Paul (@drpaulmorgan).“@statsguyuk Very! The host-response mechanism is a mix ofgenetics (cytokine response) andthe individual bug. Lots ofinteresting research!”. [Tweet,10June 2016, 8:12 AM]. Availablefrom: https://twitter.com/drpaulmorgan/status/741166150147538944.
4. Mozaffarian D, Micha R, Wallace S.Effects on Coronary Heart Diseaseof Increasing Polyunsaturated Fatin Place of Saturated Fat: A Systematic Review and Meta-Analysis of RandomizedControlled Trials. PLoS Med.2010;7(3):e1000252.
5. Jacobs A. Zombie statistics on halfof all clinical trials unpublished.[Blogpost; 29 August 2015].Available from: http://www.statsguy. co.uk/zombie-statistics-on-half-of-all-clinical-trials-unpublished.
6. Song F, Parekh S, Hooper L, LokeYK, Ryder J, Sutton AJ, et al.Dissemination and publication ofresearch findings: an updatedreview of related biases. HealthTechnol Assess. 2010;14(8).
7. Jacobs A. Cochrane review onindustry sponsorship. [Blogpost;20 December 2012]. Availablefrom: http://dianthus.co.uk/cochrane-review-on-industry-sponsorship.
Author informationAdam Jacobs was previously amedical writer, and was president ofEMWA in 2004 to 2005. He nowworks as a medical statistician atPremier Research. He still teachesregular workshops for EMWA onstatistical topics. You can find him onTwitter at @statsguyuk.
10
8
6
4
2
01 2 3 4 5
Risk ratio
1 /
stan
dard
erro
r of R
R●
●
●
●
●
●●
●
●
●
●
●
●
●
Figure 3: Funnel plot of studies investigating link between industry sponsorship andresults favourable to the sponsor’s product. Abbreviations: RR: relative risk