+ All Categories
Home > Documents > ORBITA and coronary stents: A case study in the analysis...

ORBITA and coronary stents: A case study in the analysis...

Date post: 04-Aug-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
13
ORBITA and coronary stents: A case study in the analysis and reporting of clinical trials Andrew Gelman, John B. Carlin and Brahmajee K Nallamothu 25 Mar 2019 Department of Statistics and Political Science, Columbia University, New York City, NY, United States (Andrew Gelman, professor); Clinical Epidemiology & Biostatistics, Murdoch Children’s Research Institute, Melbourne School of Population and Global Health and Department of Paediatrics, University of Melbourne, Melbourne, Australia (John Carlin, professor); Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, MI, United States (Brahmajee K Nallamothu, professor); Correspondence to: Brahmajee K Nallamothu [email protected] Acknowledgements: We thank Doug Helmreich for bringing this example to our attention, Shira Mitchell for helpful comments, and the Office of Naval Research, Defense Advanced Research Project Agency, and the National Institutes of Health for partial support of this work. Competing interests: Dr. Gelman and Dr. Carlin report no competing interests. Dr. Nallamothu is an interventional cardiologist and Editor-in-Chief of a journal of the American Heart Association but otherwise has no competing interests. Word Count: 3078
Transcript
Page 1: ORBITA and coronary stents: A case study in the analysis ...gelman/research/published/Stents...2019/03/25  · ORBITA and coronary stents: A case study in the analysis and reporting

ORBITAandcoronarystents:

Acasestudyintheanalysisandreportingofclinicaltrials

AndrewGelman,JohnB.CarlinandBrahmajeeKNallamothu

25Mar2019

DepartmentofStatisticsandPoliticalScience,ColumbiaUniversity,NewYorkCity,NY,UnitedStates(AndrewGelman,professor);ClinicalEpidemiology&Biostatistics,MurdochChildren’sResearchInstitute,MelbourneSchoolofPopulationandGlobalHealthandDepartmentofPaediatrics,UniversityofMelbourne,Melbourne,Australia(JohnCarlin,professor);DepartmentofInternalMedicine,UniversityofMichiganMedicalSchool,AnnArbor,MI,UnitedStates(BrahmajeeKNallamothu,professor);Correspondenceto:[email protected]:WethankDougHelmreichforbringingthisexampletoourattention,ShiraMitchellforhelpfulcomments,andtheOfficeofNavalResearch,DefenseAdvancedResearchProjectAgency,andtheNationalInstitutesofHealthforpartialsupportofthiswork.Competinginterests:Dr.GelmanandDr.Carlinreportnocompetinginterests.Dr.NallamothuisaninterventionalcardiologistandEditor-in-ChiefofajournaloftheAmericanHeartAssociationbutotherwisehasnocompetinginterests.WordCount:3078

Page 2: ORBITA and coronary stents: A case study in the analysis ...gelman/research/published/Stents...2019/03/25  · ORBITA and coronary stents: A case study in the analysis and reporting

1.Introduction

Al-Lameeetal.(2017)reportresultsfromarandomizedcontrolledtrialofpercutaneouscoronaryinterventionusingcoronarystentsforstableangina.Thestudy,calledORBITA(ObjectiveRandomisedBlindedInvestigationWithOptimalMedicalTherapyofAngioplastyinStableAngina),includedapproximately200patientsandwasnotableforbeingablindedexperimentinwhichhalfthepatientsreceivedstentsandhalfreceivedaplaceboprocedureinwhichashamoperationwasperformed.Infollow-up,patientswereaskedtoguesstheirtreatmentandofthosewhowerewillingtoguessonly56%guessedcorrectly,indicatingthattheblindingwaslargelysuccessful.

Thesummaryfindingfromthestudywasthatstentingdidnot“increaseexercisetimebymorethantheeffectofaplaceboprocedure”withthemeandifferenceinthisprimaryoutcomebetweentreatmentandcontrolgroupsreportedas16.6secondswithastandarderrorof9.8(95%confidenceinterval,−8.9to+42.0s)andap-valueof0.20.IntheNewYorkTimes,Kolata(2017)reportedthefindingas“unbelievable,”remarkingthatit“stunnedleadingcardiologistsbycounteringdecadesofclinicalexperience.”Indeed,oneofus(BKN)wasquotedasbeinghumbledbythefinding,asmanycardiologistshadexpectedapositiveresult.Ontheotherhand,Kolatanoted,“therehavelongbeenquestionsabout[stents’]effectiveness.”Attheveryleast,thewillingnessofdoctorsandpatientstoparticipateinacontrolledtrialwithaplaceboproceduresuggestssomedegreeofexistingskepticismandclinicalequipoise.

ORBITAwasalandmarktrialduetoitsinnovativeuseofblindingforasurgicalprocedure.However,substantialquestionsremainregardingtheroleofstentinginstableangina.Itisawell-knownstatisticalfallacytotakearesultthatisnotstatisticallysignificantandreportitaszero,aswasessentiallydoneherebasedonthep-valueof0.20fortheprimaryoutcome.Hadthiscomparisonhappenedtoproduceap-valueof0.04,wouldtheheadlinehavebeen,“Confirmed:HeartStentsIndeedEaseChestPain”?Alotofcertaintyseemstobehangingonasmallbitofdata.

ThepurposeofthispaperistotakeacloserlookatthelackofstatisticalsignificanceinORBITAandthelargerquestionsthistrialraisesaboutstatisticalanalyses,statisticallybaseddecisionmaking,andthereportingofclinicaltrials.ThisreviewofORBITAisparticularlytimelyinthecontextofthewidelypublicizedstatementreleasedbytheAmericanStatisticalAssociationthatcautionedagainsttheuseofsharpthresholdsfortheinterpretationofp-values(WassersteinandLazar,2016)andmorerecentextensionsofthisadvicebyourselvesandothers(Amrhein,Greenland,andMcShane,2019,McShaneetal.,2019).Weendbyofferingpotentialrecommendationsto

Page 3: ORBITA and coronary stents: A case study in the analysis ...gelman/research/published/Stents...2019/03/25  · ORBITA and coronary stents: A case study in the analysis and reporting

improvereporting.

2.StatisticalanalysisoftheORBITAtrial

Adjustingforbaselinedifferences.InORBITA,exercisetimeinastandardizedtreadmilltest—theprimaryoutcomeinthepreregistereddesign—increasedonaverageby28.4sinthetreatmentgroupcomparedtoanincreaseofonly11.8sinthecontrolgroup.Asnotedabove,thisdifferencewasnotstatisticallysignificantatasignificancethresholdof0.05.Followingconventionalrulesofscientificreporting,thetrueeffectwastreatedaszero—aninstanceoftheregrettablycommonstatisticalfallacyofpresentingnon-statistically-significantresultsasconfirmationofthenullhypothesisofnodifference.

However,theestimateusinggaininexercisetimedoesnotmakefulluseofthedatathatwereavailableondifferencesbetweenthecomparisongroupsatbaseline(VickersandAltman,2001,Harrell,2017a).AscanbeseenintheTable,thetreatmentandplacebogroupsdifferintheirpre-treatmentlevelsofexercisetime,withmeanvaluesof528.0and490.0s,respectively.Thissortofdifferenceisnosurprise—randomizationassuresbalanceonlyinexpectation—butitisimportanttoadjustforthisdiscrepancyinestimatingthetreatmenteffect.Inthepublishedpaper,theadjustmentwasperformedbysimplesubtractionofthepre-treatmentvalues:

Gainscoreestimatedeffect: (ypost−ypre)T−(ypost−ypre)

C, (1)

Butthisover-correctsfordifferencesinpre-testscores,becauseofthefamiliarphenomenonof“regressiontothemean”—justfromnaturalvariation,wewouldexpectpatientswithlowerscoresatbaselinetoimprove,relativetotheaverage,andpatientswithhigherscorestoregressdownward.Theoptimallinearestimateofthetreatmenteffectisactually:

Adjustedestimate: (ypost−βypre)T−(ypost−βypre)

C, (2)

whereβisthecoefficientofypreinaleast-squaresregressionofypostonypre,also

controllingforthetreatmentindicator.Theestimatein(1)isaspecialcaseoftheregressionestimate(2)correspondingtoβ=1.Giventhatthepre-testandpost-testmeasurementsarepositivelycorrelatedandhavenearlyidenticalvariances(ascanbeseenintheTable),wecananticipatethattheoptimalβwillbelessthan1,whichwillreducethecorrectionfordifferenceinpre-testandthusincreasetheestimatedtreatmenteffectwhilealsodecreasingthestandarderror.Asaresult,anadjustedanalysisofthesedatawouldbeexpectedtoproducealowerp-value.

Page 4: ORBITA and coronary stents: A case study in the analysis ...gelman/research/published/Stents...2019/03/25  · ORBITA and coronary stents: A case study in the analysis and reporting

TheadjustedregressionanalysiscanbedoneusingtheinformationavailableintheTable,asexplainedindetailinBox1.Thep-valuefromthisadjustedanalysisis0.09:asanticipated,lowerthanthep=0.20fromtheunadjustedanalysis.

Alternativereporting.Despitemovingclosertotheconventional0.05threshold,thep-valueof0.09remainsabovethetraditionallevelofsignificancewhereatwhichonewouldnotistaughttorejectthenullhypothesis.Apotentialblockbusterreversalwithanadjustedanalysis—“StatisticalSleuthsTurnReportedNullEffectintoaStatisticallySignificantEffect”—doesnotquitematerialize.

Yetwithindifferentconventionsforscientificreporting,thisexperimentcouldhavebeenpresentedaspositiveevidenceinfavorofstents.Insomesettings,ap-valueof0.09isconsideredtobestatisticallysignificant;forexample,inarecentsocialscienceexperimentpublishedintheProceedingsoftheNationalAcademyofSciences,Sands(2017)presentedacausaleffectbasedonap-valueoflessthan0.10,andthiswasenoughforpublicationinatopjournalandinthepopularpress,with,forexample,thatworkmentioneduncriticallyinthemediaoutletVoxwithoutanyconcernregardingsignificancelevels(Resnick,2017).Bycontrast,VoxreportedthatORBITAshowedstentstobe“dubioustreatments,”aprimeexampleofthe“epidemicofunnecessarymedicaltreatments”(Belluz,2017).HadAl-Lameeetal.performedtheadjustedanalysiswiththeirdataandpublishedinPNASratherthantheLancet,couldtheyhaveconfidentlyreportedacausaleffectofstentsonexercisetime?

OurpointhereisnotatalltosuggestthatAl-Lameeelal.engagedinreverse“p-hacking”(Simmons,Nelson,andSimonsohn,2011),choosingananalysisthatproducedanewsworthynullresult.Infact,theauthorsshouldbecongratulatedforpre-registeringtheirstudy,publishingtheirprotocolpriortoperformingtheiranalyses,andreportingapre-specifiedprimaryanalysis.Ratherwewishtoemphasizetheflexibilityinherentbothindataanalysisandreporting—eveninthecaseofacleanrandomizedexperiment.Wearepointingoutthepotentialfragilityofthestents-didn’t-workstoryinthiscase.Existingdatacouldeasilyhavebeenpresentedasasuccessforstentscomparedtoplacebobyauthorswhowereaimingforthatnarrativeandperformingreasonableanalyses.

Fragilityofthefindings.Howsensitiveweretheresultstoslightchangesinthedata?Tobetterunderstandthiscriticalpoint,onecanperformasimplebootstrapanalysis,computingtheresultsthatwouldhavebeenobtainedfromreanalyzingthedata1000times,eachtimeresamplingpatientsfromtheexistingexperimentwithreplacement(Efron,1979).Asrawdatawerenotavailabletous,weapproximatedusingthenormal

Page 5: ORBITA and coronary stents: A case study in the analysis ...gelman/research/published/Stents...2019/03/25  · ORBITA and coronary stents: A case study in the analysis and reporting

distributionbasedontheobservedz-scoreof1.7.Theresultwasthat,in40%ofthesimulations,stentsoutperformedplaceboatatraditionallevelofstatisticalsignificance.Thisisnottosaythatstentsreallyarebetterthanplacebo—thedataalsoappearconsistentwithanulleffect.Thetake-homepointofthisexperimentisthattheresultscouldeasilyhavegone“theotherway,”whenreportingisforcedintoabinaryclassificationofstatisticalsignificance,formanydifferentreasons.

3.Designofthetrialandclinicalsignificance

Inajustificationfortheirstudydesignandsamplesize,Al-Lameeetal.(2017)wrote:“Evidencefromplacebo-controlledrandomisedcontrolledtrialsshowsthatsingleantianginaltherapiesprovideimprovementsinexercisetimeof48–55s...Giventhepreviousevidence,ORBITAwasconservativelydesignedtobeabletodetectaneffectsizeof30s.”Theestimatedeffectof21swithstandarderror12sisconsistentwiththe“conservative”effectsizeestimateof30sgiveninthepublishedarticle.Soalthoughtheexperimentalresultsareconsistentwithanulleffect,theyareevenmoreconsistentwithasmallpositiveeffect.

Onemightask,however,abouttheclinicalsignificanceofsuchatreatmenteffect,whichwecandiscusswithoutrelevancetop-valuesorstatisticalsignificance.Forsimplicity,supposewetakethepointestimatefromthedataatfacevalue.Howshouldwethinkaboutanincreaseinaverageexercisetimeof21s?Onewaytoconceptualizethisisintermsofpercentiles.Thedatashowapre-randomizationdistribution(averagingthetreatmentandcontrolgroups)withameanof509andastandarddeviationof188.Assuminganormalapproximation,anincreaseinexercisetimeof21sfrom509to530wouldtakeapatientfromthe50thpercentiletothe54thpercentileofthedistribution.Lookedatthatway,itwouldbehardtogetexcitedaboutthiseffectsize,evenifitwerearealpopulationshift.

Beyondexercisetime,therewereothersignalsfromORBITAthatseemedtosuggestconsistentimprovementsinthephysiologicalparameterofischemiathroughendpointssuchasfractionalflowreserve,instantaneouswave-freeratio,andstressecho.Actually,findingsfromthestressechohighlightapotentiallyimportantavenueintoanalternativepresentationoftheseresults.Thereisnoquestionthatsomephysiologicalchangesarebeingmadebystents,withverylargeandhighlystatisticallysignificant(p<0.001)effectsseenonechomeasures.Asisoftenthecase,thenullhypothesisthatthesephysicalchangesshouldmakeabsolutelyzerodifferencetoanydownstreamclinicaloutcomesseemsfarfetched.Thus,thesensiblequestiontoaskis“Howlargearetheclinicaldifferencesobserved?”,not“Howsurprisingistheobservedmeandifference

Page 6: ORBITA and coronary stents: A case study in the analysis ...gelman/research/published/Stents...2019/03/25  · ORBITA and coronary stents: A case study in the analysis and reporting

undera[spurious]nullhypothesis?”Thesimpletextbookwaytotacklethisquestionistoreportconfidenceintervals(CIs)aroundthemeandifferencesandnottofocusonwhethertheintervalshappentoincludezero.Thefactthatthestandard95%CIfortheprimaryoutcomecomfortablyincludesthetargeteffectsizeof30ssuggeststhisvalueshouldbenomore“rejected”thanthenullvalue.Furthermore,withoutthelongitudinaldatatoobservetheoutcomesthatmattermosttopatients—healthandlengthoflife—muchremainsuncertain.

Thelargerquestionhastobeaboutbalancingthelong-termbenefitsofstentswithrisksoftheoperation.Itdoesnotseemreasonableforapersontorisklifeandhealthbysubmittingtoasurgicalprocedurejustforapotentialbenefitof21secondsofexercisetimeonastandardizedtreadmilltest—orevenahypothesizedlargerbenefitof50seconds,whichwouldstillonlyrepresenta10%improvementforanaveragepatientinthisstudy.Yetmaybea5-10%increaseisconsequentialinthiscaseasitcouldimprovequalityoflifeforapatientoutsideofthisartificialsetting.Perhapsthissmallgaininexercisetimeisassociatedwiththeneedforlessmedications,fewerfunctionallimitationsorgreatermobility.Ifso,however,onemightpostulatethisgainwouldhavebeenapparentinassessmentsofanginaburden,anditwasnot.

Partofthebiggerconcernhereisthatthesepatientswerealreadydoingprettywellonmedications—thatis,theyalreadyhadalowsymptomfrequencybeforestenting.Forexample,anginafrequencyasmeasuredbytheSeattleAnginaQuestionnairewas63.2afteroptimizingmedicationsandbeforestentinginthetreatmentgroup.Thisroughlytranslatesas“monthly”angina(JohnSpertus,personalcommunication).Howdoesastudywithafollow-upofjust6weeksexpecttoimproveanoutcomethathappensthisinfrequently?Infact,oneofthegreatdebatessurroundingORBITAisthatthosewhodiscountthetrialsuggestitenrolledpatientswhotypicallydonotreceivestentsinroutinepractice.ThosewhobelieveORBITAisagame-changerarguethattheselesssymptomaticpatientsactuallymakeupalargeproportionofthosereceivingstents—andthatiswhywehavesuchalargeproblemwiththeiroveruse.

Finally,arestentsreallybeinggiventopatientswithstableanginajusttoimprovefitnessortoreducesymptoms?Oristhereacontinuedexpectationthatstentshavelong-termbenefitsforpatients,despiteearlierdatafromstudiesliketheClinicalOutcomesUtilizingRevascularizationandAggressiveDrugEvaluation(COURAGE)study(Boden,2007)?Thiswouldseemtobethekeyquestion,inwhichcasetheshort-termeffects,orlackthereof,foundintheORBITAstudyarelargelyirrelevant.Otherlargertrials,suchasInternationalStudyofComparativeHealthEffectivenessWithMedicalandInvasiveApproaches(ISCHEMIA,see:https://clinicaltrials.gov/ct2/show/NCT01471522)

Page 7: ORBITA and coronary stents: A case study in the analysis ...gelman/research/published/Stents...2019/03/25  · ORBITA and coronary stents: A case study in the analysis and reporting

areconsideringthismorefundamentalquestionbutwillnothaveaplaceboprocedure.

4.Recommendationsforstatisticalreportingoftrials

Thesearchforbettermedicalcareisanincrementalprocess,withincompleteevidenceaccumulatingovertime.Thereisunfortunatelyafundamentalincompatibilitybetweenthatcoreideaandthecommonpractice,bothinmedicaljournalsandthenewsmedia,ofup-or-downreportingofindividualstudiesbasedonstatisticalsignificance.WeoffersomerecommendationssummarizedinBox2thatwebelievewillbehelpfultoauthorsandeditorsmovingforward.

Atthispointit’snotclearhowbesttoincorporatethisrecentexperimentintoroutinepracticedespiteitsnovelandprovocativestudydesign,sotheforcedreportingoftheprimaryoutcomeas“positive”or“negative”isunhelpful.AreanalysisofthesummarydatafromAl-Lameeetal.(2017)revealsastrongerestimatedeffectthatisclosertotheconventionalboundaryofstatisticalsignificance,indicatingthatthestudycouldrathereasilyhavegeneratedandreportedevidenceinfavorof,ratherthanagainst,theeffectivenessofstentsforpatientswithstableangina.Andfromourbriefflurryofexcitementoverthepossibilitythatasimplereanalysiscouldchangethesignificancelevel,weareagainremindedofthesensitivityofheadlineconclusionstodecisionsinstatisticalanalysis.Inanycase,though,theobservedincreasesinexercisetime,evenifstatisticallysignificant,donotappearatfirstglancetobeofmuchclinicalimportance,comparedtothemuchmorerelevantlong-termhealthoutcomesthatremainuncertain.

Inthedesign,evaluation,andreportingofexperimentalstudies,thereisanormoffocusingonthestatisticalsignificanceofaprimaryoutcomey—inthiscase,changeinaverageexercisetimeonastandardizedtreadmilltest.Ingeneral,theconclusionsthatfollowwillbefragilebecausep-valuesareextremelynoisyunlesstheunderlyingeffectishuge.Anexperimentmaybedesignedtohave80%power,butthisdoesn’teliminatethefragility,asillustratedbyourbootstrapre-analysis.Powercalculationsareoftenconditionalonanoverestimatedeffectsize(SchulzandGrimes,2005,Gelman,2018)anddoesnotaddresstheimportantquestionofvariationintreatmenteffects.ExaminationoftheLancetpaperanditsreceptioninthenewsmediasuggeststhatitexhibitsaclassiccaseof“significantitis”or“dichotomania”(Greenland,2017),withfrequentrepetitionofphrasessuchas“therewasnosignificantdifference.”Inlinewithcurrentthinking(Amrhein,Greenland,andMcShane,2019),wesuggestthatthephraseusedbytheseauthors,“Wedeemedapvaluelessthan0.05tobesignificant,”shouldbestronglydiscouraged,ratherthanactivelydemandedasiscurrentlythecasebymanyjournaleditors.Totheircredit,theORBITAauthorsthemselveshaverecognizedthese

Page 8: ORBITA and coronary stents: A case study in the analysis ...gelman/research/published/Stents...2019/03/25  · ORBITA and coronary stents: A case study in the analysis and reporting

criticalissues(seehttps://twitter.com/ProfDFrancis/status/952008644018753536).

Inthecaseofstents,animportantdisconnectappearsbetweenthefindingsemphasizedintherecentstudy—howeverpresented—andthelargercontextoftreatmentsforheartdisease.Fromastatisticalperspective,thisappearstoreflectaproblemwiththeframingofclinicaltrialsasattemptstodiscoverwhetheratreatmenthasastatisticallysignificanteffect—commonlymisinterpretedtobeequivalenttoareal(non-zero)populationmeandifference.Powercalculationsareusedinanattempttoassurestableestimatesandagoodchanceoftheexperimentbeing“successful”,althoughwithintheseconstraintstherecanbeapushtowardconvenienceratherthanrelevanceofoutcomemeasures—whichisperhapsaninevitablecompromise.ORBITAshowsustheconfusionthatariseswhenatreatmentisreportedasasuccessorfailureinstatisticaltermsthataredivorcedfromclinicalcontext.

ORBITAwasnevermeanttobedefinitiveinabroadsense—itwasdesignedtofindastatisticallysignificantphysiologicaleffectofstentingonmeanexercisetime,withoutclarityontheclinicalrelevanceofanticipatedeffectsonthisoutcomemeasure.Indeed,alikelyreasonwhythestudywaslimitedinitssizeanddesignofthesesurrogateoutcomeswasbecausethisisallthatcouldhavepassedanethicalboardgiventhenoveltyoftheplaceboprocedureinthissetting.FurtherbackgroundonthesetopicsfromDarrelFrancis,theseniorauthoronthestudy,appearsatHarrell(2017b).Beyondimmediatenewsreports,onepositiveimpactofORBITAisthatbiggertrialsofstentingwithplaceboproceduresarenowmuchmorelikelywithamoredefinitivesetofmeasuredoutcomesthataremeaningfulforpatients.

Wedon’tseeanyeasyanswershere—long-termoutcomeswouldrequirealong-termstudy,afterall,andclinicaldecisionsneedtobemaderightaway,everyday—butperhapswecanuseourexaminationofthisparticularstudyanditsreportingtosuggestpracticaldirectionsforimprovementinhearttreatmentstudiesandinthedesignandreportingofclinicaltrialsmoregenerally.

References

Al-Lamee,R.,Thompson,D.,Dehbi,H.M.,Sen,S.,Tang,K.,Davies,J.,Keeble,T.,Mielewczik,M.,Kaprielian,R.,Malik,I.S.,Nijjer,S.S.,Petraco,R.,Cook,C.,Ahmad,Y.,Howard,J.,Baker,C.,Sharp,A.,Gerber,R.,Talwar,S.,Assomull,R.,Mayet,J.,Wensel,R.,Collier,D.,Shun-Shin,M.,Thom,S.A.,Davies,J.E.,andFrancis,D.P.(2017).Percutaneouscoronaryinterventioninstableangina(ORBITA):adouble-blind,randomisedcontrolledtrial.Lancet.http://dx.doi.org/10.1016/S0140-6736(17)32714-9

Page 9: ORBITA and coronary stents: A case study in the analysis ...gelman/research/published/Stents...2019/03/25  · ORBITA and coronary stents: A case study in the analysis and reporting

Amrhein,V.,Greenland,S.,andMcShane,B.(2019).Scientistsriseupagainststatisticalsignificance.Nature567,305-307.AmericanCollegeofCardiology(2017).ORBITA:Firstplacebo-controlledrandomizedtrialofPCIinCADpatients.ACCNews,2Nov.http://www.acc.org/latest-in-cardiology/articles/2017/10/27/13/34/thurs-1150am-orbita-tct-2017

Belluz,J.(2017).Thousandsofheartpatientsgetstentsthatmaydomoreharmthangood.Vox.com,6Nov.https://www.vox.com/science-and-health/2017/11/3/16599072/stent-chest-pain-treatment-angina-not-effective

Boden,W.E.,O'Rourke,R.A.,Teo,K.K.,Hartigan,P.M.,Maron,D.J.,Kostuk,W.J.,Knudtson,M.,Dada,M.,Casperson,P.,Harris,C.L.,Chaitman,B.R.,Shaw,L.,Gosselin,G.,Nawaz,S.,Title,L.M.,Gau,G.,Blaustein,A.S.,Booth,D.C.,Bates,E.R.,Spertus,J.A.,Berman,D.S.,Mancini,G.B.,andWeintraub,W.S.;COURAGETrialResearchGroup.(2007).OptimalmedicaltherapywithorwithoutPCIforstablecoronarydisease.NewEnglandJournalofMedicine356,1503–16.Epub2007Mar26.

Efron,B.(1979).Bootstrapmethods:Anotherlookatthejackknife.AnnalsofStatistics7,1–26.

Gelman,A.(2004).Treatmenteffectsinbefore-afterdata.InAppliedBayesianModelingandCausalInferencefromIncomplete-dataPerspectives,ed.A.GelmanandX.L.Meng,chapter18.NewYork:Wiley.

Gelman,A.(2018).Thefailureofnullhypothesissignificancetestingwhenstudyingincrementalchanges,andwhattodoaboutit.PersonalityandSocialPsychologyBulletin44,16–23.

Gelman,A.,andCarlin,J.B.(2014).Beyondpowercalculations:AssessingTypeS(sign)andTypeM(magnitude)errors.PerspectivesonPsychologicalScience9,641–651.

Greenland,S.(2017).Theneedforcognitivescienceinmethodology.AmericanJournalofEpidemiology186,639–645.

Harrell,F.(2017a).Statisticalerrorsinthemedicalliterature.StatisticalThinkingblog,8Apr.http://www.fharrell.com/2017/04/statistical-errors-in-medical-literature.html

Harrell,F.(2017b).Statisticalcriticismiseasy;Ineedtorememberthatrealpeopleareinvolved.StatisticalThinkingblog,5Nov.http://www.fharrell.com/2017/11/statistiorbita-tct-2017cal-criticism-is-easy-i-need-to.html

Kolata,G.(2017).’Unbelievable’:Heartstentsfailtoeasechestpain.NewYorkTimes,2Nov.https://www.nytimes.com/2017/11/02/health/heart-disease-stents.html

McShane,B.B.,Gal,D.,Gelman,A.,Robert,C.,andTackett,J.L.(2019).Abandon

Page 10: ORBITA and coronary stents: A case study in the analysis ...gelman/research/published/Stents...2019/03/25  · ORBITA and coronary stents: A case study in the analysis and reporting

statisticalsignificance.AmericanStatistician73(S1),235–245.

Resnick,B.(2017).Whitefearofdemographicchangeisapowerfulpsychologicalforce.Vox.com,28Jan.https://www.vox.com/science-and-health/2017/1/26/14340542/white-fear-trump-psychology-minority-majority

Sands,M.L.(2017).Exposuretoinequalityaffectssupportforredistribution.ProceedingsoftheNationalAcademyofSciences114,663–668.

Schulz,K.F.,andGrimes,D.A.(2005).Samplesizecalculationsinrandomisedtrials:Mandatoryandmystical.Lancet365,1348–1353.

Simmons,J.,Nelson,L.,andSimonsohn,U.(2011).False-positivepsychology:Undisclosedflexibilityindatacollectionandanalysisallowpresentinganythingassignificant.PsychologicalScience22,1359-1366.

Vickers,A.J.,andAltman,D.G.(2001).Analysingcontrolledtrialswithbaselineandfollowupmeasurements.BritishMedicalJournal323,1123–1124.

Wasserstein,R.L.,andLazar,N.A.(2016).TheASA'sstatementonp-values:Context,process,andpurpose.AmericanStatistician70,129–133.

Page 11: ORBITA and coronary stents: A case study in the analysis ...gelman/research/published/Stents...2019/03/25  · ORBITA and coronary stents: A case study in the analysis and reporting

Table.Summarydatacomparingstentstoplacebo,fromTable3ofAl-Lameeetal.(2017).

Page 12: ORBITA and coronary stents: A case study in the analysis ...gelman/research/published/Stents...2019/03/25  · ORBITA and coronary stents: A case study in the analysis and reporting

Box1.Usingthereporteddatasummariestoobtaintheanalysiscontrollingforthepre-treatmentmeasureForeachofthetreatmentandcontrolgroups,wearegiventhestandarddeviationofthepre-testmeasurements,thestandarddeviationofthepost-testmeasurements,andthestandarddeviationoftheirdifference,whichcanbeobtainedbytakingthewidthoftheconfidenceintervalforthedifference,dividingby4togetthestandarderrorofthedifference,andthenmultiplyingby 𝑛togetbacktothestandarddeviation.

Thenweusetherule,sd(y! − y!) = sd y! ! + sd y! !

− 2ρ sd(y!)sd(y!)andsolveforρ,thecorrelationbetweenbeforeandaftermeasurementswithineachgroup.Theresultinthiscaseisρ=0.88withineachgroup.Wethenconvertthecorrelationtoaregressioncoefficientofy!ony!usingthewell-knownformula,β = ρ sd(y!)/sd(y!),whichyieldsβ = 0.88forthetreatedandβ = 0.86forthecontrolgroup.Ifthesetwocoefficientsweremuchdifferentfromeachother,wemightwanttoconsideraninteractionmodel(Gelman,2004),butheretheyarecloseenoughthatwesimplytaketheiraverage.

Weusetheaverage,β=0.87,in(2)andgetanestimatefortheadjustedmeandifferenceof21.3(indeed,quiteabithigherthanthereporteddifferenceingainscoresof16.6)withastandarderrorof12.5(veryslightlylowerthan12.7,thestandarderrorofthedifferenceingainscores)and95%CI−3.2to45.8s.Theestimateisnotquitetwostandarderrorsawayfromzero:thez-scoreis1.7,andthep-valueis0.09.

Page 13: ORBITA and coronary stents: A case study in the analysis ...gelman/research/published/Stents...2019/03/25  · ORBITA and coronary stents: A case study in the analysis and reporting

Box2.RecommendationsforAnalysesandReportingAnalyses1.Baselineadjustmentfordifferences:shouldbeprespecifiedfortheprimaryanalysiswherestrongconfounderssuchasabaselinemeasureoftheoutcomeareavailable.2.Beawareoffragilityofinferences.Fragilitycanbedemonstratedusingthesamplingorposteriordistributionasestimatedusingmathematicalmodeling,bootstrapsimulation,orBayesiananalysis.Reporting1.Avoiduseofsharpthresholdsforp-valuesandthuseliminatetheterm“statisticalsignificance”fromthereportingofresults.2.Considerthefullrange(upperandlowerends)ofintervalestimatesforimportantoutcomesandtheirpotentialinclusionofclinicallyimportantdifferences.3.Considerthepotentialforindividualvariabilityinresponses(heterogeneityoftreatmenteffects)andnotjustmeandifferences.


Recommended