+ All Categories
Home > Documents > Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in...

Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in...

Date post: 11-May-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
69
Prac%cal Issues in Sta%s%cal Interpreta%on of Tevatron Data 6/5/12 1 T. Junk SLAC Stats Progress Thomas R. Junk Fermilab Progress on Sta<s<cal Issues in Searches Conference SLAC Na<onal Accelerator Laboratory June 5, 2012 Mul<ple Parameters of Interest Handling Nuisance Parameters Overbinning, Smoothing, and Distribu<ons that Ought Not to be Smoothed Model valida<on with Mul<variate Analyses ABCD Methods Look‐Elsewhere
Transcript
Page 1: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

Prac%calIssuesinSta%s%calInterpreta%onofTevatronData

6/5/12 1T.JunkSLACStatsProgress

ThomasR.JunkFermilab

ProgressonSta<s<calIssuesinSearchesConferenceSLACNa<onalAcceleratorLaboratory

June5,2012

• Mul<pleParametersofInterest• HandlingNuisanceParameters• Overbinning,Smoothing,andDistribu<onsthatOughtNottobeSmoothed• Modelvalida<onwithMul<variateAnalyses• ABCDMethods• Look‐Elsewhere

Page 2: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 2

2

Protons‐an<protoncollisionsforRunIandRunII

1.96 TeVpps =

Main Injector and Recycler

Antiproton source

Booster

Tevatronringradius=1kmCommissionedin1983

MainInjectorcommissionedforRunII

Recyclerusedasanotheran<protonaccumulator

RunIIendedSep.30,201110W‐1ofanalyzabledata/experiment.500+papers/experimentandcoun<ng!

RunII:

Page 3: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 3

3

Page 4: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 4

Two(ormore)ParametersofInterest

Fromthe2011PDGSta<s<csReview

h`p://pdg.lbl.gov/2011/reviews/rpp2011‐rev‐sta<s<cs.pdf

Forquo<ngGaussianuncertain<esonsingleparameters.Ellipseisacontourof2ΔlnL=1

Fordisplayingjointes<ma<onofseveralparameters

Page 5: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 5

1Dor2DPresenta%on

Parameter1

Parameter2

68%

68%

2ΔlnL=168%

2ΔlnL=2.3

Ipreferwhenshowinga2Dplot,showingthecontourswhichcoverin2D.The2ΔlnL=1contouronlycoversforthe1Dparameters,oneata<me.

Page 6: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 6

PRD85,071106PRD83,032009

h`p://www‐cdf.fnal.gov/physics/new/top/2011/WhelDil/index.html

AVarietyofwaystoshow2DFitresults

68%and95%contours

Page 7: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 7

HypothesisTes%ngwithTwoParametersofInterest?

Seeforexample,D0’sevidencefort‐channelsingletopproduc<on:Phys.Le`.B705(2011)313‐319

“...usingthelog‐likelihoodapproach...wecomputeforthefirst<methesignificanceofthetqbcrosssec<onindependentlyofanyassump<onontheproduc<onrateoftb.”

Inthisspecificcasethecorrela<onissmall,andthisclaimisn’tsobad.Buttocalculateap‐valuep(λ≥λobs|signal=0),oneneedsasamplespaceofpseudoexperiments,andthusanassump<onofthes‐channelrate.

Page 8: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 8

HypothesisTes%ngwithTwoParametersofInterest?

Anotherissue:Avaria<onontheLEEtheme.

s‐channelandt‐channelsingletopeventsaredifferintheirkinema<cdistribu<ons

Forobserva<onandmeasurementofthetotalsingletopproduc<onrateσs+t,we’dtrainourMVA’sontheStandardModelmixtureofboth.

Forseparatemeasurementofσsandσt,wehavechoicesoftrainingstrategies.Single‐tagevents–t‐channel;Double‐tagevents–s‐channel.

Supposewewanttodoahypothesistestonjustthet‐channel?Re‐op<mizeallMVA’switht‐channelasthesignalands‐channelasbackground.Re‐dothisfors‐channel.

Ideallywe’dpickthemostsensi<veMVA(highestexpectedsignificance)forthetestwewanttodo,butthereisatempta<ontopicktheonewiththehighestmeasuredsignificance(wetellanalyzerstopickthemostsensi<ve).

Ifyouhaveanexcessofdataeventsthatcouldbes‐ort‐channel(can’ttell),thisstrategymayendupgivinganobserva<onofboth(orneither),whenwe’rereallyonlysurethere’satleastoneprocesspresent.

Page 9: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 9

SeveralAnalysesontheSameData• Differentgroupsareinterestedinthesamesearch/measurementusingthesamedata.• Mayhaveslightlydifferentselec<onrequirements(Jetenergies,leptontypes,missingEt,etc).• UsuallyhavedifferentchoicesofMVAoreventrainingstrategiesforthesameMVA• Alwayswillgivedifferentresults!

• Whattodo?• Pickoneandpublishit–criterion:bestsensi<vity.Medianexpectedlimit,medianexpectedp‐value,medianexpectedmeasurementuncertainty.Howtopickitiftheresultis2D?Needa1Dfigureofmerit.• Cancheckconsistencywithpseudoexperiments.Ap‐valueusingΔ(measurement)asateststa<s<c.What’sthechanceofrunningtwoanalysesonthesamedataandgevngaresultasdiscrepantaswhatwegot?• CombineMVA’sintoasuper‐MVA

• Keepseveryonehappyandinvolved• Usuallyhelpssensi<vity• Requirescoordina<onandalignmentofeacheventindataandMC• Easiestwhenoverlapindatasamplesis100%.Otherwisehavetobreaksampleupintosharedandnon‐sharedsubsetsandanalyzethemseparately

• Whatnottodo:Picktheonewiththe“best”observedresult.(LEE!)

Page 10: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 10

AnExampleofRunningThreeAnalysesontheSameEventsinMonteCarloRepe%%ons

Differentques<onscanbeasked:What’sthedistribu<onofthemaximumdifferencebetweenthemeasurementsanytwoteams?What’sthequadraturesumofthepairwisedifferences?Condi<ononthesum?(Probablynot..)

Page 11: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 11

Systema%cUncertaintyHandlingForaverythoroughreview,seeLucDemor<er’snote:“PValues:WhatTheyAreandHowtoUseThem“

h`p://www‐cdf.fnal.gov/~luc/sta<s<cs/cdf8662.pdf

Plausibleop<ons:

1)Prior‐Predic<vemethod2)Supremummethod3)Confidence‐Levelmethod4)Plug‐Inp‐values5)Defineallnuisanceparameterstobeparametersofinterest6)Defineonlytheimportantnuisanceparameterstobeparametersofinterest

Theprior‐predic<vemethodisamixtureofBayesianandFrequen<streasoning

Thesupremummethodisveryconserva<veandIarguenotfullynon‐Bayesian.Italsoproducesmixedresults–canhaveanoutcomewhichisanexcessoverbackgroundwhensevnglimitsandadeficitwhencompu<ngap‐value.

Page 12: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 12

TreatNuisanceParametersAsParametersofInterest!

• Oneperson’snuisanceparameterisanother’sparameterofinterest.

• Reallyonlygoodifyouhaveonedominantsourceofsystema<cuncertainty,andyouwanttoshowyourjointmeasurementofthenuisanceparameterandtheparameterofinterest.

Example:topquarkmass(parameterofinterest),vs.CDF’sjetenergyscaleinall‐hadronic`barevents.Doesn’tfollowmysugges<on!Butoneparameter(JES)isnotaparameterof“interest”Ifitwere,we’duseΔLnL=1.15insteadof0.5

Difficulttoapplytocaseswithmanynuisanceparameters.

Page 13: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 13

“Strong”SidebandConstraints

Page 14: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 14

AnotherStrongSidebandConstraintExample

Page 15: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 15

“Weak”SidebandConstraints

CDF’sΩbobserva<onpaper:

Phys.Rev.D80(2009)072003

Page 16: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 16

AMixtureofTheoryandDataisNeededforamoreComplicatedsitua<on

HZZFourLeptonsMainBackgrounds:

ppZZ(MC)*theoryppZ+jets,ZW+jet(s),...Datapp`bar

Low‐mass4Lside:off‐shellZ’s,“radia<vetail”,andotherbackgrounds

Dependentontheore<calpredic<onsoftheshapeofthedominantZZbackground.

Withmoredata,replace“bad”systema<ccswith“good”ones(theoryreplacedwithdata).Butintheearlystages,the“bad”systema<cuncertain<esaresmaller!

Page 17: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 17

AnotherWeakSidebandConstraintExamplethatLooksLikeaStrongSidebandConstraint

Phys.Rev.D85(2012)032005e‐Print:arXiv:1106.4782[hep‐ex]

Page 18: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 18

BreakingtheFlavorDegeneracywithaTagVariable

Page 19: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 19

NoSidebandConstraints?Example:Coun<ngexperiment,onlyhaveaprioripredic<onsofexpectedsignalandbackground

Allteststa<s<csareequivalenttotheeventcount–theyservetoorderoutcomesasmoresignal‐likeandlesssignal‐like.Moreevents==moresignal‐like.

Classicalexample:RayDavis’sSolarNeutrinoDeficitobserva<on.Comparingdata(neutrinointerac<onsonaChlorinedetectorattheHomestakemine)withamodel(JohnBahcall’sStandardSolarModel).Calibra<onsofdetec<onsystemwereexquisite.Butitlackedastandardcandle.

Howtoincorporatesystema<cuncertain<es?Fewerop<onsle�.

Anotherexample:Beforeyouruntheexperiment,youhavetoes<matethesensi<vity.Nosidebandconstraintsyet(exceptfromotherexperiments).

Priorpredic<vemethodthenisequivalenttotheprofilemethodusingthecontrolsamplestoes<matenuisanceparameters.Andit’smoregeneralincasesthatthesignalcontamina<onofthesidebandsisimportant.

Page 20: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 20

Several“on’s”,Several“off’s”

Page 21: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 21

Morethanone“off”sample

Conflic<nges<matesofbackground–whattodo?

Verytypicalexample:Pythiavs.Herwig(herethe“off”samplesareMonteCarlo).“Takethedifferenceasasystema<c”“Taketheaverageandhalfthedifferenceasasystema<c”Trytolearnsomethingaboutwhichoneismorereliable.

Butyoucaninvertmorethanonecutandhave“conflic<ng”offsamplesinthedatatoo.Reallytheextrapola<onfactorsorthesamplecomposi<ones<matesarewhat’swrong,nottheactualdata.

Page 22: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 22

Lessonlearned:Trytodoabe`erjobwiththepredic<ons!Sta<s<calmethodswon’tsaveus.

SeeGlenCowan’stalkyesterday

Page 23: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 23

MCSta%s%csand“Broken”Bins

• Limitcalculators/discoverytoolscannottellifthebackgroundexpecta<onisreallyzeroorjustadownwardMCfluctua<on.• Realbackgroundes<ma<onsaresumsofpredic<onswithverydifferentweightsineachMCevent(ordataevent)• Rebinningorjustcollec<ngthelastfewbinstogethero�enhelps.• Problemcompoundedbyrequiringshapeuncertain<estobeevaluated!AlternateshapeMCsamplesareo�enevenmorethinlypopulatedthanthenominalsamples.Valida<onofadequateprepara<onofresultsisnecessary?(butwhatarethecriteria?)

NDOF=?

OverbinningislikeovertrainingaNN.s,b,anddcanallbeindifferentbins.Ahistogramcanbepar<allyoverbinned,likethisonehere:

Page 24: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 24

SomeVeryEarlyPlotsfromATLASSufferfromlimitedsamplesizesincontrolsamplesandMonteCarloNearlyallexperimentsareguiltyofthis,especiallyintheearlydays!

Thele�plothasadequatebinninginthe“uninteres<ng”region.Fallsapartontheright‐handside,wherethesignalisexpected.Sugges<ons:MoreMC,Widerbins,transforma<onofthevariable(e.g.,takethelogarithm).Notsurewhattodowiththeright‐handplotexceptgetmoremodelingevents.

Datapoints’errorbarsarenotsqrt(n).Whatarethey?Idon’tknow.Howabouttheuncertaintyonthepredic<on?

Page 25: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 25

ThisHistogramisProbablyOkay

Thebinningisali`leodd,though.Youcangetthiskindofdistribu<onfromadecisiontreeoralikelihoodMVA.(forestofdeltafunc<ons)

Watchoutthough!Smoothingandsomekindsofinterpola<ons(E.G.horizontalmorphingalaAlexRead)areinappropriateforthisdistribu<on.

Some<mesdistribu<onslikethesehavenaturalcauses:Leptonφdistribu<onsfordetectorswithmanycracks,forexample.

MVAmark

Page 26: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 26

AMoreCommonExample–muoncoverageathighangles.

Nosmoothing/extrapoa<onallowedhere!

Page 27: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 27

Twocompe<ngeffects:

1)Separa<onofeventsintoclasseswithdifferents/bimprovesthesensi<vityofasearchorameasurement.Addingeventsincategorieswithlows/btoeventsincategorieswithhighers/bdilutesinforma<onandreducessensi<vity.

Pushestowardsmorebins

2)InsufficientMonteCarlocancausesomebinstobeempty,ornearlyso.Thisonlyhastobetrueforonehigh‐weightcontribu<on.

Needreliablepredic<onsofsignalsandbackgroundsineachbin

Pushestowardsfewerbins

Note:Itdoesn’tma`erthattherearebinswithzerodataevents–there’salwaysaPoissonprobabilityforobservingzero.

Theproblemisinadequatepredic<on.Zerobackgroundexpecta<onandnonzerosignalexpecta<onisadiscovery!

Op%mizingHistogramBinning

Page 28: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 28

ACommonpi�all–Choosingselec<oncriteriaa�erseeingthedata.“Drawingsmallboxesaroundindividualdataevents”

ThesamethingcanhappenwithMonteCarloPredic<ons–

Limi<ngcase–eacheventinsignalandbackgroundMCgetsitsownbin.FakePerfectsepara<onofsignalandbackground!.

Sta<s<caltoolsshouldn’tgiveadifferentanswerifbinsareshuffled/sorted.

Trysor<ngbys/b.Andcollectbinswithsimilars/btogether.Cangetarbitrarilygoodperformancefromananalysisjustbyoverbinningit.

Note:Emptydatabinsareokay–justemptypredic<onisaproblem.Itisourjobhowevertoproperlyassigns/btodataeventsthatwedidget(andallpossibleones).

Overbinning=Overlearning

Page 29: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 29

ModelValida%on

• Notnormallyasta<s<csissue,butsomethingHEPexperimentalistsspendmostoftheir<meworryingabout.

• Systema<cUncertain<esonpredic<onsareusuallyconstrainedbydatapredic<ons.

• O�endiscrepanciesbetweendataandpredic<onarethebasisfores<ma<ngsystema<cuncertainty

Page 30: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 30

CheckingInputDistribu%onstoanMVA• Relaxselec<onrequirements–showmodelinginaninclusivesample(example–nob‐tagrequiredforthecheck,butrequireitinthesignalsample)• Checkthedistribu<onsinsidebands(requirezerob‐tags)• Checkthedistribu<oninthesignalsampleforallselectedevents• Checkthedistribu<ona�erahigh‐scorecutontheMVA

Example:Qlepton*ηuntaggedjetinCDF’ssingletopanalysis.Goodsepara<onpowerfort‐channelsignal.

highest|η|jetasawell‐chosenproxy

Phys.Rev.D82:112005(2010)

Page 31: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 31T.JunkSta<s<csETHZurich30Jan‐3Feb

CheckingMVAOutputDistribu%ons• CalculatethesameMVAfunc<onforeventsinsideband(control)regions• Forvariablesthatarenotdefinedoutsideofthesignalregions,putinproxies.(some<mesjustazerofortheinputvariableworkswellifthequan<tyreallyisn’tdefinedatall–pickatypicalvalue,notonewayoffontheedgeofitsdistribu<on)• BesuretousethesameMVAfunc<onasforanalyzingthesignaldata.

Example:CDFNNsingle‐topNNvalidatedusingeventswithzerob‐tag

signalregion

Phys.Rev.D82:112005(2010)

Page 32: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 32

AComparisoninaControlSamplethatisLessthanPerfectCDF’ssingletopLikelihoodFunc<ondiscriminantcheckedinuntaggedevents

Phys.Rev.D82:112005(2010)

Strategy:Assessashapesystema<ccoveringthedifferencebetweendataandMC–extrapolatetheuncertaintyfromthecontrolsampletothesignalsample.

Ifthecomparisonisokaywithinsta<s<calprecision,donotassesanaddi<onaluncertainty(even/especiallyiftheprecisionisweak).Barlow,hep‐ex/0207026(2002).

Page 33: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 33

AnotherValida%onPossibility–TrainDiscriminantstoSeparateEachBackground

Phys.Rev.D82:112005(2010)

SameinputvariablesassignalLF.LFhasthepropertythatthesumoftheseplusthesignalLFis1.0foreachevent.Givesconfidence.Ifthecheckfails,it’sastar<ngpointforaninves<ga<on,andnotawaytoes<mateanuncertainty.

Page 34: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 34

ModelValida%onwithMVA’s

• Eventhoughinputdistribu<onscanlookwellmodeled,theMVAoutputcoulds<llbemismodeled.Possiblecause–correla<onsbetweenoneormorevariablescouldbemismodeled• Checksinsubsetsofeventscanalsobeincomplete.Asumofdistribu<onswhoseshapesarewellreproducedbythetheorycans<llbemismodelediftherela<venormaliza<onsofthecomponentsismismodeled.

• Cancheckthecorrela<onsbetweenvariablespairwisebetweendataandpredic<on• Difficulttodoifsomeofthepredic<onisaone‐dimensionalextrapola<onfromcontrolregions(e.g.,ABCDmethods).

• Myfavorite:ChecktheMVAoutputdistribu<oninbinsoftheinputvariables!WecaremoreabouttheMVAoutputmodelingthantheinputvariablemodelinganyway.• Makesuretousethesamenormaliza<onschemeasfortheen<redistribu<on–donotrescaletoeachbin’scontents.

Ideally,we’dtrytofindacontrolsampledepletedinsignalthathasexactlythesamekindofbackgroundasthesignalregion(usuallythisisunavailable).

Page 35: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 35

TheSumofUncorrelated2DDistribu%onsmaybeCorrelated

x

y

Knowledgeofonevariablehelpsiden<fywhichsampletheeventcamefromandthushelpspredicttheothervariable’svalueeveniftheindividualsampleshavenocovariance.

Page 36: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 36

“ABCD”MethodsCDF’sWCrossSec<onMeasurement

Isola<onfrac<on=

Energyinaconeofradius0.4aroundleptoncandidatenotincludingtheleptoncandidate/EnergyofleptoncandidateMissingTransverseEnergy(MET)

WantQCDcontribu<ontothe“D”regionwheresignalisselected.

Assumes:METandISOareuncorrelatedsamplebysampleSignalcontribu<ontoA,B,andCaresmallandsubtractable

ABCDmethodsarereallyjuston‐offmethodswhereτismeasuredusingdatasamples

Page 37: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 37

“ABCD”MethodsAdvantages• Purelydatabased,goodifyoudon’ttrustthesimula<on• Modelassump<onsareinjectedbyhandandnotinacomplicatedMonteCarloprogram(mostly)• Modelassump<onsareintui<ve

Disadvantages• Thelackofcorrela<onbetweenMETandISOassump<onmaybefalse.e.g.,semileptonicBdecaysproduceunisolatedleptonsandMETfromtheneutrinos.• Evenatwo‐componentbackgroundcanbecorrelatedwhenthecontribu<onsaren’tbythemselves.• Anotherwayofsayingthatextrapola<onsaretobechecked/assignedsufficientuncertainty• WorksbestwhentherearemanyeventsinregionsA,B,andC.Otherwisealltheproblemsoflowstatsinthe“Off”sampleintheOn/Offproblemreappearhere.LargenumbersofeventsGaussianapproxima<ontouncertaintyinbackgroundinD• Requiressubtrac<onofsignalfromdatainregionsA,B,andCintroducesmodeldependence• Worse,thesignalsubtrac<onfromthesidebandsdependsonthesignalratebeingmeasured/tested.Asmalleffectifs/binthesidebandsissmallYoucaniteratethemeasurementanditwillconvergequickly

Page 38: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 38

ExamplesofABCDMethods• Sidebandcalibra<onofbackgroundunderapeak.(“whatifthebackgroundpeaksalsowherethesignalpeaks?”)• Theon‐offproblemwithτ=A/C.VeryfrequentlysamplesAandCareinMCsimula<ons,wherewecanbesurenottocontaminatethebackgroundes<ma<onsw<hsignal.Example:UsingtheMCtoes<mateacceptanceforacutforbackground,tobescaledwithadatacontrolsample.ButwepaythepriceofunknownMCmismodeling.

Uncorrelatedvariableassump<on==assump<onthatτisthesameinthedataandtheMC.(checkmodelingofshapeofdistribu<onintheMC)

Equivalentofpreviousproblem:EvenifthebackgroundshapesarewellmodeledbytheMC,iftherearemul<plebackgroundprocesseswhichcontribute,theycanhavedifferentfrac<onalcontribu<ons,distor<ngthetotalshapes.

• FivnganMVAshapetothedata.Low‐scoreMC=A,High‐ScoreMC=CLow‐scoredata=B,High‐scoreData=D.

Page 39: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 39

AnApproximateLEECorrec%onforPeakHun%ngSeeE.GrossandO.Vitells,Eur.Phys.J.C70(2010)525‐530.

Approximateformulaappliestobumphuntsonasmoothbackground.

Requiresafewfullysimulatedpseudoexperimentswithcompletep‐valuecalcula<onsovertheregionofinterest.Countup‐crossingsofathreshold.Extrapolatestohigherthresholdsassuminglarge‐samplebehavior.Specifically,thattheLRteststa<s<chasachisquareddistribu<on.

Aninteres<ngfeature–specifictobumphuntsbutmaybemoregeneral:

Astheexpectedsignificancegoesup,sodoestheLEEcorrec%on

Thismakeslotsofsense:LEEdependsonthenumberofseparatemodelsthatcanbetested.Aswecollectmoredata,wecanmeasuretheposi<onofthepeakmoreprecisely.

Sowecantellmorepeaksapartfromeachother,evenwiththesamereconstruc<onresolu<on.

But:Combineapoorresolu<onlows/bsearchwithahighresolu<onhighs/bbutvery<nysandvery<nybsearch–maynotgettherightanswer.

Page 40: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 40

CDF’s2011HγγSearch

+2otherchannelswithsmallerexcesses

Insufficientsensi<vitytoaSMHiggsboson.Rateruledoutbyothersearches(ggHWWforexample).Soweknowthebumpisastatfluctua<on.

Page 41: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 41

arXiv:1203.3774

CDF+D0HiggsSearchChannelsCombined

>300nuisanceparameters

Page 42: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 42

TevatronHiggsSearchLEELocalp‐valuevsHiggsbosonmass

Bandsshowexpecta<onassumingasignalispresent(ateachmHseparately)

CrossSec<on<mesBranchingRa<oFitsvs.mH

Acomplica<on:MostsearchchannelstraintheirMVA’sseparatelyateachmHtoop<mizesensi<vity

Page 43: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 43

Signal,Background,andDataintheZHllbbSearch

Reconstructedmjjdistribu%on

YoucantellmHisonthehighsideoftherangeonlybywhat’smissingandnotwhat’sthere!

Page 44: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 44

)2Higgs Mass (GeV/c

100 110 120 130 140 150

95

% C

L L

imit

/SM

1

10

210 ! 1!Expected Limits

! 2!Expected Limits

2=115 GeV/cH

Expected with Injected M

)-1CDF II Preliminary (10 fb

CDF’sWHchannelexpecta<on(x3luminositytosimulatethepresenceofotherchannels:llbb,METbb)

Witha115GeVsignalinjected

Page 45: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 45

S<rringitAllTogether–D0’sLLRTest

Assumingobservedandexpected+3sigmaexcess,andmedianoutcome.Resolu<onfrom‐2ΔLLR=Δχ2=1

Resolu<onat115GeV:±5GeVResolu<onat135GeV:~±10GeV

Page 46: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 46

Aninteres<ngBiasBillMurrayShowedatTheNextStretchoftheHiggsMagnificentMileConference

h`ps://twindico.hep.anl.gov/indico/conferenceOtherViews.py?view=standard&confId=856

SeekabumponasmoothbackgroundExample:LHC(orTevatron)Hγγsearch.

AllowmHtofloatandpickthemHthatmaximizesthefi`edcrosssec<on.

Thefi`edcrosssec<onwillbebiasedupwardsandtheposi<onresolu<onof“lucky”outcomeswillbeworsethanunluckyonesevenifasignalistrulypresent.

Why?Atruebumpcancoalescewithafluctua<oneithertothele�ortotherightofthebump(twochancestofluctuateupwards).

Effectcanbesubstan<al!Calibratewithsimulatedexperimentaloutcomes(FC).

Page 47: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 47

LEEforLimits?

10-6

10-5

10-4

10-3

10-2

10-1

1

100 102 104 106 108 110 112 114 116 118 120

mH

(GeV/c2)

CL

s

114.4115.3

LEP

Observed

Expected forbackground

No,butthereistheoppositeeffect.Wetakethemostconserva<vemassexclusion.IftheCLscurvecrossesseveral<meswequotethesmallest(LEP).

Hardtosaywhatthemedianexpectedlimitis–theplacewherethemedianCLscrossesthelineishigherthanthemedianlowestlimit.

LHCandTevatronexperimentsquotemul<pledisjointmHlimits.

NoLEE:jus<fica<on–eachtestateachmHisanindependentsearchwithitsownerrorrate,assumingapar<cleistrulythereateachmassoneata<me.

Page 48: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 48

1

10

100 110 120 130 140 150 160 170 180 190 200

1

10

mH (GeV/c2)

95

% C

L L

imit

/SM

Tevatron Run II Preliminary, L ! 10.0 fb-1

Expected

Observed

!1 s.d. Expected

!2 s.d. Expected

LE

P E

xc

lus

ion

Tevatron

+ATLAS+CMS

Exclusion

SM=1

Te

va

tro

n +

LE

P E

xc

lus

ion

CM

S E

xclu

sio

n

AT

LA

S E

xclu

sio

n

AT

LA

S E

xclu

sio

n

LE

P+

AT

LA

S E

xclu

sio

n

ATLAS+CMS

Exclusion

ATLAS+CMS

Exclusion

February 27, 2012

LEEforLimits?Mul<pleexperimentssearchingforthesamepar<cle.

Mul<plechancestofalselyexcludeapar<clethat’sactuallythere.

Veryeasytotaketheunionofexcludedregions,butthisdoesnothave95%coverage.

Thebestthingtodoistocombineforasingleinterpreta<on.

Butthelimitsareofsecondaryimportancehere.

Page 49: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 49

Whereis“Elsewhere?”Acollidercollabora<onistypicallyverylarge;>1000Ph.D.students.ATLAS+CMSisanotherfactoroftwo.(FourLEPcollabora<ons,TwoTevatroncollabora<ons).

Manyongoinganalysesfornewphysics.Thechanceofseeingabumpsomewhereislarge.WhatistheLEE?

Dowehavetocorrectourpreviouslypublishedp‐valuesforalargerLEEwhenweaddnewanalysestoourpor�olio?

Howaboutthephysicistwhogoestothelibraryandhand‐picksallthelargestexcesses?WhatisLEEthen?

“Consensus”attheBanff2010Sta<s<csWorkshop:LEEshouldcorrectonlyforthosemodelsthataretestedwithinasinglepublishedanalysis.Usuallyonepapercoversoneanalysis,butreviewpaperssummarizingmanyanalysesdonothavetoputinaddi<onalcorrec<onfactors.

FortheWinter2012Higgssearchanalyses,wehadseveralLEE’scomputed,dependingonthemassrangedefinedtobeelsewhere.

Caveatlector.

Page 50: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 50

Whereis“Elsewhere?”LEEiso�enhardenoughtoevaluate.Rightwaytodoit–computep‐valueofp‐valuessimulateexperimentassumingzerosignalmany<mesandforeachsimulatedoutcomefindthemodelwiththesmallestp‐value.Mul<dimensionalmodelsareharder,andLEEisworse.

Kane,Wang,Nelson,Wang,Phys.Rev.D71,035006(2005)

ALEPH,DELPHI,L3,OPAL,andtheLHWGPhys.Lel.B565(2003)61‐75

ALELPH,DELPHI,L3,OPAL,andtheLHWGEur.Phys.J.C47(2006)547‐587

Twoexcessesseen;proposedmodelsexplainbothwithtwoHiggsbosons.Combinedlocalsignificanceisgreater,butLEEnowismuchlarger(andunevaluated).Publishedplotgraysoutregionbeyondexperimentalsensi<vity.

Page 51: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 51

Page 52: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 52

ChoosingaRegionofInterest

• Idonothaveafoolproofprescrip<onforthis,justsomethoughts.

• Analysesaredesignedtoop<mizesensi<vity,butLEEdilutessensi<vity.Thereisapenaltyforlookingformanyindependentlytestablemodels.Canweop<mizethis?

• Butyoushouldalwaysdoasearchanyway!Ifyouexpecttobeabletotestamodel,youshould.

• Tes<ngpreviouslyexcludedmodels?Wedothisanyway,justincasesomenewphysicsshowsupinawaythatevadedtheprevioustest.

• Thereisnosuchthingasamodel‐independentsearch.MerelybuildingtheLHCortheTevatronmeanswehadsomethinginmind.AndtheSM(orjustourimplementa<onofit)iswrong,butpossiblynotinawaythatisbothinteres<ngandtestable.

Page 53: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 53

LookElseWHENRunningaveragesconvergeoncorrectanswer,butthedevia<onsinunitsoftheexpecteduncertaintyhavearandomwalkinthelogarithmofthenumberoftrials

TherkareIIDnumbersdrawnfromaunitGaussian.

dn =

rk /nk=1

n

∑1/ n

Trial Number

d-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

1 10 102

103

104

105

106

107

108

109

=nIt’spossibletocherry‐pickadatasetwithamaximumdevia<on.“Samplingtoaforegoneconclusion”

StoppingRule:InHEP,we(almostalways!)takedataun<lourmoneyisgone.Weproduceresultsforthemajorconferencesalongtheway.Somewillcoincidentallystopwhenthefluctua<onsarebiggest.Wetakethemostrecent/largestdatasampleresultandignore(orshould!)resultsperformedonsmallerdatasets.p‐valuess<lldistributeduniformlyfrom0to1.Arecipeforgenera<ng“effectsthatgoaway”

Page 54: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 54

LookElseWHENC.Paus,Implica<onsWorkshop,Mar.27,2012

h`ps://indico.cern.ch/conferenceOtherViews.py?view=standard&confId=162621

Page 55: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 55

ParameterEs%ma%on–MarginalizeorProfile?

0

5

10

15

20

25

30

-3 -2 -1 0 1 2 3Nuisance Parameter ! (units of ")

Yie

ld

PredictedObserved

Predicted = 10+6

Observed = 15+3

Nuisance Parameter ! (units of ")

Lik

elih

oo

d

0

0.02

0.04

0.06

0.08

0.1

-3 -2 -1 0 1 2 3

IfPred=10‐6‐3,andobs=15,thenthelikelihoodwouldhaveonemaximum,butitwouldhaveacorner.MINUITmayquoteinappropriateuncertain<esasthesecondderiva<veisn’twelldefined.

Thecornercanbesmoothedout–SeeButIknowofnowayR.Barlow,h`p://arxiv.org/abs/physics/0406120,togetridofthedouble‐peakh`p://arxiv.org/abs/physics/0401042(norshouldtherebeaway‐‐h`p://arxiv.org/abs/physics/0306138itcanbearealeffect.SeetheLEP2TGCmeasurements)

Page 56: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 56

AnalysisOp%miza%oninIsola%onorinCombina%on?Typicalsitua<on:

Ameasurementhasasta<s<calandasystema<cuncertainty,wherethesta<s<caluncertaintyincludes“good”systema<csthatareconstrainedbythedata,andthe“bad”onesnevergetbe`erconstrainednoma`erhowmuchdataarecollected.

Wesome<meshaveachoiceofhowtoanalyzemarkedPoissondata.

1)aggressivereconstruc<onmakingassump<onsaboutpar<cledistribu<ons–moresta<s<calpowerpereventatthecostofintroducingsystema<cuncertainty

2)moremodel‐independentanalysiswithfewerassump<ons–lesssta<s<calpowerpereventbutbe`ercontroloversystema<cs.

Combina<onwithothermeasurements(fromotherdatarunsorothercollabora<ons)islikecollec<ngmoredata.Method1hitsthesystema<climitandlosesweightinthecombina<oneventhoughitmaybethemostpowerfulmethodbyitself.

Moregeneral:Withli`ledata,wearemoredependentonourassump<ons,withmoredatawecanrelaxtheassump<onsandconstrainourmodels.

Recommenda<on:Forcombina<ons,op<mizeforthelargeluminositycase.

Page 57: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 57

Correla%onsamongUncertain%es–WhenisitConserva%ve,whennot?

• Withinachannel–contribu<onsthataddtogether:includingcorrela<onsusuallyweakensthesensi<vity(always:sensi<vityisexpected)• Betweenchannels–accoun<ngforcorrela<onsisnotconserva<veOnechannel’sobserveddatabecomesanother“off”sampleforanother’s.Havetotrustalltheτfactors,andevenoffsetsfromcentralpredic<onsinordertoputinthesecorrela<ons.

• Overes<ma<ngtheimpactsofsystema<cuncertaintyonapredic<onisnotconserva<veifacorrela<onistakenintoaccount.Canresultinunderes<matedsystema<cerroronacombinedresult.

Example(systema<cuncertainty1is100%correlated,systuncertainty2is100%correlated)

Measurement1:m1=5±1(syst1)±1(syst2)CombinewithBLUE:mbest=2m1‐m2Measurement2:m2=5±1(syst1)±2(syst2)mbest=5±1(syst1)±0(syst2)

Hereaccoun<ngforcorrela<onandanoveres<matedsystema<cuncertaintyresultsinanaggressiveresult.

Page 58: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 58

ExtraSlides

Page 59: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 59

Whereis“Elsewhere?”• Mostsearchesfornewphysicshavea“regionofinterest”

• Defini<onisachoiceoftheanalyzer/collabora<on• O�enboundedbelowbyprevioussearches,boundedabovebykinema<creachoftheaccelerator/detector• Limitstheamountofworkinvolvedinpreparingananalysis.Some<mesa2DsearchinvolveslotsoftrainingofMVA’sandcheckingsidebandsandvalida<onofinputsandoutputs

Example:Asearchforpair‐producedstopquarkswhichdecaytoc+Neutralino

IfMstop>mW+mb+mneutralinothenanotheranalysistakesover.

Page 60: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 60T.JunkSta<s<csETHZurich30Jan‐3Feb

AnExample:Double‐TagMethods

x

yDijeteventsatLEP1/SLD

Zu,ubard,dbars,sbarb,bbarleptonsneutrinos

PrimaryVertex

Adouble‐vertex‐B‐taggedeventwithasemileptonicdecay

B‐taggingefficiencies(efficiencyoffindingthedisplacedvertex)areabout40%.WedonottrustMCmodelingoftheb‐tagefficiency.WouldliketomeasuretheB‐tagefficiencyandtheBr(Zb,bbar)branchingfrac<ontogetherinthesamedata.Counteventswith0,1,and2vertextags.Enoughinforma<ontosolvefortheBrandtheefficiency.

x=b‐tagofjet1,y=b‐tagofjet2.Assumeuncorrelatedprobabili<esfortaggingthejets.Buttheflavorofthejetsiscorrelated!Itisthisflavorcorrela<onthatallowsustoextractBrandTageff.

Page 61: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 61

On‐OffMeasurements–AveragedorCombined

• Oneglobalon‐offmeasurementvs.breakingthedataintosubsamples• Assumethe“off”dataarecollectedalongwiththe“on”data(controlsampleontheothersideofacutforexample)

• Globalon‐offmeasurementallowseachdatasubsample’soffmeasurementstohelpmeasureeachotherdatasubsample’sonsample’sbackgrounds.Assump<onwhichmaybefalse:youareallowedtodothis.Ifthedetectororacceleratorchangedpartwaythroughtherun,thenyoumayneedtobreakthesamplesup.

• Breakingthemapartallowsonlyeachsubsample’soffmeasurementstocalibratethebackgroundinthecorrespondingonsamples.

• SameforABCDmethods–averagingsubsamples

Page 62: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 62

SingleTopProduc5onMechanisms

“s‐Channel” “t‐Channel”

“NLOContribu<onstot‐ChannelProduc<on”

Page 63: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 63

LeveragingourRateMeasurementstoMeasuretheHiggsBosonMass

AssumingSMcrosssec<onsandbranchingfrac<ons,measuredratesarestrongfunc<onsofmH.ExampleatmH=115GeV,assuming+3sigmaexcess,andamedianoutcomeinboththebb,ττchannelsandtheWWchannels:

Tauchannelscancontributehere,evenwithlessprecisemrecthanthebbchannels

Page 64: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 64

Page 65: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 65

Page 66: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 66

Interes%ngBehaviorofCLsCLsmaynotbeamonotonicfunc<onof‐2lnQ

Tailsinthe‐2lnQdistribu<onsharedinthes+bandb‐onlyhypothesis(fitfailures)

Notreallyapathologyofthemethod,butratherareflec<onthattheteststa<s<cisn’talwaysdoingitsjobofsepara<ngs+b‐likeoutcomesfromb‐likeoutcomesinsomefrac<onofthecases.

CLs=1for‐2lnQ<‐15or‐2lnQ>+15

Distribu<onsaresumsoftwoGaussianseach.ThewideGaussianiscenteredonzero.

Prac<calreasonthiscouldhappen–everythousandthexperimentaloutcome,thefitprogram“fails”

andgivesarandomanswer.

T.JunkSLACStatsProgress

Page 67: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 67

ABumpthatGotAway

Dijet mass sum in e+e-→jjjj ALEPH Collaboration, Z. Phys. C71, 179 (1996)

“the width of the bins is designed to correspond to twice the expected resolution ... and their origin is deliberately chosen to maximize the number of events found in any two consecutive bins”

Page 68: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 68

ASamplewithZeroCovarianceisNotNecessarilyUncorrelated

y

x

Example–perimeterofacircle.Knowledgeofxprovidesknowledgeofyuptoa2‐foldambiguity.Butthecovarianceofthesamplevanishes!

SomethingtowatchoutforwithPrincipalComponentsAnalysis–doesnotremovecorrela<on,onlycovariance.

Page 69: Praccal Issues in Stascal Interpretaon of Tevatron Datatrj/trjslacstats.pdf · Praccal Issues in Stascal Interpretaon of Tevatron Data 6/5/12 T. Junk SLAC Stats Progress 1 Thomas

6/5/12 T.JunkSLACStatsProgress 69

Notallsearchesarebumphuntsonasmoothbackground

–Mul<variateAnalysesareusuallytrainedupateachmassseparately,andthereisnotasingledistribu<onwecanlookelsewherein.

Sta<s<caleffectsonly.Ifthere’sasystema<ceffectinthebackgroundmodeling,a“signal”maygrowinsignificancewithaddi<onaldatainawaythat’snotdescribedhere.

Themismodelingmaybeconcentratedinasmallpor<onofthehistogram(thisisnotaLEEeffectbutamoredifficultques<on).

Backgroundparameteriza<onmaygrowinsophis<ca<onasdataarecollected.

NotallLRteststa<s<cdistribu<onsaremodeledwellbychisquareddistribu<ons.

Combinealarge‐data‐samplebumphuntwithahighs/b,low‐background(say,b=1e‐5)searchandthedistribu<onoftheLRisaconvolu<onofchisquaredandPoisson.

CasestobeCarefulaboutApplyingtheLEEApproxima%on


Recommended