CL s - Reporting Search Results Advanced Statistical Techniques in Particle Physics, Durham Alex...

CLs - Reporting

Search Results

Advanced Statistical Techniques in Particle Physics, Durham

Alex ReadUniversity of Oslo

21 March, 2002

March, 2002


Reporting search results - CLsAdv. Stat. Tech. Durham

Outline

Comments

Introduction

Search procedure

Higgs search at LEP

Neutrino mixing (toy)

Conclusions

Fred: Are you Bayesian and believe or are you frequentistand cover?

Alex: I believeI am frequentistbut frequently think and act as Bayesian.

March, 2002



CommentsDon't make "final" hard cuts, include your favorite multivariate discriminate in the likelihood of the experiment.

I told you two years ago not to be afraid of candidate!

No luminosity-dependent optimization of event selection.

NP/Bayes decision gives optimal separation of specified hypotheses for a given information content.

S/(S+B) maps onto S/B so it is optimal event discriminator but saturates interesting region into 1- so beware of digitizing/numerical tolerances

I hate the CLs name. It is only a small detail in the framework I describe.

P((-2lnL)<1)=68% is a terrible approximation for searches for rare decays (motivation for CLs/MFLR in early days).

March, 2002



Terms

I call the likelihood ratio Q=L(H1)/L(H

0)

s and b are both symbols for signal and background as well as expected rates for s and b

S and B are normalized pdf's

s and S depend on model parameters and S also depends on measured observables

March, 2002



Frontier Discovery

Scientists ultimately put confidence in a hypothesis or a theory if it has been able to withstand empirical or observational attempts to falsify it (Judd, Smith, Kidder, Research Methods in Social Relations (!)).

The theory or model must be falsifiable - the experiment must be sensitive to the parameters of the theory or model: Confidence intervals for insensitive experiments are uninformative.

It is our duty to be skeptical i.e. to try to falsify or exclude.

Good example of falsifiable theory: Z,W in electroweak unification.

We don't confirm theories, we fail to break them.

March, 2002



ProblemsWe want to conclude something about the signal but our data tends to be mostly background.

D'Agostini proposed that in the absence of a significant signal one should quote a sensitivity bound (where the LR falls below an agreed standard value) - a very pessimistic least common denominator.

Clifford, Cousins, D'Agostini,... - most of us think P(theory|data) even for frequentist conf. intervals based on P(data|theory).

Clifford: My suggestion for a possible way forward is to investigate and focus on frequentist confidence intervals which are approximate Bayesian credible intervals, or equivalently look for Bayesian credible intervals which have approximately the required coverage probability.

Even our beloved SM is an approximation!

March, 2002



ProblemsSystematic uncertainties? Most of us are comfortable with encoding this in a pdf for the uncertain background, efficiency, etc. Trivial to include even hugely complicated systematic uncertainties in widening of pdf's (generalize H+C).

Discrete events: Unavoidable imperfect coverage in any method.

Unavoidable conservatism/overcoverage to avoid undercoverage (false exclusion rate lower than specified).

Debates about what the ensemble is.

Ex: 77 events in Tevatron top mass.

Higgs mass uncertainty at LEP wouldn't have been easier.

March, 2002



Is flip-flopping a problem?What IS the experimental result?

x0 and where it lies in relation to

the pdf bands of x|All single and double-sided confidence intervals or limits of all confidences 90%, 95%, 99%,...coexist.

In the past one often made a subjective decision which was the most relevant to report.

In FC this freedom is gone.

The reader deserves more input.

No flipping until at least 3 sigma, rather 4-5!

March, 2002



Goals

ExcludeObserveDiscoverMeasure

in a common framework if possible (LR)

"CLs" stops before measurements

Provide possibility for analysis optimization

Accelerator, detector, DAQ, analysis

Don't exclude or discover things which the experiment is obviously not sensitive to.

Keep it as simple as possible.

March, 2002



Likelihood ratio for search

Q

i1

N chan

e s

ib

is

ib

i

nicand

n i !

i1

N chan

ebi

bi

nicand

ni

cand!

j1

nicand

siS

i x ijb

iB

i x ij

sibi

j1

nicand

bi B i x ij

bi

2lnQ2 stot

2 i1

N chan

j1

nicand

ln 1si S i x ij

biS

jx

ij

Counting and pdf (event discriminant)

For Higgs at LEP 2d is reconstructed mass and (almost independent) "Higgs-tag" and signal s and S depend on m

H

x

March, 2002



Confidences

These are pdf's of -2lnQ for the 2 hypotheses in a test

Confidences are integrals of the pdf's from the right

CLb: conf. in background

1-CLb: test significance

CLsb: conf in sig.+bg.

CLs=CLsb/CLb: approximate conf. in sig.

These ensembles are no more invalid than those in FC!

Use CLsb for model-dependentgoodness-of-fit!

CLsb

1-CLb

March, 2002



Confidences

Never used!!!!

March, 2002



Sensitivity

Poor sensitivity, results highly ambiguous, CLs important, all methods more or less odd

Eventually peaks at -2lnQ=2s, due to n=0

Good sensitivity, strong exclusion or discovery probable

Moderate sensitivity, nearing expected exclusion limit

Backgr.Sig.+backgr.

March, 2002



Why not Bayes?Want to report on the compatibility of the results of this experiment with the background or new physics before combining with prior.

Empirically saw coverage problems worse than CLs.

Can do a frequentist-motivated "calibration" but...

Why not use the optimal Neyman-Pearson test (LR)? Non-factorizable test-statistic eliminates possibility of CPU-

saving techniques in pdf determination.

Empirical study showed worse performance, as expected from N-P. for anything other than simple event counting in the absence of prior-tuning.

March, 2002



OptimizationWe ask for funding to exclude/discover exciting new physics. We build a great ACC+DET+DAQ and expect b~0 and L

With FC we could be tempted to increase the background to increase odds that we exclude the signal at 95% CL.

With CLs we won't improve until we add a channel that can see additional signal.

Optimization criticised as "however short intervals those LEP guys think they can get away with".

No. We want to exclude as powerfully as possible for a given false exclusion rate (which is allowed by sensitivity).

March, 2002



OptimizationExclusion potential vs. false exclusion, discovery potential vs. false discovery

With dice we can achieve

P95

= PFE

= 5% (for 95% CL)

P5s

= PFD

= 5.7x10-7

If we can't do better than this then we are beyond D'Agostini's "wallof insensitivity"

March, 2002



Higgs search at LEP

Can we exclude/falisfy the indirect evidence?

Do we reach the sensitivity bound?

Or if we don't reach the sensitivity bound, is there evidence for signal?

March, 2002



Higgs search at LEPTop plot corresponds to x vs.

Interesting structure but near sensitivity bound

Can't exclude dip region

Must test for significance

Interval >114.1 GeV is unexcluded

This is not the CI given that Higgs exists!!!!

Do not conclude that P(m

H<114.1)=5%!!!!

March, 2002



Higgs search at LEPDirect evidence for Higgs is pitiful.

Is it necessary/useful to determine a CI for m

H?

Bayes possible

FC is the natural step after CLs but difficult

Took crude shortcut, assumed -2lnQ~2, estimated interval with:

2ln Q sb 2ln Q sb 1

March, 2002



Neutrino oscillations

We learned a lot in early days of Higgs WG by studying specific toy experiments with different methods.

Let's see what CLs, etc. says about -osc. toy experiment in FC article.

5 energy bins 10 to 60 GeV, L=600-1000 m, 100 events bg per bin, 100 events per 1% oscillation. s(test)=236 events.

Signal rate

P e sin 2 2 sin2 1.27 m2 L

E

March, 2002



-osc: Likelihood ratios

March, 2002



-osc: Performance

March, 2002



-osc: Test signal

Median expected limit (90%)

90% CL contour

10% CL contour

Excludedregion

March, 2002



-osc: Test background

90% CL contour

Median expected limit

Excludedregion

March, 2002



-osc: Significances (1-CLb)

HUGE signal pollutes entire plane -nowhere is compatible with background.

Hole is due tocomp. limitations

March, 2002



SummaryThe likelihood ratio is a common element in the analysis of search results (essentially required).

The differences between FC and this analysis start out small but hypothesis testing and interval setting are different.

LR(FC) =

LR(search)=

A search must include an attempt to falsify the new physics hypothesis.

The value of any confidence or credibility interval in the absence of a significant signal is questionnable at best.

I recommend to not report CI until it says something about a signal.

L sb L sb

L sb L b

March, 2002



SummaryCLs is an approximately frequentist construction with useful properties for hypothesis testing in searches for new physics

Allows robust optimization

Behaves ~intuitively (~P(theory|data)) with addition of new, weak channel, increased background, increased uncertainties

Although it is ~P(data|theory) it behaves more like P(theory|data) than FC

Does not produce confidence intervals in the signal parameters.

Has ~correct false exclusion rate in region of full sensitivity and gives no exclusion below sensitivity bound.

The same rich framework provides information relevant to the subjective decision to claim a discovery including a test of goodness-of-fit.

March, 2002



SummaryThis exclusion->discovery framework has confronted several, difficult real-life searches at LEP -among them the search for the SM Higgs boson.

There is one important participant missing from the conference!

Date post:	19-Dec-2015
Category:	Documents
View:	214 times
Download:	1 times

CL s - Reporting Search Results Advanced Statistical Techniques in Particle Physics, Durham Alex...

Documents