Date post: | 19-Dec-2015 |
Category: |
Documents |
View: | 214 times |
Download: | 1 times |
CLs - Reporting
Search Results
Advanced Statistical Techniques in Particle Physics, Durham
Alex ReadUniversity of Oslo
21 March, 2002
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
Outline
Comments
Introduction
Search procedure
Higgs search at LEP
Neutrino mixing (toy)
Conclusions
Fred: Are you Bayesian and believe or are you frequentistand cover?
Alex: I believeI am frequentistbut frequently think and act as Bayesian.
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
CommentsDon't make "final" hard cuts, include your favorite multivariate discriminate in the likelihood of the experiment.
I told you two years ago not to be afraid of candidate!
No luminosity-dependent optimization of event selection.
NP/Bayes decision gives optimal separation of specified hypotheses for a given information content.
S/(S+B) maps onto S/B so it is optimal event discriminator but saturates interesting region into 1- so beware of digitizing/numerical tolerances
I hate the CLs name. It is only a small detail in the framework I describe.
P((-2lnL)<1)=68% is a terrible approximation for searches for rare decays (motivation for CLs/MFLR in early days).
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
Terms
I call the likelihood ratio Q=L(H1)/L(H
0)
s and b are both symbols for signal and background as well as expected rates for s and b
S and B are normalized pdf's
s and S depend on model parameters and S also depends on measured observables
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
Frontier Discovery
Scientists ultimately put confidence in a hypothesis or a theory if it has been able to withstand empirical or observational attempts to falsify it (Judd, Smith, Kidder, Research Methods in Social Relations (!)).
The theory or model must be falsifiable - the experiment must be sensitive to the parameters of the theory or model: Confidence intervals for insensitive experiments are uninformative.
It is our duty to be skeptical i.e. to try to falsify or exclude.
Good example of falsifiable theory: Z,W in electroweak unification.
We don't confirm theories, we fail to break them.
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
ProblemsWe want to conclude something about the signal but our data tends to be mostly background.
D'Agostini proposed that in the absence of a significant signal one should quote a sensitivity bound (where the LR falls below an agreed standard value) - a very pessimistic least common denominator.
Clifford, Cousins, D'Agostini,... - most of us think P(theory|data) even for frequentist conf. intervals based on P(data|theory).
Clifford: My suggestion for a possible way forward is to investigate and focus on frequentist confidence intervals which are approximate Bayesian credible intervals, or equivalently look for Bayesian credible intervals which have approximately the required coverage probability.
Even our beloved SM is an approximation!
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
ProblemsSystematic uncertainties? Most of us are comfortable with encoding this in a pdf for the uncertain background, efficiency, etc. Trivial to include even hugely complicated systematic uncertainties in widening of pdf's (generalize H+C).
Discrete events: Unavoidable imperfect coverage in any method.
Unavoidable conservatism/overcoverage to avoid undercoverage (false exclusion rate lower than specified).
Debates about what the ensemble is.
Ex: 77 events in Tevatron top mass.
Higgs mass uncertainty at LEP wouldn't have been easier.
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
Is flip-flopping a problem?What IS the experimental result?
x0 and where it lies in relation to
the pdf bands of x|All single and double-sided confidence intervals or limits of all confidences 90%, 95%, 99%,...coexist.
In the past one often made a subjective decision which was the most relevant to report.
In FC this freedom is gone.
The reader deserves more input.
No flipping until at least 3 sigma, rather 4-5!
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
Goals
ExcludeObserveDiscoverMeasure
in a common framework if possible (LR)
"CLs" stops before measurements
Provide possibility for analysis optimization
Accelerator, detector, DAQ, analysis
Don't exclude or discover things which the experiment is obviously not sensitive to.
Keep it as simple as possible.
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
Likelihood ratio for search
Q
i1
N chan
e s
ib
is
ib
i
nicand
n i !
i1
N chan
ebi
bi
nicand
ni
cand!
j1
nicand
siS
i x ijb
iB
i x ij
sibi
j1
nicand
bi B i x ij
bi
2lnQ2 stot
2 i1
N chan
j1
nicand
ln 1si S i x ij
biS
jx
ij
Counting and pdf (event discriminant)
For Higgs at LEP 2d is reconstructed mass and (almost independent) "Higgs-tag" and signal s and S depend on m
H
x
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
Confidences
These are pdf's of -2lnQ for the 2 hypotheses in a test
Confidences are integrals of the pdf's from the right
CLb: conf. in background
1-CLb: test significance
CLsb: conf in sig.+bg.
CLs=CLsb/CLb: approximate conf. in sig.
These ensembles are no more invalid than those in FC!
Use CLsb for model-dependentgoodness-of-fit!
CLsb
1-CLb
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
Confidences
Never used!!!!
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
Sensitivity
Poor sensitivity, results highly ambiguous, CLs important, all methods more or less odd
Eventually peaks at -2lnQ=2s, due to n=0
Good sensitivity, strong exclusion or discovery probable
Moderate sensitivity, nearing expected exclusion limit
Backgr.Sig.+backgr.
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
Why not Bayes?Want to report on the compatibility of the results of this experiment with the background or new physics before combining with prior.
Empirically saw coverage problems worse than CLs.
Can do a frequentist-motivated "calibration" but...
Why not use the optimal Neyman-Pearson test (LR)? Non-factorizable test-statistic eliminates possibility of CPU-
saving techniques in pdf determination.
Empirical study showed worse performance, as expected from N-P. for anything other than simple event counting in the absence of prior-tuning.
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
OptimizationWe ask for funding to exclude/discover exciting new physics. We build a great ACC+DET+DAQ and expect b~0 and L
With FC we could be tempted to increase the background to increase odds that we exclude the signal at 95% CL.
With CLs we won't improve until we add a channel that can see additional signal.
Optimization criticised as "however short intervals those LEP guys think they can get away with".
No. We want to exclude as powerfully as possible for a given false exclusion rate (which is allowed by sensitivity).
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
OptimizationExclusion potential vs. false exclusion, discovery potential vs. false discovery
With dice we can achieve
P95
= PFE
= 5% (for 95% CL)
P5s
= PFD
= 5.7x10-7
If we can't do better than this then we are beyond D'Agostini's "wallof insensitivity"
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
Higgs search at LEP
Can we exclude/falisfy the indirect evidence?
Do we reach the sensitivity bound?
Or if we don't reach the sensitivity bound, is there evidence for signal?
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
Higgs search at LEPTop plot corresponds to x vs.
Interesting structure but near sensitivity bound
Can't exclude dip region
Must test for significance
Interval >114.1 GeV is unexcluded
This is not the CI given that Higgs exists!!!!
Do not conclude that P(m
H<114.1)=5%!!!!
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
Higgs search at LEPDirect evidence for Higgs is pitiful.
Is it necessary/useful to determine a CI for m
H?
Bayes possible
FC is the natural step after CLs but difficult
Took crude shortcut, assumed -2lnQ~2, estimated interval with:
2ln Q sb 2ln Q sb 1
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
Neutrino oscillations
We learned a lot in early days of Higgs WG by studying specific toy experiments with different methods.
Let's see what CLs, etc. says about -osc. toy experiment in FC article.
5 energy bins 10 to 60 GeV, L=600-1000 m, 100 events bg per bin, 100 events per 1% oscillation. s(test)=236 events.
Signal rate
P e sin 2 2 sin2 1.27 m2 L
E
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
-osc: Likelihood ratios
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
-osc: Performance
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
-osc: Test signal
Median expected limit (90%)
90% CL contour
10% CL contour
Excludedregion
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
-osc: Test background
90% CL contour
Median expected limit
Excludedregion
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
-osc: Significances (1-CLb)
HUGE signal pollutes entire plane -nowhere is compatible with background.
Hole is due tocomp. limitations
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
SummaryThe likelihood ratio is a common element in the analysis of search results (essentially required).
The differences between FC and this analysis start out small but hypothesis testing and interval setting are different.
LR(FC) =
LR(search)=
A search must include an attempt to falsify the new physics hypothesis.
The value of any confidence or credibility interval in the absence of a significant signal is questionnable at best.
I recommend to not report CI until it says something about a signal.
L sb L sb
L sb L b
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
SummaryCLs is an approximately frequentist construction with useful properties for hypothesis testing in searches for new physics
Allows robust optimization
Behaves ~intuitively (~P(theory|data)) with addition of new, weak channel, increased background, increased uncertainties
Although it is ~P(data|theory) it behaves more like P(theory|data) than FC
Does not produce confidence intervals in the signal parameters.
Has ~correct false exclusion rate in region of full sensitivity and gives no exclusion below sensitivity bound.
The same rich framework provides information relevant to the subjective decision to claim a discovery including a test of goodness-of-fit.
March, 2002
Alex ReadUniversity of Oslo
Reporting search results - CLsAdv. Stat. Tech. Durham
SummaryThis exclusion->discovery framework has confronted several, difficult real-life searches at LEP -among them the search for the SM Higgs boson.
There is one important participant missing from the conference!