Basic Counting Statistics
Stat
istics
W. Udo Schröder, 2009
2
Stochastic Nuclear Observables2 sources of stochastic observables x in nuclear science:
1) Nuclear phenomena are governed by quantal wave functions and inherent statistics2) Detection of processes occurs with imperfect efficiency (e < 1) and finite resolution distributing sharp events x0 over a range in x. Stochastic observables x have a range of values with frequencies determined by probability distribution P(x). characterize by set of moments of P
<xn> = ∫ xn P (x)dx; n 0, 1, 2,…
Normalization <x0> = 1. First moment (expectation value) of P:
E(x) = <x> = ∫xP (x)dx
second central moment = “variance” of P(x): sx
2 = <x2- <x2> >
Uncertainty and Statistics
Nuclear systems: quantal wave functions yi(x,…;t)(x,…;t) = degrees of freedom of system, time.Probability density (e.g., for x, integrate over other d.o.f.)
12
12
2
2
212 12
2121
( , ) | , | 1,2,....
( , )( ) | , | 1
1 2, | | ( )2
( , ) | | 1
y
y
y
ii
ii i
t t
dP x t x t idx
NormalizationdP x tP t dx dx x tdx
Transition between states M E
dP x t x e e state disappearsdx
1
2
Partial probability rates 12 for disappearance (decay of 12) can vary over many orders of magnitude no certainty statistics
Stat
istics
W. Udo Schröder, 2009
4
The Normal Distribution
10 8 6 4 2 0 2 4 6 8 100
0.05
0.1
0.15
0.2
G( )x
G( )s
x
30 38 46 54 62 70
0.02
0.04
0.06
0.08
Normal (Gaussian) Probability1 10 1
10 8
G xn
7030 xn____________________________________________
2
22
1( ) exp22
2 2ln 2.352
ss
s s
FWHM x
XX
x
x xP x
Continuous function or discrete distribution (over bins)
Normalized probability
1
1
2
22
1( ) exp22 ss
X
x
X
x xP x x dx
2sx
P(x)
P(x)
x
x
<x>
Stat
istics
W. Udo Schröder, 2009
5
Experimental Mean Counts and Variance
1
2 2 2
1
22 2
1
:1 ( )
1 ( )1
(" ")
1 ( )( 1)
N
i populationi
N
ii
population
N
n ii
Average count n in a sample
n n n unknownN
Variance of n in the individual samples
s n nN
Variance error of thesample average n n
s n nN N N
s
s
Measured by ensemble sampling expectation values + uncertainties
Sample (Ensemble) = Population instant 236U (0.25mg) source, count a particles emitted during N = 10 time intervals (samples @1 min). =??
2
1
. : / 5
(35496 59) n
9
i: m
s s
pop
n nStd deviation n N
R n nesultSlightly different from sample to sample
Stat
istics
W. Udo Schröder, 2009
6
Sample Statistics
0 2 4 6 8 10
0.1
0.2
0.3
Normal (Gaussian) Probability
101 105.5 1100
5
10Normally Distributed Events
10
0
xm i
x0
x0 s
x0 s
110101 m300 304.5 3090
5
10Normally Distributed Events
10
0
xm i
x0
x0 s
x0 s
309300 m
2
22
1( ) exp22
pop
XX
x xP x
vv
x
P(x) Assume true population distribution for
variable x
with true (“population”) mean <x>pop = 5.0, nx =1.0
401 405.5 4100
5
10Normally Distributed Events
10
0
xm i
x0
x0 s
x0 s
410401 m
Mean arithmetic sample average <x> = (5.11+4.96+4.96)/3 = 5.01Variance of sample averages s2=s2 = [(5.11-5.01)2+2(4.96-5.01)2]/2 = 0.01 s = 0.0075 sx
2 = 0.0075/3 = 0.0025 sx = 0.05 Result: <x>pop 5.01 ± 0.05
<x>-s
<x> =5.11 s =1.11 <x> =4.96 s =0.94
<x>+s<x>
<x> =4.96 s =1.23
3 independent sample measurements (equivalent statistics):
x
Stat
istics
W. Udo Schröder, 2009
7
Example of Gaussian Population
0 25 500
5
10Normally Distributed Events
10
0
xm i
x0
x0 s
x0 s
500 m
0 5 100
10
2050MC Events in 10 x bins
ii FRAME
0 5 100
5
10Normally Distributed Events
xm i
x0
x0 s
x0 s
m
xaver 5.0601( ) xvar 0.8726( )
0 5 100
10
2010MC Events in 10 x bins
xm xm
Sample size makes a difference ( weighted average) n = 10 n = 50
The larger the sample, the narrower the distribution of x values, the more it approaches the true Gaussian (normal) distribution.
Stat
istics
W. Udo Schröder, 2009
8
Central-Limit Theorem
Increasing size n of samples: Distribution of sample means Gaussian normal distrib. regardless of form of original (population) distribution.
The means (averages) of different samples in the previous example cluster together closely. general property:
The average of a distribution does not contain information on the shape of the distribution.The average of any truly random sample of a population is already somewhat close to the true population average.
Many or large samples narrow the choices: smaller Gaussian width Standard error of the mean decreases with incr. sample size
Stat
istics
W. Udo Schröder, 2009
9
Integer random variable m = number of events, out of N total, of a given type, e.g., decay of m (from a sample of N ) radioactive nuclei, or detection of m (out of N ) photons arriving at detector. p = probability for a (one) success (decay of one nucleus, detection of one photon)
Choose an arbitrary sample of m trials out of N trialspm = probability for at least m successes(1-p)N-m = probability for N-m failures (survivals, escaping detection)Probability for exactly m successes out of a total of N trials
How many ways can m events be chosen out of N ? Binomial coefficient
Total probability (success rate) for any sample of m events:
Binomial Distribution
( ) 1 N mmP m p p
P mNm
p pbinomialm N m( )
FHGIKJ 1b g
! ( 1)!( )! 1
N N N m Nm N m mm
Stat
istics
W. Udo Schröder, 2009
10
Moments and Limits
( ) 1 N mmbinomial
NP m p pm
0 0
1 ( ) 1
N N
N mmbinomial
m m
NormalizationN
P m p pm
Distributions for N=30 and p=0.1 p=0.3Poisson Gaussian
2 (1 )
(1 ) 1
s
s
m
m
Mean and variance
m N p and N p p
N p pm N p N
0 5 10 15 200
0.1
0.2
0.3Binomial Distributions N=30
0.236
0
Pb N m 0.1( )
Pb N m 0.3( )
200 m
222
1lim ( ) exp22binomialN mm
x mP m
ss
Probability for m “successes” out of N trials, individual probability p
Stat
istics
W. Udo Schröder, 2009
11
Poisson Probability Distribution
0 5 10 15 200
0.2
0.4
0.6
0.8Poisson Distributions
0.607
0
Pp 0.5 m( )
Pp 3 m( )
Pp 5 m( )
Pp 10 m( )
200 m
P m emPoisson
m
( , )!
Probability for observing m events when average is <m> =
m=0,1,2,… = <m> = N·p and s2 = is the mean, the average number of successes in N trials.Observe N counts (events) uncertainty is s= √Unlike the binomial distribution, the Poisson distribution does not depend explicitly on p or N !For large N, p:Poisson Gaussian (Normal Distribution)
Results from binomial distribution in the limit of small p and large N (N·p > 0)
0lim ( , ) ( , )
binomial Poissonpand N
P N m P m
Stat
istics
W. Udo Schröder, 2009
12
Moments of Transition Probabilities23
17
4 114 1
17
14 1
71/ 2
6.022 10: 0.25 0.25 6.38 10236
3.5946 10 min 5.6362 10 min6.38 10
( ) :
5.6362 10 min
" " 2.34 10
N mg mgg
np
NProbability for decay decay rate per nucleus
p
corresponds to half life t a
Small probability for process, but many trials (n0 = 6.38·1017)
0< n0· < ∞Statistical process follows a Poisson distribution: n=“random” Different statistical distributions: Binomial, Poisson, Gaussian
Stat
istics
W. Udo Schröder, 2009
13
Radioactive Decay as Poisson Process
Slow radioactive decay of large sample Sample size N » 1, decay probability p « 1, with 0 < N·p <
137Cs unstable isotope decayt1/2 = 27 years p = ln2/27 = 0.026/a = 8.2·10-10s-1 0Sample of 1 g: N = 1015 nuclei (=trials for decay)
How many will decay(= activity ) ? =< >= N·p = 8.2·10+5 s-1
Count rate estimate < >=d<N>/dt = (8.2·10+5 ± 905) s-1
estimatedProbability for m actual decays P (,m) =
55 8.52 10(8.52 10 )( , ) ! !m m
Poissone eP m
m m
N
N
Stat
istics
W. Udo Schröder, 2009
14
Functions of Stochastic Variables
Random independent variable sets {N1}, {N2},….,{Nn } corresponding variances s1
2, s22,….,sn
2
Function f(N1, N2,….,Nn) defined for any tuple {N1, N2,….,Nn}Expectation value (mean) Gauss’ law of error propagation:
1 22 2 22 2 21 2
1 2
2 2 22, 3,.. 1, 3,.. 1, 2,.., 1
....
| | .. |
f nn
N N N N N N Nn
ff fN N N
ff f
s s s s
Further terms if Ni not independent ( correlations)Otherwise, individual component variances (f)2 add.
1
1 ,...,,...,
nn N N
ff N N
Stat
istics
W. Udo Schröder, 2009
15
Example: Spectral Analysis
Background B
Peak Area A
B1
B2
Analyze peak in range channels c1 – c2: beginning of background left and right of peak n = c1 – c2 +1.
Total area = N12 =A+B
N(c1)=B1, N(c2)=B2, Linear background <B> = n(B1+B2)/2Peak Area A:
12 1 2
12 12
2
/ 2/ 4A
A N n B B
N B Bns
1 2 11 2 22 1:N N N NN N N NN
Adding or subtracting 2 Poisson distributed numbers N1 and N2:Variances always add
1 2s s
Stat
istics
W. Udo Schröder, 2009
16
Confidence Level
2
22
2(| | ) exp22
s
s
x
popx
x xP x x dx CL
( 1 ) 68.3% ( 2 ) 95.4%( 3 ) 99.7% s s s
CL CLCL
Assume normally distributed observable x:
2
22
1( ) exp22
pop
poppop
x xP x
vv
Sample distribution with data set observed average <x> and std. error s approximate population. Confidence level CL (Central Confidence Interval):
With confidence level CL (probability in %), the true value <xpop> differs by less than = ns from measured average.Trustworthy exptl.
results quote ±3s error bars!
1s
2s
3s
Measured Probability
Stat
istics
W. Udo Schröder, 2009
17
Setting Confidence LimitsExample: Search for rare decay with decay rate observe no counts within time t. Decay probability law
dP/dt=-dN/Ndt= exp {- ·t}. P(t) = symmetric in and t
0
0
0
00
0 0
0 | 1
( ) 1
1 1ln[1 ( )] ln[1 ] 0
:
t t
tt
P t e with t e d
P t e d e
P CLt t
CL
normalized P
normalized P
Upper limit
Higher confidence levels CL (0 CL 1) larger upper limits for a given time t inspected. Reduce limit by measuring for longer period.
no counts in t
Stat
istics
W. Udo Schröder, 2009
18
Maximum LikelihoodMeasurement of correlations between observables y and x: {xi,yi| i=1-N} Hypothesis: y(x) =f(c1,…,cm; x). Parameters defining f: {c1,…,cm}ndof=N-m degrees of freedom for a “fit” of the data with f. 2
11 22
( ,.., ; )1( ,.., ; ) exp22 ss
i m ii m
ii
y f c c xP c c x for every
data point
1 11
2
2211
( ,.., ) ( ,.., ; )
1 exp22 ss
N
m i mi
N Ni
ij ij
P c c P c c x
y
Maximize simultaneous probability
Minimize chi-squared by varying {c1,…,cm}: ∂c2/∂ci = 0
2 212
1 2 21 1
( ,.., ; ),.., :c
s s
N Ni i m i
mi ii i
y y f c c xc c
When is c2 as good as can be?
Stat
istics
W. Udo Schröder, 2009
19
Minimizing c2
2 22
2 21 1
, :cs s
N Ni i i
i ii i
y y a bxa b
Example: linear fit f(a,b;x) = a + b·x to data set {xi, yi, si}
Minimize:
22
2 21 1
20 ,N Ni i i i
i ii i
y a bx y a bxa b
a ac
s s
221
20 ,N i i i
i i
x y a bxa b
bc
s
2 2 21 1 1
1N N Ni i
i i ii i ib x ya
s s s
2
2 2 21 1 1
N N Ni i i i
i i ii i ia xbx x y
s s s
11 12 1
21 22 2
aad d cd db c
b
11 12
21 22
d dD
d d
1 12 11 1
2 22 21 22
2 22 2
1 1
1 1 1ia b
i i
c d d ca b
c d d cD D
xD D
s ss s
Equivalent to solving system of linear equations
Stat
istics
W. Udo Schröder, 2009
20
Distribution of Chi-SquaredsDistribution of possible c2 for data sets distributed normally about a theoretical expectation (function) with ndof degrees of freedom:
22 12 22 2
2
2
1 2
2
2
2
e( )
2 2
( ) 1 !2.507 (1 0.0 )
2 1
833 /
d
ndof
n ndof d
of dof dof
ofdof
n n
P d d
n n n
n
n n
e n n
c
c
c
cc
s
c c
(Stirling’s formula)
Reduced c2:2 2 2
2
( 1)
0 1.5 50%
r dof
r
n N mFor
Confidence
c c c
c
22
( , ) ( )dof ndofP n P x dxc
c
P u n 1
2 n2
u2
n2
1
expu
2
0 2 4 6 8 100
0.1
0.2
0.3
0.4Chi-Squared Distribution
P u 1( )
P u 2( )
P u 3( )
P u 4( )
P u 5( )
u
<c2>ndof=5
u:=c2
Should be P 0.5 for a reasonable fit
Stat
istics
W. Udo Schröder, 2009
21
CL for c2-Distributions
1-CL
Stat
istics
W. Udo Schröder, 2009
22
Correlations in Data Sets
Correlations within data set. Example: yi small whenever xi small
x
yuncorrel. P(x,y)
2 2
2 22 2
( , ) ( ) ( )
1 exp2 24 s s s
unc
x yx
P x y P x P y
x x y y
2 2 2
2 22 2
2 2 2
1( , )2
2exp
2
s s s
s s
s s s
corr
x y xy
x y
x y xy
P x y
x x y y x x y y
22 22 221cot 2 4
1 1
x yx yxy
xy
xy xy x y xyr correlation coefficient r
s ss sa s
s
s s s
x
y correlated P(x,y)
a ( , )s xy x x y y P x y dxdy covariance
Stat
istics
W. Udo Schröder, 2009
23
Correlations in Data Sets
22
: 1 1
;
1 1:1
ss s
s
i i j jij ij
i i j j
ijij
i j
ij i j i j
c c c cr r
c c c c
r
c c c c covariancen n
uncertainties of deduced most likely parameters ci (e.g., a, b for linear fit) depend on depth/shallowness and shape of the c2 surface
ci
cjuncorrelated c2 surface
ci
cj
correlated c2 surface
({ }) ( )
({ }) ( )
unc i i
i
corr i ii
P c P c
P c P c
Stat
istics
W. Udo Schröder, 2009
24
Multivariate Correlations
ci
cj
initial guess
search path
Smoothed c2 surface Different search strategies:
Steepest gradient, Newton method w or w/odamping of oscillations,
Biased MC:Metropolis MC algorithms,
Simulated annealing (MC derived from metallurgy),
Various software packagesLINFIT, MINUIT,….