Date post: | 21-Jan-2017 |
Category: |
Science |
Upload: | peter-coles |
View: | 3,025 times |
Download: | 0 times |
Statistics in Astronomy
Peter Coles
STFC Summer School, Cardiff 27th August 2015
Lecture 1Probability
“The Essence of Cosmology is Statistics”
George McVittie
1 May 2023
Precision Cosmology
“…as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns -- the ones we don't know we don't know.”
SAY “PRECISION COSMOLOGY” ONE MORE TIME…
“The Essence of Cosmology is Statistics”
George McVittie
Direct versus Inverse Reasoning
Theory (, H0…)
Observations
Fine Tuning• In the standard model of cosmology the
free parameters are fixed by observations• But are these values surprising?• Even microscopic physics seems to have
“unnecessary” features that allow complexity to arise
• Are these coincidences? Are they significant?
• These are matters of probability…
What is a Probability?• It’s a number between 0 (impossible) and 1
(certain)• Probabilities can be manipulated using simple
rules (“sum” for OR and “product” for “AND”).• But what do they mean?• Standard interpretation is frequentist (proportions
in an ensemble)
Bayesian Probability• Probability is a measure of the “strength of
belief” that it is reasonable to hold.• It is the unique way to generalize
deductive logic (Boolean Algebra)• Represents insufficiency of knowledge to
make a statement with certainty• All probabilities are conditional on stated
assumptions or known facts, e.g. P(A|B)• Often called “subjective”, but at least the
subjectivity is on the table!
Balls• Two urns A and B.• A has 999 white balls and 1 black one; B
has 1 white balls and 999 black ones.• P(white| urn A) = .999, etc. • Now shuffle the two urns, and pull out a
ball from one of them. Suppose it is white. What is the probability it came from urn A?
• P(Urn A| white) requires “inverse” reasoning: Bayes’ Theorem
Urn A Urn B
999 white 1 black
999 black 1 white
P(white ball | urn is A)=0.999, etc
Bayes’ Theorem: Inverse reasoning
• Rev. Thomas Bayes (1702-1761)
• Never published any mathematical papers during his lifetime
• The general form of Bayes’ theorem was actually given later (by Laplace).
Bayes’ Theorem
• In the toy example, X is “the urn is A” and Y is “the ball is white”.
• Everything is calculable, and the required posterior probability is 0.999
I)|P(YI)X,|I)P(Y|P(X=I)Y,|P(X
Probable Theories
I)|P(DI)H,|I)P(D|P(H=I)D,|P(H
• Bayes’ Theorem allows us to assign probabilities to hypotheses (H) based on (assumed) knowledge (I), which can be updated when data (D) become available
• P(D|H,I) – likelihood• P(H|I) – prior probability• P(H|D,I) – posterior probability• The best theory is the most probable!
Why does this help?• Rigorous Form of Ockham’s Razor: the hypothesis
with fewest free parameters becomes most probable.
• Can be applied to one-off events (e.g. Big Bang)• It’s mathematically consistent!• It can even make sense of the Anthropic
Principle…
Null Hypotheses• The frequentist approach to statistical
hypothesis testing involves the idea of a null hypothesis H0,which is the model you are prepared to accept unless there is evidence to the contrary.
• Under the null hypothesis one then constructs the sampling distribution of some statistic Q, called f(Q).
• If the measured value of Q is unlikely on the basis of H0 then the null hypothesis is rejected.
Type I and Type II Errors• There are two ways of making an error in this
kind of test.• Type I is to reject the null when it is actually
true. The probability of this happening is called the significance level (or p-value or “size”), usually called . It is usually chosen to be 5% or 1%.
• The other possibility is to fail to reject the null when it is wrong. If the probability of this happening is then (1-) is called the power.
Bayesian Hypothesis TestingTwo of the advantages of this is that it doesn’t put one hypothesis in a special position (the null), and it doesn’t separate estimation and testing.Suppose Dr A has a theory that makes a direct prediction while Professor B has one that has a free parameter, say .Suppose the likelihoods for a given set of data are P(D|A) and P(D|B,)
Occam’s Razor
λ)B,|(DB)|(λdλA)|(D
(B)(A)
λ)B,|(Dλ)(B,dλA)|(D(A)
D)|λ(B,dλD)|(A=
D)|(BD)|(A
PrPrPr
PrPr
PrPrPrPr
PrPr
PrPr
Occam factor
Bayesian estimation
aI)d,aa|xI)p(x|ap(a=K
I),aa|xI)p(x|aKp(a=I),xx|ap(a
mmnm
mnmnm
.......
............
1111
11111
This involves finding the posterior distribution of the parameters given the data and any prior information.
Evidence!
Is there anything wrong with Frequentism?
• The laws for manipulating probabilities are no different
• What is different is the interpretation.• OK to imagine an ensemble, but there is
no need to assert that it is real! (mind projection fallacy)
• The idea of a prior is worrying for many, but is the only way to make this reasoning consistent
Prior and Prejudice• Priors are essential. • You usually know more than you
think..• Flat priors usually don’t make much
sense.• Maximum entropy, etc, give useful
insights within a well-defined theory: “objective Bayesian”
• “Theory” priors are hard to assign, especially when there isn’t a theory…
Why is the Universe (nearly) flat?
• Assume the Universe is one of the Friedman family
• Q: What should we expect, given only this assumption?
• Ω=1 is a fixed point (so is Ω=0)..
• The Universe is walking a tightrope..
a2=8πGρ
3 a2−kc 2
The Friedman ModelsThe simplest relativistic cosmological models are remarkably similar (although the more general ones have additional options…)
a=−4πGρ
3 a
Solutions of these are complicated, except when k=0 (flat Universe). This special case is called the Einstein de Sitter universe.
Notice that
ρ∝1a3
For non-relativistic particles (“dust”)
Curvature
Cosmology by Numbers
c
2
2
ρ=ρ
H==aaa=a=k
kca=a
8ππ3H
38ππG
38ππG0
38ππG
2
222
22
The “Critical Density”
This applies at any time, but we usually take the “present” time. In general,
020
0
3H8ππG Ω=,ρ=ρ,H=H,t=t 000
The Cosmic Tightrope• We know the Universe doesn’t have either
a very large or a very small one, or we wouldn’t be around.
• We exist and this fact is an observation about the Universe
• The most probable value of is therefore very close to unity
• Still leaves the mystery of what trained the Universe to walk the tightrope (inflation?)
Theories
Observations
FrequentistBayesian
“The Essence of Cosmology is Statistics”
George McVittie
“CONCORDANCE”
Cosmology is an exercise in data compression
Cosmology is a massive exercise in data compression...
….but it is worth looking at the information that has been thrown away to check that it makes sense!
1 May 2023
How Weird is the Universe?• The (zero-th order) starting point is
FLRW.• The concordance cosmology is a “first-
order” perturbation to this• In it (and other “first-order” models), the
initial fluctuations were a statistically homogeneous and isotropic Gaussian Random Field (GRF)
• These are the “maximum entropy” initial conditions having “random phases” motivated by inflation.
• Anything else would be weird….
A)!|P(MM)|P(A
Beware the Prosecutor’s Fallacy!
Is there an Elephant in the Room?
Types of CMB Anomalies• Type I – obvious problems with data
(e.g. foregrounds)• Type II – anisotropies (North-South, Axis
of Evil..)• Type III – localized features, e.g. “The
Cold Spot”• Type IV – Something else (even/odd
multipoles, magnetic fields, ?)
“If tortured sufficiently, data will confess to almost
anything”
Fred Menger
Weirdness in PhasesΔT (θ,φ )
T=∑∑ a l,mY lm (θ,φ )
ml,ml,ml, ia=a exp
For a homogeneous and isotropic Gaussian random field (on the sphere) the phases are independent and uniformly distributed. Non-random phases therefore indicate weirdness..
Final Points