Date post: | 10-May-2015 |
Category: |
Education |
Upload: | christian-robert |
View: | 1,161 times |
Download: | 0 times |
Bayesian Model Comparison in Cosmology
Bayesian Model Comparison in Cosmologywith Population Monte Carlo
Monthly Notices Royal Astronomical Soc. 405 (4), 2381 - 2390, 2010
Christian P. Robert
Universite Paris Dauphine & CRESThttp://www.ceremade.dauphine.fr/~xian
Joint works with D., Benabed K., Cappe O., Cardoso J.F., Fort G., Kilbinger M.,
[Marin J.-M., Mira A.,] Prunet S., Wraith D.
Bayesian Model Comparison in Cosmology
Outline
1 Cosmology background
2 Importance sampling
3 Application to cosmological data
4 Evidence approximation
5 Cosmology models
6 lexicon
Bayesian Model Comparison in Cosmology
Cosmology background
Cosmology
A large part of the data to answer some of the major questions in cosmologycomes from studying the Cosmic Microwave Background (CMB) radiation(fossil heat released circa 380,000 years after the BB).
Huge uniformity of the CMB. Only very sensitive instruments like such asWMAP (NASA, 2001) can detect fluctuations CMB temperaturee.g minute temperature variations: one part of the sky has a temperature of 2.7251Kelvin (degrees above absolute zero), while another part of the sky has a temperatureof 2.7249 Kelvin
Bayesian Model Comparison in Cosmology
Cosmology background
CosmologyA large part of the data to answer some of the major questions in cosmologycomes from studying the Cosmic Microwave Background (CMB) radiation(fossil heat released circa 380,000 years after the BB).
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
CMB
−0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6
01
23
45
[Marin & CPR, Bayesian Core, 2007]
Bayesian Model Comparison in Cosmology
Cosmology background
Plank
Temperature variations are related to fluctuations in the density ofmatter in the early universe and thus carry information about theinitial conditions for the formation of cosmic structures such asgalaxies, clusters, and voids for example.
PlanckJoint mission between the European Space Agency (ESA) and NASA, launched inMay 2009. The Planck mission plans to provide datasets of nearly 5 × 1010
observations to settle many open questions with CMB temperature data. Rather thanscalar valued observations, Planck will provide tensor-valued data and thus is likely toalso open up this area of statistical research.
Bayesian Model Comparison in Cosmology
Cosmology background
Plank
Temperature variations are related to fluctuations in the density ofmatter in the early universe and thus carry information about theinitial conditions for the formation of cosmic structures such asgalaxies, clusters, and voids for example.
PlanckJoint mission between the European Space Agency (ESA) and NASA, launched inMay 2009. The Planck mission plans to provide datasets of nearly 5 × 1010
observations to settle many open questions with CMB temperature data. Rather thanscalar valued observations, Planck will provide tensor-valued data and thus is likely toalso open up this area of statistical research.
Bayesian Model Comparison in Cosmology
Cosmology background
.
Bayesian Model Comparison in Cosmology
Cosmology background
Some questions in cosmology
Will the universe expand forever, or will it collapse?
Is the universe dominated by exotic dark matter and what isits concentration?
What is the shape of the universe?
Is the expansion of the universe accelerating rather thandecelerating?
Is the “flat ΛCDM paradigm” appropriate or is the curvaturedifferent from zero?
[Adams, The Guide [a.k.a. H2G2], 1979]
Bayesian Model Comparison in Cosmology
Cosmology background
Statistical problems in cosmology
Potentially high dimensional parameter space [Not consideredhere]
Immensely slow computation of likelihoods, e.g WMAP, CMB,because of numerically costly spectral transforms [Data is aFortran program]
Nonlinear dependence and degeneracies between parametersintroduced by physical constraints or theoretical assumptions
Ωm
w0
0.0 0.2 0.4 0.6 0.8 1.0 1.2
−3.
0−
2.0
−1.
00.
0
− M
α
19.1 19.3 19.5 19.7
1.0
1.5
2.0
2.5
Bayesian Model Comparison in Cosmology
Importance sampling
Importance sampling solutions
1 Cosmology background
2 Importance samplingAdaptive importance samplingAdaptive multiple importance sampling
3 Application to cosmological data
4 Evidence approximation
5 Cosmology models
6 lexicon
Bayesian Model Comparison in Cosmology
Importance sampling
Importance sampling 101
Importance sampling is based on the fundamental identity
π(f) =
∫
f(x)π(x) dx =
∫
f(x)π(x)
q(x)q(x) dx
If x1, . . . , xN are drawn independently from q,
π(f) =
N∑
n=1
f(xn)wn; wn =π(xn)/q(xn)
∑Nm=1 π(xm)/q(xm)
,
provides a converging approximation to π(f) (independent of thenormalisation of π).
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive importance sampling
Initialising importance sampling
PMC/AIS offers a solution to the difficulty of picking q throughadaptivity:Given a target π, PMC produces a sequence qt of importancefunctions (t = 1, . . . , T ) aimed at approximating πFirst sample produced by a regular importance sampling scheme,x1
1, . . . , x1N ∼ q1, associated with importance weights
w1n =
π(x1n)
q1(x1n)
and their normalised counterparts w1n, providing a first
approximation to a sample from π.Moments of π can then be approximated to construct an updatedimportance function q2, &c.
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive importance sampling
Adaptive importance sampling
Optimality criterion?
The quality of approximation can be measured in terms of theKullback divergence from the target,
D(π‖qt) =
∫
log
(
π(x)
qt(x)
)
π(x)dx,
and the density qt can be adjusted incrementally to minimize thisdivergence.
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive importance sampling
PMC – Some papers
Cappe et al (2004) - J. Comput. Graph. Stat.
Outline of Population Monte Carlo but missed main point
Celeux et al (2005) - Comput. Stat. & Data Analysis Rao-Blackwellisation forimportance sampling and missing data problems
Douc et al (2007) - ESAIM Prob. & Stat. and Annals of Statistics
Convergence issues proving adaptation is positive where q is a mixture density ofrandom-walk proposals (mixture weights varied)
Cappe et al (2007) - Stat. & Computing
Adaptation of q (mixture density of independent proposals), where weights andparameters vary
Wraith et al (2009) - Physical Review D
Application of Cappe et al (2007) to cosmology and comparison with MCMC
Beaumont et al (2009) - Biometrika
Application of Cappe et al (2007) to ABC settings
Kilbinger et al (2010) - Month. N. Royal Astro. Soc.
Application of Cappe et al (2007) to model choice in cosmology
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive importance sampling
Adaptive importance sampling (2)
Use of mixture densities
qt(x) = q(x;αt, θt) =
D∑
d=1
αtdϕ(x; θt
d)
[West, 1993]
where
αt = (αt1, . . . , α
tD) is a vector of adaptable weights for the D
mixture components
θt = (θt1, . . . , θ
tD) is a vector of parameters which specify the
components
ϕ is a parameterised density (usually taken to be multivariateGaussian or Student-t, the later preferred)
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive importance sampling
Cappe et al (2007) optimal scheme
Update qt using an integrated EM approach minimising the KLdivergence at each iteration
D(π‖qt) =
∫
log
(
π(x)∑D
d=1 αtdϕ(x; θt
d)
)
π(x)dx,
equivalent to maximising
ℓ(α, θ) =
∫
log
(
D∑
d=1
αdϕ(x; θd)
)
π(x) dx
in α, θ.
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive importance sampling
PMC updates
Maximization of Lt(α, θ) leads to closed form solutions inexponential families (and for the t distributions)For instance for Np(µd,Σd):
αt+1d =
∫
ρd(x;αt, µt,Σt)π(x)dx,
µt+1d =
∫
xρd(x;αt, µt,Σt)π(x)dx
αt+1d
,
Σt+1d =
∫
(x − µt+1d )(x − µt+1
d )Tρd(x;αt, µt,Σt)π(x)dx
αt+1d
.
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive importance sampling
Empirical updates
And empirical versions,
αt+1d
=N
X
n=1
wtn ρd(xt
n;αt, µt, Σt)
µt+1d
=
PNn=1 wt
nxtn ρd(xt
n;αt, µt, Σt)
αt+1d
Σt+1d
=PN
n=1 wtn (xt
n − µt+1d
)(xtn − µt+1
d)Tρd(xt
n;αt, µt, Σt)
αt+1d
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive importance sampling
Banana benchmark
Twisted Np(0, Σ) target with Σ = diag(σ2
1, 1, . . . , 1), changing the
second co-ordinate x2 to x2 + b(x2
1− σ2
1)
x1
x 2
−40 −20 0 20 40
−40
−30
−20
−10
010
20
p = 10, σ2
1= 100, b = 0.03
[Haario et al. 1999]
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive importance sampling
Simulation
−40 −20 0 20 40
−40
−20
010
20
−40 −20 0 20 40
−40
−20
010
20−40 −20 0 20 40
−40
−20
010
20
−40 −20 0 20 40−
40−
200
1020
−40 −20 0 20 40
−40
−20
010
20
−40 −20 0 20 40
−40
−20
010
20
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive importance sampling
Monitoring by perplexity
Stop iterations when further adaptations do not improve D(π‖qt).
The transform exp[−D(π‖qt)] may be estimated by the normalised
perplexity p = exp(HtN)/N, where
HtN = −
N∑
n=1
wtn log wt
n
is the Shannon entropy of the normalised weights
Thus, minimization of the Kullback divergence can beapproximately connected with the maximization of the perplexity(normalised) (values closer to 1 indicating good agreementbetween q and π).
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive importance sampling
Monitoring by ESS
A second criterion is the effective sample size (ESS)
ESStN =
(
N∑
n=1
wtn
2
)−1
which can be interpreted as the number of equivalent iid samplepoints.
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive importance sampling
Simulation
1 2 3 4 5 6 7 8 9 10
0.0
0.2
0.4
0.6
0.8
NP
ER
PL
1 2 3 4 5 6 7 8 9 10
0.0
0.2
0.4
0.6
0.8
NE
SS
Normalised perplexity (top panel) and normalised effective sample size(ESS/N) (bottom panel) estimates for thefirst 10 iterations of PMC
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive importance sampling
Comparison to MCMCAdaptive MCMC: Proposal is a multivariate Gaussian with Σupdated/based on previous values in the chain. Scale and updatetimes chosen for optimal results.
!"# $"# %"# &"# '"#
!$
!(
!!
"!
($
)!
!"# $"# %"# &"# '"#
!$
!(
!!
"!
($
)!
!"# $"# %"# &"# '"#
!$
!(
!!
"!
($
)(
!"# $"# %"# &"# '"#
!$
!(
!!
"!
($
)(
fa fa
fbfb
PMC MCMC
Evolution of π(fa) (top panels) and π(fb) (bottom panels) from 10k points to 100k points for both PMC (leftpanels) and MCMC (right panels).
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive importance sampling
Simulation
d10 PMC d10 MCMC d2 PMC d2 MCMC d1 PMC d1 MCMC
0.62
0.66
0.70
0.74
Propoportion of points inside
d10 PMC d10 MCMC d2 PMC d2 MCMC d1 PMC d1 MCMC
0.88
0.92
0.96
1.00
Propoportion of points inside
MCMC
MCMC
MCMC
MCMC
MCMC
MCMC
PMC PMC PMC
PMC PMC PMC
fc fe fh
fd fg fi
Results showing the distributions of the PMC and the MCMC estimates. All estimates are based on 500 simulationruns.
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive multiple importance sampling
Adaptive multiple importance sampling
Full recycling:
At iteration t, design a new proposal qt based on all previoussamples
x11, . . . , x
1N , . . . , xt−1
1 , . . . , xt−1N
At each stage, the whole past can be used: if un-normalisedweights ωi,t are preserved along iterations, then all xt
i’s can bepooled together
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive multiple importance sampling
Adaptive multiple importance sampling
Full recycling:
At iteration t, design a new proposal qt based on all previoussamples
x11, . . . , x
1N , . . . , xt−1
1 , . . . , xt−1N
At each stage, the whole past can be used: if un-normalisedweights ωi,t are preserved along iterations, then all xt
i’s can bepooled together
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive multiple importance sampling
Caveat
When using several importance functions at once, q0, . . . , qT , withsamples x0
1, . . . , x0N0
, . . ., xT1 , . . . , xT
NTand importance weights
ωti = π(xt
i)/qt(xti), merging thru the empirical distribution
∑
t,i
ωtiδxt
i(x)
/
∑
t,i
ωti≈ π(x)
Fails to cull poor proposals: very large weights do remain large inthe cumulated sample and poorly performing samplesoverwhelmingly dominate other samples in the final outcome.
c© Raw mixing of importance samples may be harmful, comparedwith a single sample, even when most proposals are efficient.
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive multiple importance sampling
Caveat
When using several importance functions at once, q0, . . . , qT , withsamples x0
1, . . . , x0N0
, . . ., xT1 , . . . , xT
NTand importance weights
ωti = π(xt
i)/qt(xti), merging thru the empirical distribution
∑
t,i
ωtiδxt
i(x)
/
∑
t,i
ωti≈ π(x)
Fails to cull poor proposals: very large weights do remain large inthe cumulated sample and poorly performing samplesoverwhelmingly dominate other samples in the final outcome.
c© Raw mixing of importance samples may be harmful, comparedwith a single sample, even when most proposals are efficient.
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive multiple importance sampling
Deterministic mixtures
Owen and Zhou (2000) propose a stabilising recycling of theweights via deterministic mixtures by modifying the importancedensity qt(x
ti) under which xt
i was truly simulated to a mixture ofall the densities that have been used so far
1∑T
j=0 Nj
T∑
t=0
Ntqt(xTi ) ,
resulting into the deterministic mixture weight
ωti = π(xt
i)
/
1∑T
j=0 Nj
T∑
t=0
Ntqt(xti) .
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive multiple importance sampling
Unbiasedness
Potential to exploit the most efficient proposals in the sequenceQ0, . . . , QT without rejecting any simulated value nor sample.Poorly performing importance functions are simply eliminatedthrough the erosion of their weights
π(xti)
/
1∑T
j=0 Nj
T∑
l=0
Nlql(xti)
as T increases.Paradoxical feature of competing acceptable importance weightsfor the same simulated value well-understood in the cases ofRao-Blackwellisation and of Population Monte Carlo. Moreintricated here in that only unbiasedness remains [fake mixture]
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive multiple importance sampling
Unbiasedness
Potential to exploit the most efficient proposals in the sequenceQ0, . . . , QT without rejecting any simulated value nor sample.Poorly performing importance functions are simply eliminatedthrough the erosion of their weights
π(xti)
/
1∑T
j=0 Nj
T∑
l=0
Nlql(xti)
as T increases.Paradoxical feature of competing acceptable importance weightsfor the same simulated value well-understood in the cases ofRao-Blackwellisation and of Population Monte Carlo. Moreintricated here in that only unbiasedness remains [fake mixture]
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive multiple importance sampling
AMIS
AMIS (or Adaptive Multiple Importance Sampling) usesimportance sampling functions (qt) that are constructedsequentially and adaptively, using past t − 1 weighted samples.
i weights of all present and past variables xli
(1 ≤ l ≤ t , 1 ≤ j ≤ Nt) are modified, based on the currentproposals
ii the entire collection of importance samples is used to buildthe next importance function.
[Parallel with IMIS: Raftery & Bo, 2010]
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive multiple importance sampling
The AMIS algorithm
Adaptive Multiple Importance SamplingAt iteration t = 1, . . . , T
1) Independently generate Nt particles xt
i∼ q(x|θt−1)
2) For 1 ≤ i ≤ Nt, compute the mixture at xit
δti
= N0q0(xti) +
P
t
l=1 Nlq(xti; θl−1) and derive the
weight of xti, ωt
i= π(xt
i)‹
[δti
ffi
N0 +P
t
l=0 Nl] .
3) For 0 ≤ l ≤ t − 1 and 1 ≤ i ≤ Nl, actualise past weights as
δl
i= δ
l
i+ q(x
l
i; θ
t−1) and ω
l
i= π(x
l
i)‹
[δl
i
‹
N0 +
tX
l=0
Nl] .
4) Compute the parameter estimate θt based on
(x01, ω
01, . . . , x
0N0
, ω0N0
, . . . , xt
1, ωt
1, . . . , xt
Nt, ω
t
Nt)
[Cornuet, Marin, Mira & CPR, 2009, arXiv:0907.1254]
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive multiple importance sampling
Studentised AMIS
When the proposal distribution qt is a Student’s t proposal,
T3(µ,Σ)
mean µ and covariance Σ parameters can be updated byestimating first two moments of the target distribution Π
µt =
Ptl=0
PNl
i=1 ωlix
li
Ptl=0
PNl
i=1 ωli
and Σt =
Ptl=0
PNl
i=1 ωli(x
li − µt)(xl
i − µt)T
Ptl=0
PNl
i=1 ωli
.
i.e. using optimal update of Cappe et al. (2007)
Obvious extension to mixtures [and again optimal update of Cappeet al. (2007)]
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive multiple importance sampling
Studentised AMIS
When the proposal distribution qt is a Student’s t proposal,
T3(µ,Σ)
mean µ and covariance Σ parameters can be updated byestimating first two moments of the target distribution Π
µt =
Ptl=0
PNl
i=1 ωlix
li
Ptl=0
PNl
i=1 ωli
and Σt =
Ptl=0
PNl
i=1 ωli(x
li − µt)(xl
i − µt)T
Ptl=0
PNl
i=1 ωli
.
i.e. using optimal update of Cappe et al. (2007)
Obvious extension to mixtures [and again optimal update of Cappeet al. (2007)]
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive multiple importance sampling
SimulationsSame banana benchmark
Target function p AMIS Cappe’07
5 0.06558 0.06879E(x1) = 0 10 0.06388 0.11051
20 0.09167 0.17912
5 0.10215 0.11583E(x2) = 0 10 0.21421 0.22557
20 0.25316 0.29087P5
i=3 E(xi) = 0 5 0.00478 0.00927P10
i=3 E(xi) = 0 10 0.00902 0.02099P20
i=3 E(xi) = 0 20 0.01666 0.04208
5 2.60672 3.92650var(x1) = 100 10 7.06686 7.48877
20 8.20020 9.71725
5 2.10682 2.96132var(x2) = 19 10 3.76660 5.08474
20 4.85407 5.98031P5
i=3var(xi) = 3 5 0.00645 0.01196P10
i=3var(xi) = 8 10 0.01370 0.02636P20
i=3var(xi) = 18 20 0.04609 0.06424
Root mean square errors calculated over 10 replications for different target functionsand dimensions p.
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive multiple importance sampling
Simulation (cont’d)
10 replicate ESSs for AMIS (left) and PMC (right) for p = 5, 10, 20.
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive multiple importance sampling
Simulation (cont’d)
10 replicate absolute errors associated to the estimations of E(x1) (left column),
E(x2) (center column) andPp
i=3 E(xi) (right column) using AMIS (left in each
block) and PMC (right) for p = 5, 10, 20.
Bayesian Model Comparison in Cosmology
Importance sampling
Adaptive multiple importance sampling
Simulation (cont’d)
10 replicate absolute errors associated to the estimations of var(x1) (left column),
var(x2) (center column) andPp
i=3 var(xi) (right column) using AMIS (left in each
block) and PMC (right) for p = 5, 10, 20.
Bayesian Model Comparison in Cosmology
Application to cosmological data
Cosmological data
Posterior distribution of cosmological parameters for recentobservational data of CMB anisotropies (differences in temperaturefrom directions) [WMAP], SNIa, and cosmic shear.Combination of three likelihoods, some of which are available aspublic (Fortran) code, and of a uniform prior on a hypercube.
Bayesian Model Comparison in Cosmology
Application to cosmological data
Cosmology parameters
Parameters for the cosmology likelihood(C=CMB, S=SNIa, L=lensing)
Symbol Description Minimum Maximum ExperimentΩb Baryon density 0.01 0.1 C LΩm Total matter density 0.01 1.2 C S Lw Dark-energy eq. of state -3.0 0.5 C S Lns Primordial spectral index 0.7 1.4 C L
∆2R
Normalization (large scales) Cσ8 Normalization (small scales) C Lh Hubble constant C Lτ Optical depth CM Absolute SNIa magnitude Sα Colour response Sβ Stretch response Sa Lb galaxy z-distribution fit Lc L
For WMAP5, σ8 is a deduced quantity that depends on the other parameters
Bayesian Model Comparison in Cosmology
Application to cosmological data
Adaptation of importance function
Bayesian Model Comparison in Cosmology
Application to cosmological data
Estimates
Parameter PMC MCMC
Ωb 0.0432+0.0027−0.0024
0.0432+0.0026−0.0023
Ωm 0.254+0.018
−0.0170.253+0.018
−0.016
τ 0.088+0.018−0.016
0.088+0.019−0.015
w −1.011 ± 0.060 −1.010+0.059
−0.060
ns 0.963+0.015−0.014
0.963+0.015−0.014
109∆2R
2.413+0.098−0.093
2.414+0.098−0.092
h 0.720+0.022−0.021
0.720+0.023−0.021
a 0.648+0.040−0.041
0.649+0.043−0.042
b 9.3+1.4−0.9
9.3+1.7−0.9
c 0.639+0.084−0.070
0.639+0.082−0.070
−M 19.331 ± 0.030 19.332+0.029
−0.031
α 1.61+0.15−0.14
1.62+0.16−0.14
−β −1.82+0.17
−0.16−1.82 ± 0.16
σ8 0.795+0.028−0.030
0.795+0.030−0.027
Means and 68% credible intervals using lensing, SNIa and CMB
Bayesian Model Comparison in Cosmology
Application to cosmological data
Advantage of AIS and PMC?
Parallelisation of the posterior calculations- For the cosmological examples, we used up to 100 CPUs on a computer cluster to explore the cosmologyposteriors using AIS/PMC. Reducing the computational time from several days for MCMC to a few hoursusing PMC.
Low variance of Monte Carlo estimates- For PMC and q closely matched to π, significant reductions in the variance of the Monte Carloestimates are possible compared to estimates using MCMC. Also translating into a computational saving,with further savings possible by combining samples across iterations
Simple diagnostics of ‘convergence’ (perplexity)- For PMC, the perplexity provides a relatively simple measure of sampling adequacy to the target densityof interest
Bayesian Model Comparison in Cosmology
Evidence approximation
Evidence/Marginal likelihood/Integrated Likelihood ...
Central quantity of interest in (Bayesian) model choice
E =
∫
π(x)dx =
∫
π(x)
q(x)q(x)dx.
expressed as an expectation under any density q with large enoughsupport.Importance sampling provides a sample x1, . . . xN ∼ q andapproximation of the above integral,
E ≈N∑
n=1
wn
where the wn = π(xn)q(xn) are the (unnormalised) importance weights.
Bayesian Model Comparison in Cosmology
Evidence approximation
Evidence/Marginal likelihood/Integrated Likelihood ...
Central quantity of interest in (Bayesian) model choice
E =
∫
π(x)dx =
∫
π(x)
q(x)q(x)dx.
expressed as an expectation under any density q with large enoughsupport.Importance sampling provides a sample x1, . . . xN ∼ q andapproximation of the above integral,
E ≈N∑
n=1
wn
where the wn = π(xn)q(xn) are the (unnormalised) importance weights.
Bayesian Model Comparison in Cosmology
Evidence approximation
Back to the banana ...
Centred d-multivariate normal, x ∼ Nd(0,Σ) with covarianceΣ = diag(σ2
1 , 1, . . . , 1), which is slightly twisted in the first twodimensions by changing x2 to be x2 + β(x2
1 − σ21). where σ2
1 = 100and β controls the degree of curvature.We integrate over the unormalised target density
E =
∫
π(β)f(x|β,Σ)dβ
or
E =
∫
π(x|β,Σ)dx.
Bayesian Model Comparison in Cosmology
Evidence approximation
Simulation results (1)
x1
x 2
−40 −20 0 20 40
−30
−20
−10
010
−40 −20 0 20 40
−30
−20
−10
010
x1
x 2
0.02
992
0.02
996
0.03
000
0.03
004
After 10th iteration
Pos
terio
r m
ean
of β
−26
4.03
6−
264.
032
−26
4.02
8
After 10th iteration
Evi
denc
e (lo
g)
β unknown
Bayesian Model Comparison in Cosmology
Evidence approximation
Simulation results (2)
1 2 3 4 5 6 7 8 9 10
0.2
0.4
0.6
0.8
Iteration
Per
plex
ity
1 2 3 4 5 6 7 8 9 10
0.0
0.2
0.4
0.6
0.8
Iteration
NE
SS
1 2 3 4 5 6 7 8 9 10
−0.
10.
00.
10.
2
Iteration
Evi
denc
e (lo
g)
−0.
015
−0.
005
0.00
50.
015
After 10th iteration
Evi
denc
e (lo
g): f
inal
sam
ple
β = 0.015 known
Bayesian Model Comparison in Cosmology
Cosmology models
Back to cosmology questions
Standard cosmology successful in explaining recent observations,such as CMB, SNIa, galaxy clustering, cosmic shear, galaxy clustercounts, and Lyα forest clustering.
Flat ΛCDM model with only six free parameters(Ωm,Ωb, h, ns, τ, σ8)
Extensions to ΛCDM may be based on independent evidence(massive neutrinos from oscillation experiments), predicted bycompelling hypotheses (primordial gravitational waves frominflation) or reflect ignorance about fundamental physics(dynamical dark energy).
Testing for dark energy, curvature, and inflationary models
Bayesian Model Comparison in Cosmology
Cosmology models
Back to cosmology questions
Standard cosmology successful in explaining recent observations,such as CMB, SNIa, galaxy clustering, cosmic shear, galaxy clustercounts, and Lyα forest clustering.
Flat ΛCDM model with only six free parameters(Ωm,Ωb, h, ns, τ, σ8)
Extensions to ΛCDM may be based on independent evidence(massive neutrinos from oscillation experiments), predicted bycompelling hypotheses (primordial gravitational waves frominflation) or reflect ignorance about fundamental physics(dynamical dark energy).
Testing for dark energy, curvature, and inflationary models
Bayesian Model Comparison in Cosmology
Cosmology models
Extended models
Focus on the dark energy equation-of-state parameter, modeled as
w = −1 ΛCDM
w = w0 wCDM
w = w0 + w1(1 − a) w(z)CDM
In addition, curvature parameter ΩK for each of the above is eitherΩK = 0 (‘flat’) or ΩK 6= 0 (‘curved’).Choice of models represents simplest models beyond a“cosmological constant” model able to explain the observed,recent accelerated expansion of the Universe.
Bayesian Model Comparison in Cosmology
Cosmology models
Cosmology priors
Prior ranges for dark energy and curvature models. In case of w(a)models, the prior on w1 depends on w0
Parameter Description Min. Max.
Ωm Total matter density 0.15 0.45Ωb Baryon density 0.01 0.08h Hubble parameter 0.5 0.9
ΩK Curvature −1 1w0 Constant dark-energy par. −1 −1/3
w1 Linear dark-energy par. −1 − w0−1/3−w0
1−aacc
Bayesian Model Comparison in Cosmology
Cosmology models
Cosmology priors (2)
Component to the matter-density tensor with w(a) < −1/3 forvalues of the scale factor a > aacc = 2/3. To limit the stateequation from below, we impose the condition w(a) > −1 for all a,thereby excluding phantom energy.Natural limit on the curvature is that of an empty Universe, i.e.upper boundary on the curvature ΩK = 1. A lower boundarycorresponds to an upper limit on the total matter-energy density:ΩK > −1, excluding high-density Universe(s) which are ruled outby the age of the oldest observed objects.Alternative prior on ΩK could be derived from the paradigm of inflation, but most
scenarios imply the curvature to be , on the order of 10−60. The likelihood over such
a prior on ΩK is essentially flat for any current and future experiments, hence cannot
be assessed.
Bayesian Model Comparison in Cosmology
Cosmology models
Cosmology priors (2)
Component to the matter-density tensor with w(a) < −1/3 forvalues of the scale factor a > aacc = 2/3. To limit the stateequation from below, we impose the condition w(a) > −1 for all a,thereby excluding phantom energy.Natural limit on the curvature is that of an empty Universe, i.e.upper boundary on the curvature ΩK = 1. A lower boundarycorresponds to an upper limit on the total matter-energy density:ΩK > −1, excluding high-density Universe(s) which are ruled outby the age of the oldest observed objects.Alternative prior on ΩK could be derived from the paradigm of inflation, but most
scenarios imply the curvature to be , on the order of 10−60. The likelihood over such
a prior on ΩK is essentially flat for any current and future experiments, hence cannot
be assessed.
Bayesian Model Comparison in Cosmology
Cosmology models
PMC setup
q0 is a Gaussian mixture model with D components randomlyshifted away from the MLE and covariance equal to theinformation matrix.
For the dark-energy and curvature models number ofiterations T equal to 10, unless perplexity indicated thecontrary. Average number of points sampled under anindividual mixture-component, N/D, controlled for stableupdating component (N = 7500 and D = 10).
For the primordial models T = 5, N = 10000 and D between7 and 10, depending on the dimensionality.
Parameters controlling the initial mixture means andcovariances, chosen as fshift = 0.02, and fvar between 1 and1.5. Final iteration run with a five-times larger sample
Bayesian Model Comparison in Cosmology
Cosmology models
Results
In most cases evidence in favour of the standard model. especiallywhen more datasets/experiments are combined.
Largest evidence is ln B12 = 1.8, for the w(z)CDM model andCMB alone. Case where a large part of the prior range is stillallowed by the data, and a region of comparable size is excluded.Hence weak evidence that both w0 and w1 are required, butexcluded when adding SNIa and BAO datasets.
Results on the curvature are compatible with current findings:non-flat Universe(s) strongly disfavoured for the three dark-energycases.
Bayesian Model Comparison in Cosmology
Cosmology models
Evidence
-8
-6
-4
-2
0
2
4
4 5 6
ln B
12
npar
Evidence (reference model ΛCDM flat)
inco
ncl.
wea
km
od.
wea
km
od.
stro
ng
CMB
Λ curved
w0 flat
w0 curved
w(z) flat
w(z) curved
CMB+SN
Λ curved
w0 flat
w0 curved
w(z) flat
w(z) curved
CMB+SN+BAO
Λ curved
w0 flat
w0 curved
w(z) flat
w(z) curved
Bayesian Model Comparison in Cosmology
Cosmology models
Posterior outcome
Posterior on dark-energy parameters w0 and w1 as 68%- and 95% credible regions forWMAP (solid blue lines), WMAP+SNIa (dashed green) and WMAP+SNIa+BAO(dotted red curves). Allowed prior range as red straight lines.
−1.0 −0.9 −0.8 −0.7 −0.6 −0.5 −0.4
−0.5
0.0
0.5
1.0
1.5
2.0
w0
w1
Bayesian Model Comparison in Cosmology
Cosmology models
PMC stability−
11.0
−10
.0−
9.5
−9.
0−
8.5
iteration
ln E
1 2 3 4 5 6 7 8 9 10
wCDM flat
−14
−13
−12
−11
−10
iterationln
E
1 3 5 7 9 11 13 15 17 19
wCDM curvature
Distribution of 25 PMC samplings of two dark-energy models, flat wCDM (left panel)
and curved wCDM (right panel). Log-evidence
Bayesian Model Comparison in Cosmology
Cosmology models
PMC stability0.
00.
20.
40.
60.
8
iteration
perp
lexi
ty
1 2 3 4 5 6 7 8 9 10
wCDM flat
0.0
0.1
0.2
0.3
0.4
0.5
iterationpe
rple
xity
1 3 5 7 9 11 13 15 17 19
wCDM curvature
Distribution of 25 PMC samplings of two dark-energy models, flat wCDM (left panel)
and curved wCDM (right panel). Perplexity
Bayesian Model Comparison in Cosmology
lexicon
lexicon
BAO, baryon acoustic oscillations
CMB, cosmic microwave background radiation
COBE, cosmic background explorer
ΛCDM, lambda-cold dark matter
Lyα, Lyman-alpha
SNIa, type Ia supernovae
WMAP, Wilkinson microwave anisotropy probe