Bayesian model choice in cosmology

Bayesian Model Comparison in Cosmology

Bayesian Model Comparison in Cosmologywith Population Monte Carlo

Monthly Notices Royal Astronomical Soc. 405 (4), 2381 - 2390, 2010

Christian P. Robert

Universite Paris Dauphine & CRESThttp://www.ceremade.dauphine.fr/~xian

Joint works with D., Benabed K., Cappe O., Cardoso J.F., Fort G., Kilbinger M.,

[Marin J.-M., Mira A.,] Prunet S., Wraith D.

http://www.ceremade.dauphine.fr/~xian


Outline

1 Cosmology background

2 Importance sampling

3 Application to cosmological data

4 Evidence approximation

5 Cosmology models

6 lexicon


Cosmology background

Cosmology

A large part of the data to answer some of the major questions in cosmologycomes from studying the Cosmic Microwave Background (CMB) radiation(fossil heat released circa 380,000 years after the BB).

Huge uniformity of the CMB. Only very sensitive instruments like such asWMAP (NASA, 2001) can detect fluctuations CMB temperaturee.g minute temperature variations: one part of the sky has a temperature of 2.7251Kelvin (degrees above absolute zero), while another part of the sky has a temperatureof 2.7249 Kelvin



CosmologyA large part of the data to answer some of the major questions in cosmologycomes from studying the Cosmic Microwave Background (CMB) radiation(fossil heat released circa 380,000 years after the BB).

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

CMB

−0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6

01

23

45

[Marin & CPR, Bayesian Core, 2007]



Plank

Temperature variations are related to fluctuations in the density ofmatter in the early universe and thus carry information about theinitial conditions for the formation of cosmic structures such asgalaxies, clusters, and voids for example.

PlanckJoint mission between the European Space Agency (ESA) and NASA, launched inMay 2009. The Planck mission plans to provide datasets of nearly 5 × 1010

observations to settle many open questions with CMB temperature data. Rather thanscalar valued observations, Planck will provide tensor-valued data and thus is likely toalso open up this area of statistical research.



Plank

Temperature variations are related to fluctuations in the density ofmatter in the early universe and thus carry information about theinitial conditions for the formation of cosmic structures such asgalaxies, clusters, and voids for example.

PlanckJoint mission between the European Space Agency (ESA) and NASA, launched inMay 2009. The Planck mission plans to provide datasets of nearly 5 × 1010

observations to settle many open questions with CMB temperature data. Rather thanscalar valued observations, Planck will provide tensor-valued data and thus is likely toalso open up this area of statistical research.



.



Some questions in cosmology

Will the universe expand forever, or will it collapse?

Is the universe dominated by exotic dark matter and what isits concentration?

What is the shape of the universe?

Is the expansion of the universe accelerating rather thandecelerating?

Is the “flat ΛCDM paradigm” appropriate or is the curvaturedifferent from zero?

[Adams, The Guide [a.k.a. H2G2], 1979]



Statistical problems in cosmology

Potentially high dimensional parameter space [Not consideredhere]

Immensely slow computation of likelihoods, e.g WMAP, CMB,because of numerically costly spectral transforms [Data is aFortran program]

Nonlinear dependence and degeneracies between parametersintroduced by physical constraints or theoretical assumptions

Ωm

w0

0.0 0.2 0.4 0.6 0.8 1.0 1.2

−3.

0−

2.0

−1.

00.

0

− M

α

19.1 19.3 19.5 19.7

1.0

1.5

2.0

2.5


Importance sampling

Importance sampling solutions

1 Cosmology background

2 Importance samplingAdaptive importance samplingAdaptive multiple importance sampling

3 Application to cosmological data

4 Evidence approximation

5 Cosmology models

6 lexicon


Importance sampling

Importance sampling 101

Importance sampling is based on the fundamental identity

π(f) =

∫

f(x)π(x) dx =

∫

f(x)π(x)

q(x)q(x) dx

If x1, . . . , xN are drawn independently from q,

π(f) =

N∑

n=1

f(xn)wn; wn =π(xn)/q(xn)

∑Nm=1 π(xm)/q(xm)

,

provides a converging approximation to π(f) (independent of thenormalisation of π).


Importance sampling

Adaptive importance sampling

Initialising importance sampling

PMC/AIS offers a solution to the difficulty of picking q throughadaptivity:Given a target π, PMC produces a sequence qt of importancefunctions (t = 1, . . . , T ) aimed at approximating πFirst sample produced by a regular importance sampling scheme,x1

1, . . . , x1N ∼ q1, associated with importance weights

w1n =

π(x1n)

q1(x1n)

and their normalised counterparts w1n, providing a first

approximation to a sample from π.Moments of π can then be approximated to construct an updatedimportance function q2, &c.


Importance sampling



Optimality criterion?

The quality of approximation can be measured in terms of theKullback divergence from the target,

D(π‖qt) =

∫

log

(

π(x)

qt(x)

)

π(x)dx,

and the density qt can be adjusted incrementally to minimize thisdivergence.


Importance sampling


PMC – Some papers

Cappe et al (2004) - J. Comput. Graph. Stat.

Outline of Population Monte Carlo but missed main point

Celeux et al (2005) - Comput. Stat. & Data Analysis Rao-Blackwellisation forimportance sampling and missing data problems

Douc et al (2007) - ESAIM Prob. & Stat. and Annals of Statistics

Convergence issues proving adaptation is positive where q is a mixture density ofrandom-walk proposals (mixture weights varied)

Cappe et al (2007) - Stat. & Computing

Adaptation of q (mixture density of independent proposals), where weights andparameters vary

Wraith et al (2009) - Physical Review D

Application of Cappe et al (2007) to cosmology and comparison with MCMC

Beaumont et al (2009) - Biometrika

Application of Cappe et al (2007) to ABC settings

Kilbinger et al (2010) - Month. N. Royal Astro. Soc.

Application of Cappe et al (2007) to model choice in cosmology


Importance sampling


Adaptive importance sampling (2)

Use of mixture densities

qt(x) = q(x;αt, θt) =

D∑

d=1

αtdϕ(x; θt

d)

[West, 1993]

where

αt = (αt1, . . . , α

tD) is a vector of adaptable weights for the D

mixture components

θt = (θt1, . . . , θ

tD) is a vector of parameters which specify the

components

ϕ is a parameterised density (usually taken to be multivariateGaussian or Student-t, the later preferred)


Importance sampling


Cappe et al (2007) optimal scheme

Update qt using an integrated EM approach minimising the KLdivergence at each iteration

D(π‖qt) =

∫

log

(

π(x)∑D

d=1 αtdϕ(x; θt

d)

)

π(x)dx,

equivalent to maximising

ℓ(α, θ) =

∫

log

(

D∑

d=1

αdϕ(x; θd)

)

π(x) dx

in α, θ.


Importance sampling


PMC updates

Maximization of Lt(α, θ) leads to closed form solutions inexponential families (and for the t distributions)For instance for Np(µd,Σd):

αt+1d =

∫

ρd(x;αt, µt,Σt)π(x)dx,

µt+1d =

∫

xρd(x;αt, µt,Σt)π(x)dx

αt+1d

,

Σt+1d =

∫

(x − µt+1d )(x − µt+1

d )Tρd(x;αt, µt,Σt)π(x)dx

αt+1d

.


Importance sampling


Empirical updates

And empirical versions,

αt+1d

=N

X

n=1

wtn ρd(xt

n;αt, µt, Σt)

µt+1d

=

PNn=1 wt

nxtn ρd(xt

n;αt, µt, Σt)

αt+1d

Σt+1d

=PN

n=1 wtn (xt

n − µt+1d

)(xtn − µt+1

d)Tρd(xt

n;αt, µt, Σt)

αt+1d


Importance sampling


Banana benchmark

Twisted Np(0, Σ) target with Σ = diag(σ2

1, 1, . . . , 1), changing the

second co-ordinate x2 to x2 + b(x2

1− σ2

1)

x1

x 2

−40 −20 0 20 40

−40

−30

−20

−10

010

20

p = 10, σ2

1= 100, b = 0.03

[Haario et al. 1999]


Importance sampling


Simulation

−40 −20 0 20 40

−40

−20

010

20

−40 −20 0 20 40

−40

−20

010

20−40 −20 0 20 40

−40

−20

010

20

−40 −20 0 20 40−

40−

200

1020

−40 −20 0 20 40

−40

−20

010

20

−40 −20 0 20 40

−40

−20

010

20


Importance sampling


Monitoring by perplexity

Stop iterations when further adaptations do not improve D(π‖qt).

The transform exp[−D(π‖qt)] may be estimated by the normalised

perplexity p = exp(HtN)/N, where

HtN = −

N∑

n=1

wtn log wt

n

is the Shannon entropy of the normalised weights

Thus, minimization of the Kullback divergence can beapproximately connected with the maximization of the perplexity(normalised) (values closer to 1 indicating good agreementbetween q and π).


Importance sampling


Monitoring by ESS

A second criterion is the effective sample size (ESS)

ESStN =

(

N∑

n=1

wtn

2

)−1

which can be interpreted as the number of equivalent iid samplepoints.


Importance sampling


Simulation

1 2 3 4 5 6 7 8 9 10

0.0

0.2

0.4

0.6

0.8

NP

ER

PL

1 2 3 4 5 6 7 8 9 10

0.0

0.2

0.4

0.6

0.8

NE

SS

Normalised perplexity (top panel) and normalised effective sample size(ESS/N) (bottom panel) estimates for thefirst 10 iterations of PMC


Importance sampling


Comparison to MCMCAdaptive MCMC: Proposal is a multivariate Gaussian with Σupdated/based on previous values in the chain. Scale and updatetimes chosen for optimal results.

!"# $"# %"# &"# '"#

!$

!(

!!

"!

($

)!

!"# $"# %"# &"# '"#

!$

!(

!!

"!

($

)!

!"# $"# %"# &"# '"#

!$

!(

!!

"!

($

)(

!"# $"# %"# &"# '"#

!$

!(

!!

"!

($

)(

fa fa

fbfb

PMC MCMC

Evolution of π(fa) (top panels) and π(fb) (bottom panels) from 10k points to 100k points for both PMC (leftpanels) and MCMC (right panels).


Importance sampling


Simulation

d10 PMC d10 MCMC d2 PMC d2 MCMC d1 PMC d1 MCMC

0.62

0.66

0.70

0.74

Propoportion of points inside

d10 PMC d10 MCMC d2 PMC d2 MCMC d1 PMC d1 MCMC

0.88

0.92

0.96

1.00

Propoportion of points inside

MCMC

MCMC

MCMC

MCMC

MCMC

MCMC

PMC PMC PMC

PMC PMC PMC

fc fe fh

fd fg fi

Results showing the distributions of the PMC and the MCMC estimates. All estimates are based on 500 simulationruns.


Importance sampling

Adaptive multiple importance sampling


Full recycling:

At iteration t, design a new proposal qt based on all previoussamples

x11, . . . , x

1N , . . . , xt−1

1 , . . . , xt−1N

At each stage, the whole past can be used: if un-normalisedweights ωi,t are preserved along iterations, then all xt

i’s can bepooled together


Importance sampling



Full recycling:

At iteration t, design a new proposal qt based on all previoussamples

x11, . . . , x

1N , . . . , xt−1

1 , . . . , xt−1N

At each stage, the whole past can be used: if un-normalisedweights ωi,t are preserved along iterations, then all xt

i’s can bepooled together


Importance sampling


Caveat

When using several importance functions at once, q0, . . . , qT , withsamples x0

1, . . . , x0N0

, . . ., xT1 , . . . , xT

NTand importance weights

ωti = π(xt

i)/qt(xti), merging thru the empirical distribution

∑

t,i

ωtiδxt

i(x)

/

∑

t,i

ωti≈ π(x)

Fails to cull poor proposals: very large weights do remain large inthe cumulated sample and poorly performing samplesoverwhelmingly dominate other samples in the final outcome.

c© Raw mixing of importance samples may be harmful, comparedwith a single sample, even when most proposals are efficient.


Importance sampling


Caveat

When using several importance functions at once, q0, . . . , qT , withsamples x0

1, . . . , x0N0

, . . ., xT1 , . . . , xT

NTand importance weights

ωti = π(xt

i)/qt(xti), merging thru the empirical distribution

∑

t,i

ωtiδxt

i(x)

/

∑

t,i

ωti≈ π(x)

Fails to cull poor proposals: very large weights do remain large inthe cumulated sample and poorly performing samplesoverwhelmingly dominate other samples in the final outcome.

c© Raw mixing of importance samples may be harmful, comparedwith a single sample, even when most proposals are efficient.


Importance sampling


Deterministic mixtures

Owen and Zhou (2000) propose a stabilising recycling of theweights via deterministic mixtures by modifying the importancedensity qt(x

ti) under which xt

i was truly simulated to a mixture ofall the densities that have been used so far

1∑T

j=0 Nj

T∑

t=0

Ntqt(xTi ) ,

resulting into the deterministic mixture weight

ωti = π(xt

i)

/

1∑T

j=0 Nj

T∑

t=0

Ntqt(xti) .


Importance sampling


Unbiasedness

Potential to exploit the most efficient proposals in the sequenceQ0, . . . , QT without rejecting any simulated value nor sample.Poorly performing importance functions are simply eliminatedthrough the erosion of their weights

π(xti)

/

1∑T

j=0 Nj

T∑

l=0

Nlql(xti)

as T increases.Paradoxical feature of competing acceptable importance weightsfor the same simulated value well-understood in the cases ofRao-Blackwellisation and of Population Monte Carlo. Moreintricated here in that only unbiasedness remains [fake mixture]


Importance sampling


Unbiasedness

Potential to exploit the most efficient proposals in the sequenceQ0, . . . , QT without rejecting any simulated value nor sample.Poorly performing importance functions are simply eliminatedthrough the erosion of their weights

π(xti)

/

1∑T

j=0 Nj

T∑

l=0

Nlql(xti)

as T increases.Paradoxical feature of competing acceptable importance weightsfor the same simulated value well-understood in the cases ofRao-Blackwellisation and of Population Monte Carlo. Moreintricated here in that only unbiasedness remains [fake mixture]


Importance sampling


AMIS

AMIS (or Adaptive Multiple Importance Sampling) usesimportance sampling functions (qt) that are constructedsequentially and adaptively, using past t − 1 weighted samples.

i weights of all present and past variables xli

(1 ≤ l ≤ t , 1 ≤ j ≤ Nt) are modified, based on the currentproposals

ii the entire collection of importance samples is used to buildthe next importance function.

[Parallel with IMIS: Raftery & Bo, 2010]


Importance sampling


The AMIS algorithm

Adaptive Multiple Importance SamplingAt iteration t = 1, . . . , T

1) Independently generate Nt particles xt

i∼ q(x|θt−1)

2) For 1 ≤ i ≤ Nt, compute the mixture at xit

δti

= N0q0(xti) +

P

t

l=1 Nlq(xti; θl−1) and derive the

weight of xti, ωt

i= π(xt

i)‹

[δti

ffi

N0 +P

t

l=0 Nl] .

3) For 0 ≤ l ≤ t − 1 and 1 ≤ i ≤ Nl, actualise past weights as

δl

i= δ

l

i+ q(x

l

i; θ

t−1) and ω

l

i= π(x

l

i)‹

[δl

i

‹

N0 +

tX

l=0

Nl] .

4) Compute the parameter estimate θt based on

(x01, ω

01, . . . , x

0N0

, ω0N0

, . . . , xt

1, ωt

1, . . . , xt

Nt, ω

t

Nt)

[Cornuet, Marin, Mira & CPR, 2009, arXiv:0907.1254]


Importance sampling


Studentised AMIS

When the proposal distribution qt is a Student’s t proposal,

T3(µ,Σ)

mean µ and covariance Σ parameters can be updated byestimating first two moments of the target distribution Π

µt =

Ptl=0

PNl

i=1 ωlix

li

Ptl=0

PNl

i=1 ωli

and Σt =

Ptl=0

PNl

i=1 ωli(x

li − µt)(xl

i − µt)T

Ptl=0

PNl

i=1 ωli

.

i.e. using optimal update of Cappe et al. (2007)

Obvious extension to mixtures [and again optimal update of Cappeet al. (2007)]


Importance sampling


Studentised AMIS

When the proposal distribution qt is a Student’s t proposal,

T3(µ,Σ)

mean µ and covariance Σ parameters can be updated byestimating first two moments of the target distribution Π

µt =

Ptl=0

PNl

i=1 ωlix

li

Ptl=0

PNl

i=1 ωli

and Σt =

Ptl=0

PNl

i=1 ωli(x

li − µt)(xl

i − µt)T

Ptl=0

PNl

i=1 ωli

.

i.e. using optimal update of Cappe et al. (2007)

Obvious extension to mixtures [and again optimal update of Cappeet al. (2007)]


Importance sampling


SimulationsSame banana benchmark

Target function p AMIS Cappe’07

5 0.06558 0.06879E(x1) = 0 10 0.06388 0.11051

20 0.09167 0.17912

5 0.10215 0.11583E(x2) = 0 10 0.21421 0.22557

20 0.25316 0.29087P5

i=3 E(xi) = 0 5 0.00478 0.00927P10

i=3 E(xi) = 0 10 0.00902 0.02099P20

i=3 E(xi) = 0 20 0.01666 0.04208

5 2.60672 3.92650var(x1) = 100 10 7.06686 7.48877

20 8.20020 9.71725

5 2.10682 2.96132var(x2) = 19 10 3.76660 5.08474

20 4.85407 5.98031P5

i=3var(xi) = 3 5 0.00645 0.01196P10

i=3var(xi) = 8 10 0.01370 0.02636P20

i=3var(xi) = 18 20 0.04609 0.06424

Root mean square errors calculated over 10 replications for different target functionsand dimensions p.


Importance sampling


Simulation (cont’d)

10 replicate ESSs for AMIS (left) and PMC (right) for p = 5, 10, 20.


Importance sampling



10 replicate absolute errors associated to the estimations of E(x1) (left column),

E(x2) (center column) andPp

i=3 E(xi) (right column) using AMIS (left in each

block) and PMC (right) for p = 5, 10, 20.


Importance sampling



10 replicate absolute errors associated to the estimations of var(x1) (left column),

var(x2) (center column) andPp

i=3 var(xi) (right column) using AMIS (left in each

block) and PMC (right) for p = 5, 10, 20.


Application to cosmological data

Cosmological data

Posterior distribution of cosmological parameters for recentobservational data of CMB anisotropies (differences in temperaturefrom directions) [WMAP], SNIa, and cosmic shear.Combination of three likelihoods, some of which are available aspublic (Fortran) code, and of a uniform prior on a hypercube.



Cosmology parameters

Parameters for the cosmology likelihood(C=CMB, S=SNIa, L=lensing)

Symbol Description Minimum Maximum ExperimentΩb Baryon density 0.01 0.1 C LΩm Total matter density 0.01 1.2 C S Lw Dark-energy eq. of state -3.0 0.5 C S Lns Primordial spectral index 0.7 1.4 C L

∆2R

Normalization (large scales) Cσ8 Normalization (small scales) C Lh Hubble constant C Lτ Optical depth CM Absolute SNIa magnitude Sα Colour response Sβ Stretch response Sa Lb galaxy z-distribution fit Lc L

For WMAP5, σ8 is a deduced quantity that depends on the other parameters



Adaptation of importance function



Estimates

Parameter PMC MCMC

Ωb 0.0432+0.0027−0.0024

0.0432+0.0026−0.0023

Ωm 0.254+0.018

−0.0170.253+0.018

−0.016

τ 0.088+0.018−0.016

0.088+0.019−0.015

w −1.011 ± 0.060 −1.010+0.059

−0.060

ns 0.963+0.015−0.014

0.963+0.015−0.014

109∆2R

2.413+0.098−0.093

2.414+0.098−0.092

h 0.720+0.022−0.021

0.720+0.023−0.021

a 0.648+0.040−0.041

0.649+0.043−0.042

b 9.3+1.4−0.9

9.3+1.7−0.9

c 0.639+0.084−0.070

0.639+0.082−0.070

−M 19.331 ± 0.030 19.332+0.029

−0.031

α 1.61+0.15−0.14

1.62+0.16−0.14

−β −1.82+0.17

−0.16−1.82 ± 0.16

σ8 0.795+0.028−0.030

0.795+0.030−0.027

Means and 68% credible intervals using lensing, SNIa and CMB



Advantage of AIS and PMC?

Parallelisation of the posterior calculations- For the cosmological examples, we used up to 100 CPUs on a computer cluster to explore the cosmologyposteriors using AIS/PMC. Reducing the computational time from several days for MCMC to a few hoursusing PMC.

Low variance of Monte Carlo estimates- For PMC and q closely matched to π, significant reductions in the variance of the Monte Carloestimates are possible compared to estimates using MCMC. Also translating into a computational saving,with further savings possible by combining samples across iterations

Simple diagnostics of ‘convergence’ (perplexity)- For PMC, the perplexity provides a relatively simple measure of sampling adequacy to the target densityof interest


Evidence approximation

Evidence/Marginal likelihood/Integrated Likelihood ...

Central quantity of interest in (Bayesian) model choice

E =

∫

π(x)dx =

∫

π(x)

q(x)q(x)dx.

expressed as an expectation under any density q with large enoughsupport.Importance sampling provides a sample x1, . . . xN ∼ q andapproximation of the above integral,

E ≈N∑

n=1

wn

where the wn = π(xn)q(xn) are the (unnormalised) importance weights.



Evidence/Marginal likelihood/Integrated Likelihood ...

Central quantity of interest in (Bayesian) model choice

E =

∫

π(x)dx =

∫

π(x)

q(x)q(x)dx.

expressed as an expectation under any density q with large enoughsupport.Importance sampling provides a sample x1, . . . xN ∼ q andapproximation of the above integral,

E ≈N∑

n=1

wn

where the wn = π(xn)q(xn) are the (unnormalised) importance weights.



Back to the banana ...

Centred d-multivariate normal, x ∼ Nd(0,Σ) with covarianceΣ = diag(σ2

1 , 1, . . . , 1), which is slightly twisted in the first twodimensions by changing x2 to be x2 + β(x2

1 − σ21). where σ2

1 = 100and β controls the degree of curvature.We integrate over the unormalised target density

E =

∫

π(β)f(x|β,Σ)dβ

or

E =

∫

π(x|β,Σ)dx.



Simulation results (1)

x1

x 2

−40 −20 0 20 40

−30

−20

−10

010

−40 −20 0 20 40

−30

−20

−10

010

x1

x 2

0.02

992

0.02

996

0.03

000

0.03

004

After 10th iteration

Pos

terio

r m

ean

of β

−26

4.03

6−

264.

032

−26

4.02

8


Evi

denc

e (lo

g)

β unknown



Simulation results (2)

1 2 3 4 5 6 7 8 9 10

0.2

0.4

0.6

0.8

Iteration

Per

plex

ity

1 2 3 4 5 6 7 8 9 10

0.0

0.2

0.4

0.6

0.8

Iteration

NE

SS

1 2 3 4 5 6 7 8 9 10

−0.

10.

00.

10.

2

Iteration

Evi

denc

e (lo

g)

−0.

015

−0.

005

0.00

50.

015


Evi

denc

e (lo

g): f

inal

sam

ple

β = 0.015 known


Cosmology models

Back to cosmology questions

Standard cosmology successful in explaining recent observations,such as CMB, SNIa, galaxy clustering, cosmic shear, galaxy clustercounts, and Lyα forest clustering.

Flat ΛCDM model with only six free parameters(Ωm,Ωb, h, ns, τ, σ8)

Extensions to ΛCDM may be based on independent evidence(massive neutrinos from oscillation experiments), predicted bycompelling hypotheses (primordial gravitational waves frominflation) or reflect ignorance about fundamental physics(dynamical dark energy).

Testing for dark energy, curvature, and inflationary models


Cosmology models

Back to cosmology questions

Standard cosmology successful in explaining recent observations,such as CMB, SNIa, galaxy clustering, cosmic shear, galaxy clustercounts, and Lyα forest clustering.

Flat ΛCDM model with only six free parameters(Ωm,Ωb, h, ns, τ, σ8)

Extensions to ΛCDM may be based on independent evidence(massive neutrinos from oscillation experiments), predicted bycompelling hypotheses (primordial gravitational waves frominflation) or reflect ignorance about fundamental physics(dynamical dark energy).

Testing for dark energy, curvature, and inflationary models


Cosmology models

Extended models

Focus on the dark energy equation-of-state parameter, modeled as

w = −1 ΛCDM

w = w0 wCDM

w = w0 + w1(1 − a) w(z)CDM

In addition, curvature parameter ΩK for each of the above is eitherΩK = 0 (‘flat’) or ΩK 6= 0 (‘curved’).Choice of models represents simplest models beyond a“cosmological constant” model able to explain the observed,recent accelerated expansion of the Universe.


Cosmology models

Cosmology priors

Prior ranges for dark energy and curvature models. In case of w(a)models, the prior on w1 depends on w0

Parameter Description Min. Max.

Ωm Total matter density 0.15 0.45Ωb Baryon density 0.01 0.08h Hubble parameter 0.5 0.9

ΩK Curvature −1 1w0 Constant dark-energy par. −1 −1/3

w1 Linear dark-energy par. −1 − w0−1/3−w0

1−aacc


Cosmology models

Cosmology priors (2)

Component to the matter-density tensor with w(a) < −1/3 forvalues of the scale factor a > aacc = 2/3. To limit the stateequation from below, we impose the condition w(a) > −1 for all a,thereby excluding phantom energy.Natural limit on the curvature is that of an empty Universe, i.e.upper boundary on the curvature ΩK = 1. A lower boundarycorresponds to an upper limit on the total matter-energy density:ΩK > −1, excluding high-density Universe(s) which are ruled outby the age of the oldest observed objects.Alternative prior on ΩK could be derived from the paradigm of inflation, but most

scenarios imply the curvature to be , on the order of 10−60. The likelihood over such

a prior on ΩK is essentially flat for any current and future experiments, hence cannot

be assessed.


Cosmology models

Cosmology priors (2)

Component to the matter-density tensor with w(a) < −1/3 forvalues of the scale factor a > aacc = 2/3. To limit the stateequation from below, we impose the condition w(a) > −1 for all a,thereby excluding phantom energy.Natural limit on the curvature is that of an empty Universe, i.e.upper boundary on the curvature ΩK = 1. A lower boundarycorresponds to an upper limit on the total matter-energy density:ΩK > −1, excluding high-density Universe(s) which are ruled outby the age of the oldest observed objects.Alternative prior on ΩK could be derived from the paradigm of inflation, but most

scenarios imply the curvature to be , on the order of 10−60. The likelihood over such

a prior on ΩK is essentially flat for any current and future experiments, hence cannot

be assessed.


Cosmology models

PMC setup

q0 is a Gaussian mixture model with D components randomlyshifted away from the MLE and covariance equal to theinformation matrix.

For the dark-energy and curvature models number ofiterations T equal to 10, unless perplexity indicated thecontrary. Average number of points sampled under anindividual mixture-component, N/D, controlled for stableupdating component (N = 7500 and D = 10).

For the primordial models T = 5, N = 10000 and D between7 and 10, depending on the dimensionality.

Parameters controlling the initial mixture means andcovariances, chosen as fshift = 0.02, and fvar between 1 and1.5. Final iteration run with a five-times larger sample


Cosmology models

Results

In most cases evidence in favour of the standard model. especiallywhen more datasets/experiments are combined.

Largest evidence is ln B12 = 1.8, for the w(z)CDM model andCMB alone. Case where a large part of the prior range is stillallowed by the data, and a region of comparable size is excluded.Hence weak evidence that both w0 and w1 are required, butexcluded when adding SNIa and BAO datasets.

Results on the curvature are compatible with current findings:non-flat Universe(s) strongly disfavoured for the three dark-energycases.


Cosmology models

Evidence

-8

-6

-4

-2

0

2

4

4 5 6

ln B

12

npar

Evidence (reference model ΛCDM flat)

inco

ncl.

wea

km

od.

wea

km

od.

stro

ng

CMB

Λ curved

w0 flat

w0 curved

w(z) flat

w(z) curved

CMB+SN

Λ curved

w0 flat

w0 curved

w(z) flat

w(z) curved

CMB+SN+BAO

Λ curved

w0 flat

w0 curved

w(z) flat

w(z) curved


Cosmology models

Posterior outcome

Posterior on dark-energy parameters w0 and w1 as 68%- and 95% credible regions forWMAP (solid blue lines), WMAP+SNIa (dashed green) and WMAP+SNIa+BAO(dotted red curves). Allowed prior range as red straight lines.

−1.0 −0.9 −0.8 −0.7 −0.6 −0.5 −0.4

−0.5

0.0

0.5

1.0

1.5

2.0

w0

w1


Cosmology models

PMC stability−

11.0

−10

.0−

9.5

−9.

0−

8.5

iteration

ln E

1 2 3 4 5 6 7 8 9 10

wCDM flat

−14

−13

−12

−11

−10

iterationln

E

1 3 5 7 9 11 13 15 17 19

wCDM curvature

Distribution of 25 PMC samplings of two dark-energy models, flat wCDM (left panel)

and curved wCDM (right panel). Log-evidence


Cosmology models

PMC stability0.

00.

20.

40.

60.

8

iteration

perp

lexi

ty

1 2 3 4 5 6 7 8 9 10

wCDM flat

0.0

0.1

0.2

0.3

0.4

0.5

iterationpe

rple

xity

1 3 5 7 9 11 13 15 17 19

wCDM curvature

Distribution of 25 PMC samplings of two dark-energy models, flat wCDM (left panel)

and curved wCDM (right panel). Perplexity


lexicon

lexicon

BAO, baryon acoustic oscillations

CMB, cosmic microwave background radiation

COBE, cosmic background explorer

ΛCDM, lambda-cold dark matter

Lyα, Lyman-alpha

SNIa, type Ia supernovae

WMAP, Wilkinson microwave anisotropy probe

Date post:	10-May-2015
Category:	Education
Upload:	christian-robert
View:	1,161 times
Download:	0 times

Bayesian model choice in cosmology

Education