durham 102208
TRANSCRIPT
-
8/14/2019 Durham 102208
1/35
SIMULATIONS ANDCOSMOLOGICAL
INFERENCEMichael D. Schneider
Durham
In collaboration with Lloyd Knox(UC Davis), Salman Habib, KatrinHeitmann, David Higdon(Los Alamos National Laboratory), Charles
Nakhleh(Sandia National Laboratories)
October 22, 2008
-
8/14/2019 Durham 102208
2/35
OverviewQuestion: How do we estimate cosmological parameters
when theoretical models are only known via forwardsimulation?
Answer: Use statistical model to interpolate outputs ofselect simulation runs.
1. Simulation design
2. Emulator
Simultaneously learn the error distribution for the data.
Applicable to CMB,galaxy, andweak lensing surveys(orreally anywhere that uses simulations for parameter inference).
arXiv:0806.1487
-
8/14/2019 Durham 102208
3/35
Technical motivation:
simulations are costly!Most astrophysical systems can only be modeled withnumerical simulations
Even when the physics is easily understood, accuratenoise modeling can require large simulations (e.g. theCMB)
Constrainingdark energyviaBAO and cosmic shearprovides formidable computational challenges inpredicting both the model and the error distributions
-
8/14/2019 Durham 102208
4/35
Parameter estimation
requires many simulationsUse Monte Carlo algorithms to integrate the jointprobability distribution of the data and model:
Requires many calculations of the model at differentparameter settings (~10,000 evaluations for ~5
parameters)
This is computationally prohibitive for manyapplications
P(model | data) = P(model, data) / P(data)
-
8/14/2019 Durham 102208
5/35
Likelihood model
For galaxy surveys or CMB, data = power spectrum
model dependence of covariance usually neglected
Framework identical for N-point correlations
Gaussian distribution can be extended usingmixture models
2log(P(x|)) = (x
x())T C1() (x
x()) + log(det(C()))
x model parameters
Multivariate Gaussian model for the Likelihood:
-
8/14/2019 Durham 102208
6/35
EXAMPLE:
NONLINEAR MATTERPOWER SPECTRUM
-
8/14/2019 Durham 102208
7/35
Non-Gaussian errors in the cosmicshear power spectrum
Fisher matrix constraints fromHalo Model calculation ofpower spectrum covariance(Cooray & Hu (2000))
non-Gaussian effects candominate at scales < 10
arcmin. (even when apparentlyshape noise dominated)(Semboloni et al. (2006))
Full sky weak lensing survey(limiting mag in R~25)
-
8/14/2019 Durham 102208
8/35
Clusters + weak lensing
Takada & Bridle (2007)
Consider cross-covariancebetween cluster numbercounts and cosmic shearpower spectrum
-
8/14/2019 Durham 102208
9/35
Power spectrum covariance
from N-body simulations32 realizations of N-body cube 450 Mpc/h on a sideChop into 64 sub-cubes
Window has large impact on covariance
Not explained by simple convolution with the power spectrum
0.02 0.05 0.10 0.20 0.50 1.00 2.00
1e!05
1e!04
1e!03
1e!02
1e!01
Normalized variance
k [h/Mpc]
Gaussian
450 Mpc/h periodic box
112.5 Mpc/h windowed box
0.02 0.05 0.10 0.20 0.50 1.00 2.00
100
200
500
1000
5000
20000
Mean power spectra
k [h/Mpc]
450 Mpc/h periodic box112.5 Mpc/h windowed box
0.05 0.10 0.20 0.50 1.00 2.00
!0.
2
0.
0
0.
2
0.
4
0.
6
0.
8
1.
0
Correlation coefficients
k [h/Mpc]
450 Mpc/h periodic box
112.5 Mpc/h windowed box
-
8/14/2019 Durham 102208
10/35
Parameter dependence of the
power spectrum covariance
0.05 0.10 0.20 0.50 1.00 2.00
1e!
04
5e!
04
5e!
03
5e!
02
k [h/Mpc]
Normalize
dvarianceofpowerspectrum
GaussianHM !!8 == 0.6
HM!!
8==
1PT !!8 == 0.6
PT !!8 == 1
sim. !!8 == 0.6
sim. !!8 == 1
Normalized variance Correlation coefficients
(Halo model)
-
8/14/2019 Durham 102208
11/35
Parameterization of the power
spectrum error distributionMultivariate Normal distribution:
Consider shell-averaged estimates of power spectrum bands
Central limit theorem guarantees a Gaussian distribution forband powers except for a few k-bins on the largest scales of the
survey
Correlations in power spectrum captured in this model
P(k) N((),())
-
8/14/2019 Durham 102208
12/35
SIMULATION DESIGN
-
8/14/2019 Durham 102208
13/35
Choosing which
simulations to runOrthogonal Array Latin Hypercube
Specify hypercube parameter
bounds (rescaled to unit interval)
Latin square: one point perrow and column
Orthogonal array: each
quadrant has a sample
Optimize with distancecriterion
!
!
!
!
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0
.8
1.0
Simulation design (OALH)
parameter 1
parameter2
-
8/14/2019 Durham 102208
14/35
-
8/14/2019 Durham 102208
15/35
-
8/14/2019 Durham 102208
16/35
GAUSSIAN PROCESS
MODELS FORINTERPOLATION
-
8/14/2019 Durham 102208
17/35
How to do interpolation in
high dimensionsWe need to interpolate multivariate simulation output as afunction of large (~ 10) numbers of parameters
Power spectrum mean and covariance components modeledas Gaussian processes(GPs)(following Habib et. al 2007)
Interpolation error propagated within Bayesian framework
GP determined by correlation parameters for theinterpolated surface
GPs scale well for interpolation in high dimensions
-
8/14/2019 Durham 102208
18/35
Gaussian process models for spatial phenomena
0 1 2 3 4 5 6 7
!2
!1
0
1
2
s
z(s)
An example ofz(s) of a Gaussian process model on s1, . . . , sn
z =
z(s1)...
z(sn)
N
0...
0
,
, with ij = exp{||si sj||
2},
where ||si sj|| denotes the distance between locations si and sj.
z has density (z) = (2)n2 ||
1
2 exp{12zT1z}.
32
Higdon, Williams, Gattiker (LANL)
-
8/14/2019 Durham 102208
19/35
Realizations from (z) = (2)n
2 ||12 exp{12z
T1
z}
0 1 2 3 4 5 6 7!2
!1
0
1
2
z(s)
0 1 2 3 4 5 6 7!2
!1
0
1
2
z(s)
0 1 2 3 4 5 6 7!2
!1
0
1
2
s
z(s)
model for z(s) can be extended to continuous s
33
Higdon, Williams, Gattiker (LANL)
-
8/14/2019 Durham 102208
20/35
Conditioning on some observations of z(s)
0 1 2 3 4 5 6 7!2
!1
0
1
2
z(s)
We observe z(s2) and z(s5) what do we now know about{z(s1), z(s3), z(s4), z(s6), z(s7), z(s8)}?
z(s2)z(s5)z(s1)
z(s3)z(s4)z(s6)z(s7)z(s8)
N
00000000
,
1 .0001.0001 1
.3679 00 .0001
.3679 0. . . . . .
0 .0001
1 0... . . . ...
0 1
38
Higdon, Williams, Gattiker (LANL)
-
8/14/2019 Durham 102208
21/35
Conditioning on some observations of z(s)
z1z2
N
00
,
11 12
21 22
, z2|z1 N(21
1
11z1,22 21
1
1112)
0 1 2 3 4 5 6 7!2
!1
0
1
2
z(s)
conditional mean
0 1 2 3 4 5 6 7
!2
!1
0
1
2
z(s)
contitional realizations
s
39
Higdon, Williams, Gattiker (LANL)
-
8/14/2019 Durham 102208
22/35
A 2-d example, conditioning on the edge
ij = exp{(||si sj||/5)2}
510
152
X5
10
15
20
Y
-2
-1
0
1
2
3
4
Z
a realization
510
15
X5
10
15
20
Y
-2
-1
0
1
2
3
4
Z
mean conditional on Y=1 points
5
1015
2
X5
10
15
20
Y
-2
-1
0
1
2
3
4
Z
realization conditional on Y=1 points
5
1015
X5
10
15
20
Y
-2
-1
01
2
3
4
Z
realization conditional on Y=1 points
42
Higdon, Williams, Gattiker (LANL)
-
8/14/2019 Durham 102208
23/35
Limitations of Gaussian Processes
A
alph
a
modeam
p.
A
alph
a
modeamp
.
s
z(s)
-
8/14/2019 Durham 102208
24/35
EMULATOR
-
8/14/2019 Durham 102208
25/35
Power spectrum emulator
Multivariate power spectrum output decomposed intoincompleteorthogonal basis(achieves dimension reduction):
Model basis weights as independent Gaussian Processes
Do MCMC to calibrate GP parameters given the design runs
(k,
) =(k
)w
(
) +
N
(0,1 )
w() GP (0,w (;w, w))
P(wdesign|,w, w) 1
+ w
1/2 exp
1
2w
Tdesign
1
+w1wdesign
-
8/14/2019 Durham 102208
26/35
-
8/14/2019 Durham 102208
27/35
Covariance matrixparameterization
Generalized Cholesky decomposition (Pouramahdi et. al 2007)
Components of T are unconstrained:
Impose prior structure on covariance with a( independent) conjugate Gaussianprior on (allows shrinking to constant T)
Prior mean can be set from sample covariance of design runs
Model as GP just like mean and variance
Estimate covariance at each design point simultaneously-fewer realizations needed
ij Tij 2 i ny, j = 1, . . . , i 1
N ( , C)
1
y () = TT()D1()T()
i() GP (i,(;,i,,i )) i = 1, . . . ,ny(ny 1)
2
-
8/14/2019 Durham 102208
28/35
Simplified emulator
Simulation outputs reduced to mean and covariance estimates ateach design point,
Approximation: neglect error in sample mean and covarianceModel variance as a GP just like the mean
Sampling model for the data:
The joint likelihood for parameter estimation breaks into:
L(y, , D|0,,, ) = dpDv L( wy, w|v, 0, ,w, w) (v, v|0,v, v)
y|w(), v()
N (w(),y(Dv()))
, D
-
8/14/2019 Durham 102208
29/35
Covariance is diagonal
Assume the same numberof modes are used toestimate P(k) in each band
This gives morenoticeable differencesin posteriors for later
validation tests
!3 !2 !1 0 1
3
4
5
6
7
8
9
log(k)
log(P(k))
!
!! ! !
!
!
!
!!!!
!
!
!
!
!!!!!!!
!
!!
!!!!
!
!
var(P(k)) P2(k)
P(k) = Ak
Validation: toy power-law model
Black: N-bodyRed: modelBlue: mock data
-
8/14/2019 Durham 102208
30/35
Emulator correlations
!!
PC5
PC4
PC3
PC2
PC1
0.0 0.2 0.4 0.6 0.8 1.0
!
!
!
!
!
amplitude
0.0 0.2 0.4 0.6 0.8 1.0
!
!
!
!
!
slope
Marginal posterior samples given design runs
-
8/14/2019 Durham 102208
31/35
Scaled model parameters
Density
0
1
2
3
4
5
0.2 0.4 0.6 0.8
amplitude
30 pt. design: sample cov.
slope
30 pt. design: sample cov.
amplitude7 pt. design
0
1
2
3
4
5
slope7 pt. design
0
1
2
3
4
5
amplitude30 pt. design
0.2 0.4 0.6 0.8
slope30 pt. design
Parameter
posteriorsMarginal distributions for
the 2 cosmological
parameters
-
8/14/2019 Durham 102208
32/35
Variance parametersMarginal posterior distributions of PC weights for the
power spectrum variance
PC weights of variance
Density
0.0
0.1
0.2
0.3
!5 0 5
PC weight 1
!5 0 5
PC weight 2
-
8/14/2019 Durham 102208
33/35
Summary
Our method uses limited numbers of simulations to calibrate amodel for the power spectrum sample variance distribution.
Obtaining precise estimates of the power spectrumcovariance is a challenge - full formulation may make thisfeasible
Our framework can be readily applied togeneral parameter
inference problems using simulationsPlan to release an R package implementing these methods
Next: demonstrate covariance matrix emulator using N-bodysimulations of the matter power spectrum
-
8/14/2019 Durham 102208
34/35
-
8/14/2019 Durham 102208
35/35
Gaussian process model formulation
for the mean power spectrumPrincipal component weights of mean are modeled as independent Gaussian processes:
Design outputs also have Gaussian sampling model(from error term
)
After marginalization over GP realizations:
Emulator outputs at new designs points can be drawn from:
(k, ) =
p
i=1
,i(k)wi() +
|w, N(w,1
I), (a, b)
complicatedNormal distribution, modifiedGammaprior
wi() GP(0,w(;w, w))
(w, w()) N(0,w,w()(w, w))
draws from posterior