monte carlo markov chains: a brief introduction and...
TRANSCRIPT
Monte Carlo Markov Chains: A Brief Introduction and Implementation
Jennifer HelsbyAstro 321
What are MCMC: Markov Chain Monte Carlo Methods?
• Set of algorithms that generate posterior distributions by sampling likelihood function in a representative way in parameter space
• Why does anybody do this?
• Scales linearly with the number of parameters considered; thus, much more efficient that other methods
Anatomy of a Markov Chain: Diagram
!M
!!
�1. Begin at a location in
parameter space,
2. Compute likelihood at this point,
3. Jump function proposes new location,
4. Compute likelihood at this point,
5. An algorithm determines whether to move there (if jump rejected, repeat steps 3-4 until accepted)
!"p = (!M,i,!!,i, wi)
!"p
L!"p
!"n = (!M,j ,!!,j , wj)
L!"n
Anatomy of a Markov Chain: Diagram
!M
!!
�1. Begin at a location in
parameter space,
2. Compute likelihood at this point,
3. Jump function proposes new location,
4. Compute likelihood at this point,
5. An algorithm determines whether to move there (if jump rejected, repeat steps 3-4 until accepted)
!"p = (!M,i,!!,i, wi)
!"p
L!"p
!"n = (!M,j ,!!,j , wj)
L!"n
!"n
Burning in
• Beginning the Markov chain
at any point in the parameter
space should result in
convergence
• First ~5-10% steps thrown
away: “burn in” process
• Reduces dependence of
reconstructed distribution on
the initial position
-0.2 0.0 0.2 0.4 0.6 0.8Omega Matter
68
70
72
74
76
78
80
82
Hubb
le R
ate
!M
H0[k
m/s
/Mpc
]
Parameter space for 2 chains:
Convergence
The Decider: Metropolis-Hastings Algorithm
• Metropolis-Hastings is the algorithm that determines whether to reject or accept the proposed jump
• Calculate ratio of likelihoods (becomes more complicated if proposal density is not symmetric):
• Acceptance criteria:
Current location:Proposed jump:
!"p = (!M,i,!!,i, wi)!"n = (!M,j ,!!,j , wj)
! =L!"nL!"p
Always accept! ! 1! < 1 Accept with probability !
Art of the MCMC
• Variable parameters:
• selection of jump function
• length of chain (typical: ~1000-10000)
• number of Markov chains (typical: ~5)
• length of burn-in (typical: ~5-10%)
• simulated annealing
Jump Function
Current location:Proposed jump:
!"p = (!M,i,!!,i, wi)!"n = (!M,j ,!!,j , wj)• Simple jump function that takes a Gaussian with mean zero and variance
and adds it to the current location:
• Adjust the variance of the Gaussian term based on the acceptance rate
• High acceptance rate: Proposed jumps are very close to current location, increase the variance
• Low acceptance rate: Many rejections means an inefficient chain (wasted computation time), decrease the variance
• Ideal acceptance rate: ~23%
!2
!"n = !"p + Gaussian(!2)
Simulated Annealing
• Algorithm used during the burn-in process
• Gives the chain a “Temperature” which slowly cools during burn in to a final temperature which then remains constant through the rest of the Markov chain
• High temperature: allows Metropolis-Hastings to accept “bad” jumps in order to travel across a large region of parameter space
• Effectively flattens the likelihood, allows more rapid movement across the parameter space during burn-in
Simulated Annealing
0 100 200 300 400Step
-0.2
-0.1
0.0
0.1
0.2
0.3
0.4Omega_Matter
!M
Step
The Dataset: Supernovae
• Supernovae dataset consists of 288 SNe combined from several surveys (SDSS-II first year supernovae + ESSENCE + SNLS + HST + LOWZ) (Kessler et al. 2009)
• Computed between luminosity distance calculated for the parameter values at that point in space and the luminosity distance from the measured distance modulus:
!2
!2 = !2 lnL
Visualizing the Parameter Space
0.2 0.4 0.6 0.8 1.0Omega Matter
70
72
74
76
78
80
Hubb
le R
ate
-1.4 -1.2 -1.0 -0.8 -0.6w_lambda
70
72
74
76
78
80
Hubb
le R
ate
w!!M
H0H0
[km
/s/M
pc]
[km
/s/M
pc]
Reconstructed DistributionsOmega_Matter
0.290 0.295 0.300 0.305 0.310 0.3150
20
40
60
80
Omega_Lambda
0.685 0.690 0.695 0.700 0.705 0.7100
20
40
60
80w_lambda
-1.08 -1.07 -1.06 -1.05 -1.04 -1.030
50
100
150
200
H_0
70.35 70.40 70.45 70.50 70.55 70.600
20
40
60
80
100
w!
!M
!!
H0
Summary
• Markov Chain Monte Carlo is a powerful method for determing parameters and their posterior distributions, especially for a parameter space with many parameters
• Selection of jump function critical in improving the efficiency of the chain, i.e. reducing computation time
• A simple MCMC is able to recover the posterior distribution of several cosmological parameters by sampling from the likelihood function computed based on supernovae data