pycbc inference: a python-based parameter estimation toolkit for … · 2018-07-30 · pycbc...

17
PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence signals C. M. Biwer Los Alamos National Laboratory, Los Alamos, NM 87545, USA Department of Physics, Syracuse University, Syracuse, NY 13244, USA Collin D. Capano Albert-Einstein-Institut, Max-Planck-Institut f¨ ur Gravitationsphysik, D-30167 Hannover, Germany Soumi De Department of Physics, Syracuse University, Syracuse, NY 13244, USA Miriam Cabero Albert-Einstein-Institut, Max-Planck-Institut f¨ ur Gravitationsphysik, D-30167 Hannover, Germany Duncan A. Brown Department of Physics, Syracuse University, Syracuse, NY 13244, USA Alexander H. Nitz Albert-Einstein-Institut, Max-Planck-Institut f¨ ur Gravitationsphysik, D-30167 Hannover, Germany V. Raymond Albert-Einstein-Institut, Max-Planck-Institut f¨ ur Gravitationsphysik, D-14476 Potsdam, Germany School of Physics and Astronomy, Cardiff University, Cardiff, CF243AA, Wales, UK Abstract. We introduce new modules in the open-source PyCBC gravitational- wave astronomy toolkit that implement Bayesian inference for compact-object binary mergers. We review the Bayesian inference methods implemented and describe the structure of the modules. We demonstrate that the PyCBC Inference modules produce unbiased estimates of the parameters of a simulated population of binary black hole mergers. We show that the posterior parameter distributions obtained used our new code agree well with the published estimates for binary black holes in the first LIGO-Virgo observing run. arXiv:1807.10312v1 [astro-ph.IM] 26 Jul 2018

Upload: others

Post on 04-Jun-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PyCBC Inference: A Python-based parameter estimation toolkit for … · 2018-07-30 · PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence

PyCBC Inference: A Python-based parameterestimation toolkit for compact binary coalescencesignals

C. M. Biwer

Los Alamos National Laboratory, Los Alamos, NM 87545, USADepartment of Physics, Syracuse University, Syracuse, NY 13244, USA

Collin D. Capano

Albert-Einstein-Institut, Max-Planck-Institut fur Gravitationsphysik, D-30167Hannover, Germany

Soumi De

Department of Physics, Syracuse University, Syracuse, NY 13244, USA

Miriam Cabero

Albert-Einstein-Institut, Max-Planck-Institut fur Gravitationsphysik, D-30167Hannover, Germany

Duncan A. Brown

Department of Physics, Syracuse University, Syracuse, NY 13244, USA

Alexander H. Nitz

Albert-Einstein-Institut, Max-Planck-Institut fur Gravitationsphysik, D-30167Hannover, Germany

V. Raymond

Albert-Einstein-Institut, Max-Planck-Institut fur Gravitationsphysik, D-14476Potsdam, GermanySchool of Physics and Astronomy, Cardiff University, Cardiff, CF243AA, Wales,UK

Abstract. We introduce new modules in the open-source PyCBC gravitational-wave astronomy toolkit that implement Bayesian inference for compact-objectbinary mergers. We review the Bayesian inference methods implemented anddescribe the structure of the modules. We demonstrate that the PyCBC Inferencemodules produce unbiased estimates of the parameters of a simulated populationof binary black hole mergers. We show that the posterior parameter distributionsobtained used our new code agree well with the published estimates for binaryblack holes in the first LIGO-Virgo observing run.

arX

iv:1

807.

1031

2v1

[as

tro-

ph.I

M]

26

Jul 2

018

Page 2: PyCBC Inference: A Python-based parameter estimation toolkit for … · 2018-07-30 · PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence

PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence signals 2

1. Introduction

The observations of six binary black hole mergers [1, 2,3, 4], and the binary neutron star merger GW170817 [5]by Advanced LIGO [6] and Virgo [7] have establishedthe field of gravitational-wave astronomy. Understand-ing the origin, evolution, and physics of gravitational-wave sources requires accurately measuring the prop-erties of detected events. In practice, this is per-formed using Bayesian inference [8, 9]. Bayesian in-ference allows us to determine the signal model thatis best supported by observations and to obtain pos-terior probability densities for a model’s parameters,hence inferring the properties of the source. In thispaper, we present PyCBC Inference; a set of Pythonmodules that implement Bayesian inference in the Py-CBC open-source toolkit for gravitational-wave astron-omy [10]. PyCBC Inference has been used to performBayesian inference for several astrophysical problems,including: testing the black hole area increase law [11];combining multi-messenger obervations of GW170817to constrain the viewing angle of the binary [12]; anddetermining that the gravitational-wave observationsof GW170817 favor a model where both compact ob-jects have the same equation of state, and measur-ing the tidal deformabilities and radii of the neutronstars [13].

We provide a comprehensive description of themethods and code implemented in PyCBC Inference.We then demonstrate that PyCBC Inference canproduce unbiased estimates of the parameters of asimulated population of binary black holes. Weshow that PyCBC Inference can recover posteriorprobability distributions that are in good agreementwith the published measurements of the binary blackholes detected in the first LIGO-Virgo observingrun [1]. This paper is organized as follows: Sec. 2gives an overview of the Bayesian inference methodsused in gravitational-wave astronomy for compact-object binary mergers. We provide an overview ofthe waveform models used; the likelihood functionfor a known signal in stationary, Gaussian noise;the sampling methods used to estimate the posteriorprobability densities and the evidence; the selection ofindependent samples; and the estimation of parametervalues from posterior probabilities. Sec. 3 describesthe design of the PyCBC Inference software and howthe methods described in Sec. 2 are implementedin the code. Sec. 4 uses a simulated population

of binary black holes and the black-hole mergersdetected in the first LIGO-Virgo observing run todemonstrate the use of PyCBC Inference. We providethe posterior probability densities for the eventsGW150914, GW151226, and LVT151012, and thecommand lines and configurations to reproduce theseresults as supplemental materials [14]. Finally, wesummarize the status of the code and possible futuredevelopments in Sec. 5.

2. Bayesian Inference for Binary Mergers

In gravitational-wave astronomy, Bayesian methodsare used to infer the properties of detected astrophys-ical sources [15, 16, 17, 18]. Given the observed data~d(t)—here this is data from a gravitational-wave de-tector network in which a search has identified a sig-nal [19, 20, 21]—Bayes’ theorem [8, 9] states that for ahypothesis H,

p(~ϑ|~d(t), H) =p(~d(t)|~ϑ,H)p(~ϑ|H)

p(~d(t)|H). (1)

In our case, hypothesis H is the model of thegravitational-wave signal and ~ϑ are the parameters ofthis model. Together, these describe the properties ofthe astrophysical source of the gravitational waves. InEq. (1), the prior probability density p(~ϑ|H) describesour knowledge about the parameters before consideringthe observed data ~d(t), and the likelihood p(~d(t)|~ϑ,H)

is the probability of obtaining the observation ~d(t)

given the waveform model H with parameters ~ϑ.Often we are only interested in a subset of the

parameters ~ϑ. To obtain a probability distributionon one or a few parameters, we marginalize theposterior probability by integrating p(~d(t)|~ϑ,H)p(~ϑ|H)over the unwanted parameters. Marginalizing over allparameters yields the evidence, p(~d(t)|H), which isthe denominator in Eq. (1). The evidence serves as anormalization constant of the posterior probability forthe given model H. If we have two competing modelsHA and HB , the evidence can be used to determinewhich model is favored by the data via the Bayesfactor [22, 23, 24],

B =p(~d(t)|HA)

p(~d(t)|HB). (2)

If B is greater than 1 then model HA is favored overHB , with the magnitude of B indicating the degree ofbelief.

Page 3: PyCBC Inference: A Python-based parameter estimation toolkit for … · 2018-07-30 · PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence

PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence signals 3

PyCBC Inference can compute Bayes factors andproduce marginalized posterior probability densitiesgiven the data from a network of gravitational-waveobservatories with N detectors ~d(t) = {di(t); 1 < i <N}, and a model H that describes the astrophysicalsource. In the remainder of this section, we review themethods used to compute these quantities.

2.1. Waveform Models

The gravitational waves radiated in a binary merger’ssource frame are described by the component massesm1,2, the three-dimensional spin vectors ~s1,2 of thecompact objects [25], and the binary’s eccentricitye [26]. A parameter φ describes the phase of the binaryat a fiducial reference time, although this is not usuallyof physical interest. For binaries containing neutronstars, additional parameters Λ1,2 describe the star’stidal deformabilities [27, 28], which depend on theequation of state of the neutron stars. The waveformobserved by the Earth-based detector network dependson seven additional parameters: the signal’s time ofarrival tc, the binary’s luminosity distance dL, andfour Euler angles that describe the transformation fromthe binary’s frame to the detector network frame [29].These angles are typically written as the binary’s rightascension α, declination δ, a polarization angle Ψ, andthe inclination angle ι (the angle between the binary’sangular momentum axis and the line of sight).

Binary mergers present a challenging problemfor Bayesian inference, as the dimensionality of thesignal parameter space is large. This is furthercomplicated by correlations between the signal’sparameters. For example, at leading order thegravitational waveform depends on the chirp massM [30]. The mass ratio enters the waveformat higher orders and is more difficult to measure.This results in an amplitude-dependent degeneracybetween the component masses [18]. Similarly,the binary’s mass ratio can be degenerate with itsspin [31], although this degeneracy can be brokenif the binary is precessing. Much of the effort ofparameter estimation in gravitational-wave astronomyhas focused on developing computationally feasibleways to explore this signal space, and on extractingphysically interesting parameters (or combinations ofparameters) from the large, degenerate parameterspace (see e.g. Ref. [32] and references therein).However, in many problems of interest, we are notconcerned with the full parameter space describedabove. For example, field binaries are expected tohave negligible eccentricity when they are observed byLIGO and Virgo [30], and so eccentricity is neglectedin the waveform models. Simplifying assumptionscan be made about the compact object’s spins(e.g. the spins are aligned with the binary’s orbital

angular momentum), reducing the dimensionality ofthe waveform parameters space.

Given a set of parameters ~ϑ, one can obtaina model of the gravitational-wave signal from abinary merger using a variety of different methods,including: post-Newtonian theory (see e.g. Ref. [33]and references therein), analytic models calibratedagainst numerical simulations [34, 35, 36, 37, 38, 39,40], perturbation theory [41, 42], and full numericalsolution of the Einstein equations (see e.g. Ref. [43] andreferences therein). Obtaining posterior probabilitiesand evidences can require calculating O

(109)

templatewaveforms, which restricts us to models that arecomputationally efficient to calculate. The cost offull numerical simulations makes them prohibitivelyexpensive at present. Even some analytic models aretoo costly to be used, and surrogate models have beendeveloped that capture the features of these waveformsat reduced computational cost [44, 45].

The specific choice of the waveform model Hfor an analysis depends on the physics that we wishto explore, computational cost limitations, and thelevel of accuracy desired in the model. A varietyof waveform models are available for use in PyCBCInference, either directly implemented in PyCBC orvia calls to the LIGO Algorithm Library (LAL) [46].We refer to the PyCBC and LAL documentation, andreferences therein, for detailed descriptions of thesemodels. In this paper, we demonstrate the use ofPyCBC Inference using the IMRPhenomPv2 [47, 48]waveform model for binary black hole mergers. Thismodel captures the inspiral-merger-ringdown physicsof spinning, precessing binaries and parameterizingspin effects using a spin magnitude aj , an azimuthalangle θaj , and a polar angle θpj for each of the twocompact objects. Examples of using PyCBC Inferencewith different waveform models include the analysisof Ref. [13] that used the TaylorF2 post-Newtonianwaveform model with tidal corrections, and Ref. [11]that used a ringdown-only waveform that models thequasi-normal modes of the remnant black hole.

2.2. Likelihood Function

The data observed by the gravitational-wave detectornetwork enters Bayes’ theorem through the likelihoodp(~d(t)|~ϑ,H) in Eq. (1). Currently, PyCBC Inferenceassumes that the each detector produces stationary,Gaussian noise ni(t) that is uncorrelated between thedetectors in the network. The observed data is thendi(t) = ni(t) + si(t), where si(t) is the gravitationalwaveform observed in the i-th detector. For detectorsthat are not identical and co-located (as in the caseof the LIGO-Virgo network), each detector observersa slightly different waveform due to their differentantennae patterns, however the signal in the i-th

Page 4: PyCBC Inference: A Python-based parameter estimation toolkit for … · 2018-07-30 · PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence

PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence signals 4

detector can be calculated given the subset of theparameters ~ϑ that describes the location of the binary.

Under these assumptions, the appropriate form ofp(~d(t)|~ϑ,H) is the well-known likelihood for a signal ofknown morphology in Gaussian noise (see e.g. Ref. [49]for its derivation), which is given by

p(~d(t)|~ϑ,H) = exp

[−1

2

N∑i=1

〈ni(f)|ni(f)〉

]

= exp

[−1

2

N∑i=1

⟨di(f)− si(f, ~ϑ)|di(f)− si(f, ~ϑ)

⟩],(3)

where N is the number of detectors in the network.The inner product 〈a|b〉 is⟨ai(f)|bi(f)

⟩= 4<

∫ ∞0

ai(f)bi(f)

S(i)n (f)

df , (4)

where S(i)n (f) is the power spectral density of the of

the i-th detector’s noise. Here, di(f) and ni(f) arethe frequency-domain representations of the data andnoise, obtained by a Fourier transformation of di(t)

and ni(t), respectively. The model waveform si(f, ~ϑ)may be computed directly in the frequency domain, orin the time domain and then Fourier transformed tothe frequency domain. There are several operations(e.g. Fourier transforms, noise power spectral densityestimation, and inner products) that are commonbetween the calculation of Eq. (3) and the computationof the matched filter signal-to-noise ratio (SNR) inPyCBC [19, 20, 21]. PyCBC Inference uses theseexisting functions, where appropriate.

In general, gravitational-wave signals consist of asuperposition of harmonic modes. However, in manycases it is sufficient to model only the most dominantmode, since the sub-dominant harmonics are too weakto be measured. In this case, the signal observed inall detectors has the same simple dependence on thefiducial phase φ,

si(f, ~ϑ, φ) = s0i (f,

~ϑ, 0)eiφ. (5)

The posterior probability p(~ϑ|~d(t), H) can be analyti-cally marginalized over φ for such models [49]. Assum-ing a uniform prior on φ ∈ [0, 2π), the marginalizedposterior is

log p(~ϑ|~d(t), H) ∝ log p(~ϑ|H) + I0

(∣∣∣∣∣∑i

O(s0i , di)

∣∣∣∣∣)

− 1

2

∑i

[⟨s0i , s

0i

⟩−⟨di, di

⟩], (6)

where

s0i ≡ si(f, ~ϑ, φ = 0),

O(s0i , di) ≡ 4

∫ ∞0

s∗i (f ;ϑ, 0)di(f)

S(i)n (f)

df,

and I0 is the modified Bessel function of the first kind.We have found that analytically marginalizing

over φ in this manner reduces the computationalcost of the analysis by a factor of 2 – 3. TheIMRPhenomPv2 model that we use here is a simplifiedmodel of precession that allows for this analyticmarginalization [47, 48]. Since fiducial phase isgenerally a nuisance parameter, we use this form ofthe likelihood function in Secs. 4.1 and 4.2.

2.3. Sampling Methods

Stochastic sampling techniques, and in particularMarkov-chain Monte Carlo (MCMC) methods [50, 51,52, 53], have been used to numerically sample theposterior probability density function of astrophysicalparameters for binary-merger signals [18, 54, 55, 56].Ensemble MCMC algorithms use multiple Markovchains to sample the parameter space. A simplechoice to initialize the k-th Markov chain in theensemble is to draw a set of parameters ~ϑ

(k)1 from

the prior probability density function. The Markovchains move around the parameter space accordingto the following set of rules. At iteration l, the

k-th Markov chain has the set of parameters ~ϑ(k)l .

The sampling algorithm chooses a new proposed

set of parameters ~ϑ(k)l′ with probability Q(~ϑ

(k)l , ~ϑ

(k)l′ ).

When a new set of parameters is proposed, thesampler computes an acceptance probability γ whichdetermines if the Markov chain should move to theproposed parameter set ~ϑ

(k)l′ such that ~ϑ

(k)l+1 = ~ϑ

(k)l′ . If

~ϑ(i)l′ is rejected, then ~ϑ

(k)l+1 = ~ϑ

(k)l . After a sufficient

number of iterations, the ensemble converges to adistribution that is proportional to a sampling ofthe posterior probability density function. The trueastrophysical parameters ~ϑ can then be estimated fromhistograms of the position of the Markov chains inthe parameter space. Different ensemble samplingalgorithms make particular choices for the proposal

probability Q(~ϑ(k)l , ~ϑ

(k)l′ ) and acceptance probability γ.

The open-source community has several well-developed software packages that implement algo-rithms for sampling the posterior probability densityfunction. PyCBC Inference leverages these develop-ments, and we have designed a flexible framework thatallows the user to choose from multiple ensemble sam-pling algorithms. Currently, PyCBC Inference sup-ports the open-source ensemble sampler emcee [57, 58],its parallel-tempered version emcee pt [59, 60], and thekombine [61, 62] sampler. All of the three are ensembleMCMC samplers. The sampling algorithm advancesthe positions of the walkers based on their previouspositions and provides PyCBC Inference the positionsof the walkers along the Markov chain.

The emcee pt sampler is a parallel-tempered

Page 5: PyCBC Inference: A Python-based parameter estimation toolkit for … · 2018-07-30 · PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence

PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence signals 5

sampler which advances multiple ensembles basedon the tempering or the “temperatures” used toexplore the posterior probability density function. Theposterior probability density function for a particulartemperature T is modified such that

pT (~ϑ|~d(t), H) =p(~d(t)|~ϑ,H)

1T p(~ϑ|H)

p(~d(t)|H). (7)

The emcee pt sampler uses several temperatures inparallel, and the position of Markov chains are swappedbetween temperatures using an acceptance criteriadescribed in Ref. [60]. Mixing of Markov chains fromthe different temperatures makes parallel-temperedsamplers suitable for sampling posterior probabilitydensity functions with widely separated modes in theparameter space [60]. The emcee sampler performs thesampling using one temperature where T = 1.

The kombine sampler on the other hand usesclustered kernel-density estimates to construct itsproposal distribution, and proposals are accepted usingthe Metropolis–Hastings condition [63]. The kombine

sampler has been included in PyCBC Inference dueto its efficient sampling which significantly lowersthe computational cost of an analysis relative to theemcee pt sampler. However, in Sec. 4.1, we foundthat the nominal configuration of the kombine samplerproduced biased estimates of parameters for binaryblack holes.

2.4. Selection of Independent Samples

The output returned by the sampling algorithmsdiscussed in Sec. 2.3 are Markov chains. Successivestates of these chains are not independent, as Markovprocesses depend on the previous state [64]. Theautocorrelation length τK of a Markov chain isa measure of the number of iterations requiredto produce independent samples of the posteriorprobability density function [65]. The autocorrelation

length of the k-th Markov chain X(k)l = {~ϑ(k)

g ; 1 < g <l} of length l obtained from the sampling algorithm isdefined as

τK = 1 + 2

K∑i=1

Ri, (8)

where K is the first iteration along the Markov chainthe condition mτK ≤ K is true, m being a parameterwhich in PyCBC Inference is set to 5 [65]. Theautocorrelation function Ri is defined as

Ri =1

lσ2

l−i∑t=1

(Xt − µ) (Xt+i − µ) , (9)

where Xt are the samples of X(k)l between the 0-th

and the t-th iteration, Xt+i are the samples of X(k)l

between the 0-th and the (t+ 1)-th iterations. Here, µand σ2 are the mean and variance of Xt, respectively.

The initial positions of the Markov chainsinfluence their subsequent positions. The length ofthe Markov chains before they are considered to havelost any memory of the initial positions is calledthe “burn-in” period. It is a common practice inMCMC analyses to discard samples from the burn-inperiod to prevent any bias introduced by the initialpositions of the Markov chains on the estimates of theparameters from the MCMC. PyCBC Inference hasseveral methods to determine when the Markov chainsare past the burn-in period. Here, we describe twomethods, max posterior and n acl, which we havefound to work well with the kombine and emcee pt

samplers used in Sections 4.1 and 4.2.The max posterior algorithm is an implementa-

tion of the burn-in test used for the MCMC samplerin Ref. [32]. In this method, the k-th Markov chainis considered to be past the burn-in period at the firstiteration l for which

logL(k)l ≥ max

k,llogL − Np

2, (10)

where L is the prior-weighted likelihood

L = p(~d(t)|~ϑ,H)p(~ϑ|H), (11)

and Np is the number of dimensions in the parameterspace. The maximization maxk,l logL is carried outover all Markov chains and iterations. The ensembleis considered to be past the burn-in period at the firstiteration where all chains pass this test. We have foundthis test works well with the kombine sampler if thenetwork signal-to-noise ratio of the signal is & 5.

While the max posterior test works well with thekombine sampler, we have found that it underestimatesthe burn-in period when used with the emcee pt

sampler. Instead we use the n acl test with theemcee pt sampler. This test posits that the sampleris past the burn-in period if the length of the chainsexceed 10 times the autocorrelation length. Theautocorrelation length is calculated using samples fromthe second half of the Markov chains. If the test issatisfied, the sampler is considered to be past the burn-in period at the midway point of the Markov chains.

Correlations between the neighboring samplesafter the burn-in period are removed by “thinning”or drawing samples from the Markov chains withan interval of the autocorrelation length [64]. Thisis done so that the samples used to estimate theposterior probability density function are independent.Therefore, the number of independent samples ofthe posterior probability density function is equal tothe number of Markov chains used in the ensembletimes the number of iterations after the burn-inperiod divided by the autocorrelation length. PyCBC

Page 6: PyCBC Inference: A Python-based parameter estimation toolkit for … · 2018-07-30 · PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence

PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence signals 6

Inference will run until it has obtained the desirednumber of independent samples after the burn-inperiod.

2.5. Credible Intervals

After discarding samples from the burn-in period andthinning the remaining samples of the Markov chains,the product is the set of independent samples asdescribed in Sec. 2.4. Typically we summarize themeasurement of a given parameter using a credibleinterval. The x% credible interval is an interval wherethe true parameter value lies with a probability ofx%. PyCBC Inference provides the capability tocalculate credible intervals based on percentile values.In the percentile method, the x% credible interval of aparameter value is written as A+C

−B where A is typicallythe 50-th percentile (median) of the marginalizedhistograms. The values A − B and A + C representthe lower and upper boundaries of the x-th percentilerespectively.

An alternative method of calculating a credibleinterval estimate is the Highest Posterior Density(HPD) method. An x% HPD interval is the shortestinterval that contains x% of the probability. Thepercentile method explained above imposes a non-zero lower boundary to the interval being measured.This can be perceived as a limitation in cases wherethe weight of histogram at the ∼ 0-th percentileis not significantly different from the weight at thelower boundary of the credible interval. Intervalsconstructed using the HPD method may be preferredin such cases. Previous studies have noted that HPDintervals may be useful when the posterior distributionis not symmetric [66]. PyCBC Inference uses HPDto construct confidence contours for two-dimensionalmarginal distributions, but HPD is not used in thecontruction of one-dimensional credible intervals for asingle parameter. This functionality will be includedin a future release of PyCBC Inference.

3. The PyCBC Inference Toolkit

In this section we describe the implementation ofPyCBC Inference within the broader PyCBC toolkit.PyCBC provides both modules for developing codeand executables for performing specific tasks withthese modules. The code is available on the publicGitHub repository at https://github.com/gwastro/pycbc, with executables located in the directorybin/inference and the modules in the directorypycbc/inference. PyCBC Inference provides anexecutable called pycbc inference that is the mainengine for performing Bayesian inference with PyCBC.A call graph of pycbc inference is shown in Figure 1.In this section, we review the structure of the

main engine and the Python objects used to buildpycbc inference.

3.1. pycbc inference executable

The methods presented in Sec. 2.1, 2.2, 2.3, 2.4, and 2.5are used to build the executable pycbc inference.For faster performances, pycbc inference can be runon high-throughput computing frameworks such asHTCondor [67, 68] and the processes for running thesampler can be parallelized over multiple computenodes using MPI [69, 70, 71]. The execution ofthe likelihood computation and the PSD estimationare done using either single-threaded or parallel FFTengines, such as FFTW [72] or the Intel MathKernel Library (MKL). For maximum flexibility inheterogeneous computing environments, the processingscheme to be used is specified at runtime as a commandline option to pycbc inference.

The input to pycbc inference is a configurationfile which contains up to seven types of sections. Thevariable args section specifies the parameters thatare to be varied in the MCMC. There is a prior sec-tion for each parameter in the variable args sectionwhich contains arguments to initialize the prior prob-ability density function for that parameter. There isa static args section specifying any parameter forwaveform generation along with its assigned value thatshould be fixed in the ensemble MCMC. Optionally,the configuration file may also include a constraint

section(s) containing any conditions that constrain theprior probability density functions of the parameters.For efficient convergence of a Markov chain, it maybe desirable to sample the prior probability densityfunction in a different coordinate system than the pa-rameters defined in the variable args section or theparameters inputted to the waveform generation func-tions. Therefore, the configuration file may contain asampling parameters and sampling transform sec-tion(s) that specifies the transformations between pa-rameters in the variable args sections and the pa-rameters evaluated in the prior probability densityfunction. Finally, the waveform generation functionsrecognize only a specific set of input parameters.The waveform transforms section(s) may be providedwhich maps parameters in the variable args sectionto parameters understood by the waveform generationfunctions. More details on application of constraintsand execution of coordinate transformations are pro-vided in Sec. 3.3 and 3.4 respectively.

The location of the configuration file, gravitational-wave detector data files, data conditioning settings,and settings for the ensemble MCMC are supplied onthe command line interface to pycbc inference. Theresults from running pycbc inference are stored in a

Page 7: PyCBC Inference: A Python-based parameter estimation toolkit for … · 2018-07-30 · PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence

PyC

BC

Inferen

ce:A

Pyth

on-ba

sedparameter

estimatio

ntoo

lkitforcompact

binary

coalescen

cesign

als

7

output_file.hdf

.priorPrior evaluator

.generatorUnderlying routines

._callfunclog posterior of variables

niterations

[sampling_params]

[params]

[rparams]

hp,hc

{h}

{h},logp,{data}

[params]

{params_i} logp_i

logp_i

logp

log_post

sampling_log_post

chain

results

[params] log_post

pycbc_inference Call Graph

: Executable: Class: Function: File: Module

niterations: Number of samples to get

[sampling_params]:List of sampling-frame parameter values to test

[params]: List of parameter values to test

[rparams]:Radiation-frame parameter values

{params_i}:Dictionary of parameters for i-th prior

logp_i:Log pdf of the i-th prior at {params_i}

logp: Log pdf of the prior at [params]

hp,hc:Plus and cross polarization of the model waveform at [rparams]

{h}:Dictionary of the model waveform at [params] in each detector

{data}:Dictionary of the data in each detector

log_post:Log of the posterior probability at [params]

sampling_log_post:Log of the posterior probability in the sampling-frame

chain:Accepted parameters, in whatever form returned by the sampling engine.

results:FieldArray of accepted [params] over niterationsand their log_post values

Key

external modules

LALSimulation or ringdown.pypycbc/waveform/[waveform.py | ringdown.py]

.distributionsPrior on each variable

.logpdflog pdf of a variable

[params] hp,hc

[params] hp,hc

.generatorWavefrom generation

function

pycb

c/w

avef

orm

/gen

erat

or.p

ypy

cbc/

dist

ribut

ions

pycbc/inference/likelihood.py

pycbc/transforms

pycbc/inference/samplers.py

bin/

infe

renc

e

pycbc/io/inference_hdf.pyniterations results

pycbc_inference

samplerSampler class

._samplerSampling engine

(kombine, emcee_pt)

likelihoodLikelihood evaluator

transformssampling parameters «waveform parameters.generator

Detector-frame waveform generation

.rframe_generatorRadiation-frame

wavefrom generation

Figure (1) The executable pycbc inference samples the posterior probability density function. For an iteration in an ensemble MCMC algorithm, theSampler object uses the LikelihoodEvaluator object to compute the natural logarithm of the posterior probability, and returns it to pycbc inference.The LikelihoodEvaluator object uses the Generator object to generate the waveform and Distribution objects to evaluate the prior probabilitydensity function. Samples are periodically written to the output file.

Page 8: PyCBC Inference: A Python-based parameter estimation toolkit for … · 2018-07-30 · PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence

PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence signals 8

HDF [73] file whose location is provided on thecommand line to pycbc inference as well. Themain results of interest are stored under the HDFgroups [‘samples’] and [‘likelihood stats’].The [‘samples’] group contains the history of theMarkov chains as separate datasets for each of thevariable parameters. The [‘likelihood stats’]

group contains a dataset of the natural logarithm ofthe Jacobian which is needed to transform from thevariable parameters to sampling parameters, a datasetcontaining natural logarithm of the likelihood ratiolog p(~d(t)|~ϑ,H)/p(~d(t)|~n) and a dataset containing thenatural logarithm of the prior probabilities. Thenatural logarithm of the noise likelihood log p(~d(t)|~n)is stored as an attribute in the output file, and thelikelihood is the summation of this quantity with thenatural logarithm of the likelihood ratio. Each ofthe datasets under the [‘samples’] group and the[‘likelihood stats’] group has shape nwalkers ×niterations if the sampling algorithm used in theanalysis did not include parallel tempering, and hasshape ntemps × nwalkers × niterations for parallel-tempered samplers. Here, nwalkers is the numberof Markov chains, niterations is the number ofiterations, and ntemps is the number of temperatures.

pycbc inference has checkpointing implementedwhich allows users to resume an analysis from the lastset of Markov chains positions written to the outputfile. It is computationally expensive to obtain thedesired number of independent samples using ensembleMCMC methods, and the pycbc inference processesmay terminate early due to problems on distributed-computing networks. Therefore, the samples from theMarkov chains should be written at regular intervalsso pycbc inference can resume the ensemble MCMCfrom the position of the Markov chains near thestate the process was terminated. The frequencypycbc inference writes the samples from Markovchains and the state of the random number generatorto the output file and a backup file is specified by theuser on the command line. A backup file is writtenby pycbc inference because the output file frompycbc inference may be corrupted. For example, ifthe process is aborted while writing to the output file,then the output file may be corrupted. In that case,samples and the state of the random number generatorare loaded from the backup file, and the backup fileis copied to the output file. This ensures that thepycbc inference process can always be resumed.

For analyses that use the emcee pt sampler,the likelihood can be used to compute the naturallogarithm of the evidence using the emcee pt sampler’sthermodynamic integration log evidence function[59]. Then, the evidences from two analyses can beused to compute the Bayes factor B for the comparison

of two waveform models.We provide example configuration files and run

scripts for the analysis of the binary black hole mergersdetected in the Advanced LIGO’s first observing run inRef. [14]. These examples can be used with the open-source datasets provided by the LIGO Open ScienceCenter [74]. The results of these analyses are presentedin Sec. 4.2.

3.2. Sampler objects

The PyCBC Inference modules provide a set ofSampler objects which execute the Bayesian samplingmethods. These objects provide classes and functionsfor using open-source samplers such as emcee [58],emcee pt [59] or kombine [62]. This acts as aninterface between PyCBC Inference and the externalsampler package. The executable pycbc inference

initializes, executes, and saves the output from theSampler objects. A particular Sampler object ischosen on the command line of pycbc inference withthe --sampler option. The Sampler object providesthe external sampler package the positions of thewalkers in the parameter space, the natural logarithmof the posterior probabilities at the current iteration,the current “state” determined from a random numbergenerator, and the number of iterations that thesampler is requested to run starting from the currentiteration. After running for the given number ofiterations, the sampler returns the updated positionsof the Markov chains, the natural logarithm of theposterior probabilities, and the new state.

3.3. Transform objects

The Transform objects in PyCBC Inference are used toperform transformations between different coordinatesystems. Currently, the Transform objects are usedin two cases: sampling transforms and waveformtransforms.

Sampling transforms are used for transformingparameters that are varied in the ensemble MCMCto a different coordinate system before evaluating theprior probaility density function. Since there existsdegeneracies between several parameters in a waveformmodel it is useful to parameterize the waveform usinga preferred set of parameters which could minimize thecorrelations. This leads to more efficient sampling, andtherefore, it leads to faster convergence of the Markovchains. One example of a sampling transformationis the transformation between the component massesm1 and m2 to chirp mass M and mass ratio q.The convention adopted for q in PyCBC Inference isq = m1/m2, where m1 and m2 are the componentmasses with m1 > m2. The chirp mass M is themost accurately measured parameter in a waveform

Page 9: PyCBC Inference: A Python-based parameter estimation toolkit for … · 2018-07-30 · PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence

PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence signals 9

model because it is in the leading order term of thepost-Newtonian expression of the waveform model.In contrast, the degeneracies of the mass ratio withspin introduces uncertainties in measurements of thecomponent masses. Therefore, sampling in (M and q)proves to be more efficient than m1 and m2 [55, 32,75]. In the GW150914, LVT151012, and GW151226configuration files in [14], we demonstrate how to allowthe Sampler object to provide priors in the (m1,m2)coordinates, and specify sampling transformations tothe (M, q) coordinates.

Waveform transforms are used to transform anyvariable parameters in the ensemble MCMC that maynot be understood by the waveform model functions.In PyCBC, the waveform model functions accept thefollowing parameters: component masses m1 and m2,dL, ι, tc, φ, and any additional spin parameters inCartesian coordinates. The convention adopted inPyCBC for ι denotes ι = 0 as a “face-on” binary (lineof sight parallel to binary angular momentum), ι = π

2as an “edge-on” binary (line of sight perpendicular tobinary angular momentum), and ι = π as a “face-off” binary (line of sight anti-parallel to binary angularmomentum).

3.4. LikelihoodEvaluator object

The LikelihoodEvaluator object computes thenatural logarithm of the prior-weighted likelihoodgiven by the numerator of Eq. 1. Since the evidence isconstant for a given waveform model, then the prior-weighted likelihood is proportional to the posteriorprobability density function, and it can be usedin sampling algorithms to compute the acceptanceprobability γ instead of the full posterior probabilitydensity function. The prior-weighted likelihood iscomputed for each new set of parameters as theSampler objects advance the Markov chains throughthe parameter space.

3.5. Distribution objects

The LikelihoodEvaluator object must compute theprior probability density function p(~ϑ|H). There existsseveral Distribution objects that provide functionsfor evaluating the prior probability density functionto use for each parameter, and for drawing randomsamples from these distributions. Currently, PyCBCInference provides the following Distributions:

(i) Arbitrary : Reads a set of samples stored in aHDF format file and uses Gaussian kernel-densityestimation [76] to construct the distribution.

(ii) CosAngle : A cosine distribution.

(iii) SinAngle : A sine distribution.

(iv) Gaussian : A multivariate Gaussian distribution.

(v) Uniform : A multidimensional uniform distribu-tion.

(vi) UniformAngle : A uniform distribution between0 and 2π.

(vii) UniformLog : A multidimensional distributionthat is uniform in its logarithm.

(viii) UniformPowerLaw : A multidimensional distribu-tion that is uniform in a power law.

(ix) UniformSky : A two-dimensional isotropic distri-bution.

(x) UniformSolidAngle : A two-dimensional distri-bution that is uniform in solid angle.

Multiple Distribution objects are needed to definethe prior probability density function for all param-eters. The JointDistribution object combines theindividual prior probability density functions, provid-ing a single interface for the LikelihoodEvaluator

to evaluate the prior probability density functionfor all parameters. As the sampling algorithmadvances the positions of the Markov chains, theJointDistribution computes the product of theprior probability density functions for the proposednew set of points in the parameter space. TheJointDistribution can apply additional constraintson the prior probability density functions of parame-ters and it renormalizes the prior probability densityfunction accordingly. If multiple constraints are pro-vided, then the union of all constraints are applied. Wedemonstrate how to apply a cut on M and q obtainedfrom the m1 and m2 prior probability density functionsin the GW151226 configuration file in Ref. [14].

3.6. Generator objects

As part of the likelihood calculation described inSec. 2.2, a waveform si(f, ~ϑ) is generated from

a waveform model H and set of parameters ~ϑ.PyCBC Inference provides Generator objects thatallow waveforms si(f, ~ϑ) to be generated for waveformmodels described in Sec. 2.1 using PyCBC’s interfaceto LAL [46]. There are also Generator objectsprovided for generating ringdown waveforms as used inRef. [11]. Given the waveform model provided in theconfiguration file, pycbc inference will automaticallyselect the associated Generator object.

4. Validation of the Toolkit

PyCBC Inference includes tools for visualizing the re-sults of parameter estimation, and several analyticfunctions that can be used to test the generation ofknown posterior probabilities. Two common ways tovisualize results are a scatter plot matrix of the inde-pendent samples of the Markov chains, and a marginal-ized one-dimensional histograms showing the bounds of

Page 10: PyCBC Inference: A Python-based parameter estimation toolkit for … · 2018-07-30 · PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence

PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence signals 10

each parameter’s credible interval. Analytic likelihoodfunctions available to validate the code include: themultivariate normal, Rosenbrock, eggbox, and volcanofunctions. An example showing the visualization of re-sults from the multi-variate Gaussian test is shown inFig. 2. This figure was generated using the executablepycbc inference plot posterior which make exten-sive use of tools from the open-source packages Mat-plotlib [77] and SciPy [78].

We can also validate the performance of PyCBCInference by: (i) determining if the inferred parametersof a population of simulated signals agrees withknown the parameters that population, and (ii)comparing PyCBC Inference’s parameter credibleintervals astrophysical signals to the published LIGO-Virgo results that used a different inference code. Inthis section, we first check that the credible intervalsmatch the probability of finding the simulated signalparameters in that interval, that is, that x% of signalsshould have parameter values in the x% credibleinterval. We then compare the recovered parameters ofthe binary black hole mergers GW150914, GW151226,and LVT151012 to those published in Ref. [1].

4.1. Simulated Signals

To test the performance of PyCBC Inference, wegenerate 100 realizations of stationary Gaussian noisecolored by power-spectral densities representative ofthe sensitivity of Advanced LIGO detectors at the timeof the detection of GW150914 [74]. To each realizationof noise we add a simulated signal whose parameterswere drawn from the same prior probability densityfunction used in the analysis of GW150914 [79], withan additional cut placed on distance to avoid havingtoo many injections with low matched-filter SNR. Theresulting injections have matched-filter SNRs between5 and 160, with the majority between ∼ 10 and∼ 40. We then perform a parameter estimationanalysis on each signal to obtain credible intervals onall parameters.

We perform this test using both the emcee pt andkombine samplers. For the emcee pt sampler we use200 walkers and 20 temperatures. We run the sampleruntil we obtain at least 2000 independent samples afterthe burn-in period as determined using the n acl burn-in test. For the kombine sampler, we use 5000 walkersand the max posterior burn-in test. As a result, weneed only to run the kombine sampler until the burn-in test is satisfied, at which point we immediately have5000 independent samples of the posterior probabilitydensity function.

Both simulated signals and the waveforms inthe likelihood computation are generated usingIMRPhenomPv2 [47, 48]. This waveform model has15 parameters. To reduce computational cost, we

analytically marginalize over the fiducial phase φ byusing Eq. (6) for the posterior probability, therebyreducing the number of sampled parameters to 14. Foreach parameter, we count the number of times thesimulated parameter falls within the measured credibleinterval.

Figure 3 summarizes the result of this test usingthe emcee pt and kombine samplers. For each ofthe parameters we plot the fraction of signals whosetrue parameter value fall within a credible interval asa function of credible interval (this is referred to asa percentile-percentile plot). We expect the formerto equal the latter for all parameters, though somefluctuation is expected due to noise. We see that allparameters follow a 1-to-1 relation, though the resultsfrom the kombine sampler have greater variance thenthe emcee pt sampler.

To quantify the deviations seen in Fig. 3, weperform a Kolmogorov–Smirnov (KS) test on eachparameter to see whether the percentile-percentilecurves match the expected 1-to-1 relation. If thesamplers and code are performing as expected, thenthese p-values should in turn follow a uniformdistribution. We therefore perform another KS teston the collection of p-values, obtaining a two-tailed p-value of 0.50 for emcee pt and 0.03 for kombine. Inother words, if emcee pt provides an unbiased estimateof the parameters, then there is a 50% chance that wewould obtain a collection of percentile-percentile curvesmore extreme than seen in Fig. 3. For the kombine

sampler, the probability of obtaining a more extremecollection of curves than that seen in Fig. 3 is only 3%.

Based on these results, we conclude that PyCBCInference does indeed provide unbiased estimates ofbinary black hole parameters when used with emcee pt

with the above settings. The kombine sampler does notappear to provide unbiased parameter estimates whenused to sample the full parameter space of precessingbinary black holes with the settings we have used.

4.2. Astrophysical Events

In this section, we present PyCBC Inference measure-ments of properties of the binary black hole sourcesof the two gravitational-wave signals GW150914 andGW151226, and the third gravitational-wave signalLVT151012 consistent with the properties of a bi-nary black hole source from Advanced LIGO’s firstobserving run [80, 1]. We perform the parameter es-timation analysis on the Advanced LIGO data avail-able for these events at the LIGO Open Science Cen-ter [74]. We use the emcee pt sampler for theseanalyses. For computing the likelihood, we analyzethe gravitational-wave dataset ~d(t) from the Han-

ford and Livingston detectors. ~d(t) in our analy-ses are taken from GPS time intervals 1126259452 to

Page 11: PyCBC Inference: A Python-based parameter estimation toolkit for … · 2018-07-30 · PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence

PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence signals 11

Figure (2) The samples of the posterior probability density function for a four-dimensional normal distribution.Typically, these results are shown as a scatter-plot matrix of independent samples. Here, the points in thescatter-plot matrix are colored by the natural logarithm of the prior-weighted likelihood logL(~ϑ). At the topof each column is the marginalized one-dimensional histogram for a particular model parameter. In this case,each parameter pi is the mean of a Gaussian in the range (0, 1). The median and 90% credible interval aresuperimposed on the marginalized histograms. Left : Results obtained from the emcee pt sampler. Right : Resultsobtained from the kombine sampler.

1126259468 for GW150914, 1135136340 to 1135136356for GW151226, and 1128678874 to 1128678906 forLVT151012. Detection of gravitational waves from thesearch pipeline [10, 20, 81, 21, 82] gives initial esti-mates of the mass, and hence estimates of the lengthof the signal. From results of the search, LVT151012was a longer signal with more cycles than the othertwo events, and LVT151012 had characteristics whichwere in agreement with a lower mass source thanGW150914 and GW151226. Therefore, more data isrequired for the analysis of LVT151012. The PSDused in the likelihood is constructed using the medianPSD estimation method described in Ref. [19] with8 s Hann-windowed segments ( overlapped by 4 s )taken from GPS times 1126258940 to 1126259980 forGW150914, 1135136238 to 1135137278 for GW151226,and 1128678362 to 1128679418 for LVT151012. ThePSD estimate is truncated to 4 s in the time-domainusing the method described in Ref. [19]. The datasetis sampled at 2048 Hz, and the likelihood is evaluatedbetween a low frequency cutoff of 20 Hz and 1024 Hz.

The waveforms si(f, ~ϑ) used in the likelihood aregenerated using the IMRPhenomPv2 [47, 48] model im-plemented in the LIGO Algorithm Library (LAL) [46].The parameters inferred for these three events are~ϑ = {α, δ, ψ,m1,m2, dL, ι, tc, a1, a2, θ

a1 , θ

a2 , θ

p1 , θ

p2}, and

we analytically marginalize over the fiducial phase φ.These parameters form the complete set of parame-ters to construct a waveform from a binary black hole

merger, and are the same parameters that were inferredfrom the parameter estimation analyses in Ref. [1].Since faster convergence of m1 and m2 can be obtainedwith mass parameterizations of the waveform in Mand q we perform the coordinate transformation from(m1,m2) to (M, q) before evaluating the priors.

We assume uniform prior distributions for thebinary component masses m1,2 ∈ [10, 80] M� forGW150914, m1,2 ∈ [5, 80] M� for LVT151012, andm1,2 corresponding to chirp mass M ∈ [9.5, 10.5] M�and mass ratio q ∈ [1, 18] for GW151226. We useuniform priors on the spin magnitudes a1,2 ∈ [0.0,0.99]. We use a uniform solid angle prior, where θa1,2 isa uniform distribution θa1,2 ∈ [0, 2π) and θp1,2 is a sine-angle distribution. For the luminosity distance, we usea uniform in volume prior with dL ∈ [10, 1000] Mpc forGW150914, dL ∈ [10, 1500] Mpc for GW151226, anddL ∈ [10, 2500] Mpc for LVT151012. We use uniformpriors for the arrival time tc ∈ [ts − 0.2 s, ts + 0.2 s]where ts is the trigger time for the particular eventobtained from the gravitational-wave search [1, 82].For the sky location parameters, we use a uniformdistribution prior for α ∈ [0, 2π) and a cosine-angledistribution prior for δ. The priors described aboveare the same as those used in Ref. [1] .

The parameter estimation analysis producesdistributions that are a sampling of the posteriorprobability density function for the variable parametersfrom the ensemble MCMC. We map these distributions

Page 12: PyCBC Inference: A Python-based parameter estimation toolkit for … · 2018-07-30 · PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence

PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence signals 12

Figure (3) Fraction of simulated signals with param-eter values within a credible interval as a function ofcredible interval. Plotted are all 14 parameters var-ied in the MCMC analyses. The diagonal line indi-cates the ideal 1-to-1 relation that is expected if thesamplers provide unbiased estimates of the parameters.We perform a Kolmogorov-Smirnov (KS) test on eachparameter to obtain two-tailed p-value indicating theconsistency between the curves and the diagonal line.Top: Results using the emcee pt sampler. Bottom:Results using the kombine sampler.

obtained directly from the analysis to obtain estimatesof other parameters of interest such as the chirp massM, mass ratio q, effective spin χeff , and the precession

spin χp [47] parameters. We use dL to relate thedetector-frame masses obtained from the MCMC tothe source-frame masses using the standard Λ-CDMcosmology [83, 15].

Recorded in Table 1, is a summary of themedian and 90% credible interval values calculatedfor GW150914, GW151226, and LVT151012 analyses.Results for msrc

1 −msrc2 , q− χeff , and dL − ι are shown

in Figs. 4, 5, and 6 for GW150914, GW151226, andLVT151012 respectively. The two-dimensional plots inthese figures show the 50% and 90% credible regions,and the one-dimensional marginal distributions showthe median and 90% credible intervals. Overlaid arethe one-dimensional marginal distributions, median,and 90% credible intervals, as well as the 50% and90% credible regions using the samples obtained fromthe LIGO Open Science Center [74] for the analysesof the three events reported in Ref. [1] using theIMRPhenomPv2 model for comparison. The resultsshow that GW150914 has the highest mass componentsamong the three events. GW150914 has more supportfor equal mass ratios whereas the GW151226 andLVT151012 posteriors support more asymmetric massratios. Overall, there is preference for smaller spins,with GW151226 having the highest spins among thethree events. While the inclination and luminositydistances are not very well constrained, with generallya support for both “face-on” (ι = 0, line of sightparallel to binary angular momentum) and “face-off”(ι = π, line of sight anti-parallel to binary angularmomentum) systems for all the three events, andGW150914 seem to have more preference for face-off systems. We also computed χp for each of thethree events and found no significant measurements ofprecession. Overall, our results are in agreement withthose presented in Ref. [1] within the statistical errorsof measurement.

5. Conclusions

In this paper we have described PyCBC Inference,a Python-based toolkit with a simplified interfacefor parameter estimation studies of compact-objectbinary mergers. We have used this toolkit to estimatethe parameters of the gravitational-wave eventsGW150914, GW151226, and LVT151012; our resultsare consistent with previously published values. Inthese analyses, we do not marginalize over calibrationuncertainty of the measured strain in our results, whichwas included in prior work, for example Refs. [1, 79,84]. We will implement this in PyCBC Inference inthe future. We have made the samples of the posteriorprobability density function from the PyCBC Inferenceanalysis of all three events available in Ref. [14] alongwith the instructions and configuration files needed

Page 13: PyCBC Inference: A Python-based parameter estimation toolkit for … · 2018-07-30 · PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence

PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence signals 13

Parameter GW150914 GW151226 LVT151012

Mdet 31.0+1.6−1.5 M� 9.7+0.06

−0.06 M� 18.1+1.0−0.7 M�

mdet1 38.8+5.4

−3.3 M� 15.0+8.4−3.4 M� 27.0+16.5

−5.6 M�mdet

2 32.9+3.2−4.9 M� 8.4+2.3

−2.6 M� 16.3+4.2−5.8 M�

Msrc 28.2+1.6−1.4 M� 8.9+0.3

−0.25 M� 15.0+1.3−1.0 M�

msrc1 35.3+5.0

−3.1 M� 13.7+7.7−3.2 M� 22.4+14.1

−4.8 M�msrc

2 29.9+3.0−4.4 M� 7.7+2.1

−2.4 M� 13.5+3.7−4.7 M�

q 1.17+0.38−0.16 1.78+2.21

−0.71 1.65+2.45−0.6

χeff −0.033+0.11−0.12 0.2+0.18

−0.07 0.0023+0.24−0.16

a1 0.29+0.57−0.26 0.53+0.37

−0.45 0.28+0.51−0.26

a2 0.33+0.56−0.30 0.51+0.43

−0.46 0.40+0.51−0.36

dL 497+126−202 Mpc 454+164

−187 Mpc 1071+458−473 Mpc

Table (1) Results from PyCBC Inference analysis of GW150914, GW151226, and LVT151012. Quoted arethe median and 90% credible interval values for the parameters of interest. Interpretations of these results aresummarized in Sec. 4.2.

to replicate these results. The source code anddocumentation for PyCBC Inference is available aspart of the PyCBC software package at http://pycbc.org.

PyCBC Inference has already been used toproduce several astrophysical results: (i) a test ofthe black hole area increase law [11], (ii) measuringthe viewing angle of GW170817 with electromagneticand gravitational-wave signals [12], and (iii) measuringthe tidal deformabilities and radii of neutron starsfrom the observation of GW170817 [13]. Theresults presented in this paper and in the studiesabove demonstrate the capability of PyCBC Inferenceto perform gravitational-wave parameter estimationanalyses. Future developments under considerationare implementation of models to marginalize overcalibration errors, generic algorithms to perform modelselection, HPD to compute credible intervals, andmethods for faster computation of the likelihood.

6. Acknowledgements

The authors would like to thank Will Farr and Ben Farrfor valuable insights into the intricacies of ensembleMCMCs. We also thank Ian Harry, ChristopherBerry, and Daniel Wysocki for helpful comments onthe manuscript. This work was supported by NSFawards PHY-1404395 (DAB, CMB), PHY-1707954(DAB, SD), and PHY-1607169 (SD). Computationswere supported by Syracuse University and NSFaward OAC-1541396. We also acknowledge the MaxPlanck Gesellschaft for support and the Atlas clustercomputing team at AEI Hannover. DAB thanksthe Ecole de Physique des Houches for hospitalityduring the completion of this manuscript. The authorsthank the LIGO Scientific Collaboration for accessto the data and acknowledge the support of theUnited States National Science Foundation (NSF)

for the construction and operation of the LIGOLaboratory and Advanced LIGO as well as the Scienceand Technology Facilities Council (STFC) of theUnited Kingdom, and the Max-Planck-Society (MPS)for support of the construction of Advanced LIGO.Additional support for Advanced LIGO was providedby the Australian Research Council. This researchhas made use of data obtained from the LIGO OpenScience Center https://losc.ligo.org.

References

[1] Abbott B P et al. (Virgo, LIGO Scientific) 2016 Phys. Rev.X6 041015 (Preprint 1606.04856)

[2] Abbott B P et al. (VIRGO, LIGO Scientific) 2017 Phys.Rev. Lett. 118 221101 (Preprint 1706.01812)

[3] Abbott B P et al. (Virgo, LIGO Scientific) 2017 Astrophys.J. 851 L35 (Preprint 1711.05578)

[4] Abbott B P et al. (Virgo, LIGO Scientific) 2017 Phys. Rev.Lett. 119 141101 (Preprint 1709.09660)

[5] Abbott B et al. (Virgo, LIGO Scientific) 2017 Phys. Rev.Lett. 119 161101 (Preprint 1710.05832)

[6] Aasi J et al. (LIGO Scientific) 2015 Class. Quant. Grav. 32074001 (Preprint 1411.4547)

[7] Acernese F et al. (VIRGO) 2015 Class. Quant. Grav. 32024001 (Preprint 1408.3978)

[8] Bayes M and Price M 1763 Philosophical Transactions(1683-1775) URL http://dx.doi.org/10.1098/rstl.

1763.0053

[9] Jaynes E T 2003 Probability Theory: The Logic ofScience (CUP) ISBN 9780521592710 URL http:

//www.cambridge.org/au/academic/subjects/physics/

theoretical-physics-and-mathematical-physics/

probability-theory-logic-science?format=HB&isbn=

9780521592710

[10] Nitz A et al. 2018 Pycbc v1.9.4 https://github.

com/gwastro/pycbc URL https://doi.org/10.5281/

zenodo.1208115

[11] Cabero M, Capano C D, Fischer-Birnholtz O, Krishnan B,Nielsen A B, Nitz A H and Biwer C M 2018 Phys. Rev.D 97(12) 124069 URL https://link.aps.org/doi/10.

1103/PhysRevD.97.124069

[12] Finstad D, De S, Brown D A, Berger E and Biwer C M2018 Astrophys. J. Lett. 860 L2 (Preprint 1804.04179)

Page 14: PyCBC Inference: A Python-based parameter estimation toolkit for … · 2018-07-30 · PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence

PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence signals 14

(a) (b)

(c)Figure (4) Posterior probability densities for the main parameters of interest from the PyCBC Inference analysisof GW150914. The parameters plotted are (a): msrc

1 −msrc2 , (b): q−χeff (c): dL−ι. The bottom-left panel in each

of (a), (b) and (c) show two-dimensional probability densities with 50% and 90% credible contour regions fromthe PyCBC Inference posteriors. The top-left and the bottom-right panels in each figure show one-dimensionalposterior distributions for the individual parameters with solid lines at the 5%, 50% and 95% percentiles. Forcomparison, we also show 50% and 90% credible regions, one-dimensional posterior probabilities with dashed linesat 5%, 50% and 95% percentiles using the posterior samples obtained from the LIGO Open Science Center [74]for the GW150914 analysis reported in Ref. [1] using the IMRPhenomPv2 model. The measurements show thatmasses for GW150914 are much better constrained as compared to the other parameters presented. Thoughthere is support for the system being both “face-on” and “face-off”, there seems to be slightly more preferencefor a ”face-off” system. The posteriors suggest a preference for lower spins. Our measurements are in agreementwith the results presented in [1] within the statistical errors of measurement of the parameters.

[13] De S, Finstad D, Lattimer J M, Brown D A, Berger E andBiwer C M 2018 (Preprint 1804.08583)

[14] https://github.com/gwastro/pycbc-inference-paper

[15] Finn L S and Chernoff D F 1993 Phys. Rev. D47 2198–2219(Preprint gr-qc/9301003)

[16] Cutler C and Flanagan E E 1994 Phys. Rev. D49 2658–2697 (Preprint gr-qc/9402014)

[17] Nicholson D and Vecchio A 1998 Phys. Rev. D57 4588–4599(Preprint gr-qc/9705064)

[18] Christensen N and Meyer R 2001 Phys. Rev. D64 022001(Preprint gr-qc/0102018)

[19] Allen B, Anderson W G, Brady P R, Brown D A andCreighton J D E 2012 Phys. Rev. D85 122006

[20] Usman S A et al. 2016 Class. Quant. Grav. 33 215004(Preprint 1508.02357)

[21] Nitz A H, Dent T, Dal Canton T, Fairhurst S and BrownD A 2017 Astrophys. J. 849 118 (Preprint 1705.01513)

[22] Kass R E and Raftery A E 1995 Journal ofthe American Statistical Association 90 773–795 (Preprint https://www.tandfonline.com/

doi/pdf/10.1080/01621459.1995.10476572) URLhttps://www.tandfonline.com/doi/abs/10.1080/

Page 15: PyCBC Inference: A Python-based parameter estimation toolkit for … · 2018-07-30 · PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence

PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence signals 15

(a) (b)

(c)Figure (5) Posterior probability densities for the main parameters of interest from the PyCBC Inference analysisof GW151226. The parameters plotted are (a): msrc

1 −msrc2 , (b): q−χeff (c): dL−ι. The bottom-left panel in each

of (a), (b) and (c) show two-dimensional probability densities with 50% and 90% credible contour regions fromthe PyCBC Inference posteriors. The top-left and the bottom-right panels in each figure show one-dimensionalposterior distributions for the individual parameters with solid lines at the 5%, 50% and 95% percentiles. Forcomparison, we also show 50% and 90% credible regions, one-dimensional posterior probabilities with dashed linesat 5%, 50% and 95% percentiles using the posterior samples obtained from the LIGO Open Science Center [74]for the GW151226 analysis reported in Ref. [1] using the IMRPhenomPv2 model. The measurements show thatGW151226 is the lowest mass and fastest spinning binary among the three O1 events presented in this work.The posteriors support asymmetric mass ratios. Inclination ι and distance dL are not well constrained, and thereis support for the system being both “face-on” and “face-off”. Our measurements are in agreement with theresults presented in [1] within the statistical errors of measurement of the parameters.

01621459.1995.10476572

[23] 1998 Statistical Science 13 163–185 ISSN 08834237 URLhttp://www.jstor.org/stable/2676756

[24] Skilling J 2006 Bayesian Anal. 1 833–859 URL https:

//doi.org/10.1214/06-BA127

[25] Thorne K S 1987 THREE HUNDRED YEARS OFGRAVITATION ed Hawking S W and Israel W

[26] Peters P C 1964 Phys. Rev. 136 B1224–B1232[27] Flanagan E E and Hinderer T 2008 Phys. Rev. D77 021502[28] Hinderer T 2008 Astrophys. J. 677 1216–1220

[29] Wahlquist H 1987 Gen. Rel. Grav. 19 1101–1113[30] Peters P C and Mathews J 1963 Phys. Rev. 131(1) 435–440

URL https://link.aps.org/doi/10.1103/PhysRev.

131.435

[31] Hannam M, Brown D A, Fairhurst S, Fryer C L and HarryI W 2013 Astrophys. J. 766 L14 (Preprint 1301.5616)

[32] Veitch J et al. 2015 Phys. Rev. D91 042003 (Preprint1409.7215)

[33] Blanchet L 2006 Living Reviews in Relativity 9 4 ISSN1433-8351 URL https://doi.org/10.12942/lrr-2006-4

Page 16: PyCBC Inference: A Python-based parameter estimation toolkit for … · 2018-07-30 · PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence

PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence signals 16

(a) (b)

(c)Figure (6) Posterior probability densities for the main parameters of interest from the PyCBC Inference analysisof LVT151012. The parameters plotted are (a): msrc

1 −msrc2 , (b): q−χeff (c): dL−ι. The bottom-left panel in each

of (a), (b) and (c) show two-dimensional probability densities with 50% and 90% credible contour regions fromthe PyCBC Inference posteriors. The top-left and the bottom-right panels in each figure show one-dimensionalposterior distributions for the individual parameters with solid lines at the 5%, 50% and 95% percentiles. Forcomparison, we also show 50% and 90% credible regions, one-dimensional posterior probabilities with dashed linesat 5%, 50% and 95% percentiles using the posterior samples obtained from the LIGO Open Science Center [74]for the LVT151012 analysis reported in Ref. [1] using the IMRPhenomPv2 model. The measurements again showthat the spins, inclination, and distance are not very well constrained and there is support for the system beingboth “face-on” and “face-off”. Our measurements are in agreement with the results presented in [1] within thestatistical errors of measurement of the parameters.

[34] Buonanno A and Damour T 1999 Phys. Rev. D59 084006(Preprint gr-qc/9811091)

[35] Buonanno A and Damour T 2000 Phys. Rev. D62 064015(Preprint gr-qc/0001013)

[36] Damour T, Jaranowski P and Schaefer G 2000 Phys. Rev.D62 084011 (Preprint gr-qc/0005034)

[37] Damour T 2001 Phys. Rev. D64 124013 (Preprint gr-qc/

0103018)[38] Ajith P et al. 2007 Class. Quant. Grav. 24 S689–S700

(Preprint 0704.3764)[39] Ajith P et al. 2011 Phys. Rev. Lett. 106 241101 (Preprint

0909.2867)[40] Santamaria L et al. 2010 Phys. Rev. D82 064016 (Preprint

1005.3306)[41] Teukolsky S A 1972 Phys. Rev. Lett. 29 1114–1118[42] Berti E, Cardoso V and Starinets A O 2009 Class. Quant.

Grav. 26 163001 (Preprint 0905.2975)[43] Cardoso V, Gualtieri L, Herdeiro C and Sperhake U 2015

Living Rev. Relativity 18 1 (Preprint 1409.0014)[44] Prrer M 2016 Phys. Rev. D93 064041 (Preprint 1512.

02248)[45] Lackey B D, Bernuzzi S, Galley C R, Meidam J and Van

Page 17: PyCBC Inference: A Python-based parameter estimation toolkit for … · 2018-07-30 · PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence

PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence signals 17

Den Broeck C 2017 Phys. Rev. D95 104036 (Preprint1610.04742)

[46] Mercer R A et al. 2017 LIGO Algorithm Libraryhttps://git.ligo.org/lscsoft/lalsuite

[47] Schmidt P, Ohme F and Hannam M 2015 Phys. Rev. D91024043 (Preprint 1408.1810)

[48] Hannam M, Schmidt P, Boh A, Haegel L, Husa S, OhmeF, Pratten G and Prrer M 2014 Phys. Rev. Lett. 113151101 (Preprint 1308.3271)

[49] Wainstein L A and Zubakov V D 1962 Extraction of signalsfrom noise (Englewood Cliffs, NJ: Prentice-Hall)

[50] Metropolis N, Rosenbluth A W, Rosenbluth M N, TellerA H and Teller E 1953 J. Chem. Phys. 21 1087–1092

[51] Geman S and Geman D 1984 IEEE Transactions onPattern Analysis and Machine Intelligence PAMI-6721–741 ISSN 0162-8828

[52] Gilks W R, Richardson S and Spiegelhalter D J MarkovChain Monte Carlo in Practice, Chapman and Hall,London, 1996

[53] Gelman A, Robert C, Chopin N and Rousseau J 1995Bayesian data analysis

[54] Christensen N, Libson A and Meyer R 2004 Class. Quant.Grav. 21 317–330

[55] Rover C, Meyer R and Christensen N 2006 Class. Quant.Grav. 23 4895–4906 (Preprint gr-qc/0602067)

[56] Rover C, Meyer R and Christensen N 2007 Phys. Rev. D75062004 (Preprint gr-qc/0609131)

[57] Foreman-Mackey D, Hogg D W, Lang D and Goodman J2013 Publ. Astron. Soc. Pac. 125 306 (Preprint 1202.

3665)[58] Foreman-Mackey D et al. 2018 dfm/emcee: emcee v3.0rc1

URL https://doi.org/10.5281/zenodo.1297477

[59] Vousden W et al. 2015 URL https://github.com/

willvousden/ptemcee

[60] Vousden W D, Farr W M and Mandel I 2016 Monthly No-tices of the Royal Astronomical Society 455 1919–1937(Preprint /oup/backfile/content_public/journal/

mnras/455/2/10.1093_mnras_stv2422/2/stv2422.pdf)URL http://dx.doi.org/10.1093/mnras/stv2422

[61] Farr B and Farr W M 2015 In prep[62] Farr B et al. 2018 URL https://github.com/bfarr/

kombine

[63] Hastings W K 1970 Biometrika 57 97–109[64] Christensen N, Dupuis R J, Woan G and Meyer R 2004

Phys. Rev. D70 022001 (Preprint gr-qc/0402038)[65] Madras N and Sokal A D 1988 J. Statist. Phys. 50 109–186[66] Chen M H, Shao Q M and Ibrahim J G 2000 Computing

Bayesian Credible and HPD Intervals (New York,NY: Springer New York) ISBN 978-1-4612-1276-8 URLhttps://doi.org/10.1007/978-1-4612-1276-8_7

[67] Tannenbaum T, Wright D, Miller K and Livny M 2001Condor – a distributed job scheduler Beowulf ClusterComputing with Linux ed Sterling T (MIT Press)

[68] Thain D, Tannenbaum T and Livny M 2005 Concurrency- Practice and Experience 17 323–356

[69] Dalcin L D, Paz R R, Kler P A and Cosimo A 2011 Advancesin Water Resources 34 1124–1139

[70] Dalcın, Lisandro, Paz R, Storti M and D’Elia J 2008Journal of parallel and distributed computing 68 655–662

[71] Dalcın L, Paz R and Storti M 2005 J. Parallel Distrib.Comput. 65 1108–1115 ISSN 0743-7315 URL http://

dx.doi.org/10.1016/j.jpdc.2005.03.010

[72] Frigo M and Johnson S G 2005 Proceedings of the IEEE93 216–231 special issue on “Program Generation,Optimization, and Platform Adaptation”

[73] Collette A et al. 2018 h5py/h5py 2.8.0 URL https://doi.

org/10.5281/zenodo.1246321

[74] Vallisneri M, Kanner J, Williams R, Weinstein A andStephens B 2015 J. Phys. Conf. Ser. 610 012021

(Preprint 1410.4839)[75] Farr B et al. 2016 Astrophys. J. 825 116 (Preprint 1508.

05336)[76] https://docs.scipy.org/doc/scipy/reference/

generated/scipy.stats.gaussian_kde.html

[77] Hunter J D 2007 Computing In Science & Engineering 990–95

[78] Jones E, Oliphant T, Peterson P et al. 2001– SciPy:Open source scientific tools for Python [Online; accessed¡today¿] URL http://www.scipy.org/

[79] Abbott B P et al. (Virgo, LIGO Scientific) 2016 Phys. Rev.Lett. 116 241102 (Preprint 1602.03840)

[80] Abbott B P et al. (Virgo, LIGO Scientific) 2016 Phys. Rev.Lett. 116 061102 (Preprint 1602.03837)

[81] Dal Canton T et al. 2014 Phys. Rev. D90 082004 (Preprint1405.6731)

[82] Abbott B P et al. (Virgo, LIGO Scientific) 2016 Phys. Rev.D93 122003 (Preprint 1602.03839)

[83] Schutz B F 1986 Nature 323 310–311[84] Abbott T D et al. (Virgo, LIGO Scientific) 2016 Phys. Rev.

X6 041014 (Preprint 1606.01210)