arxiv:0901.3269v2 [astro-ph.co] 29 jul 2009arxiv:0901.3269v2 [astro-ph.co] 29 jul 2009 astronomy...

arX

iv:0

901.

3269

v2 [

astr

o-ph

.CO

] 29

Jul

200

9Astronomy & Astrophysicsmanuscript no. 11697 c© ESO 2018May 30, 2018

The non-Gaussianity of the cosmic shear likelihoodor

How odd is the Chandra Deep Field South?

J. Hartlap1, T. Schrabback2,1, P. Simon3, and P. Schneider1

1 Argelander-Institut fur Astronomie, Universitat Bonn,Auf dem Hugel 71, D-53121 Bonn, Germany2 Leiden Observatory, Universiteit Leiden, Niels Bohrweg 2,NL-2333 CA Leiden, The Netherlands3 The Scottish Universities Physics Alliance (SUPA), Institute for Astronomy, School of Physics,

University of Edinburgh, Royal Observatory, Blackford Hill, Edinburgh EH9 3HJ, UK

Received 22 January 2009/ Accepted 19 June 2009

ABSTRACT

Aims. We study the validity of the approximation of a Gaussian cosmic shear likelihood. We estimate the true likelihood for a fiducialcosmological model from a large set of ray-tracing simulations and investigate the impact of non-Gaussianity on cosmological parameterestimation. We investigate how odd the recently reported very low value ofσ8 really is as derived from theChandra Deep Field South (CDFS)using cosmic shear by taking the non-Gaussianity of the likelihood into account as well as the possibility of biases coming from the way theCDFS was selected.Methods. A brute force approach to estimating the likelihood from simulations must fail because of the high dimensionality of theproblem.We therefore use independent component analysis to transform the cosmic shear correlation functions to a new basis, in which the likelihoodapproximately factorises into a product of one-dimensional distributions.Results. We find that the cosmic shear likelihood is significantly non-Gaussian. This leads to both a shift of the maximum of the posteriordistribution and a significantly smaller credible region compared to the Gaussian case. We re-analyse the CDFS cosmic shear data using thenon-Gaussian likelihood in combination with conservativegalaxy selection criteria that minimise calibration uncertainties. Assuming that theCDFS is a random pointing, we findσ8 = 0.68+0.09

−0.16 for fixedΩm = 0.25. In a WMAP5-like cosmology, a value equal to or lower than thiswould be expected in≈ 5% of the times. Taking biases into account arising from the way the CDFS was selected, which we model as beingdependent on the number of haloes in the CDFS, we obtainσ8 = 0.71+0.10

−0.15. Combining the CDFS data with the parameter constraints fromWMAP5 yieldsΩm = 0.26+0.03

−0.02 andσ8 = 0.79+0.04−0.03 for a flat universe.

1. Introduction

Weak gravitational lensing by the large-scale structure intheUniverse, or cosmic shear, is becoming a more and more im-portant tool to constrain cosmological parameters. It is largelycomplementary to other cosmological probes like the cosmicmicrowave background or the clustering of galaxies, and partic-ularly sensitive to the matter densityΩm and the normalisationof the matter power spectrumσ8. Important constraints have al-ready been obtained by Benjamin et al. (2007), who compileda set of five weak lensing surveys, and from the CFHT LegacySurvey (Hoekstra et al. 2006; Semboloni et al. 2006; Fu et al.2008). In subsequent years, a new generation of surveys likeKIDS or Pan-STARRS (Kaiser & Pan-STARRS Collaboration2005) will allow cosmic shear to be measured with statisticaluncertainties that are much smaller than the systematic errorsboth on the observational and the theoretical sides. Strongef-forts are now being made to find sources of systematics in

the process of shape measurement and shear estimation (e.g.Massey et al. 2007a). In addition, new methods of shape mea-surement are being explored, such as the shapelet formalism(Refregier & Bacon 2003; Kuijken 2006) or the methods pro-posed in Bernstein & Jarvis (2002) and Miller et al. (2007).

It is equally important to have accurate theoretical modelpredictions that can be fit to the expected high-quality mea-surements. Currently, these models are all based on fitting for-mulae for the three-dimensional matter power spectrum de-rived from N-body simulations as given by Peacock & Dodds(1996) and more recently by Smith et al. (2003). However,these are only accurate at best to the percent level on thescales relevant to this and similar works when compared toray-tracing simulations based on state-of-the-artN-body sim-ulations (Hilbert et al. 2009), such as the Millennium Run(Springel et al. 2005). Therefore, there is a strong need fora

http://arxiv.org/abs/0901.3269v2

2 Hartlap et al.: The non-Gaussianity of the cosmic shear likelihood

large ray-tracing effort to obtain accurate semi-numerical pre-dictions for a range of cosmological parameters.

While a tremendous effort is currently being directed to thesolution of these problems, the actual process of parameteresti-mation has so far received relatively little attention. Obviously,the statistical data analysis has to achieve the same accuracy asthe data acquisition if the aforementioned efforts are not to bewasted.

The standard procedure for converting measurements ofsecond-order cosmic shear statistics into constraints on cosmo-logical parameters is to write down a likelihood function andto determine the location of its maximum for obtaining esti-mates of the cosmological parameters of interest. To make thisfeasible, several approximations are commonly made. Despitethe shear field being non-Gaussian due to nonlinear structuregrowth, lacking an analytical description the likelihood is mostoften approximated by a multivariate Gaussian distribution.The covariance matrix for the Gaussian likelihood then remainsto be determined, which is an intricate issue by itself.

In most previous studies, the dependence of the covariancematrix on cosmological parameters has been ignored whenwriting down the likelihood function. Instead, it was kept fixedto some fiducial cosmological model. The dependence of thecovariance matrix on the cosmological parameters has been in-vestigated in Eifler et al. (2008) for the case of Gaussian shearfields. The authors find that this has a significant effect on theconstraints on cosmological parameters (reducing the sizeofthe credible regions) and will be particularly important for fu-ture large-area surveys.

There are several approaches to determine the covariancefor the fiducial set of parameters: Hoekstra et al. (2006) usethecovariance matrix derived for a Gaussian shear field. Althoughthis is rather easy to compute (Joachimi et al. 2008), the er-rors are strongly underestimated particularly on small scales.Another option is to estimate the covariance from the data itself(e.g. Massey et al. 2007b). This will become sensible and feasi-ble mostly for the upcoming large surveys, which can be safelysplit into smaller subfields without severely underestimatingcosmic variance. A third possibility, which currently seems tobe the most accurate, is to measure the covariance matrix froma large sample of ray-tracing simulations. Semboloni et al.(2007) have provided a fitting formula which allows oneto transform covariances computed for Gaussian shear fieldsinto covariances including non-Gaussianity. Another promisingway, which would also easily allow one to take into account thedependence on cosmological parameters, is the semi-analyticalcomputation using the halo model (Scoccimarro et al. 1999;Cooray & Hu 2001; Takada & Jain 2009).

However, all these works are based on the assumption thatthe likelihood is well approximated by a Gaussian. In this pa-per, we study the impact of this assumption on the shape of theposterior probability distribution of the matter density parame-terΩm and the power spectrum normalisationσ8. Furthermore,we compute Fisher matrix constraints for the four-dimensionalparameter space spanned byΩm, σ8, h100 andΩΛ. We proposea method to numerically compute the likelihood function froma large set of ray-tracing simulations based on the techniqueof independent component analysis (ICA, e.g. Jutten & Herault

1991; Comon et al. 1991). ICA is a technique for the separa-tion of independent source signals underlying a set of observedrandom variables, a statistical method related to factor analysisand principal component analysis (PCA). An approach simi-lar to ours, calledprojection pursuit density estimation, whichwe use to verify our results, was proposed by Friedman et al.(1984).

In their cosmic shear analysis of the combined HSTGEMS and GOODS data of theChandra Deep Field South,Schrabback et al. (2007) (S07 from hereon) have found a verylow value ofσ8(Ωm = 0.3) = 0.52+0.11

−0.15. In the second part ofthis paper, we present a re-analysis of the cosmic shear dataofS07. Using our estimate of the non-Gaussian likelihood, we in-vestigate whether cosmic variance alone is responsible forpro-ducing the lowσ8-estimate or whether the criteria applied byGiacconi et al. (2001) to select a field suitable for deep X-rayobservations have a share in this.

The outline of our paper is as follows: in Sec. 2, we de-scribe our sample of ray-tracing simulations which we use forthe likelihood estimation. In Sec. 3, we briefly review the lens-ing quantities relevant for this paper and Bayesian parameterestimation. We introduce our method of estimating the “true”likelihood and illustrate the impact of non-Gaussianity onpa-rameter estimation using the example of a CDFS-like survey.In Sec. 4, we present the improved cosmic shear analysis ofthe CDFS and investigate possible reasons for the low powerspectrum normalisation found in S07.

2. Ray-Tracing simulations

We have performed a set of 10N-body simulations using thepublically available codeGADGET-2 (Springel 2005), all ofwhich are realisations of the same WMAP-5-like cosmology(Ωm = 0.25,ΩΛ = 0.75,Ωb = 0.04, ns = 1.0, σ8 = 0.78,h100 = 0.73). The simulation boxes areLbox = 150h−1

100Mpcon a side, populated byNp = 2563 dark matter particles withmasses ofmp = 1.2×1010 h−1

100 M⊙. We have started the simula-tions atz = 50 and obtained snapshots fromz = 0 to z = 4.5 inintervals of∆z corresponding to the box size, so that a suitablesnapshot is available for each lens plane.

In the following, we only give a brief description of ourray-tracing algorithm and refer the reader to, for example,Jain et al. (2000) or Hilbert et al. (2009) for a more detailedin-troduction.

The ray-tracing is performed by dividing the dark matterdistribution into redshift slices and projecting each slice onto alens plane. Starting at the observer, light rays are shot throughthis array of lens planes. We assume that deflections only takeplace at the planes themselves, and that the rays propagate onstraight lines in the space between two planes. In our case, eachredshift slice corresponds to one output box of theN-body sim-ulation and was projected as a whole onto a lens plane, preserv-ing the periodic boundary conditions of the simulation box.Toavoid repetition of structure along the line of sight, the planeswere randomly shifted and rotated. The light rays are shot fromthe observer through the set of lens planes, forming a regulargrid on the first plane. We then use FFT methods to computethe lensing potential on each lens plane, from which we obtain

Hartlap et al.: The non-Gaussianity of the cosmic shear likelihood 3

the deflection angle and its partial derivatives on a grid. Theray position and the Jacobian of the lens mapping for each rayare obtained by recursion: given the ray position on the currentlens plane, its propagation direction (known from the positionon the previous plane), and the deflection angle on the currentplane interpolated onto the ray, we immediately obtain the rayposition on the next plane. Differentiation of this recursion for-mula with respect to the image plane coordinates yields a sim-ilar relation for the Jacobian of the lens mapping, which takesinto account the previously computed tidal deflection field (fora detailed description of the formalism used, see Hilbert etal.2009). The recursion is performed until we reach the redshiftcut-off at z = 4.5.

We obtain the final Jacobian for a given source redshift dis-tribution by performing a weighted average over the Jacobiansfor the light paths to each lens plane. Since our aim is to cre-ate mock catalogues comparable to those of the CDFS field,we use the redshift distribution found for our revised galaxycatalogues (Smail et al. 1995, see Sec. 4.1):

p(zs) = A

(

zs

z0

)α

exp

−(

zs

z0

)β

,

where z0 = 1.55, α = 0.59, β = 1.35 andA is a normal-isation constant. This corresponds to a mean source redshiftof zs = 1.54. We then create the mock source catalogue byrandomly sampling the resulting shear maps withNs = nsΩ

2

galaxies, wherens = 68 arcmin−2 is the number density ofsources andΩ = 0.5 is the side length of the simulated field. Intotal, we have produced 9600 quasi-independent realisations ofthe CDFS field, based on different random shifts and rotationsof the lens planes and the variousN-body simulations.

3. The non-Gaussianity of the cosmic shearlikelihood

3.1. Cosmic shear

Perhaps the most common way to extract the lensing informa-tion from the measured shapes of distant galaxies is to estimatethe two-point correlation functions of the distortion field. Onedefines two shear correlation functions (for more details, seee.g. Schneider 2006)

ξ±(θ) = 〈ǫt(ϑ)ǫt(θ + ϑ)〉 ± 〈ǫ×(ϑ)ǫ×(θ + ϑ)〉 , (1)

whereǫt,× are the tangential and cross components of the mea-sured ellipticity relative to the line connecting the two galaxies,andθ is the angular separation. An unbiased estimator for theshear correlation functions for a random distribution of galax-ies is given in Schneider et al. (2002):

ξ±(θ) =1

Np(θ)

∑

i, j

(

ǫitǫ jt ± ǫi×ǫ j×)

∆θ(|ϑi − ϑ j|) . (2)

Here,i and j label galaxies at angular positionsϑi andϑ j, re-spectively. The function∆θ(φ) is 1 if φ falls into the angularseparation bin centred onθ, and is zero otherwise. Finally,Np

is the number of pairs of galaxies in the bin under considera-tion.

3.2. Parameter estimation

Let us assume that we have measured the shear correlationfunctionsξ±(θi) on p/2 angular separation binsθi and now wishto infer some parametersπ of our modelm(π) for ξ±(θi). Forwhat follows, we define the joint data vectorξ = (ξ+, ξ−)t,which in total is supposed to havep entries.

Adopting a Bayesian point of view, our aim is to computethe posterior likelihood, i.e. the probability distribution of a pa-rameter vectorπ given the information provided by the dataξ:

p(π|ξ) = p(π)p(ξ)

p(ξ|π) . (3)

Here, p(π) is the prior distribution of the parameters, whichincorporates our knowledge aboutπ prior to looking at the data;such can originate from previous measurements or theoreticalarguments. The evidencep(ξ) in this context simply serves asa normalisation factor. Hitherto, it has been assumed that thelikelihood p(ξ|π) is a Gaussian distribution:

p(ξ|π) = 1(2π)p/2 detC(π)1/2

× exp

−12

[

ξ − m(π)]t C−1(π)

[

ξ − m(π)]

,

(4)

whereC(π) is the covariance matrix ofξ as predicted by theunderlying model. Usually, however, the dependence of the co-variance matrix upon cosmological parameters is not taken intoaccount. Rather, the covariance that is computed for a fixedfiducial set of parametersπ0 is used in Eq. (4). Under thisapproximation, the likelihood is a function of the difference∆(π) = ξ − m(π) only:

p(ξ|π) = Lπ0 [∆(π)] . (5)

3.3. Estimating the likelihood

The choice of the functional form of the likelihood as given byEq. (4) is only approximate. Since the underlying shear fieldinthe correlation function measurement becomes non-Gaussianin particular on small scales due to nonlinear structure forma-tion, there is no good reason to expect the distribution of theshear correlation function to be Gaussian. Our aim therefore isto use a very large sample of ray-tracing simulations to estimatethe likelihood and explore the effects of the deviations from aGaussian shape on cosmological parameter constraints.

In this work, we have to sustain the approximation that thefunctional form of the likelihood does not depend on cosmol-ogy in order to keep computation time manageable. Our ray-tracing simulations were all done for identical cosmologicalparameters, which is our fiducial parameter vectorπ0. Thus, asin Eq. (5) the likelihood depends on cosmology only throughthe difference∆(π) = ξ − m(π).

SinceLπ0 is the probability of obtaining the dataξ giventhe parametersπ0, we in principle have to estimate thep-dimensional distribution ofξ from our sample ofN ray-tracingsimulations. However, due to the high dimensionality of theproblem, a brute force approach to estimate the full joint distri-bution is hopeless. The problem would simplify considerably


if we could find a transformation

s = f [∆(π)] , (6)

such that

ps(s|π0) =nIC∏

i=1

psi(si|π0) . (7)

Here, f is in general a mapping fromRp to RnIC (nIC ≤ p) ands ∈ RnIC is our new data vector. This would reduce the problemto estimatingnIC one-dimensional probability distributions in-stead of a singlep-dimensional one. Eq. (7) is equivalent to thestatement that we are looking for a new set of basis vectors ofR

nIC in which the componentssi of the shear correlation func-tion are statistically independent. It is virtually impossible tofind the (in general nonlinear) mappingf . However, it is possi-ble to make progress if we make the ansatz thatf is linear:

s = A∆(π) , (8)

whereA ∈ RnIC×p is the transformation or “un-mixing” matrix.Our likelihood estimation procedure is as follows: the first

step is to remove first-order correlations from the data vectorby performing a PCA (e.g. Press et al. 1992). This yields a ba-sis in which the components ofξ are uncorrelated. If we knewthat the distribution ofξ were Gaussian, this would be suffi-cient, because in this case uncorrelatedness is equivalentto sta-tistical independence. However, for a general distribution, un-correlatedness is only a necessary condition for independence.Since we suspect that the likelihood is non-Gaussian, a secondchange of basis, determined by the ICA technique (describedin detail in the next section), is carried out which then results inthe desired independence. We then use a kernel density method(see e.g. Hastie et al. 2001; Venables & Ripley 2002, and ref-erences therein) to estimate and tabulate the one-dimensionaldistributionspsi(si|π0) in this new basis. The density estimateis constructed by smoothing the empirical distribution functionof the observations ofsi,

pempsi

(x) =1N

N∑

j=1

δD(x − s( j)i ) , (9)

wheres( j)i is the j-th of N observations ofsi andδD is the Dirac

delta-function, with a smooth kernelK. The estimate ˆpsi of thedesired densitypsi then is given by

psi(x) =1

Nb

N∑

j=1

K

x − s( j)i

b

, (10)

where s( j)i is the j-th of N observations ofsi and b is the

bandwidth. For the kernelK we use a Gaussian distribution.It has been shown that the shape of the KernelK is of sec-ondary importance for the quality of the density estimate; muchmore important is the choice of the bandwidthb. If b is toosmall, psi is essentially unbiased, but tends to have a high vari-ance because the noise is not properly smoothed out. On theother hand, choosing a bandwidth that is too large results ina smooth estimate with low variance, but a higher bias, be-cause real small scale features of the probability density are

smeared out. Our choice of the bandwidth is based on the “ruleof thumb” (e.g. Silverman 1986; Scott 1992; Davison 2003):b = 0.9 min(σ, R/1.34)N−1/5. Here,σ is the sample standarddeviation andR is the inter-quartile range of the sample.

Constraints on cosmological parameters can now be de-rived as follows: we transform our set of model vectors andthe measured correlation function to the new ICA basis:

m(π) = A m(π) , (11)

ξ = A ξ , (12)

so thats = ξ − m(π). The ICA posterior distribution is thengiven by

p(π|ξ) ∝ p(π)nIC∏

i=1

psi(ξi − mi(π)|π0) . (13)

3.4. Independent Component Analysis

We now briefly outline the ICA method (Hyvarinen et al. 2001;Hyvarinen & Oja 2000), which we use to find the new basis inR

nIC in which the components of∆ are (approximately) statisti-cally independent. ICA is best introduced by assuming that thedata at hand were generated by the following linear model:

∆ = Ms , (14)

wheres is a vector of statistically independent source signalswith non-Gaussian probability distributions andM is the mix-ing matrix. For simplicity, we will from now on only considerthe casenIC = p, in which case the mixing matrixM is simplythe inverse of the un-mixing matrixA in Eq. (8). The goal ofICA is to estimate bothM ands from the data.

An intuitive, though slightly hand-waving way to under-stand how ICA works is to note that a set of linear combina-tionsYi of independent, non-Gaussian random variablesX j willusually have distributions that are more Gaussian than the orig-inal distributions of theX j (Central Limit Theorem). Reversingthis argument, this suggests that theX j could be recoveredfrom a sample of theYi by looking for linear combinationsof the Yi that have the least Gaussian distributions. These lin-ear combinations will also be close to statistically independent.A more rigorous justification of the method can be found inHyvarinen et al. (2001).

The ICA algorithm consists of two parts, the first of whichis a preprocessing step: after subtracting the mean∆ = 〈∆〉from ∆, the data is whitened, i.e. a linear transformation∆ =L(∆ − ∆) is introduced such that〈∆∆t〉 = E, whereE is the unitmatrix. This can be achieved by the eigen-decomposition of thecovariance matrixC = UDUt of ∆, whereD = diag(d1, . . . , dp),by choosingL = D−1/2Ut. Note thatU is orthonormal and thatdi ≥ 0 for all i. As will be discussed below, each source signalsi can only be determined up to a multiplicative constant usingICA. We choose these factors such that〈sst〉 = E. The effect ofthe whitening is that the new mixing matrixM = LM between∆ ands is orthogonal. This can be seen as follows:E = 〈∆∆t〉 =M〈sst〉Mt. Since we have chosen〈sst〉 = E, the claim follows.

After the preprocessing, the components of∆ are uncorre-lated. This would be equivalent to statistical independence if


their distributions were Gaussian. However, as this is not thecase here, a further step is needed. It consists of finding a newset of orthogonal vectorswi (the row vectors ofM) such thatthe distributionspzi (zi) of

zi = ∆i · wi (15)

maximise a suitable measure of non-Gaussianity. A commonmethod to achieve this is to minimise the entropy (or approxi-mations thereof) of thezi, which is defined by

Hzi = −∫

dy pzi(y) log pzi(y) . (16)

Since it can be shown that the Gaussian distribution has thelargest entropy of all distributions of equal variance, this canbe rewritten as maximising the so-called negentropy of thezi,defined by

Jzi = HzGaussi− Hzi . (17)

Here,zGaussi is a Gaussian random variable with the same vari-

ance aszi andJ (zi) ≥ 0. Starting from randomly chosen initialdirectionswi, the algorithm tries to maximiseJ (zi) iteratively(in practice, it is sufficient to use a simple approximation to thenegentropy). For more details, the reader is again referredtoHyvarinen et al. (2001).

ICA suffers from several ambiguities, none of which, how-ever, is crucial for this work. First of all, the amplitudes of thesource signals cannot be determined, since any prefactorλ tothe signalsi can be cancelled by multiplication of the corre-sponding column of the mixing matrix by 1/λ. Secondly, theorder of the independent components is not determined, sinceany permutation of thesi can be accommodated by correspond-ing changes toM. Thirdly, ICA does not yield a unique answerif at least some of thesi are Gaussian – the subset of Gaussiansignals is only determined up to an orthogonal transformation.This is not an issue in our context, since the Gaussian sig-nals will be uncorrelated thanks to the preprocessing steps, anduncorrelatedness implies statistical independence for Gaussianrandom variables.

Several interpretations of ICA and algorithms existand are described in detail in Hyvarinen et al. (2001). Inthis work, we use an implementation of thefastICAalgorithm (Hyvarinen & Oja 1997) for theR language(R Development Core Team 2007)1.

3.5. Tests

In this section, we present the results of a number of tests wehave performed to insure that our results are not affected byconvergence issues or statistical biases of any kind.

The fastICA algorithm requires a set of randomly cho-sen directionswi as initial conditions. It then iteratively com-putes corrections to these vectors in order to increase the negen-tropy of the projections of the data vectors onto these directions(Eq. 15), followed by an orthonormalisation step. It is not clear

1 http://www.r-project.org/

0 2000 4000 6000 8000

0.03

0.04

0.05

0.06

0.07

0.08

0.09

N

Are

a of

cre

dibl

e re

gion

Varying N

Fig. 1. Area of the 68% (dashed lines) and 99% (solid lines)credible regions in theΩm-σ8-plane as function of the sam-ple sizeN, for the Gaussian likelihood (red, upper curves) andthe likelihood computed using the ICA algorithm (black, lowercurves). Blue lines are the predicted areas based on Eq. (18).

a priori whether the algorithm will settle in the same negen-tropy maxima for different sets of initial vectors. This concernis backed by the fact that at least some of thepsi (si|π0) mightbe very close to Gaussian, which might hamper convergenceeven further. We have therefore tested whether we obtain thesame set of basis vectors from a large number of different ini-tial conditions. We find that this is indeed the case for thosewi for which the distribution ofpzi(zi = ∆ · wi) departs signifi-cantly from a Gaussian. As expected, the directions leadingtoa rather Gaussianpzi are different for different starting values,reflecting the inability of ICA to distinguish between Gaussiansource signals. However, the posterior distributions derived us-ing our algorithm do not differ notably when using differentinitial conditions. This is even true if thefastICA algorithmdoes not formally converge (i.e. when the differences of someof the basis vectors between two iterations is not small): after afew hundred iterations, the non-Gaussian directions are deter-mined and do not change anymore. The reason for not reachingconvergence is that the algorithm still tries to find negentropymaxima in the subspace of Gaussian directions.

As has been noted in Hartlap et al. (2007), statistical biasescan become significant already for the Gaussian approximationof the likelihood (Eq. 4): care has to be taken if the covariancematrix of the correlation function (given onp bins) is estimatedfrom a finite set ofN simulations or observations. Inverting theestimated covariance yields a biased estimate of the inverse:

⟨

C−1⟩

=N − 1

N − p − 2Σ−1 for p < N − 1 , (18)

whereC is the estimated andΣ the true covariance matrix. Thisbias leads to an underestimation of the size of credible regionsby a factor of (N − p − 2)/(N − 1) ≈ 1− p/N. We suspect thata similar bias occurs in our likelihood estimation procedure. InFig. 1, we therefore plot the area of the 68% and 99% credibleregions of the posterior distribution forΩm andσ8 (keepingall other cosmological parameters fixed to their fiducial values)

http://www.r-project.org/


as functions of the numberN of observations of the correla-tion functions used to estimate the ICA transformation (blackcurves). To exclude noise effects from the analysis, we use thetheoretical prediction of the correlation function for thefidu-cial cosmological parameters as data vector. We setp = 30throughout. For comparison, we also show the areas computedusing the Gaussian likelihood (red curves). In the latter case,the bias predicted by Eq. (18) is clearly visible as a decreaseof the area whenN becomes small. The ICA method suffersfrom a similar bias, although the behaviour at smallN seemsto be slightly different. More important, though, is the factthat this bias is unimportant for reasonably large sample sizes(N & 2000). Since we always use the full sample (N = 9600)in the following, this bias is completely negligible.

Our method to estimate the likelihood crucially depends onthe assumption that a linear transformation makes the com-ponents of the shear correlation vectors statistically indepen-dent. A necessary condition for mutual statistical indepen-dence of allsi is pairwise independence. The componentsiand j are called pairwise statistical independent ifp(si, s j) =psi(si) ps j(s j). We therefore compare the joint pairwise dis-tributions p(si, s j) to the product distributionspsi(si) ps j(s j),where we estimatep(si, s j) using a two-dimensional exten-sion (using a bi-variate Gaussian kernel) of the kernel densitymethod given by Eq. (10). We give two examples in Fig. 2,where we compare the joint and product distributions of thetwo most-non-Gaussian components and two nearly Gaussiancomponents. As expected, a simple PCA is not enough toachieve pairwise statistical independence in the non-Gaussiancase. Only after performing the ICA, pairwise independenceisachieved.

A more rigorous test for mutual statistical independence forthe multivariate, continuous case was proposed by Chiu et al.(2003). It is based on the observation that ifx is a continuousrandom variable andP(x) is its cumulative distribution function(CDF), thenz = P(x) is uniformly distributed in [0, 1]. If we aregiven a set of statistically independent random variablessi, thismeans that the joint distribution ofzi = Pi(si), where againPi

is the CDF ofsi, is uniform in the multidimensional unit cube.On the other hand, if the assumption of statistical independenceof the si is violated, the joint densitypz of thezi is given by

pz(z) = pz [P1(s1), . . . , Pn(sn)]

= ps(s1, . . . , sn)∣

∣

∣

∣

∣

∂z∂s

∣

∣

∣

∣

∣

−1

=ps(s1, . . . , sn)∏n

i=1 pi(si). (19)

Here,pi(si) is the distribution function ofsi only andps is thejoint distribution function ofs1, . . . , sn. This means that thejoint distribution of thezi is not uniform if thesi are statis-tically dependent. Therefore, we can test if thesi we obtainfrom the ICA procedure are indeed independent by comput-ing their empirical cumulative distribution functions, carryingout the above transformation and finally testing for multivariateuniformity. Such a test was described in Liang et al. (2001),towhich we refer the reader for more details.

ICA PCA

Components 1 − 2

ICA PCA

Components 9 − 10

Fig. 2. Comparison of the joint distributionsp(si, s j) (blackdashed contours) and the productpsi (si) ps j(s j) (solid red con-tours) for the two most non-Gaussian components (i = 1,j = 2) and two rather Gaussian ones (i = 9, j = 10), af-ter performing ICA (left panels) and PCA (right panels). Thecomponents have been ranked and labelled according to theirnon-Gaussianity; thei-th PCA component is in general notthe same asi-th ICA-component. In the right panel of eachplot, the distributions with respect to the PCA basis vectorsare shown and in the left panel, the distributions in the ICAbasis are displayed. Statistical independence is indicated byp(si, s j) = psi(si) ps j(s j).

Applying the test to thesi that we have obtained from ourICA procedure, we have to reject statistical independence at99% confidence. This means that the ICA does not remove alldependencies between the components of the shear correlationfunction. This result, however, does not give an indicationofhow these residual dependencies affect our likelihood estimateand the conclusions regarding constraints on cosmologicalpa-rameters. We therefore compare the constraints derived fromthe ICA likelihood with the constraints from the likelihood


Ωm

σ 8

0.1 0.2 0.3 0.4 0.5 0.6 0.7

0.4

0.6

0.8

1.0

1.2

1.4

ICA likelihoodx

o

Ωm

σ 8

0.1 0.2 0.3 0.4 0.5 0.6 0.7

0.4

0.6

0.8

1.0

1.2

1.4

Gaussian likelihood

o

Fig. 4. Comparison of the posterior likeli-hoods for (Ωm, σ8), computed using the ICAlikelihood (left panel) and the Gaussian ap-proximation (right panel). Shown are thecontours of the 68%, 95% and 99% credibleregions. The maximum of the ICA posterioris denoted by ‘x’; the maximum of the pos-terior based on the Gaussian likelihood co-incides with the fiducial parameter set and ismarked by the symbol ‘o’.

Ωm

σ 8

0.1 0.2 0.3 0.4 0.5 0.6

0.4

0.6

0.8

1.0

1.2

Fig. 3. Comparison of the posterior likelihoods for (Ωm, σ8),computed using the ICA likelihood (black contours) and thePPDE likelihood (red contours). Shown are the contours of the68%, 95% and 99% credible regions.

estimated using an alternative method, calledprojection pur-suit density estimation (PPDE; Friedman et al. 1984), whichwe describe in detail in App. A. This method is free from anyassumptions regarding statistical independence and thereforeprovides an an ideal cross-check for the ICA method. For thecomparison, we have computed the shear correlation functionswith p = 10, and we also usenIC = 10 independent compo-nents. The resulting contours in theΩm-σ8-plane are shown inFig. 3. Both posterior likelihoods are very similar, although thecredible regions of the PPDE posterior have a slightly smallerarea than the contours of the ICA posterior (which actually sup-ports the findings presented in the next section). Given the goodagreement of the two methods, we will henceforth only makeuse of the ICA procedure, which is considerably faster and nu-merically less contrived than PPDE.

3.6. Results on the posterior

The most interesting question is how much the posterior dis-tribution computed from the non-Gaussian ICA likelihood willdiffer from the Gaussian approximation. We have investigatedthis for the case of the CDFS and the parameter set (Ωm, σ8).Here and henceforth, we use 15 angular bins forξ+ andξ− inthe range from 12′′ to 30′, i.e. p = 30. For the data vector,we do not use the correlation functions from our simulations,but take the theoretical prediction for our fiducial parameter setinstead. This allows us to study the shape of the posterior likeli-hood independent of noise in the data and biases due to the factthat the theoretical model does not quite match the mean cor-relation function from the simulations. In Fig. 4, we show thecontours of the posterior computed in this way from the like-lihood estimated using our ICA method (left panel) and fromthe Gaussian likelihood. We have assumedσ|ǫ| = 0.45 for thedispersion of the intrinsic galaxy ellipticities. The shape of theICA posterior is different from that of the Gaussian approxima-tion in three respects: it is steeper (leading to smaller credibleregions), the maximum is shifted towards higherσ8 and lowerΩm, and the contours are slightly tilted. The first two differ-ences can be traced back to the shape of the distributions ofthe individual ICA components: most of the distribution func-tions psi are generally slightly steeper than a Gaussian andmost of the non-Gaussian components are in addition stronglyskewed, thus shifting the peak of the posterior. Generally,thesedifferences are more pronounced in the direction of theΩm–σ8-degeneracy and towards lower values of both parameters,where the posterior is shallower.

Of more practical relevance is how the parameter con-straints change when the ICA likelihood is used for the anal-ysis of large weak lensing surveys. Here, we consider surveysconsisting ofNf CDFS-like fields. Bayesian theory states thatif Nf is large enough, the posterior probability distribution ofthe parameters becomes Gaussian, centred on the true parame-ter values, with covariance matrix (Nf F)−1 (e.g. Gelman et al.2004). Here,F is the Fisher matrix (Kendall et al. 1987), whichis defined by

Fαβ =

⟨

∂ logL∂πα

∂ logL∂πβ

⟩

, (20)


0.2

0.22

0.24

0.26

0.28 0.3

r=0.93

0.7

0.75 0.8

0.85

r=0.92

0.55 0.6

0.65 0.7

0.75 0.8

0.85

r=0.77

0.65 0.7 0.75 0.8 0.85ΩΛ

0.65 0.7

0.75 0.8

0.85

r=0.86

0.2 0.22 0.26 0.30.

650.

750.

85

ΩΛ

Ωm

r=0.73

0.7 0.75 0.8 0.85σ8

r=0.71

0.55 0.65 0.75 0.85

h100

r=0.62

0.55

0.65

0.75

0.85

h 100

r=0.59 r=0.58

0.7

0.8

σ 8

r=0.58

Fig. 5. Fisher matrix constraints for a hy-pothetical 1500-deg2 survey, consisting of6000 CDFS-like fields. The plots on thediagonal show the 1D marginals, the off-diagonal plots the 2D marginals derivedfrom the full 4D posterior. The red dashed(black solid) lines/contours have been com-puted using the Fisher matrix of theGaussian likelihood (the ICA likelihood).

where〈·〉 denotes the expectation value with respect to the like-lihood function. If the likelihood is Gaussian and if the covari-ance matrixC does not depend on cosmology, one can showthat

Fαβ =∑

i, j

C−1i j∂mi(π)∂πα

∂m j(π)

∂πα. (21)

Eq. (20) provides us with a way to estimate the Fisher ma-trix for the non-Gaussian likelihood. For each ray-tracingre-alisation of the CDFS, we compute the logarithm of the pos-terior distribution logp(π|ξ) and its derivatives with respect tothe cosmological parameters at the fiducial parameter values.Since we use uniform priors for all cosmological parameters,the derivatives of the log-posterior are identical to thoseof thelog-likelihood. We can then compute the Fisher matrix by av-eraging over all realisations:

Fαβ =1N

N∑

k=1

∂ log p(π|ξ)∂πα

∂ log p(π|ξ)∂πβ

. (22)

In App. B, we show that the expression for the Fisher matrix ofthe ICA likelihood can be evaluated further to be

Fαβ =∑

i

∂mi

∂πα

∂mi

∂πβ

∫

dsi psi(si)

(

∂ log psi(si)

∂si

)2

. (23)

This equation allows a simpler, alternative computation ofFfrom the estimatedpsi(si), as discussed in App. B.

We have used Eqns. (21) and (23) to compute the Fishermatrices for a 1500-deg2 survey (Nf = 6000). We fit for fourcosmological parameters (Ωm, σ8, h100, ΩΛ), keeping all otherparameters fixed to their true values. To visualise the posterior,we compute two-dimensional marginalised posterior distribu-tions for each parameter pair as well as the one-dimensionalmarginals for each parameter. The results are shown in Fig. 5.

A general feature of the ICA likelihood, which has alreadybeen apparent in the 2D-analysis (Fig. 4), is that the credibleintervals are significantly smaller than the ones derived fromthe Gaussian likelihood. For the two-dimensional marginaldis-tributions, the area of the 68% credible regions derived fromthe ICA likelihood are smaller by≈ 30 − 40%. The one-dimensional constraints are tighter by≈ 10− 25%. In additionwe find that the ICA Fisher ellipses in some cases are slightlytilted with respect to those computed using the Gaussian likeli-hood. This is particularly apparent for parameter combinationsinvolving the Hubble parameter. Note that the shift of the maxi-mum observed in the two-dimensional case for a single CDFS-like field is absent here because it was assumed for the Fisheranalysis that the posterior is centred on the true parameterval-ues.

4. How odd is the Chandra Deep Field South?

4.1. The CDFS cosmic shear data

The second part of this work is based on the cosmologi-cal weak lensing analysis of the combined HST GEMS andGOODS data of the CDFS (Rix et al. 2004; Giavalisco et al.2004), which was presented in S07. The mosaic comprises 78ACS/WFC tiles imaged in F606W, covering a total area of∼ 28′ × 28′. We refer the reader to the original publication fordetails on the data and weak lensing analysis, which appliestheKSB+ formalism (Kaiser et al. 1995; Luppino & Kaiser 1997;Hoekstra et al. 1998).

In S07, the cosmic shear analysis was performed using twodifferent signal-to-noise and magnitude cuts. The first one se-lects galaxies with S/N > 4 and has no magnitude cut, andthe second one applies a more conservative selection withS/N > 5 and m606 < 27.0, where S/N is the shear measure-


ment signal-to-noise ratio as defined in Erben et al. (2001).The drizzling process in the data reduction introduces cor-related noise in adjacent pixels. While these correlationsareignored in the computation of S/N, an approximate correc-tion factor (see S07) is taken into account for S/Ntrue, mak-ing the above cuts S/Ntrue

& 1.9 and S/Ntrue& 2.4 respectively.

The two selection criteria yielded moderately differentσ8-estimates of 0.52+0.11

−0.15 and 0.59+0.11−0.14 for Ωm = 0.3 (median of

the posterior), not assuming a flat Universe. The errors in-clude the statistical and redshift uncertainties. This translatesto σ8 = 0.57+0.12

−0.16 and 0.65+0.12−0.15 for our fiducial cosmology

with Ωm = 0.25. The difference of the two estimates was con-sidered as a measure for the robustness and hence systematicaccuracy of our shear measurement pipeline. While the analy-sis of the “Shear TEsting Programme 2” (STEP2) image sim-ulations (Massey et al. 2007a) indicated no significant aver-age shear calibration bias for our method, a detected depen-dence on galaxy magnitude and size could effectively bias acosmic shear analysis through the redshift dependence of theshear signal (see also Semboloni et al. 2008). In order to bet-ter understand the difference between the two estimates foundin S07, and to exclude any remaining calibration uncertaintyin the current analysis, we further investigate the shear recov-ery accuracy as a function of the signal-to-noise ratio usingthe STEP2 simulations in Appendix C. Here we conclude thatour KSB+ implementation under-estimates gravitational shearfor very noisy galaxies with S/Ntrue

. 2.5, which likely ex-plains the lower signal found in S07 when all galaxies withS/N > 4 (S/Ntrue

& 1.9) were considered. For the more conser-vative selection criteria we find no significant mean shear cal-ibration bias and a variation as a function of magnitude andsize of . ±5%. Therefore we base our current analysis onthe more robust galaxy sample with S/N > 5 (S/Ntrue

& 2.4)and m606 < 27.0, which yields a galaxy number density of68 arcmin−2. Based on the simulations, any remaining calibra-tion uncertainty should be negligible compared to the statisticaluncertainty.

Note that Heymans et al. (2005) found a higher estimateof σ8(Ωm/0.3)0.65 = 0.68± 0.13 from GEMS, where they ex-trapolated the redshift distribution from the relatively shallowCOMBO-17 photometric redshifts (Wolf et al. 2004). Usingdeeper data from the GOODS-MUSIC sample (Grazian et al.2006), S07 were able to show that the COMBO-17 extrapola-tion significantly underestimates the mean redshift for GEMS,leading to the difference in the results forσ8.

In Fig. 6, we show the posterior distribution forσ8 basedon this sample of galaxies. For the fit, all other cosmologicalparameters were held fixed at the fiducial values chosen forour ray-tracing simulations. This avoids complications inthediscussion of cosmic variance and field selection biases duetothe effect of parameter degeneracies. We choose a flat prior forσ8, with a lower boundary ofσ8,min = 0.35 to cut off the tailof the posterior distribution towards small values of the powerspectrum normalisation, which is caused by the fact that thedifference (and therefore the likelihood) between the data andthe model vectors changes only very little whenσ8 (and there-fore the shear correlation function) is very small. We have per-formed the fit for the ICA likelihood as well as for the Gaussian

0.4 0.5 0.6 0.7 0.8 0.9 1.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

σ8

Pos

terio

r de

nsity

Fig. 6. Posterior distributions forσ8 as computed from theCDFS data. The black solid line corresponds to the ICA likeli-hood, the red dashed line is from the Gaussian likelihood whosecovariance matrix was estimated from the ray-tracing simula-tions. The blue dotted line was computed from the Gaussianlikelihood with an analytically computed covariance matrix,assuming that the shear field is Gaussian. The similarity of theposterior densities derived from the ICA likelihood and usingthe Gaussian covariance matrix is purely coincidental, occur-ring only for this particular data vector.

Table 1.Estimates ofσ8 from the CDFS

ICA likel. Gaussian likel. Gaussian likel.(ray-tracing cov.) (Gaussian cov.)

MAP 0.68+0.09−0.16 0.59+0.10

−0.19 0.68+0.10−0.14

Median 0.62+0.11−0.11 0.57+0.15

−0.15 0.64+0.10−0.14

approximation to the likelihood. For the latter, the covariancematrix was in one case estimated from the full sample of ourray-tracing simulations, and in the other case computed ana-lytically assuming that the shear field is a Gaussian randomfield (Joachimi et al. 2008). The striking similarity of the pos-terior densities derived from the ICA likelihood and using theGaussian covariance matrix for this particular data vectorismerely a coincidence and is in general not seen for our set ofsimulated correlation functions.For estimates ofσ8, we use the maximum of the posterior(henceforth we write ICA-MAP for the maximum of the non-Gaussian likelihood, and Gauss-MAP if the Gaussian approxi-mation is used), although we also quote the median (ICA me-dian) for comparison with S07. In the first case, our credibleintervals are highest posterior density intervals, whereas for themedian we choose to report the interval for which the probabil-ity of σ8 of being below the lower interval boundary is as highas being above the upper boundary. The results are summarisedin Tab. 1.

4.2. Cosmic Variance

The original estimates forσ8 given in S07 and those foundin the previous section for the Gaussian likelihood are rather


0.4 0.6 0.8 1.0 1.2

01

23

45

true

CD

FS

ICA

CD

FS

GA

US

S

σ8

p(σ 8

)Gauss MAPICA MAP

Fig. 7.Sampling distributions of the MAP estimators ofσ8, de-rived from 9600 realisations of the CDFS. All other parameterswere held fixed at their fiducial values for the fit. The histogramwith red dashed lines has been obtained from the Gaussian like-lihood, the one with solid lines from the ICA likelihood. Alsoshown are the best fitting Gaussian distributions. We indicatethe fiducial value ofσ8 and our estimates from the CDFS withvertical lines.

low compared to the value reported by WMAP5 (Dunkley et al.2009). This problem appears less severe when the full non-Gaussian likelihood is used, but theσ8-estimate is still ratherlow. It is therefore interesting to know whether this can be fullyattributed to cosmic variance or whether the way in which theCDFS was originally selected biases our estimates low.

To begin, we determine the probability of finding a lowσ8

in a CDFS-like field if the pointing is completely random. Weestimate the sampling distribution of theσ8-MAP estimatorsfor Gaussian and ICA likelihoods from the full sample of ourray-tracing simulations. We compute the posterior likelihoodfor σ8 using a uniform prior in the rangeσ8 ∈ [0.35; 1.8] anddetermine the MAP estimator ˆσ8. As in the previous sections,we do this using both the Gaussian and the ICA likelihoods.To separate possible biases of the estimators from biases thatmight arise because the model prediction based on Smith et al.(2003) does not quite fit our simulations, we correct the simu-lated correlation functions for this: ifξ(i) is the correlation func-tion measured in thei-th realisation, then

ξ(i)rc = ξ

(i) − 〈ξ〉 + m(π0) , (24)

is the “re-centred” shear correlation, where〈ξ〉 is the mean ofall realisations andm(π0) is our fiducial model.

The resulting sampling distributions of ˆσ8 are shown inFigs. 7 (originalξ) and 8 (re-centredξ). All the distributionsare well fit by a Gaussian. With the original correlation func-tions, we obtain estimates ˆσ8 which are too high on average.This reflects the fact that the power spectrum fitting formulabySmith et al. (2003) underpredicts the small scale power in thesimulations (see also Hilbert et al. 2009). If we correct forthis,we see that the maximum of the ICA likelihood is a nearly un-biased estimator ofσ8 in the one-dimensional case consideredhere, and in addition has a lower variance than the maximumof the Gaussian likelihood.

0.4 0.6 0.8 1.0 1.2

01

23

45

true

CD

FS

ICA

CD

FS

GA

US

S

re−centered

σ8

p(σ 8

)

Gauss MAPICA MAP

Fig. 8. Same as Fig. 7, but using re-centred correlation func-tions

Table 2.Prob(σ8 < σCDFS8 ) for the CDFS

Gauss Gauss ICA ICA(MAP) (median) (MAP) (median)

re-centred CF 6.8% 8.6% 12.9% 9.0%original CF 1.8% 3.0% 5.4% 3.4%

We estimate the probability of obtaining a power spectrumnormalisation as low as the one measured in the CDFS or lower,Prob(σ8 < σ

CDFS8 ), by the ratio of the number of realisations

which fulfil this condition to the total number of simulations.These estimates agree very well with those computed from thebest fitting Gaussian distribution. The results for the MAP andmedian estimators are summarised in Tab. 2. As expected fromthe above considerations, we find higher probabilities for there-centred correlation functions. In this case, the ICA-MAP es-timator yields 13% for the probability of obtaining an equallylow or lowerσ8 than the CDFS. This reduces to≈ 5% when theuncorrected correlation functions are used, because the misfitof our theoretical correlation functions to the simulations bi-ases theσ8-estimates high. If we assume that our simulationsare a reasonable representation of the real Universe, we canex-pect the same bias when we perform fits to real data. Therefore,Prob(σ8 < σ

CDFS8 ) ≈ 0.05 as derived from the uncorrected cor-

relation functions is most likely closest to reality. The proba-bilities computed from the Gauss-MAP estimates are generallysmaller than the ICA-MAP values because of the lower value ofσCDFS

8 found using these estimators, even though the samplingdistributions of the Gauss estimators are broader.

4.3. Influence of the CDFS selection criteria

We now investigate if and by how much the way in which theCDFS was selected can bias our estimates of the power spec-trum normalisation low. Several local criteria had to be fulfilledby the future CDFS, such as a low galactic HI density, the ab-sence of bright stars and observability from certain observatorysites. Since these conditions do not reach beyond our galaxy,


0.85

0.90

0.95

1.00

Flim(0.1−2.4 keV) [erg/sec/cm2]

σ 8/σ

8, n

o cu

t

10−14 10−13 10−12 10−11 10−10

ICA MAPICA medianGauss MAPGauss median

Fig. 9. The average values of the ICA-MAP (solid black line)and Gauss-MAP (solid red line) estimators computed fromCDFS realisations that do not contain clusters with an X-rayflux larger thanFlim . For comparison, we also plot the averagesof the corresponding median estimators (dashed lines).

we do not expect them to affect the lensing signal by the cos-mological large-scale structure.

Furthermore, the field was chosen such that no extendedX-ray sources from the ROSAT All-Sky Survey (RASS), inparticular galaxy clusters, are in the field of view. This is po-tentially important, since it is known from halo-model calcula-tions that the cosmic shear power spectrum on intermediate andsmall scales is dominated by group- and cluster-sized haloes.Therefore, the exclusion of X-ray clusters might bias the se-lection of a suitable line of sight towards under-dense fields.On the other hand, the RASS is quite shallow and thus onlycontains very luminous or nearby clusters, which have a lim-ited impact on the lensing signal due to their low number orlow lensing efficiency. We quantify the importance of this cri-terion using the halo catalogues of ourN-body-simulations. Toeach halo, we assign an X-ray luminosity in the energy rangefrom 0.1 to 2.4 keV using the mass-luminosity relation given inReiprich & Bohringer (2002) and convert this into X-ray fluxusing the halo redshift. We then compute the average of theσ8

estimates from all fields which do not contain a cluster brighterthan a certain flux limit. It is difficult to define an exact overallflux limit to describe the CDFS selection, because the RASSis rather heterogeneous. However, it is apparent from Fig. 9that even a very conservative limit of 10−13 ergs/sec/cm2 willchange the averageσ8 estimate by at most 3− 5%. This biasis therefore most likely not large enough to explain our CDFSresult alone.

Finally, the CDFS candidate should not contain any “rele-vant NED source”. This is very hard to translate into a quanti-tative criterion, in particular because our simulations containonly dark matter. We model the effect of imposing this re-quirement by demanding that there be less thannhalo group-or cluster-sized haloes (M > 1013 M⊙/h) in the redshift rangefrom z = 0 andz = 0.5 in a CDFS candidate. The impact ofthis criterion on the estimated value of ˆσ8 using the ICA- andGauss-MAP estimators is shown in Fig. 10. As expected, the

all <4 <8 <12 <16 <20

0.4

0.6

0.8

1.0

1.2

σ 8

Number of halos with z< 0.5, M>1013 MSun h

Fig. 10.Dependence of the ICA-MAP-estimator forσ8 on thenumber of group- and cluster-sized haloesnhalo betweenz = 0andz = 0.5. For eachnhalo-bin, we summarise the distributionof the corresponding subsample of simulated CDFS-fields bygiving a box plot: the thick horizontal line in each box denotesthe median, the upper and lower box boundaries give the up-per and lower quartiles of the distribution of the sample values.The error bars (“whiskers”) extend to the 10% and 90% quan-tiles, respectively. To visualise the tails of the distributions, themost extreme values are given as points. The width of eachbox is proportional to the square root of the sample size. Forcomparison, we also show for each subsample the median ofthe Gauss MAP estimators as red crosses. The solid black hor-izontal line indicates the true value ofσ8, the black dashed linethe ICA-MAP estimate for the CDFS and the red dotted linethe Gauss-MAP estimate. The average number of haloes withM > 1013 M⊙ andz ≤ 0.5 in a CDFS-like field is ¯nhalo = 18.5.

medianσ8 is a monotonically increasing function ofnhalo. Forfields with less than≈ 12 massive haloes, the probability of ob-taining a power spectrum normalisation as low as in the CDFSrises above≈ 20%. Given that the average number of massivehaloes in the specified redshift range is 18.5, it does not seem tobe too unreasonable that fields with less than≈ 12 such haloescould be obtained by selecting “empty” regions in the NED.This is also in qualitative agreement with Phleps et al. (2007),who find that the CDFS is underdense by a factor of≈ 2 in theredshift range fromz ≈ 0.2 to z ≈ 0.4.

We estimate the impact of this selection criterion on theestimates of cosmological parameters by treating the numberof haloes in the CDFS as a nuisance parameter in the pro-cess of parameter estimation. Similar to what we did to ob-tain Fig. 10, we bin the realisations of the CDFS according tothe number of group-sized haloes in the realisations. For eachbin, we obtain the mean shear correlation function and its ra-tio to the mean shear correlation function of all realisations,r±(θ, nhalo) = ξ±(θ, nhalo)/ξ±(θ). The functionsr+ and r− areshown in Fig. 11. The realisations with fewer (more) haloesthan the average generally display a smaller (larger) shearcor-relation function. We fit the ratios in each bin with a double


0.8

0.9

1

1.1

1.2

1.3r +

(θ; n

halo)

0.6 0.7 0.8 0.9

1 1.1 1.2 1.3

1 10

r -(θ

; nha

lo)

θ [arcmin]

Fig. 11. Ratios r+ (upper panel) andr− (lower panel) of theshear correlation functions in a particularnhalo-bin to the aver-age correlation function of all realisations. The lowest (solid)curve represents the bin withnhalo ∈ [0, 4), the second lowestthe bin withnhalo ∈ [4, 8), and so on. The highest ratio corre-sponds to the bin withnhalo ≥ 28. The error bars have beenestimated from the field-to-field variation.

power law of the form

r±(θ, nhalo) = A±(nhalo)θα±(nhalo) + B±(nhalo)θβ±(nhalo) . (25)

For values ofnhalo which do not coincide with one of thebin centres, the functionsr± are obtained by linear interpo-lation between the fits for the two adjacent bins. With this,we extend our model for the shear correlation function tom′±(θ; π, nhalo) = m±(θ; π) r±(θ, nhalo). In Fig. 12, we show theresulting posterior distributions forσ8(Ωm = 0.25) andnhalo,keeping all other cosmological parameters fixed and using auniform prior fornhalo. The two-dimensional distribution showsa weak correlation between the two parameters: as expected,alow (high) value ofnhalo requires a slightly higher (lower) valueof σ8. The marginalised posterior forσ8 is very similar to theone shown in Fig. 7, where the field selection is not taken intoaccount. However, includingnhalo increases the MAP estimateof σ8 by 5% toσ8 = 0.71+0.10

−0.15 for the ICA likelihood and by10% toσ8 = 0.65+0.13

−0.20 for the Gaussian likelihood and the ray-tracing covariance matrix. The marginalised posterior distribu-tion of nhalo shows a weak peak at ˆnhalo ≈ 13 (compared to theaverage of ¯nhalo = 18.5 for all ray-tracing realisations) in theICA case and even lower values if the Gaussian likelihood isused. Overall, however, the posterior is very shallow.

Having corrected for the field selection, we cannow recompute the probabilities given in Tab. 2 fordrawing the CDFS at random. We find for the ICA-MAP estimate Prob(σ8 < 0.71)= 9.4% for the originalshear correlation functions and Prob(σ8 < 0.71)= 18.5%for the re-centred ones. For the Gaussian likeli-

0.3 0.4 0.5 0.6 0.7 0.8 0.9

510

1520

25

σ8

n hal

o

99%

95% 95%

68%

68%

30%

0.4 0.6 0.8 1.0

0.0

1.0

2.0

3.0

σ8

Pos

terio

r de

nsity

5 10 15 20 25

0.03

20.

034

0.03

60.

038

nhalo

Pos

terio

r de

nsity

Fig. 12. Upper panel: Posterior density forσ8(Ωm = 0.25)andnhalo computed using the ICA likelihood, keeping all othercosmological parameters constant. Lower panels: Marginalisedposterior densities ofσ8(Ωm = 0.25) (left panel) andnhalo (rightpanel). Solid black curves show the results from using the ICAlikelihood, dashed red lines from the Gaussian likelihood andthe ray-tracing covariance.

hood, we find Prob(σ8 < 0.65)= 6.0% (original) andProb(σ8 < 0.65)= 14.9% (re-centred), respectively.

With this (approximate) treatment of the systematic ef-fects caused by the field selection, we can now put the CDFSin context with the results from the WMAP five-year data.For this, we fit the shear correlation function forΩm andσ8, marginalising overh100 (with a Gaussian prior centred onh100 = 0.7 andσh100 = 0.07, as suggested by the Hubble KeyProject; Freedman et al. 2001) andnhalo with a uniform prior.We use the WMAP Markov chain for a flatΛCDM model(lcdm+sz+lens; Dunkley et al. 2009; Komatsu et al. 2009),where again we marginalise over all parameters exceptΩm andσ8. The resulting posterior distributions for the CDFS only(blue dashed contours), WMAP only (red contours) and thecombination of both measurements (thick black contours) areshown in Fig. 13. Clearly, the joint posterior is dominated bythe WMAP data; however, the constraints from the CDFS al-low us to exclude parameter combinations where bothΩm andσ8 are large. We find the MAP estimatesΩm = 0.26+0.03

−0.02 andσ8 = 0.79+0.04

−0.03 when marginalising over the other parameter.


Ωm

σ 8

0.15 0.25 0.35 0.45

0.4

0.5

0.6

0.7

0.8

0.9

Fig. 13. Posterior density forΩm and σ8, where we havemarginalised over the Hubble constanth100 and the number ofhaloes in the fieldnhalo. The dashed blue contours show the68%, 95% and 99% credible regions resulting from the cosmicshear analysis of the CDFS (using the ICA likelihood), the redcontours show the posterior from the WMAP 5-year data (us-ing the flatΛCDM model). The combined posterior is shownwith thick black contours.

Finally, note that the two criteria discussed in this sectionare not strictly independent. However, it is highly improba-ble that a single field will contain more than one massive haloabove the X-ray flux limit. Therefore, selecting fields withoutan X-ray-bright cluster prior to performing the steps that leadto Fig. 10 would change the halo numbers that go into the anal-ysis by at most one and would not significantly influence theforegoing discussion.

5. Summary and discussion

In this paper, we have investigated the validity of the approx-imation of a Gaussian likelihood for the cosmic shear correla-tion function, which is routinely made in weak lensing studies.We have described a method to estimate the likelihood froma large set of ray-tracing simulations. The algorithm triestofind a new set of (non-orthogonal) basis vectors with respectto which the components of the shear correlation functions be-come approximately statistically independent. This then allowsus to estimate the high-dimensional likelihood as a productofone-dimensional probability distributions. A drawback ofthismethod is that quite a large sample of realistically simulatedcorrelation functions is required to get good results for the tailsof the likelihood. However, this should become less problem-atic in the near future when increasingly large ray-tracingsim-ulations will become available.

We have investigated how the constraints on matter andvacuum energy density, Hubble parameter and power spectrumnormalisation depend on the shape of the likelihood for a sur-vey composed of 0.5 deg×0.5 deg fields and a redshift distribu-tion similar to the CDFS. We find that if the non-Gaussianityof the likelihood is taken into account, the posterior likelihoodbecomes more sharply peaked and skewed. When fitting onlyforΩm andσ8, the maximum of the posterior is shifted towardslowerΩm and higherσ8, and the area of the 68% highest poste-rior density credible region decreases by about 40% comparedto the case of a Gaussian likelihood. For the four-dimensionalparameter space, we have conducted a Fisher matrix analysisto obtain lower limits on the errors achievable with a 1500 deg2

survey. As in the two-dimensional case, we find the most im-portant effect to be that the error bars decrease by 10− 40%compared to the Gaussian likelihood. Less severe is the slighttilt of the Fisher ellipses when marginalising over two of thefour parameters, particularly whenh100 is involved.

In the second part of this work, we have presented a re-analysis of the CDFS-HST data. Using the non-Gaussian like-lihood, we findσ8 = 0.68+0.09

−0.16 for Ωm = 0.25 (keeping allother parameters fixed to their fiducial values), compared toσ8 = 0.59+0.10

−0.19 obtained from the Gaussian likelihood with acovariance matrix estimated from the ray-tracing simulations.We have then tried to quantify how (un-)likely it is to ran-domly select a field with the characteristics of theChandraDeep Field South with a power spectrum normalisation thislow. We have used 9600 ray-tracing realisations of the CDFSto estimate the sampling distribution of the ICA-MAP estima-tor for σ8. For our fiducial, WMAP5-like cosmology, we findthat Prob(σ8 ≤ 0.68) ≈ 5%, assuming that the location of theCDFS on the sky was chosen randomly. The fact that the CDFSwas selected not to contain an extended X-ray source in theROSAT All-Sky Survey can lead to a bias of the estimatedσ8

by at most 5%. This is because the clusters excluded by thiscriterion are rare and mostly at low redshifts, and therefore notvery lensing-efficient. The second relevant selection criterionis that the CDFS should not contain any relevant NED source.We model this by selecting only those fields which contain aspecific numbernhalo of group- and cluster-sized haloes. Wefind that for those realisations for which the number of suchhaloes is below the average, the estimates ofσ8 can be bi-ased low by about 5-10%. We include this effect in our like-lihood analysis by extending our model shear correlation func-tion by a correction factor depending onnhalo and treatingnhalo

as a nuisance parameter. This increases the estimate ofσ8 by5% to σ8 = 0.71+0.10

−0.15 for the ICA likelihood and by 10% toσ8 = 0.65+0.13

−0.20 for the Gaussian likelihood. This procedure alsoyields tentative evidence that the number of massive haloesinthe CDFS is only≈ 70% of the average, in qualitative agree-ment with the findings of Phleps et al. (2007).

Finally, we combine the CDFS cosmic shear results withthe constraints on cosmological parameters from the WMAPexperiment. We fit forΩm andσ8, where we marginalise overthe Hubble constant and take into account the field selectionbias by marginalising also overnhalo. While the posterior isclearly dominated by the WMAP data, the CDFS still allows usto exclude parts of the parameter space with high values of both


Ωm andσ8. Assuming a flat Universe, the MAP estimates forthese two parameters areΩm = 0.26+0.03

−0.02 andσ8 = 0.79+0.04−0.03.

Acknowledgements. We would like to thank Tim Eifler for use-ful discussions and the computation of the Gaussian covariancematrix, and Sherry Suyu for careful reading of the manuscript.We thank Richard Massey and William High for providing theSTEP2 image simulations. JH acknowledges support by the DeutscheForschungsgemeinschaft within the Priority Programme 1177 underthe project SCHN 342/6 and by the Bonn-Cologne Graduate Schoolof Physics and Astronomy. TS acknowledges financial supportfromthe Netherlands Organization for Scientific Research (NWO)andthe German Federal Ministry of Education and Research (BMBF)through the TR33 “The Dark Universe”. We acknowledge the useof the Legacy Archive for Microwave Background Data Analysis(LAMBDA). Support for LAMBDA is provided by the NASA Officeof Space Science.

Appendix A: Projection Pursuit Density Estimation

In order to have an independent check of the ICA-based like-lihood estimation algorithm, we employ the method ofprojec-tion pursuit density estimation (PPDE; Friedman et al. 1984).Like our ICA method, PPDE aims to estimate the joint proba-bility densityp(x) of a random vectorx, given a set of observa-tions ofx. As starting point, an initial modelp0(x) for the mul-tidimensional probability distributionp(x) has to be provided,for which a reasonable choice is e.g. a multivariate Gaussianwith a covariance matrix estimated from the data. The methodthen identifies the directionθ1 along which the marginalisedmodel distribution differs most from the marginalised densityof the data points and corrects for the discrepancy along thedirectionθ1 by multiplying p0 with a correction factor. Thisyields a refined density estimatep1(x), which can be furtherimproved by iteratively applying the outlined procedure.

More formally, the PPDE density estimate is of the form

pM(x) = p0(x)M

∏

m=1

fm(θm · x) , (A.1)

where pM is the estimate afterM iterations of the proce-dure and p0 is the initial model. The univariate functionsfm are multiplicative corrections to the initial model alongthe directionsθm. The density estimate can be obtained iter-atively using the relationpM(x) = pM−1(x) fM(θM · x). Atthe M-th step of the iteration, a directionθM and a func-tion fM are chosen to minimise the Kullback-Leibler diver-gence (Kullback & Leibler 1951) between the actual data den-sity p(x) and the density estimatepM(x),

DKL [p, pM] =∫

dx p(x) logp(x)

pM(x), (A.2)

as a goodness-of-fit measure. The Kullback-Leibler divergenceprovides a “distance measure” between two probability distri-bution functions, since it is non-negative and zero only ifp ≡ q,albeit not symmetric. Only the cross term

W(θM , fM) = −∫

dx p(x) log pM(x) (A.3)

of the K-L divergence is relevant for the minimisation, all otherterms do not depend onθM and fM. By using Eq. (A.1), onesees that the minimum ofW is attained at the same location asthe minimum of

w(θM , fM) = −∫

dx p(x) log fM(θM · x) , (A.4)

which is the expectation value of logfM with respect top(x).The data densityp(x) is unknown; however, the data comprisea set ofN samples from this distribution. The expectation valueof log fM can therefore be estimated by

w(θM, fM) = −1N

N∑

i=1

log fM(θM · xi) . (A.5)

For fixedθM, the minimum of Eq. (A.4) is attained for

fM(θM · x) =pθM (θM · x)

pθM

M−1(θM · x), (A.6)

wherepθM andpθM

M−1 are the marginal densities of the data andof model density from the (M − 1)-st iteration along the direc-tion θM, respectively. With this, the iterative process that leadsto estimates ofθM and fM schematically consists of:

– choosing a directionθM,– computing the marginal densitiespθM andpθM

M−1,– computingfM(θM · x) according to Eq. (A.6),– computingw(θM, fM)– choosing a newθM that decreases ˆw– continuing from step 2 until a convergence criterion is ful-

filled.

To efficiently compute the marginalspθM and pθM

M−1, MonteCarlo samples of these densities are used. Note that the dataalready comprise a sample ofp(x); a sample ofpθM

M−1 can beobtained efficiently by an iterative method: sincepM−1 is simi-lar to pM−2, a subset of the sample frompM−1 can be obtainedby rejection sampling from the sample from the (M − 2)-ndstep. The remaining data vectors are then drawn by rejectionsampling fromp0. For more technical details of the estimationprocedure, we refer the reader to Friedman et al. (1984).

Note that the PPDE technique, although using very similarmethodology as our ICA-based procedure, is different in theimportant point that it does not rely on the assumption that alinear transformation of the data leads to statistical indepen-dence of the components of the transformed data vectors. Ittherefore comprises a good test of the validity of this approxi-mation.

Appendix B: Fisher matrix of the ICA likelihood

In this appendix, we give the derivation of Eq. (23). In the gen-eral case, the Fisher matrix is given by (e.g. Kendall et al. 1987)

Fαβ =

⟨

∂ logL∂πα

∂ logL∂πβ

⟩

. (B.1)


In our case, the likelihood depends on cosmological parametersonly through the difference between data and model vector, i.e.s = ξ − m (see Eq. 8). This allows us to write

∂ logL(s(π))∂πα

=∂ logL(s)∂si

∂si

∂πα(B.2)

=d logpsi(si)

dsi

∂si

∂πα, (B.3)

where in the last step we have made use of the fact that thelikelihood factorises in the ICA basis. The expression for theFisher matrix then can be written as

Fαβ =∑

i, j

⟨

d logpsi(si)

dsi

d logps j(s j)

ds j

⟩

∂mi

∂πα

∂m j

∂πβ(B.4)

Next, we compute the expectation value on the right hand sideand obtain

Fαβ =∑

i, j

∂mi

∂πα

∂m j

∂πβ

∫

dsidpsi(si)

dsi

∫

ds jdps j(s j)

ds j(B.5)

+∑

i

∂mi

∂πα

∂mi

∂πβ

∫

dsi psi(si)

(

d logpsi(si)

dsi

)2

. (B.6)

The integrals in the first term of the right hand side vanish sincethe psi drop to zero for very large and small values ofsi. Thisleaves us with

Fαβ =∑

i

∂mi

∂πα

∂mi

∂πβ

∫

dsi psi(si)

(

d logpsi(si)

dsi

)2

. (B.7)

The derivatives in Eq. (B.7) can be strongly affected bynoise in the estimatedpsi(si), in particular in the tails ofthe distributions. For their numerical computation, we there-fore choose the following four-point finite difference operator(Abramowitz & Stegun 1964):

dpds=

p(s − 2h) − 8p(s − h) + 8p(s + h) − p(s + 2h))12h

+O(h5) ,

(B.8)which we find to be more stable against this problem than itsmore commonly used two-point counterpart. Because of thispotential difficulty, we cross-check our results with the alterna-tive method provided by Eq. (22). This method is significantlyslower, but numerically simpler. This is because the derivativesof the log-likelihood in Eq. (22) are on average computed closeto the maximum-likelihood point, where the likelihood esti-mate is well sampled. Reassuringly, we find excellent agree-ment between the two methods. Finally, we have investigatedthe influence of the choice of the Kernel functionK in Eq. (10),which might affect the computation of the numerical deriva-tives. Our results prove to be stable against variation ofK, pro-vided that we chose a differentiable Kernel function.

Appendix C: Further conclusions for our KSB+pipeline from the STEP simulations

In this appendix we assume that the reader is familiar with basicKSB notation. For a short introduction and a summary of dif-ferences between various implementations see Heymans et al.(2006).

Within the Shear TEsting Programme2 (STEP) simulatedimages containing sheared galaxies are analysed in blind tests,in order to test the shear measurement accuracy of weak lens-ing pipelines. In these analyses the shear recovery accuracyhas been quantified in terms of a multiplicative calibrationbiasm and additive PSF residualsc. From the analysis of thefirst set of simulations (STEP1, Heymans et al. 2006), whichmimic simplified ground-based observations, we find that ourKSB+ implementation significantly under-estimates gravita-tional shear on average if no calibration correction is applied.After the elimination of selection and weighting-related ef-fects this shear calibration bias amounts to a relative under-estimation ofm = −9%. According to our testing the largestcontribution to this bias originates from the inversion of thePg-tensor, which describes the response of galaxy ellipticityto gravitational shear. While a full-tensor inversion reducesthis bias, it strongly increases the measurement noise (seealsoErben et al. 2001) and dependence on galaxy selection criteria.We therefore decided to stick to the commonly applied approx-imation of (Pg)−1 = 2/Tr[Pg], which we measure from indi-vidual galaxies, and correct the shear estimate using a multi-plicative calibration factor ofccal = 1.10 in the S07 analysis.This average calibration correction was found to be stable tothe ∼ 2%-level between different STEP1 simulation subsets.However, note that the bias depends on the details of the KSBimplementation, which might explain some of the scatter be-tween the results for different KSB codes in STEP1. In par-ticular, we identified a strong dependence on the choice of theGaussian filter scalerg, which is used in the computation ofthe KSB brightness moments. For example changing from ourdefaultrg = 1.0 rf , whererf is the flux radius as measured bySExtractor (Bertin & Arnouts 1996), torg = 0.7 rf , worsens thebias tom = −17%.

The average calibration correction likewise proved to berobust for the second set of image simulations (STEP2,Massey et al. 2007a), which also mimics ground-based data buttakes into account more complex PSFs and galaxy morpholo-gies by applying the shapelets formalism (Massey et al. 2004).Yet, the STEP2 analysis revealed a significant magnitude de-pendence of the shear recovery accuracy for our implemen-tation, with a strong deterioration at faint magnitudes. Inthisanalysis we applied the same signal-to-noise cut S/N > 4.0 asin STEP1 (KSB S/N as defined in Erben et al. 2001), wherewe however ignored the strong noise correlations present intheSTEP2 data, which was added to mimic the influence of driz-zle.

In the case of uncorrelated noise the dispersion of the sumover the pixel values ofN pixels scales as

σN =√

Nσ1 , (C.1)

whereσ1 is the dispersion computed from single pixel values.Drizzling, or convolution in the case of the STEP2 simulations,reducesσ1 but introduces correlations between neighbouringpixels. The signal-to-noise of an object is usually defined as theratio of the summed object flux convolved with some windowor weight function, divided by an rms estimate for the noise in

2 http://www.physics.ubc.ca/˜heymans/step.html

http://www.physics.ubc.ca/~heymans/step.html


Fig. C.1. Estimate of the effective influence of the noise cor-relations in the STEP2 simulations: Plotted is the ratio of thepixel value dispersionσmeasure

N measured from large areas ofN = M2 pixels to the estimate from the normal single pixeldispersion

√Nσmeasure

1 as a function ofM, determined from anobject-free STEP2 image. In the absence of noise correlationsr = 1 for all M. The valuer ≃ 2.8 for M → ∞ gives the factorby which the signal-to-noise is over-estimated when measuredfrom the single pixel dispersionσmeasure

1 ignoring the correla-tions.

an equal area convolved with the same weight function. If thenoise estimate is computed fromσ1 and scaled according toEq. C.1, the correlations are neglected and the noise estimate istoo small.

In order to estimate the effective influence of the noise cor-relations in STEP2, we use a pure noise image which was pro-vided together with the simulated images. We compute the rmsof the pixel sumσmeasure

N in independent quadratic sub-regionsof the image with side lengthM =

√N and determine the ratio

r =σmeasure

N√Nσmeasure

1

, (C.2)

which in the absence of correlated noise would be equal to 1for all N. In the presence of noise correlations it will for largeN converge to the factor by whichσmeasure

1 under-estimates theuncorrelatedσ1. This can be understood as drizzling or con-volution typically re-distributes pixel flux within a relativelysmall area. As soon as this kernel is much smaller than the areaspanned byM2 pixels, the correlations become unimportant forthe area pixel sum. The measuredr(M) is plotted in Fig. C.1.Extrapolating toM → ∞ we estimate that ordinary noise mea-sures based on the single pixel dispersion, which ignore thenoise correlation, will over-estimate the signal-to-noise of ob-jects by a factorr ≃ 2.8. Hence, our original selection criterionS/N > 4.0 corresponds to a very low true cut S/Ntrue

& 1.4including much noisier objects than in STEP1.

We plot the dependence of our STEP2 shear estimate onthe (uncorrected) S/N in Fig. C.2. For S/N . 7, correspondingto S/Ntrue

. 2.5, a significant deterioration of the shear signaloccurs, with a mean calibration bias〈m〉 ∼ −10% and a largescatter between the different PSF models. We conclude that this

Fig. C.2. Calibration biasm as a function of the uncorrectedKSB signal-to-noise S/N for the TS analysis of the STEP2 sim-ulations. Thin solid (dashed) lines showγ1 (γ2) estimates forindividual PSFs, where we show individual error-bars only forone PSF for clarity. The bold solid line and error-bars show themean and standard deviation of the individual PSF estimatesand shear components. Note the deterioration of the shear esti-mate for the STEP2 galaxies with S/N . 7 (S/Ntrue

. 2.5). Forthis plot an adapted calibration correction of 1.08 was applied.

approximately marks the limit down to which our KSB+ imple-mentation can reliably measure shear. If we apply a modifiedcut S/N > 7.0 to the STEP2 galaxies, the resulting magnitudeand size dependence of the shear calibration bias is. ±5% (toppanels in Fig. C.3). The remaining galaxies are best correctedwith a slightly reduced calibration factorccal = 1.08, which weapply for the plots shown in this appendix and the updatedshear analysis presented in this paper. The difference betweenthe calibration corrections derived from STEP1 and STEP2agrees with the estimated∼ 2% accuracy. Note that the errorincreases for the highly elliptical PSFs D and E (e∗ ≃ 12%) inSTEP2, for which in addition significant PSF anisotropy resid-uals occur (bottom panels in Fig. C.3). This should howevernot affect our analysis given that typical ACS PSF ellipticitiesrarely exceede∗ ≃ 7%, see e.g. S07.

References

Abramowitz, M. & Stegun, I. A. 1964, Handbook ofMathematical Functions with Formulas, Graphs, andMathematical Tables, ninth edn. (New York: Dover)

Benjamin, J., Heymans, C., Semboloni, E., et al. 2007,MNRAS, 381, 702

Bernstein, G. M. & Jarvis, M. 2002, AJ, 123, 583Bertin, E. & Arnouts, S. 1996, A&AS, 117, 393Chiu, K.-C., Liu, Z.-Y., & Xu, L. 2003, in Proc. 4th

International Symposium on Independent ComponentAnalysis and Blind Signal Separation (ICA2003), Nara,Japan, 751–756

Comon, P., Jutten, C., & Herault, J. 1991, Signal Processing,24, 11

Cooray, A. & Hu, W. 2001, ApJ, 554, 56Davison, A. C. 2003, Statistical Models, Cambridge Series

in Statistical and Probabilistic Mathematics (Cambridge


Fig. C.3. Calibration biasm and PSF residualsc as a functionof input galaxy magnitude and size for our refined analysis ofthe STEP2 data. Thin solid (dashed) lines showγ1 (γ2) esti-mates for individual PSFs, where we include individual error-bars only for one PSF for clarity. Bold solid lines and error-barsshow the mean and standard deviation of the individual PSF es-timates and shear components. In this plot only galaxies withS/N > 7 (S/Ntrue > 2.5) are taken into account, which stronglyreduces the deterioration found in Massey et al. (2007a) forfaint magnitudes. For this plot an adapted calibration correc-tion of 1.08 was applied.

University Press)Dunkley, J., Komatsu, E., Nolta, M. R., et al. 2009, ApJS, 180,

306Eifler, T., Schneider, P., & Hartlap, J. 2008, astro-ph/0810.4254Erben, T., van Waerbeke, L., Bertin, E., Mellier, Y., &

Schneider, P. 2001, A&A, 366, 717Freedman, W. L., Madore, B. F., Gibson, B. K., et al. 2001,

ApJ, 553, 47Friedman, J., Stuetzle, W., & Schroeder, A. 1984, Journal of

the American Statistical Association, 79, 599Fu, L., Semboloni, E., Hoekstra, H., et al. 2008, A&A, 479, 9Gelman, A., Carlin, J. B., Stern, H., & Rubin, D. B. 2004,

Bayesian Data Analysis (Chapman & Hall/CRC)Giacconi, R., Rosati, P., Tozzi, P., et al. 2001, ApJ, 551, 624Giavalisco, M., Ferguson, H. C., Koekemoer, A. M., et al.

2004, ApJ, 600, L93Grazian, A., Fontana, A., de Santis, C., et al. 2006, A&A, 449,

951Hartlap, J., Simon, P., & Schneider, P. 2007, A&A, 464, 399Hastie, T., Tibshirani, R., & Friedman, J. 2001, The Elements

of Statistical Learning (Springer)Heymans, C., Brown, M. L., Barden, M., et al. 2005, MNRAS,

361, 160Heymans, C., Van Waerbeke, L., Bacon, D., et al. 2006,

MNRAS, 368, 1323

Hilbert, S., Hartlap, J., White, S. D. M., & Schneider, P. 2009,A&A, 499, 31

Hoekstra, H., Franx, M., Kuijken, K., & Squires, G. 1998, ApJ,504, 636

Hoekstra, H., Mellier, Y., van Waerbeke, L., et al. 2006, ApJ,647, 116

Hyvarinen, A., Karhunen, J., & Oja, E. 2001, IndependentComponent Analysis (Wiley Interscience)

Hyvarinen, A. & Oja, E. 1997, Neural Computation, 9(7), 1438Hyvarinen, A. & Oja, E. 2000, Neural Networks, 13(4-5), 411Jain, B., Seljak, U., & White, S. 2000, AJ, 530, 547Joachimi, B., Schneider, P., & Eifler, T. 2008, A&A, 477, 43Jutten, C. & Herault, J. 1991, Signal Processing, 24, 1Kaiser, N. & Pan-STARRS Collaboration. 2005, in Bulletin of

the American Astronomical Society, Vol. 37, 465Kaiser, N., Squires, G., & Broadhurst, T. 1995, ApJ, 449, 460Kendall, M. G., Stuart, A., & Ord, J. K., eds. 1987, Kendall’s

advanced theory of statistics (New York, NY, USA: OxfordUniversity Press, Inc.)

Komatsu, E., Dunkley, J., Nolta, M. R., et al. 2009, ApJS, 180,330

Kuijken, K. 2006, A&A, 456, 827Kullback, S. & Leibler, R. A. 1951, Annals of Mathematical

Statistics, 22, 79Liang, J.-J., Fang, K.-T., Hickernell, F. J., & Li, R. 2001, Math.

Comput., 70, 337Luppino, G. A. & Kaiser, N. 1997, ApJ, 475, 20Massey, R., Heymans, C., Berge, J., et al. 2007a, MNRAS, 376,

13Massey, R., Refregier, A., Conselice, C. J., David, J., & Bacon,

J. 2004, MNRAS, 348, 214Massey, R., Rhodes, J., Leauthaud, A., et al. 2007b, ApJS, 172,

239Miller, L., Kitching, T. D., Heymans, C., Heavens, A. F., & van

Waerbeke, L. 2007, MNRAS, 382, 315Peacock, J. A. & Dodds, S. J. 1996, MNRAS, 280, 19Phleps, S., Wolf, C., Peacock, J. A., Meisenheimer, K., &

van Kampen, E. 2007, in Astronomical Society of thePacific Conference Series, Vol. 379, Cosmic Frontiers, ed.N. Metcalfe & T. Shanks, 327

Press, W. et al. 1992, Numerical Recipes in C (CambridgeUniversity Press)

R Development Core Team. 2007, R: A Language andEnvironment for Statistical Computing, R Foundation forStatistical Computing, Vienna, Austria

Refregier, A. & Bacon, D. 2003, MNRAS, 338, 48Reiprich, T. H. & Bohringer, H. 2002, ApJ, 567, 716Rix, H.-W., Barden, M., Beckwith, S. V. W., et al. 2004, ApJS,

152, 163Schneider, P. 2006, in Saas-Fee Advanced Course 33:

Gravitational Lensing: Strong, Weak and Micro, ed.G. Meylan, P. Jetzer, P. North, P. Schneider, C. S. Kochanek,& J. Wambsganss, 269–451

Schneider, P., van Waerbeke, L., Kilbinger, M., & Mellier, Y.2002, A&A, 396, 1

Schrabback, T., Erben, T., Simon, P., et al. 2007, A&A, 468,823

Scoccimarro, R., Zaldarriaga, M., & Hui, L. 1999, ApJ, 527, 1


Scott, D. W. 1992, Multivariate Density Estimation: Theory,Practice, and Visualization (New York: John Wiley & Sons)

Semboloni, E., Mellier, Y., van Waerbeke, L., et al. 2006,A&A, 452, 51

Semboloni, E., Tereno, I., van Waerbeke, L., & Heymans, C.2008, astro-ph/0812.1881

Semboloni, E., van Waerbeke, L., Heymans, C., et al. 2007,MNRAS, 375, 6

Silverman, B. W. 1986, Density Estimation (London: Chapmanand Hall)

Smail, I., Hogg, D. W., Yan, L., & Cohen, J. G. 1995, ApJ, 449,105

Smith, R. E., Peacock, J. A., Jenkins, A., et al. 2003, MNRAS,341, 1311

Springel, V. 2005, MNRAS, 364, 1105Springel, V., White, S. D. M., Jenkins, A., et al. 2005, Nature,

435, 629Takada, M. & Jain, B. 2009, MNRAS, 395, 2065Venables, W. & Ripley, B. 2002, Modern Applied Statistics

with S (Springer)Wolf, C., Meisenheimer, K., Kleinheinrich, M., et al. 2004,

A&A, 421, 913

arxiv:0901.3269v2 [astro-ph.co] 29 jul 2009arxiv:0901.3269v2 [astro-ph.co] 29 jul 2009 astronomy...

Documents