application of genetic algorithm to estimate the large

MNRAS 000, 1–11 (2021) Preprint 15 February 2021 Compiled using MNRAS LATEX style file v3.0

Application of Genetic Algorithm to Estimate the LargeAngular Scale Features of Cosmic Microwave Background

Parth Nayak1and Rajib Saha1

1Department of Physics, Indian Institute of Science Education and Research (IISER) Bhopal, 462066, India

Accepted XXX. Received YYY; in original form ZZZ

ABSTRACT

Genetic Algorithm (GA) – motivated by the natural evolutionary process – is a robust method to estimate the

optimal solutions of problems involving one or more objective functions. In this article, for the first time, we apply

GA to reconstruct the cleaned CMB temperature anisotropy map over large angular scales of the sky using (internal)

linear combination (ILC) of observations from the final-year WMAP and Planck satellite missions. To avoid getting

trapped into a local minimum, we implement the GA with generous diversity in the populations. This is achieved

by introducing a small but significant amount of mutation of genes during crossover and selecting pairs with diverse

fitness coefficients. We find that the new GA-ILC method produces a cleaned CMB map which agrees very well with

the CMB map obtained using the exact and analytical expression of weights in the ILC method. By performing

extensive Monte Carlo simulations of the CMB reconstruction using the GA-ILC algorithm we find that residual

foregrounds in the cleaned map are minimal and mostly tend to occupy localized regions along the central galactic

plane. The CMB angular power spectrum shows no indication of any bias in the entire multipole range 2 ≤ ` ≤ 32

studied in this work. The error in the CMB angular power spectrum is minimal as well and given entirely by the

cosmic variance induced error. Our results agree well with those obtained by various other reconstruction methods

by different research groups. This problem-independent robust GA-ILC method provides a flexible way towards the

complex and challenging task of CMB component reconstruction in cosmology.

Key words: cosmic microwave background – genetic algorithm – methods: data analysis.

1 INTRODUCTION

The cosmic microwave background (CMB) angular powerspectrum is one of the most crucial probes of the cosmologicalparameters and the dynamical history of the universe. Thethermal radiation is observed by various satellite missionssuch as WMAP (Bennett et al. 2013a) and Planck (PlanckCollaboration et al. 2020a). This primordial signal, however,is largely contaminated by the emissions from outside theMilky Way as well as within. These sources of contamina-tion are collectively called the “foreground” sources since weare concerned with the cosmic microwave background. Forinstance, the hot interstellar dust inside the galaxy emitsstrongly at frequencies & 70 GHz. This is referred to as thedust emission and is one kind of foreground. Some other kindsof foreground are synchrotron and free-free emissions fromgalactic and extragalactic sources (Ichiki 2014; Bennett et al.2003a; Bouchet & Gispert 1999; Hinshaw et al. 2003). Hence,an accurate estimation of the angular power spectrum of pureCMB from the foreground-contained observations is crucialfor precision cosmology.

Since the onset of the scientific observations of CMB, ex-tensive research has been done to come up with ways of ef-fective foreground removal and accurate CMB retrieval. As

a consequence, several methods exist in literature for get-ting rid of the foreground. However, most of them make useof the underlying spectral and spatial models of the fore-ground (Hinshaw et al. 2003; Bennett et al. 2003b; Hinshawet al. 2007). For example, a Gibbs sampling approach (Geman& Geman 1984) is proposed and implemented to separateCMB and foreground components by Eriksen et al. (2004a,2007); Eriksen et al. (2008a,b); and a maximum likelihood ap-proach to reconstruct CMB and foreground components us-ing prior noise covariance information and foreground modelsis implemented by Eriksen et al. (2006); Gold et al. (2011).However, they may have uncertainties arising due to incom-plete information of the foreground frequency and spatialdependence (Dodelson 1997; Tegmark 1998). A completelymodel-independent method is the internal linear combination(ILC) of multifrequency observations (Tegmark & Efstathiou1996). Sudevan & Saha (2018a) explore the CMB posteriorusing a Gibbs sampling approach within a global ILC frame-work. Sudevan & Saha (2018c) study the method in the wakeof the theoretical CMB covariance information. All theseworks have been carried out considering the Gaussian natureof the primordial anisotropy signal; any non-Gaussianities aresmall fluctuations (Allen et al. 1987; Falk et al. 1992; Ganguiet al. 1994; Acquaviva et al. 2003; Maldacena 2003). Saha

© 2021 The Authors

arX

iv:2

102.

0656

9v1

[as

tro-

ph.C

O]

12

Feb

2021

2 Parth Nayak and Rajib Saha

(2011) considers the highly non-Gaussian inheritance of theforeground emissions and proposes an ILC method to getrid of the non-Gaussian contamination based on the kurtosisminimization. Other notable studies of the ILC method in-clude, but not limited to, Tegmark et al. (2003); Bennett et al.(2003b); Eriksen et al. (2004b); Saha et al. (2006, 2008a).Some of the previous works also suggest that, depending onthe context of the underlying problem, an exact analyticalILC solution may not be possible to find.

We propose an expression-independent, numerical ILCtechnique for a robust and effective CMB foreground recon-struction. In this work, we apply the novel numerical methodover the global ILC proposed by Sudevan & Saha (2018c),making use of the Planck mission’s theoretical CMB angularpower spectrum (Planck Collaboration et al. 2020c) in ourcomputation of the reduced variance of the output map. Weapply the genetic algorithm (GA) for estimating optimal ILCweights to produce a clean map. An evolutionary algorithmsuch as this involves several inherent parameters that need tobe studied extensively for such an implementation. For ourstudy of the method over the large angular scales, we makeuse of the low-resolution WMAP and Planck observations asinput at HEALPix 1 Nside = 16. We smooth the input mapsby a Gaussian beam at 9° FWHM. Presumably, this also re-duces the pixel-uncorrelated noise level (which is dominanton smaller angular scales).

We perform detailed Monte Carlo simulations to ensurethe statistical sanity of our method. We find that this novelGA-ILC method produces a clean map with a minimal resid-ual foreground. The residuary contamination tends to occupysmall localized regions close to the galactic plane. The angu-lar power spectrum of the clean map contains no apparentbias in the multipole range of 2 ≤ ` ≤ 32. The reconstructionerrors in the angular power spectrum are also minimal andconform to the cosmic variance-induced errors.

The organization of this paper is as follows. We describethe basic formalism of the method in section 2. We elaborateon the exact implementation of this method in section 3.We present our trial GA implementation for validationin section 4. We discuss the GA-ILC implemented withMonte Carlo simulations in section 5. We present the resultsobtained by using the techniques of this work on WMAPand Planck final-year data and discuss our findings in section6. We conclude our paper by outlining some key aspects ofthis work and its future prospects in section 7.

2 FORMALISM

Suppose there are N channels of different frequencies, eachproducing a temperature anisotropy map (hereinafter sim-ply, “map”) upon observation. A map observed from theith frequency channel is denoted by the vector ∆Ti inthe HEALPix pixelated format. The size of this vector isp ≡ Npix = 12N2

side.Since a map is defined over a sphere, it can be expanded

1 Hierarchical Equal Area isoLatitude Pixelation of sky, e.g.,

see Gorski et al. (2005)

in the so-called “harmonic space” as

∆Ti(n) =∑`m

ai`mY`m(n), (1)

where n denotes the pixel index (corresponding to some di-rection vector n), Y`m are the standard spherical harmonics,and ai`m are called the harmonic coefficients of the ith map.The multipole index ` is of special importance because it in-dicates the underlying angular scale (θ` ∼ 2π/`). The cross-correlation of the harmonic coefficients of maps i and j is alsocalled the cross power spectrum, defined by

Cij` =∑m=−`

ai∗`maj`m

2`+ 1. (2)

The auto-correlation of the harmonic coefficients of the ithmap is called its angular power spectrum, Ci`. The CMB an-gular power spectrum is the single, most important quantitywe would like to accurately estimate by finding an optimalclean CMB map from multifrequency observations. All ob-served maps consist primarily of CMB and foreground. Var-ious foreground emissions are not thermalized, hence, theydo not have a blackbody spectrum unlike CMB, which fol-lows the thermal blackbody distribution with great accuracy(Mather et al. 1994). Hence, across different frequency bands,the thermodynamic temperature of the foreground patchesvaries whereas the CMB temperature will be constant. ILCexploits this fact as a benefit of multifrequency observations.

Apart from foreground, detector-noise is also one of thecomponents that contaminates the observed CMB signal. Al-though small amount of noise is present on all angular scales,it is dominant only on the smaller angular scales of the datawhich are irrelevant for this work. Furthermore, noise fromany two frequency channels is uncorrelated (Hinshaw et al.2003, 2007; Jarosik et al. 2003, 2007), more to the advantageof ILC.

An observed CMB map in ith frequency channel can thusbe expressed as

∆Ti = ∆TCMB + ∆Tfg,i + ∆Tnoise,i. (3)

Notice that the CMB part, ∆TCMB, does not containan i index since the CMB temperature is independent offrequency.

2.1 Internal Linear Combination

Tegmark & Efstathiou (1996) proposed a model-independentmethod to estimate the clean CMB signal from a multifre-quency observation. ILC attempts to find the optimal esti-mate of CMB from multifrequency foreground-contaminatedmaps without using any external templates or models. In thistechnique, we start by writing a linear combination of N mul-tifrequency input maps to find a clean output map. This issimply

P ≡ ∆Tclean({wi}) =

N∑i=1

wi∆Ti, (4)

where {wi} are the coefficients, more commonly called theweights given to all the input maps before summing them.

MNRAS 000, 1–11 (2021)

Genetic Algorithm on CMB 3

Using Eqn. (3), this yields

P =

( N∑i=1

wi

)∆TCMB +

N∑i=1

wi∆Ttrash,i, (5)

where ∆Ttrash,i = ∆Tfg,i + ∆Tnoise,i. In order to preservethe norm of the clean CMB map, a constraint is imposed inthe form

N∑i=1

wi = 1 or eTw = 1, (6)

where e = (1, 1, . . . , 1)T is the column vector of size N withall entries 1. In the usual ILC method, the best estimate ofCMB is obtained by minimizing the variance of the cleanmap, PTP, w.r.t. the choice of the weight vector, w. Thisis a problem of multivariable constrained optimization andthe analytical solution can be found by using the Lagrange’smultipliers method (e.g., see Tegmark & Efstathiou (1996);Saha et al. (2008b)) as,

wusual =eTC−1

eTC-1e, (7)

or following summation notation,

wusual,i =

∑Nj=1 C

−1ij∑N

i′,j′=1 C−1i′j′

, (8)

where C is the N×N square symmetric covariance matrix ofthe N input maps. In case the covariance matrix is singular,it is possible to generalize the inverse as the Moore-Penrosepseudoinverse (Moore 1920; Penrose 1956) C−1 7→ C†. If Cis not singular, C† = C−1. Rewriting Eqn. (7) in terms ofthe pseudoinverse,

wusual =eTC†

eTC†e. (9)

Sudevan & Saha (2018c) (SS, for short) proposed a new,global ILC method making use of the prior knowledge of thetheoretical covariance of CMB. The method employs this ad-ditional information to assert that the covariance structureof the final clean map is consistent with the expectation ofpure CMB. Instead of minimizing the variance of the cleanmap, this global ILC seeks to minimize the reduced variance,called σ2, calculated as

σ2 = PTC†thP, (10)

where Cth is the theoretical CMB pixel-pixel covariance ma-trix of size p × p and † represents the pseudoinverse. Fromthe definition of P in Eqn. (4), we can write

σ2 = wTAw, (11)

where A is an N ×N square matrix with elements

Aij = (∆Ti)TC†th∆Tj . (12)

Similarly to the usual ILC, using the Lagrange multiplierswith the same constraint, the optimal weights under thisglobal ILC can be found as

wSS =eTA†

eTA†e. (13)

As found by Sudevan & Saha (2018b), the matrix A can

Start Randomly initiate a population

Fitness Calculation

ParentSelectionSort

Crossoverand Mutation

NoTerminate?

Yes

Results

Figure 1. The flowchart of a typical GA. The sorting is introducedin order to ensure the better readability and presentability of the

solutions in a population.

be calculated as

Aij =

`max∑`=2

(2`+ 1)Cij`C′`,th

, (14)

where C′`,th is the theoretical power spectrum of CMB afteraccounting for the beam and pixel effects, i.e.

C′`,th = C`,thB2`P

2` , (15)

where B` are the Legendre transforms of the beam functionand P` are the HEALPix pixel window functions. `max

can be chosen according to the pixel angular resolution ofthe maps. In this work, we made use of the low resolutionmaps at Nside = 16, for which ` = 32 is a sufficiently largemultipole. The sum is taken starting from ` = 2 because themonopole (` = 0) and dipole (` = 1) are uninteresting forcosmological studies, and hence, they are removed from theinput maps prior to all the analyses.

2.2 Genetic Algorithm

Genetic algorithm (GA) was first proposed by John H. Hol-land (Holland 1962) in the mid 1960s and was further ex-tended by David Goldberg in the late 1980s (Goldberg 1989).GA aims to find global optimal solution(s) to objective func-tions having a multimodal nature, i.e., it has multiple localoptima in the domain of search. As the conventional optima-finding algorithms tend to fail in such cases, GA opens upa new window with its novel concept. The underlying prin-ciple of GA is Darwin’s theory of “natural selection”. Thebiological species evolve by this natural process. In a popu-lation, each individual has a different set of genetic charac-teristics. Thus, the whole population has a diverse gene pool.This community is subject to certain natural circumstancesknown as the environment. The process of natural selectionasserts that the genes which lead to better traits for survivalwill more likely be passed on to the offspring by reproductionover the generations than those which do not. Occasional mu-

MNRAS 000, 1–11 (2021)


tations may bring in new beneficial genes that were earliernot present. Hence, eventually the whole population will bestadapt to its environment after several generations.

GA employs this strategy to the optimization problems.The goal at hand is to numerically find the global optimumof an objective function in the given domain of interest. GAworks on the principle of convergence with variation. Diver-sity in a population is a necessary ingredient for the GAto succeed. Over the generations, individual solutions withhigher and higher fitness dominate the population. Fig. 1shows the flowchart of a typical GA. In the following webriefly discuss the most common steps of GA.

(i) Population initialization: GA derives from the basicideology of population genetics. Hence, GA is based on a“population” – a set of possible solutions to the problem.According to the domain of search for the optimal solution, apopulation can be randomly created. The size of a populationis a heuristic parameter.

(ii) Fitness calculation: A fitness function is chosen suchthat the fitter individuals will be closer to the optimum (inthe functional space) than the less fit ones. The choice of afitness function depends on the underlying objective and itis not unique. However, in order for a proper implementationof the algorithm, the fitness function should be positive andanalytic, at least in the domain of interest.

(iii) Parent selection: This is one of the most crucial stepsof the entire algorithm. As Natural selection dictates, thefitter individuals in the population are more likely to passon their genes by mating than the less fit ones. To simulatethis scenario, we must assign some weights to the individualsaccording to their fitness. One most common method is thefitness-proportionate assignment of weights (probabilities). Inthis step the pairs are selected for mating and producing thenew generation. There is a set of methods for this. We willdiscuss some of them in section 3.

(iv) Crossover and Mutation: From the parents selected inthe previous step, we now make crossovers to generate newindividuals, collectively called “offspring”. The new individ-uals are occasionally mutated with a small amplitude. In ourcase, mutation refers to a discrepancy in the passage of genesfrom one generation to the next. Mutation is a crucial tool tobring in variety in the genetic material. We discuss the impor-tance of mutation in section 3. The offspring thus producedwill constitute the next generation of our population.

(v) Termination: Steps (ii)-(iv) are repeated in a loop foreach new generation until the stopping criteria are reached/GA has “converged”. In general, a GA implementation is saidto have converged if the fitness of the best individual(s) overthe generations does not improve any further. The program-mer can set a stop-point generation, hitting which will endthe generation loop. This stop-point can be found by hit-and-trail. Altogether, the fittest individual in the last generationis deemed the solution of the GA run.

Fig. 2 uses a toy-model population to visually understandthe different terms and steps in our GA implementation. (Wediscuss it further in the due course.) In section 3, we describethe methodology adopted for this implementation of GA-ILCwith the details concerning the exact styles and the values ofcertain parameters.

3 METHODOLOGY

Python is one of the most commonly used programming lan-guages for scientific computation. It provides many open-source packages for various utilities. One of those packages isthe python-port of HEALPix, HEALPy. It comes with manyutilities we need for handling and analyzing the data. Thereare pixel utilities, spherical-harmonic transform utilities, vi-sualization related functions and many miscellaneous toolsfor an effective manipulation and reduction of a HEALPixdata-set. We have implemented the algorithm in Python 3for numerical computations.

In this work, GA is implemented to find the optimalweights to linearly combine the input maps such that thereduced variance of the clean map is minimized. This is a mul-tivariable constrained optimization problem. In the presentwork, the problem is converted to an unconstrained optimiza-tion problem by a little manipulation of the method. The ob-jective function to be optimized (minimized) is the (reduced)variance function (Eqn. (11))

σ2(w) = wTAw. (16)

The constraint is∑Ni=1 w = 1. Alternatively, for the current

work, one of the N weights is found as wi′ = 1 −∑i 6=i′ wi.

(It is the last weight of the spectrally-ordered system forour work.) Thus, the goal is to estimate wGA that minimizeσ2(w) using GA.

3.1 Implementation of GA-ILC

In Genetics, an individual can be represented by its geno-typic as well as phenotypic characteristics. In our GA, thephenotype of an individual is a decimal number in the caseof a single variable problem and a vector (array) of n dec-imal numbers in the case of a multivariable problem. Wechose the binary-string representation as our genotype. Therequired decimal accuracy is a fixed parameter initially cho-sen by the programmer. Depending upon the accuracy, eachindividual fraction can be converted to an integer by mul-tiplying it with 10decimal-accuracy and rounding off. Then theinteger is converted into its binary representation. In the mul-tivariable cases, each element of a vector is converted into abinary string and all of those strings are concatenated in theoriginal order into a single large string. We call this stringa “chromosome”. The binary part of an individual elementof this vector chromosome is called a “segment”, each posi-tion of a chromosome is called a “gene”, and the bit-valueof each gene is called the “allele” of that gene. Fig. 2 (A)depicts a schematic of this terminology. It shows an exam-ple one-variable population in the genotype representation.By performing the exact inverse operations on a chromosomewill return the phenotype (decimal vector) of that individual.

Now that the terminology is clear, we need to initialize apopulation. The size of the population, Popsize, is fixed bythe programmer by hit-and-trial. A domain of search mustbe first decided. For our GA-ILC, we chose {wSS,i ± 1} asthe domain, hereinafter called “priors”. The initial popula-tion is then uniformly randomly generated. The populationat any stage of calculation is a 2D array with one dimen-sion containing genes of a single chromosome and the othercontaining as many as Popsize chromosomes. The allele ofeach gene of each chromosome is determined by generating a

MNRAS 000, 1–11 (2021)


Gene0 1 0 1 1

1 1 0 1 0

1 1 0 1 1

0 0 1 1 0

I-1

I-2

I-3

I-4 Population

Chromosome

Allele

I-224%

I-347%

I-414%

I-115%

I-3

I-4

I-3’

I-4’

1 1 0 1 1

0 0 1 1 0

1 0 1 0 0 Randomsequence

Uniform crossover

0 1 1 1 1

1 0 0 1 0 } New individuals

I-3”

I-4”

0 1 1 1 0

1 0 0 1 0 } Mutated individuals

I-3’

I-4’

0 1 1 1 1

1 0 0 1 0Mutation (bit-flip)

ri > P r5 < P

ri → Random number ϵ [0,1)

P → Probability of Mutation

(A) (B) (C) (D)

Figure 2. A schematic representation of a toy-model of GA. (A) shows a population with 4 individuals and indicates the terminology

within a GA framework. (B) shows a Roulette wheel with two fixed-points diametrically opposite pointing to I-3 and I-4. (C) shows a

typical uniform crossover event on I-3 and I-4, producing the new individuals I-3′ and I-4′. These are the offspring which then get mutatedinto I-3′′ and I-4′′ as shown in (D). All the numbers including binary strings and probabilities are only meant for visualization of this

toy-model.

(pseudo-) random number between 0 and 1, and according towhether it is less or greater than 1/2. This is, in some sense,the coin-toss determination and it is frequently used in thiswork. From the binary population, the decimal population isalso found. At this point a check is made on whether all thevariables in each individual fall within the priors imposed.The number of variables in our GA-ILC, N var, is one lessthan the number of input maps since one of the weights isdependent on the rest of them.

A suitable fitness function for the GA-ILC is chosen as

f(w) =1

σ2(w). (17)

Since the σ2 is strictly positive, this fitness function satisfiesthe criteria of being a positive analytic function (in principle,the global minimum of σ2 is the reduced variance of a pureCMB map).

As discussed in subsection 2.2, there are various methodsfor selecting parent pairs from a given fitness-associated pop-ulation. Some of them are the fitness-proportionate selectionmethods which assign weights proportional to the fitness ofthe individuals, whereas some make use of the fitness valuesindirectly in doing so. One of the most common parent selec-tion methods (which we use in this work) is called “Stochas-tic Universal Sampling” or “SUS” in short. It is a Roulette-wheel-based method in which, depending upon the numberof individuals to be picked in one turn of the wheel, thereis a number of fixed-points. In case of picking pairs, thereare two fixed-points, equally separated around the wheel’speriphery. A turn of the wheel is simulated by generating apseudorandom number.

In a fitness-proportionate (“FP” for short) selection ofparents, the area assigned to each individual in a populationon the wheel is directly proportional to their fitness. Thus,depending upon the number of pairs to be picked (a.k.a. sizeof the “mating-pool”), the wheel is turned so many times andeach turn will return two individuals as a pair. This scenarioof fitter individuals having better chance of reproduction isdubbed the “selection pressure”. However, as the generationsprogress and the population approaches the desired optimum,the fitness value may get less and less diverse due to a de-creasing gradient of the objective function. In such cases,the selection pressure to the fitter individuals wanes sinceall the individuals will be assigned an almost equal area onthe wheel. To make sure this does not happen, another selec-tion method such as “Rank selection” can be employed. In

w1

0.50.0

0.51.0

w31.0

0.50.0

0.5

Var

ianc

e (1

06 K

2)

0.51.01.52.02.53.03.5

Figure 3. The variance function in Eqn. (16) plotted against twoweights w1 and w3 while keeping the rest of the weights fixed to

the SS weights. This is performed on the simulation of seed 100

low-resolution maps. A very broad valley can be seen close to thesupposed global minimum.

this technique, the individuals in the population are sortedaccording to their fitness values and they are assigned ranks.Then the weights are assigned according to their ranks, irre-spective of their fitness values. Nonetheless, the individual(s)with the highest fitness value will have the largest weight.The selection is then done by the Roulette-wheel turnings.This method brings the selection pressure back into the pic-ture as the weight-gradient is preserved. When the objec-tive function in the domain of search inherently has a lessgradient, rank selection should be preferred from the startof the generation loop. If closer to the optimum point thefunction is less steep, then after a certain number of gener-ations the selection type can be switched from FP to rankselection. This switching-point generation is another GA pa-rameter called switch. Fig. 2 (B) shows a Roulette wheel inour toy-model population with some area given to all fourindividuals under either of the selection settings. Notice thatthe two fixed-points for picking pairs are diametrically oppo-site to each other. This way, an individual with low selectionpressure may also get picked occasionally, maintaining thegenetic diversity in GA.

The variance function in Eqn. (16) is plotted in Fig. 3 fora pair of weights while keeping the rest of the weights fixed

MNRAS 000, 1–11 (2021)


to the SS-weights. It is done on one of the realizations in theMonte Carlo simulations we performed (we discuss it furtherin section 5). As can be seen in that figure, the variancefunction has a very broad valley close to the global minimum.We verified that the feature is present irrespective of the pairof weights chosen. This required us to switch between FP andrank selection types after a switch-point generation. It is yetanother parameter that must be fixed by hit-and-trial like allthe rest of the GA parameters and is variable for a plethoraof GA implementations.

The crossover again comes with a variety of types. Someof the most popular types are one-point, two-point, k-point,uniform crossovers. We chose the uniform crossover as theuniversal type for our entire work. An instance of this typeof crossover is schematically shown in Fig. 2 (C) for ourtoy-model population. Under uniform crossover, the allelesof each gene are exchanged between the two parent chromo-somes of a pair using the coin-toss determination technique.Thus, each pair of parents give rise to a pair of children. Thesechildren are mutated with a small amplitude, P mut. A mu-tation is introduced as the following. For each gene of eachchild chromosome, a pseudorandom number r in the range[0, 1) is generated. If r<P mut, the allele of that gene is flipped(0→1 and 1→0). The parameter P mut is best chosen by theprogrammer by hit-and-trial. Fig. 2 (D) indicates one of themutation incidents in the toy-model GA. After these steps,we have a set of offspring chromosomes at hand. Dependingupon the fitness of the children and the size of the popula-tion, a new generation is recruited by keeping Popsize-manyfittest children into the population and discarding the rest ofthem. The newly deemed population will then undergo thesame steps of parent selection and mating. Proper termina-tion criteria in terms of the number of generations, N gen, isfound by observing the convergence of GA-ILC and subjectto the computational resources.

3.2 Remarks on Genetic Diversity

It is particularly important to ensure a genetically diversepopulation at each step of GA. The initial population is ran-domly initialized to have the maximum diversity possible.While applying selection pressure towards the fitter individ-uals, the less fit individuals should not be neglected entirely.A balanced reproduction is only possible if the entire pop-ulation takes part in mating. We need to make sure thatthe least fit individuals also get to reproduce at least a cou-ple times. A way to ensure this (in addition to using SUSmethod) is to make N pairs ∼ d1/(min({pi}))e where {pi}is the set of probabilities (weights) assigned to the individ-uals and de is the ceiling function. If this is less than halfPopsize, N pairs can simply be Popsize. Due to the limitedresources and increasing requirement of computing power, thebottleneck must be applied to filter-out the fitter individualsafter reproduction and preserve the size of the population ateach generation. This results in only a slightly reduced ge-netic diversity, which is harmless considering the variety ofchildren produced by mutation. Mutation, thus, is a very cru-cial ingredient in our GA. Indeed, a mild mutation preventsGA from premature convergence. A GA-ILC implementationwithout any mutation loses the genetic diversity almost com-pletely as it moves towards the global optimum and causesit to converge to some false local optimum which is not ac-

x

32

10

12

3

y

32

10

12

3

z =

f (x

, y)

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Figure 4. The two-variable case of the trial function in Eqn. (18)

within an appropriate domain of interest. Observe the highly multi-

modal nature with multiple local maxima located closely together.

tually present. It is a partly computational – partly funda-mental defect (in that a non-diverse, self-breeding populationeventually ceases to survive under natural selection). In Pop-ulation Genetics, occasional mutation may, by pure chance,introduce a better trait for survival and is preferred throughgenerations. We studied this very interesting ingredient inmore detail in our work. We discuss this further in sections5 and 6.

4 VALIDATION OF THE GAIMPLEMENTATION

Implementation of GA is a complicated task. In particular,the programmer needs to find the optimal values of variousGA parameters mostly by hit-and-trial as mentioned quitefrequently throughout subsection 3.1. The most common GAparameters are the population size Popsize, number of totalgenerations N gen, decimal accuracy required Dec, probabilityof mutation P mut, etc. Hence, it is wise to start with a verysimple implementation and then move towards more com-plicated problems. We have employed this strategy for thiswork. We initially implemented GA for single variable test-problems. Then we moved on to multivariable test-problemsand we obtained satisfactory results in both of them. Thetest function we chose is

f(x) =1

1 +∑Nvari=1 [x2i + 10(cos(2πxi)− 1)]

, (18)

with the same function as the fitness function. This functionis well-defined for any number of variables. The two-variablecase of this function is shown in fig. 4 for visualization. It isobserved from the figure that this trial function is a highlymultimodal function i.e. it has multiple local maxima witha single global maximum at x = 0 with f(0) = 1, withinthe domain of interest as plotted therein. Also, it is a steepfunction with large gradient. Hence, for such implementa-tions, no mutation was necessary (in other words, P mut = 0)and after ∼60% of total generations, the selection type wasswitched from FP to rank selection to compensate for havingno mutation. The best of those results for different Nvar are

MNRAS 000, 1–11 (2021)


Table 1. Results of the trial implementation for different number

of variables. Notice that, for larger and larger number of variables,to get the best results, we need to have a larger and larger popu-

lation.

Nvar Popsize N gen Best f(x∗)

1 200 60 1.02 200 60 1.0

5 400 200 1.0

10 600 250 1.011 600 250 1.0

12 600 250 1.0

Table 2. Variation of the final function value (in the trial imple-

mentation) w.r.t Popsize for fixed number of generations N gen =250 and Nvar = 11. Notice the very bad function value for small

Popsize.

Popsize Best f(x∗)

400 0.5011925167207855450 1.0000000000000000

500 0.9999999999980158

550 0.9999999999682601600 1.0000000000000000

650 1.0000000000000000

summarized in table 1. Table 2 shows the results obtained forthe fixed number of variables in the trial function – which isthe same as that in our GA-ILC implementation – with dif-ferent population sizes. As expected, the best function valueimproves as we increase the size of the population. The re-sults of the trials as discussed here are sufficient to concludethat our GA implementation is robust and accurate in themultivariable scenario of our GA-ILC method, and that it issafe and desirable to incorporate ILC with it in the mannerdescribed in section 3.

5 MONTE CARLO SIMULATIONS

As a statistical test for our GA-ILC method, we performedMonte Carlo simulations using WMAP and Planck simulatedmaps. From the satellite missions’ observed angular powerspectra, using HEALPix IDL 2 routine isynfast, anisotropymaps were randomly generated by passing an argument calledseed which seeds the pseudorandom number generator. Thesample size of our simulations is 200, with seed varying from1 to 200 in steps of 1. A total of 12 low-resolution maps atNside = 16 were generated for each seed as input of GA-ILC. Of the 12 input maps, the 5 WMAP frequency bandsare K (23 GHz), Ka (33 GHz), Q (41 GHz), V (61 GHz),W (94 GHz); and the seven Planck frequency bands are 30,44, 70, 100, 143, 217 and 353 GHz. These simulated mapsare 9°beam-smoothed. Corresponding to each seed, a pureCMB map was also generated using the same routine fromthe Planck theoretical angular power spectrum. These CMBmaps were used as references to compare with the clean mapsthat GA-ILC produced. Some of the GA parameters were

2 IDL is the interactive data language for Unix and Linux based

systems, see, e.g., https://caligari.dartmouth.edu/doc/idl/

Mean Difference Map

-7.14851 5.13155

Standard Deviation Difference Map

0.808651 15.7783

Figure 5. The statistical difference maps from the Monte Carlosimulations. Only a very small residual contamination seems to

be present, only close to the plane of Milky Way. The unit of thevalues is µK.

kept fixed for the simulations, namely, Popsize = 400, N gen

= 400, decimal accuracy Dec = 7, and switching between FPand rank selection types after switch generation # 20, all ofthem optimally chosen. To explore the effect of mutation onthe results, GA-ILC was run for 128 different values of P mut

starting from 0.05% in steps of 0.05%, for each seed value.The best results (of all P mut) for every seed were compiledand then we found the statistical measures. In particular,we found the best map of all P mut for every seed and sub-tracted the pure CMB simulated map for that seed from it.This map is called a simulated difference map here. 200 suchsimulated difference maps were produced and the mean andstandard deviation difference maps were calculated from it.Fig. 5 shows these statistical maps. The mean difference mapcontains very small amount of residual contamination andthat is limited to small regions around the galactic equator,as can also be inferred from the standard deviation differencemap.

The angular power spectrum corresponding to the sim-ulated mean clean map along with the standard deviationerrors is overplotted with the Planck theoretical CMB powerspectrum in the top panel of fig. 6. In the middle panel of thefigure, the mean difference power spectrum is plotted with theerrors in the estimation of mean (σµ,`). We observe that theestimated power spectrum is within a 3σµ deviation to thetheoretical CMB power spectrum. This confirms that thereis no bias in the entire multipole range 2 < ` < 32 under

MNRAS 000, 1–11 (2021)

https://caligari.dartmouth.edu/doc/idl/


500

1000

1500

(+

1)C

/2(

K2 ) Planck Theory

GA-ILC Simulations

50

0

50

(+

1)C

/2(

K2 ) GA-ILC Simulations Planck Theory

5 10 15 20 25 30Multipole,

200

400

600

(+

1)/2

(K

2 ) Simulation errorCosmic variance

Figure 6. The angular power spectrum of the GA clean simu-

lated mean difference map overplotted with the Planck theoretical

power spectrum. Note that the beam and pixel effects are removed.Middle: The difference of the power spectra of GA simulations and

Planck theory with errors in mean-estimation. It can be seen that

the deviation is within 3σµ level which indicates that there is nobias. Bottom: The standard deviation errors in the angular power

spectrum of the final mean difference map. Notice that these errors

conform with the minimal errors induced by the cosmic variance,

∆C`,cv =√

22`+1

C`.

study. In the bottom panel of the same figure, the standarddeviation errors (σ`) are overplotted with the cosmic-varianceinduced errors. The observed errors also agree well with theexpected minimal errors purely of the cosmic origin.

6 APPLICATION ON WMAP AND PLANCKDATA

We applied our GA-ILC on 12 multifrequency low-resolutioninput maps observed by WMAP and Planck to find an opti-mal clean map as output (we refer to this as the implemen-tation on “data” as shorthand; the 12 input frequencies arethose mentioned in section 5). GA parameters like Popsize,N gen, Dec, and switch were kept at the fixed values, sameas the Monte Carlo simulations. Similarly to the simulations,we applied GA-ILC with 144 values of P mut, from 0% to7.15% in steps of 0.05%, to analyze the final reduced vari-ance in each case. The best case of all these 144 was deemedto represent the best version of our GA-ILC. The plot of finalreduced variance against the mutation probability is shownin fig. 7. As we can see from there, a very low (or negligible)mutation does not help in our pursuit of optimization, anda large amount of mutation is also not useful. A moderatelysmall value of 1.15% is the best mutation value in this case.There is no exact analytical problem-based expression for theoptimal amount of mutation as is demonstrated by this fig-ure. We plotted the variance values against the cumulativeindividuals through generations (the population is sorted inthe ascending order of fitness). This type of plots are called

1000

1100

1200

1300

1400

Var

ianc

e (

K2 )

GA-ILC varianceMinimum variance

0 1 2 3 4 5 6 7% mutation

1010

1020

1030

1040

Var

ianc

e (

K2 )

GA-ILC varianceMinimum variance1.15% Mutation

Figure 7. The minimal variance obtained by GA-ILC on data

against the coefficient of mutation. The minimum of these minimalvalues – that occurs at a low-to-moderate mutation by 1.15% – is

also shown distinctively. Bottom: a zoomed-in version for a closer

look at the variation.

1000

1200

1400

1600

Mutated by 1.15%Unmutated

1010

1020

1030Mutated by 1.15%

0 20000 40000 60000 80000 100000 120000 140000 160000Cumulative individuals

1008.50

1008.75

1009.00

1009.25

1009.50Mutated by 1.15%

Var

ianc

e (

K2 )

Figure 8. Top: The overplot of the traces of mutated and un-

mutated cases. The mutated case is the best among all the 144

mutation values with P mut = 1.15% implemented on data. Middleand Bottom: closer look at the convergence in the mutated case.

“trace-plots” here. These plots help us understand the con-vergence of GA and the effects of various parameters on it.Fig. 8 shows an example of trace-plots. Therein we can seethat the population without any mutation converges prema-turely due to the loss of genetic diversity as there exists abroad valley close to the global optimum. On the other hand,the optimally mutated case is able to find better solutionssince it is able to explore the variable space. It is also evidentthat even when the unmutated case has long since converged,the mutated case is slowly able to find better and better so-lutions after generations.

Fig. 9 shows the clean CMB map produced by GA-ILCon data. Therein the bottom panel shows the difference mapof GA-ILC and SS-ILC (Sudevan & Saha 2018c). It confirms

MNRAS 000, 1–11 (2021)


GA-ILC Clean Map

-137.055 103.237

Difference map, GA-ILC — SS-ILC

-13.821 16.801

Figure 9. The clean map produced by the GA-ILC on WMAP

and Planck data with 1.15% mutation. Bottom: Difference map ofGA-ILC and the SS-ILC. All the values are in µK. Notice that the

residual contamination is very small and exists only close to the

galactic plane. In other parts of the sky, a very good agreementcan be seen.

5 10 15 20 25 30Multipole,

500

0

500

1000

1500

2000

(+

1)C

/2 (

K2 )

SS-ILCGA-ILC

Figure 10. The clean angular power spectrum produced by GA-ILC implemented on data with 1.15% mutation. The reconstruc-

tion errors in GA-ILC here are found using 200 Monte Carlo simu-lations. The two power spectra seem to agree quite well with eachother.

that the minimal residual contamination is present in the GA-ILC clean map and that it tends to occupy small areas closeto the galactic plane. The clean angular power spectrum cal-culated from the GA-ILC clean map (data implementation) isoverplotted with SS-ILC clean power spectrum in fig. 10. Weobserve that the two match very well with each other. Thisalso indicates that the purely numerical GA-ILC techniquegives as good results as SS-ILC with analytical expression ofweights.

We compare our GA-ILC results with those obtained byvarious component reconstruction methods of other sciencegroups, namely, COMMANDER, NILC, SMICA (Planck Col-laboration et al. 2020b), and WMAP-ILC (Bennett et al.2013b). Shown in fig. 11 are the difference maps of each ofthose clean maps and the GA-ILC clean map (data imple-mentation, of course). It is evident from here that our methodproduces a clean map that agrees well with some other com-pletely different methods. We also observe that the residualcontamination near the Milky Way plane is different in dif-ferent methods. This indicates that those other clean mapsalso contain some (minimal) amount of residual foreground,same as ours. Finally we show, in fig. 12, each of the angularpower spectra of GA-ILC and that of COMMANDER, NILC,SMICA, and WMAP-ILC for comparison. As expected, thespectra agree well with each other, with the small differencebetween all the different methods. Again, the other methods’results differ from each other same as our method’s.

7 CONCLUSION

In this paper, for the first time in literature, we develop andimplement the biological-selection-rule-motivated genetic al-gorithm to reconstruct the CMB component over large an-gular scales by removing foregrounds using linear combina-tion of multifrequency observations of WMAP and Plancksatellite missions. We validate our methodology by perform-ing 200 Monte Carlo simulations using realistic observationsfrom the final-year WMAP and Planck missions. The resultsof the simulations show that the CMB map and the angu-lar power spectrum can be accurately recovered by using ourmethod. The outcome of our method is in close agreementwith the results obtained by using weights following the exactanalytical method which demonstrates usefulness of the newmethod of this paper. We compare the cleaned CMB mapsand recovered angular power spectrum obtained by the newmethod with those obtained by WMAP and Planck sciencegroups. These results agree well with each other, which showsthat the CMB results obtained by the satellite missions arerobust with respect to the data analysis algorithms applied.Enriched in inherent parametric characteristics, GA-ILC is avery flexible and volatile method to work with. Nonetheless,since GA is expected to produce robust results with respectto fluctuations in the optimizing system, our method opensup a new avenue to apply the method when the weight fac-tors for linear combination in a modified ILC method can nolonger be obtained in a closed analytical form, e.g., when oneis interested in an objective function which deviates fromquadratic nature. The larger domain of application of theGA-ILC is a very promising feature and will be explored inthe future communications by the authors.

MNRAS 000, 1–11 (2021)


COMMANDER — GA-ILC

-30 30

NILC — GA-ILC

-30 30

SMICA — GA-ILC

-30 30

WMAP-ILC — GA-ILC

-30 30

Figure 11. Difference maps of several other foreground reconstruction techniques and GA-ILC implemented on data. Notice that exceptthe small residual foreground close to the galactic equator – which itself varies from one method to the other, our results agree very nicely

with those of other methods. The values are in µK.

ACKNOWLEDGEMENTS

We thank the HPC cluster facility, “Kanad” of IISER Bhopalfor providing access and resources for the crucial part of thisresearch.

8 DATA AVAILABILITY

The data pertaining to this article will be shared on reason-able request to the corresponding author.

REFERENCES

Acquaviva V., Bartolo N., Matarrese S., Riotto A., 2003, Nuclear

Physics B, 667, 119

Allen T., Grinstein B., Wise M. B., 1987, Physics Letters B, 197,66

Bennett C. L., et al., 2003a, ApJS, 148, 1

Bennett C. L., et al., 2003b, ApJS, 148, 97

Bennett C. L., et al., 2013a, ApJS, 208, 20

Bennett C. L., et al., 2013b, ApJS, 208, 20

Bouchet F. R., Gispert R., 1999, New Astron., 4, 443

Dodelson S., 1997, The Astrophysical Journal, 482, 577

Eriksen H. K., et al., 2004a, The Astrophysical Journal SupplementSeries, 155, 227–241

Eriksen H. K., Banday A. J., Gorski K. M., Lilje P. B., 2004b, The

Astrophysical Journal, 612, 633

Eriksen H. K., et al., 2006, ApJ, 641, 665

Eriksen H. K., Huey G., Saha R., et al., 2007, The AstrophysicalJournal, 656, 641

Eriksen H. K., Dickinson C., Jewell J. B., Banday A. J., GorskiK. M., Lawrence C. R., 2008a, ApJ, 672, L87

Eriksen H. K., Jewell J. B., Dickinson C., Banday A. J., Gorski

K. M., Lawrence C. R., 2008b, ApJ, 676, 10

Falk T., Rangarajan R., Srednicki M., 1992, Phys. Rev. D, 46, 4232

Gangui A., Lucchin F., Matarrese S., Mollerach S., 1994, ApJ, 430,447

Geman S., Geman D., 1984, IEEE Transactions on Pattern Anal-ysis and Machine Intelligence, PAMI-6, 721

Gold B., et al., 2011, ApJS, 192, 15

Goldberg D. E., 1989, Genetic algorithms in search, optimization,and machine learning. Addison-Wesley Publishing Company,

Inc.

Gorski K. M., Hivon E., Banday A. J., Wandelt B. D., Hansen

F. K., Reinecke M., Bartelmann M., 2005, ApJ, 622, 759

Hinshaw G., et al., 2003, ApJS, 148, 135

Hinshaw G., et al., 2007, ApJS, 170, 288

Holland J. H., 1962, J. ACM, 9, 297–314

Ichiki K., 2014, Progress of Theoretical and Experimental Physics,2014

Jarosik N., Bennett C. L., et al., 2003, ApJS, 145, 413

Jarosik N., et al., 2007, ApJS, 170, 263

Maldacena J., 2003, Journal of High Energy Physics, 2003, 013

Mather J. C., et al., 1994, ApJ, 420, 439

Moore E. H., 1920, Proceedings of the American Mathematical

Society

Penrose R., 1956, Proceedings of the Cambridge Philosophical So-

ciety, 52, 17

MNRAS 000, 1–11 (2021)

http://dx.doi.org/https://doi.org/10.1016/S0550-3213(03)00550-9

http://dx.doi.org/https://doi.org/10.1016/S0550-3213(03)00550-9

http://dx.doi.org/https://doi.org/10.1016/0370-2693(87)90343-1

http://dx.doi.org/10.1086/377253

https://ui.adsabs.harvard.edu/abs/2003ApJS..148....1B

http://dx.doi.org/10.1086/377252

https://ui.adsabs.harvard.edu/abs/2003ApJS..148...97B

http://dx.doi.org/10.1088/0067-0049/208/2/20


http://dx.doi.org/10.1088/0067-0049/208/2/20


http://dx.doi.org/10.1016/S1384-1076(99)00027-5

https://ui.adsabs.harvard.edu/abs/1999NewA....4..443B

http://dx.doi.org/10.1086/304157

http://dx.doi.org/10.1086/425219

http://dx.doi.org/10.1086/425219

http://dx.doi.org/10.1086/422807

http://dx.doi.org/10.1086/422807

http://dx.doi.org/10.1086/500499

https://ui.adsabs.harvard.edu/abs/2006ApJ...641..665E

http://dx.doi.org/10.1086/509911

http://dx.doi.org/10.1086/509911

http://dx.doi.org/10.1086/526545

https://ui.adsabs.harvard.edu/abs/2008ApJ...672L..87E

http://dx.doi.org/10.1086/525277

https://ui.adsabs.harvard.edu/abs/2008ApJ...676...10E

http://dx.doi.org/10.1103/PhysRevD.46.4232

http://dx.doi.org/10.1086/174421

https://ui.adsabs.harvard.edu/abs/1994ApJ...430..447G


http://dx.doi.org/10.1088/0067-0049/192/2/15

https://ui.adsabs.harvard.edu/abs/2011ApJS..192...15G

http://dx.doi.org/10.1086/427976


http://dx.doi.org/10.1086/377225

https://ui.adsabs.harvard.edu/abs/2003ApJS..148..135H

http://dx.doi.org/10.1086/513698

https://ui.adsabs.harvard.edu/abs/2007ApJS..170..288H

http://dx.doi.org/10.1145/321127.321128

http://dx.doi.org/10.1093/ptep/ptu065

http://dx.doi.org/10.1086/346080

https://ui.adsabs.harvard.edu/abs/2003ApJS..145..413J

http://dx.doi.org/10.1086/513697

https://ui.adsabs.harvard.edu/abs/2007ApJS..170..263J

http://dx.doi.org/10.1088/1126-6708/2003/05/013

http://dx.doi.org/10.1086/173574

https://ui.adsabs.harvard.edu/abs/1994ApJ...420..439M

http://dx.doi.org/10.1017/S0305004100030929

http://dx.doi.org/10.1017/S0305004100030929

https://ui.adsabs.harvard.edu/abs/1956PCPS...52...17P


500

0

500

1000

1500

2000(

+1)

C/2

(K

2 )

COMMANDERSMICAGA-ILC

500

0

500

1000

1500

2000

NILCWMAP-ILCGA-ILC

5 10 15 20 25 30

200

100

0

100

200

300

(+

1)C

/2(

K2 )

COMMANDER GA-ILCSMICA GA-ILC

5 10 15 20 25 30

100

0

100

200NILC GA-ILCWMAP-ILC GA-ILC

Multipole,

Figure 12. Comparison of angular power spectrum of the GA-ILC clean map with those of several different foreground reconstruction

techniques. Bottom panel: differences of those various clean angular power spectra and GA-ILC clean power spectrum (data implemen-

tation). A good agreement between GA-ILC and other methods is seen here as well.

Planck Collaboration et al., 2020a, A&A, 641, A1

Planck Collaboration et al., 2020b, A&A, 641, A4Planck Collaboration et al., 2020c, A&A, 641, A6

Saha R., 2011, The Astrophysical Journal, 739, L56

Saha R., Jain P., Souradeep T., 2006, The Astrophysical Journal,645, L89–L92

Saha R., Prunet S., Jain P., Souradeep T., 2008a, Phys. Rev. D,

78, 023003Saha R., Prunet S., Jain P., Souradeep T., 2008b, Phys. Rev. D,

78, 023003

Sudevan V., Saha R., 2018a, An Application of Global ILC Algo-rithm over Large Angular Scales to Estimate CMB PosteriorUsing Gibbs Sampling (arXiv:1810.08872)

Sudevan V., Saha R., 2018b, arXiv e-prints, p. arXiv:1810.08872Sudevan V., Saha R., 2018c, The Astrophysical Journal, 867, 74

Tegmark M., 1998, ApJ, 502, 1Tegmark M., Efstathiou G., 1996, Monthly Notices of the Royal

Astronomical Society, 281, 1297–1314Tegmark M., de Oliveira-Costa A., Hamilton A. J. S., 2003, Phys-

ical Review D, 68

This paper has been typeset from a TEX/LATEX file prepared by

the author.

MNRAS 000, 1–11 (2021)

http://dx.doi.org/10.1051/0004-6361/201833880

http://dx.doi.org/10.1051/0004-6361/201833881

https://ui.adsabs.harvard.edu/abs/2020A&A...641A...4P

http://dx.doi.org/10.1051/0004-6361/201833910

http://dx.doi.org/10.1088/2041-8205/739/2/l56

http://dx.doi.org/10.1086/506321



https://ui.adsabs.harvard.edu/abs/2008PhRvD..78b3003S

http://arxiv.org/abs/1810.08872

https://ui.adsabs.harvard.edu/abs/2018arXiv181008872S

http://dx.doi.org/10.3847/1538-4357/aae439

http://dx.doi.org/10.1086/305905

https://ui.adsabs.harvard.edu/abs/1998ApJ...502....1T

http://dx.doi.org/10.1093/mnras/281.4.1297

http://dx.doi.org/10.1093/mnras/281.4.1297

http://dx.doi.org/10.1103/physrevd.68.123523

http://dx.doi.org/10.1103/physrevd.68.123523

application of genetic algorithm to estimate the large

Documents