982 ieee transactions on image processing,...

13
982 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009 Bayesian Separation of Images Modeled With MRFs Using MCMC Koray Kayabol, Student Member, IEEE, Ercan E. Kuruo˘ glu, Senior Member, IEEE, and Bülent Sankur, Senior Member, IEEE Abstract—We investigate the source separation problem of random fields within a Bayesian framework. The Bayesian for- mulation enables the incorporation of prior image models in the estimation of sources. Due to the intractability of the analytical so- lution, we resort to numerical methods for the joint maximization of the a posteriori distribution of the unknown variables and pa- rameters. We construct the prior densities of pixels using Markov random fields based on a statistical model of the gradient image, and we use a fully Bayesian method with modified-Gibbs sam- pling. We contrast our work to approximate Bayesian solutions such as Iterated Conditional Modes (ICM) and to non-Bayesian solutions of ICA variety. The performance of the method is tested on synthetic mixtures of texture images and astrophysical images under various noise scenarios. The proposed method is shown to outperform significantly both its approximate Bayesian and non-Bayesian competitors. Index Terms—Astrophysical images, Bayesian source separa- tion, Gibbs sampling with embedded Metropolis, Markov chain Monte Carlo (MCMC), MRFs. I. INTRODUCTION I NVERSE problems represent some of the most challenging issues in image processing in the absence of ground truth information. Several cases of inverse problems can be given. In image de-noising and restoration, the source image is esti- mated from its noisy observation [1]–[3]. In super-resolution image reconstruction and image fusion, the source image is es- timated from multiple observations [4], [5]. In image separation multiple, source images are extracted from their multiple mixed observations. This is closely related to the Blind Source Separa- tion (BSS) problem where the task is to reconstruct indepen- dent sources from observations that consist of their mixtures. Bayesian estimation techniques provide an effective and elegant solution to this set of problems. Manuscript received November 21, 2008; revised November 13, 2008. First published March 24, 2009; current version published April 10, 2009. K. Kayabol was supported by CNR and TUBITAK joint project (No. 104E101) at ISTI- CNR, Pisa, Italy, and Bogazici BAP 08A202 Project. This work was supported in part by ASI contract Planck LFI Activity of Phase E2. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Stanley J. Reeves. K. Kayabol was with the Electrical and Electronics Engineering Department, Istanbul University, Istanbul, Turkey. He is now with the ISTI, CNR, 56124, Pisa, Italy (e-mail: [email protected]). E. E. Kuruo˘ glu is with the ISTI, CNR, 56124, Pisa, Italy (e-mail: ercan.ku- [email protected]). B. Sankur is with the Electrical and Electronics Engineering Depart- ment, Bogazici University, 34342, Bebek, Istanbul, Turkey (e-mail: bu- [email protected]). Digital Object Identifier 10.1109/TIP.2009.2012905 In this paper, we address the image source separation problem. Some current applications of image source separation include astrophysical images [6], document processing [7], and analysis of fMRI data [8]. The classical blind source separation techniques are based on the assumption of source independence and aim to recover un- known sources in the lack of information about the mixing oper- ator. Generally, in the observation model, the mixing operator is assumed to be linear while there are also works in the literature that consider nonlinear models [9]. Our work is limited to linear schemes since for the applications in case, linear mixing forms a good first order approximation. A common methodology for source separation is Independent Component Analysis (ICA) [10]. In this study, we use the term “ICA-based methods” to represent the classical source separation methods and consider the Bayesian source separation to be a more general framework which includes the ICA-based methods as a special case. Various approaches exist among ICA-based methods for ob- taining independent sources. A popular approach is to minimize the mutual information between the sources to be separated by adjusting the coefficients of the mixing matrix [11]. In an- other approach, the marginal likelihood function of the mixing matrix is maximized [12]. The third approach is based on the maximization of the non-Gaussianity of the sources [13]. The fourth approach uses the time correlation or non whiteness of the sources to make separation possible [15], [16]. The algo- rithms in this category drop the non-Gaussianity assumption of all sources and compensate for the missing information using the time correlation. The SOBI (Second Order Blind Identifica- tion) algorithm [15] exploits the time coherence of source sig- nals using second order stationary statistics and takes the obser- vation noise into consideration. Except for SOBI, these methods are designed for 1-D signals. It is possible to utilize these tech- niques for 2-D image separation by converting the images to 1-D signals by ordering the pixels lexicographically. However, the optimality of this 2-D architecture is not guaranteed, because the neighboring pixel interactions are not properly exploited. In this study, we have used all the first order neighbors of pixels in order to make use of the 2-D local information by adopting a Markov random field (MRF) model. The emerging category of source separation techniques are based on Bayesian formalism. In contrast to the classical ICA- based approach, the Bayesian solution to the BSS enables one to introduce a priori knowledge about all of the parameters and variables in the problem. For example, the observation noise is often assumed to be independent and identically distributed (iid) zero-mean Gaussian with diagonal covariance matrix. In 1057-7149/$25.00 © 2009 IEEE Authorized licensed use limited to: IEEE Xplore. Downloaded on April 8, 2009 at 14:42 from IEEE Xplore. Restrictions apply.

Upload: others

Post on 26-Jun-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 982 IEEE TRANSACTIONS ON IMAGE PROCESSING, …busim.ee.boun.edu.tr/~sankur/SankurFolder/Bayesian...982 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009 Bayesian Separation

982 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009

Bayesian Separation of Images ModeledWith MRFs Using MCMC

Koray Kayabol, Student Member, IEEE, Ercan E. Kuruoglu, Senior Member, IEEE, andBülent Sankur, Senior Member, IEEE

Abstract—We investigate the source separation problem ofrandom fields within a Bayesian framework. The Bayesian for-mulation enables the incorporation of prior image models in theestimation of sources. Due to the intractability of the analytical so-lution, we resort to numerical methods for the joint maximizationof the a posteriori distribution of the unknown variables and pa-rameters. We construct the prior densities of pixels using Markovrandom fields based on a statistical model of the gradient image,and we use a fully Bayesian method with modified-Gibbs sam-pling. We contrast our work to approximate Bayesian solutionssuch as Iterated Conditional Modes (ICM) and to non-Bayesiansolutions of ICA variety. The performance of the method is testedon synthetic mixtures of texture images and astrophysical imagesunder various noise scenarios. The proposed method is shownto outperform significantly both its approximate Bayesian andnon-Bayesian competitors.

Index Terms—Astrophysical images, Bayesian source separa-tion, Gibbs sampling with embedded Metropolis, Markov chainMonte Carlo (MCMC), MRFs.

I. INTRODUCTION

I NVERSE problems represent some of the most challengingissues in image processing in the absence of ground truth

information. Several cases of inverse problems can be given.In image de-noising and restoration, the source image is esti-mated from its noisy observation [1]–[3]. In super-resolutionimage reconstruction and image fusion, the source image is es-timated from multiple observations [4], [5]. In image separationmultiple, source images are extracted from their multiple mixedobservations. This is closely related to the Blind Source Separa-tion (BSS) problem where the task is to reconstruct indepen-dent sources from observations that consist of their mixtures.Bayesian estimation techniques provide an effective and elegantsolution to this set of problems.

Manuscript received November 21, 2008; revised November 13, 2008. Firstpublished March 24, 2009; current version published April 10, 2009. K. Kayabolwas supported by CNR and TUBITAK joint project (No. 104E101) at ISTI-CNR, Pisa, Italy, and Bogazici BAP 08A202 Project. This work was supportedin part by ASI contract Planck LFI Activity of Phase E2. The associate editorcoordinating the review of this manuscript and approving it for publication wasProf. Stanley J. Reeves.

K. Kayabol was with the Electrical and Electronics Engineering Department,Istanbul University, Istanbul, Turkey. He is now with the ISTI, CNR, 56124,Pisa, Italy (e-mail: [email protected]).

E. E. Kuruoglu is with the ISTI, CNR, 56124, Pisa, Italy (e-mail: [email protected]).

B. Sankur is with the Electrical and Electronics Engineering Depart-ment, Bogazici University, 34342, Bebek, Istanbul, Turkey (e-mail: [email protected]).

Digital Object Identifier 10.1109/TIP.2009.2012905

In this paper, we address the image source separationproblem. Some current applications of image source separationinclude astrophysical images [6], document processing [7], andanalysis of fMRI data [8].

The classical blind source separation techniques are based onthe assumption of source independence and aim to recover un-known sources in the lack of information about the mixing oper-ator. Generally, in the observation model, the mixing operator isassumed to be linear while there are also works in the literaturethat consider nonlinear models [9]. Our work is limited to linearschemes since for the applications in case, linear mixing formsa good first order approximation. A common methodology forsource separation is Independent Component Analysis (ICA)[10]. In this study, we use the term “ICA-based methods” torepresent the classical source separation methods and considerthe Bayesian source separation to be a more general frameworkwhich includes the ICA-based methods as a special case.

Various approaches exist among ICA-based methods for ob-taining independent sources. A popular approach is to minimizethe mutual information between the sources to be separatedby adjusting the coefficients of the mixing matrix [11]. In an-other approach, the marginal likelihood function of the mixingmatrix is maximized [12]. The third approach is based on themaximization of the non-Gaussianity of the sources [13]. Thefourth approach uses the time correlation or non whiteness ofthe sources to make separation possible [15], [16]. The algo-rithms in this category drop the non-Gaussianity assumption ofall sources and compensate for the missing information usingthe time correlation. The SOBI (Second Order Blind Identifica-tion) algorithm [15] exploits the time coherence of source sig-nals using second order stationary statistics and takes the obser-vation noise into consideration. Except for SOBI, these methodsare designed for 1-D signals. It is possible to utilize these tech-niques for 2-D image separation by converting the images to1-D signals by ordering the pixels lexicographically. However,the optimality of this 2-D architecture is not guaranteed, becausethe neighboring pixel interactions are not properly exploited. Inthis study, we have used all the first order neighbors of pixelsin order to make use of the 2-D local information by adopting aMarkov random field (MRF) model.

The emerging category of source separation techniques arebased on Bayesian formalism. In contrast to the classical ICA-based approach, the Bayesian solution to the BSS enables oneto introduce a priori knowledge about all of the parameters andvariables in the problem. For example, the observation noiseis often assumed to be independent and identically distributed(iid) zero-mean Gaussian with diagonal covariance matrix. In

1057-7149/$25.00 © 2009 IEEE

Authorized licensed use limited to: IEEE Xplore. Downloaded on April 8, 2009 at 14:42 from IEEE Xplore. Restrictions apply.

Page 2: 982 IEEE TRANSACTIONS ON IMAGE PROCESSING, …busim.ee.boun.edu.tr/~sankur/SankurFolder/Bayesian...982 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009 Bayesian Separation

KAYABOL et al.: BAYESIAN SEPARATION OF IMAGES MODELED WITH MRFS USING MCMC 983

the context of image separation, the prior information appearsas the spatial dependency model of source pixels. In ill-posedproblems, prior information always leads to more realistic andplausible estimates: This is the case for the Markov RandomField (MRF) constraint imposed upon pixels as they are not anymore allowed to have arbitrary 2-D structures. Since pixels arenot assumed any more to be iid, we are effectively reducing thesolution space in the source separation problem. We also remarkthat the Bayesian framework is capable of full optimization ofall parameters.

In [12], Belouchrani and Cardoso considered maximizationof likelihood via Expectation-Maximization (EM) algorithmfor separation of discrete sources. Mohammad-Djafari [18]derived the posterior in terms of the likelihood function andassigned prior probability density functions (pdf) to sourcesand the mixing matrix. Independently, in [17], Knuth showedthat the blind source separation problem can be formulated ina fully Bayesian framework. In [19], Rowe proposed a com-putational scheme for the solution of this problem formulatedin the Bayesian framework utilizing ICM and Gibbs samplingprocedures.

The Bayesian BSS algorithms can be considered under twomain approaches. The first approach assumes that the sourcesare hidden (latent) variables [12] and proceeds in two tiers:1) learning the mixing matrix and the observation noise vari-ance; 2) estimation of the sources based on these learnt param-eters. The methods in this approach require the integration oversources; therefore, analytically integrable models are preferred.Various applications of this strategy on different sources modelslist as follows. The parameters of mixture of Gaussians havebeen obtained by the EM method as in Moulines et al. [20],Snoussi et al. [21] and Attias [22], and by variational EM byMiskin [23]. Miskin [23] and Kuruoglu et al. [24] have appliedtheir approach to various image separation problems. For com-plex models, integration is not tractable.

In the second approach, one uses the joint posterior densityof complete variable set to obtain the joint estimate of all thevariables. In this approach, the Bayesian BSS problem is solvedby maximizing the joint posterior density of sources, mixingmatrix and noise variances. The joint maximization of theposterior generally is not tractable; hence, several numericalmethods have been suggested in the literature. A method forsolving the joint posterior modal estimation problem is the ICMmethod, which maximizes the conditional pdfs sequentiallyfor each variable [19]. If the mode of the conditional densitycannot be found analytically, any deterministic optimizationmethod can be used [18]. Under non-Gaussian hypotheses, theICM method does not guarantee the unique global solution. Analternative approach is Gibbs sampling [19] which is a MonteCarlo method, and is performed by drawing random samplesfrom conditional posterior densities of each variable. Afterthe burn-in period of the simulation, the random samples willappear to be drawn from the true joint pdf, and one can obtainboth MAP and Mean Square Error (MSE) estimations of thevariables.

Another avenue of research in BSS addresses the sourcemodeling problem. Cardoso et al. [25] have used a stationaryGaussian source model and exploited the spectral diversity

as the separability criterion. This approach is based on thenonstationarity in Fourier domain, which is the third criterionof separability according to Cardoso [16]. The wavelet domainextension of this method is proposed by Moudden et al. [26].The wavelet domain source separation has been also studiedin [27] and [28]. Another frequently used iid model is mixtureof Gaussian densities [20]–[24]. Hosseini et al. [29] proposeda method to separate sources based on a Markov chain model,so that the time correlation information of the sources could beused. Although some of these methods have been applied to2-D image separation problem [23], [24], under lexicographicordering, they all, except for SOBI and 2-D wavelet-based ones[26], are designed for 1-D signals. This 1-D treatment of imagedata precludes the full exploitation of the rich 2-D structure ofimages.

There are 2-D special Bayesian source separation studies inthe literature. Tonazzini et al. [30] proposed blind separation ofsources using MRF models. In [30], they used a hybrid alter-nating maximization method in a simulated annealing scheme.The sources are modeled with convex and nonconvex energyfunctions, they are estimated using a deterministic optimiza-tion technique, and then mixing matrix is updated using theMetropolis method. Snoussi et al. [31] proposed the GaussianMRF model for source images and implemented a fast MCMCalgorithm for joint separation and segmentation of mixed im-ages. Kuruoglu et al. [32] gave a MRF formulation and used itas an edge-preserving regularizer to model the pixel-by-pixel in-teractions. In their work, Maximum Likelihood (ML) estimationis used for the parameters of the mixing matrix, while estimatesof the sources are optimized by cycling through the variables. Inanother work [7], mean field approximation is applied to MRFand EM algorithm is used to estimate the sources and the mixingmatrix parameters. The noise variance and the hyper-parame-ters of Gibbs distribution are assumed to be constant throughouttheir algorithm.

MRF models generally make use of the Gibbs formulation,i.e., the probability density function is expressed in an “en-ergy potential” form. When the Gibbs distribution is such thatit preserves the edges, it gives rise to nonconvex energy po-tentials, and the resulting nonconvexity makes the determin-istic optimization difficult so that the convergence is not alwaysguaranteed. The methods used to find MAP estimates can beseparated into two groups: deterministic and stochastic. Per-formance of the deterministic methods strongly depend on theconvexity of the cost function. Steepest descent and Newton-Raphson are the most common deterministic methods. The sto-chastic methods do not suffer as much from the lack of con-vexity, since Gibbs sampling and the simulated annealing usedin Monte Carlo (MC) simulations can avoid being trapped inlocal minima.

These algorithms have been tested on astrophysical images,especially because the separation of these images is of muchcurrent interest in the PLANCK project [43]. Maino et al. [33]applied the fast ICA algorithm to astrophysical componentseparation problem under the assumption of no noise. TheSpectral Matching ICA (SMICA) algorithm is advanced in[25] assuming stationary Gaussian astrophysical components.Even under Gaussian assumption the mixed components are

Authorized licensed use limited to: IEEE Xplore. Downloaded on April 8, 2009 at 14:42 from IEEE Xplore. Restrictions apply.

Page 3: 982 IEEE TRANSACTIONS ON IMAGE PROCESSING, …busim.ee.boun.edu.tr/~sankur/SankurFolder/Bayesian...982 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009 Bayesian Separation

984 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009

separable in the Fourier domain. A fully Bayesian version ofthe mixture of noisy Gaussian sources [25] is proposed in [34].In this version, Gibbs sampling is used for drawing samplesof the mixing matrix, noise and sources covariance matricesin the frequency domain. The final estimate of the parametersare found by posterior expectations which are approximatedby the empirical means of the Markov chains. The componentsare estimated via a Wiener filter as in [25]. The SOBI approachwas applied to correlated astrophysical component separationin [35]. Another 1-D separation method that utilizes fullyBayesian separation is given in [36] where the authors work inthe spatial domain avoiding the problems raised by ignoringnonstationarity by working in the frequency domain. The firstattempt to exploit the MRF modeled astrophysical componentswas in [32]. The ICM based Bayesian estimation method wasused to find joint estimation of components and mixing matrix.

In this paper, we introduce a new method for the numer-ical solution of the Bayesian image source separation problem.Our work differs from previous studies in that we provide afull Bayesian solution to the image separation problem utilizingGibbs sampling for the estimation of the MRF image model.Two novel aspects of this formulation need explanation.

i) We model the source images as edge (gradient) imagepriors based on the Cauchy distribution. This spatial de-pendence model of pixels in the MRF setting preservesthe sharpness of edges while at the same time providesthe separability in the sense of spatial correlation. Wecompare this method with several other methods, namely,the ICA methods based on the assumption of non Gaus-sianity, nonwhiteness or spectral diversity, and a Bayesianalternative with iid source model, the latter to show theloss incurred when spatial correlation in images is notconsidered.

ii) We solve the MRF model numerically using a modifiedGibbs sampling in which we embed Metropolis samplingas will be detailed in the next sections. The schemeproceeds by drawing random samples from the condi-tional densities of the unknown variables in contrast toprevious solutions of this problem based on semi-analyt-ical techniques and/or on the cycling through parameterestimates as in the expectation-maximization method.The Bayesian sampling enables us to explore the solutionspace more thoroughly.

The rest of the paper is laid out as follows. In Section II, wedefine formally the source separation problem in the Bayesiancontext, and outline deterministic (ICM) and stochastic (Gibbs)methods of solution. The observation model is described in Sec-tion III. In Section IV, the image source model, based on di-rectional pixel differences, is given. Sampling techniques fromposterior distribution of sources are also discussed in this sec-tion. A number of simulation cases including the astrophysicscomponent separation problem are given in Section V, and, fi-nally, conclusions are drawn in Section VI.

II. PROBLEM DEFINITION IN THE BAYESIAN FRAMEWORK

We assume that the observed images, ,are linear combinations of source images. Let the th ob-served image be denoted as , where

represent lexicographically ordered pixels indices. It is assumedthat every image in this set is formed by the superposition ofindependent sources, , . The image separa-tion problem consists of finding the independent sources fromdifferent observations. Therefore, the observation model can bewritten as

......

(1)

where is a mixing matrix with elements . Thenoise is a zero mean noise vector withcovariance matrix of size . Noise terms are iid in everypixel of every image.

Let and denote vector representations of sourceand observation images, respectively. In this case, the observa-tion model in (1) can be written as

......

(2)

(3)

where the noise is a zero mean noise matrix withcovariance matrix of

size where is identity matrix with size. Since the sources are assumed to be independent,

the joint probability density is factorized as

(4)

It was stated that the pixels , of im-ages are not independent. This structural image information canbe elegantly incorporated in the source separation solution inthe guise of a MRF and its distribution in the Gibbs form.

A. Iterative Conditional Mode

In the BSS problem as formulated in the Bayesian framework,the unknowns , and can all be estimated by max-imizing the posterior density. The MAP estimate in this casebecomes

(5)

Let us define a new variable whichcontains all the unknowns. If a variable is excluded fromfor example the , we denote new unknown variable set .Using the Bayes rule, conditional densities for sources, mixingmatrix and noise variances can be written as

(6)

Authorized licensed use limited to: IEEE Xplore. Downloaded on April 8, 2009 at 14:42 from IEEE Xplore. Restrictions apply.

Page 4: 982 IEEE TRANSACTIONS ON IMAGE PROCESSING, …busim.ee.boun.edu.tr/~sankur/SankurFolder/Bayesian...982 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009 Bayesian Separation

KAYABOL et al.: BAYESIAN SEPARATION OF IMAGES MODELED WITH MRFS USING MCMC 985

(7)

(8)

The MAP estimate given in (5) is separated into consecutivemaximization steps for each variable. The variables , and

are independent from each other. Using the marginal den-sities given in (6)–(8), the maximization steps are arranged as(superscripts denote iteration steps)

(9)

(10)

(11)

The given procedure in (9)–(11) represents ICM update steps[19], [37]. If the direct solution of this maximization is notpossible, iterative optimization methods can be used, such assteepest descent or Newton–Raphson. Difficulties may arisein MAP estimation due to non-Gaussian prior densities sincethey disturb the convexity of the MAP estimate. In this case,one can resort to simulated annealing or Markov Chain MonteCarlo-based (MCMC) numerical Bayesian methods such asGibbs sampling. These are relaxation type methods; hence,they usually take a long time until convergence.

B. Gibbs Sampling

Since our priors are non-Gaussian (see Section IV), we re-sort to MCMC methods to find the MAP estimate of sourcesgiven in (9). However, direct sampling from the joint distribu-tion, , is not possible. We, therefore, useGibbs sampler to break down the multivariate sampling probleminto a set of univariate ones [38], [39]. In other words, samplingfrom conditional distribution of all the unknowns is conductedseparately from their corresponding marginal densities. Whenthis iterative procedure converges, then the samples are effec-tively being drawn by what would be the joint density. Recallthat in our source separation problem, not only the source im-ages but also the mixing matrix and the noise variances are un-knowns. So the limit density of the source separation problembecomes the joint posterior density of all the unknowns giventhe observations.

Our random sampling scheme from source images is hybridwherever we cannot use direct sampling methods; in otherwords, Metropolis steps are embedded within Gibbs sampling.Recall that the image estimation part of the source separationproblem is cast as an MRF model whose corresponding prob-ability distribution is a Gibbs distribution. The normalizationterm of the Gibbs distribution, namely the partition function,is intractable. Since the posterior of a pixel is formed by theproduct of a Gaussian likelihood and a Gibbs prior, the pos-terior is also intractable and direct sampling is not possible.Therefore, we resort to Metropolis steps.

Consider two neighbor pixels in scanning direction, say(current one) and th (next one) of

source . First we calculate the posterior density of thepixel, , and then draw a sample from itwith Metropolis method to update its estimate . With

TABLE IMETROPOLIS ALGORITHM FOR A SOURCE IMAGE. � : PROPOSAL DENSITY FOR

THE �TH PIXEL OF SOURCE �; �: UNIFORM POSITIVE RANDOM NUMBER IN

THE UNIT INTERVAL; �: GENERATED SAMPLE TO BE TRIED; �: ACCEPTANCE

RATIO OF THE GENERATED SAMPLE

this estimate one obtains the updated ; we continuewith the same procedure on pixel with posterior

. This procedure continues until all thepixels have been visited.

The conditional posterior probability of a pixel can be writtenas

(12)

where represents the first order neighbors of pixel in theMRF model.

The Metropolis algorithm is used to draw samples from theMRF-modeled images. In the Metropolis algorithm, one startswith an initial or current value and generates a new sample,

, by using a symmetric proposal density. If the proposal den-sity equals the transition probability, the generated sample isaccepted as the updated value with probability 1. How-ever, in real applications, this density is often not known. Hence,a proposal density, , is used in its place. The tran-sition is modeled as a random walk processwhere is a uniform random number between and . Sothe proposal density becomes conditionally uniform such that

.The acceptance probability is given as whereis the acceptance ratio defined such that

(13)

One cycle of the Metropolis algorithm which is embedded inGibbs sampling, for any one source image, is given in Table I.More explicit versions of posterior and ac-ceptance ratio can be found in Appendix A.

After sampling all the images with Metropolis method, thenext step is drawing samples from the mixing matrix and fromthe noise variance. When all of the unknowns are sampled, oneiteration of the Gibbs sampling algorithm is completed.

Gibbs sampling modified by embedded Metropolis steps, asused in this study, is given in Table II. Notice that it involvessampling of all the variables from their conditional densities.The Gibbs sampling algorithm must start with initial values, asdiscussed in Section V-B. In this study, we have assigned theobservations themselves as initial values. We start by samplingthe first source image with the Metropolis method. After all ofits pixels are visited, the algorithm proceeds to the next sourceimage, and so on. When the MCMC simulation of all source

Authorized licensed use limited to: IEEE Xplore. Downloaded on April 8, 2009 at 14:42 from IEEE Xplore. Restrictions apply.

Page 5: 982 IEEE TRANSACTIONS ON IMAGE PROCESSING, …busim.ee.boun.edu.tr/~sankur/SankurFolder/Bayesian...982 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009 Bayesian Separation

986 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009

TABLE IIONE CYCLE OF GIBBS SAMPLING WITH EMBEDDED METROPOLIS

images is completed, samples are generated for the mixing ma-trix. The random sampling of the mixing matrix can be done di-rectly because its conditional density is Gaussian, hence, simpleto sample. This is due to the fact that the posterior of consistsof the product of a uniform prior and a Gaussian ML term. Theposterior density of an element of mixing matrix is expressedas . The conditional densities are given inTable II, which are also explained in detail in the next two sec-tions. Notice that our scheme is not the classical Gibbs samplingalgorithm because we embed Metropolis steps in it for sam-pling from source images. Consequently we denote it as mod-ified Gibbs sampling [41] or Gibbs sampling with embeddedMetropolis. The steps in Table II are iterated until convergence.

It is beneficial to reduce the unknown space since with smallernumber of unknowns the Gibbs sampling algorithm convergesfaster and with less bias. For this purpose, we sample only onesource image at a time and fix the other ones. Once one source,the mixing matrix and the noise variance have been sampleduntil convergence, the sampling process restarts for the sourcenext in line. The estimated source, mixing matrix and noise vari-ance obtained from previous sampling process are used as initialvalues for the second source. Estimation process is completedwhen all the sources have been visited. Notice that for th source,only , and are sampled. The details of these modifiedsteps are given in Appendix B. After burn-in period , the last

samples of samples are used to obtain MAP estimate.The MAP estimate of source becomes [41]

(14)

where denotes the cycle number in Gibbs sam-pling, as described in Table II. Once the source has con-verged, we take as its final estimate and keep it fixed in allfuture sampling iterations. Then we proceed with updatingand . In fact, and updated with th source becomethe initial values for the iterations following the convergence ofsource . Thus, after all the sources have been estimated,the final estimates of and are found. For the MAP es-timate in (14), one can collect all samples and find its mode.Alternatively, one can use all samples to obtain an MSE es-timate. We have observed, however, that the last samples in thesampling iterations for , and do not differ much fromtheir MAP estimates, that is, the empirical mode of their sampledistributions.

In the case that the noise variance is known, we do not needto draw samples for it. For example, in the astrophysical imageseparation problem, which will be presented in Section V-B,the noise variances are known, so for this case we can skip thisvariance sampling step.

III. OBSERVATION MODEL

Since the observation noise is assumed to be independent andidentically distributed zero-mean Gaussian at each pixel, thelikelihood is expressed as

(15)

(16)

The prior distributions of elements of are chosen as non-negative uniform distribution due to the lack of any informationto the contrary, i.e., . Therefore, we assumethat the prior of is uniform between 0 and , and re-ject any negative samples. The conditional density of is ex-pressed as . From (15), it canbe seen that the conditional density of becomes Gaussian,

where

(17)

In principle, one can use the mean of the marginal density asan estimate of in the th step since the MAP and MSEestimates of a Gaussian random variable coincide with its mean.However, we do not prefer this estimation in order to take ad-vantage of MC methods. In other words, samples are generatedfrom the whole range of the variables, so that we minimize thechance to get stuck in a local extremum. Since elements of themixing matrix are independent we can draw their samples di-rectly using the conditional density, .

If the noise is assumed stationary and identical in each ob-servation channel, we can define a unique variance for eachchannel as . According to the likeli-hood given (15) and choosing a uniform prior, the posterior dis-tribution is in the form of in-

verse-gamma as where

(18)

IV. SOURCE MODELS

For image sources, we have adopted a non-Gaussian MRFmodel. In this novel approach, we use the iid Cauchy sourcedistribution as a model for neighboring pixel differences in theMRF model. In addition, we have also used the iid Cauchy

Authorized licensed use limited to: IEEE Xplore. Downloaded on April 8, 2009 at 14:42 from IEEE Xplore. Restrictions apply.

Page 6: 982 IEEE TRANSACTIONS ON IMAGE PROCESSING, …busim.ee.boun.edu.tr/~sankur/SankurFolder/Bayesian...982 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009 Bayesian Separation

KAYABOL et al.: BAYESIAN SEPARATION OF IMAGES MODELED WITH MRFS USING MCMC 987

source distribution as a model of pixel intensities in the iidmodel, to expose the superiority of the non-Gausian MRF tothe non-Gausian iid model.

A. Non-Gaussian iid Model

The non-Gaussianity assumption is one of requirements forseparability in the case of iid data. For non-Gaussian iid model,we have used the iid Cauchy density. The iid Cauchy density fora source image can be written as

(19)

where is the sample mean of the source image and definedas

(20)

The is the scale parameter of the iid Cauchy density.

B. MRF Model

We assume furthermore that the sources are modeled withMRFs and that their densities are chosen as Gibbs distri-butions with nonconvex energy potential functions. In this sta-tistical image model, we form the cliques using the first orderneighboring pairs. Let and represent the coordinates of twoadjacent pixels, and represents the set of first order neigh-bors of . The entire clique set can be defined as

.The energy function is expressed as the summation of all

clique potentials

(21)

where is the potential function. In inverse image problems,one is tempted to choose the potential function as convex toavoid ill-posed situations. However, potential functions thatavoid over-smoothing and tend to preserve edges lead us tononconvex choices. More details can be found in [1]–[3]. The

in (21) is the parameter of the random field.The probability density of is expressed in a Gibbs formu-

lation as

(22)

where is the partition function to ensure that the total prob-ability equals 1. The density given in (22) can be written in thevector form using the cliques in the eight compass directions as

(23)

where the clique differences in each direction are defined as. Here, is the one pixel shift operator in

direction .

In (23), we have reverted to the edge image domain, , fromthe original image domain, . One reason is that the descrip-tion of the probability distribution function of an edge imageis considerably simpler as compared to that in the original in-tensity domain. Furthermore, there exists well-established pdfmodels for edge images characterized by heavy tails. In thiswork we opt to describe edge pixels in terms of Cauchy density.Since the Cauchy distribution has heavy-tails it is a good statis-tical model for edge images. There are obviously other types ofheavy-tail distributions and clique potential functions. However,the focus of this study is not the comparison of these functions.The clique potential of the Gibbs density under Cauchy assump-tion becomes then

(24)

where is the scale parameter of Cauchy distribution which canalso be called threshold parameter of regularization function. Aproper choice of the potential function, , such as the one in(24) enables the Gibbs distribution to act as an edge preservingprior.

V. SIMULATION RESULTS

In this section we illustrate the performance of the pro-posed Bayesian algorithm; Gibbs sampling with MRF model(GS-MRF), compared to ICA-based and other Bayesian sourceseparation algorithms. We compare the solutions of four ICAbased techniques, namely Fixed Point ICA, SOBI 1-D, SOBI2-D, SMICA, and two Bayesian approaches, namely ICM (as inSection II-A) and Gibbs sampling (as in Section II-B) with iidCauchy model GS-iid (as in Section IV-A). We first experimenton a textbook case, i.e., mixture of texture images, and then ona more realistic problem, that of astrophysical images.

The Peak Signal-to-Inference Ratio (PSIR) is used as a nu-merical performance indicator for source separation

(25)

For components, PSIR is obtained by geometric averagingof component PSIRs, which is simply obtained by arithmeticaveraging of decibel values of component PSIRs. Before cal-culation of PSIR figures, the estimated sources are ordered bycorrelation matching and their intensities are re-scaled. Recallthat the sources can be recovered subject to two types of ambi-guity: Their scale and their ordering may differ from that of theactual sources. Note, however, that the re-scaling is done solelyfor PSIR calculation, but otherwise it is needed for the executionof the algorithm. In Table III, we give PSIR values at differentnoise levels for the methods mentioned above.

Another performance measure focuses on the estimatedmixing matrix and it is calculated by the matrix .If there were no permutations on , the matrix would be anidentity matrix. We use the following measure:

(26)

Authorized licensed use limited to: IEEE Xplore. Downloaded on April 8, 2009 at 14:42 from IEEE Xplore. Restrictions apply.

Page 7: 982 IEEE TRANSACTIONS ON IMAGE PROCESSING, …busim.ee.boun.edu.tr/~sankur/SankurFolder/Bayesian...982 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009 Bayesian Separation

988 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009

TABLE IIICOMPARISON OF THE AVERAGE PSIR VALUES OBTAINED BY FPICA, SOBI 1-D, SOBI 2-D, SMICA, ICM IN [32], THE GIBBS SAMPLING (GS) WITH MRF

MODEL, GS-MRF, AND GIBBS SAMPLING WITH IID CAUCHY MODEL, GS-IID

Fig. 1. Simulation experiments with noiseless texture images. First column:original images; second column: images mixed with �; third column: sourcesestimated via Gibbs sampling.

as an indicator of the distance of from unit matrix.

A. Synthetic Mixture of Texture Images

We first conducted image separation experiments on texturesthat ranged from geometrical patterns to random noise. Sampleimages of original textures and their mixtures are shown in thefirst and second columns of Fig. 1, respectively. The mixingmatrix, in this case, was chosen as

(27)

The and parameters determine the contribution of thelikelihood and of the image prior to the estimation process, re-spectively. There are only two user defined parameters, namely

, the parameter of Gibbs distribution and the scale parameterof Cauchy density. These two parameters were assumed homo-geneous over images. We have chosen the Gibbs distribution pa-rameter as by assigning equal weights to 8 clique po-tentials, and the scale parameter of Cauchy is chosen asafter a few experimental runs. Notice that optimal dependsupon the noise level as it indicates the contribution of the priormodel vis-à-vis the observations. If the noise level is high, thevariance of noise decreases the contribution of observations toposterior. In this case we must increase the contribution of priorby increasing the value . The parameter also controls thesmoothness of the images. For example, an unnecessarily largevalue of would lead to excessive smoothing. We took asfixed for all experiments, but must be increased proportionalto the noise level. We tested the performance of the proposed al-

gorithm for several peak signal-to-noise ratios (PSNR) rangingfrom 15 to dB. Equal amounts of iid Gaussian noise of vari-ance were added to observed mixtures to result in a targetPSNR. The mixing matrix is initialized with 1’s on the diago-nals and 0.5’s on the off diagonals. Finally, the initial value ofthe sources are assigned by using the given observations. Thenumber of sources and the observations are equal ,so the assignment is one-to-one. The estimated sources in thenoise-free case are shown in the last column of Fig. 1.

The performance figures of PSIR and are listed in Ta-bles III and IV, respectively. The PSIR values listed in Table IIIare average PSIRs of three separated sources. We compare ouralgorithm against two other algorithms available in the ICALABToolbox [40]. Among these we opted for the following pop-ular ones: Fixed Point (or Fast) ICA and SOBI. In the 1-DSOBI algorithm, we build the covariance matrices with 70 lags,which for images correspond to 70 neighbor pixels. We havealso adapted the SOBI algorithm into 2-D by calculating co-variance matrices using 8 spatial shifts in a 3 3 window. Wealso tested 24 shifts in 5 5 window and 48 shifts in 7 7window. However, 8 spatial shifts gave much better results. The2-D adapted SOBI algorithm performed much better, but stillcould not exceed the Gibbs sampling (GS-MRF) results. Al-though the competition is not fair considering that the MRFmodel uses only 8 neighbor pixels instead of 70, the results showthat the MRF model with first order neighborhood truly outper-forms SOBI algorithms. The nearest competitor to our methodwas SMICA. SMICA was realized with 4 ring-shaped frequencybands and the number of bands was chosen to yield the best re-sults.

We have also compared the proposed algorithm with two ver-sions of the joint MAP estimation: the ICM method as in [32],as well as with Gibbs sampling in the absence of any neighbor-hood model. While Huber energy potential was used in [32] wehave preferred the Cauchy potential model, as in (24). We haveused Newton-Raphson technique for ICM. An important remarkis that the ICM algorithm did not converge when PSNR is under30 dB in this data set. The Bayesian Gibbs sampling with neigh-borhood model surpasses the ICM method, the iid Gibbs sam-pling as well as the ICA and SOBI algorithms. The algorithmthat gets closest to the proposed method in the noiseless caseis the SMICA. Similarly, the proposed algorithm surpasses sig-nificantly its competitors in the measure as in Table IV.In interpreting the results, recall that the best value of

is zero, and that while PSNR is increasing, de-creases. However, the behavior of the measure is not al-ways monotonic with PSNR.

Authorized licensed use limited to: IEEE Xplore. Downloaded on April 8, 2009 at 14:42 from IEEE Xplore. Restrictions apply.

Page 8: 982 IEEE TRANSACTIONS ON IMAGE PROCESSING, …busim.ee.boun.edu.tr/~sankur/SankurFolder/Bayesian...982 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009 Bayesian Separation

KAYABOL et al.: BAYESIAN SEPARATION OF IMAGES MODELED WITH MRFS USING MCMC 989

TABLE IV���� MEASURE OF FPICA, SOBI 1-D, SOBI 2-D, SMICA, ICM IN [32], THE GIBBS SAMPLING WITH MRF MODEL, GS-MRF, AND GIBBS SAMPLING

WITH IID CAUCHY MODEL, GS-IID

Fig. 2. Average improvement in PSIR of the algorithms as a function of PSNR.

In Fig. 2, we have plotted the average PSIR improvement, asa measure of processing gain vis-a-vis the initial PSIR, that isthat of raw observations. The 1 dB PSIR improvement in this 3sources example corresponds to 3 dB improvement in total. Inthe case of noisy observations, the Bayesian source separationbased on Gibbs sampling proves to be superior to ICM-basedBayesian solution and to ICA methods.

Fig. 3 displays the source separation results for the 35 dBPSNR case. The first source is well separated in each algorithm.The second source cannot be separated by FPICA. The thirdsource is not well separated by the ICA algorithms as there re-main vestiges of interference and it is noisier. The proposed al-gorithm can separate the first and third textures well and makesthe second source more visible.

The PSIR values in Table V give more insight into the be-haviors of the algorithms. For example, SMICA attains highestPSIR for Texture 2 while at the same time it is the worst for Tex-ture 1, hence, quite unstable from source to source. In contrast,the proposed Bayesian Gibbs sampling exhibits more stable be-havior; furthermore, it outperforms all the other methods in Tex-tures 1 and 3.

In Table VI, the computation times of the GS-MRF, ICM,FPICA and SMICA algorithms are given under 40 dB PSNR.It is seen from the execution time figures that the compu-tational complexity of the GS-MRF is significantly highercompared to others. However, this burden can be alleviated bycascading two methods. More explicitly, in the method denotedas -MRF, we first run FPICA algorithm so that

Fig. 3. Simulation results of texture images under 35-dB PSNR. First column,originals; second column FPICA; third column Gibbs sampling; fourth columnSMICA. The numbers below the images represent the corresponding PSIRvalues.

TABLE VINDIVIDUAL PSIR (DB) VALUES OF ESTIMATED TEXTURE IMAGES

IN THE NOISELESS CASE

its output becomes the input (initial values) to the Gibbs sam-pling method. The cascade treatment brings in two advantages.First, the source separation performance improves. The scoresin Table V reveal that -MRF achieves a totalimprovement over FPICA of 41.38 dB (sum of componentimprovements), the average being 13.79 dB. Second, the Gibbsmethod is accelerated almost by an order of magnitude, since

-MRF converges in 11600 steps (43.82 min)while GS-MRF alone requires 73000 iterations (275.81 min).Such a cascade stratagem to reduce convergence time wasalso applied to inverse image problems solved by complexnumerical methods by Tian and Ma [42].

To show the sensitivity of the method to its parameters and, we performed an experiment where they were varied around

and values, their optimal setting. We chose the noiselesscase of Texture 3 and its parameters for this experiment. ThePSIR value for this case was shown in Table III. In Table VII,

Authorized licensed use limited to: IEEE Xplore. Downloaded on April 8, 2009 at 14:42 from IEEE Xplore. Restrictions apply.

Page 9: 982 IEEE TRANSACTIONS ON IMAGE PROCESSING, …busim.ee.boun.edu.tr/~sankur/SankurFolder/Bayesian...982 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009 Bayesian Separation

990 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009

TABLE VICOMPUTATION TIMES OF THE ALGORITHMS FOR 40 DB PSNR

TABLE VIIPSIR RESULTS AND NUMBER OF ITERATIONS FOR TEXTURE 3 FOR

� ��� � � �� � AND � ��� � � �� TO SHOW THE SENSITIVITY OF METHOD

TO THESE PARAMETERS

The PSIR results and the number of iterations for Texture 3 areshown for and . The method ap-pears less sensitive to parameter, so we took it fixed throughall noise experiments. However, we tuned the parameter ac-cording to noise levels.

B. Separation of Components in Astrophysical Images

Separation of astrophysical images is one of the mostimportant test cases and application areas of blind sourceseparation. The astrophysical observations obtained at dif-ferent microwave frequencies arrive mixed with each otherand they are also contaminated by sensor noise. The knowngalactic and extragalactic components are: cosmic microwavebackground, galactic dust, synchrotron noise, free-free emis-sions and thermal Sunyaev–Zeldovich (SZ) effect. The mostimportant component of microwave astronomy is the cosmicMicrowave Background (CMB), which is considered to be theproof of the hot Big-Bang theorem [43]. The future PLANCKsatellite mission is expected to expose the sky map at a higherresolution than the past missions. The importance of the CMBand especially its anisotropy pattern lies in the fact that it givesus information on the beginning, evolution and the future ofour universe. Therefore, the accuracy of the CMB estimates arecritical for calculation of the cosmic parameters.

In measurements at low frequencies, CMB is dominant butthe antenna noise is also higher. At high frequencies, antennanoise is lower, but in this case components other than CMB areexpressed more strongly. In microwave bands, galactic dust andsynchrotron noise radiation overlap spectrally, though they arerelatively weaker (about one order of magnitude). The additivewhite Gaussian observation noise can in some cases be strongerthan certain individual signal levels.

The astrophysics images used in the simulations are shownin Fig. 4. These are CMB, synchrotron and dust images forPLANCK mission of the European Space Agency (ESA) [43].In the first row of Fig. 5, the histograms of the three componentsare shown. The histogram of the CMB is similar to Gaussian,but others do not resemble to any known distribution. In thesecond row of Fig. 5, the histograms of the gradient images

, which were obtained at a single direction, of thecomponents are shown. It can be seen that the these histogramscan be well modeled by Cauchy distributions, which proves that

Fig. 4. (a) Original images of the astrophysics components. (b) Estimated as-trophysics components via Gibbs sampling approach to Bayesian estimation.

Fig. 5. First row: Histograms of the astrophysical components in Fig. 4. Secondrow: Histograms of the edge (gradient) images of corresponding components.Solid (red) lines represent the approximate Cauchy densities.

Fig. 6. DFT magnitudes of the astrophysical components in Fig. 4.

our assumption about edge images possessing heavy-tailed dis-tribution is valid also for astrophysics images.

The Discrete Fourier Transforms (DFT) of astrophysicalcomponents can be seen in Fig. 6. The CMB has a distinctFourier spectrum compared to others, but the spectra of syn-chrotron and dust are similar to each other. This makes theseparability in spectral domain very hard, especially in the caseof high noise level. This is the reason why the SMICA methodfails to separate synchrotron and dust from noisy mixtures.This empirical fact will be verified by experimental results inthe following paragraphs of this section.

The observation images are generated by using a 9 3mixing matrix simulating nine images at frequencies 30, 44,

Authorized licensed use limited to: IEEE Xplore. Downloaded on April 8, 2009 at 14:42 from IEEE Xplore. Restrictions apply.

Page 10: 982 IEEE TRANSACTIONS ON IMAGE PROCESSING, …busim.ee.boun.edu.tr/~sankur/SankurFolder/Bayesian...982 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009 Bayesian Separation

KAYABOL et al.: BAYESIAN SEPARATION OF IMAGES MODELED WITH MRFS USING MCMC 991

TABLE VIIIPSIR (DB) VALUES OF ESTIMATED ASTROPHYSICS COMPONENTS AND MIXING

MATRIX IN NOISELESS CASE

70, 100, 143, 217, 353, 545, and 857 GHz. In other words, wehave observations and sources. The mixturematrix is as follows:

(28)

The separated components are displayed in Fig. 4 and theirperformance figures are presented in Table VIII. In the noise-less case, the mixture at 857 GHz looks identical to the dustcomponent (cf. the two bottom right images in Fig. 4). In fact,the PSIR of the dust component is 93.29 dB, so that it can bepractically considered as separated; hence, we did not iterateover it. GS-MRF achieved 55 dB separation for the CMB com-ponent, which means that it is estimated correctly for all prac-tical purposes. However, the synchrotron noise is not separatedas accurately since it was weakly observed. For comparison, weused FPICA, SOBI 1-D, SOBI 2-D, SMICA, and GS-iid algo-rithms. The SMICA had 8 ring-shape frequency bands. Astro-physicists assume that CMB has a stationary Gaussian distribu-tion. Therefore, we considered three source models for Bayesiantechnique to expose the advantage of the MRF model. Thesesource models were: 1) all sources have MRF models, hence,only GS-MRF method is used; 2) the CMB is iid Gaussianand the others are iid Cauchy, hence, GS-iid only is used; 3)the CMB is Gaussian MRF and the others are Cauchy MRF,hence, GS- only is used. The GS-MRF andGS- are slightly better than the GS-iid. Fol-lowing GS methods, SMICA performs well at CMB and SOBI2-D does well with synchrotron and dust.

For noise experiments, we add stationary Gaussian noise toeach channel with different variances to emulate the PSNR fig-ures of antennas at different bands. These values correspond tothe variance figures. Although the noise power would bespatially varying in reality, in this experiment we have assumedit to be stationary over the observed sky. We have assignedto it strengths higher than the expected RMS values stated bythe PLANCK team. Typical observations and the correspondingPSNR values are shown in Fig. 7. The PSNR values vary from6 to 25 dB, the lowest PSNRs being encountered in the low fre-quency bands. The components estimated from noisy observa-tions via SOBI 2-D, SMICA and GS-MRF are shown in Fig. 8

Fig. 7. Synthetic mixtures of astrophysics components contaminated by noise.PSNR values are stated on the top of each image.

Fig. 8. Estimated astrophysics components via Gibbs sampling method fromnoisy observations.

with performance figures given in Table IX. The FPICA algo-rithm gives the worst results; in fact its outcomes are poorerthan the observed images. The SOBI 2-D gives much better re-sults than both SMICA and SOBI 1-D. In Fig. 8, the CMB anddust components are clearly visible, but the synchrotron com-ponent is not evident for the SOBI 2-D and SMICA methods.These methods fail to separate synchrotron from noisy mix-tures. GS-iid performs nearly as well as SOBI 2-D, but GS-MRFgives by far the best overall results. Finally, Gibbs varieties havegood performance under metric as well. We observe thatonly the synchrotron component has not been estimated accu-rately. Reconstructed synchrotron image appears blurred due tothe smoothing caused by the assumption of model noise vari-ance higher than the real value.

For GS-MRF, the number of iterations in noiseless and noisycases are 50000 and 35500, respectively. The number of itera-tions in the noisy case is less than the noiseless case, becausethe improvement in PSIR in noisy case is less.

Authorized licensed use limited to: IEEE Xplore. Downloaded on April 8, 2009 at 14:42 from IEEE Xplore. Restrictions apply.

Page 11: 982 IEEE TRANSACTIONS ON IMAGE PROCESSING, …busim.ee.boun.edu.tr/~sankur/SankurFolder/Bayesian...982 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009 Bayesian Separation

992 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009

TABLE IXPSIR (DB) VALUES OF ESTIMATED ASTROPHYSICS COMPONENTS AND MIXING

MATRIX IN NOISY CASE

VI. CONCLUSION

In this study, we have formulated the image separationproblem within a Bayesian framework. We have used Cauchydistribution as spatial prior to model pixel differences. Fur-thermore, we have used Gibbs sampling method, which inprinciple, should explore the solution space more comprehen-sively as compared, for example to the ICM method. We haveexploited the advantage of Bayesian methods, which insteadof deterministic values of model parameters, allows flexibleprior distributions to incorporate prior information. Finally theflexibility of the Bayesian approach enables the extension ofthe proposed method into cases of nonlinear or convolutionalmixtures.

A detailed comparison of seven source separation methods,three in the Bayesian category and four in the ICA categoryhas been performed. Experimental results in the astrophysicalimage separation, which is presently the most relevant sourceimage separation problem, have been presented. The superiorityof the proposed Bayesian technique has been demonstrated inthe component separation of noisy astrophysical images. TheFPICA did not do well under noise. SOBI and SMICA methodsfailed since the spectral shape of the synchrotron and dust werequite similar, and noise eclipsed the already meager spectraldiversity. The Bayesian separation of modeling pixel interac-tions with MRF has given better results against the stationarynon-Gaussian assumption. The GS-MRF method is also shownto be more robust than the ICM method in convergence.

There are three important paths to develop this work further.First, one can work to reduce the computational complexity ofthe proposed algorithm which is currently not appropriate forfast image handling. Presently, the observations form the initialvalues, that is, we initialize the algorithm with the noisy linearcombination of the sources. Obviously this is not the best wayto initialize. We have seen that a two-tier architecture, whereFPICA runs first to generate the initial values for GS-MRF notonly improves the results significantly, but also speeds up theGS part by an order of magnitude. Another stratagem to speedup the Gibbs sampling could be parallel sampling schemes.

Second, we have to address the problem of spatially varyingcharacteristics of astrophysical images. For example, themixing coefficients of the synchrotron and dust components arespatially varying. The observation noise is also not stationary,though with recent developments in antenna design this effectis mitigated. Over small patches, as we are currently using,mixing coefficients can be assumed constant, but over the skymap they are spatially varying. Hence, user defined parameters

and must be appropriately estimated for each patch. This

may improve the estimation performance and the speed of thealgorithm. Third, for nonstationary mixing case, the mixingmatrix itself can be modeled as an MRF.

APPENDIX ADETAILS OF METROPOLIS

The posterior of pixel of the source,, is

comprised by (15), (16), (21), (22), and (24). It is explicitlywritten as

(29)

where the mean is represented with

(30)

Then, the acceptance ratio in (13) becomes for the th pixel

(31)

where and is proposed intensity valuefrom density for pixel

.

APPENDIX BDETAILS OF GIBBS SAMPLING

One cycle of Gibbs sampling scheme for source ; , mixingmatrix; , and noise variance;

...(32)

...

(33)

ACKNOWLEDGMENT

The authors would like to thank the PLANCK missionteam for preparing expected astrophysical components maps,and D. Herranz, Instituto di Fisica di Cantabria, Spain, forproviding the suitable version of these maps to us and hisvaluable explanation about astrophysical maps.

Authorized licensed use limited to: IEEE Xplore. Downloaded on April 8, 2009 at 14:42 from IEEE Xplore. Restrictions apply.

Page 12: 982 IEEE TRANSACTIONS ON IMAGE PROCESSING, …busim.ee.boun.edu.tr/~sankur/SankurFolder/Bayesian...982 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009 Bayesian Separation

KAYABOL et al.: BAYESIAN SEPARATION OF IMAGES MODELED WITH MRFS USING MCMC 993

REFERENCES

[1] P. Charbonnier, L. Blanc-Féraud, G. Aubert, and M. Barlaud, “Deter-ministic edge-preserving regularization in computed imaging,” IEEETrans. Image Process., vol. 6, no. 2, pp. 298–311, Feb. 1997.

[2] G. Archer and D. M. Titterington, “On some Bayesian/regularizationmethods for image restoration,” IEEE Trans. Image Process., vol. 4,no. 7, pp. 989–995, Jul. 1995.

[3] C. Bouman and K. Sauer, “A generalized Gaussian image model foredge-preserving MAP estimation,” IEEE Trans. Image Process., vol.2, no. 7, pp. 296–310, Jul. 1993.

[4] S. C. Park, M. K. Park, and M. G. Kang, “Super-resolution image re-construction: A technical overview,” IEEE Signal Process. Mag., vol.20, no. 3, pp. 21–36, May 2003.

[5] M. Elad and A. Feuer, “Restoration of a single superresolution imagefrom several blurred, noisy, and undersampled measured images,”IEEE Trans. Image Process., vol. 6, no. 12, pp. 1646–1658, Dec. 1997.

[6] M. Costagli and E. E. Kuruoglu, “Image separation using particle fil-ters,” Digital Signal Process, vol. 17, no. 5, pp. 935–946, 2007.

[7] A. Tonazzini, L. Bedini, and E. Salerno, “A Markov model for blindimage separation by a mean-field EM algorithm,” IEEE Trans. ImageProcess., vol. 15, no. 2, pp. 473–482, Feb. 2006.

[8] M. J. McKeown, S. Makeig, G. G. Brown, T.-P. Jung, S. S. Kinder-mann, A. J. Bell, and T. J. Sejnowski, “Analysis of fMRI data by blindseparation into independent spatial components,” Human Brain Map.,vol. 6, pp. 160–188, 1998.

[9] A. Hyvarinen and P. Pajunen, “Nonlinear independent component anal-ysis: Existence and uniqueness results,” Neural Netw., vol. 12, no. 3,pp. 429–439, 1999.

[10] P. Comon, “Independent component analysis: A new concept?,” SignalProcess., vol. 36, pp. 287–314, 1994.

[11] A. J. Bell and T. J. Sejnowski, “An information maximization approachto blind separation and blind deconvolution,” Neural Comput., vol. 7,pp. 1129–1159, 1995.

[12] A. Belouchrani and J.-F. Cardoso, “Maximum likelihood source sep-aration for discrete sources,” in Proc. Eur. Conf. Signal Processing,Edinburgh, U.K., Sep. 1994, pp. 768–771.

[13] A. Hyvarinen, “Gaussian moments for noisy independent componentanalysis,” IEEE Signal Process. Lett., vol. 6, no. 6, pp. 145–147, Jun.1999.

[14] A. Hyvarinen and E. Oja, “Fast fixed-point algorithm for independentcomponent analysis,” Neural Comput., vol. 9, no. 7, pp. 1483–1492,1997.

[15] A. Belouchrani, K. A. Meraim, J.-F. Cardoso, and E. Moulines, “Ablind source separation technique using second-order statistics,” IEEETrans. Signal Process., vol. 45, no. 2, pp. 434–444, Feb. 1997.

[16] J.-F. Cardoso, “The three easy routes to independent component anal-ysis; Contrasts and geometry,” presented at the Int. Conf. Indepen.Comp. Anal., San Diego, CA, Dec. 2001.

[17] K. H. Knuth, “A Bayesian approach to source separation,” in Proc. Int.Conf. Indepen. Comp. Anal., Jan. 1999, pp. 283–288.

[18] A. Mohammad-Djafari, “A Bayesian approach to source separation,”presented at the Int. Workshop on Maximum Entropy and BayesianMethods, Jul. 1999.

[19] D. B. Rowe, “A Bayesian approach to blind source separation,” J. In-terdisciplin. Math., vol. 5, no. 1, 2002.

[20] E. Moulines, J.-F. Cardoso, and E. Gassiat, “Maximum likelihod forblind separation and deconvolution of noisy signals using mixturemodels,” in Proc. Int. Conf. Acoustic, Speech, Signal Processing,1997, pp. 3617–3620.

[21] H. Snoussi and A. Mohammad-Djafari, “Bayesian source separationwith mixture of Gaussians prior for sources and Gaussian prior for mix-ture coefficients,” presented at the Int. Workshop on Maximum Entropyand Bayesian Methods, Jul. 2000.

[22] H. Attias, “Independent factor analysis,” Neural Comput., vol. 11, pp.803–851, 1999.

[23] J. W. Miskin, “Ensemble Learning for Independent Component Anal-ysis,” Ph.D. dissertation, Univ. Cambridge, Cambridge, U.K., 2000.

[24] E. E. Kuruoglu, L. Bedini, M. T. Paratore, E. Salerno, and A. Tonazzini,“Source separation in astrophysical maps using independent factoranalysis,” Neural Netw., vol. 16, pp. 479–491, 2003.

[25] J.-F. Cardoso, H. Snoussi, J. Delabrouille, and G. Patanchon, “Blindseparation of noisy Gaussian stationary sources: Application tocosmic microwave background imaging,” in Proc. Eur. Conf. SignalProcessing, Toulouse, France, Sep. 2002, pp. 561–564.

[26] Y. Moudden, J.-F. Cardoso, J.-L. Starck, and J. Delabrouille, “Blindcomponent separation in wavelet space: Application to CMB analysis,”EURASIP J. Appl. Signal Process., vol. 15, pp. 2437–2454, 2005.

[27] M. M. Ichir and A. Mohammad-Djafari, “Bayesian wavelet basedsignal and image separation,” presented at the Int. Workshop Max-imum Entropy and Bayesian Methods, Aug. 2003.

[28] W. Addison and S. Roberts, “Blind source separation with non-sta-tionary mixing using wavelets,” presented at the Int. Conf. Indepen.Comp. Anal and Blind Sour. Sep., Mar. 2006.

[29] S. Hosseini, C. Jutten, and D. T. Pham, “Markovian source separa-tion,” IEEE Trans. Signal Process., vol. 51, no. 12, pp. 3009–3019,Dec. 2003.

[30] A. Tonazzini, L. Bedini, E. E. Kuruoglu, and E. Salerno, “Blindseparation of auto-correlated images from noisy mixtures using MRFmodels,” presented at the Int. Symp. Independent Comp. Anal andBlind Sig. Sep., Apr. 2003.

[31] H. Snoussi and A. Mohammad-Djafari, “Fast joint separation and seg-mentation of mixed images,” J. Electron. Imag., vol. 13, no. 2, pp.349–361, Apr. 2004.

[32] E. E. Kuruoglu, A. Tonazzini, and L. Bianchi, “Source separation innoisy astrophysical images modelled by Markov random fields,” inProc. Int. Conf. Image Processing, Oct. 2004, pp. 24–27.

[33] D. Maino, A. Farusi, C. Baccigalupi, F. Perrotta, A. J. Banday, L. Be-dini, C. Burigana, G. De Zotti, K. M. Górski, and E. Salerno, “All-skyastrophysical component separation with fast independent componentanalysis (FastICA),” Monthly Notices Roy. Astron. Soc., vol. 334, no.1, pp. 53–68, 2002.

[34] H. Snoussi, “Fast MCMC spectral matching separation in noisyGaussian mixtures: Application to astrophysics,” presented at theIEEE ISCCSP, Marrakech, Morocco, 2006.

[35] L. Bedini, D. Herranz, E. Salerno, C. Baccigalupi, E. E. Kuruoglu, andA. Tonazzini, “Separation of correlated astrophysical sources usingmultiple-lag data covariance matrices,” EURASIP J. Appl. SignalProcess., vol. 15, pp. 2400–2412, 2005.

[36] S. Wilson, E. E. Kuruoglu, and E. Salerno, “Fully Bayesian source sep-aration of astrophysical images modelled by mixture of Gaussians,”IEEE J. Sel. Areas Signal Process., Sep. 2008.

[37] B. J. Frey and N. Jojic, “A comparison of algorithms for inference andlearning in probabilistic graphical models,” IEEE Trans. Pattern Anal.Mach. Intell., vol. 27, no. 9, pp. 1392–1416, Sep. 2005.

[38] J. J. K. Ruanaidh and W. J. Fitzgerald, Numerical Bayesian MethodsApplied to Signal Processing. New York: Springer-Verlag, 1996.

[39] W. R. Gilks, S. Richardson, and D. J. Spiegalhalter, Markov ChainMonte Carlo in Practice. London, U.K: Chapman & Hall, 1996.

[40] A. Cichocki et al., ICALAB Toolboxes [Online]. Available:http://www.bsp.brain.riken.jp/ICALAB.

[41] C. Andrieu, N. De Freitas, A. Doucet, and M. I. Jordan, “An introduc-tion to MCMC for machine learning,” Mach. Learn., vol. 50, pp. 5–43,2003.

[42] J. Tian and K. K. Ma, “A MCMC approach for Bayesian super-res-olution image reconstruction,” presented at the Int. Conf. Image Pro-cessing, Oct. 2005.

[43] Planck Science Team, PLANCK: The Scientific Programme EuropeanSpace Agency (ESA) [Online]. Available: http://www.rssd.esa.int/SA/PLANCK/docs/

Koray Kayabol (S’03) was born in Sakarya, Turkey,in 1977. He received the B.Sc., M.Sc., and Ph.D. de-grees in electrical and lectronics engineering from Is-tanbul University, Istanbul, Turkey, in 1997, 2002,and 2008, respectively.

Between 2001 and 2008, he was with the Electricaland Electronics Engineering Department, IstanbulUniversity, as a research assistant. He is currently apostdoctoral fellow with ISTI-CNR, Pisa, Italy. Hisresearch interests include statistical image and videoprocessing.

Authorized licensed use limited to: IEEE Xplore. Downloaded on April 8, 2009 at 14:42 from IEEE Xplore. Restrictions apply.

Page 13: 982 IEEE TRANSACTIONS ON IMAGE PROCESSING, …busim.ee.boun.edu.tr/~sankur/SankurFolder/Bayesian...982 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009 Bayesian Separation

994 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 5, MAY 2009

Ercan E. Kuruoglu (M’98–SM’06) was born in1969 in Ankara, Turkey. He received the B.Sc.and M.Sc. degrees in electrical and electronicsengineering from Bilkent University, Turkey, in1991 and 1993, and the M.Phil. and Ph.D. degreesin information engineering from the University ofCambridge, Cambridge, U.K., in 1995 and 1998,respectively.

Upon completion of his studies, he joined theXerox Research Center Europe, Cambridge, as apermanent member of the Collaborative Multimedia

Systems Group. He was an ERCIM Fellow in 2000 at INRIA-Sophia Antipolis,France. In January 2002, he joined ISTI-CNR, Pisa, Italy. He was a visitingprofessor at Georgia Tech-Shanghai in Autumn 2007. He is currently a SeniorResearcher and Associate Professor at ISTI-CNR. His research interests are inthe areas of statistical signal and image processing and information and codingtheory with applications in astrophysics, bioinformatics, telecommunications,and intelligent user interfaces.

Dr. Kuruoglu was an Associate Editor for the IEEE TRANSACTIONS ON

SIGNAL PROCESSING between 2002–2006. He is currently an Associate Editorfor the IEEE TRANSACTIONS ON IMAGE PROCESSING and is on the editorialboard of Digital Signal Processing: A Review Journal. He acted as the technicalchair for EUSIPCO 2006. He is a member of the IEEE Technical Committeeon Signal Processing Theory and Methods.

Bülent Sankur (SM’90) received the B.S. degree inelectrical engineering from Robert College, Istanbul,Turkey, and completed his graduate studies at Rens-selaer Polytechnic Institute, Troy, NY.

He is with the Department of Electric and Elec-tronic Engineering, Bogaziçi (Bosporus) University,Turkey. He has held visiting positions at the Uni-versity of Ottawa, the Technical University of Delft,and the Ecole Nationale Supérieure des Télécom-mications, Paris. He also served as a consultantin several private and government institutions. His

research interests are in the areas of digital signal processing, image and videocompression, biometry, cognition, and multimedia systems. He has establisheda Signal and Image Processing laboratory and hasn published 150 journal andconference articles in these areas.

Prof. Sankur serves on the editorial boards of three journals on signal pro-cessing. He was the Chairman of ICT’96, International Conference on Telecom-munications and EUSIPCO’05, the European Conference on Signal Processing,as well as technical Chairman of ICASSP’00.

Authorized licensed use limited to: IEEE Xplore. Downloaded on April 8, 2009 at 14:42 from IEEE Xplore. Restrictions apply.