international journal of pure and applied mathematics volume 73 no

International Journal of Pure and Applied Mathematics————————————————————————–Volume 73 No. 4 2011, 379-403

EM FILTER FOR TIME-VARYING SPECT RECONSTRUCTION

Joe Qranfal1 §, Charles Byrne2

1Department of MathematicsSimon Fraser University

British Columbia, CANADA2Department of Mathematical SciencesUniversity of Massachusetts at Lowell

Lowellm, USA

Abstract: The new filtering algorithm EM (expectation maximization) filteris introduced and is validated numerically by applying it to solve the ill-posedinverse problem of reconstructing a time-varying medical image. A linear state-space stochastic approach based on a Markov process is utilized to model theproblem, while no precise a-priori information about the underlying dynamicsof the physical process is required. The method is tested for the case of imagereconstruction from noisy data in dynamic single photon emission computedtomography (SPECT), where the signal strength changes substantially overthe time required for the noisy data acquisition. Numerical results corroboratethe effectiveness of the reconstruction method.

AMS Subject Classification: 93E11, 93E10, 34K29, 49N45, 60G35, 62G05,62M05, 68U10, 94A08, 90C25Key Words: estimation, stochastic filtering, Kalman filter, optimal filter-ing, state estimator, convex optimization, medical image, dynamic SPECT,cross-entropy, nonnegative reconstruction, hidden Markov model, expectationmaximization, maximum likelihood, temporal regularization

Received: May 31, 2011 c© 2011 Academic Publications, Ltd.

§Correspondence author

380 J. Qranfal, C. Byrne

1. Introduction

We introduce a new filtering algorithm to find a nonnegative estimate xk, k =1, . . . , S, to the nonnegative unknown xk of the problem given by the two linearspace-state equations,

xk = Akxk−1 + µk

zk = Hkxk + νk

µk is the error vector, E(µk) = 0 and E(µkµ⊤k ) = Qk is the covariance of the

error in modeling the transition from xk−1 to xk. E(νk) = 0 and E(νkν⊤k ) = Rk

are the mean and covariance respectively of the noise vector νk. Entries of thevector zk and of both matrices Ak and Hk are nonnegative. We know also thatwe deal with white error and noise so that,

Qk = σ2kI

Rk = diag(zk)

where I is the identity matrix of order N, diag(a) denotes the square matrixthat has the ai in its main diagonal and 0 otherwise. Our new algorithm that werefer as the SMART filter is then numerically tested to solve a reconstructionproblem arising in medical imaging, namely dynamic/time-varying SPECT.

Recent advances in high speed computing and image processing have con-tributed significantly to the progress of treatment planning including radiationtherapy, hyperthermia, surgical procedure, and cryosurgery. Emission tomogra-phy, as in nuclear medicine, is a medical imaging modality which uses detectionof electromagnetic radiation for diagnostic purposes. In case of SPECT or PET(positron emission tomography), a judiciously designed chemical radiopharma-ceutical tagged with a radioactive isotope is administered to the patient, usuallyby intravenous injection. It is chosen to amass in a targeted organ or regionof the body, the heart or the brain for instance. The radioactive isotope emitsphotons which are detected by an external device, the gamma camera, at sev-eral angular positions around the patient body. Data from these 2D angularviews/projections are reconstructed into a 3D image. This reconstructed ra-dionuclide distribution from measured data is a useful tool to clinically interpretand diagnose unhealthy tissue. Nuclear medicine is concerned in investigationsof the dynamics of the human body’s physiological processes and biochemicalfunction, thus the interest in the dynamic SPECT.

The image reconstruction of the dynamic SPECT is an ill-posed inverseproblem. This ill-posedness is further amplified by physical degradation of the

EM FILTER FOR TIME-VARYING SPECT RECONSTRUCTION 381

acquired data caused by camera blurring, photon scattering, or attenuation.Most of the reconstruction algorithms presented so far have mainly focused onthe case where the activity of the object is time-invariant within the time takento acquire a full set of independent measurement data. Noniterative approaches,such as FBP (filtered back-projection) approach, and iterative ones, such as EMalgorithm, provide acceptable results in the static case when the activity is timeindependent. However, they will break down when the activity is dynamic.

In the static case, the EM algorithm has been extensively applied to medicalimage reconstruction since its introduction by Dempster et al [1] in 1977. Thefirst application in emission tomography began with Shepp and Vardi [2] in1982. Its efficiency is well established; it converges to an optimal solution.A variant, called OSEM (ordered subsets EM), converges faster and is dueto Hudson and Larkin [3] in 1994. The EM algorithm and its variants areeasy to implement and perform very well to reconstruct a static/time-invariantemission tomography when a nonnegative solution is needed. An extension toa convex constraint set was studied by Bauschke et al [4] for the dynamic/time-varying SPECT problem. The authors describe the expectation step (E step)and the minimization step (M step) as a KL projection onto a convex set andthey emphasized that there is no explicit formula for the M step. The Mstep, indeed, requires solving a nonlinear optimization problem. This does notprovide an explicit formula as in the static case. As we shall see, by using atemporal regularization through a cross-entropy minimization, we are able toderive an explicit EM like formula for the dynamic SPECT problem.

The remainder of the paper is organized as follows. First, we describe inSection 2 the stochastic modeling of the state evolution and projection in space.The state evolution and space models are the basis of the HMM (hidden Markovmodel) to apply our method. Then Section 3 reviews the EM and Kalmanfilter algorithms. Section 4 introduces the EM filter which is derived through aweighted Kullback-Leibler (KL) distance. The application of EM filter to time-varying SPECT is covered in Section 5. Section 6 details the simulations ofdynamic SPECT as numerical experiments; these corroborate the effectivenessof our algorithm in terms of convergence and cpu time. Section 7 concludes thepaper in summing up the findings.


2. Problem Formulation

2.1. Stochastic Modeling of Dynamic SPECT

We consider a physiological process where the distribution of the radioactivetracer in an organ or a specific region is time dependent. This region is dividedinto small parts called dynamic voxels in 3D or doxels and dynamic pixels in 2Dor dixels. A SPECT camera, that could have one, two or three heads, is usedto register the number of photons emitted by the patient. Let tk, k = 1, . . . , Sbe an index of a sequence of acquisition times, N the total number of voxelsand M the total number of bins, we denote by xk ∈ R

N and zk ∈ RM the

spatial distribution of the activity and the measured data during the k∆t time.The observations z1, z2 . . . , zS are random vectors. The ith entry in the vectorzk measures the number of photons registered at the ith bin during the timetk. The jth entry in the vector xk measures the number of photons emittedfrom the jth dixel/voxel during the time tk. Furthermore, each observation zkdepends on xk only. The nonnegative activities sequence x1, x2 . . . , xS satisfyMarkov property with unknown time varying transition matrix Ak ∈ R

N×N .That is

xk = Akxk−1 + µk (1)

where, µk is the error random vector in modeling the transition from xk−1

to xk with E(µk) zero and covariance matrix E(µkµ⊤k ) = Qk. The random

variable µk does not have to be a Gaussian process. In many applications theunknown transition matrix is approximated by a random walk or a discretediffusion-transport operator.

Let (hk)ij be the conditional probability that an emission from voxel jduring the acquisition time k∆t will be detected in bin i. We call projectionor observation matrix the time varying matrix defined by Hk = [(hk)ij ]. Itis assumed to be known from the geometry of the detector array and mayinclude attenuation correction. We shall assume throughout that the matrixHk ∈ R

M×N has been constructed so that Hk, as well as any submatrix Cobtained from Hk by deleting columns, has full rank. In particular, if C isM ×L and L ≥ M , then C has rank M . This is not an unrealistic assumption.The columns of Hk are vectors in the nonnegative orthant of M -dimensionalspace. When attenuation, detector response, and scattering are omitted fromthe design of the projection matrix Hk, it can sometimes happen that Hk orsome submatrices C can fail to be full rank. However, the slightest perturbationof the entries of such a Hk will almost surely produce a new Hk having the


desired full-rank properties. The observation and activity vectors are relatedby the following

zk = Hkxk + νk (2)

where E(νk) = 0 and E(νkν⊤k ) = Rk are the mean and covariance respectively

of the noise vector νk. The random variable νk does not have to be a Gaussianprocess or Poisson distributed. We only need to know its mean and covariancematrix. The observation noise is not additive to the measurements in a strictphysical sense. However, feasible solutions can also be obtained using thisapproximate noise model. Multiplicative noise is generally more difficult toremove than additive noise [5], because the intensity of the noise varies withthe signal intensity, thus violating the linearity of the observation model. Thelinear model we choose requires then the noise to be additive. Otherwise wewould not have an unbiased estimator, as in the Kalman filter (KF) case forinstance.

The operation that concerns itself in the extraction of information about aquantity of interest at time k by using noisy data measured up to and includingk is called filtering. In our setting, the determination of the activity xk from thenoisy data zk is a filtering problem. Stochastic filtering is an inverse problem:One needs to find the optimal xk, given the data zk, the evolution matrixAk

and the observation system matrix Hk at each time step k. Equations (1)and (2) are the state-space form of a particular case of a more general filteringproblem [6, 7]. The actual model is a linear dynamic system for which theanalytic filtering solution is given by the KF [8]. The KF can be viewed as atemporal regularization technique for solving dynamic inverse problems.

We presume that covariance matrices Qk and Rk are diagonal in the caseof time-varying SPECT. We deal then with white error and noise respectively.We also presume that we have a system that has nonnegative entries. Diagonalcovariance matrices, white noise and error, and nonnegative system matricesare three more assumptions that we shall adopt throughout this paper. In caseone or more of these three is violated, refer to [9, 10] in how we could bring ageneral setting to an equivalent one presented here in this paper.


3. Filtering and Nonnegative Reconstruction

3.1. The Expectation Maximization Algorithm

The EM algorithm, when the activity is assumed to be static, has been exten-sively applied to medical image reconstruction since its introduction by Demp-ster et al [1] in 1977. The first application in emission tomography began withShepp and Vardi [2] in 1982. Its efficiency is well established; it convergesto an optimal solution. A variant, called OSEM (ordered subsets EM), con-verges faster and is due to Hudson and Larkin [3] in 1994. The EM algorithmand its variants are easy to implement and perform very well to reconstructa static/time-invariant emission tomography when a nonnegative solution isneeded. An extension to a convex constraint set was studied by Bauschke etal [4] for the dynamic/time-varying SPECT problem. The authors describe theexpectation step (E step) and the minimization step (M step) as a KL projec-tion onto a convex set and they emphasized that there is no explicit formulafor the M step. The M step, indeed, requires solving a nonlinear optimizationproblem. This does not provide an explicit formula as in the static case. As weshall see, by using a temporal regularization through a cross-entropy minimiza-tion, we are able to derive an explicit EM like formula for the dynamic SPECTproblem.

The static emission tomography problem amounts to finding x ∈ RN solu-

tion of the linear equation

z = Hx+ ν (3)

where, z ∈ RM , H ∈ R

M×N are the observation data vector and the observationsystem matrix respectively. The vector ν ∈ R

M represents the additive noisein recording the data z. We assume the entries of the vector z are nonnegative,the entries of the matrix H are nonnegative such as each one of its columnssums up to one; that is the system is normalized. We denote by support(x) theset of indexes j of the vector x for which xj > 0. Central to our discussion is thenotion of cross-entropy or Kullback-Leibler [11] distance between vectors withnonnegative entries. Recall that the KL distance between nonnegative numbersα and β is

KL(α, β) = α logα

β+ β − α

We also define KL(α, 0) = +∞, KL(0, β) = β, and KL(0, 0) = 0. Extending


to nonnegative vectors a = (a1, · · · , aN )⊤ and b = (b1, · · · , bN )⊤ , we have

KL(a, b) =N∑

j=1

KL(aj, bj) =N∑

j=1

(aj logajbj

+ bj − aj)

We have KL(a, b) = ∞ unless support(a) is contained in support(b). Note howthe KL distance is not symmetric that is KL(a, b) 6= KL(b, a).

The EM algorithm is an iterative procedure for computing a nonnegativesolution of the linear system (3) when we discard the additive noise ν. Using thecross-entropy/Kullback-Leibler (KL) distance, Titterington [12] noted in 1987that maximizing the likelihood function is equivalent to minimizing KL(z,Hx)distance. Later on in 1993, Byrne [13] showed that by minimizing KL(Hx, z)we obtain the simultaneous version SMART of MART (multiplicative algebraicreconstruction technique) considered earlier by Gordon et al. [14] and others.Both algorithms are meant for the static state. Starting with an arbitraryx0 > 0, for ℓ = 0, 1, . . . , an iteration of EM Algorithm is given by

xℓ+1j = xℓj

M∑

i=1

Hijzi(Hxℓ)i

(4)

When using EM or least squares techniques for dynamic emission tomog-raphy, the challenge is how to take into account the activity’s dynamics. Someauthors assume that the time activity curves (TACs) or the the activity tempo-ral behavior in different regions of interest (ROI) are assumed to be known [15].In this particular case, constrained least squares method can be effective. How-ever such extra information is not usually available. An alternative approach isto assume nothing about the dynamics of the activity and just use a temporalregularization that would penalize high variations of the activity’s changes overtime. This has been done in [16] in 2008, where the authors present a Kalmanbased regularization technique. Recently, Qranfal and Byrne [9] developed atemporal regularization method based on cross-entropy minimization and an-alyzed a recursive reconstruction method called SMART filter. We introducehere, by the same token, a new method that we refer to as the EM filter. Butfirst, let us review the KF.

3.2. Kalman Filter

Consider the problem of finding an estimator x as a linear function of thenoisy data vector z that satisfies (3), that is z = Hx + ν. The best linear


unbiased estimator (BLUE) for x from z is the vector x, which minimizes thecost function

J(x) = ‖z −Hx‖2R−1

where ‖x‖2B denotes x⊤Bx, the weighted Euclidian norm. If H have full rankthen x = V ⊤z, where V = R−1H(H⊤R−1H)−1. Now suppose that, in additionto the data vector z, we have y = x+ µ. The vector y is a prior estimate of x,where µ is the mean-zero error of this estimate and Q is the known covariancematrix of µ. We want to estimate x as a linear function of both z and y.Applying the BLUE to the augmented system of equations, that is minimizingthe cost function

F (x) = ‖z −Hx‖2R−1 + ‖y − x‖2Q−1

we find the solution to be

x = y +W (z −Hy)

whereW = QH⊤(R+HQH⊤)−1

We see that to obtain the estimate x of x, we first check to see how well y,the prior estimate of x, performs as a potential solution of the system z = Hxand correct the estimate y, using the error z − Hy, to get the new estimatex. If z = Hy, then x = y. The KF involves the repeated application of thisextension of the BLUE.

The KF is a recursive algorithm to estimate the state vector xk during thetime k∆t as a linear combination of the vectors zk and yk. In this case xk isthe unique minimizer of the functional

F (xk) = ‖zk −Hkxk‖2R−1

k

+ ‖yk − xk‖2Q−1

k

(5)

The KF update is the following: Given an unbiased estimate xk−1 of the statevector xk−1, our prior estimate of xk based solely on the activity dynamics is

yk = Akxk−1 (6)

The estimate xk will have the form, refer for instance to [17],

xk = yk +Kk(zk −Hkyk) (7)

where

Pk = AkPk−1A⊤k +Qk (8)


Kk = PkH⊤k (HkPkH

⊤k +Rk)

−1 (9)

Pk and Pk−1 in (8) are the covariances of the estimated activity x at time k andk − 1 respectively. Since the Kalman estimate is an unconstrained minimizerof F (xk), we will most likely end up with some negative entries in the vectorsolution xk. This is not a desirable solution in medical imaging. This has beenremedied, for instance in [16, 17], where projection into the set of nonnega-tive vectors was used to cope with this situation. Another drawback of KF isthe matrix-matrix multiplications involved in (8) and (9) and matrix inversioninvolved in (9). Attempts have been made to rectify these two shortcomings;see for instance [6, 7] for more details. Furthermore, KF needs to calculate,update, and store covariance matrices. Our goal is manyfold. We intent to finda substitute algorithm to KF that

1. filters out errors from modeling the dynamical system,

2. filters out the noise from the data,

3. insures temporal regularization,

4. is an optimal recursive estimate,

5. does not require the storage of past measurement data,

6. guarantees nonnegativity of the solution,

7. does not use matrix-matrix multiplications

8. does not necessitate any matrix inversion, and

9. does not need to calculate, update, or store any covariance matrix.

We aim then to keep the same first five properties of KF while improvingit by requesting four more. Each recursive step in the new approach is aniterative reconstruction that involves only matrix-vector multiplication. Theseshould then handle the problems of huge number of variables, such is the casein medical imaging, and would guarantee positive solutions. We mentionedearlier that the SMART filter [9] satisfies these properties. The main goal ofthis paper is to provide, yet another alternative, based on an extension of theEM algorithm to solve our filtering problem for the dynamic state.


4. EM Filtering

4.1. Towards a Nonnegative Filter

The problem we are trying to solve can be summarized as follows. Fromone/two/three tomographic projections entire image needs to be estimated foreach time frame. This is usually, if not always, a highly underdetermined prob-lem in SPECT and some constraints need to be used in order to get a reasonableestimate. In this paper, we impose few constraints, namely, filtering out errorsfrom modeling the dynamical system and the noise from the data, imposingtemporal regularization, and enforcing the nonnegativity of the solution.

A nonnegative approach, applicable to nonnegative vectors and matrices,might need to minimize a distance that applies only to nonnegative quantities;the Kullback-Leibler distance satisfies this constraint. Suppose now that, inaddition to the data vector z as in the static situation, we have y = x+ µ. Wefind a new estimate x by minimizing the following cost function:

F (x) = KL(z,Hx) +KL(y, x)

where it is clear that if the prior estimate y of x satisfies z = Hy, then thenew estimate is y. So the nonnegative filter would use a repeated applicationof the solution to the minimization problem. However, the covariances do notseem to play a role now, since this is not a least-squares or Gaussian theory.Nevertheless, the two covariance matrices R and Q play crucial roles in KFto filter out the noise from the data and the errors from our modeling of thedynamic system. We would like then to keep this filtering property by utilizingthese two matrices. We introduce then a weighted KL distance that will handlethe noise and error filtering part; this is covered in the next subsection.

4.2. Weighted Kullback-Leibler Distance

R and Q, being covariance matrices, are symmetric positive definite, so aretheir inverses R−1 and Q−1. Therefore, the Cholesky decomposition impliesthat there exist upper triangular matrices D1, and D2 with strictly positivediagonal entries such that

R−1 = D⊤1 D1 (10)

Q−1 = D⊤2 D2 (11)

with these decompositions, we have ‖x‖2R−1 = ‖D1x‖2 and ‖x‖2Q−1 = ‖D2x‖2.By the same token, we define the weighted KL distance w.r.t. to, for instance


R−1, as follows,

KLR−1(a, b) = KL(D1a,D1b)) (12)

A nonnegative approach, applicable to nonnegative vectors and nonnegativematrix H, might be to minimize the following weighted cross-entropy/KL sum

KL(D1z,D1Hx) +KL(D2y,D2x) (13)

It goes without saying that first we should make sure that the four vectorsD1z, D1Hx, D2y, and D2x have nonnegative entries. For our application,the matrix H is any of the observation matrices Hk and the vector z is anyof the observation vectors zk; k = 1, · · · , S. Recall that we assume that wehave diagonal covariance matrices, white noise and error, nonnegative z, andnormalized system matrixH with nonnegative entries. On one hand, the matrixH and vectors z, y, and x have nonnegative values. On the other hand, matricesD1 in (10) and D2 in (11) could have off-diagonal negative entries to the extentthat it is not ensured that the four vectors are all nonnegative coordinate-wise. Nonetheless, when we deal with white noise and error, D1 and D2 willbe diagonal matrices with nonnegative values. This is exactly the case of mostapplications including dynamic SPECT where these four vectors D1z, D1Hx,D2y, and D2x are nonnegative. Otherwise, we should first convert a generalsystem to a nonnegative one and then pre-whiten it, refer to [9]. Diagonalcovariance matrices, white noise and error, nonnegative detected photons zk,and normalized system matrices Hk (k = 1, · · · , S) with nonnegative entriesare assumptions that we are adopting throughout this paper. We also assumethat the matrix Hk ∈ R

M×N has been constructed so that Hk, as well as anysubmatrix C obtained fromHk by deleting columns, has full rank. In particular,if C is M × L and L ≥ M , then C has rank M .

4.3. Cross-Entropy Minimization

Byrne [13] considers the following regularization problem,

minx≥0

F (x) = αKL(d, Px) + (1− α)KL(y, x) (14)

His alternating projections algorithm goes something like this. Let x0 be astarting nonnegative point. Then having got the ℓth iterate xℓ, we obtain

rℓ+1ij =

Pijxℓjdi

(Pxℓ)i(15)


xℓ+1j = α

M∑

i=1

rℓ+1ij + (1− α)yj (16)

In this article, we also offer an alternative update formula by gathering bothsteps (15) and (16) into a compact convex combination form as follows,

xℓ+1j = αxℓj

M∑

i=1

Pijdi(Pxℓ)i

+ (1− α)yj (17)

For α = 1, we obtain the well known EM/ML (expectation maximization/maxi-mum likelihood) iteration (4). He states the following convergence lemma andproves it using the orthogonality conditions of a Pythagorean type.

Lemma 1. The sequence {xℓ} converges to a limit x∞ for all M and N,

for all staring x0 > 0, for all y > 0, and for all 0 ≤ α ≤ 1. For 0 ≤ α < 1,x∞ is the unique minimizer of F (x). For α = 1, x∞ is the unique nonnegative

minimizer of KL(d, Px) if there is no nonnegative solution of d = Px. If thereare nonnegative solutions, then d = Px∞ and x♯ = x∞ is the only solution for

which the inequalities KL(x, x♯) ≤ KL(x, xℓ) hold for all x ≥ 0 with d = Pxand all ℓ; it follows that support(x) is contained in support(x∞) for all such x.

Note that the term KL(y, x) forces temporal smoothness with α as a tem-poral smoothing parameter. When recursively minimizing this function overxk, there will be a trade off between fidelity to the data and to smoothness ofthe TAC. The positive parameter α is a smoothing parameter that controls therelative importance of the two criteria. A crucial step in regularization is theselection of the regularization parameter α.

5. Application to Dynamic SPECT

Our error and noise covariance matrices in the dynamic SPECT problem areusually modeled as diagonal matrices with nonnegative entries, refer to sec-tion 2.1.

Qk = σ2kI (18)

Rk = diag(zk) (19)

where diag(a) is the square matrix that has the ai in its main diagonal and0 otherwise. Our measurements zk are modeled as Poisson random variables,


thus the choice of diag(zk) as their covariance matrices. Let

D1 = diag(1√zk

) (20)

D2 =1

σkI (21)

where the vector√zk is the one having

√

(zk)i in its entry i. In this case, weare sure that D1z ≥ 0, D1Hx ≥ 0, D2y ≥ 0, and D2x ≥ 0. For ease of notation,we drop for a while the subscript k. The weighted KL cost function is then

KL(√z, (

1√z).Hx) +KL(

1

σy,

1

σx) (22)

orσ − 1

σKL(

σ

σ − 1

√z,

σ

(σ − 1)√z.Hx) +

1

σKL(y, x) (23)

where a.b and a/b designate the multiplication and division respectively of thevectors a and b component wise. We can obtain the same functional (23) byusing a pre-whitening [9].

Recall that we aim to find a nonnegative estimate xk, k = 1, . . . , S, to thenonnegative unknown xk of the problem given by the two linear space-stateequations (1) and (2),

xk = Akxk−1 + µk

zk = Hkxk + νk

µk is the error vector, E(µk) = 0 and E(µkµ⊤k ) = Qk is the covariance of the

error in modeling the transition from xk−1 to xk. E(νk) = 0 and E(νkν⊤k ) = Rk

are the mean and covariance respectively of the noise vector νk. Entries of thevector zk and of both matrices Ak and Hk are nonnegative. We know also fromequations (18) and (19) that,

Qk = σ2kI

Rk = diag(zk)

Minimizing the functional (23) at each recursion step k is the same as solvingthe functional F (x) in (14). It suffices to use the change of variables given instep 2 of the EM filter algorithm that follows and to recall that the predictedactivity state yk is Akxk−1. We apply the iterative method based on alternatingprojections and given by the formulas (15) and (16) at each time step k. The


clustering point x∞ will be the estimate state xk we are solving for using equa-tions (1) and (2). The matrix Hk is assumed to be a normalized system matrix

(

M∑

i=1

(hk)ij = 1, ∀j = 1, · · · , N ) with nonnegative entries. Thus we obtain the

following EM filter algorithm,

Algorithm 2. EM Filter Algorithm

1. Start with x0 > 0. For k = 1, · · · , S execute the following steps

2. Assume we have done the recursive step up to time k − 1, do the change

of variables

α =σk − 1

σk

P =1

αR

−1/2k Hk =

1

αdiag

(

1√zk

)

Hk

d =1

αR

−1/2k zk =

1

α

√zk

3. To get xk, start with s0 = xk−1

4. Make yk = Akxk−1

5. Do rℓ+1ij =

Pijsℓjdi

(Psℓ)i

6. Compute sℓ+1j = α

∑Mi=1 r

ℓ+1ij + (1− α)(yk)j , ℓ = 0, 1, . . .

7. The update formula for the next estimate is xk = s∞, where s∞ is the

cluster point of the sequence (sℓ)ℓ∈N.

Observe that we could combine the two steps 5 and 6 into one step as wedid before in (17)

sℓ+1j = αsℓj

M∑

i=1

Pijdi(Psℓ)i

+ (1− α)(yk)j , ℓ = 0, 1, . . . (24)

or in an even simplified form when we combine steps 2, 5, and 6

sℓ+1j = sℓj

M∑

i=1

(hk)ij

(R−1/2k Hksℓ)i

+ (1− α)(yk)j , ℓ = 0, 1, . . . (25)


Note how the next iterate sℓ+1j in (24) is formed as a sum of a convex combi-

nation of the predicted (yk)j, associated with the same coefficient 1 − α as inproblem (14), relying only on the evolution model and of the calculated EMiterate, in a similar form as in (4), associated with the same coefficient α as inproblem (14) as well relying only on the observation model. The matrix Rk isdiagonal, thus the EM filter does not involve any matrix-matrix multiplicationor any matrix inversion. It involves only matrix-vector multiplication. Thetemporal regularization parameter α is well defined and takes values between0 and 1 when σk varies between 1 and ∞. If α = 0 at each time step k, that isσk = 1 and xk = Akxk−1, then we are discarding completely the observationsto the extend we rely only on our evolution model. This defeats the purpose ofthe experiment. When α = 1 at each time step k, the predicted yk in step 4 isnot needed in step 6. Thus we retrieve the EM/ML iteration as in (4), whichis only valid for the static case as mentioned before. Indeed, choosing α = 1means that σk−1

σk= 1 or simply σk = ∞. That is the covariance matrix in the

transition equation (1) is very huge; which implies we have no confidence at allin our evolution model. In other words, we discard the evolutionary state ofthe variable. Only the observations zk are meaningful in finding the xk = x;which is then a stationary state as it should be.

We have the following convergence result; this is a direct consequence oflemma 1

Theorem 3. For all k = 1, · · · , S, the sequence {sℓ} converges to a limit

xk = s∞ for all M and N, for all staring x0 > 0, and for all 1 ≤ σk ≤ ∞. For

1 ≤ σk < ∞, xk is the unique minimizer of the functional (23). For σk = ∞,

x = xk is the unique static nonnegative minimizer of KL(z,Hx) if there is

no nonnegative solution of z = Hx. If there are nonnegative solutions, then

z = Hx and x♯ = x is the only solution for which the inequalities KL(x, x♯) ≤KL(x, sℓ) hold for all x ≥ 0 with z = Hx and all ℓ; it follows that support(x)is contained in support(x) for all such x.

Remark 4. In considering these particular mixture of KL distancesin (23), we unified approaches commonly taken in the underdetermined andoverdetermined cases, and we developed an iterative solution method within asingle framework of alternating projections as well as establishing a convergenceresult.


0

20

40

60

80

100

120

Simulated activity at time 3

20 40 60

10

20

30

40

50

600

20

40

60

80

100

120

Simulated activity at time 15

20 40 60

10

20

30

40

50

60

0 10 20 30 400

20

40

60

80

100

120

140 TACs of different ROI

time

BackgroundImmersedRightLowerUpperLeft

Background

Immersed

Right

Lower

Upper

Left

Figure 1: Simulated annulus with its different ROI and their TACs.Upper left: simulated activity at time 3, upper right: simulated activityat time 15, and lower left: TACs of the 6 different ROI.

6. Numerical Experiment

6.1. Simulation

Our phantom is composed of six regions of interest (ROI) or segments. Asegment represents a spatial region with similar temporal behavior. In thepresent study we do not assume the segments to be known exactly. Differentapproaches to determine the segmentation are described in literature. Each ROIhas a different time activity curve (TAC), see figure 1. The example investigatedin this work is based on the teboroxime dynamics in the body during first hourpost injection. The choice of the TACs is motivated by the behavior of liver,healthy myocardium, muscles, stenotic myocardium, and lungs. Only one sliceis modeled; that is we simulate a 2D object. The star-like shape placed on theleft ensures that the phantom is not entirely symmetrical. We simulate 120projections over 360◦, one projection for every 3◦, with attenuation and a 2DGaussian detector response.


50

100

150

200

250

300

350

400

450

Sinograms of the three heads with no noise in data

time

bin

nu

mb

er

20 40 60 80 100 120

10

20

30

40

50

60

Figure 2: Noiseless sinogram or 2D projections without noise: y-axishas the bin number and the x-axis has the 40 time instances of the 3heads. Time instances from 1 to 40 are for head 1, 41 to 80 for head 2,and 81 to 120 for head 3. A color intensity of a pixel is the number ofdetected photons by a certain bin at a certain time.

There are three camera heads consisting of 64 square bins each measuring0.625 cm in each side. The distance from the annulus to the camera headrotation axis is 30 cm. We simulate S = 40 time instances for three heads; thatis we have 3 × 40 = 120 projections for a camera rotating clock wise (CW) ina circular orbit. Head 1 starts at −60◦, head 2 at 60◦, and head 3 at 180◦. Alow energy high resolution (LEHR) collimator is used with a full width at halfmaximum (fwhm). We determine the blurred parallel strip/beam geometrysystem matrices for all projections with resolution recovery and attenuationcorrection [18].

We have 64 projection/measurement values for each head, which amountsto a total of M = 192 observations at each time frame. We run tests wherewe compare reconstructions without added noise to the data, see figure 2, andwith noise included into the data/observations, see figure 3. That is, insteadof working with the observation z as is in the case of noiseless data, we tookrather a Poisson random observation with mean z in the case of noisy data.We noticed that there are very slight differences in the TACs and in the re-constructed images with and without noisy data. We present here only theresults with the noise added to the observations. The size of the image we aimto reconstruct is N = 4096 = 64× 64 dixels. We have six kinds of TACs that


0

50

100

150

200

250

300

350

400

450

Sinograms of the three heads

time

bin

nu

mb

er

20 40 60 80 100 120

10

20

30

40

50

60

Figure 3: Noisy Sinogram: y-axis has the bin number and the x-axishas the 40 time instances of the 3 heads. Time instances from 1 to 40are for head 1, 41 to 80 for head 2, and 81 to 120 for head 3. A colorintensity of a pixel is the number of detected photons by a certain binat a certain time. Note: We use this noisy data for the experiment.

are very representative for clinical applications. The annulus has four arcs thatwe name “Left”, “Upper”, “Right”, and “Lower” according to their location.The activity is decreasing in the Left arc, increasing-decreasing in the Upperarc, constant in the Right arc, and increasing in the Lower arc; see figure 1.The star-like shape has zero activity within it and is called the “Star” region;we refer to it as “Background” too. The annulus is immersed within a region,called “Immersed”, that has a constant activity. We have six ROI in total. TheEM filter algorithm should work in both underdetermined and overdeterminedsettings, refer to remark 4. We aim then to test the algorithm 2 in the un-derdetermined and overdetermined cases. The undermined case happens whenwe reconstruct dixel by dixel; we possess M = 192 noisy data for N = 4096unknowns or a ratio of about 1:21 data to unknowns. It is an ill-posed problem.The overdetermined consists in reconstructing the six ROI, when we assume fullknowledge of their locations, so that we have M = 192 noisy data for N = 6unknowns or a ratio of 32:1 data to unknowns. We should of course get betterreconstructed images in the latter case than in the former one; this, indeed, willbe confirmed shortly.

We provide quantitative analysis of the reconstructed images in order tocompare the simulated activity with the reconstructed one. We define the rel-


ative deviation error τ of the reconstructed activity v∗ from the truth x, referto (26) through (28). Hence we compare the simulated count xi,k with thecorresponding reconstructed one v∗i,k at each time frame k for every location i.We sum over a ROI containing J dixels normalized by the total simulated/truecounts in order to diminish the effect of statistical fluctuations. We have aτROI,k for every sector. These indicators allow us to see how the method per-forms under different dynamic behaviors. We could compare, for instance,sectors with fast washout with those with slow one [17]. We calculate similarτk over the total number of doxels (dynamic voxels) N then we average themover the total number S of time acquisitions; so that we have τavg. This isan objective comparison of the quality of reconstruction for different sets ofparameters such as iteration stopping criteria, noise levels, etc. The closer τavgis to zero, the better the reconstructed images should be.

τ2ROI,k =

∑Jj=1(v

∗j,k − xj,k)

2

∑Jj=1 x

2j,k

(26)

τ2k =

∑Nj=1(v

∗j,k − xj,k)

2

∑Nj=1 x

2j,k

(27)

τavg =1

S

S∑

k=1

τk (28)

6.2. Results

We assume that the system dynamics are unknown to us (1); therefore we usea random walk. In practical terms, we set Ak = I, for all k = 1, · · · , S. Forthe state transition linear model, we proceeded as follows. We are not inter-ested in the background and we assume that we know the locations of thesezero activities; this is a common practice [19, 20, 21]. We have run experi-ments without this assumption and results are very comparable to when wehave run them with this assumption. One interesting way to deal with thisassumption is as this. Set to zero the values of the corresponding positionsof the matrix Hk. The updating equations (8) and (9) ensure that the up-dated activities will remain equal to zero; thus the KF reconstructs perfectlythe star/background region(s) [16, 17]. By having these values as zero whileeliminating these columns and setting to zero the corresponding entries of theestimated activity [9, 10], we guarantee automatically that those entries stayat the value of zero when using our present algorithm EM filter.


We experiment with different initial guesses x0 such as (1, · · · , 1)⊤ and(10−6, · · · , 10−6)⊤. We also start the algorithm with the static image given byOSEM; we call this initial guess OSEM activtiy. The average of the deviationerror τavg combined with visual inspection show that there is no pronouncedadvantage in favor of any. We do not have much confidence in our transitionmodel (1) so we choose covariance matrices to be pretty high, 103 ≤ σk ≤ 104.In the underdetermined case, we do not assume the ROI to be known exactly.However, we make use of these segments only to interpret the results. As aconsequence, there are some differences in intensity between pixels within thesame region. To assess the effectiveness of the method and of the convergenceresult 3, we show the TACs averaged over the pixels within the same ROI andthis is also valid for the overdetermined case. Follow are the results of bothreconstruction cases, underdetermined and overdetermined, SMART filter [9]and their comparison with results obtained using the projected Kalman algo-rithm [17]. The projected Kalman is the classical KF followed by a projectioninto the positive octant, using a proximal approach, to ensure the feasibility ofthe activity. Recall that KF, see equations (8) and (9), necessitates inversionof huge matrices which is computationally time consuming and RAM memoryhungry.

6.2.1. Underdetermined Case

We are solving the ill-posed inverse problem in reconstructing the dynamic im-ages of the annulus, 192 noisy observations for 4096 unknowns. We use bothalgorithms, EM algorithm 2 and SMART filter algorithm [9], on a P4 3.00 GHzdesktop. It takes less than 2 min to run the EM filter and about 8 mins, 4 timesslower, to run the SMART filter. In contrast to the projected Kalman algo-rithm [17] which takes more that 2.5 hr, we witness improvements ranging fromabout 18 to 75 times faster. The SMART filter takes longer than the EM filterbecause of the many evaluations of the log and exp functions in its steps 5 and6. It was mentioned in [9] that combining 2 steps of the SMART filter algorithminto one to avoid the evaluation of log and exp functions, evaluating rather apower function instead, might speed up the SMART filter algorithm. The τavgof both methods is about 0.52 which is the same as with projected Kalman.Images and TACs look fine with both, although the image with EM filter looksslightly better and TACs are somewhat smoother, see figure 4. Images andTACs of both methods are of the same quality as with projected Kalman [17].


0 20 40

20

40

60

80

100

120

140

Lower

time0 20 40

0

20

40

60

80

Left

TACs of different sectors

TrueSMART FilteredEM Filtered

0 20 400

20

40

60

80

100

Upper

time

0

20

40

60

80

100

120

True at time 21

0

20

40

60

80

100

120

SMART Filtered

0

20

40

60

80

100

120

EM Filtered

Figure 4: Reconstructed images at time 21 and TACs of 4096 dixelsusing EM filter and SMART filter.

6.2.2. Overdetermined Case

In medical imaging, we are sometimes not interested in individual intensitiesof each and every pixel/voxel but rather on some ROI intensities. We arethen more concerned with a segmented reconstruction [21]. A CT scan forinstance might give us an idea about the ROI. In this case we are solvingthe inverse problem in reconstructing the dynamic images of the annulus, 192noisy observations for 6 unknown ROI. We use both algorithms, EM filter 2and SMART filter algorithm [9], on a P4 3.00 GHz desktop. It takes about1.5 sec to run the EM filter and about 15 sec to run the SMART filter. Incontrast to the projected Kalman algorithm [17] which takes about 1.7 sec, wedo not witness any improvement in using EM filter. Again, the SMART filtertakes longer than the EM filter, 10 times slower but both times in the seconds,because of the many evaluations of the log and exp functions in its steps 5 and6. This shortcoming could be remedied, see section 6.2.1 above. The τavg ofboth methods is about 0.03 which is half of the one with projected Kalman.


0 20 40

20

40

60

80

100

120

140

Lower

time0 20 40

0

20

40

60

80

Left

TACs of different sectors

TrueSMART FilteredEM Filtered

0 20 400

20

40

60

80

100

Upper

time

0

20

40

60

80

100

120

True at time 21

0

20

40

60

80

100

120

SMART Filtered

0

20

40

60

80

100

120

EM Filtered

Figure 5: Reconstructed images at time 21 and TACs of 6 ROI usingEM filter and SMART.

Hence we improve on convergence. As expected with this overdetermined case,we get much better images and TACs, compare figure 4 to figures 5 and 6.Images and TACs of both methods, EM filter and SMART filter, are of thesame quality as with projected Kalman [17].

7. Conclusion

We presented here a novel algorithm that we refer to as EM filter. It applies tononnegative normalized systems when a nonnegative solution is desired. Ouralgorithm guarantees this as well as a temporal smoothness. We ran experi-ments comparing EM filter with SMART filter and projected Kalman in bothcases, underdetermined and overdetermined. Quality of images and TACs isabout the same in the three reconstructed images, although EM filter performsslightly better than SMART filter. EM filter is 4 to 10 times faster than theSMART filter; still, both are 18 to 75 times faster than the projected Kalman


0

50

100

Time 4

0

50

100

0

50

100

0

50

100

Time 7

0

50

100

0

50

100

0

50

100

Time 17

0

50

100

0

50

100

0

50

100

Time 22

0

50

100

0

50

100

0

50

100

Time 29

0

50

100

0

50

100

0

50

100

Time 35

0

50

100

0

50

100

Figure 6: Reconstructed images at various time instances: simulatedimages in top row, SMART filter reconstructed images in middle row,and EM filter reconstructed images in bottom row.

in the underdetermined case, minutes instead of hours. EM filter performsabout the same as projected Kalman time wise; nevertheless, it improves onconvergence in the overdetermined case. EM filter algorithm filters out errorsfrom modeling the dynamical system and the noise from the data. It insurestemporal regularization and outputs an optimal recursive estimate. It also doesnot use any matrix-matrix multiplication and does not necessitate any matrixinversion. These last two properties make it very suitable for large scale systemssuch as the ones in medical imaging. EM filter could be used in any disciplinewhich has used, for instance KF, or in any one that is interested in time-varyingvariables such as financial risk assesment/evaluation and forecasting or control,especially when there is concern with nonnegative outputs. To confirm corol-lary 3, we applied the algorithm to time-varying SPECT, a medical imagingmodality in nuclear medicine. Our results substantiate the efficiency of thisnovel EM filter. Like two faces of the same coin, EM filter and SMART filterare two algorithms deduced from the same weighted KL distance.


Acknowledgments

This is to thank Dr. Germain Tanoh of Quantimal for his contributions to thepresent paper.

References

[1] A.P. Dempster, N.M. Laird, D.B. Rubin, Maximum Likelihood from In-complete Data via the EM Algorithm, Journal of the Royal StatisticalSociety. Series B (Methodological), 39 (1977), 1-38.

[2] L.A. Shepp, Y. Vardi, Maximum likelihood reconstruction for emissiontomography, IEEE Trans. Med. Imag., 1 (1982), 113-122.

[3] A. Hudson, R.S. Larkin, Accelerated image reconstruction using orderedsubsets of projection data, IEEE Transactions on Medical Imaging, 13

(1994), 601-609.

[4] H.H. Bauschke, D. Noll, A. Celler, J. M. Borwein, An EM-algorithm fordynamic SPECT tomography, IEEE Trans. Med. Imag., 18 (1999), 252-261.

[5] M.A. Schulze, An edge-enhancing nonlinear filter for reducing multiplica-tive noise, In: Proc. SPIE Vol. 3026, (1997), 46-56.

[6] B.D.O. Anderson, J.B. Moore, Optimal Filtering, Printice-Hall, Engle-wood, Ciffs, NJ (1979).

[7] D. Simon, Optimal State Estimation: Kalman, H Infinity, and NonlinearApproaches, Wiley-Interscience, (2006).

[8] R.E. Kalman, K.G. Smythe, A new approach to linear filtering and pre-diction problems, Trans. of the ASME-Jour. of Basic Eng., 82 (1960).

[9] J. Qranfal, C. Byrne, SMART Filter and its Application to DynamicSPECT Reconstruction, Int. J. of Pure and Appli Math, to appear (2011).

[10] C.L. Byrne, Signal Processing, A Mathematical Approach, A K Peters,Wellesley, MA (2005).

[11] S. Kullback, R. Leibler, On information and sufficiency, J. Theoret. Biol.,22 (1951), 79-86.


[12] D. Titterington, On the iterative image space reconstruction algorithm forECT, IEEE Trans. Med. Imag., 1 (1987), 52-56.

[13] C.L. Byrne, Iterative image reconstruction algorithms based on cross-entropy minimization, IEEE Trans. on Image Processing, 2 (1993), 96-103.

[14] R. Gordon, R. Bender, G.T. Herman, Algebraic reconstruction technique(ART) for three-dimentional electron microscopy and X-ray photography,Ann. Math. Statist, 29 (1970), 471-481.

[15] T. Farncombe, A. Celler, C. Bever, D. Noll, J. Maeght, R. Harrop, TheIncorporation of Organ Uptake into Dynamic SPECT (dSPECT) ImageReconstruction, Nuclear Science, IEEE Transactions on, 48 (2001), 3-9.

[16] J. Qranfal, G. Tanoh, Regularized Kalman filtering for dynamic SPECT,In: J. Phys.: Conf. Ser., 124 (2008).

[17] J. Qranfal, Optimal Recursive Estimation Techniques for Dynamic MedicalImage Reconstruction, Simon Fraser University (2009).

[18] G. Tanoh, Algorithmes du point interieur pour l’optimisation en tomo-graphie dynamique et en mecanique du contact, Universite Paul Sabatier(2004).

[19] T. Farncombe, Functional Dynamic SPECT Imaging Using a Single SlowCamera Rotation, University of British Columbia (2000).

[20] M. Kervinen, M. Vauhkonen, J.P. Kaipio, P.A. Karjalainen, Time-varyingreconstruction in single photon emission computed tomography, Int. J. ofimaging syst. tech, 14 (2004), 186-197.

[21] B.W. Reutter, G.T. Gullberg, R.H. Huesman, Direct least Squares Esti-mation of Spatiotemporal Distribution from Dynamic SPECT Projectionsusing Spatial Segmentation and Temporal B-splines, IEEE trans. med.imaging, 19 (2000), 434-450.

international journal of pure and applied mathematics volume 73 no

Documents