multidimensional rational covariance extension with

16
1 Multidimensional Rational Covariance Extension with Applications to Image Compression Axel Ringh, Student Member, IEEE, Johan Karlsson, Member, IEEE, and Anders Lindquist, Life Fellow, IEEE Abstract—The rational covariance extension problem (RCEP) is an important problem in systems and control occurring in such diverse fields as control, estimation, system identification, and signal and image processing, leading to many fundamental theoretical questions. In fact, this inverse problem is a key component in many signal processing techniques and plays a fundamental role in prediction, analysis, and modeling of signals. It is well-known that the RCEP can be reformulated as a (truncated) trigonometric moment problem subject to a rationality condition. In this paper we consider the more general multidimensional trigonometric moment problem with a similar rationality constraint. This generalization creates many interesting new questions and also provides new insights into the original one-dimensional problem. A key concept in this approach is the complete smooth parametrization of all solutions, allowing solutions to be tuned to satisfy additional design specifications without violating the complexity constraints. As an illustration of the potential of this approach we apply our results to image compression. This is just a first step in this direction, and we expect that more elaborate tuning strategies will enhance this compression procedure in the future. I. I NTRODUCTION In this paper we consider the (truncated) multidimensional trigonometric moment problem with a certain complexity constraint. Many spectral estimation problems, such as radar, sonar, and medical imaging, are essentially problems of this type [66], and here we shall pay special attention to problems in image compression [21]. More precisely, given a set of complex numbers c k , k Λ, where k := (k 1 ,...,k d ) is a vector-valued index belonging to a specified index set Λ Z d , find a nonnegative bounded measure such that c k = Z T d e i(k,θ) (θ) for all k Λ, (1) where T := (-π,π], θ := (θ 1 ,...,θ d ) T d , and (k, θ) := d j=1 k j θ j is the scalar product in R d . Moreover, let e iθ := (e 1 ,...,e d ). By the Lebesgue decomposition [62, p. 121], the measure can be decomposed in a unique fashion as (θ) = Φ(e iθ )dm(θ)+ d ˆ μ(θ) (2a) This work was supported by the Swedish Research Council (VR), the Swedish Foundation of Strategic Research (SSF), and the Center for Industrial and Applied Mathematics (CIAM). A. Ringh and J. Karlsson are with the Division of Optimization and Systems Theory, Department of Mathe- matics, KTH Royal Institute of Technology, 100 44 Stockholm, Sweden. [email protected], [email protected]. Anders Lindquist is with the Departments of Automation and Mathematics, Shanghai Jiao Tong University, 200240 Shanghai, China, and with the Division of Optimization and Systems Theory, Department of Mathematics, KTH Royal Institute of Technology, 100 44 Stockholm, Sweden. [email protected]. of an absolutely continuous part Φdm, with a spectral density Φ and a Lebesgue measure dm(θ) := (1/2π) d d Y j=1 j , and a singular part d ˆ μ containing, e.g., the spectral lines. This is an inverse problem, which in general has infinitely many solutions if one exists, and the problem of interest to us in this paper is how to smoothly parametrize the family of all solutions that satisfy the rational complexity constraint Φ(e iθ )= P (e iθ ) Q(e iθ ) , where P,Q ¯ P + \{0}, (2b) where P + is the convex cone of positive trigonometric poly- nomials P (e iθ )= X kΛ p k e -i(k,θ) that are positive for all θ T d , and ¯ P + is its closure; P + will be called the positive cone. Moreover, we use the notation P + := ¯ P + \P + for its boundary; i.e., the subset of P ¯ P + that are zero in at least one point. In this paper we develop a theory based on convex optimization for this problem. Since (2b) is determined by a limited number of parameters, this approach enables compression of data. Moreover, the smoothness of the parameterization will facilitate tuning to specifications. In this paper we shall apply this low complex- ity approach to achieve image compression. A preliminary example, to which we shall return in Section VII, is shown in Fig. 1. The original image, the Shepp-Logan phantom, often used in medical imaging [64], is shown in Fig. 1a, and two different compressions are shown in Fig. 1b and 1c. Essentially, some tuning has been applied in Fig. 1b, while there is none in Fig. 1c. These simulations have been done with a straight-forward application of the algorithms without any special enhancements, and more work is required to refine these procedures. For d = 1 and Λ = {0, 1,...,n} the trigonometric moment problem with complexity constrains stated above is well understood, and it has a solution with d ˆ μ =0 if and only if the Toeplitz matrix T (c)= c 0 c -1 ... c -n c 1 c 0 c -n+1 . . . . . . . . . c n c n-1 ... c 0 (3) is positive definite [46]. Such a sequence, c 0 ,...,c n , will therefore be called a positive sequence in this paper. arXiv:1507.01430v1 [math.OC] 6 Jul 2015

Upload: others

Post on 30-Oct-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multidimensional Rational Covariance Extension with

1

Multidimensional Rational Covariance Extensionwith Applications to Image Compression

Axel Ringh, Student Member, IEEE, Johan Karlsson, Member, IEEE, and Anders Lindquist, Life Fellow, IEEE

Abstract—The rational covariance extension problem (RCEP)is an important problem in systems and control occurring insuch diverse fields as control, estimation, system identification,and signal and image processing, leading to many fundamentaltheoretical questions. In fact, this inverse problem is a keycomponent in many signal processing techniques and playsa fundamental role in prediction, analysis, and modeling ofsignals. It is well-known that the RCEP can be reformulatedas a (truncated) trigonometric moment problem subject toa rationality condition. In this paper we consider the moregeneral multidimensional trigonometric moment problem with asimilar rationality constraint. This generalization creates manyinteresting new questions and also provides new insights into theoriginal one-dimensional problem. A key concept in this approachis the complete smooth parametrization of all solutions, allowingsolutions to be tuned to satisfy additional design specificationswithout violating the complexity constraints. As an illustrationof the potential of this approach we apply our results to imagecompression. This is just a first step in this direction, and weexpect that more elaborate tuning strategies will enhance thiscompression procedure in the future.

I. INTRODUCTION

In this paper we consider the (truncated) multidimensionaltrigonometric moment problem with a certain complexityconstraint. Many spectral estimation problems, such as radar,sonar, and medical imaging, are essentially problems of thistype [66], and here we shall pay special attention to problemsin image compression [21].

More precisely, given a set of complex numbers ck, k ∈ Λ,where k := (k1, . . . , kd) is a vector-valued index belongingto a specified index set Λ ⊂ Zd, find a nonnegative boundedmeasure dµ such that

ck =

∫Td

ei(k,θ)dµ(θ) for all k ∈ Λ, (1)

where T := (−π, π], θ := (θ1, . . . , θd) ∈ Td, and (k,θ) :=∑dj=1 kjθj is the scalar product in Rd. Moreover, let eiθ :=

(eiθ1 , . . . , eiθd). By the Lebesgue decomposition [62, p. 121],the measure dµ can be decomposed in a unique fashion as

dµ(θ) = Φ(eiθ)dm(θ) + dµ(θ) (2a)

This work was supported by the Swedish Research Council (VR), theSwedish Foundation of Strategic Research (SSF), and the Center for Industrialand Applied Mathematics (CIAM). A. Ringh and J. Karlsson are withthe Division of Optimization and Systems Theory, Department of Mathe-matics, KTH Royal Institute of Technology, 100 44 Stockholm, [email protected], [email protected]. Anders Lindquistis with the Departments of Automation and Mathematics, Shanghai Jiao TongUniversity, 200240 Shanghai, China, and with the Division of Optimizationand Systems Theory, Department of Mathematics, KTH Royal Institute ofTechnology, 100 44 Stockholm, Sweden. [email protected].

of an absolutely continuous part Φdm, with a spectral densityΦ and a Lebesgue measure

dm(θ) := (1/2π)dd∏j=1

dθj ,

and a singular part dµ containing, e.g., the spectral lines.This is an inverse problem, which in general has infinitely

many solutions if one exists, and the problem of interest to usin this paper is how to smoothly parametrize the family of allsolutions that satisfy the rational complexity constraint

Φ(eiθ) =P (eiθ)

Q(eiθ), where P,Q ∈ P+\{0}, (2b)

where P+ is the convex cone of positive trigonometric poly-nomials

P (eiθ) =∑k∈Λ

pke−i(k,θ)

that are positive for all θ ∈ Td, and P+ is its closure; P+

will be called the positive cone. Moreover, we use the notation∂P+ := P+\P+ for its boundary; i.e., the subset of P ∈ P+

that are zero in at least one point. In this paper we develop atheory based on convex optimization for this problem.

Since (2b) is determined by a limited number of parameters,this approach enables compression of data. Moreover, thesmoothness of the parameterization will facilitate tuning tospecifications. In this paper we shall apply this low complex-ity approach to achieve image compression. A preliminaryexample, to which we shall return in Section VII, is shownin Fig. 1. The original image, the Shepp-Logan phantom,often used in medical imaging [64], is shown in Fig. 1a,and two different compressions are shown in Fig. 1b and 1c.Essentially, some tuning has been applied in Fig. 1b, whilethere is none in Fig. 1c. These simulations have been donewith a straight-forward application of the algorithms withoutany special enhancements, and more work is required to refinethese procedures.

For d = 1 and Λ = {0, 1, . . . , n} the trigonometricmoment problem with complexity constrains stated above iswell understood, and it has a solution with dµ = 0 if and onlyif the Toeplitz matrix

T (c) =

c0 c−1 . . . c−nc1 c0 c−n+1

.... . .

...cn cn−1 . . . c0

(3)

is positive definite [46]. Such a sequence, c0, . . . , cn, willtherefore be called a positive sequence in this paper.

arX

iv:1

507.

0143

0v1

[m

ath.

OC

] 6

Jul

201

5

Page 2: Multidimensional Rational Covariance Extension with

2

(a) The Shepp-Logan phantom.

(b) Compression with tuning. (c) Compression without tuning.

Fig. 1. Compression of the Shepp-Logan phantom, with a compression rateof about 97%.

In his pioneering work on spectral estimation, J.P. Burgobserved that among all spectral densities Φ satisfying themoment constraints

ck =1

∫TeikθΦ(eiθ)dθ, k = 0, 1, . . . , n (4a)

the one with maximal entropy∫T

log Φ(eiθ)dθ

2π(4b)

is of the form Φ(eiθ) = 1/Q(eiθ), where Q(eiθ) is a positivetrigonometric polynomal [3], [4]. Later, in 1981, R.E. Kalmanposed the rational covariance extension problem (RCEP) [34]:given a finite covariance sequence c0, . . . , cn, determine allinfinite extensions cn+1, cn+2, . . . such that

Φ(eiθ) =

∞∑k=−∞

cke−ikθ

is a positive rational function of degree bounded by 2n.This problem, which is important in systems theory [46], isprecisely a (one-dimensional) trigonometric moment problemwith the complexity constraint (2b). The designation ‘co-variance’ emanates from the fact that c0, c1, c2, . . . , can beinterpreted as the covariance lags E{y(t+ k)y(t)∗} = ck of astationary stochastic process y with spectral density Φ.

In 1983, T.T. Georgiou [27] (also see [28]) proved thatto each positive covariance sequence and positive numeratorpolynomial P , there exists a rational covariance extension ofthe sought form (2b). He also conjectured that this extensionis unique and hence gives a complete parameterization of allrational extensions of degree bounded by 2n. This conjecturewas first proven in [15], where it was also shown that the

complete parameterization is smooth, allowing for tuning. Theproofs in [15], [27], [28] were nonconstructive, using topo-logical methods. Later a constructive proof was given in [10],[11], leading to an approach based on convex optimization.Here Φ is obtained as the maximizer of a generalized entropyfunctional ∫

TP (eiθ) log Φ(eiθ)

2π(5)

subject to the moment conditions (4a), and the problem isreduced to solving a dual convex optimization problem.

Since then, this approach have been extensively studied [5]–[7], [11], [23], [24], [29], [45], [52], [54], [59], [61], [70], andthe approach has also been generalized to a quite completetheory for scalar moment problems [8], [9], [12], [13], [32].Moreover a number of multivariate counterparts, i.e., when Φis matrix-valued, have also been solved [1], [2], [26], [31],[44], [55], [56], [69].

A considerable amount of research has also been done inthe area of multidimensional spectral estimation; for example,Woods [68], Ekstrom and Woods [22], Dickinson [19], andLev-Ari et al. [42] to mention a few. Of special interest is alsoresults by Lang and McClellan [38]–[41], [49], [50], as theyconsider a similar entropy functional. In many of these areasit seems natural to consider rational models. Nevertheless,the multidimensional version of the RCEP has only beenconsidered at a few instances, for the two-dimensional case in[30], [31] and in the more general setting of moment problemswith arbitrary basis functions in our recent paper [37].

The purpose of this paper is to extend the theory of rationalcovariance extension from one-dimension to the general d-dimensional case and to develop methods for multidimensionalspectral estimation. In Section II we summarize the maintheoretical results of the paper. This includes the main theoremcharacterizing the optimal solutions to the weighted entropyfunctional, which is then proved in Section III. In Section IVwe prove that under certain assumptions the problem is well-posed in the sense of Hadamard and provide comments and ex-amples related to these assumptions. In Section V we considersimultaneous matching of covariance lags and logarithmicmoments, and Section VI is devoted to a discrete version ofthe problem, where the measure dµ consists of discrete pointmasses placed equidistantly in a discrete grid in Td. This is ageneralization to the multidimensional case of recent results in[45] and is motivated by computational considerations. In fact,these discrete solutions provide approximations to solutions tomoment problems with absolutely continuous measures andallow for fast arithmetics based on the fast Fourier transform(FFT) (c.f., [59]). Finally, in Section VII we apply this theoryto image compression. This is just one possible application,others being in system identification and signal processing.

Similar methods have previously been used for compres-sion of textures [17], [55], where, instead of a scalar two-dimensional moment problem, a one-dimensional vector prob-lem is considered. In these papers the image is modeledby a periodic stochastic vector process rather than a two-dimensional random field, leading to a discrete vector momentproblem akin to the one presented in [45]. This is connectedto the circulant moment problem considered in Section II-B

Page 3: Multidimensional Rational Covariance Extension with

3

and to modeling of reciprocal systems [16], [43]. Comparingthese procedures will be the topic of a future study.

II. MAIN RESULTS

Given the moments {ck}k∈Λ, the problem under consider-ation is to find a positive measure (2) of bounded variationsatisfying the moment constraint (1).

Let us pause to pin down the structure of the index set Λ.In view of (1), we have c−k = ck, where ¯ denotes the com-plex conjugation. Revisiting the one-dimensional result [12]–[14] for moment problems with arbitrary basis functions, weobserve that the theory holds also for sequences with “gaps”,e.g., for a sequence c0, . . . , ck−1, ck+1, . . . , cn. As seen in [37]this observation equally applies to the multidimensional case.Therefore, we shall consider covariance sequences {ck}k∈Λ,where Λ ⊂ Zd is any finite index set such that 0 ∈ Λ and−Λ = Λ. We will denote the cardinality of Λ by |Λ|. Further,let nj = max{kj |k ∈ Λ} denote the maximum range of Λin dimension j.

Next, given the inner product

〈c, p〉 =∑k∈Λ

ckpk,

we define the open convex cone

C+ :={c | 〈c, p〉 > 0, for all P ∈ P+ \ {0}

},

the closure of which, C+, is the dual cone of P+. Theboundary of the dual cone will be denoted ∂C+.

We now extend the domain of the generalized entropyfunctional in (5) to multidimensional nonnegative measuresof the type (2) and consider functionals

IP (dµ) =

∫Td

P (eiθ) log Φ(eiθ) dm(θ), (6)

where Φ is the absolutely continuous part of dµ.1 This func-tional is concave, but not strictly concave since the singularpart of the measure does not influence the value. This leadsto the optimization problem to maximize (6) subject to themoment constraints (1). Since the constraints are linear, thisis a convex problem. However, as it is an infinite-dimensionaloptimization problem, it is more convenient to work with thedual problem, which has a finite number of variables butan infinite number of constraints. In fact, the dual problemamounts to minimizing

JP (Q) = 〈c, q〉 −∫Td

P (eiθ) logQ(eiθ)dm (7)

over all Q ∈ P+. Note that (7) takes an infinite value forQ ≡ 0.

Theorem 1: For every c ∈ C+ and P ∈ P+ \ {0} thefunctional (7) is strictly convex and has a unique minimizerQ ∈ P+ \ {0}. Moreover, there exists a unique c ∈ ∂C+ and

1Note that the absolutely continuous part is uniquely defined by theLebesgue decomposition, and hence the function IP (dµ) is uniquely defined.Moreover, this definition of IP (dµ) can be motivated by the fact thatlimn→∞

∫Td log(Φ(eiθ) + fn(θ))dm(θ) =

∫Td log(Φ(eiθ))dm(θ) for

any log-integrable Φ and nonnegative good kernel fn(θ) (see, e.g., [65, p.48]). See also the discussion in Section III-B.

a nonnegative singular measure dµ with support supp(dµ) ⊆{θ ∈ Td | Q(eiθ) = 0} such that

ck =

∫Td

ei(k,θ)

(P

Qdm+ dµ

)for all k ∈ Λ

andck =

∫Td

ei(k,θ)dµ, for all k ∈ Λ.

For any such dµ,

dµ(θ) = (P (eiθ)/Q(eiθ))dm(θ) + dµ(θ)

is an optimal solution to the problem to maximize (6) subjectto the moment constraints (1). Moreover, dµ can always bechosen so that it has support in at most |Λ| − 1 points.

Corollary 2: Let c ∈ C+. Then, for any

dµ =P

Qdm, P,Q ∈ P+\{0}

satisfying the moment condition (1), Q is the unique minimizerover P+ of the dual functional (7).

This corollary implies that, for any c ∈ C+, any measuredµ with only absolutely continuous rational part matching ccan be obtained by solving (7) for a suitable P . However,although c ∈ C+, not all P result in an absolutely continuoussolution dµ = (P/Q)dm that satisfies (1). Nevertheless, theexceptional case when this happens is of particular interest.To insure this we need to impose the following assumption,which always holds for d ≤ 2; see Section IV.

Assumption 3: The cone P+ has the propery∫Td

1

Qdm(θ) =∞ for all Q ∈ ∂P+.

Corollary 4: Suppose that Assumption 3 holds. Then, forany c ∈ C+ and P ∈ P+ there exists a Q ∈ P+ suchthat dµ = (P/Q)dm satisfies (1). Moreover this Q is theunique solution to the strictly convex optimization problem tominimize the dual functional (7) over all Q ∈ P+.

Although they do not consider rational solutions explicitly,nor parameterizations of them, Corollary 4 can be deducedfrom the early work of Lang and McClellan [40].

Note that Corollary 4 is only valid for P ∈ P+, whileTheorem 1 holds for all P ∈ P+ \ {0}. The reason for thisis that, by Assumption 3 and the condition P ∈ P+, thedirectional derivative

δJP (Q; δQ) = 〈c, δq〉 −∫Td

P

QδQdm

of JP tends to −∞ on the boundary, so a minimum is notattained there (c.f., Section III and IV). Therefore it must beattained in P+, which implies that dµ = 0. In fact, if theoptimal solution is an interior point, it is a stationary pointfor which δJP (Q; δQ) = 0 for all δQ, and hence the momentmatching condition (1) holds with dµ = 0. On the other hand,if P ∈ ∂P+, we have

∫Td(P/Q)dm <∞ for some Q ∈ ∂P+,

take for example Q = P . More generally, the integral may notdiverge if Q only have zeros in a subset of the zeros of P .Hence, in this case, there is no guarantee that the optimalsolution is an interior point. We shall return specifically tothis issue in Sections III and IV.

Page 4: Multidimensional Rational Covariance Extension with

4

A. Covariance and cepstral matching

From Theorem 1 and Corollary 4 it follows that Q iscompletely determined by the pair (c, P ). For d = 1 thechoice P ≡ 1 leads to Burg’s formulation (4), which has beentermed the maximum-entropy (ME) solution. On the other handbetter dynamical range of the spectrum can be obtained bytaking advantage of the extra degrees of freedom in P . Severalmethods for selecting P have been suggested in the one-dimensional setting. Examples are methods based on inverseproblems as in [25], [35], [36], a linear-programming approachas in [5], [6], and simultaneous matching of covariancesand cepstral coefficients as in [51] and independently in [5],[6], [23], [45]. Here, in the multivariate setting, we considerthe selection of P based on the simultaneous matching oflogarithmic moments.

We define the (real) cepstrum of a multidimensional spec-trum as the (real) logarithm of its absolutely continuouspart. The cepstral coefficients are the corresponding Fouriercoefficients

γk =

∫Td

ei(k,θ) log Φ(eiθ)dm(θ), for k ∈ Λ \ {0} (8)

where k = 0 has been excluded. For spectra that only have anabsolutely continuous part this agrees with earlier definitionsin the literature (see, e.g., [53, pp. 500-507] or [18, Chapter6]).

Given a set of cepstral coefficients we now also enforcecepstral matching of the sought family of spectra. This meansthat we look for Φ = P/Q that also satisfy (8). Note thatthe index k = 0 is not included in (8). In fact, for technicalreasons, we shall set γ0 = 1. Also to avoid trivial cancelationsof constants in P/Q, we need to introduce the set

P+,◦ := {P ∈ P+ | p0 = 1}.

Theorem 5: Let γk, for k ∈ Λ \ {0}, be any sequence ofcomplex numbers such that γ−k = γk, and let γ = {γk}k∈Λ

where γ0 = 1. Then, for c ∈ C+, the convex optimizationproblem (D) to minimize

J(P,Q) = 〈c, q〉 − 〈γ, p〉+

∫Td

P log

(P

Q

)dm (9)

subject to (P,Q) ∈ P+,◦×P+ has an optimal solution (P , Q).If such a solution belongs to P+,◦ × P+, then Φ = P /Qsatisfies the logarithmic moment condition (8) and dµ = Φdmthe moment condition (1). Moreover, Φ is also an optimalsolution to the problem (P) to maximize

I(Φ) =

∫Td

log Φ dm

subject (1) and (8) for dµ = Φdm. Finally, if Assumption 3holds, then, P ∈ P+,◦ implies that Q ∈ P+.

For reasons to become clear in Section V, the optimizationproblems (P) and (D) will be referred to as the primal anddual problem, respectively. A drawback with Theorem 5 isthat even when Assumption 3 holds, a solution to the dualproblem can be guaranteed to have a rational spectrum thatsatisfies (1) and (8) only if P ∈ P+,◦. In fact, as we saw in thediscussion after Corollary 4, for a solution with P ∈ ∂P+,◦

we might have Q ∈ ∂P+ and hence covariance mismatch. Aremedy is to use the Enqvist regularization, introduced in theone-dimensional setting in [23]. This makes the optimizationproblem strictly convex and forces the solution P into the setP+,◦. In this way we obtain strict covariance matching andapproximative cepstral matching. This statement will be madeprecise in Theorem 22 in Section V-A.

B. The circulant covariance extension problemIn the recent paper [45], Lindquist and Picci studied, for the

case d = 1, the situation when the underlying stochastic pro-cess y(t) is periodic. For the N -periodic case, the covariancesequence must satisfy the extra condition cN−k = ck; i.e., theN × N Toeplitz matrix of one period is Hermitan circulant.In this case, the spectral measure must be discrete with point

masses at ζ` = ei`2πN , ` = 0, 1, . . . , N−1, on the discrete unit

circle, and instead of the moment condition (1) we have

ck =1

N

N−1∑`=0

Φ(ζ`)ζk` , (10)

which is the inverse discrete Fourier transform of the sequence(Φ(ζ`)).

This was generalized to the multidimensional case in [60],where a circulant version of Theorem 1 and Corollary 4 wasderived. For N := (N1, . . . , Nd), set

ζ` := (ei`1

2πN1 , . . . , e

i`d2πNd )

and

ZdN := {` = (`1, . . . , `d) | 0 ≤ `j ≤ Nj − 1, j = 1, . . . , d},

and let P+(N) be the positive cone of all

P (eiθ) =∑k∈Λ

pke−i(k,θ)

such that P (ζ`) > 0 for all ` ∈ ZdN . Moreover, define theinterior C+(N) of the dual cone as the set of all c ∈ Λsuch that 〈c, p〉 > 0 for all P ∈ P+(N) \ {0}. ClearlyP+(N) ⊃ P+, and hence C+(N) ⊂ C+. Then Theorem 2and Corollary 3 in [60] can be combined in the followingtheorem.

Theorem 6: Suppose that 2nj < Nj , for j = 1, . . . , d, andlet c ∈ C+(N) and P ∈ P+(N) \ {0}. Then, there exist aQ ∈ P+(N) \ {0} such that Q is a solution to the convexproblem to minimize2

JNP (Q) = 〈c, q〉 − 1∏dj=1Nj

∑`∈Zd

N

P (ζ`) logQ(ζ`)

over all Q ∈ P+(N). Moreover, there exists a nonnegativefunction µ with support supp(µ) = {ζ` | Q(ζ`) = 0, ` ∈ ZdN}such that

ck =1∏d

j=1Nj

∑`∈Zd

N

ζk`

(P (ζ`)

Q(ζ`)+ µ(ζ`)

), (11)

2Note that in the multidimensional case, limits such as P log(Q) and P/Qmay not be well defined. In the problems and derivations we therefore definethe expressions P log(Q) and P/Q to be zero whenever P = 0. This is notneeded in the continuous case as the set where P is zero is of measure zero.

Page 5: Multidimensional Rational Covariance Extension with

5

and the number of mass points for µ can be chosen so that atmost |Λ|−1 points µ(ζ`) are nonzero. Finally, if P ∈ P+(N)then Q ∈ P+(N), which is then also unique, and hence Φ =P/Q satisfies (11) with µ ≡ 0.

In [45] it was also shown that in the one-dimensional case,as N → ∞ the solution of the discrete problem, call itQN , converges to the solution to the corresponding continuousproblem, call it Q. This is of interest since it gives a naturalway to compute an approximate solution to the continuousproblem using the fast computations of the discrete Fouriertransform. We will show that the same holds also in higherdimensions, and the result can be formulated as follows.

Theorem 7: Suppose that P ∈ P+ \ {0} and c ∈ C+, andlet Q be an optimal solution of Theorem 1 and QN one ofTheorem 6. Then

limmin{N}→∞

QN = Q.

III. THE MULTIDIMENSIONAL RATIONAL COVARIANCEEXTENSION PROBLEM

Most of this section will be devoted to proving Theorem 1.Some of the details that are of more technical character aredeferred to the appendix. In the end of the section we remarkon the possible interpretation of P and give an example thatshows the non-uniqueness of the measure dµ.

A. Proof of Theorem 1

1) Deriving the dual problem: For a given P ∈ P+ \ {0}and c ∈ C+, consider the primal problem to maximize (6) sub-ject to the moment constraints (1) over the set of nonnegativebounded measures, i.e., over dµ = Φdm + dµ, where Φ is anonnegative L1(Td)-function and dµ is a nonnegative singularmeasure. The Lagrangian of the problem becomes

LP (Φ, dµ, Q) =

∫Td

P log Φdm

+∑k∈Λ

qk

(ck −

∫Td

ei(k,θ)(Φdm+ dµ)

)where qk, k ∈ Λ, are the Lagrange multipliers. By moving thesum into the second integral and identifying

∑k∈Λ qke

i(k,θ)

with the trigonometric polynomial Q, this can be simplified to

LP (Φ, dµ, Q) =

∫Td

P log Φ dm+〈c, q〉−∫Td

QΦdm−∫Td

Qdµ.

The dual function supdµ≥0 LP (Φ, dµ, Q) is finite only if Q ∈P+ \{0}. To see this, let Q 6∈ P+, i.e., suppose there is θ0 ∈Td for which Q(θ0) < 0. Then, by letting µ(θ0)→∞ in thesingular part dµ, we get that LP (Φ, dµ, Q)→∞. Moreover,if Q ≡ 0 then since P is continuous and P 6≡ 0 there is asmall neighbourhood where P > 0. Letting Φ → ∞ in thisneighbourhood we again have that LP (Φ, dµ, Q)→∞. Hencewe can restrict the multipliers to P+ \ {0}.

Note that any pair (Φ, dµ) maximizing LP (Φ, dµ, Q) mustsatisfy

∫Td Qdµ = 0, or equivalently, the support of dµ is

contained in {θ ∈ Td |Q(eiθ) = 0}. Otherwise letting dµ = 0would result in a larger value of the Lagrangian.

Now consider the directional derivative with respect to Φ,

δLP (Φ, dµ, Q; δΦ) = limε→0

LP (Φ+εδΦ,dµ,Q)−LP (Φ,dµ,Q)

ε

=

∫Td

(P1

Φ−Q)δΦdm,

for any direction δΦ such that Φ+εδΦ is a nonnegative L1(Td)function for sufficiently small ε > 0. For a stationary point thismust be zero for all feasible directions δΦ, requiring that

P1

Φ−Q = 0 ⇐⇒ Φ =

P

Qa.e.,

which inserted into the dual function yields

supdµ≥0

LP (Φ, dµ, Q) = JP (Q) +

∫Td

P (logP − 1)dm, (12)

where

JP (Q) = 〈c, q〉 −∫Td

P logQdm (13)

can replace (12) in the dual problem since the last term in (12)does not depend on Q.

2) Lower semicontinuity of the dual functional: For anyQ ∈ P+, JP (Q) is clearly continuous. However, for Q ∈∂P+, logQ will approach −∞ in the points where Q(eiθ) =0, and hence we need to consider the behavior of the integralterm in (13). Since P is a fixed nonnegative trigonometricpolynomial, it suffices to consider the integral∫

Td

logQdm.

However, this integral is known as the (logarithmic) Mahlermeasure of the Laurent polynomial Q [48], and it is finitefor all Q ∈ P+ \ {0} [63, Lemma 2, p. 223]. This leads tothe following lemma, the proof of which is deferred to theAppendix.

Lemma 8: For any P ∈ P+\{0} and c ∈ C+, the functionalJP : P+ \ {0} → R is lower semicontinuous.

3) The uniqueness of a solution: From the first directionalderivative

δJP (Q; δQ) = 〈c, δq〉 −∫Td

P

QδQdm

of the dual functional (13), we readily derive the second

δ2JP (Q; δQ) =

∫Td

P

Q2(δQ)2dm,

which is clearly nonnegative for all variations δQ. Therefore,since, in addition, the constraint set P+ is convex, the dualproblem is a convex optimization problem. To see that JP isactually strictly convex, note that since P is positive almosteverywhere, so is P/Q2. Therefore, for δ2JP (Q; δQ) to bezero we must have δQ = 0 almost everywhere, which impliesthat it is zero everywhere since it is continuous. This impliesthat if there exists a solution, this solution is unique.

Page 6: Multidimensional Rational Covariance Extension with

6

4) The existence of a solution: If we can show that JP hascompact sublevel sets, then JP must have a minimum since itis lower semicontinuous (Lemma 8).

Lemma 9: The sublevel sets J−1P (−∞, r] are compact for

all r ∈ R.For the proof of Lemma 9 we need the following lemma

modifying Proposition 2.1 in [13] to the present setting.Lemma 10: For a fixed c ∈ C+, there exists an ε > 0 such

that for every (P,Q) ∈ (P+ \ {0})× (P+ \ {0})

JP (Q) ≥ ε‖Q‖∞ −∫Td

Pdm log ‖Q‖∞. (14)

Proof: Since 〈c, q〉 is a continuous function, it achieves aminimum on the compact set {Q ∈ P+ \ {0} | ‖q‖∞ = 1},where ‖q‖∞ := maxk∈Λ |qk|. The minimum value κc mustbe positive since c ∈ C+ and hence 〈c, q〉 > 0 for any q ∈P+ \ {0}. For any Q ∈ P+ \ {0} we thus have

〈c, q〉 = 〈c, q

‖q‖∞〉‖q‖∞ ≥ κc‖q‖∞. (15)

By Lemma 28 (see Appendix), ‖Q‖∞ ≤ |Λ|‖q‖∞ , and henceby choosing ε ≤ κc/|Λ| we get

〈c, q〉 ≥ κc‖q‖∞ ≥κc|Λ|‖Q‖∞ ≥ ε‖Q‖∞. (16)

To obtain a bound on the second term in (13), we observe that∫Td

P logQdm =

∫Td

P log

[Q

‖Q‖∞

]dm+

∫Td

Pdm log ‖Q‖∞

≤∫Td

Pdm log ‖Q‖∞,

since Q/‖Q‖∞ ≤ 1. Hence (14) follows.Proof of Lemma 9: For any r ∈ R, large enough for the

sublevel set {Q ∈ P+ \ {0} | r ≥ JP (Q)} to be nonempty,

r ≥ JP (Q) ≥ ε‖Q‖∞ −∫Td

Pdm log ‖Q‖∞

for some ε > 0 (Lemma 10). Comparing linear and loga-rithmic growth we see that the sublevel set is bounded bothfrom above and from below. Moreover, since JP is lowersemicontinuous (Lemma 8), the sublevel sets are also closed[62, p. 37]. Therefore they are compact.

5) Existence of a singular measure: It remains to show thatthere exists a measure dµ prescribed by the theorem and thatdµ = P/Qdm+dµ is in fact an optimal solution to the primalproblem to maximize (6) subject to the moment constraints (1).To this end, we invoke the KKT-conditions [47, p. 249] for thedual optimization problem, which require that the functional

LP (Q, dµ) = 〈c, q〉 −∫Td

P log(Q)dm−∫Td

Qdµ.

is stationary at Q for some nonnegative measure3 dµ and thatthe complementary slackness condition

∫Td Qdµ = 0 holds

implying that supp(dµ) ⊆ {θ ∈ Td | Q(eiθ) = 0}.Applying the Wirtinger derivatives [57, pp. 66-69]

∂z=

1

2

(∂

∂x− i ∂

∂y

),

∂z=

1

2

(∂

∂x+ i

∂y

), (17)

3Note that by Reitz’s representation theorem (for periodic functions), thedual of C(Td) is the space of bounded measures on Td [47, p. 133].

where z = x+ i y is a complex variable, we obtain

∂LP (Q, dµ)

∂qk= ck −

∫Td

ei(k,θ)

(P

Qdm+ dµ

),

from which we see that a stationary point must satisfy themoment condition (1). This shows that there exists a singularmeasure dµ with the properties prescribed in part 1) of theproof, such that dµ = P/Qdm+ dµ matches the covariances,and we may therefore take dµ = dµ. Next, we define

ck :=

∫Td

ei(k,θ)dµ = ck −∫Td

ei(k,θ)P

Qdm (18)

for k ∈ Λ, from which we see that c is unique, although themeasure dµ might not be. For a Q ∈ P+ \ {0},

〈c, q〉 =∑k∈Λ

ckqk =∑k∈Λ

(∫Td

ei(k,θ)dµ

)qk =

∫Td

Qdµ,

which shows that 〈c, q〉 ≥ 0 for all Q ∈ P+ \ {0}, and thusc ∈ C+. However for Q we have 〈c, q〉 =

∫Td Qdµ = 0

by complementary slackness, which shows that c ∈ ∂C+.Morerover, it is shown in [39] that there exists a discreterepresentation with support in |Λ| − 1 points for all c ∈ ∂C+.

To show that the solution is optimal also for the primalproblem we observe that, for all dµ = Φdm+ dµ,

IP (Φ) ≤ LP (Φ, dµ, Q) ≤ JP (Q) +

∫Td

P (logP − 1)dm.

Since equality holds for the feasible point dµ = (P/Q)dm+dµ, optimality follows. This completes the proof of Theorem 1.

B. Comments and an example

As was already observed in the one-dimensional case in[12], [13], P need not be confined to the cone P+ \ {0}but could be a general nonnegative integrable function withzero locus of measure zero. This observation was implementedin [32] to interpret the functional (5) as a Kullback-Leiblerpseudo-distance between P and Φ, allowing P to be inter-preted as a Kullback-Leibler prior. In fact, maximizing (5) isequivalent to minimizing the Kullback-Leibler divergence

SKL(P‖Φ) :=

∫TP log

(P

Φ

)dm.

For functions with the same total mass this is nonnegative andequal to zero only when the functions are equal.

In our present more general setting, P could be any ab-solutely integrable, nonnegative function for which the set{θ ∈ Td | P (eiθ) = 0} has measure zero. In this contextit is also possible to interpret the functional (6) as a Kullback-Leibler distance, not only between the two functions P andΦ, but between the two measures dp := Pdm and dµ. Sincedp is absolutely continuous with respect to dµ we obtain [58](see, in particular, equation 3.1)∫

Td

P log

(P

Φ

)dm =

∫Td

log

(dp

)dp

where (dp/dµ) = P/Φ is the Radon-Nikodym derivative.Except in the one-dimensional case, the singular part of the

measure is in general not unique. To illustrate this fact, we

Page 7: Multidimensional Rational Covariance Extension with

7

consider the following example in two dimensions, similar toExample 5.4 in [37], where Q has zeros along a line.

Example 11: Given Λ = {(0, 0), (−1, 0), (1, 0), (0,−1),(0, 1), (−1,−1), (1, 1), (−1, 1), (1,−1)}, consider

P (eiθ1 , eiθ2) = (1− cos θ1),

Q(eiθ1 , eiθ2) = (1− cos θ1)(2− cos θ2).

Let c be the covariances of the spectrum Φ = P/Q, i.e., c0,0 =1/√

3, c1,0 = 0, c0,1 = −1 + 2/√

3, c1,1 = 0 and c−1,1 = 0,the remaining covariances being uniquely determined by theconjugate symmetry c−k = ck. Moreover, let c be given by

ck =

∫T2

ei(k,θ)δ(θ1)dθ1dθ2

so that c0,0 = 1, c1,0 = 1, c0,1 = 0, c1,1 = 0 and c−1,1 = 0.Clearly P, Q ∈ P+, and thus c ∈ C+ since

〈c, r〉 =∑k∈Λ

ckrk =

∫T2

R(eiθ)P (eiθ)

Q(eiθ)dm(θ) > 0

for any R ∈ P+ \ {0}. In the same way,

〈c, q〉 =

∫T2

Q(eiθ)δ(θ1)dm(θ)

=

∫ π

−π(1− cos θ1)δ(θ1)dθ1

∫ π

−π(2− cos θ2)

dθ2

2π= 0,

and thus c ∈ ∂C+. Hence, (Q, c) is the unique pair prescribedby Theorem 1 for the covariance sequence c + c and thenumerator polynomial P . However, since Q is zero for θ1 = 0,any measure dµ with support constrained to the line θ1 = 0and mass 1 such that

∫T2 cos θ2dµ = 0 is a solution.

IV. WELL-POSEDNESS AND ASSUMPTION 3

The intuition behind Corollary 4 is that the optimal solutionQ should always be in the interior P+ whenever P ∈ P+ andAssumption 3 holds, as then solutions are repelled from theboundary. Then, since the measure dµ can only have mass inthe zeros of Q, we must have dµ = 0.

Corollary 4 follows directly from the following lemma.Lemma 12: Let P ∈ P+, and suppose that Assumption 3

holds. Then the optimal solution Q to the problem to minimize(7) over all Q ∈ P+ belongs to P+.

Proof: Let Q ∈ ∂P+ be arbitrary. Then, for any ρ > 0,Q(eiθ) + ρ > 0 for all θ ∈ Td. Hence the functional JP isalso differentiable in Q, and the directional derivative in thedirection 1 is

δJP (Q+ ρ; 1) = 〈c, 1〉 −∫Td

P

Q+ ρdm.

Now note that P/(Q+ ρ) is nonnegative in all points, that itis pointwise monotone increasing for decreasing values of ρ,and that it converges pointwise in extended real-valued sense4

to P/Q. Hence by Lebesgue’s monotone convergence theorem[62, p. 21] we have, as ρ→ 0,∫

Td

P

Q+ ρdm −→

∫Td

P

Qdm,

4That is, the limit may be ∞.

which, since P ∈ P+, is infinite by Assumption 3. Therefore 1is a descent direction from the point Q, and hence the optimalsolution is not obtained there. Since Q ∈ ∂P+ is arbitrary,this means that the optimal solution is not attained on theboundary, i.e., we have Q ∈ P+.

Consequently, by Corollary 4, the unique solution Q ofthe dual problem belongs to the interior P+ for every pair(c, P ) ∈ C+ × P+ if Assumption 3 holds. It turns out thatthe problem is in fact well-posed in the sense of Hadamard,i.e., the solution depends smoothly on c and P , which isan important property when it comes to tuning of solutionsto design specifications. This follows from the followinggeneralizations to the multidimensional case of Theorems 1.3and 1.4 in [13]. The proofs are given in the Appendix.

Theorem 13: Let fp : P+ → C+ be the map from Q to c,given component-wise by

ck =

∫Td

ei(k,θ)P

Qdm

for a fixed P ∈ P+. Under Assumption 3, fp is a diffeomor-phism.

Theorem 14: Let fp be as in Theorem 13, and let c ∈ C+ befixed. Then, given Assumption 3, the function gc : P+ → P+

mapping P to Q = (fp)−1(c) is a diffeomorphism onto itsimage Q+.

As noted in [13], Assumption 3 always holds in the one-dimensional case (d = 1), since the trigonometric functionsare Lipschitz continuous. Using results by Georgiou in [30, p.819] it can shown that this assumption is also valid for d = 2.However, Lang and McClellan [40] note that Assumption 3does not hold in general for dimensions d ≥ 3. To see this theyconsider the polynomial Q(eiθ) =

∑d`=1(1 − cos θ`) ∈ ∂P+

and show that for d ≥ 3∫Td

1

Qdx <∞.

The following simple one-dimension example illustratessome complication that can occur when P ∈ ∂P+.

Example 15: Consider a one-dimensional problem of degreeone, i.e., with Λ = {−1, 0, 1}. Fix c = (1, c1), where c1 is any(negative) number on the interval (−1, 0). Clearly the Toeplitzmatrix (3) is positive definite, and hence c ∈ C+. We fix

P (eiθ) = 2 + eiθ + e−iθ,

which belongs to ∂P+ since P (π) = 0. We want to find aQ ∈ P+ of degree at most one so that Φ = P/Q matches thecovariance sequence c, i.e,

ck =

∫Teikθ

P

Qdm, k = 0, 1. (19)

Any such Q must have the form

Q(eiθ) = λ(1− ρeiθ)(1− ρe−iθ)

for some λ > 0 and |ρ| < 1. Now, clearly

Φ(eiθ) = λ−1 2 + eiθ + e−iθ

1− |ρ|21− |ρ|2

(1− ρeiθ)(1− ρe−iθ),

Page 8: Multidimensional Rational Covariance Extension with

8

where the second factor takes the form

1

1− ρeiθ+

1

1− ρe−iθ− 1

= · · ·+ ρ2e−2iθ + ρe−iθ + 1 + ρeiθ + ρ2e2iθ + · · · ,

from which it follows that

c0 = λ−1 2 + ρ+ ρ

1− |ρ|2, c1 = λ−1 1 + 2ρ+ ρ2

1− |ρ|2.

Since c0 = 1,

c1 =1 + 2ρ+ ρ2

2 + ρ+ ρ=

(1 + ρ)2

2 + ρ+ ρ,

which has positive, real denominator. Then, since c1 < 0,1 +ρ is purely imaginary, which is impossible since 1 +ρ hasa positive real part. Hence, there is no Q ∈ P+ of degree atmost one satisfying (19). However, for a certain Q ∈ ∂P+,namely

Q(eiθ) =1

1 + c1(2 + eiθ + e−iθ),

we obtain dµ = (P/Q)dm− c1δ(θ − π)dθ, i.e.,

dµ = (1 + c1)dm− c1δ(θ − π)dθ,

which matches c with −1 < c1 < 0. Now Φ = 1 + c1 and thesingular measure dµ = δ(θ−π)dθ has all its mass at the zeroof Q, as required by Theorem 1.

In this context it is interesting to note that the covarianceextension problem is usually formulated as a partial realizationproblem where one wants to determine an extension of thepartial covariance sequence c so that

Φ+(z) =1

2c0 +

∞∑k=1

ckz−k

is positive real, i.e., Φ+ maps the unit disc to the right half ofthe complex plane; see, e.g., [46]. Then Φ+(eiθ) + Φ+(eiθ)∗

is the corresponding spectral density Φ(eiθ). In our examplesuch a solution is provided by

Φ+(z) =1

2

(1 + c1 − c1

1− z1 + z

)=

1

2+ c1z − c1z2 + · · · ,

which yields precisely Φ = 1 + c1. However the singularmeasure never appears in this framework.

V. LOGARITHMIC MOMENTS AND CEPSTRAL MATCHING

Given c ∈ C+, Corollary 4 and Theorem 14 together providea complete smooth parameterization in terms of P ∈ P+ of allΦ = P/Q such that dµ = Φdm is the solution of the momentequations (1). Therefore adjusting P can be used to tune thesolution to satisfy additional design specifications. How todetermine P is, however, a separate problem. Theorem 5, tobe proved next, extends results from the one-dimensional caseto simultaneously estimate P using the cepstral coefficientsand logarithmic moment matching.

Proof of Theorem 5: The proof follows along the samelines as that of Theorem 1. By relaxing the primal problem(P) we get the Lagrangian

L(Φ, P,Q) =

∫Td

log Φ dm

+∑k∈Λ

qk

(ck −

∫Td

ei(k,θ)Φ dm

)(20)

+∑

k∈Λ\{0}

pk

(∫Td

ei(k,θ) log Φ dm− γk),

where qk and pk are Lagrangian multipliers. Setting p0 =γ0 = 1 and rearranging terms, this can be written as

L(Φ, P,Q) = 〈c, q〉 −∫Td

QΦ dm

− 〈γ, p〉+ 1 +

∫Td

P log Φ dm,

(21)

where the first term in (20) has been incorporated in the lastterm of (21). As before, supΦ≥0 L(Φ, P,Q) is only finite if werestrict Q to P+, and similarly we need to restrict P to P+,◦.Taking the directional derivative of (21) for any direction δΦsuch that Φ + εδΦ is a nonnegative L1(Td) function for allε ∈ (0, a) for a sufficiently small a > 0, we obtain

δL(Φ, P,Q; δΦ) =

∫Td

(P1

Φ−Q)δΦdm.

For the directional derivative to vanish for all feasible direc-tions δΦ we need

Φ =P

Qa.e.,

which inserted into (21) yields

supΦL(Φ, P,Q) = J(P,Q) + 1−

∫Td

Pdm, (22)

where J(P,Q) is the functional (9) of Theorem 5. A closerlook at the last term in (22) shows that∫

Td

Pdm =

∫Td

∑k∈Λ

pkei(k,θ)dm

=∑k∈Λ

pk

d∏j=1

∫ π

−πeikjθj

dθj2π

= 1,

since all integrals vanish except those for k1 = . . . = kd = 0.Consequently, J is precisely the dual functional (22).

Using the Wirtinger derivative from (17) to form the gradi-ent of J, we obtain

∂J(P,Q)

∂qk= ck −

∫Td

ei(k,θ)P

Qdm, k ∈ Λ,

(23a)∂J(P,Q)

∂pk=

∫Td

ei(k,θ) log

(P

Q

)dm− γk, k ∈ Λ \ {0}.

(23b)In deriving (23b) we used the fact that∫

Td

ei(k,θ)dm =

d∏j=1

∫ π

−πeikjθj

dθj2π

= 0, k 6= 0. (24)

Page 9: Multidimensional Rational Covariance Extension with

9

Therefore, if P ∈ P+,◦ and Q ∈ P+, and hence the optimalsolution is a stationary point of J, then the spectrum Φ = P /Qfulfills both covariance matching (1) and cepstral matching (8).

The following three lemmas ensure the existence of asolution and shows that the problem is in fact convex. Thearguments are similar to those in the proof of Theorem 1, andare given in the Appendix.

Lemma 16: Given c ∈ C+ and any sequence γ = {γk}k∈Λ,where γ0 = 1 and γ−k = γk, (P,Q) 7→ J(P,Q) is a lowersemicontinuous functional on P+,◦ × (P+ \ {0}).

Lemma 17: The sublevel sets J−1(−∞, r] are compact.Lemma 18: The dual problem in Theorem 5 is convex on

P(n1,...,nd)+,◦ × P

(n1,...,nd)+ .

It remains to show that Q ∈ P+ whenever P ∈ P+,◦and that in this case Φ = P /Q is also optimal for theprimal problem of Theorem 5. However, the first statementfollows directly from Lemma 12, and the second followsfrom considering the Lagrangian (20) and observing that wehave a feasible point and that the primal problem takes thesame values as the Lagrangian in this point (c.f., the proof ofTheorem 1). This concludes the proof of Theorem 5.

From this proof we see that the stationarity of J(P,Q) in Qensures covariance matching and the stationarity in P providescepstral matching. Therefore we can only guarantee matchingfor a solution in the interior P+,◦ × P+. This subtle factwas overlooked in [6], [23], where it is claimed that we alsohave covariance matching for P ∈ ∂P+,◦. However, evenwhen Assumption 3 holds, we cannot guarantee that there isa solution Q belonging to the interior P+ if P ∈ ∂P+,◦. Thefollowing example illustrates this.

Example 19: Consider the one-dimensional problem withc0 = 2, c−1 = c1 = 1 and γ1 = −1. Set

P (eiθ) = 1− (eiθ − e−iθ)/2 = 1− cos θ,

and Q = P . Clearly P and Q belong to the boundary, sinceP (0) = Q(0) = 0. Moreover Φ = P/Q = 1, so there isneither covariance matching nor cepstral matching. A simplecalculation shows that

∂J∂q0

=∂J∂q1

=∂J∂p1

= 1.

However, for any feasible direction (δq0, δq1, δp1) in (P,Q)we have Re{δp1} ≥ 0 and Re{δq0 + 2δq1} ≥ 0, and hencethere is no feasible descent direction from this point. Thereforewe have a local minimum, which, by convexity, is also aglobal minimum. Consequently, we have an optimal solutionon the boundary where we have neither covariance nor cepstralmatching.

Remark 20: From Theorem 1 we know that it is possible toachieve covariance matching in this example by adding a non-negative singular measure dµ, representing spectral lines. Infact, a similar statement can be proved for cepstral matching,namely that that there exists a nonpositive measure dµ suchthat supp(dµ) ⊆ {θ ∈ Td | P (θ) = 0} and

γk =

∫Td

ei(k,θ)(

log(P /Q)dm(θ)− dµ(θ))

for all k ∈ Λ\{0}. However, while the physical interpretationof dµ in Theorem 1 is clear, in this case it is not obvious whatdµ represents in terms of the spectrum.

Also note that the optimization problem is convex but ingeneral not strictly convex, and hence the solution might notbe unique. This is illustrated in the following example [46, p.504].

Example 21: Again consider a one-dimensional problem,this time with c0 = 1, c−1 = c1 = 0 and γ1 = 0. Choosing

P (eiθ) = Q(eiθ) = 1− ρ cos θ, |ρ| ≤ 1,

we obtain Φ = 1, which matches the given covariances andcepstral coefficients. Therefore all P and Q of this form arestationary points of J and are thus optimal for the dual problemin Theorem 5.

In one dimension it has been shown that we have strictconvexity, and thus a unique solution, if and only if there isan optimal solution for which P and Q are co-prime [6].

A. Regularizing the problem

The motivation for simultaneous covariance and cepstralmatching is to obtain a rational spectrum Φ = P/Q thatmatches the covariances without having to provide a prior P .However, even if Assumption 3 holds, the dual problem inTheorem 5 cannot be guaranteed to produce such a spectrumthat satisfies the covariance constraints (1). To remedy this weconsider the regularization proposed by Enqvist [23], whichhas the objective function

Jλ(P,Q) = J(P,Q)− λ∫Td

logP dm,

where λ ∈ (0,∞) is the regularization parameter.The partial derivative with respect to qk is identical to (23a),

whereas the partial derivative with respect to pk becomes

∂Jλ(P,Q)

∂pk=

∫Td

ei(k,θ)

(log

(P

Q

)− λ

P

)dm− γk.

By Assumption 3, this gradient will be infinite for P ∈ ∂P+,and hence the optimal solution is not on the boundary.Moreover, with this regularization, the optimization problembecomes strictly convex and we thus have a unique solution.

Theorem 22: Suppose that Assumption 3 holds, and let γk,where k ∈ Λ \ {0}, be any sequence of complex numberssuch that γ−k = γk. Set γ = {γk}k∈Λ where γ0 = 1, and letc ∈ C+. Then for any λ > 0 there exists a unique solution(P , Q) to the strictly convex optimization problem to minimize

Jλ(P,Q) = 〈c, q〉 − 〈γ, p〉

+

∫Td

P log

(P

Q

)dm− λ

∫Td

logP dm

subject to P ∈ P+,◦ and Q ∈ P+. Moreover, Φ = P /Qfulfills the covariance matching (1) and approximately fulfillsthe cepstral matching (8) via

γk + εk =

∫Td

ei(k,θ)Φ dm,

whereεk = λ

∫Td

ei(k,θ)P−1dm.

Page 10: Multidimensional Rational Covariance Extension with

10

Proof: Together with the discussion in this section, all ofthe results follow from Theorem 5 except the strict convexity.To prove this, we note that the second directional derivativeof Jλ is given by

δ2Jλ(P,Q; δP, δQ)

=

∫Td

P

(δP

1

P− δQ 1

Q

)2

dm+

∫Td

δP 2 λ

P 2dm

(c.f., the proof of Lemma 18 in the Appendix). Since bothintegrands are nonnegative, they both need to be zero almosteverywhere in order for the derivative to vanish. However,since P > 0, this implies that δP ≡ 0 by continuity. Thenthe first integrand becomes δQ2P/Q2 and in the same waywe must thus have δQ ≡ 0. Hence δ2Jλ(P,Q; δP, δQ) > 0,implying uniqueness.

VI. THE CIRCULANT PROBLEM

Theorem 6 in Section II-B can be viewed as a periodicversion of Theorem 1 and Corollary 4, as can be seenby following the lines of [45], where the one-dimensionalproblem was first introduced. To this end, we introduce thediscrete measure dνN , i.e.,

dνN (θ) =∑`∈Zd

N

δ(θ1 − φ`1 , . . . , θd − φ`d)

d∏j=1

dθjNj

(25)

where φ` := 2kπ/N` and δ is the multidimensional Dirac-delta function. Then the moment matching condition (11) takesthe form

ck =1∏d

j=1Nj

∑`∈Zd

N

ζk`Φ(ζ`) =

∫Td

ei(k,θ)Φ(eiθ)dνN ,

which is similar to (1), but where dνN and dm have differentmass distributions (discrete versus continuous). In fact, themain difference in the statements of Theorem 6 and Theorem 1together with Corollary 4 is that different measures and conesare used. In the same way, versions of Theorems 5 and 22also hold in the circulant case; see [60] for details.

In connection to this it is also interesting to observe thatAssumption 3 holds for any measure dν with discrete massdistribution (see also [40]). However, if P ∈ ∂P+(N) we maystill obtain solutions without covariance matching, because forany Q that is zero only in a subset of points where P is zero wewill have

∫Td(P/Q)dν < ∞ and hence the optimal solution

may occur on the boundary.Remark 23: Although the measure (25) has mass in points

placed in the roots of unity on the d-dimensional torus, onecould also consider other mass distributions. One possibilityis to place the mass points in the odd points of the rootsof unity, i.e., in the points {ei(2kj−1)π/N`}N`

kj=1, a situationwhich has been studied in the one-dimensional case and whichcorrespond to spectra of skew-periodic processes [61]. Thesame holds true in the multidimensional setting. Also notethat all dimensions does not need to have mass distributions ofthe same type. For example, the approach in this paper workseven if the process is periodic in some of the dimensions,while non-periodic in others.

A. Convergence of discrete to continuous

In [45] Lindquist and Picci proved for the one-dimensionalcase that, when the number of mass points in the discretemeasure dνN in (25) goes to infinity, the solution convergesto the solution of the problem using the continuous measuredm. The same is true in higher dimensions, and the formalresult is given in Theorem 7 in Section II-B. In this subsectionwe will prove this statement. Note that we use the notation

JP (Q) = 〈c, q〉 −∫Td

P logQdm (26a)

JNP (Q) = 〈c, q〉 −∫Td

P logQdνN (26b)

to explicitly distinguish the objective functions using thecontinuous and the discrete measure. Moreover let Q be theminimizer of (26a), subject to Q ∈ P+, and QN be aminimizer of (26b), subject to Q ∈ P+(N). Before provingthe theorem, we make some clarifying observations.

Remark 24: We have already noted that the singular measuredµ is not unique. However, the corresponding “rest covari-ance,” c, which dµ matches, is unique (c.f., equation (18)). Inconnection to this it is interesting to note that although this isthe case, and although QN → Q, in general cN 6→ c. To seethis, note that for a P which is positive in all points exceptfor some irrational frequency5 where P = 0, we will haveP ∈ P+(N) for all N , since this point will never belong to thegrid. Thus we will have QN ∈ P+(N) and therefore cN = 0.However P ∈ ∂P+, and therefore we can have Q ∈ ∂P+

and hence c 6= 0. One can construct such example basedon Example 15 by shifting the spectral line to an irrationalfrequency point.

Remark 25: In connection to the previous remark, we notethat in two dimensions we have Q ∈ P+ whenever P 6∈ ∂P+,since Assumption 3 is valid for d = 2. Hence there will beno singular measure. Moreover, since QN → Q as min(N)goes to infinity, for large enough value of min(N) we musthave QN > 0, i.e., QN ∈ P+. Therefore (P/QN )dνN tendsto (P/Q)dm in weak∗.

The first thing we need to show is that QN is in factwell-defined. That this is not evident from the statement ofthe theorem becomes apparent when noting the followingrelationship among the cones of trigonometric polynomials:

P+(N) ⊃ P+(2N) ⊃ . . . ⊃ P+.

For the dual cones we therefore have [47, pp. 157-158]

C+(N) ⊂ C+(2N) ⊂ . . . ⊂ C+,

and thus it is not guaranteed that minimizing (26b) over Q ∈P+(N) has a solution for c ∈ C+. However note that whenNl →∞ the corresponding set {eikl2π/Nl}kl∈ZNl

will becomedense on the unit circle. Therefore P+ =

⋂N∈Zd

+P+(N).

Using this we have the following lemma, proved in theAppendix, which is a generalization to the multivariable caseof Proposition 6 in [45].

Lemma 26: For any c ∈ C+ there exist an N0 such thatc ∈ C+(N) for all N ≥ N0.

5An irrational frequency is an angle λπ for which λ is an irrational number.

Page 11: Multidimensional Rational Covariance Extension with

11

This shows that for each c ∈ C+, the problem of minimizing(26b) over Q ∈ P+(N) does in fact have a solution for largeenough values of N . Interestingly, the lemma is equivalent tolimmin(N)→∞ C+(N) = C+.

Proof of Theorem 7: Let Q and QN be as in the statementof the theorem. Choose a c ∈ C+ and a P ∈ P+ \ {0} andfix N0 in accordance with Lemma 26. Throughout the rest ofthis proof we only consider N ≥ N0, which means that anoptimal solution QN exists. Moreover, in the proof we needthe following result, which is proved in the Appendix.

Lemma 27: The sequence (QN ) is bounded in L∞.Since (QN ) is bounded, there is a convergent subsequence,

call it (QN ) for convenience, converging in the L∞ normto some function Q∞. Since (QN ) is a set of continuousfunctions, this means that the convergence is in fact uniformand hence Q∞ is a continuous function. Now since i) theconvergence is uniform, ii) Q∞ is continuous, and iii) the gridpoints become dense on Td as N goes to infinity, we obtaintQ∞(eiθ) ≥ 0 for all θ, and hence Q∞ belongs to P+ \ {0}.

It remains to show that Q∞ = Q. This will be done byproving that ‖Q∞ − Q‖∞ ≤ ε for all ε > 0. To do this, fixa Q ∈ P+ and consider Q + ηQ, which belongs to P+ forall η > 0. By simply adding and subtracting ηQ, the triangleinequality gives

‖Q∞ − Q‖∞ ≤ η‖Q‖∞ + ‖(Q∞ + ηQ)− Q‖∞. (27)

We want to bound the second term. To this end, note that

JP (Q+ ηQ)− JP (Q) = 〈c, ηq〉 −∫Td

P log

(1 +

ηQ

Q

)dm,

and, since the integral is nonnegative, we obtain

JP (Q+ ηQ) ≤ JP (Q) + η〈c, q〉. (28)

The same holds for JNP , i.e.,

JNP (QN + ηQ) ≤ JNP (QN ) + η〈c, q〉.By optimality we also have JNP (QN ) ≤ JNP (Q + ηQ) < ∞for all η > 0, and hence

JNP (QN + ηQ) ≤ JNP (Q+ ηQ) + η〈c, q〉. (29)

Now, since QN + ηQ→ Q∞ + ηQ ∈ P+, we know that, forlarge enough values of N , we have QN+ηQ ∈ P+. Therefore,the left hand side of (29) is guaranteed to be well-defined forall values of N larger than this value. We can thus take thelimit on both sides of (29) to obtain

JP (Q∞ + ηQ) ≤ JP (Q+ ηQ) + η〈c, q〉,which together with (28) yields

JP (Q∞ + ηQ) ≤ JP (Q) + 2η〈c, q〉. (30)

Now consider the sets

Dδ = {Q ∈ P+ | JP (Q) ≤ JP (Q) + δ}.Since the Hessian at the optimal solution is positive definitewe have

⋂δ>0Dδ = {Q}. Therefore, it follows from (30) that

η > 0 can be chosen so that ‖(Q∞ + ηQ) − Q‖∞ < ε forany ε > 0. Consequently, by selecting η sufficiently small, wemay bound (27) by an arbitrary small positive number. HenceQ∞ = Q.

Fig. 2. A simplistic test image used to test the compression. Each black orwhite square is 128× 128 pixels

VII. APPLICATION TO IMAGE COMPRESSION

Next we apply the two-dimensional circulant RCEP to com-pression of black-and-white images. Compression is achievedby approximating the image with a rational spectrum, therebyusing fewer parameters. We compare the ME spectrum tothe solution resulting from regularized covariance and cepstralmatching. By choosing n1 � N1, n2 � N2, where N1 andN2 are the dimensions of the image, we obtain a significantreduction in number of parameters describing the image.

A seemingly straight-forward way is to compute the co-variances and cepstral coefficients directly from the image,and then use these to compute the spectrum. However, if thediscrete spectrum is zero in one of the grid points, the cep-strum is not well-defined. Hence simultaneous covariance andcepstral matching cannot be applied. Therefore we transformthe image, denoted by Ψ, using Φ = eΨ. Since Ψ is real, Φ isguaranteed to be real and positive for all discrete frequencies,and Ψ is obtained as Ψ = log(Φ). We then compute (1) and(8) and obtain the approximant Φ from Theorem 22.

Note that a ME solution of the same maximum degree as asolution with a full-degree P have about half the number ofparameters. To compensate for this, we let the degree of theME solution be a factor

√2 higher (rounded up), in order to

get a fair comparison.

A. Compression of simplistic images

To better understand the different methods we first performcompression on a simple image of only black and whitesquares. The original image is shown in Fig. 2 and variousresults are shown in Fig. 3. A first observation is related toFig. 3a, which shows that, if too few coefficients are used,the compression cannot represent the harmonics present in theimage, regardless of the use of a nontrivial P .

A visual assesment of the result gives that 3e clearlyoutperforms 3a, and that 3f is still slightly better than 3b.However 3c and 3d is better than 3g and 3h, respectively. Inorder to more objectively asses the quality of the two differentcompressions methods, we also compute the MSSIM valueof the compressed images. This is a measure for evaluatingquality and degradation of images that takes values in theinterval [0, 1], and where 1 means exact agreement [67]. Aplot of the MSSIM value for compressions of different degree

Page 12: Multidimensional Rational Covariance Extension with

12

(a) Cepstral matching, n = 4. (b) Cepstral matching, n = 10. (c) Cepstral matching, n = 11. (d) Cepstral matching, n = 30.

(e) ME solution, n = 6. (f) ME solution, n = 15. (g) ME solution, n = 16. (h) ME solution, n = 43.

Fig. 3. A series of compressions of a simplistic the simple image shown in Fig. 2. The top row shows compression with regularized covariance and cepstralmatching, where λ = 10−2, and the bottom row shows compression with maximum-entropy solution. In all compressions n1 = n2, and the two compressionsin each column are described by approximately the same number of parameters, namely nme ≈

√2nceps.

Fig. 4. MSSIM values of different compression levels, plotted against nfor the compression with cepstral matching. Hence the corresponding MEcompression has d

√2ne coefficients.

is shown in Fig. 4. However note that this measure does notagree completely with the visual impression of all images.Most notably, the measure gives a higher value to the greyimage in Fig. 3a than the image with structure in Fig. 3e.

B. Compression of real images

We now apply the methods to some more realistic images.In the first example, shown in Fig. 5a, the original image isthe Shepp-Logan phantom often used in medical imaging [64],

of size 256 × 256 pixels. In Fig. 5b a compression usingcovariance and cepstral mathing is shown, where n1 + 1 =n2 + 1 = 30. Hence this image is described by 2 · 302 = 1800parameters, compared to the original 2562 = 65536 parame-ters, which correspond to a reduction in parameters of about97%. We have also computed an ME compression, with degreen1 + 1 = n2 + 1 = 45 ≈

√2 · 30 which is shown in Fig. 5c.

The second example is a compression of the classical Lennaimage, often used in the image processing literature. Theoriginal image is 512 × 512 pixels, and shown in Fig. 5d.In the formulation with regularized cepstral matching weset n1 + 1 = n2 + 1 = 60, which also corresponds toa compression rate of about 97%, and the result is shownin Fig. 5e. The ME compression was thus computed withn1 + 1 = n2 + 1 = 85 ≈

√2 · 60 and is shown in Fig.

5f.We have also computed the MSSIM values for these com-

pressions, and the results are shown in Table I. These valuesseem to agree with the visual impression, and it is interestingto note that the compression with cepstral matching is betterfor compressing the Shepp-Logan phantom. However for com-pressing the Lenna image neither of the methods outperformthe other. The ME compression has more ringing artifacts,however it is less blurred than the cepstral compression. Webelieve that this is related to the fact that if you have relativelyfew sharp transitions in pixel values, which is the case in Fig.2 and 5a, placing both poles and zero close to each other canachieve this transition efficiently and thus give better qualityon the compressed image. However when this is not the case,

Page 13: Multidimensional Rational Covariance Extension with

13

(a) Original. (b) Cepstral matching, n = 30 and λ = 10−2. (c) ME solution, n = 45.

(d) Original. (e) Cepstral matching, n = 60 and λ = 10−2. (f) ME solution, n = 85.

Fig. 5. Compression of the Shepp-Logan phantom and the Lenna image, with a compression rate of about 97%.

TABLE IMSSIM-VALUES OF DIFFERENT COMPRESSION TECHNIQUES, ON THE

TWO TEST IMAGES.

Shepp-Logan Lenna

Compression MSSIM-value Compression MSSIM-value

Cepstral 0.8690 Cepstral 0.7451ME 0.7044 ME 0.7489

as with the Lenna image, the trade-off between having spectralzeros or matching higher frequencies is more complex.

VIII. CONCLUSION AND FUTURE WORK

In this paper we have extended the theory of rational covari-ance extension from one to higher dimensions and shown howit can be used for image compression. However, here muchwork remains to make it competitive with existing methods.Other possible application areas are system identification andstochastic realization theory. The power spectrum of a signalrepresents the energy distribution across frequencies of thesignal. For a multidimensional, discrete-time, zero-mean, andhomogeneous6 stochastic process {y(t)}, defined for t ∈ Zd,

6Homogeneity implies that covariances ck := E{y(t + k)y(t)} areinvariant with “time” t ∈ Zd. From this it is also easy to see that c−k = ck.

the power spectrum is defined as the nonnegative measure dµon Td whose Fourier coefficients are the covariances

ck =

∫Td

ei(k,θ)dµ.

In one dimension the singular part of the measure representsspectral lines, and if the absolutely continuous part is alsorational, Φ = P/Q, one can use spectral factorization todetermine the filter coefficients for an autoregressive-moving-average (ARMA) model which, when feed with white noiseinput, reproduces a stochastic signal with the same powerdistribution as Φ. However spectral factorization is not ingeneral possible in higher dimensions (see, e.g., [20]). It wouldtherefore be interesting to study which spectra represent arealizable systems. The work of Geronimo and Woerdeman[33] may be of use here.

APPENDIX

Some of the proofs use general properties of multidimen-sional trigonometric polynomials, summarized in this lemma.

Lemma 28: For all P ∈ P+ we havei) |pk1...,kd | ≤ p0...,0

ii) ‖P‖∞ ≤ |Λ|‖p‖∞.

Page 14: Multidimensional Rational Covariance Extension with

14

Proof: i) follows from the fact that

|pk| =∣∣∣∣∫

Td

ei(k,θ)Pdm

∣∣∣∣ ≤ ∫Td

|ei(k,θ)| |P |dm = p0.

Next we note that P has |Λ| coefficients, and hence

‖P‖∞ ≤ supθ∈Td

∑k∈Λ

|pk||ei(k,θ)| =∑k∈Λ

|pk| ≤ |Λ|‖p‖∞,

which proves ii).Proof of Lemma 8: To show lower semicontinuity of

JP (Q) = 〈c, q〉+

∫Td

−P logQdm

we note that 〈c, q〉 is continuous and hence only the integralneeds to be considered.

Fix any Q ∈ P+ \ {0}. From [63, p. 223] we know thatit is log-integrable. Moreover, let (Qn) be a sequence oftrigonometric polynomials in P+\{0} that converges to Q. Weknow that Q is bounded, and, since the convergence Qn → Qis uniform, we must have M := supn{maxθ[Qn]} <∞, andthus 0 ≤ Q/M ≤ 1 and 0 ≤ Qn/M ≤ 1 for all n. Moreover,

limn→∞

− log

(QnM

)= − log

(Q

M

)in extended real-valued sense. Since − log(Qn/M) ≥ 0, byFatou’s Lemma [62, p. 23], we have∫

Td

− log

(Q

M

)dm ≤ lim inf

n→∞

∫Td

− log

(QnM

)dm.

Since (Qn) is an arbitrary sequence, this shows that thefunctional is lower semicontinuous in Q. Moreover, since Qis also arbitrary it follows that JP is lower semicontinuous onP+ \ {0}.

To prove Theorem 13, we need the following lemma.Lemma 29: fp is a bijective map.

Proof: By Corollary 4, fp is injective, since there is aunique minimizer of (7) over all Q ∈ P+. Hence there is (atmost) one q that corresponds to a certain c, proving injectivity.Surjectivity can also be seen from Corollary 4. We fix a P ∈P+ and simply note that there exist a unique solution for allc ∈ C+, which is given by q = (fp)−1(c).

Proof of Theorem 13: In the proof of Theorem 1 we sawthat ∂2JP (Q; δQ) > 0 for all nontrivial variations δQ. Hence

∂fpk∂q`

=

∫Td

ei(k−`,θ) P

Q2dm =

∂2JP (Q)

∂q`∂qk(31)

is positive definite. Next, we define the map ϕp : C+×P+ →{(rk)k∈Λ ∈ C|Λ| | r−k = rk,k ∈ Λ} ∼= R|Λ| as

ϕpk(c, q) = ck −∫Td

ei(k,θ)P

Qdm.

By Corollary 4, γ(c, q) = 0 has a unique solution for eachc ∈ C+. Since ∂ϕp/∂q = ∂fp/∂q is invertible, the ImplicitFunction Theorem implies that q = (fp)−1(c) is locally a C1

function and hence a local diffeomorphism. However, fp is abijection (Lemma 29) and therefore a (global) diffeomorphism.

By Theorem 13, the function gc is a well-defined map. Theproof of Theorem 14 now follows along the same lines.

Lemma 30: gc is a bijective map.Proof: The fact that gc is surjective on the image Q+

follows directly from definition. A straight-forward general-ization of Lemma 2.4 in [13] shows that gc is injective.

Proof of Theorem 14: Let the map ϕc : P+ × P+ →{(rk)k∈Λ ∈ C|Λ| | r−k = rk,k ∈ Λ} ∼= R|Λ| be given by

ϕck(p, q) = ck −∫Td

ei(k,θ)P

Qdm.

The Jacobian with respect to q is precisely the same as (31).Hence q = gc(p) is C1 by the Implicit Function Theorem.Since (31) gives a positive definite Jacobian matrix,

∂ϕck∂p`

= −∫Td

ei(k−`,θ) 1

Qdm

defines a negative-definite invertible Jacobian. Therefore p =(gc)−1(q) is C1, so gc is a local diffeomorphism. However,by Lemma 30 it is a bijection and hence a (global) diffeomor-phism.

Proof of Lemma 16: For any Q ∈ P+ \ {0}, logQ isintegrable [63, p. 223]. Since P ∈ P+,◦, P is not the zero-polynomial, hence, since x log x → 0 as x → 0, P logP isintegrable and in fact continuous for all P ∈ P+,◦. Hence∫

Td

P logP dm−∫Td

P logQdm =

∫Td

P log

(P

Q

)dm,

and therefore we can rewrite the functional J(P,Q) as

J(P,Q) = 〈c, q〉 − 〈γ, p〉+

∫Td

P logP dm−∫Td

P logQdm.

All terms in this expression are continuous, except possibly thelast integral. However, following along the same lines as in theproof of Lemma 8, we can apply Fatou’s Lemma showing thatJ(P,Q) is lower semicontinuous.

Proof of Lemma 17: To show that J−1(−∞, r] havecompact sublevel sets, we proceed as in [46, p. 503] by firstsplitting the objective function into two parts

J1(P,Q) = 〈c, q〉 −∫Td

P logQdm

J2(P ) = −〈γ, p〉+

∫Td

P logP dm

The sublevel set consists of the (P,Q) ∈ P+,◦ × P+ suchthat

r ≥ J1(P,Q) + J2(P ),

and from Lemma 10 we have

J1(P,Q) ≥ ε‖Q‖∞ + log ‖Q‖∞,

since∫Td Pdm = 1 by (24). Next we show that J2(P ) is

bounded from below. We first note that since P ∈ P+,◦ wehave p0 = 1, and thus P is bounded away from the zeropolynomial. Now, since x log(x) achieves a minimum > −∞on any compact set [0, a], P logP must achieve a minimum> −∞ on Td. Calling this minimum κP , we have∫

Td

P logP dm ≥∫Td

κP dm = κP

Page 15: Multidimensional Rational Covariance Extension with

15

To bound the term −〈γ, p〉 from below we note that

〈γ, p〉 =∑k∈Λ

γkpk ≤

∣∣∣∣∣∑k∈Λ

γkpk

∣∣∣∣∣ ≤∑k∈Λ

|γk| |pk|

≤∑k∈Λ

‖γ‖∞|pk| ≤ ‖γ‖∞|Λ|‖p‖∞

and thus −〈γ, p〉 ≥ −|Λ|‖γ‖∞‖p‖∞ = −|Λ|‖γ‖∞, since‖p∞‖ = p0 = 1 by Lemma 28. Hence there exist someρ > −∞ such that J2(P ) ≥ ρ. From this we have

r − ρ ≥ J1(P,Q) ≥ ε‖Q‖∞ + log ‖Q‖∞,

so comparing linear and logarithmic growth we see that theset is bounded both from above and below. As before, since itis the sublevel set of a lower semicontinuous function it willbe closed, and hence it is compact.

Proof of Lemma 18: Consider the directional derivativeof J in a point (P,Q) ∈ P+,◦× ∈ P+ in any direction(δP, δQ) such that P + εδP ∈ P+,◦, and Q + εδQ ∈ P+

for all ε ∈ (0, a) for some a > 0. A quite straight-forwardcalculation yields

δJ(P,Q; δP, δQ) = 〈c, δq〉 − 〈γ, δp〉

+

∫Td

[δP log

(P

Q

)− δQP

Q

]dm.

where we have used the fact, obtained from (24), that∫Td

δPdm = δp0 = 0,

since p0 = 1 is constant. Likewise, the second directionalderivative becomes

δ2J(P,Q; δP, δQ) =

∫Td

P

(δP

1

P− δQ 1

Q

)2

dm,

which is clearly nonnegative for all feasible directions andhence positive semi-definite. Thus the problem is convex.

Proof of Lemma 26: First note that C+(N) ⊂ C+. Toprove the lemma, it is sufficient to prove that any c ∈ C+

belongs to C+(N) if min(N) is large enough.Let c ∈ C+. From (15) there exists κc > 0 such that

〈c, p〉 ≥ κc‖p‖∞, for all p ∈ P+. (32)

We want to show that 〈c, p〉 > 0 for any p ∈ P+(N) \ {0}.Without loss of generality we may take ‖p‖∞ = 1. Then|∂P (eiθ)/∂θj | ≤

∑k∈Λ |kj |, and, since P (eiθ) ≥ 0 in θ ∈

TN , it follows that P (eiθ) ≥ −π∆/min(N) where ∆ =∑k∈Λ ‖k‖1. Therefore P+π∆/min(N) ∈ P+, and by using

(32) we get

〈c, p〉+ c0π∆

min(N)≥ κc

(‖p‖∞ −

π∆

min(N)

).

By selecting min(N) > π∆(1+ c0/κc), we obtain 〈c, p〉 > 0.Since p ∈ P+(N) \ {0} is arbitrary, it therefore follows thatc ∈ C+(N).

Proof of Lemma 27: For a fixed Q ∈ P+ we have

limmin{N}→∞

JNP (Q) = JP (Q),

since the sums in (26b) are Riemann sums converging to (26a).Consequently we can define

L := supN

JNP (Q) <∞.

Moreover, for all values of N we have by optimality that∞ > JNP (Q) ≥ JNP (QN ), and also ∞ > JP (Q) ≥ JP (Q).Using this and Lemma 10 we obtain

L ≥ JNP (Q) ≥ JNP (QN ) ≥ εN‖QN‖∞ − ‖P‖1‖ log(QN )‖∞

for all values of N . In accordance with (16), we can chooseεN := κNc /|Λ|, where κNc is the minimum value of 〈c, qN 〉 onthe compact set {Q ∈ P+(N) | ‖q‖∞ = 1}. If we can showκc := infN κ

Nc > 0, we can choose ε := κc/|Λ| ≤ εN for all

N , so that

L ≥ ε‖QN‖∞ − ‖P‖1‖ log(QN )‖∞.

Then comparing linear and logarithmic growth this wouldimply that the sequence (QN ) must be bounded.

To show that κc > 0 first note that for every finite value ofmin(N) we have κNc > 0. Now assume infN κ

Nc = 0. Then

there must exist a sequence (q?N ) such that 〈c, q?N 〉 → 0 asmin(N) → ∞, where q?N ∈ P+(N) and ‖q?N‖∞ = 1. Now,since every q?N is a vector in C|Λ|, the constraint ‖q‖∞ = 1defines a compact set. Hence there is a subsequence, alsoindexed with N , so that q? := limmin(N)→∞ q?N is well-defined and ‖q?‖∞ = 1. Then 〈c, q?〉 = 0. However, sincec ∈ C+ and q? ∈ P+, this implies that q? = 0, whichcontradicts ‖q?‖∞ = 1. Hence κc > 0, as claimed.

REFERENCES

[1] E. Avventi. Spectral Moment Problems : Generalizations, Implementa-tion and Tuning. PhD thesis, 2011. Optimization and systems theory,Department of mathematics, Royal Institue of Technology.

[2] A. Blomqvist, A. Lindquist, and R. Nagamune. Matrix-valuedNevanlinna-Pick interpolation with complexity constraint: an optimiza-tion approach. IEEE Trans. Automat. Contr., 48(12):2172–2190, 2003.

[3] J.P. Burg. Maximum entropy spectral analysis. In Proceedings of the37th Meeting Society of Exploration Geophysicists, 1967.

[4] J.P. Burg. Maximum Entropy Spectral Analysis. PhD thesis, 1975.Department of Geophysics, Stanford University.

[5] C.I. Byrnes, P. Enqvist, and A. Lindquist. Cepstral coefficients,covariance lags, and pole-zero models for finite data strings. IEEETransactions on Signal Processing, 49(4):677–693, 2001.

[6] C.I. Byrnes, P. Enqvist, and A. Lindquist. Identifiability and well-posedness of shaping-filter parameterizations: A global analysis ap-proach. SIAM Journal on Control and Optimization, 41(1):23–59, 2002.

[7] C.I. Byrnes, T.T. Georgiou, and A. Lindquist. A new approach tospectral estimation: a tunable high-resolution spectral estimator. SignalProcessing, IEEE Transactions on, 48(11):3189–3205, 2000.

[8] C.I. Byrnes, T.T. Georgiou, and A. Lindquist. A generalized entropycriterion for Nevanlinna-Pick interpolation with degree constraint. IEEETrans. Automat. Contr., 46(6):822–839, 2001.

[9] C.I. Byrnes, T.T. Georgiou, A. Lindquist, and A. Megretski. Generalizedinterpolation in h-infinity with a complexity constraint. Transactions ofthe American Mathematical Society, 358(3):965–987, 2006.

[10] C.I. Byrnes, S.V. Gusev, and A. Lindquist. A convex optimizationapproach to the rational covariance extension problem. SIAM Journalon Control and Optimization, 37(1):211–229, 1998.

[11] C.I. Byrnes, S.V. Gusev, and A. Lindquist. From finite covariancewindows to modeling filters: A convex optimization approach. SIAMReview, 43(4):pp. 645–675, 2001.

[12] C.I. Byrnes and A. Lindquist. A convex optimization approach togeneralized moment problems. In K. Hashimoto, Y. Oishi, and Y. Ya-mamoto, editors, Control and Modeling of Complex Systems, Trends inMathematics, pages 3–21. Birkhuser, Boston, 2003.

Page 16: Multidimensional Rational Covariance Extension with

16

[13] C.I. Byrnes and A. Lindquist. The generalized moment problemwith complexity constraint. Integral Equations and Operator Theory,56(2):163–180, 2006.

[14] C.I. Byrnes and A. Lindquist. The moment problem for rationalmeasures: convexity in the spirit of Krein. In Modern Analysis andApplication: Mark Krein Centenary Conference, Vol. I: Operator Theoryand Related Topics, pages 157–169. Birkhauser, 2009.

[15] C.I. Byrnes, A. Lindquist, S.V. Gusev, and A.S. Matveev. A completeparameterization of all positive rational extensions of a covariancesequence. IEEE Trans. on Automatic Control, 40(11):1841–1857, 1995.

[16] F. P. Carli, A. Ferrante, M. Pavon, and G. Picci. A maximum entropysolution of the covariance extension problem for reciprocal processes.IEEE Trans. Automat. Contr., 56(9):1999–2012, 2011.

[17] A. Chiuso, A. Ferrante, and G. Picci. Reciprocal realization andmodeling of textured images. In 44th IEEE Conference on Decisionand Control (CDC), and European Control Conference (ECC), pages6059–6064, Dec 2005.

[18] J.R. Deller, J.G. Proakis, and J.H.L. Hansen. Discrete-time processingof speech signals. IEEE Press, Piscataway, N.Y., 2000.

[19] B. Dickinson. Two-dimensional markov spectrum estimates need notexist. IEEE Trans. on Information Theory, 26(1):120–121, Jan 1980.

[20] B. Dumitrescu. Positive Trigonometric Polynomials and Signal Process-ing Applications. Springer, Berlin, 2007.

[21] M. Ekstrom. Digital image processing techniques. Acad. Press, 1984.[22] M. Ekstrom and J. Woods. Two-dimensional spectral factorization with

applications in recursive digital filtering. IEEE Trans. Acoust. SpeechSignal Proc., 24(2):115–128, Apr 1976.

[23] P. Enqvist. A convex optimization approach to ARMA(n,m) modeldesign from covariance and cepstral data. SIAM Journal on Controland Optimization, 43(3):1011–1036, 2004.

[24] P. Enqvist and E. Avventi. Approximative covariance interpolation with aquadratic penalty. In Decision and Control, 2007 46th IEEE Conferenceon, pages 4275–4280, 2007.

[25] G. Fanizza. Modeling and Model Reduction by Analytic Interpolationand Optimization. PhD thesis, 2008. Optimization and systems theory,Department of mathematics, KTH Royal Institue of Technology.

[26] A. Ferrante, M. Pavon, and F. Ramponi. Hellinger versus Kullback-Leibler multivariable spectrum approximation. IEEE Trans. Automat.Contr., 53(4):954–967, 2008.

[27] T.T. Georgiou. Partial Realization of Covariance Sequences. PhD thesis,1983. Center for mathematical systems theory, Univeristy of Florida.

[28] T.T. Georgiou. Realization of power spectra from partial covariancesequences. IEEE Trans. Acoust. Speech Signal Proc., 35(4):438–449,1987.

[29] T.T. Georgiou. The interpolation problem with a degree constraint. IEEETrans. Automat. Contr., 44(3):631–635, 1999.

[30] T.T. Georgiou. Solution of the general moment problem via a one-parameter imbedding. IEEE Trans. Automat. Contr., 50(6):811–826,2005.

[31] T.T. Georgiou. Relative entropy and the multivariable multidimensionalmoment problem. IEEE Trans. Inf. Theory, 52(3):1052–1066, 2006.

[32] T.T. Georgiou and A. Lindquist. Kullback-Leibler approximation ofspectral density functions. IEEE Trans. Inf. Theory, 49(11):2910–2917,2003.

[33] J. S. Geronimo and H. J. Woerdeman. Positive extensions, Fejr-Rieszfactorization and autoregressive filters in two variables. Annals ofMathematics, 2:160, 2004.

[34] R. Kalman. Realization of covariance sequences. In Toeplitz memorialconference, 1981. Tel Aviv, Israel.

[35] J. Karlsson, T.T. Georgiou, and A. Lindquist. The inverse problem ofanalytic interpolation with degree constraint and weight selection forcontrol synthesis. IEEE Trans. Automat. Contr., 55(2):405–418, 2010.

[36] J. Karlsson and A. Lindquist. Stability-preserving rational approximationsubject to interpolation constraints. IEEE Trans. Automat. Contr.,53(7):1724–1730, Aug 2008.

[37] J. Karlsson, A. Lindquist, and A. Ringh. The Multidimensional MomentProblem with Complexity Constraint. ArXiv e-prints, April 2015.

[38] S.W. Lang and J.H. McClellan. Spectral estimation for sensor arrays. InProceedings of the First ASSP Workshop on Spectral Estimation, pages3.2.1–3.2.7, 1981.

[39] S.W. Lang and J.H. McClellan. The extension of Pisarenko’s methodto multiple dimensions. In Acoustics, Speech, and Signal Processing,IEEE International Conference on ICASSP ’82., volume 7, pages 125–128, May 1982.

[40] S.W. Lang and J.H. McClellan. Multidimensional MEM spectralestimation. IEEE Trans. Acoust. Speech Signal Proc., 30(6):880–887,1982.

[41] S.W. Lang and J.H. McClellan. Spectral estimation for sensor arrays.IEEE Trans. Acoust. Speech Signal Proc., 31(2):349–358, 1983.

[42] H. Lev-Ari, S. Parker, and T. Kailath. Multidimensional maximum-entropy covariance extension. IEEE Trans. Inf. Theory, 35(3):497–508,May 1989.

[43] B.C. Levy, R. Frezza, and A.J. Krener. Modeling and estimationof discrete-time gaussian reciprocal processes. IEEE Trans. Automat.Contr., 35(9):1013–1023, 1990.

[44] A. Lindquist, C. Masiero, and G. Picci. On the multivariate circulantrational covariance extension problem. In IEEE 52nd Annual Conferenceon Decision and Control (CDC), pages 7155–7161, 2013.

[45] A. Lindquist and G. Picci. The circulant rational covariance exten-sion problem: The complete solution. IEEE Trans. Automat. Contr.,58(11):2848–2861, 2013.

[46] A. Lindquist and G. Picci. Linear stochastic systems: A geometricapproach to modeling, estimation and identification. Springer-VerlagBerlin Heidelberg, 2015.

[47] D.G. Luenberger. Optimization by Vector Space Methods. John Wiley& Sons, Inc., New York, 1969.

[48] K. Mahler. On some inequalities for polynomials in several variables.Journal of the London Mathematical Society, 1(1):341–344, 1962.

[49] J.H. McClellan and S.W. Lang. Mulit-dimensional MEM spectralestimation. In Proc. of the Inst. of Acoust. ”Spectral Analysis and its Usein Underwater Acoustics”: Underwater Acoustics Group Conf., London,29-30 Apr., pages 10.1–10.8, 1982.

[50] J.H. McClellan and S.W. Lang. Duality for multidimensional MEMspectral analysis. Communications, Radar and Signal Processing, IEEProceedings F, 130(3):230–235, April 1983.

[51] B.R. Musicus and A.M. Kabel. Maximum entropy pole-zero esti-mation. Technical Report 510, Research Laboratory of Electronics,Massachusetts Institute of Technology, August 1985.

[52] H.I. Nurdin. New results on the rational covariance extension problemwith degree constraint. Systems & Cont. Letters, 55(7):530–537, 2006.

[53] A.V. Oppenheim and R.W. Schafer. Digital signal processing. Prentice-Hall, New Jerseys, 1975.

[54] M. Pavon and A. Ferrante. On the geometry of maximum entropyproblems. SIAM Review, 55(3):1415–439, 2013.

[55] G. Picci and F.P. Carli. Modelling and simulation of images by reciprocalprocesses. In Tenth international conference on Computer Modelling andSimulation, UKSIM, pages 513–518, 2008.

[56] F. Ramponi, A. Ferrante, and M. Pavon. A globally convergent matricialalgorithm for multivariate spectral estimation. IEEE Trans. Automat.Contr., 54(10):2376–2388, Oct 2009.

[57] R. Remmert. Theory of complex functions. Graduate texts in mathemat-ics. Springer-Verlag, New York, 1991. Translation of: FunktionentheorieI. 2nd ed.

[58] A. Renyi. On measures of entropy and information. In Fourth Berkeleysymposium on mathematical statistics and probability, volume 1, pages547–561, 1961.

[59] A. Ringh and J. Karlsson. A fast solver for the circulant rationalcovariance extension problem. In European Control Conference, 2015.

[60] A. Ringh, J. Karlsson, and A. Lindquist. The multidimensional circulantrational covariance extension problem: solution and application in imagecompression. Submitted to 54th IEEE Conference on Decision andControl (CDC), 2015.

[61] A. Ringh and A. Lindquist. Spectral estimation of periodic and skewperiodic random signals and approximation of spectral densities. In 33rdChinese Control Conference (CCC), pages 5322–5327, 2014.

[62] W. Rudin. Real and complex analysis. McGraw-Hill, New York, 1987.[63] A. Schinzel. Polynomials with special regard to reducibility. Cambridge

University Press, 2000.[64] L.A Shepp and B.F. Logan. The Fourier reconstruction of a head section.

IEEE Transactions on Nuclear Science, 21(3):21–43, June 1974.[65] E.M. Stein and R. Shakarchi. Fourier analysis: an introduction.

Princeton university press, Princeton, N.J., 2003.[66] P. Stoica and R. Moses. Introduction to Spectral Analysis. Prentice-Hall,

Upper Saddle River, N.J., 1997.[67] Z. Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli. Image

quality assessment: from error visibility to structural similarity. IEEETransactions on Image Processing, 13(4):600–612, April 2004.

[68] J Woods. Two-dimensional Markov spectral estimation. IEEE Trans.Inf. Theory, 22(5):552–559, 1976.

[69] M. Zorzi. A new family of high-resolution multivariate spectralestimators. IEEE Trans. Automat. Contr., 59(4):892–904, 2014.

[70] M. Zorzi. Rational approximations of spectral densities based onthe alpha divergence. Mathematics of Control, Signals, and Systems,26(2):259–278, 2014.