quantized principal component analysis with applications to

International Journal of Innovative

Computing, Information and Control ICIC International c©2005 ISSN 1349-4198Volume x, Number 0x, x 2005 pp. 0–0

QUANTIZED PRINCIPAL COMPONENT ANALYSIS WITHAPPLICATIONS TO LOW-BANDWIDTH IMAGE COMPRESSION

AND COMMUNICATION

D. Wooden

School of Electrical and Computer Engineering

Georgia Institute of Technology

Atlanta, GA 30332, USA

[email protected]

M. Egerstedt

School of Electrical and Computer Engineering

Georgia Institute of Technology

Atlanta, GA 30332, USA

[email protected]

B.K. Ghosh

Department of Electrical and Systems Engineering

Washington University in St. Louis

St. Louis, MO 63130, USA

[email protected]

Abstract. In this paper we show how Principal Component Analysis can be mapped

to a quantized domain in an optimal manner. In particular, given a low-bandwidth

communication channel over which a given set of data is to be transmitted, we show how

to best compress the data. Applications to image compression are described and examples

are provided that support the practical soundness of the proposed method.

Keywords: principal component analysis, quantization, image compression

1

2 D. WOODEN, M. EGERSTEDT, AND B.K. GHOSH

1. Introduction. Principal Component Analysis (PCA) is an algebraic tool for com-

pressing large sets of statistical data in a structured manner. However, the reduction

results in real-valued descriptions of the data. In this paper, we take the compression

one step further by insisting on the use of only a finite number of bits for its repre-

sentation. This is necessary in a number of applications where the data is transmitted

over low-bandwidth communication channels. In particular, the inspiration for this work

came from the need for multiple mobile robots to share visual information about their

environment.

Assuming that the data x1, . . . , xN take on values in a d-dimensional space, one can

identify the d principal directions, coinciding with the eigenvectors of the covariance

matrix, given by

C =1

N

N∑

i=1

(xi − m)(xi − m)T , (1)

where m is the mean of the data.

If our goal is to compress the data set to a set of dimension n < d, we would pick

the n dominant directions, i.e. the directions of maximum variation of the data. This

results in an optimal (in the sense of least squared error) reduction of the dimension from

d to n. For example, if n = 0 then only the mean is used, while n = 1 corresponds to a

1-dimensional representation of the data. The fact that the reduction can be done in a

systematic and optimal manner has lead to the widespread use of PCA in a number of

areas, ranging from process control [7], to weather prediction models [3], to image com-

pression [2]. In this paper, we focus on and draw inspiration from the image processing

problem in particular, even though the results are of a general nature.

2. Principal Component Analysis. Suppose we have a stochastic process with sam-

ples xk ∈ Rd, k = (1, . . . , N), where N is the number of samples taken. Let

QPCA 3

1. m = 1N

∑N

k=1 xk be the mean of the data.

2. ei ∈ Rd be the ith principal direction of the system, where i ∈ {1, . . . , d}

3. ai ∈ R be the ith principal component, i.e.

ai = e Ti (x − m),

associated with the sample point x ∈ Rd. We can then reconstruct x perfectly from its

principal components and the system’s principal directions as

x = m +

d∑

i=1

aiei. (2)

If we wish to reduce the system complexity from a d-dimensional data set to n dimensions,

only the n principal directions (corresponding to the n largest eigenvalues of the covariance

matrix) should be chosen.

The main contribution in this paper is not the problem of reducing the dimension of

the data set, but rather the problem of communicating the data. Given n ≤ d number of

transmittable real numbers, the optimal choice for the reconstruction of x from these num-

bers is simply given by the n largest (in magnitude) principal components. But because

the ai’s are all real-valued, we are required to quantize them prior to communication, and

we wish then to transmit only the most significant quanta. In this paper we derive a

mapping of the PCA algorithm to a quantized counterpart.

3. Quantized Components. Let r ∈ N be the resolution of our system. For example,

if r = 10, then we are communicating decimal integers. If r = 16, then we are communi-

cating nibbles (i.e. half-bytes). Now, let K ∈ Z be the largest integer exponent of r such

thatmaxi∈{1,...,d}(|ai|)

rK∈ R[1,r−1]. (3)


With this definition of r and K, we are equipped to define the quantities by which we will

decompose the principal components. We name the first quantity the quantized component,

zi = arg minζ∈Z

(

|ai − ζ|)

, (4)

where Z = {rK(−r + 1), rK(−r + 2), ..., rK(r − 1)}. As a result, we have that

0 ≤ |zi| ≤ (r − 1)rK, (5)

In other words zi is obtained from the integer in the range [−r + 1, +r − 1], which, when

scaled by rK, minimizes the distance to the principal component ai.

The second quantity, called the remainder component, is simply defined as

yi = ai − zi, (6)

and therefore,

0 ≤ |yi| <1

2rK. (7)

The remainder component is equivalent to the round-off error between zi and ai.

With these definitions, we define the quantized version of the original principal com-

ponents to be aQi , zi, i = 1, . . . , d. And, in a manner similar to the reconstruction of

x from its principal components, we can reconstruct a quantized version of x from its

quantized principal components:

xQ = m +

d∑

i=1

aQi ei. (8)

We presume that sufficient resources may be allocated during a start-up period in which

the transmitter and receiver can agree upon the real-valued mean and principal directions.

Thereafter, regular transmission of quantized components commences.

Now, the question remains, if we may only transmit one quantized component, which

one should we pick?

QPCA 5

ai r K zi yi

345.6 10 2 300 45.6345.6 24 1 336 9.6-984.1 10 3 -1000 15.9-924.1 10 2 -900 -24.1

Table 1. Example Set of Principal, Quantized, and Remainder Components.

Problem 1. Identify the quantized component which minimizes the error between xQ and

x. In other words, solve

arg mink∈{1,...,d}

∥

∥

∥

(

m +d

∑

i=1

δikaQi ei

)

− x∥

∥

∥

2

(9)

where

δik =

1, if i = k

0, otherwise

For the sake of clarity, we present Table 1 as an example of principal components and

their corresponding value of K, as well as their quantized and remainder components.

4. Main Result. Let S = {1, . . . , d} and Sz ={

s ∈ S∣

∣

∣|zs| ≥ |zm|, ∀m ∈ S

}

.

Theorem 4.1. If |Sz| = 1, i.e. ∃ n ∈ S such that |zn| > |zm| ∀m ∈ S, m 6= n, then n is

the solution to Problem 1, i.e. zn is the optimal component to transmit. |Sz| represents

the cardinality of Sz.


Proof. Define the cost function

J(aQ) =∥

∥

∥xQ − x

∥

∥

∥

2

=

=∥

∥

∥(m +

d∑

i=1

aQi ei) − (m +

d∑

i=1

aiei)∥

∥

∥

2

=∥

∥

∥

d∑

i=1

ei(aQi − ai)

∥

∥

∥

2

=

d∑

i=1

(aQi )2 − 2aQ

i ai + a2i

=d

∑

i=1

z2i − 2zi(zi + yi) + a2

i

= −d

∑

i=1

(

z2i + 2ziyi

)

+d

∑

i=1

a2i . (10)

Now, define a similar cost function

Jk(aQ) =

∥

∥

∥m +

d∑

i=1

δikaQi ei − x

∥

∥

∥

2

.

Hence,

Jk(aQ) =

d∑

i=1

δik

(

(aQi )2 − 2aQ

i ai

)

+ a2i

= (aQk )2 − 2aQ

k ak +

d∑

i=1

a2i

= −(

z2k + 2zkyk

)

+d

∑

i=1

a2i . (11)

Taking n, m ∈ S, we may extend Eq. (11) to write

Jn − Jm = −(z2n + 2znyn) + (z2

m + 2zmym) (12)

= −z2n − 2|zn||yn|sgn(zn)sgn(yn) + z2

m + 2|zm||ym|sgn(zm)sgn(ym),

QPCA 7

where sgn(zi) indicates the sign (+ or −) of zi. Since sgn(zi)sgn(yi) ∈ {−1, +1}, we may

write

Jn − Jm ≤ −z2n + 2|zn||yn| + z2

m + 2|zm||ym| (13)

Now, assume that |zn| > |zm|, which gives us

|zn| = |zm| + αrK, (14)

where α ∈ Z+. Hence, we wish to show that Jn − Jm < 0.

Substituting Eq. (14) into Eq. (13),

Jn − Jm ≤ −(|zm| + αrK)2 + 2|yn|(|zm| + αrK) + z2m + 2|zm||ym|.

Using Eq. (7), we conclude that

Jn − Jm < −(|zm| + αrK)2 + (|zm| + αrK)rK +

+z2m + |zm|r

K

< 2|zm|rK − 2|zm|αrK − r2K

(

α2 − α)

=(

1 − α)(

2|zm|rK + αr2K

)

.

And finally, recalling that α ≥ 1,

Jn − Jm < 0, (15)

and the theorem follows.

Now, define S= ={

s ∈ Sz

∣

∣

∣sgn(zs) = sgn(ys)

}

and S 6= = Sz\S=. Moreover, define

S+y =

{

s ∈ S=

∣

∣

∣|ys| ≥ |yl|, ∀l ∈ S=

}

. In other words, S+y refers to the principal

component(s) with the largest quantized and remainder components which also have equal

signs.


Theorem 4.2. If |Sz| > 1 and |S+y | > 0, i.e. ∃ n ∈ S+

y such that |yn| ≥ |ym|, |zn| = |zm|

∀m ∈ Sz and sgn(zn) = sgn(yn), then n is the solution to Problem 1, i.e. zn is the optimal

component to transmit.

Proof. By assumption, |Sz| > 1, and it is a direct consequence of the proof of Theorem 4.1

that we should choose between the elements of Sz for a quantized component to transmit.

Let n ∈ S+y and m ∈ Sz. Recalling Eq. (12),

Jn − Jm = −z2n − 2znyn + z2

m + 2zmym

= −2znyn + 2zmym

= −2|zn|(|yn| − |ym|sgn(zm)sgn(ym)).

It is true that either m ∈ S 6= or m ∈ S=. When m ∈ S 6=, sgn(zm)sgn(ym) = −1 and

therefore

Jn − Jm = −2|zn|(|yn| + |ym|) < 0. (16)

On the other hand, when m ∈ S=, sgn(zm)sgn(ym) = +1 and

Jn − Jm = −2|zn|(|yn| − |ym|). (17)

Note again that n ∈ S+y (i.e. |yn| ≥ |ym|). Hence, Eq. (17) becomes

Jn − Jm ≤ 0. (18)

Furthermore, Jn − Jm = 0 only when |yn| = |ym| and |zn| = |zm|. In other words,

|an| = |am| and clearly the two costs are equal.

Finally, define S−y =

{

s ∈ S 6=

∣

∣

∣|ys| ≤ |yl|, ∀l ∈ S 6=

}

.

Theorem 4.3. If |Sz| > 1 and |S+y | = 0, i.e. sgn(zs) 6= sgn(ys) ∀s ∈ Sz and ∃ n ∈ S−

y

such that |zn| = |zm| and |yn| ≤ |ym| ∀m ∈ Sz, then n is the solution to Problem 1, i.e.

|zn| is the optimal component to transmit.

QPCA 9

Proof. As was the case in Theorem 4.2, |Sz| > 1, but S= and S+y are empty (indicating

that the signs of the quantized and remainder components differ for all those in Sz). We

prove then that the optimal quantized component to transmit is the one with largest |zi|

and the smallest |yi|. Let n ∈ S−y and m ∈ Sz. Recalling again Eq. (12),

Jn − Jm = −z2n − 2znyn + z2

m + 2zmym

= −2znyn + 2zmym

= 2|zn|(|yn| − |ym|)

≤ 0,

and the theorem follows.

Theorem 4.1 tells us that we should transmit the quantized component largest in mag-

nitude. If this is not unique, Theorem 4.2 tells us to send the zi which also has the largest

remainder component which “points” in the same direction as its quantized component

(i.e. sgn(zi) = sgn(yi)). According to Theorem 4.3, if no such remainder component

exists (i.e. all yi “point” opposite of their quantized counterparts), then we send the zi

with the smallest remainder component.

When a unique largest (in magnitude) quantized component exists, it is a direct result of

Eqs. (6) and (7) that it corresponds to the largest (in magnitude) principal component. In

other words, Theorem 4.1 tells us to send the quantized component of the largest principal

component. As a consequence of Eq. (6), when the remainder component has the same

sign as the quantized component, the corresponding principal component is larger (again,

in magnitude) than the quantized component. Moreover, given |zi| and that zi and yi

have matching signs, |ai| is maximized by the largest value possible for |yi|. Theorem

4.2 tells us that, given the |zi| which are largest in magnitude, transmit the zi which has

matching remainder and quantized component signs and has the largest |yi|. In other

words, Theorem 4.2 tells us to transmit the zi corresponding to the largest |ai|.


Similarly, given |zi| and mismatching signs of zi and yi, |ai| is maximized by the smallest

value possible for |yi|. Theorem 4.3 tells us that, given the |zi| which are largest in

magnitude, transmit the zi which has mismatching remainder and quantized component

signs and has the smallest |yi|. In other words, Theorem 4.3 tells us to transmit the zi

corresponding to the largest |ai|.

To summarize: Theorem 4.1, 4.2, and 4.3 tell us, given a set of principal, quantized, and

remainder components, which quantized component should be transmitted. This optimal

quantized component is always the one corresponding the principal component largest in

magnitude.

5. Iterative Algorithm. Naturally, in a practical application, we will want to sent a

succession of quantized components rather than just one, and so an iterative algorithm

is necessary. Fortunately, the nature of our approach very easily lends itself this, and the

algorithm is presented below:

Given a sample set of training data.

Compute the sample mean and principal directions.

Given a new sample x and the system resolution r:Compute x’s principal components ai, i = 1, . . . , d.Set bi = ai

Compute K via Eq. (3)Compute the quantized components zi via Eq. (4)Compute the remainder components yi = bi - zi

For the number of desired transmissions

Determine the optimal quantized component zopt via Thms. 1-3Transmit zopt

Set bopt = yopt

Recompute zopt from bopt via Eq. (4)Recompute K via Eq. (3)

End

The simplicity of this algorithm is appealing. Moreover, at each iteration, there is little

to compute, as little changes from one loop to the next. Indeed the real burden of this

approach comes at the first step in the computation of the principal directions. In the

QPCA 11

next section, we will discuss some of the complexity issues associated with this problem

as well as some methods for reducing it through a particular image compression example.

6. Image Compression Examples. We apply the proposed quantization scheme to

a problem in which images are to be compressed and transmitted over low-bandwidth

channels. Two separate data sets are used. The first set is comprised of 12 352x352 pixel

grayscale images of very similar scenes (Figure 4). The second is comprised of 118 64x96

pixel grayscale images of very different scenes (Figure 5).

The images themselves are represented in two different ways. In the first method, as

is common practice in image processing [4][5] , the images are broken into 8x8 distinct

blocks. Principal components and directions are computed over these 64-pixel pieces, and

the quantization algorithm is applied to each block. At each additional iteration, one

more quantized component is transmitted per block of the image. In the second method,

principal directions and components are computed over the entire image, as a whole.

Under this method, only a single quantized component is communicated per iteration.

There are computational advantages/disadvantages associated with employing either

method. In the first case, a greater number of transmissions is required for the same drop

in error, but computing the principal directions is easy, and the memory required to hold

them is small. In the second case, the mean-squared error drops off dramatically fast,

but computing and maintaining the principal directions can be practically intractable.

In other words, the computational burden associated with computing larger and larger

principal directions can be justified by the lower number of zi’s needed to reconstruct the

image.

Figure 1 shows successive iterations of the algorithm on selected images from our two

data sets. Each pairing in the figure ( (a) with (b), (c) with (d), etc. ) shows two

progressions of an image, where the first progression is based on block representations

of the image, and the second is based on full-image representations. The value of r was


set to 16, meaning that at each iteration, 4 additional bits of information per block were

transmitted.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

Figure 1. Progression of Quantized Images - { Version (Image Set, ImageNumber) Iteration } (a) Block (1,4) 1 3 9 27 40 100 (b) Whole (1,4) 1 3 814 20 25 (c) Block (2,23) 1 5 10 15 30 50 80 (d) Whole (2,23) 1 10 24 50 90140 200 (e) Block (2,80) 1 5 10 15 30 50 80 (f) Whole (2,80) 1 10 24 50 90140 200 (g) Block (2,120) 1 5 10 15 30 50 80 (h) Whole (2,120) 1 10 24 5090 140 200

QPCA 13

Note that the PCA algorithm and our Quantized PCA algorithm rely on the agreed

knowledge between the sender/receiver of what the data mean and what the principal

directions are. Though these values may indeed not be stationary, it is possible to up-

date them in an online fashion. This can be accomplished, for example, by using the

Generalized Hebbian Algorithm [1][6]. Hence, even in a changing environment, as for

mobile robots, a realistic model of the surroundings can be maintained. Note moreover

that whether the image sets are broken into blocks or not, the mean-squared error of any

image drops off at approximately a log-linear rate with respect to the number of quantized

components transmitted. In the following section, we formalize this observation and show

how to predict the linear slope.

7. Error Regression. With a small amount of a priori information, we show how it is

possible to predict the rate at which the log of the mean-squared error falls off. This is

useful because given this rate, we may determine ahead of time how many transmissions

are needed. At each iteration of our algorithm, we transmit one quantized component.

As a result, the error between the true principal components a ∈ Rd and the receiver’s

reconstruction a is reduced along one of its d dimensions.

Per iteration, we can compute the error of our quantization and reconstruction as

ej = x − x =d

∑

i=1

(ai − ai,j)ei, (19)

where x =∑d

i=1(aiei)+m, x =∑d

i=1(ai,jei)+m, and ai,j is the receiver’s ith reconstructed

principal component after the jth iteration. The mean-squared error at the jth iteration

then is

msej =1

d

d∑

i=1

(

d∑

i=1

(ai − ai,j)ei

)2

(20)

=1

d

d∑

i=1

(

ai − ai,j

)2

(21)


By our previous definitions, we note that ai− ai,j = yi,j, i.e. the ith remainder component

after the jth iteration, and

msej =1

d

d∑

i=1

yi,j2 (22)

More useful to us is the logarithm of the mean square error, which we will denote lmsej .

Taking the first difference of lmsej gives

4lmsej = lmsej − lmsej−1 (23)

=(

logr(1

d) + logr(

d∑

i=1

yi,j2)

)

−(

logr(1

d) + logr(

d∑

i=1

yi,j−12)

)

= logr

(

d∑

i=1

yi,j2

yi,j−12

)

(24)

Now, define the total set of indexes L = {1, . . . , d}, and let L0 be given by L0 ={

i ∈

L∣

∣

∣ai = 0

}

. Clearly, these components will never have any affect on the error as yi,j = 0

for all iterations j and ∀i ∈ L0.

It is intuitively clear, and a direct result of Theroem 4.1, that the value of K never

increases over iterations. (For example, see Figure 2.) In fact, K is a constant over blocks

0 200 400 600 800 1000−8

−6

−4

−2

0

2

Val

ue o

f K

Iteration

Value of K for Image Set 2, Image 80, r=16;

Figure 2. Example of the Value of K over Iterations.

QPCA 15

of iterations. After an initial start-up period (e.g. after 41 iterations in Figure 2), the

length of these blocks will be the same on average. For image sets that are very similar,

these block lengths can be very short. For dissimilar image sets, they will be longer,

bounded above by d.

Over these blocks of iterations, yi,j will change value at most once. It is thus possible

to compute the total change in error over an entire block, rather than the change at each

iteration. The question then is, how can we characterize the lengths of these blocks, and

how much should we expect the lmse to change? If we let Kj denote the value of K at

the jth iteration, and define Mκ as the set of iterations for which Kj is some constant κ

we can define

Lκ1 =

{

i ∈ L\L0

∣

∣

∣yi,j = yi,k ∀j, k ∈ Mκ

}

(25)

and

Lκ2 =

{

i ∈ L\L0

∣

∣

∣yi,j 6= yi,k ∀j, k ∈ Mκ, j 6= k

}

. (26)

L0 ∪ Lκ1 ∪ Lκ

2 = L and each set is disjoint. Naturally, L0 does not need to be indexed by

κ as it is invariant over iterations.

We now make the assumption that ∀j and ∀i /∈ L0, the zi,j’s are independently and

identically distributed (with respect to both i and j), and|zi,j |

rKj

is uniformly distributed

over the range {0, . . . , r − 1}. With these definitions, we use Eq. (23) to construct an

average change in log-mean-squared error. We compare the difference in MSE between n,

the last iteration of Mκ, and m, the last iteration of Mκ+1. This is the average 4lmse

over the Mκ interval.

4lmseave = logr

(

∑

i∈L0yi,n +

∑

i∈Lκ1yi,n +

∑

i∈Lκ2yi,n

∑

i∈L0yi,m +

∑

i∈Lκ+11

yi,m +∑

i∈Lκ+12

yi,m

)

. (27)

Recalling that yi = 0 ∀ i ∈ L0, and hence

4lmseave = logr

(

∑

i∈Lκ1yi,n +

∑

i∈Lκ2yi,n

∑

i∈Lκ+11

yi,m +∑

i∈Lκ+12

yi,m

)

. (28)


We moreover have that |yi,j| = γi,jrKj , where γi,j ∈ [0, 1

2). By definition, yi,j is constant

over Mκ, and hence yi,m = yi,n ∀i ∈ Lκ+11 . In other words,

yi,m =

γi,nrκ, if i ∈ Lκ+1

1

γi,mrκ+1, if i ∈ Lκ+12

Similarly,

yi,n =

γi,prκ−1, if i ∈ Lκ

1

γi,nrκ, if i ∈ Lκ

2

where p is the last iteration of Mκ−1. Hence,

4lmseave = logr

(

∑

i∈Lκ1γ2

i,pr2(κ−1) +

∑

i∈Lκ2γ2

i,nr2κ

∑

i∈Lκ+11

γ2i,nr

2κ +∑

i∈Lκ+12

γ2i,mr2(κ+1)

)

(29)

The γi,j’s cancel out, which gives

4lmseave = logr

(

∑

i∈Lκ1r2κr−2 +

∑

i∈Lκ2r2κ

∑

i∈Lκ+11

r2κ +∑

i∈Lκ+12

r2κr2

)

(30)

To continue, we first draw some conclusions about the relative lengths of Lκ1 and Lκ

2 .

Obviously, there are r elements in the set {0, . . . , r − 1}. For all j ∈ M κ and for all

i /∈ L0, the probability that zi,j = 0 is 1r. The existence of zi,j’s that are zero over the

interval Mκ implies that the corresponding remainder components at the end of the last

interval block (i.e. yi,j where j ∈ Mκ+1) are equal to the remainder components at the

end of the current interval block (i.e. yi,j where j ∈ Mκ). In other words, these remainder

components do not change for a certain Kj, hence belong to Lκ1 , and 4lmse over this

interval is not increased or decreased by QPCA as a result of these quantized components.

Another set of indexes also belongs to Lκ1 . These correspond to the quantized com-

ponents which under the interval Mκ would have a zi value of 1rκ, but actually will

not be transmitted until the subsequent Mκ−1 interval block. For example, consider the

following, simple three dimensional system:

QPCA 17

Suppose r = 10 and y = [2543 1251 931]. At this iteration, we calculate that K = 3.

Thus, z = [2000 1000 1000]. We transmit, therefore, z1 = 2000 (which in turn makes

y1 = 543). Now, according to Theorem 4.2, we transmit z2 = 1000 (⇒ y2 = 251). But

we do not transmit z3 = 1000 yet. Instead, K should be re-evaluated to K = 2, and we

transition from M 3 to M2. Now, z = [500 300 900] and we transmit z3 = 900.

From this example, we see that some of the zi,j’s in an interval Mκ that are equal to rκ

should not be transmitted until we enter Mκ−1. Those zi’s which are transmitted are those

which in the previous interval Mκ+1 had the corresponding remainder component |yi,j| ≥

rκ. Given the assumption that yi,j and zi,j are uniformly distributed, this constitutes half

of the zi,j’s in Mκ, with |zi,j| = 1rκ. Those that are not transmitted in the Mκ interval

had a remainder component |yi,j| < rκ. Consequently, half of the probable 1r

zi,j in Mκ

that are equal to rκ are not transmitted in the Mκ interval.

In other words, the set of indexes Lκ1 is a fraction of the total number of available indexes

from L\L0, namely 1r

+ 0.5r

= 1.5r

. And L2 makes up the remaining fraction, 1 − 1.5r

. By

definition, |Lκ1 | + |Lκ

2 | is invariant over κ, and so we define the length ratio of Li as

ρi =|Li|

∑2i=1 |Li|

. (31)

Specifically,

ρ1 =1.5

r(32)

ρ2 = 1 −1.5

r. (33)

We plug this back into Eq. (30),

4lmseave = logr

(ρ1r−2 + ρ2

ρ1 + ρ2r2

)

= logr

(

r−2)

= −2. (34)


In other words, 4lmseave drops by 2 after Mκ iterations, and Mκ = |Lκ2 |.

For our examples, we used images with d = 3136 and d = 6144 for image sets 1 and

2, respectively. The number of sample images used to compute the mean and principal

directions was 12 and 118. Consequently, the space spanned by our two image sets was

11 and 117 dimensional, which is equal to |L1| + |L2|. For example, with r = 10, we can

say then that the average length of L2 for the first image set was

11ρ2 = 11(1 −1.5

r) = 9.35,

and for image set 2 it was

117ρ2 = 117(1 −1.5

r) = 99.45.

We computed a slope for the error from our experimental results for image set 1 and 2,

with r equal to 10, 24, and 128, as shown in Figure 3. Note that the experimental results

coincide with the computed results to a very high degree. In fact, the predicted slope of

the error has a percentage error less than 0.02.

Several important facts should be noted. First, the length of Lκ2 deviates from the

average in the early ”start-up” region of the algorithm, and hence so does the slope. In

the start-up region, the error will drop off much faster, the length of Mκ are much shorter

here. Second, in general, image samples will not span such a small subspace of the original

d-dimensional space. Consequently, the average length of L2 grows and error drops off at

a lower rate. Third, as can be seen in Figure 3, the lmse does not drop off at constant

rate, but has “humps”. These “humps” have a frequency equal to |Lκ2 |.

8. Conclusions. We have proposed a method for transmitting quantized versions of the

principal components associated with a given data set in an optimal manner. This method

was shown to be applicable to the problem of image compression and transmission over

low-bandwidth communication channels. We also show how the progression of the error

QPCA 19

0 20 40 60 80 100 120−25

−20

−15

−10

−5

0

5

Iteration

log r(M

SE

)

LMSE for Image Set 1, with Prediction, r = 10

Plot of error for each image from Image Set 1.

Predicted slope of the error for r=10.

0 20 40 60 80 100−20

−15

−10

−5

0

5

Iteration

log r(M

SE

)




0 10 20 30 40 50 60 70 80−14

−12

−10

−8

−6

−4

−2

0

2

Iteration

log r(M

SE

)




(a) (b) (c)

0 200 400 600 800 1000−20

−15

−10

−5

0

5LMSE for Image Set 2, with Prediction, r = 10

Iteration

log r(M

SE

) Plot of error for each image from Image Set 2.


0 200 400 600 800 1000−20

−15

−10

−5

0

5LMSE for Image Set 2, with Prediction, r = 24

Iteration

log r(M

SE

) Plot of error for each image from Image Set 2.


0 100 200 300 400 500 600 700 800−14

−12

−10

−8

−6

−4

−2

0

2

Iteration

log r(M

SE

)




(d) (e) (f)

Figure 3. Measured and Predicted Error for QPCA on (a) Image Set 1(IS1), r = 10 (b) IS1, r = 24 (c) IS1, r = 128 (d) IS2, r = 10 (e) IS2, r = 24(f) IS2, r = 128.

per iteration of our algorithm can be accurately predicted, based on the parameters of

the algorithm.

REFERENCES

[1] L. Chen, S. Chang: An adaptive learning algorithm for principal component analysis, IEEE Trans-

actions on Neural Networks, v6, i5, pp.1255-1263, 1995.

[2] X. Du, B.K. Ghosh, P. Ulinski: Decoding the Position of a Visual Stimulus from the Cortical Waves

of Turtles, Proceedings of the 2003 American Control Conference, v1, i1, pp.477-82, 2003.

[3] R. Duda, P. Hart, D. Stork: Pattern Classification, John Wiley and Sons, Inc., N.Y., 2001.

[4] M. Kunt: Block Coding of Graphics: A Tutorial Review, Proc. of IEEE, v68, i7, pp.770-86, 1980.

[5] M. Marcellin, M. Gormish, A. Bilgin, M. Boliek: An Overview of JPEG-2000, Proc. of IEEE Data

Compression Conference, pp.523-541, 2000.


[6] T. D. Sanger: Optimal unsupervised learning in a single-layer linear feed-forward neural network,

Neural Networks, v2, i6, pp.459-473, 1989.

[7] C. Undey, A Cinar: Statistical Monitoring of Multistage, Multiphase Batch Processes, IEEE Control

Systems, v22, i5, pp.40-52, 2002.

QPCA 21

Figure 4. Image Set 1.

Figure 5. Image Set 2.

quantized principal component analysis with applications to

Documents