image reconstruction based on back propagation learning in...

Image reconstruction based on back propagation

learning in Compressed Sensing theory

Gaoang Wang

Project for ECE 539

Fall 2013

Abstract

Over the past few years, a new framework known as compressive sampling has been developed

for simultaneous sampling and compression. It can significantly reduce the number of

measurements required for a given signal in traditional compression methods. One way to increase

the compression ratio during the sampling is that we can apply different compression ratio to

different part of the image. Thus we can increase the compression ratio to the background of an

image in order to increase the total compression ratio. However, we know nothing of the original

image before we start sampling. So we can use sampling data to judge which part of the image

belongs to the background then we apply second-time sampling to these parts of the image.

Before sampling an image, we should use a lot of images to be the training data to compute the

weights of the classification. Then in the second-sampling, we use these weights to decide which

part belongs to the background. In the first-sampling, I apply the algorithm of block based

Compressed Sensing and use the sampling data as feature vectors. In the construction step, I use

OMP method and DCT matrix to reconstruct the image.

Content

Introduction ............................................................................................................................... 4

Compressed Sensing .................................................................................................................. 5

2.1 Background ..................................................................................................................... 5

2.2 Algorithm of SPL-BCS ....................................................................................................... 5

MLP and Back-Propagation ...................................................................................................... 7

3.1 Introduction of Back-Propagation ................................................................................... 7

3.2 Traning method in image reconstruction ........................................................................ 7

3.2.1 Feature vectors in training ........................................................................................ 7

3.2.2 BP learning program ................................................................................................. 9

3.2.3 Sampling testing data ............................................................................................. 11

3.2.4 Image reconstruction algorithm .............................................................................. 13

Results ..................................................................................................................................... 14

4.1 The confusion rate of BP learning ................................................................................. 14

Introduction

Over the past few years, a new framework known as compressive sampling has been

developed for simultaneous sampling and compression. It can significantly reduce the

number of measurements required for a given signal in traditional compression

methods. Compressed sensing (CS), built upon the groundbreaking work by Candes et

al. [1] and Donoho [2], aims at exactly reconstructing the original signals while

sampling at sub-Nyquist rate. Unlike traditional theories, CS theory greatly reduces

the signal sampling rate, signal processing time, data storage and transmission costs,

leading signal processing into a new revolutionary era. Due to its great practical

potentials, CS has been intensively studied and used both in academia and industries

in the past few years [3, 4]. The field of CS is related to other topics in signal

processing and computational mathematics, such as underdetermined linear-systems,

group testing, heavy hitters, sparse coding, multiplexing, sparse sampling, and finite

rate of innovation. Imaging techniques having a strong affinity with CS include coded

aperture and computational photography.

There are many algorithms of image reconstruction based on CS, like

block-based CS sampling (BCS) [5]. It is a quite efficient method, which can solve

the artifact problem among block edges. Before sampling since we know little about

the original images, so few algorithm can take consideration of image characteristics

during the reconstruction. As we all know, most parts of national images are smooth.

Therefore we could take a high compression ratio in the sampling. However, since

some parts of images have complicated texture, these parts of image can be hardly

reconstructed well with a high compression ration. Thus we have to reduce the total

compression ratio even if most parts of image are very smooth. Fortunately, we can

use some learning method in the training data. In this way, we will know which parts

of image are smooth than other parts after sampling. Then we could sample these

parts of image in a second time with a higher ratio. Therefore the total compression

ratio will increase.

http://en.wikipedia.org/wiki/Underdetermined_system

http://en.wikipedia.org/wiki/Group_testing

http://en.wikipedia.org/wiki/Sparse_coding

http://en.wikipedia.org/wiki/Multiplexing

http://en.wikipedia.org/wiki/Coded_aperture

http://en.wikipedia.org/wiki/Coded_aperture

http://en.wikipedia.org/wiki/Computational_photography

Compressed Sensing

2.1 Background

Consider a real-valued, N-length, one-dimensional, discrete-time signal x, which can

be viewed as an N × 1 column vector in RN with elements x[n], n = 1, 2, . . . , N.

Suppose that we are allowed to take M (M<<N) linear non-adaptive measurement of x

through the following linear transformation [1, 2]:

y=Фx, (1.1)

where y represents an M × 1sampled vector and Φ is an M × N measurement matrix.

Since M<<N, the reconstruction of x from y is generally ill-posed. However, the

CS theory is based on the fact that x has a sparse representation in a known transform

domain Ψ. In other words, the transform-domain signal f = Ψx can be well

approximated using only d<M<<N non-zero entries. It was proved in [1, 2] that when

Φ and Ψ are incoherent, x can be well recovered from M measurements. In the study

of CS, a couple of the most important issues include: (a) the design of measurement

matrix Φ; (b) the selection of transform Ψ; (c) the reconstruction algorithm.

Random matrix is always selected as measurement matrix since incoherence can

be achieved with a high probability. As for transform, there are DCT, wavelet,

grouplet, bandlet and curvelet [6], Dual-tree discrete wavelet transform (DDWT) [7],

contourlet [8] and so on. For the reconstruction methods, orthogonal matching pursuit

(OMP) and basis pursuit (BP) are classical ones. For 2D images, another well known

reconstruction algorithm is through the minimization of total variation (TV) [9]. Other

algorithms include iterative soft-thresholding and projection onto convex sets. [10]

2.2 Algorithm of SPL-BCS

In BCS, an image is divided into B × B blocks and sampled using an

appropriately-sized measurement matrix. That is, suppose that xj is a vector

representing, in raster-scan fashion, block j of input image x. The corresponding yj is

then yj =ΦBxj , where ΦB is an MB × B2 orthonormal measurement matrix with MB

=(M/N)B2.

Using BCS rather than random sampling applied to the entire image x has

several merits [11]. First, the measurement operator ΦB is conveniently stored and

employed because of its compact size. Second, the encoder does not need to wait until

the entire image is measured, but may send each block after its linear projection. Last,

an initial approximation x(0)

with minimum mean squared error can be feasibly

calculated due to the small size of ΦB [11].

In [11], Wiener filtering was incorporated into the basic PL framework in order

to remove blocking artifacts. In essence, this operation imposes smoothness in

addition to the sparsity inherent to PL. Specifically, in [11], a Wiener-filtering step

was interleaved with the PL projection of (2)–(3); thus, the approximation to the

image at iteration i + 1, x(i+1)

, is produced from x(i)

as:

Here, Wiener(∙) is pixelwise adaptive Wiener filtering using a neighborhood of 3 × 3,

while Threshold(∙) is a thresholding process as discussed below. In our use of SPL, we

initialize with x(0)

= ΦTy and terminate when |D

(i+1) − D(i)

| < 10−4

, where

( ) ( ) ( 1)

2

1 ˆ̂|| ||i i iD x xN

( +1) ( )

B

( ) ( )

( ) ( ) ( )

B B

( ) ( ) ( )

( ) ( )

( ) ( ) ( )

function = SPL( , , , , )

= Wiener( )

For each block

ˆ̂ ˆ ˆ ( )

ˆ̂

( , )

ˆ

i i

1 2

i i

i i T i

j j 1 j j

i i i

i i

i i i

x x y p p

x x

j

x x p y x

x D x

x Threshold x

x D x

( ) ( ) ( )

B B

For each block

( )i i T i

j 2 j j

j

x x p y x

MLP and Back-Propagation

3.1 Introduction of Back-Propagation

Multilayer perceptrons have been applied successfully to solve some difficult and

diverse problems by training them in a supervised manner with a highly popular

algorithm known as the error back-propagation algorithm. This algorithm is based on

the error-correction learning rule. As such, it may be viewed as generalization of an

equally popular adaptive filtering algorithm: the ubiquitous least-mean-square (LMS)

algorithm for the special case of a single linear neuron model.

Basically, the error back-propagation process consists of two passes through the

different layers of the network: a forward pass and a backward pass. In the forward

pass, an activity pattern (input vector) is applied to the sensory nodes of the network,

and its effect propagates through the network, layer by layer. Finally, a set of outputs

is produced as the actual response of the network. During the forward pass the

synaptic weights of the network are all fixed. During the backward pass, on the other

hand, the synaptic weights are all adjusted in accordance with the error-correction rule.

Specifically, the actual response of the network is subtracted from a desired (target)

response to produce an error signal. Then this error signal is propagated backwards

through the network, against the direction of synaptic connections-hence the name

“error back-propagation”. The synaptic weights are adjusted so as to make the actual

response of the network move closer to the desired response.

3.2 Traning method in image reconstruction

I use a large amount of sampling blocks from images as training data. The outputs

determine whehter these blocks can bare a higher sampling ratio. After training, I take

the final weights into sampling. Given an original image, I take two times

measurement. In the first measurement, since we know nothing about the

characteristics of the image, we use a general lower compression ratio for all the

image blocks. Then the weights from training come into use. They decide whether

these parts of image can have a higher compression ratio. If satisfied, then these parts

will proceed a second sampling. With this method, the total compression ratio will

increase.

3.2.1 Feature vectors in training

In real time processing, we know nothing about the original image before sampling.

We could do something after the first sampling step. This requiresus that we could

only take sampling data as traning data in BP learning method. We know that

sampling data is ΦB∙v , where ΦB is a Gaussian random matrix and v is a vectorized

block from original image. Since ΦB is a random matrix, if we take ΦB∙v as feature

vectors, the BP learning can hardly work. Therefore, I take two measurement as

pre-processing:

(1) Fix measurement matrix. In the whole process, I use identical random

matrices in each sampling step. If sampling a new image, I won’t

generate a new random matrix again.

(2) To reduce the randomness of the measurement matrix, I times the pseudo

inverse of the measurement matrix, i.e. I take ΦB -1

∙ΦB∙v as the feature

vectors.

In this project, the training data comes from 10 images (256 × 256). Since it will

save the computing time with gray level images. Therefore all these 10 images are

gray level. The block size is 8 × 8. Thus for each image, there are 2562/8

2=1024

training data. So there are 10240 training data in total. The size of MB is 16 × 64, i.e.

equals 4 compression ratio. The training images are given as below:

Each image is 256 × 256.

3.2.2 BP learning program

As mentioned above, each feature vector is 64 × 1. For decreasing the computing

comlexity. Each time, I random select 1024 vectors from 10240 vectors as training

data. I run the training program hundreds of times. I use 3 layer and 4 layer separately.

In addition, I set the Epoch=1000, μ=0.8 and η=0.01. For the first training, I use the

random values as the initial weights. For the times afterwards, I use the weights

generated from last time as the initial weights. The confusion rate is ploted as below:

3 layer

4 layer

From the diagram we can see the variance of rates of 4 layer configuration is much

higher than 3 layer configuration.

3.2.3 Sampling testing data

After BP learing algorithm, we get the weights of MLP. This step we will use these

weights to deal with the testing data. I use 12 images (512 × 512) as the testing

example. All these 12 images are shown below:

Each Image is 512 × 512. From upper left to lower right, we denote the images as 1 to 12.

In the first step, we divide each image into block of size 8 × 8. Then we sample each

block with compression ratio of 4. In this step, we don’t need to consider the

characteristics of images. When having obtained the sampled images, we times the

weights and then we know which parts of images need to take a second step of

compressed sensing.

In the second step, all the satisfied blocks are resampled by the measurement matrix

of 4 × 64, i.e. the compression ratio is 16. Thus the total compression ratio will

increase.

3.2.4 Image reconstruction algorithm

In the reconstruction, we apply the SPL-BCS algorithm by James E. Fowler. Since in

block based CS, it is easy to have artifacts among block edges in the final

reconstruction image. In this algorithm, we use Wiener filter to remove blocking

artifacts. The pseudocodes are given below:

Since there are two kinds of ΦB in the algorithm (one has the size of 16 by 64, and

another is 4 by 64). So we should modify this algorithm into two parts. For different

parts of image we apply different measurement matrix to them. Then combine them

together.

( +1) ( )

B

( ) ( )

( ) ( ) ( )

B B

( ) ( ) ( )

( ) ( )

( ) ( ) ( )

function = SPL( , , , , )

= Wiener( )

For each block

ˆ̂ ˆ ˆ ( )

ˆ̂

( , )

ˆ

i i

1 2

i i

i i T i

j j 1 j j

i i i

i i

i i i

x x y p p

x x

j

x x p y x

x D x

x Threshold x

x D x

( ) ( ) ( )

B B

For each block

( )i i T i

j 2 j j

j

x x p y x

Results

4.1 The confusion rate of BP learning

If we apply the obtained weights to the testing examples. The confusion rates are

given below:

Image 3 layer

Con rate

4 layer

Con rate

1 0.8862 0.4082

2 0.8884 0.2363

3 0.6877 0.4980

4 0.8022 0.4341

5 0.8464 0.3984

6 0.8865 0.2979

7 0.8545 0.3831

8 0.7207 0.6763

9 0.7849 0.4666

10 0.7747 0.3599

11 0.7332 0.5967

12 0.7380 0.5818

We see the confusion rates in the form are not too high, especially for 4 layer

configuration. However, these rates are only the referrence since the label values I

give to the testing images are subjective. In other words, if I think this part of image is

smooth enough, then I give it the value 1. If not, I give 0 value. This is how the target

label comes from.

The most interesting part results are the total compression rates and the PSNR of

the reconstructed images. The results are shown below:

Image 3 layer BP 4 layer BP

PSNR Com.Rate PSNR Com.Rate

1 27.3083 14.3184 28.9963 6.2969

2 29.7727 14.4795 31.5096 4.3457

3 27.4235 13.7207 29.9105 4.9053

4 26.5394 13.9990 28.2385 5.6025

5 28.3924 14.7900 30.0847 6.5488

6 29.0541 13.9902 30.1757 4.6943

7 28.4538 14.2832 30.0008 5.5557

8 21.1964 10.7676 21.4562 4.8203

9 23.5239 12.7744 24.2369 4.2490

10 27.1238 14.2217 28.4484 4.2813

11 24.3417 13.1992 25.2604 6.4990

12 22.2190 12.4609 22.7841 5.4414

Image SPL-BCS

PSNR Com.Rate

1 26.2417 8

2 29.4186 8

3 27.1824 8

4 25.1452 8

5 27.8494 8

6 27.5375 8

7 26.8514 8

8 19.8811 8

9 23.2498 8

10 26.5758 8

11 22.7026 8

12 20.1181 8

We can see in the form that the algorithm with using 3 layer BP learing is efficient.

On one hand, it increase the total compression ratio. On the other hand, the

reconstructed images can get a higher PSNR than SPL-BCS. However, with 4 layer

configuration, since the confusion rates are very low (which have been shown in last

chapter), most smooth parts of images have been justified as unsmooth. So there are

few blocks take a second sampling, which leads to the low compression ratio. Since

the compression ratio is low, the PSNR is much higher than 3 layer configuration.

Furthermore, we can see the difference of using BP method and without BP method in

the reconstructed images.

The left-hand-side are reconstructed images using 3 layer BP learning and the right-hand-side are

images without BP learning.

From the reconstructed images, we find that the edges with BP learning are much

clear than the edges without BP learning.

Because of time limit, I haven’t compare much of the different MLP configures. In the

future research, I would find which configuration in MLP is better for image

reconstruction.

Referrence

[1] E. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal

reconstruction from highly incomplete frequency information,” IEEE Trans. Inform. Theory,

vol. 52, no. 2, pp. 489–509, Feb. 2006.

[2] D. Donoho, “Compressed sensing,” IEEE Trans. Inform. Theory, vol. 52, no. 4, pp. 1289–

1306, Apr. 2006.

[3] Y.Tsaig and D. L. Donoho, “Extensions of compressed sensing,” Signal Processing, vol. 86,

pp.533-548, July 2006.

[4] D. L. Donoho, Y.Tsaig, I. Drori, and J.-L. Starck, “Sparse solution of underdetermined linear

equations by stagewise orthogonal matching pursuit,” Mar. 2006.

[5] Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad, “Orthogonalmatching pursuit: Recursive

function approximation with applications to wavelet decomposition,” in Conf. Rec. 27th

Asilomar Conf. Signals, Syst. Comput, vol.1, pp. 40 - 44, 1993.

[6] E. Pennec and S. Mallat, “Bandelet image approximation and compression,” Multiscale

Modeling & Simulation, vol.4, no. 4, pp. 992 － 1039, 2005.

[7] N. G. Kingsbury, “Complex wavelets for shift invariant analysis and filtering of signals,”

Journal of Applied Computational Harmonic Analysis, vol. 10, pp. 234–253, May 2001.

[8] M. N. Do and M. Vetterli, “The contourlet transform: An efficient directional multiresolution

image representation,” IEEE Transactions on Image Processing, vol. 14, no. 12, pp. 2091–

2106, December 2005.

[9] E. Cand`es, J. Romberg, and T. Tao, “Stable signal recovery from incomplete and inaccurate

measurements,” Communications on Pure and Applied Mathematics, vol. 59, no. 8, pp. 1207–

1223, August 2006.

[10] E. Candes and J. Romberg, “Practical signal recovery from random

projections,”2005,[Online].Available:http://www.dsp.ece.rice.edu/CS

[11] L. Gan, “Block compressed sensing of natural images,” in Proceedings of the International

Conference on Digital Signal Processing, Cardiff, UK, pp.403-406, July 2007.

image reconstruction based on back propagation learning in...

Documents