image restoration using dnn - the faculty of mathematics ...vision/courses/2016_2/... · it was...

Image Restoration Using DNN

Hila Levi & Eran Amar

Images were taken from: http://people.tuebingen.mpg.de/burger/neural_denoising/

AgendaDomain Expertise vs. End-to-End optimization

Image Denoising and Inpainting:

Task definition and previous work

Displaying the NNs, experiments and results

Conclusions

Image Super Resolution (SR)

For the rest of the talk, Neural Network will be written as NN.

2

Domain Expertise vs. End-to-End optimizationHow to utilize Neural Networks for algorithmic challenges?

One possible approach, is to try to combine the network with existing well-

engineered algorithms (“physically” or by better initialization).

On the other hand, there is a “pure” learning approach which looks at the

NN as a “black box”. That is, one should build a network with some

(possibly customized) architecture and let it optimize its parameters

jointly in an end-to-end manner.

In this talk we will discuss those two approaches for the task of image

restoration. 3

Image Denoising

4

Introduction - Image Denoising ● Goal - mapping a noisy image to a noise-free image.

● Motivation - Additive noise and image degradation are probable results of

many acquisition channels and compression methods.

5

● The most common and easily simulated noise is

Additive White Gaussian noise.

● Gaussianizable noise types: Poisson noise & Rice-

distributed noise (will see examples later)

Abundance of more complicated types of noise:

● Salt and pepper

● Strip noise

● JPEG quantization artifacts

Previous WorkNumerous and diverse (non-NN) approaches:

Selectively smooth parts of the noisy image

Careful shrinkage of wavelets coefficients

Dictionary Based: try to approximate noisy patches with sparse combination of elements

from a pre-learned dictionary (trained on noise-free database), for instance: KSVD.

“Non local statistics” of images: different patches in the same image are often similar in

appearance. For example, BM3D.

6

KSVD● Relying on the assumption that natural images admit a sparse

decomposition over a redundant dictionary.

● In general, KSVD is an iterative procedure used to learn the dictionaries. In

this talk, we refer to the denoising-algorithm based on KSVD as “KSVD”

(more details about dictionary-based methods in the SR part of this talk).

● Achieved great results in image denoising and inpainting.

Based on: M. Aharon, M. Elad, and A. Bruckstein. K-svd: An algorithm for designing overcomplete dictionaries for sparse

representation. IEEE Transactions on Signal Processing, 54(11):4311–4322, 2006 7

BM3D● BM3D = Block-Matching and 3D filtering, suggested first in 2007.

● Given a 2D square-block, finds all 2D similar blocks and “group” them

together as a 3D array, then performs a collaborative filtering (method that

the authors designed) of the group to obtain a noise-free 2D estimation.

● Averaging overlapping pixels estimations.

● Gives state of the art results.

8

Based on: K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Transactions on Image Processing, 16(8):2080–2095, 2007.

How to Evaluate Denoising Technique?● In this part, we will focus on PSNR.

● Peak Signal-to-Noise Ratio, expressed with logarithmic decibel scale.

● Higher is better.

If I is the original noise-free image, and K is the noisy approximation:

MAXI = the maximum value across all pixels.9

Pure NN approach: MLP● Based on the work of: Harold C. Burger, Christian J. Schuler, and Stefan

Harmeling. Image denoising: Can plain Neural Networks compete with BM3D?

(June 2012 - the dawn of Neural Networks)

● The idea is to learn a Multi-Layer Perceptron (MLP), which is simply a

feed-forward network, to map noisy grayscale image patches onto

cleaner patches.

10

Mathematical FormulationFormally, the MLP is a nonlinear function that maps vector-valued input to a

vector-valued output. For a 3-layer network, it can be written as:

When the function tanh() operates coordinate-wise.

11

Training TechniquesLoss function:

Stochastic Gradient Descent with backpropagation algorithm.

Common NN tricks:

Data normalization (to have zero mean)

Weights initialization with normal distribution

Learning rate division (in each layer, the learning rate was divided by the number of in-

connection to that layer).

Implemented over GPUs to allow large-scale experiments.12

Training and Testing DatasetTraining: pairs of noisy and clean patches.

Given a clean image, the noisy image was generated by applying AWG

noise (std=25).

Two main sources for clean images:

Barkeley Segmentation dataset (small dataset, ~200 images)

The union of LabelMe dataset (large dataset, ~150,000 images)

Testing:

Standard test set: 11 common images “Lena”, “Barbara” and etc13

Architecture VariationsThe specific variation of the network is defined by a string with the following

structure ?-??-?x???

First, a letter S or L indicating the size of the training set

Then, a number denoting the patch size

The number of hidden layers follow by the size of the layers (all of them are

of the same size)

For example: L-17-4x2047.

14

Improvement During TrainingTest the PSNR on “Barbara” and “Lena” images for every 2 million training

examples.

15

Competing with State of the Art

16

Competing with State of the Art (cont.)

17

Noise Levels & “Agnostic” Testing● The MLP was trained

for fixed level of noise

(std=25).

● Testing was done on

different level of noise.

● Other algorithms has

to be supplied with the

level of noise for the

given image.

18

Mixed Noise Levels for Training● To overcome that, MLP was trained on several noise level (“std” from 0 to

105 in step size of 5).

● The amount of noise were given during training as an additional input

parameter.

19

Handling Different Types of NoiseRice-distributed noise and Poisson noise can be handled by transforming

the input image, and applying the MLP on the resulting AWG-like noise.

20

Handling Different Types of Noise (2)In most cases it is more difficult\impossible to find Gaussianizing

transforms.

MLPs allows to effectively learn a denoising algorithm for a given noise type,

provided that noise can be simulated (no need to redesign the network).

21

Handling Different Types of Noise (3)

22

● Strip noise:

○ Contains a structure

○ No canonical denoising algorithm so used BM3D.

● Salt & pepper noise:

○ Noisy values are not correlated to the original image values.

○ Median filtering as a baseline for comparison.

● JPEG quantization artifacts:

○ Due to the image compression (blocky image and loss of edge clarity).

○ Not random, but completely determined by the input.

○ Compared against the common method for handling JPEG artifacts (re application of JPEG).

The Power of Pure LearningAchieved State of the Art results.

Key ingredients for success:

The capacity of the network should be large enough (in terms of layers and units)

Large patch size

Huge training set (tens of millions)

However, the best MLP performs well only with respect to single level of

noise. When tried to overcome that, improved the generalization to

different noises but still achieved less than the original version against

fixed level of noise. 24

Image Inpainting

25

Blind \ Non-Blind Image Inpainting● Goal - recovering missing pixel values or removing sophisticated patterns

from the image.

● Known vs. unknown locations of the corrupted pixels.

● Some image denoising algorithms can be applied (with minor

modifications) to non-blind image inpainting and achieve state of the art

results.

● Blind inpainting are much harder problem. Previous methods forced

strong assumptions on the inputs.

26

Exploiting Domain Expertise● Previous works based on Sparse Coding techniques performs well in

practice, albeit being linear.

● It was suggested that non-linear “deep” models might achieve superior

performance.

● Multi layer NN are such a deep model.

● Junyuan Xie, Linli Xu and Enhong Chen, suggested in Image Denoising and

Inpainting with Deep Neural Networks (2012) to try to combine “sparse” with

“deep”. We now present their work.

27

DA, SDA and SSDA● Denoising Autoencoder (DA) is a 2 layer NN that try to reconstruct the

original input given a noisy estimation of it.

● Used in several other Machine Learning fields.

● Concatenate multiple DAs to get Stacked Denoising Autoencoder (SDA).

● The authors proposed a Sparse (induced) Stacked Denoising Autoencoder

(SSDA).

28

Single DA - Mathematical FormulationNoise\clean relation:

**

Learning objective:

Layers formulation:

Activation function:

Loss function:

29

SSDA Formulation - how to make it “parse”?● Each DA pre-trained w.r.t sparsity inducing loss function:

Parameter used:

30

● Layer-wise pre-training gives better initialization before traditional

backpropagation training.

● Iterative manner: After training of the first layer, the hidden layer

activations of both the noisy input and the clean input are the training

data of the second layer. Namely:

“Fine Tuning”

31

Remainder

● Assume K stacked DAs, that were already pre-trained (our case K=2).

● Backpropagation training phase done w.r.t.

● Optimization algorithm: L-BFGS (some Quasi-Newton method) with

backpropagation.

SSDA for the task of Denoising● Datasets:

○ Clean training images were taken from a public web database.

○ Noisy counterparts were generated with AWG noise.

● Training:

○ pairs of noise-clean patches (that may overlaps).

● Testing:

○ Standard 11 tests images (“Lena”, “boat” ect).

● Variations:

○ Try different patch sizes (not specified)

32

SSDA for Denoising - Results● Were compared to: KSVD and BLS-GSM (Bayes Least Squares Gaussian

Scale Mixture).

● Numerical results (PSNR): were statistically Insignificant between all

three methods.

● SSDA gave better visual results (clearer boundaries and texture details)

33

SSDA for Denoising - Results (2)Was observed for SSDA:

● Higher noise level

requires larger patch size

● Not very sensitive to the

weights of the

regularization terms

34

clean

noisy

BLS-GSM

KSVD

SSDA

SSDA for the task of Inpainting● Consider for the task of blind text removal.

● Same datasets for testing and training.

● “Corrupted” images were generated by adding text of various sizes

● No blind algorithm for comparison, thus used non-blind KSVD

● Results:

○ Eliminates small fonts completely, larger fonts were dimmed

○ Blind method which compete with non-blind one!35

SSDA for inpainting - Visual Results

36noisy SSDA KSVD

Summary: Combining “deep” with “sparse” ● Inspired by domain expertise to better understand the network.

○ Much more to improve: dropout, convolutional layers etc

● SSDA for denoising task:

○ Performance comparable to traditional sparse coding algorithms

○ Limited to the specific type of noise it was trained on, and for fixed level of noise

● SSDA for blind inpainting task:

○ Achieved great results even compared to non-blind methods

37

Next part after the break

38

Thank You

image restoration using dnn - the faculty of mathematics ...vision/courses/2016_2/... · it was...

Documents