image restoration using dnn - the faculty of mathematics ...vision/courses/2016_2/... · it was...
TRANSCRIPT
Image Restoration Using DNN
Hila Levi & Eran Amar
Images were taken from: http://people.tuebingen.mpg.de/burger/neural_denoising/
AgendaDomain Expertise vs. End-to-End optimization
Image Denoising and Inpainting:
Task definition and previous work
Displaying the NNs, experiments and results
Conclusions
Image Super Resolution (SR)
For the rest of the talk, Neural Network will be written as NN.
2
Domain Expertise vs. End-to-End optimizationHow to utilize Neural Networks for algorithmic challenges?
One possible approach, is to try to combine the network with existing well-
engineered algorithms (“physically” or by better initialization).
On the other hand, there is a “pure” learning approach which looks at the
NN as a “black box”. That is, one should build a network with some
(possibly customized) architecture and let it optimize its parameters
jointly in an end-to-end manner.
In this talk we will discuss those two approaches for the task of image
restoration. 3
Image Denoising
4
Introduction - Image Denoising ● Goal - mapping a noisy image to a noise-free image.
● Motivation - Additive noise and image degradation are probable results of
many acquisition channels and compression methods.
5
● The most common and easily simulated noise is
Additive White Gaussian noise.
● Gaussianizable noise types: Poisson noise & Rice-
distributed noise (will see examples later)
Abundance of more complicated types of noise:
● Salt and pepper
● Strip noise
● JPEG quantization artifacts
Previous WorkNumerous and diverse (non-NN) approaches:
Selectively smooth parts of the noisy image
Careful shrinkage of wavelets coefficients
Dictionary Based: try to approximate noisy patches with sparse combination of elements
from a pre-learned dictionary (trained on noise-free database), for instance: KSVD.
“Non local statistics” of images: different patches in the same image are often similar in
appearance. For example, BM3D.
6
KSVD● Relying on the assumption that natural images admit a sparse
decomposition over a redundant dictionary.
● In general, KSVD is an iterative procedure used to learn the dictionaries. In
this talk, we refer to the denoising-algorithm based on KSVD as “KSVD”
(more details about dictionary-based methods in the SR part of this talk).
● Achieved great results in image denoising and inpainting.
Based on: M. Aharon, M. Elad, and A. Bruckstein. K-svd: An algorithm for designing overcomplete dictionaries for sparse
representation. IEEE Transactions on Signal Processing, 54(11):4311–4322, 2006 7
BM3D● BM3D = Block-Matching and 3D filtering, suggested first in 2007.
● Given a 2D square-block, finds all 2D similar blocks and “group” them
together as a 3D array, then performs a collaborative filtering (method that
the authors designed) of the group to obtain a noise-free 2D estimation.
● Averaging overlapping pixels estimations.
● Gives state of the art results.
8
Based on: K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Transactions on Image Processing, 16(8):2080–2095, 2007.
How to Evaluate Denoising Technique?● In this part, we will focus on PSNR.
● Peak Signal-to-Noise Ratio, expressed with logarithmic decibel scale.
● Higher is better.
If I is the original noise-free image, and K is the noisy approximation:
MAXI = the maximum value across all pixels.9
Pure NN approach: MLP● Based on the work of: Harold C. Burger, Christian J. Schuler, and Stefan
Harmeling. Image denoising: Can plain Neural Networks compete with BM3D?
(June 2012 - the dawn of Neural Networks)
● The idea is to learn a Multi-Layer Perceptron (MLP), which is simply a
feed-forward network, to map noisy grayscale image patches onto
cleaner patches.
10
Mathematical FormulationFormally, the MLP is a nonlinear function that maps vector-valued input to a
vector-valued output. For a 3-layer network, it can be written as:
When the function tanh() operates coordinate-wise.
11
Training TechniquesLoss function:
Stochastic Gradient Descent with backpropagation algorithm.
Common NN tricks:
Data normalization (to have zero mean)
Weights initialization with normal distribution
Learning rate division (in each layer, the learning rate was divided by the number of in-
connection to that layer).
Implemented over GPUs to allow large-scale experiments.12
Training and Testing DatasetTraining: pairs of noisy and clean patches.
Given a clean image, the noisy image was generated by applying AWG
noise (std=25).
Two main sources for clean images:
Barkeley Segmentation dataset (small dataset, ~200 images)
The union of LabelMe dataset (large dataset, ~150,000 images)
Testing:
Standard test set: 11 common images “Lena”, “Barbara” and etc13
Architecture VariationsThe specific variation of the network is defined by a string with the following
structure ?-??-?x???
First, a letter S or L indicating the size of the training set
Then, a number denoting the patch size
The number of hidden layers follow by the size of the layers (all of them are
of the same size)
For example: L-17-4x2047.
14
Improvement During TrainingTest the PSNR on “Barbara” and “Lena” images for every 2 million training
examples.
15
Competing with State of the Art
16
Competing with State of the Art (cont.)
17
Noise Levels & “Agnostic” Testing● The MLP was trained
for fixed level of noise
(std=25).
● Testing was done on
different level of noise.
● Other algorithms has
to be supplied with the
level of noise for the
given image.
18
Mixed Noise Levels for Training● To overcome that, MLP was trained on several noise level (“std” from 0 to
105 in step size of 5).
● The amount of noise were given during training as an additional input
parameter.
19
Handling Different Types of NoiseRice-distributed noise and Poisson noise can be handled by transforming
the input image, and applying the MLP on the resulting AWG-like noise.
20
Handling Different Types of Noise (2)In most cases it is more difficult\impossible to find Gaussianizing
transforms.
MLPs allows to effectively learn a denoising algorithm for a given noise type,
provided that noise can be simulated (no need to redesign the network).
21
Handling Different Types of Noise (3)
22
● Strip noise:
○ Contains a structure
○ No canonical denoising algorithm so used BM3D.
● Salt & pepper noise:
○ Noisy values are not correlated to the original image values.
○ Median filtering as a baseline for comparison.
● JPEG quantization artifacts:
○ Due to the image compression (blocky image and loss of edge clarity).
○ Not random, but completely determined by the input.
○ Compared against the common method for handling JPEG artifacts (re application of JPEG).
23
The Power of Pure LearningAchieved State of the Art results.
Key ingredients for success:
The capacity of the network should be large enough (in terms of layers and units)
Large patch size
Huge training set (tens of millions)
However, the best MLP performs well only with respect to single level of
noise. When tried to overcome that, improved the generalization to
different noises but still achieved less than the original version against
fixed level of noise. 24
Image Inpainting
25
Blind \ Non-Blind Image Inpainting● Goal - recovering missing pixel values or removing sophisticated patterns
from the image.
● Known vs. unknown locations of the corrupted pixels.
● Some image denoising algorithms can be applied (with minor
modifications) to non-blind image inpainting and achieve state of the art
results.
● Blind inpainting are much harder problem. Previous methods forced
strong assumptions on the inputs.
26
Exploiting Domain Expertise● Previous works based on Sparse Coding techniques performs well in
practice, albeit being linear.
● It was suggested that non-linear “deep” models might achieve superior
performance.
● Multi layer NN are such a deep model.
● Junyuan Xie, Linli Xu and Enhong Chen, suggested in Image Denoising and
Inpainting with Deep Neural Networks (2012) to try to combine “sparse” with
“deep”. We now present their work.
27
DA, SDA and SSDA● Denoising Autoencoder (DA) is a 2 layer NN that try to reconstruct the
original input given a noisy estimation of it.
● Used in several other Machine Learning fields.
● Concatenate multiple DAs to get Stacked Denoising Autoencoder (SDA).
● The authors proposed a Sparse (induced) Stacked Denoising Autoencoder
(SSDA).
28
Single DA - Mathematical FormulationNoise\clean relation:
**
Learning objective:
Layers formulation:
Activation function:
Loss function:
29
SSDA Formulation - how to make it “parse”?● Each DA pre-trained w.r.t sparsity inducing loss function:
Parameter used:
30
● Layer-wise pre-training gives better initialization before traditional
backpropagation training.
● Iterative manner: After training of the first layer, the hidden layer
activations of both the noisy input and the clean input are the training
data of the second layer. Namely:
“Fine Tuning”
31
Remainder
● Assume K stacked DAs, that were already pre-trained (our case K=2).
● Backpropagation training phase done w.r.t.
● Optimization algorithm: L-BFGS (some Quasi-Newton method) with
backpropagation.
SSDA for the task of Denoising● Datasets:
○ Clean training images were taken from a public web database.
○ Noisy counterparts were generated with AWG noise.
● Training:
○ pairs of noise-clean patches (that may overlaps).
● Testing:
○ Standard 11 tests images (“Lena”, “boat” ect).
● Variations:
○ Try different patch sizes (not specified)
32
SSDA for Denoising - Results● Were compared to: KSVD and BLS-GSM (Bayes Least Squares Gaussian
Scale Mixture).
● Numerical results (PSNR): were statistically Insignificant between all
three methods.
● SSDA gave better visual results (clearer boundaries and texture details)
33
SSDA for Denoising - Results (2)Was observed for SSDA:
● Higher noise level
requires larger patch size
● Not very sensitive to the
weights of the
regularization terms
34
clean
noisy
BLS-GSM
KSVD
SSDA
SSDA for the task of Inpainting● Consider for the task of blind text removal.
● Same datasets for testing and training.
● “Corrupted” images were generated by adding text of various sizes
● No blind algorithm for comparison, thus used non-blind KSVD
● Results:
○ Eliminates small fonts completely, larger fonts were dimmed
○ Blind method which compete with non-blind one!35
SSDA for inpainting - Visual Results
36noisy SSDA KSVD
Summary: Combining “deep” with “sparse” ● Inspired by domain expertise to better understand the network.
○ Much more to improve: dropout, convolutional layers etc
● SSDA for denoising task:
○ Performance comparable to traditional sparse coding algorithms
○ Limited to the specific type of noise it was trained on, and for fixed level of noise
● SSDA for blind inpainting task:
○ Achieved great results even compared to non-blind methods
37
Next part after the break
38
Thank You