single image super resolution - university of southern...

Single Image Super Resolution

Wenchao Zheng

5/6/2016

MCL

University of Southern California

Outline

1

• Definition of Super Resolution

• Multi-image SR

• Single-Image SR

Introduction

• Match filter

Current Work

• Example based

• SRCNN

• VDSR

• Perceptual SR

Literature

Outline


• Multi-image SR

• Single-Image SR

Introduction

• Match filter

Current Work

• Example based

• SRCNN

• VDSR

• Perceptual SR

Literature

What is Super Resolution?

Definition: Generating high resolution (HR) image with more details using one or more low resolution (LR) images.

3

More examples

4

More examples

5

Difference with Sharpening

76

Multi-image

Source: [Park et al. SPM 2003]

Single-image

Source: [Freeman et al. CG&A 2002]

Multi-image vs. Single-image

87

http://www-sipl.technion.ac.il/new/Teaching/Projects/Winter2007/SR_Overview.pdf

http://people.csail.mit.edu/billf/publications/Example-Based_Super_Resolution.pdf

• Several images of the same scenery.• Each image has different information of the same scenery.

Multi-image SR

8

• Reconstruct HR image through multiple LR images

Multi-image SR

9

Single Image SR

Learning to map from low-res to high-res patches

• Nearest neighbor [Freeman et al. CG&A 02]

• Neighborhood embedding [Chang et a. CVPR 04]

• Sparse representation [Yang et al. TIP 10]

• Kernel ridge regression [Kim and Kwon PAMI 10]

• Locally-linear regression [Yang and Yang ICCV 13] [Timofte et al.

ACCV 14]

• Convolutional neural network [Dong et al. ECCV 14]

• Random forest [Schulter et al. CVPR 15]

10

http://people.csail.mit.edu/billf/publications/Example-Based_Super_Resolution.pdf

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.330.2972&rep=rep1&type=pdf

http://www.ifp.illinois.edu/~jyang29/ScSR.htm

https://people.mpi-inf.mpg.de/~kkim/supres/supres.htm

http://www.cv-foundation.org/openaccess/content_iccv_2013/papers/Yang_Fast_Direct_Super-Resolution_2013_ICCV_paper.pdf

http://www.vision.ee.ethz.ch/~timofter/publications/Timofte-ACCV-2014.pdf

http://mmlab.ie.cuhk.edu.hk/projects/SRCNN.html

http://lrs.icg.tugraz.at/pubs/schulter_cvpr_2015.pdf

11

Single Image SR

Low Resolution Super Resolution Original

Outline

12


• Multi-image SR

• Single-Image SR

Introduction

• Match filter

Current Work

• Example based

• SRCNN

• VDSR

• Perceptual SR

Literature

Example Based SR

Correspondences between low and high resolution image patches are learned from a database of low and high resolution image.

13

Example Based SR

14

Example Based SR

• Super-Resolution from Transformed Self-Exemplars

Huang, J.B., Singh, A. and Ahuja, N., 2015, June. Single image super-resolution from transformed

self-exemplars. In Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on (pp. 5197-5206). IEEE.

15

LR input image Matching error

LR patch HR patch

Translation

Perspective

Ground truth LR/HR patch

Example Based SR

1716

17

SRCNN

Dong, C., Loy, C.C., He, K. and Tang, X., 2014. Learning a deep convolutional network for image super-resolution. In Computer Vision–ECCV 2014 (pp. 184-199). Springer International Publishing.

A review on: Image Super-Resolution using deep CNN

• First super resolution topic using CNN• Achieved significant improvement

18

Training: Network overview

backpropagation

• Patch extraction & representation- 64 filters of size Cxf1xf1 (1x9x9) 64*1*9*9=5184

• Non-linear mapping- 32 filters of size n1xf2xf2 (64x1x1) 32*64*1*1=2048

• Reconstruction-1 filters of size n2xf3xf3 (32x5x5) 1*32*5*5=800

19

Training: filter size and numbers

• Training dataset- 91 images- only use Y channel

• Input patch extraction - extract patches from bicubic interpolated LR images

- input patches are of size 33x33

- normalize pixel value to 0~1

• Target patch extraction- extract patches from original HR images

- target patches are of size 21x21- normalize pixel value to 0~1

20

Training: preprocessing

Euclidian distance

21

Training: make LR match HR

g and h are like Laplacian/Gaussian filtersa – e are like edge detectors at different directionsf is like a texture extractor

22

Training: filter visualization

23

Training: feature maps

25

24

Training: number of layers

Training: number of filter

26

25

Test: forward propagation

Using full bicubic interpolated image as input, not patch

26

27

Performance comparison

Performance comparison

Table 1: performance on BSD200

28

PSNR: Bicubic 32.39 dB and SRCNN 34.71 dBImprovement: 2.32 dB

SRCNNBicubic

SRCNN’s Strength

29


SRCNNBicubic

SRCNN’s Strength

30

SRCNNBicubic

PSNR: Bicubic 24.04 dB and SRCNN 27.95dBImprovement: 3.94 dB

SRCNN’s Strength

31

SRCNNBicubic


SRCNN’s weakness

32


SRCNNBicubic

SRCNN’s weakness

33

34

Modify SRCNN’s training data

SRCNN Reconstruction

PSNR 30.60 dB SSIM 0.9572


PSNR 30.90 dB SSIM 0.9599

Author-given training dataset Specific Hair training dataset

35

SRCNN trained on hair dataset

orginal

Specific Hair training datasetSRCNN Reconstruction

Original

Author-given training dataset

32.38 dB


32.45 dB0.85310.8511

36


orginal


Original


Author-given training dataset

Specific Hair training dataset

0.824030.21dB30.19 dB

0.8256

37


VDSR

A review on: Accurate Image Super-Resolution Using Very Deep Convolutional Networks

• Second super resolution topic using CNN with deeper network

• Less training iteration• Less training time• Achieved huge improvement

Kim, Jiwon, Jung Kwon Lee, and Kyoung Mu Lee. "Accurate Image Super-Resolution

Using Very Deep Convolutional Networks." arXiv preprint arXiv:1511.04587 (2015).

38

39

Training: network overview

• Residue learning• 20 layers• Large learning rate 0.1

• Why the deeper the better?

• Why learning error map is better?

• Why large learning rate? And how to achieve convergence?

• How to choose filter size?

40

Some questions

• SRCNN only has three layers– They stopped the training procedure before networks converged

– learning rate 10−5 is too small for a network to converge within a week on a commonGPU.

• CNN exploit spatially-local correlation.

• Stacking many layers leads to filters that become increasingly global.

41

Why the deeper the better?

• Advantages– Converges faster

– Superior results

42

Why learning error map is better?

• large learning rate can boost training

• Initially in SRCNN, the learning rate is 0.0001, and the total iteration is 15,000,000.

• How to assure convergence?

43

Why large learning rate?

• Gradient Explosion

• stochastic gradient descent equation:𝑉𝑡+1 = 𝜇𝑉𝑡 − 𝛼𝛻𝐿 𝑊𝑡

𝑊𝑡+1 = 𝑊𝑡 + 𝑉𝑡+1

44

12

34

Large learning rate issue

• Clip weight updates value to a predefined small range −𝜃, 𝜃 .

• Since gradient is multiplied by learning rate to update weights.

𝑉𝑡+1 = 𝜇𝑉𝑡 − 𝛼𝛻𝐿 𝑊𝑡

𝑊𝑡+1 = 𝑊𝑡 + 𝑉𝑡+1

• Gradients are actually clipped to

[−𝜃

𝛾,𝜃

𝛾]

• where 𝛾 is the current learning rate.

45

Gradient Clipping

• Network trained much faster than SRCNN (4 days ->3 hours)

• Superior performance than bicubic and SRCNN

Bicubic SRCNN VDSR

46

Performance of VDSR

• Extremely well in sharp edge refinement

Bicubic SRCNN VDSR

47

(Cont’d)

• +0.5 dB

48

(Cont’d)

Set5 Bicubic (dB) SRCNN (dB) VDSR (dB)

Bird 32.47 34.91 35.70

Head 32.92 33.55 33.68

Woman 28.60 30.92 31.47

Baby 33.95 35.01 35.18

Butterfly 24.05 27.58 28.74

Average 30.39 32.39 32.95

Table 1: Comparison of PSNR using different methods

Perceptual SR

Johnson, Justin, Alexandre Alahi, and Li Fei-Fei. "Perceptual Losses for Real-Time

Style Transfer and Super-Resolution." arXiv preprint arXiv:1603.08155 (2016).

A review on: Perceptual Losses for Real-Time Style Transfer and Super-Resolution

• PSNR doesn’t guarantee perceptually comfortable SR• Teach the network to redraw!

49

network overview

50

Modified loss function

Feature Reconstruction Loss

Style Reconstruction Loss

Final Loss function

51

Experiment results

54

Experiment results

55

Outline

56


• Multi-image SR

• Single-Image SR

Introduction

• Match filter

Current Work

• Example based

• SRCNN

• VDSR

• Perceptual SR

Literature

57

Matched Filter Based SR

10021

21

Input patch

100

ReLu

conv results

21

21

21

21

21

⁞

⁞

⁞

⁞

SVR

Match filters

Final prediction

Matched filters

Bicubic patch predictionOriginal patch

58

Matched filters

• Cluster patches based on prediction and patch similarity

59

single image super resolution - university of southern...

Documents