single image super resolution - university of southern...
TRANSCRIPT
Outline
1
• Definition of Super Resolution
• Multi-image SR
• Single-Image SR
Introduction
• Match filter
Current Work
• Example based
• SRCNN
• VDSR
• Perceptual SR
Literature
Outline
• Definition of Super Resolution
• Multi-image SR
• Single-Image SR
Introduction
• Match filter
Current Work
• Example based
• SRCNN
• VDSR
• Perceptual SR
Literature
What is Super Resolution?
Definition: Generating high resolution (HR) image with more details using one or more low resolution (LR) images.
3
Multi-image
Source: [Park et al. SPM 2003]
Single-image
Source: [Freeman et al. CG&A 2002]
Multi-image vs. Single-image
87
• Several images of the same scenery.• Each image has different information of the same scenery.
Multi-image SR
8
Single Image SR
Learning to map from low-res to high-res patches
• Nearest neighbor [Freeman et al. CG&A 02]
• Neighborhood embedding [Chang et a. CVPR 04]
• Sparse representation [Yang et al. TIP 10]
• Kernel ridge regression [Kim and Kwon PAMI 10]
• Locally-linear regression [Yang and Yang ICCV 13] [Timofte et al.
ACCV 14]
• Convolutional neural network [Dong et al. ECCV 14]
• Random forest [Schulter et al. CVPR 15]
10
Outline
12
• Definition of Super Resolution
• Multi-image SR
• Single-Image SR
Introduction
• Match filter
Current Work
• Example based
• SRCNN
• VDSR
• Perceptual SR
Literature
Example Based SR
Correspondences between low and high resolution image patches are learned from a database of low and high resolution image.
13
Example Based SR
• Super-Resolution from Transformed Self-Exemplars
Huang, J.B., Singh, A. and Ahuja, N., 2015, June. Single image super-resolution from transformed
self-exemplars. In Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on (pp. 5197-5206). IEEE.
15
LR input image Matching error
LR patch HR patch
Translation
Perspective
Ground truth LR/HR patch
Example Based SR
1716
17
SRCNN
Dong, C., Loy, C.C., He, K. and Tang, X., 2014. Learning a deep convolutional network for image super-resolution. In Computer Vision–ECCV 2014 (pp. 184-199). Springer International Publishing.
A review on: Image Super-Resolution using deep CNN
• First super resolution topic using CNN• Achieved significant improvement
• Patch extraction & representation- 64 filters of size Cxf1xf1 (1x9x9) 64*1*9*9=5184
• Non-linear mapping- 32 filters of size n1xf2xf2 (64x1x1) 32*64*1*1=2048
• Reconstruction-1 filters of size n2xf3xf3 (32x5x5) 1*32*5*5=800
19
Training: filter size and numbers
• Training dataset- 91 images- only use Y channel
• Input patch extraction - extract patches from bicubic interpolated LR images
- input patches are of size 33x33
- normalize pixel value to 0~1
• Target patch extraction- extract patches from original HR images
- target patches are of size 21x21- normalize pixel value to 0~1
20
Training: preprocessing
g and h are like Laplacian/Gaussian filtersa – e are like edge detectors at different directionsf is like a texture extractor
22
Training: filter visualization
SRCNN Reconstruction
PSNR 30.60 dB SSIM 0.9572
SRCNN Reconstruction
PSNR 30.90 dB SSIM 0.9599
Author-given training dataset Specific Hair training dataset
35
SRCNN trained on hair dataset
orginal
Specific Hair training datasetSRCNN Reconstruction
Original
Author-given training dataset
32.38 dB
SRCNN Reconstruction
32.45 dB0.85310.8511
36
SRCNN trained on hair dataset
orginal
SRCNN Reconstruction
Original
SRCNN Reconstruction
Author-given training dataset
Specific Hair training dataset
0.824030.21dB30.19 dB
0.8256
37
SRCNN trained on hair dataset
VDSR
A review on: Accurate Image Super-Resolution Using Very Deep Convolutional Networks
• Second super resolution topic using CNN with deeper network
• Less training iteration• Less training time• Achieved huge improvement
Kim, Jiwon, Jung Kwon Lee, and Kyoung Mu Lee. "Accurate Image Super-Resolution
Using Very Deep Convolutional Networks." arXiv preprint arXiv:1511.04587 (2015).
38
• Why the deeper the better?
• Why learning error map is better?
• Why large learning rate? And how to achieve convergence?
• How to choose filter size?
40
Some questions
• SRCNN only has three layers– They stopped the training procedure before networks converged
– learning rate 10−5 is too small for a network to converge within a week on a commonGPU.
• CNN exploit spatially-local correlation.
• Stacking many layers leads to filters that become increasingly global.
41
Why the deeper the better?
• large learning rate can boost training
• Initially in SRCNN, the learning rate is 0.0001, and the total iteration is 15,000,000.
• How to assure convergence?
43
Why large learning rate?
• Gradient Explosion
• stochastic gradient descent equation:𝑉𝑡+1 = 𝜇𝑉𝑡 − 𝛼𝛻𝐿 𝑊𝑡
𝑊𝑡+1 = 𝑊𝑡 + 𝑉𝑡+1
44
12
34
Large learning rate issue
• Clip weight updates value to a predefined small range −𝜃, 𝜃 .
• Since gradient is multiplied by learning rate to update weights.
𝑉𝑡+1 = 𝜇𝑉𝑡 − 𝛼𝛻𝐿 𝑊𝑡
𝑊𝑡+1 = 𝑊𝑡 + 𝑉𝑡+1
• Gradients are actually clipped to
[−𝜃
𝛾,𝜃
𝛾]
• where 𝛾 is the current learning rate.
45
Gradient Clipping
• Network trained much faster than SRCNN (4 days ->3 hours)
• Superior performance than bicubic and SRCNN
Bicubic SRCNN VDSR
46
Performance of VDSR
• +0.5 dB
48
(Cont’d)
Set5 Bicubic (dB) SRCNN (dB) VDSR (dB)
Bird 32.47 34.91 35.70
Head 32.92 33.55 33.68
Woman 28.60 30.92 31.47
Baby 33.95 35.01 35.18
Butterfly 24.05 27.58 28.74
Average 30.39 32.39 32.95
Table 1: Comparison of PSNR using different methods
Perceptual SR
Johnson, Justin, Alexandre Alahi, and Li Fei-Fei. "Perceptual Losses for Real-Time
Style Transfer and Super-Resolution." arXiv preprint arXiv:1603.08155 (2016).
A review on: Perceptual Losses for Real-Time Style Transfer and Super-Resolution
• PSNR doesn’t guarantee perceptually comfortable SR• Teach the network to redraw!
49
Outline
56
• Definition of Super Resolution
• Multi-image SR
• Single-Image SR
Introduction
• Match filter
Current Work
• Example based
• SRCNN
• VDSR
• Perceptual SR
Literature
57
Matched Filter Based SR
10021
21
Input patch
100
ReLu
conv results
21
21
21
21
21
⁞
⁞
⁞
⁞
SVR
Match filters
Final prediction