1 2shenzhen institutes of advanced technology, chinese ... · path-restore: learning network path...

10
Path-Restore: Learning Network Path Selection for Image Restoration Ke Yu 1 Xintao Wang 1 Chao Dong 2 Xiaoou Tang 1 Chen Change Loy 3 1 CUHK - SenseTime Joint Lab, The Chinese University of Hong Kong 2 Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences 3 Nanyang Technological University, Singapore {yk017, wx016, xtang}@ie.cuhk.edu.hk [email protected] [email protected] Abstract Very deep Convolutional Neural Networks (CNNs) have greatly improved the performance on various image restoration tasks. However, this comes at a price of in- creasing computational burden, which limits their practi- cal usages. We believe that some corrupted image regions are inherently easier to restore than others since the dis- tortion and content vary within an image. To this end, we propose Path-Restore, a multi-path CNN with a pathfinder that could dynamically select an appropriate route for each image region. We train the pathfinder using reinforcement learning with a difficulty-regulated reward, which is related to the performance, complexity and “the difficulty of restor- ing a region”. We conduct experiments on denoising and mixed restoration tasks. The results show that our method could achieve comparable or superior performance to exist- ing approaches with less computational cost. In particular, our method is effective for real-world denoising, where the noise distribution varies across different regions of a sin- gle image. We surpass the state-of-the-art CBDNet [13] by 0.94 dB and run 29% faster on the realistic Darmstadt Noise Dataset [27]. Models and codes will be released. 1 1. Introduction Image restoration aims at estimating a clean image from its distorted observation. Very deep Convolutional Neu- ral Networks (CNNs) have achieved great success on var- ious restoration tasks. For example, in the NTIRE 2018 Challenge [32], the top-ranking super-resolution methods are built on very deep models such as EDSR [23] and DenseNet [15], achieving impressive performance even when the input images are corrupted with noise and blur. However, as the network becomes deeper, the increasing computational cost makes it less practical for real-world ap- plications. 1 https://yuke93.github.io/Path-Restore/ Input 5 layers 12 layers 20 layers smooth region, mild noise smooth region, severe noise texture region, severe noise Performance‐Complexity Trade‐Off Network Depth 3 5 12 20 0 1 2 3 MSE Loss 10 ‐3 zoom in Figure 1. The relation between performance and network depth for different image regions. The noise level is gradually increasing from the left to the right side. The MSE loss is shown on the left while visual results are presented on the right. Do we really need a very deep CNN to process all the regions of a distorted image? In reality, a large variation of image content and distortion may exist in a single image, and some regions are inherently easier to process than oth- ers. An example is shown in Figure 1, where we use several denoising CNNs with different depth to restore a noisy im- age. It is observed that a smooth region with mild noise in the red box is well restored with a 5-layer CNN, and fur- ther increasing the network depth can hardly bring any im- provement. In the green box, a similar smooth region yet with severe noise requires a 12-layer CNN to process, and a texture region in the blue box still has some artifacts after being processed by a 20-layer CNN. This observation moti- vates the possibility of saving computations by selecting an optimal network path for each region based on its content and distortion. In this study, we propose Path-Restore, a novel frame- work that can dynamically select a path for each image re- gion with specific content and distortion. In particular, a pathfinder is trained together with a multi-path CNN to find arXiv:1904.10343v1 [cs.CV] 23 Apr 2019

Upload: others

Post on 03-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 2Shenzhen Institutes of Advanced Technology, Chinese ... · Path-Restore: Learning Network Path Selection for Image Restoration Ke Yu 1Xintao Wang Chao Dong2 Xiaoou Tang1 Chen Change

Path-Restore: Learning Network Path Selection for Image Restoration

Ke Yu1 Xintao Wang1 Chao Dong2 Xiaoou Tang1 Chen Change Loy3

1CUHK - SenseTime Joint Lab, The Chinese University of Hong Kong2Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences

3Nanyang Technological University, Singapore

{yk017, wx016, xtang}@ie.cuhk.edu.hk [email protected] [email protected]

Abstract

Very deep Convolutional Neural Networks (CNNs)have greatly improved the performance on various imagerestoration tasks. However, this comes at a price of in-creasing computational burden, which limits their practi-cal usages. We believe that some corrupted image regionsare inherently easier to restore than others since the dis-tortion and content vary within an image. To this end, wepropose Path-Restore, a multi-path CNN with a pathfinderthat could dynamically select an appropriate route for eachimage region. We train the pathfinder using reinforcementlearning with a difficulty-regulated reward, which is relatedto the performance, complexity and “the difficulty of restor-ing a region”. We conduct experiments on denoising andmixed restoration tasks. The results show that our methodcould achieve comparable or superior performance to exist-ing approaches with less computational cost. In particular,our method is effective for real-world denoising, where thenoise distribution varies across different regions of a sin-gle image. We surpass the state-of-the-art CBDNet [13]by 0.94 dB and run 29% faster on the realistic DarmstadtNoise Dataset [27]. Models and codes will be released.1

1. IntroductionImage restoration aims at estimating a clean image from

its distorted observation. Very deep Convolutional Neu-ral Networks (CNNs) have achieved great success on var-ious restoration tasks. For example, in the NTIRE 2018Challenge [32], the top-ranking super-resolution methodsare built on very deep models such as EDSR [23] andDenseNet [15], achieving impressive performance evenwhen the input images are corrupted with noise and blur.However, as the network becomes deeper, the increasingcomputational cost makes it less practical for real-world ap-plications.

1https://yuke93.github.io/Path-Restore/

Input 5 layers 12 layers 20 layers

smooth region, mild noisesmooth region, severe noisetexture region, severe noise

Performance‐Complexity Trade‐Off

Network Depth3 5 12 200

1

2

3

MSE Loss

10‐3

zoom in

Figure 1. The relation between performance and network depth fordifferent image regions. The noise level is gradually increasingfrom the left to the right side. The MSE loss is shown on the leftwhile visual results are presented on the right.

Do we really need a very deep CNN to process all theregions of a distorted image? In reality, a large variation ofimage content and distortion may exist in a single image,and some regions are inherently easier to process than oth-ers. An example is shown in Figure 1, where we use severaldenoising CNNs with different depth to restore a noisy im-age. It is observed that a smooth region with mild noise inthe red box is well restored with a 5-layer CNN, and fur-ther increasing the network depth can hardly bring any im-provement. In the green box, a similar smooth region yetwith severe noise requires a 12-layer CNN to process, anda texture region in the blue box still has some artifacts afterbeing processed by a 20-layer CNN. This observation moti-vates the possibility of saving computations by selecting anoptimal network path for each region based on its contentand distortion.

In this study, we propose Path-Restore, a novel frame-work that can dynamically select a path for each image re-gion with specific content and distortion. In particular, apathfinder is trained together with a multi-path CNN to find

arX

iv:1

904.

1034

3v1

[cs

.CV

] 2

3 A

pr 2

019

Page 2: 1 2Shenzhen Institutes of Advanced Technology, Chinese ... · Path-Restore: Learning Network Path Selection for Image Restoration Ke Yu 1Xintao Wang Chao Dong2 Xiaoou Tang1 Chen Change

the optimal dispatch policy for each image region. Sincepath selection is non-differentiable, the pathfinder is trainedin a reinforcement learning (RL) framework driven by a re-ward that encourages both high performance and few com-putations. There are mainly two challenges. First, the multi-path CNN should be able to handle a large variety of dis-tortions with limited computations. Second, the pathfinderrequires an informative reward to learn a good policy fordifferent regions with diverse contents and distortions.

We make two contributions to address these challenges:(1) We devise a dynamic block as the basic unit of ourmulti-path CNN. A dynamic block contains a shared pathfollowed by several dynamic paths with different complex-ity. The proposed framework allows flexible design on thenumber of paths and path complexity according to the spe-cific task at hand.(2) We devise a difficulty-regulated reward, which valuesthe performance gain of hard examples more than simpleones, unlike conventional reward functions. Specifically,the difficulty-regulated reward scales the performance gainof different regions by an adaptive coefficient that repre-sents the difficulty of restoring this region. The difficulty ismade proportional to a loss function (e.g., MSE loss), sincea large loss suggests that the region is difficult to restore,such as the region in blue box of Figure 1. We experimen-tally find that this reward helps the pathfinder learn a rea-sonable dispatch policy.

The proposed method differs significantly from severalworks that explore ways to process an input image dynam-ically. Yu et al. [39] propose RL-Restore that dynamicallyselects a sequence of small CNNs to restore a distorted im-age. Wu et al. [37] and Wang et al. [33] propose to dy-namically skip some blocks in ResNet [14] for each im-age, achieving fast inference while maintaining good per-formance for the image classification task. These existingdynamic networks treat all image regions equally, and thuscannot satisfy our demand to select different paths for dif-ferent regions. We provide more discussions in the relatedwork section.

We summarize the merits of Path-Restore as follows:1) Thanks to the dynamic path-selection mechanism, ourmethod achieves comparable or superior performance to ex-isting approaches with faster speed on various restorationtasks. 2) Our method is particularly effective in real-worlddenoising, where the noise distribution is spatially variantwithin an image. We achieve the state-of-the-art perfor-mance on the realistic Darmstadt Noise Dataset (DND) [27]benchmark2 3) Our method is capable of balancing the per-formance and complexity trade-off by simply adjusting aparameter in the proposed reward function during training.4) After RL-Restore [39], this is another successful attemptof combining reinforcement learning and deep learning to

2https://noise.visinf.tu-darmstadt.de/benchmark/#results srgb

solve image restoration problems.

2. Related WorkCNN for Image Restoration. Image restoration has beenextensively studied for decades. CNN-based methods haveachieved dramatic improvement in several restoration tasksincluding denoising [22, 16, 6, 13], deblurring [38, 30, 26],deblocking [8, 35, 11, 12] and super-resolution [9, 18, 19,31, 21, 34].

Although most of the previous papers focus on address-ing a specific degradation, several pioneering works aim atdealing with a large variety of distortions simultaneously.For example, Zhang et al. [40] propose DnCNN that em-ploys a single CNN to address different levels of Gaussiannoise. Guo et al. [13] develop CBDNet that estimates anoise map as a condition to handle diverse real noise. How-ever, processing all the image regions using a single pathmakes these methods less efficient. Smooth regions withmild distortions do not need such a deep network to restore.

Yu et al. [39] propose RL-Restore to address mixed dis-tortions. RL-Restore is the seminal work that applies deepreinforcement learning in solving image restoration prob-lems. Their method adaptively selects a sequence of CNNsto process each distorted image, achieving comparable per-formance with a single deep network while using fewercomputational resources. Unlike RL-Restore that requiresmanual design for a dozen CNNs, we only use one net-work to achieve dynamic processing. Our method can thusbe more easily migrated and generalized to various tasks.We also empirically find that our method runs faster andproduces more spatially consistent results than RL-Restorewhen processing a large image. Moreover, our frameworkenjoys the flexibility to balance the trade-off between per-formance and complexity by merely adjusting the rewardfunction, which is not possible in Yu et al. [39].Dynamic Networks. Dynamic networks have been inves-tigated to achieve a better trade-off between speed and per-formance in different tasks. Bengio et al. [5] use condi-tional computation to selectively activate one part of thenetwork at a time. Recently, several approaches [10, 37, 33]are proposed to save the computational cost in ResNet [14]by dynamically skipping some residual blocks. While theaforementioned methods mainly address one single task,Rosenbaum et al. [28] develop a routing network to facili-tate multi-task learning, and their method seeks an appropri-ate network path for each specific task. However, the pathselection is irrelevant to the image content in each task.

Existing dynamic networks successfully explore a rout-ing policy that depends on either the whole image contentor the task to address. However, we aim at selecting dy-namic paths to process different regions of a distorted im-age. In this case, both the content (e.g., smooth regions orrich textures) and the restoration task (e.g., distortion type

Page 3: 1 2Shenzhen Institutes of Advanced Technology, Chinese ... · Path-Restore: Learning Network Path Selection for Image Restoration Ke Yu 1Xintao Wang Chao Dong2 Xiaoou Tang1 Chen Change

Conv

Dyna

mic

Block

Dyna

mic

Block

Dyna

mic

Block

Conv

Conv

Conv

+Co

nv

Conv

+

Conv

Conv

+

… … …

Pathfinder

Pathfinder

Input Output GT

L2 loss

RewardForward

PathSelection

Backward 

Conv

Conv Fc LSTM Fc…

… …

1

1

Figure 2. Framework Overview. Path-Restore is composed of a multi-path CNN and a pathfinder. The multi-path CNN containsN dynamicblocks, each of which hasM optional paths. The number of paths is made proportional to the number of distortion types we aim to address.The pathfinder is able to dynamically select paths for different image regions according to their contents and distortions.

and severity) have a weighty influence on the performance,and they should be both considered in the path selection.To address this specific challenge, we propose a dynamicblock to offer diverse paths and adopt a difficulty-regulatedreward to effectively train the pathfinder. The contributionsare new in the literature.

3. MethodologyOur task is to recover a clear image y from its distorted

observation x. There might exist multiple types of distor-tions (e.g., blur and noise), and each distortion may havedifferent degrees of severity. We wish that our method couldscale well to complex restoration tasks effectively and ef-ficiently. Therefore, the challenges are mainly two-fold.First, the network should be capable of addressing a largevariety of distortions efficiently. Second, the pathfinder re-quires an informative reward to learn an effective dispatchpolicy based on the content and distortion of each imageregion.

We propose Path-Restore to address the above chal-lenges. We first provide an overview of the architecture ofPath-Restore in Sec. 3.1. In Sec. 3.2, we then introduce thestructure of pathfinder and the reward to evaluate its policy.Finally, we detail the training algorithm in Sec. 3.3.

3.1. Architecture of Path-Restore

We aim to design a framework that can offer different op-tions of complexity. To this end, as shown in Figure 2, Path-Restore is composed of a convolutional layer at the start-and end-point, and N dynamic blocks in the middle. In thei-th dynamic block f iDB , there is a shared path f iS that everyimage region should pass through. Paralleling to the sharedpath, a pathfinder fPF generates a probabilistic distribu-tion of plausible paths for selection. Following the shared

path, there areM dynamic paths denoted by f i1, fi2, . . . , f

iM .

Each dynamic path contains a residual block except that thefirst path is a bypass connection. According to the output ofpathfinder, the dynamic path with the highest probability isactivated, where the path index is denoted by ai. Therefore,the i-th dynamic block can be formulated as:

xi+1 = f iai(f iS(xi)), (1)

where xi and xi+1 denote the input and output of the i-thdynamic block, respectively. Note that in two different dy-namic blocks, the parameters of each corresponding pathare different, while the parameters of pathfinder are shared.

As can be seen from Eq. (1), the image features gothrough the shared path and one dynamic path. The sharedpath is designed to explore the similarity among differenttasks and images. The dynamic paths offer different op-tions of complexity. Intuitively, simple samples should beled to the bypass path to save computations, while hard ex-amples might be guided to another path according to itscontent and distortion. The number of dynamic paths canbe designed based on the complexity of distortion types in aspecific task. We show various options in our experiments.

3.2. Pathfinder

The pathfinder is the key component to achieve path se-lection. As shown in Figure 2, the pathfinder contains Cconvolutional layers, followed by two fully-connected lay-ers with a Long Short-Term Memory (LSTM) module in themiddle. The number of convolutional layers depends on thespecific task, which will be specified in Sec. 4. The LSTMmodule is used to capture the correlations of path selectionin different dynamic blocks. The pathfinder accounts forless than 3% of the overall computations.

Since path selection is non-differentiable, we formulate

Page 4: 1 2Shenzhen Institutes of Advanced Technology, Chinese ... · Path-Restore: Learning Network Path Selection for Image Restoration Ke Yu 1Xintao Wang Chao Dong2 Xiaoou Tang1 Chen Change

the sequential path selection as a Markov Decision Pro-cess (MDP) and adopt reinforcement learning to train thepathfinder. We will first clarify the state and action of thisMDP, and then illustrate the proposed difficulty-regulatedreward to train the pathfinder.State and Action. In the i-th dynamic block, the state siconsists of the input features xi and the hidden state ofLSTM hi, denoted by si = {xi, hi}. Given the state si,the pathfinder fPF generates a distribution of path selec-tion, which can be formulated as fPF (si) = π(a|si). In thetraining phase, the action (path index) is sampled from thisprobabilistic distribution, denoted by ai ∼ π(a|si). Whilein the testing phase, the action is determined by the highestprobability, i.e., ai = argmaxa π(a|xi).Difficulty-Regulated Reward. In an RL framework, thepathfinder is trained to maximize a cumulative reward, andthus a proper design of reward function is critical. Exist-ing dynamic models for classification [37, 33] usually use atrade-off between accuracy and computations as the reward.However, in our task, a more effective reward is requiredto learn a reasonable dispatch policy for different imageregions that have diverse contents and distortions. There-fore, we propose a difficulty-regulated reward that not onlyconsiders performance and complexity, but also depends on“the difficulty of restoring an image region”. In particular,the reward at the i-th dynamic block is formulated as:

ri =

{−p(1− 1{1}(ai)), 1 ≤ i < N,−p(1− 1{1}(ai)) + d(−∆L2), i = N,

(2)where p is the reward penalty for choosing a complex pathin one dynamic block. The 1{1}(·) represents an indica-tor function and ai is the selected path index in the i-thdynamic block. When the first path (bypass connection)is selected, i.e., ai = 1, the reward is not penalized. Theperformance and difficulty are only considered in the N -thdynamic block. In particular, −∆L2 represents the perfor-mance gain in terms of L2 loss. The difficulty is denoted byd in the following formula:

d =

{Ld/L0, 0 ≤ Ld < L0,1, Ld ≥ L0,

(3)

where Ld is the loss function of the output image and L0

is a threshold. As Ld approaches zero, the difficulty d de-creases, indicating the input region is easy to restore. In thiscase, the performance gain is regulated by d < 1 and the re-ward also becomes smaller, so that the proposed difficulty-regulated reward will penalize the pathfinder for wastingtoo many computations on easy regions. There are multi-ple choices for Ld such as L2 loss and VGG loss [17], anddifferent loss functions may lead to different policy of pathselection.

Algorithm 1 REINFORCE (1 update)Get a batch of K image pairs, {x(1), y(1)}, · · · , {x(K), y(K)}With policy π, derive (s

(k)1 , a

(k)1 , r

(k)1 , . . . , s

(k)N , a

(k)N , r

(k)N )

Compute gradients using REINFORCE

∆θ =1

K

K∑k=1

N∑i=1

∇θ log π(a(k)i |s

(k)i ; θ)(

N∑j=i

r(k)j − b

(k))

Update parameters θ ← θ + β∆θ

3.3. Training Algorithm

The training process is composed of two stages. In thefirst stage, we train the multi-path CNN with random pol-icy of path selection as a good initialization. In the secondstage, we then train the pathfinder and the multi-path CNNsimultaneously, so that the two components are better asso-ciated and optimized.The First Stage. We first train the multi-path CNN withrandomly selected routes. The loss function consists of twoparts. First, a final L2 loss constrains the output images tohave a reasonable quality. Second, an intermediate loss en-forces the states observed by the pathfinder at each dynamicblock to be consistent. Specifically, the loss function is for-mulated as:

L1 = ||y − y||22 + α

N∑i=1

||y − flast(xi)||22, (4)

where y and y denote the output image and the ground truth,respectively. The N represents the number of dynamicblocks. The last convolutional layer of Path-Restore is de-noted by flast, and α is the weight of the intermediate loss.If no intermediate loss is adopted, the features observed atdifferent dynamic blocks may vary a lot from each other,which makes it more difficult for the pathfinder to learn agood dispatch policy.The Second Stage. We then train the pathfinder togetherwith the multi-path CNN. In this stage, the multi-path CNNis trained using the loss function in Eq. (4) with α = 0,i.e., intermediate loss does not take effect. Meanwhile, wetrain the pathfinder using the REINFORCE algorithm [36],as shown in Algorithm 1. The parameters of pathfinder andlearning rate are denoted by θ and β, respectively. We adoptan individual baseline reward b(k) to reduce the variance ofgradients and stabilize training. In particular, b(k) is the re-ward of the k-th image when the bypass connection is se-lected in all dynamic blocks.

4. ExperimentsIn this section, we first clarify the implementation de-

tails. In Sec. 4.1, we present the results of blind Gaus-sian denoising. In Sec. 4.2, we then show the capability

Page 5: 1 2Shenzhen Institutes of Advanced Technology, Chinese ... · Path-Restore: Learning Network Path Selection for Image Restoration Ke Yu 1Xintao Wang Chao Dong2 Xiaoou Tang1 Chen Change

Table 1. Settings of architecture and reward for different tasks.

TaskArchitecture RewardN M C Threshold(L0) Penalty(p)

Denoising 6 2 2 5×10−4 8×10−6

Mixed 5 4 4 0.01 4×10−5

of Path-Restore to address a mixed of complex distortions.In Sec. 4.3, we further demonstrate that our method per-forms well on real-world denoising. Finally, in Sec. 4.4, weoffer an ablation study to validate the effectiveness of thenetwork architecture and the difficulty-regulated reward.Network Architecture. As shown in Table 1, we select dif-ferent architectures for different restoration tasks. Specifi-cally, we employ N = 6 dynamic blocks and M = 2 pathsfor all the denoising tasks, and the pathfinder has C = 2convolutional layers. For the task to address mixed distor-tions, we adopt N = 5 dynamic blocks, M = 4 paths,and C = 4 convolutional layers in the pathfinder. Theseparameters are chosen empirically on the validation set forgood performance and also for fair comparison with base-line methods.Training Details. As shown in Table 1, the threshold ofdifficulty L0 and the reward penalty p are set based on thespecific task. Generally it requires larger threshold and re-ward penalty when the distortions are more severe. For alltasks, the difficulty d is measured by L2 loss. In the firsttraining stage, the coefficient of intermediate loss α is setas 0.1. Both training stages take 800k iterations. Trainingimages are cropped into 63×63 patches and the batch sizeis 32. The learning rate is initially 2×10−4 and decayed bya half after every quarter of the training process. We useAdam [20] as the optimizer and implement our method onTensorFlow [1].Evaluation Details. During testing, each image is split into63×63 regions with a stride 53. After being processed bythe multi-path CNN, all the regions are merged into a largeimage with overlapping pixels averaged. In each task, wereport the number of floating point operations (FLOPs) thatare required to process a 63×63 region on average. We alsorecord the CPU3 runtime to process a 512×512 image.

4.1. Evaluation on Blind Gaussian Denoising

Training and Testing Details. We first conduct exper-iments on a blind denoising task where the distortion isGaussian noise with a wide range of standard deviation [0,50]. The training dataset includes the first 750 images ofDIV2K [3] training set (denoted by DIV2K-T750) and 400images used in [7]. For fair comparison, we use the offi-cially released codes to train DnCNN [40] with the sametraining dataset as ours. Following [40], we test our modelon CBSD684, and we further do evaluation on the remaining

3Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz4Color images of BSD68 dataset [29].

50 images in the DIV2K training set (denoted by DIV2K-T50). In addition to uniform Gaussian noise, we also con-duct experiments on spatially variant Gaussian noise usingthe settings in [41]. Specifically, “linear” denotes that theGaussian noise level gradually increases from 0 to 50 fromthe left to the right side of an image. Another spatially vari-ant noise, denoted by “peaks”, is obtained by translatingand scaling Gaussian distributions as in [41], and the noiselevel also ranges from 0 to 50.Quantitative and Qualitative Results. We report thePSNR performance and average FLOPs in Table 2. Forboth uniform and spatially variant denoising on differentdatasets, Path-Restore is able to achieve about 26% speed-up than DnCNN [40] while achieving comparable perfor-mance. The acceleration on DIV2K-T50 is slightly largerthan that on CBSD68, because high-resolution images inDIV2K-T50 tend to have more smooth regions that canbe restored in short network paths. When processing a512×512 image, the CPU runtime of DnCNN is 3.16s,while the runtime of Path-Restore is 2.37s and 2.99s givenσ = 10 and σ = 50, respectively. In practice, Path-Restoreis about 20% faster than DnCNN on average.

Several qualitative results of spatially variant (type“peaks”) Gaussian denoising are shown in Figure 3. Al-though most of the restored images are visually compara-ble, Path-Restore achieves better results on some regionswith fine-grained textures, since Path-Restore could selectmore complex paths to process hard examples with detailedtextures.Path Selection. We first visualize the path selection foruniform Gaussian denoising in Figure 4. The input noisyimages are shown in the first row, and the policy under p =8× 10−6 and p = 5× 10−6 are presented in the second andthird row, respectively. The green color represents a shortand simple path while the red color stands for a long andcomplex path.

It is observed that a larger reward penalty p leads toa shorter path, as the pathfinder learns to avoid choosingoverly complex paths that have very high reward penalty.Although the noise level is uniform within each image, thepathfinder learns a dispatch policy that depends on the im-age content. For example, the pathfinder focuses on pro-cessing the subject of an image (e.g., the airplane and pen-guin in the left two images) while assigning the smoothbackground to simple paths.

The dispatch policy for spatially variant (type “linear”)Gaussian denoising are shown in Figure 5. As the noiselevel gradually increases from left to right, the chosen pathsbecome more and more complex. This demonstrates thatthe pathfinder learns a dispatch policy based on distortionlevel. Moreover, on each blue dash line, the noise level re-mains the same but the path selection is quite diverse, againdemonstrating the pathfinder could select short paths for

Page 6: 1 2Shenzhen Institutes of Advanced Technology, Chinese ... · Path-Restore: Learning Network Path Selection for Image Restoration Ke Yu 1Xintao Wang Chao Dong2 Xiaoou Tang1 Chen Change

Table 2. PSNR and average FLOPs of blind Gaussian denoising on CBSD68 and DIV2K-T50 datasets. The unit of FLOPs is Giga (×109).Path-Restore is consistently 25%+ faster than DnCNN with comparable performance on different noise settings.

Dataset CBSD68 [29] DIV2K-T50 [2]

Noiseuniform spatially variant uniform spatially variant

σ=10 σ=50 FLOPs linear peaks FLOPs σ=10 σ=50 FLOPs linear peaks FLOPsDnCNN [40] 36.07 27.96 5.31G 31.17 31.15 5.31G 37.32 29.64 5.31G 32.82 32.64 5.31GPath-Restore 36.04 27.96 4.22G 31.18 31.15 4.22G 37.26 29.64 4.20G 32.83 32.64 4.17G

Path‐Restore

DnCN

N

Figure 3. Qualitative results of spatially variant (type “peaks”) Gaussian denoising. While most visual results are comparable, Path-Restorerecovers texture regions better than DnCNN since these regions are processed with long paths.

Inpu

t8

105

10

short

long

Figure 4. The policy of path selection for uniform Gaussian denoising σ = 30. The pathfinder learns to select long paths for the objectswith detailed textures while process the smooth background with short paths.

short

longInput

0

50

Path selection Input Path selection Noise distribution

Figure 5. The policy of path selection for spatially variant (type “linear”) Gaussian denoising. The pathfinder learns a dispatch policy basedon both the content and the distortion, i.e., short paths for smooth and clean regions while long paths for textured and noisy regions.

smooth regions while choose long paths for texture regions.

4.2. Evaluation on Mixed Distortions

Training and Testing Details. We further evaluate ourmethod on a complex restoration task as that in RL-Restore [39], where an image is corrupted by different lev-els of Gaussian blur, Gaussian noise and JPEG compres-sion simultaneously. Following [39], we use DIV2K-T750for training and DIV2K-T50 for testing. We report resultsnot only on 63×63 sub-images as in [39] but also on thewhole images with 2K resolution. To conduct a thorough

comparison, we test RL-Restore in two ways for large im-ages: 1) each 63×63 region is processed using a specifictoolchain (denoted by RL-Restore-re), and 2) each 63×63region votes for a toolchain, and the whole image is pro-cessed with a unified toolchain that wins the most votes (de-noted by RL-Restore-im).Quantitative Results. The quantitative results are shownin Table 3. The results on 63×63 sub-images show thatPath-Restore achieves a slightly better performance com-pared with RL-Restore. While testing on large images with2K resolution, our method shows more apparent gain (0.2

Page 7: 1 2Shenzhen Institutes of Advanced Technology, Chinese ... · Path-Restore: Learning Network Path Selection for Image Restoration Ke Yu 1Xintao Wang Chao Dong2 Xiaoou Tang1 Chen Change

RL‐Restore‐re

RL‐Restore‐im

Path‐Restore

26.18dB

25.98dB

26.55dB

27.31dB

26.78dB

27.59dB

24.04dB

24.32dB

24.51dB

30.08dB

30.04dB

30.35dB

Figure 6. Qualitative results of addressing mixed distortions. RL-Restore [39] tends to generate artifacts across different regions (see thewhite arrow), while Path-Restore is able to handle diverse distortions and produce more spatially consistent results (zoom in for best view).

Table 3. Results of addressing mixed distortions on DIV2K-T50 [3] compared with two variants of RL-Restore [39].

Image size 63x63 image 2K imageMetric PSNR / SSIM PSNR / SSIM FLOPs(G)

RL-Restore-re N/A 25.61 / 0.8264 1.34RL-Restore-im 26.45 / 0.5587 25.55 / 0.8251 0.948Path-Restore 26.48 / 0.5667 25.81 / 0.8327 1.38

dB higher) than RL-Restore-re and RL-Restore-im with justa tiny increase in computational complexity (FLOPs). TheCPU runtime of RL-Restore-re, RL-Restore-im and Path-Restore are 1.34s, 1.09s and 0.954s, respectively. Theseresults show that Path-Restore runs faster than RL-Restorein practice. The reason is that RL-Restore needs extra timeto switch among different models while this overhead is notneeded in the unified framework – Path-Restore.Qualitative Results. As shown in Figure 6, RL-Restore-re tends to produce inconsistent boundaries and appearancebetween adjacent 63×63 regions, as each region may beprocessed by entirely different models. RL-Restore-im failsto address severe noise or blur in some regions because onlya single toolchain can be selected for the whole image. Onthe contrary, thanks to the dynamic blocks in a unified net-work, Path-Restore can not only remove the distortions indifferent regions but also yield spatially consistent results.

4.3. Evaluation on Real-World Denoising

Training and Testing Details. We further conduct exper-iments on real-word denoising. The training data are con-sistent with CBDNet [13] for a fair comparison. In par-ticular, we use BSD500 [25] and Waterloo [24] datasets tosynthesize noisy images, and use the same noise model asCBDNet. A real dataset RENOIR [4] is also adopted fortraining as CBDNet. We evaluate our method on the Darm-stadt Noise Dataset (DND) [27]. The DND contains 50 re-alistic high-resolution paired noisy and noise-free images,captured by four different consumer cameras with differing

Table 4. Results of real-world denoising on the Darmstadt NoiseDataset [27]. Unpublished works are denoted by “*”.

Method PSNR / SSIM FLOPs (G) Time (s)CBDNet [13] 38.06 / 0.9421 6.94 3.95

NLH* 38.81 / 0.9520 N/A N/ADRSR v1.4* 39.09 / 0.9509 N/A N/A

MLDN* 39.23 / 0.9516 N/A N/ARU sRGB* 39.51 / 0.9528 N/A N/APath-Restore 39.00 / 0.9542 5.60 3.06

Path-Restore-Ext 39.72 / 0.9591 22.6 12.6

sensor sizes.Quantitative Results. The quantitative results are pre-sented in Table 4, where CBDNet [13] is the methodthat achieves the highest performance among all the pub-lished works. Using the same training datasets as CBD-Net, Path-Restore significantly improves the performanceby nearly 1 dB with less computational cost and CPU run-time. Path-Restore-Ext denotes an extended deeper modeltrained with more data, which further improves 0.7 dB andcurrently achieves the state-of-the-art performance on theDND benchmark. More details are presented in the supple-mentary material.Qualitative Results. As shown in Figure 7, CBDNet failsto remove the severe noise in dim regions and tends to gen-erate artifacts along edges. This may be caused by the in-correct noise estimation for real-world images. On the con-trary, Path-Restore can successfully clean the noise in dif-ferent regions and recover sharp edges without artifacts, in-dicating that our path selection is more robust to handle realnoise compared with the noise estimation in CBDNet.

4.4. Ablation Study

Architecture of Dynamic Block. We investigate the num-ber of paths in each dynamic block. For the task to addressmixed distortions, the number of paths is originally set as4 since there are 3 paths for various distortions and 1 by-pass connection. As shown in Table 5, we alternatively use

Page 8: 1 2Shenzhen Institutes of Advanced Technology, Chinese ... · Path-Restore: Learning Network Path Selection for Image Restoration Ke Yu 1Xintao Wang Chao Dong2 Xiaoou Tang1 Chen Change

CBDNet Path‐RestoreInput CBDNet Path‐RestoreInput

Figure 7. Qualitative results on the Darmstadt Noise Dataset [27]. Path-Restore recovers clean results with sharp edges.

Table 5. Ablation study on the number of paths in a dynamic block.

DatasetDIV2K-T50 [2]

63×63 sub-image 2K imageMetric PSNR FLOPs (G) PSNR FLOPs (G)4 paths 26.48 1.00 25.81 1.382 paths 26.46 1.09 25.80 1.48

2 paths to address the same task. It is observed that theFLOPs increase by nearly 10% when achieving comparableperformance with the original setting.Difficulty-Regulated Reward. A performance-complexitytrade-off can be achieved by adjusting the reward penalty pwhile training. The original setting is p = 8× 10−6 for thedenoising task, and we gradually reduce p to 3× 10−6. Asshown in Figure 8, the blue curve depicts that PSNR andFLOPs both increase as the reward penalty decreases. Thisis reasonable since a smaller reward penalty encourages thepathfinder to select a longer path for each region. We alsoadjust the depth of DnCNN to achieve the trade-off betweenPSNR and FLOPs, which is shown in the green curve ofFigure 8. We observe that Path-Restore is consistently bet-ter than DnCNN by 0.1 dB with the same FLOPs. Whenachieving the same performance, our method is nearly 30%faster than DnCNN.

We then study the impact of “difficulty” in the proposedreward function. In particular, we set difficulty d ≡ 1in Eq. (3) to form a non-regulated reward. As shownin Figure 8, the orange curve is consistently below theblue curve, indicating that the proposed difficulty-regulatedreward helps Path-Restore achieve a better performance-complexity trade-off. We further present the path-selectionpolicy of different rewards in Figure 9. Compared withnon-regulated reward, our difficulty-regulated reward savesmore computations when processing easy regions (e.g., re-gion in the red box), while using a longer path to processhard regions (e.g., region in the blue box).

4.5 5.0 5.5 6.0FLOPs 1e9

32.75

32.80

32.85

32.90

32.95

PSNR

Performance-Complexity Trade-OffDifficulty-regulatedNon-regulatedDnCNN

Figure 8. Performance-complexity trade-off vs. non-regulated re-ward and DnCNN for spatially variant (type “linear”) Gaussiandenoising on DIV2K-T50 [2].

Input Difficulty‐regulated Non‐regulated

Figure 9. Path selection of regulated and non-regulated reward.

5. Conclusion

We have devised a new framework for image restora-tion with low computational cost. Combining reinforce-ment learning and deep learning, we have proposed Path-Restore that enables path selection for each image region.Specifically, Path-Restore is composed of a multi-path CNNand a pathfinder. The multi-path CNN offers several pathsto address different types of distortions, and the pathfinderis able to find the optimal path to efficiently restore eachregion. To learn a reasonable dispatch policy, we proposea difficulty-regulated reward that encourages the pathfinderto select short paths for easy regions. Path-Restore achievescomparable or superior performance to existing methods

Page 9: 1 2Shenzhen Institutes of Advanced Technology, Chinese ... · Path-Restore: Learning Network Path Selection for Image Restoration Ke Yu 1Xintao Wang Chao Dong2 Xiaoou Tang1 Chen Change

with much fewer computations on various restoration tasks.Our method is especially effective for real-world denoising,where the noise distribution is diverse across different im-age regions.

References[1] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean,

M. Devin, S. Ghemawat, G. Irving, M. Isard, et al. Tensor-flow: A system for large-scale machine learning. In 12th{USENIX} Symposium on Operating Systems Design andImplementation ({OSDI} 16), pages 265–283, 2016. 5

[2] E. Agustsson and R. Timofte. Ntire 2017 challenge on singleimage super-resolution: Dataset and study. In CVPRW, 2017.6, 8

[3] E. Agustsson and R. Timofte. Ntire 2017 challenge on sin-gle image super-resolution: Dataset and study. In CVPRW,volume 3, page 2, 2017. 5, 7

[4] J. Anaya and A. Barbu. Renoir–a dataset for real low-lightimage noise reduction. Journal of Visual Communicationand Image Representation, 51:144–154, 2018. 7

[5] Y. Bengio, N. Leonard, and A. Courville. Estimating or prop-agating gradients through stochastic neurons for conditionalcomputation. arXiv preprint arXiv:1308.3432, 2013. 2

[6] J. Chen, J. Chen, H. Chao, and M. Yang. Image blind denois-ing with generative adversarial network based noise model-ing. In Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition, pages 3155–3164, 2018. 2

[7] Y. Chen and T. Pock. Trainable nonlinear reaction diffusion:A flexible framework for fast and effective image restora-tion. IEEE transactions on pattern analysis and machineintelligence, 39(6):1256–1272, 2017. 5

[8] C. Dong, Y. Deng, C. Change Loy, and X. Tang. Compres-sion artifacts reduction by a deep convolutional network. InICCV, 2015. 2

[9] C. Dong, C. C. Loy, K. He, and X. Tang. Image super-resolution using deep convolutional networks. TPAMI,38(2):295–307, 2016. 2

[10] M. Figurnov, M. D. Collins, Y. Zhu, L. Zhang, J. Huang,D. P. Vetrov, and R. Salakhutdinov. Spatially adaptive com-putation time for residual networks. In CVPR, volume 2,page 7, 2017. 2

[11] J. Guo and H. Chao. Building dual-domain representationsfor compression artifacts reduction. In ECCV, 2016. 2

[12] J. Guo and H. Chao. One-to-many network for visuallypleasing compression artifacts reduction. arXiv preprintarXiv:1611.04994, 2016. 2

[13] S. Guo, Z. Yan, K. Zhang, W. Zuo, and L. Zhang. Towardconvolutional blind denoising of real photographs. In CVPR,2019. 1, 2, 7

[14] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learn-ing for image recognition. In Proceedings of the IEEE con-ference on computer vision and pattern recognition, pages770–778, 2016. 2

[15] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger.Densely connected convolutional networks. In CVPR, vol-ume 1, page 3, 2017. 1

[16] V. Jain and S. Seung. Natural image denoising with convo-lutional networks. In NIPS, 2009. 2

[17] J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses forreal-time style transfer and super-resolution. In ECCV, 2016.4

[18] J. Kim, J. Kwon Lee, and K. Mu Lee. Accurate image super-resolution using very deep convolutional networks. In CVPR,2016. 2

[19] J. Kim, J. K. Lee, and K. M. Lee. Deeply-recursive convolu-tional network for image super-resolution. In CVPR, 2016.2

[20] D. P. Kingma and J. Ba. Adam: A method for stochasticoptimization. In ICLR, 2015. 5

[21] C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham,A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al.Photo-realistic single image super-resolution using a genera-tive adversarial network. In CVPR, 2017. 2

[22] S. Lefkimmiatis. Non-local color image denoising with con-volutional neural networks. In CVPR, 2017. 2

[23] B. Lim, S. Son, H. Kim, S. Nah, and K. M. Lee. Enhanceddeep residual networks for single image super-resolution. InCVPRW, volume 1, page 4, 2017. 1

[24] K. Ma, Z. Duanmu, Q. Wu, Z. Wang, H. Yong, H. Li, andL. Zhang. Waterloo exploration database: New challengesfor image quality assessment models. IEEE Transactions onImage Processing, 26(2):1004–1016, 2017. 7

[25] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A databaseof human segmented natural images and its application toevaluating segmentation algorithms and measuring ecologi-cal statistics. In null, page 416. IEEE, 2001. 7

[26] S. Nah, T. H. Kim, and K. M. Lee. Deep multi-scale con-volutional neural network for dynamic scene deblurring. InCVPR, 2017. 2

[27] T. Plotz and S. Roth. Benchmarking denoising algorithmswith real photographs. In CVPR, pages 1586–1595, 2017. 1,2, 7, 8

[28] C. Rosenbaum, T. Klinger, and M. Riemer. Routing net-works: Adaptive selection of non-linear functions for multi-task learning. In ICLR, 2017. 2

[29] S. Roth and M. J. Black. Fields of experts. InternationalJournal of Computer Vision, 82(2):205, 2009. 5, 6

[30] J. Sun, W. Cao, Z. Xu, and J. Ponce. Learning a convolu-tional neural network for non-uniform motion blur removal.In CVPR, 2015. 2

[31] Y. Tai, J. Yang, and X. Liu. Image super-resolution via deeprecursive residual network. In CVPR, 2017. 2

[32] R. Timofte, S. Gu, J. Wu, L. V. Gool, L. Zhang, et al. Ntire2018 challenge on single image super-resolution: Methodsand results. In CVPRW, 2018. 1

[33] X. Wang, F. Yu, Z.-Y. Dou, and J. E. Gonzalez. Skipnet:Learning dynamic routing in convolutional networks. InECCV, 2018. 2, 4

[34] X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, C. C.Loy, Y. Qiao, and X. Tang. Esrgan: Enhanced super-resolution generative adversarial networks. arXiv preprintarXiv:1809.00219, 2018. 2

Page 10: 1 2Shenzhen Institutes of Advanced Technology, Chinese ... · Path-Restore: Learning Network Path Selection for Image Restoration Ke Yu 1Xintao Wang Chao Dong2 Xiaoou Tang1 Chen Change

[35] Z. Wang, D. Liu, S. Chang, Q. Ling, Y. Yang, and T. S.Huang. D3: Deep dual-domain based fast restoration ofJPEG-compressed images. In CVPR, 2016. 2

[36] R. J. Williams. Simple statistical gradient-following algo-rithms for connectionist reinforcement learning. Machinelearning, 8(3-4):229–256, 1992. 4

[37] Z. Wu, T. Nagarajan, A. Kumar, S. Rennie, L. S. Davis,K. Grauman, and R. Feris. Blockdrop: Dynamic inferencepaths in residual networks. In CVPR, pages 8817–8826,2018. 2, 4

[38] L. Xu, J. S. J. Ren, C. Liu, and J. Jia. Deep convolutionalneural network for image deconvolution. NIPS, 2014. 2

[39] K. Yu, C. Dong, L. Lin, and C. C. Loy. Crafting a toolchainfor image restoration by deep reinforcement learning. InCVPR, pages 2443–2452, 2018. 2, 6, 7

[40] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang. Beyonda gaussian denoiser: Residual learning of deep cnn for imagedenoising. TIP, 2017. 2, 5, 6

[41] K. Zhang, W. Zuo, and L. Zhang. Ffdnet: Toward a fastand flexible solution for cnn based image denoising. IEEETransactions on Image Processing, 2018. 5