context encoders - stanford...

32
Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo: Live on the Edge Photography

Upload: others

Post on 25-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Context EncodersFeature Learning by InpaintingBy Pathak et al. (2016)

Photo: Live on the Edge Photography

Page 2: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Unsupervised Semantic Feature Learning

Intro Related Work Main Contributions Results Conclusion

More supervised

More semantic

ImageNet

Image Captioning

Learning to Generate Chairs

Image reconstruction

Semantic Inpainting

GAN

Image denoising

Context Prediction

OdometryPrediction

Page 3: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Inputs:

( , )

Task: Learn a

f( ) =

Semantic Inpainting

Intro Related Work Main Contributions Results Conclusion

Photo: Live on the Edge Photography

Page 4: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Semantic Inpainting+ For large regions, requires

semantics+ Unsupervised

- Ill-posed (not well-defined)

Intro Related Work Main Contributions Results Conclusion

Photo: Zhang et al (ECCV 2016)

Page 5: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Hypothesis Selection in Semantic InpaintingHow to choose between possibilities?

L2: Choose them all

Adversarial: Pick the most believable

Intro Related Work Main Contributions Results Conclusion

Photo: Pathak et al. (2016)

Page 6: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Related Work

Intro Related Work Main Contributions Results ConclusionIntro Related Work Main Contributions Results Conclusion

Page 7: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Unsupervised Semantic Feature Learning

More supervised

More semantic

ImageNet

Image reconstruction

Semantic Inpainting

GAN

Image Captioning

Image denoising

Learning to Generate Chairs

Context Prediction

OdometryPrediction

Visual Memex

Intro Related Work Main Contributions Results Conclusion

Page 8: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Visual MemexCreates graph of previously seen objects, and compares query image to graph

Intro Related Work Main Contributions Results Conclusion

Malisiewicz et al. (2009)

Page 9: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Unsupervised Semantic Feature Learning

More supervised

More semantic

ImageNet

Image reconstruction

Semantic Inpainting

GAN

Image Captioning

Image denoising

Learning to Generate Chairs

Context Prediction

OdometryPrediction

Intro Related Work Main Contributions Results Conclusion

Page 10: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Dosovitsky et al. (2015)

Intro Related Work Main Contributions Results Conclusion

Page 11: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Unsupervised Semantic Feature Learning

More supervised

More semantic

ImageNet

Image Captioning

Learning to Generate Chairs

Image reconstruction

Semantic Inpainting

GAN

Image denoising

Context Prediction

OdometryPrediction

Intro Related Work Main Contributions Results Conclusion

Page 12: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Autoencoders

Intro Related Work Main Contributions Results Conclusion

Shinya Yuki (2016)

Page 13: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Unsupervised Semantic Feature Learning

More supervised

More semantic

ImageNet

Image Captioning

Learning to Generate Chairs

Image reconstruction

Semantic Inpainting

GAN

Image denoising

Context Prediction

OdometryPrediction

Intro Related Work Main Contributions Results Conclusion

Page 14: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Context Prediction

Intro Related Work Main Contributions Results Conclusion

Doersch et al. (2016)

Page 15: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Unsupervised Semantic Feature Learning

More supervised

More semantic

ImageNet

Image reconstruction

Semantic Inpainting

GAN

Image Captioning

Image denoising

Learning to Generate Chairs

Context Prediction

OdometryPrediction

Intro Related Work Main Contributions Results Conclusion

Page 16: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Learning to See by Moving

Intro Related Work Main Contributions Results Conclusion

Agrawal et al. (2015)

Page 17: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Main Contributions

Intro Related Work Main Contributions Results Conclusion

Page 18: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Context Aware L210x scaled loss in context region,

Intro Related Work Main Contributions Results Conclusion

Inputs:

( , )

Page 19: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Random Patches

Intro Related Work Main Contributions Results Conclusion

Page 20: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

AlexNet Architecture

Intro Related Work Main Contributions Results Conclusion

Page 21: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Channel-Wise Fully Connected

Followed by 1x1 convolution to propagate across channels

Intro Related Work Main Contributions Results Conclusion

100M → <0.4М

Page 22: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Context Encoder Architecture

Intro Related Work Main Contributions Results Conclusion

Page 23: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Context Encoder Architecture Continued

Intro Related Work Main Contributions Results Conclusion

GAN Objective:

Adversarial LossTerm:

Context EncoderObjective:

Page 24: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Results

Intro Related Work Main Contributions Results Conclusion

Page 25: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Feature Transfer Evaluation Methodology

Intro Related Work Main Contributions Results Conclusion

● Feature transfer capability evaluated on three tasks: a. Classification pretrainingb. Detection pretrainingc. Semantic Segmentation pretraining

● Compared against:a. Random weight initializationb. Autoencoder initializationc. Learning to see by moving (Agrawal et al.)d. Context prediction (Doersch et al.)e. Unsupervised learning with videos (Wang et al.)

Page 26: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Further Details

Intro Related Work Main Contributions Results Conclusion

● Classification○ Pascal VOC 2007 Dataset○ ~10000 images for training○ Output generated by voting from 10 random croppings of input image

● Detection○ Pascal VOC 2007 Detection Challenge Dataset ○ Fast R-CNN method (Girshick, 2015) used to generate detection hypotheses

from features● Segmentation

○ Pascal VOC 2012 Dataset ○ Fully convolutional network (FCN) (Shelhamer et al., 2015) used to generate

segmentation hypothesis from features

Page 27: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Intro Related Work Main Contributions Results Conclusion

Pretraining Results

Doersch et al. 65.3% 51.1%Modified

Page 28: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Intro Related Work Main Contributions Results Conclusion

Inpainting Results

Page 29: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Intro Related Work Main Contributions Results Conclusion

Encoded Features Nearest Neighbors

Page 30: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Recapitulation

Intro Related Work Main Contributions Results Conclusion

Page 31: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Intro Related Work Main Contributions Results Conclusion

Paper Contributions● Idea of using semantic inpainting as a supervisory signal for

unsupervised feature learning● Idea of using adversarial loss as a modular loss function that

can be combined with other losses● Qualitatively nice inpainting results

Page 32: Context Encoders - Stanford Universityweb.stanford.edu/class/cs331b/2016/presentations/paper13.pdf · Context Encoders Feature Learning by Inpainting By Pathak et al. (2016) Photo:

Intro Related Work Main Contributions Results Conclusion

Negatives of Paper● Seemed to be two “separate tasks”

a. Unsupervised feature learningb. Semantic inpainting

● No feature transfer results for context encoder● No results for how adversarial loss affects pre-trainability of

context encoder features● Worked on par with other pre-training methods

Semantic Inpainting

Feature Learning

Semantic Inpainting

Feature Learning