deep neural networks are easily fooled: high confidence...

Post on 13-Jul-2020

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Deep Neural Networks are Easily Fooled:High Confidence Predictions for Unrecognizable Images

Author: Anh Nguyen et. al.

Speaker: Charlie Liu

Date: Oct, 22nd

Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictionsfor Unrecognizable Images. In Computer Vision and Pattern Recognition (CVPR ’15), IEEE, 2015.

ECS 289G 001 Paper Presentation, Prof. Lee

https://www.youtube.com/watch?v=M2IebCN9Ht4

Main Idea

• It is easy to produce images that are• completely unrecognizable to humans,

• but DNNs believe to be recognizable objects with 99.99% confidence

Main Idea

• It is easy to produce images that are• completely unrecognizable to humans,

• but DNNs believe to be recognizable objects with 99.99% confidence

Nguyen A et. al. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

Methods

• DNN models• AlexNet + ILSVRC 2012 ImageNet

• LeNet + MNIST dataset

Methods

• DNN models• AlexNet + ILSVRC 2012 ImageNet

• LeNet + MNIST dataset

• Image Generating Algorithms• Evolutionary Algorithms (EAs)

• MAP-Elites - more computationally efficient than EAs

Methods

• DNN models• AlexNet + ILSVRC 2012 ImageNet

• LeNet + MNIST dataset

• Image Generating Algorithms• Evolutionary Algorithms (EAs)

• MAP-Elites - more computationally efficient than EAs

• Encoding methods for EAs• Direct encoding

• Indirect encoding – regular images

Indirect encoding

Nguyen A et. al. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

Indirect encoding

• More likely to produce regular images

• Evolve complex, regular images• Resemble natural and man-made objects

• Some images can be recognized by DNNs

Indirect encoding

Nguyen A et. al. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

Results

• Evolving irregular images to match MNIST• Digits 0-9, 99.99% confidence after 200 generations

Nguyen A et. al. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

Results

• Evolving regular images to match MNIST• Digits 0-9, 99.99% confidence after 200 generations

Nguyen A et. al. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

Results

• Evolving irregular images to match ImageNet• Less successful, median confidence 21.59% after 20k generations

• However, 45 classes are classified with ≥ 99%

Nguyen A et. al. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

Results

• Evolving regular images to match ImageNet• 88.11% median confidence after 5k generations

Nguyen A et. al. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

Generated Images

Nguyen A et. al. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

Generated Images - Properties

Nguyen A et. al. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

• Generated images often contain some features of the target class• Evolution need only to produce features that are unique to a class

Generated Images - Properties

Nguyen A et. al. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

• Evolution produced very similar, high-confidence images for all classes

• Many of the images are related to each other phylogenetically

Generated Images - Properties

Nguyen A et. al. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

• Removing copies of the repeated element lead to a performance drop

Generated Images - Properties

• The low-performing band of classes (157-286) are dogs and cats

• Datasets with more classes can help ameliorate fooling

Nguyen A et. al. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

Results

• Images that fool one DNN generalize to others

Results

• Images that fool one DNN generalize to others

• Training networks to recognize fooling images• The median confidence decreased from 88.1% to 11.7%

Nguyen A et. al. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

Results

• Images that fool one DNN generalize to others

• Training networks to recognize fooling images

• Producing images via gradient ascent

Nguyen A et. al. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

Discussion

• Why high-confidence, unrecognizable images are generated?• Discriminative models

• create decision boundaries that partition data into classification regions

Nguyen A et. al. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

Discussion

• Why high-confidence, unrecognizable images are generated?• Discriminative models

• Generative models• may be more difficult to fool, because of low marginal probability

• current generative models do not scale well

Discussion

• Why high-confidence, unrecognizable images are generated?• Discriminative models

• Generative models

• Some generated images are recognizable as members of their target class once the class label is known

Nguyen A et. al. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

Conclusion

• It is easy to produce images that are• completely unrecognizable to humans,

• but DNNs believe to be recognizable objects with 99.99% confidence

• Direct encoding, indirect encoding to generate fooling images• A third way to generate is gradient ascent

• Training networks to recognize fooling images

• The models of this property is not fully known• Discriminative models

• Generative models

Questions?Thanks!

ECS 289G 001 Paper Presentation, Oct 22ndNguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictionsfor Unrecognizable Images. In Computer Vision and Pattern Recognition (CVPR ’15), IEEE, 2015.

top related