novelty detection for deep classifiersalinlab.kaist.ac.kr/resource/lec13_novelty_detection.pdf ·...

AlgorithmicIntelligenceLab

EE807:RecentAdvancesinDeepLearningLecture13

Slidemadeby

Kimin LeeKAISTEE

NoveltyDetectionforDeepClassifiers

1. Introduction• Whatisnoveltydetection?• Overview

2. UtilizingthePosteriorDistribution• Baselinemethod• Post-processingmethod

3. UtilizingtheHiddenFeatures• Localintrinsicdimensionality• Mahalanobis distance-basedscore

TableofContents

• Deepneuralnetworks(DNNs)canbegeneralizedwell whenthetestsamplesarefromsimilardistribution(i.e.,in-distribution)

WhatisNoveltyDetection?

Trainingdata=animal

TestsampleDNNs

Softmax

cat dog

• Deepneuralnetworks(DNNs)canbegeneralizedwell whenthetestsamplesarefromsimilardistribution(i.e.,in-distribution)

• However,intherealworld,therearemanyunknownandunseensamples thatclassifiercan’tgivearightanswer

Trainingdata=animal

TestsampleDNNs

Softmax

cat dog

Unseensample,i.e.,out-of-distribution(notanimal)

Unknownsample Adversarialsamples[Goodfellow etal.,2015]

• Noveltydetection• Givenpre-trained(deep)classifier,• Detectwhetheratestsampleisfromin-distribution(i.e.,trainingdistributionbyclassifier)ornot(e.g.,out-of-distribution/adversarialsamples)

Decisionboundary

Abnormalsample

• Itcanbeusefulformanymachinelearningproblems:

Decisionboundary

Abnormalsample

Calibration[Guoetal.,2017]

Ensemblelearning[Leeetal.,2017]

Incrementallearning[Rebuff etal.,2017]

• ItisalsoindispensablewhendeployingDNNsinreal-worldsystems [Amodei etal.,2016]

Decisionboundary

Abnormalsample

Autonomousdrive Secureauthenticationsystem

• Howtosolvethisproblem?• Threshold-basedDetector[Hendrycks etal.,2017,Liangetal.,2018]

[Testsample] [DeepClassifier]

score10

Ifscore>𝜖:In-distribution

Else:out-of-distribution

score10

Howtogetconfidencescore

• Utilizingaposterior distribution• 1.Maximumvalueorentropyofposterior[Hendrycks etal.,2017]

• 2.Inputandoutputprocessing[Liangetal.,2018]

• 3.Bayesianinference[Lietal.,2017]andensembleofclassifier[Balajietal.,2017]

Softmax

Persiancat

tigercat

0.120.18

• UtilizingahiddenfeaturesfromDNNs• 1.Localintrinsicdimensionality[Maetal.,2018]

• 2.Mahalanobis distance[Leeetal.,2018b]

Softmax

Persiancat

tigercat

0.120.18

TableofContents

• Remindthatclassificationisfindinganunknownposteriordistribution,i.e.,P(Y|X)

• Howtomodelourposteriordistribution:Softmax classifierwithDNNs

• WhereishiddenfeaturesfromDNNs

UtilizingthePosteriorDistribution

Inputspace Outputspace

• Remindthatclassificationisfindinganunknownposteriordistribution,i.e.,P(Y|X)

• Howtomodelourposteriordistribution:Softmax classifierwithDNNs

• WhereishiddenfeaturesfromDNNs

• Naturalchoiceforconfidencescore• 1.maximumvalueofposteriordistribution

• 2.entropyofposteriordistribution

Inputspace Outputspace

• Baselinedetector[Hendrycks etal.,2017]• Confidencescore=maximumvalueofpredictivedistribution

[Input] [Deepclassifier]

• Evaluation:detectingout-of-distribution• AssumethatwehaveclassifiertrainedonMNISTdataset• Detectingout-of-distributionforthisclassifier

In-distribution Out-of-distribution

Predictivedist.

• Evaluation:detectingout-of-distribution• TP=truepositive/FN=falsenegative/TN=truenegative/FP=falsepositive

• AUROC• AreaunderROCcurve• ROCcurve=relationshipbetweenTPRandFPR

• AUPR(AreaunderthePrecision-Recallcurve)• AreaunderPRcurve• PRcurve=relationshipbetweenprecisionandrecall

• Evaluation:detectingout-of-distribution• Imageclassification(computervision)

Baselinemethodisbetterthanrandomdetector

• Evaluation:detectingout-of-distribution• Textcategorization(NLP)

• Out-of-distribution• 5Newsgroupsfor15Newsgroups• 2ReutersforReuters6• 12Reutersfor40Reuters

• ODINdetector[Liangetal.,2018]• Calibratingtheposteriordistributionusingpost-processing

• Twotechniques• Temperaturescaling

• Relaxingtheoverconfidencebysmoothingtheposteriordistribution

Temperaturescalingparameter

• Inputpreprocessing

Magnitudeofnoise

• Usingtwomethods,theauthorsdefineconfidencescoreasfollows:

• Howtoselecthyper-parameters• Validation

• 1000imagesfromin-distribution(positive)• 1000imagesfromout-of-distribution(negative)

• Experimentalresults

2. UtilizingthePosteriorDistribution• Baseline method• Post-processingmethod

TableofContents

• Motivation• HiddenfeaturesfromDNNscontainmeaningfulfeaturesfromtrainingdata

• Theycanbeusefulfordetectingabnormalsamples!

UtilizingtheHiddenFeatures

LotsofData

Objects

Edge Parts

• LocalIntrinsicDimensionality(LID)[Maetal.,2018]• Expansiondimension

• Rateofgrowthinthenumberofdataencounteredasthedistancefromthereferencesampleincreases(𝑉 isvolume)

• LID=expansiondimensioninthestatisticalsetting

• Where𝐹 isanalogoustothevolumeinequation(1)

• LID=expansiondimensioninthestatisticalsetting

• Where𝐹 isanalogoustothevolumeinequation(1)• EstimationofLID[Amsaleg etal.,2015]

distancebetweensampleanditsk-th nearestneighbor

• MotivationofLID• Abnormalsamplemightbescatteredcomparedtonormalsamples

• ThisimpliesthatLIDcanbeusefulfordetectingabnormalsamples!

• Evaluation:detectingadversarialsamples[Szegedy,etal.,2013]• Misclassifiedexamplesthatareonlyslightlydifferentfromoriginalexamples

*Thistopicwillbecoveredinthenextlecture

• Evaluation:detectingadversarialsamples[Szegedy,etal.,2013]

• Empiricaljustification

• Adversarialsamples(generatedbyOPTattack[Carlini etal.,2017])canbedistinguishedusingLID

• LIDsfromlow-levellayersarealsousefulindetection

• Mainresultsondetectingadversarialattacks• Testedmethod

• Bayesianuncertainty(BU)andDensityestimator(DE)[Feinman etal.,2017]

• LIDoutperformsallbaselinemethods

• Mahalanobis distance-basedconfidencescore[Leeetal.,2018]

• Mahalanobis distance-basedconfidencescore[Leeetal.,2018]• Givenpre-trainedSoftmax classifierwithDNNs

• Inducingagenerativeclassifieronhiddenfeaturespace

penultimate

• Motivation:connectionbetween softamx andgenerativeclassifier(LDA)

penultimate

• Theparametersofgenerativeclassifier=samplemeansandcovariance• Giventrainingdata

penultimate

• Usinggenerativeclassifier,wedefinenewconfidencescore:

• Measuringthelogoftheprobabilitydensitiesofthetestsample

• Intuition

• Boostingtheperformance• Inputpre-processing

• MotivatedbyODIN[Liangetal.,2018]

• Featureensemble

FittingGaussianusingfeaturesfromintermediatelayers

• Featureensemble

• Intuition:low-levelfeaturealsocanbeusefulfordetectingabnormalsamples

FittingGaussianusingfeaturesfromintermediatelayers

• Mainalgorithm

• Remarkthat• Wecombinetheconfidencescoresfrommultiplelayersusingweightedensemble

• Ensembleweightsareselectedbyutilizingthevalidationset

• Experimentalresultsondetectingout-of-distribution• Contributionbyeachtechnique

Baseline[13]:maximumvalueofposteriordistributionODIN[21]:maximumvalueofposteriordistributionafterpost-processingOurs:theproposedMahalanobis distance-basedscore

• Experimentalresultsondetectingout-of-distribution• Mainresults

• Forallcases,oursoutperformsODINandbaselinemethod• Validationconsistsof1Kdatafromeachin- andout-of-distributionpair• Validationconsistsof1Kdatafromeachin- andcorrespondingFGSMdata

• Noinformationaboutout-of-distribution

• Experimentalresultsondetectingadversarialattacks• Mainresults

• Foralltestedcases,ourmethodoutperformsLIDandKDestimator• Forunseenattacks,ourmethodisstillworkingwell

• FGSMsamplesdenotedby“seen”areusedforvalidation

• Inthislecture,wecovervariousmethodsfordetectingabnormalsampleslikeout-of-distributionandadversarialsamples• Posteriordistribution-basedmethods• Hiddenfeature-basedmethods

• Therearealsotrainingmethodsforobtainingmorecalibratedscores• Ensembleofclassifier[Balajietal.,2017]• Bayesiandeepmodels[Lietal.,2017]• CalibrationlosswithGAN[Leeetal.,2018a]

• Suchmethodscanbeusefulformanymachinelearningapplications• Activelearning[Galetal.,2017]• Incrementallearning[Rebuff etal.,2017]• Ensemblelearning[Leeetal.,2017]• Networkcalibration[Guoetal.,2017]

Summary

[Hendrycks etal.,2017]Abaselinefordetectingmisclassifiedandout-of-distributionexamplesinneuralnetworks. InICLR2017.https://arxiv.org/abs/1610.02136

[Maetal.,2018]CharacterizingAdversarialSubspacesUsingLocalIntrinsicDimensionality. InICLR,2018.https://openreview.net/pdf?id=B1gJ1L2aW

[Feinman etal.,2017]Detectingadversarialsamplesfromartifacts. arXiv preprintarXiv:1703.00410,2017.https://arxiv.org/abs/1703.00410

[Lee,etal.,2018a]TrainingConfidence-calibratedClassifiersforDetectingOut-of-DistributionSamples,InICLR,2018.https://arxiv.org/abs/1711.09325

[Lee,etal.,2018b]ASimpleUnifiedFrameworkforDetectingOut-of-DistributionSamplesandAdversarialAttacks,InNIPS,2018.https://arxiv.org/abs/1807.03888

[Liang,etal.,2018]PrincipledDetectionofOut-of-DistributionExamplesinNeuralNetworks. InICLR,2018.https://arxiv.org/abs/1706.02690

[Goodfellow etal.,2015]Explainingandharnessingadversarialexamples. InICLR,2015.https://arxiv.org/pdf/1412.6572.pdf

[Amodei,etal.,2016]Concreteproblemsinai safety.arXiv preprintarXiv:1606.06565,2016.https://arxiv.org/abs/1606.06565

[Guoetal.,2017]OnCalibrationofModernNeuralNetworks. InICML,2017.https://arxiv.org/abs/1706.04599

[Leeetal.,2017]ConfidentMultipleChoiceLearning.InICML,2017.https://arxiv.org/abs/1706.03475

[Balajietal.,2017]SimpleandScalablePredictiveUncertaintyEstimationusingDeepEnsembles,InNIPS,2017.https://arxiv.org/pdf/1612.01474.pdf

References

[Rebuff etal.,2017]iCaRL:IncrementalClassifierandRepresentationLearning. InCVPR,2017.https://arxiv.org/pdf/1611.07725.pdf

[Huangetal.,2017]Denselyconnectedconvolutionalnetworks,InCVPR,2017.https://arxiv.org/abs/1608.06993

[Zagoruyko etal.,2016]Wideresidualnetworks,InBMVC2016.https://arxiv.org/pdf/1605.07146.pdf

[Amsaleg etal.,2015]Estimatinglocalintrinsicdimensionality.InSIGKDD,2015.http://mistis.inrialpes.fr/~girard/Fichiers/p29-amsaleg.pdf

[Szegedy etal.,2013]Intriguingpropertiesofneuralnetworks. arXiv preprintarXiv:1312.6199,2013.https://arxiv.org/abs/1312.6199

[Lietal.,2017]DropoutInferenceinBayesianNeuralNetworkswithAlpha-divergences,InICML,2017.https://arxiv.org/abs/1703.02914

[Galetal.,2017]DeepBayesianActiveLearningwithImageData,InICML,2017.https://arxiv.org/abs/1703.02910

[Carlini etal.,2017]Towardsevaluatingtherobustnessofneuralnetworks.In IEEESP,2017.https://arxiv.org/abs/1608.04644

References

novelty detection for deep classifiersalinlab.kaist.ac.kr/resource/lec13_novelty_detection.pdf ·...

Documents

#10 appeal to novelty

economic development, novelty consumption,...

song liang - egh.phhp.ufl.edu file · web viewsong liang -...

wearables. novelty or necessity

liang, yamin, al bodour and erfani...

novelty yarn

support vector method for novelty...

paul banks novelty samples

novelty trends catalog

liang fu et al- majorana fermions in topological insulators

novelty techpoint

structure & novelty

novelty techpoint brochure

novelty techpoint ebrochure

novelty: what’s new? plenty!

novelty 2012

triz (novelty) part 1

novelty mugs

novelty world 2011 catalogue

innovativesness, novelty seeking