novelty detection for deep classifiersalinlab.kaist.ac.kr/resource/lec13_novelty_detection.pdf ·...
Post on 01-Sep-2020
3 Views
Preview:
TRANSCRIPT
AlgorithmicIntelligenceLab
AlgorithmicIntelligenceLab
EE807:RecentAdvancesinDeepLearningLecture13
Slidemadeby
Kimin LeeKAISTEE
NoveltyDetectionforDeepClassifiers
AlgorithmicIntelligenceLab
1. Introduction• Whatisnoveltydetection?• Overview
2. UtilizingthePosteriorDistribution• Baselinemethod• Post-processingmethod
3. UtilizingtheHiddenFeatures• Localintrinsicdimensionality• Mahalanobis distance-basedscore
TableofContents
2
AlgorithmicIntelligenceLab
1. Introduction• Whatisnoveltydetection?• Overview
2. UtilizingthePosteriorDistribution• Baselinemethod• Post-processingmethod
3. UtilizingtheHiddenFeatures• Localintrinsicdimensionality• Mahalanobis distance-basedscore
TableofContents
3
AlgorithmicIntelligenceLab
• Deepneuralnetworks(DNNs)canbegeneralizedwell whenthetestsamplesarefromsimilardistribution(i.e.,in-distribution)
WhatisNoveltyDetection?
4
Trainingdata=animal
TestsampleDNNs
Softmax
cat dog
0.99
AlgorithmicIntelligenceLab
• Deepneuralnetworks(DNNs)canbegeneralizedwell whenthetestsamplesarefromsimilardistribution(i.e.,in-distribution)
• However,intherealworld,therearemanyunknownandunseensamples thatclassifiercan’tgivearightanswer
WhatisNoveltyDetection?
5
Trainingdata=animal
TestsampleDNNs
Softmax
cat dog
0.99
Unseensample,i.e.,out-of-distribution(notanimal)
Unknownsample Adversarialsamples[Goodfellow etal.,2015]
AlgorithmicIntelligenceLab
• Noveltydetection• Givenpre-trained(deep)classifier,• Detectwhetheratestsampleisfromin-distribution(i.e.,trainingdistributionbyclassifier)ornot(e.g.,out-of-distribution/adversarialsamples)
WhatisNoveltyDetection?
6
Decisionboundary
Abnormalsample
AlgorithmicIntelligenceLab
• Noveltydetection• Givenpre-trained(deep)classifier,• Detectwhetheratestsampleisfromin-distribution(i.e.,trainingdistributionbyclassifier)ornot(e.g.,out-of-distribution/adversarialsamples)
• Itcanbeusefulformanymachinelearningproblems:
WhatisNoveltyDetection?
7
Decisionboundary
Abnormalsample
Calibration[Guoetal.,2017]
Ensemblelearning[Leeetal.,2017]
Incrementallearning[Rebuff etal.,2017]
AlgorithmicIntelligenceLab
• Noveltydetection• Givenpre-trained(deep)classifier,• Detectwhetheratestsampleisfromin-distribution(i.e.,trainingdistributionbyclassifier)ornot(e.g.,out-of-distribution/adversarialsamples)
• ItisalsoindispensablewhendeployingDNNsinreal-worldsystems [Amodei etal.,2016]
WhatisNoveltyDetection?
8
Decisionboundary
Abnormalsample
Autonomousdrive Secureauthenticationsystem
AlgorithmicIntelligenceLab
• Howtosolvethisproblem?• Threshold-basedDetector[Hendrycks etal.,2017,Liangetal.,2018]
WhatisNoveltyDetection?
9
[Testsample] [DeepClassifier]
score10
Ifscore>𝜖:In-distribution
Else:out-of-distribution
AlgorithmicIntelligenceLab
• Howtosolvethisproblem?• Threshold-basedDetector[Hendrycks etal.,2017,Liangetal.,2018]
WhatisNoveltyDetection?
10
[Testsample] [DeepClassifier]
score10
Ifscore>𝜖:In-distribution
Else:out-of-distribution
Howtogetconfidencescore
AlgorithmicIntelligenceLab
• Howtosolvethisproblem?• Threshold-basedDetector[Hendrycks etal.,2017,Liangetal.,2018]
• Utilizingaposterior distribution• 1.Maximumvalueorentropyofposterior[Hendrycks etal.,2017]
• 2.Inputandoutputprocessing[Liangetal.,2018]
• 3.Bayesianinference[Lietal.,2017]andensembleofclassifier[Balajietal.,2017]
WhatisNoveltyDetection?
11
[Testsample] [DeepClassifier]
Softmax
Persiancat
tigercat
0.120.18
AlgorithmicIntelligenceLab
• Howtosolvethisproblem?• Threshold-basedDetector[Hendrycks etal.,2017,Liangetal.,2018]
• UtilizingahiddenfeaturesfromDNNs• 1.Localintrinsicdimensionality[Maetal.,2018]
• 2.Mahalanobis distance[Leeetal.,2018b]
WhatisNoveltyDetection?
12
[Testsample] [DeepClassifier]
Softmax
Persiancat
tigercat
0.120.18
AlgorithmicIntelligenceLab
1. Introduction• Whatisnoveltydetection?• Overview
2. UtilizingthePosteriorDistribution• Baselinemethod• Post-processingmethod
3. UtilizingtheHiddenFeatures• Localintrinsicdimensionality• Mahalanobis distance-basedscore
TableofContents
13
AlgorithmicIntelligenceLab
• Remindthatclassificationisfindinganunknownposteriordistribution,i.e.,P(Y|X)
• Howtomodelourposteriordistribution:Softmax classifierwithDNNs
• WhereishiddenfeaturesfromDNNs
UtilizingthePosteriorDistribution
14
Inputspace Outputspace
AlgorithmicIntelligenceLab
• Remindthatclassificationisfindinganunknownposteriordistribution,i.e.,P(Y|X)
• Howtomodelourposteriordistribution:Softmax classifierwithDNNs
• WhereishiddenfeaturesfromDNNs
• Naturalchoiceforconfidencescore• 1.maximumvalueofposteriordistribution
• 2.entropyofposteriordistribution
UtilizingthePosteriorDistribution
15
Inputspace Outputspace
AlgorithmicIntelligenceLab
• Baselinedetector[Hendrycks etal.,2017]• Confidencescore=maximumvalueofpredictivedistribution
UtilizingthePosteriorDistribution
16
[Input] [Deepclassifier]
Ifscore>𝜖:In-distribution
Else:out-of-distribution
AlgorithmicIntelligenceLab
• Baselinedetector[Hendrycks etal.,2017]• Confidencescore=maximumvalueofpredictivedistribution
• Evaluation:detectingout-of-distribution• AssumethatwehaveclassifiertrainedonMNISTdataset• Detectingout-of-distributionforthisclassifier
UtilizingthePosteriorDistribution
17
[Input] [Deepclassifier]
Ifscore>𝜖:In-distribution
Else:out-of-distribution
In-distribution Out-of-distribution
Predictivedist.
Data
AlgorithmicIntelligenceLab
• Baselinedetector[Hendrycks etal.,2017]• Confidencescore=maximumvalueofpredictivedistribution
• Evaluation:detectingout-of-distribution• TP=truepositive/FN=falsenegative/TN=truenegative/FP=falsepositive
UtilizingthePosteriorDistribution
18
[Input] [Deepclassifier]
Ifscore>𝜖:In-distribution
Else:out-of-distribution
• AUROC• AreaunderROCcurve• ROCcurve=relationshipbetweenTPRandFPR
• AUPR(AreaunderthePrecision-Recallcurve)• AreaunderPRcurve• PRcurve=relationshipbetweenprecisionandrecall
AlgorithmicIntelligenceLab
• Baselinedetector[Hendrycks etal.,2017]• Confidencescore=maximumvalueofpredictivedistribution
• Evaluation:detectingout-of-distribution• Imageclassification(computervision)
UtilizingthePosteriorDistribution
19
[Input] [Deepclassifier]
Ifscore>𝜖:In-distribution
Else:out-of-distribution
Baselinemethodisbetterthanrandomdetector
AlgorithmicIntelligenceLab
• Baselinedetector[Hendrycks etal.,2017]• Confidencescore=maximumvalueofpredictivedistribution
• Evaluation:detectingout-of-distribution• Textcategorization(NLP)
• Out-of-distribution• 5Newsgroupsfor15Newsgroups• 2ReutersforReuters6• 12Reutersfor40Reuters
UtilizingthePosteriorDistribution
20
[Input] [Deepclassifier]
Ifscore>𝜖:In-distribution
Else:out-of-distribution
AlgorithmicIntelligenceLab
• ODINdetector[Liangetal.,2018]• Calibratingtheposteriordistributionusingpost-processing
• Twotechniques• Temperaturescaling
• Relaxingtheoverconfidencebysmoothingtheposteriordistribution
UtilizingthePosteriorDistribution
21
Temperaturescalingparameter
AlgorithmicIntelligenceLab
• ODINdetector[Liangetal.,2018]• Calibratingtheposteriordistributionusingpost-processing
• Twotechniques• Temperaturescaling
• Inputpreprocessing
UtilizingthePosteriorDistribution
22
Magnitudeofnoise
AlgorithmicIntelligenceLab
• ODINdetector[Liangetal.,2018]• Calibratingtheposteriordistributionusingpost-processing
• Twotechniques• Temperaturescaling
• Inputpreprocessing
• Usingtwomethods,theauthorsdefineconfidencescoreasfollows:
UtilizingthePosteriorDistribution
23
AlgorithmicIntelligenceLab
• ODINdetector[Liangetal.,2018]• Calibratingtheposteriordistributionusingpost-processing
• Twotechniques• Temperaturescaling
• Inputpreprocessing
• Usingtwomethods,theauthorsdefineconfidencescoreasfollows:
• Howtoselecthyper-parameters• Validation
• 1000imagesfromin-distribution(positive)• 1000imagesfromout-of-distribution(negative)
UtilizingthePosteriorDistribution
24
AlgorithmicIntelligenceLab
• Experimentalresults
UtilizingthePosteriorDistribution
25
AlgorithmicIntelligenceLab
1. Introduction• Whatisnoveltydetection?• Overview
2. UtilizingthePosteriorDistribution• Baseline method• Post-processingmethod
3. UtilizingtheHiddenFeatures• Localintrinsicdimensionality• Mahalanobis distance-basedscore
TableofContents
26
AlgorithmicIntelligenceLab
• Motivation• HiddenfeaturesfromDNNscontainmeaningfulfeaturesfromtrainingdata
• Theycanbeusefulfordetectingabnormalsamples!
UtilizingtheHiddenFeatures
27
LotsofData
Objects
Edge Parts
AlgorithmicIntelligenceLab
• LocalIntrinsicDimensionality(LID)[Maetal.,2018]• Expansiondimension
• Rateofgrowthinthenumberofdataencounteredasthedistancefromthereferencesampleincreases(𝑉 isvolume)
UtilizingtheHiddenFeatures
28
AlgorithmicIntelligenceLab
• LocalIntrinsicDimensionality(LID)[Maetal.,2018]• Expansiondimension
• Rateofgrowthinthenumberofdataencounteredasthedistancefromthereferencesampleincreases(𝑉 isvolume)
• LID=expansiondimensioninthestatisticalsetting
• Where𝐹 isanalogoustothevolumeinequation(1)
UtilizingtheHiddenFeatures
29
AlgorithmicIntelligenceLab
• LocalIntrinsicDimensionality(LID)[Maetal.,2018]• Expansiondimension
• Rateofgrowthinthenumberofdataencounteredasthedistancefromthereferencesampleincreases(𝑉 isvolume)
• LID=expansiondimensioninthestatisticalsetting
• Where𝐹 isanalogoustothevolumeinequation(1)• EstimationofLID[Amsaleg etal.,2015]
UtilizingtheHiddenFeatures
30
distancebetweensampleanditsk-th nearestneighbor
AlgorithmicIntelligenceLab
• MotivationofLID• Abnormalsamplemightbescatteredcomparedtonormalsamples
• ThisimpliesthatLIDcanbeusefulfordetectingabnormalsamples!
UtilizingtheHiddenFeatures
31
AlgorithmicIntelligenceLab
• MotivationofLID• Abnormalsamplemightbescatteredcomparedtonormalsamples
• ThisimpliesthatLIDcanbeusefulfordetectingabnormalsamples!
• Evaluation:detectingadversarialsamples[Szegedy,etal.,2013]• Misclassifiedexamplesthatareonlyslightlydifferentfromoriginalexamples
UtilizingtheHiddenFeatures
32
*Thistopicwillbecoveredinthenextlecture
AlgorithmicIntelligenceLab
• MotivationofLID• Abnormalsamplemightbescatteredcomparedtonormalsamples
• ThisimpliesthatLIDcanbeusefulfordetectingabnormalsamples!
• Evaluation:detectingadversarialsamples[Szegedy,etal.,2013]
UtilizingtheHiddenFeatures
33
=
AlgorithmicIntelligenceLab
• Empiricaljustification
• Adversarialsamples(generatedbyOPTattack[Carlini etal.,2017])canbedistinguishedusingLID
• LIDsfromlow-levellayersarealsousefulindetection
UtilizingtheHiddenFeatures
34
AlgorithmicIntelligenceLab
• Mainresultsondetectingadversarialattacks• Testedmethod
• Bayesianuncertainty(BU)andDensityestimator(DE)[Feinman etal.,2017]
• LIDoutperformsallbaselinemethods
UtilizingtheHiddenFeatures
35
AlgorithmicIntelligenceLab
• Mahalanobis distance-basedconfidencescore[Leeetal.,2018]
UtilizingtheHiddenFeatures
36
AlgorithmicIntelligenceLab
• Mahalanobis distance-basedconfidencescore[Leeetal.,2018]• Givenpre-trainedSoftmax classifierwithDNNs
UtilizingtheHiddenFeatures
37
AlgorithmicIntelligenceLab
• Mahalanobis distance-basedconfidencescore[Leeetal.,2018]• Givenpre-trainedSoftmax classifierwithDNNs
• Inducingagenerativeclassifieronhiddenfeaturespace
UtilizingtheHiddenFeatures
38
penultimate
AlgorithmicIntelligenceLab
• Mahalanobis distance-basedconfidencescore[Leeetal.,2018]• Givenpre-trainedSoftmax classifierwithDNNs
• Inducingagenerativeclassifieronhiddenfeaturespace
• Motivation:connectionbetween softamx andgenerativeclassifier(LDA)
UtilizingtheHiddenFeatures
39
penultimate
~ =
AlgorithmicIntelligenceLab
• Mahalanobis distance-basedconfidencescore[Leeetal.,2018]• Givenpre-trainedSoftmax classifierwithDNNs
• Inducingagenerativeclassifieronhiddenfeaturespace
• Theparametersofgenerativeclassifier=samplemeansandcovariance• Giventrainingdata
UtilizingtheHiddenFeatures
40
penultimate
AlgorithmicIntelligenceLab
• Usinggenerativeclassifier,wedefinenewconfidencescore:
• Measuringthelogoftheprobabilitydensitiesofthetestsample
UtilizingtheHiddenFeatures
41
AlgorithmicIntelligenceLab
• Usinggenerativeclassifier,wedefinenewconfidencescore:
• Measuringthelogoftheprobabilitydensitiesofthetestsample
• Intuition
UtilizingtheHiddenFeatures
42
AlgorithmicIntelligenceLab
• Usinggenerativeclassifier,wedefinenewconfidencescore:
• Measuringthelogoftheprobabilitydensitiesofthetestsample
• Boostingtheperformance• Inputpre-processing
• MotivatedbyODIN[Liangetal.,2018]
UtilizingtheHiddenFeatures
43
AlgorithmicIntelligenceLab
• Usinggenerativeclassifier,wedefinenewconfidencescore:
• Measuringthelogoftheprobabilitydensitiesofthetestsample
• Boostingtheperformance• Inputpre-processing
• Featureensemble
UtilizingtheHiddenFeatures
44
FittingGaussianusingfeaturesfromintermediatelayers
AlgorithmicIntelligenceLab
• Usinggenerativeclassifier,wedefinenewconfidencescore:
• Measuringthelogoftheprobabilitydensitiesofthetestsample
• Boostingtheperformance• Inputpre-processing
• Featureensemble
• Intuition:low-levelfeaturealsocanbeusefulfordetectingabnormalsamples
UtilizingtheHiddenFeatures
45
FittingGaussianusingfeaturesfromintermediatelayers
AlgorithmicIntelligenceLab
• Mainalgorithm
• Remarkthat• Wecombinetheconfidencescoresfrommultiplelayersusingweightedensemble
• Ensembleweightsareselectedbyutilizingthevalidationset
UtilizingtheHiddenFeatures
46
AlgorithmicIntelligenceLab
• Experimentalresultsondetectingout-of-distribution• Contributionbyeachtechnique
UtilizingtheHiddenFeatures
47
Baseline[13]:maximumvalueofposteriordistributionODIN[21]:maximumvalueofposteriordistributionafterpost-processingOurs:theproposedMahalanobis distance-basedscore
AlgorithmicIntelligenceLab
• Experimentalresultsondetectingout-of-distribution• Mainresults
• Forallcases,oursoutperformsODINandbaselinemethod• Validationconsistsof1Kdatafromeachin- andout-of-distributionpair• Validationconsistsof1Kdatafromeachin- andcorrespondingFGSMdata
• Noinformationaboutout-of-distribution
UtilizingtheHiddenFeatures
48
AlgorithmicIntelligenceLab
• Experimentalresultsondetectingadversarialattacks• Mainresults
• Foralltestedcases,ourmethodoutperformsLIDandKDestimator• Forunseenattacks,ourmethodisstillworkingwell
• FGSMsamplesdenotedby“seen”areusedforvalidation
UtilizingtheHiddenFeatures
49
AlgorithmicIntelligenceLab
• Inthislecture,wecovervariousmethodsfordetectingabnormalsampleslikeout-of-distributionandadversarialsamples• Posteriordistribution-basedmethods• Hiddenfeature-basedmethods
• Therearealsotrainingmethodsforobtainingmorecalibratedscores• Ensembleofclassifier[Balajietal.,2017]• Bayesiandeepmodels[Lietal.,2017]• CalibrationlosswithGAN[Leeetal.,2018a]
• Suchmethodscanbeusefulformanymachinelearningapplications• Activelearning[Galetal.,2017]• Incrementallearning[Rebuff etal.,2017]• Ensemblelearning[Leeetal.,2017]• Networkcalibration[Guoetal.,2017]
Summary
50
AlgorithmicIntelligenceLab
[Hendrycks etal.,2017]Abaselinefordetectingmisclassifiedandout-of-distributionexamplesinneuralnetworks. InICLR2017.https://arxiv.org/abs/1610.02136
[Maetal.,2018]CharacterizingAdversarialSubspacesUsingLocalIntrinsicDimensionality. InICLR,2018.https://openreview.net/pdf?id=B1gJ1L2aW
[Feinman etal.,2017]Detectingadversarialsamplesfromartifacts. arXiv preprintarXiv:1703.00410,2017.https://arxiv.org/abs/1703.00410
[Lee,etal.,2018a]TrainingConfidence-calibratedClassifiersforDetectingOut-of-DistributionSamples,InICLR,2018.https://arxiv.org/abs/1711.09325
[Lee,etal.,2018b]ASimpleUnifiedFrameworkforDetectingOut-of-DistributionSamplesandAdversarialAttacks,InNIPS,2018.https://arxiv.org/abs/1807.03888
[Liang,etal.,2018]PrincipledDetectionofOut-of-DistributionExamplesinNeuralNetworks. InICLR,2018.https://arxiv.org/abs/1706.02690
[Goodfellow etal.,2015]Explainingandharnessingadversarialexamples. InICLR,2015.https://arxiv.org/pdf/1412.6572.pdf
[Amodei,etal.,2016]Concreteproblemsinai safety.arXiv preprintarXiv:1606.06565,2016.https://arxiv.org/abs/1606.06565
[Guoetal.,2017]OnCalibrationofModernNeuralNetworks. InICML,2017.https://arxiv.org/abs/1706.04599
[Leeetal.,2017]ConfidentMultipleChoiceLearning.InICML,2017.https://arxiv.org/abs/1706.03475
[Balajietal.,2017]SimpleandScalablePredictiveUncertaintyEstimationusingDeepEnsembles,InNIPS,2017.https://arxiv.org/pdf/1612.01474.pdf
References
51
AlgorithmicIntelligenceLab
[Rebuff etal.,2017]iCaRL:IncrementalClassifierandRepresentationLearning. InCVPR,2017.https://arxiv.org/pdf/1611.07725.pdf
[Huangetal.,2017]Denselyconnectedconvolutionalnetworks,InCVPR,2017.https://arxiv.org/abs/1608.06993
[Zagoruyko etal.,2016]Wideresidualnetworks,InBMVC2016.https://arxiv.org/pdf/1605.07146.pdf
[Amsaleg etal.,2015]Estimatinglocalintrinsicdimensionality.InSIGKDD,2015.http://mistis.inrialpes.fr/~girard/Fichiers/p29-amsaleg.pdf
[Szegedy etal.,2013]Intriguingpropertiesofneuralnetworks. arXiv preprintarXiv:1312.6199,2013.https://arxiv.org/abs/1312.6199
[Lietal.,2017]DropoutInferenceinBayesianNeuralNetworkswithAlpha-divergences,InICML,2017.https://arxiv.org/abs/1703.02914
[Galetal.,2017]DeepBayesianActiveLearningwithImageData,InICML,2017.https://arxiv.org/abs/1703.02910
[Carlini etal.,2017]Towardsevaluatingtherobustnessofneuralnetworks.In IEEESP,2017.https://arxiv.org/abs/1608.04644
References
52
top related