change detection in multivariate data: likelihood and detectability loss
TRANSCRIPT
![Page 1: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/1.jpg)
Change Detection in
Multivariate Data
Giacomo Boracchi
July, 8th , 2016
TJ Watson, IBM NY
Likelihood and Detectability Loss
![Page 2: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/2.jpg)
Examples of CD Problems: Anomaly Detection
![Page 3: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/3.jpg)
Examples of CD Problems: Anomaly Detection
![Page 4: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/4.jpg)
Other Examples of CD Problems
ECG monitoring: Detect arrhythmias / device mispositioning
![Page 5: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/5.jpg)
Other Examples of CD Problems
ECG monitoring: Detect arrhythmias / device mispositioning
Environmental monitoring: detect changes in signals
monitoring a rockface
![Page 6: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/6.jpg)
Other Examples of CD Problems
ECG monitoring: Detect arrhythmias / device mispositioning
Environmental monitoring: detect changes in signals
monitoring a rockface
Stream mining: Fraud Detection
Stream mining: Online Classification Systems
Spam Classification Fraud Detection
![Page 7: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/7.jpg)
The Change-Detection Problem
Often, these problems boil down to:
a) Monitor a stream π π‘ , π‘ = 1,β¦ , π π‘ β βπ of
realizations of a random variable, and detect the
change-point π,
π π‘ βΌ ΰ΅π0 π‘ < ππ1 π‘ β₯ π
,
where {π π‘ , π‘ < π} are i.i.d. and π0 β π1, π1 is unknown and
π0 can be possibly estimated from training data
π‘
π(π‘)
β¦β¦
π1π0
π
![Page 8: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/8.jpg)
The Change-Detection Problem
Often, these problems boil down to:
a) Monitor a stream π π‘ , π‘ = 1,β¦ , π π‘ β βπ of
realizations of a random variable, and detect the
change-point π,
π π‘ βΌ ΰ΅π0 π‘ < ππ1 π‘ β₯ π
,
where {π π‘ , π‘ < π} are i.i.d. and π0 β π1, π1 is unknown and
π0 can be possibly estimated from training data
π‘
π(π‘)
β¦β¦
π1π0
π
(Process) Change Detection Problem
![Page 9: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/9.jpg)
The Change-Detection Problem
Often, these problems boil down to:
b) Determining whether a set of data π π‘ , π‘ = π‘0, β¦ , π‘1 is
generated from ππ and detect possible outliers
We refer to
β’ π0 pre-change distribution / normal (can be estimated)
β’ π1 post-change distribution / anomalous (unknown)
π‘
π(π‘)
β¦β¦
π1π0 π0
![Page 10: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/10.jpg)
The Change-Detection Problem
Often, these problems boil down to:
b) Determining whether a set of data π π‘ , π‘ = π‘0, β¦ , π‘1 is
generated from ππ and detect possible outliers
We refer to
β’ π0 pre-change distribution / normal (can be estimated)
β’ π1 post-change distribution / anomalous (unknown)
π‘
π(π‘)
β¦β¦
π1π0 π0
Anomaly Detection Problem
![Page 11: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/11.jpg)
THE ADDRESSED PROBLEM
![Page 12: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/12.jpg)
Our Goal
Study how the data dimension π influences
the change detectability, i.e., how difficult is
to solve these two problems
π‘
π(π‘)
β¦β¦
π1π0
π
![Page 13: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/13.jpg)
Our Goal
π‘
π(π‘)
β¦β¦
π
π1π0
Study how the data dimension π influences
the change detectability, i.e., how difficult is
to solve these two problems
![Page 14: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/14.jpg)
Our Goal
π‘
π(π‘)
β¦β¦
π
π1π0
Study how the data dimension π influences
the change detectability, i.e., how difficult is
to solve these two problems
![Page 15: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/15.jpg)
Our Approach
To study the impact of the sole data dimension π in
change-detection problems we need to:
1. Consider a change-detection approach
2. Define a measure of change detectability that well
correlates with traditional performance measures
3. Define a measure of change magnitude that refers
only to differences between π0 and π1
![Page 16: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/16.jpg)
Our Approach
To study the impact of the sole data dimension π in
change-detection problems we need to:
1. Consider a change-detection approach
2. Define a measure of change detectability that well
correlates with traditional performance measures
3. Define a measure of change magnitude that refers
only to differences between π0 and π1
Our goal (reformulated):
Studing how the change detectability varies in change-
detection problems that have
β’ different data dimensions π
β’ constant change magnitude
![Page 17: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/17.jpg)
Our Result
We show there is a detectability loss problem, i.e. that
change detectability steadily decreases when π increases.
Detectability loss is shown by:
β’ Analytical derivations: when π0 and π1 are Gaussians
β’ Empirical analysis: measuring the the power of
hypothesis tests in change-detection problems on
real data
![Page 18: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/18.jpg)
Presentation Outline
Preliminaries:
β’ Assumptions
β’ The change-detection approach
β’ The change magnitude
β’ The measure of change detectability
The detectability loss
Detectability loss and anomaly detection in images
![Page 19: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/19.jpg)
Presentation Outline
Preliminaries:
β’ Assumptions
β’ The change-detection approach
β’ The change magnitude
β’ The measure of change detectability
The detectability loss
Detectability loss and anomaly detection in images
![Page 20: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/20.jpg)
Our Assumptions
To detect the change π0 β π1 we assume that
β’ π0 is unknown, can be estimated from a training set
ππ = π₯ π‘ , π‘ < π‘0, π₯ βΌ π0β’ π1 is unknown, no training data are provided
We refer to
β’ π0 as stationary / normal / pre-change distribution
β’ π0 as the estimate of π0 from a training set
β’ π1 as nonstationary / anomalous / post-change
distribution
![Page 21: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/21.jpg)
Presentation Outline
Preliminaries:
β’ Assumptions
β’ The change-detection approach
β’ The change magnitude
β’ The measure of change detectability
The detectability loss
Detectability loss and anomaly detection in images
![Page 22: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/22.jpg)
How? Monitoring the Log-likelihood
A typical approach to monitor the log-likelohood
1. During training, estimate π0 from ππ
2. During testing, compute
β π π‘ = log( π0(π(π‘)))
3. Monitor β π π‘ , π‘ = 1,β¦
π‘
π(π‘)
β¦
βππ‘
β¦
![Page 23: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/23.jpg)
How? Monitoring the Log-likelihood
A typical approach to monitor the log-likelohood
1. During training, estimate π0 from ππ
2. During testing, compute
β π π‘ = log( π0(π(π‘)))
3. Monitor β π π‘ , π‘ = 1,β¦
This is quite a popular approach in sequential monitoring
and in anomaly detection
L. I. Kuncheva, βChange detection in streaming multivariate data using likelihood detectors," IEEE
Transactions on Knowledge and Data Engineering, vol. 25, no. 5, 2013.
X. Song, M. Wu, C. Jermaine, and S. Ranka, βStatistical change detection for multidimensional data," in
Proceedings of International Conference on Knowledge Discovery and Data Mining (KDD), 2007.
J. H. Sullivan and W. H. Woodall, βChange-point detection of mean vector or covariance matrix shifts
using multivariate individual observations," IIE transactions, vol. 32, no. 6, 2000.
![Page 24: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/24.jpg)
Our Goal / Presentation Outline
Preliminaries:
β’ Assumptions
β’ The change-detection approach
β’ The change magnitude
β’ The measure of change detectability
The detectability loss
Detectability loss and anomaly detection in images
![Page 25: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/25.jpg)
The Change Magnitude
We measure the magnitude of a change π0 β π1 by the
symmetric Kullback-Leibler divergence
sKL π0, π1 = KL π0, π1 + KL π1, π0 =
= ΰΆ± logπ0 π
π1 ππ0 π ππ + ΰΆ± log
π1 π
π0 ππ1 π ππ
In practice, large values of sKL π0, π1 correspond to
changes π0 β π1that are very apparent, since sKL π0, π1is related to the power of hypothesis tests designed to detect
either π0 β π1 or π1 β π0 (Stein Lemma)
T. Dasu, K. Shankar, S. Venkatasubramanian, K. Yi, βAn information-theoretic approach to detecting
changes in multi-dimensional data streamsβ In Proc. Symp. on the Interface of Statistics, Computing
Science, and Applications, 2006
![Page 26: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/26.jpg)
Our Goal / Presentation Outline
Preliminaries:
β’ The change-detection approach
β’ The change magnitude
β’ The measure of change detectability
The detectability loss
Concluding remarks
![Page 27: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/27.jpg)
The Change Detectability
The Signal to Noise Ratio of the change
SNR π0 β π1 =
EπβΌπ0
β(π) β EπβΌπ1
β(π)2
varπβΌπ0
β(π) + varπβΌπ1
β(π)
The SNR π0 β π1β’ Measures the extent to which π0 β π1 is detectable by
monitoring E β(π)
β’ If we replace E[β ] and var[β ] by the sample estimators
we get the t-test statistic
![Page 28: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/28.jpg)
DETECTABILITY LOSS
![Page 29: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/29.jpg)
The Detectability Loss
Theorem
Let π0 = π©(π0, Ξ£0) and let π1 π = π0(ππ + π) where
π β βπΓπ and orthogonal , π β βπ, then
SNR π0 β π1 <πΆ
π
Where πΆ is a constant that depends only on sKL π0, π1
C. Alippi, G. Boracchi, D. Carrera, M. Roveri, "Change Detection in Multivariate
Datastreams: Likelihood and Detectability Loss" IJCAI 2016, New York, USA, July 9 - 13
![Page 30: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/30.jpg)
The Detectability Loss: Remarks
Theorem
Let π0 = π©(π0, Ξ£0) and let π1 π = π0(ππ + π) where
π β βπΓπ and orthogonal , π β βπ, then
SNR π0 β π1 <πΆ
π
Where πΆ is a constant that depends only on sKL π0, π1
Remarks:
β’ Changes of a given magnitude, sKL π0, π1 , become
more difficult to detect when π increases
β’ DL does not depend on how π0 changes
β’ DL does not depend on the specific detection rule
β’ DL does not depend on estimation errors on π0
![Page 31: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/31.jpg)
The Detectability Loss: The Change Model
Theorem
Let π0 = π©(π0, Ξ£0) and let π1 π = π0(ππ + π) where
π β βπΓπ and orthogonal , π β βπ, then
SNR π0 β π1 <πΆ
π
Where πΆ is a constant that depends only on sKL π0, π1
![Page 32: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/32.jpg)
The Detectability Loss: The Change Model
The change model π1 π = π0(ππ + π) includes:
β’ Changes in the location of π0 (i.e, +π)
π0
π1
![Page 33: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/33.jpg)
The Detectability Loss: The Change Model
The change model π1 π = π0(ππ + π) includes:
β’ Changes in the location of π0 (i.e, +π)
β’ Changes in the correlation of π (i.e, ππ)
π0
π1
![Page 34: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/34.jpg)
The Detectability Loss: The Change Model
The change model π1 π = π0(ππ + π) includes:
β’ Changes in the location of π0 (i.e, +π)
β’ Changes in the correlation of π (i.e, ππ)
It does not include changes in the scale of π0 that can be
however detected monitoring | π |
π0
π1
![Page 35: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/35.jpg)
The Detectability Loss: The Gaussian Assumption
Theorem
Let π0 = π©(π0, Ξ£0) and let π1 π = π0(ππ + π) where
π β βπΓπ and orthogonal , π β βπ, then
SNR π0 β π1 <πΆ
π
Where πΆ is a constant that depends only on sKL π0, π1
![Page 36: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/36.jpg)
The Detectability Loss: The Gaussian Assumption
Assuming π0 = π©(π0, Ξ£0) looks like a severe limitation.
β’ Other distributions are not easy to handle analytically
β’ We can prove that DL occurs also in random variables
having independent components
β’ The result can be empirically extended to the
apprimations of β β typically used for Gaussian
mixtures
![Page 37: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/37.jpg)
The Detectability Loss: Empirical Analysis
The data
β’ Two datsets from UCI database (Particle, Wine)
β’ Synthetically generate streams of different dimension π
β’ Estimate π0 by GM from a stationary training set
β’ In each stream we introduce π0 β π1 such that
π1 π = π0 ππ + π and sKL π0, π1 = 1
β’ Test data: two windows π0 and π1 (500 samples
each) selected before and after the change.
π‘π‘
π(π‘)
π1π0
![Page 38: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/38.jpg)
The Detectability Loss: Empirical Analysis
The change-detectabiltity measure:
β’ Compute β π0(π) from π0 and π1, obtaining π0 and π1
β’ Compute a test statistic π―(π0,π1) to compare the two
β’ Detect a change by an hypothesis test
π― π0,π1 βΆ β
where β controls the amount of false positives
β’ Use the power of this test to assess change
detectability
π‘
βππ‘
π1π0
π‘
βππ‘
![Page 39: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/39.jpg)
DL: the Power of HTs on Gaussian Streams
Gaussians Remarks:
β’ π1 is defined analytically
β’ The t-test detects changes in
expectation
β’ The Lepage test detects changes
in the location and scale
Results
β’ The HT power decays with π: DL
does not only concern the
upperbound of SNR.
β’ DL is not due to estimation errors,
but these make things worst.
β’ The power of the Lepage HT also
decreases, which indicates that
the change is more difficult to
detect also monitoring the variance
Lepage log(π0(β ))
Lepage log( π0(β ))
t-test log(π0(β ))
t-test log( π0(β ))
![Page 40: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/40.jpg)
Results: the Power of the Hypothesis Tests
Particle Wine
![Page 41: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/41.jpg)
Results: the Power of the Hypothesis Tests
Particle Wineβ’ DL: the power of Hypothesis Tests
also decays with π, not just the
upperbound of SNR.
β’ DL occurs also in non-Gaussian data
β’ The Lepage statistic also decreases,
which indicates that the change is
more difficult to detect also monitoring
the variance
β’ Experiments on synthetic datasets
confirms that DL is not due to
estimation errors of π0
![Page 42: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/42.jpg)
DETECTABILITY LOSS AND
ANOMALY DETECTION IN IMAGES
![Page 43: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/43.jpg)
The Considered Problem
![Page 44: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/44.jpg)
Patch-based processing of nanofibers
Analyze each patch of an image π
π¬π = {π π + π’ , π’ β π°}
and determine whether it is normal or anomalous
Patches π¬c β βπ are too high-dimensional (π β« 0) for
modeling the distribution π0 generating normal paches
We need to extract suitable features to reduce the
dimensionality of our anomaly-detection problem.
![Page 45: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/45.jpg)
Feature Extraction
Expert-driven features: On each patch, compute
β’ the average,
β’ the variance,
β’ the total variation.
These are expected to distinguish normal and anomalous
patches
Data-driven features: our approach consists in
1. Learning a model π that describes normal patches
2. Assessing the conformance of each patch π¬π to π
![Page 46: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/46.jpg)
π: Dictionary of patches
Sparse representations have shown to be a very useful
method for constructing signal models
The underlying assumption is that
π¬ β π·πΆ i. e., π¬ β π·πΆ 2 β 0
and πΆ β βπ where:
β’ π· β βπΓπ is the dictionary, columns are called atoms
β’ the coefficient vector π± is sparse
β πΆ 0 = πΏ βͺ π or
β πΆ 1 is small
The dictionary is learned a training set of normal patches.
We learn a union of low-dimensional sub-spaces where
normal patches live
![Page 47: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/47.jpg)
The dictionary of normal patches
Example of training patches Few learned atoms (BPDN-based learning)
![Page 48: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/48.jpg)
Data-Driven Features
To assess the confrmance of ππ with π we perform the
Sparse coding:
πΆ = argminπΆββπ
π·πΆ β π¬ ππ + π πΆ 1, π > 0
which we solve using the BPDN problem (using ADMM).
We then measure
π·πΆ β π¬ ππ
and
πΆ π
Data-driven features are π± =π·πΆ β π¬ π
π
πΆ 1
![Page 49: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/49.jpg)
Detecting Anomalies
Normal patches are expected to yield features π± that are
i.i.d. and that follow a (unknown) distribution π0,anomalous patches do not, as they follow π1 β π0
We are back to the original problem
βDetermining whether a set of data ππ , π = 1,β¦ is
generated from ππ and detect possible outliers"
π‘
π(π‘)
β¦β¦
π1π0 π0
Anomaly Detection Problem
![Page 50: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/50.jpg)
![Page 51: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/51.jpg)
![Page 52: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/52.jpg)
![Page 53: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/53.jpg)
![Page 54: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/54.jpg)
The ROC curves
Tests on 40 images with
anomalies manually
annotated by an expert
The proposed anomaly
detection algorithm
outperforms expert-driven
features and other
methods based on sparse
representations
![Page 55: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/55.jpg)
Detectability Loss on these nanofibers
Selecting the good features is obviously important.
Why not stacking data-driven and expert-driven features?
Consider π = 3, 4, 5 dimensional features
β’ We selectively add the three expert-driven features to
the two data-driven ones
β’ We always fit a GM model to a large-enough number of
training data
![Page 56: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/56.jpg)
Detectability Loss on these nanofibers
Anomaly detection
performance
progressively decay
when π increases
![Page 57: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/57.jpg)
Detectability Loss and Irrelevant Features
Irrelevant features, namely features that:
β’ are not directly affected by the change
β’ do not provide any additional information for change
detection purposes (i.e. leave sKL π0, π1 constant)
Adding irrelevant feature yields detectability loss.
Other issues might cause the performance decay
β’ A biased denisty function for π0β’ Scarcity of training samples when π increases
However, we are inclined to conclude that
β’ These expert-driven features do not add enough
relevant information on top of the data-driven ones (for
anomaly-detection purposes).
![Page 58: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/58.jpg)
Obviously is not always the case
We developed data-driven features based on convolutional
sparse models
π β
π=1
π
π π βπΆπ , s. t. πΆπ is sparse
where a signal π is entirely encoded as the sum of πconvolutions between a filter π π and a coefficient map πΆπ
Pros:
β’ Translation invariant representation
β’ Few small filters are typically required
β’ Filters exhibit very specific image structures
β’ Easy to use filters having different size
Collaboration with Los Alamos National Laboratory, NM, USA
![Page 59: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/59.jpg)
Example of Learned Filters
Training Image Learned Filters
![Page 60: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/60.jpg)
Convolutional Sparsity for Anomaly Detection
If we consider the convolutional sparse coding
ΰ·πΆ = argminπΆ π
π=1
π
π π βπΆπ β π¬
π
π
+ π
π=1
π
πΆ 1
we can build the feature vector as:
ππ =
ΰ·
π
π=1
π
π π β ΰ·πΆπ β π¬
π
π
π=1
π
ΰ·
π
ΰ·πΆ
π
β¦but unfortunately, detection performance are rather poor
![Page 61: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/61.jpg)
Sparsity is too loose a criterion for detection
The two (normal and anomalous) patches
exhibit same sparsity and reconstruction error
Coeffic
ient
maps
norm
alpatc
h
Coeffic
ient
maps
anom
alo
us
patc
h
![Page 62: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/62.jpg)
Convolutional Sparsity for Anomaly Detection
Add the group sparsity of the maps on the patch
support as an additional feature
π₯π =
ΰ·
π
π=1
π
π π β ΰ·πΆπ β π¬
π
π
π=1
π
ΰ·
π
ΰ·πΆ
1
π=1
π
ΰ·
π
ΰ·πΆ
2
D. Carrera, G. Boracchi, A. Foi and B. Wohlberg , βDetecting Anomalous Structures by
Convolutional Sparse Models β IEEE IJCNN 2015
![Page 63: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/63.jpg)
Anomaly-Detection Performance
On 25 different textures and
600 test images (pair of
textures to mimic
normal/anomalous regions)
Best performance achieved
by the 3-dimensional
feature indicators
Achieve similar
performance than steerable
pyramid specifically
designed for texture
classification
![Page 64: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/64.jpg)
CONCLUDING REMARKS
![Page 65: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/65.jpg)
Comments on Detectability Loss
Detectability loss occurs:
β’ independently on the specific statistical tool used to
monitor the log-likelihood
β’ does not depend on how the change affects π0, e.g.
the number of affected components.
Empirical analysis confirms DL on real-world datastreams.
β’ It is important to keep the change-magnitude constant
when changing π (or the dataset)
Irrelevant components in π are harmful! Consider this in
feature-based anomaly-detection methods.
Ongoing works: extending this study to other change-
detection approaches and to other families of distributions.
Further details http://arxiv.org/pdf/1510.04850v2
![Page 66: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/66.jpg)
Thanks, Questions?
C. Alippi, G. Boracchi, D. Carrera, M. Roveri, "Change Detection in Multivariate
Datastreams: Likelihood and Detectability Loss" IJCAI 2016, New York, USA, July 9 - 13
D. Carrera, G. Boracchi, A. Foi and B. Wohlberg "Detecting Anomalous Structures by
Convolutional Sparse Models" IJCNN 2015 Killarney, Ireland, July 12
D. Carrera, F. Manganini, G. Boracchi, E. Lanzarone "Defect Detection in Nanostructures",
IEEE Transactions on Industrial Informatics -- Submitted, 11 pages.
![Page 67: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/67.jpg)
BACKUP SLIDES
![Page 68: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/68.jpg)
Sketch of the proof
Theorem
Let π0 = π©(π0, Ξ£0) and let π1 π = π0(ππ + π) where
π β βπΓπ and orthogonal , π β βπ, then
SNR π0 β π1 <πΆ
π
Where πΆ is a constant that depends only on sKL π0, π1
Sketch of the proof: recall
We compute an upper bound of the numerator and a lower
bound of the denominator
SNR π0 β π1 =
EπβΌπ0
β(π) β EπβΌπ1
β(π)2
varπβΌπ0
β(π) + varπβΌπ1
β(π)
![Page 69: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/69.jpg)
Sketch of the proof
We now show that
sKL π0, π1 β₯ EπβΌπ0
β π β EπβΌπ1
β π (β)
From β π = log(π0 π ) and the definition of sKL it follows
sKL π0, π1 = EπβΌπ0
[log(π0 π ] β EπβΌπ0
log π1 π +
+ EπβΌπ1
[log π1 π ] β EπβΌπ1
[log(π0 π ]
Thus
β βΊ EπβΌπ1
log π1 π β EπβΌπ0
log π1 π β₯ 0
![Page 70: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/70.jpg)
Sketch of the proof
EπβΌπ1
log π1 π β EπβΌπ0
log π1 π =
= β« log π1 π π1 π ππ β β« log π1 π π0 π ππ
We denote
π = πβ² π β π , π = ππ + π
ππ = det πβ² ππ = ππ
π0 π = π1 πβ² π β π = π1(π)
π1 π = π1 ππ + π =:π2(π)
then
EπβΌπ1
log π1 π β EπβΌπ0
log π1 π =
= β« log π1 π π1 π ππ β β« log π2 π π1 π ππ =
= KL π1, π2 β₯ 0
![Page 71: Change Detection in Multivariate Data: Likelihood and Detectability Loss](https://reader031.vdocument.in/reader031/viewer/2022030222/58843b2e1a28ab39538b74d1/html5/thumbnails/71.jpg)
Sketch of the proof
Thus
sKL π0, π1 β₯ EπβΌπ0
β π β EπβΌπ1
β π
Moreover
varπβΌπ0
β π = varπβΌπ0
β1
2π2 =
π
2
It follows
SNR π0 β π1 =
EπβΌπ0
β(π) β EπβΌπ1
β(π)2
varπβΌπ0
β(π) + varπβΌπ1
β(π)β€sKL π0, π1
2
π/2