# multi-scale streaming anomalies detection for time series · streaming pca i dimensionality...

TRANSCRIPT

Multi-scale streaming anomalies detection for time seriesB Ravi Kiran

Université Lille 3, CRIStAL, Lille, France

Streaming anomaly detection

I x (t ) are observations over time t where new data arrives over time t .I Problem : Window x (t ), [t : t − p + 1] which “deviates” from normal behavior.I Motivation : How to become invariant to the window-size/scale of

pseudo-periodic structure in x (t ) ?

Streaming multi-scale Lag-matrix construction

Xpt = [xt,xt−1, . . . ,xt−p+1]

T ∈ Rp

Streaming PCA

I Dimensionality reduction for time series lag embeddingI Recursive update for principal subspaceI Linear Principal Component Analysis criterion : Given Xp ∈ RT×p, wp is

defined as 1-D projection capturing most of the energy of samples :wp = arg min‖w‖=1

∑Tt=1 ‖X

pt − (wwT )Xp

t ‖2

Multiscale streaming PCA

Algorithm 1 Streaming PCAInitialization: wj ← 0, σ 2

j ← ϵ with ϵ �1for t = 1, . . . ,T do

for j = 1, . . . , J doZ jt ← HT

2jX jt

y jt ← wTj Z

jt

σ 2j ← σ 2

j + (y jt )2

ejt ← Z jt − y

jtwj

wj ← wj + σ−2j y jte

jt

Z̃ jt ← wT

j Zjtwj

αjt ← ‖Z̃

jt − Z

jt ‖

2

end forend forreturn ααα ∈ RT×J

Given x (t ) ∈ R for t = 1 : TI We evaluate the lag-matrixX ∈ RT×p where p = 2j.I For each Xt ∈ Rp we perform a

change of basis Zt := ΦTXtI H refers to a unitary tx. that can. localize a deviation from the

local mean and variations.. Preserve the variance.I Haar transform Φ = H :

H2N =1√2

[HN ⊗ [1,1]IN ⊗ [1,−1]

]

Hierarchical (Approximation) PCA

2j−1 2j−1 2j−1 2j−1

Future

wTj−1· wT

j−1·wTj−1· wT

j−1·

wTj · wT

j ·

wTj+1·

For scale pj+1 = 2pj, a reduced lagmatrixZ j+1

t is obtained by projectionof each component of size pj on theprincipal direction obtained at thisscale, i.e. Z j+1

t = [wTj Z

jt ,w

Tj Z

jt−2j

]T

with Z 1t = X 1

t . The principal direc-tion is updated a�er this step.

Global flow

Input xt X 2j=1

t

X 2j=2

t

X 2j=Jt

Update w1

Update w2

Update wJ

α1t = ‖X

1t − X̃

1t ‖

α2t = ‖X

2t − X̃

2t ‖

α Jt = ‖X

Jt − X̃

Jt ‖

Norm‖ααα t‖

2

2nd iteration‖α̃αα t − ααα t‖

2

Least corr. scalearg minj

∑i(ααα

Tααα )ji

... ... ... ...

AUC performances across di�erent aggregation methods

Performance Evaluation : Area under the receiver operatorscharacteristics curve (AUC) (FPR vs TPR), 0 (worst) & 1 (perfect).

E�ect of 2nd iteration of Streaming PCA

Least correlated scale Vs 2nd iteration of Streaming PCA :I Least correlated scale and 2nd iteration of streaming PCA decorrelates the

reconstruction error across scales.I The least correlated scale performs be�er when there is a single scale

structure across time series. Second iteration performs be�er when there aremultiple scales.

Future work

I Understand bounds on reconstruction error ααα (t ) for Streaming PCA.I Be�er base-line by comparing with streaming covariance estimation.I Probabilistic anomaly score using gaussian neg-log score.I Recursively calculable multi-scale time series representation HTXt to model

long range dependencies.

References

I Papadimitriou et al. Optimal multi-scale pa�erns in time series streams, 2006 ACM SIGMOD

I Bin Yang, Projection approximation subspace tracking. Trans. Sigal Processing, Jan. 1995

I Evaluating real-time anomaly detection algorithms, Numenta benchmark, ICMLA 2015

https://beedotkiran.github.io/anomaly.html [email protected]