time series outlier detection - ucsd mathematicst8zhu/talks/ts outlier detection.pdfoutline time...

42
Time Series Outlier Detection Tingyi Zhu July 28, 2016 Tingyi Zhu Time Series Outlier Detection July 28, 2016 1 / 42

Upload: others

Post on 10-Mar-2020

16 views

Category:

Documents


0 download

TRANSCRIPT

Time Series Outlier Detection

Tingyi Zhu

July 28, 2016

Tingyi Zhu Time Series Outlier Detection July 28, 2016 1 / 42

Outline

Time Series Basics

Outliers Detection in Single Time Series

Outlier Series Detection from Multiple Time Series

Demos

Tingyi Zhu Time Series Outlier Detection July 28, 2016 2 / 42

Time Series Basics

Tingyi Zhu Time Series Outlier Detection July 28, 2016 3 / 42

First-order Autoregression

A model denoted as AR(1), in which the value of X at time t is a linearfunction of the value of X at time t − 1:

Xt = φXt−1 + εt (1)

Assumptions:

εti .i .d∼ N(0, σ), stochastic term.

εt is independent of Xt .

Tingyi Zhu Time Series Outlier Detection July 28, 2016 4 / 42

General Autoregressive ModelAR(p):

Xt = φ1Xt−1 + φ2Xt−2 + · · ·+ φpXt−p + εt

=

p∑i=1

φiXt−i + εt

=

p∑i=1

φiBiXt + εt

where we use the backshift operator B (BXt = Xt−1, BkXt = Xt−k).

Alternative notation:φ(B)Xt = εt

φ(B) is a polynomial of B,

φ(B) = 1− φ1B − φ2B2 − · · · − φpBp = 1−p∑

i=1

φiBi

Tingyi Zhu Time Series Outlier Detection July 28, 2016 5 / 42

Moving Average

Another approach for modeling univariate time series

Xt depends linearly on its own current and previous stochastic terms

MA(1):

Xt = εt + θ1εt−1

MA(q):

Xt = εt + θ1εt−1 + · · ·+ θqεt−q

Tingyi Zhu Time Series Outlier Detection July 28, 2016 6 / 42

θ1, . . . , θq: parameters of MA model

εt , . . . , εt−q: stochastic terms

Using backshift operator B, model simplified as

Xt = (1 + θ1B + · · ·+ θqBq)εt

= (1 +

q∑i=1

θiBi )εt

= θ(B)εt

Tingyi Zhu Time Series Outlier Detection July 28, 2016 7 / 42

ARMA Model

A model consists of both autoregressive (AR) part and movingaverage (MA) part:

Xt =

p∑i=1

φiXt−i + εt +

q∑i=1

θiεt−i (2)

referred to as the ARMA(p,q) model.

p: the order of the autoregressive part

q: the order of the moving average part

More concisely, using backshift operator B, (2) becomes:

φ(B)Xt = θ(B)εt

Tingyi Zhu Time Series Outlier Detection July 28, 2016 8 / 42

Stationarity of Time Series

In short, a time series is stationary if its statistical properties are allconstant over time.

To mention some properties:

Mean: E [Xt ] = E [Xs ] for any t, s ∈ Z ,

Variance: Var [Xt ] = Var [Xs ] for any t, s ∈ Z ,

Joint distribution:

Cov(Xt ,Xt+1) = Cov(Xs ,Xs+1) for any t, s ∈ Z.

Tingyi Zhu Time Series Outlier Detection July 28, 2016 9 / 42

Tingyi Zhu Time Series Outlier Detection July 28, 2016 10 / 42

Requirements for a Stationary Time Series

AR(1) Xt = φXt−1 + εt : |φ| < 1

AR(p) φ(B)Xt = εt :

All the roots of φ(z) = 0 are outside unit circle.

MA models are always stationary

ARMA(p,q) φ(B)Xt = θ(B)εt :

All the roots of φ(z) = 0 are outside unit circle.

Tingyi Zhu Time Series Outlier Detection July 28, 2016 11 / 42

Non-stationary time series

Trend effect

Seasonal effect

Time

AirPa

ssen

gers

1950 1952 1954 1956 1958 1960

100

200

300

400

500

600

Figure: Monthly totals of international airline passengers, 1949 to 1960.

Tingyi Zhu Time Series Outlier Detection July 28, 2016 12 / 42

Time Series Decomposition

Think of a more general time series formulation including both trendand seasonal effect:

Xt = Tt + St + Et (3)

I Xt is data point at time t

I Tt is the trend component at time t

I St is the seasonal component at time t

I Et is the remainder component at time t (containing AR and MAterms)

Tingyi Zhu Time Series Outlier Detection July 28, 2016 13 / 42

Series with Trend, examples:

Assuming no seasonal effect, i.e. St = 0

Linear trend:

Xt = 2t + 0.5Xt−1 + εt

Quadratic trend:

Xt = 2t + t2 + 0.5Xt−1 + εt

Goal: remove the trend, to transform the series to be stationary

Solution: lag-1 differencing

Tingyi Zhu Time Series Outlier Detection July 28, 2016 14 / 42

Differencing and Trend

Define the lag-1 difference operator,

∇Xt = Xt − Xt−1 = (1− B)Xt ,

where B is the backshift operator.

If Xt = β0 + β1t + Et , then

∇Xt = β1 +∇Et .

If Xt =∑k

i=0 βi ti + Et , then

∇kXt = (1− B)kXt = k!βk +∇kEt .

we call ∇k kth lag-1 difference operator.

Tingyi Zhu Time Series Outlier Detection July 28, 2016 15 / 42

Lag-1 Differencing

Jan 04

2016

Mar 01

2016

May 02

2016

Jul 01

2016

18

50

19

50

20

50

21

50

S&P 500 Quote Year−To−Date

Jan 04

2016

Mar 01

2016

May 02

2016

Jul 01

2016

−8

0−

60

−4

0−

20

02

04

0

S&P 500 YTD Lag−1 Differencing

Tingyi Zhu Time Series Outlier Detection July 28, 2016 16 / 42

Series with Seasonal Effect, example:

For quarterly data, with possible seasonal (quarterly) effects, we candefine indicator function Sj . For j = 1, 2, 3, 4,

Sj =

{1 if observation is in quarter j of a year ,

0 otherwise.

A model with seasonal effects could be written as

Xt = α1S1 + α2S2 + α3S3 + α4S4 + εt

Goal: remove the seasonal effects

Solution: lag-s differencing, where s is the number of seasons

Tingyi Zhu Time Series Outlier Detection July 28, 2016 17 / 42

Differencing and Seasonal Effects

Define the lag-s difference operator,

∇sXt = Xt − Xt−s = (1− Bs)Xt ,

where B is the backshift operator.

If Xt = Tt + St + Et , and St has period s (i.e. St = St−s for all t), then

∇sXt = (1− Bs)Xt = Tt − Tt−s +∇sEt .

Tingyi Zhu Time Series Outlier Detection July 28, 2016 18 / 42

Non-seasonal ARIMA

St = 0

ARIMA stands for Auto-Regressive Integrated Moving Average,ARMA integrated with differencing.

A nonseasonal ARIMA model is classified as ARIMA(p,d,q), where

I p is the order of AR terms,

I d is the number of nonseasonal differences needed for stationarity,

I q is the order of MA terms.

Tingyi Zhu Time Series Outlier Detection July 28, 2016 19 / 42

Non-seasonal ARIMA, Cont.

Recall ARMA(p,q):

φ(B)Xt = θ(B)εt ,

I φ(B) and θ(B) are polynomials of B of order p and q.

I Stationary requirement: all roots of φ(z) = 0 outside unit circle.

ARIMA(p,d,q):

φ(B)(1− B)dXt = θ(B)εt ,

I Xt is not stationary. Why?

I Zt = (1− B)dXt is ARMA(p,q), is stationary.

Tingyi Zhu Time Series Outlier Detection July 28, 2016 20 / 42

Seasonal ARIMA

A seasonal ARIMA model is classified as

ARIMA(p, d , q)× (P,D,Q)m

I p is the order of AR terms,

I d is the number of nonseasonal differences,

I q is the order of MA terms.

I P is the order of seasonal AR terms,

I D is the number of seasonal differences,

I Q is the order of seasonal MA terms.

I m is the number of seasons.

Tingyi Zhu Time Series Outlier Detection July 28, 2016 21 / 42

Example: ARIMA(1, 1, 1)× (1, 1, 1)4

Tingyi Zhu Time Series Outlier Detection July 28, 2016 22 / 42

General ARIMA

The ARIMA model can be generalized as follow:

φ(B)α(B)Xt = θ(B)εt ,

I φ(B): autoregressive polynomial, all roots outside unit circle

I α(B): differencing filter renders the data stationary, all roots on theunit circle

I θ(B): moving average polynomial, all roots outside unit circle (toassure θ(B) is invertible.

Alternatively,

Xt =θ(B)

φ(B)α(B)εt .

Tingyi Zhu Time Series Outlier Detection July 28, 2016 23 / 42

Outliers Detection in Single Time Series

Tingyi Zhu Time Series Outlier Detection July 28, 2016 24 / 42

Automatic Detection Procedure

Described in Chung Chen, Lon-Mu Liu. Joint Estimation of ModelParameters and Outlier Effects in Time Series,JASA, 1993

Based on the framework of ARIMA models

R package tsoutlier written by YAHOO in 2014

Tingyi Zhu Time Series Outlier Detection July 28, 2016 25 / 42

Types of Outliers

General representation: L(B)It(tj)

I L(B): a polynomial of lag operator B

I It(tj) = 1 there’s outlier at time t = tj , and 0 otherwise.

Types of outliers:

I Additive Outliers (AO): L(B) = 1;

I Level Shift (LS): L(B) = 11−B ;

I Temporary Change (TC): L(B) = 11−δB ;

I Seasonal Level Shift (SLS): L(B) = 11−Bs ;

I Innovational Outliers (IO): L(B) = θ(B)φ(B)α(B) .

Tingyi Zhu Time Series Outlier Detection July 28, 2016 26 / 42

Types of Outliers

Tingyi Zhu Time Series Outlier Detection July 28, 2016 27 / 42

Formulation

ARIMA model:

Xt =θ(B)

φ(B)α(B)εt .

Model with outliers at time t1, t2, . . . , tm:

X ∗t =

m∑j=1

ωjLj(B)It(tj) +θ(B)

φ(B)α(B)εt .

I Lj(B) depends on pattern of the jth outlier

I It(tj) = 1 there’s outlier at time t = tj , and 0 otherwise.

I ωj denotes the magnitude of the jth outlier effect

Tingyi Zhu Time Series Outlier Detection July 28, 2016 28 / 42

Effect of One Outlier

Assume the time series parameters are known, we examine the effectof one outlier:

X ∗t = ωL(B)It(t1) +

θ(B)

φ(B)α(B)εt

Define polynomial π(B) as:

π(B) =φ(B)α(B)

θ(B)= 1− π1B − π2B − · · · ,

Contaminated by the outlier, the estimated residual et becomes

et = π(B)X ∗t

(Without outlier, et = π(B)Xt .)

Tingyi Zhu Time Series Outlier Detection July 28, 2016 29 / 42

For the four types of outliers,

IO: et = ωIt(t1) + εt ,

AO: et = ωπ(B)It(t1) + εt ,

LS: et = ω π(B)1−B It(t1) + εt ,

TC: et = ω π(B)1−δB It(t1) + εt .

Alternatively,

et = ωxi ,t + εt , t = t1, t1 + 1, . . . and i = 1, 2, 3, 4

xi ,t = 0 for all i and t < t1,

xi ,t = 1 for all i ,

x1,t1+k = 0, x2,t1+k = −πk ,

x3,t1+k = 1−∑k

j=1 πj , x4,t1+k = δk −∑k−1

j=1 δk−jπj − πk .

A simple linear regression!

Tingyi Zhu Time Series Outlier Detection July 28, 2016 30 / 42

Estimate of ω

The least square estimate doe the effect of a single outlier at t = t1 canbe expressed as

Tingyi Zhu Time Series Outlier Detection July 28, 2016 31 / 42

Test Statistics τ

From regression analysis, we have

ω − ωσa

(n∑

t=t1

x2i ,t)1/2 ∼ N(0, 1),

where σa is the estimation of residual standard deviation.

We want to test whether ω = 0 , then the following statistics areapproximately N(0, 1):

Tingyi Zhu Time Series Outlier Detection July 28, 2016 32 / 42

Procedure in the Presence of Multiple Ouliers

In the presence of multiple outliers, recall the model

X ∗t =

m∑j=1

ωjLj(B)It(tj) +θ(B)

φ(B)α(B)εt .

where σa is the estimation of residual standard deviation.

The estimated residual becomes

et =m∑j=1

ωjπ(B)Lj(B)It(tj) + εt

Tingyi Zhu Time Series Outlier Detection July 28, 2016 33 / 42

Stage 1: Joint Estimation of Outlier Effect and ModelParameters

Fitting the series by an ARIMA model (forecast package in R),obtain initial parameter (φ(B), θ(B), α(B)) estimation of the model.

Detect outliers one by one sequentially

Tingyi Zhu Time Series Outlier Detection July 28, 2016 34 / 42

Stage 2: Initial Parameter Estimation and OutlierDetection

Tingyi Zhu Time Series Outlier Detection July 28, 2016 35 / 42

Tingyi Zhu Time Series Outlier Detection July 28, 2016 36 / 42

Outlier Series Detection from Multiple Time Series

Tingyi Zhu Time Series Outlier Detection July 28, 2016 37 / 42

Detect Anomalous Series

Goal: efficiently find the least similar time series in a large set

Motivation: Internet companies monitoring the servers(CPU,Memory), find unusual behaviors

Tingyi Zhu Time Series Outlier Detection July 28, 2016 38 / 42

Detect Anomalous Series

Described in Rob J Hyndman et al. Large-Scale Unusual Time SeriesDetection, ICDM, 2015

Approach: Extract features from time series, PCA

R package anomalous

Test on real data from YAHOO email server,80% accuracy compared to 40% from previous methods

Tingyi Zhu Time Series Outlier Detection July 28, 2016 39 / 42

Step 1: Extract Features from Time Series

15 features selected, each captures the global information of timeseries

Tingyi Zhu Time Series Outlier Detection July 28, 2016 40 / 42

Step2: PCA to reduce dimension

I dim=15 initially, correlation existing between features

I The first 2 PCs are sufficient, capturing most of the variance

Step 3: Implement multi-dimentional outlier detection algorithm tofind outlier series

I Density based

I α-hull

Tingyi Zhu Time Series Outlier Detection July 28, 2016 41 / 42

Demo

Tingyi Zhu Time Series Outlier Detection July 28, 2016 42 / 42