time series forecasting modeling cmg12

24
1 “And” or “Or” ? CMG 12 December 4, 2012 Alex Gilgur Josep Ferrandiz Matthew Beason

Upload: alex-gilgur

Post on 13-Jan-2017

65 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Time Series Forecasting Modeling CMG12

1

“And” or “Or” ?

CMG 12 December 4, 2012Alex Gilgur

Josep FerrandizMatthew Beason

Page 2: Time Series Forecasting Modeling CMG12

OVERVIEW

• Definitions: Loose but True

• Business Case

• So what’s the problem?

• How much traffic do you have to support?

• Regression

• Can you support the traffic at time T?

• Forecasting

• What If…

• Solution

• Real-World Use Case

• A Digression about Regression

• Conclusions

• Acknowledgments

• Q&A

2

HIGH

LOW

R

0 0.5 1 1.5 2 2.5 3 3.5 4

1

3

5

7

9

11

13

15

17

0

10

20

30

40

50

60

70

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 121

1000s hosts * 100s apps * 10s metrics

12 months * 4.5 weeks * 7 days of the week * 24 hours

Business Demands

Page 3: Time Series Forecasting Modeling CMG12

Traf

fic

Business Metric

Q vs. BM

QMdlLo90%Hi90%

DEFINITIONS: LOOSE BUT TRUE

3

Regression:

A black boxy = f (x)

x

y

Page 4: Time Series Forecasting Modeling CMG12

Traf

fic

Business Metric

Q vs. BM

QMdlLo90%Hi90%

DEFINITIONS: LOOSE BUT TRUE

Statistics = The art of torturing data until they talk to you 4

Regression:

A black boxy = f (x)

x

A black boxy = f (t)t y

Time Series:

0

50

100

150

200

250

300

9/24

/201

1

10/4

/201

1

10/1

4/20

11

10/2

4/20

11

11/3

/201

1

11/1

3/20

11

11/2

3/20

11

12/3

/201

1

Max Daily Concurrency

TrendSeasonality

LevelEvents

Page 5: Time Series Forecasting Modeling CMG12

Traf

fic

Business Metric

Q vs. BM

QMdlLo90%Hi90%

DEFINITIONS: LOOSE BUT TRUE

Statistics = The art of torturing data until they talk to you5

Regression:

A black boxy = f (x)

x

A black boxy = f (t)t y

Time Series:

0

50

100

150

200

250

300

9/24

/201

1

10/4

/201

1

10/1

4/20

11

10/2

4/20

11

11/3

/201

1

11/1

3/20

11

11/2

3/20

11

12/3

/201

1

Max Daily Concurrency

TrendSeasonality

LevelEvents

TSA and Regressionallow us to reconstruct the

y given the x,

and / or the t, and the

parameters

Page 6: Time Series Forecasting Modeling CMG12

DEFINITIONS: CONTINUED

Forecasting = The art of meaningful reflection on the past6

Forecasting: Predicting the future based on the past

0

200

400

600

800

1000

1200

9/24/2011 11/13/2011 1/2/2012 2/21/2012

<pool ABCD>: peak-hour busy threads of <app1234>

RSAS

ForecastPro...

Compute a Weighted Moving

Average

Extend it 1 point;Add that point to the

WMA

FOR(Level, Trend, Seasonality, Events)

Page 7: Time Series Forecasting Modeling CMG12

BUSINESS CASE

• You have a web site

• You know your business metric behavior

• can forecast it

• can simulate it• You need to size the servers while minimizing the cost

• CPU

• Memory

• Worker threads

• Storage

• Network

So what’s the problem?

7

Page 8: Time Series Forecasting Modeling CMG12

“It’s complicated”8

A black boxy = f (x)

x

The same black boxy = f (t)t y

y = f (x, t) + ε (t)

BM

, Q, X

, and

R a

s ti

mes

erie

s

Q (BM, t) = X(BM, t) * R(BM, t)

q

x

r

BMX = throughput (TPS)R = response time

Q = concurrency (traffic)BM = business metric

How much traffic do you need to support?

Page 9: Time Series Forecasting Modeling CMG12

HOW MUCH TRAFFIC DO YOU HAVE TO SUPPORT?

A better question is… 9

BM

, Q, X

, and

R a

s ti

mes

erie

s

Q (BM, t) = X(BM, t) * R(BM, t)

q

x

r

BM

X = throughput (TPS)R = response time

Q = concurrency (traffic)BM = business metric

t = time

BM = f(t)X = f (BM)

R = f(X, BM)Q = R * X = f (R, X)

Q (BM, t) = X(BM, t) * R[X(BM, t), BM, t]

Tools: 1: Enter Regression

The complexity of the relationships is enormous

Page 10: Time Series Forecasting Modeling CMG12

TOOLS: 2: A WORD FOR FORECASTING

• If we cannot regress it, we forecast it.

• Not an Excel-style regression to time

• Not a point forecast:

• need the prediction interval

Holt-Winters and ARIMA are standard tools; new methods are being developed. 10

Holt-Winters

ARIMA

Can you support the traffic that you will have at time T?

A simple example

Page 11: Time Series Forecasting Modeling CMG12

MORE SERIOUS CASES:

11

http://robjhyndman.com/papers/complex-seasonality/http://forecastingprinciples.com/

There are cases where regression would not have worked

Page 12: Time Series Forecasting Modeling CMG12

MORE SERIOUS CASES:

12

http://robjhyndman.com/papers/complex-seasonality/http://forecastingprinciples.com/

There are cases where regression would not have worked

Exponentially Weighted Moving Average (HW)

Auto Regressive Integrated Moving Average

Extend it 1 point;Add that point to the

Time Series

FOR(Level, Trend, Seasonality, Events)

Page 13: Time Series Forecasting Modeling CMG12

IF 𝑩𝑴 = 𝒇 𝒕 …

We need to outsmart the model 13

𝑄 𝐵𝑀 𝑡 , 𝑡 = 𝑋 𝐵𝑀 𝑡 , 𝑡 ∗ 𝑅{[𝑋 𝐵𝑀 𝑡 , 𝑡 , 𝐵𝑀, 𝑡}

1. Forecast the BM; get the value at time T2. Build a regression of performance metrics to BM

i. How good is the regression?ii. How do we measure the goodness of the regression?

Can you support the traffic that you will have at time T?

Q = f(BM) + ε The ε is the residualsif the fit is good, ε is small => R2 is high

What if the R2 is OK, but… we used linear model on quadratic data?we missed a pattern in the data?

What if the ε is time-dependent ? Q(t) = f[BM (t)] + ε (t)

Page 14: Time Series Forecasting Modeling CMG12

AN ILLUSTRATION (SOTTO VOCE)

14

Tried to fit a quadratic modelR2 = 0.995

Obviously missed a trendThe data are cubic

R2 is not good enough

Page 15: Time Series Forecasting Modeling CMG12

AN ILLUSTRATION (SOTTO VOCE)

15

Tried to fit a quadratic modelR2 = 0.995

Obviously missed a trendThe data are cubic

R2 is not good enough

Here the missed trend may not matter, but it’s only an illustration

Page 16: Time Series Forecasting Modeling CMG12

SOLUTION: FORECAST THE RESIDUALS!

Forecast IV; build regression; forecast residuals; add it all together 16

Start DV == f (IV)?

DATA

DV(t)

, IV

(t)

Generate TSA FORECASTS for IV and DV

Project IV and DV to t = T

independently

Done

NO

DATA

Generate DV(IV) REGRESSION

YES

Generate TSA FORECAST for

ResidualsAnd for IV

Project to t = T

Combine DV[IV(t=T)] + Residuals(t = T)

DV (t)IV(t)

DV (t) = f[IV(t), t] |t = T* + ε (t) |t = T*

Page 17: Time Series Forecasting Modeling CMG12

TRADITIONAL SOLUTION:

17

BM Response Time

ThroughputTraffic

A real-life exampleSize the worker threads for an application for the next year

Page 18: Time Series Forecasting Modeling CMG12

REGRESSION IS OPTIMIZATION

18

DV = f(IV, A) : A = arg min(ε );ε = DV|predict - DV

Averages: OLS: Simple algebraCI from StDev

Linear a0 + a1 * IVPolynomial a0 + a1 * IV + a2 * IV^2 + …Exponential a0 * exp (a1 * IV)Logarithmic a0 * log (a1 * IV) Power a0 * IV ^ a1

Traf

fic

Business Metric

Q vs. BM

QMdlLo90%Hi90%

Page 19: Time Series Forecasting Modeling CMG12

REGRESSION IS OPTIMIZATION

19

y = 0.241x + 24.215R² = 0.03376

Con

curr

ency

Business Metric

Q Linear (Q)

DV = f(IV, A) : A = arg min(ε );ε = DV|predict - DV

Averages: OLS: Simple algebraCI from StDev

95%ile?

Linear a0 + a1 * IVPolynomial a0 + a1 * IV + a2 * IV^2 + …Exponential a0 * exp (a1 * IV)Logarithmic a0 * log (a1 * IV) Power a0 * IV ^ a1

Traf

fic

Business Metric

Q vs. BM

QMdlLo90%Hi90%

Page 20: Time Series Forecasting Modeling CMG12

REGRESSION IS OPTIMIZATION

20

y = 0.241x + 24.215R² = 0.03376

Con

curr

ency

Business Metric

Q Linear (Q)

DV = f(IV, A) : A = arg min(ε );ε = DV|predict - DV

Averages: OLS: Simple algebraCI from StDev

95%ile?

Linear a0 + a1 * IVPolynomial a0 + a1 * IV + a2 * IV^2 + …Exponential a0 * exp (a1 * IV)Logarithmic a0 * log (a1 * IV) Power a0 * IV ^ a1

Traf

fic

Business Metric

Q vs. BM

QMdlLo90%Hi90%

library(quantreg)Mdl = rq (DV ~ IV, tau = 0.95)DV_bar = predict (Mdl)

Page 21: Time Series Forecasting Modeling CMG12

EXAMPLE (CONTINUED)

21

Forecast BM

Build Regression Q ~ BM

Forecast Residuals

Q(t) = f[BM (t)] + ε (t)

Size the worker threads for an application for the next year

Page 22: Time Series Forecasting Modeling CMG12

a) b)

FINISHING TOUCHES

22

Red = regular regression

Blue = our methodGreen and black = data

Grey = predictive interval bounds

Page 23: Time Series Forecasting Modeling CMG12

CONCLUSIONS• Downsides:

• It is an extra step in building the projection, increasing the runtime of computing the models.

• If the regression model is good, then the residuals are unforecastable.

• Advantages:

• It is a very robust method:

• No worries about the data not being suitable for the regression:

• missed trend and periodicity in the residuals will be picked up by the TSA forecasts.

• It is a versatile method:

• Regression and TSA forecasting combined:

• give us more control in tuning regression and TSA models than regression by itself and TSA forecasting by itself.

• TSA forecast of residuals can only be inappropriate if the regression is good.

• Then the weight (significance) of the residuals is negligible compared with the actual data.

• There are forecasting methods even for unforecastable data.

• Forecast replacement for nonlinear time series data:

• Linear is too conservative

• Exponential is too optimistic

• Quadratic regression to time

• Forecast residualsThere is no reason not to use it 23

Page 24: Time Series Forecasting Modeling CMG12

• Co-authors and reviewers:

• Dr. Josep Ferrandiz

• Matthew Beason

• A big thank-you goes to

• Dr. Igor Trubin who inspired this paper at CMG’11

• Mike Perka who has been my guide on this journey into the world of IT data

ACKNOWLEDGMENTS

[email protected]