big data at home depot ksu – big data survey course steve einbender advanced analytics architect
TRANSCRIPT
![Page 1: Big Data at Home Depot KSU – Big Data Survey Course Steve Einbender Advanced Analytics Architect](https://reader036.vdocument.in/reader036/viewer/2022082612/56649f355503460f94c53077/html5/thumbnails/1.jpg)
Big Data at Home Depot
KSU – Big Data Survey CourseSteve Einbender
Advanced Analytics Architect
![Page 2: Big Data at Home Depot KSU – Big Data Survey Course Steve Einbender Advanced Analytics Architect](https://reader036.vdocument.in/reader036/viewer/2022082612/56649f355503460f94c53077/html5/thumbnails/2.jpg)
2
Time Series Concept
Statistical Forecasting and/or Statistical Estimation are the primary goals of Time Series modeling
Time Series is essential to accurately model and account for Time Effects. If you don’t then you will, in general, confound your experiment and, specifically, your covariates/predictors.
Conceptual Model:
= ….. T is Trend, S is Seasonal, E is Error, ….. And is
Time period and is the Lag period In general, assess behavior of the Dependent Variable(e.g., Net_Sales) and
apply an appropriate Time Series Model to the Trend and Seasonality of the DV (e.g., ARIMA) to account for the Time Effects….. Then the fun begins…
![Page 3: Big Data at Home Depot KSU – Big Data Survey Course Steve Einbender Advanced Analytics Architect](https://reader036.vdocument.in/reader036/viewer/2022082612/56649f355503460f94c53077/html5/thumbnails/3.jpg)
3
Operationalizing the Concept
p - AutoRegressive (Auto Correlation)d - Integrated (Stationarity / Trend)q - Moving Average (Shocks / Error)
P – Seasonal Auto CorrelationD – Seasonal TrendQ – Seasonal Error
Seasonal effects: If there are spikes in the data every four periods for quarterly data, or every 12 periods for monthly data, there is a seasonal effect.
![Page 4: Big Data at Home Depot KSU – Big Data Survey Course Steve Einbender Advanced Analytics Architect](https://reader036.vdocument.in/reader036/viewer/2022082612/56649f355503460f94c53077/html5/thumbnails/4.jpg)
4
Time Series Parameter Specifications
ARIMA modeling involves three stages: (1) Identification of the initial p, d, and q parameters
Autoregressive component (p). Usually 0, 1, or 2 Integrated component (d). Usually 0, 1, or 2 Moving average component (q). Usually 0, 1, or 2
(2) Estimation of the p (auto-regressive) and q (moving average) components to see if they contribute significantly to the model or if one or the other should be dropped; and
(3) Diagnosis of the residuals to see if they are random and normally distributed, indicating a good model.
An ARIMA (0,1,1) model means no autoregressive component, differencing one time to remove linear trends, and a lag 1 moving average component.
![Page 5: Big Data at Home Depot KSU – Big Data Survey Course Steve Einbender Advanced Analytics Architect](https://reader036.vdocument.in/reader036/viewer/2022082612/56649f355503460f94c53077/html5/thumbnails/5.jpg)
5
Time Series Forecasting System(TSFS) Demo
Data Range identification View Series graphically What Functions and Tests do we use to derive the most accurate Time
Series model possible ? Autocorrelation Function Partial Autocorrelation Function
Patterns in the ACF/PACF functions can be used to suggest different models to use. White Noise Test Dickey-Fuller Unit Root / Stationarity Test
After a candidate set of models are identified, the models are estimated and their fit assessed
The best fitting model is used to generate a forecast.
![Page 6: Big Data at Home Depot KSU – Big Data Survey Course Steve Einbender Advanced Analytics Architect](https://reader036.vdocument.in/reader036/viewer/2022082612/56649f355503460f94c53077/html5/thumbnails/6.jpg)
6
ARIMA with Dynamic Regression
Another use of Time Series is for the introduction of Covariates/Predictors. An extension of ordinary Regression One or more of the Independent Variables(i.e., predictors) are correlated with the
Dependent Variable at non-concurrent time lags.
Intervention Analysis Two basic activities
Identify the Functional Form of the Intervention Assess the Statistical Significance of the Intervention
Let’s look at how we build a Time Series ADS….