seasonal time series zishan chen, lijuan kang, qichao sun, mengying wang, zihao wang

Download Seasonal Time Series Zishan Chen, Lijuan Kang, Qichao Sun, Mengying Wang, Zihao Wang

If you can't read please download the document

Upload: sandra-watkins

Post on 21-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

  • Slide 1
  • Seasonal Time Series Zishan Chen, Lijuan Kang, Qichao Sun, Mengying Wang, Zihao Wang
  • Slide 2
  • Seasonality Zihao Wang
  • Slide 3
  • What is Seasonality? In statistics, many time series exhibit cyclic variation known as seasonality, seasonal variation, periodic variation, or periodic fluctuations. This variation can be either regular or semi-regular. In short, its like a time series repeats itself after a regular period of time. Seasonal variation is a component of a time series which is defined as the repetitive and predictable movement around the trend line in one year or less. It is detected by measuring the quantity of interest for small time intervals, such as days, weeks, months or quarters.
  • Slide 4
  • What is Seasonality? Seasonality could applied to many industries, for example, retail sales tend to peak for the Christmas season and then decline after the holidays. So time series of retail sales will typically show increasing sales from September through December and declining sales in January and February. Seasonality is quite common in economic time series. It is less common in engineering and scientific data.
  • Slide 5
  • What is Seasonality? Organizations affected by seasonal variation need to identify and measure this seasonality to help with planning for temporary increases or decreases in labor requirements, inventory, training, periodic maintenance, and so forth. Apart from these considerations, the organizations need to know if the variation they have experienced has been more or less than the expected, given the usual seasonal variations.
  • Slide 6
  • Detecting Seasonality Time Plot
  • Slide 7
  • Detecting Seasonality Seasonal Subseries Plot monthplot() in R
  • Slide 8
  • Detecting Seasonality Box Plot boxplot() in R
  • Slide 9
  • Detecting Seasonality The time plot is a recommended first step for analyzing any time series. Although seasonality can sometimes be indicated with this plot, seasonality is shown more clearly by the seasonal subseries plot or the box plot. Furthermore, for large data sets, the box plot is usually easier to read than the seasonal subseries plot.
  • Slide 10
  • Detecting Seasonality Both the seasonal subseries plot and the box plot assume that the seasonal periods are known. In most cases, the analyst will in fact know this. For example, for monthly data, the period is 12 since there are 12 months in a year. However, if the period is not known, the autocorrelation plot can help. If there is significant seasonality, the autocorrelation plot should show spikes at lags equal to the period. For example, for monthly data, if there is a seasonal effect, we would expect to see significant peaks at lag 12, 24, 36, and so on (although the intensity may decrease the further out we go).
  • Slide 11
  • Detecting Seasonality ACF
  • Slide 12
  • Detecting Seasonality Periodogram
  • Slide 13
  • Seasonal Unit Roots The main advantage of seasonal unit root tests is where you need to make use of data that cannot be seasonally adjusted or even as a pretest before seasonal adjustment. If a series has seasonal unit roots, then standard ADF test statistic do not have the same distribution as for non-seasonal series.
  • Slide 14
  • The Dickey-Hasza-Fuller Test The first test for testing seasonal unit root is develop by Dickey, Hasza and Fuller (DHF) in 1984. This test is an extension version of the well known Dickey-Fuller procedure to seasonal time series. Assuming that the process is SAR(1), The DHF test is shown as: In the null hypothesis, we are testing versus
  • Slide 15
  • The Dickey-Hasza-Fuller Test After the OLS estimation, the test statistics is obtained as Again, the asymptotic distribution of this test statistics is a non- standard distribution. The critical values were obtained by Monte- Carlo simulation for different sample sizes and seasonal periods.
  • Slide 16
  • The Dickey-Hasza-Fuller Test The problem of the DHF test is that, under the null hypothesis, one has exactly s unit roots. Under the alternative, one has no unit root. This is very restrictive, as some people may wish to test for specific seasonal or non-seasonal unit roots, there is some other tests.
  • Slide 17
  • HEGY Test The HEGY test is posed by Hylleberg, Granger, Engle, Yoo. This test has the advantage of testing seasonal unit root at each frequency separately, thus it is widely applied and it is the most customary test.
  • Slide 18
  • HEGY Test The HEGY test for seasonal integration is conducted by estimating the following regression (special case for quarterly data): where Q jt is a seasonal dummy, and the W it are given below.
  • Slide 19
  • HEGY Test After OLS estimation, tests are conducted for 1 = 0, for 2 = 0 and a joint test of the hypothesis 3 = 4 = 0. The HEGY test is a joint test for LR (or zero frequency) unit roots and seasonal unit roots. If none of the i are equal to zero, then the series is stationary (both at seasonal and nonseasonal frequencies).
  • Slide 20
  • HEGY Test Interpretation of the different i is as follows: 1. If 1 < 0, then there is no long-run (nonseasonal) unit root. 1 is on W 1t = S(B)Y t which has had all of the seasonal roots removed. 2. If 2 < 0, then there is no semi-annual unit root. 3. If 3 and 4 < 0, then there is no unit root in the annual cycle.
  • Slide 21
  • HEGY Test Just as in the ADF tests, it is important to ensure that the residuals from estimating the HEGY equation are white noise. The power of unit root tests is low, that is, it is not easy to distinguish between genuine unit roots and near-unit roots. So erroneously imposing a unit root seems better than not imposing it when one should.
  • Slide 22
  • OCSB Test Osborn, Chui, Smith and Birchenhall (1988) modified the DHF test by replacing s z t with s y t as the dependent variable. The overall F statistics for testing is used to test the presence of all seasonal unit roots, and the t statistics on relates to test the null hypothesis that no intermediate (nonseasonal) lag is involved in the data generation.
  • Slide 23
  • CH Test Canova and Hansen (1995) proposed this test. Unlike all the seasonal unit root tests shown above, the null hypothesis of this test is that the process is stationary, while the alternative hypothesis can be the presence of unit root(s) at specific seasonal frequency or at selected frequencies.
  • Slide 24
  • CH Test Canova and Hansen use the assumption that both the process under investigation and the explanatory variables in the null regression do not contain any non-stationary behavior at the zero frequency.
  • Slide 25
  • Number of differences required for a stationary series (ndiffs) ndiffs uses a unit root test to determine the number of differences required for time series x to be made stationary. If test="kpss", the KPSS test is used with the null hypothesis that x has a stationary root against a unit-root alternative. Then the test returns the least number of differences required to pass the test at the level alpha. We can also change the kpss to adf, pp (Phillips-Perron), In both of these cases, the null hypothesis is that x has a unit root against a stationary root alternative. ndiffs(x, alpha=0.05, test=c("kpss","adf", "pp"))
  • Slide 26
  • Number of differences required for a stationary series (nsdiffs) nsdiffs uses seasonal unit root tests to determine the number of seasonal differences required for time series x to be made stationary. If test=ch, the Canova-Hansen test is used (with null hypothesis of deterministic seasonality) and if test=ocsb, the Osborn-Chui-Smith- Birchenhall test is used (with null hypothesis that a seasonal unit root exists). nsdiffs(x, m=frequency(x), test=c("ocsb","ch"))
  • Slide 27
  • Number of differences required for a stationary series (nsdiffs) After one difference, the seasonal time series becomes stationary
  • Slide 28
  • Regression model with seasonal variables Zishan Chen
  • Slide 29
  • Dummy Variable Usually, the predictor takes numerical values. When a predictor is a categorical variable taking only two values (e.g., "yes" and "no"). This situation can still be handled within the framework of regression models by creating a "dummy variable" taking value 1 corresponding to "yes" and 0 corresponding to "no". A dummy variable is also known as an "indicator variable". If there are more than two categories, then the variable can be coded using several dummy variables (one fewer than the total number of categories if the intercept term are included in the model).
  • Slide 30
  • Seasonal Dummy Variable Deterministic seasonality S t can be written as a function of seasonal dummy variables Let s be the seasonal frequency s=4 for quarterly s=12 for monthly Let D 1t, D 2t, D 3t,, D st be seasonal dummies D 1t = 1 if s is the first period, D 1t = 0 otherwise D 2t = 1 if s is the second period, D 2t = 0 otherwise At any time period t, one of the seasonal dummies D 1t, D 2t, D 3t,, D st will equal 1, all the others will equal 0.
  • Slide 31
  • Example of monthly seasonality
  • Slide 32
  • Example of weekly seasonality
  • Slide 33
  • Other types of seasonality Daily data High-frequency data Holiday effects Flower sales big on Valentines Day, Mothers Day, Easter, yet these days can move around Trading-day/ business-day variation Can divide by number of trading days, or include as a regressor
  • Slide 34
  • Regression Model
  • Slide 35
  • Interpreting Coefficients
  • Slide 36
  • Seasonal dummy variables and linear time trend
  • Slide 37
  • Seasonal + AR
  • Slide 38
  • Transformation
  • Slide 39
  • Seasonal Representation
  • Slide 40
  • Redundant But lagged seasonal dummies are redundant with the original seasonal dummies The set of lagged dummy variables are collinear with the current dummy variables Given that you know this month is February, there is no information in knowing that last month was January. The lagged dummies should be omitted
  • Slide 41
  • AR(p) Case
  • Slide 42
  • Trend + Seasonal + AR(p)
  • Slide 43
  • Atmospheric CO 2 Monthly Mean Concentrations
  • Slide 44
  • Fit the data with dummy variables model fit
  • Example 1.To get the regression function, first, we need to simulate a set of empty harmonics at different frequencies of 1 to 6(=s/2). >sin
  • 2. Regression using all potential varibales >co2.lm1 coef(co2.lm1)/sqrt(diag(vcov(co2.lm1))) >summary(co2.lm1)
  • Slide 56
  • 3. Select significant variables Rule 1: an approximate t-ratio of magnitude 2 is a common choice. This t-ratio can be obtained by dividing the estimated coefficient by the standard error of the estimate. Rule 2: check the p-value using significant level of 0.05 (two-tailed).
  • Slide 57
  • 4.Regression with selected variables > co2.lm2 coef(co2.lm2) > summary(co2.lm2)
  • Slide 58
  • 5. Check the regression model 5.1 Time plot of original data and fitted values
  • Slide 59
  • 5.2 Check the residuals (residual plot, acf, and pacf) ADF test the check the stationary of residuals > adf.test(resid(co2.lm2)) Augmented Dickey-Fuller Test data: resid(co2.lm2) Dickey-Fuller = -2.8827, Lag order = 8, p-value = 0.2047 alternative hypothesis: stationary 5. Check the regression model
  • Slide 60
  • 5.3 Modeling the error term
  • Slide 61 acf(coerror$resid[-(1:3)])"> coerror coerror 5.4 Check the AR(3) model by residual plot and ACF >plot(coerror$resid, ylab="Residuals", type">
  • > coerror coerror 5.4 Check the AR(3) model by residual plot and ACF >plot(coerror$resid, ylab="Residuals", type="p");abline(h=0) > acf(coerror$resid[-(1:3)])
  • Slide 62
  • 5.4 Check the AR(3) model by residual plot and ACF
  • Slide 63
  • 6. The final model The harmonic function Where e(t) is white noise. Mean of t Sd of t
  • Slide 64
  • 7.Prediction > newT TIME sin