Forecasting using
11. Dynamic regression
OTexts.com/fpp/9/1/
Forecasting using R 1
Rob J Hyndman
Outline
1 Regression with ARIMA errors
2 Example: Japanese cars
3 Using Fourier terms for seasonality
4 Example: Sales of petroleum & coalproducts
Forecasting using R Regression with ARIMA errors 2
Regression with ARIMA errors
Regression modelsyt = b0 + b1x1,t + · · ·+ bkxk,t + nt
yt modeled as function of k explanatoryvariables x1,t, . . . , xk,t.Usually, we assume that nt is WN.Now we want to allow nt to be autocorrelated.
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
and et is white noise .
Forecasting using R Regression with ARIMA errors 3
Regression with ARIMA errors
Regression modelsyt = b0 + b1x1,t + · · ·+ bkxk,t + nt
yt modeled as function of k explanatoryvariables x1,t, . . . , xk,t.Usually, we assume that nt is WN.Now we want to allow nt to be autocorrelated.
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
and et is white noise .
Forecasting using R Regression with ARIMA errors 3
Regression with ARIMA errors
Regression modelsyt = b0 + b1x1,t + · · ·+ bkxk,t + nt
yt modeled as function of k explanatoryvariables x1,t, . . . , xk,t.Usually, we assume that nt is WN.Now we want to allow nt to be autocorrelated.
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
and et is white noise .
Forecasting using R Regression with ARIMA errors 3
Regression with ARIMA errors
Regression modelsyt = b0 + b1x1,t + · · ·+ bkxk,t + nt
yt modeled as function of k explanatoryvariables x1,t, . . . , xk,t.Usually, we assume that nt is WN.Now we want to allow nt to be autocorrelated.
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
and et is white noise .
Forecasting using R Regression with ARIMA errors 3
Regression with ARIMA errors
Regression modelsyt = b0 + b1x1,t + · · ·+ bkxk,t + nt
yt modeled as function of k explanatoryvariables x1,t, . . . , xk,t.Usually, we assume that nt is WN.Now we want to allow nt to be autocorrelated.
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
and et is white noise .
Forecasting using R Regression with ARIMA errors 3
Residuals and errors
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
Be careful in distinguishing nt from et.nt are the “errors” and et are the “residuals”.In ordinary regression, nt is assumed to bewhite noise and so nt = et.
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
Now a regression with ARMA(1,1) error
Forecasting using R Regression with ARIMA errors 4
Residuals and errors
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
Be careful in distinguishing nt from et.nt are the “errors” and et are the “residuals”.In ordinary regression, nt is assumed to bewhite noise and so nt = et.
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
Now a regression with ARMA(1,1) error
Forecasting using R Regression with ARIMA errors 4
Residuals and errors
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
Be careful in distinguishing nt from et.nt are the “errors” and et are the “residuals”.In ordinary regression, nt is assumed to bewhite noise and so nt = et.
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
Now a regression with ARMA(1,1) error
Forecasting using R Regression with ARIMA errors 4
Residuals and errors
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
Be careful in distinguishing nt from et.nt are the “errors” and et are the “residuals”.In ordinary regression, nt is assumed to bewhite noise and so nt = et.
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
Now a regression with ARMA(1,1) error
Forecasting using R Regression with ARIMA errors 4
Residuals and errors
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
Be careful in distinguishing nt from et.nt are the “errors” and et are the “residuals”.In ordinary regression, nt is assumed to bewhite noise and so nt = et.
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
Now a regression with ARMA(1,1) error
Forecasting using R Regression with ARIMA errors 4
Residuals and errors
Example: nt = ARIMA(1,1,1)
yt = b0 + b1x1,t + · · ·+ bkxk,t + ntwhere (1− φ1B)(1− B)nt = (1− θ1B)et
Be careful in distinguishing nt from et.nt are the “errors” and et are the “residuals”.In ordinary regression, nt is assumed to bewhite noise and so nt = et.
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
Now a regression with ARMA(1,1) error
Forecasting using R Regression with ARIMA errors 4
Regression with ARIMA errorsAny regression with an ARIMA error can be rewrittenas a regression with an ARMA error by differencingall variables with the same differencing operator asin the ARIMA model.
Original data
yt = b0 + b1x1,t + · · ·+ bkxk,t + nt
where φ(B)(1− B)dnt = θ(B)et
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
where φ(B)nt = θ(B)et
and y′t = (1− B)dyt, etc.Forecasting using R Regression with ARIMA errors 5
Regression with ARIMA errorsAny regression with an ARIMA error can be rewrittenas a regression with an ARMA error by differencingall variables with the same differencing operator asin the ARIMA model.
Original data
yt = b0 + b1x1,t + · · ·+ bkxk,t + nt
where φ(B)(1− B)dnt = θ(B)et
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
where φ(B)nt = θ(B)et
and y′t = (1− B)dyt, etc.Forecasting using R Regression with ARIMA errors 5
Regression with ARIMA errorsAny regression with an ARIMA error can be rewrittenas a regression with an ARMA error by differencingall variables with the same differencing operator asin the ARIMA model.
Original data
yt = b0 + b1x1,t + · · ·+ bkxk,t + nt
where φ(B)(1− B)dnt = θ(B)et
After differencing all variables
y′t = b1x′1,t + · · ·+ bkx
′k,t + n′t.
where φ(B)nt = θ(B)et
and y′t = (1− B)dyt, etc.Forecasting using R Regression with ARIMA errors 5
Modeling procedureProblems with OLS and autocorrelated errors
1 OLS no longer the best way to computecoefficients as it does not take account oftime-relationships in data.
2 Standard errors of coefficients are incorrect —most likely too small. This invalidates tests andprediction intervals.
Second problem more serious because it can lead tomisleading results.If standard errors obtained using OLS too small, someexplanatory variables may appear to be significant when,in fact, they are not. This is known as “spuriousregression.”
Forecasting using R Regression with ARIMA errors 6
Modeling procedureProblems with OLS and autocorrelated errors
1 OLS no longer the best way to computecoefficients as it does not take account oftime-relationships in data.
2 Standard errors of coefficients are incorrect —most likely too small. This invalidates tests andprediction intervals.
Second problem more serious because it can lead tomisleading results.If standard errors obtained using OLS too small, someexplanatory variables may appear to be significant when,in fact, they are not. This is known as “spuriousregression.”
Forecasting using R Regression with ARIMA errors 6
Modeling procedureProblems with OLS and autocorrelated errors
1 OLS no longer the best way to computecoefficients as it does not take account oftime-relationships in data.
2 Standard errors of coefficients are incorrect —most likely too small. This invalidates tests andprediction intervals.
Second problem more serious because it can lead tomisleading results.If standard errors obtained using OLS too small, someexplanatory variables may appear to be significant when,in fact, they are not. This is known as “spuriousregression.”
Forecasting using R Regression with ARIMA errors 6
Modeling procedureProblems with OLS and autocorrelated errors
1 OLS no longer the best way to computecoefficients as it does not take account oftime-relationships in data.
2 Standard errors of coefficients are incorrect —most likely too small. This invalidates tests andprediction intervals.
Second problem more serious because it can lead tomisleading results.If standard errors obtained using OLS too small, someexplanatory variables may appear to be significant when,in fact, they are not. This is known as “spuriousregression.”
Forecasting using R Regression with ARIMA errors 6
Modeling procedureProblems with OLS and autocorrelated errors
1 OLS no longer the best way to computecoefficients as it does not take account oftime-relationships in data.
2 Standard errors of coefficients are incorrect —most likely too small. This invalidates tests andprediction intervals.
Second problem more serious because it can lead tomisleading results.If standard errors obtained using OLS too small, someexplanatory variables may appear to be significant when,in fact, they are not. This is known as “spuriousregression.”
Forecasting using R Regression with ARIMA errors 6
Modeling procedure
Estimation only works when all predictorvariables are deterministic or stationary andthe errors are stationary.
So difference stochastic variables as requireduntil all variables appear stationary. Then fitmodel with ARMA errors.
auto.arima() will handle order selection anddifferencing (but only checks that errors arestationary).
Forecasting using R Regression with ARIMA errors 7
Modeling procedure
Estimation only works when all predictorvariables are deterministic or stationary andthe errors are stationary.
So difference stochastic variables as requireduntil all variables appear stationary. Then fitmodel with ARMA errors.
auto.arima() will handle order selection anddifferencing (but only checks that errors arestationary).
Forecasting using R Regression with ARIMA errors 7
Modeling procedure
Estimation only works when all predictorvariables are deterministic or stationary andthe errors are stationary.
So difference stochastic variables as requireduntil all variables appear stationary. Then fitmodel with ARMA errors.
auto.arima() will handle order selection anddifferencing (but only checks that errors arestationary).
Forecasting using R Regression with ARIMA errors 7
Outline
1 Regression with ARIMA errors
2 Example: Japanese cars
3 Using Fourier terms for seasonality
4 Example: Sales of petroleum & coalproducts
Forecasting using R Example: Japanese cars 8
Example: Japanese cars
Forecasting using R Example: Japanese cars 9
Japanese motor vehicle production
Year
mill
ions
1965 1970 1975 1980 1985 1990
24
68
1012
Example: Japanese cars
Forecasting using R Example: Japanese cars 9
Japanese motor vehicle production
Year
mill
ions
1965 1970 1975 1980 1985 1990
24
68
1012
Proposed model:linear trend pluscorrelated errors.yt = a + bt + ntwhere nt is ARMAprocess.
Example: Japanese cars
Forecasting using R Example: Japanese cars 10
Errors from linear trend model
1965 1970 1975 1980 1985 1990
−0.
50.
51.
01.
5
●
●●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
2 4 6 8 10 12 14
−0.
6−
0.2
0.2
0.6
Lag
AC
F
2 4 6 8 10 12 14
−0.
6−
0.2
0.2
0.6
Lag
PAC
F
Example: Japanese cars
1 We will fit a linear trend model:
yt = a + bxt + nt
where xt = t − 1963.
2 auto.arima chooses an AR(1) model for nt.
3 Full model is
yt = a + bxt + nt where nt = φ1nt−1 + et
and et is a white noise series.
Forecasting using R Example: Japanese cars 11
Example: Japanese cars
1 We will fit a linear trend model:
yt = a + bxt + nt
where xt = t − 1963.
2 auto.arima chooses an AR(1) model for nt.
3 Full model is
yt = a + bxt + nt where nt = φ1nt−1 + et
and et is a white noise series.
Forecasting using R Example: Japanese cars 11
Example: Japanese cars
1 We will fit a linear trend model:
yt = a + bxt + nt
where xt = t − 1963.
2 auto.arima chooses an AR(1) model for nt.
3 Full model is
yt = a + bxt + nt where nt = φ1nt−1 + et
and et is a white noise series.
Forecasting using R Example: Japanese cars 11
Example: Japanese cars
Parameter Estimate StandardError
AR(1) φ1 0.736 0.152Intercept a 1662. 504.1Slope b 463.6 32.31
We need to check that the residuals (et) looklike white noise.
Forecasting using R Example: Japanese cars 12
Example: Japanese cars
Parameter Estimate StandardError
AR(1) φ1 0.736 0.152Intercept a 1662. 504.1Slope b 463.6 32.31
We need to check that the residuals (et) looklike white noise.
Forecasting using R Example: Japanese cars 12
Example: Japanese cars
Forecasting using R Example: Japanese cars 13
Residuals from linear trend + AR(1) model
1965 1970 1975 1980 1985 1990
−0.
50.
00.
51.
0
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
2 4 6 8 10 12 14
−0.
6−
0.2
0.2
0.6
Lag
AC
F
2 4 6 8 10 12 14
−0.
6−
0.2
0.2
0.6
Lag
PAC
F
Example: Japanese cars
Forecasting using R Example: Japanese cars 14
Forecasts from linear trend + AR(1) errors
mill
ions
1965 1970 1975 1980 1985 1990 1995 2000
510
1520
Example: Japanese cars
> fit2 <- auto.arima(x)ARIMA(0,1,0) with drift
Coefficients:drift0.4530
s.e. 0.0836
sigma^2 estimated as 0.1749:log likelihood = -13.68AIC = 31.36 AICc = 31.9 BIC = 33.8
Forecasting using R Example: Japanese cars 15
Example: Japanese cars
Forecasting using R Example: Japanese cars 16
Forecasts from ARIMA(0,1,0) with drift
mill
ions
1965 1970 1975 1980 1985 1990 1995 2000
510
1520
Example: Japanese cars
Forecasting using R Example: Japanese cars 16
Forecasts from linear trend + AR(1) errors
mill
ions
1965 1970 1975 1980 1985 1990 1995 2000
510
1520
Outline
1 Regression with ARIMA errors
2 Example: Japanese cars
3 Using Fourier terms for seasonality
4 Example: Sales of petroleum & coalproducts
Forecasting using R Using Fourier terms for seasonality 17
Fourier series
Any periodic function of period m can beapproximated using
K∑k=1
{sin
(2πkt
m
)+ cos
(2πkt
m
)}The approximation can be made exact withlarge K.
Each k represents a harmonic.
fourier generates a matrix of Fourier terms.
Examplefit <- auto.arima(y, seasonal=FALSE,
xreg=fourier(y, K=3))
Forecasting using R Using Fourier terms for seasonality 18
Fourier series
Any periodic function of period m can beapproximated using
K∑k=1
{sin
(2πkt
m
)+ cos
(2πkt
m
)}The approximation can be made exact withlarge K.
Each k represents a harmonic.
fourier generates a matrix of Fourier terms.
Examplefit <- auto.arima(y, seasonal=FALSE,
xreg=fourier(y, K=3))
Forecasting using R Using Fourier terms for seasonality 18
Fourier series
Any periodic function of period m can beapproximated using
K∑k=1
{sin
(2πkt
m
)+ cos
(2πkt
m
)}The approximation can be made exact withlarge K.
Each k represents a harmonic.
fourier generates a matrix of Fourier terms.
Examplefit <- auto.arima(y, seasonal=FALSE,
xreg=fourier(y, K=3))
Forecasting using R Using Fourier terms for seasonality 18
Fourier series
Any periodic function of period m can beapproximated using
K∑k=1
{sin
(2πkt
m
)+ cos
(2πkt
m
)}The approximation can be made exact withlarge K.
Each k represents a harmonic.
fourier generates a matrix of Fourier terms.
Examplefit <- auto.arima(y, seasonal=FALSE,
xreg=fourier(y, K=3))
Forecasting using R Using Fourier terms for seasonality 18
Fourier series
Any periodic function of period m can beapproximated using
K∑k=1
{sin
(2πkt
m
)+ cos
(2πkt
m
)}The approximation can be made exact withlarge K.
Each k represents a harmonic.
fourier generates a matrix of Fourier terms.
Examplefit <- auto.arima(y, seasonal=FALSE,
xreg=fourier(y, K=3))
Forecasting using R Using Fourier terms for seasonality 18
Outline
1 Regression with ARIMA errors
2 Example: Japanese cars
3 Using Fourier terms for seasonality
4 Example: Sales of petroleum & coalproducts
Forecasting using R Example: Sales of petroleum & coal products 19
Example: Sales of petroleum & coal products
U.S. monthly sales: Jan 1971 – Dec 1991.Forecasting using R Example: Sales of petroleum & coal products 20
Sales: Petroleum and coal products
1975 1980 1985 1990
510
1520
Sales: Chemicals and allied products
1975 1980 1985 1990
510
1520
25
Bituminous Coal production
1975 1980 1985 1990
2040
6080
Sales: Motor Vehicles and parts
1975 1980 1985 1990
510
1520
Example: Sales of petroleum & coal products
Forecasting using R Example: Sales of petroleum & coal products 21
Log Sales: Petroleum & Coal products
1975 1980 1985 1990
1.0
1.5
2.0
2.5
3.0
Log Sales: Chemicals & allied products
1975 1980 1985 1990
1.5
2.0
2.5
3.0
Log Bituminous Coal production
1975 1980 1985 1990
2.5
3.0
3.5
4.0
4.5
Log Sales: Motor Vehicles and parts
1975 1980 1985 1990
1.5
2.0
2.5
3.0
Example: Sales of petroleum & coal products
Series clearly non-stationary, so difference.
Forecasting using R Example: Sales of petroleum & coal products 21
Log Sales: Petroleum & Coal products
1975 1980 1985 1990
1.0
1.5
2.0
2.5
3.0
Log Sales: Chemicals & allied products
1975 1980 1985 1990
1.5
2.0
2.5
3.0
Log Bituminous Coal production
1975 1980 1985 1990
2.5
3.0
3.5
4.0
4.5
Log Sales: Motor Vehicles and parts
1975 1980 1985 1990
1.5
2.0
2.5
3.0
Example: Sales of petroleum & coal products
y′t = b1x′1,t + b2x
′2,t + b3x
′3,t + nt
yt = log petroleum sales
x1,t = log chemical sales
x2,t = log coal production
x3,t = log motor vehicle and parts sales.
auto.arima selects an ARIMA(0,0,1)(1,0,2)12
model for nt.Full model is yt = b1x1,t + b2x2,t + b3x3,t + nt
(1−Φ1B12)(1−B)nt = (1+θ1B)(1+Θ1B
12+Θ2B24)et.
Forecasting using R Example: Sales of petroleum & coal products 22
Example: Sales of petroleum & coal products
y′t = b1x′1,t + b2x
′2,t + b3x
′3,t + nt
yt = log petroleum sales
x1,t = log chemical sales
x2,t = log coal production
x3,t = log motor vehicle and parts sales.
auto.arima selects an ARIMA(0,0,1)(1,0,2)12
model for nt.Full model is yt = b1x1,t + b2x2,t + b3x3,t + nt
(1−Φ1B12)(1−B)nt = (1+θ1B)(1+Θ1B
12+Θ2B24)et.
Forecasting using R Example: Sales of petroleum & coal products 22
Example: Sales of petroleum & coal products
y′t = b1x′1,t + b2x
′2,t + b3x
′3,t + nt
yt = log petroleum sales
x1,t = log chemical sales
x2,t = log coal production
x3,t = log motor vehicle and parts sales.
auto.arima selects an ARIMA(0,0,1)(1,0,2)12
model for nt.Full model is yt = b1x1,t + b2x2,t + b3x3,t + nt
(1−Φ1B12)(1−B)nt = (1+θ1B)(1+Θ1B
12+Θ2B24)et.
Forecasting using R Example: Sales of petroleum & coal products 22
Example: Sales of petroleum & coal products
Parameter Estimate S.EMA(1) θ1 0.331 0.058Seasonal AR(1) Φ1 0.916 0.053Seasonal MA(1) Θ1 −0.661 0.086Seasonal MA(2) Θ2 −0.116 0.068Log Chemicals b1 0.296 0.065Log Coal b2 -0.029 0.013Log Vehicles b3 -0.014 0.023
AIC = −913.8Consider dropping motor vehicles and partsvariable: AIC= −915.6.
Forecasting using R Example: Sales of petroleum & coal products 23
Example: Sales of petroleum & coal products
Parameter Estimate S.EMA(1) θ1 0.331 0.058Seasonal AR(1) Φ1 0.916 0.053Seasonal MA(1) Θ1 −0.661 0.086Seasonal MA(2) Θ2 −0.116 0.068Log Chemicals b1 0.296 0.065Log Coal b2 -0.029 0.013Log Vehicles b3 -0.014 0.023
AIC = −913.8Consider dropping motor vehicles and partsvariable: AIC= −915.6.
Forecasting using R Example: Sales of petroleum & coal products 23
Forecasting
To forecast a regression model with ARIMAerrors, we need to forecast the regression partof the model and the ARIMA part of the modeland combine the results.When future predictors are unknown, theyneed to be forecast first.Forecasts of macroeconomic variables may beobtained from the national statistical offices.Separate forecasting models may be neededfor other explanatory variables.Some explanatory variable are known into thefuture (e.g., time, dummies).
Forecasting using R Example: Sales of petroleum & coal products 24
Forecasting
To forecast a regression model with ARIMAerrors, we need to forecast the regression partof the model and the ARIMA part of the modeland combine the results.When future predictors are unknown, theyneed to be forecast first.Forecasts of macroeconomic variables may beobtained from the national statistical offices.Separate forecasting models may be neededfor other explanatory variables.Some explanatory variable are known into thefuture (e.g., time, dummies).
Forecasting using R Example: Sales of petroleum & coal products 24
Forecasting
To forecast a regression model with ARIMAerrors, we need to forecast the regression partof the model and the ARIMA part of the modeland combine the results.When future predictors are unknown, theyneed to be forecast first.Forecasts of macroeconomic variables may beobtained from the national statistical offices.Separate forecasting models may be neededfor other explanatory variables.Some explanatory variable are known into thefuture (e.g., time, dummies).
Forecasting using R Example: Sales of petroleum & coal products 24
Forecasting
To forecast a regression model with ARIMAerrors, we need to forecast the regression partof the model and the ARIMA part of the modeland combine the results.When future predictors are unknown, theyneed to be forecast first.Forecasts of macroeconomic variables may beobtained from the national statistical offices.Separate forecasting models may be neededfor other explanatory variables.Some explanatory variable are known into thefuture (e.g., time, dummies).
Forecasting using R Example: Sales of petroleum & coal products 24
Forecasting
To forecast a regression model with ARIMAerrors, we need to forecast the regression partof the model and the ARIMA part of the modeland combine the results.When future predictors are unknown, theyneed to be forecast first.Forecasts of macroeconomic variables may beobtained from the national statistical offices.Separate forecasting models may be neededfor other explanatory variables.Some explanatory variable are known into thefuture (e.g., time, dummies).
Forecasting using R Example: Sales of petroleum & coal products 24