lecture

39
1 Lectures on modelling non stationary time series by Roberto Golinelli 1. Univariate preliminary analysis 2 2. The stationarity issue in AR models: the unit root tests 10 3. Unit roots and spurious regressions 30 4. The dynamic specification (ARDL) 34 5. Long run relationships and cointegrated variables 40 6. Modelling systems 49 7. Guidelines for the preparation of applied econometrics projects 73 8. Reading list and acknowledgements 74 CIDE’s PhD Lectures, Bertinoro (FO), June 2005 Department of Economics Strada Maggiore, 45 40125 Bologna (Italy) [email protected] www.dse.unibo.it/golinelli 2 1. UNIVARIATE PRELIMINARY ANALYSIS From a statistical p. o. v., a time series is a sequence of random variables ordered in time; we introduce the concept of STOCASTIC PROCESS (SP): {X t }, t = 1, 2, ..., T The probability structure of a stochastic process is determined by its joint distribution. Example of a SP: the white noise model [1] x t = c + ε t ε t ~ n.i.d. (0, σ 2 ) x t is normally and independently distributed over time with constant variance and c mean (also constant). Q: IS IT AN APPROPRIATE MODEL FOR THE MACROECONOMIC TIME SERIES? Eviews/phil/series u (unemployment rate)/descript. stats 0 2 4 6 8 10 12 2 3 4 5 6 7 8 9 10 11 12 13 Series: U Sample 1960 1999 Observations 40 Mean 6.945875 Median 5.616000 Maximum 12.25100 Minimum 2.835000 Std. Dev. 3.130564 Skewness 0.428753 Kurtosis 1.673676 Jarque-Bera 4.157416 Probability 0.125092 a corresponding artificial series can be generated with same sample mean and standard deviation of the historical u: genr uaswn = 6.94 + 3.13 * nrnd genr meanline = 6.94 plot u uaswn meanline

Upload: nuur-ahmed

Post on 20-Oct-2015

17 views

Category:

Documents


3 download

DESCRIPTION

econ

TRANSCRIPT

Page 1: Lecture

1

Lectures on modellingnon stationary time series

by

Roberto Golinelli‡

1. Univariate preliminary analysis 22. The stationarity issue in AR models:

the unit root tests 10

3. Unit roots and spurious regressions 30

4. The dynamic specification (ARDL) 345. Long run relationships

and cointegrated variables 40

6. Modelling systems 497. Guidelines for the preparation of applied

econometrics projects 73

8. Reading list and acknowledgements 74

CIDE’s PhD Lectures, Bertinoro (FO), June 2005

Department of EconomicsStrada Maggiore, 4540125 Bologna (Italy)[email protected]/golinelli

2

1. UNIVARIATE PRELIMINARY ANALYSIS

From a statistical p. o. v., a time series is a sequence ofrandom variables ordered in time; we introduce the conceptof STOCASTIC PROCESS (SP): {Xt}, t = 1, 2, ..., T

The probability structure of a stochastic process isdetermined by its joint distribution.

Example of a SP: the white noise model

[1] xt = c + εt εt ~ n.i.d. (0, σ2)

xt is normally and independently distributed over time withconstant variance and c mean (also constant).

Q: IS IT AN APPROPRIATE MODEL FOR THE MACROECONOMIC

TIME SERIES?

Eviews/phil/series u (unemployment rate)/descript. stats

0

2

4

6

8

10

12

2 3 4 5 6 7 8 9 10 11 12 13

Series: USample 1960 1999Observations 40

Mean 6.945875Median 5.616000Maximum 12.25100Minimum 2.835000Std. Dev. 3.130564Skewness 0.428753Kurtosis 1.673676

Jarque-Bera 4.157416Probability 0.125092

a corresponding artificial series can be generated with samesample mean and standard deviation of the historical u:

genr uaswn = 6.94 + 3.13 * nrndgenr meanline = 6.94plot u uaswn meanline

Page 2: Lecture

3

0

2

4

6

8

10

12

14

60 65 70 75 80 85 90 95

U UASWN MEANLINE

The white noise (WN) model for the unemployment rate inItaly would state that: “u randomly fluctuates around aconstant mean (6.94) with constant variance (3.132)”. Butthe white noise model does not fit actual data for u becauseit does not feature the time series most commoncharacteristic: PERSISTENCE. In fact, the actual u is by farmore persistent than the simple WN process under andabove the “natural rate” of about 7%.

Q: IS THIS RESULT PECULIAR TO UNEMPLOYMENT?

Eviews/lqr/series lqr (logs of capacity utilisation ratio).plot lqr lqraswn meanlineFrom the plot below it is evident that the capacityutilisation has a completely different path with respect tothe unemployment rate: in fact, lqr is markedly lesspersistent than u.However, the capacity utilisation ratio still persists morethan the corresponding artificial series (generated as a whitenoise realisation).

4

-0.12

-0.10

-0.08

-0.06

-0.04

-0.02

0.00

55 60 65 70 75 80 85 90 95

LQR LQRASWN MEANLINE

A: WE HAVE TO FIND OTHER REFERENCE MODELS. MORE

REALISTIC STATISTICAL MODELS ARE COMBINATIONS OF

DIFFERENT ε; THEY ARE CALLED ARMA MODELS.

Example of another SP: the AR(1) model

[2] xt = c + α xt-1 + εt εt ~ n.i.d. (0, σ2) is a WN

The variable xt is not independently distributed over timebecause it depends on xt-1.In a model for u, we can estimate c and α parameters ofequation [2] by using the OLS method:

ls u c u(-1)

Method: Least SquaresSample(adjusted): 1961 1999Included observations: 39 after adjusting endpoints

Variable Coefficient Std. Error t-Statistic Prob.

C 0.123144 0.215444 0.571584 0.5711U(-1) 1.011932 0.028916 34.99588 0.0000

Page 3: Lecture

5

-2

-1

0

1

2

2

4

6

8

10

12

14

65 70 75 80 85 90 95

Residual Actual Fitted

from previous regression output we note that: the estimate of α parameter is very close to one; the AR(1) model fits unemployment quite well.

Since residuals are estimates of εt (white noise processes),we have to check the classical assumptions by using thediagnostic (mispecification) tests:

Under the null: AR(1) residualsno autocorrelation rejectedno heteroschedasticity not rejectednormality not rejected

→ We can react to autocorrelation by increasing to 2 theorder of the AR process: the AR(2) model is written as

[3] xt = c + α1 xt-1 + α2 xt-2 + εt εt ~ WN

where there is one more parameter, and the dynamics isextended to the second lag.

ls u c u(-1) u(-2)

(results not reported).

6

the residual tests are all fine (white noise errors); the AR(2) model equally fits well; the sum of the two α estimates is close to one.

Now, also try with the lqr variable:ls lqr c lqr(-1)

Dependent Variable: LQRSample(adjusted): 1952 1997Included observations: 46 after adjusting endpoints

Variable Coefficient Std. Error t-Statistic Prob.

C -0.022713 0.007179 -3.163683 0.0028LQR(-1) 0.604811 0.119754 5.050447 0.0000

R-squared 0.366970 Mean dependent var -0.057308Adjusted R-squared 0.352583 S.D. dependent var 0.018119S.E. of regression 0.014579 Durbin-Watson stat 1.696

In this case, a first order model is enough to avoid residualsproblems and the alpha estimate is equal to about 0.6 (< 1).

Note that the dynamics of the capacity utilisation rate (lesspersistent than the unemployment rate) is more difficult tobe fitted by the AR model (R2 = 0.367, against 0.976).

Preliminary findings:a) data persistence is explained by AR models;b) the sum of the AR parameter estimates is often close

to one;c) the more persistent the path, the easier to fit the data

by AR models and the closer to one is the sum ofalpha estimates.

In addition, note that not all the economic series areuntrended, and in case of trended variables we mustintroduce deterministic components in our statisticalmodels in order to (potentially) give account of this furtherfeature.

Page 4: Lecture

7

From an economic p. o. w., the “nature” of previous u andlqr variables excludes the presence of a deterministic trend,being both measured by ratios.

On the other side, there are many variables whose levelscan continuously grow over time (output, real wages,prices, etc.).For example if we define logs of the real wage as:genr lwp = log(w/p), and if plot lwp, we can note that itspath over time is trended, explained by some causal effects(e.g. labour productivity growth). The same apply to ly(logs of real output), equally trended, or consumer pricelevels p.

Previous statistical models can be easily extended to thisfeature by including a deterministic trend (t); e.g. theequation [1] becomes:

[1'] xt = c + β t + εt

and we can fit this WN plus deterministic trend model toactual lwp data:ls lwp c @trend

-0.4

-0.2

0.0

0.2

11.5

12.0

12.5

13.0

13.5

60 65 70 75 80 85 90 95

Residual Actual Fitted

8

The WN plus trend model does not fit data, and theregression residuals are very much persistent (strongpositive autocorrelation).→ we have to introduce wage dynamics with the AR(1)plus trend model (an extension of the equation [2]):

[2′] xt = c + β t + α xt-1 + εt

ls lwp c @trend lwp(-1)Dependent Variable: LWPMethod: Least SquaresSample(adjusted): 1961 1999Included observations: 39 after adjusting endpoints

Variable Coefficient Std. Error t-Statistic Prob.

C 0.538676 0.260299 2.069449 0.0457@TREND -0.001136 0.000675 -1.683461 0.1009LWP(-1) 0.961725 0.021464 44.80662 0.0000

The inclusion is very important indeed: thanks to dynamicsthe residuals are now fine (results not reported); therelevance of time trend vanishes while the autoregressiveparameter estimate is close to one (as in many cases of ARmodel estimates).

Some first tentative conclusions confirm previouspreliminary findings:

a) despite the inclusion of a deterministic trend, lwppersistence needs an AR dynamics → in general,many economic series can be represented by ARmodels of different orders, with or withoutdeterministic trends;

b) the (sum of) AR parameter estimates is very oftenclose to one

Page 5: Lecture

9

Q: WHAT DOES POINT B) IMPLY IN TERMS OF THE ARMODELS STATISTICAL PROPERTIES?

The next step will be the study of the statistical propertiesof the AR models with (or without) unit roots: a unit rootis found in the SP of an AR model when the sum of thealpha parameters is equal to one (necessary condition).

10

2. THE STATIONARITY ISSUE IN AR MODELS:THE UNIT ROOT TESTS

Consider the AR(1) model in the equation [2], and for themoment let’s ignore the deterministic components:

[2″] xt = α xt-1 + εt εt ~ n.i.d. (0, σ2) is a WN

By introducing the lag operator:

Lxt = xt-1 L2xt = LLxt = Lxt-1 = xt-2 L0xt = xt

we can redefine equation [2]:

xt = α Lxt + εt → (1 - αL) xt = εt → xt = εt /(1 - αL)

and if |α| < 1 we have that:

1/(1 - αL) = 1 + αL + α2L2 + α3L3 + ... = ∑∞

=0iαiLi

a geometric series converges if the absolute ratio ofsuccessive terms is less than 1.

[2*] xt = (1 + αL + α2L2 + α3L3 + ... ) εt

= εt + αεt-1 + α2εt-2 + α3εt-3 + ...In equation [2*] the AR(1) process is written in thecorresponding MA(∞) form (Wold representation).

E(xt) = 0 (this result depends on the absence ofdeterministic components);Var(xt) = E[xt-E(xt)]

2 = E[εt + αεt-1 + α2εt-2 + α3εt-3 + ...]2 =E[εt

2 + α2ε2t-1 + α4ε2

t-2 + α6ε2t-3 + ...] =

σ2 [1 + α2 + α4 + α6 + ...] = σ2 /(1 - α2) ≡ Var(xt-k)Cov(xt,xt-k) = E{[xt-E(xt)] [xt-k-E(xt-k)]} =E{[εt + αεt-1 + α2εt-2 + ... + αkεt-k + αk+1εt-k-1 + αk+2εt-k-2 + ...]

[εt-k + αεt-k-1 + α2εt-k-2 + ...]} =αkσ2 /(1 - α2) = αk Var(xt) ≡ αk Var(xt-k)

Page 6: Lecture

11

Autocorrelation coefficient of order kρk = Cov(xt,xt-k)/Var(xt) = Cov(xt,xt-k)/Var(xt-k) = αk

If |α| < 1, the AR(1) model is STATIONARY since itsmoments do not depend on t.

The autocorrelation coefficient ρk decreases when kincreases (the memory of the process decreases with k).

Example: if α = 0.6 (as in the case of the AR(1) model forthe capacity utilisation ratio in logs) then:

xt = εt + 0.6 εt-1 + 0.36 εt-2 + 0.216 εt-3 +0.13 εt-4 + 0.08 εt-5 + 0.047 εt-6 + ...

after six periods, the shock is no longer economicallysignificant. An easy way to appreciate the path of the shockis to draw the impulse-responses function of a series.

IMPULSE-RESPONSES IN THE STATIONARY AR(1) MODEL

horizon timing impulse shocked xi

= x'iresponses= x'i - xi

0 t s x't=xt + s s

1 t+1 0x't+1=αx't+εt+1=αxt+αs+εt+1=xt+1+αs

αs

2 t+2 0x't+2=αx't+1+εt+2=αxt+1+ααs+εt+2=xt+2+α2s

α2s

3 t+3 0 ... α3s... ... ... ... ...h t+h 0 ... αhs

12

The responses decrease because |α| < 1.

Example: the AR(1) model for the capacity utilisation ratio.

Eviews/lqr/quick/estimate VAR/lqr 1 1/impulse h=10(multiple graphs)

-0.005

0.000

0.005

0.010

0.015

0.020

1 2 3 4 5 6 7 8 9 10

Response of LQR to One S.D. LQR Innovation

What is depicted is the path of a transitory shock: giventhat lqr variable is explained by a stationary AR(1) model,|0.6|<1, the response to the impulse vanishes over time.

A transitory shock can be interpreted as a demand shock:an increase in demand (positive shock) causes a short runincrease in output, but leaves unaffected the long runpotential output of the economy (given by the supply side).

Contrast the stationarity (|α| < 1) case with the unitroots case (α = 1 in equation [2″]):

xt = xt-1 + εt

Repeated backwards substitution allows to write:

Page 7: Lecture

13

xt = x0 + εt + εt-1 + εt-2 + εt-3 + ... + ε2 + ε1

where x0 is assumed to be a fixed initial value for theprocess. In a process with unit roots, second momentsdepend on time (non stationarity):

E(xt) = x0

Var(xt) = E[xt - x0]2 = E[∑

=

t

1iεi]

2 = t σ2

Cov(xt,xt-k) = E{[xt - x0] [xt-k - x0]} =E{[εt + εt-1 + εt-2 + ... + εt-k + εt-k-1 + εt-k-2 + ... + ε2 + ε1]

[εt-k + εt-k-1 + εt-k-2 + ... + ε2 + ε1]} == (t-k) σ2 ≡ Var(xt-k)

ρk = Cov(xt,xt-k)/[Var(xt) Var(xt-k)]0.5 = (t-k)σ2/σ2[t(t-k)]0.5 =

= t

kt −

lim Var(xt) = ∞ ; lim Cov(xt,xt-k) = ∞ ; lim ρk = 1 ;t→∞ t→∞ t→∞

Eviews/phil/quick/estimate VAR/u 1 2/impulse h=20(multiple graphs): the unemployment rate in Italy.

-0.5

0.0

0.5

1.0

1.5

2.0

2 4 6 8 10 12 14 16 18 20

Response of U to One S.D. U Innovation

14

→ While in the stationary case a shock (innovation) ε hasan effect on x that diminishes with t (transitory shock), inthe unit root case ε has a sustained (permanent) effect.

In the case of the unemployment rate in Italy, a 0.5% shockis very persistent: after 20 years it is still 0.5% (permanentshock). The unit root model has an infinite memory.

DETERMINISTIC COMPONENTS

Previous outcomes do not substantially change by addingdeterministic components to the AR model:

yt = dt + xt

where: dt are the deterministic variables,and xt is a zero mean AR(1) process.By using the Wold representation of the AR(1) model:yt = dt + εt /(1 - αL) → (1 - αL) yt = (1 - αL) dt + εt

→ yt = α yt-1 + dt - α dt-1 + εt

case (a): dt = µ0 (only the constant term)yt = (1-α) µ0 + α yt-1 + εt

by defining (1-α) µ0 = c, we have the equation [2].

case (b): dt = µ0 + µ1 t (linear trend)yt = α yt-1 + µ0 + µ1 t - α [µ0 + µ1 (t-1)] + εt =

= αyt-1 + µ0 + µ1t - αµ0 - αµ1t + αµ1 + εt == (1-α)µ0+αµ1 + µ1(1-α)t + αyt-1 +εt

by defining (1-α)µ0+αµ1 = c, and µ1(1-α) = βwe have the equation [2′].

Summary of two useful models(a) yt = c + α yt-1 + εt

(b) yt = c + β t + α yt-1 +εt

Page 8: Lecture

15

⎧ (a) E(yt) = µ0E(yt) = E(dt) + E(xt) = dt → ⎨

⎩ (b) E(yt) = µ0 + µ1 t

Var(yt) = E[yt-E(yt)]2 = E(xt)

2 = Var(xt)

Cov(yt,yt-k) = E{[yt - E(yt)] [yt-k - E(yt-k)]} = E{xt xt-k} =

= Cov(xt,xt-k)

The deterministic variables in yt only change the mean (thatin any case is non stochastic); second moments are thesame as those of xt (zero mean) variable.

→ the stationary condition is still |α| < 1

if |α| < 1: for:

model (a) is a stationary AR(1)model with mean µ0 ≠ 0(MEAN REVERTING)

stationary non driftingtime series

model (b) is a stationary AR(1)plus trend model(TREND REVERTING)

drifting trend stationary(TS) time series

if α = 1: for:

model (a): yt = yt-1 + εt

is the RANDOM WALK

non drifting differencestationary (DS) timeseries

model (b): yt = µ1 + yt-1 + εt

is the RANDOM WALK WITH

DRIFT

drifting differencestationary (DS) timeseries

When α = 1 we talk about random walks because thenon stationary AR is a first order model.

16

We can summarise the univariate unit root concept byfollowing Johansen (1997), who notes that the specific unit-root model:

xt = xt-1 + εt or: ∆xt = εt

can be also written as:

xt = xo + ε1 + ε2 + ... + εt

This is the random walk: a person that starts walking froma square (xo), and takes steps (εi) of random size anddirection: xt is his position after t steps when he starts at xo.

→ By modelling a variable by a random walk, we do nottry to reproduce its sample path, because we decide thatthese “details” are not important to explain (in fact, wemodel them as random)

→ only the qualitative behaviour of the path matters: it is a“float” (once it has reached a level, it stays there until itreaches a new level).

On the other hand, in the model:

↓ unpredictable (random) part

∆xt = π xt-1 + εt or: xt = α xt-1 + εt

↑ predictablepart of the movement

π (= α - 1) represents the “glue” of the process; if π ≈ 0(then, α ≈ 1) neighbouring values of xt are more often closetogether, and we got a wave-like behaviour. While if: π ≈ -1(then, α ≈ 0), neighbouring values of xt are almost unrelated(independent). In general, when -2 < π < 0, or -1 < α < 1,the path of xt exhibits a mean reversion.

Page 9: Lecture

17

Summary exercise: Practice with the stationary AR(1) model

Use the model (a) above:yt = c + α yt-1 + εt with: εt ~ n.i.d. (0, σ2)

under stationary condition |α| < 1 we have thatE(yt) = E(yt-1) = µ0

We can simpler summarise first and second moments.E(yt) = c + α E(yt-1) + E(εt ) hence:

µ0 = c /(1 - α)

If we substitute this definition in model (a), and take theexpected value of the square:

(yt - µ0) = α (yt-1 - µ0) + εt

E(yt - µ0)2 = α2 E(yt-1 - µ0)

2 + E(εt )2

Var(yt) = γ0 = σ2 /(1 - α2)

Finally, multiply the demeaned equation times (yt-k - µ0):(yt - µ0) (yt-k - µ0) = α (yt-1 - µ0) (yt-k - µ0) + εt (yt-k - µ0)

if we define: E[(yt - µ0) (yt-k - µ0)] = γk

then: γk = α γk-1 + 0 and:

ρk = α ρk-1 = αk ρ0 = αk (note that ρ0 = 1)

Simulation analysis can be used to verify a number ofstylised facts (procedure: simular1.prg).

In order to simulate an AR(1) model, we need to set three

parameters: µ0 , α , and the ratio =oµ

γ 0 in order to

obtain the three “genuine” parameters of the AR(1) model:

c = µ0 (1 - α) α σ = ratio µ0 (1 - α2)½

In the procedure µ0=%0 α=%1 ratio=%2sign of α=%3 (0=positive, 1=negative)

18

' %0 = mean (id number)' %1 = alpha (e.g. -0.6 or 0.6 -----> 60)' %2 = ratio between s.d.(y)/mean(y) (e.g. 50% ---> 50)' %3 = sign of alpha (0=positive, 1=negative)

scalar s = %2/100*%0*(1-( (-1)^%3 *%1/100)^2)^.5

smpl 1970.1 2000.4rndseed %0genr e%0%1%2_%3 = s*nrnd' set the initial value (random number from long run mean & variance of y' alternatively you can start from zero or deterministically from the mean %0smpl 1970.1 1970.1genr y%0%1%2_%3=%0 + e%0%1%2_%3/(1-( (-1)^%3 *%1/100)^2)^.5' iterate the other valuessmpl 1970.2 2000.4genr y%0%1%2_%3 = %0*(1-((-1)^%3*%1/100)) +

(-1)^%3*%1/100*y%0%1%2_%3(-1)+e%0%1%2_%3

smpl 1970.1 2000.4

With alternative simulations we can assess ...

(a) the issue of the initial value of the dynamic processes→ what happens if we start far away from the mean?(e.g. 100 90 5 0)

(b) that the persistence of the path of yt depends on the αparameter (with 0 ≤ α < 1 ): it grows with α → 1

(c) what happens when -1 ≤ α < 0

(e.g. 100 0/40/80/99 50 0/1)

(d) what role played by “ratio”

Hints: open a new working file “your_name.wf1” (quarterly data from1970.1 to 2000.4) open the program simular1.prg run various scenarios (in parentheses above) compare plots, correlograms, impulse-response functions save all useful results

Page 10: Lecture

19

HOW TO TEST DS VS TS MODELS

The data generating process (DGP) is DS if α = 1, while itis TS (or mean reverting) if |α| < 1. The main inferentialpoint is to find a significance-test for α estimates.

The testing models are two: model (a) includes a constantterm, model (b) includes both a constant and a trend.

Models (a) and (b) can be conveniently reparametrised inorder to ease inference:

(a) ∆yt = c + π yt-1 + εt

(b) ∆yt = c + β t + π yt-1 +εt

where: π = α - 1

THE DICKEY-FULLER (DF) UNIT ROOTS TEST

H0: π = 0 → α = 1 “yt variable is a random walk”H1: π < 0 → α < 1 “yt variable is a stationary AR(1)”

Under the null, yt is first order integrated: I(1) because it isstationary after one difference.

Under the alternative it is a mean reverting AR(1) in model(a) or a trend stationary variable in model (b); in both casesyt is I(0) because it is stationary without (zero) difference.

The choice of the deterministic components [i.e. model (a)or (b)?] depends on

the economic nature of the variable: model (a) for ratiosand rates, model (b) for levels; the historical pattern: model (a) for non drifting variables(remember that level variables are often drifting).

20

Sometimes, residuals from models (a) or (b) areautocorrelated: if so, they tell that first-order autoregressivedynamics is not enough.

In these cases we have to pass from the DF (first-order) testto the Augmented Dickey-Fuller, the ADF(p) test analysesa more general p+1 order dynamics.

The corresponding (a) and (b) models for ADF(p) testingare:

(a) ∆yt = c + π yt-1 + ∑=

p

1i γi ∆yt-i + εt

(b) ∆yt = c + β t + π yt-1 + ∑=

p

1i γi ∆yt-i + εt

Note that the augmentation is made in order to obtain white noiseresiduals; the augmentation is mainly suggested for high frequencyobservations (e.g. monthly, quarterly). The rule of thumbstates “the augmentation of the DF test is often quitesimilar (or equal) to data periodicity”.

Example. If the variable yt is quarterly, it is appropriate tostart with a fourth-fifth order dynamics. Suppose we areusing the model (a), and that a fourth order dynamics isappropriate to explain the path of the variable underscrutiny [AR(4) model with constant and without trend]:

yt = c + α1yt-1 + α2yt-2 + α3yt-3 + α4yt-4 + εt

it can be conveniently rearranged as follows:yt-yt-1 = c + α1yt-1 - yt-1 ± α2yt-1 ± α3yt-1 ± α4yt-1 + α2yt-2 ±

α3yt-2 ± α4yt-2 + α3yt-3 ± α4yt-3 + α4yt-4 + εt

Page 11: Lecture

21

∆yt = c + (α1+α2+α3+α4-1) yt-1

- (α2+α3+α4) ∆yt-1- (α3+α4) ∆yt-2 - α4 ∆yt-3 + εt

this specification coincides with ADF(3) testing model bydefining: π = (α1+α2+α3+α4-1); γ1 = - (α2+α3+α4);γ2 = - (α3+α4); and γ3 = - α4 ;

H0: π = 0 → α1+α2+α3+α4 = 1 yt is I(1); no a randomwalk, but simply DS

H1: π < 0 → α1+α2+α3+α4 < 1 yt is I(0) stationaryAR(4) model

The critical values of the ADF(p) test are the same as theDF test.

The choice of the starting augmentation order depends on: data periodicity (see above) significance of γi estimates white noise residuals

After preliminary estimation, non significant parameteraugmentation can be dropped in order to enjoy moreefficient estimates. For this reason, Campbell-Perron (1991)intuitively suggest a “dropping down” procedure from pmax .Then, such procedure has been supported (and refined) bythe findings of Hall (1994), and Ng-Perron (1995, 2001).

In particular, simulations carried out e.g. in Ng-Perron(1995) show a strong association between the choice of pand the severity of size distortions (over-rejections) and/orthe extent of power loss (too few rejections).

Some ADF unit-root test applications

The unemployment data for Italy

Eviews/phil/u/view/line graph/unit root test/levels/intercept

22

ADF Test Statistic -0.470304 1% Critical Value* -3.6117 5% Critical Value -2.9399 10% Critical Value -2.6080

*MacKinnon critical values for rejection of hypothesis of a unit root.

Dependent Variable: D(U)Method: Least SquaresSample(adjusted): 1962 1999Included observations: 38 after adjusting endpoints

Variable Coefficient Std. Error t-Statistic Prob.

U(-1) -0.013118 0.027892 -0.470304 0.6411D(U(-1)) 0.432667 0.157136 2.753456 0.0093

C 0.218240 0.202729 1.076510 0.2891

R-squared 0.179620 Mean dependent var 0.219526Adjusted R-squared 0.132741 S.D. dependent var 0.537428S.E. of regression 0.500489 Akaike info criterion 1.529196Sum squared resid 8.767134 Schwarz criterion 1.658479Log likelihood -26.05472 F-statistic 3.831583Durbin-Watson stat 1.845302 Prob(F-statistic) 0.031280

The same test is accomplished for u first differences:ADF Test Statistic -4.190608 1% Critical Value* -3.6171

5% Critical Value -2.9422 10% Critical Value -2.6092

*MacKinnon critical values for rejection of hypothesis of a unit root.

Augmented Dickey-Fuller Test EquationDependent Variable: D(U,2)Method: Least SquaresSample(adjusted): 1963 1999Included observations: 37 after adjusting endpoints

Variable Coefficient Std. Error t-Statistic Prob.

D(U(-1)) -0.745106 0.177804 -4.190608 0.0002D(U(-1),2) 0.230343 0.163098 1.412300 0.1669

C 0.177567 0.089767 1.978085 0.0561

R-squared 0.352465 Mean dependent var 0.010108Adjusted R-squared 0.314374 S.D. dependent var 0.591415S.E. of regression 0.489707 Akaike info criterion 1.487584Sum squared resid 8.153625 Schwarz criterion 1.618199Log likelihood -24.52030 F-statistic 9.253393Durbin-Watson stat 2.098816 Prob(F-statistic) 0.000619

Page 12: Lecture

23

→ The unemployment rate in Italy is generated by astatistical process with one unit root: u is I(1), and ∆u isI(0). This fact is apparently impossible, since u is a ratiolimited between zero and 100%; the same can be said withreference to other ratios or rates (interest rates, the inflationrate, etc.). An explanation comes by quoting, among theothers, Hall, Anderson, Granger (1992, note 5):

«The conclusion that yields to maturity are integratedprocesses can not be true in a very strict sense becauseintegrated series are unbounded, while nominal yields arebounded below by zero. Nevertheless it is evident from thedata that the statistical characteristics of yields are closer tothose of I(1) series than I(0) series, so that for the purposesof building models of the term structure it is appropriate totreat these yield series as if they were I(1)».

Practice: does previous result change if we take u in logs(variable lu) instead of in levels? Try both reference models(a) and (b); do the answers change with models?

Another application: the logs of capacity utilisation in Italy.

ADF Test Statistic -3.723136 1% Critical Value* -3.5814 5% Critical Value -2.9271 10% Critical Value -2.6013

*MacKinnon critical values for rejection of hypothesis of a unit root.

Augmented Dickey-Fuller Test EquationDependent Variable: D(LQR)Method: Least SquaresSample(adjusted): 1953 1997Included observations: 45 after adjusting endpoints

Variable Coefficient Std. Error t-Statistic Prob.

LQR(-1) -0.493511 0.132552 -3.723136 0.0006D(LQR(-1)) 0.250065 0.149150 1.676600 0.1010

C -0.028317 0.007893 -3.587552 0.0009

24

R-squared 0.248158 Mean dependent var -4.54E-05Adjusted R-squared 0.212356 S.D. dependent var 0.016277S.E. of regression 0.014446 Akaike info criterion -5.572444Sum squared resid 0.008765 Schwarz criterion -5.452000Log likelihood 128.3800 F-statistic 6.931416Durbin-Watson stat 1.998996 Prob(F-statistic) 0.002504

The logs of capacity utilisation ratio are I(0), as alsosuggested by the profile of the impulse-responses.

Example: Do the Treasury bills interest rates have a unitroot? Eviews/termine/ plot rbot3 rbot6 rbot12

0.00

0.05

0.10

0.15

0.20

0.25

80 82 84 86 88 90 92 94 96 98

RBOT3 RBOT6 RBOT12

ADF(12) test for rbot3 (levels):ADF Test Statistic -0.990188 1% Critical Value* -3.4625

5% Critical Value -2.8752 10% Critical Value -2.5740

*MacKinnon critical values for rejection of hypothesis of a unit root.

ADF(12) test for d(rbot3) (first differences):ADF Test Statistic -3.475083 1% Critical Value* -3.4627

5% Critical Value -2.8753 10% Critical Value -2.5740

*MacKinnon critical values for rejection of hypothesis of a unit root.

→ The variable rbot3 is I(1), for an explanation see aboveand read the following phrase.

Page 13: Lecture

25

«Yet, interest rates are almost certainly stationary in levels.Interest rates were about 6% in ancient Babylon; they areabout 6% now. The chances of a process with a randomwalk component displaying this behaviour are infinitesimal.Pr(|r1991<100%| r4000BC = 6%) it is infinitesimal if r are orcontain a random walk; it is near one if interest rates are anAR(1) with a coefficient of 0.99», Cochrane (1991, p. 208).

Last sentence introduce the issue of the relevance of thetime span, rather than the number of observations.

Given an economic variable, e.g. the inflation rate, differentdata periodicity imply different time spans (inflation data,see below):

name periodicity time span # of observ.lypc 1 1895-1997 103lpq 4 70q1-97q4 112

(1)(2)

lpm 12 72m1-98m7 319

Given the number of observations T, the wider the timespan, the higher the power of unit roots tests (case 1).

Given the time span, high frequency data do not relevantlyimprove the power of the tests (case 2).

The power gain from increasing the data span is bigger thanthe power gain from increasing the sample size whileleaving the data span fixed.

It would be surprising if simple time disaggregation helpedin the estimation of long run relations. Hendry (1986,OBES).

An example: Is the inflation rate I(1) or I(0)?

If we use annual data (lypc.wf1), we have:

26

-10

-8

-6

-4

-2

0

2

1900 1920 1940 1960 1980

LPC

-0.2

0.0

0.2

0.4

0.6

0.8

1900 1920 1940 1960 1980

DLPC

and in fact, the ADF test results are:ADF Test Statistic -3.264109 1% Critical Value* -4.0485

5% Critical Value -3.4531 10% Critical Value -3.1519

*MacKinnon critical values for rejection of hypothesis of a unit root.

Augmented Dickey-Fuller Test EquationDependent Variable: D(LPC)Sample(adjusted): 1894 1997Included observations: 104 after adjusting endpoints

Variable Coefficient Std. Error t-Statistic Prob.

LPC(-1) -0.042398 0.012989 -3.264109 0.0015D(LPC(-1)) 0.769148 0.097420 7.895145 0.0000D(LPC(-2)) 0.044517 0.124285 0.358188 0.7210D(LPC(-3)) -0.005133 0.098845 -0.051929 0.9587

C -0.420880 0.134976 -3.118186 0.0024@TREND(1890) 0.004323 0.001332 3.245647 0.0016

While, in first differences, the null is 1% rejected:ADF Test Statistic -3.725069 1% Critical Value* -3.4946

5% Critical Value -2.8895 10% Critical Value -2.5815

*MacKinnon critical values for rejection of hypothesis of a unit root.

Result: the inflation rate is I(0), and price levels are I(1).

But a completely different picture emerges with highfrequency data (quarterly, lpq.wf1, and monthly, lpm.wf1);in fact, the common outcome is that inflation rate is I(1),and price levels I(2). Understandable, if we look at the plotsbelow:

Page 14: Lecture

27

0.00

0.05

0.10

0.15

0.20

0.25

70 72 74 76 78 80 82 84 86 88 90 92 94 96

4*DLP D4LP

-0.1

0.0

0.1

0.2

0.3

0.4

0.5

72 74 76 78 80 82 84 86 88 90 92 94 96 98

D12LP 12*DLP

THE ELLIOTT-ROTHEMBERG-STOCK (DF-GLS)UNIT ROOTS TEST

Fact: While the presence-absence of a unit root hasimportant implications, many remain skeptical aboutthe conclusions drawn from such tests → why?

rememberSize of the test: probability that the test actually rejects the null

when the null is true.Power of the test: probability that the test correctly rejects the null

when the alternative is true.

The ADF test suffers from severe size distortions (over-rejection of the unit root hypothesis) when the moving-average polynomial of the first differenced series has alarge negative root.

The ADF test has low power when the root of theautoregressive polynomial is close to but less than unity

Elliott, Rothemberg, Stock (1996) find that local GLS de-trending of the data yields substantial power gains. Ng,Perron (2001) show that size and power may be furtherimproved when the truncation lag is appropriately selected(e.g. with their specific “MAIC” p-selection rule).

28

DF-GLS and ADF share the same alternative univariatemodels: (a) without or (b) with a deterministic linear trend;and test for the same hypotheses:H0: “yt has a unit root ” H1: “yt is stationary”

The DF-GLS test is accomplished in two steps.

1st step: GLS detrending.

case (a) case (b)

y(α)1 = y1 ; c

(α)1 = 1 ; y(α)

1 = y1 ; c(α)

1 = 1 ; t(α)1 = 1

y(α)t = yt -α* yt-1 for t = 2, 3, .., T y(α)

t = yt -α* yt-1 for t = 2, 3, .., Tc(α)

t = 1 - α* for t = 2, 3, .., T c(α)t = 1 - α* for t = 2, 3, .., T

t(α)t = t - α*(t-1) for t = 2, 3, .., T

where α* = 1 -7/T where α* = 1 -13.5/T

OLS estimates of the δ0 and δ1 parametersy(α)

t = δ0 c(α)

t + et y(α)t = δ0 c

(α)t + δ1 t

(α)t + et

detrended yt is defined as ydt

ydt = yt -

^

0δ ydt = yt - (

^

0δ +^

1δ t)

2nd step: ADF tests of the detrended series ydt.

∆ydt = π yd

t-1 + ∑=

p

1i γi ∆yd

t-i + εt

The DF-GLS test corresponds to the Student-t of the OLSestimate of π in the equation above (note that all thedeterministic variables are excluded, and the equation is thesame in both cases).

While DF-GLS t-ratio follows the ADF distribution (butwithout constant) in the case (a), the asymptotic distributiondiffers when case (b) is considered (c.v. are simulated in theERS paper).

Page 15: Lecture

29

Concluding remarks (Stock and Watson, ch. 12, 2002)

In case the dependent variable and/or regressors are nonstationary, then autoregressive coefficients are biasedtowards zero, OLS t-statistics nonnormal under the null,spurious regression. Hence: the conventional hypothesistests, confidence intervals, and forecasts are unreliable. Theprecise created by the nonstationarity, and the solution tothat problems depend on its nature.

Main sources of nonstationarity are trends and breaks.

Trend: a persistent long run movement of a variable overtime. It is of two types:

deteministic (nonrandom function of time) stochastic (random, varies over time)

(examples: ly and dlpc in lypc.wf1)

“Many econometricians think it is more appropriate tomodel economic time series as having stochastic rather thandeterministic trends. Economics is a complicated stuff. It ishard to reconcile the predictability implied by adeterministic trend with the complications and surprisesfaced year after year by workers, businesses, andgovernments (... examples ...). For these reasons, ourtreatment of trends in economic time series focuses onstochastic rather than deterministic trends.” (p. 458)

Break: arises when the population regression function changesover time over the course of the sample. In economics it occurs forchanges in economic policy and/or in the structure of theeconomy, for inventions, etc. Usually, it entails changes in theregression parameters and poorer-than-expected forecastingperformance. Again, the nature of the break suggests the bestsolution (switching/evolving parameter regressions).

30

3. UNIT ROOTS AND SPURIOUS REGRESSIONSMost econometric analyses are based on sample varianceand covariance estimates among variables.

Non stationarity causes problems (unconditional momentsare not defined): a likely result is spurious regression, andthe use of standard large samples theory for validestimation and inference in the linear model is not allowed.

Historical background (since 1920s). Spurious correlationis an observed sample correlation between two serieswhich, though appearing statistically significant, is areflection of a common trend rather than a reflection of anygenuine underlying association.

Allen (1949, p. 156): “There is a stronger positivecorrelation between the birth rate and the number of storksin Sweden, since each has been declining for variousreasons.”

Correlation is a statistical concept which is neutral asregards causal relations (economic concept). The nonstationarity “cancels” a number of standard statisticalproperties and tools.

Consider the model:yt = c + β zt + εt

where we suppose that yt and zt are independent.

Classical assumptions of regression:

I. the regressors are either deterministic or stationaryrandom variables (uncorrelated with the error term)

II. E(εt) = 0; E(εt)2 = σ2; E(εtεt-k) = 0 ∀k > 0.

Page 16: Lecture

31

If both assumptions hold, then:

H0: β = 0 Pr(|t| > 1.96) = 0.05

while, if yt and zt are I(1) variables, assumption I. clearlyfails, and the effect on t-statistic distribution is:

H0: β = 0 Pr(|t| > 1.96) ≈ 0.753

(i.e. appearance of a false significant regression well overthe 5% significance level: problems in the size of the test).

In addition, such regressions are characterised by:

2t t

2t tt2

)yy(

)yy(1R

∑∑

−−=

is very high, close to one

(since the variables are both trended, the ratio above can bevery small)

Very low Durbin-Watson (DW) statistic of 1st orderautocorrelation, close to zero. DW ≈ 2 (1 - ρ), where ρ isthe 1st order autocorrelation coefficient of the regressionresiduals.If DW → 0 then ρ → 1 and it suggests regressionresiduals are probably I(1).

Example: Spurious regressions with artificial data.Eviews/new/workfile/quarterly/1970.1-2000.4 (T=124).open/program/montecarlo.prg/run it various times and check: t, R2 and DW.

32

Suggested remedies in literature

Granger-Newbold (1974, JE) suggest a “rule of thumb” todetect spurious regressions: when R2 >> DW.→ the remedy they suggest is to impose a ∆ = (1-L) filter toI(1) series in order to make them stationary, and improveinference: ls D(Y) C D(Z)In this way, t-statistics are no longer significant (asexpected, since the two variables were independentlysimulated), R2 is close to zero, and DW test suggests nonautocorrelated residuals (close to 2).

Sims-Stock-Watson (1990, E) note that, in levels staticregressions, t-statistics test the following hypothesis:H0: β = 0 → yt = c + εt is FALSE because yt is I(1)H1: β ≠ 0 → is FALSE too, since yt and zt are not related→ both null and alternative hypotheses are false; this raisesfurther inference problems. The main problem with thesespurious regressions is that nothing in the model givesaccount of yt persistence (only the residuals, since zt is notrelated to yt).→ the remedy they suggest is to add lags in order to reachwhite noise residuals (since the persistency is caught by thedynamic specification): ls Y C Z Y(-1) Z(-1)In addition, the true model (y is a random walk) is nested inthe dynamic model. Things improve, but we still missstatistical foundations for inference (with I(1) variables, t-statistics are non standard and R2 are uninformative).

Some preliminary findings: dynamics matters very much (white noise residuals); always remember what model is underlying both thenull and the alternative hypotheses.

Page 17: Lecture

33

Example of a “crazy” regression: the US consumers look atlogs of UK incomes when they purchase goods (logs of USconsumption)! Static model.Eviews/ardlusuk/ ls lcus c lyuk

Dependent Variable: LCUSSample(adjusted): 1959:1 1998:1Included observations: 157 after adjusting endpoints

Variable Coefficient Std. Error t-Statistic Prob.

C -5.612676 0.160374 -34.99740 0.0000LYUK 1.208592 0.014419 83.81657 0.0000

R-squared 0.978413 Mean dependent var 7.824778S.E. of regression 0.052291 Akaike info criterion -3.051334Durbin-Watson stat 0.140469 Prob(F-statistic) 0.000000

If we impose first difference transformation:ls d(lcus) c d(lyuk) as suggested by Granger-Newbold (results not reported),we obtain positive and negative findings:

non significant t-statistic and very low R2 (positive side); we loose levels and the information of economic theory(negative side); residual autocorrelation (negative side).

Reconsider now spurious regression for US consumption inthe context of the dynamic model by augmenting the staticregression with lags up to one year (four lags).ls lcus c lyuk(0 to –4) lcus(-1 to –4) assuggested by Sims-Stock-Watson (results not reported).Main findings:

the residuals are white noise; the sum of lagged consumption parameter estimates isclose to one, and the sum of UK income parameterestimates is close to zero.

34

4. THE DYNAMIC SPECIFICATION (ARDL)

The problems related to non stationarity can be partlysolved in a dynamic framework because many“potentially right” models are nested in it.

Empirical analysis of level (long run) relationships has beenan integral part of time series econometrics and pre-datesthe literature on unit roots and cointegration, see Hendry-Pagan-Sargan (1984).

The fundamental contribution of this literature is on thespecification and estimation of level relationships, ratherthan testing for their presence, since “co-integration” theorywas missing (not yet fully established).

Q: HOW TO USE ECONOMIC THEORIES WHEN

CONSTRUCTING AN EMPIRICAL MODEL?

Two extreme approaches (see Granger, 1999, ch. 1):(1) “Theory contains the only pure truth, so has to be at the basis

of the model, leaving little place for stochastics, uncertainty orexogenous shocks to the system”.

(2) “Theory is useless; better atheoretical models based just onexamination of the data and using any apparent regularitiesand relationships found in it”.

Most applied economists take a middle ground, using theory toprovide the initial specification (variables of interest) and thendata exploration techniques to extend or refine the starting model,leading to a form that better represents data.

Theory → static statements about economicrelations (long run relations)

Reality (data) → dynamic fashion

Page 18: Lecture

35

A: TO PRODUCE A “BRIDGE” FROM THE PRISTINE THEORY TO

THE MORE PRAGMATIC DATA ANALYSIS

When economic theory proposes an equilibriumrelationship between two variables this may be seen as thelong-run steady-state solution of a dynamic model.

Define the simple 1st order Auto-Regressive Distributed-Lags (ARDL) model:

[4] yt = c + α1 yt-1 + β0 zt + β1 zt-1 + εt

where εt ~ n.i.d. (0, σ2).

A long run relation is something that the dynamic processwould satisfy if all errors were switched-off, and theequations would then bring back the process to a set ofvalues where the long run relation is satisfied (steady state).

The long-run-steady-state-non-stochastic solution of eq. [4]is obtained by setting: yt = yt-1 = y*; zt = zt-1 = z*; εt = 0

y* = c/(1-α1) + (β0+β1)/(1-α1) z*

and the long run (level) relationship is measured by theparameter β = (β0+β1)/(1-α1).

Model [4] can be reparametrised in a convenient way inorder to better understand the mechanism of adjustmenttowards the long run relationship:

[4′] ∆yt = c+β0∆zt+(α1-1) [yt-1-(β0+β1)/(1-α1)zt-1] + εt

by defining α = (α1-1) and, as above, β = (β0+β1)/(1-α1),we obtain the specification of the error correctionmechanism (ECM) model:

[4″] ∆yt = c + β0∆zt + α [yt-1- β zt-1] + εt

36

given the level zt-1, y*

t-1 = β zt-1 measures the target level,and [yt-1- y

*t-1] is the “error(equilibrium)-correction” term.

The ECM form of the model may be seen as comprising theshort-run transitory effect and the long run relationship, anddescribes how the long run solution is achieved via errorcorrection feedback.

In fact, if -2 < α < 0 (that corresponds to -1 < α1 < 1) theequation [4″] equilibrates in presence of a discrepancybetween yt-1 and y*

t-1: it guarantees that, in the long run, ywill converge to its target y*.

If yt is not on its long run path, and suppose that yt-1 > y*t-1

{yt-1 < y*t-1} from the ECM representation -2 < α < 0

ensures that there is pressure, from the error-correctionterm, for ∆yt < 0 {∆yt > 0}. In other terms, if -2 < α < 0, in“disequilibrium” yt will move towards its long run path;both from above and below, and the movement will be inproportion of the last period’s “error” given by [yt-1- y

*t-1].

α also measures the speed (and the path) of the adjustmentof yt to disequilibrium.α > 0 → the model is explosiveα = 0 → the model does not adjust-1 < α < 0 → stable process of adjustmentα = -1 → the model adjusts in one period-2 < α < -1 → overshooting adjustmentα = -2 → the model continuously oscillatesα < -2 → the model is explosive

Big forecast problems in presence of level-breakingrelationships in y*

t (see Clements-Hendry, 1999).

Page 19: Lecture

37

Testing for the existence of a level-relationship

It is more convenient to use another parametrisation of [4]:

[4*] ∆yt = c + β0 ∆zt + π1 yt-1 + π2 zt-1 + εt

where: π1 = (α1-1) = α; and π2 = (β0+β1) = -α β

There are two testing possibilities:

i. H0: π1 = 0 → (t-statistic); ii. H0: π1 = π2 = 0 → (F-statistic).

Note that t and F distributions are not standard; asymptoticcritical values are tabulated by Pesaran-Shin-Smith (2001),and they depend on:

the variables are either I(0) or I(1); the presence of the deterministic trend and/or theconstant; the number k of explanatory (forcing) variables.

Example:Unrestricted constant and no trend (5% asymptotic c.v.)

t Fk I(0) I(1) I(0) I(1)

1 -2.86 -3.22 4.94 5.732 -2.86 -3.53 3.79 4.853 -2.86 -3.78 3.23 4.354 -2.86 -3.99 2.86 4.01

Main features of the ARDL approach:

Developed in the field of the Auto Regressive DistributedLags models (dynamic specification, Sargan, LSE, etc.).

38

The null hypothesis of both (t and F) tests is the absenceof a long run relationship.

Critical values are ad hoc tabulated: one set of c.v. areobtained by assuming that all the variables are I(0), theother set by assuming all the variables are I(1).

In the case where zt and εt are correlated, the ARDLprocedure requires estimation of an augmented version ofthe original model.

Hence, the important issue in the application of theARDL procedure is the choice of the order of thedistributed lag function on yt and zt.

Example: Does exist a long run stable relationship betweenconsumption and income in the US? Eviews/ardlusuk/ardlgenr dlcus = d(lcus)genr dlyus = d(lyus)ls dlcus c dlcus(-1 to –3) dlyus(0 to -3)

lcus(-1) lyus(-1)

Both t and F tests do not reject H0 → spurious relation?

Probably it’s better to think in terms of omitted variables

→ e.g. the quarterly inflation rate: pius (Deaton consumptionmodel) is added to previous ARDL specification:genr dpius = d(pius)ls dlcus c dlcus(-1 to –3) dlyus(0 to -3)

dpius(0 to –3)lcus(-1) lyus(-1) pius(-1)

Given LM(4) autocorrelated residuals, we added dlcus lags 4 and5: the results are in /equation/ardlfinal/

ls dlcus c dlcus(-1 to –5) dlyus(0 to -3)dpius(0 to –3)lcus(-1) lyus(-1) pius(-1)

(model used in estimating/testing for the long run relationship).

Page 20: Lecture

39

Both t and F tests reject H0 either by using I(0) or I(1) c.v., sincestatistics are: t = -3.5 and F = 8.46while corresponding I(0)-5% c.v. are t = -2.86, F = 3.79; and I(1)-5% c.v. are t = -3.53, F = 4.85 (see Pesaran, Shin, Smith, 2001, p.T.2 and T.4).

The estimates of the long run parameters are:

94.0112.0

105.0β y ==

; 24.2112.0

251.0βpi −=

−=

.

The estimate of the loading parameter is α = -0.112: eachquarter, lcus adjusts by about 10% towards the target(equilibrium) level given by: lcus* = 0.94 lyus – 2.24 pius.

Often the absence of a long run relationship is thesymptom of the omission of relevant variables.

In fact, the “partial” long run relationship (lcus-lyus) is verypersistent (i.e. reverts slowly), while the additionalinformation from pius path can further explain savingsinertia, and form a cointegrated relationship together with(lcus-lyus).plot lcus-lyus pius

-0.26

-0.24

-0.22

-0.20

-0.18

-0.16

-0.14

-0.02

0.00

0.02

0.04

0.06

60 65 70 75 80 85 90 95 00

pius

lcus-lyus

40

5. LONG RUN RELATIONSHIPS ANDCOINTEGRATED VARIABLES

Ex ante, by pretesting with ADF test, suppose we know thatyt and zt are I(1):

does a long run relationship between y and z exist? are y and z co-integrated?

Cointegration definition:

Two integrated I(d) variables are co-integrated if thereexists a linear combination of them which is integrated I(c),with c < d.

The case d = 1 and c = 0 is interesting in that cointegrationimplies an ECM representation (EqCM is the Hendry’s &coupdate) which allows to rewrite a dynamic model in levelsI(1) as a dynamic model which involves only variables I(0).

THE ENGLE AND GRANGER PROCEDURE:

[A] OLS estimation of the static (cointegration) regressionyt = c + β zt + ut

Note that if the combination ut=yt–(c+βzt) is I(0), thenthe integrated y and z variables are also cointegrated.

[B] Unit roots test of the cointegration regression residuals

tu = yt - ( c + β zt)

[C] If y and z are cointegrated, then β (OLS β estimator)is superconsistent.

[D] Dynamic (short run) ECM modelling of ∆yt, ∆zt andtu . Note that under the hypothesis of cointegration all

the variables in the ECM model are I(0).

Page 21: Lecture

41

RATIONALE OF STEPS [A, B]:COINTEGRATION AND COMMON TRENDS

At univariate level, suppose: yt = µ1t + v1t

where: µ1t = µ1 + µ1t-1 + ε′1t ε′1t ~ i.i.d. (0, σ211)

v1t = v1 + α1 v1t-1 + ε″1t ε″1t ~ i.i.d. (0, σ212)

µ1t is a random walk (the non stationary component of yt),and v1t is an AR(1) with |α1|<1 (the stationary component ofyt). The same is supposed for zt: zt = µ2t + v2t

where: µ2t = µ2 + µ2t-1 + ε′2t ε′2t ~ i.i.d. (0, σ221)

v2t = v2 +α2 v2t-1 + ε″2t ε″2t ~ i.i.d. (0, σ222)

In the static regression: yt = c + β zt + ut , we have that:ut = yt - β zt – c = µ1t + v1t - β (µ2t + v2t) – c =

= [µ1t - β µ2t] + [v1t - β v2t] – cwhere: [µ1t - β µ2t] are the I(1) components,and [v1t - β v2t] are the I(0) components.

The cointegration condition is [µ1t - β µ2t] = 0(when combined, the I(1) components of y and z variables“cancel” each other).

Under the cointegration condition, the error term isut = (v1t - β v2t – c) ~ I(0)

while, if the cointegration condition is not satisfied,ut ~ I(1). In other terms, ut ~ I(0) means that µ1t = β µ2t: theI(1) component of yt is the same as that of zt up to a scalarβ, the parameter that “converts” µ2t in µ1t.

µ2t is the stochastic common trend of yt and zt ; in fact,under the cointegration assumption, we have that:

zt = µ2t + v2t

yt = β µ2t + v1t

42

→ µ2t is the common source of nonstationarity; and, bysubstituting µ2t definition in the cointegrated yt we have:

yt = β (zt - v2t)+ v1t = c + β zt + (v1t – β v2t – c)

→ Again, the static regression residuals are stationary(though autocorrelated) if yt and zt are cointegrated.

INTUITION BEHIND STEP [C]: SUPERCONSISTENCY

If yt and zt are cointegrated, the OLS method yields a“superconsistent” estimator of the cointegrating parameterssince the effect of the common trend dominates the effectof the stationary component.

The omission of the dynamics is not very relevant if thevariables are I(1) and cointegrated.

The cointegrated combination is a strong linearrelationship, but with relevant biases in small samples (seeBanerjee et al. (1993) results).

Example with simulated data. Eviews/new/workfile/undated/1 200/open/program/supercon.prg/: run theprogram, and it will display different dispersions around theregression lines in I(0), yi0 against zi0, and I(1), yi1 againstzi1, variables. All data were simulated with long runparameter equal to 1, with both I(0) and I(1) variables, andadjustment parameter equal to ½.

The regression output from I(1) variables ls yi1 c zi1shows that 200 observation are enough to enjoysuperconsistence of the OLS estimator of cointegratedrelations. The recursive estimation gives the intuition of theissue: with samples < 100 observations the amount of the

Page 22: Lecture

43

bias is very relevant (this result confirms Banerjee et al.(1993) outcomes).

-6

-4

-2

0

2

4

6

8

-4 -2 0 2 4

ZI0

YI0

10

20

30

40

50

0 10 20 30 40 50

ZI1

YI1

-0.5

0.0

0.5

1.0

1.5

20 40 60 80 100 120 140 160 180 200

The residuals of the static regression are autocorrelated(LM test and correlogram) but stationary. Following theEngle and Granger (1987) approach, white noise residualsare not essential at this stage. Save residuals and performthe ADF test without deterministic components (CRDWand CRDF with different c.v., see Engle-Granger, 1987).

The same regression with I(0) variables ls yi0 c zi0shows short term (and not long term) parameter estimate,and autocorrelated residuals; by adding the laggeddependent variable, we are able to find a consistent estimateof the long run parameter.

44

Previous results with simulated I(1) time series suggests afurther question: “why the static (cointegration) regressiondoes estimate the contegration (long run) parameter eventhough the DGP is dynamic?”.

Hypotheses: (i) y and z are I(1) and cointegrated;(ii) the “true” DGP is (see eq. 4 p. 33):yt =c+ α1 yt-1+ β0 zt+ β1 zt-1+ εt [εt~nid(0,σ2)]

Fact: if y and z are cointegrated, the OLS regressionof y on z alone (with no lags) yields a slope βthat is a (super)consistent estimator of the longrun parameter β = (β0+β1)/(1-α1)

because: the OLS criterion of picking β to minimisethe sum of squared residuals forces it towards[(β0+β1)/(1-α1)].

In fact, if β ≡ β0 ≠ [(β0+β1)/(1-α1)] → yt = β zt + u1t

where u1t = β1 zt-1 +α1 yt-1 +εt

while, if β ≡ [(β0+β1)/(1-α1)] → yt = β zt + u2t ; whereu2t = β1/(α1-1) ∆zt + α1/(α1-1) ∆yt + εt/(1-α1)

Given the hypotheses (i) and (ii), u1t ~ I(1) and u2t ~ I(0).

Since the sum of squares of an I(1) variable increaseswithout size as the sample goes to infinity, the OLSestimator will pick the β estimate such that thecorresponding residuals are closer to an estimate of u2t

instead of u1t.

Of course, this fact per se does not prevent u2t from beingautocorrelated.

Page 23: Lecture

45

Example: the Engle and Granger application to the Phillipscurve for Italy. Eviews/phil/

[1st step] ls dlw c lu → DW = 0.4, R2 = 0.35

Dependent Variable: DLWMethod: Least SquaresSample(adjusted): 1961 1999Included observations: 39 after adjusting endpoints

Variable Coefficient Std. Error t-Statistic Prob.

C 0.247763 0.031744 7.804992 0.0000LU -0.075388 0.016687 -4.517788 0.0001

-0.10

-0.05

0.00

0.05

0.10

-0.05

0.00

0.05

0.10

0.15

0.20

0.25

65 70 75 80 85 90 95

Residual Actual Fitted

Figure above suggests very persistent residuals, and themagnitude of ADF residuals test reported below confirmvisual inspection:

ADF Test Statistic -1.220092 (c.v. are in Engle-Granger (1987)

46

Another symptom of the absence of a stable (cointegrated)long run relationship among the variables of interest is therecursive estimate plots of the long run parameters:

-0.2

0.0

0.2

0.4

0.6

70 75 80 85 90 95

Recursive C(1) Estimates ± 2 S.E.

-0.4

-0.3

-0.2

-0.1

0.0

0.1

0.2

0.3

70 75 80 85 90 95

Recursive C(2) Estimates ± 2 S.E.

What is the reason for such result? The scatter of wage-unemployment trade-off can be suggestive:

scat lu dlw (connect/line)

-0.05

0.00

0.05

0.10

0.15

0.20

0.25

1.0 1.5 2.0 2.5 3.0

LU

DLW

→ The impression is that the inflation omission from theinformation set prevents the other two variables to becointegrated. In fact, the sudden rise of inflation at thebeginning of 70s pushed up nominal wage growth, withouta corresponding reduction in the unemployment rate.

Page 24: Lecture

47

ls dlw c dlp lu → DW = 1.91, R2 = 0.92

Dependent Variable: DLWMethod: Least SquaresSample(adjusted): 1961 1999Included observations: 39 after adjusting endpointsVariable Coefficient Std. Error t-Statistic Prob.C 0.152943 0.012837 11.91409 0.0000DLP 0.832169 0.052361 15.89303 0.0000LU -0.059509 0.006058 -9.823363 0.0000ADF Test Statistic -4.154235

Now there is cointegration (see the ADF test above).

How to test for wage growth elasticity to inflation beingone? With a restricted regression: ls dlw-dlp c luwhose results say that cointegration still holds:

in fact, the residuals are autocorrelated (i.e. persistent)

-0.06

-0.04

-0.02

0.00

0.02

0.04

0.06

-0.05

0.00

0.05

0.10

0.15

65 70 75 80 85 90 95

Residual Actual Fitted

but stationary:Durbin-Watson statistic 1.535165ADF Test Statistic -4.790140

and the estimated long run cointegrated relationship is:dlw – dlp = 0.134 – 0.0563 lunote that the estimated parameter for the unemploymentrate (in logs) is very similar to the previous one.

48

→ genr ecm = resids

[2nd step] short term dynamics among I(0) variables.The starting point is a dynamic model up to 4th

order (3rd because the short run dynamics is indifferences). Given that residuals are whitenoise, we test for dropping some lags: the Fdeletion test for all –2 and –4 lags is notrejected with a p-value = 89.6%. Therestricted model is depicted below:

Dependent Variable: D(DLW)Method Least SquaresSample(adjusted): 1963 1999Included observations: 37 after adjusting endpoints

Variable Coefficient Std. Error t-Statistic Prob.

C 0.005225 0.003472 1.504584 0.1429D(DLP) 0.835032 0.156380 5.339758 0.0000D(DLP(-1)) -0.010759 0.224236 -0.047979 0.9621D(DLW(-1)) -0.214205 0.171202 -1.251178 0.2205D(LU) -0.175406 0.043813 -4.003544 0.0004D(LU(-1)) -0.068000 0.046539 -1.461134 0.1544ECM(-1) -0.848525 0.232109 -3.655714 0.0010

R-squared 0.762838 Mean dependent var -0.002961Adjusted R-squared 0.715405 S.D. dependent var 0.032476S.E. of regression 0.017325 Akaike info criterion -5.104643Sum squared resid 0.009005 Schwarz criterion -4.799875Log likelihood 101.4359 F-statistic 16.08260Durbin-Watson stat 1.915861 Prob(F-statistic) 0.000000

Residual checks do not show relevant problems. Therestriction to zero of not significant parameters is notrejected with p-value = 29.5%. The restricted modelpresents very stable recursive estimates (results notreported). The retained uniequational model is:

∆dlwt = 0.003 + 0.76 ∆dlpt – 0.16 ∆lut – 0.86 ecmt-1 + tε

Page 25: Lecture

49

6. MODELLING SYSTEMS

Problems with the single equation approach

Fundamentally, there are problems when: i. not all the right hand variables in the cointegration

vector are weakly exogenous (loss of information); ii. there are more than one cointegrating vector (when the

number of variables of interest > 2).

In the previous long run Phillips curve, we were interestedin analysing the long run relationship between real wagegrowth and (logs of) the unemployment rate, then the pointii. was not a problem since between two I(1) variables therecan be at maximum only one cointegration relationship.

Instead, point i. is still a potential problem, given that inmodelling ECM equation for wage we did not model thedeterminants of the other variables that enter on the rightside of the ECM equation for wage (inflation andunemployment rates). This fact does not lead to a loss ofinformation (long run estimator inefficiency) only if thecointegration relationship does not enter the other twoequations for short run inflation and unemployment: this isthe definition of weak exogeneity.

In addition, both 2nd step Engle-Granger short run estimatorand ARDL model are conditional models sincesimultaneous explanatory variables also appear asregressors. The potential advantages of not modellingadditional variables and of reducing the number of short runexplanatory variables in the equation require that theconditioning variables are weakly exogenous.

50

Weak exogeneity issue

Though a better testing for weak exogeneity should beimplemented in the field of the system (Johansen)cointegration, a first check can be accomplished by thefollowing steps with reference to previous Phillips curveanalysis.

Eviews/phil/system phillips contains 3 reduced formequations, where only I(0) variables are included:D(DLW) = C(11) + C(12)*D(DLW(-1)) + C(13)*D(DLP(-1)) + C(14)*D(LU(-1)) + C(15)*ECM(-1)D(DLP) = C(21) + C(22)*D(DLW(-1)) + C(23)*D(DLP(-1)) + C(24)*D(LU(-1)) + C(25)*ECM(-1)D(LU) = C(31) + C(32)*D(DLW(-1)) + C(33)*D(DLP(-1)) + C(34)*D(LU(-1)) + C(35)*ECM(-1)

System SUR estimation is required in order to test for thenull:

c(25)=0,c(35)=0 F = 4.77 [9.2%]

DLP and LU levels contribute to the definition of thePhillips curve long run equilibrium (hidden inside the ECMterm), but do not converge to that equilibrium.

→ D(DLP) and D(LU) equations do not containinformation about the long run parameters, since thecointegration relationship does not enter into theseequations. (Eviews/phil/system: philrestr):

Note that in the first equation, the restricted SUR estimateof the loading parameter is quite similar to that we reportedabove (see Engle-Granger 2nd step).

If you add in the first equation D(DLP) and D(LU), youwill reproduce OLS results (think to the orthogonalitycondition in OLS). Eviews/phil/system: philcond.→ It is valid to condition on DLP and LU to follow theuniequational approach because they are weaklyexogenous.

Page 26: Lecture

51

Cointegration rank issue

When we operate in a multivariate I(1) framework (n > 2variables of interest) we can have up to (n-1) cointegratingrelationships (which is the cointegration rank, r), and thesingle equation approach can lead to serious troubles ifthere are multiple cointegrating vectors.

Example: suppose that the “true” model has r = 2:

∆R6t = c + α1 (R6t-1 – R3t-1) + α2 (R6t-1 – R12t-1) + εt

= c + α1 R6t-1 – α1 R3t-1 + α2 R6t-1 – α2 R12t-1 + εt

= c + (α1+α2) [R6t-1–α1/(α1+α2)R3t-1-α2/(α1+α2)R12t-1] + εt

α β1 β2

In the single equation dynamic modelling we estimate aslong run elasticities (β1, β2) which instead in the “true”model are mixtures of the cointegrating parameters (1, -1)and (1, -1), with the loading factors (α1, α2). In fact, fromEviews/termine/ we have that:

ARDL:ls drbot6 c drbot6(-1 to –2)drbot3(0 to –2) drbot12(0 to –2)rbot3(-1) rbot6(-1) rbot12(-1)

E-G 1st step: ls rbot6 c rbot3 rbot12

ARDL cointegration approach is in: equation/ardl

Engle-Granger 1st step cointegration is in: equation/eg1step

It is worth to note that:

cointegration still persists

long run parameters are unidentified (mixed)

52

Why the cointegration rank < n?

Generalisation to the case of n variables of the commontrend analysis (vector approach):

[5] xt = µt + vt with µt ~ I(1) and vt ~ I(0)

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

+

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

µ

µµ

=

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

nt

t2

t1

nt

t2

t1

nt

t2

t1

v

v

v

x

x

x

If there is one cointegration relationship (r = 1), there existsa (n×1) vector β such that:

[6] β′µt = β1µ1t + β2µ2t + ... + βnµnt = 0

and premultiplying [5] by β′ and substituting [6] we have:

β′xt = β′µt + β′vt = β′vt

→ the linear combination β′xt is I(0) because a combinationof I(0) variables is always I(0).

With one cointegration relationship, one of the n trends µit

can be expressed as a linear combination of the othertrends, e.g. (if we normalise for β1) from [6] we have:

µ1t = -β2/β1 µ2t - ... - βn/β1 µnt

In general, there can be multiple linear relationships amongthe trends and the number of these relationships is theCOINTEGRATING RANK r. In this case, there are r < n linearrelations such that

β′µt = 0where β is a (n×r) matrix and 0 is a (r×1) vector. Hence:

β′xt = β′vt

→ r relationships (combinations) among the n variables ofinterest are stationary.

Page 27: Lecture

53

If r = n, then a n×n full rank matrix B would exist suchthat:B′µt = 0; then B′xt = B′vt → xt = (B′)-1B′vt = vt but thisresult is impossible since xt ~ I(1) and vt ~ I(0) bydefinition.

The non stationarity “accounting”:

number of variables in xt n -number of cointegrating relationships r =number of common stochastic trends (n-r)

There is no point in imposing n unit roots to the variables ofthe system (∆xt), since r unit roots “simplify” (evaporate)thanks to cointegration among the n variables in xt.

PUTTING ALL PREVIOUS THINGS TOGETHER

Single equation dynamic modelling (e.g. Pesaran et al.) andthe 2-step (Engle-Granger) approaches are both valid (i.e.we can use them without loss of information) ONLY IF twoconditions are satisfied:

• cointegration rank r = 1;• weak exogeneity of the explanatory (forcing) variables

for the long run parameters of interest.

otherwise, we must follow the procedure proposed bySoren Johansen within the framework of the VectorAutoRegressive (VAR) model that achieves both the resultsof:

• testing for the cointegration rank;• imposing identifying restrictions on the reduced rank

regressions.

54

Johansen’s approach is based on an unrestricted vectorautoregressive approach (UVAR). Let’s start with a quitegeneral VAR(2) model with n I(1) variables of interest andstandard :

[7] Xt = c0 + c1 t + A1 Xt-1 + A2 Xt-2 + εt

that is the multivariate analogous of previous AR(2) model:xt = c + β t + α1 xt-1 + α2 xt-2 + εt

that (do remember the basics of the ADF test model) can bereparametrised as:∆xt = c + β t + π xt-1 + γ1 ∆xt-1 + εt

where: π = α1+α2-1; and γ1 = -α2

In the same way, VAR model in [7] can be reparametrised:

[7′] ∆Xt = c0 + c1 t + Π Xt-1 + Γ1 ∆Xt-1 + εt

where a particularly relevant role is played by the n×nmatrix Π = A1 + A2 – I, defined as the long run multipliermatrix. In [7′], changes of each variable in X are predictedby a linear combination of past values of all variables,provided that rank(Π) ≠ 0.In the ADF case the main point was testing for unit rootsunder the null π = 0 → presence of a unit root →stationarity can be achieved only by taking the firstdifferences of the xt variable.

The first step of the Johansen’s cointegration approach isto test for the rank(Π). If the null Ho: rank(Π) = 0 is notrejected, then the system (VAR) becomes stationary onlyby imposing n unit roots to the n variables in the Xt vector.On the other side, we noted above that rank(Π) = n isimpossible, since the n variables are I(1). Hence, bydefinition, the cointegrating rank 0 ≤ r < n.

Page 28: Lecture

55

The matrix Π is a reduced rank matrix, and can bedecomposed: Π = αβ′, where α and β are n×r matrices. Ther linear combinations are such that β′Xt ~ I(0). Fromequation [7′] the (reduced form) vector error correctionmodel (VEC) is obtained by substituting Π matrix with αβ′:[8] ∆Xt = c0 + c1 t + αβ′ Xt-1 + Γ1 ∆Xt-1 + εt

As far as the deterministic components of the VAR areconcerned, Eviews allows for 5 different cases (as manyother packages):1) no intercepts, no trends: c0=c1=0 (unlikely to be relevant);2) restricted intercepts, no trends: c0=Πµ0, c1=0 (non

trended variables);3) unrestricted intercepts, no trends: c0≠0, c1=0 (for unit

roots models with drifts);4) unrestricted intercepts, restricted trends: c0≠0, c1=Πµ1

(linear deterministic trends in the data);5) unrestricted intercepts, unrestricted trends: c0≠0, c1≠0

(quadratic deterministic trends in the data);

The null hypothesis of the trace test is Hr: rank(Π) = ragainst the alternative hypothesis of (trend)-stationarity: Hn:rank(Π) = n (full rank). The trace statistic is also a log-likelihood ratio statistic, and the appropriate critical valuesfor all the five cases are reported by Eviews.

To determine the number of cointegrating relations r,subject to the assumptions made about the trends in theseries, we can proceed sequentially from r = 0 to r = n-1until we fail to reject. The first row in the upper table teststhe hypothesis of no cointegration, the second row tests thehypothesis of one cointegrating relation, the third row tests

56

the hypothesis of two cointegrating relations, and so on, allagainst the alternative hypothesis of full rank, i.e. all seriesin the VAR are stationary.

After the value of r is estimated, the second step is toidentify β. When r = 1, there are no problems: thenormalisation (one) restriction for the parameter of what theeconomic theory suggests is the dependent variable yields aunique estimate up to a scaling parameter. However, when r> 1, the problem of identification arises. The appropriateprocedure would be to estimate the cointegratingrelationships subject to a priori restrictions from theeconomic theory.

Suppose there are r cointegrating relations and β is a n×rmatrix, then we need at least r restrictions (including thenormalisation restriction) on each of the r cointegratingrelationships. The exact identification of the wholecointegrating parameters requires r×r restrictions.

Remember: the source of these identifying restrictions isusually from a priori theory. As far as the role of theory inproviding these restrictions are discussed in Pesaran (1997).

Unfortunately, Eviews 3.1 is very rough in approaching theissue; since the version 4 it has been considerablyimproved.

General practical advice

The order p of the VAR often plays a crucial role in thesubsequent analyses and particular attention must bedevoted to obtain vector white noise residuals. On the otherside, after the order is selected sufficiently long, remaining

Page 29: Lecture

57

observations must be still enough for asymptotic theory towork reasonably well → difficult balancing act.

It is also important to remember that the lag specificationthat EViews prompts you to enter in the VAR refers to lagsof the first difference terms in the VEC. For example, 1 1specifies a model involving a regression of the firstdifferences on one lag of the first difference.

VAR approach is highly data intensive, particularly if n islarge: when n = 4 and p = 5, each equation of the VARcontains 20 unknown parameters (plus possibledeterministic components).

The five cointegrating VAR cases presume that thevariables are I(0) or I(1), and that the nature of the trendsin Xt variables is ascertained (by plotting, no econometrictheories).

Sometimes, the trace test outcome is overtaken by a prioriinformation from the long predictions of a suitableeconomic model (it is also important the sensitivityanalysis to the choice of r).

Eviews/phil/open group: dlwp lu /view/cointegration test/“model 3”/Series: DLWP LULags interval: 1 to 1

Likelihood 5 Percent 1 Percent HypothesizedEigenvalue Ratio Critical Value Critical Value No. of CE(s)

0.325332 15.51374 15.41 20.04 None * 0.025426 0.952944 3.76 6.65 At most 1

*(**) denotes rejection of the hypothesis at 5%(1%) significance level L.R. test indicates 1 cointegrating equation(s) at 5% significance level

Unnormalized Cointegrating Coefficients:

DLWP LU 12.29458 0.620128-2.007333 -0.481619

Normalized Cointegrating Coefficients: 1 Cointegrating Equation(s)

58

DLWP LU C 1.000000 0.050439 -0.122536

(0.00732)

It is worth to note that the Johansen’s approach providesquite similar long run estimates to the Engle-Granger 1st

step approach, since as we previously noted the latter meetsboth the requirements to avoid main specification andestimation problems.Eviews: procs/make a vector autoregression/1 1/VEC:Sample(adjusted): 1963 1999 Included observations: 37 after adjusting endpoints Standard errors & t-statistics in parentheses

Cointegrating Eq: CointEq1

DLWP(-1) 1.000000

LU(-1) 0.050439 (0.00732) (6.88642)

C -0.122536

Error Correction: D(DLWP) D(LU)

CointEq1 -0.828431 -0.201773 (0.24889) (1.03614)(-3.32854) (-0.19474)

D(DLWP(-1)) -0.014596 -0.978637 (0.18953) (0.78904)(-0.07701) (-1.24029)

D(LU(-1)) -0.072898 0.149233 (0.04059) (0.16899)(-1.79586) (0.88310)

C 0.000246 0.027836 (0.00360) (0.01497) (0.06849) (1.85958)

R-squared 0.368937 0.143114 Adj. R-squared 0.311568 0.065216 S.E. equation 0.020244 0.084276 Log likelihood 93.91271 41.14123 Mean dependent -0.002026 0.034860 S.D. dependent 0.024398 0.087166

Determinant Residual Covariance 1.71E-06 Log Likelihood 140.6207

Page 30: Lecture

59

Next steps are: weak exogeneity test together with both long runparameter estimation and short term dynamics modelling. Then,residual tests (var_we). You must use version 4 of Eviews.Vector Error Correction Estimates Sample(adjusted): 1963 1999 Included obs: 37 after adj endpoints Standard errors in ( ) & t-statistics in [ ]

Cointegration Restrictions: B(1,1)=1,A(2,1)=0Convergence achieved after 3 iterations.Restrictions identify all cointegrating vectorsLR test for binding restrictions (r = 1):Chi-square(1) 0.040194Probability 0.841102

Cointegrating Eq: CointEq1

DLWP(-1) 1.000000

LU(-1) 0.050082 (0.00768)[ 6.52274]

C -0.121878

Error Correction: D(DLWP) D(LU)

CointEq1 -0.851481 0.000000 (0.21370) (0.00000)[-3.98449] [ NA ]

D(DLWP(-1)) -0.014914 -0.984184 (0.18923) (0.78829)[-0.07881] [-1.24850]

D(LU(-1)) -0.073151 0.149749 (0.04059) (0.16907)[-1.80235] [ 0.88572]

C 0.000254 0.027807 (0.00359) (0.01497)[ 0.07055] [ 1.85741]

R-squared 0.369650 0.143011 Adj. R-squared 0.312345 0.065102 Sum sq. resids 0.013508 0.234409 S.E. equation 0.020232 0.084281 F-statistic 6.450618 1.835631 Log likelihood 93.93361 41.13899 Akaike AIC -4.861276 -2.007513 Schwarz SC -4.687123 -1.833360 Mean dependent -0.002026 0.034860 S.D. dependent 0.024398 0.087166

60

VEC Res Ser Correl LM TestsH0: no serial cor at lag order hIncluded observations: 37

Lags LM-Stat Prob

1 3.035956 0.55182 7.999531 0.09163 3.388368 0.4951

Probs from chi-square (4 df.)

VEC Residual Normality TestsH0: residuals are multivariate normalIncluded observations: 37

Component Skewness Chi-sq df Prob.

1 0.612554 2.313868 1 0.12822 -0.908903 5.094309 1 0.0240

Joint 7.408177 2 0.0246

Component Kurtosis Chi-sq df Prob.

1 3.517597 0.413023 1 0.52042 3.805923 1.001330 1 0.3170

Joint 1.414353 2 0.4930

Component Ja-Bera df Prob.

1 2.726891 2 0.25582 6.095639 2 0.0475

Joint 8.822530 4 0.0657

VEC Residual Heteroskedasticity Tests: Includes Cross TermsIncluded observations: 37

Joint test:Chi-sq df Prob.

20.47036 27 0.8104

Individual components:

Dependent R-squared F(9,27) Prob. Chi-sq(9) Prob.

res1*res1 0.092625 0.306241 0.9662 3.427126 0.9449res2*res2 0.292728 1.241647 0.3122 10.83092 0.2875res2*res1 0.211538 0.804877 0.6155 7.826916 0.5517

→ Previous single-equation results are strongly confirmed

Page 31: Lecture

61

Another example can be drawn from Italian Treasury billsinterest rates. Eviews/termine/select rbot3 rbot6 rbot12/double click/open group/view/coint./ 1 1/case 2Sample: 1979:01 1998:09Included observations: 223

Test assumption: No deterministic trend in the dataSeries: RBOT3 RBOT6 RBOT12Lags interval: 1 to 1

Likelihood 5 Percent 1 Percent HypothesizedEigenvalue Ratio Critical Value Critical Value No. of CE(s)

0.236692 86.26466 34.91 41.07 None ** 0.103856 26.03375 19.96 24.60 At most 1 ** 0.007064 1.580939 9.24 12.97 At most 2

*(**) denotes rejection of the hypothesis at 5%(1%) significance level L.R. test indicates 2 cointegrating equation(s) at 5% significance level

Unnormalized Cointegrating Coefficients:

RBOT3 RBOT6 RBOT12 C-22.66285 26.88986 -4.529554 0.042690 1.086596 16.60378 -17.67262 -0.017085 1.133854 -1.632210 0.407011 -0.055662

Normalized Cointegrating Coefficients: 1 Cointegrating Equation(s)

RBOT3 RBOT6 RBOT12 C 1.000000 -1.186517 0.199867 -0.001884

(0.09561) (0.09712) (0.00123)

Loglikelihood

2850.842

Normalized Cointegrating Coefficients: 2 Cointegrating Equation(s)

RBOT3 RBOT6 RBOT12 C 1.000000 0.000000 -0.986434 -0.002881

(0.02471) (0.00326) 0.000000 1.000000 -0.999818 -0.000840

(0.01949) (0.00257)

Log likelihood 2863.069

As before, in what follows we have to switch to Eviews 4 inorder to deepen the multivariate cointegration analysis.

Step (a): imposing long-run overidentification restrictionsVector Error Correction Estimates Sample(adjusted): 1980:03 1998:09 Included observations: 223 after adjusting endpoints

62

Standard errors in ( ) & t-statistics in [ ]

Cointegration Restrictions: B(1,1)=1,B(1,2)=0,B(2,1)=0,B(2,2)=1,B(1,3)=-1,B(2,3)=-1Convergence achieved after 2 iterations.Restrictions identify all cointegrating vectorsLR test for binding restrictions (rank = 2):Chi-square(2) 1.926163Probability 0.381715

Cointegrating Eq: CointEq1 CointEq2

RBOT3(-1) 1.000000 0.000000

RBOT6(-1) 0.000000 1.000000

RBOT12(-1) -1.000000 -1.000000

C -0.001166 -0.000818 (0.00094) (0.00075)[-1.23619] [-1.09389]

Error Correction: D(RBOT3) D(RBOT6) D(RBOT12)

CointEq1 -0.528178 0.025085 -0.041694 (0.14886) (0.13285) (0.11349)[-3.54818] [ 0.18882] [-0.36737]

CointEq2 0.369898 -0.193926 0.091514 (0.20690) (0.18465) (0.15775)[ 1.78778] [-1.05022] [ 0.58013]

Step (b): Weak exogeneity test for 12-month TB rate:Vector Error Correction Estimates Sample(adjusted): 1980:03 1998:09 Included observations: 223 after adjusting endpoints Standard errors in ( ) & t-statistics in [ ]

Cointegration Restrictions: B(1,1)=1,B(1,2)=0,B(2,1)=0,B(2,2)=1,B(1,3)=-1, B(2,3)=-1,A(3,1)=0,A(3,2)=0Convergence achieved after 2 iterations.LR test for binding restrictions (rank = 2):Chi-square(4) 2.286812Probability 0.683171

Cointegrating Eq: CointEq1 CointEq2

RBOT3(-1) 1.000000 0.000000

RBOT6(-1) 0.000000 1.000000

RBOT12(-1) -1.000000 -1.000000

C -0.001085 -0.000729 (0.00095) (0.00075)[-1.14796] [-0.97187]

Error Correction: D(RBOT3) D(RBOT6) D(RBOT12)

CointEq1 -0.485189 0.070152 0.000000 (0.09290) (0.05318) (0.00000)

Page 32: Lecture

63

[-5.22280] [ 1.31917] [ NA ]

CointEq2 0.275717 -0.292604 0.000000 (0.12920) (0.07396) (0.00000)[ 2.13405] [-3.95632] [ NA ]

D(RBOT3(-1)) 0.132715 0.176900 0.028579 (0.13405) (0.11962) (0.10222)[ 0.99002] [ 1.47880] [ 0.27958]

D(RBOT6(-1)) -0.168470 -0.155026 0.015922 (0.20015) (0.17860) (0.15262)[-0.84174] [-0.86799] [ 0.10432]

D(RBOT12(-1)) 0.022412 0.109891 0.060543 (0.20417) (0.18220) (0.15569)[ 0.10977] [ 0.60315] [ 0.38886]

Exercise: Cointegration and forward (ft) spot (st)D-Mark/US $ exchange rate (Zivot, 2000)

Economic theory (in pills ...)The Forward Rate Unbiasedness Hypothesis (FRUH) isbased on rational expectations and risk neutralityhypotheses, and defines:(1) Rational expectation forecast error:

ut+1 = st+1 - Et(st+1) such as: Et(ut+1) = 0

(2) Risk neutrality of the forward rate:ft = Et(st+1)

Together, (1) + (2) lead to the relationships:level: st+1 = ft + ut+1

difference: ∆st+1 = (ft - st) + ut+1

where: ft - st is the forward premium

→ The FRUH hypothesis can be tested in both (level anddifference) models.

Variables of interest (sf_data.wf1)ldmus DM-US$ spot exchange rate (logs)ldmusf DM-US$ forward (3-m) exchange rate (logs)

64

SF_data.wf1 Eviews quarterly database (period: 86.1-00.4)

Empirical analysisspot, forward and premium plotsunit root tests (testing down from pmax=6)ldmus ADF(5) = -2,699ldmusf ADF(4) = -1.824ldmus_fp ADF(0) = -6.085**

→ Cointegrated VAR approach, from UVAR(1) [lags 1 1]diagnostic checks on residuals are OK, then pass to rank test[lags 0 1] → r = 1Estimate the VEqCM (with restricted long-run):Vector Error Correction Estimates Sample(adjusted): 1986:2 2000:4 Included observations: 59 after adjusting endpoints Standard errors in ( ) & t-statistics in [ ]

Cointegration Restrictions: B(1,1)=1,B(1,2)=-1Restrictions identify all cointegrating vectorsLR test for binding restrictions (rank = 1):Chi-square(1) 3.054771Probability 0.080500

Cointegrating Eq: CointEq1

LDMUSF(-1) 1.000000

LDMUS(-1) -1.000000

Error Correction: D(LDMUSF) D(LDMUS)

CointEq1 0.088322 0.826023 (0.42940) (0.37079)[ 0.20569] [ 2.22773]

R-squared -0.000313 0.078012 Sum sq. resids 0.229573 0.171185 S.E. equation 0.062914 0.054327 Log likelihood 79.98022 88.63774 Mean dependent -0.002014 -0.001662 S.D. dependent 0.062904 0.056579

Page 33: Lecture

65

In addition, the theory would predict that the forward rate isweakly exogenous, and that the loading parameter in the spot rateis 1. In order to check these restrictions, pass to the systemSYS_DMUS_FS:

D(LDMUSF) = C(1)*( LDMUSF(-1) - 1*LDMUS(-1) )D(LDMUS) = c(2)*( LDMUSF(-1) - 1*LDMUS(-1) )

obtain the FIML estimate, and impose the Wald test:

c(1)=0,C(2)=1

The results are reproduced below:System: SYS_DMUS_FSEstimation Method: Full Information Maximum Likelihood (Marquardt)Sample: 1986:2 2000:4Included observations: 59Total system (balanced) observations 118Convergence achieved after 1 iteration

Coefficient Std. Error z-Statistic Prob.

C(1) 0.088327 0.448442 0.196963 0.8439C(2) 0.826024 0.372911 2.215073 0.0268

Log Likelihood 245.2717Determinant residual covariance 8.40E-07

Equation: D(LDMUSF) = C(1)*( LDMUSF(-1) - 1*LDMUS(-1) )Observations: 59R-squared -0.000313 Mean dependent var -0.002014Adjusted R-squared -0.000313 S.D. dependent var 0.062904S.E. of regression 0.062914 Sum squared resid 0.229573Durbin-Watson stat 1.991218

Equation: D(LDMUS) = C(2)*( LDMUSF(-1) - 1*LDMUS(-1) )Observations: 59R-squared 0.078012 Mean dependent var -0.001662Adjusted R-squared 0.078012 S.D. dependent var 0.056579S.E. of regression 0.054327 Sum squared resid 0.171185Durbin-Watson stat 1.959340

Wald Test:System: SYS_DMUS_FS

Null Hypothesis: C(1)=0C(2)=1

Chi-square 4.134558 Probability 0.126530

66

The final (empirical) model is:

∆ft = εft → ft is a random walk∆st = ft-1 - st-1 + εst → st = ft-1 + εst

It is clear that our empirical model is in line with the FRUHapproach, where εft is the forward shock, and εst = st - ft-1 isthe realised profit/loss from speculation. Note that botherrors are white noise (unpredictable) processes.

MODELLING A SMALL MACROECONOMIC SYSTEM:FROM VAR TO SEM (Bagliano-Golinelli-Morana,2003)

The theoretical modelFollowing Gerlach-Svensson (2001) we nest in the inflationequation both the Phillips curve and the price gap effects(Hallman-Porter-Small, 1991):

πt = πet + αy (qrt-1) + αm [pt-1 - p

*t-1] + επt

where:

πet ≡ πt-1 (rw model of expectations)

qrt ≡ yt - y*

t (output gap)

y*t = βy0 + y*

t-1 + εyt (rw model of potential output)

mt-pt = βm0 + βm1 yt + βm2 (lt - st) + εmt

(real money demand)

p*t ≡ mt - [βm0 + βm1 y

*t + βm2 (l

*t - s

*t) ] (P-star)

The long run solution of the model:

Page 34: Lecture

67

π = πe

y = y*

l* = β0f + πl* = β0s + s*

m*-p*= β0 + βm1 y* where: β0 ≡ βm0 + βm2 β0s

The structural model long run solution predicts thefollowing stochastic properties of the variables:

πe and y* are I(1)→ π , y ~ I(1) ; qr ~ I(0)

l and π are CI(1,1)l and s are CI(1,1)

→ l , s ~ I(1) ; (l-s) ~ I(0);

(l-π) ~ I(0)

m-p and y are CI(1,1) → m-p ~ I(1) ; [(m-p)- y] ~ I(1)

The outcomes from univariate Dickey-Fuller (DF)integration test confirm the predictions of the structuralmodel long run solution

see Bagliano-Golinelli-Morana (2002, Table 1)

Univariate analysis

68

..... output omitted ....

→ The structural model predicts a stochasticbehaviour of the variables that is coherent withunivariate DF test outcomes.

Multivariate analysis

Johansen (1995) cointegrated VARs approach to thevector of the endogenous variables of the system:

xt = [(m-p)t yt lt st πt qrt ]

Note: from univariate analysis, the variables m-p, y, l,s, and π are I(1), qr is I(0)

The UVAR(3) model is chosen on the basis of bothAIC criterion and residual diagnostic test results thatsupport the third order dynamics.

The corresponding Johansen’s rank test results are:

Eviews/BaGoMo_CUP/BGM_UVAR/view/coint.test/3Sample: 1981:4 1997:3Included observations: 64Trend assumption: Linear deterministic trendSeries: MP S Y L DP YGAPLags interval (in first differences): 1 to 2Unrestricted Cointegration Rank Test

Hypothesized Trace 5 Percent 1 PercentNo. of CE(s) Eigenvalue Statistic CV CV

None ** 0.410572 106.9796 94.15 103.18At most 1 * 0.324775 73.14907 68.52 76.07At most 2 * 0.284830 48.01566 47.21 54.46At most 3 0.225796 26.56064 29.68 35.65At most 4 0.113139 10.18178 15.41 20.04At most 5 0.038272 2.497522 3.76 6.65

*(**) denotes rejection of the hypothesis at the 5%(1%) level

→ rank = 4 (at 10% significance level) ....

Page 35: Lecture

69

... because it is in line with the theoretical modeloutlined above. Then, we can suppose the followingfour long run relationships:

• a simple money demand function;• relation between the inflation rate and the long term

interest rate (Fisher parity);• relation between the short and the long term interest

rates (term structure of interest rates);• output gap stationarity

Are empirical realisations in line with such aninterpretation?

a) start with the unrestricted VAR (UVAR)the Haavelmo distribution

b) impose the cointegration tank restrictions, and

c) test for the long run over-identifying restrictions(CVAR)

Eviews/BaGoMo_CUP_V4/BGM_CVARlong/Vector Error Correction Estimates

Sample: 1981:4 1997:3 Included observations: 64 Standard errors in ( ) & t-statistics in [ ]

Cointegration Restrictions: B(1,1)=1,B(1,3)=0,B(1,4)=0,B(1,5)=0,B(1,6)=0, B(2,1)=0,B(2,2)=0,B(2,3)=1,B(2,4)=-1,B(2,5)=0,B(2,6)=0, B(3,1)=0,B(3,2)=0,B(3,3)=0,B(3,4)=1,B(3,5)=-1,B(3,6)=0, B(4,1)=0,B(4,2)=0,B(4,3)=0,B(4,4)=0,B(4,5)=0,B(4,6)=1,Convergence achieved after 6 iterations.Restrictions identify all cointegrating vectorsLR test for binding restrictions (rank = 4):Chi-square(7) 15.48266Probability 0.030287

Cointegrating Eq: CointEq1 CointEq2 CointEq3 CointEq4

MP(-1) 1.000000 0.000000 0.000000 0.000000

70

Y(-1) -1.663368 0.000000 0.000000 0.000000 (0.01312)[-126.746]

S(-1) 0.000000 1.000000 0.000000 0.000000

L(-1) 0.000000 -1.000000 1.000000 0.000000

DP(-1) 0.000000 0.000000 -1.000000 0.000000

YGAP(-1) 0.000000 0.000000 0.000000 1.000000

C 12.16091 0.007432 -0.051891 0.007849

Error Correction: D(MP) D(Y) D(S) D(L) D(DP) D(YGAP)

CointEq1 -0.157779 0.090382 -0.087051 0.043514 0.397329 -0.100884 (0.05669) (0.09040) (0.06425) (0.05339) (0.14666) (0.09198)[-2.78299] [ 0.99976] [-1.35493] [ 0.81500] [ 2.70920] [-1.09675]

CointEq2 0.083287 -0.211621 0.007442 0.136887 -0.305398 -0.086949 (0.07453) (0.11885) (0.08446) (0.07019) (0.19281) (0.12093)[ 1.11743] [-1.78054] [ 0.08811] [ 1.95015] [-1.58394] [-0.71900]

CointEq3 -0.111188 -0.166579 -0.113507 -0.127045 0.577208 -0.130594 (0.07612) (0.12138) (0.08626) (0.07169) (0.19691) (0.12350)[-1.46067] [-1.37234] [-1.31583] [-1.77221] [ 2.93128] [-1.05740]

CointEq4 -0.071663 0.111143 -0.001656 0.027848 0.297839 -0.181116 (0.04907) (0.07825) (0.05561) (0.04621) (0.12694) (0.07962)[-1.46036] [ 1.42037] [-0.02979] [ 0.60260] [ 2.34628] [-2.27483]

(... short run components omitted ...)

d) test for the long run over-identifying restrictions(CVAR) plus a number of restrictions on loadingparameters

Eviews/BaGoMo_CUP/var/BGM_CVARload/Vector Error Correction Estimates Sample: 1981:4 1997:3 Included observations: 64 Standard errors in ( ) & t-statistics in [ ]

Cointegration Restrictions: B(1,1)=1,B(1,3)=0,B(1,4)=0,B(1,5)=0,B(1,6)=0, B(2,1)=0,B(2,2)=0,B(2,3)=1,B(2,4)=-1,B(2,5)=0,B(2,6)=0, B(3,1)=0,B(3,2)=0,B(3,3)=0,B(3,4)=1,B(3,5)=-1,B(3,6)=0, the same as above B(4,1)=0,B(4,2)=0,B(4,3)=0,B(4,4)=0,B(4,5)=0,B(4,6)=1, A(2,1)=0,A(3,1)=0,A(4,1)=0,A(6,1)=0, A(1,2)=0,A(3,2)=0,A(5,2)=0,A(6,2)=0,additional restrictions

Page 36: Lecture

71

A(1,3)=0,A(2,3)=0,A(3,3)=0,A(6,3)=0, A(1,4)=0,A(2,4)=0,A(4,4)=0,Convergence achieved after 8 iterations.Restrictions identify all cointegrating vectorsLR test for binding restrictions (rank = 4):Chi-square(22) 38.00057Probability 0.018319

Cointegrating Eq: CointEq1 CointEq2 CointEq3 CointEq4

MP(-1) 1.000000 0.000000 0.000000 0.000000

Y(-1) -1.645856 0.000000 0.000000 0.000000 (0.02185)

S(-1) 0.000000 1.000000 0.000000 0.000000

L(-1) 0.000000 -1.000000 1.000000 0.000000

DP(-1) 0.000000 0.000000 -1.000000 0.000000

YGAP(-1) 0.000000 0.000000 0.000000 1.000000

C 12.03693 0.007432 -0.051891 0.007849Error Correction: D(MP) D(Y) D(S) D(L) D(DP) D(YGAP)

CointEq1 -0.072375 0.000000 0.000000 0.000000 0.390804 0.000000 (0.03119) (0.00000) (0.00000) (0.00000) (0.11450) (0.00000)[-2.32083] [ NA ] [ NA ] [ NA ] [ 3.41305] [ NA ]

CointEq2 0.000000 -0.203311 0.000000 0.169642 0.000000 0.000000 (0.00000) (0.11461) (0.00000) (0.06102) (0.00000) (0.00000)[ NA ] [-1.77394] [ NA ] [ 2.78021] [ NA ] [ NA ]

CointEq3 0.000000 0.000000 0.000000 -0.125265 0.524819 0.000000 (0.00000) (0.00000) (0.00000) (0.05799) (0.16397) (0.00000)[ NA ] [ NA ] [ NA ] [-2.16003] [ 3.20077] [ NA ]

CointEq4 0.000000 0.000000 0.058235 0.000000 0.274028 -0.123909 (0.00000) (0.00000) (0.02894) (0.00000) (0.09958) (0.04424)[ NA ] [ NA ] [ 2.01196] [ NA ] [ 2.75175] [-2.80081]

Short run estimates (omitted) tell us that it is possible tofurther model the system. Such further short run parametersrestrictions can be imposed to the model without changing(re-estimate) the long run (cointegration) estimates.

Main results from multivariate analysis:

72

• the elasticity of money demand to income issignificantly bigger than one

→ confirms an usual result in literature, sometimes explained by the omission of wealth

• the long-run effect of inflation on interest rate issuggested by the Fisher equation

• positive deviations from the long run relationshipbetween real money and GDP cause upwardpressures on inflation and output, and theequilibrium correcting effect on real money

• as far as interest rate – inflation disequilibria areconcerned, they significantly affect only the level ofthe interest rate

• increases in the capacity utilisation rate havepositive effects on inflation (Phillips curve effect),and feedback on own levels

→ both P-star and Phillips curve effects aresignificant explanations of short run inflationbehaviour (as suggested by Gerlach-Svensson paper)

The Structural Econometric Model (SEM) is the outcomeof a further modelling phase. Here, integration “problems”are solved by the imposition of cointegrated combinations,then the procedure of modelling from general to specific isbased on statistics that have standard distributions.

Page 37: Lecture

73

SOME GUIDELINES FOR THE PREPARATION OFAPPLIED ECONOMETRICS PROJECTS(from M. Hashem Pesaran Lectures)

1. Introduction. This section shall identify the issues andshould state clearly the purpose of the analysis. It shouldalso gave a brief account of the literature together withspecific references.

2. Theoretical considerations. This section shoulddescribe the economic model (or the relationship) to beanalysed, and usually contains a brief review of theexisting theory and related evidence, the econometricspecification, and the a priori information concerning theparameters of the economic model (i.e. their signs, rangeof variation, or their most likely values).

3. Data sources and descriptions. This section shoulddiscuss the data used in the study and give their sources.In particular, attention should be paid to the relationshipbetween the theoretical concepts in the economic model(see section 2) and the available data.

4. Econometric considerations. This section shoulddescribe the econometric methods, i.e. the estimationmethod, the inference procedure, diagnostic checks, etc.

5. Empirical results. This section should report the results,comment on their statistical and economic significanceand suggest ways that the results may be improved andextended.

6. Conclusions. This section should give a very briefaccount of the main finding of the research.

7. Bibliography. A complete list of the references cited.

74

READING LIST

Introductory readings

Kennedy P. (2003), A guide to econometrics, 5th ed., BlackwellCuthbertson K., Hall S.G., Taylor M.P. (1992), Applied

econometric techniques, Philip AllanGranger C.W.J. (1999), Empirical Modelling in Economics,

Cambridge University PressStock J.H., Watson M.W. (2003), Introduction to Econometrics,

Addison Wesley, part four

Classical time series analysis

Box G.E.P., Jenkins G.M., Reinsel G.C. (1994), Time seriesanalysis: forecasting and control, 3rd edition, Prentice Hall,ch. 1-9

Enders W. (1995), Applied econometric time series, Wiley, ch. 2Granger C.W.J., Newbold P. (1986), Forecasting economic time

series, Academic Press, ch. 1-3Maddala G.S. (1992), Introduction to econometrics, 2nd edition,

Macmillan, ch. 13Mills T.C. (1990), Time series techniques for economists,

Cambridge University Press, ch. 5-8Mills T.C. (1993), The econometric modelling of financial time

series, Cambridge University Press, ch. 2Pindyck R.S., Rubinfeld D.L. (1991), Econometric models and

economic forecasts, 3rd edition, Mc Graw-Hill, ch. 15-19

Trends and unit roots

Campbell J.Y., Perron P. (1991), Pitfalls and opportunities: Whatmacroeconomists should know about unit roots, in BlanchardO.J., Fischer S. (a cura di), NBER Economics Annual 1991,MIT Press

Cochrane J. (1991), Comment, in Campbell e Perron, cit.Elliott G., Rothemberg T.J., Stock J.H. (1996), Efficient tests for

an autoregressive unit root, in «Econometrica», Vol. 64(4)

Page 38: Lecture

75

Enders W. (1995), cit., Wiley, ch. 3-4Hall A. (1994), Testing for a unit root in time series with pretest

data-based model selection, in «Journal of Business andEconomic Statistics», n. 12

Maddala G.S. (1992), cit., ch. 6.10Mills T.C. (1990), cit., ch. 11.2Mills T.C. (1993), cit., ch. 3.1Ng S., Perron P. (1995), Unit root test in ARIMA models with data

dependent methods for the selection of the truncation lag, in«Journal of the American Statistical Association», n. 90

Ng S., Perron P. (2001), Lag length selection and the constructionof unit root tests with good size and power, Econometrica, 69(6)

Perron P. (1997), Further evidence on breaking trend functions inmacroeconomic variables, Journal of Econometrics, vol. 80

Stock J.H. (1994), Unit roots, structural breaks and trends, inEngle R.F., McFadden D.L. (ed. by), Handbook ofeconometrics, vol. 4, ch. 47

Spurious regressions

Granger C.W.J., Newbold P. (1974), Spurious regressions ineconometrics, in «Journal of Econometrics», n. 2

Granger C.W.J., Newbold P. (1986), cit., ch. 6.4Phillips P.C.B. (1986), Understanding spurious regression in

econometrics, Journal of Econometrics, vol. 33Sims C., Stock J., Watson M. (1990), Inference in linear time

series models with some unit roots, Econometrica, vol. 58

Dynamic specification

Hendry D.F., A.R. Pagan and J.D. Sargan (1984), Dynamicspecification, in Z. Griliches and M.D. Intriligator (eds.),Handbook of Econometrics, vol. II, North Holland

Cointegration analysis

Bagliano F., Golinelli R., Morana C. (2003), Inflation modellingin the euro area, Cambrigde Univ. Press, (Golinelli web page)

76

Dickey D.A., Rossana R.J. (1994), Cointegrated time series: aguide to estimation and hypothesis testing, in Oxford Bulletinof Economics and Statistics, vol. 56, n. 3

Enders W. (1995), cit., Wiley, ch. 6Gerlach S. and L.E.O. Svensson (2000), Money and inflation in

the Euro area: a case of monetary indicators?, NBERWorking Papers

Granger C.W.J e N. Swanson (1996), Future developments in thestudy of cointegrated variables, Oxford Bulletin ofEconomics and Statistics, vol. 58

Gregory, A.W. and Hansen, B.E. (1996), Residual-based tests forcointegration in models with regime shifts, Journal ofEconometrics, vol. 70

Gregory, A.W. and Hansen, B.E. (1996b), Tests for cointegrationin models with regime and trend shifts, Oxford Bulletin ofEconomics and Statistics, vol. 58

Hakkio C.S., M. Rush (1991), Cointegration: how short is thelong run?, Journal of International Money and Finance, v. 10

Hall A.D., H.M. Anderson, C.W.J. Granger (1992), Acointegration analysis of treasury bill yields, The Review ofEconomics and Statistics, vol. 74

Hallman J.J., Porter R.D, Small D.H. (1991), Is the price level tiedto the M2 monetary aggregate in the long run?, AmericanEconomic Review, vol. 81, pp. 841-858.

Harris R. (1995), Cointegration analysis in econometricmodelling, Prentice Hall

Johansen S. (1997), Mathematical and statistical modelling ofcointegration, EUI Working Paper, ECO No. 97/14

Maddala G.S. (1992), cit., ch. 14Mills T.C. (1993), cit., ch. 6.1-6.6Mills T.C. (1998), Recent developments in modelling

nonstationary vector autoregressions, Journal of EconometricSurveys, vol. 12, n. 3

Pesaran M.H. (1997), The role of economic theory in modellingthe long run, Economic Journal, vol. 107

Page 39: Lecture

77

Pesaran M.H., Shin Y., Smith R.J. (2001), Bounds approaches tothe analysis of level relationships, special issue of the Journalof Applied Econometrics in honour of JD Sargan on the theme“Studies in Empirical Macroeconometrics”, (eds) D.F. Hendryand M.H. Pesaran, forthcoming

Stock J.H., Watson M.W. (2001), Vector Autoregressions, Journalof Economic Perspectives, vol. 15, N. 4, Fall, pp. 101-115

Watson M.W. (1994), Vector autoregression and cointegration, inEngle R.F., McFadden D.L. (ed. by), Handbook ofeconometrics, vol. 4, ch. 47

Zivot E. (2000), Cointegration and forward and spot exchangerate regressions, Journal of International Money and Finance,vol. 19, pp. 785-812

Textbooks

Banerjee A., Dolado J.J., Galbraith J.W., Hendry D.F. (1993),Cointegration, error correction and the econometric analysisof nonstationary data, Oxford University Press

Clements M.P., Hendry D.F. (1999), Forecasting Non-StationaryEconomic Time Series, The MIT Press

Hamilton J. (1994), Time series analysis, Princeton Univ. PressHendry D.F. (1994), Dynamic econometrics, Oxford Univ. PressJohansen S. (1995), Likelihood-based inference in cointegrated

vector autoregressive models, Oxford University Press

Applied textbooks

Favero C. (2000), Applied macroeconometrics, Oxford Univ PressLütkepohl H. and M. Krätzig, ed. by (2004), Applied Time Series

Econometrics, Cambridge University PressPatterson K. (2000), An introduction to applied econometrics: a

time series approach, Macmillan PressRao B.B., ed. by (1994), Cointegration for the applied economist,

St. Martins PressVogelvang B. (2004), Econometrics:Theory and Applications with

E-Views, FT Prentice Hall

78

Very useful: you can not miss ...

Lucchetti Jack R. (2001), Appunti di analisi delle serie storiche,dowloadable from my home page.

Mosconi R. (2000), Malcolm for Rats software.

ACKNOWLEDGEMENTS

Many thanks are due to Luigi Bidoia, Maria Elena Bontempi,Michele Burattoni, Juri Marcucci, and to the students of CIDE,Prometeia and SDIC courses for their comments and suggestions.The usual caveats apply.