i(1) and i(2) cointegration analysis theory and...

I(1) and I(2) Cointegration Analysis

Theory and Applications

Heino Bohn Nielsen

Ph.D. ThesisInstitute of EconomicsUniversity of Copenhagen

October 2003

Contents

Introduction and Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1 Likelihood Ratio Testing for Cointegration Ranks in I(2) Models . . . . . . . . . . . . . . . . . 8

2 Analyzing I(2) Systems by Transformed Vector Autoregressions. . . . . . . . . . . . . . . . .32

3 Cointegration Analysis in the Presence of Outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4 An I(2) Cointegration Analysis of Price and Quantity Formation in

Danish Manufactured Exports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5 Inflation Adjustment in the Open Economy: An I(2) Analysis of UK Prices . . . . . 98

6 Has US Monetary Policy Followed the Taylor Rule? A Cointegration

Analysis 1988-2002 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

Introduction and Summary

This thesis, entitled I(1) and I(2) Cointegration Analysis: Theory and Applica-

tions, consists of six article manuscripts, which broadly reflect the research ar-

eas I have been working with during my Ph.D. studies. The manuscripts are

self-contained and can be read independently, and the connecting thread through the

manuscripts is the use of the cointegrated vector autoregressive model as the statistical

framework.

The first three manuscripts are theoretical and deal with issues related to the vector

autoregressive model with I(1) and I(2) restrictions, namely: likelihood ratio testing for

the number of I(0), I(1) and I(2) relations in a vector process in Chapter 1; how to

make simple inference on the I(2) cointegrating parameters based on a minimal I(2)-to-

I(1) transformation in Chapter 2; and how to conduct cointegration inference in the I(1)

model in the presence of outliers in Chapter 3. The last three manuscripts are empirical

applications analyzing, respectively, price and quantity formation in Danish exports in

Chapter 4; the impact of import price shocks on UK inflation in Chapter 5; and monetary

policy rules for the US in Chapter 6. The remainder of this introduction contains brief

summaries of the manuscripts.

Chapter 1, Likelihood Ratio Testing for Cointegration Ranks in I(2) Models,

is written jointly with Anders Rahbek. A central point in empirical applications of the

p−dimensional I(2) vector autoregressive model is the determination of the so-called coin-tegration ranks (r, s), i.e. the number of stationary relations, r, the number of I(1) trends,

s, and the number of I(2) trends, p−r−s. The paper considers the likelihood ratio test forthe number of cointegrating and multi-cointegrating relations. We derive the asymptotic

distribution of the likelihood ratio test for the (multi-) cointegration ranks, and show that

the limiting distribution is identical to the limiting distribution of the much applied test

statistic from the Two-Step estimation procedure. However, in finite samples the test

statistics differ; and based on two existing empirical examples and corresponding Monte

Carlo simulations, we conclude that the likelihood ratio test is clearly preferable to the

Two-Step rank test.

Chapter 2, Analyzing I(2) Systems by Transformed Vector Autoregressions, is

a joint paper with Hans Christian Kongsted. Most empirical applications considering I(2)

behavior transform the original I(2) process to an I(1) process comprising real magnitudes.

The transformation requires a homogeneity restriction to apply to the original data, which

is equivalent to knowing the loadings to the I(2) trends up to a normalization. This

Introduction

paper assumes that the necessary homogeneity restrictions are valid, and characterizes

the relationship between the parameters of the original I(2) vector autoregressive model

and the parameters in the I(1) model for the transformed process.

It is shown that the parameters of the transformed model are subject to certain re-

strictions; one set that fits directly into the structure of an I(1) reduced rank regression

analysis, and a second set of restrictions that requires a more elaborate estimation algo-

rithm and is commonly ignored in applied work. An empirical example and a simulation

study indicate that only a minor loss of efficiency is incurred by ignoring the second set of

restrictions. The paper concludes that a properly transformed model provides a practical

and effective means for inference on the parameters of the I(2) model.

Chapter 3, Cointegration Analysis in the Presence of Outliers, analyzes how in-

ference of the cointegration rank of an I(1) vector autoregressive model can be conducted

in the presence of innovational an additive outliers. An innovational outlier is produced

by a shock to the innovation term of a data generating process and is propagated through

the autoregressive structure. An additive outlier, on the other hand, is superimposed on

the levels of the data, and the effects are independently of the autoregressive parameters.

Models with innovational dummies can be estimated with reduced rank regression, and the

usual practice in applied cointegration analysis is to include innovational dummy variables

to account for large residuals. Models with additive dummies, on the other hand, cannot

be estimated using reduced rank regression, and the paper proposes a simple algorithm

for maximum likelihood estimation of the cointegration model with additive dummies.

The paper analyzes how outliers can be modelled with dummy variables. Based on

a Monte Carlo simulation, we conclude that additive outliers distort inference on the

cointegration rank while innovational outliers are less harmful. Furthermore, unrestricted

dummies are misspecified if an outlier is additive, and the potential distortion from mis-

specified dummies is much larger than the distortions from outliers per se. These findings

question the usual practice in applied cointegration analyses. Instead, the paper suggests

an outlier detection procedure to determine the types as well as the locations of outliers

before testing for the cointegration rank. Alternatively, the focus could be on correcting

additive outliers, while the less distorting innovational outliers could be ignored. A maxi-

mum likelihood based correction works very well, but requires a non-standard estimation

algorithm. An alternative linear interpolation in the levels of the data is almost equally

effective.

Chapter 4, An I(2) Cointegration Analysis of Price and Quantity Formation in

Danish Manufactured Exports, is an application of the I(2) and I(1) cointegration

framework to a quarterly data set for the Danish export sector 1975− 1996. The Danishexport price, the competing price on the world market, and the production costs are

characterized as integrated of second order, I(2), but a long-run homogeneity restriction

seems to cancel the I(2)-trend allowing a transformed process to be analyzed within the

cointegrated I(1) framework.

Introduction

Based on the I(2)-to-I(1) transformed process, the paper analyzes the long-run and

short-run structure of the Danish manufacturing export sector. Two long-run relations are

found and identified as a demand-relation for Danish exports and a polynomially cointe-

grated price relation. The demand relation has a price elasticity of around 3 numerically,

and there is a permanent positive effect from the German reunification on the Danish

equilibrium market share. In the price formation a large weight to foreign prices and an

effect from the rate of inflation to the steady-state markup are found. The latter effect is

interpreted as an element of caution in the price setting in an inflationary environment.

To characterize the short-run behavior a system of simultaneous equations is developed.

Chapter 5, Inflation Adjustment in the Open Economy: An I(2) Analysis of UK

Prices, is a joint paper with Christopher Bowdler, analyzing the transmission of import

price shocks to UK domestic inflation 1969 − 2000. The paper estimates a cointegratedvector autoregressive model for consumer prices, unit labour costs, import prices and real

consumption growth subject to I(1) and I(2) restrictions. The paper argues that a key to

understand the inflation adjustment to import price shocks is the accommodating role of

real unit labor costs.

In the empirical analysis the price variables are found to be integrated of second order,

I(2), but a homogeneity restriction allows for a transformation from I(2) to I(1) space.

The consumption growth is found to be stationary, and a second long-run relation links

UK inflation to real import prices and real unit labor costs. An important finding is that

real unit labor costs error correct to this relation, and high real import prices reduce the

real wage, such that the pass through to domestic inflation is moderated.

To illustrate the inflation dynamics following a foreign price shock the paper applies

the structural vector autoregressive methodology to the cointegrated system. The paper

performs an impulse response analysis to illustrate that the accommodating real wages

limit the impact of import price shocks on domestic inflation. The accommodating effect

may explain why the depreciation of sterling in 1992 left inflation unchanged. In contrast,

high real import prices in 1974 increased inflation because wage accommodation effects

were weaker at that time.

Finally, Chapter 6, Has US Monetary Policy Followed the Taylor Rule? A Coin-

tegration Analysis 1988—2002, considers possible monetary policy rules for the US

1988 − 2002 within a cointegrated vector autoregressive model. This paper is writtenjointly with Anders Møller Christensen.

The paper reconsiders the empirical evidence concerning monetary policy rules using

the equilibrium correction structure of a cointegrated vector autoregressive model. In

particular, the paper argues that a relation involving a policy interest rate can only be

interpreted as a monetary policy rule if the policy rate equilibrium corrects to the suggested

relation. This is a testable hypothesis in the multivariate cointegration model.

Using this idea, it is rejected that US monetary policy 1988−2002 can be described bythe traditional rule suggested by John B. Taylor in 1993 in a seminal paper in Carnegie-

Introduction

Rochester Conference Series on Public Policy. Instead we find a stable long-term relation-

ship between the Federal funds rate, the unemployment rate, and the long-term interest

rate, with deviations from the long-term relation being corrected primarily via changes

in the Federal funds rate. This is taken as an indication that the Federal Open Market

Committee sets interest rates with a view to activity and to expected inflation and other

conditions available in financial markets.

Introduction

Acknowledgements

I would like to take the opportunity to thank a number of people for their encouragements

and support. First of all I would like to thank my supervisor, Hans Christian Kongsted,

for excellent guidance during my three years of Ph.D. studies. His inspiration and many

comments have strongly influenced the entire thesis.

Secondly, I wish to thank my coauthors, Christopher Bowdler, Anders Møller Chris-

tensen, Hans Christian Kongsted, and Anders Rahbek, for very inspiring cooperation; and

Katarina Juselius and Dan Knudsen for comments and suggestions.

In spring 2002 I visited Nuffield College and University of Oxford. The hospitality

of these institutions and a financing grant from the Euroclear Bank and the University

of Copenhagen are gratefully acknowledged. In particular, I would like to thank Bent

Nielsen for help in organizing the visit as well as academic and personal assistance during

the stay. Furthermore, I would like to thank David Hendry, Sophocles Mavroeidis, and

John Muellbauer for discussions of various subjects.

Finally, thanks to Freja for all her love and patience, and to Katinka Coco, born April

2003, for making the last six months so much more fun.

Chapter 1

Likelihood Ratio Testing

for Cointegration Ranks in I(2) Models

Likelihood Ratio Testing

for Cointegration Ranks in I(2) Models

Heino Bohn Nielsen

Institute of Economics,

University of Copenhagen

heino.bohn.nielsen@econ.ku.dk

Anders Rahbek

Dept. of Applied Mathematics and

Statistics, University of Copenhagen

rahbek@math.ku.dk

Abstract

This paper considers the likelihood ratio (LR) test for the number of cointegrating

and multi-cointegrating relations in the I(2) vector autoregressive (VAR) model. We

derive the asymptotic distribution of the LR test for the (multi-) cointegration ranks,

and show that this is identical to the asymptotic distribution of the much applied

test statistic from the Two-Step procedure, see Johansen (1995), Paruolo (1996), and

Rahbek, Kongsted, and Jørgensen (1999). However, conclusions with regards to the

cointegration ranks in existing empirical applications change when applying the LR

test, and based on Monte Carlo simulations we conclude that the LR test is preferable

to the Two-Step based statistic.

Keywords: Vector autoregressive model, Cointegration, I(2), Likelihood Ratio, Monte

Carlo.

JEL Classification: C32.

1 Introduction and Summary

For many OECD countries for the post-war period, the first difference of nominal vari-

ables, e.g. inflation rates or money growth, seem to behave as unit root processes, imply-

ing that the levels of the nominal variables are integrated of second order, I(2). Johansen

(1992) shows that a model for I(2) variables can be parameterized as a vector autoregres-

sive (VAR) model with two reduced rank restrictions imposed, see also Johansen (1995),

Paruolo (1996), Johansen (1997), Rahbek, Kongsted, and Jørgensen (1999), Paruolo and

Rahbek (1999), Paruolo (2000) and the survey by Haldrup (1998). Several applications

have appeared in the empirical literature on e.g. money demand and open economy

Discussions with Søren Johansen and Hans Christian Kongsted are gratefully acknowledged. Ox code for

calculating the likelihood ratio test for the (multi-) cointegration ranks in the I(2) model can be obtained

from the authors. Maximum likelihood estimation of the I(2) model has also been implemented in the new

version of CATS in RATS and in the Matlab program me2 by Pieter Omtzigt.

Likelihood Ratio Testing for Cointegration Ranks in I(2) Models

price determination, see inter alia Juselius (1998), Juselius (1999), Diamandis, Georgout-

sos, and Kouretas (2000), Banerjee, Cockerell, and Russell (2001), Banerjee and Russell

(2001), Fliess and MacDonald (2001), Nielsen (2002) and Kongsted (2003).

A central point in empirical applications of the multivariate p−dimensional I(2) modelis the determination of the so-called cointegration ranks (r, s), i.e. the number of stationary

relations, r, the number of I(1) trends, s, and the number of I(2) trends, p− r− s. In this

paper we consider the likelihood ratio (LR) test for the (multi-) cointegration ranks in the

I(2) model and derive the asymptotic distribution. Existing literature applies the non-LR

test based on the sum of two sets of canonical correlations from a Two-Step estimation

procedure, see Johansen (1995), Paruolo (1996), and Rahbek, Kongsted, and Jørgensen

(1999). The Two-Step procedure is easy to implement as a sequential application of the

reduced rank regression (RRR), known from the estimation of the cointegrated I(1) model,

but it does not make use of the rich structure of the I(2) parameterization and ignores a

set of restrictions in the first step of the procedure.

Maximum likelihood (ML) algorithms for estimation of the I(2) model have been pro-

posed by Johansen (1997) and Boswijk (2000) based on different parameterizations. In

this paper we derive the asymptotic distribution of the LR test for the cointegration ranks

using results from Johansen (1997), and show that the asymptotic distribution of the LR

test is identical to the asymptotic distribution of the Two-Step test. This implies that the

asymptotic critical values published for the Two-Step test also apply to the LR test.

Based on two existing empirical examples and corresponding Monte Carlo simulations

we examine some finite sample properties of the LR test and the Two-Step rank test. We

find that the size properties of the LR test are excellent. For ranks (r, s), where r > 0

or p − r − s > 0, the Two-Step statistic is always larger than the LR statistic, which

increases rejection frequencies. In some examples we find that the difference between the

LR and the Two-Step statistic is large. Overall we conclude that the LR test seems clearly

preferable to the Two-Step rank test.

The rest of the paper is organized as follows: Section 2 presents the I(2) model and

the used notation. Section 3 presents the LR and the Two-Step test statistics, and the

asymptotic distribution of the LR test is then derived in Section 4. Section 5 illustrates

some small sample properties of the rank tests based on empirical examples and Monte

Carlo simulations.

Throughout the paper use is made of the following notation: for any p × r matrix α

of rank r, r < p, let α⊥ indicate a p × (p− r) matrix whose columns form a basis of the

orthogonal complement of span(α). Hence α⊥ = 0 if r = p and α⊥ = Ip if α = 0. Define

also α = α(α0α)−1 and let Pα = αα0 = αα0 denote the orthogonal projection matrixonto span(α). Finally, the symbols

D→ andP→ are used to indicate weak convergence and

convergence in probability respectively.

2 Model and Representation

In this section we introduce the notation used throughout and present the I(2) model as

a submodel of the vector autoregressive model.

2.1 The I(2) Model

Consider the p−dimensional vector autoregressive model of order k :

Xt = Π1Xt−1 + . . .+ΠkXt−k + t, t = 1, 2, ..., T,

or in a parameterization convenient for I(2) analysis:

∆2Xt = ΠXt−1 − Γ∆Xt−1 +k−2Xi=1

Ψi∆2Xt−i + t, (1)

where Π =Pk

i=1Πi − I, Γ = I +Pk

i=2 (i− 1)Πi, and Ψj =Pk

i=j+2 (i− j − 1)Πi. Heret is an iid N(0,Ω) sequence with Ω positive definite, and the initial values X−k+1, ...,X0

are fixed.

The I(2) model, H(r, s), is defined by the two reduced rank restrictions

Π = αβ0 (2)

α0⊥Γβ⊥ = ξη0, (3)

where α and β are p× r matrices and ξ and η are (p− r)× s matrices. Note, that (3) can

alternatively be written as Pα⊥ΓPβ⊥ = α1β01, where α1 = α⊥ξ and β1 = β⊥η are p × s

matrices, and by definition span(α1) ⊂ span(α⊥) and span(β1) ⊂ span(β⊥).We use the notation H(r) = H(r, p − r) to denote I(1) models, where α1 and β1 are

p× (p− r) matrices, and H(p) denotes the unrestricted VAR.

2.2 Representation

Under the additional assumption that the characteristic polynomial, A(z) = Ip − Π1z −. . .−Πkzk, has 2(p− r)− s roots at the point z = 1 and the remaining roots outside the

unit circle, the process Xt is I(2). Under this hypothesis, denoted H0(r, s), Xt has the

representation:

Xt = C2

i + C1

i + γ1 + γ2t+ Xt. (4)

C2 = β2(α02Θβ2)

−1α02, β0C1 = α0ΓC2, β01C1 = α01(I −ΘC2), (5)

and Xt is a stationary I(0) process. Here Θ = Γβα0Γ + Ip −Pk−2

i=1 Ψi, α2 = (α, α1)⊥,and β2 = (β, β1)⊥. Note that α02Θβ2 has full rank p − r − s under H0(r, s), because

the 2(p − r) − s unit roots correspond exactly to the reduced ranks in (2) and (3). The

coefficients γ1 and γ2 depend on the initial conditions and satisfy (β, β1)0γ2 = 0, and

β0γ1−δβ02γ2 = 0, where δ = α0Γβ2. Also note that by definition (α, α1, α2) and (β, β1, β2)are square non-singular matrices with orthogonal blocks, and that β2 is a function of

(β, β1).

It follows from (4) and (5) that (β, β1)0C2 = 0 so that (β, β1)

0∆Xt is stationary,

whereas the (p−r−s) linear combinations β02Xt are integrated of order two, I(2). Moreover

it turns out that also the r linear combinations

β0Xt − δβ02∆Xt (6)

are stationary. Since the process (6) and (β, β1)0∆Xt are stationary, it follows that also

β0Xt − α0Γ∆Xt = β0Xt − α0Γ(Pβ,β1 + Pβ2)∆Xt (7)

is a stationary process.

3 Estimations and Rank Test Statistics

This section presents the LR test based on the ML algorithm of Johansen (1997) and the

usually applied Two-Step rank test based on the estimator of Johansen (1995).

3.1 Likelihood Ratio Test

For the likelihood analysis, Johansen (1997) suggests an alternative parameterization

based on a multi-cointegration term of the type (7):

∆2Xt = α[ρ0τ 0Xt−1 + ψ0∆Xt−1] +Ωα⊥(α0⊥Ωα⊥)−1κ0τ 0∆Xt−1

+k−2Xi=1

Ψi∆2Xt−i + t, (8)

where α (p × r), ρ ((r + s) × r), τ (p × (r + s)), ψ (p × r), κ ((r + s) × (p − r)), Ψi

(p× p), i = 1, ..., k − 2, and Ω (p× p) are all freely varying parameters. The parameters

in the previous formulation can be derived from the new parameters from the identities

τ = (β, β1), β = τρ, ψ0 = −(α0Ω−1α)−1α0Ω−1Γ, and κ0 = −α0⊥Γ(β, β1) = −(α0⊥Γβ, ξ), seeJohansen (1997). Here we have used the projection identity

α(α0Ω−1α)−1α0Ω−1 +Ωα⊥(α0⊥Ωα⊥)−1α0⊥ = Ip.

The LR test for H(r, s) against the unrestricted alternative, H(p), is given by

SLRr,s = −2 logQ (H(r, s) | H(p)) = −T log

¯Ω−1Ω

¯, (9)

where Ω and Ω denote the covariance matrices estimated under H(r, s) and H(p) respec-

tively.

No closed form solution for the ML estimator exists, but estimates can be obtained

through an iterative algorithm that switches between two steps: For fixed τ , the param-

eters α⊥ and α can be obtained by solving an eigenvalue problem and the remaining

parameters can be found from regression. For fixed values of these parameters, τ can

be estimated by generalized least squares. Convergence to the global maximum of the

likelihood function is not guaranteed, but the value of the likelihood function increases in

each iteration. In our implementation, we use the Two-Step estimates, presented below,

as starting values for the ML iterations.

3.2 Two-Step Rank Test

The Two-Step estimator is based on the parameterization in (1). The first step imposes

the restriction (2) to obtain

∆2Xt = αβ0Xt−1 − Γ∆Xt−1 +k−2Xi=1

Ψi∆2Xt−i + t. (10)

Ignoring the second restriction (3), the parameters α and β are estimated using RRR,

similar to estimation in I(1) models. The test for reduced rank of Π is the familiar trace

test of H(r) in H(p) :

Qr = −TpX

log (1− λi) ,

where 1 ≥ λ1 ≥ ... ≥ λp ≥ 0 are the eigenvalues from the corresponding RRR, see

Johansen (1996, Chapter 6).

The second step is estimated conditional on r and on the estimated α and β from the

first step. The model is decomposed in a marginal model for α0⊥∆2Xt and a model for

α0∆2Xt conditional on α0⊥∆

2Xt. The restriction (3) concerns alone the marginal equation:

α0⊥∆2Xt = −ξη0β0⊥∆Xt−1 − α0⊥Γββ

0∆Xt−1 +k−2Xi=1

α0⊥Ψi∆2Xt−i + α0⊥ t,

which can be estimated with RRR. Conditional on r, the test for reduced rank, s, of

α0⊥Γβ⊥, can be written as

Qr,s = −Tp−rX

log (1− ζi) ,

where 1 ≥ ζ1 ≥ ... ≥ ζp−r ≥ 0 are the second step eigenvalues. The Two-Step test forH(r, s) against the unrestricted H(p) is given by

S2Sr,s = Qr +Qr,s.

3.3 Rank Determination

Different values of the cointegration ranks define the sequence of partially nested models

illustrated in Table 1 for the trivariate case.

In the determination of the cointegration ranks in empirical applications, economic

theory could give some guidance. When a particular model is suggested from theory,

the corresponding hypothesis, H(r, s), could be tested against H(p). If little is known

a priori, however, an estimate (br, bs) of the ranks can be obtained from the data by a

sequential application of the rank test. The idea is to start testing the most restricted

model, H(0, 0), and in case of rejection to proceed left-to-right and top-to-bottom in Table

1. The estimator can be written as

(br, bs) = (r, s) | Sr,s ≤ cr,s;

Sr0,s0 > cr0,s0 for the indices (r0 < r, s0 ≤ p− r0) and (r0 = r, s0 < s),where Sr,s and cr,s denote the test statistics and critical values respectively. The estimator

(br, bs) will select the correct ranks with a limiting probability (1− π), where π is the size

of each test in the sequence, see Johansen (1995).

[Table 1 around here]

4 Asymptotic Distribution of the Likelihood Ratio Test

In this section we state the asymptotic distribution of the likelihood ratio test for the coin-

tegration ranks in the I(2) model. We introduce the following notation: for two stochastic

processes Xu and Yu on the unit interval u ∈ [0, 1], define the process Xu corrected for Yu

Xu|Y = Xu −µZ 1

¶µZ 1

¶−1Yu.

4.1 No Deterministic Components

Consider the hypothesis H(r, s) against the unrestricted H(p). We state the asymptotic

distribution of the LR statistic in (9).

Theorem 1 Under H0(r, s), then as T →∞,

SLRr,s = −2 logQ(H(r, s)|H(p)) D→ Q∞r +Q∞r,s,

Q∞r = tr

¶−1 Z 1

Q∞r,s = tr

0dW2uW

¶−1 Z 1

0W2udW

Here Wu = (W01u,W

0 is a (p − r) dimensional standard Brownian motion on the unit

interval, u ∈ [0, 1], with W1u of dimension s and W2u of dimension p−r−s. Furthermore,

ÃW1uR u0 W2vdv

¯¯W2

The proof is given in the appendix. Note, that the asymptotic distribution of the LR test

is identical to the asymptotic distribution of the Two-Step rank test derived in Johansen

(1995, Theorem 7). In particular, the two components, Q∞r and Q∞r,s, are the asymptoticdistributions of the statistics from the first and second step of the Two-Step rank test, Qr

and Qr,s. This implies that the critical values for the Two-Step rank test, see Johansen

(1995) for the present case, can also be applied to the LR test.

4.2 Linear Trends

In empirical applications it is often important to include deterministic linear trends. Rah-

bek, Kongsted, and Jørgensen (1999) propose a specification with deterministic linear

trends in all directions of the process, including the multi-cointegrating relations (6), at

the same time avoiding quadratic trends. An important feature of this model, denoted

H∗(r, s) in the following, is that both the LR and Two-Step tests for cointegration ranksare asymptotically similar with respect to the parameters of the deterministic terms, see

also Nielsen and Rahbek (2000).

In terms of the parameterization in (8), H∗(r, s) can be represented as:

∆2Xt = α[ρ0τ∗0X∗t−1 + ψ∗0∆X∗

t−1] +Ωα⊥(α0⊥Ωα⊥)

−1κ0τ∗0∆X∗t−1

+k−2Xi=1

Ψi∆2Xt−i + t, (11)

where τ∗ = (τ 0, τ 01)0 ((p+ 1)× (r + s)), ψ∗ =

¡ψ0, ψ01

¢0((p+ 1)× r), and, finally, X∗

t−1 =(X 0

t−1, t)0 is p + 1 dimensional. Imposing the assumptions in section 2.2, the model isreferred to as H∗

0 (r, s). Under H∗0 (r, s) the process Xt in (11) has a representation similar

to (4) with γ1 and γ2 being functions of initial values in addition to τ1 and ψ1, see Rahbek,

Kongsted, and Jørgensen (1999). As a result, Xt is as noted an I(2) process with linear

trends in all linear combinations of the process, including the multi-cointegrating ones.

The result in Theorem 1 is extended to the model H∗(r, s) in Theorem 2. As the

proof mimics the proof of Theorem 1, we state the result without proof, simply noting

that the asymptotic distributions of the ML estimators are identical to the asymptotic

distributions of the Two-Step estimators in Rahbek, Kongsted, and Jørgensen (1999).

Note that H∗(r, s) in terms of the parametrization in (1), suitable for the Two-Stepanalysis, corresponds to restricting a constant and linear regressor, µ1 + µ2t, as follows:

µ2 = αb01 and α0⊥µ1 = −α0⊥Γβb01 − ξη01, (12)

where b1 is 1× r and η1 is 1× s, see Rahbek, Kongsted, and Jørgensen (1999) for further

details.

Theorem 2 Under H∗0 (r, s), then as T →∞,

−2 logQ(H∗(r, s)|H∗(p)) D→ Q∗∞r +Q∗∞r,s ,

Q∗∞r = tr

¶−1 Z 1

0G1udW

Q∗∞r,s = tr

0dW2uG

¶−1 Z 1

0G2udW

Here Wu = (W01u,W

0 is a (p − r) dimensional standard Brownian motion on the unit

interval, u ∈ [0, 1], with W1u of dimension s and W2u of dimension p−r−s. Furthermore,

W1uR u0 W2vdv

¯¯(1,W2)

and G2u =

Critical values are given in inter alia Rahbek, Kongsted, and Jørgensen (1999) and in

Doornik (1998).

5 Finite Sample Properties

The asymptotic equivalence of the LR test and the Two-Step rank test does not hold for

finite samples in general. The relation between the test statistics is stated in the following

proposition.

Proposition 1 Consider the hypothesis H(r, s) in H(p). In finite samples,

SLRr,s ≤ S2Sr,s .

Equality holds if (p− r − s) r = 0.

Proof: Under the alternative, H(p), estimation is identical for the ML and the Two-

Step analysis, but under the null, H(r, s), the Two-Step procedure does not maximize the

likelihood function. To see this compare (10) and (8) to obtain

Γ = −αψ0 − Ωα⊥¡α0⊥Ωα⊥

¢−1κ0τ 0

= α¡α0Ω−1α

¢−1α0Ω−1Γ+Ωα⊥

¡α0⊥Ωα⊥

¢−1α0⊥Γββ

0 +Ωα⊥¡α0⊥Ωα⊥

¢−1ξη0β0⊥.

The first step of the Two-Step procedure ignores the restriction imposed on the last term,

and instead of the 2s (p− r) − s2 free parameters in ξη0 the Two-Step procedure allowsfor (p− r)2 parameters.

In two special cases, the Two-Step procedure maximizes the likelihood function under

H(r, s). First in I(1) models, where s = p− r and α0⊥Γβ⊥ is non-singular. And secondlyif r = 0 where the second step of the Two-Step procedure is conditional on α = β = 0.

Note, that the magnitude of the difference depends on the number of redundant parameters

as well as the sample correlation between the terms in (10) and the redundant terms. The

number of additional parameters is largest for models on the diagonal of Table 1, where

s = 0, while the LR and the Two-Step statistics coincide for model located in the first row

and last column of Table 1. Also note that in the model H∗(r, s), the second restrictionin (12) is also ignored in the first step of the Two-Step procedure, which introduces an

additional distortion.

To illustrate the difference we consider two empirical examples, namely the analysis

of Australian inflation and the markup of prices and cost taken from Banerjee, Cockerell,

and Russell (2001); and the Danish import price determination from Kongsted (2003).

In both cases we revisit the rank determination based on the original data and use the

estimated model as a data generating process (DGP) in a small Monte Carlo simulation.

5.1 Australian Inflation and the Markup on Costs

The first empirical example is the analysis of Banerjee, Cockerell, and Russell (2001),

who consider the relation between inflation and the markup of prices on costs based on

a set of quarterly Australian data covering the effective sample 1972 : 3 − 1995 : 2. Thethree dimensional process consists of the logs of consumer prices, import prices and unit

labor costs. Furthermore, four (assumed) stationary and weakly exogenous variables are

included. Banerjee, Cockerell, and Russell (2001) analyze a second order VAR based on

the deterministic specification of Paruolo (1996) and use the Two-Step estimator. To

simplify the analysis we exclude in this paper the conditioning variables and apply the

deterministic specification of Rahbek, Kongsted, and Jørgensen (1999).

The left hand side of Table 2 reports the Two-Step rank tests. Starting from the most

restricted model H∗(0, 0) it is possible to reject all models with r = 0 at a 5% level. In

the second row, H∗(1, 0) is easily rejected whereas the model H∗(1, 1) has a p−value of72%. This is also the preferred model in the original study.

The LR tests are reported in the right hand side of the Table 2. As noted, all test

statistics in first row and last column are identical to the Two-Step results, and the

remaining test statistics are all lower. Using the LR test there are less evidence for

rejecting H∗(1, 0). The LR test statistic is 48, corresponding to a p−value of 6%, ascompared to a clearly significant Two-Step statistic of 78. For the theoretically preferred

H∗(1, 1) the difference is smaller.

Simulations Based on the Estimated Model. To analyze the difference, we set up

a small Monte Carlo simulation. As DGP we use the model H∗(1, 1) with parameters setto the ML estimates, and generate time series, X−101, ...,X1, ...XT , for T ranging between

50 and 1000, by replacing t with random draws from an iid normal with the estimated

covariance matrix. The initial values X−101 and X−100 are taken from the actual data

and the first 100 observations are discarded to eliminate the importance of this choice.

The rejection frequencies at a nominal 5% level are reported in Table 3 for the two

tests. The rejection frequencies of H∗(r, s) against H∗(p) are identical for the Two-Steprank test and the LR test for models with r = 0, while there are big differences when

testing H∗(1, 0) and the true model H∗(1, 1). For the LR test the rejection frequenciesfor the true model H∗(1, 1) are very close to the nominal 5% for all sample lengths. The

rejection frequency of H∗(1, 0) against H∗(3) are relatively low in small samples.The Two-Step test is sized distorted, with rejection frequencies for the true model

H∗(1, 1) around 20% in small samples. At the same time the rejection frequencies for

H∗(1, 0) are higher than the LR test. For a small sample, T = 50, the rejection frequencyfor H∗(1, 0) is 94% compared to a rejection frequency of 32% for the LR test.

The distributions of the test statistics are reported in Figure 1 for the case T = 75.

The graphs are organized according the nesting structure in Table 1. The lower right

graph reports the results for tests of the true model, H∗(1, 1). The distribution of the LRstatistics almost coincides with the asymptotic distribution, while the distribution of the

Two-Step statistics is more dispersed and displaced to the right.

The differences in rejection frequencies translate into differences in the distributions

of the ranks estimated from the sequential test procedure, (br, bs), cf. Table 4. Note, thatin small samples the correct ranks (r, s) = (1, 1) are rarely chosen, primarily due to high

Type II error probabilities. For this particular DGP, the proportion of correctly selected

models is higher using the Two-Step rank test than the LR test in small samples. For

longer samples the ordering changes.

[Figure 1 around here]

5.2 Danish Import Price Determination

In this section we consider the Danish import price determination from Kongsted (2003).

Data consists of import, competing, and domestic prices, as well as an interest rate.

Kongsted (2003) estimates a VAR(2) for the effective sample 1975 : 3 − 1995 : 4. Thedeterministic specification includes a linear trend in all directions and an unrestricted

impulse dummy for 1992 : 2, and we adopt the same specification. The theoretically

preferred setup according to Kongsted (2003) is H∗(2, 1) with two stationary relationsand one I(2) trend.

The Two-Step rank tests, also reported in Kongsted (2003), are given in the left hand

side of Table 5. It is possible to reject all models with r = 0, while H∗(1, 2) has a p−valueof 9%. The first model not rejected at a 10% level is the preferred H∗(2, 1).

The LR statistics are reported in the right hand side of the table. The tests for

r = 0 are identical, but now it is marginally harder to reject H∗(1, 2). If this model isnevertheless rejected, the next potential model is H∗(2, 0) with two I(2) trends. The LRtest statistic of 45 is markedly lower than the corresponding Two-Step test statistic of 75.

Simulations Based on the Estimated Model I. Again we set up a small Monte

Carlo simulation. We use as a DGP the estimated H∗(2, 1), and exclude the dummyvariable both in the DGP and in the estimation model.

The rejection frequencies are reported in Table 6 for different sample lengths. Now the

small sample distributions differ for several model: H∗(1, 0),H∗(1, 1),H∗(1, 2),H∗(2, 0)and H∗(2, 1), and there are large differences in the rejection frequencies. The Two-Steprank test is heavily over-sized with small-sample rejection frequencies of the true null

H∗(2, 1) above 30%. The actual size converges very slowly to the nominal 5%, and forT = 500 the rejection frequency is still twice the nominal size. The LR test has excellent

size properties, but the rejection frequencies for more restricted models indicate a low

power.

The distributions of the test statistics are reported in Figure 2. Again the distributions

of the Two-Step rank tests are more dispersed and displaced to the right. The differences

are largest where s = 0, but there are substantial differences for several models.

Simulations Based on the Estimated Model II. Finally we take the model H∗(2, 0)as the DGP to illustrate the properties if s = 0 in the DGP, such that a large distortion

appears in the Two-Step estimation of the model with correct ranks.

The rejection frequencies are reported in Table 7. Note, that the true model H∗(2, 0)is almost always rejected using the Two-Step rank test — and this is case for all relevant

sample lengths. For T = 200, corresponding to 50 years of quarterly observations, the

actual size is 66%. The LR tests, on the other hand, have good size properties, with

rejection frequencies close to the nominal size.

A Proof of Theorem 1

In this appendix we derive the asymptotic distribution of the LR test for cointegration

ranks in the I(2) model. In order to motivate the notation and ease the presentation of

the I(2) test we start by considering the well-known likelihood ratio test for cointegration

rank in the I(1) model.

Throughout we use the notation θ and θ to denote the ML estimates of a parameter

θ under the null hypothesis and under the alternative respectively.

A.1 Asymptotics for the I(1) LR Test

Consider the well-known I(1) model and the rank test in this case. The present rederiva-

tion of the LR test is not based on the usual representation in terms of eigenvalues and

canonical correlations, see e.g. Johansen (1996, Chapter 11), but is instead based on a

linear regression type formulation. It is assumed here that the reader is familiar with the

well-established literature on I(1) VAR models.

For simplicity and without loss of generality consider the simplest case of the p−dimen-sional I(1) VAR(1) model as given by

∆Xt = ΠXt−1 + t, (13)

and the hypothesis H(r) parameterized as Π = αβ0. We want to derive the asymptoticdistribution of the likelihood ratio

−2 logQ(H(r)|H(p)) = −T log¯Ω−1Ω

where Ω, Ω are the ML estimates of the covariance matrix Ω under the hypothesis H(r)

and the alternative respectively. Denote by H0(r) the model with exactly p− r unit roots

in the characteristic polynomial, A(z), with the remaining roots outside the unit circle.

Lemma 1 Under H0(r), then as T →∞, the LR statistic −2 logQ(H(r)|H(p)) convergesin distribution to

(µZ 1

¶0µZ 1

¶−1 Z 1

), (14)

where Wu is (p−r)-dimensional standard Brownian motion on the unit interval, u ∈ [0, 1].

Proof: Recall that under H(r) the parameters α and β are non-identified. Identification

is obtained by normalization on the known p × r matrix c such that βc = β(c0β)−1 andαc = αβ0c, and hence Π = αβ0 = αcβ

0c. Denote by α0, β0 and Ω0 the true parameters

corresponding to the null, H0(r).

In order to derive the asymptotic distribution of −2 logQ(H(r)|H(p)) introduce thesimple auxiliary null hypothesis Haux that β is known, β = β0, as given by the equation,

∆Xt = αβ00Xt−1 + t. (15)

Using that by definitionHaux ⊆ H(r) ⊆ H(p), and therefore in particularQ(H(r)|H(p)) =Q(H(r)|Haux)×Q(Haux|H(p)), this implies that

−2 logQ(H(r)|H(p)) = −2 logQ(Haux|H(p))− [−2 logQ(Haux|H(r))] .

Turn first to −2 logQ(Haux|H(r)) and introduce the (p− r)× r dimensional parameter

B = β00⊥ (β − β0) , (16)

where β is normalized by c = β0. Note that the hypothesis Haux is given by B = 0. The

asymptotic distribution of the ML estimates of α and β under H(r) and normalized by

c = β0, α and β respectively, is given in Johansen (1996) and Johansen (1997). It follows

TB = T β00⊥³β − β0

´D→ B∞ =

¶−1 Z 1

−10 α0(α

00Ω−10 α0)

−1, (17)

with Fu = β00⊥MVu, where Vu is a Brownian motion on u ∈ [0, 1] with covariance Ω0, andM = β0⊥(α00⊥β0⊥)

−1α00⊥.Under H(r), with α and β normalized on c = β0,

∆Xt = αβ0Xt−1 + t = αβ0¡β0⊥β

00⊥ + β0β

¢Xt−1 + t = α

¡B0Z1t + Z0t

¢+ t. (18)

Here Z1t = β00⊥Xt−1 and Z0t = β00Xt−1 are I(1) and I(0) processes respectively. Definethe estimated residuals

t = ∆Xt − α³B0Z1t + Z0t

´, t = ∆Xt − αZ0t and 0t = ∆Xt − αZ0t.

Then Ω = 1T

PTt=1 t

0t, and

ˆ0t00t −

¡YT + Y 0T −XT

XT = αB0Ã1

Z1tZ01t

!Bα0 and YT =

(∆Xt − αZ0t)Z01t

!Bα0.

Using (17), the consistency of α and the continuous mapping theorem, it follows that,

TXTD→ α0B

∞0µZ 1

¶B∞α00

= α0(α00Ω−10 α0)

−1α00Ω−10

¶−1 Z 1

0uΩ−10 α0(α

00Ω−10 α0)

−1α00.

Next, by convergence to stochastic integrals,

tZ01t − (α− α0)

!TBα0

D→Z 1

¶−1 Z 1

0uΩ−10 α0(α

00Ω−10 α0)

−1α00

Therefore by joint convergence, and as 1T

PTt=1 ˆ0t

P→ Ω0, Ω P→ Ω0,

− 2 logQ(Haux|H(r)) = −T log¯Ω−1Ω

¯= Ttr

©Ω−10

¡YT + Y 0T −XT

¢ª+ oP (1)

(Ω−10 α0(α

00Ω−10 α0)

−1α00Ω−10

¶−1 Z 1

)+ oP (1).

This implies directly,

−2 logQ(Haux|H(p)) = tr

(Ω−10

¶−1 Z 1

)+ oP (1).

Hence by the joint convergence of −2 logQ(Haux|H(p)) and −2 logQ(Haux|H(r)) underH0(r), and using the skew projection, Ip = α0(α

00Ω−10 α0)

−1α00Ω−10 +Ω0α0⊥ (α

00⊥Ω0α0⊥)

−1 α00⊥,

− 2 logQ(H(r)|H(p))= −2 logQ(Haux|H(p))− [−2 logQ(Haux|H(r))]

(α0⊥

¡α00⊥Ω0α0⊥

¢−1α00⊥

¶−1 Z 1

)+ oP (1)

(µZ 1

¶0µZ 1

¶−1 Z 1

)+ oP (1),

with Wu = (α00⊥Ω0α0⊥)

−1/2 Vu. This completes the proof of Lemma 1.

A.2 Asymptotics for the I(2) LR Test

The proof in the I(2) case is completely analogous to the proof in the I(1) case with

the main difference being the more sophisticated parameterization. Again we consider,

without loss of generality, the simplest case of the I(2) model, the VAR(2) model as given

∆2Xt = ΠXt−1 − Γ∆Xt−1 + t.

The hypothesis of interest is H(r, s) against the alternative H(p). Under H(r, s) we use

the parameterization in (8),

∆2Xt = α[ρ0τ 0Xt−1 + ψ0∆Xt−1] +Ωα⊥(α0⊥Ωα⊥)−1κ0τ 0∆Xt−1 + t. (19)

We want to derive the asymptotic distribution of the likelihood ratio test,

−2 logQ(H(r, s)|H(p)) = −T log¯Ω−1Ω

Proof of Theorem 1: As before, with θ a parameter, the corresponding true parameter

is denoted θ0. Henceforth, the parameters β and τ under H(r, s) are normalized on

c = β0 and c = τ0 respectively such that β00β = Ir and τ 00τ = Ir+s. Furthermore, set

α⊥ = (Ip − β0(α0β0)−1α0)β0⊥ such that all other parameters are identified, see Johansen

(1997). Note in particular that ρ = τ 00β which is (r + s)× r.

Introduce the parameters defined in Johansen (1997):

B0 = β020 (ψ − ψ0) , B1 = β

010 (β − β0) , B2 = β

020 (β − β0) , C = β

020 (τ − τ0) ρ⊥,

where ρ⊥ =³I − ρ0 (ρ

0ρ0)−1 ρ

´ρ0⊥. Note that ρ = τ 00β = ρ0 + τ 00β10B1 = ρ(B1) and

define similarly ρ⊥(B1). By Johansen (1997, Lemma 1) it follows that under H0(r, s) for

the ML estimates B0, B1, B2 and C derived under H(r, s),

T B0D→ B∞0 , T B1

D→ B∞1 , T 2B2D→ B∞2 and TC

D→ C∞, (20)

B∞ =¡B∞00 , B∞01 , B∞02

¶−1 Z 1

01u (21)

C∞ =

¶−1 Z 1

0H0udV

02u. (22)

β02C2Vuβ01C1Vuβ02C2

R u0 Vsds

with Vu a Brownian motion on u ∈ [0, 1] with covariance Ω0. Furthermore,

V1u =¡α00Ω

−10 α0

¢−1α00Ω

−10 Vu (23)

V2u =³ρ00⊥κ0

¡α00⊥Ω0α0⊥

¢−1κ00ρ0⊥

´−1ρ00⊥κ0

¡α00⊥Ω0α0⊥

¢−1α00⊥Vu (24)

= −³ξ00¡α00⊥Ω0α0⊥

¢−1ξ0

´−1ξ00¡α00⊥Ω0α0⊥

¢−1α00⊥Vu.

Now under H(r, s) the model in (19) can be written as

∆2Xt = A0Z0t +A1Z1t +A2Z2t + t (25)

where Z0t, Z1t and Z2t are I(0), I(1) and I(2) regressors respectively, defined by

Ãβ00Xt−1 + ψ00∆Xt−1τ 00∆Xt−1

Ãβ020∆Xt−1β010Xt−1

!Z2t = β020Xt−1.

Finally,

A0 =³α, α (ψ − ψ0)

0 τ0 +Ωα⊥¡α0⊥Ωα⊥

¢−1κ0´

A1 =³αB00 +Ωα⊥

¡α0⊥Ωα⊥

¢−1κ0£ρ⊥ (B1)C

0 + ρ (B1)B02

¤, αB01

´(27)

A2 = αB02. (28)

Introduce the auxiliary hypothesis, Haux where ψ (p× r), β (p× r) and τ (p× (r+ s)) are

fixed at their true values, ψ0, β0 and τ0, corresponding to B0, B1, B2 and C all identically

zero. Note that under Haux, the model equation in (25) reduces to

∆2Xt = A0Z0t + t,

and furthermore that Haux ⊆ H(r, s) ⊆ H(p) such that

−2 logQ(H(r, s)|H(p)) = −2 logQ(Haux|H(p))− [−2 logQ(Haux|H(r, s))] .

Turn first to −2 logQ(Haux|H(r, s)) and define the corresponding estimated residuals

t = ∆2Xt − A0Z0t − A1Z1t − A2Z2t, t = ∆

2Xt − A0Z0t and 0t = ∆2Xt − A0Z0t.

Then Ω = 1T

PTt=1 t

0t, and

ˆ0t00t +XT − YT − Y 0T ,

XT =³A1, A2

´Ã 1T

¡Z 01t, Z

¢0 ¡Z 01t, Z

¢!³A1, A2

´0YT =

³∆Xt − A0Z0t

´ ¡Z 01t, Z

¢!³A1, A2

With DT =blockdiag³

1√TIp−r, 1

T 3/2Ip−r−s

´it follows that

¡Z 01t, Z

¢0 ¡Z 01t, Z

D→Z 1

0udu. (29)

Next, using (20) together with the definitions of Ai in (26)-(28) and consistency of the

remaining parameters, it follows that

√TD−1T

³A1, A2

´D→ α0

¡B∞00 , B∞01 , B∞02

¢− ³Ω0α0⊥ ¡α00⊥Ω0α0⊥¢−1 ξ0C∞0, 0, 0´ . (30)

Combining (29) and (30) and using the definitions (21)-(22) it follows that

TXTD→ X∞ =

hα0¡B∞00 , B∞01 , B∞02

¢0 − ³Ω0α0⊥ ¡α00⊥Ω0α0⊥¢−1 ξ0C∞0, 0, 0´i×·Z 1

¸ hα0¡B∞00 , B∞01 , B∞02

¢0 − ³Ω0α0⊥ ¡α00⊥Ω0α0⊥¢−1 ξ0C∞0, 0, 0´i0Note that, as α00α0⊥ = 0 by definition,

tr©Ω−10 X∞ª = tr

0dV1uH

¶−1 Z 1

(¡α00⊥Ω0α0⊥

¢−1ξ0

0dV2uH

¶−1 Z 1

that is, the cross product terms vanish. Next, by convergence to stochastic integrals,

TYT = T

³∆Xt − A0Z0t

´ ¡Z 01t, Z

!TD−1T

³A1, A2

ÃTXt=1

¡Z 01t, Z

!√TD−1T

³A1, A2

´0+ oP (1)

D→Z 1

hα0¡B∞00 , B∞01 , B∞02

¢− ³Ω0α0⊥ ¡α00⊥Ω0α0⊥¢−1 ξ0C∞0, 0, 0´i0=

¶−1 Z 1

00−Z 1

¶−1 Z 1

0H0udV

¡α00⊥Ω0α0⊥

¢−1α00⊥Ω0.

Therefore by joint convergence, and as 1T

PTt=1 ˆ0t

P→ Ω0, ΩP→ Ω0, and furthermore

using the definition of V1 and V2 in (23)-(24),

− 2 logQ(Haux|H(r, s)) = −T log¯Ω−1Ω

¯= −Ttr ©Ω−10 ¡

X∞ − Y∞ − Y∞0¢ª+ oP (1)

(Ω−10 α0(α

00Ω−10 α0)

−1α00Ω−10

¶−1 Z 1

½·α0⊥

¡α00⊥Ω0α0⊥

¢−1ξ0

³ξ00¡α00⊥Ω0α0⊥

¢−1ξ0

´−1ξ00¡α00⊥Ω0α0⊥

¢−1α00⊥

¸×Z 1

¶−1 Z 1

0H0udV

)+ oP (1).

This implies directly that also

−2 logQ(Haux|H(p)) = tr

(Ω−10

¶−1 Z 1

)+ oP (1).

Using the projection Ip = α0(α00Ω−10 α0)

−1α00Ω−10 +Ω0α0⊥ (α00⊥Ω0α0⊥)

−1 α00⊥ and collect-ing terms, it follows that

− 2 logQ(H(r, s)|H(p))= −2 logQ(Haux|H(p))− [−2 logQ(Haux|H(r, s))]

(α0⊥

¡α00⊥Ω0α0⊥

¢−1α00⊥

¶−1 Z 1

)− (31)

½·α0⊥

¡α00⊥Ω0α0⊥

¢−1ξ0

³ξ00¡α00⊥Ω0α0⊥

¢−1ξ0

´−1ξ00¡α00⊥Ω0α0⊥

¢−1α00⊥

¸× (32)Z 1

¶−1 Z 1

0H0udV

)+ oP (1). (33)

The term in (31) can be rewritten as

(α0⊥

¡α00⊥Ω0α0⊥

¢−1α00⊥

¶−1 Z 1

0H0udV

(α0⊥

¡α00⊥Ω0α0⊥

¢−1α00⊥

¶−1 Z 1

¯¯H0

H1u −R 10 H1uH

hR 10 H0uH

i−1H 00u

H2u −R 10 H2uH

hR 10 H0uH

i−1H 00u

.Note that that the term appearing in (32) can be written as

α0⊥¡α00⊥Ω0α0⊥

¢−1ξ0

³ξ00¡α00⊥Ω0α0⊥

¢−1ξ0

´−1ξ00¡α00⊥Ω0α0⊥

¢−1α00⊥

= α0⊥¡α00⊥Ω0α0⊥

¢−1α00⊥ − α0⊥ξ0⊥

¡ξ00⊥α

00⊥Ω0α0⊥ξ0⊥

¢−1ξ00⊥α

00⊥.

Hence,

− 2 logQ(H(r, s)|H(p))

(α0⊥

¡α00⊥Ω0α0⊥

¢−1α00⊥

¶−1 Z 1

¡α020Ω0α20

¢−1α020

¶−1 Z 1

0H0udV

and the result in Theorem 1 follows by defining the standard Brownian motion

Wu =¡α00⊥Ω0α0⊥

¢−1/2α00⊥Vu.

References

Banerjee, A., L. Cockerell, and B. Russell(2001): “An I(2) Analysis of Inflation and theMarkup,” Journal of Applied Econometrics, 16,221—240.

Banerjee, A., and B. Russell (2001): “The Re-lationship Between the Markup and Inflation inthe G7 Economies and Australia,” The Review ofEconomics and Statistics, 83(2), 377—387.

Boswijk, H. P. (2000): “Mixed Normality and An-cillarity in I(2) Systems,” Econometric Theory,16, 878—904.

Diamandis, P., D. Georgoutsos, andG. Kouretas (2000): “The Monetary Modelin the Presence of I(2) Components: Long-Run Relationships, Short-Run Dynamics andForecasting of the Greek Drachma,” Journal ofInternational Money and Finance, 19, 917—941.

Doornik, J. A. (1998): “Approximations to theAsymptotic Distribution of Cointegration Tests,”Journal of Economic Surveys, 12(5), 573—593.

Fliess, N., and R. MacDonald (2001): “The In-stability of the Money Demand Function: andI(2) Interpretation,” Oxford Bulletin of Eco-nomics and Statistics, 63(4), 475—495.

Haldrup, N. (1998): “A Review of the Economet-ric Analysis of I(2) Variables,” Journal of Eco-nomic Surveys, 12, 595—650.

Johansen, S. (1992): “A Representation of VectorAutoregressive Processes Integrated of Order 2,”Econometric Theory, 8, 188—202.

(1995): “A Statistical Analysis of Cointe-gration for I(2) Variables,” Econometric Theory,11, 25—59.

(1996): Likelihood-Based Inference in Coin-tegrated Autoregressive Models. Oxford UniversityPress, Oxford, 2nd edn.

(1997): “Likelihood Analysis of the I(2)Model,” Scandinavian Journal of Statistics, 24,433—462.

Juselius, K. (1998): “A Structured VAR for Den-mark under Changing Monetary Regimes,” Jour-nal of Business and Economic Statistics, 16(4),400—412.

(1999): “Price Convergence in the Mediumand Long Run: An I(2) Analysis of Six PriceIndices,” in Cointegration, Causality, and Fore-casting - A Festschrift in Honour of Clive W.J.Granger, ed. by R. F. Engle, and H. White. Ox-ford University Press, Oxford.

Kongsted, H. C. (2003): “An I(2) CointegrationAnalysis of Small-Country Import Price Determi-nation,” Econometrics Journal, 6, 53—71.

Nielsen, B., and A. Rahbek (2000): “Similar-ity Issues in Cointegration Analysis,” Oxford Bul-letin of Economics and Statistics, 62(1), 5—22.

Nielsen, H. B. (2002): “An I(2) CointegrationAnalysis of Price and Quantity Formation in Dan-ish Manufactured Exports,” Oxford Bulletin ofEconomics and Statistics, 64(5), 449—472, Chap-ter 4 in this Thesis.

Paruolo, P. (1996): “On the Determination ofIntegration Indices in I(2) Systems,” Journal ofEconometrics, 72, 313—356.

(2000): “Asymptotic Efficiency of the TwoStage Estimator in I(2) Systems,” EconometricTheory, 16, 524—550.

Paruolo, P., and A. Rahbek (1999): “WeakExogeneity in I(2) VAR Systems,” Journal ofEconometrics, 93, 281—308.

Rahbek, A., H. C. Kongsted, andC. Jørgensen (1999): “Trend-Stationarityin the I(2) Cointegration Model,” Journal ofEconometrics, 90, 265—289.

r Models

0 H(0, 0) ⊂ H(0, 1) ⊂ H(0, 2) ⊂ H(0, 3) = H(0)

∩1 H(1, 0) ⊂ H(1, 1) ⊂ H(1, 2) = H(1)

∩2 H(2, 0) ⊂ H(2, 1) = H(2)

∩3 H(3, 0) = H(3)

p− r − s 3 2 1 0

Table 1: Partial nesting structure for p=3.

r Two-Step rank test LR test

0 203.53 118.91 57.62 47.28 203.53 118.91 57.62 47.28

[.00] [.00] [.02] [.02] [.00] [.00] [.02] [.02]

1 78.08 20.37 17.78 47.56 19.21 17.78

[.00] [.72] [.37] [.06] [.79] [.37]

2 11.11 7.05 9.10 7.05

[.55] [.35] [.74] [.35]

p− r − s 3 2 1 0 3 2 1 0

Table 2: Rank determination for the data in Banerjee, Cockerell, and Russell

(2001). Figures in square brackets are asymptotic p−values according to theΓ−approximation of Doornik (1998).

T Models H∗(r, s)H∗(0, 0) H∗(0, 1) H∗(0, 2) H∗(0, 3) H∗(1, 0) H∗(1, 1)

Two-Step rank test

50 100.0 96.2 47.1 52.3 94.4 20.2

75 100.0 100.0 81.6 76.7 99.6 21.8

100 100.0 100.0 97.1 92.1 100.0 18.4

200 100.0 100.0 100.0 100.0 100.0 11.5

500 100.0 100.0 100.0 100.0 100.0 8.1

1000 100.0 100.0 100.0 100.0 100.0 6.3

LR test

50 100.0 96.2 47.1 52.3 32.3 5.7

75 100.0 100.0 81.6 76.7 66.1 7.5

100 100.0 100.0 97.1 92.1 91.4 7.0

200 100.0 100.0 100.0 100.0 100.0 5.9

500 100.0 100.0 100.0 100.0 100.0 5.7

1000 100.0 100.0 100.0 100.0 100.0 5.4

Table 3: Rejection frequencies in a simulation based on Banerjee,

Cockerell, and Russell (2001). The tests are not calculated sequen-

tially. Bold indicates rejection frequencies for tests of the correct

model (empirical size). Based on 5000 replications and a nominal

5% level.

100 200

0.03 H*(0,0)

50 100 150

H*(0,1)

50 100

0.04 H*(0,2)

50 100

0.04 H*(0,3)

0 50 100 150 200

0.05 H*(1,0)

0 50 100

H*(1,1)

Two-StepLR95% critical value As. distribution

Figure 1: Distributions of the two test statistics for the case T=75. Graphs are organized

according the partial nesting structure. Based on 5000 replications.

T Cointegration Ranks

(0, 0) (0, 1) (0, 2) (0, 3) (1, 0) (1, 1) (1, 2) (2, 0) (2, 1) (3, 0)

Two-Step rank test

50 0.0 3.8 49.1 5.2 1.3 27.8 5.4 2.7 3.6 1.1

75 0.0 0.0 18.4 8.5 0.2 56.0 7.1 4.5 4.1 1.2

100 0.0 0.0 2.8 5.6 0.0 75.2 7.0 4.0 3.9 1.4

200 0.0 0.0 0.0 0.0 0.0 88.5 4.8 3.1 2.7 1.0

500 0.0 0.0 0.0 0.0 0.0 91.9 2.1 3.0 2.3 0.7

1000 0.0 0.0 0.0 0.0 0.0 93.7 1.2 2.6 1.9 0.7

LR test

50 0.0 3.8 49.1 5.2 16.7 19.8 1.1 3.2 0.6 0.4

75 0.0 0.0 18.4 8.5 15.8 49.9 1.0 5.0 0.9 0.5

100 0.0 0.0 2.8 5.6 5.6 79.0 0.8 4.6 1.0 0.6

200 0.0 0.0 0.0 0.0 0.0 94.1 0.5 4.0 0.9 0.5

500 0.0 0.0 0.0 0.0 0.0 94.3 0.5 3.7 1.0 0.5

1000 0.0 0.0 0.0 0.0 0.0 94.6 0.7 3.3 0.9 0.4

Table 4: Empirical distribution of the estimated ranks (br, bs) based on the sequentialtest procedure and a nominal 5% level. The DGP is the estimated model for the

Banerjee, Cockerell, and Russell (2001) data. Bold indicates the correct rank indices.

Based on 5000 replications.

r Two-Step rank test LR test

0 343.53 238.76 161.88 122.79 101.00 343.53 238.76 161.88 122.79 101.00

[.00] [.00] [.00] [.00] [.00] [.00] [.00] [.00] [.00] [.00]

1 187.53 96.95 51.17 48.48 142.37 88.22 49.94 48.48

[.00] [.00] [.09] [.01] [.00] [.00] [.11] [.01]

2 75.15 27.61 24.52 45.36 26.10 24.52

[.00] [.27] [.07] [.10] [.35] [.07]

3 13.02 10.42 11.72 10.42

[.37] [.11] [.49] [.11]

p− r − s 4 3 2 1 0 4 3 2 1 0

Table 5: Rank determination for the data in Kongsted (2003). Figures in square brackets are asymp-

totic p−values according to the Γ−approximation of Doornik (1998).

T Models H∗(r, s)H∗(0, 0) H∗(0, 1) H∗(0, 2) H∗(0, 3) H∗(0, 4) H∗(1, 0) H∗(1, 1) H∗(1, 2) H∗(1, 3) H∗(2, 0) H∗(2, 1)

Two-Step rank test

50 100.0 100.0 98.8 92.0 83.6 100.0 92.9 53.6 43.2 94.2 33.5

75 100.0 100.0 100.0 100.0 98.1 100.0 99.8 76.9 72.4 99.5 32.9

100 100.0 100.0 100.0 100.0 99.8 100.0 100.0 89.5 91.3 100.0 28.7

200 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 16.8

500 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 10.1

1000 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 7.5

LR test

50 100.0 100.0 98.8 92.0 83.6 97.5 66.7 27.7 43.2 21.5 4.0

75 100.0 100.0 100.0 100.0 98.1 100.0 98.6 53.3 72.4 51.8 5.5

100 100.0 100.0 100.0 100.0 99.8 100.0 100.0 77.7 91.3 81.2 6.3

200 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 5.7

500 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 5.6

1000 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 5.5

Table 6: Rejection frequencies in a simulation based on Kongsted (2003). DGP is H∗(2, 1). See also notes toTable 3.

T Models H∗(r, s)H∗(0, 0) H∗(0, 1) H∗(0, 2) H∗(0, 3) H∗(0, 4) H∗(1, 0) H∗(1, 1) H∗(1, 2) H∗(1, 3) H∗(2, 0)

Two-Step rank test

50 100.0 100.0 96.2 84.3 82.8 100.0 86.6 33.3 39.3 91.0

75 100.0 100.0 100.0 99.1 97.0 100.0 99.1 46.6 59.5 91.1

100 100.0 100.0 100.0 100.0 99.7 100.0 99.9 64.6 79.5 86.3

200 100.0 100.0 100.0 100.0 100.0 100.0 100.0 98.1 99.6 66.3

500 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 38.3

1000 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 23.0

2500 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 11.2

5000 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 8.0

LR test

50 100.0 100.0 96.2 84.3 82.8 94.3 47.7 17.7 39.3 6.6

75 100.0 100.0 100.0 99.1 97.0 100.0 90.2 31.2 59.5 7.5

100 100.0 100.0 100.0 100.0 99.7 100.0 99.2 55.7 79.5 7.5

200 100.0 100.0 100.0 100.0 100.0 100.0 100.0 97.6 99.6 5.9

500 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 5.9

1000 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 6.1

2500 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 5.6

5000 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 5.2

Table 7: Rejection frequencies in a simulation based on Kongsted (2003). DGP is H∗(2, 0). See alsonotes to Table 3.

Figure 2: Distributions of the two test statistics for the case T=75. See also notes to Figure 1.

Chapter 2

Analyzing I(2) Systems by

Transformed Vector Autoregressions

Analyzing I(2) Systems by

Transformed Vector Autoregressions

Hans Christian Kongsted

Centre for Applied Microeconometrics

hans.christian.kongsted@econ.ku.dk

Heino Bohn Nielsen

Abstract

We characterize the restrictions imposed by the minimal I(2)-to-I(1) transformation

that underlies much applied work, e.g. on money demand relationships or open-

economy pricing relationships. The relationship between the parameters of the original

I(2) vector autoregression, including the coefficients of polynomially cointegrating

relationships, and the transformed I(1) model is characterized. We discuss estimation

of the transformed model subject to restrictions as well as the more commonly used

approach of unrestricted reduced rank regression. Only a minor loss of efficiency is

incurred by ignoring the restrictions in the empirical example and a simulation study.

A properly transformed vector autoregression thus provides a practical and effective

means for inference on the parameters of the I(2) model.

Keywords: Cointegration, stochastic trend, price homogeneity, nominal, real, Monte

Carlo experiment.

JEL Classification: C32, C51, C52, F41.

1 Introduction

This paper is motivated by a rich empirical literature applying cointegration analysis in

examining the levels and the growth rates of macroeconomic variables and their relation-

ships. Main examples are relationships that involve the growth rates of nominal variables,

e.g. the rate of inflation, wage growth, or the money growth rate, and real or relative

We thank Allan Wurtz and seminar participants at the ESEM 2002 meeting in Venice and the European

Central Bank for comments. Kongsted gratefully acknowledges financial support from the Danish Social

Sciences Research Council under the project ”Macroeconomic transmission mechanisms in Europe: Empir-

ical applications and econometric methods”. The activities of the Centre for Applied Microeconometrics

(CAM) are financed by a grant from the Danish National Research Foundation.

Analyzing I(2) Systems by Transformed Vector Autoregressions

magnitudes of such variables, e.g. real wages, real money, or the markup, see Coenen

and Vega (2001), Crowder, Hoffman, and Rasche (1999), or Doornik, Hendry, and Nielsen

(1998) for examples. Other studies consider so-called stock-flow relationships, e.g. be-

tween income, consumption, and wealth as in Hendry and von Ungern-Sternberg (1981),

or between sales, production, and inventories at the industry level as in Granger and Lee

(1989).

The time series of the rate of inflation–at least over the post-World War II period–is

often treated as being integrated of order one, denoted I(1),1 in the literature on relation-

ships between nominal variables. This carries an immediate implication that price levels

are I(2) and has prompted a flurry of statistical research into models of I(2) variables,

see Haldrup (1998) for a recent survey. Recent research has also established that the

study of stock-flow relationships ought to be conducted within an I(2) framework, see

Engsted and Johansen (1999). Still, with few exceptions the full-blown I(2) analysis tends

to be avoided by the applied literature. Instead most studies rely on transformations that

partly difference the data vector. The widespread use of transformations in dealing with

I(2) variables seems related to the fact that inference in I(2) models is difficult in the

sense that few hypotheses allow the usual asymptotic χ2 inference. In particular, tests of

hypotheses on the so-called polynomially cointegrating relationships that relate levels and

growth rates of the process have non-standard asymptotic distributions.

The present paper offers a general and formal characterization of the partly differencing

approach. Specifically, we derive the properties of a transformed vector process obtained

by partly differencing an I(2) process. The transformation eliminates the I(2) trends while

retaining possible cointegrating relationships among the variables. A case is examined in

which the original process is generated by a vector autoregressive (VAR) model. The

transformation examined is minimal in terms of the amount of a priori information on

the parameters required to achieve a valid reduction from I(2) to I(1). Throughout,

the validity of a priori parameter restrictions will be taken as given. Clearly, in empirical

applications their validity should be tested.2 A properly transformed process will satisfy a

VAR and inference on the full set of cointegrating parameters can be achieved by standard

I(1) methods.

The parameters of the transformed VAR are subject to certain restrictions. One

set of restrictions is shown to fit directly into the standard I(1) reduced rank regression

analysis, see Johansen (1996, chapter 6). Another set requires different and more elaborate

estimation techniques and is commonly ignored in applied work. Moreover, we find that

standard methods for inference on the cointegration rank employ an alternative hypothesis

1A process is integrated of order d, denoted I(d), if it becomes stationary only after first-differencing d

times, see Johansen (1996, chapter 3) for the formal definition.2Kongsted (2002) examines the properties of a transformed process under invalid a priori parameter

restrictions. Tests of the validity of the transformation are derived by Kongsted (2003) based on the

two-step I(2) algorithm of Johansen (1995a) and by Johansen (2002) based on the maximum likelihood

estimator of vector autoregressions with I(2) restrictions.

which is irrelevant under the assumptions maintained in transforming the data. The likely

consequences of ignoring these considerations in terms of the resulting efficiency loss are

explored in an empirical example and a small-scale simulation experiment. We find only

a minor efficiency loss and conclude that the transformed model provides a practical and

efficient means for inference on the parameters of the original I(2) model.

The paper is outlined as follows: Section 2 defines the I(2) model assumed to generate

the original data and the class of transformations analyzed. The structure of cointegrating

relationships and common stochastic trends in the transformed model is also derived in this

section and the parameters of the I(2) model are recovered. Section 3 collects the results

on parameter restrictions implied by the transformation and outlines some estimation

algorithms that impose those restrictions. Section 4 provides an empirical illustration

based on Banerjee, Cockerell, and Russell (2001) and uses it to set up a small-scale

simulation experiment.

Some notation and definitions are used throughout: For a p×r matrix α of rank r, let

α⊥ denote a basis of the p× (p− r) orthogonal complement and define α = α(α0α)−1. Fora p× r matrix β and a (p− r)× s matrix η, s < p− r, define β1 = β⊥η and β2 = β⊥η⊥.The matrices β, β1, and β2 are thus mutually orthogonal. Also note the relationship

I = V (B0V )−1B0 + b(v0b)−1v0 where V = v⊥ is p × (r + s), B = b⊥, and |v0b| 6= 0, see

Hansen and Johansen (1998, page 19).

2 A Minimal Transformation from I(2) to I(1)

This section derives the process obtained by a minimal transformation from I(2) to I(1),

relating its cointegration properties and common stochastic trend structure to the orig-

inal I(2) process. Because the precise I(2) conditions play a major role in deriving the

implied restrictions, the section will briefly set up the VAR of the original data based

on Johansen (1992a) and Rahbek, Kongsted, and Jørgensen (1999). Then, a minimal

I(2)-to-I(1) transformation is defined and the transformed VAR is derived.

2.1 The Original I(2) Process

The starting point for the analysis is a p-dimensional I(2) vector time series, Xt. The

original process satisfies the kth order vector autoregressive model written in a parame-

terization suitable for I(2) processes,

∆2Xt = ΠXt−1 − Γ∆Xt−1 +k−2Xi=1

Ψi∆2Xt−i + µ0 + µ1t+ εt, (1)

for t = 1, . . . , T . For the statistical analysis εt is assumed to be identically and indepen-

dently distributed N(0,Ω) terms and the initial observations, X−k+1, . . . ,X0, are taken

to be fixed.

Assuming that the roots of the characteristic polynomial of (1) are either at one or

outside the unit circle and maintaining a further rank condition, see Johansen (1992a),

the parameters of (1) should satisfy the following reduced rank conditions for Xt to be an

I(2) process,

Π = αβ0 and α0⊥Γβ⊥ = ξη0. (2)

Here, α and β are p× r matrices of full rank, whereas ξ and η are (p− r)× s and also of

full rank.

The cointegration structure of Xt and the structure of its common stochastic trends

are determined by (2) as derived in Johansen (1996, section 4.3). There are p − r − s

common I(2) trends embodied in Xt, represented as α02

PPεi with α2 = α⊥ξ⊥. They

are loaded into the Xt process by a matrix which is proportional to β2 = β⊥η⊥. Thecommon I(2) trends are eliminated by the full set of r + s cointegrating vectors (β, β1)

where β1 = β⊥η. Both sets of linear combinations, β0Xt and β

01Xt, are I(1) in general and

include the first-differenced I(2) component, α02P

εi. The s linear combinations, β01Xt, in

addition includes the genuine I(1) trend of the system, α01P

εi, where α1 = α⊥ξ, and dotherefore not cointegrate any further. The r linear combinations, β0Xt, on the contrary,

cointegrate to stationarity with ∆Xt, producing a second layer of cointegration reflected

in the r polynomially cointegrating relations,

St = β0Xt − δβ02∆Xt, (3)

which are I(0). This relationship includes the polynomially cointegrating parameter, δ =

α0Γβ2, of dimensions r× (p− q) with q = r+s. Note that if r > p−q, there are fewer I(2)trends than polynomially cointegrating relationships and r− (p− q) directly cointegratingrelationships can be defined among the Xt variables as δ

0⊥β

In terms of the deterministics in (1) a specification will be considered that allows the

process Xt to be linearly trending in all directions whereas, by assumption, no quadratic

trends can be present. This is the model of Rahbek et al. (1999), which requires the

parameters of the constant and linear drift terms in (1) to be restricted as

α0⊥µ0 = −ξη00 − α0⊥Γββ00 (4)

µ1 = αβ00, (5)

where η00 and β00 are vectors of dimensions s and r. Transforming the process Xt will in

most cases also impose restrictions on the deterministic part.

2.2 The Transformation

The transformed process is defined and analyzed under a specific set of assumptions on

the parameters of the original process, Xt. Those assumptions reflect a situation in which

there are strong a priori expectations of some number of common I(2) trends being shared

in certain known proportions by a (sub)set of variables in Xt.3

3The transformation is denoted a nominal-to-real transformation in Kongsted (2002) since it typically

involves going from a system of nominal variables to a real system.

A common example would be that one nominal trend is reflected by several I(2) vari-

ables in equal proportions, e.g. by the price level and the money stock, or by several price

measures, as in the empirical application below. The transformed process then includes

variables that reduce to I(1) either by linear transformation, e.g. the real money stock and

relative prices (along with any real variables in Xt), or by first-differencing as for instance

the rate of inflation.

In general terms, the transformation starts from a known matrix b of dimension p ×(p− q). The transformed vector process, Xt, is defined by

ÃB0Xt

v0∆Xt

!≡Ã

where B = b⊥ is p × q. The p × (p − q) matrix v that defines the first-differenced term

should satisfy |v0b| 6= 0. Throughout b is assumed to satisfy orthogonality in terms of thefull set of cointegrating vectors,

b0(β, β1) = 0, (7)

or, equivalently, span(β, β1) = span(B).

2.3 The Structure of the Transformed Process

Under the condition (7) the process Xt will be I(1) with cointegrating rank r. This is

shown by Kongsted (2002) who also examines the general case of a potentially invalid

transformation that does not achieve the reduction from I(2) to I(1). The present paper

examines the case that (7) is indeed satisfied. Next, we characterize the parameter re-

strictions implied by the fact that Xt derives from the I(2) process Xt by this particular

transformation. Moreover, it will be shown explicitly how to recover the parameters of

the original I(2) model.

The matrix of loadings of the I(2) common trends, β2, is known (up to a normalization)

under the condition (7) and b is a valid basis for β2. Similarly, the full set of cointegrating

vectors can be given the representation

(β, β1) = B(ϕ, (B0B)−1ϕ⊥) (8)

for some (r+ s)× r matrix ϕ of full rank. The condition imposed on the matrix v ensures

that linear combinations of first-differenced process that are needed in order to recover

the polynomially cointegrating relationships (3) are in fact included in Ut.

The transformation requires b and thus the number of common I(2) trends in the

system, p − q, to be known. The number of polynomially cointegrating relationships,

r, and the number of genuine I(1) common trends amongst the variables, s, are only

restricted by their sum, q. This reflects a view that often less a priori information is

available on r and s. Note that the transformation leaves unrestricted the relationship

between B and each of the sets of cointegrating vectors, β and β1. In that sense, (6) is

the minimal transformation that achieves the reduction from I(2) to I(1).

The parameters of the transformed process Xt will now be derived. First, the condi-

tions Π = αβ0 and µ1 = αβ00 are imposed in (1). Secondly, the equations are premultipliedby the non-singular matrix M = (B, v)0 to obtainÃ

B0∆2Xt

v0∆2Xt

ÃB0αβ0Xt−1v0αβ0Xt−1

!−Ã

B0Γ∆Xt−1v0Γ∆Xt−1

+k−2Xi=1

ÃB0Ψi∆

2Xt−iv0Ψi∆

2Xt−i

ÃB0αβ00tv0αβ00t

ÃB0(µ0 + εt)

v0(µ0 + εt)

!. (9)

The final step applies the identity I = V (B0V )−1B0 + b(v0b)−1v0 = AB0 + av0 with A =

V (B0V )−1 and a = b(v0b)−1. This allows us to write ∆Xt = A∆Zt + aUt. Substituting

for ∆Xt−1 and ∆2Xt−i, i = 0, 1, . . . , k − 2, in (9) and collecting terms, a set of equationsfor Xt is obtained,

∆Xt = ΠXt−1 +k−1Xi=1

Γi∆Xt−i + µ1t+ µ0 + εt, (10)

where εt = Mεt, µ0 = Mµ0, and µ1 = Mαβ00. This is a VAR(k) for the transformedvariables in error correction format, the standard representation for cointegration analysis

of I(1) processes, see Johansen (1996). Comparing (9) and (10) the parameters of the

transformed VAR are given by:

ÃB0αϕ0 −B0Γav0αϕ0 −v0Γa

ÃI +B0(Ψ1 − Γ)A B0Ψ1av0(Ψ1 − Γ)A v0Ψ1a

ÃB0(Ψi −Ψi−1)A B0Ψia

v0(Ψi −Ψi−1)A v0Ψia

!, i = 2, . . . , k − 2,

Γk−1 =

Ã−B0Ψk−2A 0

−v0Ψk−2A 0

The structure of the cointegrating relationships in the transformed model can be an-

alyzed by applying a result from Johansen (1992a), that for the I(2) process it holds

Γ = Γββ0 + (αα0Γβ1 + α1)β01 + αα0Γβ2β

02. (11)

Post-multiplying by a = b(v0b)−1 will eliminate the first two terms since b is a valid basisfor span(β2), that is, β2 = bω0 for some ω such that |ω| 6= 0. Thus, Γa = αδ with

δ = δωb0b(v0b)−1 where δ reflects the rule adopted for normalizing β2 in the I(2) model.

Substituting for Γa in the above expression for Π it is seen to be the product of p × r

matrices, Π = αβ0, where

−δ0!

and α =Mα =

ÃB0αv0α

!. (12)

The full set of I(2) cointegrating parameters is seen to be recovered from the cointe-

grating parameters of the transformed process, β. Specifically, β = Bϕ and β1 = Bϕ⊥.A particularly useful result concerns the parameter δ of the polynomially cointegrating

relationship (3) which can be recovered as δ = δv0b(b0b)−1ω−1. It enters the transformedmodel as a standard I(1) cointegrating parameter for which there is a well-developed

theory of inference, see Johansen (1996).

It can also be noted that span(δ) equals span(δ) which means that if r > (p − q) we

can define δ⊥ = δ⊥ and produce r−(p−q) linear combinations of β with a zero coefficientfor Ut. These reflect the directly cointegrating relationships that may exist in the I(2)

model. In the general case, restrictions on elements of δ can be imposed as restrictions on

β.4 Note that a zero restriction on any linear combination of the last q rows of β implies

that a certain linear combination of the differenced common I(2) trend does not enter the

polynomially cointegrating relationship of the I(2) model.

The matrix of equilibrium-correction loadings of the levels term in (1), α, is recovered

α =M−1α,

whereM−1 = (V (B0V )−1, b(v0b)−1). Because Xt is I(1) and satisfies the VAR (10) we can

apply a result on weak exogeneity from Johansen (1992b): If c0α = 0 for some p× (p−m)matrix c with m ≥ r, then the process c0Xt is weakly exogenous for β. The equivalent

condition in terms of α is c0α = 0 with c = M 0c. This shows that if β2 is known

then imposing a condition on α–or, equivalently, α⊥–is sufficient for efficient inferenceon the cointegrating parameters. This simplifies matters considerably as compared to the

unrestricted I(2) case for which the general result of Paruolo and Rahbek (1999) holds that

c0Xt is weakly exogenous for the parameters (β, β1, δ) if the condition c0(α, α1,Γβ) = 0

is satisfied. The difference between the weak exogeneity condition in the transformed

system and in the unrestricted I(2) model stems from the fact that they are formulated

with respect to different sets of cointegration parameters: In the unrestricted I(2) model

the cointegration parameters are (β, β1, δ), whereas in the present case, β2 is known

up to a normalization and the cointegration parameters that remain are just (ϕ, δ), or,

equivalently, (β, δ).

A final set of results regards the common trends structure of the transformed process.

The p − r common I(1) trends of the transformed process are given by α0⊥P

εi with

α⊥ = (M−1)0α⊥ and therefore

α0⊥X

εi = α0⊥M−1XMεi = α0⊥

Xεi. (13)

As span(α⊥) = span(α1, α2) this shows that the common trends of the original I(2) processare fully recovered by the transformed I(1) process. Both sets of common trends now enter

4This includes identifying restrictions which can be imposed on δ with the usual caveat that the structure

is in fact identifying according the conditions laid out by Johansen (1995b). In particular, one should not

be able to impose a full row of zeros in δ0. This is testable by standard I(1) tools as a row restriction on β.

as I(1) stochastic trends. The matrix of loadings of the common trends is proportional to

the orthogonal complement, β⊥, which can be conveniently represented as

β⊥ =

Ãϕ⊥ ϕδ

0 v0b(b0b)−1ω−1

!. (14)

Two main implications for the common trends structure emerge directly from (14).

First, the fact that |v0b| 6= 0 implies that Ut = v0∆Xt in itself is non-cointegrating, that

is, any linear combination of the components of the p − q-dimensional process Ut would

remain I(1). Essentially, the full set of common I(2) trends of the original process carry

over in first differences via Ut.

Secondly, the representation allows α1 and α2 to be separately identified from the trans-

formed process. Specifically, the representation in (14) separates the last p − q columns

related to the differenced I(2) trend, α02P

εi, from the first s columns related to the gen-

uine I(1) trend, α01P

εi. To see this, note that α02

Pεi is the only common trend left in

Ut due to first-differencing. Thus, α01

Pεi produce zero loadings in Ut which can be used

as part of the identification scheme of a structural common trends model along the lines

of King, Stock, Plosser and Watson (1991), see also Warne (1993). Economic theory may

suggest alternative identifying assumptions but the different roles assigned to α01εt andα02εt due to their very different effects in the I(2) model would seem suggestive for their

economic interpretation as well.

3 Restrictions and Estimation

The parameters of the transformed VAR in (10) are subject to certain restrictions that

derive from the structure of the original process, Xt, and the transformation itself. The

restrictions can be categorized in two groups in practical terms.

A first group of restrictions relate to the reduced rank of the coefficient matrix of the

levels term, Π. This leads to the parameterization Π = αβ0. Moreover, the coefficient of

the linear trend can be seen to be restricted accordingly, that is,

µ1 = αβ00. (15)

The reduced rank of Π and (15) are straightforward to implement in the reduced rank

regression algorithm for I(1) models with a restricted linear term, see Johansen (1996,

chapter 6).

A second group of parameter restrictions on the transformed VAR do not fall naturally

into the reduced rank regression framework. Evidently, there are zero restrictions on the

coefficients of ∆Ut−k+1, restricting the last p− q columns of the last lag coefficient, Γk−1.Moreover, if the I(2) process has restricted deterministic terms, conditions such as (4)

carry over to the transformed model. Specifically, as (4) serves to exclude the possibility

of quadratic trends in Xt, the first-differenced component in Xt, Ut = v0∆Xt, can have

no linear trend.

The latter group of restrictions are commonly ignored in applied work. The aim of the

empirical example and the simulation experiment below is to assess the importance of the

resulting efficiency loss. Before turning to this part we outline the estimation algorithms

adopted.

3.1 Reduced Rank Regression

For completeness we first outline the standard case of reduced rank regression. This is

based on a VAR(k) of the transformed variables, Xt, with a restriction on the linear trend

term similar to (5). Maximum likelihood estimation (MLE) of this model amounts to

solving the eigenvalue problem ¯λS11 − S10S

−100 S01

¯= 0,

where Sij = T−1PT

t=1RitR0jt are sample moment matrices, and R0t and R1t are least

squares residuals of regressing ∆Xt and (X0t−1, t)0 respectively on Wt = (∆X

0t−1,∆X 0

t−2,. . . , ∆X 0

t−k+1, 1)0, see Johansen (1996, chapter 6). This yields p + 1 ordered eigenvalues

1 > bλ1 > bλ2 > ... > bλp > bλp+1 = 0. The MLE of (β0, β00)0 is given by the eigenvectors

corresponding to the r largest eigenvalues. Furthermore the likelihood ratio test for a

reduced rank of r compared to the full rank alternative can be written as a function of

the eigenvalues as the trace test statistic

Qr = −2 logQ³rank(Π) ≤ r | rank(Π) ≤ p

´= −T

pXi=r+1

log³1− bλi´ . (16)

Note that by not imposing the restriction on Ut−k+1 one more initial observation is nec-essary for the unrestricted model.

3.2 Restriction on the Lagged First-Differences

The redundancy of Ut−k+1 is implied by the hypothesis

H0 : eΓk−1 = ¡γp×q, 0p×(p−q)¢ ,where γ contains the free parameters. To impose the restriction we modify the reduced

rank regression described above. In particular we modify the vector of unrestricted vari-

ables to obtain

W ∗t =

Ã∆X 0

t−1,∆X0t−2, . . . ,∆X

0t−k+1

0(p−q)×q

Note that each equation of the VAR still contains the same set of variables and the effect

can still be partialled out using least squares.

3.3 Restriction on the Trend Term

In order to impose that a linear trend is absent in Ut we first rewrite (10) as

∆eYt = eαeβ0 eYt−1 + k−1Xi=1

eΓi∆eYt−i + eαψ + eεt (17)

eXt = eYt + θt. (18)

Non-zero means in all directions are allowed for by the constant term restricted to the

cointegrating relations in (17) and the linear trend is added in the factor representation

(18). The restriction is that the last q elements in θ are zero, i.e.

H1 : θ = Nϑ =

0(p−q)×q

0(p−q)×1

!where ϑ contains the free trend parameters.

MLE of the model under H1 can be performed by applying the switching algorithm

of Nielsen (2003). The idea is that conditional on an estimate, bθ, of the parameters tothe linear trend, θ, the parameters of (17) can be estimated using a usual reduced rank

regression of the corrected data eYt = eXt − bθt. With these estimates we can construct theestimated characteristic polynomial, bA (L), and the estimated residual, bet = bA (L) eXt−beαbψ,which under the model can be written as

bet = bA (L)Nϑt+ eεt = bHtϑ+ eεt. (19)

where bHt ≡ bA (L)Nt. The likelihood function conditional on beαbψ and the parameters inbA (L) is maximized over ϑ by a GLS estimation in (19), i.e.bϑ = Ã TX

³ bH 0tbΩ−1 bHt

´!−1Ã TXi=1

³ bH 0tbΩ−1bet´! , (20)

see Tsay, Pena, and Pankratz (2000) and Saikkonen and Lutkepohl (2000) for a similar

GLS step used in a two step estimator. Here we follow Nielsen (2003) and iterate between

the two conditional steps until convergence.

4 Empirical Application and Simulations

This section provides an empirical assessment of likely efficiency losses associated with the

usual practice of applying standard I(1) methods to the unrestricted VAR of Xt, ignoring

the fact that it was derived by transforming an I(2) process. First, the empirical analysis

of Banerjee, Cockerell, and Russell (2001), BCR in the following, is reexamined in view

of the above results. Secondly, a small scale Monte Carlo experiment is set up to provide

simulation evidence on the issue. It uses the estimated BCR model as a realistic data

generating process (DGP).5

5The computations have been conducted in Ox 3.0, see Doornik (2001).

4.1 Empirical Illustration

BCR used Australian data to analyze the relation between inflation and the markup of

prices on costs. The analysis included a set of three core variables: The log of consumer

prices, pt, the log of unit labor costs, ut, and the log of import prices, mt.6 The data has

94 effective quarterly observations for the period t = 1972 : 1 − 1995 : 2. BCR analyzeda VAR(2) with the deterministic specification of Paruolo (1996), applying the two-step

estimator of Johansen (1995a).

For the purpose of illustration we make a few modifications to the specification. In

particular, four conditioning variables are excluded so that Xt = (pt, ut,mt)0, and the

deterministic specification of Rahbek et al. (1999) is applied. Moreover, we apply the

maximum likelihood estimator of the I(2) model, see Johansen (1997), to ensure that dif-

ferences between the I(2) model and the transformed I(1) model do not reflect inefficiencies

in the estimation of the I(2) model.

BCR impose the rank indices r = 1 and s = 1, and linear homogeneity between the

variables, i.e. b = (1, 1, 1)0. The chosen rank indices are also consistent with the simplifiedspecification.7 The six eigenvalues of the characteristic polynomial of the restricted model

have moduli given by

1.000; 1.000; 1.000; .540; .277; .060.

Two of the unit roots are associated with the common I(2) trend and one with the genuine

I(1) trend of the system. The largest unrestricted eigenvalue is far from unit circle,

reflecting a fast dynamic adjustment.

The estimates of the polynomially cointegration parameters are reported in Table 1

in terms of β0 and −δ. The estimates in row (iii) are obtained by applying the MLE tothe simplified model. The original estimates from BCR are reported in rows (i) and (ii).

The results are similar although the import share of the present analysis is larger. The

differences reflect the different deterministic specifications, the exclusion of conditioning

variables, and the use of the MLE rather than the two-step estimator.

A nominal-to-real transformation is performed by BCR using the matrices

−1 0

0 −1

and v =

, (21)

which produce a transformed data set given by eXt = (pt − ut, pt −mt,∆pt)0. It includes

the markup on unit labor costs, the inverse of the real import prices and the rate of change

in the consumer prices. The transformation satisfies the requirement that |v0b| 6= 0.6In addition, four variables assumed to be stationary and weakly exogenous were included.7The test of homogeneity can be conducted by standard methods in the I(2) model, see Kongsted

(2003) and Johansen (2002). Using the ML estimator, homogeneity is accepted with a test statistic of 1.43

corresponding to a p−value of .49 according to a χ2 (2) distribution.

Applying this transformation to the simplified version of the BCR model and esti-

mating the unrestricted I(1) model yields the estimates presented in row (iv). The log-

likelihood of the unrestricted model is only marginally higher than the likelihood of the

homogeneous I(2) model. The formal test statistic for the test of 4 restrictions–three

restricted lag coefficients and one restricted trend coefficient–is around two, which is far

from significant in a χ2 (4) distribution.

Row (v) reports the results of imposing zero restrictions on the last lag on v0∆Xt, i.e.

H0 : eΓ1 = ¡γ3×2, 03×1¢ ,while row (vi) reports the results obtained by imposing the restriction that the transformed

variables v0∆Xt have no linear trend,

H1 : θ =¡ϑ01×2, 0

Finally, row (vii) reports the results of imposing both restrictions on the model. The

fully restricted I(1) model for the transformed data is simply a reparametrization of the

homogeneous I(2) model and the results in rows (iii) and (vii) are seen to be identical.

The partly restricted models show that in the present case the importance of the trend

restriction, H1, is negligible whereas the lag restriction, H0, is somewhat more important.

This could simply be a result of the fact that H0 imposes three restrictions on the model

whereas H1 imposes only one. In total, the results indicate that inference on β and δ can

be effectively performed in an unrestricted I(1) analysis of the transformed data set.

Subject to the limitation that |v0b| 6= 0, the choice of v only matters for the interpreta-tion of the model. This is illustrated in the lower part of Table 1, which reports estimation

results for an alternative choice of v. The average inflation rate, v = 13 (1, 1, 1)

0, now rep-resents the first difference of the I(2) trend. If the full set of restrictions is imposed as in

row (xi), the results are indeed identical to row (iii) and (vii) and a different choice of

v amounts to a reparametrization. However, in unrestricted or partly restricted models

some differences in the results may appear. The effects of including a redundant lag of

v0∆Xt depend on the sample correlation between that particular variable and the other

terms in the model, and therefore also on the specific choice of v. Similarly, the effects of

the redundant linear trend allowed for in v0∆Xt depend on the particular sample values.8

4.2 Simulations

The loss of efficiency from ignoring the derived nature of the transformed process will now

be characterized along two dimensions: Cointegration rank determination and estimation

of the polynomially cointegrating parameters. A small scale Monte Carlo simulation is

8A grid search over possible vectors v = (1, v2, v3)0, where v2 and v3 take values between −100 and 100,

resulted in log-likelihood values between 1195.13 and 1196.24 for the unrestricted reduced rank regression,

the extremes being obtained for the vectors (1, 0.2,−0.4)0 and (1,−1,−0.2)0 respectively.

Table 1: Estimation on the Banerjee et al. (2001) data.

β0 −δ Log- Difference to

pt ut mt v0∆Xt likelihood I(2) model

Banerjee et al. (2001)

(i) Homogeneous I(2) 1.000 −.868 −.132 7.026

(ii) I(1) 1.000 −.901 −.099 7.201

(iii) Homogeneous I(2) 1.000 −.795 −.205 7.887 1194.7875 ...

v = (1, 0, 0)0

(iv) Unrestricted 1.000 −.770 −.230 8.252 1195.8085 1.0210

(v) Lag restriction 1.000 −.788 −.212 8.180 1194.9731 .1856

(vi) Trend restriction 1.000 −.782 −.218 7.742 1195.6406 .8532

(vii) Both restrictions 1.000 −.795 −.205 7.887 1194.7875 .0000

v = 13 (1, 1, 1)

(viii) Unrestricted 1.000 −.768 −.232 8.706 1195.9540 1.1665

(ix) Lag restriction 1.000 −.788 −.212 8.180 1194.9731 .1856

(x) Trend restriction 1.000 −.775 −.225 8.319 1195.7618 .9743

(xi) Both restrictions 1.000 −.795 −.205 7.887 1194.7875 .0000

50 75 100 125 150 175

100 Rejection frequency

r ≤ 1

Number of observations

Trace Trace tilde

Figure 1: Rejection frequencies for tests of the true null of r ≤ 1 and the false null of r = 0 usingthe conventional trace test statistic, Qr, and the modified statistic, eQr. Critical values for both tests

are taken from Cavaliere, Fanelli, and Paruolo (2001, Table 6).

set up for this. It employs the homogeneous I(2) model reported in row (iii) of Table 1

as its DGP9 which is defined by

−0.0637 0.0506 0.0131

0.0771 −0.0613 −0.0158−0.0659 0.0524 0.0135

, Γ =

0.5753 −0.0888 0.0156

−1.6646 1.0236 0.0326

−0.1901 −0.2564 0.9666

Ω = 10−3

0.0381 0.0483 0.0628

0.0483 0.3392 0.1133

0.0628 0.1133 0.9678

, µ0 =

−0.08040.0980

−0.0838

, µ1 =

−0.02740.0332

−0.0284

Samples of T +k+100 observations, t = −101,−100, ..., 0, 1, 2, ..., T , are generated by ini-tializing the stochastic part of the process at zero and replacing εt with random indepen-

dent N(0, bΩ) drawings. Sample sizes of T = 50, 75, 100, 200, 400 effective observationsare considered, 100 presample observations are discarded10, and estimation is conditional

on k = 2 initial observations. 10.000 Monte Carlo replications are used for each case.

The determination of cointegration rank based on the transformed process is the start-

ing point in most applications. Kongsted (2002) showed that subject to (7), the rank index

r will indeed be correctly identified as rank(Π). However, simply applying the trace test

(16) would employ rank(Π) ≤ p as the alternative hypothesis when, under the set of

assumptions maintained in transforming the process, in fact it holds that rank(Π) ≤ q.

A potential efficiency loss is evident already from the fact that applying the standard

trace test could well result in an estimated cointegration rank in the infeasible interval

q < r ≤ p.

One could also take rank(Π) ≤ q as the alternative hypothesis and consider a modified

trace statistic,

Qr = −2 logQ³rank(Π) ≤ r | rank(Π) ≤ q

´= −T

qXi=r+1

log³1− bλi´ . (22)

This situation of cointegration rank determination under a priori rank constraints is

analyzed in Cavaliere, Fanelli and Paruolo (2001), who also report simulated critical values

for the modified statistic.

The BCR-based DGP that underlies the simulation experiment has p = 3 and q = 2,

and the modified trace statistic Qr employs rank(Π) ≤ 2 as the alternative hypothesis.Figure 1 compares the empirical rejection frequencies of two tests of the (true) null that

rank(Π) ≤ 1, Q1 and Q1, at different sample sizes. The rejection frequencies can hardly be9This is chosen as a simple yet realistic DGP. Due to the fast dynamic adjustment in the model, it is well-

behaved and shows reasonable size and power properties even in fairly small samples. Still, the emphasis

is on the comparison between unrestricted and restricted estimators rather than their performance in

absolute terms.10Simulation results for different choices of starting values show that this eliminates any importance

of the starting values in the stationary directions of the process. The results also remain unaffected by

changing the starting values in non-stationary directions due to a similarity property of the estimation

method adopted here in terms of the deterministic terms of the DGP.

distinguished. The rejection frequencies of Q0 and Q0 for the (false) null that rank(Π) = 0

are also depicted in Figure 1. For small and moderate sample sizes there is an efficiency

gain since the power of Q0 is marginally higher. Still, the differences are minor and,

overall, the loss of efficiency from using the standard trace test seems very limited. We

will therefore make use of the standard test in the following.

The second issue concerns efficiency in estimating the polynomially cointegrating pa-

rameters from the transformed model by unrestricted reduced rank regression. In each

Monte Carlo replication we applied the estimators outlined in Section 3 to the simulated

data. To evaluate the properties of the estimators the average angle between the estimatedeβ and the true vector of the DGP, eβ = (0.795, 0.205, 7.887)0, is reported together with therejection frequency of the LR test for the (true) restrictions on the last lag and the trend

coefficient. Furthermore, we report the actual size and power at a nominal 5 per cent level

of the standard trace test for rank determination based on the different estimators.

Table 2 reports the estimation results for a transformation given by B and v as defined

in (21). In the unrestricted reduced rank regression, reported in Panel A, the average

angle between the true and the estimated eβ is 6 degrees for T = 50 and converging to zerorelatively fast. The results for the rank determination reflect fast dynamic adjustment in

the DGP. With T = 100 observations the power of the trace test for the hypothesis r = 0

is 95 per cent and the size is close to 5 per cent for all sample lengths.11

Imposing the lag restriction, H0, improves the average precision of the estimates in

small samples, cf. the results reported of Panel B in Table 2. For 50 observations the

restriction improves the average angle from 6 to 4.7 degrees. For 100 observations the

difference is down to .1 degrees and for T = 200 the results are almost identical. In the

restricted model the distribution of the trace test is apparently moved a little to the right,

implying a higher power and higher size of the test. The LR test for the lag restriction,

H0, which is χ2 (3) distributed, is somewhat oversized in small samples with a rejection

frequency of 10 to 15 per cent against the nominal size of 5 per cent.

Imposing the restriction, H1, that the variable v0∆Xt has no linear trend, yields little

or no improvement. The results reported in Panel C are almost identical to the results of

the unrestricted reduced rank regression. Due to a faster rate of convergence of the trend

coefficient it matters very little if the restriction is imposed or not. Similarly, the results in

Panel D of imposing the full set of restrictions are almost identical to the results obtained

under the lag restriction alone reflecting the minor importance of the trend restriction.

5 Conclusions

This paper has derived the restrictions that apply to a transformed vector autoregres-

sion obtained by a minimal I(2)-to-I(1) transformation. The relationship between the

11The critical values for the trace tests are based on the Gamma approximation of Doornik (1998).

A small negative approximation error in terms of the 95 per cent quantile seems to be implied by this

approximation.

Table 2: Simulations results

Rejection frequency (per cent)

Average angle Difference in LR test of Trace test Trace test

T to true β angle to Panel A VAR restriction r = 0 r ≤ 1

A. Unrestricted RRR estimation

50 6.194 ... ... 50.7 5.2

75 2.801 ... ... 78.7 5.9

100 1.743 ... ... 95.3 6.1

200 0.744 ... ... 100.0 5.8

400 0.342 ... ... 100.0 6.0

B. Lag restriction imposed

50 4.701 −1.493 14.5 75.8 7.0

75 2.379 −0.423 10.2 97.3 6.6

100 1.617 −0.125 8.6 99.9 6.2

200 0.725 −0.020 7.0 100.0 5.8

400 0.338 −0.004 6.2 100.0 6.1

C. Trend restriction imposed

50 6.269 0.075 4.5 47.3 4.8

75 2.838 0.037 4.7 76.2 5.5

100 1.739 −0.003 5.1 93.9 5.7

200 0.741 −0.003 4.9 100.0 5.3

400 0.343 −0.000 5.0 100.0 5.7

D. Both restrictions imposed (Homogeneous I(2) model)

50 4.835 −1.359 12.6 72.6 6.3

75 2.408 −0.393 9.4 96.5 6.1

100 1.619 −0.123 8.1 99.9 5.9

200 0.722 −0.022 6.5 100.0 5.4

400 0.338 −0.004 6.0 100.0 5.7

Note: Angle measured in degrees. 10.000 replications. Critical values for the

trace tests are taken from Doornik (1998).

parameters of the I(2) vector autoregression and the transformed model is characterized,

including the coefficients of polynomially cointegrating relationships.

In applied work it is common to use unrestricted reduced rank regression and to

apply standard tools for inference on the cointegrating rank in the transformed model.

We find that there is only a small gain from excluding the alternative hypotheses which

are irrelevant under the assumptions maintained in transforming the data. Moreover,

unrestricted reduced rank regression is shown to yield only a minor loss of efficiency

compared to imposing the restrictions in the simulation experiment. Most efficiency is

gained by imposing the absence of redundant lags in the differenced I(2) process, which

is a fairly simple restriction in terms of the restricted estimation procedure. Imposing the

restriction on the trend coefficients, which requires a more involved iterative estimation

algorithm, leads to little or no efficiency gain. It appears fairly safe to ignore this restriction

in applied work.

In conclusion, a properly transformed vector autoregression provides a practical and

effective means for inference on the parameters of the I(2) model.

References

Banerjee, A., Cockerell, L., and Russell, B.(2001): “An I(2) Analysis of Inflation and theMarkup”, Journal of Applied Econometrics, 16,221—240.

Cavaliere, G., Fanelli, L., and Paruolo, P.(2001): “Determining the Number of Cointegrat-ing Relations Under Rank Constraints”, Univer-sity of Insubria WP 2001/17.

Coenen, G. and Vega, J.-L. (2001): “The De-mand for M3 in the Euro Area”, Journal of Ap-plied Econometrics, 16, 727—748.

Crowder, W., Hoffman, D., and Rasche, R.(1999): “Identification, Long-Run Relations, andFundamental Innovations in a Simple Cointe-grated System”, Review of Economics and Statis-tics, 81(1), 109—121.

Doornik, J.A. (1998): “Approximations tothe Asymptotic Distributions of CointegrationTests”, Journal of Economic Surveys, 12(5), 573—593.

––– (2001): Object-Oriented Matrix Pro-gramming Using Ox, 4rd ed., London:Timberlake Consultants Press and Oxford:www.nuff.ox.ac.uk/Users/Doornik.

–––, Hendry, D.F., and Nielsen, B. (1998):“Inference in Cointegrating Models: UK M1 Re-visited”, Journal of Economic Surveys, 12, 533—572.

Engsted, T. and Johansen, S. (1999):“Granger’s Representation Theorem and Mul-ticointegration”, in Engle, R. and White, H.(eds.): Cointegration, Causality, and Forecasting:A Festschrift in Honour of Clive W.J. Granger,Oxford: Oxford University Press.

Granger, C. and Lee, T. (1989): “Investiga-tion of Production, Sales and Inventory Re-lationships Using Multicointegration and Non-symmetric Error Correction Models”, Journal ofApplied Econometrics, 4, 145—159.

Haldrup, N. (1998): “An Econometric Analysis ofI(2) Variables”, Journal of Economic Surveys, 12,595—650.

Hansen, P.R. and Johansen, S. (1998): Work-book on Cointegration, Oxford: Oxford Univer-sity Press.

Hendry, D.F. and von Ungern-Sternberg, T.(1981): “Liquidity and Inflation Effects on Con-sumers’ Behaviour”, in Deaton, A. (ed.).: Es-says in the Theory and Measurement of Con-sumers’ Behaviour, Cambridge: Cambridge Uni-versity Press.

Johansen, S. (1992a): “A Representation of VectorAutoregressive Processes Integrated of Order 2”,Econometric Theory, 8, 188—202.

––– (1992b): “Cointegration in Partial Systemsand the Efficiency of Single Equation Analysis”,Journal of Econometrics, 52, 389—402.

––– (1995a): “A Statistical Analysis of Cointe-gration for I(2) Variables”, Econometric Theory,11, 25—59.

––– (1995b): “Identifying Restrictions of LinearEquations: With Applications to SimultaneousEquations and Cointegration”, Journal of Econo-metrics, 11, 111—132.

––– (1996): Likelihood-Based Inference in Coin-tegrated Vector Autoregressive Models, 2nd edi-tion, Oxford: Oxford University Press.

––– (1997): “Likelihood Analysis of the I(2)Model”, Scandinavian Journal of Statistics, 24,433—462.

––– (2002): “The Statistical Analysis of Hy-potheses on the Cointegrating Relations in theI(2) Model”, Preprint No. 13. Department ofTheoretical Statistics, University of Copenhagen.

King, R., Plosser, C., Stock, J., and Wat-son, M. (1988): “Stochastic Trends and Eco-nomic Fluctuations”, American Economic Re-view, 81(4), 819—840.

Kongsted, H.C. (2002): “Testing the Nominal-to-Real Transformation”, Discussion Paper 02-06, Institute of Economics, University of Copen-hagen.

––– (2003): “An I(2) Cointegration Analysisof Small-Country Import Price Determination”,Econometrics Journal, 6, 53—71.

Nielsen, H.B. (2003): “Cointegration Analysis inthe Presence of Outliers”, Discussion Paper 03-05, Institute of Economics, University of Copen-hagen, Chapter 3 in this Thesis.

Paruolo, P. (1996): “On the Determination ofIntegration Indices in I(2) Systems”, Journal ofEconometrics, 72, 313—356.

––– and Rahbek, A. (1999): “Weak Exogeneityin I(2) VAR Systems”, Journal of Econometrics,93, 281—308.

Rahbek, A., Kongsted, H.C., and Jørgensen,C. (1999): “Trend Stationarity in the I(2) Coin-tegration Model”, Journal of Econometrics, 90,265—289.

Saikkonen, P. and Lutkepohl H. (2000): “TrendAdjustment Prior to Testing for the Cointegra-tion Rank of a VAR Process”, Journal of TimeSeries Analysis, 21(4), 435—456.

Tsay, R.S., Pena D., and Pankratz A.E.(2000): “Outliers in Multivariate Time Series”,Biometrika, 87(4), 789—804.

Warne, A. (1993): “A Common Trends Model:Identification, Estimation and Inference”, Semi-nar paper 555, Institute of International Studies,Stockholm University.

Chapter 3

Cointegration Analysis

in the Presence of Outliers

Cointegration Analysis in the Presence of Outliers

Heino Bohn Nielsen

Abstract

The effects of innovational outliers and additive outliers in cointegrated vector autore-

gressive models are examined and it is analyzed how outliers can be modelled with

dummy variables. Using a Monte Carlo simulation it is illustrated that additive out-

liers are more distortionary that innovational outliers, and that misspecified dummies

may distort inference on the cointegration rank in finite samples. That questions the

common practice in applied cointegration analyses of including unrestricted dummy

variables to account for large residuals. Instead it is suggested to focus on additive

outliers, and correct the data before the cointegration analysis, or alternatively to

test the adequacy of a particular specification of dummies prior to determining the

cointegration rank. The points are illustrated on a UK money demand data set.

Keywords: Cointegrated VAR, Innovational outlier, Additive outlier, Dummy vari-

ables, Monte Carlo.

JEL Classification: C32.

1 Introduction

Economic time series are frequently affected by special events, for instance policy interven-

tions, strikes or gross measurement errors. Such events often show up as large residuals,

or outliers, in econometric models and that raises two issues for an applied econometri-

cian; first the inferential consequences of outliers if they are not detected, and second how

the irregularities can be modelled with dummy variables. In this paper we address these

issues for the case of the cointegrated vector autoregressive (VAR) model, which has been

widely applied in many fields of empirical research.

The effects of non-modelled outliers in autoregressive models depend on their precise

nature and a distinction is often made between innovational outliers (IOs) and additive

outliers (AOs), see inter alia Fox (1972), Tsay (1986) or Muirhead (1986). An IO is

I am grateful to Hans Christian Kongsted, Bent Nielsen, participants at the Conference on Bridging

Economics and Econometrics, Florence, as well as participarts in seminars at the Institute of Economics,

Copenhagen, and Nuffield College, Oxford, for numerous comments and suggestions.

produced by a shock to the innovation term of a data generating process (DGP), and is

propagated through the autoregressive structure. An AO, on the other hand, is superim-

posed on the levels of the data, i.e. independently of the autoregressive parameters.

In the case of a fixed number of outlying observations asymptotic inference in the

cointegration model is unchanged, in the sense that the asymptotic distributions are unaf-

fected; but the distortionary effects could be important in finite samples. Using simulations

Doornik, Hendry, and Nielsen (1998) find that ignored IOs have only minor consequences

for small sample inference on the cointegration rank of a VAR process, while Franses and

Haldrup (1994), Shin, Sarkar, and Lee (1996) and Vogelsang (1999) find that AOs may

bias inference towards the finding of stationarity or cointegration.

The second issue, how outliers in a cointegrated VAR model can be modelled with

dummy variables, is less resolved. An IO is straightforward to model with an unrestricted

dummy variable, whereas an AO is more difficult to handle in a cointegrated VAR model

because it does not fit into the reduced rank regression structure. In applications of

the cointegrated VAR model the usual practice is to identify outlying observations from

the estimated residuals and to include unrestricted (innovational) dummies to whiten

residuals, see inter alia Hendry and Juselius (2001). But to the best of our knowledge

there is little justification for this practice; and it is not obvious that outliers of a general

form, and AOs in particular, can be successfully modelled with unrestricted dummies.

Moreover, the focus on IOs in practical applications is not in line with the result that IOs

are relatively harmless while the distortion from AOs may be more severe.

In this paper we use a Monte Carlo simulation to analyze how IOs and AOs can be

approximated in small samples. We find that the usual innovational model is misspecified

if an outlier is additive; and AOs are very difficult to approximate with unrestricted

dummies. As a results the usual practice may bias the estimated parameters if AOs are

present and may distort inference on the cointegration rank.

Instead we suggest two alternative approaches. One approach is to test for the type of

outliers prior to determining the cointegration rank. For this we propose an algorithm for

maximum likelihood estimation of the cointegrated VAR model with additive dummies,

and construct a simple outlier detection procedure. The other approach utilizes the fact

that IOs are not particularly harmful and focuses on correcting for AOs. AOs are related

to the time series and not to the autoregressive model, and the correction for AOs could be

done directly in the data. One possibility is to use the maximum likelihood estimator of

the model with additive dummies to interpolate observations with AOs, and this method

gives good results. We also consider a very simple alternative, namely to correct for AOs

using linear interpolation in the data. This is easy to implement and is almost as effective

as the maximum likelihood approach, and, moreover, it is much preferable to unrestricted

dummies for the outlying observations.

The rest of the paper expands on these results. First, Section 2 introduces the cointe-

grated VAR model and two ways to include dummy variables. Section 3 briefly presents

the estimation algorithms and some hypotheses of interest. Section 4 then gives an empir-

ical illustration of the role of outliers and dummy variables based on UK money demand

data, and the effects of outliers and dummy variables are then analyzed in a Monte Carlo

simulation in Section 5. Finally, Section 6 concludes and gives recommendations for ap-

plied cointegration analysis in the presence of outliers.

2 Models for Deterministic Components

We consider the p−dimensional cointegrated VAR model

∆Yt = αβ0Yt−1 +k−1Xi=1

Γi∆Yt−i + µ0 + αβ00t+ µt + t, t = 1, 2, ..., T, (1)

where t are i.i.d. Gaussian innovations, N (0,Ω). The parameters α and β are of di-

mension p × r such that the rank of Π = αβ0 is r ≤ p, and the remaining autoregressive

parameters, Γ1, ...,Γk−1, are each of dimension p× p.

We consider a deterministic specification given by an unrestricted constant, µ0, and

a restricted linear drift term, µ1t = αβ00t. This model allows for a linear trend both inthe stationary and non-stationary directions of the data and is often favored in empirical

applications, see Nielsen and Rahbek (2000) for a comparison with other specifications.

Finally, (1) includes an additional deterministic function, µt, which will contain indicator

variables. To characterize impulses and balanced impulses respectively, we define for a

particular observation T0 the indicators

Dt(T0) = 1 t = T0 and dt(T0) = ∆Dt(T0),

where 1 · is the indicator function equal to one if the expression in curly brackets istrue. For ease of notation we will sometimes drop the reference to T0 and denote the

variables Dt and dt. Throughout the paper we consider (1) with µt = 0 as the baseline

specification and we denote this model H∗(r). In the subsections below we present twodistinct ways to augment the baseline model with indicator variables, and we characterize

the resulting specifications both as DGPs and estimation models. When (1) is a DGP we

denote by outliers the effects on the data of the indicator variables in µt 6= 0. When (1)is an estimation model we use dummy variables to denote the indicators included in µt to

approximate irregularities in the data.

If the (p− r)×(p− r) matrix α0⊥Γβ⊥ is non-singular1 and the roots of the characteristic

polynomial

A (z) = (1− z) I − αβ0z −k−1Xi=1

Γi (1− z) zi, (2)

are located either outside the unit circle or at z = 1, the process Yt is I(1) with represen-

tation

Yt = CtX

( i + µi) + C1(L) ( t + µt) + κ1t+ κ0, (3)

1For a p× r matrix α we denote by α⊥ the p× (p− r) dimensional orthogonal complement such that

α0α⊥ = 0 and span(α : α⊥) = Rp. Further we define Γ = I −Pk−1i=1 Γi.

where C = β⊥ (α0⊥Γβ⊥)−1 α0⊥ has reduced rank p− r, C1(L) is an infinite but convergent

matrix polynomial, κ1 is a functions of the parameters, and κ0 involves the parameters

and the initial values, see Johansen (1996, Theorem 4.2).

2.1 The Innovational Model, H∗I (r)

The usual way to augment (1) with a n−dimensional vector of dummy variables, Dt say, is

to let µt = φDt, where φ is a p×n matrix of unrestricted coefficients, see Johansen (1996,

Chapter 5) and Hendry and Juselius (2001). It follows from (3) that the total effect in Yt

is given by CPt

i=1 φDi+C1(L)φDt, and unless α0⊥φ = 0 the levels contain the cumulated

effect of Dt. Since β0C = 0, the cointegrating relations, β0Yt, annihilate both the common

stochastic trends and the cumulated effect of Dt, and the deterministic specifications are

different in the stationary and non-stationary directions.

The interpretation of an innovational impulse outlier at time T0, i.e. the indicator

variable Dt(T0) included in a DGP, is a large shock that is subject to the same dynamic

adjustment as the usual innovations. Columns one and two of Figure 1 give an example

of the effects in the stationary and non-stationary directions of a balanced impulse, dt,

and a non-balanced impulse, Dt.

In an estimation model an innovational impulse dummy renders the corresponding

residual equal to zero, eliminating the contribution to the likelihood function.

2.2 The Additive Model, H∗A(r)

An alternative additive formulation can be written as the unobserved components model

Yt = Xt + θDt (4)

A (L)Xt − µ0 − αβ00t = t, (5)

where Xt are unobserved variables that obey a cointegrated VAR model and θ is a p× n

matrix of unrestricted coefficients. In this case the outliers are related to the time series,

independent of the autoregressive model. The interpretation of an impulse AO could

be an isolated measurement error, e.g. a typing mistake. It follows from (4) that θDt

enters the representation (3) additively and in general, the effects in the stationary and

non-stationary directions are similar, cf. column three of Figure 1.

To characterize the properties of the estimation model, first note that according to (4)

and (5) the dummy variables are included with the full lag structure of the endogenous

variables, µt = A (L) θDt. It is useful to rewrite the additive specification as a vector error

correction model for the observed variables:

∆Yt = α¡β0 : β00 : β

¢ Yt−1t

Dt−1

+ k−1Xi=1

Γi∆Yt−i +k−1Xi=0

θi∆Dt−i + µ0 + t, (6)

subject to the k sets of restrictions

β1 = −θ00β (7)

θi = −Γiθ0, i = 1, ..., k − 1. (8)

If the restrictions in (7) and (8) are not imposed, the dynamics of the dummies are

approximated by the k free parameters (θ0, ..., θk−1). This is closely related to the modelproposed by Johansen, Mosconi, and Nielsen (2000) for broken levels and broken linear

trends. While the approximation is often appropriate in the case of the transition to a

structural break, it is very costly in terms of degrees of freedom for the case of isolated

outliers and there is a potential efficiency gain by imposing the restrictions.

It is worth noting that an additive impulse dummy, Dt(T0), eliminates the contribution

from the observation, YT0 , to the likelihood function rather than the contribution from the

residual, T0 . The interpretation of an additive dummy is therefore in a sense equivalent

to the interpretation of a dummy variable in a static model; and it is closely related to

the interpolation of missing values, see Gomez, Maravall, and Pena (1999).

3 Estimation and Some Hypotheses of Interest

Maximum Likelihood (ML) estimation of the basic model, µt = 0, is reduced rank re-

gression (RRR), equivalent to solving the eigenvalue problem¯λS11 − S10S

−100 S01

¯= 0,

where Sij = T−1PT

t=1RitR0jt are sample moment matrices, and R0t and R1t are least

squares residuals of regressing ∆Yt and Y ∗t−1 =¡Y 0t−1 : t

¢0respectively on the unrestricted

variables Ut = (∆Y0t−1 : ... : ∆Y 0t−k+1 : 1)

0, see Johansen (1996, Chapter 6). That yieldsp+ 1 eigenvalues, 1 > bλ1 > bλ2 > ... > bλp > bλp+1 = 0, and the estimate of the cointegrat-ing relations, bβ∗ = (bβ0 : bβ00)0, is given by the eigenvectors corresponding to the r largesteigenvalues.

Different values of r define a sequence of nested models, H∗(0) ⊂ ... ⊂ H∗(r) ⊂ ... ⊂H∗(p), and a Likelihood Ratio (LR) test for a cointegration rank smaller than or equal tor against a rank smaller than or equal to p is given by the so-called trace test statistic

Qr = −2 logQ (H∗(r) | H∗(p)) = −TpX

log³1− bλi´ .

The asymptotic distribution of Qr depends on the deterministic specification and involves

Brownian motions. In the determination of the cointegration rank the idea is to reject a

model, H∗(r), only if all the more restricted models, H∗(0), ...,H∗(r−1), are also rejected,starting from the most restricted model H∗(0), see Johansen (1996, Chapter 12).

The presence of innovational dummies does not complicate estimation. The dummies,

Dt, can be included in Ut and concentrated out prior to the RRR. The asymptotic distri-

bution of the trace test statistic is not changed by a fixed number of outliers or a fixed

number of innovational dummies, see Doornik, Hendry, and Nielsen (1998).

It is also straightforward to modify the RRR procedure to estimate the additive model

(6) without the restrictions (7) and (8). The first differences, ∆Dt, ...,∆Dt−k+1, can beincluded in Ut, and Y ∗t can be augmented with the lagged levels, Dt−1. The effects ofthe outliers per se are still asymptotically negligible and will not affect the asymptotic

distributions. But the augmented Y ∗t increases the dimension of the eigenvalue problemand the reduced rank hypotheses also restrict the coefficients to Dt−1. The additionalrestrictions will change the asymptotic distribution of the trace test by adding an inde-

pendent χ2 (p− r) term to the distribution of Qr for each included dummy variable, see

Johansen, Mosconi, and Nielsen (2000) and Doornik, Hendry, and Nielsen (1998).

3.1 The Additive Model

The log-likelihood function for (6) under the restrictions in (7) and (8) is (apart from a

constant) given by

logL (α, β,Γ1, ...,Γk−1, β0, µ0,Ω, θ) = −T

2log |Ω|

h¡A (L) (Yt − θDt)− αβ00t− µ0

¢0Ω−1

¡A (L) (Yt − θDt)− αβ00t− µ0

and no closed form solution for the estimator exists. The ML estimates can be obtained

by a standard numerical procedure, but convergence can be very slow due to the dimen-

sionality. As an alternative we suggest a simple algorithm that switches between two

conditional ML estimations for which closed form solutions exist.

Consider iteration j. First, conditional on the estimate bθj−1 of θ from the previous

iteration, the ML estimates (bα, bβ, bΓ1, ..., bΓk−1, bβ0, bµ0, bΩ) and hence bA (L) can be foundfrom RRR of the cointegrated VAR model for the corrected data, Xt = Yt − bθj−1Dt.

Secondly, conditional on the remaining parameters, the ML estimate bθj of θ, can be foundfrom the estimated residuals, bet = bA (L)Yt − bαbβ00t − bµ0, which under the model (4) and(5) are given by bet = bA (L) θDt + t = bHtvec (θ) + t, (9)

where vec(θ) stacks the columns of θ, and bHt = D0t ⊗ bA (L) = ( bA (L)D1t : bA (L)D2t : ... :bA (L)Dnt). The conditional likelihood function corresponding to (9) is maximized by the

GLS-type estimator

vec³bθj´ = Ã TX

³ bH 0tbΩ−1 bHt

´!−1Ã TXt=1

³ bH 0tbΩ−1bet´! , (10)

see also Tsay, Pena, and Pankratz (2000) and Saikkonen and Lutkepohl (2000). ML esti-

mates are obtained by iterating between the two steps until convergence. The covariance

matrix of vec(bθ) can be estimated by ³PTt=1

bH 0tbΩ−1 bHt

´−1and Wald-type tests for hy-

potheses on θ can easily be constructed, see also Tsay, Pena, and Pankratz (2000). It is

also straightforward to impose linear restrictions on vec(θ) in the GLS step (10).

For dummy variables involving few observations the switching algorithm normally

converges very fast and a starting value of bθ0 = 0 can be used. Alternatively, an initialestimate of θ can be obtained by ignoring the non-linear restrictions (7) and (8).

To briefly illustrate the properties of the estimator in (10), consider a known VAR

model with characteristic polynomial written as A (L) = I − Π1L − ... − ΠkLk, where

Π1, ...,Πk are autoregressive parameters in the level specification, and consider the case

of n = 1 intervention dummy, Dt = Dt (T0). In this case Ht = A (L)Dt = Dt −Π1Dt−1−...−ΠkDt−k, and

bθ = ÃΩ−1 + kXi=1

Π0iΩ−1Πi

!−1ÃΩ−1beT0 − kX

Π0iΩ−1beT0+i

!, (11)

involving k + 1 residuals. The algorithm can also be used to estimate the model with

an innovational dummy by substituting a unit matrix for A (L). In this case Ht = 1 for

t = T0 and Ht = 0 otherwise, and all information on θ is contained in a single residual

eθ = ¡D0T0Ω

−1DT0

¢−1 ¡D0T0Ω

−1beT0¢ = beT0 . (12)

Since the estimators in (11) and (12) are based on a fixed number of residuals even

as T → ∞, the estimates are not consistent, see Davidson (2001, p.147) and Doornik,Hendry, and Nielsen (1998). Furthermore, the distribution of the test for θ = 0 does not

follow from a Central Limit Theorem, but has to be derived directly from the properties

of the residuals under the null. Consider as an example the Wald test for eθ = 0:W = eθ0 ³V heθi´−1 eθ = be0T0Ω−1beT0 ,

which given the assumption of Gaussian innovations is distributed as a χ2 (p) variable

under the null of no outliers. For the additive model the notation is more complicated,

but a χ2 (p) distribution again follows from the assumption of Gaussian innovations under

the null.

3.2 Outlier Detection

As an alternative to the usual practise of including innovational dummy variables to

account for large residuals, it seems natural to estimate both the innovational model,

H∗I (r), and the additive model, H

∗A(r), for a given dummy variable, Dt = Dt(T0) say, and

base the specification on a test criterion. Both outlier models nest the basic model, i.e.

H∗(r) ⊂ H∗I (r) and H∗(r) ⊂ H∗

A(r), (13)

while H∗I (r) and H∗

A(r) are not nested2. Each of the hypotheses in (13) can be tested

using e.g. a Wald type test or the LR test statistic of the form

τ i = −2 · logQ (H∗(r) |H∗i (r)) = −T log

¯bΩ−1bΩi ¯ , i = I,A, (14)

2For the last observation, Dt = Dt (T ), the two models are equivalent.

where bΩ and bΩi denote the estimated covariance matrices under H∗(r) and H∗i (r) respec-

tively.

When the basic model, H∗(r), can be rejected against both outlier models a choicehas to be made between the non-nested candidates. One possible test strategy is to find

a model that nests both outlier models, and test against the nesting alternative. In many

situations, however, the nesting model is not a natural candidate model, and an alternative

is to apply a model selection approach. Since H∗I (r) and H∗

A(r) have the same number of

parameters, the use of any conventional information criteria reduces to choosing the model

with the highest likelihood. A similar result is obtained using the Likelihood Dominance

Criterion of Pollak and Wales (1991).

In the construction of the test (14) the location of a potential outlier is considered

known. In practical applications, however, this is typically not the case and an important

issue is how to identify the location of outlying observations. Usually in applied coin-

tegration analyses, outliers are identified from a priori information and the size of the

residuals. As an example, Hendry and Juselius (2001) identify outliers as observations

with absolute standardized residuals larger than 3.3.

To determine the location as well as the type of outliers, an automatic outlier detection

procedure can be used, see also Tsay (1986), Chen and Liu (1993) and Tsay, Pena, and

Pankratz (2000). This corresponds to performing the test (14) for all observations, τ i(t)

for t = 1, 2, ..., T , producing two series of test statistics, τ i(t)Tt=1, i = I,A. It is natural

to focus on the largest test statistic, i.e.

τmaxi = max1≤t≤T

τ i(t) . (15)

Under the i.i.d assumption on the innovations it follows from (12) that the test statis-

tics for an IO, τ I(t), are approximately independent over t = 1, 2, ..., T , as they involve

only one residual. The tests for AOs involve sequences of k + 1 residuals, which implies

a dependence over t. The dependence is related to the autoregressive parameters and

complicates the distribution of the maximum test statistic. A simple approximation to

the critical values of the maximum test statistic can be based on the Bonferroni inequality.

In particular the multi-comparison significance level δmax = 1− (1− δ)1/T can be used in

each χ2−based test to ensure an overall Type I error probability below δ. Alternatively

the critical values could be obtained by simulations, see Tsay, Pena, and Pankratz (2000)

or Perron and Rodrıguez (2003) for examples. Table 1 reports finite sample critical values

for the maximum tests, τmaxI and τmaxA , when the 4−dimensional DGP from Section 5 is

used. Note that the quantiles of the tests for IOs and AOs are quite close and in the

application below we use the same critical values for the two tests, namely the average of

the 95% quantiles. Also note the non-monotonicity of the quantiles as a function of T .

This reflects a small-sample distortion of the test statistics, and will not be picked up by

the Bonferroni approximation.

If multiple outliers are present, we follow inter alia Tsay (1986) and include a dummy

for the most significant outlier, and repeat the procedure conditional on previously iden-

tified outliers, see also Perron and Rodrıguez (2003) for a discussion of the iterative pro-

cedure.

In (14), we still need to select the cointegration rank, r, for which the outlier detection

should be conducted. One approach is to use the full rank model H∗ (p), which facilitatesa sequential procedure: first the outlying observations can be identified in the stationary

VAR model, and then the cointegration rank can be determined conditional on the iden-

tified outliers. Alternatively a pretest for the cointegration rank can be applied, and the

search for outliers can be made conditional on that estimated rank, br0 say. Finally, thecointegration rank can be determined conditional on the detected outliers. Intuitively,

there might be a gain in the outlier detection from using a rank, r0, close to the true rank,

r, and in the simulations in Section 5 we try both approaches.

4 Empirical Illustration

To illustrate the importance of outliers and dummy variables in cointegration analyses we

consider a set of quarterly UK money demand data Yt = (mt : yt : ∆pt : rt)0, t = 1963 :

1, ..., 1989 : 2, where mt denotes the log of real money M1, yt denotes the log of real final

expenditures, ∆pt denotes the change in the log of the deflator of yt, and rt denotes the

difference between the three month interest rate and a measure of the own interest rate

of M1.3

The data have previously been analyzed in inter alia Hendry and Doornik (1994) and

Doornik, Hendry, and Nielsen (1998) using a VAR model with k = 2 lags and we set up

a similar model. Hendry and Mizon (1993) analyze the data for a shorter sample and

include two innovational dummy variables, Dt = (Doilt : Doutt)0, where

Doilt = 1 t = 1973 : 3+ 1 t = 1973 : 4+ 1 t = 1979 : 3Doutt = 1 t = 1972 : 4+ 1 t = 1973 : 1+ 1 t = 1979 : 2 ,

and give a detailed account for their interpretation of the dummies4. Hendry and Doornik

(1994) and Doornik, Hendry, and Nielsen (1998) include the same dummies for the ex-

tended sample.

The trace statistics for determining the cointegration rank are reported in Table 2.

Row (A) is based on the specification with no dummies. The hypothesis of r = 0 can be

easily rejected whereas the test for r ≤ 1 is borderline with a p−value of 8%. Row (B)reports the results based on a model including the original innovational dummies, Doilt

and Doutt. This specification clearly point towards r = 1. The dummy variables, Doilt

3All calculation have been conducted in Ox 3.0, see Doornik (2001).4Doutt accounts for expansionary economic policy measures attributed to the Heath-Barber boom and

the first effect of the Thatcher government. Doilt accounts for the effects of the two oil crises.

and Doutt, identify 6 outliers in the data. The restriction that these can be modelled

with two composite dummies is not important for the rank determination, as shown by

the results based on a model with 6 unrestricted dummies in row (C).

Although the second cointegrating relation appears to be insignificant in their preferred

model, Doornik, Hendry, and Nielsen (1998) retain it and continue the analysis with r = 2.

To illustrate the role of the dummies, Figure 2 depicts the actual data together with the

estimated deterministic components. The latter reflect the total effects of the dummies,

Doilt and Doutt, and the linear drift term, and are generated from the initial values by

setting all innovations equal to zero in the estimated model. Furthermore the cointegrating

linear combinations are given together with the estimated deterministic components. The

unrestricted impulse dummies produce marked shifts in the levels of the data and the short-

run transitions are prolonged. The dummies have no long-run effects in the cointegrating

relations, but the short-run effects account for a considerable proportion of the variation

around the linear trend.

4.1 Outlier Detection.

As an alternative to the original model we apply the outlier detection procedure outlined

above, and Table 3 reports the results calculated in the full rank case. In the first iteration

an outlier is found for 1973 : 2. Hendry and Mizon (1993) identify 1973 as a year for

potential problems due to the Heath government attempts to ”go for growth” and the

first oil crisis, but the observation is not modelled by the original dummies. Based on the

likelihood values it is difficult to distinguish a model with an innovational dummy and

an additive dummy, but the innovational model is marginally preferred. In the second

iteration, which is conditional on the dummy for 1973 : 2, an AO in 1974 : 2 is detected,

apparently in the inflation equation. This observation is also not modelled by the original

dummies. In the third iteration an outlier in 1973 : 1 is detected. This is mainly an outlier

in total expenditures and is also identified in Doutt and interpreted as an expansionist

policy measure. The tests suggest to model the outlier as additive but the likelihood

function is not very informative on the outlier type. Number four is an IO in 1979 : 2

located in total expenditures, and is also included in Doutt.

There are still 24 observations for which the test statistic is larger than the critical

value for individual tests, χ20.95 (4), but none of these are significant according to the

simulated critical value for the maximum statistic, τmaxi .

The likelihood ratio tests for rank determination based on the preferred specification,

with 2 additive dummies and 2 innovational dummies, are presented in row (D) of Table

2 and suggest a cointegration rank of r = 2. This is in line with the priors of Doornik,

Hendry, and Nielsen (1998) based on economic theory.

This empirical example illustrates that inference on the cointegration rank is sensitive

to the specification of dummy variables in finite samples, and suggest that care should

be taken in the design of the empirical model. Furthermore, it illustrates how a formal

outlier detection, combined with a priori knowledge on the timing of special events in the

data, can be useful in pointing out critical observations.

5 Monte Carlo Simulation

In this Section we set up a Monte Carlo simulation to analyze how outliers can be modelled

with dummy variables. As the baseline DGP in the simulations we use an estimated

version of the UK money demand model from Section 4, where the dummy variables

are excluded. This is a 4−dimensional VAR(2) with a cointegration rank of r = 2.5

We augment the basic DGP with n ∈ 1, 2, 4, 6 impulse indicator variables, i.e. Dt =

(Dt(T1) : Dt(T2) : ... : Dt(Tn))0, where Ti = i ·T · (n+1)−1, to produce IOs or AOs located

equidistantly in the time series. Each outlier is imposed randomly on one of the four

variables, and has a magnitude of 5 residual standard deviations. We generate samples

for t = −101, ..., 0, 1, ..., T starting from the initial values of the actual data, and discard

100 observations to minimize the importance of that choice. For each case the same

random Gaussian innovations are used. The Monte Carlo results are based on 10000

replications, and the standard deviation of an estimator of the rejection probability 0.05

is approximately 0.002.

Since the set-up involves a fixed number of outliers asymptotic inference is unaffected,

and we focus on the small sample properties of different estimation strategies6. First we

consider in subsection 5.1 the case of known outliers to analyze how outlying observations

can be approximated with dummy variables. In 5.2 we introduce the additional cost of

searching for outliers, and evaluate different strategies feasible in empirical applications.

5.1 Modelling Known Outlying Observations

To illustrate the ability of different specifications to approximate outlying observations

we first assume that the locations of outliers are known. We estimate the model (1) with

5The simulation has also been carried out for other DGPs. While the absolute size and power vary the

conclusions on outliers and dummy variables remain unchanged.6An alternative set-up, where outliers are assumed to occur with a fixed probability and the (expected)

number of outliers increases with the sample length, is used in Franses and Haldrup (1994). In this case

the asymptotic distributions will typically involve nuisance parameters, see also Shin, Sarkar, and Lee

(1996).

µt = µ1t+µ2t+ ...+µnt, where µit is given by different configurations of dummy variables.

Based on each configuration we calculate the likelihood ratio tests for rank determination

and compare with the appropriate critical values calculated from a Γ−approximation basedon the mean and variance of the asymptotic distribution estimated in a response surface

regression by Doornik (1998). Furthermore, we analyze the precision of the estimated

long-run parameters by imposing the correct rank, r = 2, and comparing the estimated

cointegration space, bβ, with the space spanned by the columns of β in the DGP. Thecomparison is based on the metric of Larsson and Villani (2001) and can be interpreted

as a simple measure of the efficiency of different estimators7. The distance is bounded

between Λ = 0 if bβ ∈ span(β) and Λ = √r if bβ ∈ span(β⊥).The simulation results for sample lengths T = 75 and T = 200 are reported in Table 4

and Table 5 respectively. Panel (A) and (B) of the tables report rejection frequencies of the

tests for rank determination. The test Q0 = −2 logQ (r = 0 | r ≤ 4) of no cointegration is(almost) always rejected and is not reported. For T = 75 panel (A) reports the rejection

frequencies of the test statistic Q1 = −2 logQ (r ≤ 1 | r ≤ 4) based on a nominal size of5%. This illustrates the empirical power of the test for a too restricted model, r ≤ 1, whenr = 2 in the DGP. For T = 200 the power is always 100% and is not reported. Panel (B)

reports the rejection frequencies of Q2 = −2 logQ (r ≤ 2 | r ≤ 4), which is the empiricalsize of the tests of the (true) cointegration rank r = 2. Finally, panel (C) reports the

average distance given the correct rank, r = 2. In the text we will concentrate on the

results for the small sample, T = 75, and make reference to the larger sample, T = 200,

only in case of differences in results.

ModelM0 : No Dummies. The first column of the tables report results for the basic

model,M0 : µit = 0, which ignores potential outliers. First row in each panel illustrates

the benchmark case where the DGP contains no outliers. In the small sample, T = 75, the

trace test is somewhat oversized with a rejection frequency of approximately 11%. The

empirical power of the test for r ≤ 1 against r ≤ 4 is reasonably high at around 80%. ForT = 200 the actual size is down to 7%. The size and power properties are specific to the

chosen DGP and constitute the benchmark for alternative specifications.

The following rows illustrate the effects of non-modelled outliers in the data. It appears

that IOs marginally decrease the size and power. Panel (C) illustrates that the average

precision of the estimated parameters increases with the number of IOs. This reflects that

IOs provide events of large variations, and because IOs follow the same autoregressive ad-

7To compare the subspaces span(A) and span(B), we find orthonormal bases for the subspaces, a and

b say, and decompose b = aγ1 + a⊥γ2, where γ1 = a0b and γ2 = a0⊥b. The distance measure is defined as

the matrix norm of the coefficient γ2, i.e. Λ = trace(γ02γ2)1/2, see Larsson and Villani (2001).

justment as the small chocks, IOs can be helpful in revealing the autoregressive structure.

The overall conclusion is that IOs do not seriously distort inference in the cointegration

model, see Doornik, Hendry, and Nielsen (1998) for a similar result.

The effects of ignored AOs are more severe. The size distortion increases markedly

with the number of outliers, confirming the findings in Franses and Haldrup (1994). Fur-

thermore, the average precision in panel (C) illustrates that AOs are pure noise in the

autoregressive model and they can potentially distort statistical inference.

An important question is now if the outliers can be modelled with dummy variables;

and below we consider a number of different specifications.

Model M1 : Innovational Balanced Impulse Dummies. In the second column

outliers are modelled by balanced impulse dummies, M1 : µit = θi0dit. These dummies

are too ‘small’ in the sense that they can at most create impulses in the non-stationary

directions, i.e. column 1 of Figure 1, whereas the DGP replicates column 2 and 3 respec-

tively. The specification is not unusual, however, see e.g. Hendry and Juselius (2001).

In the case of impulse IOs the test based onM1 is clearly under-sized, and the power

is rapidly decreasing. In the case of AOs the size is only marginally distorted but the

power is much lower than for the baseline specification. In both cases the combination

of power and size implies that the proportion of cases where the correct rank, r = 2, is

selected using the normal test sequence is markedly lower forM1 than for the basic model

A similar picture appears for the average precision of the estimates of β. Both for IOs

and AOs the results are clearly inferior to ignoring the outliers.

Model M2 : Innovational Impulse Dummies. In the third column we consider

unrestricted impulse dummies, M2 : µit = θi0Dit, which is probably the most common

strategy to control for outliers.

In the case of IOs the estimation model M2 is identical to the DGP and the results

are excellent. The size distortion is marginally decreased compared to the basic model,

M0, and the power of the test for r ≤ 1 increases. This picture is confirmed by the resultson the average distance between β and bβ.

In the case of AOs the estimation model with an unrestricted impulse dummy is

not well suited, and to remove the isolated outlying observations the estimation model

introduces a shift in the levels of the non-stationary directions. The tests are highly over-

sized, with a rejection frequency of 60% for 6 AOs, and the average precision is extremely

low. With T = 200 the size is still around 30%. These results clearly suggest that it is not

recommendable in practice just to insert innovational impulse dummies for observations

with large residuals.

ModelM3 : Approximate Additive Model. One possibility for modelling additive

outliers is the specification (6), where the non-linear restrictions (7) and (8) are not

imposed; M3 : µit = αθi0Dit−1 + θi1dit + θi2dit−1.8 This model can be estimated by

standard RRR.

When the DGP contains IOs results are poor and for an increasing number of outliers,

the Type I error probability approaches 100%. This reflects an additional contribution

to the trace test statistic from imposing the reduced rank restrictions also on the dummy

terms when they are unrestricted in the DGP, see also Doornik, Hendry, and Nielsen

(1998).

When AOs are present, M3 is the correct model apart from the restrictions (7) and

(8) not imposed, but the model is costly in terms of degrees of freedom. In longer samples

or with a limited number of outliers the size and power are reasonable and there is a small

gain compared to the basic modelM0. But in small samples and with several outliers the

loss in degrees of freedom is prohibitive9.

Model M4 : Nesting Model. In column five we introduce a general model where

the dummies are included with the full lag structure of the autoregressive model and

unrestricted coefficients; M4 : µit = θi0Dit + θi1Dit−1 + θi2Dit−2. A similar model is

applied in inter alia Vogelsang (1999) for the univariate Dickey-Fuller unit root test. This

model nests both the IO model and the AO model and can be estimated using RRR, but

is costly in terms of degrees of freedom. In the case of IOs the estimates are more precise

than for the basic specification, M0, but the size distortion is markedly higher. For the

case of AOs the results are comparable to ignoring the outliers10.

A simple refinement ofM4 is to estimate the general model and then to delete insignif-

icant dummies to save degrees of freedom. This is applied in modelM4b in column six.

More specifically a dummy is deleted if all four t−ratios for the dummy coefficients equalto zero are lower than the conventional 1.96. For the case of IOsM4b is closer to the true

specification, M2, in terms of the average distance. For the case of AOs, the additional

zero restrictions do not bring the model closer to the correct additive specification and

the results do not improve.

Model M5 : Additive Model. In column seven we consider the exact additive spec-

ification, M5 : µit = A (L) θi0Dit, estimated with the switching algorithm suggested in

Section 3. In the presence of IOs results are comparable to ignoring the outlier. This

illustrates that IOs, inducing level shift in the non-stationary directions, cannot be ap-

proximated by additive dummies.

In the case of AOs the estimation model is identical to the DGP. Because the outlying

observations are completely removed from the likelihood function, the size and the power

8For this model the critical values for the rank test are based on the mean and variance for the basic

model augmented with the moments of n independent χ2 (p− r) distributions, cf. Section 3.9Note, that the statistics are independent of the magnitude of the AOs. The properties do not reflect

the magnitude of the outliers, but the presence of the dummies in the estimation model.10Again the test statistics do not depend on the magnitude of the AOs.

are by and large similar to the benchmark case with no outliers — and they are independent

of the magnitude of the outliers. There is no potential gain from additive outliers, but

using the correct model it is possible to conduct inference conditional on additive outlying

observations.

One way to interpret the additive model is to note that it replaces the actual ob-

servation YT0 with an ML based interpolated value. The main disadvantage for the ap-

plied econometrician is that the estimation procedure is more complicated, and below we

consider two alternative interpolations that can easily be implemented in applied coin-

tegration analyses. Compared to the ML interpolation these alternatives differ in one

important aspect. The additive model maximizes the likelihood function for each model,

H∗A(0), ...,H

∗A(r), ...,H

∗A(4), which implies that separate corrections are estimated for each

model. The alternative interpolations in M6 and M7 correct the data once and for all

based on the estimates in the unrestricted model H∗(4); and perform a standard cointe-

gration analysis on the corrected data.

Model M6 : Univariate Correction for AOs. Since AOs are related to the indi-

vidual time series and not to the system of equations, it is natural to consider univariate

corrections, andM6 corrects the time series univariately using the program TRAMO, see

Gomez and Maravall (1997). TRAMO automatically identifies an ARIMA model for the

univariate time series and corrects for the AOs at the known locations. This is also sug-

gested in Franses and Haldrup (1994). To be comparable with the previous specifications,

the corrections are made in all four variables.

The results are reported in column eight, and are very close to the results of the additive

specification in M5. This suggests that it is important to interpolate observations with

AOs but that the precise method might not be too critical.

Model M7 : Linear Interpolation in the Data. As a very simple alternative we

consider inM7 linear interpolation and replace an outlying observation, YT0 , withbYT0 =

12YT0−1 +

12YT0+1. To be comparable we again interpolate all four variables. Shin, Sarkar,

and Lee (1996) use a similar idea and replace YT0 with YT0−1 for the Dickey-Fuller test.The results are given in column nine and are again very close to the results of the addi-

tive model inM5. In particular, the simple interpolation is much preferable to including

combinations of unrestricted dummies in the case of AOs.

Three main conclusions emerge from the results: Firstly, AOs can seriously distort infer-

ence on the cointegration rank and the long-run parameters, while IOs are less harmful.

For a practitioner, this suggests that the main focus should be on identifying and cor-

recting AOs. Secondly, it is not possible to approximate AOs with combinations of unre-

stricted dummies. And thirdly, the potential distortion from using an incorrect dummy

specification is much larger than the distortion from ignored outliers.

These results have important implications for the recommended practice in empirical

applications, where the locations and types of outliers are unknown. It is clear that the

conventional use of unrestricted dummies to account for large residuals is not justified.

Instead, two alternative strategies seem to prevail:

(i) The type of a particular outlier should be determined, e.g. using formal testing or

an outlier detection procedure.

(ii) The focus could be solely on correcting AOs as they potentially distort inference,

while IOs could be left in the data. A similar idea is applied in Shin, Sarkar, and

Lee (1996) and Vogelsang (1999) for univariate Dickey-Fuller tests.

Below we evaluate different implementations of these two strategies.

5.2 Feasible Strategies

In this subsection we extend the analysis to strategies for modelling outliers, which are

feasible in empirical applications. This introduces the additional cost of detecting the

locations and types of outliers.

Model M8 : ’Usual Practice’. First, we consider for comparison a stylized version

of the usual practice in applied cointegration analyses; i.e. to insert unrestricted impulse

dummies for observations with large residuals. To be precise, we define an outlier as an

observation with an absolute value of the standardized residual larger than 3.39.11 For an

outlying observation, Ti, we insert an innovational dummy, Dt(Ti). The results, reported

under M8 in column ten, are not far from the results of M2. The additional cost of

searching for the location of outliers is relatively small, but the results are very poor in

the case of AOs.

Model M9 : Multivariate Outlier Detection. UnderM9, in column 11, we apply

the outlier detection outlined in Section 3.2. First the calculations are based on the full

rank model, H∗(p).12 In the case of IOs the size is relatively constant around 10% and

the power is high. And the average distances in panel (C) are not far from the results of

the correct specificationM2. In case of AOs the average distances are close to the results

of the correct specification M5; but the size is increasing with the number of outliers.

This reflects that in small samples the outlier type is sometimes mistaken by the model

selection procedure, and innovational dummies included for AOs inflate the size of the

test, cf. the results of modelM2. In the case of 4 AOs, for example, the outlier detection

procedure identifies on average 3.60 AOs and 0.42 IOs; and even this small fraction of

mis-labelled outliers is enough to inflate the empirical size of the test.

Difficulties to distinguish IOs and AOs in small samples could be related to the use

of the stationary model, H∗(p), for the detection procedure, and the properties may be11The critical value corresponds to a multi-comparison significance level δmax for a nominal 5% and

T = 75 observations in a standard normal distribution.12To minimize the computational burden we only calculate the test for an AO if the largest numerical

value of the standardized residual exceeds 2.

improved if the outlier detection is based on a model, H∗(r0), where r0 is closer to thetrue (but unknown) cointegration rank, r. One feasible strategy is to determine the

cointegration rank in a model with no dummies, br0 say, in an initial stage, and then toperform the outlier detection based on H∗(br0). Conditional on the detected outliers thefinal cointegration rank is determined. The results of this procedure is reported under

M9b. For IOs the size is now very close to the correct specification. For AOs the size

distortion is also smaller, although still visible.

A third approach is to rely on the suggestion (ii) above and identify only AOs. The

procedure is identical toM9, except that only τmaxA is calculated in each iteration of the

outlier detection. The obtained results are reported under M9c. In the presence of IOs

the results are close to ignoring the outliers. In the presence of AOs the results are very

close to the results of the correct specificationM5.

Note, that the all results could be marginally improved by imposing individual zero

restrictions on insignificant coefficients to the additive dummies.

Model M10 : Univariate Correction of AOs. The main drawback of M9, M9b

and M9c is that they require a non-standard estimation of the cointegration model. As

an alternative we reconsider the univariate correction using TRAMO. We apply the auto-

matic ARIMA model identification and automatic identification and correction of additive

outliers to each series and perform a standard cointegration analysis on the so-called lin-

earized series13. Note that this procedure does not treat the series symmetrically, and

only significant outliers are corrected in each series.

The results, reported underM10, are inferior to the system correction for AOs reported

underM9b andM9c. The size distortion is marginally lower, but the size-power trade off

is less favorable and the average precision is lower. And unlike the previous models there

is a non-negligible cost of searching for outliers when there are no outliers in the DGP.

Model M11 : Linear Interpolation in the Data. Finally we reconsider the linear

interpolation. Like modelM8 we identify outliers as observations with large standardized

residuals and use linear interpolations for these observations directly in the levels of the

data. Again we treat the series asymmetrically and correct only series with large residuals.

The results, reported under M11, are very close to the results of the maximum like-

lihood correction of AOs, M9c. For the case of IOs the results are marginally superior,

while they are marginally inferior in the case of AOs.

13The current version of TRAMO does not allow for correction of only AOs, and we therefore choose

also to correct for transitory changes, using the option AIO=1 in TRAMO. This deteriorates the results

in the simulations.

6 Concluding Remarks

In this paper we used a Monte Carlo simulation to analyze the effects of IOs and AOs on

inference in the cointegrated VAR model. Several conclusions emerge. Firstly, inference

on the cointegration rank of a VAR process is sensitive to AOs, and non-modelled AOs

increase the Type I error probability. IOs, on the other hand, are less distorting. And by

introducing events of large variation in the data, IOs can even be helpful in revealing the

autoregressive parameters. This suggests that in order to make cointegration inference

robust to outliers, the main focus should be on correcting for AOs.

Secondly, additive deterministic components do not fit well into the autoregressive

structure and it is very difficult to approximate AOs with combinations of unrestricted

dummies. Moreover, the potential distortion induced by an incorrect dummy specification

can be much more severe than the distortion from the outliers per se.

Thirdly, the usual practice of including unrestricted dummies to whiten residuals re-

ceive no support, unless AOs can be ruled out a priori. Instead two alternative procedures

are suggested:

The outlier type can be determined based on a model selection approach. Thereby the

information contained in IOs can be used and inference can be made robust to AOs. The

best results are obtained for the case where the cointegration rank is initially determined

in a model with no dummies; and then the outliers are detected based on the model

H∗(br0). Conditional on the dummies the final cointegration rank testing is performedAlternative, the IOs, which are not particularly problematic for cointegration inference,

can be ignored, while the distortionary AOs can be corrected. Preferable, the correction of

AOs should be done using maximum likelihood, but a simple linear interpolation directly

in the levels of the data prior to the cointegration analysis gave almost identical results.

References

Chen, C., and L.-M. Liu (1993): “Joint Estima-tion of Model Parameters and Outlier Effects inTime Series,” Journal of the American StatisticalAssociation, 88(841), 284—297.

Davidson, J. (2001): Econometric Theory. Black-well, Oxford.

(2001): Object-Oriented Matrix Program-ming Using Ox. Timberlake Consultants Press,London, 4th edn.

Doornik, J. A., D. F. Hendry, and B. Nielsen(1998): “Inference in Cointegrating Models: UKM1 Revisited,” Journal of Economic Surveys,12(5), 533—572.

Fox, A. (1972): “Outliers in Time Series,” Journalof the Royal Statistical Society, Series B, 34(3),350—363.

Franses, P. H., and N. Haldrup (1994): “TheEffect of Additive Outliers on Tests for UnitRoots and Cointegration,” Journal of Businessand Economic Statistics, 12(4), 471—478.

Gomez, V., and A. Maravall (1997): “ProgramTRAMO and SEATS - Instructions for the User,”Mimeo, Banco de Espana.

Gomez, V., A. Maravall, and D. Pena (1999):“Missing Observations in ARIMA Models: Skip-ping Approach Versus Additive Outlier Ap-proach,” Journal of Econometrics, 88, 341—363.

Hendry, D. F., and J. A. Doornik (1994):“Modelling Linear Dynamic Econometric Mod-els,” Scottish Journal of Political Economy, 41(1),1—33.

Hendry, D. F., and K. Juselius (2001): “Ex-plaining Cointegration Analysis: Part II,” EnergyJournal, 22(1), 75—120.

Hendry, D. F., and G. Mizon (1993): “Evalu-ating Dynamic Econometric Models by Encom-passing the VAR,” in Models, Methods and Ap-plications of Econometrics, ed. by P. Phillips, pp.272—300. Basil Blackwell, Oxford.

Johansen, S. (1996): Likelihood-Based Inference inCointegrated Autoregressive Models. Oxford Uni-versity Press, Oxford, 2nd edition edn.

Johansen, S., R. Mosconi, and B. Nielsen(2000): “Cointegration in the Presence of Struc-tural Breaks in the Deterministic Trend,” Econo-metrics Journal, 1(3), 216—249.

Larsson, R., and M. Villani (2001): “A Dis-tance Measure Between Cointegration Spaces,”Economics Letters, 70, 21—27.

Muirhead, C. (1986): “Distinguishing OutlierTypes in Time Series,” Journal of the Royal Sta-tistical Society, Series B, 48(1), 39—47.

Nielsen, B., and A. Rahbek (2000): “Similar-ity Issues in Cointegration Analysis,” Oxford Bul-letin of Economics and Statistics, 62(1), 5—22.

Perron, P., and G. Rodrıguez (2003): “Search-ing for Additive Outliers in Nonstationary TimeSeries,” Journal of Time Series Analysis, 24(2),193—220.

Pollak, R. A., and T. J. Wales (1991): “TheLikelihood Dominance Criterion A New Ap-proach to Model Selection,” Journal of Econo-metrics, 47, 227—242.

Saikkonen, P., and H. Lutkepohl (2000):“Trend Adjustment Prior to Testing for the Coin-tegration Rank of a VAR Process,” Journal ofTime Series Analysis, 21(4), 435—456.

Shin, D. W., S. Sarkar, and J. H. Lee (1996):“Unit Root Tests for Time Series with Outliers,”Statistics & Probability Letters, 30, 189—197.

Tsay, R. S. (1986): “Time Series Model Specifica-tion in the Presence of Outliers,” Journal of theAmerican Statistical Association, 81(393), 132—141.

Tsay, R. S., D. Pena, and A. E. Pankratz(2000): “Outliers in Multivariate Time Series,”Biometrika, 87(4), 789—804.

Vogelsang, T. J. (1999): “Two Simple Proceduresfor Testing for a Unit Root when there are Addi-tive Outliers,” Journal of Time Series Analysis,20(2), 237—252.

dt, Innovational Dt, Innovational Dt, Additive

Stationary directions, β0Yt

Non-stationary directions, β0⊥Yt

Figure 1: Examples of the effects (relative to the baseline of no outliers) of different types of

outliers in the stationary and non-stationary directions. Based on a 2-dimensional VAR(1) with

r = 1.

Real money

64 67 70 73 76 79 82 85 8810.6

11.5Final expenditures

64 67 70 73 76 79 82 85 8811.1

Inflation

64 67 70 73 76 79 82 85 88-0.01

0.07Interest rate

64 67 70 73 76 79 82 85 880.025

First cointegrating relation

64 67 70 73 76 79 82 85 88-7.5

2.5Second cointegrating relation

64 67 70 73 76 79 82 85 88465

Figure 2: Original data (–) and the estimated deterministic components of the original model of

Doornik, Hendry, and Nielsen (1998) (- -). Calculated for r=2.

Innovational model, τmaxI Additive model, τmaxA

T Mean Std. Quantiles Mean Std. Quantiles

50% 90% 95% 97.5% 99% 50% 90% 95% 97.5% 99%

50 17.46 3.97 16.85 22.54 24.83 26.98 29.74 16.83 4.68 16.17 22.83 25.38 27.90 31.49

75 16.81 3.46 16.29 21.31 23.25 25.17 27.81 15.94 3.92 15.33 21.05 23.02 25.12 27.93

100 16.76 3.30 16.22 21.09 22.84 24.90 27.24 15.89 3.65 15.36 20.52 22.62 24.47 27.20

150 16.98 3.13 16.46 21.06 22.73 24.39 26.89 16.07 3.36 15.58 20.43 22.31 24.07 26.24

200 17.27 3.03 16.79 21.26 22.91 24.55 26.75 16.46 3.31 15.99 20.74 22.61 24.35 26.99

400 18.31 2.90 17.85 22.09 23.66 25.25 27.05 17.63 3.09 17.20 21.61 23.33 25.05 27.37

Table 1: Simulated moments and quantiles of the maximum test statistics τmaxI and τmaxA for the 4−dimensional datagenerating process used in Section 5. Based on 10000 replications.

Hypotheses

Dummy specification r = 0 r ≤ 1 r ≤ 2 r ≤ 3(A) No dummies. 119.38 40.89 12.12 4.48

[.00] [.08] [.80] [.67]

(B) Original dummies, Dt = (Doilt : Doutt)0. 108.53 29.25 14.77 6.20

[.00] [.55] [.60] [.45]

(C) The 6 unrestricted dummies from (B). 106.19 32.48 14.13 5.07

[.00] [.37] [.65] [.59]

(D) Preferred specification. 132.49 47.52 14.18 5.93

2 additive and 2 innovational dummies, see Table 3. [.00] [.01] [.65] [.48]

Table 2: Trace test statistics, Qr, based on different dummy specifications. p−values based onthe Γ−approximation of Doornik (1998) in square brackets.

Iteration Detected outliers Test statistics Critical Standardized residuals

Date Type τmaxI τmax

A value ∆mt ∆yt ∆2pt ∆rt

1 1973:2 IO 29.785 28.924 22.73 4.07 0.39 -3.95 -1.90

2 1974:2 AO 24.911 29.874 22.73 -0.04 1.22 3.28 -1.30

3 1973:1 AO 26.246 27.209 22.73 -0.51 3.98 -1.36 0.86

4 1979:2 IO 24.799 8.514 22.73 0.33 4.40 1.03 -0.10

Table 3: Outlier detection in the UK money demand data. The critical value for the

maximum test statistics, τmaxi , is the average of the 95% quantiles for T=100 simulated

in Table 1.

Outliers Specification of Dummies in Estimation Model

in DGP None Known Location Outlier Detection

M0 M1 M2 M3 M4 M4b M5 M6 M7 M8 M9 M9b M9c M10 M11

(A) Power, Q1 = −2 logQ (r ≤ 1 | r ≤ 4)0 80.5 ... ... ... ... ... ... ... ... 80.9 81.0 80.4 80.6 77.9 80.6

1 IO 79.8 71.0 82.2 87.7 79.9 82.2 80.1 79.7 79.8 83.0 83.0 80.3 80.5 75.9 80.2

2 IO 78.7 59.4 84.2 92.7 80.7 84.9 79.6 79.1 79.1 85.2 85.1 80.3 80.1 73.2 79.6

4 IO 77.5 40.2 88.6 97.7 82.0 88.4 79.1 78.2 78.5 88.6 88.8 81.7 79.0 69.6 79.0

6 IO 75.7 26.5 92.4 99.4 84.2 92.2 78.0 77.3 78.1 92.1 91.2 82.1 77.5 65.8 77.8

1 AO 80.3 76.0 88.2 79.5 81.5 82.8 80.3 80.2 80.2 87.6 81.4 80.2 80.5 77.2 80.3

2 AO 80.8 71.9 82.7 79.6 82.0 84.5 79.8 79.6 79.7 91.9 82.0 79.9 80.1 76.8 80.3

4 AO 83.3 63.5 97.3 82.5 83.5 87.1 79.5 79.0 79.5 96.3 84.1 80.9 80.3 74.9 80.9

6 AO 85.8 56.8 99.0 87.5 85.3 89.7 79.0 78.3 79.3 97.9 86.4 82.1 80.9 74.4 80.7

(B) Size, Q2 = −2 logQ (r ≤ 2 | r ≤ 4)0 10.9 ... ... ... ... ... ... ... ... 11.3 11.3 11.0 11.0 10.6 10.9

1 IO 9.6 7.0 9.6 25.0 10.7 11.0 9.7 9.7 9.8 10.8 10.6 9.8 9.8 9.3 9.7

2 IO 8.8 4.2 9.1 38.7 10.9 11.5 8.9 8.9 8.9 10.5 10.6 9.0 9.0 8.5 8.8

4 IO 7.9 1.8 7.5 62.3 11.7 12.1 7.9 7.8 7.9 9.3 9.5 7.5 8.5 7.6 8.0

6 IO 7.2 .7 6.9 79.5 12.5 13.9 7.3 7.2 7.2 9.0 9.8 7.0 7.8 7.1 7.4

1 AO 12.0 10.3 19.7 12.1 12.5 13.3 10.7 10.7 10.7 19.7 11.8 11.0 10.8 10.4 11.1

2 AO 13.3 9.7 28.9 13.8 14.8 16.7 10.7 10.7 10.8 28.4 12.9 11.3 10.9 10.5 11.3

4 AO 17.1 8.5 46.5 19.0 17.9 21.5 10.4 10.4 10.4 43.2 15.7 12.3 11.2 10.3 12.2

6 AO 20.3 8.5 60.2 28.5 21.4 25.9 10.8 10.9 10.8 54.6 19.3 14.4 12.4 10.6 13.3

(C) Average distance, given r = 2

0 .2001 ... ... ... ... ... ... ... ... .2008 .2011 .2009 .2005 .2114 .2004

1 IO .2014 .2102 .1876 .2091 .1961 .1925 .1964 .1990 .1988 .1912 .1912 .1920 .1988 .2163 .1976

2 IO .1997 .2148 .1736 .2146 .1892 .1829 .1926 .1970 .1965 .1794 .1795 .1811 .1970 .2187 .1955

4 IO .1900 .2155 .1571 .2246 .1812 .1717 .1820 .1867 .1858 .1647 .1661 .1682 .1876 .2223 .1844

6 IO .1825 .2139 .1448 .2379 .1778 .1650 .1755 .1796 .1776 .1536 .1585 .1607 .1805 .2225 .1770

1 AO .2118 .2139 .2258 .2099 .2080 .2103 .2012 .2015 .2013 .2254 .2036 .2034 .2018 .2150 .2034

2 AO .2216 .2274 .2466 .2181 .2162 .2204 .2020 .2025 .2018 .2464 .2074 .2060 .2030 .2185 .2068

4 AO .2346 .2501 .2753 .2365 .2322 .2380 .2046 .2050 .2040 .2746 .2154 .2130 .2070 .2260 .2126

6 AO .2475 .2729 .2971 .2595 .2521 .2578 .2083 .2092 .2070 .2972 .2276 .2223 .2138 .2322 .2214

Table 4: Simulation results, T=75. Results for the size and power are the rejection frequencies at a nominal 5% level of the

trace test for the cointegration rank. Note that the tests are not performed sequentially. Bold indicates that the estimation

model and the DGP coincide. Results are based on 10000 replications.

Outliers Specification of Dummies in Estimation Model

in DGP None Known Location Outlier Detection

M0 M1 M2 M3 M4 M4b M5 M6 M7 M8 M9 M9b M9c M10 M11

(B) Size, Q2 = −2 logQ (r ≤ 2 | r ≤ 4)0 7.0 ... ... ... ... ... ... ... ... 7.6 7.1 7.0 7.0 7.0 7.0

1 IO 7.1 6.2 6.9 23.9 7.3 7.2 6.9 7.0 7.0 7.6 7.2 7.0 7.0 6.9 7.0

2 IO 6.5 5.0 6.6 40.2 7.2 7.3 6.3 6.4 6.4 7.1 6.7 6.5 6.4 6.5 6.3

4 IO 6.4 3.6 5.8 65.9 7.1 6.9 5.9 6.0 6.0 6.5 6.2 5.8 6.3 6.3 6.2

6 IO 6.5 2.8 6.0 81.0 7.5 7.7 6.1 6.2 6.3 6.9 6.1 6.2 6.2 6.2 6.2

1 AO 7.6 7.3 10.6 7.6 7.6 7.7 7.0 7.0 7.0 10.2 7.2 7.1 7.0 7.1 6.0

2 AO 8.3 7.6 14.4 7.8 8.1 8.7 7.0 7.1 7.1 12.7 7.3 7.1 7.0 7.1 7.1

4 AO 9.8 8.0 22.8 9.2 9.1 10.1 6.9 6.9 7.0 18.7 7.4 7.1 6.9 7.1 7.3

6 AO 11.2 8.4 31.7 11.0 10.4 12.1 6.9 7.0 7.1 25.1 7.6 7.0 6.9 7.0 7.5

(C) Average distance, given r = 2

0 .0571 ... ... ... ... ... ... ... ... .0573 .0572 .0572 .0571 .0584 .0571

1 IO .0573 .0575 .0559 .0566 .0563 .0561 .0567 .0570 .0570 .0562 .0562 .0561 .0570 .0594 .0568

2 IO .0574 .0578 .0550 .0565 .0559 .0554 .0566 .0570 .0570 .0554 .0553 .0553 .0570 .0602 .0568

4 IO .0565 .0573 .0527 .0554 .0542 .0534 .0552 .0560 .0559 .0533 .0532 .0532 .0558 .0610 .0555

6 IO .0551 .0563 .0510 .0555 .0533 .0520 .0536 .0545 .0544 .0516 .0519 .0516 .0541 .0614 .0540

1 AO .0580 .0580 .0586 .0576 .0576 .0577 .0572 .0572 .0572 .0586 .0573 .0573 .0572 .0589 .0575

2 AO .0586 .0586 .0599 .0582 .0581 .0583 .0572 .0572 .0572 .0601 .0573 .0573 .0572 .0594 .0578

4 AO .0598 .0600 .0623 .0591 .0591 .0596 .0573 .0574 .0574 .0620 .0576 .0574 .0573 .0601 .0584

6 AO .0607 .0611 .0650 .0601 .0600 .0608 .0574 .0576 .0575 .0645 .0578 .0574 .0575 .0610 .0590

Table 5: Simulation results, T=200. See notes for Table 4.

Chapter 4

An I(2) Cointegration Analysis

of Price and Quantity Formation

in Danish Manufactured Exports

An I(2) Cointegration Analysis of Price and Quantity

Formation in Danish Manufactured Exports

Heino Bohn Nielsen

Abstract

The long-run and short-run structure of the Danish manufacturing export sector is

analyzed within a cointegrated vector autoregressive model. The price variables are

characterized as integrated of second order, I(2), but long-run homogeneity seems

to cancel the I(2)-trend allowing a transformed data set to be analyzed within the

cointegrated I(1)-framework. Two long-run relations are found and identified as a

demand-relation for Danish exports and a polynomially cointegrated price relation.

In the price formation a large weight to foreign prices and an effect from the rate of

inflation to the steady-state markup are found. The latter effect is interpreted as an

element of caution in the price setting in an inflationary environment. To characterize

the short-run behavior a structural representation is developed.

Keywords: Cointegration, I(2), Export pricing, Market-shares, Small open economy.

Jel Classification: C32, F14, F41.

1 Introduction

Since the large fluctuations of the exchange rates in the eighties there have been a consider-

able amount of research in the determination of foreign-trade prices and the pass-through

from exchange rates to prices, see inter alia Dornbusch (1987), Hooper and Mann (1989)

or Goldberg and Knetter (1997). Most of these studies have been carried out for large

countries, mainly for the US, Germany and Japan, while research on the pricing behavior

of exporters in small open economies is more limited. That may seem a little surprising

since the characterization of a small open economy precisely involves statements on the

determination of foreign trade prices.

This manuscript was published in Oxford Bulletin of Economics and Statistics, 64 (5), 449—472, (2002).

I thank Hans Christian Kongsted, Dan Knudsen, an anonymous referee and participants at the 56th

European Meeting of the Econometric Society, Lausanne, for many invaluable comments and suggestions.

Price and Quantity Formation in Danish Manufactured Exports

Focus on exchange rates and prices is surely still relevant in many European countries

given the large movements in both nominal and real exchange rates associated with the

”currency crises” in the early nineties and following the introduction of the euro on Jan-

uary 1, 1999. Besides revealing information on the price-setting process per se, this line

of research contributes to the understanding of the developments in trade balances and

the transmission of inflation and demand shocks across countries. Moreover, exchange

rates and export prices are main determinants of competitiveness on export markets and

the development of export volumes plays a pronounced role for the performance of the

economy. That is particularly the case for a small open economy.

This article examines the long-run and short-run structure in the price and quantity

formation of Danish manufactured exports. We thus focus the attention on a small open

economy, which since the beginning of the eighties has participated in a fixed exchange rate

system. The Danish export sector has previously been analyzed by inter alia Kongsted

(1998) who applies the I(1) cointegration procedure of Johansen (1996) to a data set

covering the period 1971−1991 and consisting of exports and the export market in volumeterms, the Danish export price and the competing price in nominal terms, a measure of

production costs (including an imported element) as well as the exchange rate. He finds

evidence of two long-run relations characterizing the market for Danish manufactured

exports; A relation for the foreign demand for Danish exports with a long-run price

elasticity of numerically 2.4. And a homogeneous relation where the Danish export price

is determined as a simple markup on domestic costs and the price of imports, with weights

corresponding to a large pass-through from exchange rates to export prices in foreign

currency of 23 .

Compared to earlier analyses of the price and quantity formation of manufactured

exports, the present article questions the assumption that the first difference of nominal

price and cost variables are stationary and generalizes the statistical framework by apply-

ing a multivariate cointegration model that allows variables to be integrated up to second

order, I(2), see Johansen (1992, 1995) and Rahbek, Kongsted and Jørgensen (1999). The

fact that the nominal variables of the empirical analysis are driven by a second order

stochastic trend facilitates a distinction between different degrees of persistence. To be

specific, the markup is potentially non-stationary, but can cointegrate with the rate of

inflation in a dynamic steady-state relation. This direct effect from inflation to the equi-

librium markup of exporting firms allows a reinterpretation of the price-setting process

in terms of uncertainty related to the inflationary process, see Banerjee, Cockerell and

Russell (2001) for a similar interpretation of the Australian inflation and markup on cost.

In the present article we use data covering the period 1975-1996. It is illustrated

that the extended sample compared to Kongsted (1998) makes the effects of the German

reunification topical and inspection of the data clearly suggests a level shift in the Danish

market share. To account for this we include a step dummy and restrict it to allow

for level shifts in all directions of the model but exclude changes in the slopes of the

linear trends. The necessary restrictions are similar to the restrictions imposed on the

constant and linear trend in Rahbek et al. (1999) and the resulting model is an example

of a generalization to the I(2) case of the I(1) model with structural breaks proposed in

Johansen, Mosconi and Nielsen (2000).

The article is organized as follows. The theoretical framework and the data are briefly

presented in section 2 and section 3 outlines the statistical model. The empirical analyses

of the long-run and short-run structure are presented in section 4 and 5 respectively and

section 6 concludes.

2 Theoretical Considerations and the Data

The theoretical framework adopted here for modeling demand for exports is the traditional

Armington (1969) approach. This setup assumes that competing products are imperfect

substitutes and specifies an inverse relationship between the export market share and the

export price relative to competitors prices1

xt − xft = −ω · (pt − pft − et) , ω > 1. (1)

Here Xt and Pt are the volume and domestic currency price of Danish manufactured

exports respectively. Xft is the size of the export market in volume terms defined as a

weighted average of imports from Denmark’s trading partners. Pft is the competing price

in foreign currency calculated as the deflator of Xft and Et is the effective exchange rate

denominated as Danish kroner per foreign currency unit. The demand relation (1) can

be derived from a utility function with constant elasticity of substitution. Because total

Danish export is small relative to the size of the export market the constant parameter

−ω can be interpreted both as the elasticity of substitution and as the price elasticity offoreign demand.

To characterize the behavior of price-setting firms, we consider a market of monop-

olistic competition. We assume a constant-returns-to-scale technology where the unit

production costs, Ct, includes an imported component, i.e. ct = κ (pmt + et) + (1− κ)wt,

where κ = 0.37 is the import content of Danish manufactured exports, Pmt is the price of

imports in foreign currency, and Wt is the average unit-labor-cost in the manufacturing

industry. Given the demand relation (1) and competitors prices, the optimal export price

is set as a constant markup, Pt =ω

ω−1Ct, see e.g. Dixit and Stiglitz (1977) or Dornbusch

(1987). In order to approximate a broader range of market structures, we follow Hung,

Kim and Ohno (1993) and assume that the markup depends on competitors prices and

we also include the rate of inflation. Using a log-linear approximation, the markup can

be written as

pt − ct = γ − θ · (pt − pft − et)− λ ·∆pt, θ ≥ 0. (2)

The pass-through from exchange rates to export prices in foreign currency is given

by the partial derivative − ∂∂et(pt − et) =

1−κ1+θ . The special case θ = κ = 0 implies a

1Lowercase denotes log-transformed variables.

constant markup, pt− ct = γ, and full exchange-rate pass-through while the limiting case

θ →∞ corresponds to the law of one price. The latter is the simple textbook assumption

for a small open economy where fluctuations in the exchange rate are fully offset by the

markup. Note that a positive import content, κ > 0, implies less than full pass-through

even when θ = 0.

Several theoretical motivations for the inflation term in (2) can be given. Haldrup

(1998) presents the general case with quadratic adjustment costs. Banerjee et al. (2001)

give a more specific motivation for the inflation term in a markup price equation. They

assume that firms have imperfect information and face a comparatively large loss if prices

are set too high, e.g. due a kinked demand curve. That will cause firms to act cautiously

and choose relatively low markups. Taking inflation as a measure of uncertainty a high

inflation will be accompanied by a relatively low markup implying a price formation like

(2) with λ > 0.

This theoretical setup suggests an information set given by the 6-dimensional vec-

tor Zt = (xt, xft, pt, pft, et, ct)0. In the empirical analysis the sample t =1975:1, ...,1996:4

is considered, and all data are quarterly, seasonally adjusted and log-transformed with

average 1980 = 0, see the Appendix for further details. The data and certain linear com-

binations are shown in figure 1 (A)-(D). Figure (B) illustrates that during the period of

1979 to 1981 the Danish krone was devalued several times, but after the commitment to

the fixed exchange rate regime in 1982, the krone has generally appreciated. Figure (D)

shows that up to the beginning of the nineties the Danish market share, xt − xft, and

price competitiveness, −(pt − pft − et), are clearly positively correlated in line with the

prediction of equation (1). After 1990, however, Danish manufacturers have gained con-

siderable market shares despite a deterioration of competitiveness. A country breakdown

indicates, that this export gain can be attributed mainly to the German market where the

Danish market share has grown approximately 40 per cent from 1990 to 1993 in spite of

a largely unchanged competitiveness, cf. figure (E), which illustrates the Danish market

share in Germany and the Danish export price relative to the German import price in

common currency. We follow Nielsen (1999) and interpret the export gain as a result of

the German reunification in 1990 where Danish exporters apparently have benefited from

the geographical proximity to the expanding German market, see Nielsen (1999) for a

broader discussion.

Outside the German market, data do not seem to be in conflict with the theoretical

model, cf. the positive correlation between the Danish market share and competitiveness

on the remaining 20 markets illustrated in figure (F). According to figure (E), the re-

unification appears to have a long lasting effect on Danish exports and in the empirical

analysis the effect of the German reunification is approximated by a permanent exogenous

shift in the form of the step-dummy, D903t, equal one from 1990:3 onwards. Furthermore,

a balanced intervention dummy, D8012t, is included in the empirical analysis to account

for two particularly large exchange-rate parity adjustments in September and November

(A) Prices and Production Costs in Danish Currency

75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96

(C) Export Volume and the Export Market

75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96

(D) Relative Prices and the Market Share

75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96

pf+e-p

(B) Effective Exchange Rate

75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96

(E) Relative Prices and Market Share in Germany

75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96

Market share

Relative prices

(F) Relative Prices and Market Share on 20 Markets

75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96

Market share

Relative prices

Figure 1: Data and certain linear combinations. Logs, 1980=0.

2.1 Time Series Interpretation

In the empirical analysis, (1) and (2) are candidates for long-run cointegrating relations

and deviations from the relations should in that case be stationary.

In empirical applications it is usually assumed that the included variables are at most

first-order non-stationary, I(1), see Kongsted (1998) for an example. In this case the

inflation rate, ∆pt, is stationary and the parameter λ is not identified, since any linear

combination of stationary variables are themselves stationary. That merely reflects that

inflation has no direct long-run effect on the equilibrium markup.

Recent analyses indicate, however, that nominal variables are often better described as

second-order non-stationary, I(2), see Juselius (1996 and 1999) for examples. One possible

scenario is the existence of one I(2) trend affecting the nominal variables, pt, pft and ct,

with identical coefficients, i.e. with loadings proportional to b ≡ (0, 0, 1, 1, 0, 1)0. Thiscorresponds to long-run homogeneity and implies that the I(2) trend cancels in directions

orthogonal to b such that both relative prices, pt − pft − et, and the markup, pt − ct,

cointegrate from I(2) to I(1), CI(2,1) in the notation of Engle and Granger (1987)2. As

a consequence, prices can diverge permanently but relative prices are less persistent than

levels as indicated in figure 1. If the volumes, xt and xft, and the exchange rate, et, are

I(1) variables, then the I(1) relative prices could possibly cointegrate CI(1,1) with the

market share to form a stationary demand relation like (1). Furthermore could the I(1)

inflation rate, ∆pt, cointegrate CI(1,1) with the relative prices and the markup through the

polynomially cointegrating relationship (2) as suggested by Banerjee et al. (2001). This

setup potentially identifies the parameter λ and implies a negative correlation between

inflation and the markup in the steady state.

An alternative scenario could contain two I(2) trends e.g. a domestic and a foreign I(2)

trend. In this case, it is less straightforward how stationary and economically interpretable

long-run relations could be revealed from the data.

3 The Econometric Framework

The empirical analysis follows the general-to-specific approach of Hendry and Mizon (1993)

based on a vector autoregressive (VAR) model with I(1) and I(2) restrictions cf. Johansen

(1996 and 1992). The basic statistical model for the p-dimensional data vector Zt, t =

1, 2, ..., T , is a VAR(k) model that can be parametrized to facilitate the I(2) analysis as

∆2Zt = ΠZt−1 − Γ∆Zt−1 +k−2Xi=1

Ψi∆2Zt−i + µ0 + µ1t+ΦDt + t. (3)

Here t is assumed to be independently and identically distributed N (0,Ω) and the initial

values, Z−k+1, ..., Z0, are taken to be fixed. The k matrices of autoregressive coefficients(Π,Γ,Ψ1,Ψ2, ...,Ψk−2) are each of dimension p×p, and µ0 and µ1t are vectors of constants

2In some cases I(2) variables can cointegrate CI(2,2) to stationarity, see further below.

and linear drift terms respectively. The vector Dt contains the dummy variables and Φ

are corresponding coefficients.

The cointegrated I(2) model, Hr,s, is a sub-model of (3), which satisfies the following

reduced rank restrictions

Π = αβ0 and α0⊥Γβ⊥ = ξη0, (4)

where α, β are p × r matrices of rank r < p and ξ, η are (p − r) × s matrices of rank

s < p− r. We let α⊥ and β⊥ denote the orthogonal complements to α and β respectively,and for later use we define the matrices β1 = β⊥(β

0⊥β⊥)−1η and β2 = β⊥η⊥.

The inclusion of deterministic variables in the I(2) model is far from straightforward.

Generally they will accumulate in the non-stationary directions of the model, and an

unrestricted constant will produce a quadratic trend in the I(2)-directions of the data. In

the empirical analysis we follow Rahbek et al. (1999) and restrict the constant and the

linear drift term in order to allow linear trends in all components of the model including

the cointegration relations while higher order trends are excluded. The step-dummy,

D903t, together with the lagged differences of the dummy are restricted in a similar way.

That allows level shifts in all directions of the model without changing the slopes of the

linear trends. This model is related to the I(1) model with structural breaks proposed in

Johansen et al. (2000). In the case of reduced rank of Π, the model is written as

∆2Zt = αβ∗0Z∗t−1 − Γ∆Zt−1 +k−2Xi=1

Ψi∆2Zt−i

+φ∆D903t−1 +k−2Xi=0

φi∆2D903t−i + µ0 + θD8012t + t, (5)

where Z∗t−1 =³Z 0t−1 D903t−1 t

´0and β∗0 =

¡β0, γ00, β

The three mutual orthogonal matrices (β, β1, β2) divide the p-dimensional space into

directions with different properties in terms of cointegration. The p − r − s dimensional

matrix β2 defines the loadings to the common I(2) trends and the linear combinations

β02Zt are I(2). The r+ s linear combinations (β, β1)0Zt are integrated of less than second

order and thereby cointegrate. They can be further divided into s directions, β01Zt, which

remain I(1), and r directions, β0Zt, which polynomially cointegrate to stationarity around

a linear trend with a level shift

St = β0Zt + γ00D903t + β00t− δβ02∆Zt, (6)

for some coefficient δ. If r− (p−r−s) > 0 the directions δ0⊥St will be linear combinationsof the levels alone allowing for direct CI(2,2) cointegration from I(2) to trend stationarity.

The statistical analysis is performed using the two-step algorithm of Johansen (1995)

but adapted to the current deterministic setup. First step is equivalent to a standard

3Furthermore, restrictions are imposed on µ0 and φ, see Rahbek et al. (1999).

I(1) analysis and estimates (α, β∗) in a reduced rank regression with the drift term and

the reunification dummy restricted to the cointegration space. This defines the first step

Trace statistic, Qr, for a rank of Π equal r against the unrestricted rank p. Similarly,

second step uses reduced rank regression to estimate the parameters (ξ, η∗), where η∗ isη augmented with coefficients to the constant and the difference of the restricted dummy.

Second step is performed conditional on the estimates from the first step and produces

a Trace test statistic, Qr,s, for a rank of α0⊥Γβ⊥ equal s against the unrestricted rank

p− r. A simultaneous test for the model Hr,s against the unrestricted alternative can be

performed with the test statistic Sr,s = Qr +Qr,s similar to Rahbek et al. (1999) in the

case of a slightly different deterministic specification4.

4 The Long-Run Structure

First part of the empirical analysis is to determine the lag length, k, of the VAR model

(5)5. The Akaike information-criteria is minimized for a lag length of k = 3. Likelihood-

Ratio tests for successive removal of lags also point towards a lag length of k = 3 and

this value is maintained in the following. The model is estimated for the effective sample

of 1976:3−1996:4 and according to table 1 the model seems to have acceptable statisticalproperties6.

4.1 I(2) Analysis

Next we want to determine the rank indices of the I(2) model, r and s, by using the Trace

test statistics, Sr,s, described in section 3. In determining the cointegration rank in I(1)

systems Johansen (1996) proposes two different strategies, and they can be generalized to

the I(2) case. If economic theory suggests a particular model, this preferred model can

be tested against the unrestricted model, H6, to assess if the restrictions imposed by this

model are compatible with data. If the prior knowledge is weak the rank indices can be

determined in a solely data based procedure, following the so-called Pantula principle.

Here all models are tested against the general H6 and a model is rejected only if all sub-

models are also rejected starting from the most restricted case H0,0. See Johansen (1996)

for the case of the I(1) model and Rahbek et al. (1999) for an application to the I(2)

model. Monte Carlo simulations in Nielsen (2001) illustrate, however, that it is generally

4The chosen deterministic specification is balanced in the sence that the different directions has the

same deterministic characteristics. This ensures that the rank tests, Sr,s, are asymptotically similar, see

Rahbek et al. (1999) and Nielsen and Rahbek (2000).5The empirical analysis is performed using the program RATS 4.20, the procedure CATS in RATS

(Hansen and Juselius, 1995) and a procedure for I(2) estimation with restricted deterministic variables

written by the author as well as the program PcFiml 9.10 (Doornik and Hendry, 1997) .6Only deviation from the assumptions significant at a conventional five per cent level is a rejected test

for no autocorrelation in the equation for the competing prices. This is associated with lag 5 and is found

to be difficult to remedy by increasing the lag length. Furthermore, we mainly focus on modelling the

domestic variables in the subsequent analysis.

Table 1: Test for misspecification of the unrestricted VAR(3)

Equation AR(1-5), F (5, 52) ARCH(4) , F (4, 49) Normality, χ2(2)

∆xt 2.178 [0.07] 0.954 [0.44] 1.785 [0.41]

∆xft 2.349 [0.06] 0.329 [0.86] 3.048 [0.22]

∆pt 2.022 [0.09] 1.301 [0.28] 0.273 [0.87]

∆pft 3.383 [0.01] 0.198 [0.94] 5.567 [0.06]

∆et 1.529 [0.20] 0.734 [0.57] 0.726 [0.70]

∆ct 0.568 [0.72] 0.216 [0.93] 1.442 [0.49]

Multivariate tests: Normality, χ2(12) 12.586 [0.40]

AR(1-5), F (180, 138) 1.283 [0.06]

Note: AR is a test for autocorrelation up to 5th order. ARCH tests up to 4th

order. Figures in square brackets are significance levels.

Table 2: Test for the rank-indices of the I(2) model

r Sr,s Qr

0 427.53 357.30 303.50 252.68 210.69 179.59 170.90

[0.00] [0.00] [0.00] [0.00] [0.00] [0.00] [0.00]

1 305.40 236.79 188.85 148.89 115.65 111.92

[0.00] [0.00] [0.00] [0.01] [0.08] [0.01]

2 222.98 162.66 121.33 84.44 76.85

[0.00] [0.00] [0.01] [0.11] [0.04]

3 132.64 86.22 49.72 44.09

[0.00] [0.02] [0.43] [0.21]

4 67.89 27.39 20.35

[0.00] [0.60] [0.53]

5 21.37 9.00

[0.12] [0.41]

p− r − s 6 5 4 3 2 1 0

Note: Figures in square brackets are p−values according to a Γ−functionapproximation of the simulated asymptotic distribution.

difficult to determine the correct rank indices with the data-based strategy, due to, among

other things, low power to distinguish I(1) directions from near-persistent I(0) directions.

The distribution of Sr,s is non-standard and depends on the deterministic specifica-

tion of the model. In the present case we have to take the presence of the restricted

step-dummy, D903t, into account and the distributions of the test statistics are there-

fore simulated7. The simulated distributions are asymptotic and little is known on the

small sample distributions of the tests. The present system is fairly large compared to the

number of observations and the expected uncertainty in the rank determination is fur-

ther enlarged by the included balanced impulse dummy, D8012t, which is asymptotically

negligible but can affect the properties of the test procedure in small samples.

The test statistics, Sr,s, are reported in table 2 together with the corresponding

p−values according to the Γ−function approximations of the asymptotic distributions.The theoretical framework presented in section 2 suggests a model with r = 2 station-

ary relations; the demand relation (1) and the pricing relation (2). If λ 6= 0, the pricingrelation constitute a polynomially cointegrating relation, whereas the demand relation is

directly cointegrating. This theoretical scenario is facilitated by the model H2,3. The

restrictions imposed by this preferred model cannot be rejected against the data, and the

asymptotic p−value for the test of H2,3 against H6 is 11 per cent, cf. table 2.

If the purely data-based rank determination procedure is applied, the overall picture

is that the decision on the number of I(2) trends is clear-cut p − r − s = 1, whereas

the number of stationary relations is more uncertain. The models with r = 1, 2 and

3 have supporting p−values of 8, 11 and 43 per cent respectively, indicating that thestationarity of the second and third long-run relation are borderline cases. To visualize

the borderline nature of the second stationary relation, the two polynomially cointegrating

relations, St = β∗0Z∗t −δβ02∆Zt, in the model H2,3 are illustrated in figure 2. From a visual

inspection the first relation appears clearly stationary, whereas the second looks a little

more persistent.

One way to get more information on the rank determination in I(1) models is to

calculate the Trace test statistics for recursive samples, t = 1, ..., t0, where t0 runs over

T0, ..., T for some T0 large enough to estimate the smallest sample with some precision.

This is also possible in the I(2) case and in figure 3 the recursively calculated Trace test

statistics for the two borderline models H1,4 and H2,3 are reported. In the present case

the calculations are complicated by the German reunification and the included restricted

dummy, D903t, which is only present in the last part of the sample. For the shortest of the

7The asymptotic distribution of each test statistic is approximated by a Γ−function based on

the simulated asymptotic mean and variance, see Doornik (1998). The mean, qm (Ti), and vari-

ance, qv (Ti), of the test statistics are first simulated for the 13 different sample lengths Ti =

80, 90, 100, 120, 150, 200, 300, 400, 500, 600, 800, 1000, 1200 . Then the asymptotic moments of a given teststatistic are estimated as the exponential of the intercept, qj∞ = expθj∞, in the simple response surfaceregression log

¡qj(Ti)

¢= θj∞ + θj1T

−1i + θj2T

−2i + θj3T

−3i , for j = m, v. For all sample lengths and test

statistics the simulations are performed with 10000 replications.

recursive samples the more conventional critical values of Doornik (1998) are appropriate,

while for the longer samples the distributions depend on the presence of the dummy. In fact

the distributions depend on the proportion of the sample influenced by the dummy, but

in figure 3 we compare in all cases with the distributions simulated for the entire sample.

The results from the recursive estimation are the following: Before the break the model

H1,4 is clearly rejected, whereas H2,3 is accepted at a 5 per cent level for all sub-samples

except one, indicating r = 2 stationary relations. When including observations after the

German reunification (and comparing with the approximate asymptotic distributions) the

model H2,3 is still clearly accepted for all sub-samples whereas the test statistics for the

model H1,4 varies above and below the 95 per cent quantile ending for the entire sample

with the p−value of 8 per cent. The formal acceptance of model H1,4 for the total sample

thus seems fragile and depends on observations after the reunification. It could therefore

be an artifact from the simple approximation of the effects of the German reunification

with a dummy variable8.

To conclude this section, it is difficult to make the choice of the rank indices in a small

sample on purely statistical grounds, which is not an unusual situation. The theoretically

preferred model H2,3 is acceptable against the data, and this is robust also for shorter

sub-samples according to the recursive estimation. In the following we give some weight

to the a priori theoretical information and select this preferred model, which allows for

both a demand and a pricing relation in the data.

More information on the dynamic properties is given by the eigenvalues of the com-

panion matrix, which are the inverses of the roots of the characteristic polynomial of (5).

For the model to be stationary, all eigenvalues are required to be located strictly inside

the complex unit circle. Each I(1) trend in the process will entail one unit root in the

polynomial whereas each I(2) trend will entail two. Given the rank indices (r, s) we there-

fore expect to find s + 2 · (p − r − s) unit roots in the process. The 10 largest of the 18

eigenvalues in the unrestricted model, H6, have moduli given by

(0.955, 0.887, 0.887, 0.828, 0.828, 0.730, 0.730, 0.617, 0.617, 0.587) .

At least five of the eigenvalues are close to one. The maintained model H2,3 restricts five

roots to unity, and the ten largest eigenvalues in this restricted model have moduli given

(1, 1, 1, 1, 1, 0.739, 0.670, 0.670, 0.623, 0.623) .

The largest unrestricted eigenvalue of 0.739 is fairly large, indicating again the borderline

nature of the second stationary relation. It could be noted, that the eigenvalue is practi-

8An alternative and simple way to avoid the problems of the German reunification in the rank deter-

mination is to exclude Germany from the data and model the Danish export to the remaining 20 markets,

i.e. the data in figure 1 (F). For this data set the models with p− r− s = 1 I(2) trend and r = 1, 2 and 3

stationary relations have supporting p−values of 4, 14 and 28 per cent respectively, supporting the validityof the theoretically preferred model H2,3 on the majority of markets.

(A) First stationary relation

76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96

(B) Second stationary relation

76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96

Figure 2: The polynomially cointegrating relations, St = β∗0Z∗t − δβ02∆Zt, in the main-

tained model H2,3. Corrected for expectation.

Recursive Trace-tests for model H2,3 and H1,4

1988 1989 1990 1991 1992 1993 1994 1995 1996

Mode l H1,4

Mode l H2,3

and 95 per cent critical values

Figure 3: Trace tests, Sr,s (t0), for the recursive sub-samples t = 1976 : 3, ..., t0, calculated

for the borderline models, H1,4 and H2,3. Together with the corresponding asymptotic 95

per cent critical values. Sample end-point, t0, on the horizontal axis. Note, that for sample

end-points 1990 : 3 and 1990 : 4 the restricted dummy is not included due to coincidence

with the unrestricted second differences of the dummy. For these samples the structural

break is modelled as an outlier only and the usual critical values apply. For the end-point

1991 : 1 the model is not easily estimated. Note furthermore, that the critical values after

the German reunification are only approximations since they are simulated for the entire

sample and not for each sub-sample in the recursive estimation.

cally unchanged from the unrestricted model, indicating that there is no ignored I(2) or

higher order stochastic trends in the data9.

4.1.1 Feedback in the Empirical Model

At the outset all 6 variables in Zt were considered endogenous. From a theoretical per-

spective, however, the feedback to foreign variables, xft and pft, are expected to be weak.

Production costs, ct, are also exogenous in the theoretical setup but that is probably not

the case in an open economy like Denmark where exports account for a considerable pro-

portion of total demand. Finally, the status of the exchange rate, et, is ambiguous for

a country participating in a fixed exchange rate mechanism, see Kongsted (1998) for a

discussion.

In the I(1) model, the concept of weak exogeneity for the cointegration parameters is

equivalent to no level-feedback and corresponds to zero rows in α. In the I(2) model, zero

rows in α still imply lack of feedback from the levels of Zt but that is not sufficient for

weak exogeneity, cf. Paruolo and Rahbek (1999). For the present study, weak exogeneity

is not interesting per se but it is still interesting to test the strength of the feedback from

the levels of the variables. Results from these tests are reported in table 3. The null

hypothesis of no level-feedback cannot be rejected for the export market, xft, and the

exchange rate, et, indicating that the feedback to these variables is weak. A joint test for

the hypothesis of no level-feedback to xft and et, i.e. f02α = f 05α = 0, yields a test statistic

of 4.73. That corresponds to a p-value of 0.32 according to a χ2(4) distribution. The

hypothesis of no level-feedback is clearly rejected for the domestic variables, xt, pt and ct,

and is also rejected for competitors prices, pft. The latter is a little surprising but could

reflect that although Denmark is a small economy it could be a non-negligible player on

some particular markets.

4.1.2 A Nominal-To-Real Transformation of the System to I(1)

The I(1) model is still more developed than the I(2) model and in the following a trans-

formation of the data will be applied that allows inference on the key parameters in the

I(2) model to be conducted within the simpler I(1) framework, see Kongsted and Nielsen

(2002). The validity of the transformation will be tested using the sequential test proposed

in Kongsted (1999 and 2003).

The theoretical candidates to long-run relations (1) and (2) are homogeneous in the

price and cost variables. Homogeneity implies that the system in the long run can be

written solely in the relative prices i.e. the markup, pt−ct, and the relative price in Danishcurrency, pt−pft−et, together with the unrestricted exchange rate, et. In addition to this∆pt is included in the transformed data set in order to allow deviations from homogeneity

9If we alternatively assume that all five unit roots are related to I(1) trends, corresponding to the I(1)

model H1,5, a large root of 0.949 is introduced. This clearly indicates the presence of an I(2) trend in the

Table 3: Test of no level-feedback

Hypothesis ν xt xft pt pft et ct

f 0iα = 0 2 10.68 3.18 15.55 8.31 1.44 12.30

[0.005] [0.204] [0.000] [0.016] [0.488] [0.002]

Note: Numbers in brackets are p−values according to χ2(ν). fi areappropriately defined unit vectors.

Table 4: Sequential test of long-run homogeneity

Hypothesis ν α unrestricted f 02α = f 05α = 0

b0β = 0 2 0.91 [0.635] 3.65 [0.161]

b0β1 = 0 3 12.37 [0.006] 11.85 [0.008]

Note: Numbers in brackets are p−values according to χ2(ν).

Table 5: Identification of the long-run relations

xt − xft pt − pft − et pt − ct ∆pt et D903t t ν LR p

1 3.101(0.392)

0 0 0 −0.176(0.075)

−0.003(0.001)

3 6.43 0.09

0 0.931(0.147)

1 5.643(0.745)

0 −0.028(0.028)

0.000(0.001)

1 3.141(0.335)

0 0 0 −0.110(0.042)

−0.004(0.001)

5 7.23 0.20

0 0.914(0.113)

1 5.541(0.690)

1 1.520(0.164)

0 0 0 −0.167(0.031)

0.000(0.001)

6 38.83 0.00

0 0.345(0.047)

1 0 0 0 0

Note: Likelihood Ratio (LR) test statistics are distributed χ2(ν) and p is significance level.

Figures in parentheses are asymptotic standard errors.

Table 6: Test for misspecification of the cointegrated VAR(3)

Equation AR(1-5), F (5, 56) ARCH(4) , F (4, 53) Normality, χ2(2)

∆(xt − xft) 2.421 [0.05] 0.234 [0.92] 4.193 [0.12]

∆(pt − pft − et) 0.735 [0.60] 0.418 [0.79] 0.319 [0.85]

∆(pt − ct) 1.340 [0.26] 0.322 [0.86] 0.788 [0.67]

∆2pt 2.351 [0.05] 1.422 [0.24] 0.517 [0.77]

Multivariate tests: Normality, χ2(8) 9.823 [0.28]

AR(1-5), F (80, 152) 1.214 [0.15]

Note: AR is a test for autocorrelation up to 5th order. ARCH tests up to 4th

order. Figures in square brackets are significance levels.

in the short run and to give the possibility of the polynomially cointegrating relation.

The proposed transformation thus results in a set of variables where the polynomially

cointegrating relations in the I(2) system, St, correspond to ordinary I(1) cointegration

relations.

The transformation requires that the hypothesis of homogeneity is accepted on all

cointegrating relations (β, β1) i.e. that the I(2) trend affects the nominal variables with

loadings proportional to b. Informally, this can be judged from the estimated loading to

the I(2) trend

xt xft pt pft et ctbβ02 = ( 0.38 −0.47 1 0.80 0.85 0.88 )

b0 = ( 0 0 1 1 0 1 )

The estimated coefficients to the price variables, pt, pft and ct, are not far from being equal

but loadings to the remaining variables are surprisingly large. Especially, the loading to

the exchange rate is of the same magnitude as the price variables.

Each part of the hypothesis b0(β, β1) = 0 can be tested separately using the sequentialtest of Kongsted (2003). A significance level of 0.05/2 should be applied in each step in

order to obtain an overall level between 0.05/2 and 0.05. Results are reported in table

4. First step of the hypothesis, b0β = 0, implies two restrictions on span(β) and yields

a Likelihood Ratio test statistic of 0.91. That corresponds to a significance level of 64

per cent in a χ2(2) distribution. Second step, b0β1 = 0, imposes three restrictions on

the second step of the I(2) estimation procedure and is performed conditional on the

restricted estimates of first step. The hypothesis yields a test statistic of 12.37, which is

highly significant in a χ2(3) distribution. A test for the homogeneity restriction conditional

on f 02α = f 05α = 0 is also reported in table 4. The restriction b0β = 0 is still accepted

albeit with a smaller p-value. The test statistic of the second step, b0β1 = 0, is lower butthe p-value is still below the conventional 0.05/2.10

The formal rejection of overall homogeneity is thus related to the β1−relations, whichare rarely considered in empirical analyses. The hypothesis of homogeneity is, however,

a necessary condition for a model to be accepted as a reasonable representation of price-

setting firms. The rejection could reflect the large loading to the exchange rate in the

estimated β2. The interpretation is that the I(2) trend is embedded in the exchange rate

as well as in the nominal variables. This is somewhat surprising, as the exchange rate

seems less persistent than the nominal variables, cf. figure 1. If I(2)-ness is embedded

in the exchange rate this could reflect large interventions in the time series rather than

the effect from a stochastic I(2) trend. In all cases, linear combinations of the nominal

variables seems to cancel the I(2) trend, as the coefficients in β2 are in the same magnitude,

10Instead of relying on the two step estimator, which estimates β and β1 in different steps, the model

could alternatively be estimated using the novel Maximum Likelihood algorithm of Johansen (1997), in

which τ = (β, β1) is estimated jointly. Using this estimator the hypothesis b0τ = 0 is in fact accepted with

a test statistic of 10.27 corresponding to a p−value of 7 per cent in a χ2(5) distribution.

and the analysis of relative prices could be acceptable.

If we impose the hypothesis of no level-feedback to the export market and the exchange

rate and we accept the restriction of homogeneity then the transformed I(1) model can be

written as

∆eYt = eαeβ∗0

eYt−1xft−1et−1

D903t−1t

eΓi∆eYt−i+ 2Xi=0

∆D903t−i∆xft−i∆et−i

+eµ+eθD8012t+et, (7)

where eY 0t = (x, p− pf − e, p− c,∆p)t are the transformed endogenous variables. This

model will be the maintained11.

4.2 Identification of the Long-Run Structure

In the light of the rejected transformation it is paramount to verify that the I(2) trend is

not present in the transformed data. A look at the roots of the characteristic polynomial

of (7) seems to indicate that the I(2) trend is in fact cancelled in the transformed variables.

The largest eigenvalues have moduli given by

(1, 1, 0.785, 0.727, 0.727, 0.646, 0.646, 0.552) ,

where the largest unrestricted eigenvalue is not alarmingly close to unity, and could reflect

the borderline nature of the second stationary relation rather than extra unit roots corre-

sponding to an I(2) trend. Further, a graphical inspection of the transformed variables

gave no indications of I(2)-ness and it is chosen therefore to accept the transformed system

and identify the long-run structure within this framework.

The theoretical demand equation (1) is formulated as a model for the market share,

xt − xft, and an elasticity of exports with respect to foreign demand different from unity

would be hard to interpret. A formal test for a unit elasticity in the long run imposes

two restrictions on span(eβ∗), and yields a test statistic of 6.27. The hypothesis is thusborderline accepted with a significance value of just above 4 per cent and the restriction

is imposed in the following.

The two cointegration relations will now be identified inspired from the theoretical

candidates (1) and (2). As a starting point the trend and the reunification dummy will

be left unrestricted. The structure (1) and (2) implies three restrictions on the first

relation and two restrictions on the second and is generically identified (Johansen, 1996).

Three restrictions are over-identifying and yield a test statistic of 6.43. The test statistic is

χ2(3) distributed and corresponds to a p-value of 9 per cent. Table 5 reports the estimated

parameters under H1.11In principle one observation is lost by the transformation but in the present case unused observations

in the beginning of the sample can be introduced and the model (7) can be estimated for an unchanged

effective sample.

In the structure H1 the trend and the reunification dummy are insignificant in thepolynomially cointegrating price relation. Restricting these terms to zero yields the results

reported under H2. The additional restrictions change the Likelihood Ratio test statisticfrom 6.43 to 7.23 against two additional degrees of freedom. The structure is accepted

with a total p-value of 20 per cent according to a χ2(5) distribution.

The structure, H2, is clearly empirically identified. The long-run demand relation isgiven by

xt − xft = −3.141 · (pt − pft − et) + 0.110 ·D903t + 0.004 · t. (8)

The estimated price elasticity of 3.14 numerically is fairly high in an international com-

parison and is somewhat higher than the 2.36 found in the I(1) cointegration analysis of

Kongsted (1998). The estimated effect of the German reunification is an increase in the

average Danish market share by approximately 11 per cent, which is in accordance with

the impression from figure 1 (D). The trend is clearly significant, which could be a result of

the applied market share restriction or some structural changes in the countries involved.

The dynamic steady-state price relation can be written as

pt = 0.478 · (pft + et) + 0.522 · ct − 2.895 ·∆pt. (9)

The pass-through from exchange rates to export prices in foreign currency, calculated

as the partial derivative of the long run solution to (9), is given by − ∂∂et(pt − et) =

1−κ1+θ = 0.33, implying that two-third of a competitiveness gain, induced by fluctuations in

exchange rates, will be offset by an increase in the markup. This indicates a significant

pricing-to-market effect of Danish exporters even in the long run and is in contrast with

Kongsted (1998), who finds a simple markup on costs in Danish export pricing. The major

difference between Kongsted (1998) and the present analysis is the starting point in the

I(2) model and the implied polynomially cointegrating element. The importance of this

can be evaluated by restricting the coefficient of ∆pt to zero. This structure is reported

under H3. The restriction results in a significantly lower weight to foreign prices anda lower price elasticity in the demand relation, more similar to the results in Kongsted

(1998). The restriction can, however, be safely rejected, which underlines the importance

of the polynomially cointegrating element. The large coefficient to the inflation term can

be interpreted as an element of caution in the price setting cf. Banerjee et al. (2001).

The equation also implies an error correction in two steps, and the large coefficient to the

inflation rate indicates a very slow adjustment, which reflects the smoothness of the time

series.

An inspection of the recursively estimated eigenvalues and identified long-run param-

eters (not reported) indicates that the model is fairly constant. Especially, the included

dummy seems to fully account for the German reunification and no instability is intro-

duced in the model. Statistically, the model seems to be well behaved, cf. table 6 that

reports a battery of misspecification tests.

5 The Short-Run Structure

To characterize the dynamic adjustment mechanisms in the Danish export sector we de-

velop a structural representation of the model. A simultaneous equation model is obtained

by premultiplying the reduced form with a p×p matrix A0 of full rank. The equations has

to be identified by a priori imposing 1 normalization and p−1 = 3 (linearly independent)restrictions on each equation.

Economic theory is not very precise in the description of the short-term interaction

between the transformed variables in eYt. We therefore rely on a combination of economictheory and empirical evidence from the reduced form in the identification of the system.

In particular, we apply the following identification scheme

Equation : A0 Further restrictions :

∆(x− xf )t

∆(p− pf − e)t

∆(p− c)t

∆2pt

1 −a12 −a13 0

0 1 0 0

0 0 1 0

0 −a42 0 1

∆(p− c)t−1 and ∆et–

(x− x )t−1

which fulfills the rank conditions for generic identification. The restrictions further im-

plies strong empirical identification of the system and makes the significant parameters

interpretable from economic theory.

In the market share equation we allow for a contemporaneous effect from the relative

prices, ∆(p−pf −e)t. This is a standard demand effect of a high export price and impliesa12 < 0. Besides this effect from the relative export prices in common currency, we expect

no contemporaneous contributions from the exchange rate and exclude ∆et. Further, we

expect a supply effect from a high markup, but the coefficient to the lagged markup is

negative in the market share equation. As the lagged markup is also significantly negative

in the markup equation, we can obtain a positive contemporaneous supply effect in the

market share equation via a13 > 0 and by excluding ∆(p − c)t−1. This identification ofthe contemporaneous effect with the lagged stresses that the supply effect is a very short

term phenomenon.

In the reduced form equation for the price acceleration, ∆2pt, a significant positive

coefficient is estimated to the lagged deviation from the long-run market share relation,

(x−x )t−1. This has the counterintuitive effect that an increase in the demand for exportswill cause firms to lower export prices, and the coefficient is therefore restricted to zero.

This restriction could identify a contemporaneous effect from competitors prices with

correct sign, a42 < 0, as (x−x )t−1 is also clearly significant in the reduced form equationfor relative prices. The equations for ∆(p−pf −e)t and ∆(p−c)t are left in reduced form.

The outlined structure can be estimated with Full Information Maximum Likelihood

and the parameters assuring identification are all significant. The system is, however,

highly overparametrized and we impose 50 overidentifying restrictions to get a parsimo-

nious system. The estimated structural equations for the market share and the price

Table 7: FIML-estimation of the structural representation

d∆(x− xf )t = −0.285(−3.608)

·∆(x− xf )t−1 −0.285(−3.608)

·∆xft−1

+1.834(2.304)

·∆(p− c)t −1.272(−4.916)

·∆(p− c)t−2

−0.619(−2.918)

·∆(p− pf − e)t −1.869(−4.503)

·∆2pt−1

−0.090(−2.338)

· (x− x )t−1 +0.286(3.117)

· (p− p )t−1

+0.096(2.687)

·∆D903t−1 −0.066(−2.000)

·∆D8012t

−0.037(−2.716)

σ = 0.0322, FAR(5, 56) = 3.79, FARCH(4, 53) = 0.26, χ2N (2) = 3.41

d∆2pt = −0.154(−3.214)

·∆(x− xf )t−2 −0.76(−2.204)

0 ·∆(p− pf − e)t

−0.480(−1.669)

·∆et −0.124(−2.868)

·∆(p− pf − e)t−1

−0.094(−2.218)

·∆(p− pf − e)t−2 −0.140(−11.155)

· (p− p )t−1

+0.047(3.521)

·D8012t +0.021(8.454)

σ = 0.0141, FAR(5, 56) = 6.35, FARCH(4, 53) = 0.79, χ2N (2) = 1.89

Note: Figures in parentheses are t−values. σ are standard deviations, FAR(5, 56)are tests for autocorrelation, FARCH(4, 53) are tests for ARCH and χ2N (2) are

tests for normality.

acceleration are reported in table 7, while the reduced form equations for the competitive-

ness and the markup are not reported. A Likelihood Ratio test for the 50 overidentifying

restrictions yields a test statistic of 44.00, corresponding to a significance level of 71 per

cent. The structural model seems to be well-specified and in particular, no structural

breaks seem to occur at the time of the German reunification according to graphical

inspections and formal Chow tests (not reported).

5.1 Economic Identification

In the equation for the market share the contemporaneous change in the export market,

∆xft, is excluded. This is an attractive theoretical feature as it implies that a boost on

some import market ceteris paribus are equally distributed on all suppliers. The coeffi-

cients to ∆(x− xf )t−1 and ∆xft−1 are restricted to be equal such that only ∆xt−1 takespart in the dynamic adjustment. This restriction is accepted with a marginal significance

value of 93.24 per cent.

The short-run price elasticity is estimated to −0.62 compared to a long-run priceelasticity of −3.14. A numerical elasticity below unity implies that the income effect of achange in relative prices dominates in the short run. A large proportion of the explanatory

power of the equation is accounted for by the two error-correction terms. The loading to

the deviation from the price equation, (p − p )t−1, is estimated to 0.29 and indicates asignificant supply effect of a relatively high export price. In the short run, a supply effect

is estimated with an elasticity of 1.83 but the effect is reduces after two quarters by a

negative elasticity of −1.27. It cannot be statistically rejected that these coefficients areidentical, such that a high markup has a positive supply effect in only two quarters. This

restrictions is, however, not imposed.

In the equation for price formation a negative coefficient of −0.76 is estimated to thecontemporaneous price competitiveness, ∆(p− pf − e)t. A change in competing prices of

one per cent will thus lead to a change in the Danish export price of ∂pt∂pft

= 0.761+0.76 = 0.43

the first quarter. The effect from the exchange rate is equal to ∂pt∂et

= 0.76−0.481+0.76 = 0.16.

That leads to a short-run exchange-rate pass-through from exchange rates to the export

price in foreign currency of −∂(pt−et)∂et

= 0.84, to be compared to a long-run pass-through

of 0.33. Finally, a significant negative coefficient is estimated to the two quarter lagged

change in the market share, ∆(x − xf )t−2. This effect is a little hard to interpret butcould act as a proxy for effects outside the VAR model. Finally, it is a little surprising

that no effect is found from production costs to export prices in the short-run. One reason

for this could be that in the short run it is the marginal rather than average production

costs that are of importance. This could indicate that a capacity effect is suppressed in

the export price formation, see also Nielsen (1999).

In this article, a structural econometric model for the price and quantity formation in

Danish manufactured exports was identified. The results on the hypothesis of long-run

homogeneity of nominal variables was somewhat mixed. The restriction was formally

rejected on the β1−cointegration relations from I(2) to I(1) but the subsequent analysis

of the transformed system gave no indications of any remaining I(2) trend. In the I(1)

transformed I(2) model, a long-run relation for the foreign demand for exports and a

dynamic steady-state price relation were identified.

The long run price elasticity of foreign demand was estimated to 3.14 and the cor-

responding short run elasticity was estimated to 0.62, implying that the income effect

dominates in the short run. The effect from the German reunification on the market

share was estimated to approximately 11 per cent.

The steady-state price relation was characterized by a large degree of pricing-to-market

and the weight to foreign prices implied an exchange rate pass-through of 0.33 in the

long run. In the short run a somewhat higher pass-through of 0.84 was estimated. The

polynomially cointegrating element from the I(2) analysis is central to the results and

opens for a reinterpretation of inflation as a source of uncertainty for price-setting firms.

Appendix: Data

The main sources of data are the International Trade and Competitiveness Indicator

database from OECD and Statistics Denmark. All data are quarterly and seasonally

adjusted from the source and indexed 1980 = 1.

• The Danish manufactured export volume, Xt, is defined as the SITC categories 5−9from Statistics Denmark.

• The Danish export price, Pt, is the unit-value corresponding to Xt.

• The unit production cost of Danish manufactured export, Ct, is derived as a geo-

metrical average of the average unit-labor-cost (ULC) and the import unit value in

Danish currency (PMVX). Both variables are taken from the econometric model

Mona of the Danmarks Nationalbank (Christensen and Knudsen, 1992). Weights

are taken from the input-output table of the Danish economy as 63 and 37 per cent

respectively.

• The market for Danish manufactured exports, Xft, is calculated as an arithmetic

average of the import volume on 21 OECDmarkets with weights given from the coun-

try breakdown of Danish manufactured exports in 1989; Xft =P21

j=1 αjMjt, where

Mjt and αj are the import volume and the weight of country j = 1, 2, ..., 21 respec-

tively. The export market in foreign-currency value terms is calculated as PftXft =P21j=1 αjPjtMjt, where PjtMjt is the import value from country j. Likewise, the

market value in Danish currency is calculated as EtPftXft =P21

j=1 αjEDK.jtPjtMjt,

where EDK.jt is the bilateral exchange rate denominated as Danish kroner per cur-

rency unit. The competing price in foreign currency is then defined as the implicit

deflator of Xf , i.e. Pft = PftXft/Xft and the implicit effective exchange rate of the

Danish krone is calculated as Et = EtPftXft/PftXft.

• The dummies take the form

D903t =

(1 for t = 1990:3− 1996:40 elsewhere

and D8012t =

1 for 1980:1

−1 for 1980:2

0 elsewhere

References

Armington, P.S. (1969): “A Theory of Demandfor Products Distinguished by Place of Produc-tion”, International Monetary Fund Staff Papers,16(1), 159—176.

Banerjee, A., L. Cockerell and B. Russell(2001): “An I(2) Analysis of Inflation and theMarkup”, Journal of Applied Econometrics, 16,221—240.

Christensen, A.M. and D. Knudsen (1992):“MONA: a Quarterly Model for the Danish Econ-omy”, Economic Modelling, 1992 January, 10—74.

Dixit, A.K. and J.E. Stiglitz (1977): “Monop-olistic Competition and Optimum Product Di-versity”, The American Economic Review, 67(4),297—308.

Doornik, J.A. (1998): “Approximations to theAsymptotic Distributions of Cointegration tests”,Journal of Economic Surveys, 12, 533—572

––– and D.F. Hendry (1997): Modeling Dy-namic Systems Using Pc-Fiml 9.0 for Windows,International Thomson Publishing.

Dornbusch, R. (1987): “Exchange Rates andPrices”, The American Economic Review, 77(1),93—106.

Engle, R. and C.W.J. Granger (1987): “Co-Integration and Error Correction: Representa-tion, Estimation and Testing”, Econometrica, 55,251—276.

Goldberg, P.K. and M.M. Knetter (1997):“Goods Prices and Exchange Rates: What HaveWe Learned?”, Journal of Economic Literature,35, 1243—1272.

Haldrup, N. (1998): “A Review of the Economet-ric Analysis of I(2) Variables”, Journal of Eco-nomic Surveys, 12, 595—650.

Hansen, H. and K. Juselius (1995): Manual toCointegration Analysis of Time Series; Cats inRats, Estima.

Hendry, D.F. and G.E. Mizon (1993): “Evaluat-ing Dynamic Econometric Models by Encompass-ing the VAR”, In Phillips, P. C. B. (ed.): Mod-els, Methods, and Applications of Econometrics,Blackwell.

Hooper, P. and C.L. Mann (1989): “ExchangeRate Pass-through in the 1980s: The Case of U.S.Imports of Manufactures”, Brookings Papers onEconomic Activity, 1989(1), 297—337.

Hung, W., Y. Kim and K. Ohno (1993): “Pric-ing Exports: a Cross-Country Study”, Journal ofInternational Money and Finance, 12, 3—28.

Johansen, S. (1992): “A Representation of VectorAutoregressive Processes Integrated of Order 2”,Econometric Theory, 8, 188—202.

––– (1995): “A Statistical Analysis of Cointegra-tion for I(2) Variables”, Econometric Theory, 11,25—59.

––– (1996): Likelihood-Based Inference in Coin-tegrated Vector Autoregressive Models, 2nd edi-tion, Oxford University Press.

–––, R. Mosconi and B. Nielsen (2000):“Cointegration in the Presence of StructuralBreaks in the Deterministic Trend”, Economet-rics Journal, 3, 216—249.

Juselius, K. (1996): “A Structured VAR underChanging Monetary Policy”, Journal of Businessand Economic Statistics, 16, 400—412.

––– (1999): “Price Convergence in the Mediumand Long Run: An I(2) Analysis of Six Price In-dices”, In Engle, R. F. and H. White: Cointegra-tion, Causality, and Forecasting. A Festschrift inHonour of Clive W. J. Granger, Oxford Univer-sity Press.

Kongsted, H.C. (1998): “Modelling Price andQuantity Relations for Danish ManufacturingExports”, Journal of Business and EconomicStatistics, 16(1), 81—91.

––– (1999): “Testing the Nominal-To-RealTransformation”, Working Paper, Institute ofEconomics, University of Copenhagen.

––– (2003): “An I(2) Cointegration Analysisof Small-Country Import Price Determination”,Econometrics Journal, 6, 53—71.

––– and H.B. Nielsen (2002): “AnalyzingI(2) Systems by Transformed Vector Autoregres-sions”, Discussion Paper 02-20, Institute of Eco-nomics, University of Copenhagen, Chapter 2in this Thesis.

Nielsen, B. and A. Rahbek (2000): “SimilarityIssues in Cointegration Analysis”, Oxford Bul-letin of Economics and Statistics, 62(1), 5—22.

Nielsen, H.B. (1999): “Market Shares of Manufac-tured Exports and Competitiveness”, MonetaryReview, 2nd Quarter, Danmarks Nationalbank.

Paruolo, P. and A. Rahbek (1999): “Weak Exo-geneity in I(2) VAR Systems”, Journal of Econo-metrics, 93, 281—308.

Rahbek, A., H.C. Kongsted and C. Jørgensen(1999): “Trend-Stationarity in the I(2) Coin-tegration Model”, Journal of Econometrics, 90,265—289.

Chapter 5

Inflation Adjustment

in the Open Economy:

An I(2) Analysis of UK Prices

Inflation Adjustment in the Open Economy:

An I(2) Analysis of UK Prices

Heino Bohn Nielsen

Christopher Bowdler

Nuffield College

University of Oxford

christopher.bowdler@nuffield.oxford.ac.uk

Abstract

This paper analyses the transmission of import price shocks to UK inflation. A coin-

tegrated vector autoregressive model for consumer prices, unit labour costs, import

prices and real consumption growth is estimated subject to I(1) and I(2) restrictions.

We find that an increase in real import prices reduces the real wage, such that the pass

through to domestic inflation is moderated. This may explain why the depreciation of

sterling in 1992 left inflation unchanged. In contrast, high real import prices in 1974

increased inflation because wage accommodation effects were weaker at that time.

Keywords: Cointegration, I(2), Impulse response, Inflation, Import price shock.

JEL Classification: C32, C51, C53, E31, F0.

1 Introduction

Empirical models of inflation in the United Kingdom typically assign a central role to

import prices, see e.g. Bank of England (1999). In recent years, however, the link between

import prices and domestic inflation appears to have been somewhat weaker. For instance,

the 10% increase in the ratio of import prices to consumer prices that followed the UK’s

exit from the European Exchange Rate Mechanism (ERM) in 1992, led to no subsequent

increase in consumer price inflation. Such an episode stands in stark contrast to the

British experience of the 1970s, where large increases in import prices induced bouts of

high inflation.

In this paper we demonstrate that the key to understanding the impact of external

shocks on the rate of inflation lies in a joint analysis of consumer prices, import prices

The authors would like to thank Dan Knudsen, Hans Christian Kongsted, John Muellbauer, Kamak-

shya Trivedi and participants at the 58th European Meeting of the Econometric Society, Stockholm, for

many helpful comments. The paper was initiated while the first author was visiting Nuffield College,

Oxford. Their hospitality and a financing grant from the Euroclear Bank and University of Copenhagen

are gratefully acknowledged.

Inflation Adjustment in the Open Economy

and unit labour costs. We analyse these variables, in addition to a cyclical indicator,

consumption growth, in a vector autoregressive (VAR) model. We allow the three nominal

variables to be integrated of second order, I(2), see inter alia Johansen (1992) or the

survey by Haldrup (1998), and then, unlike in most applications of the I(2) framework,

we estimate the model using Maximum Likelihood (ML). In the data we find one I(2)

trend affecting the nominal variables proportionately. Imposing a homogeneity restriction

then reduces the system to I(1) space, and the identification of the long-run structure may

be carried out on a transformed data set including real unit labour costs (equivalently,

productivity adjusted real wages), real import prices, consumer price inflation and the

growth rate of real consumption. The results for the model in I(1) space indicate that one

long-run relation links the inflation rate to real unit labour costs and real import prices,

and a second relation links consumption growth to its constant steady-state value.

We then use an impulse response analysis on the cointegrated VAR model to interpret

the effects of two types of permanent shock impacting the system, a domestic shock and a

foreign shock. A key finding is that increases in real import prices induced by the foreign

shock are associated with a downward adjustment of productivity adjusted real wages,

such that the total effect of the shock on inflation is small. This real wage accommodation

effect has theoretical foundations in the competing claims models proposed by Layard,

Nickell and Jackman (1991). This effect provides one explanation as to why the post-

ERM depreciation of sterling had a relatively benign impact on UK inflation. Next, we

show that a version of our results computed for the post-1974 subsample suggests that

real wage accommodation was weaker at the time of the first oil price shock in 1974, a

finding that we attribute to the effects of Phase III of the Heath administration’s income

policies. This explains the larger response of inflation to real import prices in that period.

The remainder of the paper expands on these points and has the following structure.

Section 2 illustrates how a markup model of the price level can be interpreted in light

of the time series properties of different measures of prices. Section 3 presents quarterly

data on the key variables spanning more than three decades. Section 4 provides details

of the econometric tools that we employ in our analysis and Section 5 presents results

from both the I(2) and I(1) analyses. Section 6 presents an impulse response analysis

and discusses its implications for understanding the mechanisms behind open economy

inflation adjustment. Finally, Section 7 concludes.

2 Theoretical Framework and Time Series Interpretation

Empirical analyses of inflation fluctuations are often based upon markup models of the

price level, see, for example, de Brouwer and Ericsson (1998). In such models the price

level, Pt, is a markup over total unit costs, which we take to be a combination of unit labour

costs, Ut, and import prices, Mt. Assuming that the price level is a linearly homogeneous

function of input costs a partial adjustment model for the price level can be written as

follows:

∆pt = ω0∆pt−1 − ω1 · [pt−1 − γ · ut−1 − (1− γ) ·mt−1 − µ] . (1)

Here lower case letters denote logarithms of variables, µ measures the markup factor,

the elasticities of Pt with respect to Ut and Mt are γ and (1− γ) respectively and the

conditions ω0 > 0, ω1 > 0 imply partial adjustment of the price level towards its steady-

state value.1 The presence of the lagged inflation rate in (1) indicates that we consider an

eclectic theory of price adjustment, encompassing such factors as adaptive expectations

amongst agents, or aggregation over heterogeneous sectors of the economy.

The dynamic relationship in (1) constitutes a simple theoretical starting point for the

empirical analysis. The exact interpretation of the dynamics depends on the order of

integration of the different series. If the price variables (pt : ut : mt)0 are integrated of

first order, I(1), as is often assumed in econometric analyses, then they could potentially

cointegrate to stationarity through the I(1)-to-I(0) cointegrating relation

pt−1 − γ · ut−1 − (1− γ) ·mt−1, (2)

see inter alia de Brouwer and Ericsson (1998). In this case (1) describes a simple error

correction mechanism with an additional stationary dynamic term. If we find the price

variables to be I(1) in the empirical analysis, (2) is a likely candidate for a long-run

relation.

To facilitate an analysis of I(2) variables, subtract ∆pt−1 from both sides of (1) to

obtain

∆2pt = (ω0 − 1) ·∆pt−1 − ω1 · [pt−1 − γ · ut−1 − (1− γ) ·mt−1 − µ] . (3)

If (pt : ut : mt)0 is an I(2) process, the relation (3) allows for an alternative interpretation

based on a dynamic steady state relation. The price levels may still cointegrate, in this case

from I(2) to I(1) in general, such that the linearly homogeneous combination in (2) is an

I(1) process. Using the property of first order homogeneity of the variables in (2) in order

to write the markup as a function of real unit labour costs2, (u− p)t, and real import

prices, (m− p)t, a potential second layer of cointegration is the following polynomially

cointegrating relation

∆pt−1 − θ · [γ · (ut−1 − pt−1) + (1− γ) · (mt−1 − pt−1)] ∼ I(0), (4)

which links the I(1) inflation rate to the I(1) combination of real unit labour costs and

real import prices. The relation (4) can be thought of as a kind of error correction, where

1The partial adjustment could also include further lags in ∆pt and the other cost terms, and we allow

for this in the empirical analysis. At this stage we focus on a parsimonious form, however, to illustrate

the interpretation of the markup approach.2The real unit labour costs referred to are equivalent to the productivity adjusted real wage facing

consumers. However, due to the fact that firms are both producers and retailers in this analysis, and also

the fact that we do not model such things as the tax wedge, real unit labour costs are also equivalent to

the productivity adjusted real wage facing producers.

the inflation rate, ∆pt, corrects deviations from the I(2)-to-I(1) relation in (2). Deviations

from (4) could now be stationary and the stationary second order difference, ∆2pt, could

error correct to this relation as in (3).

In the empirical analysis we allow the markup to fluctuate over the short-term through

business cycle effects. We use the log-linear approximation of a time-varying markup:

µt = θ0 + θ1zt, (5)

where θ0 denotes the autonomous component of the steady-state markup and zt measures

the cyclical position of the economy. In this paper we choose the growth rate of real

consumer expenditure, ∆ct, as the cyclical indicator. Inserting (5) in the dynamic relation

(1) modifies the candidates for the stationary long-run relation, (2) or (4), by including

consumption growth in each of them. If consumption growth is stationary, however,

then the coefficients to ∆ct in the relationship are not identified by the requirement of

stationarity alone, as linear combinations of stationary variables are themselves stationary.

However, we can still model the effects of the stationary variable within the system through

adopting a unit vector as a second long-run relation, see Section 5 for further discussion.

Banerjee, Cockerell and Russell (2001) and Banerjee and Russell (2001) also consider

a time varying markup, where µt explicitly depends on the inflation rate. They argue

that firms have imperfect information on market prices and face a comparatively large

loss if prices are set too high, e.g. due to the presence of a kinked demand curve. That

will cause firms to act cautiously and the markup will depend negatively on the level of

uncertainty. Taking inflation as a measure of uncertainty, high levels of inflation will be

accompanied by a relatively low markup. This suggests an alternative interpretation of

the polynomially cointegrating relation in (4).

In the presentation above the focus was solely on the time series interpretations of

the dynamics of the inflation rate. It may be the case that wage-setting practices lead

unit labour costs to respond to import and consumer price indices, see Layard, Nickell

and Jackman (1991), making unit labor costs endogenous to the parameters in the long-

run relation. To take account of this, the empirical analysis allows for causation in all

directions in (4) through estimating a system of equations for the vector process Xt =

(pt : ut : mt : ∆ct)0.

3 The Data

In the empirical analysis we study Xt for the effective sample t = 1969 : 1− 2000 : 4. Weuse the natural log of the implicit deflator for household consumption in measuring pt, the

log of average unit labour costs for ut, the log of the implicit deflator for imports of goods

and services for mt, and the log of real household consumption for ct, see the Appendix

for further details.

The data and some relevant linear combinations are presented in Figure 1. Graph

(A) depicts the log-levels of nominal prices. Over the sample period the total increases in

consumer prices, pt, and unit labor costs, ut, have been quite similar, while the increase

in import prices, mt, has been somewhat smaller. These differences are reflected in Graph

(C), in which real import prices, mt− pt, are more obviously negatively trended than real

unit labour costs, ut− pt. A further interesting feature is that for much of the past thirty

years real unit labour costs and real import prices have been negatively correlated. For

example, real unit labour costs declined following an increase in real import prices after

sterling exited the ERM in 1992. Such co-movements suggest that real wages accommo-

date the impact of shocks to real import prices. In contrast, real import prices and real

unit labour costs appear to correlate positively following the first oil price shock of 1974,

suggesting that real wage accommodation effects did not operate at that time.

The final variable in our empirical analysis is the growth rate of real consumption,

∆ct. This measure of cyclical conditions is more appropriate than alternatives based on

GDP, for the inflation rate that we study is based on the deflator for total consumer

expenditure, and is therefore more likely to be a function of fluctuations in the consumer

sector than in the aggregate economy.3

A priori and after inspection of Figure 1, the most likely scenario in terms of cointe-

gration is that in which there is one I(2) trend affecting (pt : ut : mt)0 proportionately. In

this case the polynomially cointegrating relation (4) is a likely candidate for a stationary

relation. From graph (D) consumption growth looks stationary suggesting a unit vector

as a second stationary relation.

4 The Cointegrated VAR Model

The starting point for analysing the p−dimensional vector Xt, t = 1, ..., T , is a VAR model

of order k, which can be parametrised as:

∆2Xt = ΠXt−1 − Γ∆Xt−1 +k−2Xi=1

Ψi∆2Xt−i + µ0 + µ1t+ φDt + t. (6)

The innovations are assumed to be identically and independently Gaussian, t ∼ N (0,Ω),

and the initial values, X−k+1, ...,X0, are taken to be fixed. The k matrices of autore-

gressive parameters, Π,Γ,Ψ1, ...,Ψk−2, are each of dimension p× p and the deterministic

specification is given by a constant, µ0, a linear drift term, µ1t, and a set of dummy

variables, Dt, with coefficients, φ.

The I(2) model, denoted Hr,s, is a submodel of (6) defined by the reduced rank re-

strictions

Π = αβ0 and α0⊥Γβ⊥ = ξη0, (7)

3The late 1990s provide a good example of how the choice of cyclical indicator can be important: GDP

growth during that period suggested that the economy was expanding at its trend rate, but that masked

strong demand pressures in the consumer sector that were offset at the aggregate level by a manufacturing

recession, as exporters struggled to cope with the effects of a high sterling exchange rate.

1970 1980 1990 20002.5

(A) Price levels (logs)

1970 1980 1990 2000

0.08 (B) First differences of consumption deflator

1970 1980 1990 2000

0.4(C) Relative prices

u-p m-p

1970 1980 1990 2000

0.06 (D) Growth of consumption

Figure 1: Data and certain linear combinations.

where α and β are matrices of dimension p×r, ξ and η are matrices of dimension (p− r)×s,and α⊥ and β⊥ are the orthogonal complements to α and β respectively. Under the

additional assumption that the characteristic polynomial corresponding to (6), A(z), has

2(p− r)− s roots at the point z = 1 and the remaining roots outside the unit circle, Xt

is an I(2) process, see Johansen (1992) for the full representation. To characterise the

cointegration properties we define the matrices β1 = β⊥η, β2 = β⊥η⊥ and δ = α0Γβ2,where for a matrix β we define β = β

¡β0β¢−1. Using this notation the p − r − s linear

combinations β02Xt are I(2) and non-cointegrating. The r + s combinations (β : β1)0Xt

cointegrate from I(2) to I(1). They can be further divided into s combinations, β01Xt, that

remain I(1), and r combinations that cointegrate to stationarity with the first differences

through the polynomially cointegrating relations:

St = β0Xt − δβ02∆Xt.

If r > p − r − s then δ0⊥St will be combinations of the levels alone, allowing for directI(2)-to-I(0) cointegration.

Previous applications of the I(2) model have employed the two-step estimator of

Johansen (1995), see inter alia Juselius (1998), Diamandis, Georgoutsos and Kouretas

(2000), Banerjee, Cockerell and Russell (2001), Banerjee and Russell (2001) and Nielsen

(2002) for examples. The two-step estimator is a sequential application of the reduced

rank regression known from the analysis of I(1) VAR models, and is not the ML estimator

for the I(2) model. In this paper we rely on the ML estimation algorithm of Johansen

(1997), based on the parametrisation

∆2Xt = α¡ρ0τ 0Xt−1 + ψ0∆Xt−1

¢+Ωα⊥

¡α0⊥Ωα⊥

¢−1κ0τ 0∆Xt−1

+k−2Xi=1

Ψi∆2Xt−i + µ0 + µ1t+ φDt + t, (8)

where the parameters (α, ρ, τ , ψ, κ,Ψ1, ...,Ψk−2,Ω) are all freely varying. The parametersof the previous notation can be derived from the new parameters using β = τρ, ξ = −κ0ρ⊥,η = β0⊥τρ⊥, and Γ = −Ωα⊥ (α0⊥Ωα⊥)−1 κ0τ 0 − αψ0.

Due to unit roots, unrestricted deterministic components in (6) or (8) will accumulate

in the levels of the variables, e.g. an unrestricted constant will produce a quadratic trend

in the I(2) directions, β02Xt. In the empirical analysis we want to allow for linear trends

in all directions, including the stationary polynomially cointegrating relations, St, as an

alternative to stochastic trends in describing non-stationarities in the data. The linear

trends may be interpreted as a control for the different methods of quality adjustment

applied in the construction of consumer and import price deflators. We exclude a priori

quadratic deterministic trends. This approach can be implemented using the specification

proposed in Rahbek, Kongsted and Jørgensen (1999), which entails the restrictions:

µ1 = αρ0τ 00 and µ0 = αψ00 +Ωα⊥¡α0⊥Ωα⊥

¢−1κ0τ 00,

where τ0 and ψ0 are matrices of dimension 1 × (r + s) and 1 × r respectively. This

specification is asymptotically similar, such that the actual coefficients on the linear trends

in different directions do not appear as nuisance parameters in the asymptotic distributions

of estimators of other coefficients, see Rahbek et al. (1999) and Nielsen and Rahbek (2000).

5 Empirical Analysis

As a basis for the empirical analysis we first set up a congruent statistical model for

the data Xt = (pt : ut : mt : ∆ct)0, t = 1969 : 1 − 2000 : 4.4 The period is charac-

terised by a number of extreme episodes induced by policy interventions or shocks to

variables outside the current information set. We do not want our results to be driven

by a few extreme shocks and therefore condition on six intervention dummies of the form

(0, ..., 0, 1,−1, 0, ..., 0),5 namely

Dt = (D73q1 : D74q1 : D75q1 : D75q3 : D79q2 : D80q1)0t .

4The analysis was carried out using a set of procedures programmed in Ox, see Doornik (2001).5The location of the dummies was determined from their effects on the likelihood function. The results

of the analysis are not sensitive to particular dummies, and by and large identical results are obtained for

a model with no dummies.

The dummies sum to zero and do not produce changes in the slopes of the linear trends

in the I(2) directions. D73q1 accounts for the fiscal expansion undertaken by the Heath

administration and the effects of decimilisation of the currency. D74q1 controls for the

fluctuations following the first oil price shock, while D75q1 and D75q3 control for the

uneven nature of earnings growth due to the Wilson-Callaghan ‘social contract’ applied

to labour market bargaining. Finally, D79q2 and D80q1 control for, respectively, the

second oil price shock and the effects of increases in Value Added Tax under the first

Thatcher administration.

To determine the appropriate lag length, k, for the unrestricted VAR model, Table 1

reports LR tests for successive lag deletions and various information criteria, see Lutkepohl

(1991). The results clearly point towards two lags, and we set k = 2 in the analysis that

follows.

The results of a battery of mis-specification tests for the unrestricted VAR(2) are

reported in Table 2. These are generally satisfactory, although the multivariate test for

no residual autocorrelation of higher order is formally rejected at a 5% level. There is

no evidence of residual autocorrelation in the individual equations and steps that might

remedy the problem, for example including additional lags in the model, do not affect the

main conclusions of the analysis. We therefore take the VAR(2) model as a framework

within which to analyse the long-run properties of Xt.

5.1 I(2) Analysis

The rank indices (r, s) defining the cointegrating properties of the I(2) model can be

determined via repeated applications of Likelihood Ratio (LR) tests, see Nielsen and

Rahbek (2003). Table 3 reports test statistics for each of the restricted models, Hr,s,

against the unrestricted model, H4,0, as well as the corresponding asymptotic p−values.The columns of Table 3 contain models with the same number of I(2) trends but different

numbers of I(1) trends; the last column sets p−r−s = 0 and thus contains the I(1) models.The idea is to first test the most restricted model, H0,0, then H0,1, and so on, row-wise,

rejecting a model only if each of the more restricted models have also been rejected. All

models with no stationary relations, r = 0, are safely rejected, and the same is the case

for models with r = 1. The model H2,1 generates a test statistic against H4,0 of 23.92 and

a p−value of .49. This model features r = 2 stationary relations, s = 1 I(1) trend, and

p − r − s = 1 I(2) trend. One of the stationary relations is directly cointegrating from

I(2) to I(0) and one is a polynomially cointegrating relation involving first differences.

This is potentially consistent with the relationships set out in Section 2, in which the

polynomially cointegrating relation is given by (4), and the second cointegrating relation

is the unit vector (0 : 0 : 0 : 1)0, which implies stationarity of consumption growth.

A Nominal-to-Real Transformation of the System to I(1). If the nominal vari-

ables of the system are found to be first-order homogeneous, it follows that relative magni-

Table 1: Lag length determination

Information criteria Likelihood Ratio test

k SW HQ AIC k |5 k |4 k |3 k |25 −33.8910 −35.3726 −36.38664 −34.3735 −35.6435 −36.5126 .72

3 −34.8064 −35.8646 −36.5889 .56 .33

2 −35.2225 −36.0691 −36.6485 .37 .20 .20

1 −34.7829 −35.4179 −35.8524 .00 .00 .00 .00

Note: SW, HQ and AIC are the values of the Schwarz, Hannan-

Quinn, and Akaike information criteria respectively. The Like-

lihood Ratio tests k |m are the F−transforms of the tests forthe last m − k lags being insignificant. The reported figures

are the corresponding p−values.

Table 2: Tests for mis-specification of the unrestricted VAR(2) model

AR(1) AR(1-8) ARCH(8) Normality

pt .00 [.99] .87 [.54] .55 [.82] 3.22 [.20]

ut .67 [.41] 1.69 [.11] 1.77 [.09] 1.38 [.50]

mt 1.04 [.31] 1.62 [.13] .97 [.46] 3.19 [.20]

∆ct .00 [.99] .41 [.91] .52 [.84] 1.62 [.45]

Multivariate tests: 1.02 [.44] 1.37 [.02] 10.91 [.21]

Note: Figures in square brackets are p−values. AR(1) are the F−testsfor first order autocorrelation and are distributed as F(1,111) and

F(16,321) for the single equation and vector tests respectively. AR(1-

8) are tests for up to eight order autocorrelation and are distributed

as F(8,104) and F(128,308) respectively. ARCH (8) tests for ARCH

effects up to the eight order and is distributed as F(8,96). The last

column reports the Jarque-Bera asymptotic tests for normality, which

are distributed as χ2(2) and χ2(8) respectively.

Table 3: Test for the rank indices of the I(2) model

r LR tests

0 519.30 238.46 165.79 126.44 117.42

[.00] [.00] [.00] [.00] [.00]

1 157.58 102.30 70.35 64.56

[.00] [.00] [.00] [.00]

2 55.43 23.92 20.12

[.01] [.49] [.22]

3 11.00 7.24

[.56] [.33]

p− r − s 4 3 2 1 0

Note: Likelihood Ratio tests for the rank indices (r, s),

see Nielsen and Rahbek (2003). The appropriate asymp-

totic distributions are given in e.g. Rahbek et al. (1999).

The p−values in square brackets are derived from ap-

proximate Γ−distributions, see Doornik (1998).

tudes are invariant to the values taken by nominal aggregates, ruling out such phenomena

as ‘money illusion’. Thus, a test for first-order homogeneity also constitutes a check of

consistency with Neo-Classical economic theory. Furthermore, homogeneity permits a

transformation of the I(2) model to one expressed in I(1) space, see Kongsted (2003) and

Kongsted and Nielsen (2002).

Homogeneity of Xt = (pt : ut : mt : ∆ct)0 implies that the loadings applied to the

I(2) trend in the nominal variables are proportional, i.e. span(β2) = span(b), where

b = (1 : 1 : 1 : 0)0. The estimate of the loadings matrix is given by bβ2 = (1.000 : 0.976 :1.304 : 0.003)0, which is not too far from the theoretical vector. Since β2 is orthogonal to

τ = (β : β1), homogeneity can be tested as the restriction b0τ = 0, see Johansen (2002).For the present data set we obtain a LR test statistic of 7.54, which is not significant at

a 5% level according to a χ2 (3) distribution.

Given homogeneity we can reparametrise the I(2) model to the well-known vector error

correction form

∆Yt = eαeβ∗0Ã Yt−1t

!+ eΓ1∆Yt−1 + eµ0 + eφDt + t, (9)

for the transformed I(1) data Yt = (X 0tB : ∆X 0

tv)0, where B = b⊥ and |b0v| 6= 0. This

simplifies hypothesis testing relating to the long-run structure considerably, see Kongsted

and Nielsen (2002). In the present paper we choose Yt = (ut − pt : mt − pt : ∆pt : ∆ct)0,

to obtain a measure of real wages, ut−pt, real import prices, mt− pt, and consumer price

inflation, ∆pt, as well as consumption growth, ∆ct. The cointegration rank, r, determined

in the I(2) analysis carries over to (9), and the polynomially cointegrating relations from

the I(2) model, St, are embedded in the new system as I(1) cointegrating relations.

5.2 Identifying the Long-Run Structure within the I(1) Model

In the model (9) the space spanned by the columns in eβ is identified but individual

coefficients are not, and in order to exactly identify the coefficients we have to impose one

normalisation and one restriction on each relation. Table 4 reports, under the heading

H0, the coefficients of two exactly identified long-run relationships, together with t−valuesbased on asymptotic standard errors. The first relation is normalised on consumption

growth and a zero restriction is placed on the inflation term, consistent with one of the

long-run relations being directly cointegrating. The second relation resembles (4). The

loading coefficients of this exactly identified structure, eα, clearly suggest that the realimport price is weakly exogenous for the long-run parameters of the model.

Next, under H1, we test the hypothesis that the first cointegrating relation is a unitvector. The hypothesis implies three over-identifying restrictions on span(eβ∗) and pro-duces a LR test statistic of 9.26, which corresponds to a p−value of .03 when using theasymptotic χ2 (3) distribution. However, LR tests pertaining to cointegrating coefficients

are often found to reject a true null hypothesis too often, see inter alia Li and Maddala

Table 4: Identification of the long-run structure

H0 H1 H2eβ∗ eα eβ∗ eα eβ∗ eαut − pt .123

(3.99)−.231(−7.91)

.060(.44)

.674(4.87)

0(...)

−.213(−7.62)

.198(1.71)

.706(5.28)

−.208(−7.50)

.205(1.81)

.710(5.50)

mt − pt .007(.78)

−.056(−6.84)

.086(.29)

.128(.42)

0(...)

−.055(−7.00)

.008(.03)

.105(.36)

−.053(−6.89)

0(...)

∆pt 0(...)

1(...)

.182(1.98)

−.476(−5.07)

0(...)

1(...)

.166(2.13)

−.509(−5.64)

1(...)

.161(2.08)

−.503(−6.04)

∆ct 1.000(...)

0(...)

−.757(−6.74)

−.098(−.85)

1(...)

0(...)

−.529(−5.21)

.035(.30)

0(...)

−.539(−5.63)

0(...)

t .000(2.13)

−.000(−4.10)

0(...)

−.000(−3.94)

−.000(−3.75)

Log-likelihood value 2395.97955 2391.34757 2391.25303

Test statistic ... 9.26 9.45

Asymp. p−value ... .026 .150

Asymp. distribution ... χ2 (3) χ2 (6)

Bootstrap p−value(a) ... .085 .303

Note: t−values based on asymptotic standard errors in parentheses. (a)Bootstrap p−values are constructed by parametricresampling as proposed in Gredenhoff and Jacobson (2001). The estimated model under the null hypothesis is used as a

data generating process to construct 10000 pseudo samples with innovations drawn from N(bΩ), where bΩ is the estimatedcovariance matrix. On each sample the hypothesis of interest is then tested and the distribution of test statistics is used

as an estimate of the small sample distribution.

Foreign shock Domestic shock

ut − pt

0 1 2 3 4 5 6 7 8 9 10 11 12

-0.010

-0.008

-0.006

-0.004

-0.002

0 1 2 3 4 5 6 7 8 9 10 11 12

mt − pt

0 1 2 3 4 5 6 7 8 9 10 11 12

-0.002-0.001

0.0000.0010.0020.0030.0040.0050.0060.007

0 1 2 3 4 5 6 7 8 9 10 11 12-0.0015

-0.0010

-0.0005

0.0000

0.0005

0.0010

0.0015

0.0020

0 1 2 3 4 5 6 7 8 9 10 11 120.0000

0.0005

0.0010

0.0015

0.0020

0.0025

0.0030

0.0035

0.0040

0 1 2 3 4 5 6 7 8 9 10 11 12

-0.0020

-0.0015

-0.0010

-0.0005

0.0000

0.0005

0 1 2 3 4 5 6 7 8 9 10 11 12

-0.001

Figure 2: Impulse responses and 90% confidence intervals based on 10000 bootstrap replications.

Horizontal axis is time in quarters.

(1997), Jacobson, Vredin and Warne (1998) and Gredenhoff and Jacobson (2001). We

therefore also estimate the finite sample distribution using the Bootstrap principle as pro-

posed in Gredenhoff and Jacobson (2001). The Bootstrap p−value of the test for thereduction of H0 to H1 is .09 indicating borderline acceptance. This is the result that onewould expect from a visual inspection of Figure 1 (D).

We next impose the restrictions that real import prices are weakly exogenous, and

that consumption growth does not react to disequilibrium in the pricing relation, leading

to the structure H2. The test statistic for H2 against H0 is 9.45 and follows a χ2 (6)

distribution under the null. The marginal restrictions embodied in H2 compared to H1are thus easily accepted, and the total structure is accepted with an asymptotic p−valueof .15 and a Bootstrap p−value of .30. Under H2 the long-run inflation relation can bewritten as

∆pt = −.261 · (pt − .797 · ut − .203 ·mt − .001 · trend) , (10)

indicating an import share of just above 20%, which is a plausible estimate for the sample

period that we have employed.

One of the fundamental results illustrated by the structure in H2 is that productivityadjusted real wages error correct with respect to disequilibrium from (10). Further, as

the error correction parameter defining real wage responses is of opposite sign to that in

the inflation equation, real wages will tend to accommodate higher real import prices and

thereby limit their impact on inflation. This effect will be examined in more detail in

Section 6.

Lagged consumption growth exerts a positive effect on both the rate of change of

inflation and the rate of change of the real wage, suggesting that erosion of spare capacity

in the consumer goods sector tends to accelerate price and wage adjustment. However, the

asymptotic standard deviations of the estimated parameters are relatively large, reflecting

difficulties in identifying the exact channels through which capacity shortages raise prices.

An additional restriction that removes the capacity effect operating via real wages can be

imposed on H2. This raises the direct capacity effect on inflation from .17 to a statistically

significant .23. Still, we take H2 as the preferred model in the analysis that follows andtherefore continue to allow for indirect capacity effects operating through real wages.

6 A Structural VAR Interpretation

In order to cast further light on the macroeconomic effects of increases in real import prices

we construct a moving average representation of the preferred model, H2, and performan impulse response analysis. We employ the methodology of King, Plosser, Stock and

Watson (1991), Melander, Vredin and Warne (1992) and Warne (1993).

The solution to the cointegrated VAR model in (9) is given by the so-called Granger

representation

Yt = CtX

i + C∗ (L) t + f (t) , (11)

where f (t) is a function of the deterministic terms and C∗ (L) =P∞

i=0C∗i L

i is a convergent

matrix polynomial in the lag operator, L. Finally, C = eβ⊥ ³eα0⊥ ³I −Pk−1i=1

eΓi´ eβ⊥´−1 eα0⊥is the long-run impact matrix, which has reduced rank, p− r, indicating that only p− r

linear combinations of the p innovations have permanent effects.

The innovations, t, are in general correlated, which makes it difficult to interpret the

innovations as structural shocks. In order to obtain a structural interpretation we require

a representation of the form

Yt = ΥtX

ϕi +R∗ (L)

Ãϕtψt

!+ f (t) . (12)

In (12) the innovations vt =¡ϕ0t : ψ

¢0are decomposed into p − r innovations with per-

manent effects, ϕt, and r innovations with only transitory effects, ψt, and we impose

orthogonality, E (vtv0t) = I4. The long-run impact matrix, Υ, is of dimension p× (p− r)

and in order to exactly identify the driving trends separately, we have to impose a pri-

ori 12 (p− r) (p− r − 1) restrictions, see Warne (1993). The exactly identified structure

(12) can be obtained from a rotation of (11), i.e. vt = F t, (Υ : 0p×r) = CF−1 andR∗ (L) = C∗ (L)F−1 for some p× p rotation matrix F .

In the present application we wish to identify the two driving trends as a foreign price

trend and a domestic price trend. We assume as the necessary identifying restriction that

the domestic trend does not exert a long-run impact on real import prices, implying no

pricing-to-market in the long run. We obtain the following long-run impact matrix, bΥ, ofshocks to the driving trends:

ut − pt

mt − pt

−.1287(−.267:−.003)

1.0000(.703:1.244)

1.0000(.740:1.273)

.0000(...)

.0267(−.003:.054)

.2081(.136:.270)

.0000(...)

Foreign driving trend

Domestic driving trend

!+ ...

Here the shocks are scaled to produce a unit impact on mt − pt and ut − pt, respectively,

and the numbers in parentheses are 90% bootstrap confidence intervals. The dynamic

effects of one standard deviation shocks to the I(1) trends are given in Figure 2.

The domestic shock exerts a positive and significant long-run impact on both inflation

and real wages, suggesting that it may be interpreted as the effect of a shift in labour

supply, e.g. the increased costs to producers arising from trade union attempts to secure

additional employee compensation. The UK was particularly affected by such shocks

during the 1970s.

As we would expect, a shock to the foreign trend exerts a significant effect on real

import prices, but it also induces a statistically significant reduction in real unit labour

costs. This reflects our earlier finding that real wages error correct with respect to disequi-

librium in the pricing relation, and provides one explanation for the lack of any significant

upturn in UK inflation following the post-ERM devaluation of sterling. Specifically, the

increase in real import prices in 1992 did not fuel large increases in the rate of inflation

because the resulting supply pressures were offset by reductions in real labour costs that

were implemented in response to the import price shock. Such ‘automatic stabilisation’

of the inflation rate is evident in the third graph in the first column of Figure 2, which

shows that while inflation increases in response to a shock to the foreign driving trend,

the effect is marginally insignificant. Real wage accommodation effects of this sort have

a theoretical basis in wage bargaining models, the central idea being that labour unions

reduce real wage claims following adverse shocks to the terms of trade in order to restrict

the extent of job losses, see Layard, Nickell and Jackman (1991).

Analysing the Robustness of the Long-Run Responses. We now assess the ro-

bustness of the impulse responses to using a shorter sample period for the estimation. An

inspection of Figure 1 suggests that real wage accomodation was weaker during the time

of the first oil price shock. Obviously it is not possible to address this question through

estimating a model for the period up to 1974, as the sample would then comprise just 24

observations. Instead, we analyse the sub-sample t = 1975 : 1− 2000 : 4 and seek to drawinferences concerning the structure of the model in the early 1970s through comparing

our new results with those obtained for the full sample.

The long-run responses to the permanent shocks post 1974 are given by

bΥPost 1974 =

−.2788(−.478:−.082)

1.0000(.664:1.287)

1.0000(.747:1.280)

.0000(...)

.0099(−.027:.047)

.2044(.123:.274)

.000(...)

.0000(...)

The key result is that real wage accommodation has been much stronger post 1974, so

strong, in fact, that the response of inflation to a shock to the foreign trend is clearly

insignificant. This suggests that real wage accommodation was not a feature of price

adjustment in the UK at the time of the first oil price shock of 1974, but has been an

important factor in the determination of inflation since then.

We attribute this finding to the effects of a system of wage indexation that was in

place in 1974 as part of Phase III of the incomes policy of the Heath administration. This

regime guaranteed that wage increases would exactly compensate for price inflation above

an annual rate of 7%, see Greenaway and Shaw (1988). Consequently, when consumer

prices accelerated following a quadrupling of oil prices in 1974, unit labour costs were

dragged along, preventing real wage accommodation.

This paper has analysed how import price shocks are transmitted to consumer prices. A

crucial finding was that real unit labour costs error correct with respect to disequilibrium

in the long-run relation between inflation and the relative price measures, suggesting

that increases in real import prices may be accommodated through reductions in real

wages. As a result of this, shocks to the trend driving real import prices induce an

increase in inflation that is actually marginally insignificant. Such a finding is consistent

with the British inflation experience following the depreciation of sterling in 1992. On

the other hand, results estimated for the post−1974 subsample indicated that real wageaccommodation did not operate during the first half of the 1970s, implying that the full

burden of adjustment following the first oil price shock was borne by the inflation rate.

The role of real wage accommodation in the transmission of import price shocks to in-

flation is also relevant to forecasting inflation. Batini, Jackson and Nickell (2000) estimate

a conditional model for UK inflation which confirms the roles of the labour share and the

import share in explaining inflation. They note that at the end of the 1990s inflation was

being held down by low real import prices, while the labour share was high by historical

standards, injecting inflation into the system. They conclude that if real import prices

revert to the historical average that will lead to an upturn in inflation, other things equal.

Our results demonstrate that the ‘other things equal’ assumption is unlikely to hold. Ac-

cording to our results, it is precisely because real import prices have hit such a low level

that the labour share has been allowed to rise; any increase in real import prices is likely

to reverse this trend, as firms squeeze the labour share in order to pay overseas suppliers.

This suggests that the future impact of higher real import prices on inflation will be very

modest.

Appendix: Data

The data source is the United Kingdom Office for National Statistics (ONS).

Our measure of consumer prices is the implicit deflator for household final consump-

tion, obtained as the ratio (RPQM/NPSP) using the ONS series codes. Import prices are

measured as the implicit deflator for total imports of goods and services, which is defined

as (IKBI/IKBL) using the ONS codes. The unit labour cost series is an average measure

for the whole economy and is calculated as the ratio of wages and salaries to real gross

domestic product, which is (ROYJ/AMBI) using the ONS codes. Real consumption ex-

penditure is defined as the volume index, NPSP, used in the construction of the consumer

expenditure deflator.

All of the series obtained from the ONS are seasonally adjusted, and the price series

that we construct are indexed such that 1995 = 100.

References

Banerjee, A., L. Cockerell and B. Russell(2001): “An I(2) Analysis of Inflation and theMarkup”, Journal of Applied Econometrics, 16,221—240.

Banerjee, A. and B. Russell (2001): “The Re-lationship Between the Markup and Inflation inthe G7 Economies and Australian”, The Reviewof Economics and Statistics, 82(2), 377—387.

Bank of England (1999): Economic Models at theBank of England, Bank of England Publications,London.

Batini, N., B. Jackson and S.J. Nickell (2000):“Inflation Dynamics and the Labour Share in theUK”, Discussion Paper No. 2, External Mone-tary Policy Committee Unit, Bank of England.

de Brouwer, G. and N. Ericsson (1998): “Mod-elling Inflation in Australia”, Journal of Businessand Economic Statistics, 16, 433—449.

Diamandis, P.F., D.A. Georgoutsos and G.P.Kouretas (2000): “The Monetary Model in thePresence of I(2) Components: Long-Run Rela-tionships, Short-Run Dynamics and Forecastingof the Greek Drachma”, Journal of InternationalMoney and Finance, 19, 917—941.

Doornik J.A. (1998): “Approximations to theAsymptotic Distribution of Cointegration Tests”,Journal of Economic Surveys, 12(5), 573—593.

––– (2001): Object-Oriented Matrix Program-ming Using Ox, Timberlake Consultants Press,London, 4th edition.

Gredenhoff M. and T. Jacobson (2001): “Boot-strap Testing Linear Restrictions on Cointegrat-ing Vectors ”, Journal of Business and EconomicStatistics, 19(1), 63—72.

Greenaway, D. and G.K. Shaw (1988): Macroe-conomics: Theory and Policy in the UK, BasilBlackwell Ltd, Oxford, 2nd edition.

Haldrup, N. (1998): “A Review of the Economet-ric Analysis of I(2) Variables”, Journal of Eco-nomic Surveys, 12, 595—650.

Jacobson T., A. Vredin and A. Warne (1998):“Are Real Wages and Unemployment Related?”,Economica, 65(267), 69—96.

Johansen, S. (1992): “A Representation of VectorAutoregressive Processes Integrated of Order 2”,Econometric Theory, 8, 188—202.

––– (1995): “A Statistical Analysis of Cointegra-tion for I(2) Variables”, Econometric Theory, 11,25—59.

––– (1996): Likelihood-Based Inference in Coin-tegrated Vector Autoregressive Models, OxfordUniversity Press, Oxford, 2nd edition.

––– (2002): “Testing Hypothesis in the I(2)Model”, Preprint No. 13. Department of The-oretical Statistics, University of Copenhagen.

Juselius, K. (1998): “A Structured VAR underChanging Monetary Policy”, Journal of Businessand Economic Statistics, 16(4), 400—412.

King, R.G., C.I. Plosser, J.H. Stock and M.W.Watson (1991): “Stochastic Trends and Eco-nomic Fluctuations”, American Economic Re-view, 81(4), 819—840.

Kongsted, H.C. (2003): “An I(2) CointegrationAnalysis of Small-Country Import Price Deter-mination”, Econometrics Journal, 6, 53—71

––– and H.B. Nielsen (2002): “AnalyzingI(2) Systems by Transformed Vector Autoregres-sions”, Discussion Paper 02-20, Institute of Eco-nomics, University of Copenhagen, Chapter 2in this Thesis.

Layard, P.R.G., S.J. Nickell and R. Jackman(1991): Unemployment, Macroeconomic Perfor-mance and the Labour Market, Oxford UniversityPress, Oxford.

Li H. and G.S. Maddala (1997): “BootstrappingCointegrating Regressions”, Journal of Econo-metrics, 80, 297—318.

Lutkepohl, H. (1991): Introduction to MultipleTime Series Analysis, Springer-Verlag, Berlin.

Mellander, E., A. Vredin and A. Warne(1992): “Stochastic Trends and Economic Fluc-tuations in a Small Open Economy”, Journal ofApplied Econometrics, 7, 369—394.

Nielsen B. and A. Rahbek (2000): “Similarity Is-sues in Cointegration Analysis”, Oxford Bulletinof Economics and Statistics, 62(1), 5—22.

Nielsen, H.B. (2002): “An I(2) CointegrationAnalysis of Price and Quantity Formation inDanish Manufactured Exports”, Oxford Bulletinof Economics and Statistics, 64(5), 449—472,Chapter 4 in this Thesis.

––– and A. Rahbek (2003): “The LikelihoodRatio Test for the Cointegration Ranks in the I(2)Model”, Working Paper, Institute of Economics,University of Copenhagen, Chapter 1 in thisThesis.

Rahbek A., H.C. Kongsted and C. Jørgensen(1999): “Trend-Stationarity in the I(2) Coin-tegration Model”, Journal of Econometrics, 90,265—289.

Warne, A. (1993): “A Common Trends Model:Identification, Estimation and Inference”, Sem-inar Paper No. 555, IIES, Stockholm University.

Chapter 6

Has US Monetary Policy

Followed the Taylor Rule?

A Cointegration Analysis 1988-2002

Has US Monetary Policy Followed the Taylor Rule?

A Cointegration Analysis 1988—2002

Anders Møller Christensen

Economics Department,

Danmarks Nationalbank

amc@nationalbanken.dk

Heino Bohn Nielsen

Abstract

Based on the equilibrium correction structure of a cointegrated vector autoregressive

model it is rejected that US monetary policy 1988-2002 can be described by a tradi-

tional Taylor (1993) rule. Instead we find a stable long-term relationship between the

Federal funds rate, the unemployment rate, and the long-term interest rate, with de-

viations from the long-term relation being corrected primarily via changes in Federal

funds rate. This is taken as an indication that the FOMC sets interest rates with a

view to activity and to expected inflation and other conditions available in financial

markets.

Keywords: Taylor rule; Bond rate; Cointegration; Equilibrium Correction.

JEL Classification: C32; E52.

1 Introduction

In an influential article Taylor (1993) suggests that US monetary policy 1987− 1992 canbe summarized by a simple policy rule, in which the Federal funds rate reflects deviations

of inflation and activity from their policy targets. That initiated two large strands of liter-

ature. One line of research deals with the representation of actual central bank behavior,

and tries to elaborate on the so-called Taylor rule, see inter alia Evans (1998), Judd and

Rudebusch (1998), Orphanides (2001), and Ball and Tchaidze (2002). Another line of

research deals with issues regarding the optimal monetary policy given the central banks

objectives and tries to encompass the Taylor rule in a framework of optimizing agents.

The present paper is of the empirical kind and considers the monetary policy in the

United States since 1988. The approach taken in this paper differs from most other

research on Taylor rules in at least two respects.

The authors would like to thank Christopher Bowdler, Eilev Jansen, Henrik Jensen, Søren Johansen,

Katarina Juselius, Dan Knudsen and Hans Christian Kongsted for comments. Remaining errors and

shortcomings are the sole responsibilities of the authors.

First, the econometric technique of multivariate cointegration analysis is applied, al-

lowing for a simultaneous investigation of long-term relationships and the short-term dy-

namics. We argue that a long-term relationship involving what is considered a monetary

policy rate can only be interpreted as a monetary policy rule if deviations from the equilib-

rium rate are corrected via changes in the policy instrument. This is a testable hypothesis

on the equilibrium correction structure of the multivariate dynamic model.

The second difference concerns the choice of variables entering the analysis. Inspired

by the role of the yield-curve in recent monetary analysis and in the literature on leading

indicators, the long-term bond rate is included in the analysis in parallel with the inflation

rate and the unemployment rate. The extended information set makes it possible to

analyze the role of financial market information in monetary policy.

The results clearly suggest that the Federal funds rate does not equilibrium correct to

a traditional Taylor (1993) rule, which we therefore reject as a representation of the be-

havior of the monetary policy managed by the Federal Open Market Committee (FOMC).

Instead, it seems like the short-term rate is set as if the information on the economy avail-

able in the capital market, here represented by the bond rate, has played an important

role in addition to developments in unemployment. One interpretation of this finding is

that monetary policy is affected by inflation expectations, as embodied in the bond yield,

rather than realized inflation. We do not consider this a real-time policy rule, but it is

a better representation than the Taylor rule of the kind of factors that have entered the

decision making process.

The rest of the paper is organized as follows. Sections 2 and 3 briefly discuss the basic

concepts regarding Taylor rules and measurement. Section 4 presents the econometric

tools involved in cointegration analysis and some important testable hypotheses implied

by the Taylor rule. Section 5 presents the empirical evidence on simple monetary policy

rules for the US since 1988, while Section 6 concludes.

2 Taylor Rules

In a seminal paper, Taylor (1993) suggested that the FOMC has managed the Federal

funds rate according to the simple linear formula

ft = πt + λ1 · eut + λ2 · eπt + κ0,

where ft denotes the Federal funds rate, πt and eπt denote the inflation and the deviationof inflation from a specified target respectively, eut denotes deviation of economic activityfrom a natural level, and the constant κ0 is interpretable as the target real interest rate

in equilibrium. If the inflation gap is measured as the deviation from a constant target,eπt = πt − π∗, as it is usually the case, π∗ is not empirically identifiable and the relationcollapses to

ft = λ1 · eut + (1 + λ2) · πt + κ1, (1)

where κ1 = κ0 − λ2π∗. The original rule in Taylor (1993) was based on the current-

quarter output gap and the year-on-year change in the GDP deflator, and the conjectured

coefficients λ1 = λ2 = 0.5 and κ1 = 1 was used to interpret US monetary policy 1987 −1992.

Inter alia Orphanides (2001) emphasizes the importance of using real-time and not

final, revised data, and Evans (1998) and Ball and Tchaidze (2002) consider policy rules

based on the deviation of unemployment from an estimated natural rate and consumer

price inflation, possibly after excluding some volatile components. That allows for an

analysis based on monthly rather than quarterly data, and is also the approach taken in

this paper.

In the basic formulation, the relation (1) is contemporaneous and the Federal funds rate

could at any time be approximated by the right hand side. Some empirical applications,

therefore, use (1) directly as a regression equation, see e.g. Evans (1998) and Ball and

Tchaidze (2002). Alternatively the right hand side of (1) can be considered a notional

target, f∗t , and a model for partial adjustment of ft to f∗t can be considered, e.g.

ft = ρ · ft−1 + (1− ρ) · (λ1 · eut + (1 + λ2) · πt + κ1) , (2)

see Judd and Rudebusch (1998) and Orphanides (2001) for examples and English, Nelson,

and Sack (2003) for different interpretations. For ρ = 0 (2) collapses to the usual Taylor

rule while some degree of interest rate smoothing prevails if ρ > 0.

In empirical analyses, inflation is often found to be best approximated by a unit root

process. In that case, simple inference on the parameters in (1) and (2) is only valid

if the variables cointegrate, a hypothesis which is rarely tested in this literature. In

the present paper we take a different route and consider the relation (1) as a candidate

for an equilibrium relation and estimate the parameters within a multivariate dynamic

framework. This approach has several advantages. First, it is possible to test if the

Federal fund rate is related to the explanatory variables in a relation like (1) such that the

deviations, ft−f∗t , are stationary. Second, the multivariate approach allows us to test forthe endogeneity of the included variables with respect to the parameters in (1). For the

relation to be interpreted as a policy rule, deviations ft − f∗t should be corrected by ft —

with the interpretation that the FOMC seeks to eliminate misalignment from the target

rate. If there are no dynamic forces in the model making ft correct to f∗t , then there is

no natural interpretation of f∗t as a target value for ft. Furthermore, we can test if thevariables of interest, unemployment and inflation, react to the stance of monetary policy

and test if the variables are actually controllable by the monetary policy instrument as

defined in Johansen and Juselius (2001), see further in Section 4.

Bernanke and Blinder (1992) and Goodfriend (1998) suggest to use the spread between

the Federal fund rate, ft, and a long term bond rate, bt, as an indicator of the stance of

monetary policy. The bond rate naturally incorporates information on inflation expecta-

tions, and at the same time it is insensitive to short-run variations in monetary conditions.

The bond rate could also contain other relevant information. A sudden increase in the

bond rate could reflect a declining credibility of monetary policy and the FOMC could

react by a preemptive increase in the Federal funds rate, see also Carey (2001). Mehra

(2001) and Carey (2001) include the bond rate as an additional variable in a Taylor rule

like (1). Since bt will react with a one-to-one impact from inflation expectations, and

inflation is already present, we insert in (1) only the ’new’ information as measured by

the real bond rate, bt−πt, and correct for the average ’tilt’ of the yield curve, τ , to obtain

ft = λ1 · eut + (1 + λ2 − λ3) · πt + λ3 · bt + κ2,

where κ2 = κ0 − λ2π∗ − λ3τ . If there is a one-to-one impact from the bond rate to the

Federal funds rate, λ3 = 1, we obtain a simple Taylor-type rule for the interest rate spread:

ft = bt + λ1 · eut + λ2 · πt + κ2. (3)

The interest rate spread is often considered to be a predictor of future inflation or ac-

tivity, cf. Mishkin (1990), Estrella and Hardouvelis (1991), and the survey by Estrella and

Mishkin (1996). Taking the information on expectations of future activity and inflation in

the long-term rates into consideration when setting short-term rates is therefore straight-

forward in a certain sense. However, in theory the relationship is normally turned the

other way round, making long-term rates a function of expected future short-term rates

as in the expectation theory of the term structure, even if Schiller (1990) acknowledges

that a lot of evidence speaks against the empirical validity of the theory. Christensen

(2002) suggests that a normalized interest rate spread is a straightforward method to

reveal real-time information on the real interest rate gap of recent monetary theory, cf.

Woodford (2003) and Svensson (2003).

Several applications have emphasized the forward looking nature of monetary pol-

icy, see Clarida, Galı, and Gertler (1998), Clarida, Galı, and Gertler (2000), Orphanides

(2001), and Svensson (2003). In the present study this feature is mainly implicit, in the

sense that the applied vector autoregressive model is consistent with the concept of for-

ward looking expectations using data based projection functions. However, when data on

long-term interest rates are included in the data set, information on expected future in-

flation and expected alternative real yields are included more directly without restrictions

on the way expectations are formed.

3 Data Measurements

To analyze monetary policy reaction functions like (1) and (3) we consider a monthly

data set, Yt = (ft : bt : ut : πt)0, comprising the effective Federal funds rate, ft, a constant

maturity 10 year Treasury bill rate, bt, the unemployment rate corrected for a linear

trend, ut, and core inflation measured as 100 times the year-on-year change in the log

transformed consumer price index excluding food and energy, πt.1 The considered sample

1All data series are taken from the EcoWin data base. The unemployment rate is calculated from the

total number of unemployed and the total civilian labor force, both seasonally adjusted.

covers the period since Alan Greenspan began as the chairman of the Federal Reserve

Board. The effective sample is 1988 : 1− 2002 : 12, and we condition the analysis on thelast months of 1987.

Graph (A) in Figure 1 depicts the Federal funds rate and the Treasury bill rate and

graph (B) depicts the spread, ft − bt. The interest rates have some similarities, but have

been far from parallel. On average ft has been lower than bt, but on three occasions the

Federal funds rate has exceeded the bond yield.

Graph (C) depicts the unemployment rate. Several authors have suggested a fall in

the natural rate of unemployment in the period under consideration, see inter alia Ball

and Tchaidze (2002). To allow for a decline in the natural rate in a transparent way and

to avoid a deterministic trend in the empirical analysis of the monetary policy rules we

take a very simple approach and correct a priori for a linear trend in unemployment using

least squares. We make no assumptions on the level of the natural rate and include in the

empirical analysis the variable

ut = u∗t − 0.00996 ·¡t− t

where u∗t is the observed unemployment rate and¡t− t

¢is the demeaned linear trend

such that ut and u∗t have same means over the period. The estimated linear trend is alsodepicted in graph (C) and assumes that the natural rate has fallen approximately 2%

in the sample period. We are fully aware that the linear correction creates problems if

extrapolations are made. However, in this way we avoid making more subjective manip-

ulations of data. We are confident that this specific choice is not material for the results

reported below. The sample period covers a slack in the early 1990s and a subsequent

long upturn ending sharply in 2000. A comparison of (C) and (A) indicates a negative

correlation between ut and ft, and there is also a clear correlation between ut and the

interest rate spread, ft − bt, in (B).

Finally graph (D) depicts core inflation. Inflation has been steadily decreasing over

the period with bouts of rising inflation. One in early 1990s and one in early 2000s.

Comparing developments in core inflation with the Federal funds rate and the interest

rate spread indicates a weaker correlation.

4 Econometric Tools

To analyze the interaction between the interest rates, unemployment, and inflation, we

consider the p−dimensional vector autoregressive (VAR) model:

H(r) : ∆Yt = α¡β0Yt−1 + µ0

k−1Xi=1

Γi∆Yt−i + t, t = 1, 2, ..., T, (4)

where α and β are of dimension p× r, the innovations t are assumed to be independently

Gaussian distributed, N (0,Ω), and the initial values, Y−k+1, ..., Y0, are considered fixed.

1990 1995 20000

10 (A) Interest rates

Federal funds rate Treasury bill rate

1990 1995 2000-4

(B) Interest rate spread

1990 1995 2000

8 (C) Unemployment rate and linear trend

1990 1995 20002

(D) Core inflation y-o-y

Figure 1: The data 1987-2002.

Based on theoretical considerations we do not allow for deterministic linear trends in the

variables and include only a constant restricted to the cointegrating relations.

Under the additional assumptions that the characteristic polynomial of (4) has roots at

one or outside the unit circle and α0⊥Γβ⊥ is non-singular, where α⊥ and β⊥ are orthogonalcomplements to α and β respectively and Γ = I − Γ1 − ... − Γk−1, Yt is an I(1) processwith representation

Yt = CtX

i + C (L) t + τ0, (5)

where C = β⊥ (α0⊥Γβ⊥)−1 α0⊥ is the p × p dimensional long-run impact matrix of rank

p − r, C (L) is a convergent polynomial in the lag operator L, and τ0 are coefficients

depending on µ and the initial values, see Johansen (1996, Theorem 4.2). The interpreta-

tion of a coefficient Cij in C is the long-run effect on variable i from an innovation to j .

The representation (5) shows that Yt is integrated of first order, I(1), while the r linear

combinations, β0Yt are stationary.Maximum Likelihood (ML) estimation of H(r) is given by reduced rank regression,

see Johansen (1996, chapter 6). To determine the number of long-run relations, r, the

nested models, H(0) ⊂ ... ⊂ H(r) ⊂ ... ⊂ H(p), can be compared using likelihood ratio

(LR) tests, the so-called trace tests, with asymptotic distributions given in Johansen

(1996, Theorem 6.3). Conditional on r, it is possible to test restrictions on the long-run

coefficients, β∗ =¡β0 : µ0

¢0, and on the short-run adjustment coefficients, α. In this paper

we consider hypotheses involving linear restrictions on the columns in α and β∗, i.e.

H : α = (A1φ1 : ... : Arφr) and β∗ = (H1ϕ1 : ... : Hrϕr) ,

where φi and ϕi contain the free parameters in column i of α and β∗ respectively. UnderH the model can be estimated using e.g. the switching algorithm of Boswijk (1995); and,

given identification, the LR test statistic for H is asymptotically distributed as a χ2 under

the null.

Johansen and Juselius (2001) analyze the implementation of monetary policy control

rules in a cointegrated vector autoregressive model. They consider a target variable d0Ytand a given instrument a0Yt, where a and d are p−dimensional vectors (often unit vectors).The definition of controllability of d0Yt with a0Yt in this context is that d0Yt can be madestationary around a target value d∗ by intervening in a0Yt at all points in time. Thenecessary control rule and the properties of the controlled process are derived in Johansen

and Juselius (2001, Theorem 7). To analyze if such a control rule has been in action, a

necessary condition is that d0Yt is stationary. The condition for controllability per se isthat d0Ca 6= 0, such that interventions to the instruments give a non-zero long-run impacton the target.

Implied Hypotheses for Monetary Policy Rules. For the present data, Yt, and for

r = 1, which is the main case considered in the empirical analysis, the first part of (4) can

be written as∆ft

∆πt

(ft−1 + β2bt−1 + β3ut−1 + β4πt−1 + µ) + ...,

where the long-run relation is normalized on the Federal funds rate, ft−1. For the empir-ical model to be interpretable as a characterization of monetary policy we propose two

requirements:

1. That the coefficients (1 : β2 : β3 : β4 : µ)0 are interpretable as a policy rule. From

theory we expect β2 ≤ 0, β3 ≥ 0, and β4 ≤ 0. If β2 = 0 the relation collapses to theconventional Taylor rule (1). If β2 = −1 the relation is a simple rule for the interestrate spread (3).

2. That α1 < 0 such that deviations of the Federal funds rate from the equilibrium

value is corrected by monetary policy actions.

Besides tests of these requirements, it is also possible to test the effect of misalignments of

the policy rate from the equilibrium rate on unemployment and inflation. This corresponds

to inference on α3 and α4 respectively. In particular, we expect high interest rates to put

downward pressure on inflation, α4 ≤ 0, and upward pressure on unemployment, α3 ≥ 0.This involves a Phillips-curve trade-off between the two goals in the optimal policy setting.

Controllability of the inflation rate, πt, with the Federal funds rate, ft, can be tested

as the hypothesis that C41 6= 0. A priori we expect C41 < 0.

5 Empirical Analysis of US Monetary Policy

In this section we look at the empirical evidence on monetary policy in the US based on the

monthly data set, Yt = (ft : bt : ut : πt)0, for the effective sample t = 1988 : 1, ..., 2000 : 12.2

First step in the analysis is to determine the lag length k of the VAR. Information

criteria and successive testing for removal of lags point towards k = 3 or k = 4. Since

there are some residual autocorrelation in the model with three lags, we base the analysis

of the long-run structure on a VAR with k = 4 lags. By and large similar results as the

ones presented below are obtained for k = 3.

Table 1 reports a battery of misspecification tests. The only deviation from the differ-

ent nulls of a well specified model is a marginal autoregressive conditional heteroscedas-

ticity (ARCH) in the equation for core inflation. This is not unusual for monthly data

and is not easily remedied within the VAR framework. Work of Rahbek, Hansen, and

Dennis (2002) indicates that moderate ARCH effects do not disturb the analysis of the

cointegrated VAR, and we choose to ignore this potential problem in the following. It is

interesting to note that there are no extreme outliers in the data and the null of Gaussian

residuals is accepted. This is also the case for the Federal funds rate which is managed by

the FOMC.

Long-Run Structure. Next we want to determine the cointegration rank. It is known

from simulation studies that it is no easy task to select the cointegration rank in empirical

applications, and the finite sample distribution of the trace test for a cointegration rank of

rank(Π) ≤ r against the unrestricted alternative, H (p), is typically displaced to the right

relative to the asymptotic distribution. To take account of the resulting size distortion,

Johansen (2002) proposes a Bartlett correction for the trace test. This is applied to the

current data in Table 2. The model H(0) with no cointegration is rejected, with a p−valuebased on the Bartlett corrected test of 2%. The test for model H(1) with r = 1 long-run

relation is a borderline case, with a p−value of 8% according to the Bartlett corrected

test. This model is also in line with the theoretical setup, and we choose this for the main

analysis.

Based on the model with r = 1 we want to analyze the information in the data of the

structure of the long-run relation. The unrestricted estimates of α and β∗ are reported inTable 3 under H0, with t−values based on the asymptotic standard deviations in paren-theses. The relation is normalized on the Federal funds rate and the t−values indicate a

2The empirical analysis was carried out using a set of procedures programmed in Ox, see Doornik

(2001), and PcGive, see Doornik and Hendry (1997).

Table 1: Tests for misspecification of the unrestricted VAR(4)

AR(1) AR(1-7) ARCH(7) Normality

∆ft .63 [.43] 1.04 [.41] 1.96 [.06] 4.39 [.11]

∆bt .91 [.34] 1.27 [.27] .78 [.60] 4.31 [.12]

∆ut .05 [.82] 1.05 [.40] .51 [.82] .91 [.64]

∆πt .02 [.88] 1.75 [.10] 2.43 [.02] .45 [.80]

Multivariate tests: .82 [.66] 1.09 [.26] ... 9.96 [.27]

Note: Figures in square brackets are p−values. AR(1) are the

F−tests for first order autocorrelation and are distributed as

F(1,162) and F(16,477) for the single equation and vector tests

respectively. AR(1-7) are tests for up to seventh order autocorre-

lation and are distributed as F(7,156) and F(112,526) respectively.

ARCH (7) tests for ARCH effects up to the seventh order and is

distributed as F(7,149). The last column reports results of the

Doornik and Hansen (1994) test for normality, distributed as χ2(2)

and χ2(8) respectively.

Table 2: Trace tests for the cointegration rank

H (r) r = 0 r ≤ 1 r ≤ 2 r ≤ 3Eigenvalues .144 .109 .069 .017

LR statistic 64.59 36.62 15.89 3.01

Asymptotic p−value [.00] [.03] [.18] [.59]

Bartlett factor 1.11 1.11 1.69 1.35

Corrected p−value [.02] [.08] [.70] [.73]

Note: Likelihood Ratio tests for H(r) against H(p). Case with

a restricted constant. Figures in square brackets are asymptotic

p−values based on the approximate critical values derived fromΓ−distributions by Doornik (1998).

Table 3: Identification of the long-run structure

H0 H1 H2 H3 H4

α β∗ α β∗ α β∗ α β∗ α β∗

ft −0.062(−3.88)

1(...)

−0.000(−0.01)

1(...)

−0.063(−3.56)

1(...)

−0.063(−3.56)

1(...)

−0.080(−4.46)

1(...)

bt 0.004(0.15)

−1.284(−5.26)

−0.027(−1.95)

0(...)

−0.011(−0.39)

−1(...)

−0.012(−0.45)

−1(...)

0(...)

−1(...)

ut −0.015(−1.04)

1.657(9.63)

−0.019(−2.38)

3.062(9.85)

−0.025(−1.60)

1.783(12.48)

−0.022(−1.42)

1.782(12.51)

0(...)

1.637(11.74)

πt −0.042(−2.94)

0.290(0.87)

−0.029(−3.62)

−1.418(−5.66)

−0.047(−3.04)

−0.077(−0.67)

−0.049(−3.10)

0(...)

−0.040(−2.37)

0(...)

1 ... −6.914(−4.82)

... −17.677(−9.43)

... −8.349(−9.70)

... −8.584(−10.78)

... −7.797(−10.01)

LR statistic ... 5.796 0.489 0.896 2.862

p−value ... 0.016 0.484 0.639 0.581

Distribution ... χ2 (1) χ2 (1) χ2 (2) χ2(4)

Note: t−values based on asymptotic standard errors in parentheses.

significant coefficient to the Treasury bill rate. A magnitude in the proximity of one is also

found in Mehra (2001) for a longer sample period. The coefficient to unemployment is

1.7, indicating that a high unemployment is associated with a low Federal funds rate. The

coefficient is clearly significant, with a t−value of 9.6. The coefficient for core inflation isalso positive, which is the opposite of the expected for a monetary policy rule, but it is not

significantly different from zero. The adjustment coefficients in α clearly suggest an inter-

pretation of the relation as a monetary reaction function. In particular, the adjustment

coefficient in the Federal funds equation is negative and clearly significant with a t−valueof −3.9. There is also a significantly negative impact in the inflation equation, indicatingthat a high interest rate relative to the target lowers inflation. The adjustment in the

equation for Treasury bill rate is close to zero and the adjustment in the unemployment

equation is negative but not significantly different from zero.

Based on the unrestricted coefficients, the long-run relation thus looks like a monetary

policy rule. A conventional Taylor rule would imply a zero coefficient for the Treasury

bill rate, β2 = 0. Imposing this restriction gives the results reported under H1. First itcould be noted that the restriction is formally rejected at a 5% level, indicating that the

simple Taylor rule does not seem to be an adequate description of the present sample. In

the restricted relation, both the coefficient to ut and πt are significant with the expected

signs and the magnitude of the coefficient to inflation is close to the 1.5 suggested by the

original Taylor rule. A static least squares regression of ft on ut, πt and a constant term,

often seen in the empirical literature on Taylor rules, yields

ft = −1.773(−32.56)

· ut + 1.206(25.10)

· πt + 11.500(33.78)

where t−values are in parentheses andR2 is 0.90. The estimated equation (6) is close to thequarterly results in Ball and Tchaidze (2002) and is not too far from the long-run relation

inH1, although the coefficient to unemployment is somewhat smaller. At a first glance theresults look like a monetary policy reaction function, but the adjustment coefficients to the

long-run relation in H1 do not give much support for this interpretation, since there is nofeedback to the Federal funds rate. Deviations from the relation are corrected by ut and πt

but not by the Federal funds rate, ft. This highlights the dangers of estimating structural

Taylor rules from static regressions. Firstly, with likely unit roots in the variables inference

on (4) is difficult, and secondly the dynamic adjustment to a possible equilibrium is not

modelled, making the interpretation of the nature of the relation hazardous.3

The above results suggest an important role for the bond rate. The theoretical relation

(3) gives a simple interpretation as a Taylor-type rule for the interest rate spread if the

coefficient to bt is restricted to minus one. Under H2 we have reported the results afterimposing β2 = −1. The restriction produces a LR test statistic of 0.49 corresponding

to a p−value of 0.48 in a χ2 (1) distribution. In this relation the coefficient to inflation

3We can add, that weak exogeneity of the Federal funds rate is not a result of including the bond rate

in the model. The same result appears in an analysis of (ft : ut : πt)0.

has the expected sign, but it is clearly insignificant with a t−value of 0.67. Imposing theadditional restriction, β4 = 0, does only marginally change the likelihood and produces

the results reported under H3.The coefficients under H3 suggest that the feedbacks to bt and ut are very weak,

and imposing the two additional restrictions that bt and ut are weakly exogenous for the

long-run coefficients, α2 = α3 = 0, produces the preferred results reported under H4.

A Characterization of US Monetary Policy 1988-2002. In the preferred model,

H4, the long-run relation can be written as

ft − bt = −1.637 · ut + 7.797, (7)

which is a Taylor-type target for the interest rate spread with a significant impact from

unemployment and a zero impact from inflation beyond that contained in the expected

inflation via bt. The constant contains the average value of unemployment and the average

shape of the yield curve. Deviations from this relation are corrected primarily by the

Federal funds rate, eliminating 8% of a misalignment each month. There is also a negative

coefficient in the equation for∆πt, indicating that a high funds rate suppress inflation. The

effect is not terribly strong, with a t−value of −2.4. The preferred structure is acceptedas a reduction of the unrestricted specification with a LR statistic of 2.86 corresponding

to a p−value of 0.58 in a χ2 (4).The result that actual inflation is not directly present in the empirical rule could reflect

that the period under consideration has been characterized by only moderate inflationary

pressure and therefore limited information on the impact of actual inflation in the policy

rule. As mentioned before, expected inflation is present with a coefficient of 1 via the

long-term interest rate.

For the Federal funds rate to be a valid instrument for controlling inflation, it is

required that bC41 is significantly negative, where bC is the estimated counterpart to C.

The estimated long-run impact matrix is given by

0.167(0.19)

1.820(3.70)

−2.748(−3.05)

−0.336(−0.67)

−0.025(−0.06)

1.143(5.10)

−0.141(−0.34)

0.051(0.22)

−0.118(−0.32)

−0.414(−2.00)

1.593(4.20)

0.236(1.13)

−0.376(−1.86)

0.299(2.63)

−0.397(−1.90)

0.755(6.54)

with standard normal distributed asymptotic t−values in parentheses. The relevant coef-ficient is −0.376, indicating that a one percentage point innovation to the Federal fundsrate lowers the long-run core inflation rate with slightly less than 0.4 percentage points

on average. The parameter is not particularly well-determined, however, with a t−valueof −1.86. Again this could reflect that the variation in inflation in the sample period,and the amount of information on the monetary transmission channel is limited. For

the sample 1985 : 8 − 1999 : 2 and a data set comprising real money, (interpolated) realGDP, monthly inflation as well as 5 interest rates, Johansen and Juselius (2001) find an

unexpected positive value for the coefficient bC41.Graph (A) in Figure 2 depicts the Federal funds rate and the long-run target, while

graph (B) depicts deviations of the Federal funds rate from the target together with

the deterministic component comprising the constant and the effects of the initial values

1987 : 9− 1987 : 12. Graph (A) clearly demonstrates that the long-run relation in generalhas been leading the Federal funds rate when major changes in the latter has occurred

corresponding to visual evidence of the endogeneity of the Federal funds rate in the system.

The deviations show that interest rates in the initial period and during 1988 was lower

than suggested by the relationship. A common interpretation relates this to concern for

the financial stability after the crash in the stock market in late 1987.4 The same type of

concern might explain the relatively low interest rates in 1999 after the financial crisis in

Russia and the problems related to LTCM. Such effects are clearly outside the information

set of the current simple model. In 1995/1996 rates were higher than suggested by (7).

A likely interpretation is that this was due to a real-time belief that the natural rate of

unemployment had not fallen significantly from the level of the early 1990’s as documented

by e.g. Orphanides and Williams (2002). They report that the real time assessment of the

natural rate of unemployment in 1995 was 6.0% while the most recent estimate from CBO

in 2002 was 5.3%. It is interesting to note that by the end of the sample, 2002 : 12, where

the Federal funds rate is at a historic low, the policy rate is still above its equilibrium

value according to our estimates.

An important issue for the interpretation of the results in terms of a policy reaction

function is the structural stability of the estimates. According to the Lucas-critique view

on empirical analyses, only deep parameters, such as characterizations of preferences and

technical relationships, can be expected to be stable, whereas reaction functions and

reduced forms are prone to instabilities following shocks to the system. Reversing this

line of argument, stable coefficient estimates can be taken as indicative evidence against

the Lucas critique for the present sample. To evaluate the stability of the relation we

depict in graph (C) the recursively estimated parameters to unemployment in the long-

run relation, see Hansen and Johansen (1999). The estimates look remarkably stable,

and the narrowing of the 95% confidence bands indicates an increasing information on the

parameters. Finally graph (D) depicts the recursively calculated test statistic for the over-

identifying restrictions. The identifying structure is clearly acceptable in all sub-samples.

Short-Run Structure. The VAR used to characterize the long-run properties is heav-

ily over-parametrized, with many insignificant parameters. To illustrate the short-run

adjustment we apply a general-to-specific modelling strategy and find a more parsimo-

nious representation, see Hendry and Mizon (1993). Using a conventional 5% critical level

4We have tried to recalculate the analysis starting the effective sample in 1989 : 1 to remove the effects

of the initial misalignment, but the estimation results are only marginally affected.

1990 1995 2000-2.0

(B) Long run relation

Deviation from long-run target Deterministic component

1990 1995 20000

3 (C) Recursive coefficient to unemployment

1990 1995 20000.0

10.0 (D) Recursive test statistic

5 % critical value

1990 1995 20000

10 (A) Federal funds rate and long-run target

Federal funds rate Long-run target

Figure 2: Deviations from the cointegrating relation and recursive results. The recursive

estimation is done for sub-samples t = 1988 : 1, ..., T0, where the endpoint takes the values

T0 = 1991 : 7, ..., 2002 : 12. In each sub-sample the short-run parameters are fixed at

their full-sample estimates, see Hansen and Johansen (1999). By and large similar results

are obtained if the short-run parameters are reestimated in each sub-sample, although a

larger initial sample is necessary. (C) depicts the recursive coefficient to unemployment

in the long-run relation under H4 together with the 95% confidence band. (D) depicts the

test statistics for the 4 over-identifying restrictions in H4 and the 5% critical value for

individual tests, see Kongsted (1998).

and retaining the adjustment coefficient α4 to inflation yield the following specification

∆πt

−0.085(−5.15)0(...)

0(...)

−0.024(−1.86)

¡ft−1 − f∗t−1

0.188(2.88)

0.167(3.47)

−0.257(−3.21)

0(...)

0.322(4.57)

0(...)

−0.159(−3.48)

0(...)

∆ft−1∆bt−1∆ut−1

0.149(2.40)

0(...)

0.100(2.17)

−0.158(−2.19)

Ã∆ft−2∆πt−2

0.177(3.08)

0(...)

0.145(2.02)

0(...)

Ã∆ft−3∆ut−3

which produces a LR test statistic of 39.89 compared to the unrestricted vector equilibrium

correction model, corresponding to p−value of 0.48 in a χ2(40).The equation for the Federal funds rate, which is the main focus here, indicates a

simple behavior. Besides the autoregressive terms in ∆ft−1, ∆ft−2, and ∆ft−3, whichdescribe the interest rate smoothing, there are additional short-run terms only for one

period lagged bond rate and unemployment. The coefficient to ∆bt−1 is 0.17, well belowthe long-run impact of one. The coefficients to the lagged change in the unemployment

rate, ∆ut−1, is −0.26. In the parsimonious system the adjustment coefficient α4 in the

inflation equation is smaller and less significant than in the unrestricted model, giving less

support for the short-run controllability of inflation.

Interpretation of a Second Long-Run Relation. In the rank determination in Ta-

ble 2, a second long-run relation was borderline significant. To illustrate that the main

conclusions are robust to the choice of cointegration rank, r, we briefly discuss the possible

interpretation of a second long-run relation.

Allowing for a second long-run relation and imposing restrictions on α and β∗yields

the structure

∆πt

−0.086(−4.81)

−0.115(−3.88)

0(...)

−0.029(−1.74)

−0.089(−3.28)

ft−1 − bt−1 + 0.839

(2.67)

ut−1 − 5.276(−27.67)

+ ...,

which is accepted with a test statistic of 6.79 in a χ2 (8) distribution. Note that for r = 2,

the cointegration space separates into a stationary interest rate spread, ft − bt, and a

stationary unemployment rate, ut, while the inflation rate can still be excluded from the

long-run relationships. The policy rule is therefore no longer explicitly in the system, but

the main conclusions from the analysis of r = 1 are preserved:

The Federal funds rate, ft, significantly equilibrium corrects to both long-run relations,

and based on the weights of the two relations in the dynamic equation for ∆ft, the implicit

policy rule is still of the form (7) although with a slightly smaller coefficient of numerically

0.086/0.115 = 1.34 to unemployment. The bond rate, bt, and unemployment, ut, are still

weakly exogenous for the long-run parameters, while inflation, πt, equilibrium corrects to

both relations, although the interest rate spread is marginally insignificant, indicating the

somewhat weak link from the Federal funds rate to inflation.

We have found that the interest-rate setting of the FOMC in the period 1988− 2002 hasbeen somewhat different to that implied by other research. We can reject that monetary

policy has been set according to a traditional Taylor rule. Instead a long-term relationship

between the Federal funds rate, the unemployment rate and the long-term interest rate is

found. Deviations from this relationship are mainly corrected via changes in the Federal

funds rate. This implies that in this small system the Federal funds rate can be considered

the endogenous variable, indicating that the relationship has the character of a monetary

policy rule. Rates are set as if FOMC reacts to unemployment and long-term interest

rates. From a decision-making point of view a likely interpretation is a reaction to activity

expressed by the unemployment rate and the information derived from financial markets

expressed by the long-term interest rate.

We are fully aware that the decision-making process in real time has been far more

complicated that a literal reading of our results suggest. Forecasts of inflation and activity

using different models have been important as has a careful digestion of recent statistics.

As a simple way to summarize factors entering the interest rate setting using ex post data

the results nevertheless provide a better description than the traditional Taylor rule.

The analysis is carried out using a cointegrated vector autoregressive model, allow-

ing for a simultaneous modelling of the long-term relationships and short-run dynamics.

When testing against a more traditional Taylor-rule specification, our model suggests that

inflation enters the relationship via its impact on expected inflation through the long-term

interest rate. We are unable to find a significant role for inflation beyond that. A spec-

ification without information from the financial market is statistically rejected, and the

dynamic adjustment indicates that although such a relation may look like a simple Taylor

(1993) rule, it cannot be interpreted as a policy reaction function because the Federal

funds rate in that case is weakly exogenous with respect to the long-run parameters.

References

Ball, L., and R. R. Tchaidze (2002): “The FEDand the New Economy,” American Economic Re-view, 92(2), 108—114.

Bernanke, B. S., and A. S. Blinder (1992):“The Federal Funds Rate and the Channels ofMonetary Transmission,” American EconomicReview, 82(4), 901—921.

Boswijk, H. P. (1995): “Identifiability of Cointe-grated Systems,” Working Paper, Tinbergen In-stitute and Department of Actuarial Science andEconometrics.

Carey, K. (2001): “Testing for Stabilizing Mon-etary Policy Rules: How Robust to AlternativeSpecifications?,” Topics in Macroeconomics, 1(1),1—16.

Christensen, A. M. (2002): “The Real Inter-est Rate Gap: Measurement and Application,”Danmarks Nationalbank, Working Paper, WP6/2002.

Clarida, R., J. Galı, and M. Gertler (1998):“Monetary Policy Rules in Practice Some Inter-national Evidence,” European Economics Review,42, 1033—1067.

(2000): “Monetary Policy Rules andMacroeconomic Stability: Evidence and SomeTheory,” The Quarterly Journal of Economics,115(1), 147—180.

(2001): Object-Oriented Matrix Program-ming Using Ox. Timberlake Consultants Press,London, 4th edn.

Doornik, J. A., and H. Hansen (1994): “An Om-nibus Test for Univariate and Multivariate Nor-mality,” Working Paper, Nuffield College, Ox-ford.

Doornik, J. A., and D. F. Hendry (1997):Modeling Dynamic Systems Using Pc-Fiml 9.0for Windows. International Thomson Publishing,London.

English, W. B., W. R. Nelson, and B. P.Sack (2003): “Interpreting the Significance ofthe Lagged Interest Rate in Estimated MonetaryPolicy Rules,” Contributions to Macroeconomics,3(1), Article 5.

Estrella, A., and G. A. Hardouvelis (1991):“The Term Structure as a Predictor of Real Eco-nomic Activity,” The Journal of Finance, 46(2),555—575.

Estrella, A., and F. S. Mishkin (1996): “TheYield Curve as a Predictor of U.S. Recessions,”Current Issues in Economics and Finance, 2(7),1—6.

Evans, C. L. (1998): “Real-time Taylor Rules andthe Federal Funds Futures Market,” Federal Re-serve Bank of Chicago, Economic Perspectives,1998(3), 44—55.

Goodfriend, M. (1998): “Using the Term Struc-ture of Interest Rates for Monetary Policy,” Fed-eral Reserve Bank of Richmond, Economic Quar-terly, 84(3), 13—30.

Hansen, H., and S. Johansen (1999): “SomeTests for Parameter Constancy in CointegratedVAR-Models,” The Econometrics Journal, 2,306—333.

Hendry, D. F., and G. Mizon (1993): “Evalu-ating Dynamic Econometric Models by Encom-passing the VAR,” in Models, Methods and Ap-plications of Econometrics, ed. by P. Phillips, pp.272—300. Basil Blackwell, Oxford.

Johansen, S. (1996): Likelihood-Based Inference inCointegrated Autoregressive Models. Oxford Uni-versity Press, Oxford, 2nd edn.

(2002): “A Small Sample Correction forthe Test of Cointegration Rank in the Vector Au-toregressive Model,” Econometrica, 70(5), 1929—1961.

Johansen, S., and K. Juselius (2001): “Control-ling Inflation in a Cointegrated Vector Autore-gressive Model with an Application to US Data,”EUI Working Paper ECO No. 2001/2.

Judd, J. P., and G. D. Rudebusch (1998): “Tay-lor’s Rule and the FED: 1970—1997,” Federal Re-serve Bank of San Francisco, Economic Review,3, 3—16.

Kongsted, H. C. (1998): “Modelling Price andQuantity Relations for Danish Manufacturing Ex-ports,” Journal of Business and Economic Statis-tics, 16(1), 81—91.

Mehra, Y. P. (2001): “The Bond Rate and Esti-mated Monetary Policy Rules,” Journal of Eco-nomics and Statistics, 53, 345—358.

Mishkin, F. S. (1990): “What Does The TermStructure Tell Us About Future Inflation?,” Jour-nal of Monetary Economics, 25(1), 77—95.

Orphanides, A. (2001): “Monetary Policy RulesBased on Real-Time data,” American EconomicReview, 91(4), 964—985.

Orphanides, A., and J. C. Williams (2002):“Robust Monetary Policy Rules with UnknownNatural Rates,” Brookings Papers on EconomicActicity, 2, 63—145.

Rahbek, A., E. Hansen, and J. Dennis (2002):“ARCH Innovations and their Impact on Coin-tegration Rank Testing,” Working Paper no.22,Centre for Analytical Finance, University ofCopenhagen.

Schiller, R. S. (1990): “The Term Structure ofInterest Rates,” in Handbook of Monetary Eco-nomics, Vol. I, ed. by B. E. Friedman, and F. H.Hahn, chapter 13, pp. 626—722. North-Holland,Amsterdam.

Svensson, L. E. O. (2003): “What Is Wrong withTaylor Rules? Using Judgment in Monetary Pol-icy through Targeting Rules,” Journal of Eco-nomic Literature, 41(2), 426—477.

Taylor, J. B. (1993): “Discretion Versus PolicyRules in Practice,” Carnegie-Rochester Confer-ence Series on Public Policy, 39, 195—214.

Woodford, M. (2003): “A Neo-WicksellianFramework for the Analysis of Monetary Policy,”in Interest and Prices: Foundations of a Theoryof Monetary Policy, chapter 4. Princeton Univer-sity Press, Princeton.

i(1) and i(2) cointegration analysis theory and...

Documents

an analysis of convergence and cointegration of sectoral...

demand and a cointegration analysis a master’s · pdf...

an analysis of cointegration relation on swedish national...

testing for cointegration using the johansen methodology ......

cointegration eg

cointegration r workshop

cointegration and arch

pairs trading, convergence trading, cointegration -...

finite sample performance in cointegration analysis … ·...

lecture notes for the course on cointegration...

explaining cointegration analysis: part...

cointegration en

cointegration analysis of oil prices

a cointegration analysis of financial liberalisation …

cointegration in the foreign exchange market and...

vectorautoregressive- var models and cointegration · pdf...

explaining cointegration analysis: part ii · pdf...

a cointegration analysis of air travel … 6(1)_… · a...

dynamic factor models, cointegration, and error correction...

cointegration analysis of german and british tourism demand...