macroeconomic data transformation matters

Motivation and GoalsCandidate transformations

Horse race setupResults

Conclusion

Macroeconomic Data Transformation Matters

Philippe Goulet-Coulombe (UPenn), Maxime Leroux (UQAM),Dalibor Stevanovic (UQAM) and Stephane Surprenant

(UQAM)

October 2020



Conclusion

Content

1 Motivation and Goals

2 Candidate transformations

3 Horse race setup

4 Results

5 Conclusion



Conclusion

Accuracy gains from machine learning are well documented.Some recent examples : Kim and Swanson (2018),Goulet-Coulombe et al. (2019) and Meideiros et al. (2019),among others ;

Typical data transformations in time series used to remove lowfrequency movements may be suboptimal for machine learningmethods ;

Deep neural networks may automate data transformations, butmacroeconomic samples are short and noisy making manualfeature engineering advisable (Kuhn and Johnson, 2019) ;

Moreover, careful feature engineering can encode priorknowledge and help improve forecasting accuracy.



Conclusion

Goals :

Propose rotations of original data which helps encode ”timeseries friendly” priors into machine learning models ;

And, of course, we seek to compare the performances ofseveral data transformations and the combinations thereof ;

Note that we focus on the transformations of thepredictors. Throughout, targets are defined as

y (h)t = 1

h (ln(Yt)− ln(Yt−h))

where h is the forecasting horizon.



Conclusion

Model : yt+h = g (fZ (Ht)) + εt+h

Objective : ming∈G

{∑Tt=1

(yt+h − g (fZ (Ht))2 + pen(g , τ)

)}Ht is the data, fZ is the feature engineering step, g is themodel and pen is a penalty function with hyperparametervector τ . Hence Zt := fZ (Ht) would be the feature matrix.

Forecast error decomposition :yt+h−yt+h = g∗(f ∗Z (Ht))− g(fZ (Ht))︸︷︷︸

approximation error

+ g(Zt)− g(Zt)︸︷︷︸estimation error

+et+h.

We focus on how our choice of fZ (transformations orcombinations thereof) impacts forecast accuracy.



Conclusion

We consider ”older” or more common candidates :

X : Differentiate the data in levels or logarithms.

F : PCA estimates of linear latent factors of X as in Stockand Watson (2002a,b) and Bai and Ng (2008).

H : (Log-)Level of the series.



Conclusion

We consider ”newer” or less common candidates :

MARX (Moving average rotation of X) : We use orderp = 1, ...,PM moving averages of each variable in X . This ismotivated by Shiller (1973). The following model :

yt =p=P∑p=1

Xt−pβp + εt , εt ∼ N(0, σ2ε )

βp = βp−1 + up, up ∼ N(0, σ2uIK ).

This can be estimated by a Ridge regression which may beparametrized using inputs Z := XC as inputs and whereC = IK ⊗ c with c, a lower triangular matrix of ones.

This transformation implicitly shrinks βp to βp−1.



Conclusion

MAF (Moving average factors) : Let Xt,k be the k-th variablelag matrix defined as

Xt,k =[Xt,k , LXt,k , . . . , LPMAF Xt,k

]Xt,k = MtΓ′k + εk,t .

We estimate Mt by PCA and use the same number of factorsfor all variables.

This is related to Singular Spectrum Analysis – except thatSSA would use the whole common component instead offocusing on latent factors.



Conclusion

Tableau – Summary : Feature Matrices and Models

Transformations Feature MatrixF Z (F )

t := [Ft , LFt , ..., Lpf Ft ]X Z (X)

t := [Xt , LXt , ..., LpX Xt ]MARX Z (MARX)

t :=[

MARX (1)1t , ..., MARX (pMARX )

1t , ..., MARX (1)Kt , ..., MARX (pMARX )

Kt

]MAF Z (MAF )

t :=[

MAF (1)1t , ..., MAF (r1)

1t , ..., MAF (1)Kt , ..., MAF (rK )

Kt

]Level Z (Level)

t := [Ht , LHt , ..., LpH Ht ]Model Functional spaceAutoregression (AR) LinearFactor Model (FM) LinearAdaptive Lasso (AL) LinearElastic Net (EN) LinearLinear Boosting (LB) LinearRandom Forest (RF) NonlinearBoosted Trees (BT) Nonlinear

Several combinations of many of those transformations arealso considered.



Conclusion

The Horse Race

Direct forecasts

Data : FRED-MD

Targets : Industrial production, non farm employment,unemployment rate, real personal income, real personalconsumption, retail and food services sales, housing starts, M2money stock, consumer and producer price index

Horizons : h ∈ [1, 3, 6, 9, 12, 24]

POOS Period : 1980M1-2017M12

Estimation Window : Expanding from 1960M1



Conclusion

Tableau – Best Specifications in Terms of MSE

INDPRO EMP UNRATE INCOME CONS RETAIL HOUST

H=1 RF�� RF�� LB�� BT�� FM� FM� FM�H=3 RF� RF�� RF�� RF� FM� AL�� BT��H=6 AL� RF�� LB� RF�� RF�� AL�� BT��H=9 LB� LB�� EN� RF�� RF�� AL�� RF�H=12 RF� LB�� RF�� RF� RF�� RF�� RF�H=24 RF�� LB�� RF� RF� RF� EN�� RF��

M2 CPI PPI

H=1 BT�� EN� EN�H=3 BT�� RF� EN��H=6 BT�� RF� RF�H=9 BT�� RF� RF�H=12 BT�� EN� RF�H=24 RF� AL� RF�

Note : Bullet colors represent data transformations included in the best modelspecifications : F , MARX , X , L, MAF .



Conclusion

The marginal contribution of a transformation to modelperformance can be evaluated using a panel regression :

R2t,h,v ,m = αf + ψt,v ,h + vt,h,v ,m

where

R2t,h,v ,m := 1− ε2

t,h,v,m

1T∑T

t=1

(y (h)

v,t−y (h)v

)2

ψt,v ,h are time, variable and horizon fixed effects

εt,h,v ,m is the time t, horizon h, variable v and model mforecast error

αf is one of αMARX , αMAF , αF associated with thecorresponding transformations. The null hypothesis is αf = 0.



Conclusion

Main FindingsMARX is especially potent when used in combination with

nonlinear models to forecast measures of real activity. It’sespecially true around recessions Cumulative Squred Errors (Real) .

Factors helps a lot with Random Forest and Boosted Trees,especially at longer horizons, in line with results inGoulet-Coulombe et al. (2019).

Random Forests with factors gained a lot of ground foryear-ahead forecasting Cumulative Squred Errors (CPI) .

MAF regressions focus on models which includes X. Resultsare more muted, but it seems to help Random Forests andLinear Boosting for horizons of 6 and 9 months.



Conclusion

Conclusion

At shorter horizons, combining non-standard and standarddata transformations helps reduce RMSE.

MARX is especially potent when used in combination withnonlinear models to forecast measures of real activity,especially around recessions.

Factors remain one of the most effective feature engineeringtool available for macroeconomic forecasting, even forinflation.



Conclusion

●●●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●●●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●●

●●

●●

●●

●

●●

●●●

●●

●●

●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●

●●●

●●

●●

●●

●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●●●

●●●●

●●

●●●●

●●●●

●●

●●

●●

●●●●

●●●

●●●

●●●●

●●

●●

●●

●●●

●●

●●

●●

●

●●

●●

●●●

●●●

●●

●●●

●●

●●

●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●●

●●

●●●●

●

Adaptive Lasso Elastic Net Linear Boosting Random Forest Boosted Trees

H=

1H

=3

H=

6H

=9

H=

12H

=24

−0.4

−0.3

−0.2

−0.1 0.0

0.1

0.2

0.3

0.4

−0.4

−0.3

−0.2

−0.1 0.0

0.1

0.2

0.3

0.4

−0.4

−0.3

−0.2

−0.1 0.0

0.1

0.2

0.3

0.4

−0.4

−0.3

−0.2

−0.1 0.0

0.1

0.2

0.3

0.4

−0.4

−0.3

−0.2

−0.1 0.0

0.1

0.2

0.3

0.4

● ● ● ● ● ● ● ● ● ●INDPRO EMP UNRATE INCOME CONS RETAIL HOUST M2 CPI PPI

Figure – MAF Average Treatment Effects ((h,v) Subsets)Back



Conclusion

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●

●●●●

●●

●●●●

●●

●●●

●●

●●

●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●●●

●●

●●

●●

●●

●●

●●●

●●

●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●

●●

●●●●●

●●●

●●

●●●●●●●

●

●●●●

●●●

●●

●

●●

●●●

●●

●●●

●●●●●

●●

●●

●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●


H=

1H

=3

H=

6H

=9

H=

12H

=24

−0.5

0

−0.1

50.

000.

150.

300.

450.

600.

75−0

.50

−0.1

50.

000.

150.

300.

450.

600.

75−0

.50

−0.1

50.

000.

150.

300.

450.

600.

75−0

.50

−0.1

50.

000.

150.

300.

450.

600.

75−0

.50

−0.1

50.

000.

150.

300.

450.

600.

75


Figure – F Average Treatment Effects ((h,v) Subsets)Back



Conclusion

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●

●●●

●●

●●

●●

●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●

●

●●

●●

●●●

●●

●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●

●●

●●

●●

●●

●●

●●●●

●●●●●●

●●

●●

●●●●

●●

●●

●●

●●

●●●

●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●●

●●

●●

●


H=

1H

=3

H=

6H

=9

H=

12H

=24

−0.4

−0.3

−0.2

−0.1 0.0

0.1

0.2

0.3

0.4

−0.4

−0.3

−0.2

−0.1 0.0

0.1

0.2

0.3

0.4

−0.4

−0.3

−0.2

−0.1 0.0

0.1

0.2

0.3

0.4

−0.4

−0.3

−0.2

−0.1 0.0

0.1

0.2

0.3

0.4

−0.4

−0.3

−0.2

−0.1 0.0

0.1

0.2

0.3

0.4


Figure – MARX Average Treatment Effects ((h,v) Subsets)Back



Conclusion

0e+00

1e−04

2e−04

3e−04

4e−04

1980 1990 2000 2010

F F−X−MARX X

Employment (3 months)

0.0000

0.0025

0.0050

0.0075

1980 1990 2000 2010

F F−X−MARX X

Industrial Production (3 months)

Figure – Cumulative Squared Errors (Random Forest)

Back



Conclusion

0e+00

3e−04

6e−04

9e−04

1980 1990 2000 2010

F F−X−MARX X

CPI Inflation (12 months)

Figure – Cumulative Squared Errors (Random Forest)

Back



Conclusion

Random Forest Boosted Trees

IND

PR

OE

MP

UN

RAT

E

19941999

20042009

20141994

19992004

20092014

−2

0

2

4

−2

0

2

4

6

−2

0

2

4

Flu

ctua

tion

test

sta

tistic

Notes : Giaconomini-Rossifluctuation tests for 3months at 10%. Improve-ments lie above the upperline. Transformations :F,F-X , F-MARX,F-X-MARX,F-X-MARX-Level ,F-X-Level , F-MAF,F-X-MAF.



Conclusion


M2

CP

IP

PI

19941999

20042009

20141994

19992004

20092014

−5

0

5

−8

−4

0

4

−7.5

−5.0

−2.5

0.0

2.5

Flu

ctua

tion

test

sta

tistic




Conclusion


INC

OM

EC

ON

SR

ETA

ILH

OU

ST

19941999

20042009

20141994

19992004

20092014

−3

0

3

6

−2.5

0.0

2.5

5.0

−5.0

−2.5

0.0

2.5

5.0

7.5

−3

0

3

Flu

ctua

tion

test

sta

tistic


macroeconomic data transformation matters

Documents