development of a macro editing approach work session on statistical data editing, topic v: editing...

28
Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Upload: maude-palmer

Post on 28-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Development of a Macro Editing Approach

Work Session on Statistical Data Editing, Topic v: Editing based on results21-23 April 2008WP 30

Page 2: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Overview

• Introduction to series of surveys that measures U.S. petroleum product supplied

• Limitation of micro editing and need for an edit approach at the aggregate level

• Approach considered for macro editing and the three types of models developed using one product as an example

• In sample forecast results and out-of-sample forecast performance results

• Summary and conclusions

Page 3: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

The PSRS and Micro Edit Limitations

• The surveys, respondents and data collected– WPSRS: Weekly, six cut-off sample surveys– MPSRS : Monthly, nine population census surveys– PSA: Annual of revised monthly estimates, population census

• Limitations– Variability of responses– Lagged population coverage

• Corrective Measures– Micro editing – Imputation

Page 4: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

The Approach

• Purpose of Study– Develop point and interval forecast at national and regional

levels– One-month ahead forecast

• Approach– Econometric time-series models– Three models : Base, ARMA, and Supplemental Models– Micro editing enhanced by providing capabilities to identify

outliers at the aggregate level

Page 5: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Model Development

• Model at product level– Distillate (Low Sulfur, High Sulfur, Total)– Gasoline

• Model at two geographic levels – National– Regional (PADD)

Page 6: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Model Forms

• Base Model: trends and seasonal factors expressed as:

• ARMA Model: Box-Jenkins approach utilizing AR and MA to capture the variation and seasonal pattern expressed as:

• Supplemental Model: Base Model with exogenous variables expressed as:

termsMAorARDShiftTrendDemandk

kkjjt

12

210

termsMAorARExogDShiftTrendDemandi

iik

kkjjt

12

210

m

n

mmjjt ARMAShiftTrendDemand

1

10

Page 7: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

US Distillate Demand: 1996-2006

US Distillate: Total

2500

3000

3500

4000

4500

5000

1996 1998 2000 2002 2004 2006

Th

ou

san

d B

arre

ls p

er D

ay

Page 8: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

US Distillate Demand: 1996-2006

US Distillate: High Sulfur

0

500

1000

1500

2000

1996 1998 2000 2002 2004 2006

Th

ou

san

d B

arre

ls p

er D

ay

Page 9: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

US Distillate Demand: 1996-2006

US Distillate: Low Sulfur

1500

2000

2500

3000

3500

4000

1996 1998 2000 2002 2004 2006

Th

ou

san

d B

arre

ls p

er D

ay

Page 10: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

In-Sample One-Month-Out Forecast Evaluation Statistics

Total Distillate Models

Base ARMA Suppl.

RMSE 100.55 126.48 89.36

MAE 83.03 97.32 73.96

MAPE 2.22 2.59 1.98

HSD Models

Base ARMA Suppl.

RMSE 83.48 109.39 71.22

MAE 63.67 84.96 54.99

MAPE 5.76 7.59 5.13

LSD Models

Base ARMA Suppl.

RMSE 74.74 94.84 74.36

MAE 60.22 76.42 59.85

MAPE 2.28 2.93 2.26

Note: There is no evidence of bias in any of the models

Page 11: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

U.S. Distillate DemandBest Model Summary Statistics

Total HSD LSD

Adjusted R2 0.909 0.898 0.949

S.E. of Regression

96.58 76.98 79.55

Note: Estimation period Jan 1996 through Dec 2006

Page 12: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

In-Sample Model Fit: Best Model 2000-2006 ( 2 forecast standard errors)

US: Total Distillate

2500

3000

3500

4000

4500

5000

Jan-00 Jan-01 Jan-02 Jan-03 Jan-04 Jan-05 Jan-06

Th

ou

san

d B

arre

ls /

Day

Total Distillate: US Model

Page 13: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

US: HSD Demand

0

500

1000

1500

2000

2500

Jan-00 Jan-01 Jan-02 Jan-03 Jan-04 Jan-05 Jan-06

Th

ou

san

d B

arre

ls /

Day

Total HSD: US Model

In-Sample Model Fit: Best Model 2000-2006( 2 forecast standard errors)

Page 14: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

In-Sample Model Fit: Best Model 2000-2006( 2 forecast standard errors)

US: LSD Demand

1500

2000

2500

3000

3500

4000

Jan-00 Jan-01 Jan-02 Jan-03 Jan-04 Jan-05 Jan-06

Th

ou

san

d B

arre

ls /

Day

Total LSD: US Model

Page 15: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Out-of-Sample Forecast Results: Best Model 2006-2007

US: Total Distillate

2500

3000

3500

4000

4500

5000

Jan-06 Jul-06 Jan-07 Jul-07

Th

ou

san

d B

arre

ls /

Day

Total Distillate: US Model

In-Sample Out-of-Sample

Page 16: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Out-of-Sample Forecast Results: Best Model 2006-2007, HSD

US: Total HSD

0

500

1000

1500

2000

2500

Jan-06 Jul-06 Jan-07 Jul-07

Th

ou

san

d B

arre

ls /

Day

Total HSD: US Model

Out-of-SampleIn-Sample

Page 17: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Out-of-Sample Results: Best Model 2006-2007, LSD

US: Total LSD

1500

2000

2500

3000

3500

4000

4500

Jan-06 Jul-06 Jan-07 Jul-07

Th

ou

san

d B

arre

ls /

Day

Total LSD: US Model

Out-of-SampleIn-Sample

Page 18: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Regional Models• Regions: Petroleum Administration for Defense District• Identify exogenous variables to explain regional patterns of distillate demand

– Residential heating in the Northeast (PADD 1): Heating Degree-Days– Agriculture in the Midwest (PADD 2): Precipitation

HDD DEV Population-Weighted Heating Degree-Days: Deviation from NormalPRECIP DEV Area-Weighted Precipitation: Deviation from Long-Term NormalEMP TRANS Employment in Transportation IndustriesIPI MFG Index of Industrial Production for Durable GoodsFREIGHT INDX Transportation Services Index for Freight PRICE RATIO Average monthly spot price ratio: No.2 Fuel Oil / Natural Gas

Exogenous Variables Used in Supplemental Distillate Models

PADD 1 PADD 2 PADD 3 PADD 5 NATIONAL

HSD LSD TOT HSD LSD TOT HSD LSD TOT HSD LSD TOT HSD LSD TOT

HDD DEV X X X X X

PRECIP DEV X X X X

EMP TRANS X X

IPI MFG X

FREIGHT INDX X X X X

PRICE RATIO X

Page 19: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Regional Model Details: In-Sample Model Fit, PADD 1 HSD

PADD 1: HSD

0

200

400

600

800

1000

1200

1400

1600

Jan-00 Jan-01 Jan-02 Jan-03 Jan-04 Jan-05 Jan-06

Th

ou

san

d B

arre

ls /

Day

HSD: PADD 1 Model

Page 20: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Regional Model Details: In-Sample Model Fit, PADD 1, LSD

PADD 1: LSD

400

500

600

700

800

900

1000

1100

1200

Jan-00 Jan-01 Jan-02 Jan-03 Jan-04 Jan-05 Jan-06

Th

ou

san

d B

arre

ls /

Day

LSD: PADD 1 Model

Page 21: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Regional Model Details: In-Sample Model Fit, PADD 2, HSD

PADD 2: HSD

0

100

200

300

400

500

Jan-00 Jan-01 Jan-02 Jan-03 Jan-04 Jan-05 Jan-06

Th

ou

san

d B

arre

ls /

Day

HSD: PADD 2 Model

Page 22: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Regional Model Details: In-Sample Model Fit, PADD 2, LSD

PADD 2: LSD

400

600

800

1000

1200

1400

Jan-00 Jan-01 Jan-02 Jan-03 Jan-04 Jan-05 Jan-06

Th

ou

san

d B

arre

ls /

Day

LSD: PADD 2 Model

Page 23: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Regional Model Details: Out-of-Sample Forecast Results, PADD 1, HSD

PADD 1: HSD

0

200

400

600

800

1000

1200

1400

Jan-06 Jul-06 Jan-07 Jul-07

Th

ou

san

d B

arre

ls /

Day

HSD: PADD 1 Model

In-Sample Out-of-Sample

Page 24: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Regional Model Details: Out-of-Sample Forecast Results, PADD 1, LSD

PADD 1: LSD

0

200

400

600

800

1000

1200

1400

Jan-06 Jul-06 Jan-07 Jul-07

Th

ou

san

d B

arre

ls /

Day

LSD: PADD 1 Model

In-Sample Out-of-Sample

Page 25: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Regional Model Details: Out-of-Sample Forecast Results, PADD 2, HSD

PADD 2: HSD

0

200

400

600

800

Jan-06 Jul-06 Jan-07 Jul-07

Th

ou

san

d B

arre

ls /

Day

HSD: PADD 2 Model

In-Sample Out-of-Sample

Page 26: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Regional Model Details: Out-of-Sample Forecast Results, PADD 2, LSD

PADD 2: LSD

600

800

1000

1200

1400

Jan-06 Jul-06 Jan-07 Jul-07

Th

ou

san

d B

arre

ls /

Day

LSD: PADD 2 Model

In-Sample Out-of-Sample

Page 27: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Benefits & Limitations

• How does this improve EIA’s current activities?– Establishes a range of expected results at the aggregate level that will

alert a reviewer when to investigate possible anomalies in the respondent data

– Can identify the region which provides largest contribution to deviation, guiding further editing and imputation activities prior to data release

– Reduces risk of revisions to released data

• Limitations of Modeling– Reasons for deviations are not always readily apparent: respondent

error, structural shifts in consumption, or failure of the model to respond to external influences

– Regional-level models provide guidance, but not necessarily answers

– Ranges may be too large

Page 28: Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30

Future Plans

• Model improvements– Dynamic adjustments to known issues like shifts– Better exogenous variables

• Automation of gathering and formatting model inputs– Weather Data– Economic Data– Forecast generation

• Expand to other key petroleum products– Gasoline and gasoline subcomponents (currently underway)– Residual fuel oil