mefm: an r package for long-term probabilistic forecasting of electricity demand

40
Rob J Hyndman Joint work with Shu Fan MEFM: long-term probabilistic demand forecasting 1 MEFM: An R package for long-term probabilistic forecasting of electricity demand

Upload: rob-hyndman

Post on 07-Aug-2015

770 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Rob J HyndmanJoint work with Shu FanMEFM: long-term probabilistic demand forecasting 1

MEFM: An R package for

long-term probabilistic

forecasting of electricity demand

South Australian demand data

MEFM: long-term probabilistic demand forecasting 2

South Australian demand data

MEFM: long-term probabilistic demand forecasting 3

SA State wide demand (summer 2015)

SA

Sta

te w

ide

dem

and

(GW

)

1.0

1.5

2.0

2.5

3.0

Oct Nov Dec Jan Feb Mar

South Australian demand data

MEFM: long-term probabilistic demand forecasting 3

Temperature data (Sth Aust)

MEFM: long-term probabilistic demand forecasting 4

Temperature data (Sth Aust)

MEFM: long-term probabilistic demand forecasting 5

Predictors

calendar effectsprevailing and recent weather conditionsclimate changeseconomic and demographic changeschanging technology

Modelling framework

Semi-parametric additive models withcorrelated errors.Each half-hour period modelled separately foreach season.

MEFM: long-term probabilistic demand forecasting 6

Predictors

calendar effectsprevailing and recent weather conditionsclimate changeseconomic and demographic changeschanging technology

Modelling framework

Semi-parametric additive models withcorrelated errors.Each half-hour period modelled separately foreach season.

MEFM: long-term probabilistic demand forecasting 6

Monash Electricity Forecasting Model

y∗t = yt/yi

yt denotes per capita demand at time t(measured in half-hourly intervals);

yi is the average demand for quarter i where tis in quarter i.

y∗t is the standardized demand for time t.

log(yt) = log(yi) + log(y∗t )

log(yi) = f(GSP, price, HDD, CDD) + εi

log(y∗t ) = f(calendar effects, temperatures) + et

MEFM: long-term probabilistic demand forecasting 7

Monash Electricity Forecasting Model

y∗t = yt/yi

yt denotes per capita demand at time t(measured in half-hourly intervals);

yi is the average demand for quarter i where tis in quarter i.

y∗t is the standardized demand for time t.

log(yt) = log(yi) + log(y∗t )

log(yi) = f(GSP, price, HDD, CDD) + εi

log(y∗t ) = f(calendar effects, temperatures) + et

MEFM: long-term probabilistic demand forecasting 7

Monash Electricity Forecasting Model

MEFM: long-term probabilistic demand forecasting 8

Monash Electricity Forecasting Model

MEFM: long-term probabilistic demand forecasting 8

Annual sub-model

log(yt) = log(yi) + log(y∗t )

log(yi) = f(GSP, price, HDD, CDD) + εi

log(y∗t ) = f(calendar effects, temperatures) + et

log(yi) = log(yi−1) +∑j

cj(zj,i − zj,i−1) + εi

First differences modelled to avoidnon-stationary variables.Predictors: Per-capita GSP, Price, Summer CDD,Winter HDD.

MEFM: long-term probabilistic demand forecasting 9

Annual sub-model

log(yt) = log(yi) + log(y∗t )

log(yi) = f(GSP, price, HDD, CDD) + εi

log(y∗t ) = f(calendar effects, temperatures) + et

log(yi) = log(yi−1) +∑j

cj(zj,i − zj,i−1) + εi

First differences modelled to avoidnon-stationary variables.Predictors: Per-capita GSP, Price, Summer CDD,Winter HDD.

MEFM: long-term probabilistic demand forecasting 9

Annual sub-model

log(yt) = log(yi) + log(y∗t )

log(yi) = f(GSP, price, HDD, CDD) + εi

log(y∗t ) = f(calendar effects, temperatures) + et

log(yi) = log(yi−1) +∑j

cj(zj,i − zj,i−1) + εi

First differences modelled to avoidnon-stationary variables.Predictors: Per-capita GSP, Price, Summer CDD,Winter HDD.

MEFM: long-term probabilistic demand forecasting 9

Half-hourly sub-model

log(yt) = log(yi) + log(y∗t )

log(yi) = f(GSP, price, HDD, CDD) + εi

log(y∗t ) = f(calendar effects, temperatures) + et

Calendar effects

“Time of summer” effect (a regression spline)Day of week factor (7 levels)Public holiday factor (4 levels)

MEFM: long-term probabilistic demand forecasting 10

Half-hourly sub-model

log(yt) = log(yi) + log(y∗t )

log(yi) = f(GSP, price, HDD, CDD) + εi

log(y∗t ) = f(calendar effects, temperatures) + et

Calendar effects

“Time of summer” effect (a regression spline)Day of week factor (7 levels)Public holiday factor (4 levels)

MEFM: long-term probabilistic demand forecasting 10

Half-hourly sub-model

log(yt) = log(yi) + log(y∗t )

log(yi) = f(GSP, price, HDD, CDD) + εi

log(y∗t ) = f(calendar effects, temperatures) + et

Calendar effects

“Time of summer” effect (a regression spline)Day of week factor (7 levels)Public holiday factor (4 levels)

MEFM: long-term probabilistic demand forecasting 10

Half-hourly sub-model

log(yt) = log(yi) + log(y∗t )

log(yi) = f(GSP, price, HDD, CDD) + εi

log(y∗t ) = f(calendar effects, temperatures) + et

Temperature effectsAve temp across two sites, plus lags for previous 3hours and previous 3 days.Temp difference between two sites, plus lags forprevious 3 hours and previous 3 days.Max ave temp in past 24 hours.Min ave temp in past 24 hours.Ave temp in past seven days.

Each function estimated using boosted regression splines.MEFM: long-term probabilistic demand forecasting 11

Half-hourly sub-model

log(yt) = log(yi) + log(y∗t )

log(yi) = f(GSP, price, HDD, CDD) + εi

log(y∗t ) = f(calendar effects, temperatures) + et

Temperature effectsAve temp across two sites, plus lags for previous 3hours and previous 3 days.Temp difference between two sites, plus lags forprevious 3 hours and previous 3 days.Max ave temp in past 24 hours.Min ave temp in past 24 hours.Ave temp in past seven days.

Each function estimated using boosted regression splines.MEFM: long-term probabilistic demand forecasting 11

Half-hourly sub-model

log(yt) = log(yi) + log(y∗t )

log(yi) = f(GSP, price, HDD, CDD) + εi

log(y∗t ) = f(calendar effects, temperatures) + et

Temperature effectsAve temp across two sites, plus lags for previous 3hours and previous 3 days.Temp difference between two sites, plus lags forprevious 3 hours and previous 3 days.Max ave temp in past 24 hours.Min ave temp in past 24 hours.Ave temp in past seven days.

Each function estimated using boosted regression splines.MEFM: long-term probabilistic demand forecasting 11

Half-hourly sub-model

log(yt) = log(yi) + log(y∗t )

log(yi) = f(GSP, price, HDD, CDD) + εi

log(y∗t ) = f(calendar effects, temperatures) + et

Temperature effectsAve temp across two sites, plus lags for previous 3hours and previous 3 days.Temp difference between two sites, plus lags forprevious 3 hours and previous 3 days.Max ave temp in past 24 hours.Min ave temp in past 24 hours.Ave temp in past seven days.

Each function estimated using boosted regression splines.MEFM: long-term probabilistic demand forecasting 11

Half-hourly sub-model

log(yt) = log(yi) + log(y∗t )

log(yi) = f(GSP, price, HDD, CDD) + εi

log(y∗t ) = f(calendar effects, temperatures) + et

Temperature effectsAve temp across two sites, plus lags for previous 3hours and previous 3 days.Temp difference between two sites, plus lags forprevious 3 hours and previous 3 days.Max ave temp in past 24 hours.Min ave temp in past 24 hours.Ave temp in past seven days.

Each function estimated using boosted regression splines.MEFM: long-term probabilistic demand forecasting 11

Half-hourly sub-model

log(yt) = log(yi) + log(y∗t )

log(yi) = f(GSP, price, HDD, CDD) + εi

log(y∗t ) = f(calendar effects, temperatures) + et

Temperature effectsAve temp across two sites, plus lags for previous 3hours and previous 3 days.Temp difference between two sites, plus lags forprevious 3 hours and previous 3 days.Max ave temp in past 24 hours.Min ave temp in past 24 hours.Ave temp in past seven days.

Each function estimated using boosted regression splines.MEFM: long-term probabilistic demand forecasting 11

Half-hourly sub-model

log(yt) = log(yi) + log(y∗t )

log(yi) = f(GSP, price, HDD, CDD) + εi

log(y∗t ) = f(calendar effects, temperatures) + et

Temperature effectsAve temp across two sites, plus lags for previous 3hours and previous 3 days.Temp difference between two sites, plus lags forprevious 3 hours and previous 3 days.Max ave temp in past 24 hours.Min ave temp in past 24 hours.Ave temp in past seven days.

Each function estimated using boosted regression splines.MEFM: long-term probabilistic demand forecasting 11

Ensemble forecasting

log(yt) = log(yi) + log(y∗t )

log(yi) = f(GSP, price, HDD, CDD) + εi

log(y∗t ) = f(calendar effects, temperatures) + et

Multiple alternative futures created:Calendar effects known;

Future temperatures simulated(taking account of climate change);

Assumed values for GSP, population and price;

Residuals simulated (preservingautocorrelations)

MEFM: long-term probabilistic demand forecasting 12

MEFM package for R

Available on github:install.packages("devtools")library(devtools)install_github("robjhyndman/MEFM-package")

Package contents:seasondays The number of days in a seasonsa.econ Historical demographic & economic data for

South Australiasa Historical data for model estimationmaketemps Create lagged temperature variablesdemand_model Estimate the electricity demand modelssimulate_ddemand Temperature and demand simulationsimulate_demand Simulate the electricity demand for the next

season

MEFM: long-term probabilistic demand forecasting 13

MEFM package for R

Available on github:install.packages("devtools")library(devtools)install_github("robjhyndman/MEFM-package")

Package contents:seasondays The number of days in a seasonsa.econ Historical demographic & economic data for

South Australiasa Historical data for model estimationmaketemps Create lagged temperature variablesdemand_model Estimate the electricity demand modelssimulate_ddemand Temperature and demand simulationsimulate_demand Simulate the electricity demand for the next

season

MEFM: long-term probabilistic demand forecasting 13

MEFM package for R

Usagelibrary(MEFM)

# Number of days in each "season"seasondays

# Historical economic datasa.econ

# Historical temperature and calendar datahead(sa)tail(sa)dim(sa)

# create lagged temperature variablessalags <- maketemps(sa,2,48)dim(salags)head(salags)

MEFM: long-term probabilistic demand forecasting 14

MEFM package for R

# formula for annual modelformula.a <- as.formula(anndemand ~ gsp + ddays + resiprice)

# formulas for half-hourly model# These can be different for each half-hourformula.hh <- list()for(i in 1:48) {

formula.hh[[i]] <- as.formula(log(ddemand) ~ ns(temp, df=2)+ day + holiday+ ns(timeofyear, df=9) + ns(avetemp, df=3)+ ns(dtemp, df=3) + ns(lastmin, df=3)+ ns(prevtemp1, df=2) + ns(prevtemp2, df=2)+ ns(prevtemp3, df=2) + ns(prevtemp4, df=2)+ ns(day1temp, df=2) + ns(day2temp, df=2)+ ns(day3temp, df=2) + ns(prevdtemp1, df=3)+ ns(prevdtemp2, df=3) + ns(prevdtemp3, df=3)+ ns(day1dtemp, df=3))

}

MEFM: long-term probabilistic demand forecasting 15

MEFM package for R

# Fit all modelssa.model <- demand_model(salags, sa.econ, formula.hh, formula.a)

# Summary of annual modelsummary(sa.model$a)

# Summary of half-hourly model at 4pmsummary(sa.model$hh[[33]])

# Simulate future normalized half-hourly datasimdemand <- simulate_ddemand(sa.model, sa, simyears=50)

# economic forecasts, to be given by userafcast <- data.frame(pop=1694, gsp=22573, resiprice=34.65,

ddays=642)

# Simulate half-hourly datademand <- simulate_demand(simdemand, afcast)

MEFM: long-term probabilistic demand forecasting 16

MEFM package for Rplot(ts(demand$demand[,sample(1:100, 4)], freq=48, start=0),

xlab="Days", main="Simulated demand futures")

MEFM: long-term probabilistic demand forecasting 17

MEFM package for Rplot(ts(demand$demand[,sample(1:100, 4)], freq=48, start=0),

xlab="Days", main="Simulated demand futures")0.

61.

01.

4

Ser

ies

520.

51.

52.

5

Ser

ies

490.

51.

52.

5

Ser

ies

880.

61.

21.

8

0 50 100 150

Ser

ies

53

Days

Simulated demand futures

MEFM: long-term probabilistic demand forecasting 17

MEFM package for Rplot(demand$annmax, main="Simulated seasonal maximums",

ylab="GW")

MEFM: long-term probabilistic demand forecasting 18

MEFM package for Rplot(demand$annmax, main="Simulated seasonal maximums",

ylab="GW")

0 20 40 60 80 100

1.5

2.0

2.5

3.0

Simulated seasonal maximums

Index

GW

MEFM: long-term probabilistic demand forecasting 18

MEFM package for Rboxplot(demand$annmax, main="Simulated seasonal maximums",

xlab="GW", horizontal=TRUE)rug(demand$annmax)

MEFM: long-term probabilistic demand forecasting 19

MEFM package for Rboxplot(demand$annmax, main="Simulated seasonal maximums",

xlab="GW", horizontal=TRUE)rug(demand$annmax)

1.5 2.0 2.5 3.0

Simulated seasonal maximums

GW

MEFM: long-term probabilistic demand forecasting 19

MEFM package for Rplot(density(demand$annmax, bw="SJ"), xlab="Demand (GW)",

main="Density of seasonal maximum demand")rug(demand$annmax)

MEFM: long-term probabilistic demand forecasting 20

MEFM package for Rplot(density(demand$annmax, bw="SJ"), xlab="Demand (GW)",

main="Density of seasonal maximum demand")rug(demand$annmax)

1.5 2.0 2.5 3.0 3.5

0.0

0.4

0.8

1.2

Density of seasonal maximum demand

Demand (GW)

Den

sity

MEFM: long-term probabilistic demand forecasting 20

References

å Hyndman, R.J. & Fan, S. (2010)“Density forecasting for long-term peak electricity demand”,IEEE Transactions on Power Systems, 25(2), 1142–1153.

å Fan, S. & Hyndman, R.J. (2012) “Short-term load forecastingbased on a semi-parametric additive model”.IEEE Transactions on Power Systems, 27(1), 134–141.

å Ben Taieb, S. & Hyndman, R.J. (2013) “A gradient boostingapproach to the Kaggle load forecasting competition”,International Journal of Forecasting, 29(4).

å Hyndman, R.J., & Fan, S. (2015).“Monash Electricity Forecasting Model”. Technical paper.robjhyndman.com/working-papers/mefm/

å Fan, S., & Hyndman, R.J. (2015). “MEFM: An R package imple-menting the Monash Electricity Forecasting Model.”github.com/robjhyndman/MEFM-package

MEFM: long-term probabilistic demand forecasting 21