load forecast uncertainty estimation using monte carlo...

Load Forecast Uncertainty Estimation using Monte Carlo Simulations

APRIL 26TH , 2018

Fernando Peña-Silva

Load Forecasting Analyst

Facilities & Transmission

69 kV Transmission Line




Wreck

CoveCheticamp

Canso

Sheet Harbour

Lunenburg

Liverpool

Yarmouth

Weymouth

Digby Annapolis Royal

Middleton

Kentville

Onslow

Trenton

Amherst

Baddeck

Atlantic Ocean

HalifaxChester

Lingan

Pt. Aconi

Bay of Fundy

Tusket

Shelburne

Hopewell

Pt.Tupper

Truro

Port Hastings

NSPI MAJOR FACILITIES 2018

Hydro Generating Plants

Thermal Generating Plants

Combustion Turbine Generating Plant

Tuft's CoveBurnside

Gisborne

Brushy Hill

MerseyHydro

Major Transmission Substation

Lakeside

Canaan Rd

Milton

Bridgewater

Line routing is not to scale

Sydney

St. Croix

Tremont

Springhill

Interconnection

with NB Power

Wind Turbine Generating (Transmission)

Tidal Power Generating Plant

Biomass Power Generating Plant

Gulliver’s Cove

Pubnico Point30.6 MW

30.0 MW

51 MWDalhousie Mtn 62.1 MW

Glen Dhu

Bear Head22.0 MW

Lingan14.0 MW

Nuttby49.5 MW

Woodbine

31.5 MW

Memramcook

102 MWSouth Canoe Lake

13.8 MW

Sable Wind

Northumberland Strait

To

Newfoundland

Parrsboro

Antigonish

▪ Nova Scotia Power serves about 500K customers (about 1 million people)

▪ 10 year load forecast filed annually with Utility and Review Board (UARB).

▪ Prior to 2017, forecast provided high/low discrete scenarios based on economics, electrification, weather and large customers

▪ In 2016 the UARB’s consultant recommended that the sensitivity analysis should be broadened to encompass a wider range of assumptions

▪ Rather than trying to come up with more discrete scenarios, we decided to use a probabilistic approach (p10/p90) -> Monte Carlo Simulations

3

Context

44%

13%

11%4%

10%

11%

5% 2%

2017 Annual Production Volumes

Coal Natural gas Oil and petcoke Purchased power - other

Wind and hydro Purchased Power IPP Purchased Power COMFIT Biomass - renewables

4

Generation Mix

5

Trends of the 3 major customer groups

6

Current regression model (reporting)

Load Forecast SAE Model (Residential)

End Use:

AC Saturation

AC Efficiency

Thermal Efficiency

Home Size

Economic:

Income

Retail Sales

Household Size

Price

End Use:

Heating Saturation

Resistance/Heat Pump

Heating Efficiency

Thermal Efficiency

Home Size Economic:

Income

Retail Sales

Household Size

Price

End Use:

Saturation Levels

Water Heat

Appliances

Lighting Densities

Plug Loads

Appliance Efficiency

Economic:

Income

Retail Sales

Household Size

Price

Heating

Degree DaysCooling

Degree Days

Billing

Days

XCool XHeat XOther

AvgUsem a bc XCoolm bh XHeatm bo XOtherm cReported DSM + e

3

8

Deterministic Results

Net System Peak by year

Yearly Net System Requirements

9

Monte Carlo Simulations

▪ Base Forecast from SAE is a deterministic model that predicts a given number (sales, peak demand) at a given time

▪ Monte Carlo method adds volatility to the forecast by including historicalvariations, or best guesses in case of lack of data, to the inputs (predictors)

▪ Regression coefficients obtained in the previous step remain fixed; “we claim we understand the nature of current sales”

▪ With help of software tools (e.g., Oracle’s Crystal Ball, R, SAS, Matlab, etc.) we generate random sets of numbers for each predictor. We then simulate thousands of input numbers based on “realistic” probability distributions

▪ After each trial, we calculate the output given the trial inputs (and known fixed coefficients), collect all results and obtain the probability distribution of the output each year

▪ For this work we only considered variation in economics and weather

10

Monte Carlo Method

▪ If historical data is available, and normally distributed, extracting mean and standard deviations should be simple

▪ Software tools can then use the parameters of the best fit (e.g., mean and SD) to generate pseudorandom numbers as many times as one wants

▪ In normal distributions the mean locates the data in the spectrum of possibilities, while the standard deviation measures variability

11

Simulating Probability Distributions

▪ Normal distributions (bell curve) usually arise in naturally occurring events

▪ Others may be product of filters, rules, biases, and lack of information

▪ Right panel: Function MAX (of monthly, bell-shaped, HDDs) creates a bias that skews the normal distribution. This may show up when studying the system peak on a given year

▪ Bottom: When little or no data is available Uniform and Triangular distributions are suggested, they are easy to create and understand, albeit not very realistic.

12

Other types of distributions

How our model works in practice

AvgUsem a bc XCoolm bh XHeatm bo XOtherm+…

3

Trial PriceTrial Income Trial March HDDOutput

• We reproduce Base SAE model in Excel

• Regression coefficients are imported

• Economic indicators , HDDs and CDDs are simulated

• For now End-Use forecasts are notmodelled

• Reported DSM is used to correct outputs

▪ After all predictor distributions (per time interval) are set, we can run our regression model, which takes as inputs each set of trials

▪ We gather distributions of the output (sales) per year

▪ The standard deviation, or spread of the outcome is a measure of uncertainty!

14

Results 1: Long term sales forecast

▪ We use a version of the SAE model for the monthly system peak model. It uses weather normalized versions of the Energy Loads and multiplies it for temperature at Peak for that month

▪ The model is very accurate (R2>0.9) , and also heavily sensitive to Peak temperature, whose distribution is slightly skewed

15

Results 2: Long term peak forecast

▪ As a byproduct one can trace the ranked correlation (Spearman’s) between each predictor and output, and rank by the order of importance

▪ This way one can have a relative (by predictor) estimation of the contribution to variance

▪ d=difference between ranks of the pairs output-predictor

16

Sensitivity analysis 1: Energy Sales

𝜌𝑆 = 1 −6σ𝑑2

𝑁(𝑁2 − 1)

𝐶2𝑉𝑎𝑟𝑖 =𝜌𝑆𝑖

2

σ𝑖 𝜌𝑆𝑖2

▪ The System Peak forecast is heavily dependent on the temperature of the peak day, and very loosely dependent on cumulative effects (economics, price)

17

Sensitivity analysis 2: Peak Demand

18

Correlations

▪ After a Monte Carlo model has been completed, it’s time to add correlations between predictors

▪ In principle all variables exhibit some degree of correlations, it is up to the analyst to select those that will be included

▪ Including correlations will have an impact on the forecast uncertainty, particularly in the mid-to-long term

▪ We can proceed in a similar way as with the historical variation study for the predictors. This time we examine the correlation of historical measures among pairs of predictors

19

Adding correlations

20

Reminder: Correlations

𝑐𝑜𝑟𝑟 𝑋, 𝑌 =𝐸[(𝑋 − ത𝑋)(𝑌 − ത𝑌)

𝜎𝑋𝜎𝑌

=σ𝑖=1𝑛 (𝑥𝑖− ҧ𝑥)(𝑦𝑖−ത𝑦)

σ𝑖=1𝑛 (𝑥𝑖 − ҧ𝑥)2 σ𝑖=1

𝑛 (𝑦𝑖 −ത𝑦)2

https://www.google.ca/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&ved=2ahUKEwigm5Xx1a3aAhXpQd8KHSzzCQQQjRx6BAgAEAU&url=https://ar.wikipedia.org/wiki/%D9%85%D9%84%D9%81:Correlation_examples.png&psig=AOvVaw1B2GwGlrpMlw2pBGS6xwHO&ust=1523379253521528

21

Simulated Correlations (after 10K trials)

▪ Calculating correlations will produce “numbers”, the analyst has to use their knowledge in order to determine if a certain correlation should be included or not (should I include temperature-price pair?...it depends!)

▪ Different software may handle correlations differently, in some cases affecting the forecast computation time, significantly

▪ We test for significance of the calculated correlation, using the T-test (for 20 years of yearly data, t* >|2.1| rejects the Null hypothesis of zero )

22

What correlations should we add?

𝑡∗ = 𝜌𝑆𝑁 − 2

1 − 𝜌𝑆2

23

Correlations make a difference

24

Test case

▪ Purpose of the confidence bands is assessing probable outcomes

▪ At the moment the Load Forecast team produces year-to-year, long term forecast reports that are submitted both to NSPI and UARB

▪ Yet we still have to find ways to asses historical DSM activities and End-Uses, and integrate them to the Monte Carlo variables

▪ However, our current forecast model runs monthly -> we can asses short term accuracies when comparing actuals (so far actuals within 5% of Monte Carlo mean)

25

Monthly Sales Forecast & 2017 Actuals

▪ We use Monte Carlo to assess probable outcomes produced by our regression models

▪ The predicted uncertainty is a result of the combinations of (known) uncertainties of many inputs

▪ Our challenges: Realistic DSM and End-Use probability distributions

▪ After assumptions are understood (e.g. sensitivity) and predictions trusted we can explore (probabilistic) actions that affect inputs that in turn affect outcomes

▪ We look forward to hear from those in the audience who may be using Monte Carlo simulations in similar or new ways!

26

Conclusions and future work

27

Thank you!

Questions?

load forecast uncertainty estimation using monte carlo...

Documents