short-term electricity load forecasting of buildings in ... · short-term electricity load...

Short-term Electricity Load Forecasting of Buildings inMicrogrids

Hamed Chitsaza,∗, Hamid Shakera, Hamidreza Zareipoura, David Wooda, NimaAmjadyb

aSchulich School of Engineering, University of Calgary, Alberta, CanadabElectrical Engineering Department, Semnan University, Semnan, Iran

Abstract

Electricity load forecasting plays a key role in operation of power systems.

Since the penetration of distributed and renewable generation is increasingly grow-

ing in many countries, Short-Term Load Forecast (STLF) of micro-grids is also

becoming an important task. A precise STLF of the micro-grid can enhance the

management of its renewable and conventional resources and improve the eco-

nomics of energy trade with electricity markets. As a consequence of the highly

non-smooth and volatile behavior of the load time series in a micro-grid, its STLF

is even a more complex process than that of a power system. For this purpose,

a new prediction method is proposed in this paper, in which a Self-Recurrent

Wavelet Neural Network (SRWNN) is applied as the forecast engine. Moreover,

the Levenberg-Marquardt (LM) learning algorithm is implemented and adapted to

train the SRWNN. In order to demonstrate the efficiency of the proposed method,

it is examined on real-world hourly data of an educational building within a micro-

∗Corresponding Author: Hamed Chitsaz;Email: [email protected]; Phone: +1 (587) 896-8388; Fax:+1 (403) 282-6855

Preprint submitted to Elsevier April 20, 2016

grid. Comparisons with other load prediction methods are provided.

Keywords: Micro-grids, buildings, electricity load forecasting, self-recurrent

wavelet neural network.

1. Introduction

Micro-grids are integrated energy systems composed of distributed energy re-

sources and multiple electrical loads operating as an autonomous grid, which can

be either in parallel to or islanded from the existing power grid. A micro-grid

can be considered as a small-scale version of the traditional power grid that its5

small scale results in far fewer line losses and lower demand on transmission in-

frastructure. All of these advantages are consequently motivating an increased

demand for micro-grids in a variety of application areas such as campus envi-

ronments, military operations, community/utility systems, and commercial and

industrial markets [1].10

Considering the fast and worldwide development of micro-grids, their optimal

operation requires advanced tools and techniques. In particular, Short-Term Load

Forecast (STLF) is an indispensable task for the operation of a micro-grid. In con-

ventional power systems, STLF is an important tool for reliable and economic op-

eration of power systems, as many operating decisions, such as dispatch schedul-15

ing of generating capacity, demand side management, security assessment and

maintenance scheduling of generators, are based on load forecast [2, 3, 4, 5, 6, 7].

Load forecasts also have significant roles in energy transactions, market shares

and profits in competitive electricity markets [7, 8]. Different prediction strate-

2

gies have already been presented for the STLF of traditional power systems over20

the years. These methodologies are generally divided into two main groups: clas-

sical statistical techniques and computational intelligent techniques. Reviews on

some of these strategies can be found in [2, 4, 5, 6, 7, 8].

In a similar way, STLF is a key factor in operation of micro-grids such as

energy management for optimal utilization of available resources in order to min-25

imize the operation cost or any environmental impact of a micro-grid [9]. More-

over, STLF for a micro-grid can be used for profitable trade of electric energy

within the grid. In other words, it is important for the operator of a micro-grid

to determine the amount of exchanged power with a wholesale energy market so

as to maximize the total benefit [10]. It has also been discussed that the fore-30

casted loads as well as forecasted generation of renewable resources are the main

inputs for optimal energy management [11, 12] and generation scheduling [13] in

micro-grids.

However, modeling and forecasting of micro-grids’ loads can be more com-

plex tasks than those usually applied for conventional power systems, as the load35

time series of micro-grids is more volatile in comparison with the load of power

systems, as demonstrated later in the present paper. Since the size of a micro-grid

is considerably small compared to a traditional power system, the load of a micro-

grid includes more fluctuations. In other words, the inertia in small-scale systems

is low and therefore, the smoothness of load time series in such systems degrades.40

Using a criterion to measure the volatility of a time series, it will be shown in

this paper that the volatility of load time series for a micro-grid is considerably

3

higher than that for a conventional power system. As a result, there is a need to

adapt a suitable STLF model to volatile behavior of micro-grids load time series.

Despite the importance of STLF for micro-grids there are a few works presented45

in this area. Authors in [14] present an on-line learning model based on Multiple

Classifier Systems (MCSs) for short-term load forecasting of micro-grids, and the

model was tested on real data of a micro-grid. A bi-level prediction strategy is

proposed in [15] for STLF in micro-grids. This strategy is composed of a fore-

caster including neural network and evolutionary algorithm in the lower level and50

an enhanced differential evolution algorithm in the upper level for optimizing the

performance of the forecaster. The proposed models in [15] is designed having

the aggregated micro-grid load in mind. However, the present paper focuses on

forecasting the load of the individual loads within a micro-grid, with potentially

significantly higher volatility compared to the aggregated micro-grid load. Fore-55

casting individual micro-grid load components is important for operation schedul-

ing and determining load serving priorities at the feeder level [16].

Some research works have also been presented regarding electricity load pre-

diction for residential areas and buildings [17, 18, 19]. The proper consumption

of electricity in buildings leads to lower operational costs. If the facility manager60

could predict the electricity demand of the building, actions could consequently

be taken to reduce the amount of energy and therefore, reduce the operational cost

of the building [19]. A few works have been published very recently in the area of

energy prediction of buildings. For instance, long-term energy consumption of a

residential area in South West China has been studied in [20]. In this reference, an65

4

Artificial Neural Network (ANN) model is compared with some other prediction

models, including Grey model, regression model, polynomial model and polyno-

mial regression model, to forecast the total energy consumption of the residential

area, and it is shown that ANN model outperforms the other models. Having ac-

cess to detailed data of a six-story multi-family residential building located on70

the Columbia University campus in New York City, the authors in [21] were able

to conduct a comparative spatial analysis to forecast the energy consumptions of

units, floors and the whole building for different temporal intervals (e.g., 10-min,

hourly and daily). The results indicate that the most effective models are built with

hourly consumption at the floor level providing that high resolution and granular75

data is available via advanced smart metering devices. In [22], a Case-Based

Reasoning (CBR) model, categorized as a machine-learning artificial intelligence

technique, is proposed to forecast energy demand in an office building located

in Verennes, Quebec, Canada. Three forecasting horizons of 3-hour, 6-hour and

24-hour ahead have been simulated with hourly prediction resolution, and the re-80

sults demonstrate that the prediction capability of the model is improved when the

horizon is reduced to 3-hour ahead. Authors in [23] have proposed a new method-

ology for electrical consumption forecasting based on end-use decomposition and

similar days. Total consumption forecast is also obtained from end-use consump-

tions and the data of selected days. In [24], a building-level neural network-based85

ensemble model is presented for day-ahead electricity load forecasting, and it is

shown that the presented model outperforms SARIMA (Seasonal Auto Regressive

Moving Average) by up to 50%. However, the comparisons are made only with

5

SARIMA model, which is a linear statistical model, which may not be capable of

capturing high nonlinearity of the building-level electricity load.90

To summarize the main points, micro-grids can bring considerable benefits to

power systems, such as supplying loads in remote areas, reducing total system

expansion planning cost, reducing carbon emission through coordinated utiliza-

tion of Renewable Energy Sources (RESs), providing cheaper electricity through

proper energy management of available resources and energy trade with the main95

grid, and improving system reliability resiliency by providing dispatchable power

for use during peak power conditions or emergency situations. Moreover, it was

discussed that short-term load forecasting tool is of high importance in optimal en-

ergy management and secure operation of micro-grids. In this way, some research

works have been conducted to develop load forecasting models with higher ac-100

curacy. However, as discussed above, a few works have focused on day-ahead

load consumption prediction of buildings in micro-grids and consequently, im-

provement of forecast accuracy is still needed in this area. In the present paper, a

forecast method is proposed for the STLF of micro-grids with the focus of elec-

tricity load prediction for individual buildings. The main contribution of this pa-105

per is applying a Self-Recurrent Wavelet Neural Network (SRWNN) forecasting

engine for electricity load prediction of micro-grids. Moreover, the Levenberg-

Marquardt (LM) learning algorithm is implemented to train the SRWNN. The pro-

posed method improves the forecast accuracy for highly volatile and non-smooth

time series of micro-grid electricity load. The higher the forecast accuracy of110

electricity load, the more efficient energy management can be achieved in a micro-

6

grid.

The remaining parts of the paper are organized as follows. Section 2 provides

a data analysis on different electricity load time series to draw a distinction be-

tween the load of a micro-grid and a power system. The proposed forecasting115

method consists of the SRWNN as the forecasting engine and LM as the training

algorithm, and is presented in Section 3. The proposed load forecasting method

is tested on real-world test cases and the results are compared with the results of

some other prediction approaches in Section 4. Finally, Section 5 concludes the

paper.120

2. Data analysis

A data analysis is presented in this section so as to compare the characteristics

of a micro-grid load time series and electricity load in power systems. The British

Columbia Institute of Technology (BCIT) in Vancouver, the Province of British

Columbia (BC), Canada, is considered as the micro-grid test case studied in this125

paper. BCIT’s Burnaby campus is Canada’s first Smart Power Micro-grid com-

prised of power plants (including renewable resources of wind and photovoltaic

modules), campus loads, command and control (including substation automation,

micro-grid control center and distributed energy management), and communica-

tion network [25]. The load data used in this work is from one building with a130

peak value of 694 kW from March 2012 to March 2013, within the BCIT micro-

grid. Hereafter, we refer to this load as BCIT. To draw a comparison between the

characteristics of a micro-grid load and power system load level, the load time

7

Table 1: Comparison of electricity load time series in terms of volatility.

Volatility index BCITBritish Columbia’s California’s

System Load System LoadDaily volatility (%) 8.34 2.66 3.18

Weekly volatility (%) 7.09 2.28 3.15

series of two power systems, i.e., British Columbia where BCIT micro-grid is

located, and California, are analyzed.135

Electricity load follows daily and weekly periodicities. In this way, we con-

sider two measures for volatility analysis, i.e., daily volatility and weekly volatil-

ity. These measures are based on the standard deviation of logarithmic returns

over a time window. In general, daily volatility quantifies the overall change in

hourly electricity load from one day to another, and weekly volatility measures140

the load changes in subsequent weeks. For more details regarding aforementioned

volatility indices, see [26].

One year hourly load data has been considered for British Columbia’s and

California’s power systems for the same period, i.e., from March 2012 to March

2013. Observe from Table 1 that both daily and weekly volatility indices for a145

micro-grid are considerably higher than those for power systems, which demon-

strate low smoothness of micro-grid load time series. For instance, daily volatil-

ity related to the micro-grid is 8.34%, while it is respectively 2.66% and 3.18%

for British Columbia’s and California’s power systems. It means that electricity

load of the micro-grid fluctuates more severely from one day to another compared150

with that of power systems. Likewise, weekly fluctuations are more severe in the

micro-grid than those in a power system. As a result, daily and weekly periodic-

8

(a) BC’s power system electricity load (b) Building electricity load in BCIT

Figure 1: One-year hourly load data of BC’s power system and the building in BCIT

ities of electricity load in a micro-grid are noticeably low, and consequently, the

predictability of such load time series decreases.

Fig. 1 illustrates one-year hourly load data of British Columbia’s power sys-155

tem and that of the building in BCIT. It is noted that the data is normalized to

the maximum value. As seen, the aggregated electricity load in a large area (e.g.,

the province of British Columbia) is noticeably different from the aggregated load

in a building. For instance, British Columbia’s load follows a common seasonal

pattern, as the load decreases in the spring in April and starts to increase in the fall160

in October. The building’s load follows a fairly similar seasonal pattern. From the

beginning of the academic year in september, electricity load starts increasing, and

it starts decreasing in February. Moreover, it is evidently shown that fluctuations

of load are more severe for a building compared with those for a power system.

These variations in load time series of the micro-grid graphically demonstrate its165

volatility previously shown in Table 1 by volatility indices.

9

−20 −15 −10 −5 0 5 10 15 20 250

500

1000

1500

2000(a) 1−hour ramp distribution

% of the peak load

Nu

mb

er

of

occu

rre

nce

s

−20 −15 −10 −5 0 5 10 15 20 250

300

600

900

1200(b) 2−hour ramp distribution

% of the peak load

Nu

mb

er

of

occu

rre

nce

s

BC power system

Building in BCIT

BC power system

Building in BCIT

Figure 2: Distribution of 1-hour and 2-hour ramps

To have a better understanding of such severe load fluctuations for a build-

ing, hourly changes of load, i.e., the difference between the two observations at

subsequent hours so-called 1-hour ramps, can be taken into consideration. Fig. 2

(a) shows the distribution (with equally spaced bins of 1% of the peak load) of170

1-hour ramps for both the building in BCIT and BC’s power system loads. Note

that the negative values show downward ramps. As seen, more frequent hourly

upward and downward ramps have been occurred in the building in BCIT with

the amplitude of more than 5% of the peak load. The most severe ramp happened

in BC’s power system load is a ramp up with the amplitude of almost 9% of the175

10

Table 2: Ramp events in electricity load time series

Interval 1-hour 2-hour

(% of the peak load)Building BC power Building BC powerin BCIT system in BCIT system

Ramp Up

5% ≤ RU < 10% 556 504 818 97110% ≤ RU < 15% 84 0 362 352

(RU) 15% ≤ RU < 20% 12 0 121 55RU > 20% 4 0 21 0

Ramp Down

5% ≤ RD < 10% 520 259 1003 127210% ≤ RD < 15% 48 0 283 141

(RD) 15% ≤ RD < 20% 9 0 57 0RD > 20% 0 0 11 0

peak load, while it is a ramp up with more than 20% of the peak load for the

building. Similarly, Fig. 2 (b) illustrates the distribution for 2-hour ramps, i.e.,

load variations in two-hour durations. As the longer time is considered for ramps,

the larger ramps will be detected. Obviously, sharp upward and downward ramps

have more frequently happened in BCIT building load than in BC system load.180

To provide more detailed statistics of ramps, table 2 shows the number of

upward and downward ramps for 1-hour and 2-hour durations. For instance, there

have been 100 upward ramps more than 10% of the peak load in BCIT building

load, while no 1-hour ramp up has occurred with the amplitude of more than 10%

of the peak load in BC system load. With regard to 2-hour ramp up, there have185

been 55 2-hour ramp ups more than 15% of the peak load occurred in BC, while it

has been 142 ones for BCIT load. This table also demonstrates that the number of

downward ramps are fewer than the number of upward ramps when large ramps

are concerned.

Based on the above descriptions, prediction of electricity load time series of a190

11

building seems to be more difficult than that of a power system since high volatil-

ity lowers the predictability. Consequently, it is required to adapt a forecasting

model so as to cope with the challenging characteristics of such time series. In the

next section, a forecasting model is proposed to capture the dynamic and volatile

behaviour of micro-grid time series.195

3. The forecasting model

The discussion in section 2 showed that dealing with micro-grid load time

series is a more challenging task compared with a power system, and therefore,

traditional STLF will not result in satisfactory accuracy in micro-grid load pre-

diction. In this way, the SRWNN forecasting engine is firstly presented in this200

section and the training algorithm is then implemented to set the free parameters

of the SRWNN.

3.1. Self-Recurrent Wavelet Neural Network

The wavelet theory has been applied through two different approaches for

forecast processes. The first one is using the wavelet transform as a preprocessor205

to compose the load time series into its low and high frequency components. Each

component is separately processed by a forecast engine [27]. The other approach

is constructing the wavelet neural network (WNN) in which a wavelet function

is used as the activation function of the hidden neurons of a Feed-Forward Neu-

ral Network (FFNN). The WNN was first introduced in [28] for approximating210

nonlinear functions. Due to the local properties of wavelets and the concept of

12

adapting the wavelet shape according to training data set instead of adapting the

parameters of the fixed shape basis function, WNNs have better generalization

property compared to the classical FFNNs, and therefore, these are more appro-

priate for the modelling of time series [29].215

The SRWNN is a modified model of WNN including the properties of the dy-

namics of Recurrent Neural Networks (RNNs) [30] and the fast convergence of

WNNs, which has successfully been applied to estimating and controlling nonlin-

ear systems [31]. Since the SRWNN has a self-recurrent mother wavelet layer, it

can store the past information of wavelets and well attract the complex nonlinear220

systems [32]. Having self-feedback loops and input direct terms, SRWNN has

improved capabilities compared to WNN, such as its dynamic response and infor-

mation storing ability. Therefore, SRWNN has been applied as a forecast engine

in this paper to overcome the volatile and non-smooth behavior of the load time

series in a micro-grid. Moreover, SRWNN does not include limitations, such as225

dependency on appropriate tuning of parameters and complex optimization pro-

cess, which are likely to be found in models such as Support Vector Machines

(SVMs) [33].

The architecture of the SRWNN is shown in Fig. 3, which is a feed forward

network with four layers. As seen, X = [x1, ..., xM ] is the input vector of the230

forecast engine and y is the target variable. The inputs x1, ..., xM of the forecast

engine can be from the past values of the target variable and past and forecast

values of the related exogenous variables. For instance, past values of electricity

load along with the past and forecast values of temperature can be considered for

13

electricity load prediction, provided that their data is available.235

A feature selection technique can be used to refine these candidate features

and select the most effective inputs for the forecast process. In this research work,

we use the feature selection method of [34]. This method is based on the infor-

mation theoretic criterion of mutual information and selects the most informative

inputs for the forecast process by filtering out the irrelevant and redundant can-240

didate features through two stages. In the first stage, which is called irrelevancy

filter, mutual information between each candidate input, i.e. xi(t), and the target

variable is calculated. The higher value of mutual information for xi(t) means

the more common information content of this feature with the target variable. The

candidate inputs with computed mutual information value greater than a relevancy245

threshold, denoted by TH1, are considered as the relevant features of the forecast

process, which are retained for the next stage. However, other candidate inputs

with mutual information value lower than TH1 are considered as irrelevant fea-

tures, which are filtered out. In the second stage, which is called redundancy filter,

redundant features among the candidate inputs secected by the relevancy filter are250

found and filtered out. Two selected candidates, e.g., xk(t) and xl(t), with high

value of mutual information have more common information, i.e., high level of

redundancy. Thus, the redundancy of each selected feature xk(t) with the other

candidate inputs is calculated. Then, if the measured redundancy becomes greater

than a redundancy threshold, denoted by TH2, xk(t) is considered as a redundant255

candidate input. Hence, between this candidate and its rival, which has the maxi-

mum redundancy with xk(t), one with lower relevancy should be filtered out [34].

14

Figure 3: Architecture of the SRWNN.

The selected candidate features in relevancy filter are considered as the inputs of

the load forecasting engine. Moreover, fine-tuning the values of the thresholds

TH1 and TH2 is performed by cross validation technique. Since this method is260

not the focus of this paper, it is not further discussed here. The interested reader

can refer to [34] for details of this feature selection method.

Therefore, the target variable is the electricity load of the next time interval

that the forecasting engine presents a prediction for it using the past values of

electricity load and calendar effects. Moreover, Multi-period forecast, e.g. load265

prediction for the next 24 hours, is reached via recursion, i.e. by feeding input

variables with the forecaster’s outputs. For instance, forecasted load for the first

hour is used as y(t−1) for load prediction of the second hour provided that y(t−1)

is among the selected candidate inputs of the feature selection technique.

The input layer of the forecast engine transmits M input variables, which are270

15

selected by the feature selection technique, to the next layer without any changes.

The second layer, which is called the wavelet layer, consists of N ×M neurons

that each has a self-feedback loop. In this paper, Morlet wavelet function has

been considered as the activation function of neurons in the mother-wavelet layer,

which is defined as follows:275

ψ(x) = e−0.5x2

cos(5x) (1)

In SRWNN, a wavelet of each node is derived from its mother wavelet as below:

ψi,j(ri,j) = ψ(ui,j − biai

), ri,j =ui,j − biai

(2)

where ψi,j is the scaled and shifted version of Morlet mother wavelet with ai and

bi as the scale and shift parameters, respectively. In addition, the inputs of the

wavelets in (2) are as follows:

ui,j = xj + ψi,jz−1 · θi,j (3)

where z−1 is the time delay; thus, the input of this layer contains the memory280

term ψi,jz−1 which can store the past information of networks, and θi,j denotes

the weight of the self-feedback loop, which represents the rate of information

storage. This feature is the main difference between a SRWNN and a WNN. In

fact, the SRWNN is the same as WNN when all θi,j are equal to zero. However,

it is noted that the initial values for θi,j are usually considered zero, which means285

16

there are no feedback units initially.

M-dimensional wavelet functions are constructed by the tensor product of one-

dimensional Morlet wavelets in the third layer as follows:

Ψi =M∏j=1

ψi,j, i = 1, 2, ..., N (4)

The output of the SRWNN, denoted by y, is finally computed as below:

y =N∑i=1

wi ·Ψi +M∑j=1

vj · xj + g (5)

where, wi is the weight between ith neuron of the product layer and the output290

node, vj is the direct input weight between jth input and the output node, and

g is the bias of the output node. Therefore, the output of SRWNN is obtained

by a combination of multi-dimensional wavelet functions, i.e. Ψi , as well as a

combination of inputs, i.e. xj . In other words, the proposed model not only can

benefit from the capabilities of wavelet functions, such as their ability to capture295

cyclical behaviors, but also can capture trends of the signal. In addition, SRWNN

can benefit from its dynamic response by storing the past information of wavelets

in self-feedback loops (equation 3) to capture complex nonlinearities. Based on

the aforementioned formulation, the vector of the free parameters of the SRWNN

is denoted by P as follows:300

P = [vj, wi, ai, bi, θi,j, g], i = 1, ..., N, j = 1, ...,M (6)

17

Therefore, the SRWNN has NP = M + 3N + M × N + 1 free parameters

which are determined by the training method. It should also be noted that the

SRWNN model presented in this paper differs from the SRWNN proposed in [32].

There are two differences between these two models. First, there is an additional

external bias (e.g., g) to the output layer of the presented SRWNN in this work.305

A bias can increase or lower the net input of the activation function, depending

on whether it is positive or negative, respectively [35]. Consequently, biases can

enhance the input/output mapping function by adding another feature to neural

networks. Second, Morlet wavelet functions have been used as the activation

functions in Wavelet layer of SRWNN in this paper, while the second derivative310

of Gaussian functions, i.e., Mexican hat wavelet function, in reference [32] of

the previous version. Although Mexican hat wavelet function has successfully

been used in WNN model for forecasting applications due to its superiorities over

Daubechies wavelets [29], it has been shown that Morlet wavelets outperform

Mexican hat wavelets for prediction applications [36, 37]. Therefore, we applied315

Morlet mother wavelets as the activation functions in SRWNN in our paper.

3.2. The training algorithm

In this subsection, a training algorithm is implemented to set the free parame-

ters of the SRWNN denoted by P in (6). Since the mother wavelet function used

in the SRWNN, i.e. Morlet wavelet function, is differentiable with respect to all320

free parameters, the Levenberg-Marquardt (LM) learning algorithm can be used

in this regards. This learning algorithm was applied to train the neural networks

18

by Hagan and Menhaj in [38]. Due to the advantages of the LM algorithm, such as

accurate training and fast convergence, it has been recommended in many research

works, and therefore, it is implemented for training the SRWNN in this paper. The325

LM algorithm is briefly described in the Appendix and its implementation on the

SRWNN is then presented.

Moreover, the termination criterion used for the training of the SRWNN is

based on early-stopping technique. Accordingly, the whole available data is di-

vided into training and validation samples. The SRWNN is trained using the train-330

ing samples and the error for validation samples is monitored in each iteration. As

the validation error begins to rise during some number of iterations, usually five,

the training phase is stopped and the values of the free parameters relating to the

iteration with the least validation error are stored as the final solution of the train-

ing algorithm.335

4. Numerical results

In this paper, we mainly focus on 24-hour ahead load prediction with hourly

forecast steps. Day-ahead load forecasting can bring significant operational ad-

vantages for energy management of micro-grids. For instance, BCIT micro-grid

consists of different types of generating units (e.g., thermal, wind and PV units),340

and day-ahead load predictions are used for energy management purposes. In

other words, optimal utilization of available resources is achieved using load fore-

casting in order to minimize the operation cost for BCIT campus micro-grid.

Moreover, as this micro-grid can operate in both stand alone and grid-connected

19

modes, accurate load forecast can be used for profitable trade of electric energy345

within the British Columbia power system.

The same load time series data of the building in BCIT and two power systems

are used for numerical experiments of this section. Based on the data analyses

presented in section 2, electricity load not only depends on the load profile of the

previous day, i.e., daily periodicity, but also the load pattern of the previous week,350

i.e., weekly periodicity. To capture such patterns, 192 candidate inputs has been

considered as lagged hourly load data, i.e., {Lt−192, ..., Lt−1} where Lt indicates

the electricity load at time t. The feature selection technique selects the most in-

formative lagged load values from these candidate inputs. Calendar information

is also highly important for a load forecasting model so as to capture weekly and355

seasonal patterns. For instance, either considering the day of the week or differ-

entiating weekends and weekdays is a common way presented in the literature

[5, 9, 39]. Thus, weekends and holidays are considered in this work using a bi-

nary variable for detecting weekends and holidays from weekdays. The month

of the year is also used in some cases [39]; however, it is not considered in this360

paper since the seasonality factor is already captured, as the model is re-trained

every day. Furthermore, temperature data as an exogenous variable has been used

to improve load forecasting prediction since temperature time series usually has

high relevancy to electricity consumption time series [5, 7, 8, 40]. Accordingly,

based on publicly available data, seven daily values of temperature for the previ-365

ous week (e.g., Td−7, ..., Td−1), and the daily forecast value of the temperature for

the prediction day (e.g., Td) were first considered for the model, where Td repre-

20

sents the average daily temperature for day d. However, numerical experiments

for BCIT test case revealed that low resolution temperature data, i.e. daily data,

cannot improve the accuracy for hourly load forecast. Therefore, we tested histor-370

ical hourly temperature data (located in Vancouver) and also used the same time

series for temperature forecasts, i.e., perfect forecasts, in order to observe if hourly

temperature data can enhance the forecast results for BCIT test case. For this pur-

pose, lagged hourly temperature data, i.e., {Tt−192, ..., Tt−1}, are considered as

192 candidate inputs that feed the feature selection stage along with 192 candi-375

date inputs for load data. The feature selection technique then selects the most

informative candidates among all candidates of load and temperature and transfer

them to the model. Considering the selected inputs, few temperature inputs are

among all selected inputs that shows the low correlation of the temperature time

series and load time series of BCIT. The low correlation results from the fact that380

the electric load of this building is mainly lighting load. Considering the mild

temperatures in Vancouver, the heating load is not as significant. The numerical

results also supported this low correlation, as hourly temperature data with even

perfect forecasts could not improve the forecast accuracy of the model. Therefore,

temperature inputs are not considered for the numerical results in this paper.385

To show the effectiveness of different forecasting engines, SRWNN is com-

pared with two other efficient neural network-based forecasting models, i.e., WNN

and Multi-Layer Perceptron (MLP). It is noted that statistical models (e.g., Au-

toregressive Integrated Moving Average (ARIMA) model) are not considered in

this paper since such techniques are basically linear methods and have limited ca-390

21

pability to capture nonlinearities in the load series [41, 42]. Therefore, we chose

two efficient Computational Intelligence (CI) based models, e.g., MLP as an ef-

ficient Feed Forward Neural Network (FFNN) and WNN as an effective model

combining nonlinear mapping merits of FNNNs and wavelet functions, as bench-

marks in our comparative results.395

Hence, 10 test months of hourly load data from the building in BCIT from

May 2012 to February 2013 are considered for 24-hour ahead load prediction.

It is noted that the first two months of the historical data is used for training of

the forecast engine and so the results of the first two months cannot be presented

here. Two error criteria are used in this paper to evaluate forecast errors: (i)400

normalized Root Mean Square Error (nRMSE) and (ii) normalized Mean Absolute

Error (nMAE), defined as follows:

nRMSE =

√√√√ 1

N

N∑t=1

(LACT(t) − LFOR(t)

LPeak)2 × 100 (7)

nMAE =1

N

N∑t=1

|LACT(t) − LFOR(t)

LPeak| × 100 (8)

where LACT(t) and LFOR(t) indicate the actual and forecast values of electricity

load for hour t. Moreover, N indicates number of hours for each month, and LPeak405

is the peak value of the electricity load over the year, which is 694 kW for this test

case. Observe from Table 3 that SRWNN outperforms the other forecasting mod-

els in all test months and in terms of both nRMSE and nMAE. For instance, the

22

Table 3: Forecasting errors, in %, of SRWNN, WNN and MLP for 10 test months.

MLP WNN SRWNNMonth nRMSE nMAE nRMSE nMAE nRMSE nMAE

May 8.44 6.22 5.96 4.05 5.23 3.80Jun. 9.92 7.55 5.44 4.27 4.86 3.80Jul. 10.41 7.92 7.04 5.26 5.43 4.01

Aug. 10.40 8.14 6.57 4.95 6.46 4.80Sep. 11.88 8.40 7.83 6.01 6.28 4.82Oct. 10.45 7.68 4.81 3.83 4.24 3.28Nov. 6.34 4.89 4.62 3.56 4.30 3.21Dec. 6.21 4.58 4.54 3.35 4.22 3.05Jan. 6.93 4.74 4.86 3.40 4.25 3.11Feb. 6.85 5.29 5.06 3.94 4.58 3.50

Average 8.78% 6.54% 5.67% 4.26% 4.98% 3.74%

average nRMSE and average nMAE of SRWNN are (5.67-4.98)/5.67=12.1% and

(4.26-3.74)/4.26=12.2% lower than those of WNN, and (8.78-4.98)/8.78=43.2%410

and (6.54-3.74)/6.54=42.8% lower than those for MLP, respectively. This table

demonstrates that for a highly volatile time series, i.e. micro-grid electricity load,

a SRWNN forecasting model can more efficiently cope with the variations and

non-smooth behavior of the time series.

Moreover, Fig. 4 illustrates the carpet charts of monthly mean absolute errors415

for different hours of the day for SRWNN and WNN on BCIT test case. This

figure clearly shows that large errors for both models usually occur between 12:00

PM and 16:00 PM when the load peaks. However, this colormap shows lower

errors during the peak hours for SRWNN in comparison with the WNN. More

importantly, the superiority of SRWNN over WNN is revealed during the upward420

ramps in the morning. As analyzed in section 2, sharp upward ramps occur more

23

(a) SRWNN (b) WNN

Figure 4: Mean absolute error (kW) of different hours of the day in different months

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 245

10

15

20

25

30

35

40

45

Hour

Mean A

bsolu

te E

rror

(kW

)

SRWNN

WNN

Figure 5: 10-month mean absolute error for different hours of the day

than downward ramps for BCIT test case, and consequently, any improvements in

forecasting ramp up events can considerably enhance the forecast accuracy of this

load time series. Fig. 5 demonstrates the average of mean absolute errors for all

10 months. According to these two curves, SRWNN shows lower yearly errors425

during the morning ramp, which usually occurs from 7:00 AM to 12:00 PM. In

addition, there is an improvement in ramp down forecasting from 16:00 PM to

18:00 PM.

24

Curves of generated forecasts and real data for a good forecasting day, i.e.

November 15, and a bad forecasting day, i.e. September 7, is demonstrated in430

Fig. 6. Fig. 6(a) shows that there are sharp changes and variations on Septem-

ber 7. Sharp spikes could result from the high temperatures during specific days,

which increase the electricity consumption of the buildings for air conditioning.

As a consequence of such severe ramps, the forecasting model faces difficulties

to capture this high sudden variations in electricity load. The major error is the435

magnitude error occurred during the peak load. On the contrary, there has been

smoother variations on November 15 shown in Fig. 6(b), so the forecasting model

could perfectly capture the upward ramp. As a result, the challenge of high volatil-

ity and sharp ramps in micro-grid time series is evidently distinct from power

system loads, and makes such time series more unpredictable.440

In the next experiment, forecasting errors of different days of the week for the

same 10-month period are separately considered to observe the users’ behavior.

It is noted that the electricity consumption of the building is mainly from light-

ings as mentioned earlier in this section. Here, users’ behavior is represented by

considering the calendar effect as the inputs of the model. A binary variable for445

differentiating weekends and holidays from weekdays is used, i.e., zero represents

weekends and holidays, while one represents weekdays. Fig. 7 demonstrates the

forecasting errors with and without the calendar effects. First, observe that the

average of nRMSE considering the calendar effect, i.e., 4.78%, is lower than that

when the calendar effect is not included, i.e., 5.32%. Moreover, according to the450

figure, the highest error occurs on Mondays, which is the first working day at the

25

(a) Bad forecasting day (b) Good forecasting day

Figure 6: Samples for bad (a) and good (b) forecasting days

university. Calendar inputs can efficiently capture such behaviors of the users.

For instance, the forecasting error in terms of nRMSE corresponds to Monday has

considerably decreased from 7.32% to 5.83% when the calendar effect is taken

into account. In addition, the standard deviation of the error associated with dif-455

ferent days of week has decreased from 0.97% to 0.57% using the calendar effect.

In other words, the model performs in a more robust way for predicting different

days of the week. According to Fig. 7, the difference between the maximum and

the minimum errors with calendar, i.e., 1.85%, and without calendar, i.e., 2.83%,

can also show the better performance of the model including the calendar effect.460

As a result, users’ behavior can efficiently be captured by considering calendar

effects in order to improve the forecast accuracy.

26

Figure 7: Forecasting errors for different days of the week

In the last experiment, the proposed forecasting model is applied to predict two

power system time series. The main goal of this numerical experiment is to show

how forecast accuracy of SRWNN improves, compared with WNN, as the volatil-465

ity of the time series increases. Hence, from a power system with low volatility to

one with higher volatility, forecast accuracy improvements increase for SRWNN.

In this way, the same test cases for British Columbia’s and California’s power sys-

tems are considered. Table 4 shows the obtained forecast error results (based on

the average of 10-month error) for both SRWNN and WNN models. Firstly, this470

table demonstrates noticeable lower forecast errors of both models for prediction

of power systems’ load data compared with those for a micro-grid illustrated in

Table 3. For instance, 4.98% compared with 2.29% in terms of nRMSE for the

micro-grid and British Columbia’s power system, respectively. Besides, observed

from Table 1, the volatility for British Columbia’s power system time series is475

27

Table 4: Forecasting errors of SRWNN and WNN for two power systems.

Power WNN SRWNN Improvements(%)System nRMSE nMAE nRMSE nMAE nRMSE nMAE

British Columbia 2.46 1.81 2.29 1.69 6.9 6.6California 3.67 2.57 3.38 2.37 7.9 7.8

the lowest in terms of both daily and weekly volatility indices. Consequently,

it is expected to have higher predictability for British Columbia’s power system

compared to the micro-grid and California’s power system. Table 4 statistically

supports that the forecasting errors for British Columbia is lower than those for

California, e.g., 2.46% compared to 3.67% in terms of nRMSE for WNN.480

Secondly, Table 4 shows how effective the SRWNN becomes as the volatility

of a time series increases. As seen from the last column of Table 4, the forecast ac-

curacy improvements obtained from SRWNN in terms of nRMSE and nMAE are

respectively (2.46-2.29)/2.46=6.9% and (1.81-1.69)/1.81=6.6% for BC’s power

system. Similarly, there are 7.9% and 7.8% forecast accuracy improvements in485

terms of nRMSE and nMAE for California’s power system, respectively. There-

fore, since the volatility of California’s power system is higher than that for British

Columbia’s, SRWNN obtained higher improvement of forecast accuracy com-

pared with WNN for California’s power system. In other words, California’s load

contains higher daily and weekly volatilities, and consequently, the SRWNN can490

capture these variations and present more accurate forecast results compared with

WNN. To have a better sense of these percentage errors, forecast accuracy im-

provement in terms of mean absolute error is around 93 MW, which is almost

twice as big as the capacity of Kumeyaay wind farm, i.e. 50 MW, located in San

28

Diego, California [43]. As a result, it shows as the volatility of the time series495

increases, the performance of SRWNN improves in comparison with WNN. As

mentioned earlier in this section, load forecast accuracy can be improved using

weather forecast data as exogenous inputs to the forecasting model. For instance,

load forecasting models utilized in California ISO (CAISO) include weather fore-

casts, such as temperature, dew point, wind speed and cloud cover, for next 9 days500

for 24 weather stations [44]. It is noted that including such exogenous inputs to

the model depends on the availability of the public data.

The computation time of the SRWNN model for the training phase is less

than 35 seconds for one day prediction for the test cases of this paper, which is

measured on a hardware set of Mac Intel Core i5 2.7 GHz with 12 GB RAM.505

Although this computation time is larger than that for WNN, i.e., less than 11

seconds, it is completely acceptable within a 24-hour decision making framework,

and shows fast forecasting performance of the proposed method.

5. Conclusions

STLF is an important tool for reliable and economic operation of power sys-510

tems as many operating decisions are based on load forecast, e.g., dispatch schedul-

ing of generating units, security assessment and demand side management. Like-

wise, precise STLF for a micro-grid can enhance the management of its renewable

and conventional resources and improve the economics of energy trade with elec-

tricity markets. Considering volatile and non-smooth characteristics of load time515

series of micro-grids compared with power systems’ electricity load, a new fore-

29

casting method is proposed to deal with such challenges in this paper. The pro-

posed method has the structure of a SRWNN as the forecasting engine, in which

feedback loops have been added to a WNN so as to better capture nonlinear com-

plexities of volatile time series. LM learning algorithm is implemented to train520

the SRWNN, i.e., adjusting the free parameters of the SRWNN. High volatility of

a micro-grid load was shown by defining a volatility criterion and comparing with

the volatility of two power systems’ load data. The effectiveness of the proposed

forecasting method was demonstrated by real-world load data of a micro-grid and

power systems. The results show that the proposed SRWNN model leads to more525

accurate forecasts when a volatile time series prediction is of interest.

Appendix: Formulation of the training algorithm

The task of the forecasting engines is to learn the mapping function between

a specified set of input/output pairs {(X1, t1), (X2, t2), ..., (XQ, tQ)}, known as

training samples. Q indicates the number of training samples. Xq and tq are the530

qth input vector and the corresponding target output of the forecasting model, re-

spectively. Mean squared error (MSE) is usually considered to be the performance

index for the network. The MSE is calculated by

MSE =1

Q

Q∑q=1

e2q, (eq = tq − yq) (A.1)

where, yq is the output of the forecasting engine when Xq is fed as the input of the

forecasting engine. eq is the forecast error of the qth sample.535

30

The LM algorithm is an approximation of Newton’s method, in which the

solution is updated as follows:

Pk+1 = Pk − (JᵀJ + µI)−1Jᵀe (A.2)

where P is the vector of the free parameters according to (6). k represents the

iteration number, and I is the identity matrix. J is the Jacobian matrix composed

of the first derivatives of the network errors with respect to all its free parameters540

and JᵀJ is the Hessian matrix. Considering (A.1) as the performance function

that should be minimized, the gradient of (A.1) can be shown as Jᵀe.

The main modification of the LM algorithm with respect to Newton’s method

is the parameter µ, such that the algorithm becomes the Newton’s method if µ is

zero in (A.2). When µ is large, the LM algorithm tends to gradient descent with a545

small step size, i.e., (1/µ), while for small µ the LM algorithm tends to Newton’s

method. Since the Newton’s method is faster and more accurate than the gradient

descent, the aim is to shift toward Newton’s method as quickly as possible. Thus,

µ is divided by a factor β (β > 1) after each successful step, i.e. reduction in

the MSE given in (A.1). On the contrary, µ is multiplied by the factor β when a550

tentative step increases the MSE. Therefore, the MSE is always reduced at each

iteration of the algorithm [38]. The initial value for µ is usually considered 0.01

and β is usually set as 10. For further details regarding the LM training algorithm,

the interested reader can refer to [38]. The implementation of the LM learning

algorithm on the SRWNN is proposed in the following.555

31

Since computation of the Jacobian matrix is the most important part of the LM

algorithm, it is required to determine the first derivative of the network errors with

respect to each free parameter of (6) in the SRWNN, i.e., vj , wi, ai, bi, θi,j , and g.

The elements in the Jacobian matrix are calculated by the following equations.

∂e

∂vj=∂(t− y)

∂vj= −xj, j = 1, 2, ...,M (A.3)

∂e

∂wi

=∂(t− y)

∂wi

= −Ψi, i = 1, 2, ..., N (A.4)

∂e

∂ai=∂(t− y)

∂ai= −wi ·

∂Ψi

∂ai, i = 1, 2, ..., N (A.5)

560

∂Ψi

∂ai=

M∑j=1

[dψi,j

dai·

M∏l=1,l 6=j

ψ(ri,l)

], (A.6)

dψi,j

dai=−ri,jai· ψ′(ri,j), (A.7)

where ψ′(.) is the derivative of the Morlet mother wavelet function.

∂e

∂bi=∂(t− y)

∂bi= −wi ·

∂Ψi

∂bi, i = 1, 2, ..., N (A.8)

∂Ψi

∂bi=

M∑j=1

[dψi,j

dbi·

M∏l=1,l 6=j

ψ(ri,l)

], (A.9)

dψi,j

dbi=−1

ai· ψ′(ri,j), (A.10)

32

∂e

∂θi,j= −wi ·

∂Ψi

∂θi,j, i = 1, ..., N , j = 1, ...,M (A.11)

∂Ψi

∂θi,j=ψi,jz

−1

ai· ψ′(ri,j) ·

M∏l=1,l 6=j

ψ(ri,l) (A.12)

∂e

∂g=∂(t− y)

∂g= −1 (A.13)

Therefore, the Jacobian matrix with the size of Q × NP can be computed using565

(A.3) to (A.13) and all free parameters of the SRWNN are updated using (A.2).

The procedure of the LM learning algorithm for training the SRWNN is summa-

rized as follows:

1. Set the iteration number to 1, i.e., (k = 1). Randomly initialize the free

parameters vj , wi, ai, bi, θi,j and g of the forecasting engine within their570

allowable ranges for the first iteration P1.

2. Present all xqs and compute the corresponding SRWNN outputs yq using

(5). Moreover, compute the corresponding errors eq and the performance

index MSE using (A.1).

3. Compute the Jacobian matrix575

4. Update the free parameters of the forecasting engine using (A.2) to obtain

Pk+1.

5. Compute the performance index MSE using Pk+1. If the new MSE is

smaller than the one computed in step 2, reduce the parameter µ by the

33

factor β, and save Pk+1. Otherwise, increase the parameter µ by multiply-580

ing it to β and go back to step 3.

6. Increment k, i.e., (k = k+1). The training algorithm is terminated when the

termination criterion is satisfied. Otherwise, go back to step 3. It is noted

that the termination criterion can be the maximum number of iterations.

However, the early stopping technique, discussed in section 3.2, is used as585

the termination criterion of the training algorithm in this paper as it can

monitor the prediction ability of SRWNN forecast engine for the unseen

samples and terminate the training process in the best point with the least

validation error.

Acknowledgements590

Partial support for this work came from the Canadian National Science and

Engineering Research Council (NSERC) and the ENMAX Corporation under the

Industrial Research Chairs program. Moreover, the authors would like to thank Dr.

Hassan Farhangi and Dr. Ali Palizan of British Columbia Institute of Technology

(BCIT) for providing data and invaluable insight.595

References

[1] Navigant research, 2013. URL: http://www.navigantresearch.

com/research/microgrids.

[2] J. Taylor, P. McSharry, Short-term load forecasting methods: An evaluation

34

http://www.navigantresearch.com/research/microgrids



based on european data, IEEE Transactions on Power Systems 22 (2007)600

2213–2219.

[3] T. Hong, M. Gui, M. Baran, H. Willis, Modeling and forecasting hourly elec-

tric load by multiple linear regression with interactions, Power and Energy

Society General Meeting, 2010 IEEE (2010) 1–8.

[4] E. Paparoditis, T. Sapatinas, Short-term load forecasting: The similar shape605

functional time-series predictor, IEEE Transactions on Power Systems 28

(2013) 3818–3825.

[5] H. Hippert, C. Pedreira, R. Souza, Neural networks for short-term load

forecasting: a review and evaluation, IEEE Transactions on Power Systems

16 (2001) 44–55.610

[6] Y. Wang, Q. Xia, C. Kang, Secondary forecasting based on deviation analy-

sis for short-term load forecasting, IEEE Transactions on Power Systems 26

(2011) 500–507.

[7] E. Ceperic, V. Ceperic, A. Baric, A strategy for short-term load forecast-

ing by support vector regression machines, IEEE Transactions on Power615

Systems 28 (2013) 4356–4364.

[8] Y. Goude, R. Nedellec, N. Kong, Local short and middle term electricity

load forecasting with semi-parametric additive models, IEEE Transactions

on Smart Grid 5 (2014) 440–446.

35

[9] A. Chaouachi, R. M. Kamel, R. Andoulsi, K. Nagasaka, Multiobjective620

intelligent energy management for a microgrid, IEEE Transactions on In-

dustrial Electronics 60 (2013) 1688–1699.

[10] E. Mashhour, S. Moghaddas-Tafreshi, Integration of distributed energy re-

sources into low voltage grid: A market-based multiperiod optimization

model, Electric Power Systems Research 80 (2010) 473–480.625

[11] E. R. Sanseverino, M. L. D. Silvestre, M. G. Ippolito, A. D. Paola, G. L.

Re, An execution, monitoring and replanning approach for optimal energy

management in microgrids, Energy 36 (2011) 3429–3436.

[12] A. Mohamed, V. Salehi, O. Mohammed, Real-time energy management

algorithm for mitigation of pulse loads in hybrid microgrids, IEEE Transac-630

tions on Smart Grid 3 (2012) 1911–1922.

[13] M. Eghbal, T. K. Saha, N. Mahmoudi-Kohan, Utilizing demand response

programs in day ahead generation scheduling for micro-grids with renewable

sources, 2011 IEEE PES Innovative Smart Grid Technologies Asia (ISGT)

(2011) 1–6.635

[14] P. Chan, W.-C. Chen, W. Ng, D. Yeung, Multiple classifier system for

short term load forecast of microgrid, Proceedings of the 2011 International

Conference on Machine Learning and Cybernetics (10-13 July, 2011) 1268–

1273.

36

[15] N. Amjady, F. Keynia, H. Zareipour, Short-term load forecast of microgrids640

by a new bilevel prediction strategy, IEEE Transactions on Smart Grid 1

(2010) 286–294.

[16] M. Shahidehpour, M. Khodayar, Cutting campus energy costs with hierar-

chical control, IEEE Electrification Magazine 1 (2013) 40– 56.

[17] A. Ahmad, M. Hassan, M. Abdullah, H. Rahman, F. Hussin, H. Abdullah,645

R. Saidur, A review on applications of ANN and SVM for building elec-

trical energy consumption forecasting, Renewable and Sustainable Energy

Reviews 33 (2014) 102 – 109.

[18] G. Escriva-Escriva, C. Alvarez-Bel, C. Roldan-Blay, M. Alcazar-Ortega,

New artificial neural network prediction method for electrical consumption650

forecasting based on building end-uses, Energy and Buildings 43 (2011)

3112 – 3119.

[19] A. H. Neto, F. A. S. Fiorelli, Comparison between detailed model simulation

and artificial neural network for forecasting building energy consumption,

Energy and Buildings 40 (2008) 2169 – 2176.655

[20] S. Farzana, M. Liu, A. Baldwin, M. U. Hossain, Multi-model prediction and

simulation of residential building energy in urban areas of chongqing, south

west china, Energy and Buildings 81 (2014) 161 – 169.

[21] R. K. Jain, K. M. Smith, P. J. Culligan, J. E. Taylor, Forecasting energy con-

sumption of multi-family residential buildings using support vector regres-660

37

sion: Investigating the impact of temporal and spatial monitoring granularity

on performance accuracy, Applied Energy 123 (2014) 168 – 178.

[22] D. Monfet, M. Corsi, D. Choiniere, E. Arkhipova, Development of an energy

prediction tool for commercial buildings using case-based reasoning, Energy

and Buildings 81 (2014) 152 – 160.665

[23] G. Escriva-Escriva, C. Roldan-Blay, C. Alvarez-Bel, Electrical consumption

forecast using actual data of building end-use decomposition, Energy and

Buildings 82 (2014) 73 – 81.

[24] J. G. Jetcheva, M. Majidpour, W. P. Chen, Neural network model ensembles

for building-level electricity load forecasts, Energy and Buildings 84 (2014)670

214 – 223.

[25] British Columbia Institute of Technology, 2014. URL: http://www.

bcit.ca/microgrid/.

[26] H. Zareipour, K. Bhattacharya, C. A. Canizares, Electricity market price

volatility: The case of Ontario, Energy Policy 35 (2007) 4739–4748.675

[27] N. Amjady, F. Keynia, Short-term load forecasting of power systems by

combination of wavelet transform and neuro-evolutionary algorithm, Energy

34 (2009) 46 – 57.

[28] Q. Zhang, A. Benveniste, Wavelet networks, IEEE Transactions on Neural

Networks 3 (1992) 889–898.680

38

http://www.bcit.ca/microgrid/



[29] N. M. Pindoriya, S. N. Singh, S. K. Singh, An adaptive wavelet neural

network-based energy price forecasting in electricity markets, IEEE Trans-

action on Power System 23 (2008) 1423–1432.

[30] J. Vermaak, E. Botha, Recurrent neural networks for short-term load fore-

casting, IEEE Transactions on Power Systems 13 (1998) 126–132.685

[31] S. J. Yoo, J. B. Park, Y. H. Choi, Adaptive dynamic surface control of

flexible-joint robots using self-recurrent wavelet neural networks, IEEE

Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 36

(2006) 1342–1355.

[32] S. J. Yoo, J. B. Park, Y. H. Choi, Indirect adaptive control of nonlinear690

dynamic systems using self recurrent wavelet neural networks via adaptive

learning rates, Information Sciences 177 (2007) 3074–3098.

[33] A. Tascikaraoglu, M. Uzunoglu, A review of combined approaches for pre-

diction of short-term wind speed and power, Renewable and Sustainable

Energy Reviews 34 (2014) 243 – 254.695

[34] N. Amjady, F. Keynia, H. Zareipour, Wind power prediction by a new fore-

cast engine composed of modified hybrid neural network and enhanced parti-

cle swarm optimization, IEEE Transactions on Sustainable Energy 2 (2011)

265–276.

[35] S. S. Haykin, Neural Networks: A Comprehensive Foundation, Prentice700

Hall, 1999.

39

[36] H. Chitsaz, N. Amjady, H. Zareipour, Wind power forecast using wavelet

neural network trained by improved clonal selection algorithm, Energy Con-

version and Management 89 (2015) 588–598.

[37] L. Wu, M. Shahidehpour, A hybrid model for day-ahead price forecasting,705

IEEE Transactions on Power Systems 25 (2010) 1519–1530.

[38] M. T. Hagan, M. B. Menhaj, Training feedforward networks with the Mar-

quardt algorithm, IEEE Transactions on Neural Networks 5 (1994) 989–993.

[39] L. Hernandez, C. Baladron, J. Aguiar, B. Carro, A. Sanchez-Esguevillas,

J. Lloret, Short-term load forecasting for microgrids based on artificial neu-710

ral networks, Energies 6 (2013) 1385–1408.

[40] A. Pandey, D. Singh, S. Sinha, Intelligent hybrid wavelet models for short-

term load forecasting, IEEE Transactions on Power Systems 25 (2010)

1266–1273.

[41] B.-L. Zhang, Z.-Y. Dong, An adaptive neural-wavelet model for short term715

load forecasting, Electric Power Systems Research 59 (2001) 121–129.

[42] N. Amjady, A. Daraeepour, Mixed price and load forecasting of electric-

ity markets by a new iterative prediction method, Electric Power Systems

Research 79 (2009) 1329–1336.

[43] Kumeyaay wind farm, 2014. URL: http://www.thewindpower.720

net/windfarm_en_2792_kumeyaay.php.

40

http://www.thewindpower.net/windfarm_en_2792_kumeyaay.php



[44] California Independent System Operator, 2014. URL: http://www.

caiso.com/1c57/1c578a8751b30.pdf.

41

http://www.caiso.com/1c57/1c578a8751b30.pdf



short-term electricity load forecasting of buildings in ... · short-term electricity load...

Documents