short-term electricity load forecasting of buildings in ... · short-term electricity load...
TRANSCRIPT
Short-term Electricity Load Forecasting of Buildings inMicrogrids
Hamed Chitsaza,∗, Hamid Shakera, Hamidreza Zareipoura, David Wooda, NimaAmjadyb
aSchulich School of Engineering, University of Calgary, Alberta, CanadabElectrical Engineering Department, Semnan University, Semnan, Iran
Abstract
Electricity load forecasting plays a key role in operation of power systems.
Since the penetration of distributed and renewable generation is increasingly grow-
ing in many countries, Short-Term Load Forecast (STLF) of micro-grids is also
becoming an important task. A precise STLF of the micro-grid can enhance the
management of its renewable and conventional resources and improve the eco-
nomics of energy trade with electricity markets. As a consequence of the highly
non-smooth and volatile behavior of the load time series in a micro-grid, its STLF
is even a more complex process than that of a power system. For this purpose,
a new prediction method is proposed in this paper, in which a Self-Recurrent
Wavelet Neural Network (SRWNN) is applied as the forecast engine. Moreover,
the Levenberg-Marquardt (LM) learning algorithm is implemented and adapted to
train the SRWNN. In order to demonstrate the efficiency of the proposed method,
it is examined on real-world hourly data of an educational building within a micro-
∗Corresponding Author: Hamed Chitsaz;Email: [email protected]; Phone: +1 (587) 896-8388; Fax:+1 (403) 282-6855
Preprint submitted to Elsevier April 20, 2016
grid. Comparisons with other load prediction methods are provided.
Keywords: Micro-grids, buildings, electricity load forecasting, self-recurrent
wavelet neural network.
1. Introduction
Micro-grids are integrated energy systems composed of distributed energy re-
sources and multiple electrical loads operating as an autonomous grid, which can
be either in parallel to or islanded from the existing power grid. A micro-grid
can be considered as a small-scale version of the traditional power grid that its5
small scale results in far fewer line losses and lower demand on transmission in-
frastructure. All of these advantages are consequently motivating an increased
demand for micro-grids in a variety of application areas such as campus envi-
ronments, military operations, community/utility systems, and commercial and
industrial markets [1].10
Considering the fast and worldwide development of micro-grids, their optimal
operation requires advanced tools and techniques. In particular, Short-Term Load
Forecast (STLF) is an indispensable task for the operation of a micro-grid. In con-
ventional power systems, STLF is an important tool for reliable and economic op-
eration of power systems, as many operating decisions, such as dispatch schedul-15
ing of generating capacity, demand side management, security assessment and
maintenance scheduling of generators, are based on load forecast [2, 3, 4, 5, 6, 7].
Load forecasts also have significant roles in energy transactions, market shares
and profits in competitive electricity markets [7, 8]. Different prediction strate-
2
gies have already been presented for the STLF of traditional power systems over20
the years. These methodologies are generally divided into two main groups: clas-
sical statistical techniques and computational intelligent techniques. Reviews on
some of these strategies can be found in [2, 4, 5, 6, 7, 8].
In a similar way, STLF is a key factor in operation of micro-grids such as
energy management for optimal utilization of available resources in order to min-25
imize the operation cost or any environmental impact of a micro-grid [9]. More-
over, STLF for a micro-grid can be used for profitable trade of electric energy
within the grid. In other words, it is important for the operator of a micro-grid
to determine the amount of exchanged power with a wholesale energy market so
as to maximize the total benefit [10]. It has also been discussed that the fore-30
casted loads as well as forecasted generation of renewable resources are the main
inputs for optimal energy management [11, 12] and generation scheduling [13] in
micro-grids.
However, modeling and forecasting of micro-grids’ loads can be more com-
plex tasks than those usually applied for conventional power systems, as the load35
time series of micro-grids is more volatile in comparison with the load of power
systems, as demonstrated later in the present paper. Since the size of a micro-grid
is considerably small compared to a traditional power system, the load of a micro-
grid includes more fluctuations. In other words, the inertia in small-scale systems
is low and therefore, the smoothness of load time series in such systems degrades.40
Using a criterion to measure the volatility of a time series, it will be shown in
this paper that the volatility of load time series for a micro-grid is considerably
3
higher than that for a conventional power system. As a result, there is a need to
adapt a suitable STLF model to volatile behavior of micro-grids load time series.
Despite the importance of STLF for micro-grids there are a few works presented45
in this area. Authors in [14] present an on-line learning model based on Multiple
Classifier Systems (MCSs) for short-term load forecasting of micro-grids, and the
model was tested on real data of a micro-grid. A bi-level prediction strategy is
proposed in [15] for STLF in micro-grids. This strategy is composed of a fore-
caster including neural network and evolutionary algorithm in the lower level and50
an enhanced differential evolution algorithm in the upper level for optimizing the
performance of the forecaster. The proposed models in [15] is designed having
the aggregated micro-grid load in mind. However, the present paper focuses on
forecasting the load of the individual loads within a micro-grid, with potentially
significantly higher volatility compared to the aggregated micro-grid load. Fore-55
casting individual micro-grid load components is important for operation schedul-
ing and determining load serving priorities at the feeder level [16].
Some research works have also been presented regarding electricity load pre-
diction for residential areas and buildings [17, 18, 19]. The proper consumption
of electricity in buildings leads to lower operational costs. If the facility manager60
could predict the electricity demand of the building, actions could consequently
be taken to reduce the amount of energy and therefore, reduce the operational cost
of the building [19]. A few works have been published very recently in the area of
energy prediction of buildings. For instance, long-term energy consumption of a
residential area in South West China has been studied in [20]. In this reference, an65
4
Artificial Neural Network (ANN) model is compared with some other prediction
models, including Grey model, regression model, polynomial model and polyno-
mial regression model, to forecast the total energy consumption of the residential
area, and it is shown that ANN model outperforms the other models. Having ac-
cess to detailed data of a six-story multi-family residential building located on70
the Columbia University campus in New York City, the authors in [21] were able
to conduct a comparative spatial analysis to forecast the energy consumptions of
units, floors and the whole building for different temporal intervals (e.g., 10-min,
hourly and daily). The results indicate that the most effective models are built with
hourly consumption at the floor level providing that high resolution and granular75
data is available via advanced smart metering devices. In [22], a Case-Based
Reasoning (CBR) model, categorized as a machine-learning artificial intelligence
technique, is proposed to forecast energy demand in an office building located
in Verennes, Quebec, Canada. Three forecasting horizons of 3-hour, 6-hour and
24-hour ahead have been simulated with hourly prediction resolution, and the re-80
sults demonstrate that the prediction capability of the model is improved when the
horizon is reduced to 3-hour ahead. Authors in [23] have proposed a new method-
ology for electrical consumption forecasting based on end-use decomposition and
similar days. Total consumption forecast is also obtained from end-use consump-
tions and the data of selected days. In [24], a building-level neural network-based85
ensemble model is presented for day-ahead electricity load forecasting, and it is
shown that the presented model outperforms SARIMA (Seasonal Auto Regressive
Moving Average) by up to 50%. However, the comparisons are made only with
5
SARIMA model, which is a linear statistical model, which may not be capable of
capturing high nonlinearity of the building-level electricity load.90
To summarize the main points, micro-grids can bring considerable benefits to
power systems, such as supplying loads in remote areas, reducing total system
expansion planning cost, reducing carbon emission through coordinated utiliza-
tion of Renewable Energy Sources (RESs), providing cheaper electricity through
proper energy management of available resources and energy trade with the main95
grid, and improving system reliability resiliency by providing dispatchable power
for use during peak power conditions or emergency situations. Moreover, it was
discussed that short-term load forecasting tool is of high importance in optimal en-
ergy management and secure operation of micro-grids. In this way, some research
works have been conducted to develop load forecasting models with higher ac-100
curacy. However, as discussed above, a few works have focused on day-ahead
load consumption prediction of buildings in micro-grids and consequently, im-
provement of forecast accuracy is still needed in this area. In the present paper, a
forecast method is proposed for the STLF of micro-grids with the focus of elec-
tricity load prediction for individual buildings. The main contribution of this pa-105
per is applying a Self-Recurrent Wavelet Neural Network (SRWNN) forecasting
engine for electricity load prediction of micro-grids. Moreover, the Levenberg-
Marquardt (LM) learning algorithm is implemented to train the SRWNN. The pro-
posed method improves the forecast accuracy for highly volatile and non-smooth
time series of micro-grid electricity load. The higher the forecast accuracy of110
electricity load, the more efficient energy management can be achieved in a micro-
6
grid.
The remaining parts of the paper are organized as follows. Section 2 provides
a data analysis on different electricity load time series to draw a distinction be-
tween the load of a micro-grid and a power system. The proposed forecasting115
method consists of the SRWNN as the forecasting engine and LM as the training
algorithm, and is presented in Section 3. The proposed load forecasting method
is tested on real-world test cases and the results are compared with the results of
some other prediction approaches in Section 4. Finally, Section 5 concludes the
paper.120
2. Data analysis
A data analysis is presented in this section so as to compare the characteristics
of a micro-grid load time series and electricity load in power systems. The British
Columbia Institute of Technology (BCIT) in Vancouver, the Province of British
Columbia (BC), Canada, is considered as the micro-grid test case studied in this125
paper. BCIT’s Burnaby campus is Canada’s first Smart Power Micro-grid com-
prised of power plants (including renewable resources of wind and photovoltaic
modules), campus loads, command and control (including substation automation,
micro-grid control center and distributed energy management), and communica-
tion network [25]. The load data used in this work is from one building with a130
peak value of 694 kW from March 2012 to March 2013, within the BCIT micro-
grid. Hereafter, we refer to this load as BCIT. To draw a comparison between the
characteristics of a micro-grid load and power system load level, the load time
7
Table 1: Comparison of electricity load time series in terms of volatility.
Volatility index BCITBritish Columbia’s California’s
System Load System LoadDaily volatility (%) 8.34 2.66 3.18
Weekly volatility (%) 7.09 2.28 3.15
series of two power systems, i.e., British Columbia where BCIT micro-grid is
located, and California, are analyzed.135
Electricity load follows daily and weekly periodicities. In this way, we con-
sider two measures for volatility analysis, i.e., daily volatility and weekly volatil-
ity. These measures are based on the standard deviation of logarithmic returns
over a time window. In general, daily volatility quantifies the overall change in
hourly electricity load from one day to another, and weekly volatility measures140
the load changes in subsequent weeks. For more details regarding aforementioned
volatility indices, see [26].
One year hourly load data has been considered for British Columbia’s and
California’s power systems for the same period, i.e., from March 2012 to March
2013. Observe from Table 1 that both daily and weekly volatility indices for a145
micro-grid are considerably higher than those for power systems, which demon-
strate low smoothness of micro-grid load time series. For instance, daily volatil-
ity related to the micro-grid is 8.34%, while it is respectively 2.66% and 3.18%
for British Columbia’s and California’s power systems. It means that electricity
load of the micro-grid fluctuates more severely from one day to another compared150
with that of power systems. Likewise, weekly fluctuations are more severe in the
micro-grid than those in a power system. As a result, daily and weekly periodic-
8
(a) BC’s power system electricity load (b) Building electricity load in BCIT
Figure 1: One-year hourly load data of BC’s power system and the building in BCIT
ities of electricity load in a micro-grid are noticeably low, and consequently, the
predictability of such load time series decreases.
Fig. 1 illustrates one-year hourly load data of British Columbia’s power sys-155
tem and that of the building in BCIT. It is noted that the data is normalized to
the maximum value. As seen, the aggregated electricity load in a large area (e.g.,
the province of British Columbia) is noticeably different from the aggregated load
in a building. For instance, British Columbia’s load follows a common seasonal
pattern, as the load decreases in the spring in April and starts to increase in the fall160
in October. The building’s load follows a fairly similar seasonal pattern. From the
beginning of the academic year in september, electricity load starts increasing, and
it starts decreasing in February. Moreover, it is evidently shown that fluctuations
of load are more severe for a building compared with those for a power system.
These variations in load time series of the micro-grid graphically demonstrate its165
volatility previously shown in Table 1 by volatility indices.
9
−20 −15 −10 −5 0 5 10 15 20 250
500
1000
1500
2000(a) 1−hour ramp distribution
% of the peak load
Nu
mb
er
of
occu
rre
nce
s
−20 −15 −10 −5 0 5 10 15 20 250
300
600
900
1200(b) 2−hour ramp distribution
% of the peak load
Nu
mb
er
of
occu
rre
nce
s
BC power system
Building in BCIT
BC power system
Building in BCIT
Figure 2: Distribution of 1-hour and 2-hour ramps
To have a better understanding of such severe load fluctuations for a build-
ing, hourly changes of load, i.e., the difference between the two observations at
subsequent hours so-called 1-hour ramps, can be taken into consideration. Fig. 2
(a) shows the distribution (with equally spaced bins of 1% of the peak load) of170
1-hour ramps for both the building in BCIT and BC’s power system loads. Note
that the negative values show downward ramps. As seen, more frequent hourly
upward and downward ramps have been occurred in the building in BCIT with
the amplitude of more than 5% of the peak load. The most severe ramp happened
in BC’s power system load is a ramp up with the amplitude of almost 9% of the175
10
Table 2: Ramp events in electricity load time series
Interval 1-hour 2-hour
(% of the peak load)Building BC power Building BC powerin BCIT system in BCIT system
Ramp Up
5% ≤ RU < 10% 556 504 818 97110% ≤ RU < 15% 84 0 362 352
(RU) 15% ≤ RU < 20% 12 0 121 55RU > 20% 4 0 21 0
Ramp Down
5% ≤ RD < 10% 520 259 1003 127210% ≤ RD < 15% 48 0 283 141
(RD) 15% ≤ RD < 20% 9 0 57 0RD > 20% 0 0 11 0
peak load, while it is a ramp up with more than 20% of the peak load for the
building. Similarly, Fig. 2 (b) illustrates the distribution for 2-hour ramps, i.e.,
load variations in two-hour durations. As the longer time is considered for ramps,
the larger ramps will be detected. Obviously, sharp upward and downward ramps
have more frequently happened in BCIT building load than in BC system load.180
To provide more detailed statistics of ramps, table 2 shows the number of
upward and downward ramps for 1-hour and 2-hour durations. For instance, there
have been 100 upward ramps more than 10% of the peak load in BCIT building
load, while no 1-hour ramp up has occurred with the amplitude of more than 10%
of the peak load in BC system load. With regard to 2-hour ramp up, there have185
been 55 2-hour ramp ups more than 15% of the peak load occurred in BC, while it
has been 142 ones for BCIT load. This table also demonstrates that the number of
downward ramps are fewer than the number of upward ramps when large ramps
are concerned.
Based on the above descriptions, prediction of electricity load time series of a190
11
building seems to be more difficult than that of a power system since high volatil-
ity lowers the predictability. Consequently, it is required to adapt a forecasting
model so as to cope with the challenging characteristics of such time series. In the
next section, a forecasting model is proposed to capture the dynamic and volatile
behaviour of micro-grid time series.195
3. The forecasting model
The discussion in section 2 showed that dealing with micro-grid load time
series is a more challenging task compared with a power system, and therefore,
traditional STLF will not result in satisfactory accuracy in micro-grid load pre-
diction. In this way, the SRWNN forecasting engine is firstly presented in this200
section and the training algorithm is then implemented to set the free parameters
of the SRWNN.
3.1. Self-Recurrent Wavelet Neural Network
The wavelet theory has been applied through two different approaches for
forecast processes. The first one is using the wavelet transform as a preprocessor205
to compose the load time series into its low and high frequency components. Each
component is separately processed by a forecast engine [27]. The other approach
is constructing the wavelet neural network (WNN) in which a wavelet function
is used as the activation function of the hidden neurons of a Feed-Forward Neu-
ral Network (FFNN). The WNN was first introduced in [28] for approximating210
nonlinear functions. Due to the local properties of wavelets and the concept of
12
adapting the wavelet shape according to training data set instead of adapting the
parameters of the fixed shape basis function, WNNs have better generalization
property compared to the classical FFNNs, and therefore, these are more appro-
priate for the modelling of time series [29].215
The SRWNN is a modified model of WNN including the properties of the dy-
namics of Recurrent Neural Networks (RNNs) [30] and the fast convergence of
WNNs, which has successfully been applied to estimating and controlling nonlin-
ear systems [31]. Since the SRWNN has a self-recurrent mother wavelet layer, it
can store the past information of wavelets and well attract the complex nonlinear220
systems [32]. Having self-feedback loops and input direct terms, SRWNN has
improved capabilities compared to WNN, such as its dynamic response and infor-
mation storing ability. Therefore, SRWNN has been applied as a forecast engine
in this paper to overcome the volatile and non-smooth behavior of the load time
series in a micro-grid. Moreover, SRWNN does not include limitations, such as225
dependency on appropriate tuning of parameters and complex optimization pro-
cess, which are likely to be found in models such as Support Vector Machines
(SVMs) [33].
The architecture of the SRWNN is shown in Fig. 3, which is a feed forward
network with four layers. As seen, X = [x1, ..., xM ] is the input vector of the230
forecast engine and y is the target variable. The inputs x1, ..., xM of the forecast
engine can be from the past values of the target variable and past and forecast
values of the related exogenous variables. For instance, past values of electricity
load along with the past and forecast values of temperature can be considered for
13
electricity load prediction, provided that their data is available.235
A feature selection technique can be used to refine these candidate features
and select the most effective inputs for the forecast process. In this research work,
we use the feature selection method of [34]. This method is based on the infor-
mation theoretic criterion of mutual information and selects the most informative
inputs for the forecast process by filtering out the irrelevant and redundant can-240
didate features through two stages. In the first stage, which is called irrelevancy
filter, mutual information between each candidate input, i.e. xi(t), and the target
variable is calculated. The higher value of mutual information for xi(t) means
the more common information content of this feature with the target variable. The
candidate inputs with computed mutual information value greater than a relevancy245
threshold, denoted by TH1, are considered as the relevant features of the forecast
process, which are retained for the next stage. However, other candidate inputs
with mutual information value lower than TH1 are considered as irrelevant fea-
tures, which are filtered out. In the second stage, which is called redundancy filter,
redundant features among the candidate inputs secected by the relevancy filter are250
found and filtered out. Two selected candidates, e.g., xk(t) and xl(t), with high
value of mutual information have more common information, i.e., high level of
redundancy. Thus, the redundancy of each selected feature xk(t) with the other
candidate inputs is calculated. Then, if the measured redundancy becomes greater
than a redundancy threshold, denoted by TH2, xk(t) is considered as a redundant255
candidate input. Hence, between this candidate and its rival, which has the maxi-
mum redundancy with xk(t), one with lower relevancy should be filtered out [34].
14
Figure 3: Architecture of the SRWNN.
The selected candidate features in relevancy filter are considered as the inputs of
the load forecasting engine. Moreover, fine-tuning the values of the thresholds
TH1 and TH2 is performed by cross validation technique. Since this method is260
not the focus of this paper, it is not further discussed here. The interested reader
can refer to [34] for details of this feature selection method.
Therefore, the target variable is the electricity load of the next time interval
that the forecasting engine presents a prediction for it using the past values of
electricity load and calendar effects. Moreover, Multi-period forecast, e.g. load265
prediction for the next 24 hours, is reached via recursion, i.e. by feeding input
variables with the forecaster’s outputs. For instance, forecasted load for the first
hour is used as y(t−1) for load prediction of the second hour provided that y(t−1)
is among the selected candidate inputs of the feature selection technique.
The input layer of the forecast engine transmits M input variables, which are270
15
selected by the feature selection technique, to the next layer without any changes.
The second layer, which is called the wavelet layer, consists of N ×M neurons
that each has a self-feedback loop. In this paper, Morlet wavelet function has
been considered as the activation function of neurons in the mother-wavelet layer,
which is defined as follows:275
ψ(x) = e−0.5x2
cos(5x) (1)
In SRWNN, a wavelet of each node is derived from its mother wavelet as below:
ψi,j(ri,j) = ψ(ui,j − biai
), ri,j =ui,j − biai
(2)
where ψi,j is the scaled and shifted version of Morlet mother wavelet with ai and
bi as the scale and shift parameters, respectively. In addition, the inputs of the
wavelets in (2) are as follows:
ui,j = xj + ψi,jz−1 · θi,j (3)
where z−1 is the time delay; thus, the input of this layer contains the memory280
term ψi,jz−1 which can store the past information of networks, and θi,j denotes
the weight of the self-feedback loop, which represents the rate of information
storage. This feature is the main difference between a SRWNN and a WNN. In
fact, the SRWNN is the same as WNN when all θi,j are equal to zero. However,
it is noted that the initial values for θi,j are usually considered zero, which means285
16
there are no feedback units initially.
M-dimensional wavelet functions are constructed by the tensor product of one-
dimensional Morlet wavelets in the third layer as follows:
Ψi =M∏j=1
ψi,j, i = 1, 2, ..., N (4)
The output of the SRWNN, denoted by y, is finally computed as below:
y =N∑i=1
wi ·Ψi +M∑j=1
vj · xj + g (5)
where, wi is the weight between ith neuron of the product layer and the output290
node, vj is the direct input weight between jth input and the output node, and
g is the bias of the output node. Therefore, the output of SRWNN is obtained
by a combination of multi-dimensional wavelet functions, i.e. Ψi , as well as a
combination of inputs, i.e. xj . In other words, the proposed model not only can
benefit from the capabilities of wavelet functions, such as their ability to capture295
cyclical behaviors, but also can capture trends of the signal. In addition, SRWNN
can benefit from its dynamic response by storing the past information of wavelets
in self-feedback loops (equation 3) to capture complex nonlinearities. Based on
the aforementioned formulation, the vector of the free parameters of the SRWNN
is denoted by P as follows:300
P = [vj, wi, ai, bi, θi,j, g], i = 1, ..., N, j = 1, ...,M (6)
17
Therefore, the SRWNN has NP = M + 3N + M × N + 1 free parameters
which are determined by the training method. It should also be noted that the
SRWNN model presented in this paper differs from the SRWNN proposed in [32].
There are two differences between these two models. First, there is an additional
external bias (e.g., g) to the output layer of the presented SRWNN in this work.305
A bias can increase or lower the net input of the activation function, depending
on whether it is positive or negative, respectively [35]. Consequently, biases can
enhance the input/output mapping function by adding another feature to neural
networks. Second, Morlet wavelet functions have been used as the activation
functions in Wavelet layer of SRWNN in this paper, while the second derivative310
of Gaussian functions, i.e., Mexican hat wavelet function, in reference [32] of
the previous version. Although Mexican hat wavelet function has successfully
been used in WNN model for forecasting applications due to its superiorities over
Daubechies wavelets [29], it has been shown that Morlet wavelets outperform
Mexican hat wavelets for prediction applications [36, 37]. Therefore, we applied315
Morlet mother wavelets as the activation functions in SRWNN in our paper.
3.2. The training algorithm
In this subsection, a training algorithm is implemented to set the free parame-
ters of the SRWNN denoted by P in (6). Since the mother wavelet function used
in the SRWNN, i.e. Morlet wavelet function, is differentiable with respect to all320
free parameters, the Levenberg-Marquardt (LM) learning algorithm can be used
in this regards. This learning algorithm was applied to train the neural networks
18
by Hagan and Menhaj in [38]. Due to the advantages of the LM algorithm, such as
accurate training and fast convergence, it has been recommended in many research
works, and therefore, it is implemented for training the SRWNN in this paper. The325
LM algorithm is briefly described in the Appendix and its implementation on the
SRWNN is then presented.
Moreover, the termination criterion used for the training of the SRWNN is
based on early-stopping technique. Accordingly, the whole available data is di-
vided into training and validation samples. The SRWNN is trained using the train-330
ing samples and the error for validation samples is monitored in each iteration. As
the validation error begins to rise during some number of iterations, usually five,
the training phase is stopped and the values of the free parameters relating to the
iteration with the least validation error are stored as the final solution of the train-
ing algorithm.335
4. Numerical results
In this paper, we mainly focus on 24-hour ahead load prediction with hourly
forecast steps. Day-ahead load forecasting can bring significant operational ad-
vantages for energy management of micro-grids. For instance, BCIT micro-grid
consists of different types of generating units (e.g., thermal, wind and PV units),340
and day-ahead load predictions are used for energy management purposes. In
other words, optimal utilization of available resources is achieved using load fore-
casting in order to minimize the operation cost for BCIT campus micro-grid.
Moreover, as this micro-grid can operate in both stand alone and grid-connected
19
modes, accurate load forecast can be used for profitable trade of electric energy345
within the British Columbia power system.
The same load time series data of the building in BCIT and two power systems
are used for numerical experiments of this section. Based on the data analyses
presented in section 2, electricity load not only depends on the load profile of the
previous day, i.e., daily periodicity, but also the load pattern of the previous week,350
i.e., weekly periodicity. To capture such patterns, 192 candidate inputs has been
considered as lagged hourly load data, i.e., {Lt−192, ..., Lt−1} where Lt indicates
the electricity load at time t. The feature selection technique selects the most in-
formative lagged load values from these candidate inputs. Calendar information
is also highly important for a load forecasting model so as to capture weekly and355
seasonal patterns. For instance, either considering the day of the week or differ-
entiating weekends and weekdays is a common way presented in the literature
[5, 9, 39]. Thus, weekends and holidays are considered in this work using a bi-
nary variable for detecting weekends and holidays from weekdays. The month
of the year is also used in some cases [39]; however, it is not considered in this360
paper since the seasonality factor is already captured, as the model is re-trained
every day. Furthermore, temperature data as an exogenous variable has been used
to improve load forecasting prediction since temperature time series usually has
high relevancy to electricity consumption time series [5, 7, 8, 40]. Accordingly,
based on publicly available data, seven daily values of temperature for the previ-365
ous week (e.g., Td−7, ..., Td−1), and the daily forecast value of the temperature for
the prediction day (e.g., Td) were first considered for the model, where Td repre-
20
sents the average daily temperature for day d. However, numerical experiments
for BCIT test case revealed that low resolution temperature data, i.e. daily data,
cannot improve the accuracy for hourly load forecast. Therefore, we tested histor-370
ical hourly temperature data (located in Vancouver) and also used the same time
series for temperature forecasts, i.e., perfect forecasts, in order to observe if hourly
temperature data can enhance the forecast results for BCIT test case. For this pur-
pose, lagged hourly temperature data, i.e., {Tt−192, ..., Tt−1}, are considered as
192 candidate inputs that feed the feature selection stage along with 192 candi-375
date inputs for load data. The feature selection technique then selects the most
informative candidates among all candidates of load and temperature and transfer
them to the model. Considering the selected inputs, few temperature inputs are
among all selected inputs that shows the low correlation of the temperature time
series and load time series of BCIT. The low correlation results from the fact that380
the electric load of this building is mainly lighting load. Considering the mild
temperatures in Vancouver, the heating load is not as significant. The numerical
results also supported this low correlation, as hourly temperature data with even
perfect forecasts could not improve the forecast accuracy of the model. Therefore,
temperature inputs are not considered for the numerical results in this paper.385
To show the effectiveness of different forecasting engines, SRWNN is com-
pared with two other efficient neural network-based forecasting models, i.e., WNN
and Multi-Layer Perceptron (MLP). It is noted that statistical models (e.g., Au-
toregressive Integrated Moving Average (ARIMA) model) are not considered in
this paper since such techniques are basically linear methods and have limited ca-390
21
pability to capture nonlinearities in the load series [41, 42]. Therefore, we chose
two efficient Computational Intelligence (CI) based models, e.g., MLP as an ef-
ficient Feed Forward Neural Network (FFNN) and WNN as an effective model
combining nonlinear mapping merits of FNNNs and wavelet functions, as bench-
marks in our comparative results.395
Hence, 10 test months of hourly load data from the building in BCIT from
May 2012 to February 2013 are considered for 24-hour ahead load prediction.
It is noted that the first two months of the historical data is used for training of
the forecast engine and so the results of the first two months cannot be presented
here. Two error criteria are used in this paper to evaluate forecast errors: (i)400
normalized Root Mean Square Error (nRMSE) and (ii) normalized Mean Absolute
Error (nMAE), defined as follows:
nRMSE =
√√√√ 1
N
N∑t=1
(LACT(t) − LFOR(t)
LPeak)2 × 100 (7)
nMAE =1
N
N∑t=1
|LACT(t) − LFOR(t)
LPeak| × 100 (8)
where LACT(t) and LFOR(t) indicate the actual and forecast values of electricity
load for hour t. Moreover, N indicates number of hours for each month, and LPeak405
is the peak value of the electricity load over the year, which is 694 kW for this test
case. Observe from Table 3 that SRWNN outperforms the other forecasting mod-
els in all test months and in terms of both nRMSE and nMAE. For instance, the
22
Table 3: Forecasting errors, in %, of SRWNN, WNN and MLP for 10 test months.
MLP WNN SRWNNMonth nRMSE nMAE nRMSE nMAE nRMSE nMAE
May 8.44 6.22 5.96 4.05 5.23 3.80Jun. 9.92 7.55 5.44 4.27 4.86 3.80Jul. 10.41 7.92 7.04 5.26 5.43 4.01
Aug. 10.40 8.14 6.57 4.95 6.46 4.80Sep. 11.88 8.40 7.83 6.01 6.28 4.82Oct. 10.45 7.68 4.81 3.83 4.24 3.28Nov. 6.34 4.89 4.62 3.56 4.30 3.21Dec. 6.21 4.58 4.54 3.35 4.22 3.05Jan. 6.93 4.74 4.86 3.40 4.25 3.11Feb. 6.85 5.29 5.06 3.94 4.58 3.50
Average 8.78% 6.54% 5.67% 4.26% 4.98% 3.74%
average nRMSE and average nMAE of SRWNN are (5.67-4.98)/5.67=12.1% and
(4.26-3.74)/4.26=12.2% lower than those of WNN, and (8.78-4.98)/8.78=43.2%410
and (6.54-3.74)/6.54=42.8% lower than those for MLP, respectively. This table
demonstrates that for a highly volatile time series, i.e. micro-grid electricity load,
a SRWNN forecasting model can more efficiently cope with the variations and
non-smooth behavior of the time series.
Moreover, Fig. 4 illustrates the carpet charts of monthly mean absolute errors415
for different hours of the day for SRWNN and WNN on BCIT test case. This
figure clearly shows that large errors for both models usually occur between 12:00
PM and 16:00 PM when the load peaks. However, this colormap shows lower
errors during the peak hours for SRWNN in comparison with the WNN. More
importantly, the superiority of SRWNN over WNN is revealed during the upward420
ramps in the morning. As analyzed in section 2, sharp upward ramps occur more
23
(a) SRWNN (b) WNN
Figure 4: Mean absolute error (kW) of different hours of the day in different months
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 245
10
15
20
25
30
35
40
45
Hour
Mean A
bsolu
te E
rror
(kW
)
SRWNN
WNN
Figure 5: 10-month mean absolute error for different hours of the day
than downward ramps for BCIT test case, and consequently, any improvements in
forecasting ramp up events can considerably enhance the forecast accuracy of this
load time series. Fig. 5 demonstrates the average of mean absolute errors for all
10 months. According to these two curves, SRWNN shows lower yearly errors425
during the morning ramp, which usually occurs from 7:00 AM to 12:00 PM. In
addition, there is an improvement in ramp down forecasting from 16:00 PM to
18:00 PM.
24
Curves of generated forecasts and real data for a good forecasting day, i.e.
November 15, and a bad forecasting day, i.e. September 7, is demonstrated in430
Fig. 6. Fig. 6(a) shows that there are sharp changes and variations on Septem-
ber 7. Sharp spikes could result from the high temperatures during specific days,
which increase the electricity consumption of the buildings for air conditioning.
As a consequence of such severe ramps, the forecasting model faces difficulties
to capture this high sudden variations in electricity load. The major error is the435
magnitude error occurred during the peak load. On the contrary, there has been
smoother variations on November 15 shown in Fig. 6(b), so the forecasting model
could perfectly capture the upward ramp. As a result, the challenge of high volatil-
ity and sharp ramps in micro-grid time series is evidently distinct from power
system loads, and makes such time series more unpredictable.440
In the next experiment, forecasting errors of different days of the week for the
same 10-month period are separately considered to observe the users’ behavior.
It is noted that the electricity consumption of the building is mainly from light-
ings as mentioned earlier in this section. Here, users’ behavior is represented by
considering the calendar effect as the inputs of the model. A binary variable for445
differentiating weekends and holidays from weekdays is used, i.e., zero represents
weekends and holidays, while one represents weekdays. Fig. 7 demonstrates the
forecasting errors with and without the calendar effects. First, observe that the
average of nRMSE considering the calendar effect, i.e., 4.78%, is lower than that
when the calendar effect is not included, i.e., 5.32%. Moreover, according to the450
figure, the highest error occurs on Mondays, which is the first working day at the
25
(a) Bad forecasting day (b) Good forecasting day
Figure 6: Samples for bad (a) and good (b) forecasting days
university. Calendar inputs can efficiently capture such behaviors of the users.
For instance, the forecasting error in terms of nRMSE corresponds to Monday has
considerably decreased from 7.32% to 5.83% when the calendar effect is taken
into account. In addition, the standard deviation of the error associated with dif-455
ferent days of week has decreased from 0.97% to 0.57% using the calendar effect.
In other words, the model performs in a more robust way for predicting different
days of the week. According to Fig. 7, the difference between the maximum and
the minimum errors with calendar, i.e., 1.85%, and without calendar, i.e., 2.83%,
can also show the better performance of the model including the calendar effect.460
As a result, users’ behavior can efficiently be captured by considering calendar
effects in order to improve the forecast accuracy.
26
Figure 7: Forecasting errors for different days of the week
In the last experiment, the proposed forecasting model is applied to predict two
power system time series. The main goal of this numerical experiment is to show
how forecast accuracy of SRWNN improves, compared with WNN, as the volatil-465
ity of the time series increases. Hence, from a power system with low volatility to
one with higher volatility, forecast accuracy improvements increase for SRWNN.
In this way, the same test cases for British Columbia’s and California’s power sys-
tems are considered. Table 4 shows the obtained forecast error results (based on
the average of 10-month error) for both SRWNN and WNN models. Firstly, this470
table demonstrates noticeable lower forecast errors of both models for prediction
of power systems’ load data compared with those for a micro-grid illustrated in
Table 3. For instance, 4.98% compared with 2.29% in terms of nRMSE for the
micro-grid and British Columbia’s power system, respectively. Besides, observed
from Table 1, the volatility for British Columbia’s power system time series is475
27
Table 4: Forecasting errors of SRWNN and WNN for two power systems.
Power WNN SRWNN Improvements(%)System nRMSE nMAE nRMSE nMAE nRMSE nMAE
British Columbia 2.46 1.81 2.29 1.69 6.9 6.6California 3.67 2.57 3.38 2.37 7.9 7.8
the lowest in terms of both daily and weekly volatility indices. Consequently,
it is expected to have higher predictability for British Columbia’s power system
compared to the micro-grid and California’s power system. Table 4 statistically
supports that the forecasting errors for British Columbia is lower than those for
California, e.g., 2.46% compared to 3.67% in terms of nRMSE for WNN.480
Secondly, Table 4 shows how effective the SRWNN becomes as the volatility
of a time series increases. As seen from the last column of Table 4, the forecast ac-
curacy improvements obtained from SRWNN in terms of nRMSE and nMAE are
respectively (2.46-2.29)/2.46=6.9% and (1.81-1.69)/1.81=6.6% for BC’s power
system. Similarly, there are 7.9% and 7.8% forecast accuracy improvements in485
terms of nRMSE and nMAE for California’s power system, respectively. There-
fore, since the volatility of California’s power system is higher than that for British
Columbia’s, SRWNN obtained higher improvement of forecast accuracy com-
pared with WNN for California’s power system. In other words, California’s load
contains higher daily and weekly volatilities, and consequently, the SRWNN can490
capture these variations and present more accurate forecast results compared with
WNN. To have a better sense of these percentage errors, forecast accuracy im-
provement in terms of mean absolute error is around 93 MW, which is almost
twice as big as the capacity of Kumeyaay wind farm, i.e. 50 MW, located in San
28
Diego, California [43]. As a result, it shows as the volatility of the time series495
increases, the performance of SRWNN improves in comparison with WNN. As
mentioned earlier in this section, load forecast accuracy can be improved using
weather forecast data as exogenous inputs to the forecasting model. For instance,
load forecasting models utilized in California ISO (CAISO) include weather fore-
casts, such as temperature, dew point, wind speed and cloud cover, for next 9 days500
for 24 weather stations [44]. It is noted that including such exogenous inputs to
the model depends on the availability of the public data.
The computation time of the SRWNN model for the training phase is less
than 35 seconds for one day prediction for the test cases of this paper, which is
measured on a hardware set of Mac Intel Core i5 2.7 GHz with 12 GB RAM.505
Although this computation time is larger than that for WNN, i.e., less than 11
seconds, it is completely acceptable within a 24-hour decision making framework,
and shows fast forecasting performance of the proposed method.
5. Conclusions
STLF is an important tool for reliable and economic operation of power sys-510
tems as many operating decisions are based on load forecast, e.g., dispatch schedul-
ing of generating units, security assessment and demand side management. Like-
wise, precise STLF for a micro-grid can enhance the management of its renewable
and conventional resources and improve the economics of energy trade with elec-
tricity markets. Considering volatile and non-smooth characteristics of load time515
series of micro-grids compared with power systems’ electricity load, a new fore-
29
casting method is proposed to deal with such challenges in this paper. The pro-
posed method has the structure of a SRWNN as the forecasting engine, in which
feedback loops have been added to a WNN so as to better capture nonlinear com-
plexities of volatile time series. LM learning algorithm is implemented to train520
the SRWNN, i.e., adjusting the free parameters of the SRWNN. High volatility of
a micro-grid load was shown by defining a volatility criterion and comparing with
the volatility of two power systems’ load data. The effectiveness of the proposed
forecasting method was demonstrated by real-world load data of a micro-grid and
power systems. The results show that the proposed SRWNN model leads to more525
accurate forecasts when a volatile time series prediction is of interest.
Appendix: Formulation of the training algorithm
The task of the forecasting engines is to learn the mapping function between
a specified set of input/output pairs {(X1, t1), (X2, t2), ..., (XQ, tQ)}, known as
training samples. Q indicates the number of training samples. Xq and tq are the530
qth input vector and the corresponding target output of the forecasting model, re-
spectively. Mean squared error (MSE) is usually considered to be the performance
index for the network. The MSE is calculated by
MSE =1
Q
Q∑q=1
e2q, (eq = tq − yq) (A.1)
where, yq is the output of the forecasting engine when Xq is fed as the input of the
forecasting engine. eq is the forecast error of the qth sample.535
30
The LM algorithm is an approximation of Newton’s method, in which the
solution is updated as follows:
Pk+1 = Pk − (JᵀJ + µI)−1Jᵀe (A.2)
where P is the vector of the free parameters according to (6). k represents the
iteration number, and I is the identity matrix. J is the Jacobian matrix composed
of the first derivatives of the network errors with respect to all its free parameters540
and JᵀJ is the Hessian matrix. Considering (A.1) as the performance function
that should be minimized, the gradient of (A.1) can be shown as Jᵀe.
The main modification of the LM algorithm with respect to Newton’s method
is the parameter µ, such that the algorithm becomes the Newton’s method if µ is
zero in (A.2). When µ is large, the LM algorithm tends to gradient descent with a545
small step size, i.e., (1/µ), while for small µ the LM algorithm tends to Newton’s
method. Since the Newton’s method is faster and more accurate than the gradient
descent, the aim is to shift toward Newton’s method as quickly as possible. Thus,
µ is divided by a factor β (β > 1) after each successful step, i.e. reduction in
the MSE given in (A.1). On the contrary, µ is multiplied by the factor β when a550
tentative step increases the MSE. Therefore, the MSE is always reduced at each
iteration of the algorithm [38]. The initial value for µ is usually considered 0.01
and β is usually set as 10. For further details regarding the LM training algorithm,
the interested reader can refer to [38]. The implementation of the LM learning
algorithm on the SRWNN is proposed in the following.555
31
Since computation of the Jacobian matrix is the most important part of the LM
algorithm, it is required to determine the first derivative of the network errors with
respect to each free parameter of (6) in the SRWNN, i.e., vj , wi, ai, bi, θi,j , and g.
The elements in the Jacobian matrix are calculated by the following equations.
∂e
∂vj=∂(t− y)
∂vj= −xj, j = 1, 2, ...,M (A.3)
∂e
∂wi
=∂(t− y)
∂wi
= −Ψi, i = 1, 2, ..., N (A.4)
∂e
∂ai=∂(t− y)
∂ai= −wi ·
∂Ψi
∂ai, i = 1, 2, ..., N (A.5)
560
∂Ψi
∂ai=
M∑j=1
[dψi,j
dai·
M∏l=1,l 6=j
ψ(ri,l)
], (A.6)
dψi,j
dai=−ri,jai· ψ′(ri,j), (A.7)
where ψ′(.) is the derivative of the Morlet mother wavelet function.
∂e
∂bi=∂(t− y)
∂bi= −wi ·
∂Ψi
∂bi, i = 1, 2, ..., N (A.8)
∂Ψi
∂bi=
M∑j=1
[dψi,j
dbi·
M∏l=1,l 6=j
ψ(ri,l)
], (A.9)
dψi,j
dbi=−1
ai· ψ′(ri,j), (A.10)
32
∂e
∂θi,j= −wi ·
∂Ψi
∂θi,j, i = 1, ..., N , j = 1, ...,M (A.11)
∂Ψi
∂θi,j=ψi,jz
−1
ai· ψ′(ri,j) ·
M∏l=1,l 6=j
ψ(ri,l) (A.12)
∂e
∂g=∂(t− y)
∂g= −1 (A.13)
Therefore, the Jacobian matrix with the size of Q × NP can be computed using565
(A.3) to (A.13) and all free parameters of the SRWNN are updated using (A.2).
The procedure of the LM learning algorithm for training the SRWNN is summa-
rized as follows:
1. Set the iteration number to 1, i.e., (k = 1). Randomly initialize the free
parameters vj , wi, ai, bi, θi,j and g of the forecasting engine within their570
allowable ranges for the first iteration P1.
2. Present all xqs and compute the corresponding SRWNN outputs yq using
(5). Moreover, compute the corresponding errors eq and the performance
index MSE using (A.1).
3. Compute the Jacobian matrix575
4. Update the free parameters of the forecasting engine using (A.2) to obtain
Pk+1.
5. Compute the performance index MSE using Pk+1. If the new MSE is
smaller than the one computed in step 2, reduce the parameter µ by the
33
factor β, and save Pk+1. Otherwise, increase the parameter µ by multiply-580
ing it to β and go back to step 3.
6. Increment k, i.e., (k = k+1). The training algorithm is terminated when the
termination criterion is satisfied. Otherwise, go back to step 3. It is noted
that the termination criterion can be the maximum number of iterations.
However, the early stopping technique, discussed in section 3.2, is used as585
the termination criterion of the training algorithm in this paper as it can
monitor the prediction ability of SRWNN forecast engine for the unseen
samples and terminate the training process in the best point with the least
validation error.
Acknowledgements590
Partial support for this work came from the Canadian National Science and
Engineering Research Council (NSERC) and the ENMAX Corporation under the
Industrial Research Chairs program. Moreover, the authors would like to thank Dr.
Hassan Farhangi and Dr. Ali Palizan of British Columbia Institute of Technology
(BCIT) for providing data and invaluable insight.595
References
[1] Navigant research, 2013. URL: http://www.navigantresearch.
com/research/microgrids.
[2] J. Taylor, P. McSharry, Short-term load forecasting methods: An evaluation
34
based on european data, IEEE Transactions on Power Systems 22 (2007)600
2213–2219.
[3] T. Hong, M. Gui, M. Baran, H. Willis, Modeling and forecasting hourly elec-
tric load by multiple linear regression with interactions, Power and Energy
Society General Meeting, 2010 IEEE (2010) 1–8.
[4] E. Paparoditis, T. Sapatinas, Short-term load forecasting: The similar shape605
functional time-series predictor, IEEE Transactions on Power Systems 28
(2013) 3818–3825.
[5] H. Hippert, C. Pedreira, R. Souza, Neural networks for short-term load
forecasting: a review and evaluation, IEEE Transactions on Power Systems
16 (2001) 44–55.610
[6] Y. Wang, Q. Xia, C. Kang, Secondary forecasting based on deviation analy-
sis for short-term load forecasting, IEEE Transactions on Power Systems 26
(2011) 500–507.
[7] E. Ceperic, V. Ceperic, A. Baric, A strategy for short-term load forecast-
ing by support vector regression machines, IEEE Transactions on Power615
Systems 28 (2013) 4356–4364.
[8] Y. Goude, R. Nedellec, N. Kong, Local short and middle term electricity
load forecasting with semi-parametric additive models, IEEE Transactions
on Smart Grid 5 (2014) 440–446.
35
[9] A. Chaouachi, R. M. Kamel, R. Andoulsi, K. Nagasaka, Multiobjective620
intelligent energy management for a microgrid, IEEE Transactions on In-
dustrial Electronics 60 (2013) 1688–1699.
[10] E. Mashhour, S. Moghaddas-Tafreshi, Integration of distributed energy re-
sources into low voltage grid: A market-based multiperiod optimization
model, Electric Power Systems Research 80 (2010) 473–480.625
[11] E. R. Sanseverino, M. L. D. Silvestre, M. G. Ippolito, A. D. Paola, G. L.
Re, An execution, monitoring and replanning approach for optimal energy
management in microgrids, Energy 36 (2011) 3429–3436.
[12] A. Mohamed, V. Salehi, O. Mohammed, Real-time energy management
algorithm for mitigation of pulse loads in hybrid microgrids, IEEE Transac-630
tions on Smart Grid 3 (2012) 1911–1922.
[13] M. Eghbal, T. K. Saha, N. Mahmoudi-Kohan, Utilizing demand response
programs in day ahead generation scheduling for micro-grids with renewable
sources, 2011 IEEE PES Innovative Smart Grid Technologies Asia (ISGT)
(2011) 1–6.635
[14] P. Chan, W.-C. Chen, W. Ng, D. Yeung, Multiple classifier system for
short term load forecast of microgrid, Proceedings of the 2011 International
Conference on Machine Learning and Cybernetics (10-13 July, 2011) 1268–
1273.
36
[15] N. Amjady, F. Keynia, H. Zareipour, Short-term load forecast of microgrids640
by a new bilevel prediction strategy, IEEE Transactions on Smart Grid 1
(2010) 286–294.
[16] M. Shahidehpour, M. Khodayar, Cutting campus energy costs with hierar-
chical control, IEEE Electrification Magazine 1 (2013) 40– 56.
[17] A. Ahmad, M. Hassan, M. Abdullah, H. Rahman, F. Hussin, H. Abdullah,645
R. Saidur, A review on applications of ANN and SVM for building elec-
trical energy consumption forecasting, Renewable and Sustainable Energy
Reviews 33 (2014) 102 – 109.
[18] G. Escriva-Escriva, C. Alvarez-Bel, C. Roldan-Blay, M. Alcazar-Ortega,
New artificial neural network prediction method for electrical consumption650
forecasting based on building end-uses, Energy and Buildings 43 (2011)
3112 – 3119.
[19] A. H. Neto, F. A. S. Fiorelli, Comparison between detailed model simulation
and artificial neural network for forecasting building energy consumption,
Energy and Buildings 40 (2008) 2169 – 2176.655
[20] S. Farzana, M. Liu, A. Baldwin, M. U. Hossain, Multi-model prediction and
simulation of residential building energy in urban areas of chongqing, south
west china, Energy and Buildings 81 (2014) 161 – 169.
[21] R. K. Jain, K. M. Smith, P. J. Culligan, J. E. Taylor, Forecasting energy con-
sumption of multi-family residential buildings using support vector regres-660
37
sion: Investigating the impact of temporal and spatial monitoring granularity
on performance accuracy, Applied Energy 123 (2014) 168 – 178.
[22] D. Monfet, M. Corsi, D. Choiniere, E. Arkhipova, Development of an energy
prediction tool for commercial buildings using case-based reasoning, Energy
and Buildings 81 (2014) 152 – 160.665
[23] G. Escriva-Escriva, C. Roldan-Blay, C. Alvarez-Bel, Electrical consumption
forecast using actual data of building end-use decomposition, Energy and
Buildings 82 (2014) 73 – 81.
[24] J. G. Jetcheva, M. Majidpour, W. P. Chen, Neural network model ensembles
for building-level electricity load forecasts, Energy and Buildings 84 (2014)670
214 – 223.
[25] British Columbia Institute of Technology, 2014. URL: http://www.
bcit.ca/microgrid/.
[26] H. Zareipour, K. Bhattacharya, C. A. Canizares, Electricity market price
volatility: The case of Ontario, Energy Policy 35 (2007) 4739–4748.675
[27] N. Amjady, F. Keynia, Short-term load forecasting of power systems by
combination of wavelet transform and neuro-evolutionary algorithm, Energy
34 (2009) 46 – 57.
[28] Q. Zhang, A. Benveniste, Wavelet networks, IEEE Transactions on Neural
Networks 3 (1992) 889–898.680
38
[29] N. M. Pindoriya, S. N. Singh, S. K. Singh, An adaptive wavelet neural
network-based energy price forecasting in electricity markets, IEEE Trans-
action on Power System 23 (2008) 1423–1432.
[30] J. Vermaak, E. Botha, Recurrent neural networks for short-term load fore-
casting, IEEE Transactions on Power Systems 13 (1998) 126–132.685
[31] S. J. Yoo, J. B. Park, Y. H. Choi, Adaptive dynamic surface control of
flexible-joint robots using self-recurrent wavelet neural networks, IEEE
Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 36
(2006) 1342–1355.
[32] S. J. Yoo, J. B. Park, Y. H. Choi, Indirect adaptive control of nonlinear690
dynamic systems using self recurrent wavelet neural networks via adaptive
learning rates, Information Sciences 177 (2007) 3074–3098.
[33] A. Tascikaraoglu, M. Uzunoglu, A review of combined approaches for pre-
diction of short-term wind speed and power, Renewable and Sustainable
Energy Reviews 34 (2014) 243 – 254.695
[34] N. Amjady, F. Keynia, H. Zareipour, Wind power prediction by a new fore-
cast engine composed of modified hybrid neural network and enhanced parti-
cle swarm optimization, IEEE Transactions on Sustainable Energy 2 (2011)
265–276.
[35] S. S. Haykin, Neural Networks: A Comprehensive Foundation, Prentice700
Hall, 1999.
39
[36] H. Chitsaz, N. Amjady, H. Zareipour, Wind power forecast using wavelet
neural network trained by improved clonal selection algorithm, Energy Con-
version and Management 89 (2015) 588–598.
[37] L. Wu, M. Shahidehpour, A hybrid model for day-ahead price forecasting,705
IEEE Transactions on Power Systems 25 (2010) 1519–1530.
[38] M. T. Hagan, M. B. Menhaj, Training feedforward networks with the Mar-
quardt algorithm, IEEE Transactions on Neural Networks 5 (1994) 989–993.
[39] L. Hernandez, C. Baladron, J. Aguiar, B. Carro, A. Sanchez-Esguevillas,
J. Lloret, Short-term load forecasting for microgrids based on artificial neu-710
ral networks, Energies 6 (2013) 1385–1408.
[40] A. Pandey, D. Singh, S. Sinha, Intelligent hybrid wavelet models for short-
term load forecasting, IEEE Transactions on Power Systems 25 (2010)
1266–1273.
[41] B.-L. Zhang, Z.-Y. Dong, An adaptive neural-wavelet model for short term715
load forecasting, Electric Power Systems Research 59 (2001) 121–129.
[42] N. Amjady, A. Daraeepour, Mixed price and load forecasting of electric-
ity markets by a new iterative prediction method, Electric Power Systems
Research 79 (2009) 1329–1336.
[43] Kumeyaay wind farm, 2014. URL: http://www.thewindpower.720
net/windfarm_en_2792_kumeyaay.php.
40
[44] California Independent System Operator, 2014. URL: http://www.
caiso.com/1c57/1c578a8751b30.pdf.
41