nn1-architecture and performance of neural networks for efficient ac control in buildings

7/29/2019 NN1-Architecture and Performance of Neural Networks for Efficient AC Control in Buildings

1/20

Architecture and performance of neural networks

for efficient A/C control in buildings

Mohamed A. Mahmoud *, Abdullatif E. Ben-Nakhi

Department of Mechanical Engineering, College of Technological Studies,

P.O. Box 33145, Rumaithya 25562, Kuwait

Received 10 November 2002; accepted 12 May 2003

Abstract

The feasibility of using neural networks (NNs) for optimizing air conditioning (AC) setback scheduling

in public buildings was investigated. The main focus is on optimizing the network architecture in order to

achieve best performance.

To save energy, the temperature inside public buildings is allowed to rise after business hours by setting

back the thermostat. The objective is to predict the time of the end of thermostat setback (EoS) such that

the design temperature inside the building is restored in time for the start of business hours.

State of the art building simulation software, ESP-r, was used to generate a database that covered the

years 19951999. The software was used to calculate the EoS for two office buildings using the climate

records in Kuwait. The EoS data for 1995 and 1996 were used for training and testing the NNs. The ro-

bustness of the trained NN was tested by applying them to a production data set (19971999), which the

networks have never seen before.

For each of the six different NN architectures evaluated, parametric studies were performed to determine

the network parameters that best predict the EoS. External hourly temperature readings were used as

network inputs, and the thermostat end of setback (EoS) is the output. The NN predictions were improved

by developing a neural control scheme (NC). This scheme is based on using the temperature readings as

they become available. For each NN architecture considered, six NNs were designed and trained for this

purpose. The performance of the NN analysis was evaluated using a statistical indicator (the coefficient of

multiple determination) and by statistical analysis of the error patterns, including ANOVA (analysis ofvariance).

The results show that the NC, when used with a properly designed NN, is a powerful instrument for

optimizing AC setback scheduling based only on external temperature records.

2003 Elsevier Ltd. All rights reserved.

Energy Conversion and Management 44 (2003) 32073226

www.elsevier.com/locate/enconman

* Corresponding author. Tel.: +965-563-0013; fax: +965-534-9253/481-1753.

E-mail address: [email protected] (M.A. Mahmoud).

0196-8904/$ - see front matter

2003 Elsevier Ltd. All rights reserved.doi:10.1016/S0196-8904(03)00105-5
http://mail%20to:%[email protected]/http://mail%20to:%[email protected]/


2/20

Keywords: Neural networks; Energy conservation; Air conditioning; Control; General regression; Building simulation;

Polynomial nets

1. Introduction

Energy conservation in buildings is important both for economic and environmental reasons.Improving building energy efficiency simultaneously reduces conventional fuel consumption,building energy cost and global warming gas release to the atmosphere. This has been highlighted

by the recent trend toward more effective and efficient heating, ventilation and air conditioning(HVAC) control methodologies. In countries with extremely hot weather conditions, energyconservation in air conditioning (AC) of public and office buildings is of particular interest, since

most of these buildings are used only for part of every work day. One of the most promisingstrategies in this respect is through off hours thermostat setback, i.e. allowing the temperature to

rise inside the building when it is not in use, leading to energy savings. This is achieved by settingback the thermostat temperature after work hours, then resetting it early enough before the start

of the work day such that the desired temperature in the building is restored in time for actualwork start. The end of setback (denoted herein as EoS) depends, for a given building, on theweather conditions, which is not known a priori. This requires an advanced tool to predict the

EoS based on past weather history, and artificial neural networks (NNs) offer an attractive andpowerful option for this purpose.

NNs have been employed in a wide range of HVAC applications, such as design, operation and

fault detection. Yeh and Wong [1] simulated the design process for sizing fluid systems in HVACby using a NN. Teeter and Chow [2] used artificial NNs to emulate the HVAC plant dynamics inorder to estimate future plant outputs and obtain plant input/output sensitivity information for

online neural control adaptation. Chen and Chen [3] discussed a NN based system identificationtechnique to determine the z-transfer function coefficients of a building envelope from experi-

mental data. These coefficients were then used in a z-transfer function technique, which is used incalculations for HVAC design and building energy consumption.

Among the many researchers who have addressed the issue of increasing indoor thermal

comfort, Egilegor et al. [4] implemented a neural-fuzzy control system in order to optimize thevalue of the predicted mean vote (PMV) index by tuning zone temperature according to the

humidity level. Rock and Wu [5] introduced a CO2 based, demand controlled ventilation schemeby using an artificial NN control algorithm. Ahmed et al. [6] studied the development of acontroller for temperature during the cooling and heating sequence. Morel et al. [7] developed anadaptive heating controller algorithm by using artificial NNs to accommodate the non-linearities

of real buildings. Jeannette et al. [8] improved the performance of an unstable hot water system inan air handling unit by applying a predictive neural network (PNN) controller. Fargus andChapman [9] developed a hybrid PI (Proportional and Integral)/neural network controller for

commercial application in HVAC control systems. Kasahara et al. [10] applied a so-called previewcontrol, which is a linear quadratic Gaussian optimal control with feed forward compensation, to

control process variables, such as indoor temperature and indoor humidity. Saboksayr et al. [11]

3208 M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226


3/20

designed a NN based decentralized controller to improve the operation of a multi-zone space

heating system.Two approaches have been reported for employing predictive controls in HVAC systems,

namely predicting future weather and predicting future thermal load. Alessandri et al. [12] pre-sented an environmental temperature forecasting model based on NNs. On the other side, most ofthe studies for applying NN based predictive control systems for HVAC systems are performed onthermal storage systems. Kawashima et al. [13] studied improving the performance of a partial ice

storage system by employing a controller that predicts the load by NNs. Similarly, Massie et al.[14] developed adaptive and predictive NN models for a chiller and ice thermal storage tank ofa central plant HVAC system. Aside from these, NNs were employed for fault detection and

diagnosis in HVAC systems [15,16].In this paper, the feasibility of using NNs to optimize HVAC setback scheduling is demon-

strated. Attention is focused on energy conservation in AC of public and office buildings in whichoff hours thermostat setback is used to allow the temperature to rise inside the building when it is

not in use, leading to energy savings. The temperature in Kuwait in summer can exceed 50 C inthe shade during the day and can exceed 37 C at dawn. With such an extreme condition, there is agreat potential for energy conservation in AC in these public buildings. As mentioned above, one

successful strategy in this respect is to allow the temperature to rise within the building when it isnot in use. One of the keys to ensure success of this strategy is being able to predict accurately thetime (EoS) when full AC power needs to be restored. This prediction of the EoS is important

because if the AC units start too late, the desired temperature within the building will not bereached in time for the start of business. On the other hand, if the AC units start too early, somepotential savings in AC energy are lost. The problem with accurate prediction of the EoS is that it

depends on the weather conditions, which are not known ahead of time. This requires an ad-

vanced tool to predict the EoS based on past weather history, and artificial NNs offer an attractiveand powerful option for this purpose. The architecture and parameters of the NNs are the mainfocus of this study to determine the optimum design for best network performance.

2. Building simulation software for calculation of EoS

In the office buildings considered in this study, the temperature is to be maintained at 24 Cwithin each building from 8 AM to 5 PM. Then, the thermostat is set back to 30 C until at least 2

AM. It is required to determine the end of setback (EoS), i.e. the time at which the thermostat isreset to 24 C, such that the temperature within the building is restored to 24 C at 8 AM. Buildingenergy simulation programs may be used to predict the EoS only when the weather conditions are

known. This was done in the present study to prepare a database of past history to be used withthe NNs to predict the EoS for new situations before the weather conditions are known.

One of the most powerful building simulation codes, the ESP-r [17], was used to calculate theEoS for this study. This state of the art, whole building simulation software is being evolved andapplied at several research centers throughout Europe since its selection by the European Com-mission as a reference program for building energy simulation. This software is based on using

integrated dynamic simulation in which the thermal performance of the building is systemic, i.e.different heat transfer mechanisms (such as the effect of wind velocity on external heat transfer

M.A. Mahmoud, A.E. Ben-Nakhi / Energy Conversion and Management 44 (2003) 32073226 3209


4/20

coefficient) interact in a complex manner. Besides conduction and convection, all significant heat

flow paths are considered. These include internal and external long wave and short wave radiationand radiation absorption by transparent materials. The weather inputs to the program include

diffuse solar radiation on the horizontal, external dry bulb temperature, direct normal solar in-tensity, prevailing wind speed, wind direction and relative humidity. Weather data in Kuwait wereobtained from the Kuwait Institute for Scientific Research for the five years 19951999. Theweather data for 1995 and 1996 were used as the base.

Two office buildings are considered in this study. The first, denoted EW, is a two story 155 mlong, 24 m wide and 7.2 m high structure. The long side of the building faces the East. The easternwall contains a double glazing transparent area of 328 m2, while the western wall contains a

double glazing transparent area of 270 m2. This building is considered one of the extreme casesthat can be studied since the effect of sun radiation through the windows contributes to the

cooling load inside the building after dawn, i.e. during the critical hours during which the NNs arepredicting the EoS. It will be noticed in this respect that dawn time is as early as 3:13 AM in

Kuwait City in June and is earlier than 4:00 AM for more than 130 days between April 15 andAugust 27. During those days, the building will be exposed to radiation (both direct and indirect)for more than 4 h before the start of work at 8:00 AM. It should also be noticed that the NNs used

in this study use external temperature only as input to predict the EoS. Therefore, reasonablyaccurate predictions of the EoS by NNs for this building serve to prove the robustness of the NNapproach. The building materials used in the numerical models are similar to the common

building materials in Kuwait and consist mainly of cement blocks, sandlime bricks, cementmortar and insulation. The thermophysical and optical properties of these materials, togetherwith those of the double glazing glass, were used as part of the input data to ESP-r to predict the

EoS.

The second building considered, denoted SN, was the same building but rotated 90, i.e. thenorthern facade contains a double glazing transparent area of 270 m2 and the southern wallcontains a double glazing transparent area of 328 m2. In this case, the effect of the sun radiation

through the windows is expected to be minimal during the period of interest (between dawn and 8AM). It is expected, therefore, that the NN analysis should give better predictions for this buildingthan for the first one. The site latitude and longitude for both buildings was assumed to be similarto Kuwait City (i.e. 29.3 N latitude and 47.9 E longitude).

Fig. 1 shows how the EoS, calculated using ESP-r, varies from day to day and from year to yearduring the period covered in this study for building EW. It is evident from the figure that aschedule for the EoS based on the date can lead to errors as large as 4 h during the hottest days of

the year. Examination of the EoS plotted against the external temperature at 2 AM during the fiveyear period for the same building revealed that no correlation exists between the EoS and the

temperature, which rules out any simple way to predict the EoS based merely on one singlereading of temperature. An advanced tool that accounts for the temperature variation for several

hours is needed, and artificial NNs offer a powerful tool for this purpose.An artificial NN is usually defined as a network composed of a large number of processors

(neurons) that are massively interconnected, operate in parallel and learn from experience

(examples). Backpropagation networks are known for their ability to generalize well on a widevariety of problems. That is why they are used for the vast majority of working NN applications.Two other NN types are also evaluated in this study, namely general regression and polynomial



5/20

NNs. A brief summary of these network architectures is presented in Table 1 and a brief de-

scription of each network is given in the next section.

3. Neural network architectures

The six architectures of NNs used in this study are summarized in Table 1. The standard type ofthe well known backpropagation NN (SBP) is one in which every layer is connected or linked to

Table 1

Types of NNs used in this study

Abbreviation Network Brief description

SBP Standard backpropagation NN

(Fig. 2)

A three layer network in which each layer connects

only to the next layer

3Slab Predictive three slab NN (Fig. 3c) Backpropagation network with the hidden layer

divided into three slabs2Slab Predictive two-slab NN (Fig. 3a) Backpropagation network with the hidden layer

divided into two slabs

2SlabB Predictive two-slab NN (Fig. 3b) Backpropagation network with the hidden layer

divided into two slabs and a connection to the output

layer

GRNN General regression NN (similar to

Fig. 2)

A three layer network that contains one hidden

neuron for each training pattern and converges to the

underlying regression surface

GMDH Polynomial NN (similar to Fig. 2,

but with multiple hidden layers)

Works by building successive layers with links that are

polynomial terms

0 50 100 150 200 250 300 350

2

3

4

5

6

7

8

1995

1996

1997

1998

1999

EoS

Day

Fig. 1. Daily variation of the EoS, building EW.



6/20

the immediately previous layer. Three layers, namely input, hidden and output layers, (shown

schematically in Fig. 2) are used for this basic standard type. The principal advantages of thisnetwork are its quick learning and fast convergence to an optimal regression surface as the

number of samples becomes large. Sometimes more than one hidden layer is used, however thismay result in a dramatic increase in training time.

Three more versatile types of the backpropagation NNs were also evaluated in this study inaddition to the SBP. These NNs, shown schematically in Fig. 3, are based on dividing the neurons

in the middle layer into 2 or 3 sets (slabs); different activation functions applied to the hidden layerslabs detect different features in a pattern processed through a network. For example, a networkdesign may use a Gaussian activation function on one hidden slab to detect features in the mid-

range of the data and use a Gaussian complement function in another hidden slab to detectfeatures from the upper and lower extremes of the data. Combining the two feature sets in the

output layer may lead to a better prediction. Thus, the output layer will get different views of thedata [20].

General regression neural networks (GRNN) were also evaluated in this study. The blockdiagram of a GRNN is essentially similar to Fig. 2. A GRNN is a feedforward network that canbe used to estimate a vector Y from a measured vector X. The input units are merely distribution

HIDDEN

LAYER

INPUTS

O

UTPUTS

Fig. 2. Block diagram of a NN.

Slab

1

Slab

3

Slab

2

Slab

4Slab

1

Slab

3

Slab

2

Slab

4

Slab

5Slab

1

Slab

3

Slab

2

Slab

4

(a) 2Slab (b) 2SlabB (c) 3Slab

Fig. 3. Block diagrams showing the architecture of the NNs used in the analysis: (a) two hidden slabs, (b) two hidden

slabs with a direct connection to output layer and (c) three hidden slabs.



7/20

units, which provide the (scaled) measured variables X to all of the neurons on the hidden layer

that contains the pattern units. Each pattern unit (neuron) is dedicated to one exemplar (pattern)or one cluster center. When a new vector X is entered into the network, it is subtracted from the

stored vector representing each cluster center. The squares of the differences are summed and fedinto a non-linear activation function. The activation function used herein is logistic in the formfx 1=1 ex where x is the input. This function is the most popular and has been found usefulfor most network applications [18]. The pattern units output is passed on to the summation units.

Details of the GRNN paradigm were provided by Specht [19].The GRNN learns by adjusting the interconnection weights between layers. The answers the

network is producing are repeatedly compared with the correct answers, and each time, the

connecting weights are adjusted slightly in the direction of the correct answers. Eventually, ifthe problem is learned, a stable set of weights adaptively evolves, which will provide good answers

for all of the sample predictions. The real test of NNs occurs when the trained network is able toproduce good results for new data.

A third class of NNs also evaluated in this study is the group method of data handling(GMDH) networks, also known as polynomial nets. These NNs work by building successivelayers with complex links (or connections) that are the individual terms of a polynomial. These

polynomial terms are created by using linear and non-linear regression. The initial layer is simplythe input layer. The first layer created is made by computing regressions of the input variables andthen choosing the best ones. The second layer is created by computing regressions of the values

in the first layer along with the input variables. (Note that the process is essentially buildingpolynomials of polynomials.) Again, only the best are chosen by the algorithm. These are calledsurvivors. This process continues until the network stops getting better (according to a pre-

specified selection criterion). More technical details of the GMDH design and limitations are

given by Farlow [23].

4. Neural network results and discussion

One of the objectives of this study was to investigate the feasibility of using NNs to estimate theEoS since there is no simple solution available to solve this problem. In order to use NNs, the

building simulation algorithm ESP-r was used to generate a database for the years 19951999.The data used in the NN analysis covered 233 days (Julian days 89321) of each year. The data for

1995 and 1996 were used for training and testing the NNs. The patterns in this database weredivided into two sets. The first set consisted of 373 patterns and was used for training the net-works. The second set consisted of the remainder of 93 patterns, selected randomly, and was used

for testing the trained networks. To evaluate the usefulness of the networks, the trained networksare applied to a production data set (699 patterns for the years 19971999) which the networks

have never seen before.The input layer to the GRNN network N2 consisted of 19 neurons to which temperature

readings (recorded every hour) between 8 AM (of the previous day) and 2 AM (of the day beingconsidered) were fed. The hidden layer must contain a minimum of one neuron for each data

pattern; the number was set to 466. The number of neurons in the output layer is 1, which cor-responds to the output (the EoS). The statistical indicator used to evaluate the closeness of fit (for



8/20

all network architectures considered herein) is the coefficient of multiple determination R2 that canbe defined as [20]: R2 1

Pyyp

2=Pyym

2 where y is the actual value, yp is the pre-dicted value ofy, and ym is the mean of the yvalues. The coefficient of multiple determination R

2,

compares the accuracy of the model to the accuracy of a trivial benchmark model wherein theprediction is simply the mean of all of the samples. A perfect fit would result in an R2 value of 1

and a very good fit near 1. The quality of fit decreases as R2 decreases. Table 2 shows that R2 forthis network is 0.9856, which is proof of a very good fit.

It will be noticed that since the temperature (in addition to other weather data needed for the

simulation program ESP-r) is read and recorded every hour, by 3 AM, more weather informationbecomes available that might improve the prediction of the EoS. To investigate this notion, an-other NN (denoted N3) was designed and trained on the temperature records between 8 AM and

3 AM, i.e. the input layer for this network contains 20 neurons. Table 3 shows that R2 for this

network is 0.9877, which is higher than the R2 for N2. However, it has to be borne in mind that thepredictions of N3 are useful only for predicted values of the EoS > 3 AM. Similarly, four more

NNs (denoted N4N7) were designed and trained. The input range and R2 for each are shown inTable 2. The table shows steady improvement in R2 from N2 to N7, i.e. with the increase in the

number of inputs. This led us to implement a neural control scheme (denoted NC) as explained bythe flow chart on Fig. 4. For any given day, at 2 AM, the network N2 is applied to the hourly

temperature record between 8 AM (of the previous day) and 2 AM (of the day being considered)to predict an EoS value denoted N2. If N2 < 3 AM, then EoSN2. If N2 > 3 AM, then at 3 AMthe network N3 is applied to the hourly temperature record between 8 AM (of the previous day)

and 3 AM (of the day being considered) to predict another EoS value denoted N3. ThenEoSN3 if 3 AM


9/20

network architectures listed in Table 1, i.e. for each architecture, networks N2N7 were designed,trained on 19951996 data, incorporated in the NC scheme and then applied to the productiondata 19971999. It will not be possible to include all the results in this paper because of size

limitations. Therefore, sample results and a summary of the most significant findings are pre-sented in the following.

For building EW, Table 3 shows the R2 for each year using both N2 and NC for all six NNtypes. From the table, the NC gives better prediction than the network N2 for all cases. The table

also includes a definition of the parameters of each of the NN types. For instance, the resultspresented in the table for the GRNN were obtained using a genetic adaptive algorithm, a so-called City Block distance metric and a linear scaling factor for the input data. This network

design was selected as a result of a parametric study of the network design that was intended todetermine quantitatively the GRNN design that best predicts the production data (i.e. for years19971999). The variables investigated included (a) different scaling functions (linear between

[)1,1], linear between [0,1], logistic and hyperbolic tangent tanh), (b) two possible ways tomeasure the distance between patterns, namely the City Block distance metric [19] and theEuclidean distance metric [22] and (c) the genetic adaptive algorithm as opposed to an iterative

Table 3

R2 using the NC and the network N2 for the production set (19971999) and the building EW for different network

designs

Network 1997 1998 1999Using N2 Using NC Using N2 Using NC Using N2 Using NC

GRNN, linear scaling, genetic adap-

tive, City Block distance metric

0.828 0.863 0.875 0.882 0.899 0.899

3Slab; learning rate and momentum

are 0.03, 0.7 on one link, and 0.1,

0.1 on all others; vanilla; rotation.

Linear scaling function on input

activation functions are Gauss,

Gauss-comp., and logistic

0.74 0.845 0.856 0.907 0.842 0.904

2Slab; learning rate and momentum

are 0.1, 0.1 on all links; vanilla;

rotation. Linear scaling functionon input activation functions are

Gauss, Gauss-comp., and logistic

0.794 0.859 0.829 0.856 0.871 0.894

2Slab-B; learning rate and momen-

tum are 0.1, 0.1 on all links; vanilla;

rotation. Linear scaling function

on input activation functions are

Gauss, Gauss-comp., and logistic

0.797 0.858 0.828 0.842 0.871 0.882

GMDH; Linear scaling function on

input

0.789 0.851 0.836 0.853 0.88 0.901

SBP; learning rate and momentum

are 0.1, 0.1 on all links; vanilla;

rotation. Linear scaling function

on input activation functions are

Gauss-comp. and logistic

0.817 0.847 0.865 0.871 0.882 0.892



10/20

approach. The genetic approach is much slower but is expected to generalize much better than theiterative procedure.

Similar parametric investigations were conducted on the other network types listed in Table 3to determine the effect of each network variable on the ability of the network to predict the EoS

accurately. Table 4 shows partial results of such a parametric study for the 3Slab networks. Itincludes the rate of learning and momentum on each link of the network, the method of weight

updates (Vanilla/Momentum), and the method of pattern selection (random/rotation). These andother network parameters are described briefly in the Appendix A for completeness. The results

presented in the table confirm that the NC gives better prediction than the network N2 for allcases. Also, it is evident from the table that the best 3Slab design among those considered is designA. The types of scaling function and activation functions for each slab in the network (as listed in

the table caption) were also selected as a result of a similar parametric study. It will be recalledthat each time a value for NC is determined, six NNs (N2N7) have to be designed and trained.Therefore, significant amounts of time and effort were spent in performing these parametric

studies. Based on these studies, the best design of each of the NNs for the present application ispresented in Table 3. These best designs were used in the remainder of this paper for building EW(Figs. 511) and building SN (Figs. 1215).

Yes

is N2 > 3 ? EoS=N2No

End

i =3

is Ni < i ?Yes

EoS = i End

i = i +1

No

is Ni > i +1 ?No

EoS = Ni End

Yes

Start

Fig. 4. Flow diagram of the NC.



11/20

For building EW, Figs. 68 show a comparison of the NC predictions (3Slab NNs) and the

actual values (predicted by ESP-r). The figures show close agreement between the NC predictionsand the actual values. Plots of the differences (errors) on the top of each of Figs. 68 show a

random distribution of these errors. Further analysis of the correlation between actual (from ESP-r) and NN predictions were conducted. Fig. 9 shows another plot of the data of Fig. 7. Ideally, the

data should fall on a line of slope 1. A least squares fit through the data showed that the slope ofthe line of best fit has a slope b of 0.971. The 95% confidence interval on b lies between 1.007and 0.936. This means that there is no reason to reject the null hypothesis that b 1, i.e. a 1:1

100 150 200 250 300

2

3

4

5

6

7

8

ESP-r

NC

EoS

Day

100 150 200 250 3002

3

4

5

6

7

8

ESP-r

NC

E

oS

Day

(a)

(b)

Fig. 5. Comparison of actual (from ESP-r) and neural (NC) EoS, building EW, training and testing data: (a) for 1995

and (b) for 1996.



12/20

Table 4

Effect of some 3Slab network parameters on R2 using the NC and the network N2 for the production set (19971999)

and the building EW (Linear scaling function, activation functions for hidden layer: GaussianGaussian complement-

Logistic, and for third layer: Logistic)

Design # Network Variables(3Slab) 1997 1998 1999

Using N2 Using NC Using N2 Using NC Using N2 Using NC

A Vanilla; rotation; learning

rate and momentum are 0.03

and 0.7, respectively on the

link to the logistic hidden

slab; 0.1 and 0.1 on all other

links

0.740 0.845 0.856 0.907 0.842 0.904

B Momentum; random; learn-

ing rate and momentum are

0.03 and 0.7, respectively on

the link to the logistic hiddenslab; 0.1 and 0.1 on all other

links

0.788 0.830 0.842 0.850 0.876 0.893

C Momentum; random; learn-

ing rate and momentum are

0.1 and 0.1, respectively on all

links

0.782 0.848 0.848 0.868 0.871 0.899

D Vanilla; random; learning

rate and momentum are 0.03

and 0.7, respectively on the

link to the logistic hidden

slab; 0.1 and 0.1 on all other

links

0.751 0.828 0.848 0.855 0.852 0.881

-2-101

Error

80 120 160 200 240 280 3202

4

6

8

ESP-r

NC

EoS

(hour)

Day

Fig. 6. Comparison of actual (from ESP-r) and neural (NC, 3Slab Networks) EoS, 1997, building EW. The differences

(errors) are randomly distributed as shown on the top of the figure.



13/20

correlation between ESP-r and NC. This statistical analysis was performed for all six networks(listed in Table 3) and the results are plotted in Fig. 10 for the years 19971999. The figure showsthat for all the networks considered and for the three year period, one may accept the hypothesis

of a 1:1 correlation between ESP-r and each of the NNs considered. Fig. 11 presents sample

-2-101

Error

80 120 160 200 240 280 320

2

4

6

8

ESP-r

NC

EoS(

hour)

Day



-2-101

Error

80 120 160 200 240 280 3202

4

6

8

ESP-r

NC

EoS(

hour)

Day





14/20

histogram plots of the errors (difference between NN and best line of fit), which shows close to a

normal distribution with mean close to 0.Statistical analysis of variance (ANOVA) showed that at the 95% confidence level there is no

reason to reject the null hypothesis that the prediction of the EoS by the six network architectureslisted in Table 3 are essentially similar. In light of this, there is no statistical basis to prefer anynetwork type over the others, except as may be based on the R2 values. From Table 3, it may be

inferred that the 3Slab architecture is marginally better than the others.

2 4 6 8 100

2

4

6

8

10

ESP-r

3Slab

Fig. 9. Comparison of EoS as predicted from ESP-r and the 3Slab (NC) networks for building EW, 1998.

0.9

1.0

1.1

SBPGMDH2SlabB2Slab3SlabGRNN

1997 1998 1999

slope

b

Network type

Fig. 10. The 95% confidence intervals and point estimates of the slope of line of best fit, building EW.



15/20

For the building SN, the same procedure used to generate Figs. 611 was used to predict theEoS. This involved (a) using ESP-r to predict the EoS for the years 19951999, (b) training six new

networks, N2N7, using 1995 and 1996 temperature data using the six network designs describedin Table 3 and (c) applying these trained networks to the temperature records of the production

years 19971999. Figs. 1214 show a sample comparison of the NC predictions and the actualvalues, and Table 5 shows the R2 for each year using both N2 and NC. The table shows that the

NC gives better prediction than the network N2. The figures show close agreement between theNC predictions and the actual values. Plots of the errors (on the top of each of Figs. 1214) showthe random distribution of these errors. The NN predictions for this building, as expected, are

-1 0 10

20

40

60

80

GRNN

-1 0 10

20

40

60

80

3Slab

-1 0 10

20

40

60

SBP

Error, hr

Frequency

Fig. 11. Histogram distribution of errors between some network predictions and lines of best-fit, building EW, 1998.

-2-101

Error

80 120 160 200 240 280 3202

4

6

8

ESP-r

NC

EoS(

hour)

Day

Fig. 12. Comparison of actual (from ESP-r) and neural (NC, 3Slab Networks) EoS, 1997, building SN. The differences




16/20

better than the predictions for the EW building. Fig. 15 presents the 95% confidence intervals and

point estimates of the slopes of the lines of best fit, which again show that for all the networksconsidered and for the three year period, one may accept the hypothesis of a 1:1 correlationbetween ESP-r and each of the NNs considered.

-2-101

Error

80 120 160 200 240 280 320

2

4

6

8

ESP-r

NC

EoS(

hour)

Day



-2-101

Error

80 120 160 200 240 280 3202

4

6

8

ESP-r

NC

EoS(

hour)

Day





17/20

Again, a statistical ANOVA showed that at the 95% confidence level there is no reason to reject

the null hypothesis that the prediction of the EoS by the six network architectures listed in Table 5are essentially similar. In light of this, there is no basis to prefer any network type over the others.

5. Conclusion

Six types of NNS were designed and trained to investigate the feasibility of using this tech-nology for HVAC set-back control. A state of the art whole building simulation software, the

ESP-r system, was used to prepare a database of past history to be used with the NNs to predictthe EoS for new situations before the weather conditions are known. Parametric studies wereconducted for each NN type to select the network parameters that best predict the EoS. The NN

predictions were improved by developing a NC. This scheme is based on using the temperaturereadings as they become available. Six NNs were designed and trained for this purpose (for eachnetwork type considered). The performance of the NN analysis was evaluated using a statistical

indicator (the coefficient of multiple determination R2), and by statistical analysis of the errorpatterns.

0.9

1.0

1.1

SBPGMDH2SlabB2Slab3SlabGRNN

1997 1998 1999

slope

b

Network type

Fig. 15. The 95% confidence intervals and point estimates of the slope of line of best fit, building SN.

Table 5

R2 using the NC and the network N2 for the production set (19971999) and the building SN for different network

architectures

Network 1997 1998 1999

Using N2 Using NC Using N2 Using NC Using N2 Using NC

GRNN 0.921 0.942 0.919 0.928 0.936 0.934

3Slab 0.916 0.943 0.923 0.932 0.936 0.937

2Slab 0.922 0.928 0.922 0.935 0.930 0.937

2Slab-B 0.921 0.942 0.922 0.938 0.925 0.940

GMDH 0.908 0.934 0.902 0.919 0.916 0.926SBP 0.918 0.943 0.920 0.933 0.928 0.938



18/20

To evaluate the usefulness of the NNS, the trained networks were applied to a production

data set (699 patterns for the years 19971999) that the networks have never seen before. Theresults of applying the technique on data for these three years show good prediction when the NN

is properly designed (R2 values are given in Tables 3 and 5). The results also confirm that the NCgives better prediction than the network N2 for all cases.

The success of the NN in accurately predicting the EoS is significant for two reasons. The first isthat the NN can predict the EoS before the weather conditions are known. The second is that

building simulation softwares require many more weather inputs than the NNs do, ESP-r, forinstance, requires as inputs diffuse solar radiation on the horizontal, external dry bulb tempera-ture, direct normal solar intensity, prevailing wind speed, wind direction and relative humidity.

The NN predictions, on the other hand, are based on using the external temperature only as input.Temperature is perhaps the easiest and most reliable weather data to measure. Based on this fact,

one expects that controllers based on the NC approach will be simple to make and reliable to use.

Appendix A. Brief discussion of some NN parameters and design aspects

As neurons pass values from one layer of the network to the next layer in backpropagation

networks, the values are modified by a weight value in the link that represents connectionstrengths between the neurons. The weights begin as random numbers that fall within a range

specified at the start of training. As each pattern passes though the network, the weight is eitherraised to reinforce a connection positively, or lowered to inhibit the connection.

A link is the connection or set of weights between the slabs or groups of neurons in a network.Each link can have an individual learning rate and momentum.

Each time a pattern is presented to the network, the weights leading to an output node aremodified slightly during learning in the direction required to produce a smaller error the next timethe same pattern is presented. The amount of weight modification is the learning rate times theerror.

Large learning rates often lead to oscillation of weight changes and learning never completes, orthe model converges to a solution that is not optimum. One way to allow faster learning withoutoscillation is to make the weight change a function of the previous weight change to provide a

smoothing effect. The momentum factor determines the proportion of the last weight change thatis added into the new weight change.

Two techniques for weight updates were evaluated, namely Vanilla and Momentum. TheVanilla algorithm applies the learning rate without a momentum term. In the Momentum algo-rithm, the weight updates not only include the change dictated by learning rate, but also include a

portion of the last weight change as well. Like momentum in physics, a high momentum term willkeep the network generally going in the direction it has been going. In other words, weight

fluctuations will tend to be dampened by a high momentum term. The Momentum algorithm isexpected to be useful for extremely noisy data, or when a high learning rate is necessary.

Two methods for selection of patterns from the training data set are Rotation and Random.The Rotation scheme selects the patterns in the order of appearance in the training set and is

expected to be useful when similar training patterns are dispersed evenly in the data set. TheRandom scheme selects the patterns randomly and is useful for training sets that contain cyclical



19/20

variations or when it is desirable to obtain answers that are independent of clustered informa-

tion.One of the pitfalls of using NNs is overtraining, i.e. the network memorizes the input patterns

and does not generalize well for other data. In this study, over training of the GRNN was pre-vented by using so-called Net-Perfect algorithm [20]. This algorithm optimizes the network byapplying the current network to an independent test set during training. The algorithm finds theoptimum network for the data in the test set (which means that the network is able to generalize

well and give good results on new data). The algorithm optimizes the smoothing factor basedupon the values in the test set. It does this by trying different smoothing factors and choosing theone that minimizes the mean squared error between the actual and predicted answers. Over-

training of the backpropagation networks and the GMDH networks used in the present study wasprevented using similar algorithms.

When variables are loaded into a NN, they must be scaled from their numeric range into thenumeric range that the NN deals with efficiently. There are two main numeric ranges the networks

commonly operate in: zero to one denoted [0, 1] and minus one to one denoted [)1, 1]. One choiceis the use of linear scaling functions for this purpose. Possible alternatives to these linear scalingfunctions include two non-linear scaling functions: logistic and tanh. The logistic function scales

data to (0, 1) according to the following formula: fx 1=1 expxm=s where xm is theaverage of all of the values of that variable in the pattern file, and s is the standard deviation ofthose values. The hyperbolic tangent function (tanh) scales data to ()1, 1) according to:fx tanhxm=s. As detailed in the main body of the paper, parametric studies were con-ducted to select the best scaling function for the present application.

GRNN work by measuring how far a given sample pattern is from patterns in the training set

in Ndimensional space, where N is the number of inputs in the problem. In this study, the method

of measuring the distance between patterns was the so-called City Block distance metric, which isthe sum of the absolute values of the differences in all dimensions between the pattern and theweight vector for that neuron [19].

The GRNN used in this study was genetic adaptive, i.e. it uses a genetic algorithm to find aninput smoothing factor adjustment. This is used to adapt the overall smoothing factor to providea new value for each input. Genetic algorithms use a fitness measure to determine which of theindividuals in the population survive and reproduce [21]. The fitness for GRNN is the mean

squared error of the outputs for the entire data set. The genetic adaptive algorithm seeks tominimize the fitness.

References

[1] Yeh S, Wong K. HVAC pipe/duct sizing using artificial neural networks. Int J Modell Simulat 1999;19(3):2826.

[2] Teeter J, Chow M. Application of functional link neural network to HVAC thermal dynamic system identification.

IEEE Trans Ind Electron 1998;45(1):70176.

[3] Chen Y, Chen Z. Neural-network-based experimental technique for determining z-transfer function coefficients of

a building envelope. Building Environ 2000;35(3):1819.

[4] Egilegor B, Uribe JP, Arregi G, Pradilla E, Susperregi L. A fuzzy control adapted by a neural network to maintain

a dwelling within thermal comfort. In: Proceedings of Building Simulation 97, vol. II. pp. 8794.



20/20

[5] Rock BA, Wu Ch-T. Performance of fixed, air-side economizer, and neural network demand-controlled ventilation

in CAV systems. ASHRAE Trans 1998;104(2):23445.

[6] Ahmed O, Mitchell J, Klein S. Feedforward-feedback controller using general regression neural network (GRNN)

for laboratory HVAC system: Part IItemperature controlcooling. ASHRAE Trans 1998;104(2):62634.

[7] Morel N, Bauer M, El-Khoury M, Krauss J. Neurobat, a predictive and adaptive heating control system using

artificial neural networks. Int J Solar Energy 2001;21(23):161202.

[8] Jeannette E, Assawamartbunlue K, Curtiss P, Kreider J. Experimental results of a predictive neural network

HVAC controller. ASHRAE Trans 1998;104(2):1927.

[9] Fargus RS, Chapman C. Commercial PI-neural controller for the control of building services plant, IEE

Conference Publication no. 455 2, IEE, Stevenage, England, 1998. p. 168893.

[10] Kasahara M, Matsuba T, Hashimoto Y, Murasawa I, Kimbara A. Optimal preview control for HVAC system.

ASHRAE Trans 1998;104(Pt 1A):50213.

[11] Saboksayr S, Patel R, Zaheer-uddin M. Energy-efficient operation of HVAC systems using neural network based

decentralized controllers. In: Proceedings of the American Control Conference, vol. 6, 1995. p. 43215.

[12] Alessandri A, Verona F, Parisini T, Torrini A. Neural approximation for the optimal control of heating systems.

In: Proceedings of the IEEE Conference on Control Applications, vol. 3, 1994. p 16138.

[13] Kawashima M, Dorgan C, Mitchell J. Optimizing system control with load prediction by neural networks for anice-storage system. ASHRAE Trans 1996;102(1):116978.

[14] Massie D, Curtiss P, Kreider J. Predicting central plant HVAC equipment performance using neural networks

laboratory system test results. ASHRAE Trans 1998;104(1A):2218.

[15] Peitsman H, Bakker V. Application of black-box models to HVAC systems for fault detection. ASHRAE Trans

1996;102(1):62840.

[16] Li X, Vaezi-Nejad H, Visier J. Development of a fault diagnosis method for heating systems using neural networks.

ASHRAE Trans 1996;102(1):60714.

[17] Clarke JA. Energy simulation in building design. Bristol: Adam Hilger Ltd.; 1985.

[18] Hammerstrom D. Working with neural networks. IEEE Spectrum 1993:4653.

[19] Specht DF. A general regression neural network. IEEE Trans Neural Networks 1991;2(6):56876.

[20] Neuroshell 2 Manual, Ward Systems Group Inc., Frederick, MA, 1996.

[21] Goldberg DE. Genetic algorithms in search optimization, and machine learning. Reading, MA: Addison-Wesley;

1989.

[22] Kreyszig E. Advanced engineering mathematics. seventh ed. New York: Wiley; 1993.

[23] Farlow SJ, editor. Self-organizing method in modeling: GMDH type algorithms. Statistics: Textbooks and

Monographs, 1984. p. 54.


nn1-architecture and performance of neural networks for efficient ac control in buildings

Documents