abrahart and see

8/14/2019 Abrahart and See

1/16

HYDROLOGICAL PROCESSESHydrol[ Process[ 03\ 10461061 "1999#

Copyright 2000 John Wiley & Sons, Ltd.

Received 2 March 1998Accepted 10 July 1999

Comparing neural network and autoregressive movingaverage techniques for the provision of continuous river flow

forecasts in two contrasting catchments

Robert J[ Abrahart0 and Linda See10 School of Earth and Environmental Sciences\ University of Greenwich\ Medway Campus\ Central Avenue\ Chatham Maritime\ Kent

ME3 3TB\ UK1 Centre for Computational Geo`raphy\ University of Leeds\ Leeds LS1 8JT\ UK

Abstract]The forecasting power of neural network "NN# and autoregressive moving average "ARMA# models arecompared[ Modelling experiments were based on a 2!year period of continuous river ~ow data for twocontrasting catchments] the Upper River Wye and the River Ouse[ Model performance was assessed usingglobal and storm!speci_c quantitative evaluation procedures[ The NN and ARMA solutions provided similarresults\ although na(ve predictions yielded poorer estimates[ The annual data were then grouped into a set ofdistinct hydrological event types using a self!organizing map and two rising event clusters were modelled usingthe NN technique[ These alternative investigations provided encouraging results[ Copyright 1999 JohnWiley + Sons\ Ltd[

KEY WORDS neural network model^ ARMA model^ hydrological forecasting

INTRODUCTION

Operational hydrological forecasting and water resource management require e.cient tools to provideaccurate estimates of future river level conditions and meet real world demand[ Physical models\ based oncontinuum mechanics\ o}er one possible forecasting method[ Such tools\ however\ will often be too complex\or too demanding in terms of data and resources\ for widespread practical application "Jakeman\ 0882#[Thus simpler approaches o}ered through {conceptual modelling| or {black!box| solutions are fast becomingattractive alternatives[ One area of recent interest in this emerging _eld of hydrological research is theapplication of neural network "NN# modelling "e[g[ Daniell\ 0880^ French et al[\ 0881^ Kang et al[\ 0882^Karunanithi et al[\ 0883^ Lorrai and Sechi\ 0884^ Smith and Eli\ 0884^ Cheng and Noguchi\ 0885^ Minnsand Hall\ 0885\ 0886^ Yang\ 0886^ Dawson and Wilby\ 0887^ Minns\ 0887^ Abrahart and See\ 0888^ Campoloet al[\ 0888#[ Neural network forecasting and prediction o}ers various bene_ts "Abrahart\ 0888# and models

can be developed to meet the four guiding principles of hydrological modelling] parsimony\ modesty\accuracy and testability "Hillel\ 0875#[ Neural solutions also can be developed with basin transfer capabilities"Minns\ 0885#\ be derived from small data sets "Abrahart and White\ in press#\ and could accommodatelong!term changes or relationships that are not constant and evolve over time[ These new technologies\however\ require evaluation against conventional models and statistical tools\ in order to determine theirrelative performance and to establish what are appropriate circumstances for their use[ Hsu et al[ "0884#\for example\ compared NN models against a statistical ARMAX "autoregressive moving average withexogenous input# model and the lumped conceptual SAC!SMA "Sacramento soil moisture accounting#model using daily data for the Leaf River basin[ The NN models generated better performance statistics[This paper\ in contrast\ evaluates the numerical performance of feedforward NN models trained with the


2/16

R[ J[ ABRAHART AND LINDA SEE

Copyright 1999 John Wiley + Sons\ Ltd[ Hydrol[ Process[ 03\ 10461061 "1999#

1047

backpropagation algorithm against linear statistical autoregressive moving average "ARMA# models for the

purpose of operational forecasting[ It therefore embraces the ideas of Maier and Dandy "0885\ 0886# andFortin et al[ "0886#\ who suggested that a comparison of these two approaches might prove useful[ Na(vepredictions\ which use the current value as the prediction\ were also included in this comparison as abottomline benchmark[

Modelling operations were implemented on two contrasting catchments] the Upper Wye "Central Wales#and the River Ouse "Yorkshire#[ This work was limited to an investigation of models developed on arestricted set of input variables\ i[e[ river ~ow data[ Other variables\ such as meteorological inputs orcatchment characteristics\ for instance the degree of urbanization\ were excluded in order that the wholemodelling procedure could be reproduced at a later date on alternative catchments where additional datamight not be available[ Models were developed on annual data for 0874\ with 0873 and 0875 data beingused for model validation purposes[ The main method of assessment was based on an evaluation of severaldi}erent global {goodness!of!_t| statistics[ Two alternative performance measures\ more relevant to high~ow situations and ~ood forecasting\ were also investigated[ The _nal part of this paper illustrates thepotential for multi!model solutions based on the application of a self!organizing map "SOM# to the samedata[ This tool was used to perform a classi_cation of the data series\ which produced a set of distincthydrological event types\ two of which were then modelled on an individual basis\ to determine if moreaccurate NN solutions could be achieved[

NEURAL NETWORK BASICS

NN solutions o}er an important alternative to traditional hydrological methods\ for both data analysis anddeterministic modelling[ In conventional computing a model is expressed as a series of equations\ which arethen translated into 2GL code\ such as C or Pascal\ and run on a computer[ NN solutions\ in contrast\ aretrained to represent the implicit relationships and processes that are inherent within each data set[ Individualnetworks are also able to represent di}erent levels of generalization and can work with di}erent types\ and

combinations\ of input and output data\ e[g[ nominal data\ fractal dimension and _eld measurements atdi}erent scales and resolutions[ There are several di}erent types of NN[ The network that is of greatestinterest at the moment is the feedforward multilayered perceptron "Figure 0#[ The basic structure is notcomplicated[ It consists of a number of simple processing units "also known as processing elements\ neuronsor nodes#\ which are arranged in a number of di}erent layers\ and joined together to form a network[ Dataenters the network through the input units "left# and is then fed forward through successive layers to emergefrom the output units "right#[ This is called a feedforward network because the ~ow of information is in onedirection going from input units to output units[ The outer layer\ where information is presented to the

Figure 0[ Basic con_guration of a feedforward multilayered perceptron "Dawson and Wilby\ 0887#


3/16

NEURAL NETWORK AND ARMA


1048

network\ is called the input layer[ The layer on the far side\ where processed information is retrieved\ is

called the output layer[ The layers between the two outer ones are called hidden layers "being hidden fromdirect contact with the external world#[ To avoid confusion the recommended method for describing anetwork is based on the number of hidden layers[ Figure 0\ for example\ is a one hidden!layer network[There are weights on each of the interconnections and it is these weights that are altered during the trainingprocess\ to ensure that the inputs produce an output that is close to the desired value\ with an appropriate{training rule| being used to adjust the weights in accordance with the data that are presented to the network[The most popular mechanism for training a network is {backpropagation of error| "Rumelhart et al[\ 0875^Tveter\ 08857#[ Backpropagation for a multilayered network works as follows[ The weighted input to eachprocessing unit is

Ii sn

j0

wijxj

The output from each processing element is a sigmoid function\ the most common being

f"I# 0

0eI

with the derivative

f ?"I# df"I#

dI f"I#0f"I#

Weight updates are based on a variation of the generalized delta rule

DwijbEf"I#aDwpreviousij

where b is the learning rate\ E is the error\ f"I# is the output from a processing unit in the previous layer

"incoming transmission#\ a is the momentum factor\ and where 9=9b 0=9 and 9=9 a 0=9[ Error forthe output layer is desired output minus actual output

Eoutputj ydesiredj y

actualj

whereas error for a hidden processing unit is derived from error that has been passed back from eachprocessing unit in the next forward layer[ This error is weighted using the same connection weights thatmodi_ed the forward output activation value\ and the total error for a hidden unit is thus the weighted sumof the error contributions from each individual unit in the next forward layer[ To ensure a stablemathematicalsolution\ the total error for each unit is then multiplied using the derivative of the output activation\ for thatunit\ in the forward pass

Ehiddeni df"Ihiddeni #

dIsn

j0

"wijEoutputj #

which is an operation that is propagated backwards across the network[ Following training\ input data arethen passed through the trained network in its non!training mode\ where the presented data are transformedwithin the hidden layers to provide the modelling output values[

STUDY AREAS AND DATABASES

Two areas were chosen for this exercise] the Upper River Wye in Central Wales and the River Ouse inYorkshire "Figure 1#[ The Upper Wye is an upland research catchment that has been used on severalprevious occasions for various hydrological modelling purposes "e[g[ Bathurst\ 0875^ Quinn and Beven\


4/16



1059

Figure 1[ Location map] "A# River Ouse catchment\ "B# Upper River Wye catchment

0882#[ The basin covers an area of some 09=44 km1 and has a quick response[ Hydrological data wereavailable from the gauging station at Cefn Brwyn[ The Ouse has a much larger catchment\ which covers anarea of 2175 km1\ and contains a mixed pattern of urban and rural land uses[ Gauging stations are distributedthroughout the catchment along each of its three main tributaries] the Nidd\ Swale and Ure[ Two gaugingstations were chosen for this exercise] "i# Skelton\ located just north of York on the River Ouse^ and "ii#Kilgram\ located further upstream on the River Ure[ Skelton\ with a downstream location\ has a less ~ashierregime than either Kilgram or the Upper Wye and has ~ood types that are easier to predict[

Data were available on a 0!h time!step for the period 087375[ The modelling data comprised] twoseasonal factors "sinCLOCK and cosCLOCK#^ 5 h of previous ~ow data "FLOWt5 to FLOWt0#^ 5 hof di}erenced ~ow data "DIFFt5 to DIFFt0#^ and either FLOW\ or DIFF\ at time t "the value to bepredicted#[ There were in total 03 input variables and one output variable[ The CLOCK input variables\based on annual hour count\ were intended to allow for variation in system output according to seasonal orannual in~uences\ because an agricultural catchment might be expected to produce di}erent responses insummer "drier# and winter "wetter#[ The selection of a 5 h historical record relates to other work where inputsaliency analysis has shown that this period contains the most in~uential variables for continuous NN river!level forecasting purposes "Abrahart et al[\ 0888#[ The variables were subjected to linear normalizationbetween 9=0 "lowest value per variable per station# and 9=8 "highest value per variable per station#[ Threeannual data sets were then created from each of the two main pattern _les for each river gauging station[The results are reported in normalized ~ow units "nfu#[

EVALUATION MEASURES

One major problem in assessing NN solutions is the use of global statistics[ When these mechanisms areused to model one!step!ahead predictions\ the solution will in most cases produce a high or near!perfect{goodness!of!_t| statistic[ Such measures give no real indication of what the network is getting right orwrong or where improvements could be made\ i[e[ for particular periods of time when the predictions werepoor[ NN solutions are designed to minimize global measures\ and a more appropriate metric that identi_esreal problems\ or between!network di}erences\ is perhaps now long overdue[ As there is no one de_nitiveevaluation test\ a multicriteria assessment was therefore carried out\ with _ve di}erent global evaluationprocedures and two di}erent storm!speci_c evaluation measures being analysed and reported in this paper[


5/16



1050

Global evaluation measures

0[ Mean absolute error "MAE#

MAE Sji="obsexp#=

N

1[ Root mean squared error "RMSE#

RMSE XSji"obsexp#

1

N

2[ Mean higher order error function "MS3E#

MS3E Sji"obsexp#

3

N

This statistic places emphasis on peak ~ow prediction "Blackie and Eeles\ 0874#[3[ Coe.cient of e.ciency "Nash and Sutcli}e\ 0869^ Diskin and Simon\ 0866#

)COE 00Sji"errerr#

1

N >Sji"xx#

1

N 10994[ Percentage of predictions grouped according to degree of error]

) correct predictions) predictions 9 4) of observed) predictions 4 09) of observed) predictions 09 14) of observed

) greater than214) of observedEvent!speci_c evaluation measures

Global error statistics provide relevant information on overall performance but do not provide speci_cinformation about model performance at high levels of ~ow\ which in a ~ood forecasting context\ is ofcritical importance[ Two additional storm!speci_c evaluation procedures were therefore implemented] "i#average di}erence in peak prediction over all ~ood events calculated using MAEpp and RMSEpp^ and "ii#percentage of early\ on!time\ or late occurrences for the prediction of individual peaks[ Both measuresshould\ in combination with the global statistics\ provide a better insight into the modelling performance ofindividual solutions for ~ood forecasting purposes[

ARMA MODELLING AND NAIVE PREDICTION

The ARMA modelling "Box and Jenkins\ 0865# was implemented using NPREDICT "Masters\ 0884#\ fromwhich appropriate statistical tools were created based on the following standard formulation

xt 8980xt081xt1= = =u0at0u1at1= = =at

where xt is the predicted value\ f9 is a constant o}set\ fi are weights associated with each previousobservation\ xti are previous observations\ ui are weights associated with each previous shock\ ati areprevious shocks or noise terms and at is current shock[

From plots of the autocorrelation function the mean of each time!series at each station was determinedto be non!stationary[ The original data were therefore detrended using single adjacent point di}erencing\which ful_lled the parametric requirements of this approach\ and created an alternative set of modelling


6/16



1051

patterns comprising six DIFF "t0 to t5# inputs and one DIFF "t# output[ This formal obligation to

work with DIFF data prohibited the use of FLOW input or output variables in these modi_ed data setsand\ likewise\ from this part of the analysis[ The ARMA solutions were then _tted to the di}erenced annualdata for 0874\ using an iterative approach to determine the optimal number of terms and weights\ and testedwith the di}erenced annual data for 0873 and 0875[ The _nal models were ARMA0\ 0 for Kilgram and theUpper Wye and ARMA0\ 1 for Skelton[ The p\ q notation refers to the number of autoregressive andmoving average terms that were included in each model[

Na(ve prediction\ or persistence\ substitutes the last known _gure as the current prediction and representsa good bottomline benchmark against which other one!step!ahead predictions can be measured[ Relevantevaluation measures were therefore calculated for each individual data set[

NEURAL NETWORK MODELLING

The NN modelling was implemented using SNNS "SNNS Group\ 0889 87#[ The selection of an appropriateinitial architecture was problematic\ because there is no single correct procedure to determine the optimumnumber of units or layers\ although one or two {rules of thumb| have been put forward "Sarle\ 0887# andvarious automated growing\ pruning and network breeding algorithms exist "e[g[ Fahlman and Lebiere\0889^ SNNS Group\ 088987^ Braun and Ragg\ 0886#[ The number of input and output units is _xedaccording to the number of variables in each training pattern[ Selecting the optimal number of hidden unitsis\ however\ a di}erent matter\ which often is considered to be problem dependent[ Intuition suggests that{more is better| but this is not always the case[ The number of hidden units and layers will control the powerof the model to perform more complex modelling\ with an associated trade!o} between training time "i[e[number of epochs in which one epoch represents one complete presentation of the training data# and modelperformance "i[e[ validation error#[ The use of large hidden layers also could be counterproductive becausean excessive number of free parameters will encourage over_tting of the network solution to the trainingdata\ and so reduce the generalization capabilities of the _nal product[ The other question that needs to be

addressed concerns the number of hidden layers and the balance of hidden units between these layers[Theoretical results have shown that a one!hidden!layer feedforward network is capable of approximatingany measurable function to any desired degree of accuracy^ and that if errors occur\ these will be due toinadequate learning or too few hidden units\ or because the data contain an insu.cient deterministicrelationship "Hornik et al[\ 0878#[ There remains some debate about this theoretical justi_cation\ however\and there might well be some advantage in considering the use of two hidden layers\ in order to provide anadditional degree of representational power "Openshaw and Openshaw\ 0886#[

Trial and error\ based on systematic studies using di}erent architectures and di}erent training procedures\is still the preferred choice of most users "e[g[ Fischer and Gopal\ 0883# and is the method of selection thatwas adopted in this work[ SNNS was therefore used to construct 01 di}erent one!hidden!layer and 01di}erent two!hidden!layer feedforward networks with a range of 5 to 61 hidden units in each set[ Thesenetworks all had 03 input units and one output unit[ The networks were trained to predict _rst FLOW and

then DIFF values using the 0874 pattern sets for each individual station[ The stopping condition for eachrun was set at 799 epochs and trained networks were saved at 099 epoch intervals[ Training was undertakenwith {enhanced backpropagation| using decreasing levels of learning and momentum[ The three relevantannual data sets were then passed through each saved network and a set of annual error statistics computed[The _nal results for the di}erent architectures were all quite similar and no clear!cut overall trend could beobserved between them[ This suggests that the use of additional hidden units had little or no real impact onthe end result and that most simple networks of modest size are able to provide an acceptable solution[Moreover\ no substantial di}erence could be found between the one!hidden!layer and two!hidden!layernetworks\ which accords with other recent work in which the bene_ts of using a second hidden layer wereconsidered marginal to the rainfallruno} modelling problem "Minns and Hall\ 0885#[

As no optimal architecture emerged\ two representative one!hidden!layer networks were chosen for more


7/16



1052

extensive training\ comprising 03 ] 5 ] 0 and 03 ] 01 ] 0 con_gurations[ Both networks were trained to predict

_rst FLOW and then DIFF values using the 0874 pattern sets for each individual station[ The stoppingcondition for each run was increased to 3999 epochs and in each case there was again little di}erence betweenthe _nal error statistics[ The 03 ] 5 ] 0 solutions\ which had the simpler architecture\ were therefore selectedfor full numerical testing and analysis[

RESULTS

This section summarizes the main results\ using an analysis of aggregated graphs\ from which importantrelative di}erences between the stations and the models can be observed[ Tables that contain a comprehensiveset of numerical statistics can be found on the world!wide web in Abrahart and See "0887#[

Global evaluation statistics

RMSE statistics for the training and validation data\ which provide a general illustration of the overallresults\ are depicted in Figure 2[ The MAE and MS3E measures showed similar results to RMSE except forthe e}ect of 04 or so large underpredictions in the validation data from the NN!FLOW model for the UpperWye\ which was accentuated in the MS3E results\ and minimized in the MAE[ Two main trends can beobserved with respect to annual data]

0[ training data error statistics for the Upper Wye were a little higher than Kilgram\ whereas those forSkelton were much lower^

1[ validation data error statistics showed a more progressive di}erentiation\ and were greatest for the UpperWye\ then Kilgram\ and then Skelton[

This pattern is thought to be a direct re~ection of the hydrological characteristics at each station\ withsteep rising limbs and spiked peaks\ on the ~ashier catchments\ being more di.cult to learn\ and producing

Figure 2[ RMSE calculated on annual data for the Upper Wye\ Kilgram and Skelton


8/16



1053

Figure 3[ )COE calculated on annual data for the Upper Wye\ Kilgram and Skelton "limited graphical range used to obtain maximumdi}erentiation#

a more inferior result in proportion to the ~ashiness of the catchment when these modelling solutions weretransferred to their validation data periods[ In terms of relative performance between the di}erent modellingsolutions\ the NN and ARMA forecasts were quite similar at each station\ whereas na(ve prediction oftenproduced a much higher error[

)COE statistics for each station are shown in Figure 3[ This measure generated between station di}erenceswith a relative pattern that was similar to the one produced using the other statistics[ As this particularevaluation measure is assessing a positive as opposed to a negative attribute\ the vertical pattern is reversed[The training and validation data exhibited similar results\ most of the reported e.ciencies were quite high\and the actual di}erences quite small[ Kilgram and Skelton are\ however\ this time more similar and theUpper Wye more distinct[ E.ciencies associated with the NN!DIFF predictions were also lower than thoserecorded for the other three solutions and the magnitude of these di}erences is observed to increase inproportion to falling e.ciencies associated with the poor modelling of quicker catchment responses[

Figure 4 depicts the spread of prediction errors and highlights the fact that similar levels of error werefound to occur within the training and validation data sets across all models and locations[ Most predictions

were within 4) of the true value\ with minor percentages occurring in the high error bands[ Na(ve prediction\in all cases\ had the largest number of percentage correct[ This re~ects a large number of low ~ow situationsin which there was no change over time[ The NN and ARMA solutions produced a limited number ofcorrect predictions\ with the ARMA solutions doing somewhat better on the Upper Wye[ The greatest errorsat each station occurred in proportion to the ~ashiness of the river at each location\ with the highest numberof large errors being produced from na(ve prediction[ The ARMA solutions produced fewer exceptionalerrors for the Upper Wye\ whereas NN solutions produced fewer exceptional errors for Kilgram and Skelton[

Event!speci_c evaluation statistics

Kilgram had the highest number of storm events for this period "0873]0874]0875#\ which was 63"10] 19 ] 22#[ The Upper Wye had 58 "8 ] 11 ] 27# and Skelton had 47 "07 ] 03 ] 15#[ Error statistics associated


9/16



1054

Figure 4[ Error in annual prediction for all three stations visualized according to percentage class groups

with peak prediction are much higher than those computed on the complete hydrographic record\ becausethis subset of the data contained a larger proportion of extreme responses\ which are more di.cult toestimate[ RMSE for peak prediction is depicted in Figure 5] MAEpp showed similar results to RMSEpp[ The

overall pattern\ in both cases\ is also consistent with results obtained from global testing with annual datasets] the Upper Wye had the highest levels of error\ then Kilgram\ and then Skelton[ Two other patternsalso can be observed with respect to peak prediction]

0[ training data error statistics now show a more progressive di}erentiation between the stations\ althoughthe Upper Wye errors are much higher\ in comparison with those for both Kilgram and Skelton^

1[ validation data error statistics now show little or no progressive di}erentiation * large errors werecomputed for the Upper Wye\ with much lower errors for Kilgram and Skelton\ which are both of asimilar magnitude[

These _ndings suggest that global measures are not good indicators of peak prediction\ owing to theoverwhelming presence of a large number of low ~ow situations\ which are easier to predict[ Shifting thefocus to peak prediction has also highlighted signi_cant variation in the forecasting power of the two more

~ashier modelling solutions\ and produced clear di}erentiation between the station on the Upper Wye "morerapid response# and at Kilgram "less rapid response#[ Na(ve prediction obtained better relative performancestatistics with respect to the other solutions and in comparison with the di}erences produced from testingbased on annual data sets[ The NN and ARMA forecasts were still quite similar\ although the ARMAforecasts for Kilgram generated higher error than the other three predictors\ and validation data with respectto the NN!FLOW model for the Upper Wye was once again a problematic outlier[ Detailed statisticalanalysis\ based on this extracted subset\ has thus once again indicated that there is no substantial di}erencein forecasting potential between the two main modelling techniques\ and that important aspects of observedbehaviour had more to do with the problem of peak ~ow prediction than the estimation power of individualsolutions[

Figure 6 shows the percentage of peak predictions that were early\ on!time or late[ Na(ve predictions were


10/16



1055

Figure 5[ RMSE calculated on peak prediction for the Upper Wye\ Kilgram and Skelton

not included because these are always late[ Skelton had the largest percentage of late predictions in thetraining data\ followed by the Upper Wye\ and then Kilgram[ The training and validation data sets produced

similar results for the Upper Wye and Kilgram\ whereas the validation data for Skelton was the better ofthe two\ and had a larger percentage of correct predictions[ However\ these improved statistical results forSkelton could be due to small number e}ects\ because there were fewer storms in 0874 "training data# thanin either 0873 or 0875 "validation data#[

IMPROVING THE NEURAL NETWORK SOLUTION] SOM!BASED MULTINETWORKMODELLING

Thus far NN solutions have been compared with ARMA forecasters and na(ve predictions all trained andvalidated on annual data sets[ The NN solutions and ARMA forecasters were observed to provide a similarlevel of performance\ which was better than na(ve prediction[ The ARMA model building was\ however\ farmore tedious than NN model building\ requiringa substantial amount of hands!on iterative experimentation\

with graphical analysis of residual error at each stage[ Moreover\ subjective decisions were required atvarious points\ for instance on items such as which terms should be included within\ or excluded from\ the_nal equation[ From a model building perspective the automated NN procedure is therefore simpler andmuch quicker to implement[ However\ it is also appropriate to consider alternative methods or approaches\through which improved performance could be achieved[ There are at least two possibilities[ The _rstmethod involves using additional inputs in the model building process and is applicable to both NNand ARMA forecasters\ e[g[ river level data from upstream stations\ or other relevant hydrological ormeteorological information[ The second method involves the implementation of a multimodel approach\which has been shown to provide improved performance with respect to alternative forecasts at Skelton "Seeet al[\ 0886#[ The original river ~ow data are _rst grouped or clustered into distinct hydrological event types\where an event is taken to mean a short section of the hydrograph record\ and each individual cluster is


11/16



1056

Figure 6[ Error in peak prediction for all three stations visualized according to timing of event

then modelled as an independent item within a set of such models[ This type of modelling\ however\ cannotbe performed with the ARMA technique\ because the nature of this statistical method is such that it requirescontinuous time!series data as opposed to out of sequence event!related groupings[

Statistical methods could be used to perform this classi_cation[ The neural network alternative would bea self!organizing map "SOM# "Kohonen\ 0884#[ This network algorithm is based on unsupervized classi!_cation\ where the processing units compete against each other to discover important relationships that existwithin the data\ with no prior knowledge[ The traditional architecture contains two layers of processingunits\ a one!dimensional input layer and a two!dimensional competitive layer[ The competitive layer\ orfeature map\ is organized into a regular grid of processing units and each unit in the input layer is connectedto each unit in the competitive layer[ The feature map has connections between the competitive units andeach competitive unit also has one or more additional weights\ or reference vectors\ which will be trained torepresent the fundamental pattern associated with each class group[ Training consists of random weightinitialization\ presenting a data pattern to the network and determining which unit has the closest match\then updating both the winning unit and those around it[ This process is repeated over numerous epochsuntil a stopping condition is reached[ The training rule is]

Dwib"xiwoldi #

where wi is the weight on the ith reference vector\ b is the learning rate\ xi is the transmission along the ithweighted reference vector and where 9=9b 0=9[

The winning unit and its neighbours will adapt their reference vector to better _t the current pattern\ inproportion to the strength of the learning coe.cient\ whereas other units are either inhibited or experienceno learning whatsoever[ Lateral interaction is introduced between neighbouring units within a certaindistance using excitors^ beyond this area\ a processing unit either inhibits the response of other processingunits\ e[g[ Mexican Hat Function "Caudill and Butler\ 0881\ p[ 73#\ or does not in~uence them at all\ e[g[Square Block Function "Openshaw\ 0883\ p[ 53#[ The weight adjustment of the neighbouring units isinstrumental in preserving the topological ordering of the input space[ The neighbourhood for updating is


12/16



1057

then reduced\ as is the learning coe.cient\ in two broad stages] a short initial training phase in which a

feature map is trained to re~ect the coarser and more general details and a much longer\ _ne tuning stage\in which the local details of the organization are re_ned[ This process continues until the network hasstabilized and weight vectors associated with each unit de_ne a multidimensional partitioning of the inputdata[

The partitioning was implemented using SOM software "NNRC\ 0887#[ Feature maps * 11\ 33\ 55and 77 * were examined using various sets of data[ The 03 input variables from the original networkmodelling operation produced _nal clusters di}erentiated according to season and not on di}ering levelbehaviour[ The use of adjacent!point di}erences in river ~ow levels also added little to the clustering process^hence the _nal input data chosen for the classi_cation exercise were FLOWs t0 to t5[ The bestresults were produced using an 77 SOM "53 clusters#[ This gave reasonable di}erentiation between eventbehaviours at high levels of ~ow[ It also produced a large number of similar events at low levels of ~ow[Figure 7 illustrates some of the di}erent types of event behaviour that were di}erentiated over the 5 h timeperiod for Kilgram\ using data for all 2 years[ To facilitate a better presentation\ the pro_les have beenforced through the origin\ and all clusters with near!identical behaviour have been omitted from the plot[The three main types of hydrograph event can be seen in this diagram\ comprising ~at\ rising and fallingbehaviours[ Each of these items can be further partitioned into low\ medium and high ~ow situations[

To examine the potential bene_ts of this data!splitting technique a dedicated 77 classi_er was producedfor each station[ To ensure a su.cient number of cases in each cluster\ for subsequent training purposes\individual clusters were created from the complete 2!year record[ The two most prevalent rising events ateach station were then identi_ed for modelling purposes\ from the complete set of events\ as shown in Figure7[ Table I lists the total number of cases in each of these rising clusters[ Six trained NN solutions "two clustertypes for three stations# were then developed using identical network architectures and training procedures\to those reported earlier\ which enabled comparison[ RMSE\ MAE and )COE statistics were calculatedon the network outputs[ Corresponding ARMA forecasts and na(ve predictions were then extracted for therelevant subsets and likewise assessed[ These values are also listed in Abrahart and See "0887#[

RMSE statistics are shown in Figure 8 and )COE statistics in Figure 09[ The error values and recorded

Figure 7[ SOM classi_cation of di}erent event types at Kilgram


13/16



1058

Table I[ Number of cases in each rising event cluster

Station Rising cluster 0 Rising cluster 1

Kilgram 079 009Skelton 64 78Upper Wye 83 096

Figure 8[ RMSE calculated on _rst high!level rising event cluster at Kilgram

e.ciencies indicated potential improvement in NN performance\ over and above that produced from theARMA equations\ on both rising event clusters[ Enabling the modelling procedure to concentrate on a smallwell!de_ned task\ rather than the entire spectrum of global hydrograph behaviours\ has therefore facilitatedbetter approximation with respect to the rising limb of the hydrograph\ although the actual statistics thatwere produced are in fact much poorer than their global counterparts[

CONCLUSIONS

0[ The NN solutions produced global error statistics that were similar to\ and sometimes better than\ astandard statistical time!series predictor using common data inputs for two contrasting catchments[ The_nal decision on which technique is better suited to the modelling operation described in this paper musttherefore rest on alternative factors[

1[ The main distinction between the two techniques that have been evaluated was in the level of user inputrequired and the speed of model building[ The NN solutions were less demanding in terms of subjectivetesting and thus much faster to construct * which are important real world considerations[

2[ Multinetwork modelling\ using a separate solution for distinct individual hydrological event types\


14/16



1069

Figure 09[ )COE calculated on _rst high!level rising event cluster at Kilgram

provided improved performance on two rising event clusters and appears to o}er considerable scope andpromise for future developments in applied operational forecasting[

ACKNOWLEDGEMENTS

SNNS "Stuttgart Neural Network Simulator# was developed in the Institute for Parallel and DistributedHigh Performance Systems at theUniversity of Stuttgart[ The SOMpackage wasdevelopedin the Laboratoryof Computer and Information Science at the Helsinki University of Technology[ Upper River Wye datawere collected by the UK Institute of Hydrology[ River Ouse data were provided by the UK EnvironmentAgency[

REFERENCES

Abrahart RJ[ 0888[ Neurohydrology] implementation options and a research agenda[ Area 20"1#] 030038[Abrahart RJ\ See L[ 0887[ Neural network vs[ ARMA modelling] constructing benchmark case studies of river ~ow prediction[

GeoComputation|87] Proceedin`s Third International Conference on GeoComputation\ University of Bristol\ 0608 September[http]::www[geog[port[ac[uk:geocomp:geo87:94:gc94[htm

Abrahart RJ\ See L[ 0888[ Fusing multi!model hydrological data[ IJCNN|88] Proceedin`s International Joint Conference on NeuralNetworks\ Washin`ton DC\ 0905July CD!ROM[

Abrahart RJ\ White S[ In press[ Modelling sediment transfer in Malawi] comparing backpropagation neural network solutions againsta multiple linear regression benchmark using small data sets[ Physics and Chemistry of the Earth[

Abrahart RJ\ See L\ Kneale PE[ 0888[ Applying saliency analysis to neural network rainfall!runo} modelling[ GeoComputation|88]Proceedin`s Fourth International Conference on GeoComputation\ Mary Washin`ton Colle`e\ Fredericksbur`\ Vir`inia\ 1417 JulyCD!ROM[

Bathurst J[ 0875[ Sensitivity analysis of the Systeme Hydrologique Europeen for an upland catchment[ Journal of Hydrolo y 76] 092012[


15/16



1060

Blackie JR\ Eeles WO[ 0874[ Lumped catchment models[ In Hydrolo ical Forecastin \ Anderson MG\ Burt TP "eds#[ Wiley] Chichester^200234[

Box GEP\ Jenkins GM[ 0865[ Time Series Analysis] Forecastin and Control[ Holden!Day] Oakland\ CA[Braun H\ Ragg T[ 0886[ ENZO*User Manual and Implementation Guide*Version 0[9[ Institute for Logic\ Complexity and Deduction

Systems\ University of Karlsruhe] Karlsruhe[Campolo M\ Andreussi P\ Soldati A[ 0888[ River ~ood forecasting with a neural network model[ Water Resources Research 24] 0080

0086[Caudill M\ Butler C[ 0881[ Understandin` Neural Networks] Computer Explorations\ Vol[ 0\ Basic Networks[ MIT Press] Cambridge\

MA[Cheng X\ Noguchi M[ 0885[ Rainfall!runo} modelling by neural network approach[ Proceedin`s International Conference on Water

Resources and Environment Research] Towards the 10st Century\ Kyoto\ Japan\ 1820 October 0885\ 1] 032049[Daniell TM[ 0880[ Neural networks * applications in hydrology and water resources engineering[ Proceedin`s\ International Hydrolo y

and Water Resources Symposium\ Vol[ 2[ National Conference Publication 80:11\ Institute of Engineering\ Australia] Barton\ ACT^686791[

Dawson CW\ Wilby RE[ 0887[ An arti_cial neural network approach to rainfall runo} modelling[ Hydrolo ical Sciences Journal32]3655[

Diskin MH\ Simon E[ 0866[ A procedure for the selection of objective functions for hydrological conceptual models[ Journal ofHydrolo`y 23] 018 038[

Fahlman SE\ Lebiere C[ 0889[ The cascade!correlation learning architecture[ In Advances in Neural Information Processin` Systems\Vol[ 1\ Touretzky D "ed[#[ Morgan Kaufmann^ San Mateo\ CA^ 413421[Fischer MM\ Gopal S[ 0883[ Arti_cial neural networks] a new approach to modelling interregional telecommunication ~ows[ Journal

of Reìonal Science 23] 492 416[Fortin V\ Ouarda TBMJ\ Bobee B[ 0886[ Comment on {The use of arti_cial neural networks for the prediction of water quality

parameters| by H[ R[ Maier and G[ C[ Dandy[ Water Resources Research 22] 13121313[French MN\ Krajewski WF\ Cuykendall RR[ 0881[ Rainfall forecasting in space and time using a neural network[ Journal of Hydrolo`y026] 020[

Hillel D[ 0875[ Modeling in soil physics] A critical review[ In Future Developments in Soil Science Research[ Soil Science Society ofAmerica] Madison\ WI^ 24 31[

Hornik K\ Stinchcombe M\ White H[ 0878[ Multilayer feedforward networks are universal approximators[ Neural Networks 1] 248255[

Hsu K!L\ Gupta HV\ Sorooshian S[ 0884[ Arti_cial neural network modeling of the rainfallruno} process[ Water Resources Research20] 14061429[

Jakeman AJ[ 0882[ How much complexity is warranted in a rainfall runo} model< Water Resources Research 18] 15261538[Kang KW\ Park CY\ Kim JH[ 0882[ Neural network and its application to rainfallruno} forecasting[ Korean Journal of Hydrosciences3] 08[

Karunanithi N\ Grenney WJ\ Whitley D\ Bovee K[ 0883[ Neural networks for river ~ow prediction[ Journal of Computin` in CivilEnìneerin` 7] 190119[Kohonen T[ 0884[ Self!Orànizin` Maps[ Springer!Verlag] Heidelberg[Lorrai M\ Sechi GM[ 0884[ Neural nets for modelling rainfall runo} transformations[ Water Resources Manaèment 8] 188202[Maier HR\ Dandy GC[ 0885[ The use of arti_cial neural networks for the prediction of water quality parameters[ Water Resources

Research 21"3#] 09020911[Maier HR\ Dandy GC[ 0886[ Reply[ Water Resources Research 22"09#] 13141316[Masters T[ 0884[ Neural\ Novel and Hybrid Alòrithms for Time Series Prediction[ Wiley] New York[Minns AW[ 0885[ Extended rainfallruno} modelling using arti_cial neural networks[ In Hydroinformatics |85] Proceedin`s 1nd

International Conference on Hydroinformatics\ Zurich\ Switzerland\ 802 September 0885\ Vol[ 0\ Muller A "ed[#[ A[ A[ Balkema]Rotterdam^ 196 102[

Minns AW[ 0887[ Modelling of 0!D pure advection processes using arti_cial neural networks[ In Hydroinformatics |87] Proceedin`sThird International Conference on Hydroinformatics\ Copenhaèn\ Denmark\ 1315Auùst\ Vol[ 1\ Babovic V\ Larsen CL "eds#[ A[A[ Balkema] Rotterdam^ 794 701[

Minns AW\ Hall MJ[ 0885[ Arti_cial neutral networks as rainfallruno} models[ Hydroloìcal Sciences Journal30] 288 306[Minns AW\ Hall MJ[ 0886[ Living with the ultimate black box] more on arti_cial neural networks[ Proceedin`s Sixth National

Hydrolo`y Symposium\ University of Salford\ 0407 September 8[348[38[

Nash JE\ Sutcli}e JV[ 0869[ River ~ow forecasting through conceptual models[ Journal of Hydrolo y 09] 171189[NNRC[ 0887[ Neural Network Research Centre[ http]::www[cis[hut[_:nnrc:nnrc!programs[htmlOpenshaw S[ 0883[ Neuroclassi_cation of spatial data[ In Neural Nets] Applications in Geo`raphy\ Hewitson BC\ Crane RG "eds#[

Kluwer Academic Publishers] Dordrecht^ 42 69[Openshaw S\ Openshaw C[ 0886[ Arti_cial Intelliènce in Geo`raphy[ Wiley] Chichester[Quinn PF\ Beven KJ[ 0882[ Spatial and temporal predictions of soil moisture dynamics\ runo}\ variable source areas and eva!

potranspiration for Plynlimon\ Mid!Wales[ Hydroloìcal Processes 6] 314 337[Rumelhart DE\ Hinton GE\ Williams RJ[ 0875[ Learning internal representations by error propagations[ In Parallel Distributed

Processin`] Explorations in the Microstructures of Co`nition\ Vol[ 0\ Rumelhart DE\ McClelland JL "eds#[ MIT Press] Cambridge\MA^ 207251[

Sarle WS[ 0887[ FAQ document for Usenet newsgroup {comp[ai[neural!nets| ftp]::ftp[sas[com:pub:neural:FAQ[html December 0887[See L\ Corne S\ Dougherty M\ Openshaw S[ 0886[ Some initial experiments with neural network models of ~ood forecasting on the

River Ouse[ GeoComputation |86] Proceedin`s 1nd International Conference on GeoComputation\ University of Otaò\ Dunedin\ NewZealand\ 1518 Auùst[


16/16



1061

Smith J\ Eli RN[ 0884[ Neural!network models of rainfallruno} process[ Journal of Water Resources Plannin` and Manaèment 010]388498[

SNNS Group[ 0889 87[ Stuttàrt Neural Network Simulator * User Manual * Version 3[0[ http]::www!ra[informatik[uni!tuebingen![de:SNNS:

Tveter D[ 08857[ Backpropaàtor|s Review[ http]::www[mcs[com:drt:bprefs[htmlYang R[ 0886[ Application of neural networks and ènetic alòrithms to modellin` ~ood discharès and urban water quality[ Unpublished

PhD Thesis\ Department of Geography\ University of Manchester[

abrahart and see

Documents