offline training for improving online performance of a...

41
1 Offline Training for Improving Online performance of a Genetic Algorithm 1 Based Optimization Model for hourly Multi-reservoir Operation 2 3 Duan Chen 1 , Arturo S. Leon 2 , Samuel P. Engle 3 , Claudio Fuentes 4 , Qiuwen Chen 5 4 1 Changjiang River Scientific Research Institute, Wuhan, China, 430010 5 2 Department of Civil and Environmental Engineering, University of Houston, Houston, TX, 6 USA, 77204 7 3 Department of Economics, University of Wisconsin-Maddison, Madison, WI, USA, 53706 8 4 Department of Statistics, Oregon State University, Corvallis, OR, USA, 97331 9 5 Center for Eco-Environmental Research, Nanjing Hydraulic Research Institute, Nanjing, China, 10 210029 11 Abstract: A novel framework, which incorporates implicit stochastic optimization (Monte Carlo 12 method), cluster analysis (machine learning algorithm), and Karhunen-Loeve expansion 13 (dimension reduction technique) is proposed. The framework aims to train a Genetic Algorithm- 14 based optimization model with synthetic and/or historical data) in an offline environment in order 15 to develop a transformed model for the online optimization (i.e., real-time optimization). The 16 primary output from the offline training is a stochastic representation of the decision variables that 17 are constituted by a series of orthogonal functions with undetermined random coefficients. This 18 representation preserves covariance structure of the simulated decisions from the offline training 19 as gains some “knowledge” regarding the search space. Due to this gained “knowledge”, better 20 candidate solutions can be generated and hence, the optimal solutions can be obtained faster. The 21

Upload: others

Post on 22-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

1

Offline Training for Improving Online performance of a Genetic Algorithm 1

Based Optimization Model for hourly Multi-reservoir Operation 2

3

Duan Chen1, Arturo S. Leon2, Samuel P. Engle 3, Claudio Fuentes4, Qiuwen Chen5 4

1Changjiang River Scientific Research Institute, Wuhan, China, 430010 5

2 Department of Civil and Environmental Engineering, University of Houston, Houston, TX, 6

USA, 77204 7

3Department of Economics, University of Wisconsin-Maddison, Madison, WI, USA, 53706 8

4Department of Statistics, Oregon State University, Corvallis, OR, USA, 97331 9

5Center for Eco-Environmental Research, Nanjing Hydraulic Research Institute, Nanjing, China, 10

210029 11

Abstract: A novel framework, which incorporates implicit stochastic optimization (Monte Carlo 12

method), cluster analysis (machine learning algorithm), and Karhunen-Loeve expansion 13

(dimension reduction technique) is proposed. The framework aims to train a Genetic Algorithm-14

based optimization model with synthetic and/or historical data) in an offline environment in order 15

to develop a transformed model for the online optimization (i.e., real-time optimization). The 16

primary output from the offline training is a stochastic representation of the decision variables that 17

are constituted by a series of orthogonal functions with undetermined random coefficients. This 18

representation preserves covariance structure of the simulated decisions from the offline training 19

as gains some “knowledge” regarding the search space. Due to this gained “knowledge”, better 20

candidate solutions can be generated and hence, the optimal solutions can be obtained faster. The 21

Page 2: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

2

feasibility of the approach is demonstrated with a case study for optimizing hourly operation of a 22

ten-reservoir system during a two-week period. 23

Keywords: Reservoir operation; Genetic Algorithm; Model Training; Machine Learning; 24

Karhunen-Loeve expansion 25

1 Introduction 26

The optimization of hourly multi-reservoir operation is essential to many short-term routines 27

such as determining the hourly operation strategy of the reservoir in a power-scheduling problem 28

(Gil et al., 2003). Energy marketing also requires hourly reservoir operation for incorporating 29

energy price changing at every hour (Olivares and Lund, 2011). Power schemes that combine 30

hydropower with wind generation and/or other renewable sources often consider hourly or sub-31

hourly time steps for accurate representations on the variation of power resources (Wang and Liu, 32

2011; Deane et al., 2014). Moreover, hourly reservoir operations are increasingly being considered 33

for environmental objectives. For instance, hourly fluctuations in water surface elevation and flow 34

discharge are essential for spawning activity of some fish species (Chen et al., 2015; Stratford et 35

al., 2016). Maintaining hourly regime of environmental flows has gained increasing attention due 36

to its benefit to the biota of river ecosystem (Meile et al., 2011; Shiau and Wu, 2013; Horne et al., 37

2017). Optimizing hourly multi-reservoir operation, however, is a challenging task due to the 38

complexity of the search space, which result from the large number of decision variables (e.g., 39

thousands). The optimization problem may be solved in an offline environment assuming all 40

information are known. This offline optimization is normally accompanied by high computational 41

cost and therefore, may not be acceptable for the online optimization, i.e., an efficient optimization 42

preformed in a real-time manner. 43

Page 3: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

3

Genetic Algorithms (GA) and its variants have been widely applied to multi-reservoir 44

operation during the last two decades (Oliveira, 1997; Wardlaw, 1999; Reed et al., 2013; Tsoukalas 45

& Makropoulos, 2015; Lerma et al, 2015; Gibbs et al, 2015) owing to its robustness, effectiveness 46

and global optimality properties. However, most applications of the GA on reservoir operation 47

focus on long term planning and management with monthly time step or short-term optimization 48

with a daily time step. Like other metaheuristics methods, GA works by iteratively moving to 49

better positions in the search-space, which are sampled using some probability distribution (e.g., 50

normal) defined around the current position. The embedded randomness is a key element for global 51

optimality, however, results in a slow convergence. For the GA-based model, the computational 52

cost for optimizing hourly multi-reservoir operation may be too expensive to perform an efficient 53

online optimization in real-time. To reduce the computation cost, some decomposition techniques 54

have been adopted. Gil et al. (2003) perform a time hierarchical decomposition on a short-term 55

hydrothermal generation scheduling problem and a set of expert operators (expert knowledge of 56

the system and a priority list) is incorporated in the GA. Zoumas et al. (2004) proposed six 57

problem-specific genetic operators, essentially a combination of local search techniques and 58

expertise on the problem, to enhance the online performance of the GA for a hydrothermal 59

coordination problem. Although useful in certain contexts, these techniques tend to be very 60

problem-dependent and difficult to apply to general problems. 61

Machine-learning approaches, such as Reinforcement Learning (RL) and Cluster Analysis (CA) 62

have been increasingly used to improve the performance of the optimization model, despite that 63

many machine-learning approaches are optimization problems per se. In fact, machine-learning 64

approaches and optimization algorithms have become frequently coupled for solving complex 65

problems (Bennett and Parrado-Hernández, 2006). Lee and Labadie (2007) used the RL to improve 66

Page 4: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

4

the performances of a stochastic optimization model for operating a two-reservoir system on Geum 67

River (South Korea). Castelletti et al (2010) applied a tree-based learning method for optimal 68

operation of reservoirs in Lake Como water system (Italy). Though different machine-learning 69

approaches are implemented, the basic idea is to interact with the experience (e.g., historical data) 70

or environment (e.g., feedback) and learn from these interactions. The pilot researches of Lee and 71

Labadie (2007) and Castelletti et al (2010) showed promising results for using machine learning 72

to improve the performance of the optimization model. However, their researches focus on long 73

term planning using offline optimization. The computational cost of optimizing one reservoir is 74

approximately 1.5 hours in a regular computing environment (Castelletti et al., 2010), which is 75

infeasible to implement for hourly multi-reservoir operation in real time. 76

The performance of the online optimization can be improved through the offline optimization 77

(De Jong, 1975). The offline optimization is normally used to determine the current state and the 78

control law for the online optimization and reduce the online control algorithm to a lookup table 79

(Bemporad et al., 2002; Pannocchia et al., 2007). This strategy applied successfully to the small 80

problems for which the number of decision variables is manageable, however, no longer practically 81

feasible for large problems with thousands of controls (Wang and Boyd, 2010). Chasse and 82

Sciaretta (2011) combined an offline optimizer with an online strategy for energy management 83

problem. The offline optimization estimated two tuning parameter for the online optimization and 84

therefore, allowing a better performance for the online optimization. However, an adaptation rule, 85

which can be problem-specific, is needed for linking the offline optimization with the online 86

optimization in order to account for future information uncertainty. Ravey et al (2011) used offline 87

optimization to predict a control strategy and then adapt online control strategy for real time energy 88

management. An extra GA optimizer is implemented for the online optimization to improve the 89

Page 5: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

5

performance. Among those works, offline optimization is used to provide a priori for the online 90

optimization but some extra efforts are normally required in the online optimization due to 91

uncertainty/variability of the future information. A novel framework based on offline training is 92

proposed herein to improve the performance of online optimization for hourly multi-reservoir 93

operation. The framework trained the online optimization through intensive offline optimization, 94

in which many inflow scenarios that account for future information are included. No extra 95

procedure is required for the online optimization after the offline training process is completed and 96

therefore, provide a more generic solution for different applications. The framework combine 97

implicit stochastic optimization (Monte Carlo method), cluster analysis (machine learning 98

algorithm), and Karhunen-Loeve expansion (dimension reduction technique) in an offline 99

environment and develop a transformed model for the online optimization. The transformed model 100

preserves the covariance structure of the obtained ‘historical’ optimal solutions, which can be 101

thought as gaining knowledge from the training process. Candidate solutions that are generated by 102

the transformed model (i.e., the trained model) share similar statistical properties with the 103

historical optimal solutions and therefore, the trained model finds optimal solutions faster given 104

similar input data. The framework is applied to train a multi-objective optimization model for 105

hourly operation of a ten-reservoir system. The performance of the trained model is compared 106

against the untrained model (zero training). The sensitivity on the number of the training times is 107

also investigated. The major contribution of the study is (1) develop a novel framework for training 108

optimization model in an offline environment, (2) the trained model significantly improve the 109

online performance of optimization and (3) discover an optimal number of model training times 110

for a relatively good performance with relatively less training budget. 111

Page 6: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

6

2 Optimization model for hourly reservoir operation 112

2.1 Reservoir system 113

A reservoir system on the Columbia River in the United States, which comprises 10 reservoirs, 114

is used as a test case. Sketch of the ten-reservoir system is shown in Fig. 1. The reservoir system 115

provides multiple operational purposes including power generation, ecological and environmental 116

requirements and recreation (Schwanenberg et al., 2014, Chen et al., 2016). 117

We consider an operational horizon as two weeks, specifically from August 25th to September 118

7th due to the data availability. The time step of the decisions is hourly. The decision variables are 119

the outflows of each reservoir at each hour during the optimization horizon, which resulting in 120

3360 variables in total. 121

122

123

Fig.1. Sketch of the ten-reservoir system in the Columbia River 124

(Reprinted from Chen et al., 2016) 125

Page 7: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

7

2.2 Objectives 126

Two objectives related to power generation are explicitly considered and expressed in the 127 following: 128

Objetive1: (1) 129

Objetive2: (2) 130

where PG is hydropower generated in the system (MWh), PD is power demand in the region 131

(MWh). The variable t denotes time in hours and Th is the optimization period (3360 hours). The 132

index i represent reservoirs in the system and Nr is total number of reservoirs. hr means heavy load 133

hours (HLH) for a day (typically from 06:00 to 22:00 h). The quantity Td corresponds to the 134

optimization period in days (14 days in our case). 135

The first objective is to minimize the power deficit during the operational horizon. The 136

function min(0, *) expresses that the deficit is equal to 0 if the total power generated is greater 137

than or equal to the power demand at time t. The second objective is to generate more power during 138

heavy load hours for selling power to the electricity market at a higher price, which would increase 139

the revenue. The function max(0, *) expresses that the objective value is equal to 0 if the total 140

power generated is smaller than or equal to the power demand at heavy load hours. In the 141

optimization model, the two objectives are normalized using a dimensionless index between zero 142

and one. Other purposes of reservoir operation such as flood control or MOP (minimum operation 143

level) requirements are considered as constraints on either reservoir water surface elevations or 144

storage limits, as summarized below. 145

2.3 Constraints 146

1 1 1, , , ,(( ) / 2 ( ) / 2)t t t t t t

i i in i in i out i out iV V Q Q Q Q t+ + +− = + − + ⋅∆ (3) 147

1 1( min(0, ( ) ))

hT Nrit t

t iMinimize PG PD

= =

−∑ ∑

14 22

1 6 1( ( max(0, ))

d

Nrihr hr

T hr iMaximize PG PD

= = =

−∑ ∑ ∑

Page 8: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

8

min, , m ,t

r i r i r ax iH H H≤ ≤ (4) 148

(5) 149

(6) 150

(7) 151

(8) 152

(9) 153

(10) 154

(11) 155

where V is reservoir storage; Qin and Qout are inflow to and outflow from reservoirs, 156

respectively; ∆t is time step. Hr is forebay elevation or reservoir water surface elevation; Hrmin and 157

Hrmax are allowed minimum and maximum forebay elevations, respectively. QS is the spill flow 158

and QF is the fish flow requirement. Hr is forebay elevation, MOPlow and MOPup are lower and 159

upper boundary for the MOP requirement on forbay elevation, respectively. Qtb is turbine flow, 160

Qtb_min and Qtb_max are allowed minimum and maximum turbine flows, respectively; Qout_ramp_allow 161

is allowed ramping rate for the outflow between any two consecutive time steps. Hramp_down is 162

allowed ramping rate when reservoir water level is decreasing. TWramp_down is allowed ramping rate 163

for tail water when tail water level is decreasing. Nd is power output, Nd_min is minimum output 164

requirement, and Nd_max is maximum output capacity. 165

The number of constraints is approximately 28000 and of many are nonlinear, which result in a 166

complicated optimization problem in a high-dimensional and complex search space. 167

i it tQS QF=

,low up

i t ir iMOP H MOP≤ ≤

_ min, , _ max,t

tb i tb i tb iQ Q Q≤ ≤

1, , _ _ ,

t t tout i out i out ramp allow iQ Q Q+− ≤

1, , _ ,

t t tr i r i ramp down iH H H+− ≤

1, , _ ,t t t

r i r i ramp down iTW TW TW+− ≤

_ min, , _ max,t

d i d i d iN N N≤ ≤

Page 9: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

9

2.4 Optimization Algorithm 168

The non-dominated sorting genetic algorithm, known as NSGA-II (Deb et al., 2002), is used 169

as the optimizer for the case study. The NSGA-II is a widely used metaheuristics method for multi-170

objective problem and has received increasing attention for reservoir operation (Prasad and Park, 171

2004; Atiquzzaman et al., 2006; Yandamuri et al., 2006; Sindhya et al., 2011; Chen et al., 2016). 172

The NSGA-II is a member of the GA family and follows the primary principles of the classical 173

GA. First, a set of candidate solutions (population) is generated randomly (first generation) that is 174

essentially white noise. By using the selection operator, some candidate solutions in the population 175

are selected. A so-called binary tournament is implemented and the selected candidate solutions 176

are compared in pairs based on the performances of the constraints and the objectives. The winners 177

of the tournament reproduce child (next generation) by using recombination and mutation 178

operators. The child can be viewed as random generation around the parent by some forms of 179

distributions. The evolution process continues until meeting the stopping criteria. One of the most 180

common stopping criteria is the number of generations. This criterion is problem-dependent, but 181

generally a large number of generations can be used for ensuring solution convergence. 182

3 Model training 183

3.1 Framework of model training 184

The framework comprises three major steps and is illustrated in Fig. 2. At the first step, we use 185

a time-series model to randomly generate time-series of inflow and power demand, and initial and 186

ending forbay elevations. Then, the generated data is used to feed a conventional optimization 187

model which employs the NSGA-II as the optimizer. The output of the conventional optimization 188

model is a near-optimal operational scheme under the given deterministic input e.g., the inflow 189

Page 10: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

10

and the power demand. The procedure can be repeated for an arbitrary number of times using 190

Monte Carlo method. A collection of the optimal operational schemes are then obtained after a 191

number of repeated experiments. This procedure is also known as implicit stochastic optimization 192

(ISO) which has been widely used for constructing reservoir operational rules (Wurbs, 1993; Lund, 193

1996b; Celeste and Billib 2009). In the Step 2, a machine-learning method called K-Spectral 194

Centroid algorithm (K-SC) is applied to the collection of the optimal operational schemes to 195

identify possible patterns. The classification is based on the similarity metric of the time-series 196

data which is defined in K-SC algorithm. It is clear that different patterns (clusters) of the optimal 197

operational schemes represent different ways of reservoir operation. Some patterns may be far 198

from the others due to distinct scenarios of the inflow and the power demand or the other model 199

conditions such as different initial or ending forbay elevation of the reservoir system. Therefore, 200

each identified cluster of the optimal operational schemes is used to construct a representation of 201

the obtained decisions using a method called Karhunen-Loeve (KL) expansion (cf. Section 3.4) in 202

the Step 3. All the KL representations are then formed as a collection and used as a pool for 203

generating new realization of the operational schemes. After the above three steps, a new 204

optimization model is set up by incorporating the collection of the KL representations. The new 205

model transformed the decision variables from time domain (i.e. Q(t)) to the frequency domain, 206

where the number of decision variables is M random coefficients ( kξ ). The optimization model in 207

the frequency domain is called spectral optimization model (Chen et al., 2016). In the spectral 208

optimization model (SOM), the operators of the NSGA-II (e.g., crossover) are conducted only on 209

the coefficients of the KL expansion ( kξ ) rather than the large number of decision variables (Q(t)), 210

significantly reducing the search space. On the other hand, the KL representation preserves the 211

covariance structure of the simulated decisions from the offline training, which is like gaining 212

Page 11: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

11

“knowledge” of the search space. Better candidate solutions are expected to be generated from the 213

representation in the transformed model due to the gained “knowledge” and therefore, the optimal 214

solutions may be obtained faster in the online optimization. 215

216

Fig. 2. Framework for training the optimization model of reservoir operation 217

Page 12: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

12

3.2 Generating the Training Data 218

3.2.1 Inflow and power demand data 219

The major input data for the optimization model is the hourly inflow and power demand. Two 220

inflows, i.e., inflow to GCL (Grand Coulee Reservoir) and inflow to LWG (Lower Granite 221

Reservoir), are considered (cf. Fig.1). According to the data availability, the historical forecasted 222

inflows from 1949 to 2002 and the observed inflows from 2002-2012 with six-hour interval are 223

used. Then a model is built based on these data for generating the training data, in which similar 224

data feature and structure are expected. For the purpose of this study, we can consider the complete 225

data set as a library of “historical” data. Since inflows and related reservoir operations do not 226

typically show sudden changes within short periods of time, hourly observations can be obtained 227

directly from the historical data by using linear interpolation or any other smoothing technique. In 228

order to generate realistic hourly inflow scenarios for a two weeks period based on the historical 229

data, there are a few alternatives. For instance, we can take directly a modeling approach and use 230

standard time series techniques, such as autoregressive integrated moving average (ARIMA) 231

models, to model the inflows in the period of interest and use the fitted models to simulate the data. 232

Alternatively, if we look at the historical data as a representative sample of the possible inflows, 233

we can use a bootstrap approach to produce noisy versions of the observed data and generate the 234

desired data. A third alternative is to take advantage of the size of the library and produce “new 235

data” by taking weighted averages of randomly selected historical inflows. Specifically, we can 236

create a new inflow by randomly selecting two or more historical inflows and compute the average 237

using the corresponding number of weights. The weights can be arbitrary. In our case, to generate 238

each inflow, we randomly selected two historical inflows (out of over 1800 possible combinations) 239

and computed the average with equal weights. In order to avoid repetitions, we can add the 240

Page 13: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

13

restrictions that a pair of selected inflows can be used only once. The number of inflows needed 241

to compute the average depends mainly on the number of generated inflows that are necessary. 242

For instance, in our case, considering the average of three randomly selected inflows give a pool 243

of over 30,000 possible results. This approach can be easily modified to produce realistic data that 244

show extreme features by clustering the original data and assigning probabilities to sample from 245

the different clusters. The method is used to generate 500 sets of inflows (Fig. 3). Similarly, the 246

500 sets of the simulated power demand is generated and shown in Fig. 3. The corresponding 247

historical data is also shown in Fig. 3 as a comparison. 248

249

Fig. 3. Historical data and simulated data 250

Page 14: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

14

3.2.2 Initial and ending forebay elevation 251

For short-term reservoir operation, the initial and ending forebay elevation (FB) are critical 252

variables that affect the operational schemes. Normally, the initial and ending FB are expected to 253

be within certain ranges guided by middle-term or long-term optimization models (Lund, 1996a) 254

which are not part of this study. In the present test case, we investigated historical initial and ending 255

FB of each reservoir from the year of 1971 to the year of 2015. Noted that the ending FB may be 256

related with initial FB. To describe the relation, we also investigated the difference between initial 257

and ending FB and fit the different data by a distribution model e.g., normal distribution. By using 258

the historical ranges on the FB and the distribution of differences on the FBs, we can randomly 259

simulate the initial and ending FB within the historical ranges. 260

3.3 K-Spectral Centroid algorithm 261

K-Spectral Centroid algorithm (K-SC) is a novel method of Cluster Analysis developed by Yang 262

and Leskovec (2011). Cluster Analysis refers to the group of techniques that are designed to 263

separate a set of objects or observations into different groups or clusters according to their 264

similarities or proximities, defined in a certain sense. More recently, clustering and classification 265

of time-series such as curves have become increasingly popular with the obvious implications in 266

pattern recognition (Zhang et al., 2015). 267

The K-SC algorithm, in a similar way to the widely used K-means clustering algorithm 268

(Hartigan and Wong., 1979; Dhillon and Modh., 2001), iterates a two-step procedure: assignment 269

step and refinement step. In the assignment step each time-series data are assigned to the closest 270

cluster, and in the refinement step the cluster centroids are updated. By alternating the two steps, 271

sum of the distances between the members of the same cluster is minimized. The K-SC algorithm 272

is found to outperform the K-means method in terms of intra-cluster homogeneity and inter-cluster 273

Page 15: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

15

diversity, which results in better cluster representation on the diverse shape of the time series (Yang 274

and Leskovec, 2011). 275

The reservoir operational schemes are time-series data with each representing actions or 276

decisions over time, which may have similar patterns even under different inflow conditions. 277

Recognition of the patterns helps to identify typical representation of the operational routines 278

amongst all the operational schemes. Similar shapes of the operational schemes imply similar 279

decisions on reservoir operation and may be grouped in one cluster. For this reason, it is essential 280

to have a metric that can appropriately measure the shape similarity of two time series. In K-means, 281

a simple distance metric i.e., Euclidean distance is adopted. The Euclidean metric measures the 282

overall distance between two curves and tends to focus on only the global peaks of the curves. 283

Under Euclidean measurement, two time series may have a large distance due to scale (in volume) 284

or shifting (in position) effect even if their temporal shapes are similar. On the contrary, the K-SC 285

uses a distance metric D(xj , xk) that is invariant to scaling and shifting (Yang and Leskovec 2011), 286

described as: 287

( )

,( , ) min j k q

j k qj

x xD x x

λ− ⋅= (12) 288

where || · || is the l2 norm, λ is the scaling coefficient, and q is the shifting coefficient measured 289

by q time units used to shift xk.. The metric works by finding the optimal value of the alignment q 290

and the scaling coefficientλ for matching the shapes of the two time series. 291

3.4 Karhunen-Loeve Expansion method 292

The Karhunen-Loeve (KL) expansion (Kosambi, 1943; Karhunen, 1947; Williams, 2015) is a 293

representation of a random process as a series expansion involving a complete set of deterministic 294

functions with corresponding random coefficients. Consider a random process of ( )Q t and let 295

Page 16: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

16

( )Q t be its mean and C(s,t)=cov( ( )Q s , ( )Q t ) be its covariance function. The Q(s) and Q(t) are 296

variables at different time step. Then, the KL expansion of ( )Q t can be represented by the 297

following function: 298

( ) ( ) ( )1

k k kk

Q t = Q t + ψ t ξλ∞

=∑ (13) 299

where{ } 1,k k k=

ψ λ ∞ are the orthogonal eigen-functions and the corresponding eigen-values, obtained 300

as solutions of the equation: 301

( ) ( ) ( )λψ t = C s,t ψ s ds∫ (14) 302

Equation (14) is a Fredholm integral equation of the second kind. When applied to a discrete and 303

finite process, this equation takes a much simpler form that can be easily solved. In its discrete 304

form, the covariance matrix C(s,t) is represented as an N×N matrix, where N is the time steps of 305

the random process. Then the above integral form can be rewritten as ∑ C(s, t)Ψ(s)Ns,t=1 to suit 306

the discrete case. In Eq. (13), { } 1k k=ξ ∞

is a sequence of uncorrelated random variables (coefficients) 307

with mean of 0 and variance of 1 and are defined as: 308

( ) ( ) ( )1k k

k

ξ = Q t Q t ψ t dtλ

− ∫ . (15) 309

The form of the KL expansion in Equation (2) is often approximated by a finite number of discrete 310

terms (e.g., M), for practical implementation. The truncated KL expansion is then written as: 311

( ) ( ) ( )1

M

k k kk

Q t Q t + ψ t ξλ=

≈ ∑ . (16) 312

Page 17: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

17

The number of terms M is determined by the desired accuracy of approximation and strongly 313

depends on the correlation of the random process. The higher the correlation of the random process, 314

the fewer the terms that are required for the approximation (Xiu, 2010). One approach to roughly 315

determine M is to compare the magnitude of the eigen-values (in descending order) with respect 316

to the first eigen-value and consider the terms with the most significant eigen-values. With the 317

truncated KL expansion, the large number of variables in time-domain is reduced to fewer 318

coefficients in the transformed space (i.e., frequency-domain). The KL expansion has found many 319

applications in science and engineering and is recognized as one of the most widely used methods 320

for reducing dimension of random processes (Narayanan et al., 1999; Phoon et al., 2002; Grigoriu 321

et al., 2006; Gibson et al., 2014). 322

The common practice of the reservoir operation is to find the optimal operational policies, i.e., 323

decision variables through a deterministic optimization model under given situations such as 324

inflow forecast and/or power demand forecast. These given situations can normally be treated as 325

random processes and each time-series of the inflow and/or power demand are essentially 326

realization of those random processes. Consequently, the optimal operational decisions (described 327

by decision variables) can be thought of as another random process associated to these given 328

random processes. Each optimal operational scheme is a realization of that random process for a 329

given scenario of the inflow and/or power demand. Therefore, the optimal operational decisions 330

can be treated as a realization of a random process which itself can be described by a collection of 331

previous realizations. In this study, we apply the KL expansion to construct a representation of the 332

optimal operational decisions based on certain number of the previous realizations. According to 333

the Equation (13) ~ (16), the covariance structure of the previous realizations can be kept in the 334

Page 18: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

18

representation. Using the representation, we can generate arbitrary realizations of the operational 335

decisions which are similar to the previous realizations due to the preserved covariance structure. 336

3.5 Hypervolume Index 337

One of the most important evaluations of performance in the multi-objective optimization 338

problem is the global optimality, commonly determined by two main aspects: convergence and 339

diversity of the Pareto front (Deb et al., 2002). In this context, the hypervolume index (H-index) 340

is found to be a good metric for evaluating the performance of multi-objective optimization (Zitzler 341

et al., 2003; Reed et al., 2013) due to its ability to combine convergence and diversity metrics into 342

a single index. The H-index is defined as 343

(17) 344

where A is an objective vector set, Z is the hyper-cube (0,1)n of the normalized objectives (n=2 in 345

our test case), and is the attainment function, which is equal to 1, if A is a weakly 346

dominated solution set in Z. Basically, the H-index measures the volume of the objective space 347

covered by a set of non-dominated solutions by calculating the volume of the objective space 348

enclosed by the attainment function and the axes. Higher values of the hypervolume index suggest 349

better quality of the solutions in terms of convergence and diversity. In general, a true Pareto front 350

or best known Pareto approximation set (i.e., reference set) is ideal or preferred for performance 351

evaluation. However, the hypervolume index can be used to compare two solution sets based on 352

its properties (Knowles and Corne, 2002). 353

(1,1)

(0,0)( )index AH dzα= Ζ∫

( )Aα Ζ

Page 19: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

19

4 Results 354

We first describe the results of the off-line model training which comprise the implicit stochastic 355

optimization, the cluster analysis and the development of the spectral optimization model. Then 356

the online performance of the trained model is compared with the untrained model. Both trained 357

and untrained model use the NSGA-II with the same population of 50. The generation of the 358

NSGA-II is set to 2500 for each optimization experiment. Other parameters of the NSGA-II such 359

a as crossover probability(=0.9) and mutation probability(=1/3600) are set to be the same as in 360

Deb et al (2002) and in other literatures (Yandamuri et al., 2006; Sindhya et al., 2011). The 361

parameters for the K-SC and K-means, as well as the parameters for the SOM, are discussed in the 362

corresponding section of the results, which is described in the following. 363

4.1 Implicit Stochastic Optimization 364

The 500 simulated scenarios of inflows (both GCL and LWG inflow) and power demand by 365

the statistical models are divided in two data sets. The first data set that comprise 450 scenarios 366

(randomly chosen) are used as a training set. The remaining 50 scenarios are used as a comparison 367

set for evaluating the performance of the trained model compared to the untrained one. The implicit 368

stochastic optimization (ISO), which is the Step 2 in the framework process (Fig.3), are completed 369

using the 450 scenarios from the training set. Each experiment for the ISO uses one inflow scenario 370

to the GCl reservoir, one inflow scenario to the LWG reservoir and one scenario of power demand 371

from the training set, resulting in a total of 450 experiments. For each experiment, the initial and 372

ending forebay elevations are randomly selected among the historical range based on the statistics 373

model described in Section 2.3.2. For the Monte Carlo simulation, the initial FB is randomly 374

selected (from uniform distribution) within the historical range and a difference between the initial 375

and the ending FB is randomly generated as well (from the fitted distribution). Then the ending 376

Page 20: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

20

FB is deterministically decided based on the selected initial FB and the generated difference. 377

Each experiment result in a set of Pareto front, which depicts the non-dominated relations 378

between the two conflicting objectives. Each solution of the Pareto front, namely a point in the 379

objective space, is associated with an operational solution (i.e., reservoir outflows) in the decision 380

space. Since we use a population of 50 in the NSGA-II for each experiment, the 450 experiments 381

result in 450 sets of Pareto fronts, where each Pareto front comprises 50 non-dominated solutions, 382

and hence there is a total of 22500 operational solutions. 383

A collection of the Pareto fronts for all the 450 experiments are shown in Fig. 4. Note that the 384

difference between each Pareto front (a curve) are resulted from different inflows, power demand, 385

and initial and ending forebay elevations. It is difficult to distinguish one set of Pareto front from 386

another since a large number of solutions are compacted within a small region. However, the 387

region shows a relatively clear range on the two objectives. Most of the solutions cover a range 388

from 0.5% to 3% on the objective one, i.e., power deficit to the demand. A relatively compact 389

range, which is from 0.2% to 1.6%, are found for the objective two, i.e., extra power generation 390

for heavy load hours. The range illustrates an optimal region of the objectives that we can expect 391

despite the variations on the inflows, power demand and the initial and ending forebay elevations. 392

Page 21: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

21

393

Fig. 4. Illustration for collection of solutions from the 450 experiments 394

Fig. 4 also shows the collection of the reservoir outflows and the corresponding forebay 395

elevations (time-series), which are both associated with the collection of the Pareto fronts. For the 396

purpose of brevity, only the results of the GCL Reservoir and that of the LWG reservoir (the most 397

upstream reservoirs) are shown. Both the outflow of GCL and LWG show great variation in 398

temporal shapes, similar as the inflows. However, the outflow of GCL is more regulated than that 399

of LWG due to the stronger regulation ability from the larger storage. The forebay elevations 400

change at each time step and result in a trajectory. The trajectory indicates the decisions of holding 401

water or releasing water and is often an intuitive representation of the operational strategy. It is 402

Page 22: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

22

shown that the initial and ending forebay elevations for the two weeks vary within a certain range, 403

which is similar to that of the historical data. 404

4.2 Cluster analysis of the simulated solutions 405

The goal of cluster analysis is to recognize underlying patterns among the simulated solutions. 406

Due to daily peaking operation (as for the GCL reservoir), the outflows of reservoirs have great 407

variations periodically in a day, which is difficult for recognizing a pattern in a relatively long-408

term (i.e., 14 days). Therefore, the forebay elevations (i.e., the trajectories) are used for cluster 409

analysis. Since each forebay trajectory is associated with one solution of outflows, the clusters of 410

the trajectories as well as the members within each cluster can be automatically applied to decision 411

variables, i.e., the reservoir outflows. 412

Due to configuration of the reservoir system, only the trajectories of the two upstream reservoir 413

i.e., GCL Reservoir and LWG Reservoir are used for cluster analysis, as the operations of the other 414

reservoirs are either depending on one or both the two reservoirs. Both K-means and K-SC are 415

used for clustering the forebay trajectories. The number of clusters is a parameter in the K-means 416

and the K-SC that needs to be specified in advance, similar as most of the CA methods. The 417

Silhouette (Kaufman and Rousseeuw 2009), an index to measure how well each object lies within 418

its cluster, is used for determining the number of clusters. The index is within the range of [-1, 1] 419

and the higher the value the better the clustering. For the case study, we measure how the Silhouette 420

index (in average) varies with the number of clusters for the forebay trajectory of the two reservoirs 421

and determine the number of the clusters with the highest Silhouette index. Fig. 5 shows the 422

relations between the Silhouette index and different number of clusters for the two reservoirs. 423

From these relations, the optimal number of clusters for GCL reservoir and LWG reservoir can be 424

determined as 2 and 3, respectively. The centroid of the clusters for the two reservoirs are also 425

Page 23: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

23

showed in Fig.5, where the centroid of the clusters from the K-SC method represent an adjusted 426

value while the absolute value of the trajectories are shown from the K-means method. The 427

adjusted value in the K-SC is a result of metric D (in Eq.12) and can be interpreted as a measure 428

for the shape similarity. 429

430

Fig. 5. Silhouette index for different number of clusters(left) and centroid of patterns 431

recognized by K-SC (middle) and K-means (right) for LWG forebay 432

The clusters that are found by the K-means, are far from each other measuring by the Euclidian 433

distances, while the shape of the trajectories are similar to each other. On the contrast, the clusters 434

that are discovered by the K-SC showed great dissimilarity in shapes. The results from the K-SC 435

0 10 20

Number of clusters

-0.2

0

0.2

0.4

0.6

0.8

1

Silh

ouet

te in

dex

For GCL Reservoir

For LWG Reservoir

0 168 336

Time step(hour)

0.0545

0.05452

0.05454

0.05456

0.05458

0.0546

D m

etric

for F

B of

GC

L

0 168 336

Time step(hour)

1275

1280

1285

FB o

f GC

L(ft)

0 168 336

Time step(hour)

0.05453

0.05454

0.05455

0.05456

0.05457

0.05458

D m

etric

for F

B of

LW

G

0 168 336

Time step(hour)

733

734

735

736

737

738FB

of L

WG

(ft)

Page 24: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

24

are preferred since different trajectory shapes may represent different operational strategies. For 436

instance, the clusters by the K-SC for the GCL reservoir shows two distinctive operational 437

strategies. One strategy is to release more water in the first week (forebay elevation decrease) and 438

holding more water in the second week (forebay elevation increase). The other strategy is just on 439

the opposite. Similarly, three operational strategies are recognized for the LWG reservoir. On the 440

contrary, the clusters by the K-means failed to recognize these different operational strategies. 441

Other reservoirs in the system are impacted either by the operational strategy of GCL reservoir 442

(i.e., CHJ reservoir on the middle Columbia river) or the operational strategy of LWG reservoir 443

(i.e., three downstream reservoirs on the snake river) or the combination of these two (i.e., four 444

reservoirs on the lower Columbia river). The possible strategies for the entire reservoir system are 445

6, or less if some combinations of the strategies are not applicable. We first classify the collection 446

of the trajectories using the two clusters of the GCL reservoir and marked each member of the 447

cluster by 1 (cluster 1) or 2 (cluster 2). Then, we apply the three clusters of the LWG reservoir 448

and add one more number for each member by 1 (cluster 1) or 2 (cluster 2) or 3 (cluster 3). Each 449

member of the collections finally have two numbers as two coordinates that divide the collection 450

into different groups. These coordinates are used on the collection of the trajectories and 451

correspondingly, on the collection of the outflows. 452

4.3 KL representations and spectral optimization model 453

Instead of one KL representation for the collection of the outflows, 6 KL representation are 454

prepared based on the different clusters of the outflows. The first 50 eigen-value of each cluster 455

are calculated according to the procedure describe in the Section 3.3 and the results are shown in 456

Fig. 6. The truncation of the eigen-values is determined based on the elbow points, which appear 457

on n=18. Therefore, we use the first 18 eigen values and the corresponding eigen-functions 458

Page 25: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

25

(showed also in Fig. 6) of each cluster to set up one truncated-version of the KL representation, 459

bringing in total 6 different KL truncated representations. 460

The truncated representations are summed as a collection and are incorporated into the spectral 461

optimization model (SOM), which is used for the online optimization. The SOM model are 462

designed to gain some” knowledge” from the simulated data through the conservation of the 463

covariance structure. Therefore, the SOM model is called trained model hereinafter and the 464

conventional model i.e., the model without incorporation of the KL representation is called 465

untrained model. 466

Fig. 6. The first 50 Eigen values for the 6 patterns and the first 18 eigen-function for GCL 467

reservoir (Pattern 1) 468

0 10 20 30 40 50n

10 0

10 1

10 2

10 3

10 4

10 5

10 6

Eige

n va

lue

Pattern 1

Pattern 2

Pattern 3

Pattern 4

Pattern 5

Pattern 6

0 48 96 144 192 240 288 336n

-0.1

-0.05

0

0.05

0.1

0.15

0.2

Eige

n fu

nctio

n va

lue

Page 26: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

26

4.4 Comparison on the model performances 469

The trained model are compared with the untrained model on the 50 scenarios from the 470

comparing set, a data set that are not used for the model training. To make a fair comparison, each 471

experiment has the same setting for both the trained and untrained model. The setting includes 472

parameters for the NSGA-II, which has the population of 50 and generation of 2500. In each 473

experiment, the scenario of the inflows and power demand as well as initial and ending forebay 474

elevations, are all the same for both the trained and untrained models. 475

Similar as the ISO procedure, 50 experiments results in 50 different sets of Pareto fronts. Each 476

experiment has different solutions from the two models. The solutions of 3 experiments are 477

illustrated in Fig. 7 for comparing the results from both the trained and the untrained model. Two 478

important measurements for the Pareto solutions are convergence and diversity (Deb, et al., 2002; 479

Yandamuri et al., 2006; Sindhya et al., 2011; Chen et al., 2015). Though the two measurements 480

can be calculated in the presence of reference solutions, it is straightforward to judge the quality 481

of the solutions by directly compare them graphically. It is shown that the solutions from the 482

trained model are in the far down-right region, a preferred place for solution convergence since we 483

aim to maximize the first objective(x axis) and minimize the second objective(y axis). The solution 484

from the trained model also shows greater ranges on the objectives, which obtain more diversified 485

results. 486

Page 27: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

27

487

488

Fig.7. Comparison on the Pareto fronts and on the improvement of hyper-volume value 489

between the untainted model and the trained model 490

0.4 0.6 0.8 1 1.2 1.4 1.6

Extra power generation for heavy load hours(%)

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Pow

er d

efic

it to

the

dem

and(

%)

Scenrio A(trained model)Scenrio B(trained model)Scenrio C(trained model)Scenrio A(untrained model)Scenrio B(untrained model)Scenrio C(untrained model)

0 500 1000 1500 2000 2500

Generations

0

5

10

15

20

25

30

Hyp

ervo

lum

e in

dex

Impr

ovem

ent(%

)

Mean result of trained model

Mean result of untrained model

Page 28: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

28

A more rigorous comparisons are made by using the hyper-volume index (H-index), which are 491

a measurement of both the convergence and diversity of the Pareto solutions. To investigate the 492

run-time performance of the two models, the H-index are calculated every 20 generations for each 493

experiment. Instead of comparing the absolute H-index value, the improvement of H-index is used 494

since the H-index is quite different for each experiment, which is difficult to compare. The 495

improvement of H-index is defined as the percentage of the H-index increase respect to the initial 496

H-index that are measured at the first 20 generations. The experiments of trained model (shown in 497

blue) gain much better improvement than those of the untrained model. On average, the 498

improvement for the final generation (the 2500th generations) gained by the trained model is 15%, 499

which is almost 3 times better that that of untrained model. The trained model also shows a strong 500

improvement (5%) in the first 100 generations, nearly the same improvement as gained by the 501

untrained model at the 2500th generations. That is 25 times of improvement compare to the 502

untrained model. 503

The above-mentioned trained model uses the entire training set (450 scenarios). One of the 504

interesting questions is the appropriate size of data for training a model. The answer to this question 505

is helpful for determining the minimum size of training samples when the training data is not easy 506

to access. It is also beneficial to save time budget of model training even the training is an offline 507

process. We use a different number of training samples to train the model and then test the 508

performance of the trained model on the comparing set. 509

The comparing set is the same as the one used for comparing the untrained model and the model 510

trained with 450 sets of data (called full-trained model hereinafter). The number of training 511

samples range from 50 to 400 with an interval of 50. The index of the experiment are from 1 to 512

10 with each indicate the training with different number of data. The Experiment 1 is for the 513

Page 29: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

29

untrained model that used 0 sets of simulated data while Experiment 10 is for the model that used 514

450 sets of simulated data. The model trained with different number of training samples (called 515

partially-trained model hereinafter) runs for the comparing set using the same population i.e., 50 516

and generations i.e., 2500 as being used in the full-trained model and untrained model. Fig. 8 517

compares the statistics of the H-index (at the final generation) of the 50 scenarios in the comparing 518

set. The result from the untrained model and full-trained model are also shown in the Fig. 8. The 519

horizontal bar in the Fig.8, from above to bottom, represent maximum, 25 percentile, mean, 75 520

percentile and minimum values of the H-index, respectively. The mean H-index improved largely 521

from the untrained model to the partially-trained model (150 training sets) and then slow down 522

with a little increase as the training sets increase to 300. The mean H-index remains unchanged for 523

the partially-trained models (300 to 400 training set) and the full-trained model. The maximum 524

value of the H-index shows similar behavior as the mean value. However, the minimum value, 25 525

percentile and 75 percentile values are not steadily improved as the training sample increases. 526

Page 30: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

30

527

Fig.8. comparison on the statistics of hyper-volume between models with different number 528

of training samples 529

5 Discussions 530

The trained model shows an overall superior optimization performance than the untrained 531

model. The reason is the trained model recognize some patterns from the historical decisions and 532

conserves the covariance structures of those patterns as a form of time-series. The covariance 533

structure is useful for the model to generate better candidate decision that are more mature (or 534

regularized) than the ones in the untrained model which are usually random generalized. 535

Information is conserved in the covariance structure similar as an athlete gaining the so-called 536

1 2 3 4 5 6 7 8 9 10

Experiment Index for diffirent number of trainning sets

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1H

-inde

x

0sets 50 sets 100sets 150sets 200sets 250 sets 300sets 350sets 400sets 450sets

Page 31: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

31

“muscle memory” through repeated exercises. As a better performance of the trained athlete than 537

those untrained amateurs, the informational advantage of the trained model lead to a quicker 538

convergence of the optimization. Consequently, a much better final solution from the trained 539

model can be expected under the condition of the same generation as the untrained model. This 540

also explain the early improvement of the H-index that are observed for the trained model. The 541

improvement of the H-index is much faster at the beginning and tends to slow down in the later 542

stage (Fig. 8). The early faster improvement is mainly due to the high-quality candidate solutions 543

that are generated from the covariance structure. Those candidate solutions reflect the best 544

historical decisions (at least best of 2500th generation) in the similar scenario. Therefore, the H-545

index of the solution are greatly improved as those candidate solutions being generated. The 546

improvement slows down since less and less information is usable for guiding the candidate 547

solutions during later stage of the optimization. 548

The reduction and transformation of the decision variables in the SOM model also contribute 549

to the better performance of the trained model. Since much less decision variables are used in the 550

trained model, the search space are greatly reduced and therefore, result in a better convergence. 551

The better performance of the trained model may be also associated with the interdependences 552

between the decision variables. The search difficulty increases with the decision interdependences 553

between the decision variables (Goldberg 2002, Hadka and Reed, 2013; Woodruff et al., 2013). 554

The decision variables of a multi-reservoir system are obviously correlated and dependent in some 555

extent, which makes the optimization of the untrained model difficult to solve in the time domain. 556

On the other hand, the decision variables in the frequency domain (i.e., coefficients) are mutually 557

independent and hence the optimization of the trained model is not as complex as in the time 558

domain. 559

Page 32: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

32

The data that are used for training the model are synthetic data generated from the statistical 560

model. Using data generated from other time-series models may result in a different trained model. 561

However, the trained model is expected to embrace better performance of optimization since the 562

covariance structure can be conserved through the training process. In other words, the accurate 563

representation of the trained model is dependent on the accuracy of the data that is used for the 564

training. Historical data are a good source for training; however, the size of the historical data may 565

not be enough. A non-fully-explored covariance structure resulted from small data size may 566

suggest an inaccurate representation of the trained model. Fig. 8 shows that the performance of the 567

trained model is strengthening along with the increase of the number of the training data. However, 568

the performance of the trained model improves little after a certain number of the training data, 569

i.e., 200 data in the study. This number is also expected to be different when a different data set is 570

used. However, a minimum number of training data can be found by comparing the performances 571

of the trained model. This minimum number can be used for a relative good model performance if 572

the training budget (time or cost) is limited. 573

The data are used for training the model are inflows (GCL inflow and LWG inflow) and power 574

demand data. The correlations between each are not considered. The correlations are helpful for 575

reducing the size of training data since we do not need to emulate the combination of the inflows 576

and power demand. This topic will be explored in the future work. 577

It is worth mentioning that the trained model is specific to a problem with a given structure, 578

inflow distribution and optimization horizon. Any changes of the problem require a new training 579

process. It is obvious that this training process is only applied to the stationary optimization 580

problem in which the inflow and demand follow the same or similar statistical pattern. The abrupt 581

changes on the inflows and power demand are not applicable since those changes may not be 582

Page 33: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

33

captured by the covariance structure of the decision variables. However, the concept of model 583

training is general and can be easily implemented in other optimization problems, for which an 584

efficient online optimization is preferred. The examples can be resources management/allocation 585

problems such as water resources planning and environmental governance and management. The 586

usage of a fast and interactive tool can help stakeholders learn about complex environmental 587

problems, to make more informed judgments (Radinsky et al., 2016). The model training can also 588

find great value in preparing online platform which allows participants to identify water 589

management adaptation options in a real-time manner (Bojovic et al., 2017). 590

5 Conclusion 591

The optimization model trained by the proposed framework significantly improve the model 592

performance, by gaining information from historical data and off-line simulations. For optimizing 593

hourly operation of multi-reservoir system, the optimization model trained with historical 594

(simulated) decisions can have significant improvement on the hyper volume index compare to a 595

model without training. Compared to the untrained version of the model, the trained model 596

converged nearly 25 times faster to achieve a solution of similar quality, which significantly 597

improves the performance of an online optimization model. The performance of the trained model 598

is found to be improved along with the number of training data; however, the improvement tends 599

to converge after certain number of training data is used. 600

The early improvement observed in the trained model is attractive for the online optimization 601

which has to be completed in short timeframe. This faster convergence can be used to greatly 602

decrease the computational time in an online optimization environment. 603

Page 34: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

34

Even the framework of model training is implemented for reservoir operation, its generic 604

procedure allows a broad application to many optimization models in water resources and 605

environmental system management in which the future decisions share the same or similar 606

statistical structure with the historical decision variables. 607

608

Software and data availability 609

The historical inflow/generation data to the reservoir system is available through Northwestern 610

Division, U.S. Army Corps of Engineers and can be obtained from http://www.nwd-611

wc.usace.army.mil/cgi-bin/dataquery.pl. The reservoir data were provided by the Bonneville 612

Power Administration. The simulated inflow/generation data can be obtained by contacting 613

Claudio Fuentes by e-mail [email protected]. The code of the KL expansion is 614

implemented in Matlab and can be accessed by contacting Duan Chen by email 615

[email protected]. The K-Spectral Centroid algorithm (K-SC) model is also written in 616

Matlab and is available from http://snap.stanford.edu/data/ksc.html. 617

Acknowledgements 618

This work was supported by the Bonneville Power Administration through project TIP258 and 619

TIP342. The first author would also thank supports from National Natural Science Foundation of 620

China (51425902, 51479188, and 51379020) and the Fundamental Research Funds for the Central 621

Universities (CKSF2016009/SL). 622

Page 35: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

35

References 623

Atiquzzaman, M., Liong, S. Y., & Yu, X. (2006).Alternative decision making in water distribution 624

network with NSGA-II.Journal of water resources planning and management, 132(2), 122-625

126. 626

Bemporad, A., Morari, M., Dua, V., & Pistikopoulos, E. N. (2002). The explicit linear quadratic 627

regulator for constrained systems. Automatica, 38(1), 3-20. 628

Bennett, K. P., & Parrado-Hernández, E. (2006). The interplay of optimization and machine 629

learning research. The Journal of Machine Learning Research, 7, 1265-1281. 630

Bojovic, D., Giupponi, C., Klug, H., Morper-Busch, L., Cojocaru, G., & Schörghofer, R. (2017). 631

An online platform supporting the analysis of water adaptation measures in the Alps. Journal 632

of Environmental Planning and Management, 1-16. 633

Castelletti, A., Galelli, S., Restelli, M., & Soncini‐Sessa, R. (2010). Tree‐based reinforcement 634

learning for optimal water reservoir operation. Water Resources Research, 46(9). 635

Celeste, A. B., & Billib, M. (2009). Evaluation of stochastic reservoir operation optimization 636

models. Advances in Water Resources, 32(9), 1429-1443. 637

Chasse, A., & Sciarretta, A. (2011). Supervisory control of hybrid powertrains: An experimental 638

benchmark of offline optimization and online energy management. Control engineering 639

practice, 19(11), 1253-1265. 640

Chen, D., Leon, A. S., Gibson, N. L., & Hosseini, P. (2016). Dimension reduction of decision 641

variables for multireservoir operation: A spectral optimization model. Water Resources 642

Research. 643

Page 36: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

36

Chen, D., Leon, A. S., Hosseini, P., Gibson, N. L., & Fuentes, C. (2017). Application of Cluster 644

Analysis for Finding Operational Patterns of Multireservoir System during Transition Period. 645

Journal of Water Resources Planning and Management, 04017028. 646

Chen, D., Li, R., Chen, Q., & Cai, D. (2015). Deriving optimal daily reservoir operation scheme 647

with consideration of downstream ecological hydrograph through a time-nested approach. 648

Water Resources Management, 29(9), 3371-3386. 649

De Jong, K. A. (1975). Analysis of the behavior of a class of genetic adaptive systems. 650

Deane, J. P., Drayton, G., & Gallachóir, B. Ó. (2014). The impact of sub-hourly modelling in 651

power systems with significant levels of renewable generation. Applied Energy, 113, 152-158. 652

Deb K, Pratap A, Agarwal S, and Meyarivan T(2002) A fast and elitist multi-objective genetic 653

algorithm: NSGA-II. IEEE Trans Evol Comput 6 (2): 182-197 654

Dhillon, I. S., and Modha, D. S. (2001). “Concept Decompositions for Large Sparse Text Data 655

Using Clustering.” Mach. Learn., 42(1–2), 143–175. 656

Gibbs, M. S., Maier, H. R., & Dandy, G. C. (2015). Using characteristics of the optimisation 657

problem to determine the genetic algorithm population size when the number of evaluations is 658

limited. Environmental Modelling & Software, 69, 226-239. 659

Gibson, N. L., Gifford-Miears, C., Leon, A. S., &Vasylkivska, V. S. (2014).Efficient computation 660

of unsteady flow in complex river systems with uncertain inputs. International Journal of 661

Computer Mathematics, 91(4), 781-797. 662

Gil, E., Bustos, J., & Rudnick, H. (2003). Short-term hydrothermal generation scheduling model 663

using a genetic algorithm. Power Systems, IEEE Transactions on, 18(4), 1256-1264. 664

Goldberg, D. E. (2002). The design of innovation: Lessons from and for Competent genetic 665

algorithms, Kluwer Academic, Norwell, MA. 666

Page 37: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

37

Grigoriu, M. (2006).Evaluation of Karhunen–Loève, spectral, and sampling representations for 667

stochastic processes.Journal of engineering mechanics, 132(2), 179-189. 668

Hadka, D., & Reed, P. (2013). Borg: An auto-adaptive many-objective evolutionary computing 669

framework. Evolutionary Computation, 21(2), 231-259. 670

Hartigan, J. A., and Wong, M. A. (1979). “Algorithm AS 136: A K-Means Clustering Algorithm.” 671

J. Roy. Stat. Soc. C-App., 28(1), 100–108. 672

Horne, A., Kaur, S., Szemis, J., Costa, A., Webb, J. A., Nathan, R., ... & Boland, N. (2017). Using 673

optimization to develop a “designer” environmental flow regime. Environmental Modelling & 674

Software, 88, 188-199. 675

Karhunen K.K, On linear methods in probability and statistics, Ann. Acad. Sci. Fennicae. Ser. A. 676

I. Math.-Phys., 1947, No. 37, 1–79. 677

Kaufman, L., and Rousseeuw, P. J. (2009). Finding Groups in Data: An Introduction to Cluster 678

Analysis. John Wiley & Sons. 679

Khalil, A., McKee, M., Kemblowski, M., & Asefa, T. (2005). Sparse Bayesian learning machine 680

for real‐time management of reservoir releases. Water Resources Research, 41(11). 681

Knowles, J., &Corne, D. (2002). On metrics for comparing nondominated sets. In Evolutionary 682

Computation, 2002.CEC'02.Proceedings of the 2002 Congress on (Vol. 1, pp. 711-716).IEEE. 683

KosambiD. D., Statistics in function space, J. Indian Math. Soc. (N.S.) 7 (1943), 76–88. 684

Labadie, J. W. (2004). Optimal operation of multireservoir systems: state-of-the-art review. 685

Journal of water resources planning and management, 130(2), 93-111. 686

Lee, J. H., & Labadie, J. W. (2007). Stochastic optimization of multireservoir systems via 687

reinforcement learning. Water resources research, 43(11). 688

Page 38: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

38

Lerma, N., Paredes-Arquiola, J., Andreu, J., Solera, A., & Sechi, G. M. (2015). Assessment of 689

evolutionary algorithms for optimal operating rules design in real Water Resource Systems. 690

Environmental Modelling & Software, 69, 425-436. 691

Lund, J. R. (1996a). Developing seasonal and long-term reservoir system operation plans using 692

HEC-PRM (No. HEC-RD-40). HYDROLOGIC ENGINEERING CENTER DAVIS CA. 693

Lund, J. R., & Ferreira, I. (1996b). Operating rule optimization for Missouri River reservoir 694

system. Journal of Water Resources Planning and Management, 122(4), 287-295. 695

Meile, T., Boillat, J. L., & Schleiss, A. J. (2011). Hydropeaking indicators for characterization of 696

the Upper-Rhone River in Switzerland. Aquatic Sciences, 73(1), 171-182. 697

Narayanan, M. V., King, M. A., Soares, E. J., Byrne, C. L., Pretorius, P. H., &Wernick, M. N. 698

(1999). Application of the Karhunen-Loeve transform to 4D reconstruction of cardiac gated 699

SPECT images. Nuclear Science, IEEE Transactions on, 46(4), 1001-1008. 700

Olivares, M. A., & Lund, J. R. (2011). Representing energy price variability in long-and medium-701

term hydropower optimization. Journal of Water Resources Planning and Management, 702

138(6), 606-613. 703

Oliveira, R., & Loucks, D. P. (1997). Operating rules for multireservoir systems. Water resources 704

research, 33(4), 839-852. 705

Pannocchia, G., Rawlings, J. B., & Wright, S. J. (2007). Fast, large-scale model predictive control 706

by partial enumeration. Automatica, 43(5), 852-860. 707

Phoon, K. K., Huang, S. P., &Quek, S. T. (2002).Implementation of Karhunen–Loeve expansion 708

for simulation using a wavelet-Galerkin scheme. Probabilistic Engineering Mechanics, 17(3), 709

293-303. 710

Page 39: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

39

Prasad, T. D., & Park, N. S. (2004).Multiobjective genetic algorithms for design of water 711

distribution networks. Journal of Water Resources Planning and Management, 130(1), 73-82. 712

Radinsky, J., Milz, D., Zellner, M., Pudlock, K., Witek, C., Hoch, C., & Lyons, L. (2016). How 713

planners and stakeholders learn with visualization tools: using learning sciences methods to 714

examine planning processes. Journal of Environmental Planning and Management, 1-28. 715

Ravey, A., Blunier, B., & Miraoui, A. (2011, September). Control strategies for fuel cell based 716

hybrid electric vehicles: From offline to online. In Vehicle Power and Propulsion Conference 717

(VPPC), 2011 IEEE (pp. 1-4). IEEE. 718

Reed PM, Hadka D, Herman J, Kasprzyk J, Kollat J(2013) Evolutionary Multiobjective 719

Optimization in Water Resources: The Past, Present, and Future. Adv Water Resour 51:438-720

456 721

Schwanenberg, D., Xu, M., Ochterbeck, T., Allen, C., &Karimanzira, D. (2014).Short-term 722

management of hydropower assets of the Federal Columbia River power system. Journal of 723

Applied Water Engineering and Research, 2(1), 25-32. 724

Shiau, J. T., & Wu, F. C. (2013). Optimizing environmental flows for multiple reaches affected by 725

a multipurpose reservoir system in Taiwan: Restoring natural flow regimes at multiple 726

temporal scales. Water Resources Research, 49(1), 565-584. 727

Sindhya, K., Deb, K., &Miettinen, K. (2011).Improving convergence of evolutionary multi-728

objective optimization with local search: a concurrent-hybrid algorithm. Natural Computing, 729

10(4), 1407-1430. 730

Stratford, D. S., Pollino, C. A., & Brown, A. E. (2016). Modelling population responses to flow: 731

The development of a generic fish population model. Environmental Modelling & Software, 732

79, 96-119. 733

Page 40: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

40

Tsoukalas, I., & Makropoulos, C. (2015). Multiobjective optimisation on a budget: exploring 734

surrogate modelling for robust multi-reservoir rules generation under hydrological uncertainty. 735

Environmental Modelling & Software, 69, 396-413. 736

Wang, J., & Liu, S. (2011). Quarter-hourly operation of hydropower reservoirs with pumped 737

storage plants. Journal of Water Resources Planning and Management, 138(1), 13-23. 738

Wang, Y., & Boyd, S. (2010). Fast model predictive control using online optimization. IEEE 739

Transactions on Control Systems Technology, 18(2), 267-278. 740

Wardlaw, R., & Sharif, M. (1999). Evaluation of genetic algorithms for optimal reservoir system 741

operation. Journal of water resources planning and management, 125(1), 25-33. 742

Williams, M. M. R. (2015). Numerical solution of the Karhunen–Loeve integral equation with 743

examples based on various kernels derived from the Nataf procedure. Annals of Nuclear 744

Energy, 76, 19-26. 745

Woodruff, M. J., Simpson, T. W., & Reed, P. M. (2013). Diagnostic analysis of 746

metamodels’multivariate dependencies and their impacts in many-objective design 747

optimization. In Proceedings of the ASME 2013 International Design Engineering Technical 748

Conferences & Computers and Information in Engineering Conference (IDETC/CIE 2013), 749

Portland, Oregon. 750

Wurbs, R. A. (1993). Reservoir-system simulation and optimization models. Journal of water 751

resources planning and management, 119(4), 455-472. 752

Xiu, D. (2010). Numerical methods for stochastic computations: a spectral method approach. 753

Princeton University Press. 754

Page 41: Offline Training for Improving Online performance of a ...aleon3/papers_PDF/Journal/Offline_training.pdf · based optimization model with synthetic and/or historical15 data) in an

41

Yandamuri, S. R., Srinivasan, K., &MurtyBhallamudi, S. (2006). Multiobjective optimal waste 755

load allocation models for rivers using nondominated sorting genetic algorithm-II. Journal of 756

water resources planning and management, 132(3), 133-143. 757

Yang, J., and Leskovec, J. (2011). “Patterns of temporal variation in online media.” Proceedings 758

of the fourth ACM international conference on Web search and data mining, ACM, 177–186. 759

Zhang, Z., Pati, D., and Srivastava, A. (2015). “Bayesian clustering of shapes of curves.” J. Stat. 760

Plan. Infer., 166, 171–186. 761

Zitzler, E., Deb, K., & Thiele, L. (2000). Comparison of multiobjective evolutionary algorithms: 762

Empirical results. Evolutionary computation, 8(2), 173-195. 763

Zitzler, E., Thiele, L., Laumanns, M., Fonseca, C. M., & Da Fonseca, V. G. (2003). Performance 764

assessment of multiobjective optimizers: an analysis and review. Evolutionary Computation, 765

IEEE Transactions on, 7(2), 117-132. 766

Zoumas, C. E., Bakirtzis, A. G., Theocharis, J. B., & Petridis, V. (2004). A genetic algorithm 767

solution approach to the hydrothermal coordination problem. Power Systems, IEEE 768

Transactions on, 19(3), 1356-1364. 769

770

771