gpu-accelerated stochastic predictive control of drinking ... · an optimization algorithm which...

12
1 GPU-accelerated stochastic predictive control of drinking water networks Ajay K. Sampathirao, Pantelis Sopasakis, Alberto Bemporad Fellow, IEEE and Panagiotis Patrinos Abstract—Despite the proven advantages of scenario-based stochastic model predictive control for the operational control of water networks, its applicability is limited by its considerable computational footprint. In this paper we fully exploit the structure of these problems and solve them using a proximal gradient algorithm parallelizing the involved operations. The proposed methodology is applied and validated on a case study: the water network of the city of Barcelona. Index Terms—Stochastic model predictive control (SMPC), Graphics processing units (GPU), Drinking water networks, Accelerated Proximal Gradient Method. I. I NTRODUCTION A. Motivation Water utilities involve energy-intensive processes, complex in nature (dynamics) and form (topology of the network), of rather large scale and with interconnected components, subject to uncertain water demands from the consumers and are required to supply water uninterruptedly. These challenges call for operational management technologies able to provide reliable closed-loop behavior in presence of uncertainty. In 2014, the IEEE Control Systems Society identified many aspects of the management of complex water networks as emerging future research directions [1]. Stochastic model predictive control is an advanced control scheme which can address effectively the above challenges and has already been used for the management of water networks [2], [3], power networks [4] and other uncertain systems. Stochastic model predictive control is also accom- panied by a wealth of theoretical results regarding its stability and recursive feasibility properties [5], [6]. However, unless restrictive assumptions are adopted regarding the form of the disturbances, such problems are known to be computationally intractable [3], [7]. In this paper we combine an accelerated dual proximal gradient algorithm with general-purpose graph- ics processing units (GPGPUs) to deliver a computationally feasible solution for the control of water networks. B. Background The pump scheduling problem (PSP) is an optimal control problem for determining an open-loop control policy for the A. K. Sampathirao, P. Sopasakis and A. Bemporad, at the time of the original submission of the manuscript and while it was under review, were with IMT Institute for Advanced Studies Lucca, Piazza San Francesco 19, 55100 Lucca, Italy. Emails: {a.sampathirao, p.sopasakis, a.bemporad}@imtlucca.it. P. Patrinos is with STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Department of Electrical En- gineering (ESAT), Kasteelpark Arenberg 10, 3001 Leuven, Belgium. Email: [email protected]. operation of a water network. Such open-loop approaches are known since the 80’s [8], [9]. More elaborate schemes have been proposed such as [10] where a nonlinear model is used along with a demand forecasting model to produce an optimal open-loop 24-hour-ahead policy. Recently, the problem was formulated as a mixed-integer nonlinear program to account for the on/off operation of the pumps [11]. Heuristic approaches using evolutionary algorithms, genetic algorithms, and simulated annealing have also appeared in the litera- ture [12]. However, a common characteristic and shortcoming of these studies is that they assume to know the future water demand and they do not account for the various sources of uncertainty which may alter the expected smooth operation of the network. The effect of uncertainty can be attenuated by feedback from the network combined with the optimization of a per- formance index taking into account the system dynamics and constraints as in PSP. This, naturally, gives rise to model predictive control (MPC) which has been successfully used for the control of drinking water networks [13], [14]. Recently, Bakker et al. demonstrated experimentally on five full-scale water supply systems that MPC will lead to a more efficient water supply and better water quality than a conventional level controller [15]. Distributed and decentralized MPC formula- tions have been proposed for the control of large-scale water networks [16], [17] while MPC has also been shown to be able to address complex system dynamics such as the Hazen- Williams pressure-drop model [18]. Most MPC formulations either assume exact knowledge of the system dynamics and future water demands [14], [17] or endeavor to accommodate the worst-case scenario [13], [19]– [21]. The former approach is likely to lead to adverse behavior in presence of disturbances which inevitably act on the system while the latter turns out to be too conservative as we will later demonstrate in this paper. When probabilistic information about the disturbances is available it can be used to refine the MPC problem formu- lation. The uncertainty is reflected onto the cost function of the MPC problem deeming it a random variable; in stochastic MPC (SMPC) the index to minimize is typically the expec- tation of such a random cost function under the (uncertain) system dynamics and state/input constraints [22], [23]. SMPC leads to the formulation of optimization problems over spaces of random variables which are, typically, infinite- dimensional. Assuming that disturbances follow a normal probability distribution facilitates their solution [7], [24], [25]; however, such an assumption often fails to be realistic. The normality assumption has also been used for the stochastic

Upload: others

Post on 26-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GPU-accelerated stochastic predictive control of drinking ... · an optimization algorithm which makes use of the problem structure and sparsity. We exploit the structure of the problem,

1

GPU-accelerated stochastic predictive control ofdrinking water networks

Ajay K. Sampathirao, Pantelis Sopasakis, Alberto Bemporad Fellow, IEEE and Panagiotis Patrinos

Abstract—Despite the proven advantages of scenario-basedstochastic model predictive control for the operational controlof water networks, its applicability is limited by its considerablecomputational footprint. In this paper we fully exploit thestructure of these problems and solve them using a proximalgradient algorithm parallelizing the involved operations. Theproposed methodology is applied and validated on a case study:the water network of the city of Barcelona.

Index Terms—Stochastic model predictive control (SMPC),Graphics processing units (GPU), Drinking water networks,Accelerated Proximal Gradient Method.

I. INTRODUCTION

A. MotivationWater utilities involve energy-intensive processes, complex

in nature (dynamics) and form (topology of the network),of rather large scale and with interconnected components,subject to uncertain water demands from the consumers andare required to supply water uninterruptedly. These challengescall for operational management technologies able to providereliable closed-loop behavior in presence of uncertainty. In2014, the IEEE Control Systems Society identified manyaspects of the management of complex water networks asemerging future research directions [1].

Stochastic model predictive control is an advanced controlscheme which can address effectively the above challengesand has already been used for the management of waternetworks [2], [3], power networks [4] and other uncertainsystems. Stochastic model predictive control is also accom-panied by a wealth of theoretical results regarding its stabilityand recursive feasibility properties [5], [6]. However, unlessrestrictive assumptions are adopted regarding the form of thedisturbances, such problems are known to be computationallyintractable [3], [7]. In this paper we combine an accelerateddual proximal gradient algorithm with general-purpose graph-ics processing units (GPGPUs) to deliver a computationallyfeasible solution for the control of water networks.

B. BackgroundThe pump scheduling problem (PSP) is an optimal control

problem for determining an open-loop control policy for the

A. K. Sampathirao, P. Sopasakis and A. Bemporad, at the time of theoriginal submission of the manuscript and while it was under review, werewith IMT Institute for Advanced Studies Lucca, Piazza San Francesco19, 55100 Lucca, Italy. Emails: a.sampathirao, p.sopasakis,[email protected].

P. Patrinos is with STADIUS Center for Dynamical Systems, SignalProcessing and Data Analytics, KU Leuven, Department of Electrical En-gineering (ESAT), Kasteelpark Arenberg 10, 3001 Leuven, Belgium. Email:[email protected].

operation of a water network. Such open-loop approachesare known since the 80’s [8], [9]. More elaborate schemeshave been proposed such as [10] where a nonlinear modelis used along with a demand forecasting model to producean optimal open-loop 24-hour-ahead policy. Recently, theproblem was formulated as a mixed-integer nonlinear programto account for the on/off operation of the pumps [11]. Heuristicapproaches using evolutionary algorithms, genetic algorithms,and simulated annealing have also appeared in the litera-ture [12]. However, a common characteristic and shortcomingof these studies is that they assume to know the future waterdemand and they do not account for the various sources ofuncertainty which may alter the expected smooth operation ofthe network.

The effect of uncertainty can be attenuated by feedbackfrom the network combined with the optimization of a per-formance index taking into account the system dynamics andconstraints as in PSP. This, naturally, gives rise to modelpredictive control (MPC) which has been successfully used forthe control of drinking water networks [13], [14]. Recently,Bakker et al. demonstrated experimentally on five full-scalewater supply systems that MPC will lead to a more efficientwater supply and better water quality than a conventional levelcontroller [15]. Distributed and decentralized MPC formula-tions have been proposed for the control of large-scale waternetworks [16], [17] while MPC has also been shown to beable to address complex system dynamics such as the Hazen-Williams pressure-drop model [18].

Most MPC formulations either assume exact knowledge ofthe system dynamics and future water demands [14], [17] orendeavor to accommodate the worst-case scenario [13], [19]–[21]. The former approach is likely to lead to adverse behaviorin presence of disturbances which inevitably act on the systemwhile the latter turns out to be too conservative as we will laterdemonstrate in this paper.

When probabilistic information about the disturbances isavailable it can be used to refine the MPC problem formu-lation. The uncertainty is reflected onto the cost function ofthe MPC problem deeming it a random variable; in stochasticMPC (SMPC) the index to minimize is typically the expec-tation of such a random cost function under the (uncertain)system dynamics and state/input constraints [22], [23].

SMPC leads to the formulation of optimization problemsover spaces of random variables which are, typically, infinite-dimensional. Assuming that disturbances follow a normalprobability distribution facilitates their solution [7], [24], [25];however, such an assumption often fails to be realistic. Thenormality assumption has also been used for the stochastic

Page 2: GPU-accelerated stochastic predictive control of drinking ... · an optimization algorithm which makes use of the problem structure and sparsity. We exploit the structure of the problem,

2

control of drinking water networks aiming at delivering highquality of services – in terms of demand satisfaction – whileminimizing the pumping cost under uncertainty [2].

An alternative approach, known as scenario-based stochas-tic MPC, treats the uncertain disturbances as discrete randomvariables without any restriction on the shape of their dis-tribution [26]–[28]. The associated optimization problem inthese cases becomes a discrete multi-stage stochastic optimalcontrol problem [29]. Scenario-based problems can be solvedalgorithmically, however, their size can be prohibitively largemaking them impractical for control applications of waternetworks as pointed out by Goryashko and Nemirovski [20].This is demonstrated by Grosso et al. who provide a com-parison of the two approaches [3]. Although compressionmethodologies have been proposed – such as the scenario treegeneration methodology of Heitsch and Romisch [30] – multi-stage stochastic optimal control problems may still involve upto millions of decision variables.

Graphics processing units (GPUs) have been used for theacceleration of the algorithmic solution of various problems insignal processing [31], computer vision and pattern recogni-tion [32] and machine learning [33], [34] leading to a manifoldincrease in computational performance. To the best of theauthors’ knowledge, this paper is the first work in which GPUtechnology is used for the solution of a stochastic optimalcontrol problem.

There have been proposed parallelizable interior point algo-rithms for two-stage stochastic optimal control problems suchas [35]–[38] and an ad hoc interior point solver for multi-stageproblems [39]. However, interior point algorithms involvecomplex steps and are not suitable for an implementation onGPUs which can make the most of the capabilities of thehardware.

C. Contributions

In this paper we address the above challenges by devisingan optimization algorithm which makes use of the problemstructure and sparsity. We exploit the structure of the problem,which is dictated by the structure of the scenario tree, toparallelize the involved operations. The algorithm runs ona GPU hardware leading to a significant speed-up as wedemonstrate in Section V.

We first formulate a stochastic MPC problem using a linearflow-based hydraulic model of the water network while takinginto account the uncertainty which accompanies future waterdemands. We propose an accelerated dual proximal gradientalgorithm for the solution of the optimal control problem andreport results in comparison with a CPU-based solver.

Finally, we study the performance of the closed-loop systemin terms of quality of service and process economics using theBarcelona drinking water network as a case study. We showthat the number of scenarios can be used as a tuning parameterallowing us to refine our representation of uncertainty andtrade the economic operation of the network for reliabilityand quality of service.

D. Mathematical preliminaries

Let R = R∪+∞ denote the set of extended-real numbers.The set of of nonnegative integers k1, k1 + 1, . . . , k2, k2 ≥k1 is denoted by N[k1,k2]. For x ∈ Rn we define [x]+ to be thevector in Rn whose i-th element is max0, xi. For a matrixA ∈ Rn×m we denote its transpose by A′.

The indicator function of a set C ⊆ Rn is the extended-real valued function δ(·|C) : Rn → R and it is δ(x|C) = 0for x ∈ C and δ(x|C) = +∞ otherwise. Indicator functionsare used to encode constraints in the cost function of anoptimization problem. A function f : Rn → R is calledproper if there is a x ∈ Rn so that f(x) < ∞ andf(x) > −∞ for all x ∈ Rn. A proper convex functionf : Rn → R is called lower semi-continuous or closed iffor every x ∈ Rn, f(x) = lim infz→x f(z). For a properclosed convex function f : Rn → R, we define its conjugateas f∗(y) = supxy′x − f(x). We say that f is σ-stronglyconvex if f(x)− σ

2 ‖x‖22 is a convex function. Unless otherwise

stated, ‖ · ‖ stands for the Euclidean norm.

II. MODELING OF DRINKING WATER NETWORKS

A. Flow-based control-oriented model

Dynamical models of drinking water networks have beenstudied in depth the last two decades [14], [17], [40]. Flow-based models are derived from simple mass balance equationsof the network which lead to the following pair of equations

xk+1 = Axk +Buk +Gddk, (1a)0 = Euk + Eddk, (1b)

where x ∈ Rnx is the state vector corresponding to thevolumes of water in the storage tanks, the manipulated inputsu ∈ Rnu are the flow set-points which are provided asreferences to the pumping stations and valves of the network,d ∈ Rnd is the vector of water demands and k ∈ N indexes thediscrete time domain. The above dynamical model is derivedon the basis of mass balances: equation (1a) describes thetransportation of water from the tanks to the demand nodesover the network topology, while (1b) models the flow atmixing nodes. Equation (1a) forms a linear time-invariantsystem with additive uncertainty and (1b) is an algebraicinput-disturbance coupling equation with E ∈ Rne×nu andEd ∈ Rne×nd where ne is the number of junctions in thenetwork.

The maximum capacity of the tanks, the maximum pumpingcapacity of each pumping station and the limits on the flowset-points which correspond to the valves are described by thefollowing bounds:

umin ≤uk ≤ umax, (2a)xmin ≤xk ≤ xmax. (2b)

The above formulation has been widely used in the formu-lation of model predictive control problems for drinking waternetworks (DWNs) [2], [13], [17].

Page 3: GPU-accelerated stochastic predictive control of drinking ... · an optimization algorithm which makes use of the problem structure and sparsity. We exploit the structure of the problem,

3

B. Demand prediction model

The water demand is the main source of uncertainty thataffects the dynamics of the network. Various time seriesmodels have been proposed for the forecasting of future waterdemands such as seasonal Holt-Winters, seasonal ARIMA,BATS and SVM [13], [41]. Such models can be used to predictnominal forecasts of the upcoming water demand along ahorizon of N steps ahead using measurements available up totime k, denoted by dk+j|k. Then, the actual future demandsdk+j — which are unknown to the controller at time k — canbe expressed as

dk+j(εj) = dk+j|k + εj , (3)

where εj is the demand prediction error which is a randomvariable on a probability space (Ωj ,Fj ,Pj) and for conve-nience we define the tuple εj = (ε0, ε1, . . . , εj) which is arandom variable in the product probability space. We alsodefine dk = (dk|k, . . . dk+N−1|k).

Time [hr]

4370 4380 4390 4400 4410 4420 4430

Wa

ter

de

ma

nd

[m

3/s

]

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

0.05

Fig. 1. Collection of possible upcoming demands at a given time instant.These results were produced using the SVM model and the data in [13].

III. STOCHASTIC MPC FOR DWNS

In this section we define the control objectives for the con-trolled operation of a DWN and we formulate the stochasticMPC problem.

A. Control objectives

We define the following three cost functions which reflectour control objectives. The economic cost quantifies the pro-duction and transportation cost

`w(uk, k) = Wα(α1 + α2,k)′uk, (4)

where the term α′1uk is the water production cost, α′2,kukis the (time-varying) pumping (electricity) cost and Wα is apositive scaling factor. The cost for the operation of valves isnegligible compared to the cost of pumping so it has not beenaccounted for here.

The smooth operation cost is defined as

`∆(∆uk) = ∆u′kWu∆uk, (5)

where ∆uk = uk − uk−1 and Wu ∈ Rnu×nu is a symmetricpositive definite weight matrix. It is introduced to penalizeabrupt switching of the actuators (pumps and valves).

The safety storage cost penalizes the drop of water levelin the tanks below a given safety level. An elevation abovethis safety level ensures that there will be enough waterin unforeseen cases of unexpectedly high demand and alsomaintains a minimum pressure for the flow of water in thenetwork. This is given by

`S(xk) = Wx dist(xk | Cs), (6)

where dist(x | C) = infy∈C ‖x − y‖2 is the distance-to-setfunction, Cs = x | x ≥ xs, and xs ∈ Rnx is the safety leveland Wx is a positive scaling factor.

These cost functions have been used in many MPC formula-tions in the literature [13], [42]. A comprehensive discussionon the choice of these cost functions can be found in [14]and [43].

The total stage cost at a time instant k is the summation ofthe above costs and is given by

`(xk, uk, uk−1, k) = `w(uk, k) + `∆(∆uk) + `S(xk). (7)

B. SMPC formulation

We formulate the following stochastic MPC problem witha prediction horizon N ∈ N, N ≥ 1 and decision variablesπ = uk+j|k, xk+j+1|kj∈N[0,N−1]

V ?(p, q, dk, k) = minπ

EV (π, p, q, k), (8a)

where E is expectation operator and

V (π, p, q, k) =

N−1∑j=0

`(xk+j|k, uk+j|k, uk+j−1|k, k+j), (8b)

subject to the constraints

xk|k = p, uk−1|k = q, (8c)xk+j+1|k = Axk+j|k +Buk+j|k +Gddk+j|k(εj),

j ∈ N[0,N−1], εj ∈ Ωj , (8d)Euk+j|k + Eddk+j|k(εj) = 0, j ∈ N[0,N−1], εj ∈ Ωj , (8e)xmin ≤ xk+j|k ≤ xmax, j ∈ N[1,N ], (8f)umin ≤ uk+j|k ≤ umax, j ∈ N[0,N−1], (8g)

where we stress out that the decision variables uk+j|kj=N−1j=0

are required to be causal control laws of the form

uk+j|k = ϕk+j|k(p, q, xk+j|k, uk+j−1|k, εj). (8h)

Solving the above problem would involve the evaluation ofmulti-dimensional integrals over an infinite-dimensional spacewhich is computationally intractable. Hereafter, however, weshall assume that all Ωj , for j ∈ N[0,N−1], are finite sets. Thisassumption will allow us to restate (8) as a finite-dimensionaloptimization problem.

C. Scenario trees

A scenario tree is the structure which naturally follows fromthe finiteness assumption of Ωj and is illustrated in Fig. 3. Ascenario tree describes a set of possible future evolutions ofthe state of the system known as scenarios. Scenario trees canbe constructed algorithmically from raw data as in [30].

Page 4: GPU-accelerated stochastic predictive control of drinking ... · an optimization algorithm which makes use of the problem structure and sparsity. We exploit the structure of the problem,

4

Fig. 2. The closed-loop system with the proposed stochastic MPC controllerrunning on a GPU device.

Fig. 3. Scenario tree describing the possible evolution of the system statealong the prediction horizon: Future control actions are decided in a non-anticipative (causal) fashion; for example u2

1 is decided as a function of ε21but not of any of εi2, i ∈ N[1,µ(3)].

The nodes of a scenario tree are partitioned in stages. The(unique) node at stage k = 0 is called root and the nodes atthe last stage are the leaf nodes of the tree. We denote thenumber of leaf nodes by ns. The number of nodes at stage kis denoted by µ(k) and the total number of nodes of the tree isdenoted by µ. A path connecting the root node with a leaf nodeis called a scenario. Non-leaf nodes define a set of children;at a stage j ∈ N[0,N−1] for i ∈ N[1,µ(j)] the set of childrenof the i-th node is denoted by child(j, i) ⊆ N[1,µ(j+1)]. Atstage j ∈ N[1,N ] the i-th node i ∈ N[1,µ(j)] is reachable froma single node at stage k − 1 known as its ancestor which isdenoted by anc(j, i) ∈ N[1,µ(j−1)].

The probability of visiting a node i at stage j starting fromthe root is denoted by pij . For all for j ∈ NN we have that∑µ(j)i=1 p

ij = 1 and for all i ∈ N[1,µ(k)] it is

∑l∈child(j,i) p

lj+1 =

pij .

We define the maximum branching factor at stage j, bj ,to be the maximum number of children of the nodes at thisstage. The maximum branching factor serves as a measure ofthe complexity of the tree at a given stage.

D. Reformulation as a finite-dimensional problem

We shall now exploit the above tree structure to reformu-late the optimal control problem (8) as a finite-dimensionalproblem. The water demand, given by (3), is now modeled as

dik+j|k = dk+j|k + εij , (9)

for all j ∈ N[0,N−1] and i ∈ N[1,µ(j+1)]. The input-disturbancecoupling (8e) is then readily rewritten as

Euik+j|k + Eddik+j|k = 0, (10)

for j ∈ N[0,N−1] and i ∈ N[1,µ(j+1)].The system dynamics is defined across the nodes of the tree

by

xlk+j+1|k = Axik+j|k +Bulk+j|k +Gddlk+j|k, (11)

for j ∈ N[0,N−1], i ∈ N[1,µ(j)] and l ∈ child(j, i), or,alternatively,

xik+j+1|k = Axanc(j+1,i)k+j|k +Buik+j|k +Gdd

ik+j|k, (12)

for j ∈ N[0,N−1] and i ∈ N[1,µ(j+1)].Now the expectation of the objective function (8b) can be

derived as a summation across the tree nodes

EV (π, p, q, k) =

N−1∑j=0

µ(j)∑i=1

pij`(xik+j|k, u

ik+j|k, u

anc(j,i)k+j−1|k, k + j), (13)

where x1k|k = p and uk−1|k = q.

In order to guarantee the recursive feasibility of the controlproblem, the state constraints (8f) are converted into softconstraints, that is, they are replaced by a penalty of the form

`d(x) = γd dist(x, C1), (14)

where γd is a positive penalty factor and C1 = x | xmin ≤x ≤ xmax. Using this penalty, we construct the soft stateconstraint penalty

Vs(π, p) =

N∑j=0

µ(j)∑i=1

`d(xik+j|k). (15)

The modified, soft-constrained, SMPC problem can be nowwritten as

V ?(p, q, d, k) = minπ

EV (π, p, q, k) + Vs(π, p), (16a)

subject to

x1k|k = p, uk−1|k = q, (16b)

umin ≤ uik+j|k ≤ umax, j ∈ N[0,N−1], i ∈ N[1,µ(j)], (16c)

and system equations (10) and (12).

IV. SOLUTION OF THE STOCHASTIC OPTIMAL CONTROLPROBLEM

In this section we extend the GPU-based proximal gradientmethod proposed in [44] to solve the SMPC problem (16). Forease of notation we will focus on the solution of the SMPCproblem at k = 0 and denote xj|0 = xj , uj|0 = uj , dj|0 = dj .

Page 5: GPU-accelerated stochastic predictive control of drinking ... · an optimization algorithm which makes use of the problem structure and sparsity. We exploit the structure of the problem,

5

A. Proximal gradient algorithm

For a closed, proper extended-real valued function g : Rn →R, we define its proximal operator with parameter γ > 0,proxγg : Rn → Rn as [45]

proxγg(v) = argminx∈Rn

g(x) + 1

2γ ‖x− v‖22

. (17)

The proximal operator of many functions is available in closedform [45], [46]. When g is given in a separable sum form, thatis

g(x) =

κ∑i=1

gi(xi), (18a)

then, for all i ∈ N[1,κ],

(proxγg(v))i = proxγgi(vi). (18b)

This is known as the separable sum property of the proximaloperator.

Let z ∈ Rnz be a vector encompassing all states xij forj ∈ N[0,N ] and i ∈ N[1,µ(j)] and inputs uij for j ∈ N[0,N−1],i ∈ N[1,µ(j+1)]; this is the decision variable of problem (16).

Let f : Rnz → R be defined as

f(z) =

N−1∑j=0

µ(j)∑i=1

pij(`w(uij) + `∆(∆uij))

+ δ(uij |Φ1(dij))

+ δ(xij+1, uij , x

anc(j+1,i)j |Φ2(dij)), (19)

where ∆uij = uij − uanc(j,i)j−1 and Φ1(d) is the affine subspace

of Rnu induced by (10), that is

Φ1(d) = u : Eu+ Edd = 0, (20)

and Φ2(d) is the affine subspace of R2nx+nu defined by thesystem dynamics (12)

Φ2(d) = (xk+1, xk, u) : xk+1 = Axk +Bu+Gdd. (21)

We define the auxiliary variable ζ which serves as a copyof the state variable xij — that is we require ζij = xij . Thereason for the introduction of this copy will be clarified inSection IV-C.

We introduce the variable t = (ζ, x, u) ∈ Rnt and definean extended real valued function g : Rnt → R as

g(t) =

N−1∑j=0

µ(j)∑i=1

`S(xij+1) + `d(ζij+1) + δ(uij |U), (22)

where U = u ∈ Rnu : umin ≤ u ≤ umax.Now the finite-dimensional optimization problem (16) can

be written as:

V ? = minz,t

f(z) + g(t), (23a)

s.t. Hz = t, (23b)

where

H =

Inx0

Inx 00 Inu

. (24)

The Fenchel dual of (23) is written as [47, Corol. 31.2.1]:

D? = minyf∗(−H ′y) + g∗(y), (25)

where y is the dual variable. The dual variable y can bepartitioned as y = (ζij+1, x

ij+1, u

ij), where ζij+1, xij+1 and uij

are the dual variables corresponding to ζij+1, xij+1 and uij re-spectively. We also define the auxiliary variable ξij := ζij + xij .

According to [48, Thm. 11.42], since function f(z)+g(Hz)is proper, convex and piecewise linear-quadratic, then theprimal problem (23) is feasible whenever the dual prob-lem (25) is feasible and, furthermore, strong duality holds,i.e., V ? = D?. Moreover, the optimal solution of (23) isgiven by z? = ∇f∗(−H ′y?) where y? is any solution of (25).Applying [48, Prop. 12.60] to f∗ and since f is lower semi-continuous, proper and σ-strongly convex — as shown atthe end of Appendix A — its conjugate f∗ has Lipschitz-continuous gradient with a constant 1/σ.

An accelerated version of proximal-gradient method whichwas first proposed by Nesterov in [49] is applied to the dualproblem. This leads to the following algorithm

wν = yν + θν(θ−1ν−1 − 1)(yν − yν−1), (26a)

zν = argminz〈z,H ′wν〉+ f(z), (26b)

tν = proxλ−1g(λ−1wν +Hzν), (26c)

yν+1 = wν + λ(Hzv − tv), (26d)

θν+1 = 12

(√θ4ν + 4θ2

ν − θ2ν

), (26e)

starting from a dual-feasible vector y0 = y−1 = 0 and θ0 =θ−1 = 1. In (26), λ > 0 is a constant step length which willbe discussed in Section IV-D.

It is worth noting that Algorithm 26 may be interpreted asan accelerated version of [50].

In the first step (26a) we compute an extrapolation of thedual vector. In the second step (26b) we calculate the dualgradient, that is zν = ∇f∗(−H ′wν), at the extrapolateddual vector using the conjugate subgradient theorem [47,Thm. 23.5]. The third step comprises of (26c), (26d) where weupdate the dual vector y and in the final step of the algorithmwe compute the scalar θν which is used in the extrapolationstep.

This algorithm has a convergence rate of O(1/ν2) for thedual iterates as well as for the ergodic primal iterate definedthrough the recursion zν = (1 − θν)z(ν−1) + θνz

ν , i.e., aweighted average of the primal iterates [51].

The splitting given in (23), with this particular choice off and g, is not unique; for example Shapiro et al. haveproposed a dual decomposition scheme where the scenariotree is decomposed into a set of independent scenarios whilethe tree structure is imposed through a linear equation ofthe form (23b) [29]. However, such a splitting may increaseconsiderably the total number of variables without facilitatingthe implementation or offering some advantage in terms ofconvergence speed.

B. Computation of primal iterateThe most critical step in the algorithm is the computation of

zν which accounts for most of the computation time required

Page 6: GPU-accelerated stochastic predictive control of drinking ... · an optimization algorithm which makes use of the problem structure and sparsity. We exploit the structure of the problem,

6

by each iteration. This step boils down to the solution ofan unconstrained optimization problem by means of dynamicprogramming where certain matrices (which are independentof wν) can be computed once before we run the algorithmto facilitate the online computations. These are (i) the vectorsβij , u

ij , e

ij which are associated with the update of the time-

varying cost (see Appendix A) and (ii) the matrices Λ,Φ,Ψ, B(see Appendix B). The latter are referred to as the factor stepof the algorithm and matrices Λ,Φ,Ψ and B are independentof the complexity of the scenario tree.

The computation of zν at each iteration of the algorithmrequires the computation of the aforementioned matrices andis computed using Algorithm 1 to which we refer as thesolve step. Computations involved in the solve step are merelymatrix-vector multiplications. As the algorithm traverses thenodes of the scenario tree stage-wise backwards (from stageN − 1 to stage 0), computations across the nodes at a givenstage can be performed in parallel. Hardware such as GPUswhich enable us to parallelizable such operations lead to agreat speed-up as we demonstrate in Section V.

Algorithm 1 Solve stepRequire: Output of the factor step (See Appendices A and B),

i.e., Λ,Φ,Ψ, B, uij , βij , e

ij , p, q and wν = (ζij+1, x

ij+1, u

ij).

qiN ← 0, and riN ← 0,∀i ∈ N[1,ns],for j = N − 1, . . . , 0 do

for i = 1, . . . , µ(k) do in parallelσlj ← rlj+1 + βlj ,∀l ∈ child(j, i)

vlj ← 12plj

(Φlj(ξ

lj+1 + qlj+1) + Ψl

j ulj + Λljσ

lj

),

∀l ∈ child(j, i)rij ←

∑l∈child(j,i) σ

lj + B′(ξlj+1 + qlj+1) + Lulj

qij ← A′∑l∈child(j,i) ξ

lj+1 + qlj+1

end forend forx1

0 ← p, u−1 ← q,for j = 0, . . . , N − 1 do

for i = 1, . . . , µ(k) do in parallelvij ← v

anc(j,i)j−1 + vij

uij ← Lvij + uijxij+1 ← Ax

anc(j,i)j + Bvij + eij

end forend forreturn xijNj=1, uij

N−1j=0

The dynamic programming approach which gives rise toAlgorithm 1 is equivalent to taking the KKT optimalityconditions of (26b) and formulating the co-state dynamicalequations. Algorithm 1 is also reminiscent of the Riccati-typerecursion in [51].

C. Computation of dual iterate

Function g given in (22) is given in the form of a separablesum

g(t) = g(ζ, x, u) = g1(ζ) + g2(x) + g3(u), (27)

where

g1(ζ) =

N−1∑j=0

µ(j)∑i=1

`S(ζij+1), (28a)

g2(x) =

N−1∑j=0

µ(j)∑i=1

`d(xij+1), (28b)

g3(u) =

N−1∑j=0

µ(j)∑i=1

δ(uij | U). (28c)

Functions g1(·) and g2(·) are in turn separable sums ofdistance functions from a set and g3(·) is an indicator func-tion. Their proximal mappings can be easily computed as inAppendix C and essentially are element-wise operations onthe vector t that can be fully parallelized.

D. Preconditioning and choice of λFirst-order methods are known to be sensitive to scaling and

preconditioning can remarkably improve their convergencerate. Various preconditioning method such as [52], [53] havebeen proposed in the literature. Additionally, a parallelizablepreconditioning method tailored to stochastic programs for usewith interior point solvers has been proposed in [54]. Here, weemploy a simple diagonal preconditioning which consists incomputing a diagonal matrix HD with positive diagonal entrieswhich approximates the dual Hessian HD and use H−1/2

D toscale the dual vector [55, 2.3.1]. Since the uncertainty does notaffect the dual Hessian, we take this preconditioning matrixfor a single branch of the scenario tree and use it to scale alldual variables.

In a similar way, we compute the parameter λ. We chooseλ = 1/LHD

where LHDis the Lipschitz constant of the dual

gradient which is computed as ‖H‖2/σ as in [55]. It againsuffices to perform the computation for a single branch of thescenario tree.

E. TerminationThe termination conditions for the above algorithm are

based on the ones provided in [51]. However, rather thanchecking these conditions at every iteration, we performalways a fixed number of iterations which is dictated bythe sampling time. We may then check the quality of thesolution a posteriori in terms of the duality gap and the term‖Hzν − tν‖∞.

V. CASE STUDY: THE BARCELONA DWNWe now apply the proposed control methodology to the

drinking water network of the city of Barcelona using thedata provided in [2], [13]. The topology of the network ispresented in Figure 4. The system model consists of 63 statescorresponding to the level of water in each tank, 114 controlinputs which are pumping actions and valve positions, 88demand nodes and 17 junctions. The prediction horizon isN = 24 with sampling time of 1 hour. The future demands arepredicted using the SVM time series model developed in [13].

In our subsequent analysis the initial state of the system,x0 = p, is selected uniformly randomly between xs and xmax.

Page 7: GPU-accelerated stochastic predictive control of drinking ... · an optimization algorithm which makes use of the problem structure and sparsity. We exploit the structure of the problem,

7

Fig. 4. Structure of the DWN of Barcelona.

A. Performance of GPU-accelerated algorithm

The accelerated proximal gradient (APG) algorithm wasimplemented in CUDA-C v6.0 and the matrix-vector computa-tions were performed using cuBLAS. We compared the GPU-based implementation with the interior-point solver of Gurobi.Active-set algorithms exhibited very poor performance and wedid not include the respective results.

All computations on CPU were performed on a 4×2.60GHzIntel i5 machine with 8GB of RAM running 64-bit Ubuntuv14.04 and GPU-based computations were carried out on anNVIDIA Tesla C2075.

The relative tolerance of Gurobi was set to 2 · 10−2 insteadof the default tolerance of 10−8. The dependence of thecomputation time on the size of the scenario tree is reported inthe Figure 5 where it can be noticed that APG running on GPUcompared to Gurobi is 10 to 25 times faster. Furthermore, thespeed-up increases with the number of scenarios as we maytell by looking at the inset of Figure 5. The runtimes shownare averages over 100 random initial points x0.

The optimization problems we are solving here are ofnoticeably large size. Indicatively, the scenario tree with 493scenarios counts approximately 2.52 million dual decisionvariables (1.86 million primal variables) and while Gurobirequires 860s to solve it, our CUDA implementation solvesit in 58.8s; this corresponds to a speed-up of 14.6.

In all of our simulations we obtained a sequence of controlactions across the tree nodes U?apg = uij which was,element-wise, within ±0.029m3/s (1.9%) of the solution pro-duced by Gurobi with relative tolerance 10−8. Moreover, weshould note that the control action u?0 computed by APG with500 iterations was consistently within ±0.0025m3/s (0.08%)of the Gurobi solution. Given that only u?0 is applied to thesystem while all other control actions uij for j ∈ N[1,N−1] andi ∈ N[1,µ(j)] are discarded, 500 iterations are well sufficientfor convergence.

B. Closed-loop performance

In this section we analyse the performance of SMPC withdifferent scenario-trees. This analysis is carried for a periodof 7 days (Hs = 168) from 1st to 8th July 2007. Here, wecompare the operational cost and the quality of service ofvarious scenario-tree structures.

0 100 200 300 400 500

# scenarios

0

200

400

600

800

run

tim

e (

s)

Gurobi APG

0 5000

100

200

300

slo

pe

Fig. 5. Runtime of the CUDA implementation against the number of scenariosconsidered in the optimization problem. Comparison with the runtimes ofGurobi.

The weighting matrices in the operational cost are chosenas Wα = 2 · 104, Wu = 105 · I and Wx = 107, respectivelyand γd = 5 · 107. The demand is predicted using SVM modelpresented in [13]. The steps involved in SMPC using GPUbased APG in closed-loop is summarized in Algorithm 2.

Algorithm 2 Closed-loop of DWN with SMPC with proximal-operatorRequire: Scenario tree, current state measurement x0 and

previous control u−1.Compute Λ, Φ, Ψ and B as in Appendix BPrecondition the original optimization problem and computeλ as in Section IV-D.loop

Step 1. Predict the future water demands dk using currentand past demand data.Step 2. Compute uij , β

ij , e

ij as in Appendix A.

Step 3. Solve the optimization problem using APG onGPU using iteration (26) and Algorithm 1.Step 4. Apply u1

0 to the system, update u−1 = u10

end loop

For the performance assessment of the proposed controlmethodology we used various controllers summarized in Ta-ble I. The corresponding computation times are presented inFigure 5.

To assess the performance of closed-loop operation ofthe SMPC-controlled network we used the key performanceindicators (KPIs) reported in [2], [3], [56]. For a simulation

Page 8: GPU-accelerated stochastic predictive control of drinking ... · an optimization algorithm which makes use of the problem structure and sparsity. We exploit the structure of the problem,

8

TABLE IVARIOUS CONTROLLERS USED TO ASSESS THE CLOSED-LOOP

PERFORMANCE OF THE PROPOSED METHODOLOGY. THE NUMBERS IN THEBRACKETS DENOTE THE FIRST MAXIMUM BRANCHING FACTORS, bj , OF

THE SCENARIO TREE WHILE ALL SUBSEQUENT BRANCHING FACTORS AREEQUAL TO 1.

Controller bk scenarios primal variables dual variablesCE-MPC 1 1 4248 5760SMPC1 [3, 2] 6 24072 32540SMPC2 [6, 5] 30 118059 160080SMPC3 [6, 5, 5] 114 430287 583440SMPC4 [8, 5, 5] 146 551355 747600SMPC5 [10, 8, 5] 242 915621 1241520SMPC6 [12, 8, 5] 303 1145544 1553280SMPC7 [12, 8, 8] 404 1520961 2062320SMPC8 [12, 10, 8] 493 1856022 2516640

time length Hs the performance indicators are computed by

KPIE =1

Hs

Hs∑k=1

(α1 + α2,k)′|uk|, (29a)

KPI∆U =1

Hs

Hs∑k=1

‖∆uk‖2, (29b)

KPIS =

Hs∑k=1

‖[xs − xk]+‖1, (29c)

KPIR =‖xs‖1

1Hs

∑Hs

k=1 ‖xk‖1· 100%. (29d)

KPIE is the average economic cost, KPI∆U measures theaverage smoothness of the control actions, KPIS correspondsto the total amount of water used from storage, that is theamount of water that is removed from a tank when the volumein that tank is below the safe level xs, and KPIR is thepercentage of the safety volume xs contained into the averagevolume of water.

TABLE IIKPIS FOR PERFORMANCE ANALYSIS OF THE DWN WITH DIFFERENT

CONTROLLERS. THE LOWEST AND THE HIGHEST VALUES IN EACHCOLUMN ARE HIGHLIGHTED.

Controller KPIE KPI∆U KPIS KPIRCE-MPC 1801.4 0.2737 6507.7 64.89%SMPC1 1633.5 0.3896 1753.7 67.96%SMPC2 1549.7 0.4652 2264.0 61.81%SMPC3 1574.0 0.4135 1360.0 49.65%SMPC4 1583.2 0.4088 885.7 48.13%SMPC5 1597.3 0.4470 508.5 46.05%SMPC6 1606.3 0.4878 302.3 44.93%SMPC7 1646.8 0.5189 285.4 47.89%SMPC8 1636.2 0.5068 283.9 47.41%

1) Risk vs Economic utility: Figure 6 illustrates the trade-off between economic and safe operation: The more scenarioswe use to describe the distribution of demand predictionerror, the safer the closed-loop operation becomes as it isreflected by the decrease of KPIS . Stochastic MPC leadsto a significant decrease of economic cost compared to thecertainty-equivalence approach, however, the safer we requirethe operation to be, the higher the operating cost we shouldexpect. For example, if the designer opts for 30 scenarios,

they will have struck a low operating cost which, nevertheless,comes with a high value of KPIS , that is, operation under highrisk. In order to decrease this risk, one needs to consider ahigher number of scenarios which comes at a higher operatingcost. We may also observe that for too few scenarios, theoperation of the network will be both expensive and will incura rather high risk.

2) Quality of service: A measure of the reliability andquality-of-service of the network is KPIS which reflects thetendency of water levels to drop under the safety storage levels.As expected, the CE-MPC controller leads to the most unsafeoperation, whereas SMPC8 leads to the lowest value.

3) Network utility: Network utility is defined as the abilityto utilize the water in the tanks to meet the demands ratherthan pumping additional water and is quantified by KPIR.In Table II, we see the dependence of KPIR on the numberof scenarios of the tree. The decrease in KPIR one mayobserve is because the more scenarios are introduced, the moreaccurate the representation of uncertainty becomes and thesystem does not need to operate, on average, too far awayfrom xs. Of course, when operating close to xs, we need totake into account the value of KPIS to check whether theoperation violates the safety storage limit.

4) Smooth operation: We may notice that the introductionof more scenarios results in an increase in KPI∆U . Then, thecontroller becomes more responsive to accommodate the needfor a less risky operation, although the value of KPI∆U is notgreatly affected by the number of scenarios.

0 100 200 300 400 500

Number of scenarios

1.5

1.55

1.6

1.65

1.7

1.75

1.8

1.85

KP

I E/1

00

0

0

1

2

3

4

5

6

7

KP

I S/1

00

0

KPIE

KPIS

Fig. 6. The figure shows the trade-off between risk and economic utilityin terms of scenarios. The KPIE represent the economical utility and KPISshows the risk of violation.

In Figure 7 we show two pumping actions during 168h(1 week) of operation. We may observe the tendency of thecontroller to be thrifty with pumping when the correspondingpumping cost is high.

C. Implementation details

At every time instant k we need to load onto the GPU thestate measurement and a sequence of demand predictions (seeFigure 2), that is dk. This amounts to 8.4kB and is rapidly

Page 9: GPU-accelerated stochastic predictive control of drinking ... · an optimization algorithm which makes use of the problem structure and sparsity. We exploit the structure of the problem,

9

20 40 60 80 100 120 140 160

Co

ntr

ol a

ctio

n [

%]

0

0.2

0.4

0.6

0.8

Time [hr]

20 40 60 80 100 120 140 160

Wa

ter

Co

st

[e.u

.]

60

70

80

90

Fig. 7. Pumping actions using SMPC4 (expressed in % of umax) and thecorresponding weighted time-varying cost Wαα2,k in economic units.

uploaded on the GPU (less than 0.034ms). In case we need toupdate the scenario-tree values, that is εk, and for the case ofSMPC8 we need to upload 3.52MB which is done in 3.74ms.Therefore, the time needed to load these data on the GPU isnot a limiting factor.

The parallelization of matrix-vector multiplications re-quired in Algorithm 1 are implemented using methodcublasSgemmBatched of cuBLAS. Vector additions areperformed using cublasSaxpy and summations over the setof children of a node were done using a custom kernel.

Compared to a MATLAB implementation of APG, thepresented GPU implementation was found to be 60 timesfaster for SMPC7 and 95 times faster for SMPC8. How-ever, we believe that the comparison to a MATLAB im-plementation is not totally fair and a comparison to anoptimized C implementation is beyond the scope of thiswork. It is evident that GPU technology enables very highspeed-ups, notwithstanding. The source code of our imple-mentation is publicly available with an LGPL v3 licenceat https://github.com/ajaykumarsampath/GPU-APG-water-network.

VI. CONCLUSIONS AND FUTURE WORK

In this paper we have presented a framework for theformulation of a stochastic model predictive control problemfor the operational management of drinking water networksand we have proposed a novel approach for the efficientnumerical solution of the associated optimization problem on aGPU. The proposed algorithm achieves remarkably high speedas we fully exploit the structure of the optimization problemand we parallelize its execution to a large extent. At the sametime, it involves only matrix-vector operations and it is easyto implement.

We demonstrated the computational feasibility of the algo-rithm and the benefits for the operational management of thesystem in terms of performance (which we quantified usingcertain KPIs from the literature). Given that the proposedalgorithmic scheme can solve linear stochastic optimal controlproblems so fast, future research should focus on the use

of nonlinear dynamical models accounting for the pressuredrops across the network. We believe that the algorithmicdevelopments presented in this paper pave the way for theapplication of stochastic model predictive control methodolo-gies to drinking water networks.

APPENDIX AELIMINATION OF INPUT-DISTURBANCE COUPLING

In this section we discuss how the input-disturbance equalityconstraints can be eliminated by a proper change of inputvariables and we compute the parameters βij , u

ij , e

ij ∀i ∈

N[1,µ(j)], j ∈ N[0,N ] which are then provided as input toAlgorithm 1. These depend on the nominal demand forecastsdk+j|k and on the time-varying economic cost parametersα2,k+j for j ∈ N[0,N−1], therefore, they need to be updatedat every time instant k.

The affine space Φ1(d) introduced in (20) can be written as

Φ1(d) = v ∈ Rnv : u = Lv + u(d), (30)

where L ∈ Rnu×nv is a full rank matrix whose range spansthe nullspace of E, i.e., for every v ∈ Rnv , we have Lv is inthe kernel of E and u(d) satisfies Eu(d) + Edd = 0.

Substituting uij = Lvij + uij , ∀i ∈ N[1,µ(j)], j ∈ N[0,N ] inthe manifold defined by the system dynamics Φ2(d) as in (21)gives

Φ2(d) =(xj+1, xj , v) : xj+1 = Axj + Bv + e,

B = BL, e = Bu+Gdd, (31)

and we defineeij = Buij +Gdd

ij . (32)

Now the cost in (19) is transformed as:N−1∑j=0

µ(j)∑i=1

pij(`w(uij) + `∆(∆uij)) =

N−1∑j=0

µ(j)∑i=1

pij(`w(vij) + `∆(∆vij , u

ij)), (33)

where

R = WuL, (34a)

R = L′R, (34b)αj = Wα(α1 + α2,j+k)L, (34c)

`w(vij) = α′jvij , (34d)

∆vij = vij − vanc(j,i)j−1 , (34e)

∆uij = uij − uanc(j,i)j−1 , (34f)

`∆(∆vij ,∆uij) = ∆vijR∆vij + 2∆ui′j R∆vij . (34g)

By substituting and expanding ∆vij and ∆uij in `∆(∆vij ,∆uij)

the cost in (34g) becomes

N−1∑j=0

µ(j)∑i=1

pij(`w(uij) + `∆(∆uij)) =

N−1∑j=0

µ(j)∑i=1

pijvi′j Rv

ij − 2pijv

anc(j,i)′j−1 Rvij + βi′j v

ij , (35)

Page 10: GPU-accelerated stochastic predictive control of drinking ... · an optimization algorithm which makes use of the problem structure and sparsity. We exploit the structure of the problem,

10

where

pij = pij +∑

l∈child(j,i)

plj+1, (36a)

βij = pijαj + 2pijR(pij u

ij − u

anc(j,i)j−1 −∑

l∈child(j,i)

plj+1ulj+1

). (36b)

Now uij , eij , β

ij are calculated by (30), (32) and (36b) respec-

tively. Using our assumption that L is full-rank, we can seethat R is a positive definite and symmetric matrix, therefore,f is strongly convex.

APPENDIX BFACTOR STEP

Algorithm 1 solves the unconstrained minimization prob-lem (26b), that is

z? = argminz〈z,H ′y〉+ f(z), (37)

where z = xij+1, uij, y = ζij+1, x

ij+1, u

ij for i ∈ N[1,µ(j)]

and j ∈ N[0,N−1], f(z) is given by (19) and H is givenby (24). Substituting H the optimization problem becomes

z? = argminz

N−1∑j=0

µ(j)∑i=1

pij(`w(uij) + `∆(∆uij))

+ ξi′j+1xij+1 + ui′j u

ij + δ(uij |Φ1(dij))

+ δ(xij+1, uij , x

anc(j+1,i)j |Φ2(dij)), (38)

where ξij := xij + ζij .The input-disturbance coupling constraints imposed by

δ(uij |Φ1(dij)) in the above problem are eliminated as discussedin Appendix A. This changes the input variable from uij to vijgiven by (30) and the cost function as in (35). We, therefore,replace the decision variable z with z := xij , vij and theoptimization problem (38) reduces to

z? = argminz

N−1∑j=0

µ(j)∑i=1

pijvi′j Rv

ij − 2pijv

anc(j,i)′j−1 Rvij

+ βi′j vij + ξi′j+1x

jj+1 + ui′j Lv

ij

+ δ(xij+1, vij , x

anc(k,i)j |Φ2(dij)), (39)

where uij = Lvij + uij .The above problem is an unconstrained optimization prob-

lem with quadratic stage cost which is solved using dynamicprogramming [57]. This method transforms the complex prob-lem into a sequence of sub-problems solved at each stage.

Using dynamic programming we find that the transformedcontrol actions vi?j have to satisfy the recursive formula

vi?j = vanc(j,i)j−1 +

1

2pij

(Φ(ξij+1 + qij+1) + Ψuij

+ Λ(βij + rij+1)), (40)

where

Λ = −R−1, (41a)Φ = ΛB′, (41b)Ψ = ΛL. (41c)

Matrix R is symmetric and positive definite, therefore, we cancompute once its Cholesky factorization so that we obviate thecomputation of its inverse.

The qij+1, rij+1 in (40) correspond to the linear cost termsin the cost-to-go function at node i of stage j+ 1. At stage j,these terms are updated by substituting the vi?j as:

rsj =∑

l∈child(j−1,s)

σlj + B′(ξlj+1 + qlj+1) + Lulj , (42a)

qsj = A′∑

l∈child(j−1,s)

ξlj+1 + qlj+1, (42b)

where s = anc(j, i).Equations (40) and (42) form the solve step as in Algo-

rithm 1. Matrices Λ, Φ and Ψ are required to be computedonce.

APPENDIX CPROXIMAL OPERATORS

Function g in (27) is a separable sum of distance andindicator functions and its proximal is computed accordingto (18). The proximal operator of the indicator of a convexclosed set C, that is

χC(x) =

0, if x ∈ C+∞, otherwise

(43)

is the projection operator onto C, i.e.,

proxλχC(v) = projC(v) = argmin

y∈C‖v − y‖. (44)

When g is the distance function from a convex closed setC, that is

g(x) =µdist(x | C) = infy∈C

µ‖x− y‖

=µ‖x− projC(x)‖, (45)

then proximal operator of g given by [46]

proxλg (v) =

v + projC(v)−v

dist(v|C) , if dist(v | C) > λµ

projC(v), otherwise(46)

ACKNOWLEDGMENT

This work was financially supported by the EU FP7 researchproject EFFINET “Efficient Integrated Real-time Monitoringand Control of Drinking Water Networks,” grant agreementno. 318556.

Page 11: GPU-accelerated stochastic predictive control of drinking ... · an optimization algorithm which makes use of the problem structure and sparsity. We exploit the structure of the problem,

11

REFERENCES

[1] V. Havlena, P. Trnka, and B. Sheridan, “Management of complexwater networks,” in The Impact of Control Technology (T. Samad andA. Annaswamy, eds.), IEEE Control Systems Society, 2nd ed., 2014.available at www.ieeecss.org.

[2] J. Grosso, C. Ocampo-Martınez, V. Puig, and B. Joseph, “Chance-constrained model predictive control for drinking water networks,”Journal of Process Control, vol. 24, no. 5, pp. 504 – 516, 2014.

[3] J. Grosso, J. Maestre, C. Ocampo-Martinez, and V. Puig, “On the assess-ment of tree-based and chance-constrained predictive control approachesapplied to drinking water networks,” in 19th IFAC Conference, (Capetown, South Africa), pp. 6240–6245, Aug. 2014.

[4] C. Hans, P. Sopasakis, A. Bemporad, J. Raisch, and C. Reincke-Collon, “Scenario-based model predictive operation control of islandedmicrogrids,” in 54 IEEE Conf. Decision and Control, (Osaka, Japan),Dec 2015.

[5] P. Sopasakis, D. Herceg, P. Patrinos, and A. Bemporad, “Stochasticeconomic model predictive control for Markovian switching systems.”arxiv preprint arXiv:1610.10014, 2016.

[6] P. Patrinos, P. Sopasakis, H. Sarmiveis, and A. Bemporad, “Stochasticmodel predictive control for constrained discrete-time Markovian switch-ing systems,” Automatica, vol. 50, pp. 2504–2514, October 2014.

[7] A. Nemirovski and A. Shapiro, “Convex approximations of chanceconstrained programs,” SIAM Journal on Optimization, vol. 17, no. 4,pp. 969–996, 2006.

[8] V. Nitivattananon, E. C. Sadowski, and R. G. Quimpo, “Optimizationof water supply system operation,” J. Water Resour. Plann. Manage.,vol. 122, pp. 374–384, sep 1996.

[9] U. Zessler and U. Shamir, “Optimal operation of water distribution sys-tems,” Journal of Water Resources Planning and Management, vol. 115,no. 6, pp. 735–752, 1989.

[10] G. Yu, R. Powell, and M. Sterling, “Optimized pump scheduling in waterdistribution systems,” Journal of Optimization Theory and Applications,vol. 83, no. 3, pp. 463–488, 1994.

[11] A. Bagirov, A. Barton, H. Mala-Jetmarova, A. A. Nuaimat, S. Ahmed,N. Sultanova, and J. Yearwood, “An algorithm for minimization ofpumping costs in water distribution systems using a novel approachto pump scheduling,” Mathematical and Computer Modelling, vol. 57,no. 3–4, pp. 873 – 886, 2013.

[12] G. McCormick and R. Powell, “Derivation of near-optimal pumpschedules for water distribution by simulated annealing,” Journal of theOperational Research Society, vol. 55, pp. 728–736, July 2004.

[13] A. Sampathirao, J. Grosso, P. Sopasakis, C. Ocampo-Martinez, A. Bem-porad, and V. Puig, “Water demand forecasting for the optimal operationof large-scale drinking water networks: The Barcelona case study,” in19th IFAC World Congress, pp. 10457–10462, 2014.

[14] C. Ocampo-Martinez, V. Puig, G. Cembrano, R. Creus, and M. Minoves,“Improving water management efficiency by using optimization-basedcontrol strategies: the Barcelona case study,” Water Science and Tech-nology: Water Supply, vol. 9, no. 5, pp. 565–575, 2009.

[15] M. Bakker, J. H. G. Vreeburg, L. J. Palmen, V. Sperber, G. Bakker, andL. C. Rietveld, “Better water quality and higher energy efficiency byusing model predictive flow control at water supply systems,” Journalof Water Supply: Research and Technology - Aqua, vol. 62, no. 1, pp. 1–13, 2013.

[16] S. Leirens, C. Zamora, R. Negenborn, and B. De Schutter, “Coordinationin urban water supply networks using distributed model predictivecontrol,” in American Control Conference (ACC), 2010, (Baltimore,USA), pp. 3957–3962, June 2010.

[17] C. Ocampo-Martinez, V. Fambrini, D. Barcelli, and V. Puig, “Modelpredictive control of drinking water networks: A hierarchical and de-centralized approach,” in American Control Conference (ACC), 2010,(Baltimore, USA), pp. 3951–3956, June 2010.

[18] G. S. Sankar, S. M. Kumar, S. Narasimhan, S. Narasimhan, andS. M. Bhallamudi, “Optimal control of water distribution networks withstorage facilities,” Journal of Process Control, vol. 32, pp. 127 – 137,2015.

[19] V. Tran and M. Brdys, “Optimizing control by robustly feasible modelpredictive control and application to drinking water distribution sys-tems,” in Artificial Neural Networks – ICANN 2009 (C. Alippi, M. Poly-carpou, C. Panayiotou, and G. Ellinas, eds.), vol. 5769 of Lecture Notesin Computer Science, pp. 823–834, Springer Berlin Heidelberg, 2009.

[20] A. Goryashko and A. Nemirovski, “Robust energy cost optimizationof water distribution system with uncertain demand,” Automation andRemote Control, vol. 75, no. 10, pp. 1754–1769, 2014.

[21] J. Watkins, D. and D. McKinney, “Finding robust solutions to waterresources problems,” Journal of Water Resources Planning and Man-agement, vol. 123, no. 1, pp. 49–58, 1997.

[22] M. Cannon, B. Kouvaritakis, and X. Wu, “Probabilistic constrainedMPC for multiplicative and additive stochastic uncertainty,” AutomaticControl, IEEE Transactions on, vol. 54, pp. 1626–1632, July 2009.

[23] D. Bernardini and A. Bemporad, “Stabilizing model predictive controlof stochastic constrained linear systems,” Automatic Control, IEEETransactions on, vol. 57, pp. 1468–1480, June 2012.

[24] D. van Hessem and O. Bosgra, “A conic reformulation of modelpredictive control including bounded and stochastic disturbances understate and input constraints,” in Decision and Control, 2002, Proceedingsof the 41st IEEE Conference on, vol. 4, (Las Vegas, USA), pp. 4643–4648 vol.4, Dec 2002.

[25] D. Bertsimas and D. Brown, “Constrained stochastic LQC: A tractableapproach,” Automatic Control, IEEE Transactions on, vol. 52, pp. 1826–1841, Oct 2007.

[26] G. Calafiore and M. Campi, “The scenario approach to robust controldesign,” Automatic Control, IEEE Transactions on, vol. 51, pp. 742–753,May 2006.

[27] M. C. Campi, S. Garatti, and M. Prandini, “The scenario approach forsystems and control design,” Annual Reviews in Control, vol. 33, no. 2,pp. 149 – 157, 2009.

[28] M. Prandini, S. Garatti, and J. Lygeros, “A randomized approach tostochastic model predictive control,” in Decision and Control (CDC),2012 IEEE 51st Annual Conference on, (Maui, Hawaii, USA), pp. 7315–7320, Dec 2012.

[29] A. Shapiro, D. Dentcheva, and A. Ruszczynski, Lectures on StochasticProgramming: Modeling and Theory. Philadelphia: Society for Indus-trial and Applied Mathematics, 2009.

[30] H. Heitsch and W. Romisch, “Scenario tree modeling for multistagestochastic programs,” Mathematical Programming, vol. 118, no. 2,pp. 371–406, 2009.

[31] M. McCool, “Signal processing and general-purpose computing andGPUs [exploratory DSP],” Signal Processing Magazine, IEEE, vol. 24,pp. 109–114, May 2007.

[32] R. Benenson, M. Mathias, R. Timofte, and L. Van Gool, “Pedestriandetection at 100 frames per second,” in Computer Vision and PatternRecognition (CVPR), 2012 IEEE Conference on, pp. 2903–2910, June2012.

[33] H. Jang, A. Park, and K. Jung, “Neural network implementation usingCUDA and OpenMP,” in Digital Image Computing: Techniques andApplications (DICTA), 2008, pp. 155–161, Dec 2008.

[34] A. Guzhva, S. Dolenko, and I. Persiantsev, Artificial Neural Networks– ICANN 2009: 19th International Conference, Limassol, Cyprus,September 14-17, 2009, Proceedings, Part I, ch. Multifold Accelerationof Neural Network Computations Using GPU, pp. 373–380. Berlin,Heidelberg: Springer Berlin Heidelberg, 2009.

[35] E. Klintberg and S. Gros, “A parallelizable interior point method for two-stage robust mpc,” IEEE Transactions on Control Systems Technology,vol. PP, no. 99, pp. 1–11, 2017.

[36] M. Lubin, C. G. Petra, M. Anitescu, and V. Zavala, “Scalable stochasticoptimization of complex energy systems,” in Proceedings of 2011International Conference for High Performance Computing, Networking,Storage and Analysis, SC ’11, (New York, NY, USA), pp. 64:1–64:64,ACM, 2011.

[37] J. Kang, Y. Cao, D. P. Word, and C. Laird, “An interior-point method forefficient solution of block-structured NLP problems using an implicitschur-complement decomposition,” Computers & Chemical Engineer-ing, vol. 71, pp. 563 – 573, 2014.

[38] N. Chiang, C. G. Petra, and V. M. Zavala, “Structured nonconvexoptimization of large-scale energy systems using pips-nlp,” in PowerSystems Computation Conference (PSCC), 2014, pp. 1–7, Aug 2014.

[39] J. Hubner, M. Schmidt, and M. C. Steinbach, “A distributed interior-point kkt solver for multistage stochastic optimization,” OptimizationOnline, 2016.

[40] S. Miyaoka and M. Funabashi, “Optimal control of water distributionsystems by network flow theory,” Automatic Control, IEEE Transactionson, vol. 29, pp. 303–311, Apr 1984.

[41] Y. Wang, C. Ocampo-Martinez, V. Puig, and J. Quevedo, “Gaussian-process-based demand forecasting for predictive control of drinkingwater networks,” in Proceedings of the 9th International Conferenceon Critical Information Infrastructures Security (CRITIS), (Limassol(Cyprus)), 2014.

[42] S. Cong Cong, S. Puig, and G. Cembrano, “Combining CSP andMPC for the operational control of water networks: Application to the

Page 12: GPU-accelerated stochastic predictive control of drinking ... · an optimization algorithm which makes use of the problem structure and sparsity. We exploit the structure of the problem,

12

Richmond case study,” in 19th IFAC World Congress, (Cape Town),pp. 6246–6251, 2014.

[43] C. Ocampo-Martinez and R. Negenborn, Transport of water versustransport over water : exploring the dynamic interplay of transport andwater. Cham: Springer, 2015.

[44] A. Sampathirao, P. Sopasakis, A. Bemporad, and P. Patrinos, “Dis-tributed solution of stochastic optimal control problems on GPUs,” in54 IEEE Conf. Decision and Control, (Osaka, Japan), Dec 2015.

[45] N. Parikh and S. Boyd, “Proximal algorithms,” Found. Trends Optim.,vol. 1, pp. 127–239, jan 2014.

[46] P. Combettes and J.-C. Pesquet, “Proximal splitting methods in signalprocessing,” 2010. arXiv:0912.3522v4.

[47] R. Rockafellar, Convex analysis. Princeton university press, 1972.[48] R. Rockafellar and J. Wets, Variational analysis. Berlin: Springer-Verlag,

3rd ed., 2009.[49] Y. Nesterov, “A method of solving a convex programming problem with

convergence rate O(1/k2),” Soviet Mathematics Doklady, vol. 72, no. 2,pp. 372–376, 1983.

[50] P. Tseng, “Applications of a splitting algorithm to decomposition inconvex programming and variational inequalities,” SIAM Journal onControl and Optimization, vol. 29, pp. 119–138, jan 1991.

[51] P. Patrinos and A. Bemporad, “An accelerated dual gradient-projectionalgorithm for embedded linear model predictive control,” AutomaticControl, IEEE Transactions on, vol. 59, pp. 18–33, Jan 2014.

[52] P. Giselsson and S. Boyd, “Metric selection in fast dual for-ward–backward splitting,” Automatica, vol. 62, pp. 1 – 10, Dec 2015.

[53] A. Bradley, Algorithms for equilibration of matrices and their appli-cation to limited-memory quasi-Newton methods. PhD thesis, StanfordUniversity, 2010.

[54] Y. Cao, C. Laird, and V. Zavala, “Clustering-based preconditioningfor stochastic programs,” Computational optimization and applications,vol. 64, no. 2, pp. 379–406, 2016.

[55] D. P. Bertsekas, Nonlinear Programming, vol. 1. Belmont, MA: AthenaScientific, 1999.

[56] H. Algre, J. Baptista, E. Cabrera Jr, and F. Cubillo, Performanceindicators for water supply services. Manuals of best practice series,London: IWA Publishing, 2006.

[57] D. P. Bertsekas, Dynamic Programming and Optimal Control. AthenaScientific, 2nd ed., 2000.

Ajay Kumar Sampathirao received a BSc degreein electrical engineering at NIT Warangal, India in2010 and an M.Tech. degree in control and automa-tion from IIT Delhi, India in 2012 and defended hisPh.D. in computer science and engineering at IMTLucca, Italy in December 2016. He recently joinedthe control systems group in Technical UniversityBerlin, Germany as a post-doctoral researcher. Hisresearch includes stochastic MPC, GPU-based opti-mization methods, in particular for large-scale sys-tems such as drinking water and power networks.

Pantelis Sopasakis was born in Athens, Greece, in1985. He received a diploma (M.Eng.) in chemicalengineering in 2007 and an M.Sc. with honoursin applied mathematics in 2009 from the NationalTechnical University of Athens. In December 2012,he defended his Ph.D. thesis titled “Modelling andcontrol of biological and physiological systems”from the School of Chemical Engineering, NTUAthens. From 2013 till 2016 he was with the Dynam-ical Systems, Control and Optimization (DYSCO)research unit at IMT Lucca as a post-doctoral re-

searcher. Since October 2016 he is a postdoc with the Department of ElectricalEngineering (ESAT) at KU Leuven. His current research focuses on thedevelopment of numerical algorithms for large-scale stochastic optimal controlproblems.

Alberto Bemporad received his Master’s degreein Electrical Engineering in 1993 and his Ph.D. inControl Engineering in 1997 from the University ofFlorence, Italy. In 1996/97 he was with the Centerfor Robotics and Automation, Department of Sys-tems Science & Mathematics, Washington Univer-sity, St. Louis. In 1997-1999 he held a postdoctoralposition at the Automatic Control Laboratory, ETHZurich, Switzerland, where he collaborated as asenior researcher until 2002. In 1999-2009 he waswith the Department of Information Engineering of

the University of Siena, Italy, becoming an associate professor in 2005.In 2010-2011 he was with the Department of Mechanical and StructuralEngineering of the University of Trento, Italy. Since 2011 he is full professorat the IMT Institute for Advanced Studies Lucca, Italy, where he served asthe director of the institute in 2012-2015. He spent visiting periods at theUniversity of Michigan and Zhejiang University. In 2011 he cofounded ODYSS.r.l., a consulting and software development company specialized in advancedcontrols and embedded optimization algorithms. He has published more than300 papers in the areas of model predictive control, automotive control, hybridsystems, multiparametric optimization, computational geometry, robotics, andfinance, and co-inventor of 7 patents. He is author or coauthor of variousMATLAB toolboxes for model predictive control design, including the ModelPredictive Control Toolbox (The Mathworks, Inc.) and the Hybrid Toolbox.

He was an Associate Editor of the IEEE Transactions on Automatic Controlduring 2001-2004 and Chair of the Technical Committee on Hybrid Systemsof the IEEE Control Systems Society in 2002-2010. He received the IFACHigh-Impact Paper Award for the 2011-14 triennial. He has been an IEEEFellow since 2010.

Panagiotis (Panos) Patrinos is currently assistantprofessor at the Department of Electrical Engineer-ing (ESAT) of KU Leuven, Belgium. He received theM.Eng. in Chemical Engineering, M.Sc. in AppliedMathematics and Ph.D. in Control and Optimiza-tion from National Technical University of Athens,Greece. After his Ph.D. he held postdoctoral posi-tions at the University of Trento and IMT Schoolof Advanced Studies Lucca, Italy, where he becamean assistant professor in 2012. During fall/winter2014 he held a visiting assistant professor position

in the department of electrical engineering at Stanford University. His currentresearch interests are in the theory and algorithms of optimization andpredictive control with a focus on large-scale, distributed, stochastic andembedded optimization with a wide range of application areas including smartgrids, water networks, aerospace, and machine learning.