model learning predictive control for batch processes

14
Model learning predictive control for batch processes: Citation for published version (APA): Marquez Ruiz, A., Loonen, M. A. C., Saltik, B., & Ozkan, L. (2019). Model learning predictive control for batch processes: A Reactive Batch Distillation Column Case Study. Industrial and Engineering Chemistry Research, 58(30), 13737-13749. https://doi.org/10.1021/acs.iecr.8b06474 DOI: 10.1021/acs.iecr.8b06474 Document status and date: Published: 31/07/2019 Document Version: Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication: • A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal. If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement: www.tue.nl/taverne Take down policy If you believe that this document breaches copyright please contact us at: [email protected] providing details and we will investigate your claim. Download date: 06. Oct. 2021

Upload: others

Post on 15-Oct-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Model learning predictive control for batch processes

Model learning predictive control for batch processes:

Citation for published version (APA):Marquez Ruiz, A., Loonen, M. A. C., Saltik, B., & Ozkan, L. (2019). Model learning predictive control for batchprocesses: A Reactive Batch Distillation Column Case Study. Industrial and Engineering Chemistry Research,58(30), 13737-13749. https://doi.org/10.1021/acs.iecr.8b06474

DOI:10.1021/acs.iecr.8b06474

Document status and date:Published: 31/07/2019

Document Version:Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can beimportant differences between the submitted version and the official published version of record. Peopleinterested in the research are advised to contact the author for the final version of the publication, or visit theDOI to the publisher's website.• The final author version and the galley proof are versions of the publication after peer review.• The final published version features the final layout of the paper including the volume, issue and pagenumbers.Link to publication

General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright ownersand it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, pleasefollow below link for the End User Agreement:www.tue.nl/taverne

Take down policyIf you believe that this document breaches copyright please contact us at:[email protected] details and we will investigate your claim.

Download date: 06. Oct. 2021

Page 2: Model learning predictive control for batch processes

Model Learning Predictive Control for Batch Processes: A ReactiveBatch Distillation Column Case StudyAlejandro Marquez-Ruiz, Marco Loonen, M. Bahadır Saltık, and Leyla Ozkan*

Department of Electrical Engineering, Eindhoven University of Technology, De Zaale, Eindhoven 5612 AJ, The Netherlands

ABSTRACT: In this paper, we present the control of batchprocesses using model predictive control (MPC) and iterativelearning control (ILC). Existing combinations of MPC andILC are based on learning of the inputs of the process fromprevious batches for a fixed linear time-invariant model (LTI).However, batch processes are inherently time varying;therefore, LTI models are limited in capturing the relevantdynamic behavior. An attractive alternative is to use linearparameter varying (LPV) models because of their ability tocapture nonlinearities in the control of batch processes.Therefore, in this work we propose a novel methodcombining MPC and ILC based on LPV models, and wecall this method model learning predictive control (ML-MPC). Basically, the idea behind the method is to update the LPVmodel of the MPC iteratively, by using the repetitive behavior of the batch process. To this end, three different application-dependent options to estimate the parameters and disturbances of the model are proposed and are compared in simulation on anonlinear batch reactor. Finally, the ML-MPC with one of the estimation methods is applied to an industrial reactive batchdistillation column (RBD) in simulation.

■ INTRODUCTIONGrowing quality demands of consumers, environmentalconstraints, limited raw materials, and increasing energydemands have pushed the process industry to continuouslyimprove their processes and operations. Continuous processesare often developed for bulk production of a specific product andare therefore less dynamic in their capacity and productvariations. Meanwhile, batch processes, which are defined bytheir start-up and well-defined end of the operation, aregenerally used for smaller quantities of specialized products,but often have the ability to change their recipes-setup betweenthe batches to provide a wider product range.For the control of batch processes, methods such as

proportional−integral−derivative (PID), run-to-run, iterativelearning, or model based control can be used as described in thework of Bonvin.1 Model predictive control (MPC) is a controlmethod that includes the input and output constraints explicitlyin its formulation and, therefore, has become an acceptedstandard within the process industry for constrainedmultivariatecontrol problems.2 In a typical implementation, MPC solves anoptimization problem subject to system dynamics andconstraints. The solution of this optimization problem is asequence of inputs that can steer the performance output todesired levels. Naturally, the performance of such a controlstrategy depends on the quality of the model that is used inpredicting the process behavior.There are two basic methods for modeling of systems:

modeling based on system identification which uses only input−output data or modeling based on first principles. In the latter,process models with a wide operating window can be obtained.

These kinds of models are generally nonlinear and hence requirenonlinear optimization methods and can be computationallyinefficient in a model based control strategy. An attractivemodeling approach to deal with nonlinear systems is linearparameter varying (LPV) models. These models describe thenonlinearity by a set of linear equations by using a schedulingparameter. The use of LPV models in control synthesis hasseveral challenges of which the identification of the model is themost important.3−6 In this direction, we have shown that a wellstructured LPV model for reacting systems can be obtained byconsidering the reaction contribution as a disturbance term andusing the extent transformation.7−9 The resulting model has aparameter dependent state space form with a diagonal statematrix. An important advantage of such representation is thatthe model parameters have physical meaning.Another important characteristic of batch processes is the

repetitive nature of the operation. This could be an advantage forcontrol purposes.While humans have the ability to improve afterrepetitive mistakes, classical feedback controllers applied in theprocess industry do not have this capability. If a controller canstore information from the control actions calculated in theprevious batches and use this information to improve the futureand the current performance, then this is called a learning action.

Special Issue: Dominique Bonvin Festschrift

Received: December 30, 2018Revised: March 29, 2019Accepted: April 24, 2019Published: April 24, 2019

Article

pubs.acs.org/IECRCite This: Ind. Eng. Chem. Res. 2019, 58, 13737−13749

© 2019 American Chemical Society 13737 DOI: 10.1021/acs.iecr.8b06474Ind. Eng. Chem. Res. 2019, 58, 13737−13749

This is an open access article published under a Creative Commons Non-Commercial NoDerivative Works (CC-BY-NC-ND) Attribution License, which permits copying andredistribution of the article, and creation of adaptations, all for non-commercial purposes.

Dow

nloa

ded

via

TU

EIN

DH

OV

EN

on

Aug

ust 7

, 201

9 at

11:

41:3

8 (U

TC

).Se

e ht

tps:

//pub

s.ac

s.or

g/sh

arin

ggui

delin

es f

or o

ptio

ns o

n ho

w to

legi

timat

ely

shar

e pu

blis

hed

artic

les.

Page 3: Model learning predictive control for batch processes

In the 1980s, the first learning applications were introduced inthe field of robotics10 and are now widely known in the field ofsystems and control as iterative learning control (ILC)methods;see the work of Wang et al.11 for a survey. ILC has the ability toobtain asymptotic convergence over iterations of the samerepetitive task while being subject to model uncertainties anditeration-invariant disturbances. The conventional applicationsof ILC are operated in an open loop manner, where the error ofthe previous iteration is used to update the trajectory of the nextiteration; hence, they are unable to reject real time disturbances.Therefore, some methods have been proposed that haveimplemented ILC as a feed-forward action in the closed loopcontrol problem; see the work of Wang et al.11 for more details.Certainly, the combination of MPC and ILC (IL-MPC) is a

good approach in the control of batch processes. Thiscombination has a lot of advantages such as double integrationmechanism (in the batch and between batches), constraintssatisfaction among others. The general learning procedure withMPC is achieved by correcting the MPC input with pastinformation.12 The mathematical formulations for IL-MPC candiffer in the incorporation of integrators and constraints; see, forexample, refs 2 and 13. Despite of all these advantages, themodelused for prediction is still a limitation, also, the most of theprocesses in industry frequently show nonlinear behavior.14−16

For nonlinear processes, such as the batch reactors, severalstudies have been conducted where a type of IL-MPC isimplemented with nonlinear model. For example, refs 16 and 17propose an iterative nonlinear model predictive control(INMPC) strategy. INMPC has been applied to two differentbatch reactions abd convergency, stability, and robustness of theclosed-loop system have been analyzed and proven.18 Anothernonlinear method, called as the nonlinear model predictiveiterative learning control (NMPILC), is described in ref 19which uses a fuzzy model that contains local linear models. Ingeneral terms, to utilize a nonlinear model in IL-MPC is the idealcase, nevertheless, as it is usual in nonlinear MPC, there are a lotof open issues when it comes to implementation such ascomputational load. Therefore, to close the gap betweenaccuracy in the prediction and implementation in real time,the combination of IL-MPC with LPV models arises as a goodalternative.In this work, we propose a method not only to improve the

performance of an MPC controller by utilizing the repetitivebehavior of batch processes in an iterative learning fashion butalso to overcome the limitations of MPC approaches using LPVmodels by updating the MPC and disturbance models. Similarstrategies have been proposed for classic ILC;20,21 however inthis work, all these ideas are combined in just one method calledmodel learning predictive control (ML-MPC). This methodinherits the advantages of ILC and MPC while a suitable modelin structure (linear) and accuracy is used for prediction in everybatch. For the estimation of the parameters and disturbance, weevaluate three different methods that cover a broad range ofscenarios. The performance of each method is evaluatedimplementing them on a simple nonlinear batch reactor.Furthermore, one of the proposed learning methods is usedfor the control of an industrial reactive batch distillation (RBD)column.This paper is organized as follows: In Preliminaries, we

provide the basic principles ofMPC,MPC for LPVmodels, ILC,and the current methods of iterative learning model predictivecontrol (IL-MPC). Problem Statement presents the mainproblem we will be solving. In Model Learning Predictive

Control, we describe our newly proposed method, includingimplementations on a simple example of a batch reactor. InReactive Batch Distillation Processes, the method is applied to amore advanced process. Finally, in Conclusions, we summarizeour findings.

■ PRELIMINARIESMPC for LPV Systems. MPC is a well-known control

technique that uses a model to predict the future states andoutputs. It solves an optimization problem, in which, usually, aquadratic cost function is minimized. In this way, the controllercan steer the process outputs toward the reference profiles. Themain advantage of MPC is that it can consider explicitlyconstraints on inputs, states, and outputs in the optimizationproblem. The optimization problem is solved at every timesample to determine the best sequence of inputs uk j j

N0

1δ{ }+ =− for a

given prediction horizon N ∈ >0. The first input from thissequence is applied, and at the next time sample, the sameoptimization problem is solved again. This is known as thereceding horizon strategy.Despite these advantages of MPC, the use of this technology

for the control of batch processes is limited if it exists at all. Thisis mainly due to the nonlinearities and absence of steady states.Therefore, typical linear MPC techniques could not beimplemented. For batch processes, MPC based on a first-principles nonlinear model is a suitable approach. Certainly,there are different approaches to deal with these issues such asmultiple models, successive linearization, or empirical nonlinearmodels such as artificial neural networks; however, theseapproaches are often subject to performance limitations22 andcomputational issues. An attractive approach to deal with batchprocesses and nonlinear systems is to use LPV models in theMPC formulation. With such a representation we have thepotential to capture nonlinear and time varying characteristics ofsystems but also use linear control theory which is welldeveloped.In the available MPC methods for LPV systems, the future

behavior of the parameter (scheduling variable) is assumedunknown over the prediction horizon. One of the most usedmethod to solve the MPC problems for this kind of uncertaintyis the worst case scenario which can be described by eq 1.23−26

x e u

x A x B u B d

y Cx Du

e r y

u u u

x u x

d

min max

subject to:

( ) ( )

, ,

,

ud

k N Pj

N

k j Q k j R

k j k j k j k j k j d k j

k j k j k j

k j k j k j

k j k j k j

k j k j k N

k j k j

2

0

12 2

1

1

k j k jk j

∑ δ

θ θ

δ

θ

+ ∥ ∥ + ∥ ∥

= + +

= +

= −

= +

∈ ∈ ∈ Ω

∈ ∈ Θ

δ θ ∈Θ∈

+=

+ +

+ + + + + + +

+ + +

+ + +

+ + − +

+ + +

+ +

+ ++

(1)

where xk jnx∈+ denotes the system state, yk j

ny∈+ is the

system output, uk jnu∈+ is the control vector, dk j

nd∈+ aredisturbances, and ∥x∥Q2 = xTQxwithQ 0≻ . The system is subjectto hard constraints on state xk j ∈+ and input uk j ∈+ , where

Industrial & Engineering Chemistry Research Article

DOI: 10.1021/acs.iecr.8b06474Ind. Eng. Chem. Res. 2019, 58, 13737−13749

13738

Page 4: Model learning predictive control for batch processes

nx⊂ , and nu⊂ are assumed convex and compact.Finally, Ω ⊆ is an invariant set of the system. In this work, thesets , , and Ω are assumed polyhedrons, i.e., described bylinear inequalities of the form

u A u b

x A x b

x A x b

k j u k j u

k j x k j x

k N N k N N

= { | ⩽ }

= { | ⩽ }

Ω = { | ⩽ }

+ +

+ +

+ + (2)

The main issue with the formulation given by optimizationproblem 1 is the computational demand.27 Therefore, it has ledto approaches in which the problem is approximated (see refs 23and 28). We also find multiparametric approaches described asexplicit MPC for LPV systems26 and tube based MPC(TMPC).29 In ref 26, the high computational issues are reducedby precomputing the optimal inputs as a piecewise affinefunction of the state. The precomputed solutions are stored in alook-up table, such that only this look-up table can be usedonline.26 However, this method only works well if the size of thesystem is low.30 On the other hand, TMPC is based on theassumption that, while the current value of the schedulingparameter is measured exactly, the future variations in thescheduling variable belong to a sequence of setsΘk+j (tube) thatdescribe the expected deviations from the nominal trajectory.25

With this, the controller is able to derive the worst-case cost forexpected future changes only, resulting in a single linearprogram. The sequence Θk+j can be constructed in severalways, namely classical, anticipative, and oracle, each correspond-ing to a different MPC scenario.25 An additional interestedapproach called MPC for Quasi-LPV systems (QMPC) hasbeen proposed by refs 31 and 32. In this formulation, it isassumed that the scheduling parameters θk+j and disturbancesdk+j can be calculated by means of explicit functions of the statesand inputs as dk+j = fd(xk+j

i , uk+ji ) and θk+j = fθ(xk+j

i , uk+ji ). With this

assumption, the min−max optimization problem given in 1 canbe avoided. The main drawback of this method is to find thefunctions fd(·) and fθ(·). However, for chemical process, most ofthe time such functions are given by constitutive equations of thephysical parameter of the process, which is advantageous if a firstprinciple model is utilized in the controller.In general, the MPC methods for LPV systems proposed in

the literature are well-defined mathematically, nevertheless, realtime implementations are difficult due to their complexity andhigh computational load. In spite of all these issues, TMPC andQMPC combined with iterative learning control (ILC) haveinspired some of the methods proposed in this paper.Iterative Learning Control. ILC is the control method that

uses data from previously performed iterations. With this data,the controller is able to learn from past “mistakes” and correctfor this. Figure 1 illustrates the ILC procedure. In this figure P isthe plant, and L and Q are filters. Notice in Figure 1 how the

control signal induces an error signal through the plant withdisturbance and how this error signal of iteration j is used for thedetermination of the control signal in iteration j + 1.In classical ILC, the goal is to generate the optimal feed-

forward input. The first solution for this has been applied byArimoto on robotics,10 which is the also called as P-Type ILC,33

first-order learning algorithm,12 or Arimoto algorithm34 and isrepresented by

u u Le

i j N1, ..., , 1, ...,

ji

ji

ji1 1= +

∀ = ∞ ∀ =

− −

(3)

where uji nu∈ is the input, ej

i−1 = rj − yji−1, ej

ny∈ is the errorbetween the output and the reference, N is the number ofsamples in batch, and L is a linear time-invariant (LTI) filtercalled a learning filter.12 The indices i∈≥0 and j∈≥0 indicatethe batch index and the time sample within a batch, respectively.To ensure that the ILC algorithm is learning the systemdynamics and not the disturbances, the filter L is designed to bethe inverse of the process sensitivity.35 The drawback of thisfunction is that high-frequency terms of the input can increasecontinuously causing instability across batches. To overcomethis problem and ensure convergence to a steady input profile, arobustness filter Q is added to this algorithm as can be seen inFigure 1 and eq 8. This robustness filter usually is a type of low-pass filter.35

u Q u Leji

ji

ji1 1= [ + ]− −

(4)

The ILC algorithm can now be executed as shown in Algorithm1.Feedback control can also be included in the closed loop

operation with ILC in the so-called current-iteration iterativelearning control.36,37 Then, the control update algorithm is givenby

ß´ ≠ÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖ ÆÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖu Q u Le Cek j

ik ji

k ji

k ji1 1

learning control feedback control

= [ + ] ++ +−

+−

+

Here, the feedback control has the ability to react to batchspecific disturbances. Usually, in the corresponding controlscheme, as shown in Figure 1, only PD controllers areconsidered,36 as ILC has a natural integration componentfrom one batch to the next.ILC strategy requires repetitive processes like robotics and

batch processing and if this is the case it can reject repetitiveerrors using past data. It adds a batch to batch integrator to thecontrol loop and the noncausal behavior can overcome the delayin the error suppression. Additionally, model information can beused to improve the convergence speed. However, it requireserror and input filtering to prevent instabilities.36

Iterative Learning Model Predictive Control (IL-MPC).Different methods have already been developed to combine thebenefits of the MPC control with iterative learning capabilities,as seen in the work of Lee and Lee12 and Adam and Gonzalez.13Figure 1. Illustration of an ILC control algorithm.

Industrial & Engineering Chemistry Research Article

DOI: 10.1021/acs.iecr.8b06474Ind. Eng. Chem. Res. 2019, 58, 13737−13749

13739

Page 5: Model learning predictive control for batch processes

In general, the IL-MPC problem considers additive disturbancesand mathematically can be written as

x e u

x Ax Bu B d

y Cx Du

e r y

u u u

u u u

x u

x d

min

subject to:

,

,

uk Ni

Pj

N

k ji

Q k ji

R

k ji

k ji

k ji

d k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k Ni

k ji

2

0

12 2

1

1

1

k ji

∑ δ

δ

δ δ δ

∥ ∥ + + ∥Δ ∥

= + +

= +

= −

= +

Δ = −

∈ ∈

∈ Ω ∈

δΔ+

=

+ +

+ + + + +

+ + +

+ + +

+ + − +

+ + +−

+ +

+ +

+

(5)

In this formulation Δ describes the input change rate from onebatch to the next. Similar to how δ includes an integrator in thein-batch time direction, Δ introduces an extra integrator in thebatch-to-batch direction. Another feature of this Δ is that thespeed of changes over batches can be constrained. Now, in orderto calculate the optimal control solution Δδuk+ji , problem 5 canbe transformed into a quadratic programming (QP) problem.

■ PROBLEM STATEMENTIn the previous sections, we have described MPC and ILCstrategies separately and also presented information on controlapproaches which combine both technologies. In IL-MPCapproaches, the optimal input is calculated using the predefinedmodel as shown in eq 5. Changes in the plant from batch tobatch causing model degradation have not been considered.Besides, MPC-LPV methods are difficult for online implemen-tation. Motivated by these observations, we present thefollowing problem:

x e u

x A x B u B d

y Cx Du

e r y

u u u

u u u

x u x

d

min

subject to:

( ) ( )

, ,

,

uk Ni

Pj

N

k ji

Q k ji

R

k ji

k ji

k ji

k ji

k ji

d k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k Ni

k ji

k ji

2

0

12 2

1

1

1

k ji

∑ δ

θ θ

δ

δ δ δ

θ

+ ∥ ∥ + ∥Δ ∥

= + +

= +

= −

= +

Δ = −

∈ ∈ ∈ Ω

∈ ∈ Θ

δΔ+

=

+ +

+ + + + + + +

+ + +

+ + +

+ + − +

+ + +−

+ + +

+ +

+

(6)

Our principal objective, therefore, is to design an MPCcontroller to achieve constrained tracking of the LPVrepresentation of the batch process for a given referencetrajectory. As this is a complicated problem to solve due to the

need of knowledge about θk+ji and dk+j

i , we propose an iterativemethod to derive these parameters from previous batches. Wecall this strategy model learning predictive control.

■ MODEL LEARNING PREDICTIVE CONTROLIn this paper, we propose a method where we update the modelused for control, such that the MPC controller performance canbe improved over batches. To this end, we use the available datafrom the previous batch to (re)estimate the model parameters,as illustrated in Figure 2.

In this illustration, we can see two time directions, the first onebeing the in batch time k, and the second being the sequentialbatches denoted by i. After the initial batch, data is used to getestimations for θk+j

1 , dk+j1 for the next batch. The MPC controller

then computes the optimal inputs for the next batch using theseupdated parameters. And this sequence is then further repeatedas described step by step in Algorithm 2.

This method is shown in the control block diagram in Figure3.

Controller Formulation. As explained before we utilize anMPC controller with an LPV model in this ML-MPC method.Additionally, we include two integrators by the use of both δ andΔin the MPC formulation as described by eq 7.

Figure 2. Illustration of the proposed control structure.

Figure 3. Model learning predictive control scheme.

Industrial & Engineering Chemistry Research Article

DOI: 10.1021/acs.iecr.8b06474Ind. Eng. Chem. Res. 2019, 58, 13737−13749

13740

Page 6: Model learning predictive control for batch processes

x e u

x A x B u B d

y Cx Du

e r y

u u u

u u u

u u u u u

x u x

min

subject to:

( ) ( )

, ,

uk Ni

Pj

N

k ji

Q k ji

R

k ji

k ji

k ji

k ji

k ji

d k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k Ni

2

0

12 2

1

1

1

11

11

k ji

∑ δ

θ θ

δ

δ δ δ

δ

+ ∥ ∥ + Δ

= + +

= +

= −

= +

Δ = −

Δ = − − +

∈ ∈ ∈ Ω

δΔ+

=

+ +

+ + + + + + +

+ + +

+ + +

+ + − +

+ + +−

+ + + − +−

+ −−

+ + +

+

(7)

In this problem, we include the estimation of the trajectory of

the parameter k ji

θ + and the time-varying disturbance dk j

i+ , which

are determined from previous batches. In the learning methodproposed in this paper, the parameters and disturbances can beestimated using multiple methods. Particularly, we proposethree different methods inspired in the MPC for LPV systemsapproaches that interpret the available data in different ways,such methods are presented in the next section.Remark 1 Note that optimization problem 7 is equivalent to the

traditional IL-MPC when the parameter θk+j is constant during thebatch. This fact shows that IL-MPC is a particular case of themethod proposed in this paper (ML-MPC); also, the performance ofthe controller given by 7 is at least the same as that of 5.Parameter and Disturbance Estimation Methods. We

have considered three different methods for estimating themodel parameters and disturbances. The first one is inspired bythe Quasi-LPV approach. The second one is formulating theestimation problem as an identification problem, while the thirdmethod is inspired by adaptive control in the batch direction. Allthese methods are explained in the following.Method 1. In the case of chemical processes it is common to

have constitutive relations. Therefore, it makes sense toapproach the estimation problems in a similar fashion as thatseen in ref 31. In this method, we use the prior knowledge of thesystem (constitutive relations) to define a function of theparameters and disturbances. We can describe this method bydefining the following set of equations for θk+j and dk+j:

f x u

d f x u

( , )

( , )

k ji

k ji

k ji

k ji

d k ji

k ji

1 1

1 1

θ =

=

θ+ +−

+−

+ +−

+−

(8)

To illustrate this method, let us consider a simple batch reactorgiven by the following set of nonlinear equations:2

Tt

UAMC

T THV

MCk C

dCdt

k C

dd

( ) e

e

jE RT

E RT

p p0

/A2

A0

/A2

= − − − Δ

= −

−(9)

Be defining x1 = T, x2 = CA, u = Tj,UA

MC T( )pθ = , and

d k CeHV TMC T

E RT( )( ) 0

/A2

p= − Δ − , the model can be written as

x x u d( )θ = − − + (10)

where θ and d can be calculated by means of the followingfunctions

f x uUA

MC x

d f x uHV x

MC xk x

( , )( )

( , )( )( )

edE Rx

p

p0

/221

θ = =

= = − Δ

θ

Note that, the parameters and disturbances can be updated inthe ith batch by means of fθ(·) and fd(·). More detail about thisexample are given in the next section (Illustrative Example:Nonlinear Batch Reactor). Naturally, in some cases, it is notpossible to have such function for both the parameters and thedisturbance; however, the method can still be used if we have afunction only for either the parameters or the disturbance. Anadvantage of Method 1 is the ability to estimate a completetrajectory for time-varying parameters. However, it requiresadditional measurements besides the available input−outputdata, namely state information. Finally, the original optimizationproblem can be written as

x e u

x A x B u B d

y Cx Du

e r y

u u u

u u u

x u x

f x u

d f x u

min

subject to:

( ) ( )

, ,

( , )

( , )

uk Ni

Pj

N

k ji

Q k ji

R

k ji

k ji

k ji

k ji

k ji

d k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k ji

k Ni

k ji

k ji

k ji

k ji

d k ji

k ji

2

0

12 2

1

1

1

1 1

1 1

k ji

∑ δ

θ θ

δ

δ δ δ

θ

∥ ∥ + + Δ

= + +

= +

= −

= +

Δ = −

∈ ∈ ∈ Ω

= ∈ Θ

= ∈

δ

θ

Δ+

=

+ +

+ + + + + + +

+ + +

+ + +

+ + − +

+ + +−

+ + +

+ +−

+−

+ +−

+−

+

(11)

Method 2. In this method we use an approach similar to thedata driven identification as seen in Forgione et al.38 We use anoptimization problem to estimate the parameters anddisturbance from the input and output data. The cost functionof this problem is user defined and depends on the a prioriknowledge the user has about the process. This problem ingeneral can be described by

d J x u

x A x B u B d

y Cx Du

d

arg min ( , )

subject to:

( ) ( )

,

k ji

k ji

dk ji

k ji

k ji

k ji

k ji

k ji

k ji

d k ji

k ji

k ji

k ji

k ji

k ji

,

1 1

11 1 1

1 1 1

k j k j

θ

θ θ

θ

[ ] =

= + +

= +

∈ ∈ Θ

θ θ+ +

∈ ∈+−

+−

+ +−

+ +−

+ +−

+

+−

+−

+−

+ +

+ +

(12)

where J(·) is usually the typical quadratic cost function used inleast squares problems. Note that problem 12 can be converted

Industrial & Engineering Chemistry Research Article

DOI: 10.1021/acs.iecr.8b06474Ind. Eng. Chem. Res. 2019, 58, 13737−13749

13741

Page 7: Model learning predictive control for batch processes

into a quadratic programming problem. This is a significantadvantage compared with the proposal of ref 38 since both theparameter and disturbance can be estimated by solving a singleoptimization problem. Despite its advantages, this method issuitable for real time applications only if the parameter θk+j doesnot vary in time (θk+j = θ). Otherwise, problem 12 is a nonlinearoptimization problem, which is usually more difficult to solve.Another problem with this method is that by using input−output data only, the optimal solution could be affected by thenoise in the system resulting in undesirable behavior by thecontroller. On the other hand, it can happen that the number ofparameters and disturbances are larger than the size of theinput−output data. This results in an ill posed problem. Theseissues are analyzed in the illustrative example presented in thenext section (Illustrative Example: Nonlinear Batch Reactor).Finally, with Method 2, the original optimization problem doesnot change its formulation.Method 3. This third method contains some similarities to

the original ILC method, where the error trajectory of the batchis utilized to estimate the parameters and disturbance. Method 3is inspired in adaptive control strategies in the presence ofunknown input gain. To this end, a projection of the error asdescribed inHovakimyan et al.39 is used to update the parameterand disturbance, mathematically the parameters and disturban-ces can be calculated as

x

d d d

e

e

Proj( , )

Proj( , )

k ji

k ji

k ji

k ji

k ji

k ji

k ji

d k ji

k ji

1 1 1 1

1 1 1

θ θ θ = + Γ −

= + Γ

θ+ +−

+−

+−

+−

+ +−

+−

+−

(13)

where, e is the difference between the output data and thepredicted output with the input data of the previous batch.Furthermore, Proj(·, ·) is the projection operator that ensuresboundedness of the parametric estimates by definition, and it isdefined as39

l

m

ooooooooooooo

n

ooooooooooooo

y

y f

y ff y

yff

ff

y f ff y

Proj( , )

, if ( ) 0

, if ( ) 0 and0

, ( ), if ( ) 0 and0

T

T

θ

θ

θ

θ θ

<

≥▽ ≤

−▽

∥▽ ∥▽

||▽ ||≥

▽ >(14)

f ( )( 1) T T

Tmax max

max maxθ

θ θ θ θθ θ

≜ϵ + −

ϵθ

θ

Additional information of the projection operator could befound in more detail in the work of Hovakimyan et al.39 Similarto Method 1, the parameter and disturbance can be estimatedseparately, as long as they can both be derived by a goodprojection of the state and/or error data. Because differentprojections of the same data points can give information aboutboth the parameters and disturbance, this method is suitable fortime-varying parameters.Illustrative Example: Nonlinear Batch Reactor. We

apply the estimation methods described in the previous sectionto a relatively simple example of a nonlinear batch reactor. Thisexample has also been used for the evaluation of an IL-MPCmethod in Oh and Lee.2 In the reactor, an exothermic reaction

takes places to form the required product C by means of thereaction A + B ⇒ C. To achieve the desired product quality, aspecific temperature profile should be tracked. The jackettemperature Tj is used to achieve this goal. The dynamics of thereactor is described by the set of differential equations presentedin eq 15.

Tt

UAMC

T THV

MCk C

Ct

k C

dd

( ) e

dd

e

jE RT

E RT

p p0

/A2

A0

/A2

= − − − Δ

= −

−(15)

where T is the reactor temperature and Tj is the temperature ofthe heating/cooling jacket.The parameters of the process dynamics are shown in Table 1.

As we stated in a previous sections, the nonlinear reactormodel eqs 15 can be written as

x x u d( )θ = − − + (16)

In this example, θ is assumed as a constant parameter and d atime-varying disturbance. Although θ is considered constantduring the batch, this parameter could change over batches dueto fouling, as it includes the heat transfer coefficient of theheating jacket. In comparing the three methods, we consider thesame MPC settings. We set a desired temperature profile asshown in Figure 4.2

The MPC is tuned with the control horizon N = 30, the costmatrices Q = 106, R = 106, and the inputs are in the sets u ∈ [10,50], δu ∈ [−1, 1], and Δu ∈ [−10, 10].For initialization of this reactor problem we apply as the initial

parameter 30

θ = and initial disturbance vector d0 = [0 ··· 0]T.With these parameter and disturbance values, the performanceof the MPC in tracking of the reference trajectory is notsatisfactory. This is not surprising due to the plant−modelmismatch, as the actual value of θ is 0.09 and the disturbancetrajectory is shown in Figure 5.In the following, we study the performance of these methods

in simulation on this batch reactor.Method 1. For this specific example we do not have a function

to estimate θ as in eq 8. Therefore, we assume that we are able to

Table 1. Nonlinear Batch Reactor Parameters

parameter values units

UA/MCp 0.09 l/minΔHV/MCp −1.64 K·\L/molE/R 13 550 Kk0 2.53 l/mol minCA(0) 0.9 mol/LT(0) 20 °C

Figure 4. Reactor temperature: reference profile.

Industrial & Engineering Chemistry Research Article

DOI: 10.1021/acs.iecr.8b06474Ind. Eng. Chem. Res. 2019, 58, 13737−13749

13742

Page 8: Model learning predictive control for batch processes

estimate the parameter perfectly such that θ θ = . Applying thisparameter learning with the ML-MPC algorithm results in thereference tracking improvement as shown in Figure 6

To improve the reference tracking further, Method 1 can beimplemented on the batch reactor to estimate the disturbancevector based on the measurable temperature T and concen-tration CA in the reactor. From the differential equation of thereactor, we know that the reaction rate can be described as thedisturbance as shown in eq 17.

d f T CHV

MCk C( , ) ek

id k k

p

E RTk

iA 0

/A

12ki 1 = = −Δ − −−

(17)

which results in the improvement of reference tracking, from t =0 min until t = 6 min, as shown in Figure 7. In this figure, thereference tracking improvement is clearly seen at the beginningof the batch. This is expected because the reaction rate has thehighest amplitude in the start of the batch.

Method 2. In this method, we use Euler approximation torepresent the differential equation as a difference equation. Thisderivation is explained as follows

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑ

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑ

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑß ß

´ ≠ÖÖÖÖÖÖÖÖÖÖÖ ÆÖÖÖÖÖÖÖÖÖÖÖ ´ ≠ÖÖÖÖÖÖÖÖÖÖ ÆÖÖÖÖÖÖÖÖÖÖ

∂ ∂ ∂ ∏μ

´ ≠ÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖ ÆÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖ∂

xx x

tTj T d k N

I

I I

d

d

( ) , 0, ,

0 0

0

k kk

N

b

N

a N

1

1 11

kk

θ

β

β

α

α

θ

≈−

Δ= − + ∀ = ···

=

βα

γ

+

(18)

which we can calculate γ using the following least-squaresproblem after substituting the input and output data in a and b

a bmin γ∥ − ∥γ (19)

This optimization problem, however, has two issues. First of all,it is ill-posed as there are more parameters than the rank of thematrix a. The second problem is that any noise in the input oroutput affects the estimation of the disturbance as shown inFigure 8.

If we evaluate the result of thed for this estimation, we see thatthe noise in the estimatedd is much larger than actual d as shownin Figure 9.

A solution to overcome such problems is utilizing aregularization method. In the case of the batch reactor, weknow that the reaction rate converges to zero and therefore the

1 or ”lasso” regularization is a good option. The least-squaresproblem then becomes

a bmin 1γ λ γ∥ − ∥ + ∥ ∥γ (20)

where λ is a tuning parameter with λ = 100. The solution of thisproblem for the first batch is shown in Figure 10. We can noticehow the regularization term filters the solution from the systemnoise.

Figure 5. Reaction rate profile.

Figure 6. Input and outputs with Method 1.

Figure 7. Input and outputs with Method 1: From t = 0 min until t = 6min.

Figure 8. Least squares estimation with noise: T − T(0).

Figure 9. Least squares estimation with noise: disturbance d.

Industrial & Engineering Chemistry Research Article

DOI: 10.1021/acs.iecr.8b06474Ind. Eng. Chem. Res. 2019, 58, 13737−13749

13743

Page 9: Model learning predictive control for batch processes

The result of the estimation for d is shown in Figure 11. In thisfigure, we see that there is still some fluctuations in the least-

squares estimation of d . These fluctuations can, however, befiltered using a lowpass filter to obtain a good estimation of thetrajectory d . This is shown by the “filtered 1 LS” result,compared with d.Now we can implement the estimated θ in the controller

model for the next batch. The result for the updated parameter isshown in Figure 12. In this result, we see that the estimation of θ

results in a direct improvement step toward the referencetracking performance. By implementing the estimated d for thenext batch, we see an improvement in the reference trackingperformance of the controller for the start of the batch as shownby Figure 13.Method 3. Method 3 is implemented using the learning

function in eq 21 based on the error.

x

d d e

eProj( , )k ji

k ji

k ji

k ji

k ji

ki

ki

ki

1 1 1 1

1 1

θ θ θ

λ

= + Γ −

= +

θ+ +−

+−

+−

+−

− −(21)

With λ = 0.25 we obtain the resulted presented in Figures 14 and15.An interesting observation of these results is related to the

stability of this learning method. In the original ILC algorithmusing eq 3 it is known that high frequent input oscillations can

reduce the controller performance over iterations, whereas inthe case of this implementation in the MPC controller the δformulation prevents these high frequent fluctuations by thepenalty on the input change δu, and therefore helps to stabilizethis learning algorithm.It is important to highlight that somemethods are better in the

disturbance learning, and others better in parameter learning,and therefore we will compare both aspects separately for thespecific batch reactor. We compare the reference trackingperformance over batches by taking the ∥e∥2 as the performancemeasure.In Figure 16, we present the performance of each method for

the learning of the parameter. In this figure, 10 batches aredisplayed where the parameter of the plant is changed from θ =

Figure 10. 1 least-squares estimation: T − T(0).

Figure 11. 1 least-squares estimation: disturbance d.

Figure 12. Input and outputs with Method 2.

Figure 13. Input and outputs with Method 2: From t = 0 min until t = 6min.

Figure 14. Input and outputs with Method 1.

Figure 15. Input and outputs with Method 3: From t = 0 min until t = 6min.

Industrial & Engineering Chemistry Research Article

DOI: 10.1021/acs.iecr.8b06474Ind. Eng. Chem. Res. 2019, 58, 13737−13749

13744

Page 10: Model learning predictive control for batch processes

0.09 to 0.05 between the sixth and the seventh batch. This causesan increase in the error during the seventh batch.From this plot we can see that Methods 1 and 2 have almost

equal performances, meaning that the least-squares optimizationis well-defined for this reactor problem as Method 1 is able tolearn the parameter and disturbance exactly. Method 3 has aclearly slower learning behavior for this application. When wecompare these results to the work of Oh and Lee,2 we see thattheir IL-MPC algorithm converges in approximately five to sixbatches, whereas ML-MPC converges in approximately twobatches, such that the learning speed over batches is more thantwice as good compared to the work of Oh and Lee.2 Of coursethis also depends on the quality of the estimation method used.For the learning of the disturbance, we can observe that

Method 1 gives a better performance over Method 2 as seen inFigure 17. This can be explained by the remaining mismatch in

disturbance estimation as shown previously in Figure 11,whereas Method 1 estimates the exact reaction rate. Method 3,in the way it is implemented, includes errors caused by thecontroller in the disturbance trajectory. And with this property,it is able to improve the controller performance over batchescompared to the other twomethods. The drawback ofMethod 3is that the disturbance trajectory can not be interpreted as thereaction rate.From the results we can conclude that the proposed methods

offer good controller improvement over batches, thus being ableto take into account changes in the system. The results, however,also show that each estimation method have its own strengthsand limitations.To implement the ML-MPC, it remains important to have

good knowledge of the process in order to select the rightestimation method for each application. We have also observedthat for the batch reactor Method 1 would be the best option forimplementation, under the assumption that both temperatureand concentration can be measured. If the concentrationmeasurement in the plant would not be possible, Method 2offers a good alternative.In the next section, we apply the ML-MPC algorithm to a

more advanced process in a reactive batch distillation columnusing only one of the learning methods.

■ REACTIVE BATCH DISTILLATION PROCESSESA reactive batch distillation (RBD) column is a unit operationutilized generally for chemical processes involving equilibriumreactions. With the use of a distillation column, light productsare removed so that the reaction proceeds in the productsdirection. A general form of RBD is illustrated in Figure 18. The

reactor of such RBD is enclosed by a jacket which can heat orcool the reactor. On top of the reactor is the distillation columnwith multiple trays. The vapor rising from the reactor and theliquid flow (reflux) coming from the top of the column get intocontact at the tray with liquid holdup.

Rigorous Dynamic Model of RBD. A general rigorousdynamic model for the ith stage of the RBD is formulated andgiven by the following set of equations,

n

tL x L x n r

n

tV y Vy

n h

tL h L h VH V H n

H r Q

i N j N

d

dd

dd

d

( )

1, ..., , 1, ...,

li i j i i j i j L i j

g

i i j i i j i j

T i ii i i i i i i i L

l

N

i l i l i

T C

1 1, , , ,

1 1, , ,

,1 1 1 1

1, , ext,

i j

i

i j

i

R

,

,

ξ

ξ

= − − +

= − +

= − − + +

−Δ +

∀ = ∀ =

− −

+ +

− − + +

=

(22)

where n n r, , ,l g i j i j, ,i j i j, ,ξ ∈ are the numbers of moles in the

liquid gas phases and the reaction rate component and masstransfer rate of the jth component, respectively. Vi ∈ and Li ∈ are the outlet gas and liquid flow, and Vi+1 ∈ and Li−1 ∈ are the inlet gas and liquid flow from the (i + 1)th and (i − 1)thtrays. Furthermore, hi,Hi,−ΔHi,j,Qext,i ∈ are the liquid, gas,and reaction molar enthalpies, and the external heat flow,respectively. Besides, nT,i is the total number of moles and isgiven by n n nT i L G, i i

= + , where nLiand nGi

are the total liquid

Figure 16. ∥e∥2 convergence for parameter learning using differentmethods.

Figure 17. ||e||2 convergence for disturbance learning using differentmethods.

Figure 18. Illustration of a reactive batch distillation column.

Industrial & Engineering Chemistry Research Article

DOI: 10.1021/acs.iecr.8b06474Ind. Eng. Chem. Res. 2019, 58, 13737−13749

13745

Page 11: Model learning predictive control for batch processes

and gas molar holdup. The liquid mole fraction xi,j and the vapormole fraction yi,j are given by

nN

n nN

n xn

ny

n

n

i N j N

, , , ,

1, ..., , 1, ...,

Lj

C

l Gj

C

g i jl

Li j

g

G

C T

1 1, ,i j i i j i

i j

i

i j

i, ,

, ,∑ ∑= = = =

∀ = ∀ =

= =

The model given by the set of eqs 22 is subject to the followingconditions presented in Table 2.LPV Representation of an RBD Column. The extent

decomposition for RBD processes can be achieved finding alinear transformation li

, and gi, similar to the one computed in

Amrhein et al.,40 such that system 22 can be represented in termsof new independent states for the gas and the liquid phase asfollows:

Liquid phase

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑ

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑ´ ≠ÖÖÖÖÖÖÖ ÆÖÖÖÖÖÖÖ

x

x n

r l

l

l

lT

lT

lT

l

,

in,

1 ,

2 ,

3 ,

i

i

i

e i

e i

e i

e li

i

,

λ=

(23)

Gas phase

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅ

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑ

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑ´ ≠ÖÖÖÖÖÖÖ ÆÖÖÖÖÖÖÖÖ

xn

g

g

glT

gT g

in, 1 ,

2 ,

i

i

e i

e i

e gi

i

,

λ=

(24)

where xr lN

, iR∈ is the extent of reaction, x l g

pin, ,

1i∈{ }

+ are

the ex t en t o f i n l e t flow , x lN N p

inv,1

iC R∈ − − − , and

x gN p

inv,1

i

C∈ − − are the extent of reaction and inlet flow

invariants. Finally, we obtain the following LPV representationfor the ith tray

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑ

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑ

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑ

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÄ

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑ

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑ

x

x

T

A

A

x

x

T

B

B

u

u

Q

B

B

B

d

( ) 0 0

0 ( ) 0

0 0

0 0

0 0

r l

e g

i

e l l

e g g

T

r l

e g

i

e l

e g

L V Q

l

g

i

d l

d g

d T

i

,

,

,

,

,

,

,

,

ext,

,

,

,

i

i

i i

i i

i

i

i

i

i

i i i

i

i

i

i

i

θ

θ

θ α α α

=

+

+

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅ

É

Ö

ÑÑÑÑÑÑÑÑÑÑ

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅ

É

Ö

ÑÑÑÑÑÑÑÑÑÑ

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅ

É

Ö

ÑÑÑÑÑÑÑÑÑÑ

n

T

C x

T

0

0 1

l

i

l e l

i

,i i i=

(25)

Details about the extent transformation for the RBD can befound in the work of Marquez-Ruiz et al.9

Case Study: Synthesis of Unsaturated Polyester in anRBD. Consider an industrial reactive batch distillation processfor the synthesis of unsaturated polyester frommaleic anhydrideand propylene glycol. This process involves four types ofreactions.41 First maleic anhydride reacts with propylene glycoland produces amaleic acid end group and a propylene glycol endgroup with an ester bridge. This is a fast and exothermic reaction(ΔH = −40 kJ/mol). Esterification proceeds by the reaction ofdifferent acid and alcohol end groups to form new ester bridgesand water, or by reaction of a glycol hydroxyl group with an acidend group to form an ester bridge and water. The double bond inmaleic anhydride can be isomerized or saturated. Saturation ofthe double bond causes cross-linking in the polymer, andapproximately 10−20% of the double bonds are saturated in thepreparation of the polyester. The three reactions, esterification,isomerization and saturation form a network of nine reactions.These reactions are summarized in Table 3.

The RBD consists of 6 stages (NT = 6) with only 3 internaltrays. Stage equilibrium is assumed in the process with NRTLactivity coefficient model for the liquid phase; molar vaporholdup is negligible with respect to the molar liquid holdup, anisothermal total condenser is assumed, and the reaction islimited to the reboiler (reactor). Only four species are assumedto evaporate (p = 4). In total, the chemical process has 8 species(NC = 8) with 9 reactions, but 5 independent (NR = 5), i.e., thestoichiometric matrix NT∈ 8×9 but has rank 5. The followingare mole and temperature initial conditions in the reactor:n (0) 20 kmollNT ,RCOOH1D

= , n (0) 20 kmollNT ,R OH=

′, a n d

T (0) 373 KNT= .

Control Objective. In the RBD, the objective is to steer thetemperature of the reactor to a reference temperature. Thisreference is a predefined trajectory to obtain optimal productqualities from a batch in the RBD. This reference profile isshown in Figure 19. With this temperature profile, the water in

Table 2. Conditions of the Rigorous Dynamic Modeling of RBD Processes

Li−1 Li Vi+1 Vi ξi,j xi−1,j ri,j Qext,i

i = 1 LD 0 0 0 0 xi+1,j 0 0i = 2 0 Li + LD Vi+1 0 ξi,j 0 0 0

i = NT Li−1 0 0 Vi ξi,j xi−1,j rN j,T Q Next, T

Table 3. Basic Reactions in the Polyesterification ofUnsaturated Carboxylic Acids with Diols41

reaction stoichiometry

1 VRCOOH R OH RCOOR H O1D 1D 2+ ′ ′ +

2 VRCOOH R OH RCOOR H O2D 2D 2+ ′ ′ +

3 VRCOOH R OH RCOOR H OS S 2+ ′ ′ +

4 VRCOOH RCOOH1D 2D

5 VRCOOR RCOOR D1D 2′ ′

6 VRCOOH 0.5R OH RCOOH1D S+ ′

7 VRCOOH 0.5R OH RCOOH2D S+ ′

8 VRCOOR 0.5R OH RCOOR1D S′ + ′ ′

9 VRCOOR 0.5R OH RCOOR2D S′ + ′ ′

Industrial & Engineering Chemistry Research Article

DOI: 10.1021/acs.iecr.8b06474Ind. Eng. Chem. Res. 2019, 58, 13737−13749

13746

Page 12: Model learning predictive control for batch processes

the reactor vessel is eliminated as shown by the bottom plot inFigure 19.

We have two inputs, the heat-flow to the reactor and the refluxof distilled liquid in the top of the reactor for the control of thereactor temperature. In a recent paper by ref 42, severalconventional control strategies have been studied for thissystem. In this work, we consider a model based control strategy.The tracking the temperature reference is difficult since whenthe boiling point of the products in the reactor is reached, thetemperature cannot be increased anymore by the heat flowinput. This behavior can be observed by the optimal profile inFigure 19 where the temperature only increases when thetemperature is below the boiling point. The reflux input of theRBD has also interesting dynamic behavior that has to be takeninto account. The reflux input only attenuates the temperature inthe reactor when there is already some product distilled. Whenthis is the case, the reflux cools the reactor when the distillationtray above the reactor has a lower temperature than the reactor,while it would heat the reactor when the tray above the reactorhas a higher temperature than the reactor.Simulation Results. In this work, we will consider the

control of the reference temperature by manipulating the heat-flow only. We assume that the optimal input trajectory of thereflux is used for the input, which can be seen in Figure 20.

By considering the manipulation of the heat flow in thereactor jacket only, the LPV model can be reduced to

ßß ßß ´ ≠ÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖ ÆÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖTt

T Q L r Vdd

iTi

A

ix

Qi

B

i

u

Li i i i i i

d( ) ( )

ext, 1θ α α β κ= − + + + +θ α

(26)

First, we apply the ML-MPC with a fixed parameter θ for thebatch time. In this case we learn the disturbance by usingMethod 1, where we assume the liquid and vapor flows and themass in the reactor measurable. With these measurements the

disturbance trajectory can be calculated. The results of learningthe parameter and disturbance trajectories for the controller areshown in Figure 21.

In these results, we see that, due to a small cooling actionaround 5000 s in the first batch, the temperature in the reactordrops and it is not able to reach the reference anymore. At thesame time, the heat flow input is high, because the boilingtemperature of the reactor is decreased. This boiling pointdecrease has been caused by the reflux that returns water back tothe reactor. In the second batch, we learn the disturbance causedby the reflux; therefore, the controller cools the reactor slightlyless around 5000 s, such that the temperature in the reactor doesnot decrease, and the reference tracking performance isimproved for the next 10 000 s.While this method improves the performance of RBD

operation, it does not consider changes in the plant as theparameters are not re-estimated. We, therefore, implementanother controller with the time-varying parameter. We start inthe initial batch with the poorly chosen parameter constant forthe full batch. In the second batch the varying trajectory isestimated. The results for this application are shown in Figure22.While in the first batch the temperature is only increased by

the exothermic reaction in the reactor as the heat flow input iszero, in the second batch the controller is able to track thereference better. Similar to the previous controller implementa-tion, cooling of the reactor around 5000 s causes a temperature

Figure 19.Optimal reference profiles for the temperature and the waterholdup in the reactor.

Figure 20. Optimal reference profile for the input.

Figure 21. Reference tracking after one batch of learning thedisturbance trajectory using Method 1.

Figure 22. Reference tracking after one batch of learning the parameterand disturbance trajectory using estimation Method 1.

Industrial & Engineering Chemistry Research Article

DOI: 10.1021/acs.iecr.8b06474Ind. Eng. Chem. Res. 2019, 58, 13737−13749

13747

Page 13: Model learning predictive control for batch processes

drop in the reactor in the second batch. And by learning thisundesired behavior, the controller for the following batches thenincreases the temperature around 5000 s to be able to remain atthe reference temperature.We have also tested the ML-MPC algorithm for 15 batches,

with changes in the initial conditions of the plant at the 11th and14th batches. We obtain the convergence of the Euclidean normof the error over batches as seen in Figure 23.

■ CONCLUSIONSIn this paper we have proposed a model learning predictivecontrol (ML-MPC) method, based on the repetitive behavior ofthe batch processes. To this end, the LPV model used in thecontroller is updated using information from the previous batch.Inspired by the MPC for LPV systems in literature, threedifferent application-dependent options to estimate theparameters and disturbance of the model have been proposedand compared in simulation on a nonlinear batch reactor. Thebest controllers are able to converge to their best performancewithin approximately two batches, which is better than the IL-MPC method in the work of Oh and Lee.2 Finally, by applyingthe ML-MPC on a reactive batch distillation column, we haveshown the ability to adjust the controller to complex nonlinearbehaviors. The parameter and disturbance converge to theiractual values in limited number of batches. In the case of changesin the initial conditions the controller is also able to adaptquickly.

■ AUTHOR INFORMATIONCorresponding Author*E-mail: [email protected] Ozkan: 0000-0001-8442-772XNotesThe authors declare no competing financial interest.

■ REFERENCES(1) Bonvin, D. Control and optimization of batch processes. IEEEcontrol systems 2006, 26, 34−45.(2) Oh, S.-K.; Lee, J. M. Iterative learning model predictive control forconstrained multivariable control of batch processes. Comput. Chem.Eng. 2016, 93, 284−292.(3) Bamieh, B.; Giarre, L. Identification of linear parameter varyingmodels. International journal of robust and nonlinear control 2002, 12,841−853.(4) Bruzelius, F. Linear parameter-varying systems-an approach togain scheduling; Chalmers University of Technology, 2004.(5) Casella, F.; Lovera, M. LPV/LFT modelling and identification:overview, synergies and a case study. CACSD 2008, IEEE InternationalConference on Computer-Aided Control Systems; 2008; pp 852−857.(6) Toth, R. Modeling and identification of linear parameter-varyingsystems; Springer, 2010; Vol. 403.

(7) Marquez-Ruiz, A.; Mendez-Blanco, C.; Ozkan, L. Control ofhomogeneous reaction systems using extent-based LPV models. IFAC-PapersOnLine 2018, 51, 548−553.(8) Marquez-Ruiz, A.; Mendez-Blanco, C.; Porru, M.; Ozkan, L. Stateand Parameter Estimation Based On Extent Transformations. Comput.-Aided Chem. Eng. 2018, 44, 583−588.(9) Marquez-Ruiz, A.; Mendez-Blanco, C. S.; Ozkan, L. Modeling ofreactive batch distillation processes for control. Comput. Chem. Eng.2019, 121, 86−98.(10) Arimoto, S.; Kawamura, S.; Miyazaki, F. Bettering operation ofrobots by learning. Journal of Field Robotics 1984, 1, 123−140.(11) Wang, Y.; Gao, F.; Doyle, F. J. Survey on iterative learningcontrol, repetitive control, and run-to-run control. J. Process Control2009, 19, 1589−1600.(12) Lee, J. H.; Lee, K. S. Iterative learning control applied to batchprocesses: An overview. Control Engineering Practice 2007, 15, 1306−1318.(13) Adam, E. J.; Gonzalez, A. H. Frontiers in Advanced ControlSystems; InTech, 2012.(14) Shamma, J. S.; Athans, M. Analysis of gain scheduled control fornonlinear plants. IEEE Trans. Autom. Control 1990, 35, 898−907.(15) Qin, S. J.; Badgwell, T. A. An overview of nonlinear modelpredictive control applications. Nonlinear model predictive control 2000,369−392.(16) Cueli, J.; Bordons, C. Iterative nonlinear control of a semibatchreactor. Stability analysis. CDC-ECC’05, 44th IEEE Conference onDecision and Control and 2005 European Control Conference, 2005; pp2071−2076.(17) Cueli, J.; Bordons, C. Application of iterative nonlinear modelpredictive control to a batch pilot reactor. IFAC Proceedings Volumes2005, 38, 63−68.(18) Cueli, J. R.; Bordons, C. Iterative nonlinear model predictivecontrol. Stability, robustness and applications. Control EngineeringPractice 2008, 16, 1023−1034.(19) Liu, X.; Kong, X. Nonlinear fuzzy model predictive iterativelearning control for drum-type boiler−turbine system. J. Process Control2013, 23, 1023−1040.(20) Shen, D.; Han, J.; Wang, Y. Convergence analysis of ILC inputsequence for underdetermined linear systems. Science China Informa-tion Sciences 2017, 60, 099201.(21) Wang, Y.; Zhang, J.; Zeng, F.; Wang, N.; Chen, X.; Zhang, B.;Zhao, D.; Yang,W.; Cobelli, C. Learning can improve the blood glucosecontrol performance for type 1 diabetesmellitus.Diabetes Technol. Ther.2017, 19, 41−48.(22) Nagy, Z. K.; Braatz, R. D. Robust nonlinear model predictivecontrol of batch processes. AIChE J. 2003, 49, 1776−1786.(23) Lu, Y.; Arkun, Y. Quasi-min-max MPC algorithms for LPVsystems. Automatica 2000, 36, 527−540.(24) Abbas, H. S.; Toth, R.;Meskin, N.;Mohammadpour, J.; Hanema,J. A robust MPC for input-output LPV models. IEEE Trans. Autom.Control 2016, 61, 4183−4188.(25) Hanema, J.; Toth, R.; Lazar, M. Tube-based anticipative modelpredictive control for linear parameter-varying systems. IEEE 55thConference on Decision and Control (CDC), 2016; pp 1458−1463.(26) Besselmann, T.; Lofberg, J.; Morari, M. Explicit MPC for LPVsystems: Stability and optimality. IEEE Trans. Autom. Control 2012, 57,2322−2332.(27) Lofberg, J. Approximations of closed-loop minimax MPC.Proceedings of the 42nd IEEE Conference on Decision and Control, 2003;pp 1438−1442.(28) Casavola, A.; Famularo, D.; Franze, G. A feedbackmin-maxMPCalgorithm for LPV systems subject to bounded rates of change ofparameters. IEEE Trans. Autom. Control 2002, 47, 1147−1153.(29) Bemporad, A.; Borrelli, F.; Morari, M. Min-max control ofconstrained uncertain discrete-time linear systems. IEEE Trans. Autom.Control 2003, 48, 1600−1606.(30) Hovland, S.; Gravdahl, J. T.; Willcox, K. E. Explicit modelpredictive control for large-scale systems via model reduction. Journal ofguidance, control, and dynamics 2008, 31, 918−926.

Figure 23. Convergence of the Euclidean norm over batches, withchanges in the initial conditions at batches 11 and 14.

Industrial & Engineering Chemistry Research Article

DOI: 10.1021/acs.iecr.8b06474Ind. Eng. Chem. Res. 2019, 58, 13737−13749

13748

Page 14: Model learning predictive control for batch processes

(31) Cisneros, P. S.; Werner, H. Parameter-dependent stabilityconditions for quasi-LPV Model Predictive Control. American ControlConference (ACC), 2017; pp 5032−5037.(32) Cisneros, P. S.; Voss, S.; Werner, H. Efficient nonlinear modelpredictive control via quasi-lpv representation. IEEE 55th Conference onDecision and Control (CDC), 2016; pp 3216−3221.(33) Lee, J. H.; Lee, K. S.; Kim, W. C. Model-based iterative learningcontrol with a quadratic criterion for time-varying linear systems.Automatica 2000, 36, 641−657.(34) Moore, K. L.; Chen, Y.; Ahn, H.-S. Iterative learning control: Atutorial and big picture view. 45th IEEE Conference on Decision andControl, 2006; pp 2352−2357.(35) Oomen, T. Lecture notes, Eindhoven University of Technology,Spring 2015.(36) Bristow, D. A.; Tharayil, M.; Alleyne, A. G. A survey of iterativelearning control. IEEE Control Systems 2006, 26, 96−114.(37) Xu, J.-X.; Wang, X.-W.; Heng, L. T. Analysis of continuousiterative learning control systems using current cycle feedback.Proceedings of the 1995 American Control Conference, 1995; pp 4221−4225.(38) Forgione, M.; Bombois, X.; Van den Hof, P. M. Data-drivenmodel improvement for model-based control. Automatica 2015, 52,118−124.(39) Hovakimyan, N.; Cao, C.; Kharisov, E.; Xargay, E.; Gregory, I. M.L1 Adaptive Control for Safety-Critical Systems. IEEE Control Systems2011, 31, 54−104.(40) Amrhein, M.; Bhatt, N.; Srinivasan, B.; Bonvin, D. Extents ofreaction and flow for homogeneous reaction systems with inlet andoutlet streams. AIChE J. 2010, 56, 2873−2886.(41) Lehtonen, J.; Salmi, T.; Immonen, K.; Paatero, E.; Nyholm, P.Kinetic model for the homogeneously catalyzed polyesterification ofdicarboxylic acids with diols. Ind. Eng. Chem. Res. 1996, 35, 3951−3963.(42) Marquez-Ruiz, A.; Ludlage, J.; Ozkan, L. Optimization and Low-Level Control Design for Reactive Batch Distillation Columnsincluding the Start-up. Comput.-Aided Chem. Eng. 2018, 44, 577−582.

Industrial & Engineering Chemistry Research Article

DOI: 10.1021/acs.iecr.8b06474Ind. Eng. Chem. Res. 2019, 58, 13737−13749

13749