mathematical methods - unige.it€¦ · mathematical methods 2 mauro gaggero the original version...

University of Genoa

Notes for the course of

MATHEMATICAL METHODS

Lecturer: Mauro Gaggero

Last update: May 2020

_________________________________________________________________________________________

Mathematical Methods 2 Mauro Gaggero

The original version of these notes was written by the student Martina Pastorino during the lessons of the Academic Year 2018/2019. Many thanks to Martina for her great work.

Students are kindly invited to send questions, comments, corrections, and updates concerning these notes to the following email address: [email protected].

_________________________________________________________________________________________


Table of contents

1. Introduction .................................................................................................................................... 5

2. Dynamic optimization ................................................................................................................... 6 2.1. Example: routing in a communication network ..................................................................... 8

2.2. Example: mixing tank .......................................................................................................... 11

2.3. Example: missile and tank .................................................................................................... 12

2.4. Time discretization with final time fixed and known .......................................................... 14 2.5. Time-discretization with final time unknown ...................................................................... 15

2.6. The dynamic programming algorithm .................................................................................. 17 2.6.1. Basic ideas of DP through the airplane example ............................................................................ 17 2.6.2. Backward phase of DP .................................................................................................................... 19 2.6.3. Forward phase of DP ...................................................................................................................... 21 2.6.4. Formalization of the airplane example ........................................................................................... 21

2.7. Bellman optimality principle ................................................................................................ 23

2.8. An LQ system: the example of the mixing tank ................................................................... 25

2.9. Non-LQ problem .................................................................................................................. 31 2.9.1. Least squares approximation .......................................................................................................... 33 2.9.2. Parameter estimation with least squares ......................................................................................... 37

2.10. Dynamic optimization with disturbances ............................................................................. 40 2.10.1. Solution with dynamic programming ........................................................................................ 41 2.10.2. LQ case ...................................................................................................................................... 41 2.10.3. General case (non-LQ problem) ................................................................................................ 43

3. Nonlinear programming ............................................................................................................. 45

3.1. Example: localization problem ............................................................................................. 48

3.2. Unconstrained nonlinear programming ................................................................................ 49 3.2.1. Optimality conditions ..................................................................................................................... 49 3.2.2. Descent methods ............................................................................................................................. 51

_________________________________________________________________________________________


3.2.3. Gradient method ............................................................................................................................. 56 3.2.4. Convergence rate ............................................................................................................................ 58 3.2.5. Newton method ............................................................................................................................... 59

3.3. Non-derivative methods ....................................................................................................... 63 3.3.1. Coordinate descent method ............................................................................................................. 63 3.3.2. Powell method ................................................................................................................................ 64 3.3.3. Random search methods ................................................................................................................. 65

3.4. Constrained nonlinear programming .................................................................................... 65 3.4.1. Optimality conditions in the constrained case ................................................................................ 66 3.4.2. Penalty function method ................................................................................................................. 68 3.4.3. Barrier function method .................................................................................................................. 70

4. Partial differential equations ...................................................................................................... 73

4.1. Classification of PDEs ......................................................................................................... 75 4.1.1. Elliptic equations ............................................................................................................................ 75 4.1.2. Hyperbolic equations ...................................................................................................................... 75 4.1.3. Parabolic equations ......................................................................................................................... 75

4.2. Solution of hyperbolic PDEs ................................................................................................ 76 4.2.1. Example: from the oscillating string to the wave equation ............................................................ 76 4.2.2. Solution of the wave equation with the separation of variables ..................................................... 78 4.2.3. Modal Analysis ............................................................................................................................... 82

4.3. Solution of elliptic equations ................................................................................................ 83 4.3.1. Unicity of the solution of the Laplace equation .............................................................................. 85 4.3.2. Solution of the Laplace equation in a circle ................................................................................... 86

4.4. Solution of parabolic equations ............................................................................................ 89 4.4.1. Heat equation on the infinite line .................................................................................................... 91 4.4.2. Solution of the heat equation using the Fourier transform ............................................................. 95

4.5. Solution of hyperbolic equations .......................................................................................... 96 4.5.1. The technique of the characteristics ................................................................................................ 97

5. Appendix ..................................................................................................................................... 103

5.1. Useful mathematical formulas ........................................................................................... 103

5.2. Basic concepts on the Fourier series .................................................................................. 105

5.3. Basic concepts on the Fourier transform ............................................................................ 105

5.4. The Dirac function ............................................................................................................. 107

6. References ................................................................................................................................... 109

_________________________________________________________________________________________


1. Introduction

The course investigates the following three main topics:

• Optimization methods: - Dynamic optimization. - Nonlinear programming.

• Partial differential equations.

Concerning optimization methods, we will study how to solve optimization problems. Such problems, also called decision problems, consist in choosing the best option among several alternatives, according to different criteria. We will focus on static optimization and dynamic optimization:

• Static optimization: we have to take optimal decisions only once (una tantum), the problem will remain the same (there is no time involved).

• Dynamic optimization: we have to take a sequence of decision at different time instants (the time plays a crucial role).

Static optimization can be divided in turn into: real linear programming, integer linear programming, nonlinear programming, and optimization over graphs. In this course, we will investigate only nonlinear programming. We will see how to formulate an optimization problem as a mathematical model and some algorithms to solve it in an efficient way.

Concerning partial differential equations (PDEs), we will study how to solve analytically classical linear PDEs. In more detail, we will focus on the following equations:

• Laplace equation. • Heat equation. • Wave equation.

_________________________________________________________________________________________


2. Dynamic optimization

We will solve optimal decision problems with time involved, in particular we will learn how to take optimal decision sequentially over time.

We will focus on two different cases:

1) Decisions are taken without uncertainties (noises). 2) Decisions are taken in presence of uncertainties.

We will start with the case 1) and then we will extend the techniques to the case 2).

Consider a discrete time dynamical system:

𝒙"#$ = 𝑓"(𝒙", 𝒖") (2.1)

where 𝒙" ∈ ℝ-is the state vector of the system, 𝒖" ∈ ℝ/ is the decision or control vector, and 𝑓" ∈ℝ- × ℝ/ ⇒ ℝ- is the vector field of the system (it describes its evolution). The state of the system is the minimal number of components We need to know to understand the system itself and predict its future evolution. From now on, we will consider always column vectors of the form

𝒙" = 2𝑥",$⋮𝑥",-

5

In the following, for the sake of simplicity we will simply use 𝑓 instead of 𝑓", i.e., we will focus on time invariant problems.

The quantity 𝑇 is the time horizon where we take decisions, while 𝑡 = 0, 1, 2, … , 𝑇are the discrete time instants when we take the decisions.

Equation (2.1) is the state equation of the system. Given the values of 𝒖" for 𝑡 = 0,… , 𝑇 − 1, and an initial condition 𝒙= = 𝒙>, it allows to predict the future evolution of the system.

_________________________________________________________________________________________


In order to take optimal decisions, we have first of all to fix our goals. Toward this end, we consider a cost function (also called objective function or performance index) describing our goal. We focus on additive cost functions having the following form:

𝐽 = @ℎ(BC$

"D=

𝒙", 𝒖") +ℎB(𝒙B)

where ℎ:ℝ- × ℝ/ ⇒ ℝ is the transition cost (the cost we have to pay for moving from a state to another, i.e., from 𝒙"to 𝒙"#$, and ℎB:ℝ- ⇒ ℝ denotes the final cost (it does not depend on the decision vector since we have already taken the last one).

Our goal is to compute the sequence of optimal decisions 𝒖=∗ , 𝒖$∗ , … , 𝒖"C$∗ minimizing the cost 𝐽. The mathematical formulation is the following:

min𝒖K,…,𝒖LMN

@ℎ(𝒙", 𝒖") + 𝒉B(𝒙B)BC$

"D=

The previous problem has to be solved subject to the following constraints:

P𝒙"#$ = 𝑓(𝒙", 𝒖"),𝑡 = 0,… , 𝑇 − 1𝒙= = 𝒙>

The previous is the standard form of dynamic optimization problems in the absence of uncertainties. We may add more constraints related to limitations on the values that 𝒙", 𝒖" can take on:

𝒙" ∈ 𝑋" ⊂ ℝ-

𝒖" ∈ 𝑈" ⊂ ℝ/

_________________________________________________________________________________________


The decisions 𝒖=∗ , 𝒖$∗ , … , 𝒖BC$∗ obtained by solving the problem determine a sequence of optimal states 𝒙=, … , 𝒙B. In other words, we construct a trajectory in the state space 𝑋 (the subset of ℝ- where the system evolves). In the following, we will see how to write a generic dynamic optimization problem in the standard form and then we will see how to compute the optimal sequence 𝒖=∗ , 𝒖$∗ , … , 𝒖BC$∗ (i.e., how to solve the problem).

2.1. Example: routing in a communication network

Let us consider a communication network with routers, connections among them, and packets to transfer.

We model the network through a graph with nodes (representing routers where the packets are stored) and links. The nodes 𝑖 = 1,… ,𝑁 (or routers) are modeled as buffers.

𝑖-th node represented by a buffer

The number of packets stored in the buffer represents the state of the network. Thus, 𝒙"is given by the length of each buffer:

_________________________________________________________________________________________


𝒙" = 2𝑥$,"⋮𝑥-,"

5

where 𝑥V," is the number of packets stored in the node 𝑖 at time 𝑡.

We assume to know 𝒙= = 𝒙>, i.e., the number of packets in the network at the beginning of the process (initial condition at time 0).

The decision vector 𝒖" = W𝑢$Y,"𝑢$Z,"⋮[ represents how many packets to transfer from one node to another.

More specifically, 𝑢V\," is the number of packets transferred from node 𝑖 to 𝑗 at time 𝑡.

The variables 𝑢 are associated with the links. If there is no link between nodes 2 and 5, for example, the quantity 𝑢Y^," will not exist. For the sake of simplicity, we assume that 𝑥V," and 𝑢V\," are real numbers, and not integers, i.e.,

𝒙" ∈ ℝ-

𝒖" ∈ ℝ/

Further, let us define the sets 𝑝(𝑖) and 𝑠(𝑖) as the set of nodes preceding the node 𝑖 and the set of nodes succeeding the node 𝑖, respectively.

The network is affected by external inputs represented by the packets that are injected from the outside. Let 𝑟V," be the number of new packets injected in the node 𝑖 from outside the network at time 𝑡, and define

𝒓" = 2𝑟$,"⋮𝑟-,"

5 ∈ ℝ-

We derive a state equation using a conservation law principle as follows:

𝑥V,"#$ = 𝑥V," + @ 𝑢cV,"c∈d(V)

− @ 𝑢V\,"\∈e(V)

+ 𝑟V,",𝑖 = 1,… ,𝑁

where 𝑥V," are the packets already in the node 𝑖, ∑ 𝑢cV,"c∈d(V) are the packets entering the node 𝑖, ∑ 𝑢V\,"\∈e(V) are the packets exiting the node 𝑖, and 𝑟V,"are the new external packets entering the node 𝑖.

We can write the state equation in vector form as follows:

_________________________________________________________________________________________


P𝒙"#$ = 𝐴𝒙" + 𝐵𝒖" + 𝒓",𝑡 = 0,… , 𝑇 − 1𝒙= = 𝒙>

where 𝐴 ∈ ℝ-×- is the identity matrix (𝐴 = 𝐼) and 𝐵 ∈ ℝ-×/ is a matrix with elements (0, ±1) depending on the topology of the network. We can introduce limitations, or constraints, on 𝒙", 𝒖", such as, for example,

• 0 ≤ 𝑥V," ≤ 𝑋V ∀𝑖, 𝑗 (capacity of buffers). • 0 ≤ 𝑢V\," ≤ 𝑈V\ ∀𝑖, 𝑗 (maximum capacity of links).

and therefore 𝒙" ∈ 𝑋, 𝒖" ∈ 𝑈, where 𝑋 ⊂ ℝ-, 𝑈 ⊂ ℝ/.

Now, we introduce an objective function describing our goals. Without loss of generality (the choice is completely arbitrary), we chose to minimize the overall congestion of the network, i.e., we define a cost

𝐽 = @m@𝒙V,"

n

VD$

oBC$

"D=

+ 𝛼@𝒙V,B

n

VD$

where 𝛼 > 0 is a weight coefficient that allows to give more importance to one addendum or the other one. It is a design parameter.

To sum up, the overall problem to solve is the following:

⎩⎪⎪⎨

⎪⎪⎧ min@m@𝒙V,"

n

VD$

oBC$

"D=

+ 𝛼@𝒙V,B

n

VD$

𝑠. 𝑡.𝒙"#$ = 𝐴𝒙" + 𝐵𝒖" + 𝒓",𝑡 = 0,… , 𝑇 − 1𝒙= = 𝒙>𝒙" ∈ 𝑋,𝑡 = 0,… , 𝑇𝒖" ∈ 𝑈,𝑡 = 0,… , 𝑇 − 1

Standard form:

⎩⎪⎪⎨

⎪⎪⎧ min𝒖K,…,𝒖LMN

@ℎ(𝒙", 𝒖") + 𝛼ℎB(𝒙B)BC$

"D=

𝑠. 𝑡.𝒙"#$ = 𝑓(𝒙", 𝒖", 𝒓"),𝑡 = 0,1, … , 𝑇 − 1𝒙= = 𝒙>𝒙" ∈ 𝑋,𝑡 = 0,1, … , 𝑇𝒖" ∈ 𝑈,𝑡 = 0,1, …𝑇 − 1

The vector field 𝑓(𝒙", 𝒖", 𝒓") = 𝐴𝒙" + 𝐵𝒖" + 𝒓" is the classical form of a linear dynamic system.

In this example, time is discrete. The system is intrinsically discrete. However, there exist problems that are continuous in time, like the one investigated in the next section.

ℎ ℎB

transition cost final cost

_________________________________________________________________________________________


2.2. Example: mixing tank

Consider a tank where two liquids, each one characterized by a certain concentration of chlorine, are mixed.

• 𝑐$, 𝑐Y are the concentration of chlorine of the two liquids. • 𝑢$(𝑡), 𝑢Y(𝑡) represents the flow of liquid of type 1 and 2, respectively, which enters the tank. • ℎ(𝑡) denotes the amount (height) of liquid in the tank. • 𝑐(𝑡) is the concentration of chlorine of liquids in the tank. • 𝑞(𝑡) is the outgoing flow. • 𝐴 denotes the area of the basis of the tank, i.e., the volume of liquid is given by 𝑉(𝑡) = 𝐴ℎ(𝑡).

The decision vector 𝒖"defines the input flow of the liquids (modifiable through the valves):

𝒖" = z𝑢$,"𝑢Y,"{

The state vector 𝒙" is given by:

𝒙" = |ℎ(𝑡)𝑐(𝑡)}

From Physics, we know that:

𝑞(𝑡) = 𝑘�ℎ(𝑡)

where 𝑘 is a suitable coefficient. From the conservation law of flows, we have:

𝑑𝑉(𝑡)𝑑𝑡

= 𝑢$(𝑡) + 𝑢Y(𝑡) − 𝑞(𝑡) = 𝑢$(𝑡) + 𝑢Y(𝑡) − 𝑘�ℎ(𝑡) = 𝐴𝑑ℎ(𝑡)𝑑𝑡

We can write a similar equation for the concentration of chlorine:

𝑑[𝑐(𝑡)𝑉(𝑡)]𝑑𝑡

= 𝑐$𝑢$(𝑡) + 𝑐Y𝑢Y(𝑡) − 𝑘𝑐(𝑡)�ℎ(𝑡) = 𝐴𝑑(𝑐(𝑡)ℎ(𝑡))

𝑑𝑡

By using the classical rules of calculus, we obtain the expression for ��(")�"

.

_________________________________________________________________________________________


Thus, we can write a state equation that is continuous in time (Cauchy problem):

�𝑑ℎ(𝑡)𝑑𝑡 = ⋯

𝑑𝑐(𝑡)𝑑𝑡

= ⋯⇒�

𝑑𝒙𝑑𝑡

= �̇� = 𝑓��𝒙(𝑡), 𝒖(𝑡)�,𝑡 ∈ [0, 𝑇]

𝒙(0) = 𝒙>

The state equation of this system is an ordinary differential equation (ODE).

We define an objective function considering our goals (many choices are possible) that is continuous in time (the sum of the previous example of the communication network is replaced by integrals):

𝐽 = � ℎ��𝒙(𝑡), 𝒖(𝑡)�𝑑𝑡 + ℎB(𝒙(𝑇))B

=

Thus, we can formulate a continuous-time dynamic decision problem as follows:

⎩⎪⎪⎨

⎪⎪⎧min𝒖(")

� ℎ��𝒙(𝑡), 𝒖(𝑡)�𝑑𝑡 + ℎB�𝒙(𝑇)�B

=𝑠. 𝑡.�̇�(𝑡) = 𝑓��𝒙(𝑡), 𝒖(𝑡)�,𝑡 ∈ [0, 𝑇]𝒙" ∈ 𝑋,𝑡 ∈ [0, 𝑇]𝒖" ∈ 𝑈,𝑡 ∈ [0, 𝑇]𝒙(0) = 𝒙>

The limitations on 𝒙" and 𝒖" come from Physics (for example, maximum capacity of the tank and of the tubes). This problem is a so-called functional optimization problem, as the unknown is a function and not a sequence of vectors. In the following, we will see how to discretize this problem and transform it into an equivalent discrete-time one.

2.3. Example: missile and tank

Consider an airplane that launches a missile with the objective of hitting a tank.

_________________________________________________________________________________________


The goal is to hit the tank with a missile from the airplane in the smallest amount of time. We describe the movement of the missile using the following 5-dimensional state vector:

𝒙(𝑡) =

⎣⎢⎢⎢⎡𝑥(𝑡)𝑦(𝑡)�̇�(𝑡)�̇�(𝑡)𝜗(𝑡)⎦

⎥⎥⎥⎤

The decision vector is composed by a scalar quantity 𝛿(𝑡), representing the deflection of the rudder of the missile:

𝑢(𝑡) = 𝛿(𝑡)

The state equation can be derived again from Physics. The equations are quite complex (we will not see the details for the sake of compactness), but they can be written in a compact way as follows:

�̇�(𝑡) = 𝑓�𝒙(𝑡), 𝒖(𝑡)�,𝑡 ∈ [0, 𝑇]

The quantity 𝑇 is the impact time, which is unknown. After this instant we are no more interested in modelling the dynamics.

This is another example of system that is intrinsically continuous in time, like the example of the mixing tank. Since the goal is to minimize the impact time, the cost function will be:

𝐽 = 𝑇

(𝑇, as said before, is the final time, so it can be considered as the time instant of the impact). The decision problem is the following:

⎩⎪⎨

⎪⎧min𝒖(")

𝑇𝑠. 𝑡.�̇�(𝑡) = 𝑓�𝒙(𝑡), 𝒖(𝑡)�,𝑡 ∈ [0, 𝑇]|𝒖(𝑡)| ≤ 𝑈,𝑡 ∈ [0, 𝑇]𝒙(𝑡) ∈ 𝑋,𝑡 ∈ [0, 𝑇]

where 𝑈 is a constraint given by the maximum deflection of the rudder and 𝑋 is another constraint related to the position of the missile (we want to stay in a certain space area). This problem is a special case of dynamic optimization problem, it is an example of the so-called minimum time problem.

position

speed angle

_________________________________________________________________________________________


2.4. Time discretization with final time fixed and known

In this section, we investigate how to discretize a continuous-time system and write an equivalent discrete-time one, in the vase of known final time 𝑇.

We let 𝑇 = 𝑛Δ𝑡, where Δ𝑡 is a given time interval and 𝑛 is a given integer. We divide the interval [0, 𝑇] into 𝑛 sub-intervals of length Δ𝑡 (chosen a priori and fixed). We have to discretize the state equation:

�̇�(𝑡) = 𝑓�𝒙(𝑡), 𝒖(𝑡)�,𝑡 ∈ [0, 𝑇]

Thus, we approximate �̇�(𝑡) with the incremental ratio:

�̇�(𝑡) ≅𝒙[(𝑖 + 1)Δ𝑡] − 𝒙(𝑖Δ𝑡)

Δ𝑡

In other words, we just consider what happens at the 𝑖 −th and the 𝑖 −th +1 time steps.

�̇�(𝑡) ≅𝒙[(𝑖 + 1)Δ𝑡] − 𝒙(𝑖Δ𝑡)

Δ𝑡≅ 𝑓(𝒙(𝑡), 𝒖(𝑡))

If we define 𝒙V ≜ 𝒙(𝑖Δ𝑡) and 𝒖V ≜ 𝒖(𝑖Δ𝑡), we get

𝒙V#$ − 𝒙VΔ𝑡

= 𝑓(𝒙V, 𝒖V)∀𝑖 = 0,… , 𝑛 − 1

𝒙V#$ = 𝒙V + Δ𝑡𝑓(𝒙V, 𝒖V)𝑓�

𝒙V#$ = 𝑓�(𝒙V, 𝒖V),𝑖 = 0,… , 𝑛 − 1

This is the so-called first order discretization, also called Euler discretization, and it is also the simplest one. Concerning the cost, we can perform a similar discretization by replacing integrals with summations:

𝐽 = � ℎ�𝒙(𝑡), 𝒖(𝑡)�𝑑𝑡 + ℎB�𝒙(𝑇)�B

=≃ @ ℎ�𝒙(𝑖Δ𝑡), 𝒖(𝑖Δ𝑡)�Δ𝑡

-C$

VD=

+ℎB�𝒙(𝑛Δ𝑡)�

ℎ�(𝒙V, 𝒖V)ℎn(𝒙n)

_________________________________________________________________________________________


Thus, we obtain,

𝐽 = @ℎ�(𝒙V, 𝒖V)-C$

VD=

+ℎn(𝒙n)

2.5. Time-discretization with final time unknown

Now we study how to discretize a continuous-time system and write an equivalent discrete-time one in the case of unknown final time 𝑇.

The idea is still to divide [0, 𝑇] into 𝑛subintervals. While 𝑛 is fixed, Δ𝑡 is not. We consider subintervals of length Δ𝑡=, Δ𝑡$, Δ𝑡Y, … (they can be different one from another).

• 𝑛 is given and fixed. • Δ𝑡V, 𝑖 = 0,… , 𝑛 − 1, can vary and are unknown.

Clearly, we must have:

@Δ𝑡V = 𝑇-C$

VD=

We can write a dynamic equation for the variable 𝑡V:

𝑡V#$ = 𝑡V + Δ𝑡V,𝑖 = 0,… , 𝑛 − 1

We discretize the state equation of the system like in the previous case, using again the Euler discretization:

_________________________________________________________________________________________


�̇�(𝑡) =𝒙(𝑡V#$) − 𝒙(𝑡V)

𝑡V#$ − 𝑡V⇒ 𝒙V#$ = 𝑓�(𝒙V, 𝒖V),𝑖 = 0,… , 𝑛 − 1

P𝒙V = 𝒙(𝑡V)𝒖V = 𝒖(𝑡V)

We use the so-called state augmentation procedure: we insert the various 𝑡Vwithin the state variables, and define an augmented state vector:

𝒙�V = z𝒙V𝑡V { ,𝑖 = 0,… , 𝑛 − 1

with dim(𝒙�V) = 𝑛 + 1, where 𝑛 is the dimension of 𝒙V.

We define also an augmented decision vector:

𝒖�V = z𝒖VΔ𝑡V{ ,𝑖 = 0,… , 𝑛 − 1

with dim(𝒖�V) = 𝑚 + 1, where 𝑚 is the dimension of 𝒖V.

The state equation of the augmented system is the following:

𝒙�V#$ = 𝑓�(𝒙 � , 𝒖 � ),𝑖 = 0,… , 𝑛 − 1

z𝒙V#$𝑡V#$ { = |𝑓�

(𝒙V, 𝒖V)𝑡V + Δ𝑡V

} ,𝑖 = 0,… , 𝑛 − 1

As usual, we have to consider also initial conditions:

𝒙�= = 𝒙>¡ = z𝒙>0{

In this case, the initial condition for the variable 𝑡Vis zero, even if it can be equal to 𝑡= in general.

The cost function to minimize is the following:

𝐽 = 𝑇

that is,

𝐽 = @Δ𝑡V =-C$

VD=

@ ℎ(𝒙V, 𝒖V)-C$

VD=

𝑓�

ℎ

_________________________________________________________________________________________


2.6. The dynamic programming algorithm

The main methods to solve dynamic optimization problems are the following:

1) Calculus of variations. 2) Pontryagin minimum principle. 3) Dynamic programming (DP). 4) Ritz/extended Ritz approach. good in the presence of disturbances.

In this course, we will focus only on the dynamic programming algorithm.

2.6.1. Basic ideas of DP through the airplane example

We want to determine the optimal route of an airplane that minimizes the time needed to go from the town 𝐴 to the town 𝐵. Due to winds and heavy weather, the minimum time route is not necessarily the straight one connecting 𝐴 with 𝐵.

We divide the line connecting 𝐴 and 𝐵 with 𝑁 geometrical decision stages. We further discretize each stage with a certain number 𝑑 of points. We have to decide the best (according to certain criteria) sequence of points in the grid to construct the optimal trajectory.

State of the system: 𝑥Vdenotes the position of the airplane at stage 𝑖, i.e., 𝑥Z = 5 means that the plane is in the 5-th discretization point at stage 3.

Decision vector: decision about the next point, i.e., 𝑢V = 𝑥V#$.

State equation: 𝒙V#$ = 𝑓(𝒙", 𝒖"). In our case, we simply have:

𝑥V#$ = 𝑢V,𝑖 = 0,… ,𝑁 − 1

_________________________________________________________________________________________


In order to construct the cost function 𝐽, it is useful to define the following quantity:

𝑇V(𝑥V, 𝑥V#$) = 𝑙V(𝑥V, 𝑥V#$)𝑣V(𝑥V, 𝑥V#$)

which is the time needed to go from node 𝑥V to node 𝑥V#$, where 𝑙V is the distance between 𝑥Vand 𝑥V#$ and 𝑣V is the speed of the airplane. The distance 𝑙V is known ∀𝑖, 𝒙V, 𝒙V#$. The speed 𝑣V in general is not known a priori due to weather conditions. For the moment, we assume to have precise weather forecasts so that we can assume that 𝑣V is known. Our setting is purely deterministic, i.e., no stochastic variables are involved. In the case of sudden wind gusts or generic weather changes, we cannot assume 𝑣V to be known. This is the case of a stochastic problem that we will face later on.

The overall time for going from 𝐴 to 𝐵 is our cost function:

𝐽 = @ 𝑇V(𝑥V, 𝑥V#$)nC$

VD=

Thus, the problem can be formulated as follows:

⎩⎪⎨

⎪⎧𝑥V#$ = 𝑢V,𝑖 = 0,… ,𝑁 − 1𝑥= = 𝐴𝑥n = 𝐵

𝐽 = @ 𝑇V(𝑥V, 𝑥V#$)nC$

VD=

+ 0

⇒

⎩⎪⎨

⎪⎧𝒙"#$ = 𝑓(𝒙", 𝒖"),𝑡 = 0,… ,𝑁 − 1𝒙= = 𝒙>𝑥" ∈ 𝑋",𝑡 = 0,… ,𝑁

𝐽 = @ ℎ(𝒙", 𝒖") + ℎn(𝒙")nC$

VD=

The goal is to determine the sequence of optimal points in the trajectory, i.e., 𝒖=∗ , 𝒖$∗ , … , 𝒖nC$∗ .

A possible solution approach is the so-called brute force approach. It consists in enumerating all the possible trajectories, computing the cost of each one, and selecting the lowest value. Consider for instance the case of 𝑁 = 21 stages and 𝑑 = 10 points per stage. The number of possible trajectories is equal to 10Y=. To compute the cost of a trajectory, we have to sum 21numbers. Assume to have a personal computer performing the computation of the cost (sum of 21 numbers) in 10C¥ seconds. The overall computational time will be 10Y=10C¥ = 10$¦ seconds. Since in one year we have about 310§ seconds, we need about 0.310§ years of computation. Hence, we conclude that brute force cannot be used for large instances of the problem. The brute force approach suffers from the curse of dimensionality issue. The number of computations grows exponentially with the “dimension” of the problem. We will see that the DP approach will solve the problem in few milliseconds.

The successful idea of DP is to divide the problem into subproblems that are easier to be solved (principle of divide et impera).

The DP algorithm is made up of two phases:

• Backward phase. • Forward phase.

_________________________________________________________________________________________


2.6.2. Backward phase of DP

In order to explain the backward phase, let us consider the airplane example in a simpler case with 𝑑 =3 and 𝑁 = 4:

Stage 𝑖 = 𝑁 − 1:

We start from the last stage where we take a decision, i.e., 𝑖 = 𝑁 − 1. We label the nodes of the stage 𝑁 − 1 with the time needed to go from the nodes to 𝐵. In general, the labels denote the time needed to go from a certain node to 𝐵. We call such quantities optimal costs to go. Our goal is to label all the nodes, i.e., to determine a cost to go for all the nodes of the grid, starting with 𝑖 = 𝑁 − 1 backwards to 𝑖 = 0. We associate with the nodes also an action, pictorially represented by an arrow pointing to the next node (𝐵 in this stage).


The optimal cost to go of the first point of the stage 𝑁 − 2 is: min(15 + 20, 20 + 21, 16 + 25) = 35.

We have to repeat the same operation for all the nodes:

The optimal cost to go of the second point of the stage 𝑁 − 2 is: min(20 + 20, 10 + 21, 21 + 25).

_________________________________________________________________________________________


The optimal cost to go of the third point of the stage 𝑁 − 2 is: min(10 + 20, 11 + 21, 12 + 25).

Now, we iterate the process for stages 𝑁 − 3 and 𝑁 − 4.


The optimal cost to go of the first point of the stage 𝑁 − 3 is: min(35 + 20, 31 + 20, 30 + 30).

The optimal cost to go of the second point of the stage 𝑁 − 3 is: min(35 + 15,31 + 25,30 + 30).

The optimal cost to go of the third point of the stage 𝑁 − 3 is: min(35 + 30,31 + 20,30 + 15).


_________________________________________________________________________________________


The optimal cost to go of the first point of the stage 𝑁 − 4 (the node 𝐴) is: min(21 + 51, 16 + 50, 20 +45).

To sum up, the backward phase consists in labelling each node with a cost to go (starting from the last node) that is computed by minimizing the sum of the transition costs (known) plus the cost to go of the next stage. Roughly speaking, the cost to go allows to forget what happens in the next decisional stages. This is the key idea of DP.

2.6.3. Forward phase of DP

In the forward phase, we go from stage 0 to stage 𝑁 forward by following the arrows computed in the backward phase. Such arrows represent the optimal decision that we have to take at each node. The cost to go of node 𝐴 is the optimal overall time to go from 𝐴 to 𝐵.

To analyze the required computational effort with respect to the brute force method, we consider again the case 𝑁 = 21 and 𝑑 = 10. The elementary operation in this case consists in performing 10 sums of two numbers per stage, plus the computation of the minimum. Consider 10C¥𝑠 required to perform such operation. The number of operations is equal to: 10Y × 20 + 10 ≈ 20 × 10Y, where 10Y is the number of operations per stage, 20 is the number of stages apart from 𝐵, and 10 is the number of operations of the last stage. Thus, the overall computational time of the DP procedure is about 20 × 10Y × 10C¥ =2𝑚𝑠.

2.6.4. Formalization of the airplane example

Let 𝐽V«(𝒙V) be the optimal cost to go of the stage 𝑖. It represents the optimal time needed to go from 𝒙V to the end of the process, i.e., 𝐵. We can write:

𝐽V«(𝒙V) = min𝒙¬,…,𝒙MN

@ 𝑇c(𝒙c, 𝒖c)nC$

cDV

= min𝒙¬,…,𝒙MN

@ 𝑇c(𝒙c, 𝒙c#$)nC$

cDV

To compute the cost to go we use the backward phase that consists in starting from 𝑖 = 𝑁 − 1 to 0.

𝐽nC$« (𝒙nC$) = 𝑇nC$(𝒙nC$, 𝒙n)

_________________________________________________________________________________________


𝐽nCY« (𝒙nCY) = min𝒙MN

�𝑇nCY(𝒙nCY, 𝒙nC$) + 𝐽nC$« (𝒙nC$)�

𝐽nCZ« (𝒙nCZ) = min𝒙M®

�𝑇nCZ(𝒙nCZ, 𝒙nCY) + 𝐽nCY« (𝒙nCY)�

⋮

𝐽=«(𝒙=) = min𝒙N�𝑇=(𝒙=, 𝒙$) + 𝐽$«(𝒙$)�

In general, we have the following equations, usually referred to as Bellman equations:

𝐽nC$« (𝒙nC$) = 𝑇nC$(𝒙nC$, 𝒙n)

𝐽V«(𝒙V) = min𝒙¬¯N

�𝑇V(𝒙V, 𝒙V#$) + 𝐽V#$« (𝒙V#$)� ,𝑖 = 𝑁 − 2,… ,0

As said, the forward phase consists in applying the decisions, i.e., following the arrows identified in the backward phase. Mathematically, the arrows represent a function of each node, i.e., we have:

⎩⎪⎨

⎪⎧𝒙= = 𝐴𝒙$ = 𝛾=(𝒙=)𝒙Y = 𝛾$(𝒙$)⋮𝒙n = 𝛾nC$(𝒙nC$)

In compact form, we can write:

𝒙V#$ = 𝛾V(𝒙V),𝑖 = 0,… ,𝑁.

The function 𝛾 is the function modeling the arrows.

The relationship 𝒖V = 𝛾V(𝒙V) is the decision law, and it is well suited to being used also in the case of systems with disturbances. In fact, the decision that has to be taken at stage 𝑖, i.e., 𝒖V, is a function of the state 𝒙V. In other words, the decisions are taken in a closed loop manner.

Feedback decisional system

On the contrary, in the open loop case we have that decisions are not a function of the state, but only a function of time.

_________________________________________________________________________________________


Open loop decisional system

In general, the feedback case allows to obtain decision/control actions that are more robust with respect to disturbances. Thus, it is preferable with respect to the open loop case. With the DP approach, we automatically obtain decision laws in closed loop, thus the DP algorithm is well suited to being used also in the presence of disturbances. In the airplane example described above, we assumed no disturbances (winds perfectly known). In this case, having a feedback rule is useless. At least in principle, we could forget the functions 𝛾V and save only the sequence of actions 𝒖=,… , 𝒖nC$. However, the arrows represent “fallback” actions that could be taken to reach the end of process (the node B) optimally in the presence of unexpected events.

2.7. Bellman optimality principle

In the example of the airplane, the Bellman optimality principle reads as follows.

“It is not important how we reached a certain point (node) of the trajectory 𝒙V. The important thing is that we reach the end of the process 𝒙n optimally, that is by optimizing the part

of the trajectory connecting 𝒙V to 𝒙n.”

Bellman equations can be derived formally starting from the definition of the cost to go:

𝐽V(𝒙V) = min𝒙¬¯N,…,𝒙


cDV

Let us consider the cost to go of the stage 0:

𝐽=«(𝒙=) = min𝒙N,…,𝒙

@ 𝑇c(𝒙c, 𝒙c#$) = min𝒙N,…,𝒙

m𝑇=(𝒙=, 𝒙$) + @ 𝑇c(𝒙c, 𝒙c#$)nC$

cD$

onC$

cD=

=

= min𝒙N

±𝑇=(𝒙=, 𝒙$) + min𝒙®,…,𝒙

𝑇c(𝒙c, 𝒙c#$)² =min𝒙N�𝑇=(𝒙=, 𝒙$) + 𝐽$«(𝒙$)�

For the stage 𝑖 = 1 we repeat the same arguments:

this term depends on 𝒙= (known) and 𝒙$, the first node of the trajectory

cost to go at stage 1, 𝐽$«(𝒙$)

_________________________________________________________________________________________


𝐽$«(𝒙$) = min𝒙®,…,𝒙


cD$

= min𝒙®,…,𝒙

m𝑇$(𝒙$, 𝒙Y) + @ 𝑇c(𝒙c, 𝒙c#$)nC$

cDY

o =

= min𝒙®

m𝑇$(𝒙$, 𝒙Y) + min𝒙³,…,𝒙


cDY

o = min𝒙®�𝑇$(𝒙$, 𝒙Y) + 𝐽Y«(𝒙Y)�

Let us consider now a generic dynamic optimization problem:

⎩⎪⎨

⎪⎧𝒙"#$ = 𝑓(𝒙", 𝒖"),𝑡 = 0,… , 𝑇 − 1𝒙= = 𝒙>

min𝒖K,…,𝒖LMN

@ℎ(𝒙", 𝒖") + ℎB(𝒙B)BC$

"D=

In this case, the Bellman optimality principle is the following:

“Necessary condition such that 𝒙"belongs to the optimal trajectory is that the decisions 𝒖", … , 𝒖BC$ are optimal, that is they minimize the remaining part of the cost.”

We define a cost to go in this case:

𝐽"«(𝒙") = min𝒖´,…,𝒖LMN

@ℎ(𝒙c, 𝒖c) + ℎB(𝒙B)BC$

cD"

Bellman equations (backward phase):

µ𝐽B«(𝒙B) = ℎB(𝒙B)𝐽"«(𝒙") = min

𝒖´(ℎ(𝒙", 𝒖") + 𝐽"#$« (𝒙"#$)),𝑡 = 𝑇 − 1,… ,0

Bellman equations (forward phase):

⎩⎪⎨

⎪⎧𝒙= = 𝒙>𝒙$ = 𝑓(𝒙=, 𝒖=);𝒖= = 𝛾=(𝒙=)𝒙Y = 𝑓(𝒙$, 𝒖$);𝒖$ = 𝛾$(𝒙$)⋮

𝒙"#$ = 𝑓(𝒙", 𝒖");𝒖" = 𝛾"(𝒙")

Let us unroll the equations of the backward phase:

𝐽B«(𝒙B) = ℎB(𝒙B)

𝐽BC$« (𝒙BC$) = min𝒖LMN

·ℎ(𝒙BC$, 𝒖BC$) + 𝐽B«�𝑓(𝒙BC$, 𝒖BC$)�¸

𝐽BCY« (𝒙BCY) = min𝒖LM®

·ℎ(𝒙BCY, 𝒖BCY) + 𝐽BC$« �𝑓(𝒙BCY, 𝒖BCY)�¸

_________________________________________________________________________________________


⋮

𝐽=«(𝒙=) = min𝒖K

·ℎ(𝒙=, 𝒖=) + 𝐽$«�𝑓(𝒙=, 𝒖=)�¸

The unrolled equations of the forward phase are the following:

𝒙= = 𝒙> known

𝒖= = 𝛾=(𝒙=) ⇒ 𝒙$ = 𝑓(𝒙=, 𝒖=)

𝒖$ = 𝛾$(𝒙$) ⇒ 𝒙Y = 𝑓(𝒙$, 𝒖$)

⋮

𝒖BC$ = 𝛾BC$(𝒙BC$) ⇒ 𝒙B = 𝑓(𝒙BC$, 𝒖BC$)

The backward phase may be computationally demanding due to the need of solving several optimization problems to compute the various cost to go. However, such phase is performed offline (before the system operation). The forward phase is performed online (during the system operation) but the required computational burden is limited since we have just to perform algebraic operations (no minimizations have to be performed).

In general, it is not possible to solve Bellman equations analytically. Thus, we will resort to suitable approximations. However, in the linear quadratic (LQ) case (linear system with quadratic cost) it is possible to find exact solutions, as discussed in the next section.

2.8. An LQ system: the example of the mixing tank

The state equation of the example of the mixing tank is the following:

⎩⎨

⎧𝑑ℎ(𝑡)𝑑𝑡 = 𝑢$(𝑡) + 𝑢Y(𝑡) − 𝑘�ℎ(𝑡)

𝑑𝑐(𝑡)𝑑𝑡

=𝑐$(𝑡) − 𝑐(𝑡)

ℎ(𝑡)𝑢$(𝑡) +

𝑐Y(𝑡) − 𝑐(𝑡)ℎ(𝑡)

𝑢Y(𝑡),𝑡 ∈ [0, 𝑇]

Now, we perform a linearization around regime values ℎ=, 𝑐=, and 𝑢$=, 𝑢Y=. We assume

Pℎ(𝑡) = ℎ= + Δℎ(𝑡)𝑐(𝑡) = 𝑐= + Δ𝑐(𝑡)

P𝑢$(𝑡) = 𝑢$= + Δ𝑢$(𝑡)𝑢Y(𝑡) = 𝑢Y= + Δ𝑢Y(𝑡)

Then, we substitute the previous equations within the state equation in order to linearize it. By substituting in the first state equation, we get:

_________________________________________________________________________________________


𝑑[ℎ= + Δℎ(𝑡)]𝑑𝑡

= 𝑢$= + Δ𝑢$(𝑡) + 𝑢Y= + Δ𝑢Y(𝑡) − 𝑘�ℎ= + Δℎ(𝑡)

There is an evident nonlinear dependence on ℎ(𝑡) because of the presence of the square root. It is possible to linearize that term using a Taylor expansion of the first order around ℎ=:

−𝑘�ℎ= + Δℎ(𝑡) ≅ −𝑘 ¹�ℎ= +1

2�ℎ=Δℎ(𝑡)º

We obtain:

𝑑[ℎ= + Δℎ(𝑡)]𝑑𝑡

= 𝑢$= + Δ𝑢$(𝑡) + 𝑢Y= + Δ𝑢Y(𝑡) − 𝑘�ℎ= −𝑘

2�ℎ=Δℎ(𝑡)

We have no variation of height and concentration at regime, thus we can write:

𝑑ℎ=𝑑𝑡

= 𝑢$= + 𝑢Y= − 𝑘�ℎ= = 0

Hence, we obtain the following linearized expression of the state equation for the height:

𝑑Δℎ(𝑡)𝑑𝑡

= Δ𝑢$(𝑡) + Δ𝑢Y(𝑡) −𝑘

2�ℎ=Δℎ(𝑡)

As regards the concentration, we can repeat similar computations, the result is the following:

𝑑Δ𝑐(𝑡)𝑑𝑡

=𝑐$ − 𝑐=ℎ=

Δ𝑢$(𝑡) +𝑐Y − 𝑐=ℎ=

Δ𝑢Y(𝑡) − ¹𝑐$ − 𝑐=ℎ=

Y 𝑢$= +𝑐Y − 𝑐=ℎ=

Y 𝑢Y=ºΔℎ(𝑡)

− ±𝑢$=ℎ=

+𝑢Y=ℎ=² Δ𝑐(𝑡)

Thus, we can write the state equation in vector form as follows:

»

𝑑Δℎ(𝑡)𝑑𝑡

𝑑Δ𝑐(𝑡)𝑑𝑡

¼ =

⎣⎢⎢⎢⎡ −

𝑘2�ℎ=

0

−𝑐$ − 𝑐=ℎ=

Y 𝑢$= −𝑐Y − 𝑐=ℎ=

Y 𝑢Y= −𝑢$=ℎ=

−𝑢Y=ℎ= ⎦⎥⎥⎥⎤|Δℎ

(𝑡)Δ𝑐(𝑡)} + 2

1 1𝑐$ − 𝑐=ℎ=

𝑐Y − 𝑐=ℎ=

5 |Δ𝑢$(𝑡)

Δ𝑢Y(𝑡)}

If we introduce:

𝒙(𝑡) = |Δℎ(𝑡)

Δ𝑐(𝑡)}, 𝒖(𝑡) = |Δ𝑢$(𝑡)

Δ𝑢Y(𝑡)}

we get

�̇�(𝑡) = 𝐴�𝒙(𝑡) + 𝐵�𝒖(𝑡),𝑡 ∈ [0, 𝑇]

which is a linear system in the 𝛥variables.

𝐴�𝐵�

_________________________________________________________________________________________


The previous is a good approximation of the original nonlinear equations for small perturbations around the regime.

Now let us consider the cost function 𝐽. Without loss of generality, our choice is to penalize big variations from the regime:

𝐽 = � 𝑣�$$[ℎ(𝑡) − ℎ=]Y + 𝑣�YY[𝑐(𝑡) − 𝑐=]Y𝑑𝑡 + 𝑣B$$[ℎ(𝑇) − ℎ=]Y + 𝑣BYY[𝑐(𝑇) − 𝑐=]YB

=

In general, it is convenient to write penalization terms that are quadratic (so that positive or negative differences have the same weight). The terms 𝑣�$$, 𝑣�YY, 𝑣B$$, 𝑣BYY are positive coefficients chosen via a trial and error procedure.

In general, we consider constraints on ℎ, 𝑐, 𝑢$, 𝑢Y as follows:

P0 ≤ ℎ(𝑡) ≤ 𝐻0 ≤ 𝑐(𝑡) ≤ 𝐶 P

0 ≤ 𝑢$(𝑡) ≤ 𝑈$0 ≤ 𝑢Y(𝑡) ≤ 𝑈Y

where 𝐻 is the total height of the tank (which is of course a finite number), 𝐶 is the maximum concentration of chlorine, and 𝑈$ and 𝑈Y are the maximum action that we can have on the valves.

A typical approach to deal with constraints is to “transfer” them in the cost function through suitable penalization functions. These penalization functions make the cost increase if constraints are violated. Since the goal is that of minimizing the cost, hopefully the constraints will not be violated (in the case of soft constraints, we may accept small violations).

Let us consider, for instance, the constraint 0 ≤ 𝑢$(𝑡) ≤ 𝑈$. The ideal penalization function for 𝑢$(𝑡) is:

𝜓$(𝑢$) = P0𝑖𝑓0 ≤ 𝑢$(𝑡) ≤ 𝑈$+∞𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

An issue arises in the limits of the domain since we have a non-continuous derivative. We consider then a smooth version of the function, with smooth corner points.

We now insert all the penalization functions in the cost:

_________________________________________________________________________________________


𝐽 = ∫ 𝑣�$$[ℎ(𝑡) − ℎ=]Y + 𝜓Æ�ℎ(𝑡)� + 𝑣�YY[𝑐(𝑡) − 𝑐=]Y + 𝜓Ç�𝑐(𝑡)� + 𝜓$�𝑢$(𝑡)� + 𝜓Y�𝑢Y(𝑡)�𝑑𝑡 +B=

+𝑣B$$[ℎ(𝑇) − ℎ=]Y + 𝑣BYY[𝑐(𝑇) − 𝑐=]Y + penalization terms at time 𝑇 if needed.

Using the penalization function, we have lost the quadraticity of the equation since penalization terms are non-quadratic. To have a quadratic cost, we simply choose quadratic penalty functions as follows:

We consider a parabolic function with a vertex on ÈNY

. 𝜓$�𝑢$(𝑡)� = 𝑃�$$[𝑢$(𝑡) − 𝑢$=]Y, where 𝑃�$$ is a weight coefficient determining the slope of the parabola and 𝑢$(𝑡) − 𝑢$= = Δ𝑢$(𝑡).

We can choose the regime value since we want to remain in a neighborhood of the vertex 𝑢$=. We obtain the goal of remaining around the regime since the function is equal to zero only in the minimum. The narrower is the parabola, the higher are the values for the points outside the domain (constraints), i.e., the less probable is the possibility of “staying” there.

Summarizing, we can write

𝐽 = � Ê𝑣�$$ + 𝑣�$$ËÌΔℎY + Ê𝑣�YY + 𝑣�YYËÌΔcY + 𝑃�$$Δ𝑢$Y(𝑡) + 𝑃�YYΔ𝑢YY(𝑡)𝑑𝑡B

=+

+𝑣B$$ΔℎY(𝑇) + 𝑣BYYΔ𝑐Y(𝑇)

since 𝜓Î[ℎ(𝑡)] = 𝑣�$$Ë[ℎ(𝑡) − ℎ=].

After the linearization, we perform a time discretization that can be done both in the cost and in the state equation in order to obtain a discrete-time system. The interval [0, 𝑇] is divided in 𝑁 sub-intervals.

The discretized cost is:

𝐽 = @[ΔℎV|Δ𝑐V]nC$

VD=

|𝑣$$ 00 𝑣YY

} |ΔℎVΔ𝑐V} + [Δ𝑢$V|Δ𝑢YV] |

𝑃$$ 00 𝑃YY

} |Δ𝑢$VΔ𝑢YV} +

+[Δℎn|Δ𝑐n] |𝑣n$$ 00 𝑣nYY

} |ΔℎnΔ𝑐n}

where 𝑣$$ and 𝑣YY are the discrete versions of the coefficients 𝑣�$$ + 𝑣�$$Ë and 𝑣�YY + 𝑣�YYË, respectively.

Δℎ(𝑡)Δ𝑐(𝑡)

_________________________________________________________________________________________


The discretized state equation is:

𝒙V#$ = 𝐴𝒙V + 𝐵𝒖V,𝑖 = 0,… ,𝑁 − 1

Hence, we can write the general form of a LQ dynamic optimization problem:

⎩⎪⎨

⎪⎧ min𝒖K,…,𝒖MN

𝐽 = @ 𝒙V⊺𝑉𝒙V + 𝒖V⊺𝑃𝒖V + 𝒙n⊺𝑉n𝒙n

nC$

VD=

𝑠. 𝑡.𝒙V#$ = 𝐴𝒙V + 𝐵𝒖V,𝑖 = 0,… ,𝑁 − 1𝒙= = 𝒙>

In general, we assume that 𝑉, 𝑃, 𝑉n are symmetric matrices such that:

• 𝑉 = 𝑉⊺ ≥ 0 semipositive definite • 𝑃 = 𝑃⊺ > 0 positive definite • 𝑉n = 𝑉n⊺ ≥ 0 semipositive definite

These assumptions are usually satisfied. The LQ case is important since we can find analytical solutions to the Bellman equations

�𝐽n«(𝒙n) = ℎn(𝒙n)𝐽V«(𝒙V) = min

𝒖¬·ℎ(𝒙V, 𝒖V) + 𝐽V#$« �𝑓(𝒙V, 𝒖V)�¸ ,𝑖 = 𝑁 − 1,… ,0

In fact, using the LQ hypotheses, we get:

�𝐽n«(𝒙n) = 𝒙n⊺ 𝑉n𝒙n𝐽nC$« (𝒙nC$) = min

𝒖MN(𝒙nC$⊺ 𝑉𝒙nC$ + 𝒖nC$⊺ 𝑃𝒖nC$ + 𝒙n⊺ 𝑉n𝒙n)

But

𝒙n⊺ 𝑉n𝒙n = (𝐴𝒙nC$)⊺𝑉n𝐴𝒙nC$ + (𝐴𝒙nC$)⊺𝑉n𝐵𝒖nC$ + (𝐵𝒖nC$)⊺𝑉n𝐴𝒙nC$ + (𝐵𝒖nC$)⊺𝑉n𝐵𝒖nC$

= 𝒙nC$⊺ 𝐴⊺𝑉n𝐴𝒙nC$ + 𝒙nC$⊺ 𝐴⊺𝑉n𝐵𝒖nC$ + 𝒙nC$⊺ 𝐴⊺𝑉n𝐵𝒖nC$ + 𝒖nC$⊺ 𝐵⊺𝑉n𝐵𝒖nC$

We extract the green terms from the minimum since they do not depend on 𝒖nC$:

𝐽nC$« (𝒙nC$) = 𝒙nC$⊺ (𝑉 + 𝐴⊺𝑉n𝐴)𝒙nC$ + min𝒖MN

(𝒖nC$⊺ (𝑃 + 𝐵⊺𝑉n𝐵)𝒖nC$ + 2(𝒙nC$⊺ 𝐴⊺𝑉n𝐵)𝒖nC$)

Now, we use the following matrix property:

min𝒛𝒛⊺𝑄𝒛 + 2𝒄⊺𝒛 ⇒ 𝒛∗ = −𝑄C$𝒄

In our case, 𝑃 + 𝐵⊺𝑉n𝐵 = 𝑄, 𝑄 = 𝑄⊺ > 0, 𝒙nC$⊺ 𝐴⊺𝑉n𝐵 = 𝒄⊺, 𝒖nC$ = 𝒛.

Thus, we can write:

_________________________________________________________________________________________


µ𝐽nC$« (𝒙nC$) = 𝒙nC$⊺ (𝑉 + 𝐴⊺𝑉n𝐴)𝒙nC$ − 𝒙nC$⊺ 𝐴⊺𝑉n𝐵(𝑃 + 𝐵⊺𝑉n𝐵)C$𝐵⊺𝑉n𝐴𝒙nC$𝒖nC$∗ (𝒙nC$) = −(𝑃 + 𝐵⊺𝑉n𝐵)C$𝐵⊺𝑉n𝐴𝒙nC$

that is,

𝐽nC$« (𝒙nC$) = 𝒙nC$⊺ 𝑇nC$𝒙nC$

𝒖nC$∗ (𝒙nC$) = −𝐿nC$𝒙nC$

where

𝑇nC$ = 𝑉 + 𝐴⊺[𝑉n − 𝑉n𝐵(𝑃 + 𝐵⊺𝑉n𝐵)C$𝐵⊺𝑉n]𝐴

𝐿nC$ = (𝑃 + 𝐵⊺𝑉n𝐵)C$𝐵⊺𝑉n

The cost to go is a quadratic form in 𝒙nC$ and the optimal control is a negative proportional feedback. For the stages 𝑁 − 2,𝑁 − 3, we have to repeat the same computations of the stage 𝑁 − 1, and therefore we get:

P𝐽nCY« (𝒙nCY) = 𝒙nCY⊺ 𝑇nCY𝒙nCY𝒖nCY∗ = −𝐿nCY𝒙nCY

In general, we have the following:

𝒖V∗ = −𝐿V𝒙V,𝑖 = 𝑁 − 1,… ,0

𝐿V = (𝑃 + 𝐵⊺𝑇V#$𝐵)C$𝐵⊺𝑇V#$𝐴

𝐽V=(𝒙V) = 𝒙V⊺𝑇V𝒙V,𝑖 = 𝑁,… ,0

where

P𝑇n = 𝑉n𝑇V = 𝑉 + 𝐴⊺[𝑇V#$ − 𝑇V#$𝐵(𝑃 + 𝐵⊺𝑇V#$𝐵)C$𝐵⊺𝑇V#$]𝐴,𝑖 = 𝑁 − 1,… ,0

The 𝑇V are computed in a recursive manner by means of the so-called Riccati matrix equation. In general, solving this equation is difficult but possible (we will not see the details in this course).

Negative proportional feedback

_________________________________________________________________________________________


2.9. Non-LQ problem

The system dynamics and cost function in the general (non-LQ) case are the following:

⎩⎪⎨

⎪⎧𝒙V#$ = 𝑓(𝒙V, 𝒖V)𝒙= = 𝒙>

min𝒖K,…,𝒖MN

𝐽 = @ ℎ(𝒙V, 𝒖V) + ℎn(𝒙n)nC$

VD=

Bellman Equations:

µ𝐽n«(𝒙n) = ℎ(𝒙n)𝐽V«(𝒙V) = min

𝒖¬Õℎ(𝒙V, 𝒖V) + 𝐽V#$« �𝑓(𝒙V, 𝒖V)�Ö ,𝑖 = 𝑁 − 1,… ,0

In this case, it is not possible to find analytical solutions, and therefore approximations are needed. This happens for instance in the example of the mixing tank for large deviations from the regime.

Without loss of generality, let us represent the state spaces 𝑋V at the various time stages in the mixing tank example:

Stage 𝑁:

𝐽n«(𝒙n) = ℎn(𝒙n)

This cost to go has an analytical expression for all 𝒙n, since ℎn is known ∀𝒙n ∈ 𝑋n.

Stage 𝑁 − 1:

𝐽nC$« (𝒙nC$) = min𝒖MN

Õℎ(𝒙nC$, 𝒖nC$) + 𝐽n«�𝑓(𝒙nC$, 𝒖nC$)�Ö

Constraints:

0 ≤ ℎ(𝑡) ≤ 𝐻 ⇒ 0 ≤ ℎV ≤ 𝐻∀𝑖

0 ≤ 𝑐(𝑡) ≤ 𝐶 ⇒ 0 ≤ 𝑐V ≤ 𝐶∀𝑖

_________________________________________________________________________________________


There are two issues:

1) We cannot find the analytical expression of the minimizer 𝒖nC$∗ ; 2) Since 𝒙nC$ ∈ 𝑋nC$, we would have to solve infinite minimization problems in order to know

the optimal cost 𝐽nC$« for all the possible values of 𝒙nC$.

To face 1), we can employ an approximate minimization algorithm (we will see the details on the part of nonlinear programming). To face 2), we discretize the state space 𝑋nC$ (for instance with a regular grid made up of 𝑑 × 𝑑 points) and compute the cost to go for all the discretization points. Thus, we need to solve:

𝐽nC$« �𝒙nC$\ � = min

𝒖MN×ℎ�𝒙nC$

\ , 𝒖nC$� + 𝐽n« ·𝑓�𝒙nC$\ , 𝒖nC$�¸Ø ,𝑗 = 1,… , 𝑑Y

At the end, the cost to go functions are known only in the discretization points.

Stage 𝑁 − 2:

𝐽nCY« (𝒙nCY) = min𝒖M®

Õℎ(𝒙nCY, 𝒖nCY) + 𝐽nC$« �𝑓(𝒙nCY, 𝒖nCY)�Ö

As before, we need to discretize 𝑋nCYin 𝑗 = 1,… , 𝑑Y points and compute the Bellman equations in the discretization points:

𝐽nCY« �𝒙nCY\ � = min

𝒖M®×ℎ�𝒙nCY

\ , 𝒖nCY� + 𝐽nC$« ·𝑓�𝒙nCY\ , 𝒖nCY�¸Ø

Unfortunately, we know 𝐽nC$« only in the discretization points of the set 𝑋nC$. However, it may happen that 𝑓�𝒙nCY

\ , 𝒖nCY� is not a discretization point. Hence, an additional issue with respect to the stage 𝑁 − 1 arises. We have to compute an approximation 𝐽�nC$« of 𝐽nC$« . Using this approximation, the Bellman equation for the stage 𝑁 − 2 becomes the following:

𝐽nCY« �𝒙nCY\ � = min

𝒖M®×ℎ�𝒙nCY

\ , 𝒖nCY� + 𝐽�nC$« ·𝑓�𝒙nCY\ , 𝒖nCY�¸Ø

Thus, we have to construct a sequence of approximations for the Bellman equations. Three sources of approximations exist at this stage (and also at the other ones up to the first stage):

1) Discretization grid. 2) Approximation of the min

𝒖¬(∙).

3) Construction of the approximate cost to go 𝐽�V«.

Let us first analyze the case of the discretization grid.

The need of discretizing the state space 𝑋V, 𝑖 = 𝑁 − 1,… , 0, is a severe issue, as it may lead to the so-called curse of dimensionality, i.e., an exponential increase of the number of computations as the dimensionality of the problem grows. For large 𝑁 (for instance 𝑁 ≥ 10), the use of DP to solve a dynamic optimization problem is difficult, due to the increase of the number of points in the grid (and therefore due to the increase of the nonlinear programming problems that we have to solve). There exist techniques to mitigate such issues, based on random sampling.

_________________________________________________________________________________________


Unfortunately, it is impossible to fix a priori the best number of points. For instance, a trial and error procedure may be used. The risk of using a random sampling is the creation of cluster of points in certain regions and the lack of points in other regions. This may lead to a coarse approximation of the cost to go functions in certain regions of the state space. A good tradeoff may be the use of low discrepancy sequences, deterministic sequences that allows to have a good uniform sampling of the sets 𝑋V with a reduced number of points and no clusters and empty regions.

Concerning the construction of the approximate cost to go 𝐽�V«, we have to approximate a function that is known only at certain discretization points. Two main options are available:

• Least squares approximation. • Nonlinear approximation techniques.

2.9.1. Least squares approximation

In this section, we investigate how to approximate an unknown function starting from samples using least squares (LS). Suppose to have a nonlinear unknown function 𝑓(𝒙) that we know only at certain points.

• 𝒙$, … , 𝒙n are 𝑁 measurement points. • 𝑓(𝒙$), … , 𝑓(𝒙n)are the values of the function at the measurement points. • 𝑦$, … , 𝑦nare noisy measures of 𝑓(𝒙$), … , 𝑓(𝒙n).

_________________________________________________________________________________________


The goal is to construct an approximating function 𝑓� of 𝑓 starting from noisy measurements. In general, we can write:

𝑦V = 𝑓(𝑥V) + 𝜂V,𝑖 = 1,… ,𝑁

where 𝜂V is the measurement noise.

The easiest choice is to interpolate the measures. However, in the presence of large noise, this is not optimal, as this technique suffers from the so-called overfitting phenomenon. In general, it is preferable to use an approximate function 𝑓� that does not pass through the noisy measures. The least square method allows to construct this kind of approximate function. The idea is to assign a fixed parametrized structure to 𝑓�, for instance we may use a linear combination of powers of 𝑥 (polynomial expansion). In this case, we can write:

𝑓�(𝑥,𝒘) =@𝑐V𝑥V/

VD=

where 𝒘 is a vector collecting the coefficients of the linear combination, i.e.,

𝒘 =

⎝

⎜⎛𝑐=𝑐$⋮𝑐/⎠

⎟⎞

The function 𝑓� is a polynomial of degree 𝑚. By changing the coefficients 𝑐V, we can change the shape of 𝑓�. The choice of 𝑚 is non-trivial (a trial and error procedure is needed). In general, we choose 𝑚 ≪𝑁 to avoid overfitting. Other choices of 𝑓� can be done, the idea is still to write 𝑓� as a linear combination of known basis functions:

𝑓�(𝑥,𝒘) =@𝑐V𝜙V(𝑥)/

VD=

Example of 𝜙V are polinomials but also sine, cosine, or exponential functions. In all the cases, the goal is to choose the best value for the coefficient vector 𝒘 (after choosing 𝑚) and the kind of parametrized structure, in order to construct a good approximating function 𝑓� of 𝑓. Toward this end, we define an approximation error:

𝑒(𝒘) = @Ê𝑓(𝒙c) − 𝑓�(𝒙c)ÌY

n

cD$

that is usually referred to as mean square error (MSE). Since 𝑓 is unknown, the MSE can be written as:

𝑒(𝒘) = @Ê𝑦c − 𝑓�(𝒙c)ÌY

n

cD$

_________________________________________________________________________________________


We choose 𝒘 minimizing the MSE, i.e.,

𝒘∗ = argmin𝒘𝑒(𝒘)

The previous is a generic nonlinear programming problem. To find a solution, we could use any approximation algorithm (see the part of the course related to nonlinear programming). However, in this case we are able to find an analytical solution. Without loss of generality, let us focus on polynomials with 𝑚 = 3 and scalar 𝑥 and 𝑦. Consider 𝒚c ≅ 𝑓�(𝒙c), i.e., neglect the noise:

⎩⎨

⎧ 𝑓�(𝑥$) = 𝑐= + 𝑐$𝑥$ + 𝑐Y𝑥$Y + 𝑐Z𝑥$Z ≅ 𝑦$𝑓�(𝑥Y) = 𝑐= + 𝑐$𝑥Y + 𝑐Y𝑥YY + 𝑐Z𝑥YZ ≅ 𝑦Y

⋮𝑓�(𝑥n) = 𝑐= + 𝑐$𝑥n + 𝑐Y𝑥nY + 𝑐Z𝑥nZ ≅ 𝑦n

The previous is a system of “almost equations”, with 𝑁equations and 𝑚+ 1 unknowns ¹�K⋮�³º. Since, in

general, 𝑁 ≫ 𝑚, this system has no solutions (there are more equations than unknowns). Let us rewrite it in vector form:

⎣⎢⎢⎡ 1 𝑥$1 𝑥Y

𝑥$Y 𝑥$Z

𝑥YY 𝑥YZ⋮ ⋮1 𝑥n

⋮ ⋮𝑥nY 𝑥nZ⎦

⎥⎥⎤

⎝

⎜⎛𝑐=𝑐$𝑐Y𝑐Z⎠

⎟⎞≅

⎝

⎜⎛𝑦$𝑦Y⋮𝑦n⎠

⎟⎞

Our goal is to make the left-hand-side as much as possible equal to the right-hand-side, by exploiting the minimization of the MSE:

𝑒(𝒘) = ‖𝐻𝒘− 𝒚‖Y = (𝐻𝒘 − 𝒚)⊺(𝐻𝒘 − 𝒚) = 𝒘⊺𝐻⊺𝐻𝒘 −𝒘⊺𝐻⊺𝒚 − 𝒚⊺𝐻𝒘 + 𝒚⊺𝒚

= 𝒘⊺𝐻⊺𝐻𝒘 − 2𝒚⊺𝐻𝒘 + 𝒚⊺𝒚 = 𝒘⊺𝑄𝒘 + 2𝑐⊺𝒘 + 𝒚⊺𝒚

The last term is a constant in our minimization, as it does not depend on 𝒘. In other words, it is a bias term.

𝐻 ∈ ℝn×/#V𝒘 ∈ ℝ/#V𝒚 ∈ ℝn

𝑄2𝑐⊺

_________________________________________________________________________________________


If we use the matrix property such that, given the problem min𝒛(𝒛⊺𝑄𝒛 + 2𝒄⊺𝒛) with 𝑄 = 𝑄⊺ > 0

symmetric quantity, positive definite, we have 𝒛∗ = −𝑄C$𝒄, the optimal vector of parameters of the least squares approach is given by:

min𝒘𝑒(𝒘) ⇒ 𝒘∗ = (𝐻⊺𝐻)C$𝐻𝒚

The previous computations can be extended to generic basis functions 𝜙V(𝒙) instead of polinomials. We have

𝑓�(𝒙,𝒘) = ∑ 𝑐V𝜙V(𝒙)/VD$ (2.2)

𝑒(𝒘) = @·𝒚c − 𝑓�(𝒙c,𝒘)¸Y

n

cD$

System of “almost equations”:

�

𝑐$𝜙$(𝑥𝟏) + 𝑐Y𝜙Y(𝑥𝟏) + ⋯+ 𝑐/𝜙/(𝑥𝟏) ≅ 𝑦$𝑐$𝜙$(𝑥𝟐) + 𝑐Y𝜙Y(𝑥𝟐) + ⋯+ 𝑐/𝜙/(𝑥𝟐) ≅ 𝑦Y

⋮𝑐$𝜙$(𝑥𝑵) + 𝑐Y𝜙Y(𝑥𝑵) +⋯+ 𝑐/𝜙/(𝑥𝑵) ≅ 𝑦n

⇒ 2𝜙$(𝑥𝟏) ⋯ 𝜙/(𝑥𝟏)⋮ ⋱ ⋮

𝜙$(𝑥𝑵) ⋯ 𝜙/(𝑥𝑵)5 m𝑐$⋮𝑐/o ≅ î

𝑦$⋮𝑦nï

that is,

Φ𝒘 = 𝒚

By using the same arguments as before, we get:

𝐰∗ = (Φ∗Φ)C$Φ⊺𝐲

Unfortunately, in general, Φ⊺Φ(or 𝐻⊺𝐻) is an ill-conditioned matrix, therefore finding its inverse may be difficult.

The least square basic expansion (2.2) may suffer from the curse of dimensionality in 𝑚, this means that the number 𝑚of basis functions required to have a satisfactory approximation may grow too fast with the dimension of 𝒙. A mitigation technique consists in inserting parameters in the basis functions, i.e.,

𝑓�(𝒙,𝒘) = ∑ 𝑐V𝜙V(𝒙, 𝑘V)/VD$ (2.3)

where 𝜙V(𝒙, 𝑘V) is a parametrized basis function and

Φ𝒘𝒚

_________________________________________________________________________________________


𝒘 =

⎝

⎜⎜⎜⎜⎜⎜⎛𝑐$⋮𝑐/⋮𝑘V⋮𝑘/⎠

⎟⎟⎟⎟⎟⎟⎞

is the vector of parameters.

In general, the use of parametrized basis functions allows having an overall number of unknown parameters that is smaller than the number of parameters required in equation (2.2) to have the same accuracy. In equation (2.2), 𝑓� depends linearly on 𝒘, whereas in (2.3) 𝑓� depends nonlinearly on 𝒘 (𝜑V may be nonlinear). This makes it impossible to find an analytical solution to the optimization problem min𝒘𝑒(𝒘) (i.e., no analytical expression for 𝒘∗exists). Thus, we have to resort to approximation to find

𝒘∗.

Equation (2.2) belongs to the family of linear approximators (so least squares can be used).

Equation (2.3) belongs to the family of nonlinear approximations (so least squares cannot be used). It is an example of the so-called one-hidden-layer feedforward neural networks. Such structures mimic the behavior of the neurons in the human brain. In this case, the function 𝜙V(𝒙,𝒘) = tanh(𝜶𝒙 + 𝛽).

“activation function” of neurons.

2.9.2. Parameter estimation with least squares

Consider an unknown parameter vector 𝒑 and a series of its measurements:

𝒚V = 𝐻V𝒑 + 𝜼V,𝑖 = 1,… ,𝑁

where:

- 𝒚V is the 𝑖-th measurement of 𝒑; - 𝒑 is the unknown parameter vector; - 𝜼V is the measurement noise; - 𝐻V is the matrix of known coefficients (distortion caused by the instrumentation used for the

measures);

outer parameters

inner parameters

_________________________________________________________________________________________


- 𝑁 is the number of measurements.

There are two cases:

1) We know nothing on the probability density function of 𝒑 and 𝜼V: the best technique to estimate 𝒑 is by using least squares.

2) We know the probability density function of 𝒑 and 𝜼V and they are Gaussian: the best technique is to use the Kalman filter.

In the following, we will focus only on 1).

We start investigating the case in which 𝐸(𝜼V) = 0, i.e., there is no bias error in our measurements.

𝒚V = 𝐻V𝒑 + 𝜼V,𝑖 = 1,… ,𝑁

We write a system of “almost equations” like before. we can do this because on the average, the measurement error is zero.

�

𝐻$𝒑 ≅ 𝒚$𝐻Y𝒑 ≅ 𝒚Y

⋮𝐻n𝒑 ≅ 𝒚n

⇒ 𝐻𝒑 ≅ 𝒚

where 𝐻 = »

𝐻$𝐻Y⋮𝐻n

¼ and 𝒚 = »

𝒚$𝒚𝟐⋮𝒚n

¼. If 𝑁 ≫ dim(𝒑), we can use the least square arguments and write:

𝒑∗ = (𝐻⊺𝐻)C$𝐻⊺𝒚

The previous is an analytical expression of the minimizer, and it comes from the goal of finding min𝒑‖𝐻𝒑 − 𝒚‖Y.

As an example of application, consider the estimation of the distance 𝑝 (a scalar quantity) between a submarine and a ship, starting from the available noisy measurements coming from sonar.

Measures are such that 𝑦V = 𝑝 + 𝜂V. We assume E(Δ𝜼V) = 0 and define the following quantities.

• 𝑖 = 1,… ,𝑁. • 𝑁 number of measurements. • 𝑝 true distance. • 𝜂V measurement error.

_________________________________________________________________________________________


• 𝑦V measured distance.

We write the system of “almost equations”:

�𝑦$ ≅ 𝑝⋮

𝑦n ≅ 𝑝⇒ m

1⋮1o𝑝 ≅ 𝒚, 𝒚 = î

𝑦$⋮𝑦nï

𝐻𝑝 ≅ 𝒚

Using least squares, we can write the optimal value for 𝑝, i.e.,

𝑝∗ = (𝐻⊺𝐻)C$𝐻⊺𝒚 = ü(111… )î11⋮ïý

C𝟏

(111… )î𝑦$⋮𝑦nï = [𝑁]C$@𝑦V =

1𝑁

n

VD$

@𝑦V

n

VD$

The best estimation with the least squares is the average of all the measurements.

Now let us consider the case of E(𝜂V) = �̂� ≠ 0, i.e., known bias error. In general, we have 𝜂V = �̂� +Δ𝜂V, where �̂�is known and Δ𝜂V is unknown, but E(Δ𝜂V) = 0. We can write:

𝑦V = 𝐻V𝑝 + 𝜂V = 𝐻V𝑝 + �̂� + Δ𝜂V,𝑖 = 1,… ,𝑁

If we group the known quantities, we obtain:

𝑦V − �̂� = 𝐻V𝑝 + Δ𝜂V

𝑦!V = 𝐻V𝑝 + Δ𝜂V,𝑖 = 1,… ,𝑁

where 𝑦!V = 𝑦V − �̂�. We have the same situation as before, thus we can use least squares to estimate 𝑝:

𝑝∗ = (𝐻⊺𝐻)C$𝐻⊺𝑦!

Lastly, we consider the case of E(𝜂V) = �̂� ≠ 0 unknown. As before, we can write that 𝜂V = �̂� + Δ𝜂V and as before E(Δ𝜂V) = 0, but now �̂� is unknown. Hence, 𝑦V = 𝐻V𝑝 + �̂� + Δ𝜂V, 𝑖 = 1,… ,𝑁, but �̂� this time is unknown like 𝑝.

Thus, we can write:

𝑝! = ±𝑝�̂�²⇒ 𝑦V = (𝐻V|𝐼) ±

𝑝�̂�²+ Δ𝜂V,𝑖 = 1,… ,𝑁

𝑦V = 𝐻¡V𝑝! + Δ𝜂V with E(Δ𝜂V) = 0. Again, we can write the usual expression of least squares since all the assumptions are satisfied and find the optimal estimation 𝑝!∗:

𝑝!∗ = (𝐻¡⊺𝐻¡)C$𝐻¡⊺𝑦

_________________________________________________________________________________________


2.10. Dynamic optimization with disturbances

Consider the case where a noise or a disturbance acts on the dynamic system:

P𝒙V#$ = 𝑓(𝒙V, 𝒖V, 𝝃V),𝑖 = 0, 1, … ,𝑁 − 1𝒙= = 𝒙>

where 𝒙V is the state, 𝒖V is the control or decision vector, and 𝝃V is a disturbance acting on the system. This disturbance in general is unknown. We assume that the sequence of disturbances 𝝃=, 𝝃$, … , 𝝃nC$ are mutually independent (they are a white sequence). This means that:

𝑝𝑑𝑓(𝝃=𝝃$ ⋯ 𝝃nC$) = 𝑝𝑑𝑓(𝝃=)𝑝𝑑𝑓(𝝃$)⋯𝑝𝑑𝑓(𝝃nC$)

We still consider an additive cost:

𝐽 =@ℎ(𝒙V, 𝒖V, 𝝃V) + 𝒉n(𝒙n)BC$

VD=

The goal is to find the sequence of decisions 𝒖= = 𝛾=(𝒙=), 𝒖$ = 𝛾$(𝒙$), … , 𝒖nC$ =𝛾nC$(𝒙nC$)minimizing the expectation of the cost, i.e.,

min𝒖K,…,𝒖MN

E𝝃K,…𝝃MN

(𝐽)

In fact, 𝐽 is a random variable due to the presence of 𝝃V, hence minimizing a realization of it is useless. For this reason, we minimize the expectation of the cost. This is just a possibility, we could make more “conservative” choices by minimizing 𝐽 in the worst case, i.e.,

min𝒖K,…,𝒖MN

± max𝝃K,…𝝃MN

(𝐽)² “Worst case design”

We can choose one or another, depending on the situation at hand. The controls 𝒖V = 𝛾V(𝒙V) are a function of the state, using DP. With disturbances this is nice since it is mandatory to consider a feedback loop.

transition cost

final cost

_________________________________________________________________________________________


2.10.1. Solution with dynamic programming

We use again the Bellman optimality principle:

“Necessary condition such that 𝒙V belongs to the optimal trajectory is that the decisions from the stage 𝑖 up to the stage 𝑁 − 1(𝒖V, … , 𝒖nC$) are optimal, that is they minimize the average cost to go up to

the stage 𝑁.”

The cost to go of the stage 𝑖 is:

𝐽V«(𝒙V) = min𝒖K,…,𝒖MN

� E𝝃K,…,𝝃MN

m@ ℎ(𝒙c, 𝒖c, 𝝃c) + ℎn(𝒙n)nC$

cD=

o$

We compute the cost to go with the Bellman equations in the backward phase, i.e.,

�𝐽n«(𝒙n) = ℎ(𝒙n)

𝐽V«(𝒙V) = min𝒖¬PE𝝃¬·ℎ(𝒙V, 𝒖V, 𝝃V) + 𝐽V#$« �𝑓(𝒙V, 𝒖V, 𝝃V)�¸% ,𝑖 = 𝑁 − 1,… , 0

We write just E𝝃¬

owing to the assumption of mutually independent disturbances. The outputs of the

backward phase are both the cost to go functions 𝐽V«(𝒙V), 𝑖 = 0,… ,𝑁 and the corresponding optimal actions 𝒖V = 𝛾V(𝒙V), 𝑖 = 0,… ,𝑁 − 1.

In the forward phase, we simply apply the decisions obtained in the backward phase, as follows:

�

𝒙$ = 𝑓(𝒙=, 𝒖=)𝑤𝑖𝑡ℎ𝒖= = 𝛾=(𝒙=)𝒙Y = 𝑓(𝒙$, 𝒖$)𝑤𝑖𝑡ℎ𝒖$ = 𝛾$(𝒙$)⋮𝒙n = 𝑓(𝒙nC$, 𝒖nC$)𝑤𝑖𝑡ℎ𝒖nC$ = 𝛾nC$(𝒙nC$)

Likewise in the noise-free case, there are two situations:

1) LQ hypotheses ⟹Bellman equations can be solved analytically. 2) General case (non LQ)⟹ approximations are needed to solve Bellman equations.

2.10.2. LQ case

We consider a linear system like the following:

P𝒙V#$ = 𝐴𝒙V + 𝐵𝒖V + 𝝃V,𝑖 = 0,1, … ,𝑁 − 1𝒙= = 𝒙>

with 𝒙V ∈ ℝ-, 𝒖V ∈ ℝ/, 𝝃V ∈ ℝ- addictive noise, 𝐴 ∈ ℝ-×-, 𝐵 ∈ ℝ-×/. The goal is to minimize a quadratic cost that is averaged over disturbances:

_________________________________________________________________________________________


min𝒖K,…,𝒖MN

� E𝝃K,…,𝝃MN

2@ 𝒙VB𝑉𝒙V + 𝒖VB𝑃𝒖V + 𝒙nB𝑉n𝒙n

nC$

VD=

5$

We have a hidden dependence of the cost on the 𝝃V through the state 𝒙V. As in the noise-free case, we assume 𝑃 = 𝑃B > 0 and 𝑉 = 𝑉B ≥ 0.

Further hypotheses are the following:

- 𝐸(𝝃V) = 0 (generalizations are possible but we will not see them). - 𝑐𝑜𝑣(𝝃V) = 𝑄 where 𝑄 is a known quantity.

The probability distribution of the disturbances is generic. Under these hypotheses, it is possible to find an analytical solution to the Bellman equations:

𝐽n«(𝒙n) = 𝒙n⊺𝑉n𝒙n

𝐽nC$« (𝒙nC$) = min𝒖MN

P E𝝃MN

�(𝒙nC$⊺𝑉nC$𝒙nC$ + 𝒖nC$⊺𝑃𝒖nC$) + 𝐽n«(𝒙n)�% =

= min𝒖MN

{ E𝝃MN

((𝒙nC$⊺𝑉nC$𝒙nC$ + 𝒖nC$⊺𝑃𝒖nC$) +

+(𝐴𝒙nC$ + 𝐵𝒖nC$ + 𝝃nC$)⊺𝑉n(𝐴𝒙nC$ + 𝐵𝒖nC$ + 𝝃nC$))}

After boring computations, very similar to the noise-free case, and using suitable matrix properties, we get:

𝐽nC$« (𝒙nC$) = 𝒙nC$⊺𝑇nC$𝒙nC$ + 𝑡𝑟𝑎𝑐𝑒(𝑄𝑉n)

The corresponding optimal action𝒖nC$ is:

𝒖nC$ = −𝐿nC$𝒙nC$

where 𝐿nC$and 𝑇nC$ are suitable matrices. The contribution of the trace is due to the presence of 𝝃V and E

𝝃K,…,𝝃MN. In general, we have for 𝑖 = 𝑁 − 1,… ,0:

�𝐽V=(𝒙V) = 𝒙V⊺𝑇V𝒙V + @ 𝑡𝑟𝑎𝑐𝑒(𝑄𝑇c)

n

cDV#$𝒖V∗(𝒙V) = −𝐿V𝒙𝒊

where 𝐿V = (𝑃 + 𝐵B𝑇V#$𝐵)C$𝐵B𝑇V#$𝐴.

The matrices 𝑇V are computed recursively as follows:

P𝑇n = 𝑉n𝑇V = 𝑉 + 𝐴⊺[𝑇V#$ − 𝑇V#$𝐵(𝑃 + 𝐵⊺𝑇V#$𝐵)C$𝐵⊺𝑇V#$]𝐴,𝑖 = 𝑁 − 1,𝑁 − 2,… ,0

These equations are the Riccati discrete-time matrix equations. The result is the same of the noise-free case (apart from the sum of the traces).

_________________________________________________________________________________________


As in the noise-free case, we have a negative proportional feedback for 𝒖V.

2.10.3. General case (non-LQ problem)

In the general case, it is impossible to find analytical solutions to the Bellman equations. Thus, it is necessary to resort to approximations:

P𝒙V#$ = 𝑓(𝒙V, 𝒖V𝝃V),𝑖 = 0,… ,𝑁 − 1𝒙 = 𝒙>

min𝒖K,…,𝒖C$

E+K,…,+C$

m@ ℎ(𝒙V, 𝒖V, 𝝃V) + ℎn(𝒙n)nC$

VD=

o

Bellman Equations:

�𝐽n«(𝒙n) = ℎn(𝒙n)

𝐽V«(𝒙V) = min𝒖¬

PE𝝃¬Êℎ(𝒙V, 𝒖V, 𝝃V) + 𝐽�V#$« �𝑓(𝒙V, 𝒖V, 𝝃V)�Ì% ,𝑖 = 𝑁 − 1,… ,0

_________________________________________________________________________________________


There are the same issues that were affecting the noise-free case: (i) need of discretizing the state spaces 𝑋V, 𝑖 = 𝑁 − 1,… , 0, (ii) need of solving numerically the optimization problems min

𝒖¬(∙), and (iii) need of

approximating the cost to go functions.

Additionally, we have the following new issues:

1) Due to the noise, it may be difficult to determine the sets 𝑋V(𝑖 = 0,… ,𝑁). It is possible to approximate them, for instance by taking “very large sets”, increasing with 𝑖 (because the noise increases with 𝑖 since it contains the summation of old terms as well) where it is more likely that the state will remain.

2) Due to noise, starting from 𝒙V and applying 𝒖V, it is possible to obtain different 𝒙V#$ depending on the actual realization of the noise. In other words, we can construct a sort of transition cone to account for this multitude of points. The expectation in the Bellman equations is replaced with an empirical mean over a certain number 𝑆 of realizations of the noise, i.e.,

�

𝐽n«(𝒙n) = ℎn(𝒙n)

𝐽V«(𝒙V) = min𝒖¬

�1𝑆@zℎ·𝒙V, 𝒖V, 𝝃V

(e)¸ + 𝐽�V#$« ±𝑓 ·𝒙V, 𝒖V, 𝝃V(e)¸²{

-

eD$

$ ,𝑖 = 𝑁 − 1,… ,0

where 𝝃V(e) is the 𝑠-th realization of the noise (𝑠 = 1,… , 𝑆) and 𝐽�V« is the approximation of 𝐽V« as

in the noise-free case.

_________________________________________________________________________________________


3. Nonlinear programming

In this part, we will learn how to solve a generic nonlinear programming (NLP) problem. The term programming in this case is a synonym of optimization. We focus on static optimization, i.e., no time is involved.

In more detail, we will investigate how to solve the following problem:

min𝒙∈.

𝑓(𝒙) (3.1)

where 𝑓:ℝ𝒏 ⇒ ℝ is a given function that is called objective function or cost function, 𝒙 = m𝑥$⋮𝑥-o ∈ ℝ𝒏

is an 𝑛-dimensional vector and is called decision variable (the unknown of the problem), and 𝑋 ⊆ ℝ-is the feasibility region. The solution of this problem is denoted by:

𝒙∗ = argmin𝒙∈.

𝑓(𝒙)

In the part of Dynamic Programming, we have already faced NLP problems when solving Bellman equations. The NLP problem (3.1) is equivalent to:

min𝒖¬(ℎ(𝒙V, 𝒖V) + 𝐽V#$« (𝒙V#𝟏))

Trivial example:

min1∈ℝ

𝑥Y

𝑥∗ = argmin1𝑥Y = 0

_________________________________________________________________________________________


In general, it is difficult to find exact solutions, and therefore approximations are needed.

All the NLP problems can be written in the standard form

min𝒙∈.

𝑓(𝒙)

In fact, in the case of maximization problems, it suffices to change the sign of the objective function since

max𝒙∈.

𝑓(𝒙) = −min𝒙∈.

𝑓(𝒙)

𝑥∗ = argmin1[−𝑓(𝒙)] = argmax

1[𝑓(𝒙)]

Definition (global minimum). A point 𝒙∗ ∈ 𝑋 is said to be a (strict) global minimum for 𝑓on 𝑋 if:

𝑓(𝒙∗) ≤ (<)𝑓(𝒙)∀𝒙 ∈ 𝑋

Definition (local minimum). A point 𝒙∗ ∈ 𝑋 is said to be a (strict) local minimum for 𝑓on 𝑋 if:

𝑓(𝒙∗) ≤ (<)𝑓(𝒙)∀𝒙 ∈ 𝐵3(𝒙∗) ∩ 𝑋

where 𝐵3(𝒙∗) is a ball centered in 𝒙∗ with radius 𝑟 (it is a neighbourhood of 𝒙∗).

_________________________________________________________________________________________


Finding global optimal solutions may be difficult. Thus, usually we are satisfied to find a local one. The algorithms that we will investigate to find approximate solutions are not able to understand whether the approximation is done on a local or a global solution.

Two main categories of NLP problems exist in the literature depending on the set 𝑋:

𝑋 = ℝ- ⇒ min𝒙∈ℝ5

𝑓(𝒙) unconstrained NLP

min𝒙∈.

𝑓(𝒙)

𝑋 ⊂ ℝ- ⇒ min𝒙∈.

𝑓(𝒙) constrained NLP

Usually, in the case of constrained NLP, 𝑋 = {𝒙 ∈ ℝ-: ℎ(𝒙) = 0;𝑔(𝒙) ≤ 0}, i.e.,

7min𝒙∈ℝ5

𝑓(𝒙)

ℎ(𝒙) = 𝟎𝑔(𝒙) ≤ 𝟎

where

• 𝑓:ℝ- ⇒ ℝ is the objective function. • ℎ:ℝ- ⇒ ℝ/ are 𝑚 equality constraints. • 𝑔: ℝ- ⇒ ℝd are 𝑝 inequality constraints.

If 𝑓,𝑔, ℎ are all linear with respect to 𝒙, the problem is of linear programming. If one or more among 𝑓,𝑔, ℎ are nonlinear, the problem is of NLP. No unconstrained case for linear programming exists since the solution would be unbounded (equal to ±∞).

_________________________________________________________________________________________


3.1. Example: localization problem

Let us consider the following problem.

An oil company buys raw materials in three cities (𝐴, 𝐵, 𝐶). 𝐵 is located 100𝑘𝑚 east and 200𝑘𝑚 north of 𝐴, while 𝐶 is located 300𝑘𝑚 and 100𝑘𝑚 south of 𝐵. Compute the optimal position of a refinery in order to minimize the length of the tubes connecting it with the three cities. Due to environmental restrictions, it is not possible to construct the refinery within a circle of radius 200𝑘𝑚 from 𝐴 and south of 𝐶.

In order to write a mathematical formulation of this problem, it is necessary to identify three main ingredients:

1) Decision variables (unknowns) 𝒙. 2) Objective function (the goal) 𝑓. 3) Constraints (limitations on the decision variables) 𝑋.

The decision variables in this case are the position of the refinery, given by two coordinates 𝑥 ∈ ℝ, 𝑦 ∈ℝ.

The objective function is:

min1,9

�𝑥Y + 𝑦Y + �(𝑥 − 100)Y + (𝑦 − 200)Y + �(𝑥 − 400)Y + (𝑦 − 100)Y = min1,9

(𝐴𝑅;;;; + 𝐵𝑅;;;; +

𝐶𝑅;;;;)

The constraints are given by:

𝑥Y + 𝑦Y ≥ 200Y𝑦 ≥ 100

_________________________________________________________________________________________


Thus, the problem can be formalized as follows:

�min1,9

·�𝑥Y + 𝑦Y + �(𝑥 − 100)Y + (𝑦 − 200)Y + �(𝑥 − 400)Y + (𝑦 − 100)Y¸

𝑥Y + 𝑦Y ≥ 200Y𝑦 ≥ 100

As said, the standard form of a constrained NLP problem is the following:

7min𝒙∈ℝ5

𝑓(𝒙)

ℎ(𝒙) = 𝟎𝑔(𝒙) ≤ 𝟎

In this example, we have 𝒙 = ·𝑥𝑦¸ , 𝑛 = 2. The cost is 𝑓(𝒙) = �𝑥Y + 𝑦Y + �(∙)Y + (∙)Y + �(∙)Y + (∙)Y.

The function ℎ(𝒙) is not defined since there are no equality constraints. The function 𝑔(𝒙) is given by:

𝑔$(𝒙) = 200Y − 𝑥Y − 𝑦Y

𝑔Y(𝒙) = 100 − 𝑦

In this example, both 𝑓 and 𝑔(𝑔$) are nonlinear.

3.2. Unconstrained nonlinear programming

Let us start with the case of unconstrained NLP. Later on, we will focus on the constrained case.

3.2.1. Optimality conditions

Consider the generic unconstrained NLP problem:

min𝒙∈ℝ5

𝑓(𝒙) (3.2)

From now on, the focus will be on functions 𝑓 ∈ 𝐶Y (continuous with continuous derivatives up to the second order).

Necessary optimality conditions. Necessary conditions such that 𝒙∗ is a local optimal solution of the problem (3.2) is that:

P∇𝑓(𝒙∗) = 0

𝐻𝑓(𝒙∗) ≥ 0

where

_________________________________________________________________________________________


∇𝑓(𝒙) =

⎣⎢⎢⎢⎡𝜕𝑓𝜕𝑥$⋮𝜕𝑓𝜕𝑥-⎦

⎥⎥⎥⎤

,𝐻𝑓(𝒙) =

⎣⎢⎢⎢⎢⎢⎡ 𝜕Y𝑓𝜕𝑥$Y

𝜕Y𝑓𝜕𝑥$𝜕𝑥Y

⋯

𝜕Y𝑓𝜕𝑥Y𝜕𝑥$

⋱ ⋮

⋮ ⋯𝜕Y𝑓𝜕𝑥-Y⎦

⎥⎥⎥⎥⎥⎤

The matrix 𝐻𝑓(𝒙) is symmetric owing to the Schwartz theorem.

Sufficient optimality conditions. Sufficient conditions such that 𝒙∗ is a local optimal solution of the problem (3.2) is that:

P∇𝑓(𝒙∗) = 0

𝐻𝑓(𝒙∗) > 0

Optimality conditions can be used to solve an unconstrained NLP problem. We have first to compute the gradient of 𝑓(𝒙)and solve the system of equations ∇𝑓(𝒙) = 0. Then, we have to compute 𝐻𝑓(𝒙) for all the solutions of the system and check the positive-definiteness of it.

However, especially for large values of 𝑛, the use of optimality conditions to find a solution is not recommended since we have to solve a system of 𝑛 equations in 𝑛 unknowns, which is in general very difficult. We will investigate alternative techniques able to find approximate solutions with reduced computational effort, the so-called descent methods.

Let us consider a simple one-dimensional example: find the minimum of 𝑥$¦ + 𝑥Y¦ − 3𝑥$𝑥Y, i.e., solve

min𝒙∈ℝ®

𝑥$¦ + 𝑥Y¦ − 3𝑥$𝑥Y

∇𝑓(𝒙) = W4𝑥$Z − 3𝑥Y4𝑥YZ − 3𝑥$

[ ; 𝐻𝑓(𝒙) = W12𝑥$Y −3−3 12𝑥YY

[

By solving the system ∇𝑓(𝒙) = 0, we obtain:

𝑥>∗ = z00{ ;𝑥?∗ =

⎣⎢⎢⎢⎡√32√32 ⎦⎥⎥⎥⎤; 𝑥Ç∗ =

⎣⎢⎢⎢⎡−√32

−√32 ⎦⎥⎥⎥⎤

Plot of the solutions:

_________________________________________________________________________________________


In the previous figure, we have plotted the level sets of 𝑓. Over each curve, 𝑓takes on always the same value (as if cutting the surface in horizontal planes). We will always use level sets to represent 2D functions.

𝐻𝑓(𝒙)|> = · 0 −3−3 0 ¸ ⇒ 𝜆$,Y = ±3 ⇒ 𝐻𝑓(𝒙)|> is non-definite: we cannot conclude that 𝐴 is a

minimum or a maximum.

𝐻𝑓(𝒙)|? = 𝐻𝑓(𝒙)|Ç = · 9 −3−3 9 ¸ ⇒ 𝜆$ = 2 > 0, 𝜆Y = 4 > 0 ⇒ 𝐻𝑓(𝒙)|? and 𝐻𝑓(𝒙)|Ç are

positive definite.

Using the sufficient optimality conditions, we conclude that 𝐵, 𝐶 are local minima for 𝑓.

3.2.2. Descent methods

Descent methods are approximate, iterative methods to find local solutions to unconstrained NLP problems. The basic idea is to construct a sequence of approximations of a certain local optimal solution 𝒙∗ starting from a given initial approximation.

Let 𝒙= be the initial approximation of 𝒙∗. We construct a minimizing sequence 𝒙= → 𝒙$ → 𝒙Y ⋯ ={𝒙c}cD=D . The algorithm is good if the sequence of approximations {𝒙c}cD=D converges to the unknown 𝒙∗. In the 2D case:

_________________________________________________________________________________________


If an infinite number of steps is needed to reach 𝒙∗, we have the so-called asymptotical convergence. If we have 𝒙c ≡ 𝒙∗ for some 𝑘 finite, the algorithm is said to be characterized by a finite convergence. In the first case, stopping criteria will be introduced to avoid an infinite number of iterations.

We say that convergence is local if the limit point of the sequence {𝒙c} depends on the initial approximation 𝒙= (different 𝒙= will lead to the approximation of different local solutions). The convergence is global if the point of convergence does not depend on 𝒙= (different 𝒙= will lead to the approximation of the same local solution). For example, in 1D the following picture shows the case of local convergence to two different solutions depending on the initial approximation 𝒙=:

The general form of a descent method is the following:

𝒙c#$ = 𝒙c + 𝛼c𝒅c,𝑘 = 0,1, …

where

• 𝒙c ∈ ℝ- is the approximation of 𝒙∗ at iteration 𝑘. • 𝛼c ∈ ℝ# is the descent step at iteration 𝑘. • 𝒅c ∈ ℝ- is the descent direction at iteration 𝑘.

Descent methods are constructed in order to guarantee that the approximation of 𝒙∗ improves iteration after iteration, i.e.,

𝑓�𝒙c#$� ≤ 𝑓�𝒙c�∀𝑘 = 0,1, …

_________________________________________________________________________________________


A descent algorithm is made up by the following steps:

1) Initialization: choose 𝒙=, 𝑘 = 0. 2) Stopping criteria: check whether the approximation is satisfactory or not. If yes, the algorithm

stops, otherwise it proceeds to 3). 3) Choice of the step 𝛼c. 4) Choice of the direction 𝒅c (steps 3) and 4) can be inverted if desired). 5) Update the approximation: 𝒙c#$ = 𝒙c + 𝛼c𝒅c, 𝑘 = 𝑘 + 1, go back to 2).

Let us analyze in detail the various steps.

1) Choice of 𝑥=. in the absence of a priori information on 𝑓 or on 𝒙∗, we choose 𝒙= arbitrarily (for instance, randomly). Unfortunately, different approximations can be obtained starting from different 𝒙=.

We say that descent methods may be “trapped” into different local minima depending on the specific choice of 𝒙=. We can mitigate the effect through the so-called multistart approach: execute the steps 1) to 5) for many different 𝒙=, i.e., consider 𝐿 initial points 𝒙=($), 𝒙=(Y), … , 𝒙=(G). Let 𝒙∗($), 𝒙∗(Y), … , 𝒙∗(G) be the corresponding optimal approximations obtained at the end of each run of the algorithm. The best approximation of the true optimal solution 𝒙∗is the one such that 𝑓�𝒙∗(G)� is minimum.

2) Stopping criteria. we introduce three different criteria. a) The first one uses optimality conditions: the algorithm stops when ∇𝑓�𝒙c� = 𝟎 ⇒

H∇𝑓�𝒙c�H ≤ 𝜀$ (in order to make it feasible with software implementations), where ‖∙‖ is the Euclidean norm and 𝜀$ is a given tolerance (small, depending on the problem at hand).

_________________________________________________________________________________________


b) The algorithm stops when H𝒙c#$ − 𝒙cH ≤ 𝜀Y∀𝑘. c) The algorithm stops when J𝑓�𝒙c#$� − 𝑓�𝒙c�J ≤ 𝜀Z∀𝑘.

Example in 1D: Consider a very “narrow” function:

In this case, it is more convenient to use the criterion c). Now, consider a very flat function:

In this case, it is more convenient to use b). To be “conservative”, we may decide to stop when all the criteria a), b), c) are satisfied (use then in a logical “AND”). Another possibility is to stop when just one of them is satisfied (logical “OR”).

3) Choice of the descent step 𝛼c. Three main possibilities exist. a) Choose a constant descent step 𝛼c = 𝛼∀𝑘 (only one choice, it works fine if we choose a

small 𝛼 but it may not be optimal).

With this choice, we may experience slow convergence if 𝛼 is too small (a large number of iterations may be needed before the stop).

_________________________________________________________________________________________


On the contrary, if 𝛼 is too large, we may have slow convergence due to “zig-zag” phenomenon. In some cases, a too large 𝛼 can lead to divergence.

b) Choose an 𝛼c reducing with 𝑘: for instance, 𝛼c = �N�®#c

where 𝑐$ and 𝑐Y are constant

coefficients. When we are far from 𝒙∗ we move of a large quantity in order to arrive fast in a neighborhood of 𝒙∗, then we reduce 𝛼c in order to mitigate the “zig-zag” phenomenon.

However, how to choose 𝑐$ and 𝑐Y? A trial and error procedure is probably the best choice.

c) Choice of an optimal step: 𝛼c = argminKL=

Ê𝑓�𝒙c + 𝛼𝒅c�Ì. This is a 1D optimization problem

for which there exist many different solution techniques in the literature under the name of line search methods. To mitigate the computational effort, there are heuristic approaches to select 𝛼c, the most famous is the so-called Armijo Rule (we will not see the details).

4) Choice of the direction dM. There are again three different possibilities: • First order methods: 𝒅c is computed by using information on ∇𝑓(𝒙c) (gradient method,

conjugate gradient method, etc.). • Second order methods: 𝒅c is computed by using both information on ∇𝑓(𝒙c) and

𝐻𝑓(𝒙c) (Newton method). • Non-derivative methods (or zero order methods): 𝒅c is computed without using

derivatives of 𝑓(𝒙c) (Powell method).

In order to have a descent method, i.e., 𝑓�𝒙c#$� ≤ 𝑓�𝒙c�, the direction 𝒅cmust satisfy certain conditions:

_________________________________________________________________________________________


Consider the Taylor expansion of 𝑓 centered in 𝒙c:

𝑓�𝒙c#$� = 𝑓�𝒙c� + ∇𝑓�𝒙c�⊺�𝒙c#$ − 𝒙c� +⋯

Since 𝒙c#$ − 𝒙c corresponds to 𝛼c𝒅c, we obtain:

𝑓�𝒙c#$� − 𝑓�𝒙c� = ∇𝑓�𝒙c�⊺𝛼c𝒅c = 𝛼c∇𝑓�𝒙c�⊺𝒅c

The first term is ≤ 0 since we are using a descent method and 𝛼c is in general positive, hence, we have a descent method if the following descent condition on the direction holds:

∇𝑓�𝒙c�⊺𝒅c ≤ 0𝛼c > 0

Descent directions: the angle between the direction 𝒅c and the gradient ∇𝑓�𝒙c� must be greater than or equal to 90°. The other ones are non-descent directions.

5) Update of 𝑥c. we simply apply the formula

𝒙c#$ = 𝒙c + 𝛼c𝒅c

3.2.3. Gradient method

It is the simplest (and most famous) descent method:

_________________________________________________________________________________________


𝒙c#$ = 𝒙c + 𝛼c𝒅c,𝑘 = 0,1, …

𝒅c = −∇𝑓(𝒙c)

𝒙c#$ = 𝒙c − 𝛼c∇𝑓�𝒙c�,𝑘 = 0,1, …

The direction 𝑑c = −∇𝑓�𝑥c� is a descent direction. In fact, we have:

∇𝑓�𝑥c�⊺𝑑c = ∇𝑓�𝑥c�⊺Ê−∇𝑓�𝑥c�Ì = −∇𝑓�𝑥c�⊺∇𝑓�𝑥c� = −H∇𝑓�𝑥c�HY ≤ 0∀𝑥c, 𝑘 = 0,1, …

The gradient method is also called steepest descent method since the direction −∇𝑓�𝒙c� is the steepest one. Under certain conditions on 𝛼c, we can prove that the method converges to a local optimal solution, i.e.,

𝒙c ⇒ 𝒙∗,𝑘 ⇒ ∞

limc⇒D

H𝒙c − 𝒙∗H = 0

If we choose a constant descent step 𝛼c = 𝛼∀𝑘, we can prove that convergence to 𝒙∗ is guaranteed provided we choose 𝛼 such that:

𝛼 ∈ ±0,2ℎ²,ℎ > 0; ℎ‖𝒙‖Y ≤ 𝒙⊺𝐻𝑓(𝒙)𝒙∀𝒙 ∈ ℝ-

In general, this means we have to choose a descent step 𝛼 sufficiently small (depending on 𝑓).

Let us consider a simple 1D example:

min1∈ℝ

𝑓(𝑥) = min1∈ℝ

𝑥Y

∇𝑓(𝑥) = 2𝑥

_________________________________________________________________________________________


𝑥c#$ = 𝑥c − 𝛼∇𝑓�𝑥c� = 𝑥c − 2𝛼𝑥c = (1 − 2𝛼)𝑥c,𝑘 = 0,1, …

We have convergence of the sequence {𝑥c} to 0 if |1 − 2𝛼| < 1, i.e., 0 < 𝛼 < 1, otherwise we have divergence.

𝛼 = 1/4 𝛼 = 1/2 𝛼 = 3/4 𝛼 = 2

𝑥= 10 10 10 10

𝑥$ 5 0 −5 −30

𝑥Y 52 0 5

2 90

𝑥Z 54 0 −

54 −270

First column: asymptotic convergence to 𝒙∗.

Second column: finite convergence to 𝒙∗. Convergence is obtained in one step. Generally, convergence is asymptotic, but, if we are lucky, a certain 𝛼 can provide finite convergence.

Third column: asymptotical convergence to 𝒙∗. There are alternate signs: the convergence is different from the one of the first column, it has an oscillatory behavior.

Fourth column: divergence.

3.2.4. Convergence rate

Let us consider now the convergence rate, a quantity that gives us information on how fast we have convergence to the optimal solution in terms of number of iterations needed to find a satisfactory approximation. The true optimal solution 𝒙∗ cannot be computed, since it is the unknown, however it is possible to estimate it. There are three cases:

1) Linear convergence rate: H𝒙c#$ − 𝒙∗H‖𝒙c − 𝒙∗‖

≤ 𝛽,𝛽 ∈ (0,1)

2) “More than linear” convergence rate:

_________________________________________________________________________________________


limc⇒D

H𝒙c#$ − 𝒙∗H‖𝒙c − 𝒙∗‖

= 0

3) Quadratic convergence rate:

H𝒙c#$ − 𝒙∗H‖𝒙c − 𝒙∗‖Y

≤ 𝛽

It is possible to prove that the gradient method has linear convergence rate. For particular functions 𝑓, the algorithm may require a large number of iterations to obtain a satisfactory approximation (convergence may be slow). However, each iteration requires small computations (only the gradient of 𝑓).

3.2.5. Newton method

The Newton algorithm is not a descent method, since it is not always true that 𝑓�𝒙c#$� ≤ 𝑓(𝒙c). However, it can be formalized as:

𝒙c#$ = 𝒙c + 𝛼c𝒅c,𝑘 = 0,1, …

𝛼c = 1∀𝑘;𝒅c = −Ê𝐻𝑓�𝒙c�ÌC$∇𝑓�𝒙c�

𝒙c#$ = 𝒙c − Ê𝐻𝑓�𝒙c�ÌC$∇𝑓�𝒙c� (3.3)

Differently from the gradient method, in each iteration we have to compute 𝐻𝑓�𝒙c� and its inverse. Hence, the computational complexity increases. However, convergence is quadratic: in general, a smaller number of iterations is required to obtain a satisfactory approximation compared to the gradient method. There is a trade-off between few complex iterations (Newton) and more light iterations (gradient). At each iteration, the idea of the Newton method is to compute a quadratic approximation of 𝑓 and take the minimum of this quadratic approximation as 𝒙c#$ (such minimum can be found analytically).

Let 𝑞(𝑥) be the quadratic approximation of 𝑓 around 𝒙c:

𝑞(𝒙) = 𝑓�𝒙c� + ∇𝑓�𝒙c�⊺(𝒙 − 𝒙c)

The minimum of 𝑞(𝑥) is computed by solving ∇𝑞(𝑥) = 0, i.e.,

_________________________________________________________________________________________


∇𝑞(𝒙) = ∇𝑓�𝒙c� + 𝐻𝑓�𝒙c��𝒙 − 𝒙c� = 0

∇𝑓�𝒙c� = −𝐻𝑓�𝒙c��𝒙 − 𝒙c�

𝒙 = 𝒙c − Ê𝐻𝑓�𝒙c�ÌC$∇𝑓�𝒙c�

The minimum of 𝑞(𝒙) is taken as 𝒙c#$, hence we obtain the formula (3.3).

A suitable convergence theorem can be stated for the Newton’s method. Assume that the following conditions hold true:

• ∃𝒙∗ ∈ ℝ- ∶ ∇𝑓(𝒙∗) = 0; • 𝐻𝑓(𝒙) is always non-singular (we can compute its inverse); • ∃𝐿 > 0 ∶ ‖𝐻𝑓(𝒙) − 𝐻𝑓(𝒚)‖ ≤ 𝐿‖𝒙 − 𝒚‖ Lipschitz condition for 𝐻𝑓(𝒙); • ∃𝑀 > 0 ∶ ‖𝐻𝑓(𝒙)C$‖ ≤ 𝑀.

If these conditions are true, and if we choose a certain approximation 𝒙= sufficiently near 𝒙∗, then we can prove that:

H𝒙c#$ − 𝒙∗H ≤𝑀2H𝒙c − 𝒙∗H,𝑘 = 0,1, …

When we have convergence, it is quadratic. However, we have fast asymptotical convergence only if we start near 𝒙∗ (few iterations, in general, are needed). If 𝑓 is quadratic, we have finite convergence in one step. If we start far from 𝒙∗ we may have divergence. Thus, the choice of 𝒙= may be an issue since 𝒙∗ is unknown.

There exists also a Newton method to compute the solutions of a nonlinear equation 𝑔(𝒙) = 0. The iterative formula is the following:

𝒙c#$ = 𝒙c − Ê∇𝑔�𝒙c�ÌC$𝑔�𝒙c�

where 𝒙c is the 𝑘-th approximation of a zero of 𝑔(𝒙). Sometimes it is called tangent method.

The Newton method for NLP is just the Newton method for finding the zeroes of the function 𝑔(𝒙) =∇𝑓(𝒙). Hence, we look for the solution of the equation ∇𝑓(𝒙) = 0 (the stationary points of 𝑓(𝒙)). For

_________________________________________________________________________________________


this reason, the Newton method may converge to a maximum point instead of a minimum: we are sure to converge to a minimum only if 𝐻𝑓(𝒙) > 0.

Let us consider the following 1D example:

min1∈ℝ

(𝑒1 − 𝑥)

We have:

∇𝑓(𝑥) = 𝑒1 − 1; 𝐻𝑓(𝑥) = 𝑒1 > 0∀𝑥 ∈ ℝ

𝑥c#$ = 𝑥c − Ê𝐻𝑓�𝑥c�ÌC$∇𝑓�𝑥c� = 𝑥c − 𝑒C1T ·𝑒1T − 1¸ = 𝑥c − 1 + 𝑒C1T ,𝑘 = 0,1, …

Let 𝑥= = −1 be randomly chosen. We have:

𝑥$ = −1 − 1 + 𝑒$ ≅ 0.7

𝑥Y = 0.7− 1 + 𝑒C=.§ ≅ 0.2

𝑥Z = −0.8+ 𝑒C=.Y ≅ 0.1

Thus, we have convergence to 0. We can conclude that for this example, 𝑥∗ is equal to zero.

Consider another 1D example:

min1∈ℝ

(𝑥 atan 𝑥 −12ln(1 + 𝑥Y))

We have:

∇𝑓(𝑥) = atan𝑥; 𝐻𝑓(𝑥) =1

1 + 𝑥Y> 0

_________________________________________________________________________________________


If 𝒙= is near 𝒙∗ we have convergence, otherwise we have divergence.

The main difference between the two examples is the change of concavity. The second derivative of ∇𝑓(𝑥) changes its sign in the second example.

There exist methods to enhance the convergence properties of the Newton method. The simplest possibility is to consider, at each iteration, Ê𝐻𝑓�𝑥c� + 𝑠c𝐼ÌC$ instead of Ê𝐻𝑓�𝑥c�ÌC$, where 𝐼 is the identity matrix of size 𝑛 × 𝑛 and 𝑠c > 0 is a coefficient. The goal of this change is to guarantee the positive definiteness of the term between the brackets. Clearly, the quadratic convergence properties are lost. Other techniques, usually referred to as quasi-Newton methods, try to reduce the effort required by the Newton method to compute the various iterations.

Some examples are:

• Reduced Hessian matrix: use a reduced diagonal Hessian (without considering the mixed derivatives). Once again, quadratic convergence is lost, but the computational effort is reduced.

𝐻𝑓�𝒙c� ≅ 𝐷�𝒙c� =

⎣⎢⎢⎢⎡𝜕

Y𝑓𝜕𝑥$Y

0

0𝜕Y𝑓𝜕𝑥YY⎦

⎥⎥⎥⎤

• Avoid updating the Hessian matrix for all 𝑘, i.e., use the following update rules:

𝒙=

𝒙$ = 𝒙= + [𝐻𝑓(𝒙=)]C$∇𝑓(𝒙=)

𝒙Y = 𝒙$ + [𝐻𝑓(𝒙=)]C$∇𝑓(𝒙$)

𝒙Z = 𝒙Y + [𝐻𝑓(𝒙=)]C$∇𝑓(𝒙Y)

𝒙¦ = 𝒙Z + [𝐻𝑓(𝒙Z)]C$∇𝑓(𝒙Z)

• Use the same Hessian matrix for all 𝑘.

We have to trade between few computationally-demanding iterations (pure Newton method) or more iterations, less computationally demanding (quasi-Newton, or, in the limit, gradient method).

_________________________________________________________________________________________


3.3. Non-derivative methods

Such methods find approximate solutions without using derivatives of the cost function. They require very few computations even if the convergence properties are not so good. The simplest choice is to use the gradient or Newton methods where ∇𝑓�𝒙c� or 𝐻𝑓�𝒙c� are replaced by the corresponding finite difference approximations, for example (forward first order finite difference):

𝜕𝑓�𝒙c�𝜕𝑥V

≃𝑓�𝒙c + ℎ𝒆V� − 𝑓�𝒙c�

ℎ

However, there exist in the literature other non-derivative methods, not based on this simple idea.

3.3.1. Coordinate descent method

It has the form of a descent method, where 𝒅c is chosen iteratively equal to a coordinate axis:

𝒙c#$ = 𝒙c + 𝛼c𝒅c,𝑘 = 0,1, …

𝒅= = 𝒆$

𝒅$ = 𝒆Y

𝒅Y = 𝒆$

𝒅Z = 𝒆Y

⋮

The step 𝛼c is chosen via the so-called line search, that is:

𝛼c = argminK𝑓�𝒙c + 𝛼𝒅c�

_________________________________________________________________________________________


where 𝛼 is no more constrained to be > 0. The convergence properties are worse than those of the gradient method. In general, a higher number of iterations may be needed to obtain a satisfactory approximation, but each iteration is very simple.

3.3.2. Powell method

It is based on the coordinate descent method, but it periodically uses a so-called “acceleration” direction:

𝒙c#$ = 𝒙c + 𝛼c𝒅c

𝒅= = 𝒆$

𝒅$ = 𝒆Y

𝒅Y = 𝒙Y − 𝒙=

𝒅Z = 𝒆$

𝒅¦ = 𝒆Y

𝒅^ = 𝒙^ − 𝒙Z

⋮

The quantities 𝒅=,𝒅$,𝒅Z,𝒅¦ are basic coordinate directions, while 𝒅Y,𝒅^ are acceleration directions. Convergence to 𝒙∗ is worse than the gradient method but it is better than the one of the coordinate descent method.

_________________________________________________________________________________________


3.3.3. Random search methods

Such methods have nothing in common with descent methods since they are based on sampling the feasibility set 𝑋 (in principle, 𝑋 is equal to ℝ- since we are dealing with unconstrained NLP, but it suffices to consider a bounded, large 𝑋).

The set 𝑋 has to be chosen “large enough” to contain the portion of 𝑓 that is interesting for the problem at hand. The idea of these methods is to sample the set 𝑋. Three main possibilities exist:

• Uniform grid. • Random sampling. • Low-discrepancy sampling (see also the part of dynamic programming).

In the previous figure, the orange circle represents the best approximation of 𝑓(𝑥) because it is its minimum value.

Let 𝑁 be the overall number of sample points 𝒙($), 𝒙(Y), … , 𝒙(n). We have to compute the values of 𝑓 in each point, i.e., 𝑓($) = 𝑓�𝒙($)�; 𝑓(Y) = 𝑓�𝒙(Y)�; …𝑓(n) = 𝑓�𝒙(n)�. Then, we collect all these values in a vector 𝐹 = �𝑓($), 𝑓(Y), … , 𝑓(n)�. The best approximation of the solution 𝒙∗ is the point with the lowest value of 𝑓 (for instance, if we sort in an ascending order the components of 𝐹, the approximation is the first value of the sorted vector). Clearly, the finer is the grid, the better is the approximation, but the higher is the computational effort. At least in principle, this method is valid also for constrained NLP, where 𝑋 is given instead of being “large” (arbitrarily chosen).

3.4. Constrained nonlinear programming

Let us consider a generic constrained NLP problem:

_________________________________________________________________________________________


7min𝒙∈ℝ5

𝑓(𝒙)𝑓: ℝ- ⇒ ℝ

ℎ(𝒙) = 0ℎ: ℝ- ⇒ ℝ/𝑔(𝒙) ≤ 0𝑔: ℝ- ⇒ ℝd

(3.4)

where the feasibility region is given by 𝑋 = {𝒙 ∈ ℝ- ∶ ℎ(𝒙) = 0, 𝑔(𝒙) ≤ 0}.

In the previous figure, the optimal solution is not the center of the level sets, as it does not belong to the feasibility region. The presence of 𝑋 makes the descent methods not directly applicable in the constrained case (since there is the risk to go outside the feasibility region).

In this case the solution of the constrained NLP is the same as the one of the unconstrained NLP problem.

3.4.1. Optimality conditions in the constrained case

We have to introduce the Lagrangian function 𝐿:ℝ- × ℝ/ × ℝd ⇒ ℝ:

𝐿(𝒙,𝝀,𝝁) = 𝑓(𝒙) +@ 𝜆VℎV(𝒙) +/

VD$

@𝜇\𝑔\(𝒙)d

\D$

Roughly speaking, the Lagrangian function is a linear combination of the function 𝑓 and the constraints. The vectors 𝝀 ∈ ℝ/ and 𝝁 ∈ ℝd are the coefficients of the combination.

Definition. Given 𝒙 ∈ 𝑋, a constraint 𝑔\(𝒙) is active in 𝒙 if 𝑔\(𝒙) = 0. Otherwise, if 𝑔\(𝒙) < 0 it is non-active. By definition, equality constraints are all active.

Definition. Constraints are regular in 𝒙 ∈ 𝑋 if the gradients of active constraints are linearly independent.

_________________________________________________________________________________________


Necessary optimality conditions. Let 𝒙∗ be a regular point for the constraints. Necessary conditions such that 𝒙∗is a local solution for the constrained NLP problem (3.4) is that:

7∇𝒙𝐿(𝒙∗,𝝀∗,𝝁∗) = 𝟎𝜇\∗ ≥ 0,𝑗 = 1,… , 𝑝𝜇\∗𝑔\(𝒙∗) = 0,𝑖 = 1,… , 𝑝

and

𝒅⊺𝐻𝒙𝐿(𝒙∗)𝒅 ≥ 0∀𝒅 ∈ ℝ- ∶ P∇ℎ(𝒙∗)⊺𝒅 = 0,𝒅 ≠ 0

∇𝑔\(𝒙∗)⊺𝒅 = 0

where 𝑔\ represents active constraints. The vectors 𝝀∗ ∈ ℝ/,𝝁∗ ∈ ℝd are called Lagrange multipliers. Optimality conditions in the constrained case are sometimes called Karush, Kuhn, Tucker (KKT) conditions.

The Lagrange multipliers for equality constraints 𝝀∗ can have any sign, while the Lagrange multipliers for the inequality constraints 𝝁∗ must be greater than or equal to zero. The equation 𝜇\∗𝑔\(𝒙) = 0 is called complementary slackness condition. If 𝑔\(𝒙∗) = 0 (𝑔\ is active in 𝒙∗), we may have either 𝜇\∗ =0 or 𝜇\∗ > 0. If 𝑔\(𝒙∗) = 0 and 𝜇\∗ > 0, then 𝑔\(𝒙∗) is called strictly active in 𝒙∗.

Sufficient optimality conditions. Sufficient conditions such that a given point 𝒙∗ ∈ 𝑋 is a local solution for the NLP problem (3.4) is that there exist 𝝀∗ ∈ ℝ/,𝝁∗ ∈ ℝd such that:

7∇𝒙𝐿(𝒙∗,𝝀∗,𝝁∗) = 𝟎𝜇\∗ ≥ 0,𝑗 = 1,… , 𝑝𝜇\∗𝑔\(𝒙∗) = 0,𝑖 = 1,… , 𝑝

and

𝒅⊺𝐻𝒙𝐿(𝒙∗)𝒅 > 0∀𝒅 ∈ ℝ- ∶ P∇ℎ(𝒙∗)⊺𝒅 = 0,𝒅 ≠ 0

∇𝑔e\(𝒙∗)⊺𝒅 = 0

where 𝑔e\ represents strictly active constraints.

Likewise in the unconstrained case, optimality conditions could be used to solve a constrained NLP problem. It is required to solve a system of 𝑛 +𝑚 + 𝑝 equations in 𝑛 +𝑚 + 𝑝unknowns (𝒙,𝝀,𝝁):

7∇𝒙𝐿(𝒙,𝝀,𝝁) = 𝟎ℎV(𝒙) = 0,𝑖 = 1,… ,𝑚𝜇\𝑔\(𝒙) = 0,𝑗 = 1,… , 𝑝

For all the solutions, we have to check whether the sufficient conditions are satisfied or not. If they are satisfied, we can conclude that 𝒙 is a local solution for the NLP problem. Unfortunately, as noticed for the unconstrained case, if 𝑛,𝑚, 𝑝 are large, finding a solution of the system is very complex and computationally demanding. Thus, in the following we introduce alternative approximate methods that are not based on the use of KKT conditions:

_________________________________________________________________________________________


• Penalty function method. • Barrier function method.

3.4.2. Penalty function method

The penalty function method is an approximate method to solve a constrained NLP problem of the kind:

7min𝒙∈ℝ5

𝑓(𝒙)

ℎ(𝒙) = 0𝑔(𝒙) ≤ 0

where ℎ(𝒙) are 𝑚equality constraints, while 𝑔(𝒙) are 𝑝 inequality constraints. The basic idea is to transfer the constraints in the cost function to minimize. Instead of the previous problem, we solve a sequence of unconstrained problems:

min𝒙∈ℝ5

𝑓�(𝑗) (𝒙),𝑗 = 0,1, …

where 𝑓�(𝑗)(𝒙) = 𝑓(𝒙) + 𝑘\𝑃(𝒙), 𝑘\ is a positive constant, and 𝑃 ∶ ℝ- ⇒ ℝis a penalty function. A penalty function 𝑃 must be such that:

• 𝑃 ∈ 𝐶$. • 𝑃(𝒙) ≥ 0∀𝒙 ∈ ℝ-. • 𝑃(𝒙) = 0⇔ 𝒙 ∈ 𝑋 = {𝒙 ∈ ℝ- ∶ ℎ(𝒙) = 0,𝑔(𝒙) ≤ 0}.

Some examples:

• In the case of equality constraints:

𝑃(𝒙) =@[ℎV(𝒙)]Y/

VD$

• In the case of inequality constraints:

𝑃(𝒙) =@Êmax�0,𝑔$𝒙)�ÌY

d

\D$

• If we have both types of constraints:

𝑃(𝒙) =@[ℎV(𝒙)]Y/

VD$

+@Êmax�0,𝑔\(𝒙)�ÌY

d

\D$

The coefficients 𝑘\ are chosen as an increasing sequence, i.e.,

𝑘= < 𝑘$ < 𝑘Y < ⋯

We have to solve a sequence of unconstrained problems min𝒙∈ℝ5

𝑓�($(𝒙) for increasing values of 𝑘\ using

methods for unconstrained NLP (gradient, Newton, etc.). Let 𝒙∗(\) be the solution of the 𝑗-th problem. It is possible to prove that, given 𝒙∗ the true solution of the original constrained NLP problem, we have:

_________________________________________________________________________________________


𝒙∗(\)\→D⎯̂̀ 𝒙∗ as 𝑘\ → ∞

H𝒙∗(\) − 𝒙∗Hca→D⎯̂⎯̀ 0

We introduce stopping criteria:

H𝒙∗(\#$) − 𝒙∗(\)H ≤ 𝜀$

J𝑓�𝒙∗(\#$)� − 𝑓�𝑥∗(\)�J ≤ 𝜀Y

where 𝜀$, 𝜀Y are given thresholds. No conditions on the gradient of 𝑓(𝒙) exist since in general it is not the gradient of 𝑓 to be equal to zero but the gradient of the Lagrangian function.

Let us consider the following 1D example:

µmin1∈ℝ

𝑓(𝑥)

𝑎 ≤ 𝑥 ≤ 𝑏

P𝑔$(𝑥) = 𝑎 − 𝑥

𝑔Y(𝑥) = 𝑥 − 𝑏

We have to introduce a penalty function 𝑃(𝑥) for the inequality constraints. In this case 𝑎 ≡ 𝑥∗. We have smooth junctions between the functions due to the properties of the penalty function 𝑃(𝑥).

𝑃(𝑥) = [max(0,𝑎 − 𝑥)]Y + [max(0, −𝑏 + 𝑥)]Y

𝑓�(\)(𝑥) = 𝑓(𝑥) + 𝑘\𝑃(𝑥)

We approximate constraints using a function like the one represented by the dashed line, as 𝑘\ increases, from the outside. Unfortunately, we have no guarantee that the constraints will be always satisfied. The risk of violating them reduces as 𝑘\ increases.

_________________________________________________________________________________________


3.4.3. Barrier function method

The barrier function method is an approximate method to solve a constrained NLP problem of the kind:

µmin𝒙∈ℝ5

𝑓(𝒙)

𝑔(𝒙) ≤ 0

𝑋 = {𝒙 ∈ ℝ- ∶ 𝑔(𝒙) ≤ 0}

This method can deal only with inequality constraints. In other words, the set 𝑋 must be robust (we can approach the boundaries of 𝑋 from the interior of 𝑋).

Non-robust sets are usually obtained from equality constraints, thus no constraints of the kind ℎ(𝒙) = 0 are allowed for this method. The idea is still to transfer the constraints in the cost function to minimize. As before, we solve a sequence of problems, using methods of unconstrained NLP, instead of the original one:

min𝒙∈V-".

𝑓�(\)(𝒙)

where 𝑖𝑛𝑡𝑋 is the interior of 𝑋, i.e, 𝑋 ∖ 𝜕𝑋. Differently from the penalty function method, the sequence of problems is constrained. However, we can use descent methods to solve them, for instance using a small descent step in order to reduce the risk of going outside 𝑋. We have:

𝑓�(\)(𝒙) = 𝑓(𝒙) + 𝜀\𝐵(𝒙)

where 𝜀\ > 0 is a constant and 𝐵(𝒙) is called barrier function. A barrier function 𝐵 is such that:

• 𝐵(𝒙) ∈ 𝐶$. • 𝐵(𝒙) ⇒ ∞ for 𝒙 → 𝜕𝑋.

Examples of barrier functions:

• Logarithmic barrier:

𝐵(𝒙) = −@ ln(−𝑔V(𝒙))d

VD$

• Inverse barrier:

𝐵(𝒙) = −@1

𝑔V(𝒙)

d

VD$

_________________________________________________________________________________________


The coefficients 𝜀\ have to be chosen as a decreasing sequence:

𝜀= > 𝜀$ > 𝜀Y > ⋯ > 0

Let 𝒙∗(\) be the solution of the problem min𝒙∈V-".

𝑓�(\)(𝒙)obtained via descent methods. It is possible to

prove that:

𝒙∗(\) → 𝒙∗ for 𝑗 → ∞

In other words,

H𝒙∗(\#$) − 𝒙∗(\)Hda→=⎯̂⎯̀ 0

We introduce stopping criteria like in the penalty function method:

H𝒙∗(\) − 𝒙∗H ≤ 𝛿$

J𝑓�𝒙∗(\#$)� − 𝑓�𝑥∗(\)�J ≤ 𝛿Y

where 𝛿$ and 𝛿Y are suitable thresholds.

Let us consider again the following 1D example:

µmin1∈ℝ

𝑓(𝑥)

𝑎 ≤ 𝑥 ≤ 𝑏

In general, 𝐵(𝒙) → ∞ as 𝒙 → 𝜕𝑋, and 𝐵(𝒙) ≠ 0 if 𝒙 ∈ 𝑖𝑛𝑡𝑋.

min1∈V-".

𝑓�(\)(𝑥);𝑓�(\) = 𝑓(𝑥) + 𝜀\𝐵(𝑥)

As 𝜀\ ⇒ 0, we have 𝑓�(\)(𝒙) ≅ 𝑓(𝒙) for 𝒙 ∈ 𝑖𝑛𝑡𝑋, and 𝑓�(\)(𝒙) ⇒ ∞ for 𝒙 ⇒ 𝜕𝑋. We can use unconstrained methods to solve min

𝒙∈V-".𝑓�(\)(𝒙) provided we start within 𝑋, in fact, it is likely that our

descent methods will not “climb the walls” of the function 𝑓�(\) as 𝒙 ⇒ 𝜕𝑋. In the example we can go

_________________________________________________________________________________________


arbitrarily near 𝒙∗ = 𝑎 (𝑎 ∈ 𝜕𝑋) as we take 𝜀\ ⇒ 0. If compared to the penalty function method, here we will never violate the constraints, provided we choose a small descent step 𝛼. So, this method is more robust as compared to the penalty function method since it approximates the constraints represented by the dashed lines from the inside.

_________________________________________________________________________________________


4. Partial differential equations

In this part, we will learn how to solve analytically classical linear partial differential equations (PDEs).

A PDE is an equation where the unknown is a function 𝜑 that depends on a certain number of variables 𝑥$, … , 𝑥-. In the equation, partial derivatives of 𝜑 are involved, up to the order 𝑁. For the sake of simplicity, we will deal only with second order PDEs like the following:

𝐹 ¹𝑥$, … , 𝑥-, 𝜑,𝜕𝜑𝜕𝑥$

,𝜕𝜑𝜕𝑥Y

, … ,𝜕𝜑𝜕𝑥-

,𝜕Y𝜑𝜕𝑥$Y

,𝜕Y𝜑𝜕𝑥YY

, … ,𝜕Y𝜑𝜕𝑥-Y

,𝜕Y𝜑

𝜕𝑥$𝜕𝑥Yº = 0

In the following, we will consider only (at most):

• 𝑁 = 2; • 𝑛 = 4 (at most 4 independent variables 𝑥, 𝑦, 𝑧, 𝑡 or 𝑥$, 𝑥Y, 𝑥Z, 𝑡); • Quasi-linear PDEs (linear in the highest order derivatives).

Example:

f®gf1®

+ 𝑥 f®gf9®

+ sin ·fgf1∙ 1g¸ + ·fg

f9¸Z= 8,𝑁 = 2; 𝑛 = 2 (4.1)

A PDE is said to be linear if it is linear in all the derivatives. An example is the following:

4𝜕Y𝜑𝜕𝑥Y

+ 3𝜕Y𝜑𝜕𝑦Y

+ 8𝜕𝜑𝜕𝑥

+ 𝑥Y𝜕𝜑𝜕𝑦

+ 6𝜑 = 0

A generic quasi-linear PDE can be written as:

linear highest order derivatives

partial derivatives of 𝜑 independent variables unknown

_________________________________________________________________________________________


𝐹 =@@𝐴V\ ±𝑥$, 𝑥Y, 𝑥Z, 𝜑,𝜕𝜑𝜕𝑥$

,𝜕𝜑𝜕𝑥Y

,𝜕𝜑𝜕𝑥Z

²𝜕Y𝜑𝜕𝑥V𝜕𝑥\

+ 𝐵 ±𝑥$, 𝑥Y, 𝑥Z, 𝜑,𝜕𝜑𝜕𝑥$

,𝜕𝜑𝜕𝑥Y

,𝜕𝜑𝜕𝑥Z

²Z

\D$

Z

VD$

= 0

with 𝐴V\ = 𝐴\V.

In (4.1), we have 𝐴$$ = 1, 𝐴$Y = 0, 𝐴Y$ = 0, 𝐴YY = 𝑥, and 𝐵 = sin ·fgf1∙ 1g¸ + ·fg

f9¸Z− 8.

PDEs are classified in three different families, depending on 𝐴:

• Elliptic if the eigenvalues of 𝐴 are all positive. • Hyperbolic if some eigenvalues of 𝐴 are > 0 and some are < 0. • Parabolic if at least one eigenvalue is = 0.

The three families are completely different: different methods are available to solve them. In general, 𝐴 is a symmetric matrix owing to the Schwartz theorem and it is diagonalizable. Let us introduce a diagonal matrix:

Λ = 𝑅𝐴𝑅B

where 𝑅 is a change of coordinate matrix, and

Λ = m𝜆$ 0 00 ⋱ 00 0 𝜆-

o

where 𝜆V are the eigenvalues of Λ which coincide with the eigenvalues of 𝐴. Hence, a generic quasi-linear PDE can be written as:

𝜆$𝜕Y𝜑𝜕𝑥$Y

+ ⋯+ 𝜆-𝜕Y𝜑𝜕𝑥-Y

+ 𝐵 ±𝑥$, … , 𝑥-, 𝜑,𝜕𝜑𝜕𝑥$

, … ,𝜕𝜑𝜕𝑥-

² = 0

As previously pointed out, the PDE is said:

• Elliptic if 𝜆$, … , 𝜆- have all the same sign. • Hyperbolic if 𝜆$, … , 𝜆- have different signs (e.g., one negative, the other positive). • Parabolic if one among 𝜆$, … , 𝜆- is zero.

_________________________________________________________________________________________


4.1. Classification of PDEs

4.1.1. Elliptic equations

Usually, they model static physical phenomena (no time involved). A well-known example is the Laplace equation:

∆𝜑 = 0

where ∆ is the Laplacian operator f®

f1®+ f®

f9®+ f®

fk® (in Cartesian coordinates). Thus, in three coordinates

the equation is:

𝜕Y𝜑𝜕𝑥Y

+𝜕Y𝜑𝜕𝑦Y

+𝜕Y𝜑𝜕𝑧Y

= 0

where 𝑥, 𝑦, 𝑧are space variables. We have:

𝐴 = m1 0 00 1 00 0 1

o , 𝐵 = 0 ⇒ 𝜆$,Y,Z = +1

4.1.2. Hyperbolic equations

They model dynamic phenomena propagating with finite speed (e.g., electromagnetic waves). A well-known example is the D’Alembert equation (or wave equation):

𝜕Y𝜑(𝑥, 𝑦, 𝑡)𝜕𝑥Y

+𝜕Y𝜑(𝑥, 𝑦, 𝑡)

𝜕𝑦Y−1𝑣Y𝜕Y𝜑(𝑥, 𝑦, 𝑡)

𝜕𝑡Y= 0

where 𝑥, 𝑦are space variables, 𝑡 is the time variable, and 𝑣 > 0 is the propagation speed. We have:

𝐴 = l

1 0 00 1 0

0 0 −1𝑣Ym , 𝐵 = 0 ⇒ 𝜆$,Y = +1 > 0;𝜆Z = −

1𝑣Y

< 0

4.1.3. Parabolic equations

They model dynamic phenomena that propagate with an infinite speed. A well-known example is the diffusion equation (or heat equation):

𝜕Y𝜑(𝑥, 𝑦, 𝑡)𝜕𝑥Y

+𝜕Y𝜑(𝑥, 𝑦, 𝑡)

𝜕𝑦Y−1𝐷𝜕𝜑(𝑥, 𝑦, 𝑡)

𝜕𝑡= 0

_________________________________________________________________________________________


where 𝑥, 𝑦are space variables, 𝑡 is the time variable, and 𝐷 > 0 is the diffusion coefficient. We have:

𝐴 = m1 0 00 1 00 0 0

o , 𝐵 = 0 ⇒ 𝜆$,Y = +1;𝜆Z = 0

4.2. Solution of hyperbolic PDEs

4.2.1. Example: from the oscillating string to the wave equation

Consider a string of length 𝐿 > 0 fixed at the boundaries.

D’Alembert studied how to model oscillations of the string. He modeled the behavior of the string by dividing it into a certain number 𝑁 of points, connected through springs. The considered 𝑁 points have mass 𝑚, such that 𝑁𝑚 = 𝑀, where 𝑀 is the mass of the overall string. The 𝑁 − 1 springs have an elastic constant equal to 𝑘.

For each point 𝑗 = 1,… ,𝑁, we can write the Newton’s Law:

𝑚�̈�\ = 𝐹\

where 𝐹\ is the total force acting on the 𝑗-th point. Such force is made up of three terms:

“fixed” string oscillating string when hit

_________________________________________________________________________________________


• Gravity, negligible if 𝑁 is large because 𝑚 ≈ 0. • Elastic interaction with the point 𝑗 − 1. • Elastic interaction with the point 𝑗 + 1.

The quantity 𝑦\ is the vertical displacement of the point 𝑗 and �̈�\ is the second order derivative of the vertical displacement (acceleration). For simplicity, only vertical movements are considered. Newton’s law can be written as:

7𝑚�̈�\ = −𝑚𝑔 − 𝑘�𝑦\ − 𝑦\C$� − 𝑘�𝑦\ − 𝑦\#$�𝑗 = 2,… ,𝑁 − 1𝑦$(𝑡) = 0𝑦n(𝑡) = 0

where the second and third equation account for the fact that the string is fixed at the boundaries. The equation is an ordinary differential equation (ODE) of the second order. It cannot be solved without proper initial conditions (initial conditions for the position and the speed of each point). Let us introduce them below:

⎩⎪⎨

⎪⎧𝑚�̈�\ = −𝑚𝑔 − 𝑘�𝑦\ − 𝑦\C$� − 𝑘�𝑦\ − 𝑦\#$�,𝑗 = 2,… ,𝑁 − 1𝑦$(𝑡) = 0𝑦n(𝑡) = 0𝑦\(0) = 𝛼\�̇�\(0) = 𝛽\

Now, consider 𝑁 ⇒ ∞;𝑚 = on⇒ 0;𝑎 = distance between points = G

nC$⇒ 0. In other words, starting

from a discrete model we are moving toward a continuous one. We have

𝑚𝑎=𝑀𝑁𝑁 − 1𝐿

≈𝑀𝐿= 𝜇

where 𝜇 is the linear density of the string. Instead of considering discrete points 𝑦\(𝑡), we introduce a continuous function 𝑦(𝑥, 𝑡). Starting from the discrete equation, if we divide the various terms by 𝑎, we get:

𝑚𝑎�̈�\ =

¹−𝑘𝑎 �𝑦\ − 𝑦\C$� −𝑘𝑎 �𝑦\ − 𝑦\#$�º𝑎

𝑎,𝑗 = 2, . . , 𝑁 − 1

If 𝑁 ⇒ ∞, we have 𝑎 ⇒ 0, hence the terms 9aC9aMN\

and 9aC9a¯N\

are incremental ratios of 𝑦, i.e., they represent spatial derivatives of 𝑦(𝑥, 𝑡) with respect to 𝑥. Thus, the term

𝑦\ − 𝑦\C$𝑎 −

𝑦\ − 𝑦\#$𝑎

𝑎

is a second order derivative of 𝑦(𝑥, 𝑡) with respect to 𝑥, i.e., f®9f1®

.

Thus, we obtain the wave equation

_________________________________________________________________________________________


𝜇𝜕Y𝑦(𝑥, 𝑡)𝜕𝑡Y

= 𝐸𝜕Y𝑦(𝑥, 𝑡)𝜕𝑥Y

where 𝐸 = −𝑎𝑘 is the Young module. If we reorder the terms, we get:

𝜕Y𝑦(𝑥, 𝑡)𝜕𝑥Y

−𝜇𝐸𝜕Y𝑦(𝑥, 𝑡)𝜕𝑡Y

= 0 ⇒ 𝜕Y𝑦(𝑥, 𝑡)𝜕𝑥Y

−1𝑣Y𝜕Y𝑦(𝑥, 𝑡)𝜕𝑡Y

= 0,𝑡 ∈ [0, +∞), 𝑥 ∈ [0, 𝐿]

where 𝑣 = �𝐸/𝜇 is the propagation speed.

We have to insert also proper boundary and initial conditions, as follows:

⎩⎪⎨

⎪⎧𝑦(0, 𝑡) = 0∀𝑡𝑦(𝐿, 𝑡) = 0∀𝑡𝑦(𝑥, 0) = 𝛼(𝑥)∀𝑥 ∈ [0, 𝐿]𝜕𝑦(𝑥, 0)𝜕𝑡

= 𝛽(𝑥)∀𝑥 ∈ [0, 𝐿]

4.2.2. Solution of the wave equation with the separation of variables

The separation of variables is a technique that allows to find the solution of the D’Alembert equation (and also of other kinds of PDEs, as we will see later on). This technique can be used only for linear PDEs. The basic idea is to set a fixed structure for the unknow 𝑦(𝑥, 𝑡), also referred to as “tentative solution”:

𝑦(𝑥, 𝑡) = 𝐴(𝑥)𝐵(𝑡)

In other words, 𝑦(𝑥, 𝑡) is written as a product of two functions, one depending only on 𝑥 and the other depending only on 𝑡.

Let us consider the D’Alembert equation:

𝜕Y𝑦(𝑥, 𝑡)𝜕𝑥Y

−𝜇𝐸𝜕Y𝑦(𝑥, 𝑡)𝜕𝑡Y

= 0

and substitute the tentative solution. We have:

𝜕Y𝑦𝜕𝑥Y

= 𝐴pp(𝑥)𝐵(𝑡);𝜕Y𝑦𝜕𝑡Y

= 𝐴(𝑥)�̈�(𝑡)

𝐴pp(𝑥) =𝜕𝐴(𝑥)𝜕𝑥Y

;�̈�(𝑡) =𝜕Y𝐵(𝑡)𝜕𝑡Y

𝐴pp(𝑥)𝐵(𝑡) =𝜇𝐸𝐴(𝑥)�̈�(𝑡)

D’Alembert equation

boundary conditions

initial conditions

_________________________________________________________________________________________


Now, we divide both members by the product 𝐴(𝑥)𝐵(𝑡) (for sure this term is different from zero, otherwise it would be 𝑦(𝑥, 𝑡) ≡ 0, which is unacceptable):

𝐴pp(𝑥)𝐴

=𝜇𝐸�̈�(𝑡)𝐵

= 𝑘

The first member depends only on 𝑥, and the second member depends only on 𝑡. Thus, they can be equal only if they are equal to a constant. Therefore, starting from a PDE it is possible to write two ODEs:

7𝐴pp(𝑥) − 𝑘𝐴(𝑥) = 0

�̈�(𝑡) −𝑘𝐸𝜇𝐵(𝑡) = 0

Both equations are of the kind 𝑦pp(𝑥) − 𝑞𝑦(𝑥) = 0, whose solution is (𝑥) = 𝑐$𝑒√Cq1 + 𝑐Y𝑒C√Cq1.

If 𝑞 = 0, then 𝑦(𝑥) = 𝛼𝑥 + 𝛽.

If 𝑞 < 0, for instance 𝑞 = −𝑝Y, then 𝑦(𝑥) = 𝑐$𝑒d1 + 𝑐Y𝑒Cd1.

If 𝑞 > 0, for instance 𝑞 = 𝜔Y, then 𝑦(𝑥) = 𝑐$𝑒\s1 + 𝑐Y𝑒C\s1 = 𝑐$(cos(𝜔𝑥) + 𝑖 sin(𝜔𝑥)) +𝑐Y(cos(𝜔𝑥) + −𝑖 sin(𝜔𝑥)) = 𝑑$ cos𝜔𝑥 +𝑑Y sin𝜔𝑥.

Thus, we can write the following expression for 𝐴(𝑥):

𝐴(𝑥) = 𝑐$𝑒√c1 + 𝑐Y𝑒C√c1

Let us consider now the boundary conditions. If 𝐵(𝑡) = 0, the solution would be identical to zero: the only way to have these equations equal to zero, i.e., 𝐴(0) = 𝐴(𝐿) = 0. Thus, we have:

7𝐴pp(𝑥) − 𝑘𝐴(𝑥) = 0𝑦(0, 𝑡) = 𝐴(0)𝐵(𝑡) ⇒ 𝐴(0) = 0𝑦(𝐿, 𝑡) = 𝐴(𝐿)𝐵(𝑡) ⇒ 𝐴(𝐿) = 0

(4.2)

Three different cases must be investigated for (4.2) that depend on the sign of 𝑘.

Case 𝑘 = 0:

𝐴pp(𝑥) = 0 ⇒ 𝐴(𝑥) = 𝛾𝑥 + 𝛿

𝐴(0) = 𝛾 ∙ 0 + 𝛿 ⇒ 𝛿 = 0

𝐴(𝐿) = 𝛾𝐿 + 𝛿 = 0 ⇒ 𝛾 = 0

This case has to be discarded since 𝐴(𝑥) ≡ 0, and therefore 𝑦(𝑥, 𝑡) = 𝐴(𝑥)𝐵(𝑡) = 0.

Case 𝑘 > 0:

𝑘 = 𝑝Y ⇒ 𝐴(𝑥) = 𝑐$𝑒d1 + 𝑐Y𝑒Cd1

𝐴(0) = 𝑐$ + 𝑐Y = 0 ⇒ 𝑐$ = −𝑐Y

_________________________________________________________________________________________


𝐴(𝐿) = 𝑐$𝑒Gd + 𝑐Y𝑒CGd = 0 ⇒ 2𝑐$ ¹𝑒Gd − 𝑒CGd

2º = 0 ⇒ 2𝑐$ sinh(𝑝𝐿) = 0

The term sinh(𝑝𝐿)can be zero only if 𝑝𝐿 = 0, but 𝑝 cannot be zero since 𝑝Y = 𝑘 (𝑘 > 0) and 𝐿 cannot be zero since we assume to deal with a string of a certain length. Hence, the only possibility is 𝑐$ = 0. Thus, this case has to be discarded since it would entail 𝐴(𝑥) = 0, and therefore (𝑥, 𝑡) = 𝐴(𝑥)𝐵(𝑡) =0∀𝑡, i.e., string not oscillating, which is not physically possible.

Case k < 0:

𝑘 = −𝜔Y ⇒ 𝐴(𝑥) = 𝑎 cos𝜔𝑥 + 𝑏 sin𝜔𝑥

𝐴(0) = 𝑎 = 0

𝐴(𝐿) = 𝑏 sin𝜔𝐿 = 0 ⇒ 7𝑏 = 0 ⇒ 𝐴(𝑥) ≡ 0 ⇒ 𝑦(𝑥, 𝑡) ≡ 0

sin𝜔𝐿 = 0 ⇒ 𝜔𝐿 = 𝑛𝜋 ⇒ 𝜔 =𝑛𝜋𝐿⇒ 𝑘 = −

𝑛Y𝜋Y

𝐿Y

The solution for 𝐴(𝑥) in indexed by 𝑛, i.e., we can write:

𝐴-(𝑥) = 𝑏- sin ·𝑛𝜋𝐿𝑥¸

By repeating the same operations for 𝐵(𝑡), we obtain:

�̈�(𝑡) −𝐸𝜇¹−

𝑛Y𝜋Y

𝐿Yº𝐵(𝑡) = 0 ⇒ �̈�(𝑡) +

𝐸𝑛Y𝜋Y

𝜇𝐿Y𝐵(𝑡) = 0

As before, the case of 𝑘 < 0 is the only feasible possibility. The solution is still indexed by 𝑛, i.e., we have:

𝐵-(𝑡) = 𝑐- cosî𝑛𝜋𝐿w𝐸𝜇𝑡ï + 𝑑- sin î

𝑛𝜋𝐿w𝐸𝜇𝑡ï , 𝑛 = 0,1, …

Hence, we can write:

𝑦-(𝑥, 𝑡) = 𝐴(𝑥)𝐵(𝑡) = 𝑏- sin ·𝑛𝜋𝐿𝑥¸ ü𝑐- cosî

𝑛𝜋𝐿w𝐸𝜇𝑡ï + 𝑑- sinî

𝑛𝜋𝐿w𝐸𝜇𝑡ïý ,𝑛 = 0,1, …

The quantity 𝑛 takes on discrete values 𝑛 = 0,1,2, …. We consider all of them by summing over 𝑛. The solution of the wave equation is therefore:

𝑦(𝑥, 𝑡) = @ sin ·𝑛𝜋𝐿𝑥¸ ü𝑎- cosî

𝑛𝜋𝐿w𝐸𝜇𝑡ï + 𝑏- sin î

𝑛𝜋𝐿w𝐸𝜇𝑡ïý

D

-D=

_________________________________________________________________________________________


where 𝑎- = 𝑐-𝑏- and 𝑏- = 𝑏-𝑑-.

The previous equation is also called general solution of the wave equation. It satisfies the PDE and the boundary conditions. The initial conditions are satisfied by choosing specific values for the coefficients 𝑎-, 𝑏-:

⎩⎪⎨

⎪⎧𝑦(𝑥, 0) = 𝛼(𝑥) ⇒ 𝛼(𝑥) = @ 𝑎- sin ·

𝑛𝜋𝐿𝑥¸

#D

-D=

�̇�(𝑥, 0) = 𝛽(𝑥) ⇒ 𝛽(𝑥) = @𝑛𝜋𝐿w𝐸𝜇 𝑏- sin ·

𝑛𝜋𝐿 𝑥¸

#D

-D=

The two expressions can be seen as the Fourier series expansion of 𝛼(𝑥) and 𝛽(𝑥). Hence, 𝑎- and

𝑏--xG y

z{ are the coefficients of the expansion.

In our case, the functions 𝛼(𝑥) and 𝛽(𝑥) are defined in [0, 𝐿] since this is the domain of the considered problem. Thus, it is necessary to extend 𝛼, 𝛽 outside their domain, in [−𝐿, 𝐿] (exploiting the symmetry of the function 𝑠𝑖𝑛 and assuming a period equal to 2𝐿, the extension in [−𝐿, 0] is an odd extension, i.e., 𝛼(𝑥) = −𝛼(−𝑥)). Thus, we get

𝑎- =1𝐿� 𝛼(𝑥) sin ·

𝑛𝜋𝑥𝐿¸𝑑𝑥

G

CG=2𝐿� 𝛼(𝑥) sin ·

𝑛𝜋𝑥𝐿¸ 𝑑𝑥

G

=

𝑛𝜋𝐿w𝐸𝜇𝑏- =

1𝐿� 𝛽(𝑥) sin ·


G

CG⇒ 𝑏- =

2𝑛𝜋

w𝐸𝜇� 𝛽(𝑥) sin ·


G

=

Thus, the solution of the D’Alembert equation for the oscillating string is given by:

𝑦(𝑥, 𝑡) = @ sin ·𝑛𝜋𝑥𝐿¸

#D

-D=

ücosî𝑛𝜋𝐿w𝐸𝜇𝑡ï ∙

2𝐿� 𝛼(𝑠) sin ·

𝑛𝜋𝑠𝐿¸ 𝑑𝑠

G

=+

+ sin î𝑛𝜋𝐿w𝐸𝜇𝑡ï ∙

2𝑛𝜋

w𝐸𝜇� 𝛽(𝑠) sin ·

𝑛𝜋𝑠𝐿¸𝑑𝑠

G

=ý

This solution satisfies the PDE, the boundary conditions, and also the initial conditions. It is written in closed form. This solution is unique, and this is guaranteed by the Fourier series expansion and it includes the case 𝑦 = 0 excluded before for 𝑛 = 0.

_________________________________________________________________________________________


4.2.3. Modal Analysis

The modal analysis consists in investigating what happens to the various terms of the sum of the solution of the wave equation for different values of 𝑛. The term within the sum for a fixed 𝑛 is called mode. Thus, the solution is a sum of modes.

Mode 0:

𝑛 = 0 ⇒ 𝑦=(0, 𝑡) = 0 static solution

Mode 1 (fundamental mode):

𝑛 = 1 ⇒ 𝑦$(𝑥, 𝑡) = sin ·𝜋𝑥𝐿¸ ü𝑎$ cosî

𝜋𝐿w𝐸𝜇𝑡ï + 𝑏$ sin î

𝜋𝐿w𝐸𝜇𝑡ïý

Let us consider what happens for a fixed 𝑡. We have 𝑦$(𝑥, 𝑡) ∝ sin ·x1G¸. The frequency of oscillation

is equal to $YG y

z{= 𝑓$, while the period is 𝑇$ = 2𝐿y{z. The longer the string, the lower is 𝑓$, i.e., 𝑓$ is

inversely proportional to 𝐿.

Mode 2:

𝑛 = 2 ⇒ 𝑦Y(𝑥, 𝑡) = sin ±2𝜋𝑥𝐿 ² ü𝑎Y cosî

2𝜋𝐿w𝐸𝜇𝑡ï + 𝑏Y sinî

2𝜋𝐿w𝐸𝜇𝑡ïý

_________________________________________________________________________________________


For fixed 𝑡, we have 𝑦Y(𝑥, 𝑡) ∝ sin ·Yx1G¸. Therefore, this mode is characterized by a complete

oscillation in [0, 𝐿] (two half sinusoids). The oscillation frequency is 𝑓Y = 2𝑓$ =$G y

z{.

Mode 𝑛:

𝑦-(𝑥, 𝑡) = sin ·𝑛𝜋𝑥𝐿¸ ü𝑎- cosî

𝑛𝜋𝐿w𝐸𝜇𝑡ï + 𝑏- sinî

𝑛𝜋𝐿w𝐸𝜇𝑡ïý

For fixed 𝑡, we have 𝑦-(𝑥, 𝑡) ∝ sin ·-x1G¸. Therefore, there are 𝑛 half sinusoids in [0, 𝐿]. The oscillation

frequency is 𝑓- = 𝑛𝑓$.

All the oscillation modes have a frequency that is a multiple of the fundamental frequency 𝑓$. By composing the oscillations of the various modes (we have an infinite number of modes), we obtain the shape of the function 𝑦 satisfying the wave equation is obtained.

4.3. Solution of elliptic equations

Let us consider the Laplace equation:

Δ𝜑 = 0 ⇒ 𝜕Y𝜑𝜕𝑥Y

+𝜕Y𝜑𝜕𝑦Y

+𝜕Y𝜑𝜕𝑧Y

= 0

As said, elliptic equations model static phenomena where no time is involved. Thus, there are only space variables involved.

We will study the solution of the Laplace equation in a finite domain Ω ⊂ ℝZ.

_________________________________________________________________________________________


Depending on boundary conditions, there are three different cases:

1) PΔ𝜑 = 0𝑖𝑛Ω𝜑|f~ = 𝑓.

The unknown 𝜑 has fixed values on the boundary of the domain. This condition is called Dirichlet condition, and the problem is called Dirichlet problem.

2) �Δ𝜑 = 0𝑖𝑛Ωfgf𝒏�f~= 𝑔

The normal derivative of 𝜑 is fixed on the boundary of the domain (Neumann condition). The problem is called Neumann problem.

3) �Δ𝜑 = 0𝑖𝑛Ω𝛼𝜑|f~ + 𝛽

fgf𝒏�f~= ℎ

where 𝛼, 𝛽 are known coefficients. This is a mixed or Robin problem with Robin boundary conditions. Mixed conditions are equivalent to having Dirichlet conditions in one part of the boundary and Neumann conditions in the other part, i.e.,

𝜕Ω = 𝜕Ω$ ∪ 𝜕ΩY; 𝜑|f~N = 𝑓; 𝜕𝜑𝜕𝒏

�f~®

= 𝑔

The functions 𝜑 that satisfy the Laplace equation are called harmonic. Examples of harmonic functions are the following:

• 𝜑 = 𝑎𝑥 + 𝑏𝑦 + 𝑐𝑧 + 𝑑. • 𝜑 = $

�1®#9®#k®⇒ 𝜑(𝑟, 𝜗, 𝑧) = $

3.

_________________________________________________________________________________________


4.3.1. Unicity of the solution of the Laplace equation

Theorem. For the Dirichlet problem, the solution of the Laplace equation Δ𝜑 = 0 is unique.

PΔ𝜑 = 0𝑖𝑛Ω𝜑|f~ = 𝑓

Proof. We suppose that there exist different 𝜑$ and 𝜑Y (𝜑$ ≠ 𝜑Y) both solving the equation. We will see that this hypothesis will be a contradiction. Since both 𝜑$ and 𝜑Y are solutions, we can write:

PΔφ$ = 0𝜑$|f~ = 𝑓 ;P

ΔφY = 0𝜑Y|f~ = 𝑓

Let 𝜓 = 𝜑$ − 𝜑Y ≠ 0. Owing to the linearity of the PDE, it is true that:

PΔ𝜓 = 0𝜓|f~ = 0

By applying the first Green identity ∭(∇ℎ ∙ ∇𝑔 + 𝑔Δℎ)𝑑Z𝑥 =∬ ·ℎ f�f𝒏¸ 𝑑𝑆 with 𝑔 = ℎ = 𝜓, we get

�(∇𝜓 ∙ ∇𝜓 + 𝜓Δ𝜓)𝑑Z𝑥 =� ±𝜓𝜕𝜓𝜕𝒏²

𝑑𝑆

�‖∇𝜓‖Y𝑑Z𝑥 = 0 ⇒ ‖∇𝜓‖Y = 0 ⇒ ‖∇𝜓‖ = 0 ⇒ ∇𝜓 = 0 ⇒ 𝜓 = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡

However, 𝜓|f~ = 0, thus 𝜓 must be zero over Ω, i.e., 𝜓 = 0 in Ω. Thus, we have 𝜑$ = 𝜑Y, hence we get unicity of the solution.

In the case of the Neumann problem, it is possible to prove unicity up to a constant. The proof follows the same steps as before: we assume different 𝜑$ and 𝜑Y solving the equation, thus we can write:

𝜑$ ≠ 𝜑Y ⇒ 7Δφ$ = 0𝜕𝜑$𝜕𝒏

�f~= 𝑔 ; 7

ΔφY = 0𝜕𝜑Y𝜕𝒏

�f~= 𝑔

𝜓 = 𝜑$ − 𝜑Y = 7Δ𝜓 = 0𝜕𝜓𝜕𝒏

�f~= 0

Let us apply again the first Green identity as before, with 𝑔 = ℎ = 𝜓:

�(∇𝜓 ∙ ∇𝜓 + 𝜓Δ𝜓)𝑑Z𝑥 =� ±𝜓𝜕𝜓𝜕𝒏²

𝑑𝑆

�‖∇𝜓‖Y𝑑Z𝑥 = 0 ⇒ ‖∇𝜓‖ = 0 ⇒ ∇𝜓 = 0 ⇒ 𝜓 = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡

Ω ∂Ω

Ω

Ω ∂Ω

Ω

Ω ∂Ω

_________________________________________________________________________________________


Hence, 𝜑$ − 𝜑Y = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡, that is 𝜑$ = 𝜑Y + 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡.

For the Robin problem, it is possible to prove unicity as in the case of Dirichlet conditions. For the proof, we consider again two different solutions 𝜑$ and 𝜑Y, i.e.,

𝜑$ ≠ 𝜑Y ⇒

⎩⎨

⎧Δ𝜑$ = 0𝜑$|f~ = 𝑓𝜕𝜑$𝜕𝒏�f~

= 𝑔;

⎩⎨

⎧Δ𝜑Y = 0𝜑Y|f~ = 𝑓𝜕𝜑Y𝜕𝒏 �f~

= 𝑔

𝜓 = 𝜑$ − 𝜑Y =

⎩⎨

⎧Δ𝜓 = 0𝜓|f~ = 0𝜕𝜓𝜕𝒏�f~

= 0

�(∇𝜓 ∙ ∇𝜓 + 𝜓Δ𝜓)𝑑Z𝑥 =� ±𝜓𝜕𝜓𝜕𝒏²

𝑑𝑆

�‖∇𝜓‖Y𝑑Z𝑥 = 0 ⇒ ‖∇𝜓‖ = 0 ⇒ ∇𝜓 = 0 ⇒ 𝜓 = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡

but 𝜓|f~ = 0, hence 𝜓 ≡ 0, i.e., 𝜑$ − 𝜑Y = 0, and therefore 𝜑$ = 𝜑Y.

4.3.2. Solution of the Laplace equation in a circle

Let us consider the Laplace equation within a circle of radius 𝑅.

PΔ𝜑 = 0𝑖𝑛Ω𝜑|f~ = 𝑓

The circle is one of the few shapes for the domain Ω that allows finding analytical solutions to the equation. For more complex domains, it is necessary to resort to numerical approximations.

Due to the shape of Ω, it is convenient to use polar coordinates:

P𝑥 = 𝑟 cos𝜗𝑦 = 𝑟 sin 𝜗

Ω ∂Ω

Ω

_________________________________________________________________________________________


7Δ𝜑(𝑟, 𝜗) =𝜕Y𝜑(𝑟, 𝜗)𝜕𝑟Y

+1𝑟𝜕𝜑(𝑟, 𝜗)𝜕𝑟

+1𝑟Y𝜕Y𝜑(𝑟, 𝜗)𝜕𝜗Y

,0 ≤ 𝑟 ≤ 𝑅,0 ≤ 𝜗 ≤ 2𝜋

𝜑(𝑅, 𝜗) = 𝑓, 𝜗 ∈ [0,2𝜋]

We search for a solution by using the separation of variables principle (which can be used since the PDE is linear). We start with a tentative solution:

𝜑(𝑟, 𝜗) = 𝐴(𝑟)𝐵(𝜗)

Having proved the unicity, separation of variables will give use the unique solution. Let us substitute the tentative solution in the equation:

Δ𝜑(𝑟, 𝜗) =𝜕Y𝐴(𝑟)𝐵(𝜗)

𝜕𝑟Y+1𝑟𝜕𝐴(𝑟)𝐵(𝜗)

𝜕𝑟+1𝑟Y𝜕Y𝐴(𝑟)𝐵(𝜗)

𝜕𝜗Y= 0

𝐴pp(𝑟)𝐵(𝜗) +1𝑟𝐴p(𝑟)𝐵(𝜗) +

1𝑟Y𝐴(𝑟)𝐵pp(𝜗) = 0

Let us divide by the product 𝐴(𝑟)𝐵(𝜗):

𝐴pp(𝑟)𝐴(𝑟)

+1𝑟𝐴p(𝑟)𝐴(𝑟)

+1𝑟Y𝐵pp(𝜗)𝐵(𝜗)

= 0

𝑟Y𝐴pp(𝑟)𝐴(𝑟)

+ 𝑟𝐴p(𝑟)𝐴(𝑟)

= −𝐵pp(𝜗)𝐵(𝜗)

= 𝑘

We have obtained two different ODEs:

7𝐴pp(𝑟) +

1𝑟𝐴p(𝑟) −

𝑘𝑟Y𝐴(𝑟) = 0𝑟𝑎𝑑𝑖𝑎𝑙𝑒𝑞𝑢𝑎𝑡𝑖𝑜𝑛

𝐵pp(𝜗) + 𝑘𝐵(𝜗) = 0𝑎𝑛𝑔𝑢𝑙𝑎𝑟𝑒𝑞𝑢𝑎𝑡𝑖𝑜𝑛

Consider the equation 𝐵pp(𝜗) + 𝑘𝐵(𝜗) = 0. We study three cases depending on the sign of 𝑘.

1) 𝑘 = 0 ⇒ 𝐵pp(𝜗) = 0 ⇒ 𝐵p(𝜗) = 𝛼 ⇒ 𝐵(𝜗) = 𝛼𝜗 + 𝛽 linear equation for 𝐵; 2) 𝑘 > 0 ⇒ 𝐵(𝜗) = 𝑎 cos√𝑘𝜗 + 𝑏 sin√𝑘𝜗; 3) 𝑘 < 0 ⇒ 𝐵(𝜗) = 𝑎𝑒√Cc� + 𝑏𝑒C√Cc�.

Since 𝜗 is an angle, we have 𝐵(𝜗) = 𝐵(𝜗 + 2𝜋). Thus, the only valid solution is 2) since exponentials or linear functions are not periodic with period 2𝜋. The case 2) is the only promising case since it involves periodic functions.

By imposing periodicity, we get

𝐵(𝜗 + 2𝜋) = 𝑎 cos√𝑘(𝜗 + 2𝜋) + 𝑏 sin√𝑘(𝜗 + 2𝜋) = 𝑎 cos √𝑘𝜗 + 𝑏 sin√𝑘𝜗

_________________________________________________________________________________________


�cosÊ√𝑘(𝜗 + 2𝜋)Ì = cosÊ√𝑘𝜗ÌsinÊ√𝑘(𝜗 + 2𝜋)Ì = sinÊ√𝑘𝜗Ì

⇒ √𝑘(𝜗 + 2𝜋) = √𝑘𝜗 + 2𝑛𝜋,𝑛 = 0,±1,±2,…

2𝜋√𝑘 = 2𝑛𝜋 ⇒ √𝑘 = 𝑛 ⇒ 𝑘 = 𝑛Y case 2), 𝑘 > 0, everything is coherent

The solution for 𝐵(𝜗) is indexed by 𝑛, i.e.,

𝐵-(𝜗) = 𝑎- cos𝑛𝜗 + 𝑏- sin 𝑛𝜗,𝑛 = 0,1,2

It is useless to consider negative values for 𝑛 since they represent just a change of sign in 𝑎-, 𝑏-. The case 𝑛 = 0 has to be taken into account even if it was discarded before.

Now, let us consider the equation for 𝐴(𝑟) with 𝑘 = 𝑛Y:

𝐴pp(𝑟) +1𝑟𝐴p(𝑟) −

𝑛Y

𝑟Y𝐴(𝑟) = 0

Let us introduce a tentative solution 𝐴(𝑟) = 𝑟d, with 𝑝 = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡. We have:

𝑝(𝑝 − 1)𝑟dCY + 𝑝𝑟dC$C$ − 𝑛Y𝑟dCY = 0

𝑟dCY[𝑝(𝑝 − 1) + 𝑝 − 𝑛Y] = 0

𝑝Y − 𝑛Y = 0 ⇒ 𝑝Y = 𝑛Y

There are two solutions: 𝑝$ = 𝑛, 𝑝Y = −𝑛. Thus, also 𝐴(𝑟) is indexed by 𝑛, i.e.,

𝐴-(𝑟) = 𝑐-𝑟- + 𝑑-𝑟C-,𝑛 = 1,2,3, …

The case 𝑛 = 0 is considered separately, since 𝑟C-, 𝑟- are linearly dependent if 𝑛 = 0. In this case, the radial equation reduces to:

𝐴pp(𝑟) +1𝑟𝐴′(𝑟) = 0

𝐴p(𝑟) = 𝑄(𝑟) ⇒ 𝑄p(𝑟) +1𝑟𝑄(𝑟) = 0

𝑑𝑄𝑑𝑟

= −𝑄𝑟⇒ �

𝑑𝑄𝑄= −�

𝑑𝑟𝑟⇒ ln𝑄 = − ln 𝑟 + 𝑐=

𝑄(𝑟) = 𝐴p(𝑟) ⇒ 𝐴(𝑟) = �𝑄(𝑟)𝑑𝑟 = 𝛼= + 𝛽= ln 𝑟

If we write all the terms together, we can write the following general solution for the Laplace equation:

𝜑(𝑟, 𝜗) = 𝐴(𝑟)𝐵(𝜗) = [𝛼= + 𝛽= ln 𝑟]𝑎= +@[𝑐-𝑟- + 𝑑-𝑟C-][𝑎- cos𝑛𝜗 + 𝑏- sin 𝑛𝜗]D

-D$

𝐴= 𝐵= 𝐴- 𝐵-

_________________________________________________________________________________________


The values of the coefficients are still to be determined. We compute them by means of the boundary conditions and 0 ≤ 𝑟 ≤ 𝑅, 0 ≤ 𝜗 ≤ 2𝜋.

For sure, since we cannot accept divergence for 𝑟 = 0, we have

P𝛽= = 0𝑠𝑖𝑛𝑐𝑒𝑖𝑓𝑟 → 0, 𝑡ℎ𝑒𝑛 ln 𝑟 ⇒ ∞𝑑- = 0𝑠𝑖𝑛𝑐𝑒𝑖𝑓𝑟 → 0, 𝑡ℎ𝑒𝑛𝑟C- ⇒ ∞

Thus, the solution simplifies as follows:

𝜑(𝑟, 𝜗) = 𝑎= +@𝑟-[𝑎- cos𝑛𝜗 + 𝑏- sin 𝑛𝜗]#D

-D$

,�𝑎= = 𝑎=𝛼=𝑎- = 𝑎-𝑐-𝑏- = 𝑏-𝑐-

Let us apply the boundary conditions: 𝜑(𝑅, 𝜗) = 𝑓(𝜗) = 𝑎= + ∑ 𝑅-[𝑎- cos𝑛𝜗 + 𝑏- sin 𝑛𝜗]#D-D$ .

The right-hand-side of the equation is the Fourier series expansion of the function 𝑓(𝜗), and 𝑎=,𝑎-, 𝑏- are the Fourier coefficients, i.e.,

𝑎= =12𝜋

� 𝑓(𝜗)𝑑𝜗Yx

=

𝑎- =1𝜋𝑅-

� 𝑓(𝜗) cos𝑛𝜗Yx

=𝑑𝜗

𝑏- =1𝜋𝑅-

� 𝑓(𝜗) sin 𝑛𝜗Yx

=𝑑𝜗

Hence, the solution of the Laplace equation satisfying the equation and the boundary condition is:

𝜑(𝑟, 𝜗) =12𝜋

� 𝑓(𝑠)𝑑𝑠Yx

=+

+@𝑟-

𝜋𝑅-µW� 𝑓(𝑠) cos𝑛𝑠 𝑑𝑠

Yx

=[ cos𝑛𝜗 + W� 𝑓(𝑠) sin 𝑛𝑠 𝑑𝑠

Yx

=[ sin 𝑛𝜗�

D

-D$

The previous is the exact, analytical expression of the solution (no approximation has been performed). The unicity is guaranteed by the theorem proved in the previous section.

4.4. Solution of parabolic equations

Parabolic equations model dynamic physical phenomena (time is involved) with infinite propagation speed.

_________________________________________________________________________________________


Let us consider heat or diffusion equation in one dimension:

Δ𝑐(𝑥, 𝑡) −1𝐷𝜕𝑐(𝑥, 𝑡)𝜕𝑡

= 0

where 𝑐(𝑥, 𝑡) is the unknown (which can either be the concentration of heat in a medium or the concentration of liquid in a basin), and 𝐷 > 0 is the diffusion coefficient.

Consider the Dirichlet problem for the 1D heat equation:

⎩⎨

⎧Δ𝑐(𝑥, 𝑡) −1𝐷𝜕𝑐(𝑥, 𝑡)𝜕𝑡

= 0,𝑥 ∈ Ω

𝑐(𝑥, 0) = 𝑔(𝑥)𝑐(𝑥, 𝑡)|f~ = 𝑓(𝑥, 𝑡)

It is possible to prove that the solution to the previous problem is unique. Suppose, as in the case of elliptic equations, that there exist 𝜒(𝑥, 𝑡) ≠ 𝜂(𝑥, 𝑡) solving the equation. Let 𝜓 = 𝜒 − 𝜂. The function 𝜓 satisfies the equation:

⎩⎨

⎧Δ𝜓(𝑥, 𝑡) −1𝐷𝜕𝜓(𝑥, 𝑡)𝜕𝑡

= 0,𝑥 ∈ Ω

𝜓(𝑥, 0) = 0𝜓(𝑥, 𝑡)|f~ = 0

By using the first Green identity with 𝑓 = 𝑔 = 𝜓, it is possible to prove that 𝜓(𝑥, 𝑡) ≡ 0:

�(∇𝜓 ∙ ∇𝜓 + 𝜓Δψ)𝑑Z𝑥 =� ±𝜓𝜕𝜓𝜕𝒏²

𝑑𝑆

1𝐷�𝜓

𝜕𝜓𝜕𝑡𝑑Z𝑥 +�‖∇𝜓‖Y𝑑Z𝑥 = 0

We assume that regularity properties for the involved quantities are satisfied, hence we can switch the integral and the time derivative, i.e.,

12𝐷

𝜕𝜕𝑡�𝜓Y𝑑Z𝑥 = −�‖∇𝜓‖Y𝑑Z𝑥

Let 𝐻(𝑥, 𝑡) = $Y�∭𝜓Y𝑑Z𝑥. We have:

𝜕𝜕𝑡𝐻(𝑥, 𝑡) = −�‖∇𝜓‖Y𝑑Z𝑥

1𝐷𝜕𝜓𝜕𝑡

Ω ∂Ω

Ω Ω

Ω Ω

Ω

Ω

_________________________________________________________________________________________


The function 𝐻(𝑥, 𝑡) is always positive (or null) owing to its definition, and it is a decreasing function in time since its time derivative is negative. Moreover, 𝐻(𝑥, 0) = 0 since 𝜓(𝑥, 0) = 0. Thus, we can conclude that we must have 𝐻(𝑥, 𝑡) = 0∀𝑥, ∀𝑡. Hence,

�‖∇𝜑‖Y𝑑Z𝑥 = 0 ⇒ ‖∇𝜑‖ = 0 ⇒ ∇𝜑 = 0 ⇒ 𝜑 = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡

Owing to the boundary and initial conditions, we conclude that 𝜓 ≡ 0, and hence 𝜒 = 𝜂. Thus, we have proved the unicity of the solution. Unicity can be proved also for the Neumann problem by performing similar computations.

4.4.1. Heat equation on the infinite line

Let us consider the heat equation in an unbounded 1D domain:

⎩⎪⎨

⎪⎧𝜕

Y𝑐(𝑥, 𝑡)𝜕𝑥Y

−1𝐷𝜕𝑐(𝑥, 𝑡)𝜕𝑡

= 0, 𝑥 ∈ (−∞,∞)

𝑐(𝑥, 0) = 𝑓(𝑥)𝑖𝑛𝑖𝑡𝑖𝑎𝑙𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛𝑐(−∞, 𝑡) = 𝑐(∞, 𝑡) = 0𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛𝑠

Strictly speaking, boundary conditions are not defined since the domain is unbounded. However, it is possible to write some regularity conditions for 𝑥 ⇒ ±∞.

We apply again the separation of variables to find a solution, i.e., we assume 𝑐(𝑥, 𝑡) = 𝐴(𝑥)𝐵(𝑡):

𝐴pp(𝑥)𝐵(𝑡) −1𝐷𝐴(𝑥)�̇�(𝑡) = 0

Divide for 𝐴(𝑥)𝐵(𝑡):

𝐴pp(𝑥)𝐴(𝑥)

−1𝐷�̇�(𝑡)𝐵(𝑡)

= 0 ⇒ 𝐴pp(𝑥)𝐴(𝑥)

=1𝐷�̇�(𝑡)𝐵(𝑡)

= 𝑘

As a result, we obtain two 1D ODEs:

P𝐴pp(𝑥) − 𝑘𝐴(𝑥) = 0�̇�(𝑡) − 𝑘𝐷𝐵(𝑡) = 0

Consider the time equation �̇�(𝑡) − 𝑘𝐷𝐵(𝑡) = 0. We have:

𝑑𝐵𝑑𝑡

= 𝑘𝐷𝐵 ⇒ �𝑑𝐵𝐵= �𝑘𝐷𝑑𝑡 ⇒ ln𝐵 = 𝑘𝐷𝑡 + ln 𝑆

Ω

_________________________________________________________________________________________


where 𝑆 is a constant. Thus, we can write:

ln𝐵𝑆= 𝑘𝐷𝑡 ⇒ 𝐵(𝑡) = 𝑆𝑒c�"

The coefficient 𝑘 must be negative since for 𝑘 > 0 there is an incremental increase of the heat (divergence), which is not physical.

Now consider the space equation with 𝑘 < 0:

𝐴pp(𝑟) − 𝑘𝐴(𝑟) = 0; 𝑘 = −𝛾Y < 0 ⇒ 𝐴pp(𝑟) + 𝛾Y𝐴(𝑟) = 0

𝐴(𝑥) = 𝑎 cos 𝛾𝑥 + 𝑏 sin 𝛾𝑥

𝑐(𝑥, 𝑡) = 𝐴(𝑥)𝐵(𝑡) = Ê𝑠(𝛾)𝑒C�®�"Ì[𝑎(𝛾) cos 𝛾𝑥 + 𝑏(𝛾) sin 𝛾𝑥]

The quantities 𝑠,𝑎, 𝑏must be constant with respect to 𝑥 or 𝑡, but in general they may depend on 𝛾 (exactly as the coefficients of elliptic or hyperbolic equations depends on 𝑛). Thus, we can write:

𝑐(𝑥, 𝑡) = 𝑒C�®�"Ê𝑎!(𝛾) cos 𝛾𝑥 + 𝑏�(𝛾) sin 𝛾𝑥Ì,P𝑎!(𝛾) = 𝑎(𝛾)𝑠(𝛾)𝑏�(𝛾) = 𝑏(𝛾)𝑠(𝛾)

To consider all the possible values of 𝛾 (𝛾 is continuous and not discrete), we perform an integral over 𝛾, i.e.,

𝑐(𝑥, 𝑡) = � 𝑒C�®�"Ê𝑎!(𝛾) cos 𝛾𝑥 + 𝑏�(𝛾) sin 𝛾𝑥Ì𝑑𝛾D

CD

The previous is the general solution of the 1D heat equation on the infinite line. By applying the Euler formula for the cosine and the sine, we get

𝑐(𝑥, 𝑡) = � 𝑒C�®�" W𝑎!(𝛾)𝑒V�1 + 𝑒CV�1

2+ 𝑏�(𝛾)

𝑒V�1 − 𝑒CV�1

2𝑖[ 𝑑𝛾

D

CD

Let us perform a change of variables 2𝜋𝜆 = 𝛾:

𝑐(𝑥, 𝑡) = � 𝑒C¦x®�®�"𝑒CYxV�1 W𝑎(2𝜋𝜆)2

−𝑏(2𝜋𝜆)2𝑖

+𝑎(−2𝜋𝜆)

2+𝑏(−2𝜋𝜆)

2𝑖[ 2𝜋𝑑𝜆

D

CD

𝑞(𝜆)

𝑐(𝑥, 𝑡) = � 𝑒C¦x®�®�"𝑒CYxV�1𝑞(𝜆)𝑑𝜆D

CD

We now substitute the initial conditions: 𝑐(𝑥, 0) = 𝑓(𝑥)∀𝑥 ∈ (−∞,∞):

𝑐(𝑥, 0) = � 𝑒CYxV�1𝑞(𝜆)𝑑𝛾D

CD= 𝑓(𝑥)

_________________________________________________________________________________________


The solution 𝑐(𝑥, 𝑡) is a linear combination of sinusoids with coefficients 𝑞(𝜆). Starting from this equation, we can introduce the Fourier transform 𝐹 exactly as we introduced the Fourier series in the previous examples.

We have:

𝑐(𝑥, 𝑡) = 𝐹 ·𝑒C¦x®�®�"𝑞(𝜆)¸

By applying the initial conditions, we get:

𝐹�𝑞(𝜆)� = 𝑓(𝑥) ⇒ 𝑞(𝜆) = 𝐹C$�𝑓(𝑥)� = � 𝑓(𝜆)𝑒YxV1�𝑑𝜆D

CD

where 𝐹C$ is the inverse Fourier transform.

If we replace the expression of 𝑞(𝜆) in the expression of 𝑐(𝑥, 𝑡), we can write:

𝑐(𝑥, 𝑡) = 𝐹 ·𝑒C¦x®�®�"𝑞(𝜆)¸ = 𝐹 ·𝑒C¦x®�®�"𝐹C$�𝑓(𝑥)�¸ =

= 𝐹�𝑒C¦x®�®�"� ∗ 𝑓(𝑥) = w 14𝜋𝐷𝑡

𝑒Cx®9®¦x®�" ∗ 𝑓(𝑥)

where ∗ denotes the convolution product.

Hence, the solution of the heat equation in (−∞,+∞) with initial condition 𝑓(𝑥) is:

𝑐(𝑥, 𝑡) = � w 14𝜋𝐷𝑡

𝑒Ck®¦�"𝑓(𝑥 − 𝑧)𝑑𝑧

D

CD

Let us now consider the case of Dirac initial conditions, i.e.,

⎩⎨

⎧𝜕Y𝑐(𝑥, 𝑡)𝜕𝑥Y

−1𝐷𝜕𝑐(𝑥, 𝑡)𝜕𝑡

= 0

𝑐(𝑥, 𝑡)|±D = 0𝑐(𝑥, 0) = 𝛿(𝑥)

_________________________________________________________________________________________


From the physical point of view, this is equivalent to consider a slab (very long and thin) with a very large amount of heat given in 𝑥 = 0 at time 𝑡 = 0. The solution to this problem is given by:

𝑐(𝑥, 𝑡) = � w 14𝜋𝐷𝑡

𝑒Ck®¦�"𝛿(𝑥 − 𝑧)𝑑𝑧

D

CD

By using the property such that ∫ 𝑓(𝑥)𝛿(𝑥 − 𝑥=)𝑑𝑥DCD = 𝑓(𝑥=), we obtain

𝑐(𝑥, 𝑡) =1

√4𝜋𝐷𝑡𝑒C

1®¦�"

The previous is usually referred to as Green solution or heat kernel for the heat equation.

The heat kernel is a Gaussian, and as 𝑡 increases it becomes larger and larger, since 𝜎" = √2𝐷𝑡. The height of the Gaussian is equal to:

𝑐(0, 𝑡) =1

√4𝜋𝐷𝑡𝑒= =

1√4𝜋𝐷𝑡

For 𝑡 = 0, the height goes to +∞, and this is coherent with the initial condition 𝑐(𝑥, 0) = 𝛿(𝑥). As 𝑡 increases, the height of the Gaussian reduces (for 𝑡 ⇒ ∞, 𝑐(0, 𝑡) ⇒ 0).

If 𝑡 = 0, we have 𝑐(𝑥, 0) = 𝛿(𝑥):

If 𝑡 = 𝜀 > 0, we have 𝑐(𝑥, 𝜀) = $√¦x�d

𝑒C�®

��:

_________________________________________________________________________________________


We start from a situation where the heat is concentrated in 𝑥 = 0 at 𝑡 = 0. At 𝑡 = 𝜀 > 0 (small), the heat has diffused all over the slab (𝑐(𝑥, 𝜀) ≠ 0∀𝑥 ∈ (−∞,∞)). This is the reason for which we say that parabolic equations model phenomena with infinite propagation speed. For 𝑡 ⇒ ∞ the heat is completely diffused and 𝑐(𝑥, 𝑡) is equal to a constant.

The heat kernel is important since it can be used in combination with the superposition principle of linear systems (heat equation and Fourier transform are all linear) to solve the heat equation:

4.4.2. Solution of the heat equation using the Fourier transform

The Fourier transform can be directly used as an alternative to the separation of variables to find the solution of the heat equation. Let us consider again the heat equation on the infinite line:

⎩⎨

⎧𝜕Y𝑐(𝑥, 𝑡)𝜕𝑥Y −

1𝐷𝜕𝑐(𝑥, 𝑡)𝜕𝑡 = 0,𝑥 ∈ (−∞,+∞)

𝑐(±∞, 𝑡) = 0𝑐(𝑥, 0) = 𝑓(𝑥)

We can write 𝑐(𝑥, 𝑡) as the Fourier transform of an unknown function 𝑄(𝑦, 𝑡):

𝑐(𝑥, 𝑡) = � 𝑒CYxV19𝑄(𝑦, 𝑡)𝑑𝑦 = 𝐹�𝑄(𝑦, 𝑡)�D

CD

By substituting the previous expression in the PDE, we get:

𝜕Y𝑐𝜕𝑥Y

=𝜕Y

𝜕𝑥Y𝐹�𝑄(𝑦, 𝑡)� = 𝐹�−4𝜋Y𝑦Y𝑄(𝑦, 𝑡)�

𝜕𝑐𝜕𝑡=𝜕𝜕𝑡𝐹�𝑄(𝑦, 𝑡)� = 𝐹��̇�(𝑦, 𝑡)�

_________________________________________________________________________________________


𝐹�−4𝜋Y𝑦Y𝑄(𝑦, 𝑡)� −1𝐷𝐹 ·�̇�(𝑦, 𝑡)¸ = 0

Owing to the linearity of the Fourier transform, we get:

𝐹 ¹−4𝜋Y𝑦Y𝑄(𝑦, 𝑡) −1𝐷�̇�(𝑦, 𝑡)º = 0

Using the inverse Fourier transform, we obtain an ODE:

−4𝜋Y𝑦Y𝑄(𝑦, 𝑡) −1𝐷�̇�(𝑦, 𝑡) = 0

By using the Fourier transform, starting from a PDE we have reduced the problem to an ODE with unknown 𝑄(𝑦, 𝑡). We can solve it easily:

�̇�(𝑦, 𝑡) = −4𝜋Y𝑦Y𝐷𝑄(𝑦, 𝑡)

�𝑑𝑄𝑄= �−4𝜋Y𝑦Y𝐷𝑑𝑡

ln𝑄𝑄=

= −4𝜋Y𝑦Y𝐷𝑡

where 𝑄= is a constant coming from the integration (it is a constant with respect to 𝑡, in general it may depend on 𝑦). Thus, we can write:

𝑄(𝑦, 𝑡) = 𝑄=(𝑦)𝑒C¦x®9®�"

𝑐(𝑥, 𝑡) = 𝐹�𝑄(𝑦, 𝑡)� = � 𝑒CYxV19𝑄=(𝑦)𝑒C¦x®9�"𝑑𝑦

D

CD

We have found the same solution of the separation of variables using the Fourier transform. This is not surprising since we have unicity of the solution.

Thus, we can conclude that the solution of the heat equation can be found both with the Fourier transform and the separation of variables, i.e., the two techniques are equivalent.

4.5. Solution of hyperbolic equations

As said, hyperbolic equations model phenomena propagating with finite propagation speed. For the wave/D’Alembert equation we have seen the technique of the separation of variables to find a solution for the oscillating string. Now, we investigate another solution technique, which is specific for

_________________________________________________________________________________________


hyperbolic equations (it cannot be used for other types of equations). It is called technique of the characteristics.

4.5.1. The technique of the characteristics

Consider the 1D wave equation in the infinite line:

𝜕Y𝜑𝜕𝑥Y

−1𝑣Y𝜕Y𝜑𝜕𝑡Y

= 0,𝑣 > 0

with suitable initial conditions and boundary conditions, in 𝑥 ∈ (−∞,+∞).

We perform a change of variables:

×𝛼 = 𝑥 − 𝑣𝑡𝛽 = 𝑥 + 𝑣𝑡

where 𝛼, 𝛽 are the new variables. They have no physical meaning, and are called characteristic coordinates. The mapping between 𝑥, 𝑡 and 𝛼, 𝛽 is invertible, i.e.,

�𝑥 =

𝛽 + 𝛼2

𝑡 =𝛽 − 𝛼2𝑣

Now, let us write the equation in terms of the new variables:

𝜕𝜑(𝛼(𝑥, 𝑡), 𝛽(𝑥, 𝑡))𝜕𝑥

=𝜕𝜑𝜕𝛼

𝜕𝛼𝜕𝑥

+𝜕𝜑𝜕𝛽

𝜕𝛽𝜕𝑥

=𝜕𝜑𝜕𝛼

+𝜕𝜑𝜕𝛽

𝜕𝜑(𝛼(𝑥, 𝑡), 𝛽(𝑥, 𝑡))𝜕𝑡

=𝜕𝜑𝜕𝛼

𝜕𝛼𝜕𝑡+𝜕𝜑𝜕𝛽

𝜕𝛽𝜕𝑡

= 𝑣 ±𝜕𝜑𝜕𝛽

−𝜕𝜑𝜕𝛼²

Thus, we have:

𝜕𝜕𝑥

=𝜕𝜕𝛼

+𝜕𝜕𝛽

;𝜕𝜕𝑡= −𝑣

𝜕𝜕𝛼

+ 𝑣𝜕𝜕𝛽

𝜕𝜕𝑥

𝜕𝜕𝑥𝜑 −

1𝑣Y

𝜕𝜕𝑡

𝜕𝜕𝑡𝜑 = 0

Hence, the equation can be written as:

𝜕Y𝜑𝜕𝛼Y

+𝜕Y𝜑𝜕𝛼𝜕𝛽

+𝜕Y𝜑𝜕𝛽Y

+𝜕Y𝜑𝜕𝛽𝜕𝛼

−1𝑣YW+𝑣Y

𝜕Y𝜑𝜕𝛼Y

− 𝑣Y𝜕Y𝜑𝜕𝛼𝜕𝛽

+ 𝑣Y𝜕Y𝜑𝜕𝛽Y

− 𝑣Y𝜕Y𝜑𝜕𝛽𝜕𝛼

[ = 0

_________________________________________________________________________________________


𝜕Y𝜑𝜕𝛼Y


+𝜕Y𝜑𝜕𝛽Y


−𝜕Y𝜑𝜕𝛼Y


−𝜕Y𝜑𝜕𝛽Y


= 0

2𝜕Y𝜑𝜕𝛼𝜕𝛽

+ 2𝜕Y𝜑𝜕𝛽𝜕𝛼

= 0

Owing to the Schwartz theorem, we get:

4𝜕Y𝜑𝜕𝛼𝜕𝛽

= 0 ⇒ 𝜕Y𝜑𝜕𝛼𝜕𝛽

= 0

Using the characteristic variables, we have simplified a lot the original PDE. The solution can be easily found:

𝜕Y𝜑𝜕𝛼𝜕𝛽

= 0 ⇒ 𝜕𝜕𝛼 ±

𝜕𝜑𝜕𝛽²

= 0

where 𝜑 = 𝜑(𝛼, 𝛽). Since ffK(∙) = 0, we have that the quantity (∙) depends only on 𝛽, i.e.,

𝜕𝜑𝜕𝛽

= 𝐴(𝛽)

By integrating on 𝛽, we get:

𝜑(𝛼, 𝛽) = �𝐴(𝛽)𝑑𝛽 + 𝐺(𝛼)

Since the integral depends only on 𝛽, we can write ∫𝐴(𝛽)𝑑𝛽 = 𝐻(𝛽). The term 𝐺(𝛼) is the integration constant with respect to 𝛽, which, in general, may depend on 𝛼. Thus, we get:

𝜑(𝛼, 𝛽) = 𝐻(𝛽) + 𝐺(𝛼)

Using the coordinates 𝑥, 𝑡, we can write:

𝜑(𝑥, 𝑡) = 𝐻(𝑥 + 𝑣𝑡) + 𝐺(𝑥 − 𝑣𝑡)

which is the general solution of the D’Alembert equation in the infinite line. It satisfies the PDE, but not yet the initial and boundary conditions. The function 𝜑 is made up of two terms, 𝐻 and 𝐺, which are called left-travelling wave and right-travelling wave, respectively.

Consider 𝐺(𝑥 − 𝑣𝑡), with 𝑡 = 0 and 𝑡 = 𝑡∗ > 0:

_________________________________________________________________________________________


The shape 𝐺(𝑥) has moved toward the right of a quantity equal to 𝑣𝑡∗, where 𝑣 represents a finite propagation speed. The same holds for 𝐻(𝑥 + 𝑣𝑡), but the movement is toward the left, with speed 𝑣. The shapes 𝐻 and 𝐺 never change: there are no dispersive phenomena. On the contrary, in the heat equation (parabolic equation) we had changes in the shape, due to dispersion (starting from a Dirac function, we obtain Gaussians).

The function 𝜑 is the sum of two waves that may interact one with the other, and many interactions may be present between 𝐺 and 𝐻, as shown below:

_________________________________________________________________________________________


Now, consider again the wave equation with generic initial conditions:

⎩⎨

⎧𝜕Y𝜑𝜕𝑥Y −

1𝑣Y𝜕Y𝜑𝜕𝑡Y = 0

𝜑(𝑥, 0) = 𝑓(𝑥)�̇�(𝑥, 0) = 𝑔(𝑥)

We introduce no boundary conditions since we have waves propagating up to −∞ and +∞.

The solution is given by:

𝜑(𝑥, 𝑡) = 𝐻(𝑥 + 𝑣𝑡) + 𝐺(𝑥 − 𝑣𝑡)

By substituting the initial conditions, we get:

𝜑(𝑥, 0) = 𝐻(𝑥) + 𝐺(𝑥) = 𝑓(𝑥)

𝜕𝜑(𝑥, 0)𝜕𝑡

= 𝑔(𝑥) = [𝐻p(𝑥 + 𝑣𝑡)𝑣 − 𝐺p(𝑥 − 𝑣𝑡)𝑣]|"D=

where 𝐻′ and 𝐺′ are total derivatives with respect to 𝑥 + 𝑣𝑡 and 𝑥 − 𝑣𝑡. Thus, we can write:

P𝐻(𝑥) = −𝐺(𝑥) + 𝑓(𝑥)

𝐻p(𝑥)𝑣 − 𝐺p(𝑥)𝑣 = 𝑔(𝑥)

𝑓p(𝑥)𝑣 = 𝑔(𝑥) + 2𝐺p(𝑥)𝑣 ⇒ 𝐺p(𝑥) =12𝑓p(𝑥) −

12𝑣𝑔(𝑥)

𝐺(𝑥) =12𝑓(𝑥) −

12𝑣� 𝑔(𝑠)1

�𝑑𝑠,𝐻(𝑥) =

12𝑓(𝑥) +

12𝑣� 𝑔(𝑠)1

�𝑑𝑠

where 𝑐 is a generic constant that takes into account the integration constant. Hence, we obtain:

_________________________________________________________________________________________


𝜑(𝑥, 𝑡) = 𝐻(𝑥 + 𝑣𝑡) + 𝐺(𝑥 − 𝑣𝑡) =

=12𝑓(𝑥 + 𝑣𝑡) +

12𝑓(𝑥 − 𝑣𝑡) +

12𝑣� 𝑔(𝑠)𝑑𝑠1#�"

�−12𝑣� 𝑔(𝑠)𝑑𝑠1C�"

�

and therefore,

𝜑(𝑥, 𝑡) =12[𝑓(𝑥 + 𝑣𝑡) + 𝑓(𝑥 − 𝑣𝑡)] +

12𝑣� 𝑔(𝑠)𝑑𝑠1#�"

1C�"

This is the particular solution of the wave equation satisfying the initial conditions. It is still composed by two waves: one left-travelling and the other one right-travelling.

Let us now consider the wave equation with 𝛿(𝑥) as initial condition:

⎩⎨

⎧𝜕Y𝜑𝜕𝑥Y

−1𝑣Y𝜕Y𝜑𝜕𝑡Y

= 0

𝜑(𝑥, 0) = 𝑞𝛿(𝑥)�̇�(𝑥, 0) = 0

⇒ . P𝑔(𝑥) = 0

𝑓(𝑥) = 𝑞𝛿(𝑥)

where 𝑞 > 0 is a constant. We can exploit the solution obtained above:

𝜑(𝑥, 𝑡) =𝑞2[𝛿(𝑥 + 𝑣𝑡) + 𝛿(𝑥 − 𝑣𝑡)] +

12𝑣� 0𝑑𝑠1#�"

1C�"=𝑞2[𝛿(𝑥 + 𝑣𝑡) + 𝛿(𝑥 − 𝑣𝑡)]

Thus, starting from 𝛿(𝑥) at 𝑡 = 0, for 𝑡 = 𝑡 ̅ > 0 we have two Dirac functions traveling in opposite direction with speed 𝑣 and with half of the initial energy 𝑞.

This situation is completely different from the parabolic case:

_________________________________________________________________________________________


_________________________________________________________________________________________


5. Appendix

5.1. Useful mathematical formulas

Let us consider the gradient, divergence, and Laplacian operators:

∇𝜑 = |𝜕𝜑𝜕𝑥$

,𝜕𝜑𝜕𝑥Y

,𝜕𝜑𝜕𝑥Z

}

𝑑𝑖𝑣𝒗 =𝜕𝒗𝜕𝑥$

+𝜕𝒗𝜕𝑥Y

+𝜕𝒗𝜕𝑥Z

∇ ∙ ∇𝜑 = Δ𝜑 =𝜕Y𝜑𝜕𝑥$Y

+𝜕Y𝜑𝜕𝑥YY

+𝜕Y𝜑𝜕𝑥ZY

According to the Gauss theorem, we have:

�𝑑𝑖𝑣𝒗𝑑Z𝑥 =�𝒗 ∙ 𝒏𝑑𝑆 = 𝜙��1(𝒗)

where 𝑑Z𝑥 = 𝑑𝑥$𝑑𝑥Y𝑑𝑥Z, Ω ⊂ ℝZ, the vector 𝒏 is the unit normal vector directed outside the surface, and the dot ∙ denotes the scalar product.

The operators 𝑑𝑖𝑣(divergence) and Δ(Laplacian) are coordinate dependent. In polar coordinates, we have:

Ω ∂Ω

_________________________________________________________________________________________


Δ𝜑(𝑟, 𝜗) =𝜕Y𝜑𝜕𝑟Y

+1𝑟𝜕𝜑𝜕𝑟

+1𝑟Y𝜕Y𝜑𝜕𝜗Y

In cylindrical coordinates, we have:

Δ𝜑(𝑟, 𝜗) =𝜕Y𝜑𝜕𝑟Y

+1𝑟𝜕𝜑𝜕𝑟

+1𝑟Y𝜕Y𝜑𝜕𝜗Y

+𝜕Y𝜑𝜕𝑧Y

From the Gauss Theorem, if 𝒗 = 𝑓𝛻𝑔, with 𝑓 and 𝑔 suitable functions, it is possible to derive the first Green identity:

�(∇𝑓 ∙ ∇𝑔 + 𝑔Δ𝑓)𝑑Z𝑥 =�±𝑓𝜕𝑔𝜕𝒏²

𝑑𝑆

From the Gauss Theorem, if 𝒗 = 𝑓𝛻𝑔 − 𝑔∇𝑓 it is possible to derive the second Green identity:

�(𝑓∇𝑔 − 𝑔Δ𝑓)𝑑Z𝑥 =� ±𝑓𝜕𝑔𝜕𝒏

− 𝑔𝜕𝑓𝜕𝒏²

𝑑𝑆 Ω ∂Ω

Ω ∂Ω

P𝑥$ = 𝑟 cos𝜗𝑥Y = 𝑟 sin 𝜗

⎩⎨

⎧𝑟 = y𝑥$Y + 𝑥YY

𝜗 = atan𝑥Y𝑥$

�𝑥$ = 𝑟 cos𝜗𝑥Y = 𝑟 sin 𝜗𝑥Z = 𝑧

_________________________________________________________________________________________


5.2. Basic concepts on the Fourier series

Given a function 𝑓(ϑ) such that 𝑓(𝜗) = 𝑓(𝜗 + 2𝜋), the Fourier series expansion of 𝑓(𝜗) is given by:

𝑓(𝜗) =𝑎=2+@ 𝑎- cos𝑛𝜗 + 𝑏- sin 𝑛𝜗

#D

-D$

𝑎= =1𝜋� 𝑓(𝜗)𝑑𝜗Yx

=

𝑎- =1𝜋� 𝑓(𝜗) cos𝑛𝜗 𝑑𝜗Yx

=

𝑏- =1𝜋� 𝑓(𝜗) sin 𝑛𝜗 𝑑𝜗Yx

=

If 𝑓 is periodic of period 2𝐿, we have:

𝑓(𝜗) =𝑎=2+@ 𝑎- cos ±

𝑛𝜋𝜗𝐿 ² + 𝑏- sin ±

𝑛𝜋𝜗𝐿 ²

#D

-D$

𝑎= =1𝐿� 𝑓(𝜗)𝑑𝜗YG

=

𝑎- =1𝐿� 𝑓(𝜗) cos ±

𝑛𝜋𝜗𝐿 ² 𝑑𝜗

YG

=

𝑏- =1𝐿� 𝑓(𝜗) sin ±

𝑛𝜋𝜗𝐿 ²𝑑𝜗

YG

=

5.3. Basic concepts on the Fourier transform

Given a function 𝑔(𝑥), we define its Fourier transform as:

𝑔 (𝑦) = 𝐹(𝑔) = � 𝑔(𝑥)𝑒CYxV19𝑑𝑥D

CD

Alternative definitions of the Fourier transform are the following:

_________________________________________________________________________________________


𝑔 (𝑦) = � 𝑔(𝑥)𝑒CV19𝑑𝑥D

CD

𝑔 (𝑦) =1√2𝜋

� 𝑔(𝑥)𝑒V19𝑑𝑥D

CD

In general, the Fourier transform of 𝑔(𝑥) exists if 𝑔(𝑥) is a rapidly decreasing function. For us, this hypothesis will always be satisfied.

Properties of the Fourier transform:

• Linearity: 𝐹�𝑎𝑓(𝑥) + 𝑏𝑔(𝑥)� = 𝑎𝐹�𝑓(𝑥)� + 𝑏𝐹�𝑔(𝑥)�

• Inverse Fourier transform:

𝐹C$𝑔 (𝑦) = � 𝑔 D

CD(𝑦)𝑒YxV19𝑑𝑦 = 𝑔(𝑥)

• Derivative of the Fourier transform:

𝑑𝑑𝑦

𝐹�𝑔(𝑥)� = −2𝜋𝑖𝐹�𝑥𝑔(𝑥)�

𝑑d

𝑑𝑦d𝐹�𝑔(𝑥)� = (−2𝜋𝑖)d𝐹�𝑥d𝑔(𝑥)�

• Given 𝑓(𝑥) and 𝑔(𝑥), we define the convolution product 𝑓 ∗ 𝑔 as:

(𝑓 ∗ 𝑔)(𝑥) = � 𝑓(𝑧)𝑔(𝑥 − 𝑧)𝑑𝑧D

CD

𝐹�𝑔(𝑥)𝑓(𝑥)� = 𝐹�𝑓(𝑥)� ∗ 𝐹�𝑔(𝑥)�

𝐹�𝑓(𝑥) ∗ 𝑔(𝑥)� = 𝐹�𝑓(𝑥)�𝐹�𝑔(𝑥)�

• The Fourier transform of a Gaussian function is a Gaussian function:

𝐹�𝑒CK1®� = y𝜋𝛼𝑒C

x®9®K

_________________________________________________________________________________________


where 𝜎 = $√YK

.

𝜎¡ =1

y2𝜋Y

𝛼

= y𝛼21𝜋

𝜎𝜎¡ =1√2𝛼

y𝛼21𝜋=

12𝜋

The product 𝜎𝜎¡ is constant. As a consequence, 𝜎¡ =$

Yx¢, i.e., there is an inverse proportionality

between the width of the original Gaussian and the width of the transformed one. If the original Gaussian is “large”, the transformed Gaussian is “narrow”, and vice versa. This phenomenon is known in the literature as uncertainty principle.

5.4. The Dirac function

The Dirac function (or delta function) is a “generalized” function (distribution function) such that:

_________________________________________________________________________________________


⎩⎨

⎧𝛿(𝑥) = P∞𝑖𝑓𝑥 = 00𝑖𝑓𝑥 ≠ 0

� 𝛿(𝑥)𝑑𝑥 = 1D

CD

Sometimes, 𝛿(𝑥) is pictorially represented with this symbol (arrow):

The Dirac function can be defined also using functionals, i.e., relationships such that, given in input a function 𝑓(𝑥), return as output a number. In the case of the Dirac function, given 𝑓(𝑥) as input, the output is the value of 𝑓(𝑥) in 𝑥 = 0.

Δ�𝑓(𝑥)� = 𝑓(0) ⇒ 7� 𝛿(𝑥)𝑓(𝑥)𝑑𝑥 = 𝑓(0) ⇒ � 𝛿(𝑥 − 𝑥=)𝑓(𝑥)𝑑𝑥 = 𝑓(𝑥=)D

CD

D

CD𝛿(𝑥) = 0∀𝑥 ≠ 0

It is possible to compute the Fourier transform of 𝛿(𝑥) even though it is not a rapidly decreasing function. We have:

𝐹�𝛿(𝑥)� = � 𝑒CYxV19𝛿(𝑥)𝑑𝑥D

CD= 𝑒= = 1

𝐹(1) = 𝛿(𝑥)

The function 𝛿 is very narrow, and its Fourier transform is very large (as for the case of the Gaussian functions).

_________________________________________________________________________________________


6. References

[1] D.P. Bertsekas, “Dynamic Programming and Optimal Control”, Athena Scientific, 2005.

[2] F.S. Hillier, G.J. Lieberman, “Introduction to Operations Research”, McGraw-Hill, 2001.

[3] R. Courant, D. Hilbert, “Methods of Mathematical Physics”, Interscience Publishers, 1973.

[4] R. Bracewell, “The Fourier Transform and its Applications”, McGraw Hill, 1999.

[5] P.V. O’Neil, “Advanced Engineering Mathematics”, Brooks Cole, 2003.

mathematical methods - unige.it€¦ · mathematical methods 2 mauro gaggero the original version...

Documents