am 205: lecture 16 - harvard university
TRANSCRIPT
![Page 1: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/1.jpg)
AM 205: lecture 16
I Last time: hyperbolic PDEs
I Today: parabolic and elliptic PDEs, introduction tooptimization
![Page 2: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/2.jpg)
The Wave Equation
We now briefly return to the wave equation:
utt − c2uxx = 0
In one spatial dimension, this models, say, vibrations in a tautstring
![Page 3: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/3.jpg)
The Wave Equation
Many schemes have been proposed for the wave equation
One good option is to use central difference approximations1 forboth utt and uxx :
Un+1j − 2Un
j + Un−1j
∆t2− c2
Unj+1 − 2Un
j + Unj−1
∆x2= 0
Key points:
I Truncation error analysis =⇒ second-order accurate
I Fourier stability analysis =⇒ stable for 0 ≤ c∆t/∆x ≤ 1
I Two-step method in time, need a one-step method to “getstarted”
1Can arrive at the same result by discretizing the equivalent first ordersystem
![Page 4: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/4.jpg)
Parabolic PDEs
![Page 5: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/5.jpg)
The Heat Equation
The canonical parabolic equation is the heat equation
ut − αuxx = f (t, x),
where α models thermal diffusivity
In this section, we shall omit α for convenience
Note that this is an Initial-Boundary Value Problem:
I We impose an initial condition u(0, x) = u0(x)
I We impose boundary conditions on both sides of the domain
![Page 6: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/6.jpg)
The Heat Equation
A natural idea would be to discretize uxx with a central difference,and employ the Euler method in time:
Un+1j − Un
j
∆t−
Unj−1 − 2Un
j + Unj+1
∆x2= 0
Or we could use backward Euler in time:
Un+1j − Un
j
∆t−
Un+1j−1 − 2Un+1
j + Un+1j+1
∆x2= 0
![Page 7: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/7.jpg)
The Heat Equation
Or we could do something “halfway in between”:
Un+1j − Un
j
∆t− 1
2
Un+1j−1 − 2Un+1
j + Un+1j+1
∆x2− 1
2
Unj−1 − 2Un
j + Unj+1
∆x2= 0
This is called the Crank–Nicolson method2
In fact, it is common to consider a 1-parameter “family” ofmethods that include all of the above: the θ-method
Un+1j − Un
j
∆t− θ
Un+1j−1 − 2Un+1
j + Un+1j+1
∆x2− (1 − θ)
Unj−1 − 2Un
j + Unj+1
∆x2= 0
where θ ∈ [0, 1]
2From a paper by Crank and Nicolson in 1947, note: “Nicolson” is not atypo!
![Page 8: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/8.jpg)
The Heat Equation
With the θ-method:
I θ = 0 =⇒ Euler
I θ = 12 =⇒ Crank–Nicolson
I θ = 1 =⇒ backward Euler
For the θ-method, we can
1. perform Fourier stability analysis
2. calculate the truncation error
![Page 9: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/9.jpg)
The θ-Method: Stability
Fourier stability analysis: Set Unj (k) = λ(k)ne ik(j∆x) to get
λ(k) =1− 4(1− θ)µ sin2
(12k∆x
)1 + 4θµ sin2
(12k∆x
)where µ ≡ ∆t/(∆x)2
Here we cannot get λ(k) > 1, hence only concern is λ(k) < −1
Let’s find conditions for stability, i.e. we want λ(k) ≥ −1:
1− 4(1− θ)µ sin2
(1
2k∆x
)≥ −
[1 + 4θµ sin2
(1
2k∆x
)]
![Page 10: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/10.jpg)
The θ-Method: Stability
Or equivalently:
4µ(1− 2θ) sin2
(1
2k∆x
)≤ 2
For θ ∈ [0.5, 1] this inequality is always satisfied, hence theθ-method is unconditionally stable (i.e. stable independent of µ)
In the θ ∈ [0, 0.5) case, the “most unstable” Fourier mode is whenk = π/∆x , since this maximizes the factor sin2
(12k∆x
)
![Page 11: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/11.jpg)
The θ-Method: Stability
Note that this corresponds to the highest frequency mode that canbe represented on our grid, since with k = π/∆x we have
e ik(j∆x) = eπij = (eπi )j = (−1)j
The k = π/∆x mode:
0 1 2 3 4 5 6 7 8 9 10−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
j
eπij
![Page 12: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/12.jpg)
The θ-Method: Stability
This “sawtooth” mode is stable (and hence all modes are stable) if
4µ(1− 2θ) ≤ 2⇐⇒ µ ≤ 1
2(1− 2θ),
Hence for θ ∈ [0, 0.5), the θ-method is conditionally stable
![Page 13: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/13.jpg)
The θ-Method: Stability
0 0.1 0.2 0.3 0.4 0.50
5
10
15
20
25
30
35
40
45
θ
µ
For θ ∈ [0, 0.5), θ-method is stable if µ is in the “green region,”i.e. approaches unconditional stability as θ → 0.5
![Page 14: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/14.jpg)
The θ-Method: Stability
Note that if we set θ to a value in [0, 0.5), then stability time-step
restriction is quite severe: ∆t ≤ (∆x)2
2(1−2θ)
Contrast this to the hyperbolic case where we had ∆t ≤ ∆xc
This is an indication that the system of ODEs that arise fromspatially discretizing the heat equation are stiff
![Page 15: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/15.jpg)
The θ-Method: Accuracy
The truncation error analysis is fairly involved, hence we just givethe result:
T nj ≡
un+1j − un
j
∆t− θ
un+1j−1 − 2un+1
j + un+1j+1
∆x2− (1 − θ)
unj−1 − 2un
j + unj+1
∆x2
= [ut − uxx ] +
[(1
2− θ
)∆tuxxt −
1
12(∆x)2uxxxx
]+
[1
24(∆t)2uttt −
1
8(∆t)2uxxtt
]+
[1
12
(1
2− θ
)∆t(∆x)2uxxxxt −
2
6!(∆x)4uxxxxxx
]+ · · ·
The term ut − uxx in T nj vanishes since u solves the PDE
![Page 16: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/16.jpg)
The θ-Method: Accuracy
Key point: This is a first order method, unless θ = 1/2, in whichcase we get a second order method!
θ-method gives us consistency (at least first order) and stability(assuming ∆t is chosen appropriately when θ ∈ [0, 1/2))
Hence, from Lax Equivalence Theorem, the method is convergent
![Page 17: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/17.jpg)
The Heat Equation
Note that the heat equation models a diffusive process, hence ittends to smooth out discontinuities
Python demo: Heat equation with discontinous initial condition
0
2
4
6
8
10
0
1
2
3
0
0.2
0.4
0.6
0.8
1
xt
This is very different to hyperbolic equations, e.g. the advectionequation will just transport a discontinuity in u0
![Page 18: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/18.jpg)
Elliptic PDEs
![Page 19: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/19.jpg)
Elliptic PDEs
The canonical elliptic PDE is the Poisson equation
In one-dimension, for x ∈ [a, b], this is −u′′(x) = f (x) withboundary conditions at x = a and x = b
We have seen this problem already: Two-point boundary valueproblem!
(Recall that Elliptic PDEs model steady-state behavior, there is notime-derivative)
![Page 20: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/20.jpg)
Elliptic PDEs
In order to make this into a PDE, we need to consider more thanone spatial dimension
Let Ω ⊂ R2 denote our domain, then the Poisson equation for(x , y) ∈ Ω is
uxx + uyy = f (x , y)
This is generally written more succinctly as ∆u = f
We again need to impose boundary conditions (Dirichlet,Neumann, or Robin) on ∂Ω (recall ∂Ω denotes boundary of Ω)
![Page 21: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/21.jpg)
Elliptic PDEs
We will consider how to use a finite difference scheme toapproximate this 2D Poisson equation
First, we introduce a uniform grid to discretize Ω
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
x
y
![Page 22: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/22.jpg)
Elliptic PDEs
Let h = ∆x = ∆y denote the grid spacing
Then,
I xi = ih, i = 0, 1, 2 . . . , nx − 1,
I yj = jh, j = 0, 1, 2, . . . , ny − 1,
I Ui ,j ≈ u(xi , yj)
Then, we need to be able to approximate uxx and uyy on this grid
Natural idea: Use central difference approximation!
![Page 23: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/23.jpg)
Elliptic PDEs
We have
uxx(xi , yj) =u(xi−1, yj) − 2u(xi , yj) + u(xi+1, yj)
h2+ O(h2),
anduyy (xi , yj) =
u(xi , yj−1) − 2u(xi , yj) + u(xi , yj+1)
h2+ O(h2),
so that
uxx(xi , yj) + uyy (xi , yj) =
u(xi , yj−1) + u(xi−1, yj) − 4u(xi , yj) + u(xi+1, yj) + u(xi , yj+1)
h2+ O(h2)
![Page 24: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/24.jpg)
Elliptic PDEs
Hence we define our approximation to the Laplacian as
Ui ,j−1 + Ui−1,j − 4Ui ,j + Ui+1,j + Ui ,j+1
h2
This corresponds to a “5-point stencil”
![Page 25: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/25.jpg)
Elliptic PDEs
As usual, we represent the numerical solution as a vector U ∈ Rnxny
We want to construct a differentiation matrix D2 ∈ Rnxny×nxny
that approximates the Laplacian
Question: How many non-zero diagonals will D2 have?
To construct D2, we need to be able to relate the entries of thevector U to the “2D grid-based values” Ui ,j
![Page 26: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/26.jpg)
Elliptic PDEs
Hence we need to number the nodes from 1 to nxny — we numbernodes along the “bottom row” first, then second bottom row, etc
Let G denote the mapping from the 2D indexing to the 1Dindexing. From the above figure we have:
G(i , j ; nx) = jnx + i , and hence UG(i ,j ;nx ) = Ui ,j
![Page 27: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/27.jpg)
Elliptic PDEs
Let us focus on node (i , j) in our F.D. grid, this corresponds toentry G(i , j ; nx) of U
Due to the 5-point stencil, row G(i , j ; nx) of D2 will only havenon-zeros in columns
G(i , j − 1; nx) = G(i , j ; nx)− nx , (1)
G(i − 1, j ; nx) = G(i , j ; nx)− 1, (2)
G(i , j ; nx) = G(i , j ; nx), (3)
G(i + 1, j ; nx) = G(i , j ; nx) + 1, (4)
G(i , j + 1; nx) = G(i , j ; nx) + nx (5)
I (2), (3), (4), give the same tridiagonal structure that we’reused to from differentiation matrices in 1D domains
I (1), (5) give diagonals shifted by ±nx
![Page 28: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/28.jpg)
Elliptic PDEs
For example, sparsity pattern of D2 when nx = ny = 6
![Page 29: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/29.jpg)
Elliptic PDEs
Python demo: Solve the Poisson equation
∆u = − exp−(x − 0.25)2 − (y − 0.5)2
,
for (x , y) ∈ Ω = [0, 1]2 with u = 0 on ∂Ω
x
y
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
0.055
0.06
![Page 30: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/30.jpg)
Nonlinear Equations and Optimization
![Page 31: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/31.jpg)
Motivation: Nonlinear Equations
So far we have mostly focused on linear phenomena
I Interpolation leads to a linear system Vb = y (monomials) orIb = y (Lagrange polynomials)
I Linear least-squares leads to the normal equationsATAb = AT y
I We saw examples of linear physical models (Ohm’s Law,Hooke’s Law, Leontief equations) =⇒ Ax = b
I F.D. discretization of a linear PDE leads to a linear algebraicsystem AU = F
![Page 32: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/32.jpg)
Motivation: Nonlinear EquationsOf course, nonlinear models also arise all the time
I Nonlinear least-squares, Gauss–Newton/Levenberg–Marquardt
I Countless nonlinear physical models in nature, e.g.non-Hookean material models3
I F.D. discretization of a non-linear PDE leads to a nonlinearalgebraic system
3Important in modeling large deformations of solids
![Page 33: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/33.jpg)
Motivation: Nonlinear Equations
Another example is computation of Gauss quadraturepoints/weights
We know this is possible via roots of Legendre polynomials
But we could also try to solve the nonlinear system of equationsfor (x1,w1), (x2,w2), . . . , (xn,wn)
![Page 34: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/34.jpg)
Motivation: Nonlinear Equations
e.g. for n = 2, we need to find points/weights such that allpolynomials of degree 3 are integrated exactly, hence
w1 + w2 =
∫ 1
−11dx = 2
w1x1 + w2x2 =
∫ 1
−1xdx = 0
w1x21 + w2x
22 =
∫ 1
−1x2dx = 2/3
w1x31 + w2x
32 =
∫ 1
−1x3dx = 0
![Page 35: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/35.jpg)
Motivation: Nonlinear Equations
We usually write a nonlinear system of equations as
F (x) = 0,
where F : Rn → Rm
We implicity absorb the “right-hand side” into F and seek a rootof F
In this Unit we focus on the case m = n, m > n gives nonlinearleast-squares
![Page 36: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/36.jpg)
Motivation: Nonlinear Equations
We are very familiar with scalar (m = 1) nonlinear equations
Simplest case is a quadratic equation
ax2 + bx + c = 0
We can write down a closed form solution, the quadratic formula
x =−b ±
√b2 − 4ac
2a
![Page 37: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/37.jpg)
Motivation: Nonlinear Equations
In fact, there are also closed-form solutions for arbitrary cubic andquartic polynomials, due to Ferrari and Cardano (∼ 1540)
Important mathematical result is that there is no general formulafor solving fifth or higher order polynomial equations
Hence, even for the simplest possible case (polynomials), the onlyhope is to employ an iterative algorithm
An iterative method should converge in the limit n→∞, andideally yields an accurate approximation after few iterations
![Page 38: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/38.jpg)
Motivation: Nonlinear Equations
There are many well-known iterative methods for nonlinearequations
Probably the simplest is the bisection method for a scalar equationf (x) = 0, where f ∈ C [a, b]
Look for a root in the interval [a, b] by bisecting based on sign of f
![Page 39: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/39.jpg)
Motivation: Nonlinear Equations
#!/usr/bin/python
from math import *
# Function to consider
def f(x):
return x*x-4*sin(x)
# Initial interval: assume f(a)<0 and f(b)>0
a=1
b=3
# Bisection search
while b-a>1e-8:
print a,b
c=0.5*(b+a)
if f(c)<0: a=c
else: b=c
print "# Root at",0.5*(a+b)
![Page 40: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/40.jpg)
Motivation: Nonlinear Equations
1 1.5 2 2.5 3−4
−2
0
2
4
6
8
10
1.932 1.933 1.934 1.935
−0.01
−0.005
0
0.005
0.01
Root in the interval [1.933716, 1.933777]
![Page 41: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/41.jpg)
Motivation: Nonlinear Equations
Bisection is a robust root-finding method in 1D, but it does notgeneralize easily to Rn for n > 1
Also, bisection is a crude method in the sense that it makes no useof magnitude of f , only sign(f )
We will look at mathematical basis of alternative methods whichgeneralize to Rn:
I Fixed-point iteration
I Newton’s method
![Page 42: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/42.jpg)
Optimization
![Page 43: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/43.jpg)
Motivation: Optimization
Another major topic in Scientific Computing is optimization
Very important in science, engineering, industry, finance,economics, logistics,...
Many engineering challenges can be formulated as optimizationproblems, e.g.:
I Design car body that maximizes downforce4
I Design a bridge with minimum weight
4A major goal in racing car design
![Page 44: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/44.jpg)
Motivation: Optimization
Of course, in practice, it is more realistic to consider optimizationproblems with constraints, e.g.:
I Design car body that maximizes downforce, subject to aconstraint on drag
I Design a bridge with minimum weight, subject to a constrainton strength
![Page 45: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/45.jpg)
Motivation: Optimization
Also, (constrained and unconstrained) optimization problems arisenaturally in science
Physics:
I many physical systems will naturally occupy a minimumenergy state
I if we can describe the energy of the system mathematically,then we can find minimum energy state via optimization
![Page 46: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/46.jpg)
Motivation: Optimization
Biology:
I recent efforts in Scientific Computing have sought tounderstand biological phenomena quantitively via optimization
I computational optimization of, e.g. fish swimming or insectflight, can reproduce behavior observed in nature
I this jells with the idea that evolution has been “optimizing”organisms for millions of year
![Page 47: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/47.jpg)
Motivation: Optimization
All these problems can be formulated as: Optimize (max. or min.)an objective function over a set of feasible choices, i.e.
Given an objective function f : Rn → R and a set S ⊂ Rn,we seek x∗ ∈ S such that f (x∗) ≤ f (x), ∀x ∈ S
(It suffices to consider only minimization, maximization isequivalent to minimizing −f )
S is the feasible set, usually defined by a set of equations and/orinequalities, which are the constraints
If S = Rn, then the problem is unconstrained
![Page 48: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/48.jpg)
Motivation: Optimization
The standard way to write an optimization problem is
minx∈S
f (x) subject to g(x) = 0 and h(x) ≤ 0,
where f : Rn → R, g : Rn → Rm, h : Rn → Rp
![Page 49: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/49.jpg)
Motivation: Optimization
For example, let x1 and x2 denote radius and height of a cylinder,respectively
Minimize the surface area of a cylinder subject to a constraint onits volume5 (we will return to this example later)
minx
f (x1, x2) = 2πx1(x1 + x2)
subject to g(x1, x2) = πx21x2 − V = 0
5Heath Example 6.2
![Page 50: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/50.jpg)
Motivation: Optimization
If f , g and h are all affine, then the optimization problem is calleda linear program
(Here the term “program” has nothing to do with computerprogramming; instead it refers to logistics/planning)
Affine if f (x) = Ax + b for a matrix A, i.e. linear plus a constant6
Linear programming mayalready be familiar
Just need to check f (x) onvertices of the feasible region
6Recall that “affine” is not the same as ”linear”, i.e.f (x + y) = Ax + Ay + b and f (x) + f (y) = Ax + Ay + 2b
![Page 51: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/51.jpg)
Motivation: Optimization
If the objective function or any of the constraints are nonlinear thenwe have a nonlinear optimization problem or nonlinear program
We will consider several different approaches to nonlinearoptimization in this Unit
Optimization routines typically use local information about afunction to iteratively approach a local minimum
![Page 52: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/52.jpg)
Motivation: Optimization
In some cases this easily gives a global minimum
−1 −0.5 0 0.5 10
0.5
1
1.5
2
2.5
3
![Page 53: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/53.jpg)
Motivation: Optimization
But in general, global optimization can be very difficult
0 0.2 0.4 0.6 0.8 1−0.1
−0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
We can get “stuck” in local minima!
![Page 54: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/54.jpg)
Motivation: Optimization
And can get much harder in higher spatial dimensions
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
−0.5
0
0.5
1
1.5
2
2.5
![Page 55: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/55.jpg)
Motivation: Optimization
There are robust methods for finding local minimima, and this iswhat we focus on in AM205
Global optimization is very important in practice, but in generalthere is no way to guarantee that we will find a global minimum
Global optimization basically relies on heuristics:
I try several different starting guesses (“multistart” methods)
I simulated annealing
I genetic methods7
7Simulated annealing and genetic methods are covered in AM207
![Page 56: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/56.jpg)
Root Finding: Scalar Case
![Page 57: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/57.jpg)
Fixed-Point Iteration
Suppose we define an iteration
xk+1 = g(xk) (∗)
e.g. recall Heron’s Method from Assignment 0 for finding√a:
xk+1 =1
2
(xk +
a
xk
)This uses gheron(x) = 1
2 (x + a/x)
![Page 58: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/58.jpg)
Fixed-Point Iteration
Suppose α is such that g(α) = α, then we call α a fixed point of g
For example, we see that√a is a fixed point of gheron since
gheron(√a) =
1
2
(√a + a/
√a)
=√a
A fixed-point iteration terminates once a fixed point is reached,since if g(xk) = xk then we get xk+1 = xk
Also, if xk+1 = g(xk) converges as k →∞, it must converge to afixed point: Let α ≡ limk→∞ xk , then8
α = limk→∞
xk+1 = limk→∞
g(xk) = g
(limk→∞
xk
)= g(α)
8Third equality requires g to be continuous
![Page 59: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/59.jpg)
Fixed-Point Iteration
Hence, for example, we know if Heron’s method converges, it willconverge to
√a
It would be very helpful to know when we can guarantee that afixed-point iteration will converge
Recall that g satisfies a Lipschitz condition in an interval [a, b] if∃L ∈ R>0 such that
|g(x)− g(y)| ≤ L|x − y |, ∀x , y ∈ [a, b]
g is called a contraction if L < 1
![Page 60: AM 205: lecture 16 - Harvard University](https://reader031.vdocument.in/reader031/viewer/2022021022/6204c0b3de172c11511cb326/html5/thumbnails/60.jpg)
Fixed-Point Iteration
Theorem: Suppose that g(α) = α and that g is a contractionon [α−A, α+A]. Suppose also that |x0−α| ≤ A. Then thefixed point iteration converges to α.
Proof:|xk − α| = |g(xk−1)− g(α)| ≤ L|xk−1 − α|,
which implies|xk − α| ≤ Lk |x0 − α|
and, since L < 1, |xk − α| → 0 as k →∞. (Note that|x0 − α| ≤ A implies that all iterates are in [α− A, α + A].)
(This proof also shows that error decreases by factor of L eachiteration)