lecture 1 di erentiation - bbk.ac.uk · lecture 1: di erentiation lecture 2: unconstrained...
TRANSCRIPT
Lecture 1
Differentiation
Agnieszka Kulacka
Birkbeck College, University of London
Differential and Integral Calculus
Lecture 1: Differentiation
Lecture 2: Unconstrained optimisation
Lecture 3: Constrained optimisation
Lecture 4: Integration
Lecture 5: Difference and differential equations
1
Motivation
Differential calculus is concerned with the way in which a value
of the function changes when the independent variable changes.
In economics, we study, for example, how a change in a firm’s
output level affects its costs and how a change in a country’s
money supply affects the rate of inflation.
2
The ratio of ∆y/∆x as ∆x→ 0 measures the instantaneous rate
of change of y with respect to x, and it’s called the derivative.
dy
dx= lim
∆x→0
∆y
∆x
The derivative of the function y = f(x) at the point P = (x0, y0)
is the slope of the tangent line at that point.
f ′(x0) = lim∆x→0
f(x0 + ∆x)− f(x0)
∆x
3
Example 1.
f(x) = x2
4
Differential
If f ′(x0) is the derivative of the function y = f(x) at the point
P (x0, f(x0)), then the total differential at the point P is
dy = df(x0, dx) = f ′(x0)dx
Thus the differential is a function of both x and dx.
The differential provides us with a method of estimating the
effect of a change in x of the amount dx = ∆x on y, where
∆y is the exact change while dy is the approximate change in y.
This can be expressed as
∆y = dy + ε
5
Hoy et al. Mathematics for Economics p. 136
6
Hoy et al. Mathematics for Economics p. 137
7
Hoy et al. Mathematics for Economics p. 137
8
Example 2.
Let f(x) = −x2. Use the differential to estimate the changes in
y between P = (10,−100) and each of the 5 points
Qn, n = 1,2,3,4,5.
Qi (15,-225) (14,-196) (13,-169) (12,-144) (11,-121)
∆x = dx
∆y
dy = f ′(x)dx
ε
9
A function f(x), which is defined on an open interval including
the point x = a, is differentiable at that point if
lim∆x→0
f(a+ ∆x)− f(a)
∆x
exists and is finite. That is,
lim∆x→0+
f(a+ ∆x)− f(a)
∆x= lim
∆x→0−f(a+ ∆x)− f(a)
∆x
The value of this expression is the value of the derivative function
f ′(x) at the point x = a.
10
Example 3.
Find the left- and right-hand derivatives at the point of non-
differentiability.
f(x) =
x, x < 1
2− x, x ≥ 1
11
Hoy et al. Mathematics for Economics p. 142
12
Rules of differentiation
1. Derivative of a constant function:
If f(x) = c, where c ∈ R, then f ′(x) = 0.
2. Derivative of a linear function:
If f(x) = mx+ c, where m, c ∈ R, then f ′(x) = m.
3. Derivative of a power function:
If f(x) = xn, where n ∈ R, then f ′(x) = nxn−1.
4. Derivative of a constant multiple of a function:
If g(x) = cf(x), where c ∈ R, then g′(x) = cf ′(x).
13
Rules of differentiation
5. Derivative of the sum of an arbitrary but finite number of
functions:
If h(x) =∑ni=i fi(x), where n ∈ N, then h′(x) =
∑ni=0 f
′i(x).
6. Product Rule:
If h(x) = f(x)g(x), then h′(x) = f ′(x)g(x) + f(x)g′(x).
7. Quotient Rule:
If h(x) = f(x)g(x), then h′(x) = f ′(x)g(x)−f(x)g′(x)
g2(x).
8. Chain Rule:
If y = f(u) and u = g(x) so that y = f(g(x)) = h(x), then
h′(x) = f ′(u)g′(x).
14
Example 4.
If f is differentiable at x, find the expressions for the derivatives
of the following functions:
(a) x+ f(x) (b) [f(x)]2 − x (c) [f(x)]4
(d) x2f(x) + [f(x)]3 (e) xf(x) (f)√f(x)
(g) x2
f(x) (h) [f(x)]2
x3
15
Rules of differentiation
9. Derivative of the inverse of a function:
If y = f(x) has an inverse function x = g(y), and f ′(x) 6= 0, then
g′(y) = 1/f ′(x).
10. Derivative of an exponential function:
If y = ex, then dy/dx = ex.
11. Derivative of a logarithmic function:
If y = lnx, then dy/dx = 1/x.
16
Example 5.
Find the expressions for the derivatives of the following functions
w.r.t. x:
(a) x ln(x2 + 2) (b) ln(√x+ 1) (c) (ex + x2)10
Example 6.
Let y = 100p−2 + p. Find dpdy
∣∣∣p=10
17
Monotonicity
We say that the function f is strictly increasing (↑) if f(x1) <
f(x2) whenever x<x2. Similarly, f is strictly decreasing (↓) if
f(x1) > f(x2) whenever x<x2. In either case f is said to be a
monotonic function.
Test for monotonicity
If f ′(x) ≥ 0 for all x, and f ′(x) for only a finite number of values
of x, then f is strictly increasing. Similarly, if f ′(x) ≤ 0 for all
x, and f ′(x) for only a finite number of values of x, then f is
strictly decreasing.
18
Example 7.
Find the interval where the following functions are increasing:
(a) y = (lnx)2 − 4 (b) y = ln(ex + e−x)
(c) y = x− 32 ln(x2 + 2)
19
Partial Derivatives
We will extend the idea of the derivative for functions of onevariable to functions defined on Rn.
Let y = f(x1, x2, ...xn). We can define the rate at which y
changes with respect to cahnges in each of the variables x1, x2, ...xn,taken separately in the same way as for one independent variable.
The partial derivative fi(x1, ..., xi, ..., xn) of a function y = f(x1, x2, ...xn)with respect to variable xi is
∂f
∂xi= lim
∆xi→0
f(x1, ..., xi + ∆xi, ..., xn)− f(x1, ..., xi, ..., xn)
∆xi
20
Hoy et al. Mathematics for Economics p. 395
21
Example 8.
Find ∂z∂x and ∂z
∂y for the following functions
(a) z = x2 + 3y2 (b) z = xy (c) z = 5x4y2 − 2xy5
(d) z = ex+y (e) z = exy (f) z = ex/y
(g) z = ln(x+ y) (h) z = ln(xy)
22
Gradient Vector
Assume that f(x1, ..., xn) is continuous and its partial derivatives
are defined for all (x1, ...xn). Then we may define the gradient
vector
∇f =
f1f2...fn
where fi = ∂f∂xi
.
23
Example 9.
Find the gradient vector for the following functions
(a) f(x, y) = x7 − y7 (b) f(x, y) = x5 ln y
(c) f(x, y) = (x2 − 2y2)5 (d) f(x, y, z) = x2 + y3 + z4
(e) f(x, y, z) = xyz (f) f(x, y, z) = x4/yz
24
Total differential
Let dx =
dx1...dxn
.
The first-order total differential for the function y = f(x1, ..., xn)
is
dy = f1dx1 + f2dx2 + ...+ fndxn = (∇f)Tdx
where fi = ∂f∂xi
.
25
Example 10.
Calculate the differentials of the following functions
(a) z = x3 + y3 (b) z = xey2
(c) z = ln(x2 − y2)
26
Using the total differential to approximate the actual change ∆y
in the function value for given changes in x1, ..., xn corresponds
geometrically to using the tangent plane as an approximation of
the function in the same way as we used the tangent line for the
functions of one variable.
27
Hoy et al. Mathematics for Economics p. 417
28
Example 11.
Let T (x, y, z) = (x2 + y2 + z2)1/2. Find the total differential and
use it to estimate the changes in T if x changes from 2 to 2.01
and y changes from 3 to 2.99 and z changes from 6 to 6.02.
29
Total derivative
Using a chain rule, we can define a total derivative
d
dtf(x1, x2, ..., xn) = (∇f)T
dx
dt
30
Example 12.
Find dzdt when
(a) z = x2 + y3 with x = t2 and y = 2t
(b) z = xe2y with x =√t and y = ln t
31
Problem Set 1: Di�erentiation
1. Let f(x) = x2. Use the di�erential to estimate the changes in y between P = (20, 400) and each
of the 5 points Qn, n = 1, 2, 3, 4, 5.
Qi (25,625) (24,576) (23,529) (22,484) (21,441)
∆x = dx
∆y
dy = f ′(x)dx
ε
What does this example suggest about the use of the di�erential as an estimate of the actual
change in the function value as x changes?
2. Find the left- and right-hand derivatives at the point of non-di�erentiability.
f(x) =
3x+ 2, x ≤ 5
x+ 12, x > 5
3. Di�erentiate the following:
(a) (x4 − 3x2)(5x+ 1)5
(b) (xm + 8)3(5x2 + 2x−n)
(c) (x2 + 1)/(2x3 + 1)
(d) xe2x
(e) x/(1 + ex)
(f) (e3x − 1)4
(g) ln(x4 + 1)
(h) ln(ex + 1)
(i) ln x2
x4+1
(j) xx
4. Find dy/dx at the point y = 0, when
x = 0.8 + 0.7y − (y + 2)−1
5. Which of the following functions are monotonic?
(a) f(x) = x5 − 10x3 + 45x
(b) f(x) = x− x−1
(c) f(x) = 5− x6
(d) f(x) = 5− x6, x > 0
(e) f(x) = x+ |x|
(f) f(x) = 2x+ |x|
1
6. In each of the following cases �nd the gradient vector of the function f(x, y).
(a) f(x, y) = 3x+ 4y3
(b) f(x, y) = x3 ln y + 6x2y3 + e2xy
(c) f(x, y) = (x2 + 4y2)−1/2
(d) f(x, y) = (x+ 4y)(e−2x + e−3y)
7. In each of the following cases �nd the gradient vector of the function f(x, y) and evaluate it at
the point (1,−2).
(a) f(x, y) = 3x2 + 2y5
(b) f(x, y) = 3x2y3 + 2x3y2
(c) f(x, y) = (3x− 2y)/(x2 + y2)
(d) f(x, y) = x ln(1 + y2)
8. Let z = x2y + y5. Find the total di�erential and use it to estimate the changes in z if x changes
from -3 to-3.1 and y changes from 2 to 2.1.
9. If f(x1, x2) = 2x1 + 3x2, x1 = 2t3, x2 = 3 + et, �nd the total derivative df/dt.
10. If z = x3y4 + xey, x = 1 + 6t, y = −2− 3t, �nd the total derivative dz/dt.
11. Let x1 = 8p−21 p1/2, x2 = 4p
1/21 p
−1/22 , and p1 = 2t1/2, p2 = 3t3/2. Find the total derivatives dx1/dt
and dx2/dt.
2
Answers
1.
Qi (25,625) (24,576) (23,529) (22,484) (21,441)
∆x = dx 5 4 3 2 1
∆y 225 176 129 84 41
dy = f ′(x)dx 200 160 120 80 40
ε 25 16 9 4 1
This suggests that as ∆x = dx gest smaller the value of the total di�erential dy becomes a better
approximation to the actual change ∆y. In this case dy underestimates the value of ∆y.
2.
lim∆x→0+
f(a+ ∆x)− f(a)
∆x= 1
lim∆x→0−
f(a+ ∆x)− f(a)
∆x= 3
3. (a) (45x4 + 4x3 − 105x2 − 6x)(5x+ 1)4
(b) (xm + 8)2[(15m+ 10)xm+1 + (6m− 2n)xm−n−1 + 80x− 16nx−n−1]
(c) −2x4−6x2+2x(2x3+1)2
(d) ex(2x+ 1)
(e) 1+ex−xex(1+ex)2
(f) 12e3x(e3x − 1)3
(g) 4x3
x4+1
(h) ex
ex+1
(i) 2−2x4
x(x4+1)
(j) (lnx+ 1)xx
4. 20/9
5. (a) monotonic (↑)
(b) monotonic (↑)
(c) not monotonic
(d) monotonic (↓)
(e) not monotonic (weakly monotonic)
(f) monotonic (↑)
3
6. (a)
[3
12y2
]
(b)
[3x2 ln y + 12xy3 + 2e2xy
x3/y + 18x2y2 + e2x
]
(c) −(x2 + 4y2)−3/2
[x
4y
]
(d)
[(1− 2x− 8y)e−2x + e−3y
4e−2x + (4− 3x− 12y)e−3y
]
7. (a)
[6x
10y4
],
[6
160
]
(b)
[6xy3 + 6x2y2
9x2y2 + 4x3y
],
[−24
28
]
(c) (x2 + y2)−2
[−3x2 + 3y2 + 4xy
−2x2 + 2y2 − 6xy
],
[1/25
18/25
]
(d)
[ln(1 + y2)
2xy/(1 + y2)
],
[ln 5
−0.8
]
8. dz = 2xydx+ (x2 + 5y4)dy, ∆z ≈ 10.1
9. 12t2 + 3et
10. 6(3x2y4 + ey)− 3(4x3y3 + xey)
11. dx1/dt = −x1/(4t), dx2/dt = −x2/(8t)
4
Lecture 2
Unconstrained optimisation
Agnieszka Kulacka
Birkbeck College, University of London
Differential and Integral Calculus
Lecture 1: Differentiation
Lecture 2: Unconstrained optimisation
Lecture 3: Constrained optimisation
Lecture 4: Integration
Lecture 5: Difference and differential equations
1
Motivation
Many economic models are based on the idea that an individual
decision maker makes an optimal choice from some given set of
alternatives. To formalise this idea, we interpret optimal choice
as maximising or minimising the value of some function. For
example, a firm is assumed to minimise costs of producing each
level of output and to maximise profit, etc.
2
Higher-order derivatives Since the derivative of a function y =
f(x) is also a function df(x)/dx, we can find its derivative, which
we call the second derivative, and denote it as
d
dx
dy
dx=d2y
dx2
or
f ′′(x)
Simlarly, we can find higher-order derivatives. The nth derivative
is denoted by dny/dx or f(n)(x).
3
Example 1.
Compute the second derivatives of
(a) y = x5 − 3x4 + 2
(b) y =√
1 + x2
(c) y = x5ex
(d) y = ex/x
(e) y = x2 lnx
(f) y = lnx/x
4
Convexity and Concavity
Let f(x) be a twice differentiable function.
1. f(x) is convex if, at all points on its domain f ′′(x) ≥ 0.
2. f(x) is strictly convex if, at all points on its domain f ′′(x) > 0,
except possibly at a single point.
3. f(x) is concave if, at all points on its domain f ′′(x) ≤ 0.
4. f(x) is strictly concave if, at all points on its domain f ′′(x) < 0,
except possibly at a single point.
5
Hoy et al. Mathematics for Economics p. 177
6
Example 2.
Find the intervals over which these functions are concave and
the intervals over which the functions are convex.
(a) f(x) = x1/4, x > 0
(b) f(y) = y3 − 9y2 + 60y + 10, y ≥ 0
7
Second-order partial derivatives
Similarly to functions of one variable, we can define second-order
partial derivatives of a function f(x1, .., xn).
∂2f
∂xi∂xj=
∂
∂xj
∂f
∂xi
where 0 ≤ i, j ≤ n.
The partial derivative ∂2f∂xi∂xj
is also denoted fij.
8
Example 3.
Let g be defined by
g(x, y, z) = 2x3 − 4xy + 10y2 + z2 − 4x− 28y − z + 24
for all (x, y, z)
Find all partial derivatives of the first and second orders.
9
Smooth functions
If the continuous function f(x1, ...xn) is such that its first-order
partial derivatives are defined for all (x1, .., xn) and are continu-
ous, then f is said to be of class C1. If f is of class C1 and its
partial derivatives are of class C1, then f is said to be a function
of class C2 (a smooth function).
Young’s theorem: If f is a smooth function, then
∂2f
∂xi∂xj=
∂2f
∂xj∂xi
10
Hessian matrix
We defined the gradient vector of function f as a vector of the
first-order partial derivatives of f . We can define a matrix of
second-order partial derivatives of f
∇2F =
f11 f12 ... f1nf21 f22 ... f2n...fn1 fn2 ... fnn
This matrix is called Hessian matrix of f .
11
Example 4.
Find the Hessian matrices of
(a) f(x, y, z) = ax2 + by2 + cz2
(b) f(x, y, z) = Axaybzc
12
The second-order total differential
We define the second-order total differential of y = f(x1, ...xn)
as
d2y = dxT ∇2F dx
13
Example 5.
Find the second-order total differential of
(a) y = f(x)
(b) z = f(x, y)
14
Convexity and concavity
For any smooth function y = f(x1, ...xn), it follows that
1. f is strictly convex on Rn if d2y > 0.
2. f is convex on Rn if d2y ≥ 0.
3. f is strictly concave on Rn if d2y < 0.
4. f is concave on Rn if d2y ≤ 0.
15
Leading principal minors
Let y = f(x1, ...xn) be any smooth function and H its Hessian
matrix. The leading principal minors are
|H1| = f11, |H2| =∣∣∣∣∣f11 f12f21 f22
∣∣∣∣∣, ...
|Hn| =
∣∣∣∣∣∣∣∣∣
f11 f12 ... f1nf21 f22 ... f2n...fn1 f2n ... fnn
∣∣∣∣∣∣∣∣∣
16
Strict Convexity and concavity
For any smooth function y = f(x1, ...xn) with the Hessian matrix
H, it follows that
1. f is strictly convex on Rn if all leading principal minors of H
are positive, i.e.
|H1| > 0, |H2| > 0, ..., |Hn| > 0.
2. f is strictly concave on Rn if the leading principal minors of
H alternate in sign beginning with a negative value for H1, i.e.
|H1| < 0, |H2| > 0, ..., |Hn| = |H|> 0 if n is even
< 0 if n is odd
17
Example 6.
Are these functions strictly concave or convex?
(a) f(x1, x2) = x21 + x2
2
(b) f(x1, x2) = 5− (x1 + x2)2
18
Principal minors
Let y = f(x1, ...xn) be any smooth function and H its Hessianmatrix. The principal minors |H∗k| refer to more than one minorof order k.
For n = 3, the principal minors are
|H∗1| = f11, f22, f33,
|H∗2| =∣∣∣∣∣f11 f12f21 f22
∣∣∣∣∣ ,∣∣∣∣∣f11 f13f31 f33
∣∣∣∣∣ ,∣∣∣∣∣f22 f23f32 f33
∣∣∣∣∣ ,
|H∗3| =
∣∣∣∣∣∣∣
f11 f12 f13f21 f22 f23f31 f23 f33
∣∣∣∣∣∣∣.
19
Convexity and concavity
For any smooth function y = f(x1, ...xn) with the Hessian matrix
H, it follows that
1. f is convex on Rn if all principal minors of H are non-negative,
i.e.
|H∗1| ≥ 0, |H∗2| ≥ 0, ..., |H∗n| ≥ 0.
2. f is concave on Rn if the principal minors of H alternate in
sign beginning with a non-positive value for H1, i.e.
|H∗1| ≤ 0, |H∗2| ≤ 0, ..., |H∗n| = |H|≥ 0 if n is even
≤ 0 if n is odd
20
Example 7.
Are these functions concave or convex?
(a) f(x1, x2) = (x1 + x2)1/2 defined on R2++
(b) f(x1, x2) = 3x1 + x22
21
Unconstrained optimisation for functions on R
Given some function f , we optimize it by finding a value x at
which it takes on a maximum or a minimum value. Such values
are called extreme values of the function. If the set of x-values
from which we can choose is the entire real line, the problem
of finding an extreme value is unconstrained, while if the set of
x-values is restricted to be a proper subset of the real line, the
problem is constrained.
22
Maxima for functions on R
A global maximum of a function f is f(x∗) if
f(x∗) ≥ f(x) for all x
A local maximum of a function f is f(x∗) if
f(x∗) ≥ f(x) for x∗ − ε ≤ x ≤ x∗+ ε
for some arbitrary small ε > 0.
23
First-order condition
Take the differential
dy = f ′(x∗)dx
If the function is at a local maximum at x∗, it must be imposible
to increase its value by small changes, dx, in either direction from
x∗. So
f ′(x∗) = 0
24
Minima for functions on R
A global minimum of a function f is f(x∗) if
f(x∗) ≤ f(x) for all x
A local minimum of a function f is f(x∗) if
f(x∗) ≤ f(x) for x∗ − ε ≤ x ≤ x∗+ ε
for some arbitrary small ε > 0.
25
First-order condition (necessary condition)
If a function f has a turning point (a local maximum or a local
mimimum) at x∗, then f ′(x∗) = 0.
If f(x∗) = 0, then f can have a stationary point (a turning point
or an inflection point) at x∗.
26
Hoy et al. Mathematics for Economics p. 211
27
Second-order condition (sufficient condition)
If f ′(x∗) = 0 and f ′′(x∗) < 0 (f is strictly concave), then f has a
local maximum at x∗.
If f ′(x∗) = 0 and f ′′(x∗) > 0 (f is strictly convex), then f has a
local minimum at x∗.
28
Example 8.
Find the local extrema of the following functions
(a) f(x) = e2x − 5ex + 4
(b) f(x) = x3 lnx defined on R+
(c) y = x2e−x
29
Stationary points for functions on Rn
A stationary value of a function f over the domain Rn occurs at
point (x∗1, x∗2, ...,
∗n ) at which the following equation holds
∇f(x∗1, x∗2, ..., x
∗n) = 0
where 0 is the null vector.
30
First-order condition (necessary condition)
If a function f has a local maximum or a local mimimum at
x∗ ∈ Rn, then ∇f(x∗) = 0.
If ∇f(x∗) = 0, then f can have a stationary point (a local max-
imum or a local mimimum or a saddle point) at x∗.
31
Hoy et al. Mathematics for Economics p. 475
32
Hoy et al. Mathematics for Economics p. 475
33
Second-order condition (sufficient condition)
Let y = f(x).
If ∇f(x∗) = 0 and d2y(x∗) < 0 (f is strictly concave), then f has
a local maximum at x∗.
If ∇f(x∗) = 0 and d2y(x∗) > 0 (f is strictly convex), then f has
a local minimum at x∗.
34
Example 9.
Find the local extrema of the following functions
(a) f(x, y) = x3 − x2 − y2 + 8
(b) f(x, y) = 5− x2 + 6x− 2y2 + 8y
(c) f(x1, x2) = x21 + 2x1x
22 + 2x2
2
(d) f(x1, x2, x3) = 2x21 + x2
2 + 4x23 − x1 + 2x3
35
Problem Set 2: Unconstrained optimisation
1. Find the �rst four derivatives of the function f(x) = x4.
2. Which of the following functions are convex? Which are concave?
(a) f(x) = (2x+ 1)6
(b) f(x) = x5 − x
(c) f(x) = x− x2
3. In each of the following cases �nd the Hessian matrix of the function f(x, y). (See Problem Set 1
Question 6).
(a) f(x, y) = 3x+ 4y3
(b) f(x, y) = x3 ln y + 6x2y3 + e2xy
(c) f(x, y) = (x2 + 4y2)−1/2
(d) f(x, y) = (x+ 4y)(e−2x + e−3y)
4. In each of the following cases �nd the Hessian matrix of the function f(x, y) and evaluate it at
the point (1,−2). (See Problem Set 1 Question 7).
(a) f(x, y) = 3x2 + 2y5
(b) f(x, y) = 3x2y3 + 2x3y2
(c) f(x, y) = (3x− 2y)/(x2 + y2)
(d) f(x, y) = x ln(1 + y2)
5. Which of the following functions are convex? Which are strictly convex? Which are concave?
Which are strictly concave?
(a) f(x1, x2) = (x1 + x2)2
(b) f(x1, x2) = 10− x21 − x22
(c) f(x1, x2) = x1/21 x
1/32 de�ned on R2
++
(d) f(x1, x2) = x1/21 x
1/22 de�ned on R2
++
6. Find the stationary points and classify them.
(a) y = x3 − x2 + 1
(b) y = x4 − 4x3 + 16x− 2
(c) y = x+ 1/x
(d) y = (1− x2)/(1 + x2)
(e) y = (3− x2)1/2
(f) y = x0.5e−0.1x
1
7. Find the stationary points and classify them.
(a) y = 0.5x21 + 2x22
(b) y = x1 + x2 − x21 − x22 + x1x2
(c) y = 10x1 + 2x2 − 0.5x21 − 2x22 + 5x1x2
(d) y = 2x1x2 − x31 − x32
(e) y = x31 + x32 − 4x1x2
(f) y = x1x2 + 2/x1 + 4x2 de�ned on R2++
(g) y = 2x21 − 4x22
2
Answers
1. f ′(x) = 4x3, f ′′(x) = 12x2, f ′′′(x) = 24x, f (4)(x) = 24
2. (a) convex
(b) neither
(c) concave
3. (a)
[0 0
0 24y
]
(b)
[6x ln y + 12y3 + 4e2xy 3x2/y + 36xy2 + 2e2x
3x2/y + 36xy2 + 2e2x −x3/y2 + 36x2y
]
(c) 2(x2 + 4y2)−5/2
[x2 − 2y2 6xy
6xy −2x2 + 16y2
]
(d)
[−4e−2x(1− x− 4y) −8e−2x − 3e−3y
−8e−2x − 3e−3y −3e−3y(8− 3x− 12y)
]
4. (a)
[6 0
0 40y3
],
[6 0
0 −320
]
(b)
[6y3 + 12xy2 18xy2 + 12x2y
18xy2 + 12x2y 18x2y + 4x3
],
[0 48
48 −32
]
(c) (x2 + y2)−3
[6x3 − 12x2y − 18xy2 + 4y3 4x3 + 18x2y − 12xy2 − 6y3
4x3 + 18x2y − 12xy2 − 6y3 4x3 + 18x2y − 12xy2 − 6y3
],
[−74/125 −32/125−32/125 74/125
]
(d) 2(1 + y2)−2
[0 y(1 + y2)
y(1 + y2) x(1− y2)
],
[0 −0.8−0.8 −0.24
]
5. (a) convex
(b) strictly concave
(c) strictly concave
(d) concave
6. (a) (0,1) local maximum point, (2/3,23/27) local minimum point
(b) (-1,-13) local minimum point, (2,14) in�ection point
(c) (-1,-2) local maximum point, (1,2) local minimum point
(d) (0,1) local maximum point
(e) (0,√3) local maximum point
(f) (5,√5e−0.5) local maximum point
3
7. (a) at (0,0) a local minimum
(b) at (1,1) a local maximum
(c) at (-50/21,-52/21) neither local extremum nor a saddle point
(d) at (0,0) neither local extremum nor a saddle point, at (2/3,2/3) a local maximum
(e) at (0,0) neither local extremum nor a saddle point, at (4/3,4/3) a local minimum
(f) at (1,2) a local minimum
(g) at (0,0) a saddle point
4
Lecture 3
Constrained optimisation
Agnieszka Kulacka
Birkbeck College, University of London
Differential and Integral Calculus
Lecture 1: Differentiation
Lecture 2: Unconstrained optimisation
Lecture 3: Constrained optimisation
Lecture 4: Integration
Lecture 5: Difference and differential equations
1
Motivation
We will continue to be concerned with maxima and minima
of functions of several variables. We now look at the cases
where optimisation takes place subject to constraints on vari-
ables. Problem of this sort occur frequently in cases such as
maximisation of revenue subject to budget constraints, minimi-
sation of cost of a diet subject to nutritional requirements.
We will assume that all functions are smooth.
2
Explanation for two-variable functions
max f(x, y) subject to g(x, y) = b
Set a new function
F (x) = f(x, h(x))
where y = h(x) is an explicit expression of g(x, y) = b.
We find the total derivative
dF
dx=∂f
∂x+∂f
∂yh′(x)
and for the stationary point F ′(x) = 0.
3
Explanation for two-variable functions
We need to establish what h′(x) is.
g(x, h(x)) = b
We differentiate the above equation with respect to x
∂g
∂x+∂g
∂yh′(x) = 0
Thus
h′(x) = −∂g∂x
(∂g
∂y
)−1
4
Explanation for two-variable functions
So
dF
dx=∂f
∂x+∂f
∂y
(− ∂g
∂x
)(∂g
∂y
)−1
Let
λ =∂f
∂y
(∂g
∂y
)−1
(1)
ThusdF
dx=∂f
∂x− λ∂g
∂x(2)
5
Explanation for two-variable functions
Therefore from equation (2)
∂f
∂x− λ∂g
∂x= 0 (3)
from equation (1)
∂f
∂y− λ∂g
∂y= 0 (4)
and the constraint
g(x, y)− b = 0 (5)
6
Lagrangian function for two-variable case
Step 1. We define the Lagrangian function
L(x, y, λ) = f(x, y)− λ(g(x, y)− b)Step 2. First-order condition. We find the stationary point(s)of L by solving
∇L = 0
So
∂L∂x = 0 equivalent to equation (3).
∂L∂y = 0 equivalent to equation (4).
∂L∂λ = 0 equivalent to equation (5).
7
Example 1.
Solve the following constrained optimisation problems:
(a) max f(x, y) = xy subject to 2x+ y = 100
(b) min f(x, y) = x+ 20y subject to√x+ y = 30
(c) min f(x, y) = −40x+x2−2xy−20y+y2 subject to x+y = 15
(d) min f(x, y) = x2 + y2 subject to x+ 2y = 4
8
Interpretation of Lagrangian multiplier λ
The value of the Lagrangian multiplier λ at the optimum point
tells us the effect on the optimized value (the rate of its increase
or decrease) of the function f of a small relaxation of the con-
straint (a small change in b). Sometimes it is called the shadow
value of the constraint.
(Explained more by the Envelope Theorem)
9
The Envelope Theorem
Let y = L(x, b), where b is some exogenous parameter. Suppose
that x∗(b) solves the optimisation problem
optimise L(x, b)
We define a value function
V (b) = L(x∗, b)
ThendV
db=∂L
∂b(x∗, b)
10
Interpretation of Lagrangian multiplier λ
The Lagrangian function is
L(x, y, λ) = f(x, y)− λ(g(x, y)− b)By the chain rule
∂L
∂b= f1
dx
db+ f2
dy
db− (g(x, y)− b) + λ
When evaluated at (x∗(b), y∗(b), λ∗(b)), we have
dV
db=∂L
∂b= λ∗
Note that the value function is the optimal value of f which is a
function of b.
11
Example 2.
Consider the following optimisation problem
max f(x, y) = xy subject to 2x+ y = m
Find the value of the Lagrangian multiplier and confirm that
∂f
∂m(x∗, y∗) = λ∗
12
Example 3.
Solve the following constrained optimisation problems and inter-
pret the Lagrangian multiplier:
(a) min f(x, y) = x+ y subject to x1/2 + y = 1
(b) max f(x, y) = 12x√y subject to 3x+ 4y = 12
13
Bordered Hessian
We define the Hessian matrix of the Lagrangian function evalu-
ated at the optimum point (x∗, y∗) with the equivalent value of
the Lagrangian multiplier λ∗.
H∗ =
L11 L12 g1L21 L22 g2g1 g2 0
14
Second-order condition
Step 3.
If (x∗, y∗, λ∗) gives a stationary value of the Lagrangian function
L(x, y, λ) = f(x, y)− λ(g(x, y)− b), then
1. it yields a maximum if the determinant of the bordered Hes-
sian |H∗| > 0.
2. it yields a minimum if the determinant of the bordered Hessian
|H∗| < 0.
15
Example 4.
Solve the following constrained optimisation problems and find
the Bordered Hessian to check whether there is indeed a maxi-
mum or a minimum at the point you found:
(a) min f(x, y) = x2 + 2y2 subject to x+ y = 12
(b) max f(x, y) = x2 + 3xy + y2 subject to x+ y = 100
16
Lagrangian function for many-valued functions
Optimise f(x) subject to g(x) = b
Step 1. Define the Lagrangian function
L(x, λ) = f(x)− λ(g(x)− b)
Step 2. First-order condition. We find the stationary point(s)
of L by solving
∇L = 0
17
Lagrangian function for many-valued functions
Step 3.
1. max f(x) s.t. g(x) = b if the successive principal minors of
|H∗| alternate in sign in the following way
∣∣∣∣∣∣∣
L11 L12 g1L21 L22 g2g1 g2 0
∣∣∣∣∣∣∣> 0,
∣∣∣∣∣∣∣∣∣
L11 L12 L13 g1L21 L22 L23 g2L31 L32 L33 g3g1 g2 g3 0
∣∣∣∣∣∣∣∣∣< 0, ...
2. min f(x) s.t. g(x) = b if the successive principal minors of
|H∗| are strictly negative.
18
Problem Set 3: Constrained optimisation
For the following problems, �nd
(i) the point at which the function has optimum
(ii) the Lagrangian multiplier and interpret it
(iii) the Bordered Hessian to check whether there is indeed a maximum or a minimum at the point
you found in (i)
1. max f(x, y) = xy subject to 3x+ 4y = 12
2. min f(x, y) = 3x+ 4y subject to xy = 12
3. max f(x, y) = 3x+ 4y subject to xy = 12
4. max f(x, y) = 6xy + 2x2 − 3y2 subject to x+ 2y = 5
5. max f(x, y) = 2x+ y subject to x2 + y2 = 4
6. min f(x, y) = 2x+ y subject to x2 + y2 = 4
7. max f(x, y) = 2x+ 3y subject to 2x2 + 5y2 = 10
8. min f(x, y) = 2x+ 3y subject to 2x2 + 5y2 = 10
9. max f(x, y) = (x+ 2)(y + 1) subject to x+ y = 21
10. min f(x, y) = 2x+ 4y − x2 − 0.5y2 − 2xy subject to 2x+ y = 10
11. max f(x, y) = xy subject to x2 + y2 = 16 with x, y > 0
12. max f(x, y) = xy subject to x2 + y2 = 16 with x, y < 0
13. min f(x, y) = xy subject to x2 + y2 = 16 with x < 0, y > 0
14. min f(x, y) = xy subject to x2 + y2 = 16 with x > 0, y < 0
1
Answers
1. (i) (2, 1.5)
(ii) λ = 0.5 so relaxing the constraint would increase the optimal value at the rate of 0.5.
(iii) H∗ =
0 1 3
1 0 4
3 4 0
|H∗| = 24 > 0 so maximum at (2, 1.5).
2. (i) (4, 3)
(ii) λ = 1 so relaxing the constraint would increase the optimal value at the rate of 1.
(iii) H∗ =
0 −1 3
−1 0 4
3 4 0
|H∗| = −24 < 0 so minimum at (4, 3).
3. (i) (−4,−3)
(ii) λ = −1 so relaxing the constraint would decrease the optimal value at the rate of 1.
(iii) H∗ =
0 1 −31 0 −4−3 −4 0
|H∗| = 24 > 0 so maximum at (−4,−3).
4. (i) (45/7,−5/7)
(ii) λ = 150/7 so relaxing the constraint would increase the optimal value at the rate of 150/7.
(iii) H∗ =
4 6 1
6 −6 2
1 2 0
|H∗| = 14 > 0 so maximum at (45/7,−5/7).
5. (i) (4/√5, 2/√5)
(ii) λ =√5/4 so relaxing the constraint would increase the optimal value at the rate of
√5/4.
(iii) H∗ =
−√5/2 0 8/
√5
0 −√5/2 4/
√5
8/√5 4/
√5 0
|H∗| = 8√5 > 0 so maximum at (4/
√5, 2/√5).
6. (i) (−4/√5,−2/
√5)
(ii) λ = −√5/4 so relaxing the constraint would decrease the optimal value at the rate of
√5/4.
(iii) H∗ =
√5/2 0 −8/
√5
0√5/2 −4/
√5
−8/√5 −4/
√5 0
|H∗| = −8√5 < 0 so minimum at (−4/
√5,−2/
√5).
2
7. (i) (5√2/19, 3
√2/19)
(ii) λ = 0.1√
19/2 so relaxing the constraint would increase the optimal value at the rate of
0.1√
19/2.
(iii) H∗ =
−0.4
√19/2 0 20
√2/19
0 −√
19/2 30√
2/19
20√
2/19 30√
2/19 0
|H∗| = 40
√38 > 0 so maximum at (5
√2/19, 3
√2/19).
8. (i) (−5√
2/19, −3√2/19)
(ii) λ = −0.1√
19/2 so relaxing the constraint would decrease the optimal value at the rate of
0.1√
19/2.
(iii) H∗ =
0.4
√19/2 0 −20
√2/19
0√19/2 −30
√2/19
−20√2/19 −30
√2/19 0
|H∗| = −40
√38 < 0 so minimum at (−5
√2/19, −3
√2/19).
9. (i) (10, 11)
(ii) λ = 12 so relaxing the constraint would increase the optimal value at the rate of 12.
(iii) H∗ =
0 1 1
1 0 1
1 1 0
|H∗| = 2 > 0 so maximum at (10, 11).
10. (i) (3, 4)
(ii) λ = −6 so relaxing the constraint would decrease the optimal value at the rate of 6.
(iii) H∗ =
−2 −2 2
−2 −1 1
2 1 0
|H∗| = −2 < 0 so minimum at (3, 4).
11. (i) (√8,√8)
(ii) λ = 0.5 so relaxing the constraint would increase the optimal value at the rate of 0.5.
(iii) H∗ =
−1 1 2
√8
1 −1 2√8
2√8 2√8 0
|H∗| = 64 > 0 so maximum at (
√8,√8).
3
12. (i) (−√8,−√8)
(ii) λ = 0.5 so relaxing the constraint would increase the optimal value at the rate of 0.5.
(iii) H∗ =
−1 1 −2
√8
1 −1 −2√8
−2√8 −2
√8 0
|H∗| = 64 > 0 so maximum at (−
√8,−√8).
13. (i) (−√8,√8)
(ii) λ = −0.5 so relaxing the constraint would decrease the optimal value at the rate of 0.5.
(iii) H∗ =
1 1 −2
√8
1 1 2√8
−2√8 2√8 0
|H∗| = −64 < 0 so minimum at (−
√8,√8).
14. (i) (√8,−√8)
(ii) λ = −0.5 so relaxing the constraint would decrease the optimal value at the rate of 0.5.
(iii) H∗ =
1 1 2
√8
1 1 −2√8
2√8 −2
√8 0
|H∗| = −64 < 0 so minimum at (
√8,−√8).
4