lecture 1 di erentiation - bbk.ac.uk · lecture 1: di erentiation lecture 2: unconstrained...

Lecture 1

Differentiation

Agnieszka Kulacka

Birkbeck College, University of London

Differential and Integral Calculus

Lecture 1: Differentiation

Lecture 2: Unconstrained optimisation

Lecture 3: Constrained optimisation

Lecture 4: Integration

Lecture 5: Difference and differential equations

1

Motivation

Differential calculus is concerned with the way in which a value

of the function changes when the independent variable changes.

In economics, we study, for example, how a change in a firm’s

output level affects its costs and how a change in a country’s

money supply affects the rate of inflation.

2

The ratio of ∆y/∆x as ∆x→ 0 measures the instantaneous rate

of change of y with respect to x, and it’s called the derivative.

dy

dx= lim

∆x→0

∆y

∆x

The derivative of the function y = f(x) at the point P = (x0, y0)

is the slope of the tangent line at that point.

f ′(x0) = lim∆x→0

f(x0 + ∆x)− f(x0)

∆x

3

Example 1.

f(x) = x2

4

Differential

If f ′(x0) is the derivative of the function y = f(x) at the point

P (x0, f(x0)), then the total differential at the point P is

dy = df(x0, dx) = f ′(x0)dx

Thus the differential is a function of both x and dx.

The differential provides us with a method of estimating the

effect of a change in x of the amount dx = ∆x on y, where

∆y is the exact change while dy is the approximate change in y.

This can be expressed as

∆y = dy + ε

5

Hoy et al. Mathematics for Economics p. 136

6


7


8

Example 2.

Let f(x) = −x2. Use the differential to estimate the changes in

y between P = (10,−100) and each of the 5 points

Qn, n = 1,2,3,4,5.

Qi (15,-225) (14,-196) (13,-169) (12,-144) (11,-121)

∆x = dx

∆y

dy = f ′(x)dx

ε

9

A function f(x), which is defined on an open interval including

the point x = a, is differentiable at that point if

lim∆x→0

f(a+ ∆x)− f(a)

∆x

exists and is finite. That is,

lim∆x→0+

f(a+ ∆x)− f(a)

∆x= lim

∆x→0−f(a+ ∆x)− f(a)

∆x

The value of this expression is the value of the derivative function

f ′(x) at the point x = a.

10

Example 3.

Find the left- and right-hand derivatives at the point of non-

differentiability.

f(x) =

x, x < 1

2− x, x ≥ 1

11


12

Rules of differentiation

1. Derivative of a constant function:

If f(x) = c, where c ∈ R, then f ′(x) = 0.

2. Derivative of a linear function:

If f(x) = mx+ c, where m, c ∈ R, then f ′(x) = m.

3. Derivative of a power function:

If f(x) = xn, where n ∈ R, then f ′(x) = nxn−1.

4. Derivative of a constant multiple of a function:

If g(x) = cf(x), where c ∈ R, then g′(x) = cf ′(x).

13


5. Derivative of the sum of an arbitrary but finite number of

functions:

If h(x) =∑ni=i fi(x), where n ∈ N, then h′(x) =

∑ni=0 f

′i(x).

6. Product Rule:

If h(x) = f(x)g(x), then h′(x) = f ′(x)g(x) + f(x)g′(x).

7. Quotient Rule:

If h(x) = f(x)g(x), then h′(x) = f ′(x)g(x)−f(x)g′(x)

g2(x).

8. Chain Rule:

If y = f(u) and u = g(x) so that y = f(g(x)) = h(x), then

h′(x) = f ′(u)g′(x).

14

Example 4.

If f is differentiable at x, find the expressions for the derivatives

of the following functions:

(a) x+ f(x) (b) [f(x)]2 − x (c) [f(x)]4

(d) x2f(x) + [f(x)]3 (e) xf(x) (f)√f(x)

(g) x2

f(x) (h) [f(x)]2

x3

15


9. Derivative of the inverse of a function:

If y = f(x) has an inverse function x = g(y), and f ′(x) 6= 0, then

g′(y) = 1/f ′(x).

10. Derivative of an exponential function:

If y = ex, then dy/dx = ex.

11. Derivative of a logarithmic function:

If y = lnx, then dy/dx = 1/x.

16

Example 5.

Find the expressions for the derivatives of the following functions

w.r.t. x:

(a) x ln(x2 + 2) (b) ln(√x+ 1) (c) (ex + x2)10

Example 6.

Let y = 100p−2 + p. Find dpdy

∣∣∣p=10

17

Monotonicity

We say that the function f is strictly increasing (↑) if f(x1) <

f(x2) whenever x<x2. Similarly, f is strictly decreasing (↓) if

f(x1) > f(x2) whenever x<x2. In either case f is said to be a

monotonic function.

Test for monotonicity

If f ′(x) ≥ 0 for all x, and f ′(x) for only a finite number of values

of x, then f is strictly increasing. Similarly, if f ′(x) ≤ 0 for all

x, and f ′(x) for only a finite number of values of x, then f is

strictly decreasing.

18

Example 7.

Find the interval where the following functions are increasing:

(a) y = (lnx)2 − 4 (b) y = ln(ex + e−x)

(c) y = x− 32 ln(x2 + 2)

19

Partial Derivatives

We will extend the idea of the derivative for functions of onevariable to functions defined on Rn.

Let y = f(x1, x2, ...xn). We can define the rate at which y

changes with respect to cahnges in each of the variables x1, x2, ...xn,taken separately in the same way as for one independent variable.

The partial derivative fi(x1, ..., xi, ..., xn) of a function y = f(x1, x2, ...xn)with respect to variable xi is

∂f

∂xi= lim

∆xi→0

f(x1, ..., xi + ∆xi, ..., xn)− f(x1, ..., xi, ..., xn)

∆xi

20


21

Example 8.

Find ∂z∂x and ∂z

∂y for the following functions

(a) z = x2 + 3y2 (b) z = xy (c) z = 5x4y2 − 2xy5

(d) z = ex+y (e) z = exy (f) z = ex/y

(g) z = ln(x+ y) (h) z = ln(xy)

22

Gradient Vector

Assume that f(x1, ..., xn) is continuous and its partial derivatives

are defined for all (x1, ...xn). Then we may define the gradient

vector

∇f =

f1f2...fn

where fi = ∂f∂xi

.

23

Example 9.

Find the gradient vector for the following functions

(a) f(x, y) = x7 − y7 (b) f(x, y) = x5 ln y

(c) f(x, y) = (x2 − 2y2)5 (d) f(x, y, z) = x2 + y3 + z4

(e) f(x, y, z) = xyz (f) f(x, y, z) = x4/yz

24

Total differential

Let dx =

dx1...dxn

.

The first-order total differential for the function y = f(x1, ..., xn)

is

dy = f1dx1 + f2dx2 + ...+ fndxn = (∇f)Tdx

where fi = ∂f∂xi

.

25

Example 10.

Calculate the differentials of the following functions

(a) z = x3 + y3 (b) z = xey2

(c) z = ln(x2 − y2)

26

Using the total differential to approximate the actual change ∆y

in the function value for given changes in x1, ..., xn corresponds

geometrically to using the tangent plane as an approximation of

the function in the same way as we used the tangent line for the

functions of one variable.

27


28

Example 11.

Let T (x, y, z) = (x2 + y2 + z2)1/2. Find the total differential and

use it to estimate the changes in T if x changes from 2 to 2.01

and y changes from 3 to 2.99 and z changes from 6 to 6.02.

29

Total derivative

Using a chain rule, we can define a total derivative

d

dtf(x1, x2, ..., xn) = (∇f)T

dx

dt

30

Example 12.

Find dzdt when

(a) z = x2 + y3 with x = t2 and y = 2t

(b) z = xe2y with x =√t and y = ln t

31

Problem Set 1: Di�erentiation

1. Let f(x) = x2. Use the di�erential to estimate the changes in y between P = (20, 400) and each

of the 5 points Qn, n = 1, 2, 3, 4, 5.

Qi (25,625) (24,576) (23,529) (22,484) (21,441)

∆x = dx

∆y

dy = f ′(x)dx

ε

What does this example suggest about the use of the di�erential as an estimate of the actual

change in the function value as x changes?

2. Find the left- and right-hand derivatives at the point of non-di�erentiability.

f(x) =

3x+ 2, x ≤ 5

x+ 12, x > 5

3. Di�erentiate the following:

(a) (x4 − 3x2)(5x+ 1)5

(b) (xm + 8)3(5x2 + 2x−n)

(c) (x2 + 1)/(2x3 + 1)

(d) xe2x

(e) x/(1 + ex)

(f) (e3x − 1)4

(g) ln(x4 + 1)

(h) ln(ex + 1)

(i) ln x2

x4+1

(j) xx

4. Find dy/dx at the point y = 0, when

x = 0.8 + 0.7y − (y + 2)−1

5. Which of the following functions are monotonic?

(a) f(x) = x5 − 10x3 + 45x

(b) f(x) = x− x−1

(c) f(x) = 5− x6

(d) f(x) = 5− x6, x > 0

(e) f(x) = x+ |x|

(f) f(x) = 2x+ |x|

1

6. In each of the following cases �nd the gradient vector of the function f(x, y).

(a) f(x, y) = 3x+ 4y3

(b) f(x, y) = x3 ln y + 6x2y3 + e2xy

(c) f(x, y) = (x2 + 4y2)−1/2

(d) f(x, y) = (x+ 4y)(e−2x + e−3y)

7. In each of the following cases �nd the gradient vector of the function f(x, y) and evaluate it at

the point (1,−2).

(a) f(x, y) = 3x2 + 2y5

(b) f(x, y) = 3x2y3 + 2x3y2

(c) f(x, y) = (3x− 2y)/(x2 + y2)

(d) f(x, y) = x ln(1 + y2)

8. Let z = x2y + y5. Find the total di�erential and use it to estimate the changes in z if x changes

from -3 to-3.1 and y changes from 2 to 2.1.

9. If f(x1, x2) = 2x1 + 3x2, x1 = 2t3, x2 = 3 + et, �nd the total derivative df/dt.

10. If z = x3y4 + xey, x = 1 + 6t, y = −2− 3t, �nd the total derivative dz/dt.

11. Let x1 = 8p−21 p1/2, x2 = 4p

1/21 p

−1/22 , and p1 = 2t1/2, p2 = 3t3/2. Find the total derivatives dx1/dt

and dx2/dt.

2

Answers

1.

Qi (25,625) (24,576) (23,529) (22,484) (21,441)

∆x = dx 5 4 3 2 1

∆y 225 176 129 84 41

dy = f ′(x)dx 200 160 120 80 40

ε 25 16 9 4 1

This suggests that as ∆x = dx gest smaller the value of the total di�erential dy becomes a better

approximation to the actual change ∆y. In this case dy underestimates the value of ∆y.

2.

lim∆x→0+

f(a+ ∆x)− f(a)

∆x= 1

lim∆x→0−

f(a+ ∆x)− f(a)

∆x= 3

3. (a) (45x4 + 4x3 − 105x2 − 6x)(5x+ 1)4

(b) (xm + 8)2[(15m+ 10)xm+1 + (6m− 2n)xm−n−1 + 80x− 16nx−n−1]

(c) −2x4−6x2+2x(2x3+1)2

(d) ex(2x+ 1)

(e) 1+ex−xex(1+ex)2

(f) 12e3x(e3x − 1)3

(g) 4x3

x4+1

(h) ex

ex+1

(i) 2−2x4

x(x4+1)

(j) (lnx+ 1)xx

4. 20/9

5. (a) monotonic (↑)

(b) monotonic (↑)

(c) not monotonic

(d) monotonic (↓)

(e) not monotonic (weakly monotonic)

(f) monotonic (↑)

3

6. (a)

[3

12y2

]

(b)

[3x2 ln y + 12xy3 + 2e2xy

x3/y + 18x2y2 + e2x

]

(c) −(x2 + 4y2)−3/2

[x

4y

]

(d)

[(1− 2x− 8y)e−2x + e−3y

4e−2x + (4− 3x− 12y)e−3y

]

7. (a)

[6x

10y4

],

[6

160

]

(b)

[6xy3 + 6x2y2

9x2y2 + 4x3y

],

[−24

28

]

(c) (x2 + y2)−2

[−3x2 + 3y2 + 4xy

−2x2 + 2y2 − 6xy

],

[1/25

18/25

]

(d)

[ln(1 + y2)

2xy/(1 + y2)

],

[ln 5

−0.8

]

8. dz = 2xydx+ (x2 + 5y4)dy, ∆z ≈ 10.1

9. 12t2 + 3et

10. 6(3x2y4 + ey)− 3(4x3y3 + xey)

11. dx1/dt = −x1/(4t), dx2/dt = −x2/(8t)

4

Lecture 2

Unconstrained optimisation

Agnieszka Kulacka








1

Motivation

Many economic models are based on the idea that an individual

decision maker makes an optimal choice from some given set of

alternatives. To formalise this idea, we interpret optimal choice

as maximising or minimising the value of some function. For

example, a firm is assumed to minimise costs of producing each

level of output and to maximise profit, etc.

2

Higher-order derivatives Since the derivative of a function y =

f(x) is also a function df(x)/dx, we can find its derivative, which

we call the second derivative, and denote it as

d

dx

dy

dx=d2y

dx2

or

f ′′(x)

Simlarly, we can find higher-order derivatives. The nth derivative

is denoted by dny/dx or f(n)(x).

3

Example 1.

Compute the second derivatives of

(a) y = x5 − 3x4 + 2

(b) y =√

1 + x2

(c) y = x5ex

(d) y = ex/x

(e) y = x2 lnx

(f) y = lnx/x

4

Convexity and Concavity

Let f(x) be a twice differentiable function.

1. f(x) is convex if, at all points on its domain f ′′(x) ≥ 0.

2. f(x) is strictly convex if, at all points on its domain f ′′(x) > 0,

except possibly at a single point.

3. f(x) is concave if, at all points on its domain f ′′(x) ≤ 0.

4. f(x) is strictly concave if, at all points on its domain f ′′(x) < 0,

except possibly at a single point.

5


6

Example 2.

Find the intervals over which these functions are concave and

the intervals over which the functions are convex.

(a) f(x) = x1/4, x > 0

(b) f(y) = y3 − 9y2 + 60y + 10, y ≥ 0

7

Second-order partial derivatives

Similarly to functions of one variable, we can define second-order

partial derivatives of a function f(x1, .., xn).

∂2f

∂xi∂xj=

∂

∂xj

∂f

∂xi

where 0 ≤ i, j ≤ n.

The partial derivative ∂2f∂xi∂xj

is also denoted fij.

8

Example 3.

Let g be defined by

g(x, y, z) = 2x3 − 4xy + 10y2 + z2 − 4x− 28y − z + 24

for all (x, y, z)

Find all partial derivatives of the first and second orders.

9

Smooth functions

If the continuous function f(x1, ...xn) is such that its first-order

partial derivatives are defined for all (x1, .., xn) and are continu-

ous, then f is said to be of class C1. If f is of class C1 and its

partial derivatives are of class C1, then f is said to be a function

of class C2 (a smooth function).

Young’s theorem: If f is a smooth function, then

∂2f

∂xi∂xj=

∂2f

∂xj∂xi

10

Hessian matrix

We defined the gradient vector of function f as a vector of the

first-order partial derivatives of f . We can define a matrix of

second-order partial derivatives of f

∇2F =

f11 f12 ... f1nf21 f22 ... f2n...fn1 fn2 ... fnn

This matrix is called Hessian matrix of f .

11

Example 4.

Find the Hessian matrices of

(a) f(x, y, z) = ax2 + by2 + cz2

(b) f(x, y, z) = Axaybzc

12

The second-order total differential

We define the second-order total differential of y = f(x1, ...xn)

as

d2y = dxT ∇2F dx

13

Example 5.

Find the second-order total differential of

(a) y = f(x)

(b) z = f(x, y)

14

Convexity and concavity

For any smooth function y = f(x1, ...xn), it follows that

1. f is strictly convex on Rn if d2y > 0.

2. f is convex on Rn if d2y ≥ 0.

3. f is strictly concave on Rn if d2y < 0.

4. f is concave on Rn if d2y ≤ 0.

15

Leading principal minors

Let y = f(x1, ...xn) be any smooth function and H its Hessian

matrix. The leading principal minors are

|H1| = f11, |H2| =∣∣∣∣∣f11 f12f21 f22

∣∣∣∣∣, ...

|Hn| =

∣∣∣∣∣∣∣∣∣

f11 f12 ... f1nf21 f22 ... f2n...fn1 f2n ... fnn

∣∣∣∣∣∣∣∣∣

16

Strict Convexity and concavity

For any smooth function y = f(x1, ...xn) with the Hessian matrix

H, it follows that

1. f is strictly convex on Rn if all leading principal minors of H

are positive, i.e.

|H1| > 0, |H2| > 0, ..., |Hn| > 0.

2. f is strictly concave on Rn if the leading principal minors of

H alternate in sign beginning with a negative value for H1, i.e.

|H1| < 0, |H2| > 0, ..., |Hn| = |H|> 0 if n is even

< 0 if n is odd

17

Example 6.

Are these functions strictly concave or convex?

(a) f(x1, x2) = x21 + x2

2

(b) f(x1, x2) = 5− (x1 + x2)2

18

Principal minors

Let y = f(x1, ...xn) be any smooth function and H its Hessianmatrix. The principal minors |H∗k| refer to more than one minorof order k.

For n = 3, the principal minors are

|H∗1| = f11, f22, f33,

|H∗2| =∣∣∣∣∣f11 f12f21 f22

∣∣∣∣∣ ,∣∣∣∣∣f11 f13f31 f33

∣∣∣∣∣ ,∣∣∣∣∣f22 f23f32 f33

∣∣∣∣∣ ,

|H∗3| =

∣∣∣∣∣∣∣

f11 f12 f13f21 f22 f23f31 f23 f33

∣∣∣∣∣∣∣.

19

Convexity and concavity

For any smooth function y = f(x1, ...xn) with the Hessian matrix

H, it follows that

1. f is convex on Rn if all principal minors of H are non-negative,

i.e.

|H∗1| ≥ 0, |H∗2| ≥ 0, ..., |H∗n| ≥ 0.

2. f is concave on Rn if the principal minors of H alternate in

sign beginning with a non-positive value for H1, i.e.

|H∗1| ≤ 0, |H∗2| ≤ 0, ..., |H∗n| = |H|≥ 0 if n is even

≤ 0 if n is odd

20

Example 7.

Are these functions concave or convex?

(a) f(x1, x2) = (x1 + x2)1/2 defined on R2++

(b) f(x1, x2) = 3x1 + x22

21

Unconstrained optimisation for functions on R

Given some function f , we optimize it by finding a value x at

which it takes on a maximum or a minimum value. Such values

are called extreme values of the function. If the set of x-values

from which we can choose is the entire real line, the problem

of finding an extreme value is unconstrained, while if the set of

x-values is restricted to be a proper subset of the real line, the

problem is constrained.

22

Maxima for functions on R

A global maximum of a function f is f(x∗) if

f(x∗) ≥ f(x) for all x

A local maximum of a function f is f(x∗) if

f(x∗) ≥ f(x) for x∗ − ε ≤ x ≤ x∗+ ε

for some arbitrary small ε > 0.

23

First-order condition

Take the differential

dy = f ′(x∗)dx

If the function is at a local maximum at x∗, it must be imposible

to increase its value by small changes, dx, in either direction from

x∗. So

f ′(x∗) = 0

24

Minima for functions on R

A global minimum of a function f is f(x∗) if

f(x∗) ≤ f(x) for all x

A local minimum of a function f is f(x∗) if

f(x∗) ≤ f(x) for x∗ − ε ≤ x ≤ x∗+ ε

for some arbitrary small ε > 0.

25

First-order condition (necessary condition)

If a function f has a turning point (a local maximum or a local

mimimum) at x∗, then f ′(x∗) = 0.

If f(x∗) = 0, then f can have a stationary point (a turning point

or an inflection point) at x∗.

26


27

Second-order condition (sufficient condition)

If f ′(x∗) = 0 and f ′′(x∗) < 0 (f is strictly concave), then f has a

local maximum at x∗.

If f ′(x∗) = 0 and f ′′(x∗) > 0 (f is strictly convex), then f has a

local minimum at x∗.

28

Example 8.

Find the local extrema of the following functions

(a) f(x) = e2x − 5ex + 4

(b) f(x) = x3 lnx defined on R+

(c) y = x2e−x

29

Stationary points for functions on Rn

A stationary value of a function f over the domain Rn occurs at

point (x∗1, x∗2, ...,

∗n ) at which the following equation holds

∇f(x∗1, x∗2, ..., x

∗n) = 0

where 0 is the null vector.

30

First-order condition (necessary condition)

If a function f has a local maximum or a local mimimum at

x∗ ∈ Rn, then ∇f(x∗) = 0.

If ∇f(x∗) = 0, then f can have a stationary point (a local max-

imum or a local mimimum or a saddle point) at x∗.

31


32


33

Second-order condition (sufficient condition)

Let y = f(x).

If ∇f(x∗) = 0 and d2y(x∗) < 0 (f is strictly concave), then f has

a local maximum at x∗.

If ∇f(x∗) = 0 and d2y(x∗) > 0 (f is strictly convex), then f has

a local minimum at x∗.

34

Example 9.

Find the local extrema of the following functions

(a) f(x, y) = x3 − x2 − y2 + 8

(b) f(x, y) = 5− x2 + 6x− 2y2 + 8y

(c) f(x1, x2) = x21 + 2x1x

22 + 2x2

2

(d) f(x1, x2, x3) = 2x21 + x2

2 + 4x23 − x1 + 2x3

35

Problem Set 2: Unconstrained optimisation

1. Find the �rst four derivatives of the function f(x) = x4.

2. Which of the following functions are convex? Which are concave?

(a) f(x) = (2x+ 1)6

(b) f(x) = x5 − x

(c) f(x) = x− x2

3. In each of the following cases �nd the Hessian matrix of the function f(x, y). (See Problem Set 1

Question 6).

(a) f(x, y) = 3x+ 4y3

(b) f(x, y) = x3 ln y + 6x2y3 + e2xy

(c) f(x, y) = (x2 + 4y2)−1/2

(d) f(x, y) = (x+ 4y)(e−2x + e−3y)

4. In each of the following cases �nd the Hessian matrix of the function f(x, y) and evaluate it at

the point (1,−2). (See Problem Set 1 Question 7).

(a) f(x, y) = 3x2 + 2y5

(b) f(x, y) = 3x2y3 + 2x3y2

(c) f(x, y) = (3x− 2y)/(x2 + y2)

(d) f(x, y) = x ln(1 + y2)

5. Which of the following functions are convex? Which are strictly convex? Which are concave?

Which are strictly concave?

(a) f(x1, x2) = (x1 + x2)2

(b) f(x1, x2) = 10− x21 − x22

(c) f(x1, x2) = x1/21 x

1/32 de�ned on R2

++

(d) f(x1, x2) = x1/21 x

1/22 de�ned on R2

++

6. Find the stationary points and classify them.

(a) y = x3 − x2 + 1

(b) y = x4 − 4x3 + 16x− 2

(c) y = x+ 1/x

(d) y = (1− x2)/(1 + x2)

(e) y = (3− x2)1/2

(f) y = x0.5e−0.1x

1

7. Find the stationary points and classify them.

(a) y = 0.5x21 + 2x22

(b) y = x1 + x2 − x21 − x22 + x1x2

(c) y = 10x1 + 2x2 − 0.5x21 − 2x22 + 5x1x2

(d) y = 2x1x2 − x31 − x32

(e) y = x31 + x32 − 4x1x2

(f) y = x1x2 + 2/x1 + 4x2 de�ned on R2++

(g) y = 2x21 − 4x22

2

Answers

1. f ′(x) = 4x3, f ′′(x) = 12x2, f ′′′(x) = 24x, f (4)(x) = 24

2. (a) convex

(b) neither

(c) concave

3. (a)

[0 0

0 24y

]

(b)

[6x ln y + 12y3 + 4e2xy 3x2/y + 36xy2 + 2e2x

3x2/y + 36xy2 + 2e2x −x3/y2 + 36x2y

]

(c) 2(x2 + 4y2)−5/2

[x2 − 2y2 6xy

6xy −2x2 + 16y2

]

(d)

[−4e−2x(1− x− 4y) −8e−2x − 3e−3y

−8e−2x − 3e−3y −3e−3y(8− 3x− 12y)

]

4. (a)

[6 0

0 40y3

],

[6 0

0 −320

]

(b)

[6y3 + 12xy2 18xy2 + 12x2y

18xy2 + 12x2y 18x2y + 4x3

],

[0 48

48 −32

]

(c) (x2 + y2)−3

[6x3 − 12x2y − 18xy2 + 4y3 4x3 + 18x2y − 12xy2 − 6y3

4x3 + 18x2y − 12xy2 − 6y3 4x3 + 18x2y − 12xy2 − 6y3

],

[−74/125 −32/125−32/125 74/125

]

(d) 2(1 + y2)−2

[0 y(1 + y2)

y(1 + y2) x(1− y2)

],

[0 −0.8−0.8 −0.24

]

5. (a) convex

(b) strictly concave

(c) strictly concave

(d) concave

6. (a) (0,1) local maximum point, (2/3,23/27) local minimum point

(b) (-1,-13) local minimum point, (2,14) in�ection point

(c) (-1,-2) local maximum point, (1,2) local minimum point

(d) (0,1) local maximum point

(e) (0,√3) local maximum point

(f) (5,√5e−0.5) local maximum point

3

7. (a) at (0,0) a local minimum

(b) at (1,1) a local maximum

(c) at (-50/21,-52/21) neither local extremum nor a saddle point

(d) at (0,0) neither local extremum nor a saddle point, at (2/3,2/3) a local maximum

(e) at (0,0) neither local extremum nor a saddle point, at (4/3,4/3) a local minimum

(f) at (1,2) a local minimum

(g) at (0,0) a saddle point

4

Lecture 3

Constrained optimisation

Agnieszka Kulacka








1

Motivation

We will continue to be concerned with maxima and minima

of functions of several variables. We now look at the cases

where optimisation takes place subject to constraints on vari-

ables. Problem of this sort occur frequently in cases such as

maximisation of revenue subject to budget constraints, minimi-

sation of cost of a diet subject to nutritional requirements.

We will assume that all functions are smooth.

2

Explanation for two-variable functions

max f(x, y) subject to g(x, y) = b

Set a new function

F (x) = f(x, h(x))

where y = h(x) is an explicit expression of g(x, y) = b.

We find the total derivative

dF

dx=∂f

∂x+∂f

∂yh′(x)

and for the stationary point F ′(x) = 0.

3


We need to establish what h′(x) is.

g(x, h(x)) = b

We differentiate the above equation with respect to x

∂g

∂x+∂g

∂yh′(x) = 0

Thus

h′(x) = −∂g∂x

(∂g

∂y

)−1

4


So

dF

dx=∂f

∂x+∂f

∂y

(− ∂g

∂x

)(∂g

∂y

)−1

Let

λ =∂f

∂y

(∂g

∂y

)−1

(1)

ThusdF

dx=∂f

∂x− λ∂g

∂x(2)

5


Therefore from equation (2)

∂f

∂x− λ∂g

∂x= 0 (3)

from equation (1)

∂f

∂y− λ∂g

∂y= 0 (4)

and the constraint

g(x, y)− b = 0 (5)

6

Lagrangian function for two-variable case

Step 1. We define the Lagrangian function

L(x, y, λ) = f(x, y)− λ(g(x, y)− b)Step 2. First-order condition. We find the stationary point(s)of L by solving

∇L = 0

So

∂L∂x = 0 equivalent to equation (3).

∂L∂y = 0 equivalent to equation (4).

∂L∂λ = 0 equivalent to equation (5).

7

Example 1.

Solve the following constrained optimisation problems:

(a) max f(x, y) = xy subject to 2x+ y = 100

(b) min f(x, y) = x+ 20y subject to√x+ y = 30

(c) min f(x, y) = −40x+x2−2xy−20y+y2 subject to x+y = 15

(d) min f(x, y) = x2 + y2 subject to x+ 2y = 4

8

Interpretation of Lagrangian multiplier λ

The value of the Lagrangian multiplier λ at the optimum point

tells us the effect on the optimized value (the rate of its increase

or decrease) of the function f of a small relaxation of the con-

straint (a small change in b). Sometimes it is called the shadow

value of the constraint.

(Explained more by the Envelope Theorem)

9

The Envelope Theorem

Let y = L(x, b), where b is some exogenous parameter. Suppose

that x∗(b) solves the optimisation problem

optimise L(x, b)

We define a value function

V (b) = L(x∗, b)

ThendV

db=∂L

∂b(x∗, b)

10

Interpretation of Lagrangian multiplier λ

The Lagrangian function is

L(x, y, λ) = f(x, y)− λ(g(x, y)− b)By the chain rule

∂L

∂b= f1

dx

db+ f2

dy

db− (g(x, y)− b) + λ

When evaluated at (x∗(b), y∗(b), λ∗(b)), we have

dV

db=∂L

∂b= λ∗

Note that the value function is the optimal value of f which is a

function of b.

11

Example 2.

Consider the following optimisation problem

max f(x, y) = xy subject to 2x+ y = m

Find the value of the Lagrangian multiplier and confirm that

∂f

∂m(x∗, y∗) = λ∗

12

Example 3.

Solve the following constrained optimisation problems and inter-

pret the Lagrangian multiplier:

(a) min f(x, y) = x+ y subject to x1/2 + y = 1

(b) max f(x, y) = 12x√y subject to 3x+ 4y = 12

13

Bordered Hessian

We define the Hessian matrix of the Lagrangian function evalu-

ated at the optimum point (x∗, y∗) with the equivalent value of

the Lagrangian multiplier λ∗.

H∗ =

L11 L12 g1L21 L22 g2g1 g2 0

14

Second-order condition

Step 3.

If (x∗, y∗, λ∗) gives a stationary value of the Lagrangian function

L(x, y, λ) = f(x, y)− λ(g(x, y)− b), then

1. it yields a maximum if the determinant of the bordered Hes-

sian |H∗| > 0.

2. it yields a minimum if the determinant of the bordered Hessian

|H∗| < 0.

15

Example 4.

Solve the following constrained optimisation problems and find

the Bordered Hessian to check whether there is indeed a maxi-

mum or a minimum at the point you found:

(a) min f(x, y) = x2 + 2y2 subject to x+ y = 12

(b) max f(x, y) = x2 + 3xy + y2 subject to x+ y = 100

16

Lagrangian function for many-valued functions

Optimise f(x) subject to g(x) = b

Step 1. Define the Lagrangian function

L(x, λ) = f(x)− λ(g(x)− b)

Step 2. First-order condition. We find the stationary point(s)

of L by solving

∇L = 0

17

Lagrangian function for many-valued functions

Step 3.

1. max f(x) s.t. g(x) = b if the successive principal minors of

|H∗| alternate in sign in the following way

∣∣∣∣∣∣∣

L11 L12 g1L21 L22 g2g1 g2 0

∣∣∣∣∣∣∣> 0,

∣∣∣∣∣∣∣∣∣

L11 L12 L13 g1L21 L22 L23 g2L31 L32 L33 g3g1 g2 g3 0

∣∣∣∣∣∣∣∣∣< 0, ...

2. min f(x) s.t. g(x) = b if the successive principal minors of

|H∗| are strictly negative.

18

Problem Set 3: Constrained optimisation

For the following problems, �nd

(i) the point at which the function has optimum

(ii) the Lagrangian multiplier and interpret it

(iii) the Bordered Hessian to check whether there is indeed a maximum or a minimum at the point

you found in (i)

1. max f(x, y) = xy subject to 3x+ 4y = 12

2. min f(x, y) = 3x+ 4y subject to xy = 12

3. max f(x, y) = 3x+ 4y subject to xy = 12

4. max f(x, y) = 6xy + 2x2 − 3y2 subject to x+ 2y = 5

5. max f(x, y) = 2x+ y subject to x2 + y2 = 4

6. min f(x, y) = 2x+ y subject to x2 + y2 = 4

7. max f(x, y) = 2x+ 3y subject to 2x2 + 5y2 = 10

8. min f(x, y) = 2x+ 3y subject to 2x2 + 5y2 = 10

9. max f(x, y) = (x+ 2)(y + 1) subject to x+ y = 21

10. min f(x, y) = 2x+ 4y − x2 − 0.5y2 − 2xy subject to 2x+ y = 10

11. max f(x, y) = xy subject to x2 + y2 = 16 with x, y > 0

12. max f(x, y) = xy subject to x2 + y2 = 16 with x, y < 0

13. min f(x, y) = xy subject to x2 + y2 = 16 with x < 0, y > 0

14. min f(x, y) = xy subject to x2 + y2 = 16 with x > 0, y < 0

1

Answers

1. (i) (2, 1.5)

(ii) λ = 0.5 so relaxing the constraint would increase the optimal value at the rate of 0.5.

(iii) H∗ =

0 1 3

1 0 4

3 4 0

|H∗| = 24 > 0 so maximum at (2, 1.5).

2. (i) (4, 3)

(ii) λ = 1 so relaxing the constraint would increase the optimal value at the rate of 1.

(iii) H∗ =

0 −1 3

−1 0 4

3 4 0

|H∗| = −24 < 0 so minimum at (4, 3).

3. (i) (−4,−3)

(ii) λ = −1 so relaxing the constraint would decrease the optimal value at the rate of 1.

(iii) H∗ =

0 1 −31 0 −4−3 −4 0

|H∗| = 24 > 0 so maximum at (−4,−3).

4. (i) (45/7,−5/7)

(ii) λ = 150/7 so relaxing the constraint would increase the optimal value at the rate of 150/7.

(iii) H∗ =

4 6 1

6 −6 2

1 2 0

|H∗| = 14 > 0 so maximum at (45/7,−5/7).

5. (i) (4/√5, 2/√5)

(ii) λ =√5/4 so relaxing the constraint would increase the optimal value at the rate of

√5/4.

(iii) H∗ =

−√5/2 0 8/

√5

0 −√5/2 4/

√5

8/√5 4/

√5 0

|H∗| = 8√5 > 0 so maximum at (4/

√5, 2/√5).

6. (i) (−4/√5,−2/

√5)

(ii) λ = −√5/4 so relaxing the constraint would decrease the optimal value at the rate of

√5/4.

(iii) H∗ =

√5/2 0 −8/

√5

0√5/2 −4/

√5

−8/√5 −4/

√5 0

|H∗| = −8√5 < 0 so minimum at (−4/

√5,−2/

√5).

2

7. (i) (5√2/19, 3

√2/19)

(ii) λ = 0.1√

19/2 so relaxing the constraint would increase the optimal value at the rate of

0.1√

19/2.

(iii) H∗ =

−0.4

√19/2 0 20

√2/19

0 −√

19/2 30√

2/19

20√

2/19 30√

2/19 0

|H∗| = 40

√38 > 0 so maximum at (5

√2/19, 3

√2/19).

8. (i) (−5√

2/19, −3√2/19)

(ii) λ = −0.1√

19/2 so relaxing the constraint would decrease the optimal value at the rate of

0.1√

19/2.

(iii) H∗ =

0.4

√19/2 0 −20

√2/19

0√19/2 −30

√2/19

−20√2/19 −30

√2/19 0

|H∗| = −40

√38 < 0 so minimum at (−5

√2/19, −3

√2/19).

9. (i) (10, 11)

(ii) λ = 12 so relaxing the constraint would increase the optimal value at the rate of 12.

(iii) H∗ =

0 1 1

1 0 1

1 1 0

|H∗| = 2 > 0 so maximum at (10, 11).

10. (i) (3, 4)

(ii) λ = −6 so relaxing the constraint would decrease the optimal value at the rate of 6.

(iii) H∗ =

−2 −2 2

−2 −1 1

2 1 0

|H∗| = −2 < 0 so minimum at (3, 4).

11. (i) (√8,√8)


(iii) H∗ =

−1 1 2

√8

1 −1 2√8

2√8 2√8 0

|H∗| = 64 > 0 so maximum at (

√8,√8).

3

12. (i) (−√8,−√8)


(iii) H∗ =

−1 1 −2

√8

1 −1 −2√8

−2√8 −2

√8 0

|H∗| = 64 > 0 so maximum at (−

√8,−√8).

13. (i) (−√8,√8)

(ii) λ = −0.5 so relaxing the constraint would decrease the optimal value at the rate of 0.5.

(iii) H∗ =

1 1 −2

√8

1 1 2√8

−2√8 2√8 0

|H∗| = −64 < 0 so minimum at (−

√8,√8).

14. (i) (√8,−√8)

(ii) λ = −0.5 so relaxing the constraint would decrease the optimal value at the rate of 0.5.

(iii) H∗ =

1 1 2

√8

1 1 −2√8

2√8 −2

√8 0

|H∗| = −64 < 0 so minimum at (

√8,−√8).

4

lecture 1 di erentiation - bbk.ac.uk · lecture 1: di erentiation lecture 2: unconstrained...

Documents