basic calculus (ii) recap

Basic Calculus (II) Recap

(for MSc & PhD Business, Management & Finance Students)

Lecturer: Farzad Javidanrad

First Draft: Sep. 2013

Revised: Sep. 2014

Multi-Variable Functions


• In the case of one-variable function, in the form of 𝑦 = 𝑓(𝑥) , the variable 𝒙 is called “independent variable” and 𝒚 “dependent variable”.

• There are many examples of the dependency of 𝑦 on 𝑥 (e.g, the state of boiling of water depends on the amount of heat; or consumption expenditure depends on the level of income) but the concept of function should be understood beyond the concept of dependency. In most of the cases, dependency is not the issue at all. The modern concept of function is based on the idea of mapping.


• When a painter paint a scene on a canvas s(he) uses a correspondence rule (mapping rule): every point in three-dimensional space (𝑅3) is corresponded (mapped) to just one and only one point in two-dimensional space (𝑅2).

• Mathematically speaking the function 𝑓: 𝑅3 → 𝑅2 can represent the type of

corresponding (mapping)

rule that the painter is

applying.

The Concept of Function as Mapping

• Transformation of an object is a mapping from 𝑅2 to 𝑅2;

• Mathematical operations describe a function from 𝑅2 to 𝑅

x

y

y

-x o

𝑓: 𝑅2 → 𝑅2

𝑎, 𝑏 → (𝑏, −𝑎)(𝑎, 𝑏)

(𝑏, −𝑎)

a

b

a+b o o

Figure1-6: Geometrical interpretation of

the sum operator as a function. This is a

transformation from space to . x x

𝑔: 𝑅2 → 𝑅𝑎, 𝑏 → 𝑎 + 𝑏

Multi Variables Functions

• All basic mathematical operators such as summation, subtraction, division and multiplication introduce a function from two-dimensional space (𝑅2) to the real number set (one-dimensional space, 𝑅), that is:

𝑓: 𝑅2 → 𝑅

For e.g. for division: 𝑎, 𝑏 →𝑎

𝑏(𝑏 ≠ 0)

• One of the important family of the multi-variable functions is the “real (scalar) multi variables function”, which can be shown as 𝑓: 𝑅𝑛 → 𝑅 or simply, 𝑦 = 𝑓(𝑥1, 𝑥2, … , 𝑥𝑛), where 𝑦 is the dependent variable and 𝑥1, 𝑥2, … , 𝑥𝑛 are independent variables.

Two Variables Functions

• A simple form of this function is when we have two independent variables 𝑥, 𝑦 and one dependent variable 𝑧, in the form of 𝑧 =𝑓(𝑥, 𝑦). This is called “two variables function” as there are two independent variables.

• E.g. a Cobb-Douglas production function :

𝑌 = 𝑓 𝐾, 𝐿 = 𝐴𝐾𝛼𝐿𝛽

Where 𝑌 is the level of production,

𝐾 and 𝐿 are the levels of capital and

labour employed for production,

respectively.

• 𝐴, 𝛼 and 𝛽 are constants of the

function.Adopted from http://en.citizendium.org/wiki/File:Cobb-Douglas_with_dimishing_returns_to_scale.png

Y

K

L

Two Variables Functions• 𝑧 = 𝑓(𝑥, 𝑦) represents a functional relationship if for every ordered

pair (𝑥, 𝑦) in the domain of the function there will be one and only one value of 𝑧 in the range of the function.

o Which graph does represent a function?𝒙𝟐

𝒂𝟐+

𝒚𝟐

𝒃𝟐+

𝒛𝟐

𝒄𝟐= 𝟏

Ellipsoid

Hyperboloid of Two Sheets

−𝒙𝟐

𝒂𝟐−

𝒚𝟐

𝒃𝟐+

𝒛𝟐

𝒄𝟐= 𝟏

Hyperbolic Paraboloid

𝒙𝟐

𝒂𝟐−

𝒚𝟐

𝒃𝟐=

𝒛

𝒄

Elliptic Paraboloid

𝒙𝟐

𝒂𝟐+

𝒚𝟐

𝒃𝟐=

𝒛

𝒄

Ad

op

ted

from

http

://tuto

rial.math

.lamar.ed

u/C

lasses/C

alcIII/Qu

adricSu

rfaces.aspx

Derivative of Two Variables Functions• Consider the function 𝑧 = 𝑓(𝑥, 𝑦); 𝑧 changes if 𝑥 or 𝑦 or both of

them change. If we control the change of 𝑦 and allow just 𝑥 to

change then the average change of 𝑧 in terms of 𝑥, is Δ𝑧

Δ𝑥. The limiting

state of this ratio when ∆𝑥 → 0 is what is called “partial derivative of 𝒛 in terms of 𝒙 ” and is shown by:

𝜕𝑧

𝜕𝑥,

𝜕𝑓(𝑥,𝑦)

𝜕𝑥, 𝑧𝑥

′, 𝑓𝑥

• This cutter plane shows that the

variable 𝑦 is controlled (fixed)

at 𝑦 = 1 but 𝑥 can change from

-2 to +2 and the movement is on

the curve of intersection between

The plane and the surface of the

function.Adopted from http://msemac.redwoods.edu/~darnold/math50c/matlab/pderiv/index.xhtml

Partial Differentiation

• If 𝑥 is controlled (fixed) and 𝑦 is allowed to change the partial derivative of 𝒛 in terms of 𝒚 can be shown by:

𝜕𝑧

𝜕𝑦,

𝜕𝑓(𝑥,𝑦)

𝜕𝑦, 𝑧𝑦

′ ,𝑓𝑦

• The cutter plane shows that

𝑥 is controlled (fixed) at

𝑥 = 0 but 𝑦 can change from

-3 to +3 on the curve of intersection

between the plane and the surface

of the function.

z

yx

Adopted from http://www.uwec.edu/math/Calculus/216-Spring2007/assignments.htm


• So, in general, the slope of the function 𝑧 = 𝑓(𝑥, 𝑦) on the curve of intersection between the surface of the function and the cutting plane parallel to x-axis at any point of the domain is:

𝜕𝑧

𝜕𝑥= 𝑓𝑥 = lim

∆𝑥→0

𝑓 𝑥 + ∆𝑥 , 𝑦 − 𝑓(𝑥 , 𝑦)

∆𝑥

= 𝑙𝑖𝑚ℎ→0

𝑓 𝑥 + ℎ , 𝑦 − 𝑓(𝑥 , 𝑦)

ℎ

It means when calculating 𝜕𝑧

𝜕𝑥the

variable 𝑦 should be treated as a constant. The same rule applies for multi variables functions.

Adopted from http://moodle.capilanou.ca/mod/book/view.php?id=19700&chapterid=240

𝒛 = 𝟏𝟎 − 𝒙𝟐 − 𝒚𝟐

Partial Differentiation • And the slope of the function 𝑧 = 𝑓(𝑥, 𝑦) on the curve of

intersection between the surface of the function and the cutting plane parallel to y-axis at any point of the domain is:

𝜕𝑧

𝜕𝑦= 𝑓𝑦 = 𝑙𝑖𝑚

∆𝑦→0

𝑓 𝑥 , 𝑦 + ∆𝑦 − 𝑓(𝑥, 𝑦)

∆𝑦

= 𝑙𝑖𝑚ℎ→0

𝑓 𝑥 , 𝑦 + ℎ − 𝑓(𝑥, 𝑦)

ℎ

It means when calculating 𝜕𝑧

𝜕𝑦the

variable 𝑥 should be treated as a constant. The same rule applies for multi variables functions.

Adopted from http://moodle.capilanou.ca/mod/book/view.php?id=19700&chapterid=240

𝒛 = 𝟏𝟎 − 𝒙𝟐 − 𝒚𝟐


• To find the partial derivatives (slope of tangent lines on the surface) at a specific point 𝑃(𝑎, 𝑏, 𝑐) we have:

• 𝜕𝑓(𝑥,𝑦)

𝜕𝑥 𝑝(𝑎,𝑏,𝑐)= 𝑙𝑖𝑚

ℎ→0

𝑓 𝑎+ℎ , 𝑏 −𝑓(𝑎 , 𝑏)

ℎ

• 𝜕𝑓(𝑥,𝑦)

𝜕𝑦 𝑝(𝑎,𝑏,𝑐)= 𝑙𝑖𝑚

ℎ→0

𝑓 𝑎 , 𝑏+ℎ −𝑓(𝑎 , 𝑏)

ℎ

Example:

o Find partial derivatives of 𝑧 = 10𝑥2𝑦3.

𝝏𝒛

𝝏𝒙= 𝟐𝟎𝒙𝒚𝟑 ,

𝝏𝒛

𝝏𝒚= 𝟑𝟎𝒙𝟐𝒚𝟐

(𝒂, 𝒃, 𝟎)

(𝒂, 𝒃, 𝒄)

(𝟎, 𝒃, 𝟎)

(𝒂, 𝟎, 𝟎)

Adopted from http://www.solitaryroad.com/c353.html

Rules of Partial Differentiation• If 𝑓(𝑥, 𝑦) and 𝑔(𝑥, 𝑦) are two differentiable functions with respect

to 𝑥 and 𝑦 ;

𝑧 = 𝑓 𝑥, 𝑦 ± 𝑔 𝑥, 𝑦 →

𝜕𝑧

𝜕𝑥=

𝜕𝑓

𝜕𝑥±

𝜕𝑔

𝜕𝑥= 𝑓𝑥 ± 𝑔𝑥

𝜕𝑧

𝜕𝑦=

𝜕𝑓

𝜕𝑦±

𝜕𝑔

𝜕𝑦= 𝑓𝑦 ± 𝑔𝑦

𝑧 = 𝑓 𝑥, 𝑦 × 𝑔 𝑥, 𝑦 →

𝜕𝑧

𝜕𝑥= 𝑓𝑥 . 𝑔 + 𝑔𝑥 . 𝑓

𝜕𝑧

𝜕𝑦= 𝑓𝑦 . 𝑔 + 𝑔𝑦 . 𝑓

𝑧 =𝑓(𝑥,𝑦)

𝑔(𝑥,𝑦)→

𝜕𝑧

𝜕𝑥=

𝑓𝑥 . 𝑔−𝑔𝑥 . 𝑓

𝑔2

𝜕𝑧

𝜕𝑦=

𝑓𝑦 . 𝑔−𝑔𝑦 . 𝑓

𝑔2

Some Exampleso Find partial derivatives of the function 𝑧 = 𝑥2 − 𝑥𝑦3 − 5𝑦2.

𝜕𝑧

𝜕𝑥= 𝟐𝒙 − 𝒚𝟑 ,

𝜕𝑧

𝜕𝑦= −𝟑𝒙𝒚𝟐 − 𝟏𝟎𝒚

o Find partial derivatives of 𝑧 = 𝑥𝑦. 𝑥2 + 𝑦2 .

𝜕𝑧

𝜕𝑥= y. 𝑥2 + 𝑦2 +

2𝑥

2 𝑥2 + 𝑦2. 𝑥𝑦 = 𝐲. 𝒙𝟐 + 𝒚𝟐 +

𝒙𝟐𝒚

𝒙𝟐 + 𝒚𝟐

𝜕𝑧

𝜕𝑦= 𝑥. 𝑥2 + 𝑦2 +

2𝑦

2 𝑥2 + 𝑦2. 𝑥𝑦 = 𝒙. 𝒙𝟐 + 𝒚𝟐 +

𝒚𝟐𝒙

𝒙𝟐 + 𝒚𝟐

o Find partial derivatives of 𝑧 =3𝑥2𝑦2

𝑥4+𝑦4.

𝜕𝑧

𝜕𝑥=

6𝑥𝑦2 𝑥4 + 𝑦4 − 4𝑥3 × 3𝑥2𝑦2

𝑥4 + 𝑦4 2

𝜕𝑧

𝜕𝑦=

6𝑦𝑥2 𝑥4 + 𝑦4 − 4𝑦3 × 3𝑥2𝑦2

𝑥4 + 𝑦4 2

Chain Rule (Different Cases)Case 1: If 𝑧 = 𝑓 𝑢 and 𝑢 = 𝑔(𝑥, 𝑦) then 𝑧 = 𝑓(𝑔 𝑥, 𝑦 ) and

Examples:

o Find partial derivatives of 𝑧 = 𝑒𝑥𝑦2.

Suppose 𝑢 = 𝑥𝑦2 , so, 𝑧 = 𝑒𝑢 and

𝜕𝑧

𝜕𝑥=

𝜕𝑧

𝜕𝑢.𝜕𝑢

𝜕𝑥= (𝑒𝑢). 𝑢𝑥 = 𝒆𝒙𝒚𝟐

. 𝒚𝟐

and 𝜕𝑧

𝜕𝑦=

𝜕𝑧

𝜕𝑢.𝜕𝑢

𝜕𝑦= (𝑒𝑢). 𝑢𝑦 = 𝒆𝒙𝒚𝟐

. 𝟐𝒙𝒚

𝜕𝑧

𝜕𝑥= 𝑓′.

𝜕𝑔

𝜕𝑥=

𝜕𝑧

𝜕𝑢.𝜕𝑢

𝜕𝑥and

𝜕𝑧

𝜕𝑦= 𝑓′.

𝜕𝑔

𝜕𝑦=

𝜕𝑧

𝜕𝑢.𝜕𝑢

𝜕𝑦

Chain Rule (Different Cases)

o Find partial derivatives of the function 𝑧 = 𝑒𝑥

𝑦 + cos(𝑥𝑦) .𝜕𝑧

𝜕𝑥=

𝟏

𝒚𝒆

𝒙𝒚 − 𝒚. 𝒔𝒊𝒏 𝒙𝒚 ,

𝜕𝑧

𝜕𝑦=

−𝒙

𝒚𝟐𝒆

𝒙𝒚 − 𝒙. 𝒔𝒊𝒏(𝒙𝒚)

• Case 2: If 𝑧 = 𝑓 𝑥, 𝑦 is a differentiable function of 𝑥 and 𝑦 and these two variables are differentiable functions of 𝑟 , such that 𝑥 =𝑔 𝑟 and 𝑦 = ℎ(𝑟) , then:

o Find partial derivatives of 𝑧 = 𝑥 − 𝑙𝑛𝑦 when 𝑥 = 𝑟 and 𝑦 = 𝑟2 − 1𝜕𝑧

𝜕𝑟= 1.

1

2 𝑟−

1

𝑦. 2𝑟 =

𝟏

𝟐 𝒓−

𝟐𝒓

𝒓𝟐 − 𝟏

• Can you suggest another way?

𝜕𝑧

𝜕𝑟=

𝜕𝑧

𝜕𝑥.𝑑𝑥

𝑑𝑟+

𝜕𝑧

𝜕𝑦.𝑑𝑦

𝑑𝑟

The same rules apply for multi variables functions

Chain Rules (Different Cases)• Case 3: If 𝑧 = 𝑓 𝑥, 𝑦 is a differentiable function of 𝑥 and 𝑦 and

these two variables are differentiable functions of 𝑟 and 𝑠 , such that 𝑥 = 𝑔 𝑟, 𝑠 and 𝑦 = ℎ(𝑟, 𝑠) and 𝑟 and 𝑠 are independent from

each other (𝑑𝑟

𝑑𝑠,𝑑𝑠

𝑑𝑟= 0), then:

• These derivatives are called “total derivatives of 𝒛 with respect to 𝒓 and 𝒔”.

o Find partial derivatives of 𝑧 =3

𝑥2 − 𝑦 where 𝑥 = 𝑟2 + 𝑠2 and

𝑦 =𝑟

𝑠.

𝜕𝑧

𝜕𝑟=

𝜕𝑧

𝜕𝑥.𝑑𝑥

𝑑𝑟+

𝜕𝑧

𝜕𝑦.𝑑𝑦

𝑑𝑟and

𝜕𝑧

𝜕𝑠=

𝜕𝑧

𝜕𝑥.𝑑𝑥

𝑑𝑠+

𝜕𝑧

𝜕𝑦.𝑑𝑦

𝑑𝑠

Implicit Differentiation• The Chain Rule can be used for implicit differentiation even for one

variable functions:𝐹 𝑥, 𝑦 = 0

Using the chain rule we have:𝜕𝐹

𝜕𝑥=

𝜕𝐹

𝜕𝑥.𝑑𝑥

𝑑𝑥+

𝜕𝐹

𝜕𝑦.𝑑𝑦

𝑑𝑥= 0

So,

• The same rule can be used for implicit two or multi variables functions. For example, for an implicit function 𝐹 𝑥, 𝑦, 𝑧 = 0, we have:

As 𝒅𝒙

𝒅𝒙= 𝟏

𝑑𝑦

𝑑𝑥= −

𝜕𝐹𝜕𝑥𝜕𝐹𝜕𝑦

= −𝐹𝑥

𝐹𝑦

𝜕𝑧

𝜕𝑥= −

𝜕𝐹𝜕𝑥𝜕𝐹𝜕𝑧

= −𝐹𝑥

𝐹𝑧𝑎𝑛𝑑

𝜕𝑧

𝜕𝑦= −

𝜕𝐹𝜕𝑦𝜕𝐹𝜕𝑧

= −𝐹𝑦

𝐹𝑧

Examples of Implicit Functionso Find the slope of the tangent line on the curve of intersection

between the surface 𝑥2 + 𝑦2 + 𝑧2 = 9 and the plane 𝑦 = 2 at the point 𝐴(1,2,2) .

As 𝑦 is fixed at 2 so, we are looking for 𝜕𝑧

𝜕𝑥at point A :

2𝑥 + 0 + 2𝑧.𝜕𝑧

𝜕𝑥= 0 →

𝜕𝑧

𝜕𝑥=

−𝑥

𝑧= −

1

2Or using implicit differentiation:

𝜕𝑧

𝜕𝑥= −

𝐹𝑥

𝐹𝑧= −

2𝑥

2𝑧= −

𝑥

𝑧

o Find 𝜕𝑧

𝜕𝑦for 𝑒𝑥+𝑦+𝑧 = 𝑥2 − 2𝑦2 + 𝑧2 .

0 + 1 +𝜕𝑧

𝜕𝑦𝑒𝑥+𝑦+𝑧 = 0 − 4y + 2z.

𝜕𝑧

𝜕𝑦→

𝜕𝑧

𝜕𝑦=

𝑒𝑥+𝑦+𝑧 + 4𝑦

2𝑧 − 𝑒𝑥+𝑦+𝑧

Use the implicit differentiation for this question.

Higher Orders Partial Derivatives

• For the function 𝑧 = 𝑓(𝑥, 𝑦) the partial derivatives 𝜕𝑧

𝜕𝑥and

𝜕𝑧

𝜕𝑦are in

turn functions of 𝑥 and 𝑦 , in general. So, we can think of second partial derivatives of 𝑧 , but in this case there are three different second derivatives:

𝑧𝑥𝑥 = 𝑓𝑥𝑥 =𝜕

𝜕𝑧𝜕𝑥

𝜕𝑥=

𝜕

𝜕𝑥

𝜕𝑧

𝜕𝑥=

𝜕2𝑧

𝜕𝑥2

𝑧𝑦𝑦 = 𝑓𝑦𝑦 =𝜕

𝜕𝑧𝜕𝑦

𝜕𝑦=

𝜕

𝜕𝑥

𝜕𝑧

𝜕𝑦=

𝜕2𝑧

𝜕𝑦2

𝑧𝑥𝑦 = 𝑓𝑥𝑦 =𝜕

𝜕𝑧𝜕𝑥

𝜕𝑦=

𝜕

𝜕𝑦

𝜕𝑧

𝜕𝑥=

𝜕2𝑧

𝜕𝑦. 𝜕𝑥

Second-order direct

partial derivatives

Second-order cross

partial derivative

The Equality of Mixed (Cross) Partial Derivatives

𝑧𝑦𝑥 = 𝑓𝑦𝑥 =𝜕

𝜕𝑧𝜕𝑦

𝜕𝑥=

𝜕

𝜕𝑥

𝜕𝑧

𝜕𝑦=

𝜕2𝑧

𝜕𝑥. 𝜕𝑦

• If the cross (mixed) partial derivatives 𝑓𝑥𝑦 and 𝑓𝑦𝑥 are continuous

and finite in their domain then they are equal to one another; i.e.

𝑓𝑥𝑦 = 𝑓𝑦𝑥

Or 𝜕2𝑧

𝜕𝑦.𝜕𝑥=

𝜕2𝑧

𝜕𝑥.𝜕𝑦

𝑧 = 𝑓(𝑥, 𝑦)

𝜕𝑧

𝜕𝑥=𝑓𝑥

𝜕𝑧

𝜕𝑦=𝑓𝑦

𝑓𝑥𝑥

𝑓𝑥𝑦 = 𝑓𝑦𝑥

𝑓𝑦𝑦

Second-order cross

partial derivative

Total Differential• The meaning of differential in multi variables scalar function is not

different with that in the one variable function. The only difference is that the source of change in dependent variable is the change of all independent variables., that is;

𝑧 + ∆𝑧 = 𝑓(𝑥 + ∆𝑥, 𝑦 + ∆𝑦)

Or ∆𝑧 = 𝑓 𝑥 + ∆𝑥, 𝑦 + ∆𝑦 − 𝑓(𝑥, 𝑦)

But 𝑑𝑧, which is called “total differential”is defined as:

𝑑𝑧 =𝜕𝑧

𝜕𝑥. 𝑑𝑥 +

𝜕𝑧

𝜕𝑦𝑑𝑦

Or

𝑑𝑧 = 𝑓𝑥 . 𝑑𝑥 + 𝑓𝑦. 𝑑𝑦Adopted from Calculus Early Transcendental James Stewart p897

Total Differential• For a multi variables scalar function the same rule applies:

𝑧 = 𝑓(𝑥1, 𝑥2, … , 𝑥𝑛)

𝑑𝑧 =𝜕𝑧

𝜕𝑥1. 𝑑𝑥1 +

𝜕𝑧

𝜕𝑥2. 𝑑𝑥2 + ⋯ +

𝜕𝑧

𝜕𝑥𝑛. 𝑑𝑥𝑛

• in the case of two variables function 𝑧 = 𝑓(𝑥, 𝑦) we assumed 𝑥 and 𝑦 are independent, but if they depend on other variables the differential of each one of them can be treated as the total differential of a dependent variable, that is;

𝑧 = 𝑓 𝑥, 𝑦 → 𝑑𝑧 =𝜕𝑧

𝜕𝑥. 𝑑𝑥 +

𝜕𝑧

𝜕𝑦. 𝑑𝑦 𝐴

𝑥 = ℎ 𝑟, 𝑠 → 𝑑𝑥 =𝜕𝑥

𝜕𝑟. 𝑑𝑟 +

𝜕𝑥

𝜕𝑠. 𝑑𝑠 𝐵

𝑦 = 𝑘 𝑟, 𝑠 → 𝑑𝑦 =𝜕𝑦

𝜕𝑟. 𝑑𝑟 +

𝜕𝑦

𝜕𝑠. 𝑑𝑠 𝐶

Substituting B and C into A:

𝑑𝑧 =𝜕𝑧

𝜕𝑥.

𝜕𝑥

𝜕𝑟. 𝑑𝑟 +

𝜕𝑥

𝜕𝑠. 𝑑𝑠 +

𝜕𝑧

𝜕𝑦.

𝜕𝑦

𝜕𝑟. 𝑑𝑟 +

𝜕𝑦

𝜕𝑠. 𝑑𝑠

Total Differential

If we are looking for total derivatives of 𝑧 with respect to 𝑟 and 𝑠, which is introduced before as the chain rule (case 3), we need to suppose that 𝑟 and 𝑠 are independent variables and not associated to

each other (𝑑𝑠

𝑑𝑟𝑜𝑟

𝑑𝑟

𝑑𝑠= 0); then:

𝑑𝑧 =𝜕𝑧

𝜕𝑥.𝜕𝑥

𝜕𝑟+

𝜕𝑧

𝜕𝑦.𝜕𝑦

𝜕𝑟. 𝑑𝑟 +

𝜕𝑧

𝜕𝑥.𝜕𝑥

𝜕𝑠+

𝜕𝑧

𝜕𝑦.𝜕𝑦

𝜕𝑠. 𝑑𝑠

𝜕𝑧

𝜕𝑟=

𝜕𝑧

𝜕𝑥.𝑑𝑥

𝑑𝑟+

𝜕𝑧

𝜕𝑦.𝑑𝑦

𝑑𝑟

and

𝜕𝑧

𝜕𝑠=

𝜕𝑧

𝜕𝑥.𝑑𝑥

𝑑𝑠+

𝜕𝑧

𝜕𝑦.𝑑𝑦

𝑑𝑠

Second Order Total Differential • The sign of the second order total differential 𝑑2𝑧 shows the convexity and

concavity of the surface with respect to the 𝑥𝑜𝑦 plane.

• Considering the total differential 𝑑𝑧 , the second order total differential 𝑑2𝑧 can be obtained by applying the differential rules:

𝑑2𝑧 = 𝑑 𝑑𝑧 = 𝑑(𝜕𝑧

𝜕𝑥. 𝑑𝑥 +

𝜕𝑧

𝜕𝑦𝑑𝑦)

= 𝑑 𝑓𝑥 . 𝑑𝑥 + 𝑓𝑦 . 𝑑𝑦= 𝑑𝑓𝑥 . 𝑑𝑥 + 𝑓𝑥 . 𝑑 𝑑𝑥 + 𝑑𝑓𝑦. 𝑑𝑦 + 𝑓𝑦. 𝑑(𝑑𝑦)

As 𝑑 𝑑𝑥 = 𝑑2𝑥 = 0

𝑑 𝑑𝑦 = 𝑑2𝑦 = 0, and

𝑑𝑓𝑥 = 𝑓𝑥𝑥 . 𝑑𝑥 + 𝑓𝑥𝑦. 𝑑𝑦

𝑑𝑓𝑦 = 𝑓𝑦𝑥 . 𝑑𝑥 + 𝑓𝑦𝑦. 𝑑𝑦therefore :

• Factorising 𝑑𝑥2 from the right hand side, we have:

𝑑2𝑧 = 𝑑𝑦2. 𝑓𝑥𝑥 .𝑑𝑥

𝑑𝑦

2

+ 2𝑓𝑥𝑦.𝑑𝑥

𝑑𝑦+ 𝑓𝑦𝑦

𝑑2𝑧 = 𝑓𝑥𝑥 . 𝑑𝑥2 + 2𝑓𝑥𝑦 . 𝑑𝑥. 𝑑𝑦 + 𝑓𝑦𝑦 . 𝑑𝑦2

Second Order Differential • 𝑑𝑦2 > 0 (why?); so the sign of 𝑑2𝑧 depends on the sign of the

expression in the bracket.

• From elementary algebra we know that the quadratic form 𝑎𝑋2 + 𝑏𝑋 + 𝑐 has the same sign as the parameter 𝑎 when ∆=𝑏2 − 4𝑎𝑐 < 0 .

• If we assume that 𝑋 =𝑑𝑥

𝑑𝑦and 𝑎 = 𝑓𝑥𝑥 , 𝑏 = 2𝑓𝑥𝑦 , 𝑐 = 𝑓𝑦𝑦 then

𝑑2𝑧 = 𝑑𝑦2. 𝑎𝑋2 + 𝑏𝑋 + 𝑐 has the same sign as 𝑎 = 𝑓𝑥𝑥 if

2𝑓𝑥𝑦2

− 4𝑓𝑥𝑥 . 𝑓𝑦𝑦 < 0 → 𝑓𝑥𝑥 . 𝑓𝑦𝑦 > 𝑓𝑥𝑦2

So;

1. 𝑑2𝑧 > 0 if 𝑓𝑥𝑥 > 0 and 𝑓𝑥𝑥 . 𝑓𝑦𝑦 > 𝑓𝑥𝑦2

.

2. 𝑑2𝑧 < 0 if 𝑓𝑥𝑥 < 0 and 𝑓𝑥𝑥 . 𝑓𝑦𝑦 > 𝑓𝑥𝑦2

.

Adopted from Calculus Early Transcendental James Stewart DIFFERENT PAGES

Optimising of Two Variables Functions• The two variables function 𝑧 = 𝑓(𝑥, 𝑦) have a relative maximum

(relative minimum) at a point in its domain if at that point :

Note 1: If 𝑓𝑥𝑥 . 𝑓𝑦𝑦 − 𝑓𝑥𝑦2

< 0 , it

means the critical point is not a maximum

or a minimum but a saddle point.

(looks maximum from one axis but

minimum from another axis)

Adopted from http://commons.wikimedia.org/wiki/File:Saddle_point.png

𝒛 = 𝒙𝟐 − 𝒚𝟐

i. 𝑓𝑥 = 0 and 𝑓𝑦 = 0 , simultaneously.

ii. 𝑓𝑥𝑥 < 0 (𝑓𝑥𝑥 > 0)

iii. 𝑓𝑥𝑥 . 𝑓𝑦𝑦 − 𝑓𝑥𝑦2

> 0Sufficient Conditions

Necessary conditions for differentiable functions

Optimising of Two Variables Functions

• Note 2: If 𝑓𝑥𝑥 . 𝑓𝑦𝑦 − 𝑓𝑥𝑦2

= 0 at the critical point further

investigation is needed to find about the nature of the point.

• Example:

o Find the local extremum of the function 𝑓(𝑥, 𝑦) = 𝑥3 − 6𝑥𝑦 + 8𝑦3, if any.

𝑓𝑥 = 0𝑓𝑦 = 0

→ 6𝑥2 − 6𝑦 = 0

−6𝑥 + 24𝑦2 = 0→

𝑥2 = 𝑦

−𝑥 + 4𝑦2 = 0

After solving these simultaneous equations two critical points emerge

𝑨(𝟎, 𝟎, 𝟎) and 𝑩(𝟑 𝟏

𝟒,

𝟑 𝟏

𝟏𝟔,−𝟑

𝟒) .

Optimising of Two Variables Functions

Now, 𝑓𝑥𝑥 = 12𝑥 and 𝑓𝑦𝑦 = 48𝑦 and 𝑓𝑥𝑦 = 𝑓𝑦𝑥 = −6 .

So, 𝑓𝑥𝑥 . 𝑓𝑦𝑦 − 𝑓𝑥𝑦2

= 12𝑥. 48𝑦 − −6 2 = 576𝑥𝑦 − 36 .

At the point 𝐴 0,0,0 : 𝑓𝑥𝑥 . 𝑓𝑦𝑦 − 𝑓𝑥𝑦2

= −36 < 0 →

𝐴 𝐢𝐬 𝐚 𝐬𝐚𝐝𝐥𝐥𝐞 𝐩𝐨𝐢𝐧𝐭.

At the point

𝐵(3 1

4,

3 1

16,−3

4) :𝑓𝑥𝑥 . 𝑓𝑦𝑦 − 𝑓𝑥𝑦

2= 144 − 36 = 108 > 0 and

𝑓𝑥𝑥 > 0 , so this point is a local minimum.

The Jacobian & Hessian Determinants

• From the matrix algebra we know that for any square matrix 𝐴if:

𝐴 = 0 ⟹ 𝐴 𝑖𝑠 𝑎 𝑠𝑖𝑛𝑔𝑢𝑙𝑎𝑟 𝑚𝑎𝑡𝑟𝑖𝑥,

Which means, there exists linear dependence between at least two rows or two columns of the matrix.

And if:

𝐴 ≠ 0 ⟹ 𝐴 𝑖𝑠 𝑎 𝑛𝑜𝑛 − 𝑠𝑖𝑛𝑔𝑢𝑙𝑎𝑟 𝑚𝑎𝑡𝑟𝑖𝑥,

Which means, all rows and all columns are linearly independent.

• So to test for linear dependence between the equations in a simultaneous system the determinant of the coefficients matrix can be used.

The Jacobian & Hessian Determinants• To test for functional dependence (both linear and non-linear) between

different functions we use Jacobian Determinant shown by 𝐽 .

• The Jacobian Matrix is the matrix of all first-order partial derivatives of a vector function 𝐹: 𝑅𝑛 → 𝑅𝑚, which corresponds a vector in 𝑛dimensional space(real n-tuples) into a vector in 𝑚 dimensional space (real m-tuples):

𝑦1 = 𝐹1(𝑥1, 𝑥2, … , 𝑥𝑛)𝑦2 = 𝐹2(𝑥1, 𝑥2, … , 𝑥𝑛)

⋮𝑦𝑚 = 𝐹𝑚(𝑥1, 𝑥2, … , 𝑥𝑛)

So, the Jacobian matrix of 𝐹 is:

𝐽 =

𝜕𝐹1

𝜕𝑥1⋯

𝜕𝐹1

𝜕𝑥𝑛

⋮ ⋱ ⋮𝜕𝐹𝑚

𝜕𝑥1⋯

𝜕𝐹𝑚

𝜕𝑥𝑛

Each row is the partial derivatives of one of the functions (e.g. 𝐹1) with respect to all independent variables 𝑥1, 𝑥2, … , 𝑥𝑛.

The Jacobian & Hessian Determinants• If 𝑚 = 𝑛, the Jacobian matrix is a square matrix and its

determinant shows if there is functional dependence or independence between the functions.

𝐽 = 0 ⟹ 𝑇ℎ𝑒 𝑒𝑞𝑢𝑎𝑡𝑖𝑜𝑛𝑠 𝑎𝑟𝑒 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛𝑎𝑙𝑙𝑦 𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡

This means, there is a linear or non-linear association between two functions.

𝐽 ≠ 0 ⟹ 𝑇ℎ𝑒 𝑒𝑞𝑢𝑎𝑡𝑖𝑜𝑛𝑠 𝑎𝑟𝑒 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛𝑎𝑙𝑙𝑦 𝑖𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡

This means, there is no linear or non-linear association between two functions.

Example: Use the Jacobian determinant to test the functional dependency of the following equations:

𝑦1 = 2𝑥1 − 3𝑥2

𝑦2 = 4𝑥12 − 12𝑥1𝑥2 + 9𝑥2

2


• The Jacobian determinant is :

𝐽 =

𝜕𝑦1

𝜕𝑥1

𝜕𝑦1

𝜕𝑥2

𝜕𝑦2

𝜕𝑥1

𝜕𝑦2

𝜕𝑥2

=2 −3

8𝑥1 − 12𝑥2 −12𝑥1 + 18𝑥2

= 2 −12𝑥1 + 18𝑥2 − −3 8𝑥1 − 12𝑥2 = 0

• So, the functions are not independent.

• We expected such a result as we know that there is a quadratic functional relationship between 𝑦1 and 𝑦2:

𝑦12 = 𝑦2


• Hessian Matrix is a square matrix which is composed of the second-order partial derivatives of a real (scalar) multi variables function, (𝑓: 𝑅𝑛 → 𝑅). For a function 𝑦 = 𝑓(𝑥1, 𝑥2, … , 𝑥𝑛), Hessian determinant is defined as:

𝐻 =

𝜕2𝑓

𝜕𝑥12

𝜕2𝑓

𝜕𝑥1𝜕𝑥2⋯

𝜕2𝑓

𝜕𝑥1𝜕𝑥𝑛

𝜕2𝑓

𝜕𝑥2𝜕𝑥1

𝜕2𝑓

𝜕𝑥22 ⋯

𝜕2𝑓

𝜕𝑥2𝜕𝑥𝑛

⋮𝜕2𝑓

𝜕𝑥𝑛𝜕𝑥1

⋮𝜕2𝑓

𝜕𝑥𝑛𝜕𝑥2

…

⋮𝜕2𝑓

𝜕𝑥𝑛2

=

𝑓11 𝑓12 … 𝑓1𝑛

𝑓21 𝑓22 … 𝑓2𝑛

⋮𝑓𝑛1

⋮𝑓𝑛2

⋱ ⋮… 𝑓𝑛𝑛

• In the optimisation of two variables function if the first-order (necessary) conditions 𝑓𝑥 = 𝑓𝑦 = 0 are met, second-order

(sufficient) conditions are:

𝑓𝑥𝑥 , 𝑓𝑦𝑦 > 0 𝑓𝑜𝑟 𝑎 𝑚𝑖𝑛𝑖𝑚𝑢𝑚 and 𝑓𝑥𝑥 , 𝑓𝑦𝑦 < 0 𝑓𝑜𝑟 𝑎 𝑚𝑎𝑥𝑖𝑚𝑢𝑚

𝑓𝑥𝑥 . 𝑓𝑦𝑦 − 𝑓𝑥𝑦2

> 0


• Using the Hessian determinant, we can simply show the sufficient conditions as:

The optimal point is minimum if 𝐻1 > 0 and 𝐻2 > 0, because:

o 𝐻1 = 𝑓𝑥𝑥 > 0

o 𝐻2 =𝑓𝑥𝑥 𝑓𝑥𝑦

𝑓𝑦𝑥 𝑓𝑦𝑦= 𝑓𝑥𝑥𝑓𝑦𝑦 − 𝑓𝑥𝑦

2 > 0

And, the optimal point is maximum if 𝐻1 < 0 and 𝐻2 > 0.

• There is the same story for a multi-variable function 𝑦 = 𝑓(𝑥1, 𝑥2, … , 𝑥𝑛):

If 𝐻1 , 𝐻2 , 𝐻3 , … , 𝐻𝑛 > 0, the critical point is local minimum.

If the principal minors change their signs consecutively, the critical point is the local maximum. (e.g. in case of 𝑦 = 𝑓(𝑥1, 𝑥2, 𝑥3), 𝐻1 < 0 , 𝐻2 > 0 and

𝐻3 < 0)

Optimisation with a Constraint • In reality, independent variables in a function, are not fully

independent from each other. They might be in a linear or even non-linear relationship with one another and make a constraint in the process of optimisation and change the result of that.

Adopted from http://staff.www.ltu.se/~larserik/applmath/chap7en/part7.htmlAdopted & altered from http://en.wikipedia.org/wiki/Lagrange_multiplier

𝑔 𝑥, 𝑦 = 𝑐

Linear constraintNon-linear constraint

𝒈 𝒙, 𝒚 = 𝒄

Optimisation with a Constraint

• In each case, the function 𝑧 = 𝑓(𝑥, 𝑦) is the target function for optimisation, subject to a constraint 𝑔 𝑥, 𝑦 = 𝑐 (where 𝑐 is a constant). So;

Max or Min ∶ 𝑧 = 𝑓 𝑥, 𝑦Subject to ∶ 𝑔 𝑥, 𝑦 = 𝑐

• If the constraint function 𝑔 𝑥, 𝑦 = 𝑐 is linear (e.g. 𝑥 − 2𝑦 = −1)one way to include the constraint into the optimisation process is to find one variable with respect to another from the constraint function (here; 𝑥 = 2𝑦 − 1) and put it into the target function to make it a function with one independent variable, 𝑧 = 𝐹(𝑦), and follow the optimisation process of two-variable function.

Example

• Example: Find the maximum of the function 𝑧 = 𝑥𝑦 subject to the constraint 𝑥 + 𝑦 = 1.

From the constraint function we have 𝑦 = −𝑥 + 1 and if we substitute this with the 𝑦 in the target function, we will have 𝑧 =− 𝑥2 + 𝑥.

𝑑𝑧

𝑑𝑥= 0 → −2𝑥 + 1 = 0 → 𝑥 = 0.5

Putting this into the constraint equation to find 𝑦 and both into the target function to find 𝑧 ; the maximum point will be 𝐴(0.5, 0.5, 0.25) .

How do we know the point is the maximum point?

The Lagrange Method• If the constraint function is non-linear the previous method

might become very complicated. Another method, which is called “Lagrange Method” or the “Method of Lagrange Multipliers”, can help us to find local extremum points.

• In the Lagrange method the constraint function comes into the process of optimisation by introducing a new variable λ(Lagrange Multiplier, Lagrange coefficient) to make the Lagrange function 𝐿 , in the form of:

𝐿 𝑥, 𝑦, 𝜆 = 𝑓 𝑥, 𝑦 + 𝜆 . [𝑐 − 𝑔 𝑥, 𝑦 ]

• By changing 𝑥 and 𝑦 a point is moving on the surface of the function but the movement is limited to the constraint 𝑔 𝑥, 𝑦 =𝑐 .

• This means 𝑐 − 𝑔 𝑥, 𝑦 = 0 and 𝐿 𝑥, 𝑦, 𝜆 = 𝑓 𝑥, 𝑦 . So, the optimisation of 𝐿 is equivalent to the optimisation of 𝑓 .

The Lagrange Method

• To find the extremum values we need to find the derivative of the Lagrange function with respect to its variables and solve the following simultaneous equations :

𝜕𝐿

𝜕𝑥= 0 →

𝜕𝑓

𝜕𝑥− 𝜆.

𝜕𝑔

𝜕𝑥= 0

𝜕𝐿

𝜕𝑦= 0 →

𝜕𝑓

𝜕𝑦− 𝜆.

𝜕𝑔

𝜕𝑦= 0

𝜕𝐿

𝜕𝜆= 0 → 𝑐 − 𝑔 𝑥, 𝑦 = 0

• Solving this simultaneous equations gives us the critical values of 𝑥

and 𝑦 and a value for 𝜆 .

• 𝜆 shows the sensitivity of the target (objective)function to the change in the constraint function.

Necessary conditions for having extremums A

Sufficient Condition• To make sure that the critical point(s) from solving the

simultaneous equations are extremum(s) we need sufficient evidence which is the sign of second order differential of the Lagrange function 𝑑2𝐿 at the critical point(s).

• If 𝐿 = 𝑓 𝑥, 𝑦 + 𝜆 . [c − 𝑔 𝑥, 𝑦 ] then

𝑑𝐿 = 𝑑𝑓 − 𝑔. 𝑑𝜆 − 𝜆. 𝑑𝑔

And 𝑑2𝐿 = 𝑑2𝑓 − 𝑑𝑔. 𝑑𝜆 − 𝑔. 𝑑2𝜆 − 𝑑𝜆 . 𝑑𝑔 − 𝜆 . 𝑑2𝑔

Since:

𝑑2𝜆 = 0

𝑑2𝑓 = 𝑓𝑥𝑥 . 𝑑𝑥2 + 2𝑓𝑥𝑦 . 𝑑𝑥. 𝑑𝑦 + 𝑓𝑦𝑦 . 𝑑𝑦2

𝑑𝑔 = 𝑔𝑥 . 𝑑𝑥 + 𝑔𝑦 . 𝑑𝑦

𝑑2𝑔 = 𝑔𝑥𝑥 . 𝑑𝑥2 + 2𝑔𝑥𝑦. 𝑑𝑥. 𝑑𝑦 + 𝑔𝑦𝑦 . 𝑑𝑦2

, therefore

𝑑2𝐿 = 𝑓𝑥𝑥 − 𝜆. 𝑔𝑥𝑥 . 𝑑𝑥2 + 2 𝑓𝑥𝑦 − 𝜆. 𝑔𝑥𝑦 . 𝑑𝑥. 𝑑𝑦 + 𝑓𝑦𝑦 − 𝜆. 𝑔𝑦𝑦 . 𝑑𝑦2 −

2𝑔𝑥𝑑𝑥. 𝑑𝜆 − 2𝑔𝑦𝑑𝑦. 𝑑𝜆

= 𝐿𝑥𝑥 . 𝑑𝑥2 + 2𝐿𝑥𝑦 . 𝑑𝑥. 𝑑𝑦 + 𝐿𝑦𝑦. 𝑑𝑦2 − 2𝑔𝑥𝑑𝑥. 𝑑𝜆 − 2𝑔𝑦𝑑𝑦. 𝑑𝜆

• In the matrix form we can use the bordered Hessian Matrix to represent the above quadratic form:

𝑑2𝐿 = 𝑑𝜆 𝑑𝑥 𝑑𝑦

0 −𝑔𝑥 −𝑔𝑦

−𝑔𝑥 𝐿𝑥𝑥 𝐿𝑥𝑦

−𝑔𝑦 𝐿𝑦𝑥 𝐿𝑦𝑦

𝑑𝜆𝑑𝑥𝑑𝑦

• Where the bordered Hessian matrix is:

𝐻3 =

0 −𝑔𝑥 −𝑔𝑦

−𝑔𝑥 𝐿𝑥𝑥 𝐿𝑥𝑦

−𝑔𝑦 𝐿𝑦𝑥 𝐿𝑦𝑦

or sometimes 𝐻3 =

𝐿𝑥𝑥 𝐿𝑥𝑦 −𝑔𝑥

𝐿𝑦𝑥 𝐿𝑦𝑦 −𝑔𝑦

−𝑔𝑥 −𝑔𝑦 0

Sufficient Condition

• In the second form, the components of vectors of the first differentials of the variables, need to be re-arranged, i.e.:

𝑑𝑥 𝑑𝑦 𝑑𝜆

𝐿𝑥𝑥 𝐿𝑥𝑦 −𝑔𝑥

𝐿𝑦𝑥 𝐿𝑦𝑦 −𝑔𝑦

−𝑔𝑥 −𝑔𝑦 0

𝑑𝑥𝑑𝑦𝑑𝜆

• Note: In some books the constraint function 𝑔 enters in the Lagrange function with a positive sign, so, the signs of the first derivatives of 𝑔in the bordered Hessian matrix are positive, but there is no difference between their determinants. (Based on the properties of determinant, if just a row or just a column of a matrix multiplied by 𝑘, the determinant of the matrix is multiplied by 𝑘. In this case, the first row and the first column multiplied by -1, so, the determinant is multiplied by -1x(-1)=1)


So, we have a minimum if

1. 𝑑2𝐿 > 0 (i.e. all the principle minors of the Hessian matrix should be negative: 𝐻2 , 𝐻3 < 0 )

And a maximum if:

2. 𝑑2𝐿 < 0 (i.e. the principle minors of the Hessian matrix change their sign one after another: 𝐻2 > 0, 𝐻3 < 0 )

• For a multi variable function 𝑦 = 𝑓(𝑥1, 𝑥2, … , 𝑥𝑛), The Hessian matrix is 𝑛 × 𝑛 but the rule is the same:

• For minimum: 𝐻2 , 𝐻3 , … , 𝐻𝑛 < 0 .

• For maximum: Tℎ𝑒 𝑠𝑖𝑔𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑝𝑟𝑖𝑛𝑐𝑖𝑝𝑙𝑒 𝑚𝑖𝑛𝑜𝑟𝑠 𝑐ℎ𝑎𝑛𝑔𝑒 𝑐𝑜𝑛𝑠𝑒𝑐𝑢𝑡𝑖𝑣𝑒𝑙𝑦.


Example• Find the extremums of the function 𝑓 𝑥, 𝑦 = 𝑥 − 𝑦 subject to the 𝑥2 + 𝑦2 =

100, if any?

𝐿 𝑥, 𝑦, 𝜆 = 𝑥 − 𝑦 + 𝜆[100 − 𝑥2 − 𝑦2]

𝐿𝑥 = 1 − 2𝜆𝑥 = 0𝐿𝑦 = −1 − 2𝜆𝑦 = 0

𝐿𝜆 = 100 − 𝑥2 − 𝑦2 = 0

1

−1=

2𝜆𝑥

2𝜆𝑦

From the first two equations 𝜆 can be omitted and we have 𝑥 = −𝑦. Substituting this new equation into the third equation we will have:

100 −𝑦2 − 𝑦2 = 0 → 𝑦 = ±5 2

So, the critical points are A −5 2, 5 2, −10 2 and 𝐵(5 2, −5 2, 10 2) and

𝜆 = ∓2

20.

Without any further investigation it can be said that point A is minimum and point𝐵 is maximum. (Why?)

Example

• Using hessian determinant method we have:

𝐻3 =

0 −2𝑥 −2𝑦−2𝑥 −2𝜆 0−2𝑦 0 −2𝜆

= 8𝜆(𝑥2 + 𝑦2)

Obviously, the sign of this determinant depends on the sign of 𝜆.

At point A −5 2, 5 2, −10 2 , 𝜆 = −2

20,so, 𝐻3 <0 and the point is

minimum. ( 𝐻2 is also negative).

At point 𝐵 5 2, −5 2, 10 2 , 𝜆 = +2

20, so, 𝐻3 >0 and the point is

maximum.

• If there are more than one constraint the process of optimisation is the same but there will be more than one Lagrange multiplier.

• This case is the generalisation of the previous case and will not be discussed here.

Interpretation of the Lagrange Multiplier 𝜆

• The first-order conditions in the form of the simultaneous equations (slide 40), provides the critical (and perhaps) optimal values of the independent variables (𝑥∗, 𝑦∗) and the corresponding value(s) of the Lagrange multiplier (𝜆∗).

• The Lagrange multiplier shows the sensitivity of the optimal value of the target (objective) function(𝑓∗) to the change in the constant value of the constraint function (𝑐). It is calculated as the ratio, i.e.:

𝜆∗ =𝜕𝑓∗ 𝑥∗, 𝑦∗

𝜕𝑐This means if 𝜆∗ = 2, and 𝑐 increases by 1%, the value of the target function (calculated at the optimal values 𝑥∗and 𝑦∗) increases 2%.

A

Duality in Optimisation Analysis• Consider the process of maximisation of the target (objective)

function 𝑧 = 𝑓(𝑥, 𝑦), subject to the constraint 𝑔 = 𝑔(𝑥, 𝑦).

• As we know, the solution is the tangency point on both functions, so, the process of optimisation can be done through different approaches. The primal approach is what we have discussed and done so far but the dual approach is when the constraint function𝑔 = 𝑔(𝑥, 𝑦) is the new target function and 𝑧 = 𝑓(𝑥, 𝑦) as the new constraint.

• The initial idea comes from the mathematical fact that if 𝑓 reaches to its maximum at the point 𝑥 = 𝑥∗, the function −𝑓 will have a minimum at that point.

• Therefore, instead of finding the maximum of 𝑧 = 𝑓(𝑥, 𝑦), subject to the constraint 𝑔 = 𝑔 𝑥, 𝑦 , we can find the minimum of 𝑔 =𝑔 𝑥, 𝑦 , subject to the constraint 𝑧 = 𝑓(𝑥, 𝑦), i.e. if we know that 𝑧cannot be bigger than 𝑧∗what is the minimum value of 𝑔 𝑥, 𝑦 , which satisfies this constraint.

Duality in Optimisation Analysis

• Let 𝑈 = 𝑈(𝑥, 𝑦) is the utility function subject to the budget constraint 𝑥. 𝑃𝑥 + 𝑦. 𝑃𝑦 = 𝑚.

• The Lagrange function is:

𝐿 𝑥, 𝑦, 𝜆 = 𝑈 𝑥, 𝑦 + 𝜆(𝑚 − 𝑥. 𝑃𝑥 − 𝑦. 𝑃𝑦)

The first-order conditions are:𝐿𝑥 = 𝑈𝑥 − 𝜆𝑃𝑥 = 0𝐿𝑦 = 𝑈𝑦 − 𝜆𝑃𝑦 = 0

𝐿𝜆 = 𝑚 − 𝑥. 𝑃𝑥 + 𝑦. 𝑃𝑦 = 0

• The optimal value for 𝑥 and 𝑦 which shows the Marshallian demand (consumption) function for 𝑥 and 𝑦 and the optimal value for 𝜆 are:

𝑥𝑀 = 𝑥𝑀 𝑃𝑥, 𝑃𝑦 , 𝑚

𝑦𝑀 = 𝑦𝑀 𝑃𝑥, 𝑃𝑦 , 𝑚

𝜆𝑀 = 𝜆𝑀 𝑃𝑥, 𝑃𝑦 , 𝑚

B

Duality in Optimisation Analysis

• Substituting these solutions into the target function gives the maximum value of the utility can be achieved by the constraint:

𝑈∗ = 𝑈∗ 𝑥𝑀 𝑃𝑥, 𝑃𝑦, 𝑚 , 𝑦𝑀 𝑃𝑥 , 𝑃𝑦, 𝑚

We call this as the indirect utility function, as it is the maximum value of the utility obtained at the optimal values of 𝑥 and 𝑦, but it is an indirect function because now its values depends on the parameters 𝑃𝑥 , 𝑃𝑦 and 𝑚.

• Now, the dual problem is when the expenditure on 𝑥 and 𝑦 is minimised subject to the maintaining of a given level of utility 𝑈∗. So, the new Lagrange function is:

𝐿 𝑥, 𝑦, 𝜆 = 𝑥. 𝑃𝑥 + 𝑦. 𝑃𝑦 + 𝜆[𝑈∗ − 𝑈 𝑥, 𝑦 ]

The first-order conditions provide optimal solutions for 𝑥,𝑦 and 𝜆.

Duality in Optimisation Analysis𝐿𝑥 = 𝑃𝑥 − 𝜆𝑈𝑥 = 0𝐿𝑦 = 𝑃𝑦 − 𝜆𝑈𝑦 = 0

𝐿𝜆 = 𝑈∗ − 𝑈(𝑥, 𝑦) = 0

The optimal solutions represent the demand functions for 𝑥 and 𝑦 .𝑥𝐻 = 𝑥𝐻 𝑃𝑥 , 𝑃𝑦 , 𝑈∗

𝑦𝐻 = 𝑦𝐻 𝑃𝑥 , 𝑃𝑦 , 𝑈∗

𝜆𝐻 = 𝜆𝐻 𝑃𝑥 , 𝑃𝑦 , 𝑈∗

• The first two equations are called Hicksion demand functions.

Both simultaneous equations and give us the same results:

𝑈𝑥

𝑃𝑥=

𝑈𝑦

𝑃𝑦𝑜𝑟

𝑈𝑥

𝑈𝑦=

𝑃𝑥

𝑃𝑦

So, primal and dual analysis leads us to the same conclusion. The only difference is that:

𝜆𝐻 =1

𝜆𝑀

C

B C

basic calculus (ii) recap

Education