separation theorems in optimizations

Abstract Introduction Separation Theorems Applications Generalizations

Separation Theorems in Optimizations

Mahesh Dumaldar

Associate Professor

School of Mathematics

Devi Ahilya University, Indore

mn [email protected]

13-01-2018

Mahesh Dumaldar, Devi Ahilya University, Indore. 1/41


Overview

1 Abstract

2 Introduction

3 Separation Theorems

4 Applications

5 Generalizations



Abstract

Various forms of Farkas’ Lemma which is a consequence

of the fundamental separation theorem have been discussed

alongwith its applications like Karush-Kuhn-Tucker optimality

conditions for linear programming problems and Minkowski

theorem.



Introduction

An optimal solution of a Linear Programming Problem

(LPP) is a supporting hyperplane of a convex set of feasible

solutions. This supporting hyperplane contains at least one

extreme point of the convex set of its feasible solutions.

Moreover, an extreme point is a basic feasible solution(BFS) of

the given LPP. It is well known that simplex method finds a

BFS at each iteration.



A supporting hyperplane is a limit of separating

hyperplanes of a point and a closed convex set. The

fundamental separation theorem gives a separating hyperplane.

This is a separation of a point and a closed set like T3

axiom(regularity) in topology and a consequence of Hahn

Banach extension theorem which separates a non zero vector

and a closed linear subspace of a Banach space.



Closest Point Theorem

Theorem

Let S be a nonempty, closed convex set in Rn and y 6∈ S. Then,

there exists a unique point x̄ in S with minimum distance from

y. Furthermore, x̄ is the minimizing point if and only if

(y − x̄)T (x− x̄) ≤ 0 for all x ∈ S

This is similar to following well known theorem.

Theorem

A closed convex subset of a Hilbert space has a unique vector of

smallest norm.



Fundamental Separation Theorem

Theorem

Let S be a nonempty closed convex set in Rn and y 6∈ S. Then,

there exists a nonzero vector p and a scalar α such that

pT y > α and pTx ≤ α for each x ∈ S

Proof.

Take

p = (y − x̄) 6= 0 and α = (y − x̄)T x̄ = pT x̄



This is similar to following theorem which is an application of

Hahn-Banach extension theorem.

Theorem

If M is a closed linear subspace of a normed linear space N and

x0 is a vector not in M then there exists a functional f0 in N∗

such that f0(M) = 0 and f0(x0) 6= 0.



An outer representation of a polyhedra

Corollary (an outer representation of a polyhedra)

Let S be a closed convex set in Rn. Then, S is the intersection

of all half-spaces containing S.

Proof.

Suppose this intersection strictly contains in S. Let y be the

point in the intersection which is not in S. By fundamental

separation theorem, there exists a hyperplane which separates S

and y. Thus S is in a half space generated by this hyperplane

but does not contain y. This contradicts the fact that y is in

the intersection of all half-spaces containing S.



An outer representation of a polyhedra

Remark:

This representation is called an outer representation of a

polyhedra. Thus

S = {x|Ax ≤ b}



Farakas’ Lemma

Lemma (Farakas’ Lemma)

Let A be an m× n matrix and c be an n component vector.

Then exactly one of the following system has a solution

System 1: Ax ≤ 0 and cTx > 0, for some x ∈ Rn

System 2: AT y = c and y ≥ 0, for some y ∈ Rm



Equivalently, the implication

Ax ≤ 0 =⇒ cTx ≤ 0 holds for all x ∈ Rn

if and only if

there exists u ∈ Rm, u ≥ 0 such that ATu = c



Proof.

If both the systems have solutions, cTx = yTAx ≤ 0. If system

2 has no solution then

c 6∈ S ={

x|x = AT y, y ≥ 0}

By fundamental separation theorem, we get

pT c > α and pTx ≤ α for all x ∈ S

Since 0 ∈ S, α ≥ 0 and so pT c > 0. Further,

α ≥ pTx = pTAT y = yTAp for all y ≥ 0

Since y can be made arbitrarily large, Ap ≤ 0

Thus, Ap < 0 and cT p > 0.Mahesh Dumaldar, Devi Ahilya University, Indore. 13/41


We observe that such separation theorems can be ex-

pressed as theorems of alternatives which are usually formulated

in two equivalent ways:

1. Either the(primal) system of inequalities has a solution or

the dual system has a solution.

2. The(primal) system has no solution if and only if the dual

system has a solution.



The other theorems of alternatives are

Fredholm’s theorem

Gordan’s theorem

Motzkin’s theorem

Tucker’s theorem

Key theorem

Carver’s theorem

Dax’s theorem

etc.,



Gordan’s Theorem

Lemma (Gordan’s Theorem)

Let A be an m× n matrix and c be an n component vector.

Then exactly one of the following system has a solution

System 1: Ax < 0, for some x ∈ Rn

System 2: AT y = 0 and y ≥ 0, for some y ∈ Rm



Proof.

System 1 can be equivalently written as

Ax+ es ≤ 0, for some x ∈ Rn

and

(0, 0, . . . , 0, 1)

x

s

> 0

Now apply Farkas’ Lemma.(

e = (1, 1, . . . , 1)T)



We can prove Farkas’ Lemma using closed convex cones.

Definition (Polar cone)

For a non empty set C in Rn

C∗ = {p|pTx ≤ 0 for all x ∈ C}

Hence

C∗∗ = {y|yT p ≤ 0 for all p ∈ C∗}



Theorem

Let C be a nonempty closed convex cone. Then C = C∗∗.

Proof.

Proof uses fundamental separation theorem.

Clearly C ⊆ C∗∗.

Let y ∈ C∗∗ and y 6∈ C. By fundamental separation theorem,

there exists a nonzero vector p and a scalar α such that

pTx ≤ α for all x ∈ C and pT y > α



Then

0 ∈ C =⇒ α ≥ 0

=⇒ pT y > 0

Now, if p 6∈ C∗ then there is some x ∈ C such that pTx > 0.

But, pT (λx) can be made as large as possible for λ > 0. This

contradicts pTx ≤ α. Therefore p ∈ C∗.

As, y ∈ C∗∗, pT y ≤ 0. This contradicts pT y > 0.

Therefore, y ∈ C.



Remark

For C = {AT y|y ≥ 0},

C∗ = {x|Ax ≤ 0}

By theorem

c ∈ C∗∗ if and only if c ∈ C



c ∈ C∗∗ =⇒ cTx ≤ 0 for all x ∈ C∗

i.e., equivalently

Ax ≤ 0(≡ x ∈ C∗) =⇒ cTx ≤ 0

and

c ∈ C =⇒ c = AT y, y ≥ 0



Hence C = C∗∗ can be equivalently stated as

System 1: Ax ≤ 0 implies cTx ≤ 0

System 2: AT y = c and y ≥ 0

So, System 1 has a solution if and only if system 2 has.

The above two systems can be put into equivalent form of Farkas’

Lemma.

System 1: Ax ≤ 0 and cTx > 0(≡ c 6∈ C∗∗ = C)

System 2: AT y = c and y ≥ 0(≡ c ∈ C)



Applications

Minkowski theorem

[An inner representation of a polyhedra]

Karush-Kuhn-Tucker conditions



Minkowski Theorem

Theorem

Let S be a non empty polyhedral set in Rn of the form

{x|Ax = b, x ≥ 0} where A is an m× n matrix with rank m. Let

x1, . . . , xk be the extreme points of S and d1, . . . , dl be the

extreme directions of S. Then x ∈ S if and only if x can be

written as

x =k

∑

j=1

λjxj +l

∑

t=1

µtdt

k∑

j=1

λj = 1, λj ≥ 0 for j = 1, . . . , k, µt ≥ 0 for t = 1, . . . , l



This statement can be viewed equivalently as

Theorem (Decomposition Theorem for Polyhedra)

A set P of vectors in Rn is a polyhedron if and only if

P = Q+ C for some polytope Q and some polyhedral cone C.



Proof.

Let

Λ ={

∑kj=1 λjxj +

∑lt=1 µtdt|

∑kj=1 λj = 1, λj ≥ 0

for j = 1, . . . , k,

µt ≥ 0 for t = 1, . . . , l}

If there is z ∈ S and z 6∈ Λ by the fundamental separation

theorem, there is a scalar α and a non zero vector p ∈ Rn such

that

pT z > α

pT(

∑kj=1 λjxj +

∑lt=1 µtdt

)

≤ α



In other words, there do not exist λj , µt satisfying

∑kj=1 λjxj +

∑lt=1 µtdt = z

−∑k

j=1 λj = −1

λj ≥ 0 for j = 1, . . . , k

µt ≥ 0 for t = 1, . . . , l



Hence by Farkas’ Lemma, there exists (π, π0) ∈ Rn+1 such that

πxj − π0 ≤ 0 for j = 1, . . . , k

πdt ≤ 0 for t = 1, . . . , l

πz − π0 > 0




Theorem

A feasible solution x is optimal to a linear programming problem

if and only if the objective gradient c lies in the cone generated

by the gradients of the binding constraints at x. (see slide 11)



For a given linear programming problem

minimize cTx subject to Ax ≥ b, x ≥ 0

its dual is

maximize bTw subject to ATw ≥ c, w ≥ 0




KKT conditions are

Ax ≥ b, x ≥ 0 (primal fesibility)

wA+ v = c, w ≥ 0, v ≥ 0 (dual feasibilty)

w(Ax− b) = 0, vx = 0 (complementary slackness)




Proof.

Let Gx ≥ g be the set of inequalities from Ax ≥ b, x ≥ 0 that

are binding at x. If x is an optimal solution then there can not

be improving direction. This means there is no direction d such

that

cTd < 0 and Gd > 0

i.e., the above system has no solution. Hence by Farkas’ lemma

there is u ≥ 0 such that

GTu = c



Some applications of Farkas’ Lemma in non linear programming:

Gordan’s theorem is used in deriving the Fritz John

necessary conditions.

Fritz John conditions are in turn used in deriving KKT

necessary conditions.



Generalizations

Lemma (Farkas’ Lemma)

Let W be a real vector space. Let α1, . . . , αm and γ be linear

forms on W . Then

α1(x) ≤ 0 ∧ · · · ∧ αm(x) ≤ 0 =⇒ γ(x) ≤ 0

holds for all x ∈ W if and only if

∃u1, . . . , um ≥ 0 : γ = u1α1 + · · ·+ umαm



Generalizations

Lemma (Farkas’ Lemma-lexicographic version)

Let W be a real vector space and let W be a vector space. Let

α1, . . . , αm : W −→ R be functionals on W . Furthermore, let

γ : W −→ RN be a linear mapping. Then

∀x ∈ W : α1(x) ≤ 0 ∧ · · · ∧ αm(x) ≤ 0 =⇒ γ(x) � 0

if and only if

∃u1, . . . , um � 0 in RN : γ = α1u1 + · · ·+ αmum



References



Books

1 Nonlinear Programming, Bazaraa M.S., Sherali H.D.,

Shetty C.M., John Wiley & Sons, c©, 2004.

2 Linear Programming and Network Flows, Bazaraa M.S.,

Jarvis J.J., Sherali H.D., John Wiley & Sons, c©, 2005.

3 Integer and Combinatorial Optimization, Nemhauser G. L.,

Wolsey L. A., John Wiley & Sons, c©, 1999.

4 Introduction to Topology and Modern Analysis, Simmons

G. F., Mc Graw Hill Book Company, c©, 1963.

5 Theory of Linear and Integer Programming, Schrijver A.,

John Wiley & Sons, c©, 1986.Mahesh Dumaldar, Devi Ahilya University, Indore. 39/41


Research Papers

1 Farkas’ Lemma, other theorems of alternative, and linear

programming in infinite dimensional spaces: a purely linear

algebraic approach, Bartl D., Linear and Multilinear

Algebra, Vol. 55, No. 4, July 2007, 327-353.

2 A very short algebraic proof of the Farkas’ Lemma, Bartle

D., Math. Meth. Oper. Res., Vol 75, 2012, 101-104.

3 A short algebraic proof of the Farkas’ Lemma, Bartle D.,

SIAM J Optim. , Vol 19, 2008, 234-239



Thank you


separation theorems in optimizations

Documents