mathematics of operations research -...

31
ANDREW TULLOCH MATHEMATICS OF OP- ERATIONS RESEARCH TRINITY COLLEGE THE UNIVERSITY OF CAMBRIDGE

Upload: trinhminh

Post on 09-Sep-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

A N D R E W T U L L O C H

M AT H E M AT I C S O F O P -E R AT I O N S R E S E A R C H

T R I N I T Y C O L L E G E

T H E U N I V E R S I T Y O F C A M B R I D G E

Contents

1 Generalities 5

2 Constrained Optimization 7

2.1 Lagrangian Multipliers 7

2.2 Lagrange Dual 8

2.3 Supporting Hyperplanes 8

3 Linear Programming 11

3.1 Convexity and Strong Duality 11

3.2 Linear Programs 12

3.3 Linear Program Duality 12

3.4 Complementary Slackness 13

4 Simplex Method 15

4.1 Basic Solutions 15

4.2 Extreme Points and Optimal Solutions 15

4.3 The Simplex Tableau 16

4.4 The Simplex Method in Tableau Form 16

4 andrew tulloch

5 Advanced Simplex Procedures 17

5.1 The Two-Phase Simplex Method 17

5.2 Gomory’s Cutting Plane Method 17

6 Complexity of Problems and Algorithms 19

6.1 Asymptotic Complexity 19

7 The Complexity of Linear Programming 21

7.1 A Lower Bound for the Simplex Method 21

7.2 The Idea for a New Method 21

8 Graphs and Flows 23

8.1 Introduction 23

8.2 Minimal Cost Flows 23

9 Transportation and Assignment Problems 25

10 Non-Cooperative Games 27

11 Strategic Equilibrium 29

12 Bibliography 31

1

Generalities

www.statslab.cam.ac.uk/~ff271/teachin/mor [email protected]

2

Constrained Optimization

Minimize f (x) subject to h(x) = b, x ∈ X.

Objective function f : Rn → R Vector x ∈ Rn of decision vari-

ables, Functional constraint where h : Rn− > Rm, b ∈ Rm Regional

constraint where X ⊆ Rn.

Definition 2.1. The feasible set is X(b) = x ∈ X : h(x) = b.

An inequality of the form g(x) ≤ b can be written as g(x) + z = b,

where z ∈ Rm called a slack variable with regional constraint z ≥ 0.

2.1 Lagrangian Multipliers

Definition 2.2. Define the Lagrangian of a problem as

L(x, λ) = f (x)− λT(h(x)− b) (2.1)

where λ ∈ Rm is a vector of Lagrange multipliers

Theorem 2.3 (Lagrangian Sufficiency Theorem). Let x ∈ X and

λ ∈ Rm such that

L(x, λ) = infx′∈X

L(x′, λ) (2.2)

and h(x) = b. Then x is optimal for P.

8 andrew tulloch

Proof.

minx′∈X(b)

f (x′) = minx′∈X(b)

[ f (x)− λT(h(x′)− b)] (2.3)

≥ minx′∈X

[ f (x′)− λT(h(x′)− b)] (2.4)

= f (x)− λT(h(x)− b) (2.5)

= f (x) (2.6)

2.2 Lagrange Dual

Definition 2.4. Let

φ(b) = infx∈X(b)

f (x). (2.7)

Define the Lagrange dual function g : Rm → R with

g(λ) = in fx∈X L(x, λ) (2.8)

Then, for all λ ∈ Rm,

infx∈X(b)

f (x) = infx∈X(b)

L(x, λ) ≥ infx∈X

L(x, λ) = g(λ) (2.9)

That is, g(λ) is a lower bound on our optimization function.

This motivates the dual problem to maximize g(λ) subject to

λ ∈ Y, where Y = λ ∈ Rm : g(λ) > −∞.

Theorem 2.5 (Duality). From (2.9), we see that the optimal value of the

primal is always greater than the optimal value of the dual. This is weak

duality.

2.3 Supporting Hyperplanes

Fix b ∈ Rm and consider φ as a function of c ∈ Rm. Further consider

the hyperplane given by α : Rm → R with

α(c) = β− λT(b− c) (2.10)

Now, try to find φ(b) as follow.

mathematics of operations research 9

(i) For each λ, find

βλ = maxβ : α(c) ≤ φ(c), ∀c ∈ Rm (2.11)

(ii) Choose λ to maximize βλ

Definition 2.6. Call α : Rm → R a supporting hyperplane to φ at b if

α(c) = φ(b)− λT(b− c) (2.12)

and

φ(c) ≥ φ(b)− λT(b− c) (2.13)

for all c ∈ Rm.

Theorem 2.7. The following are equivalent

(i) There exists a (non-vertical) supporting hyperplane to φ at b,

(ii) XXXXXXXXXXXXXXXXXXXXXXXXXX

3

Linear Programming

3.1 Convexity and Strong Duality

Definition 3.1. Let S ⊆ Rn. S is convex if for all δ ∈ [0, 1], x, y ∈ S

implies that δx + (1− δ)y ∈ S.

f : S → R is convex if for all x, y ∈ S and δ ∈ [0, 1], δ f (x) + (1−δ) f (y) ≥ f (δx + (1− δ)y.

Visually, the area under the function is a convex set.

Definition 3.2. x ∈ S is an interior point of S if there exists ε > 0

such y : ‖y− x‖2 ≤ ε ⊆ S.

x ∈ S is an extreme point of S if for all y, z ∈ S and S ∈ (0, 1),

x = δy + (1− δ)z implies x = y = z.

Theorem 3.3 (Supporting Hyperplane). Suppose that our function φ is

convex and b lies in the interior of the set of points where φ is finite. Then

there is a (non-vertical) supporting hyperplane to φ at b.

Theorem 3.4. Let X(b) = x ∈ X : h(x) ≤ b, φ(b) = infx∈X(b) f (x).

Then φ is convex if X, f , and h are convex.

Proof. Let b1, b2 ∈ Rm such that φ(b1), φ(b2) are defined. Let δ ∈ [0, 1]

and b = δb1 + (1− δ)b2. Consider x1 ∈ X(b1), x2 ∈ X(b2) and let

x = δx1 + (1− δ)x2.

12 andrew tulloch

By convexity of Y, x ∈ X. By convexity of h,

h(x) = h(δx1 + (1− δ)x2)

≤ δh(x1) + (1− δ)h(x2)

≤ δb1 + (1− δ)b2 = b

Thus x ∈ X(b), and by convexity of f ,

φ(b) ≤ f (x)

= f (δx1 + (1− δ)x2)

≤ δ f (x1) + (1− δ) f (x2)

≤ δφ(b1) + (1− δ)φ(b2)

3.2 Linear Programs

Definition 3.5. General form of a linear program is

mincTx : Ax ≥ b, x ≥ 0 (3.1)

Standard form of a linear program is

mincTx : Ax = b, x ≥ 0 (3.2)

3.3 Linear Program Duality

Introduce slack variables to turn problem P into the form

mincTx : Ax− z = b, x, z ≥ 0 (3.3)

We have X = (x, z) : x, z ≥ 0 ⊆ Rm+n. The Lagrangian is

L((x, z), λ) = cTx− λT(Ax− z− b) (3.4)

= (cT − λT A)x + λTz + λTb (3.5)

mathematics of operations research 13

and has a finite minimum if and only if

λ ∈ Y = λ : cT − λT A ≥ 0, λ ≥ 0 (3.6)

For a fixed λ ∈ Y, the minimum of L is satisfied when (cT −λT A)x = 0 and λTz = 0, and thus

g(λ) = inf(x,z)∈X

L((x, z), λ) = λTb (3.7)

We obtain that the dual problem

maxλTb : ATλ ≤ c, λ ≥ 0 (3.8)

and it can be shown (exercise) that the dual of the dual of a linear

program is the original program.

3.4 Complementary Slackness

Theorem 3.6. Let x and λ be feasible for P and its dual. Then x and λ are

optimal if and only if

(cT − λT A)x = 0 (3.9)

λT(Ax− b) = 0 (3.10)

Proof. If x, λ are optimal, then

cTx = λTb︸︷︷︸by strong duality

(3.11)

= infx′∈X

(cTx′ − λT(Ax′ − b)) ≤ cTx− λT(Ax− b)︸ ︷︷ ︸primal and dual optimality

≤ cTx

(3.12)

Then the inequalities must be equalities. Thus

λTb = cTx− λT(Ax− b) = (cT − λT A)x︸ ︷︷ ︸=0

+λTb (3.13)

and

cTx− λT(Ax− b)︸ ︷︷ ︸=0

= cTx (3.14)

14 andrew tulloch

If (cT − λT A)x = 0 and λT(Ax− b) = 0, then

cTx = cT − λT(Ax− b) = (cT − λTa)x + λTb = λTb (3.15)

and so by weak duality, x and λ are optimal.

4

Simplex Method

4.1 Basic Solutions

Maximize cTx subject to Ax = b, x ≥ 0, A ∈ Rm×n.

Call a solution x ∈ Rn of Ax = b basic if it has at most m non-zero

entries, that is, there exists B ⊆ 1, . . . , n with |B| = m and xi = 0 if

i /∈ B.

A basic solution x with x ≥ 0 is called a basic feasible solution

(bfs).

4.2 Extreme Points and Optimal Solutions

We make the following assumptions:

(i) The rows of A are linearly independent

(ii) Every set of m columns of A are linearly independent.

(iii) Every basic solution is non-degenerate - that is, it has exactly m

non-zero entries.

Theorem 4.1. x is a bfs of Ax = b if and only if it is an extreme point of

the set X(b) = x : Ax = b, x ≥ 0.

Theorem 4.2. If the problem has a finite optimum (feasible and bounded),

then it has an optimal solution that is a bfs.

16 andrew tulloch

4.3 The Simplex Tableau

Let A ∈ Rm×n, b ∈ Rm. Let B be a basis (in the bfs sense), and

x ∈ Rn, such that Ax = b. Then

ABxB + AN xN = b (4.1)

where AB ∈ Rm×m and AN ∈ Rm×(n−m) respectively consist of the

columns of A indexed by B and those not indexed by B. Moreover,

if x is a basic solution, then there is a basic B such that xN = 0 and

ABxB = b, and if x is a basic feasible solution, there is a basis B such

that xn = 0, ABxB = b, and xB ≥ 0.

For every x with Ax = b and every basis B, we have

xB = A−1B (b− AN xN) (4.2)

as we assume that AB has full rank. Thus,

f (x) = cTx = cTBxB + cT

N xN (4.3)

= cTB A−1

B (b− AN XN) + cTN xN (4.4)

= CTB A−1

B b + (cTN − cT

B A−1B AN)xN (4.5)

Assume we can guarantee that A−1B b = 0. Then x? with x?B = A−1

B b

and x?N = 0 is a bfs with

f (x?) = CTB A−1

B b (4.6)

Assume that we are maximizing cTx. There are two different cases:

(i) If CTN − CT

B A−1B AN ≤ 0, then f (x) ≤ cT

B A−1B b for every feasible x, so

x? is optimal.

(ii) If (cTN − cT

B A−1B AN)i > 0, then we can increase the objective value

by increasing the corresponding row of (xN)i.

4.4 The Simplex Method in Tableau Form

5

Advanced Simplex Procedures

5.1 The Two-Phase Simplex Method

Finding an initial bfs is easy if the constraints have the form Ax = b

where b ≥ 0, as

Ax + z = b, (x, z) = (0, b) (5.1)

5.2 Gomory’s Cutting Plane Method

Used in integer programming (IP). This is a linear program where in

addition some of the variables are required to be integral.

Assume that for a given integer program we have found an op-

timal fractional solution x? with basis B and let aij = (A−1B Aj) and

ai0 = (A−1B b) be the entries of the final tableau. If x? is not integral,

the for some row i, ai0 is not integral. For every feasible solution x,

xi = ∑j∈N

baijcxj ≤ xi + ∑j∈N

aijxj = ai0. (5.2)

If x is integral, then the left hand side is integral as well, and the

inequality must still hold if the right hand side is rounded down.

Thus,

xi + ∑j∈N

baijcxj ≤ bai0c. (5.3)

Then, we can add this constraint to the problem and solve the

augmented program. One can show that this procedure converges in

a finite number of steps.

6

Complexity of Problems and Algorithms

6.1 Asymptotic Complexity

We measure complexity as a function of input size. The input of

a linear programming problem: c ∈ Rn, A ∈ Rm×n, b ∈ Rm is

represented in (n + m · n + m) · k bits if we represent each number

using k bits.

For two functions f : R→ R and g : R→ R write

f (n) = O(g(n)) (6.1)

if there exists c, n0 such that for all n ≥ n0,

f (n) ≤ c · g(n) (6.2)

, . . . (similarly for Ω→≥, and Θ→ (Ω +O))

7

The Complexity of Linear Programming

7.1 A Lower Bound for the Simplex Method

Theorem 7.1. There exists a LP of size O(n2), a pivoting rule, and an

initial BFS such that the simplex method requires 2n − 1 iterations.

Proof. Consider the unit cube in Rn, given by constraints 0 ≤ xi ≤ 1

for i = 1, . . . , n. Define a spanning path inductively as follows. In

dimension 1, go from x1 = 0 to x1 = 1. In dimension k, set xk = 0 and

follow the path for dimension 1, . . . , k− 1. Then set x = 1, and follow

the path for dimension 1, . . . , k− 1 backwards.

The objective xn currently increases only once. Instead consider

the perturbed unit cube given by the constraints ε ≤ x1 ≤ 1, εxi−1 ≤xi ≤ 1− εxi−1 with ε ∈ (0, 1

2 ).

7.2 The Idea for a New Method

mincTx : Ax = b, x ≥ 0 (7.1)

maxbTλ : ATλ ≤ c (7.2)

By strong duality, each of these problems has a bounded optimal

22 andrew tulloch

solution if and only if the following set of constraints is feasible:

cTx = bTλ (7.3)

Ax = b (7.4)

x ≥ 0 (7.5)

ATλ ≤ c (7.6)

It is thus enough to decide, for a given A′ ∈ Rm×n and b′ ∈ Rm,

whether x ∈ Rn : Ax ≥ b 6= ∅.

Definition 7.2. A symmetric matrix D ∈ Rn×n is called positive

definite if xT Dx > 0 for every x ∈ Rn. Alternatively, all eigenvalues

of the matrix are strictly positive.

Definition 7.3. A set E ⊆ Rn given by

E = E(z, D) = x ∈ Rn : (x− z)T D(x− z) ≤ 1 (7.7)

for a positive definite symmetric D ∈ Rn×n and z ∈ Rn is called an

ellipsoid with center z.

Let P = x ∈ Rn : Ax ≥ b for some A ∈ Rm×n and b ∈ Rm.

To decide whether P 6= ∅, we generate a sequence Et of ellipsoids

Et with centers xt. If xt ∈ P, then P is non-empty and the method

stops. If xt /∈ P, then one of the constraints is violated - so there exists

a row j of A such that aTj xt < bj. Therefore, P is contained in the

half-space x ∈ Rn : aTj x ≥ aT

j xt, and in particular the intersection of

this half-space with Et, which we will call a half-ellipsoid.

Theorem 7.4. Let E = E(z, D) be an ellipsoid in Rn and a ∈ Rn 6= 0.

Consider the half-space H = x ∈ Rn|aTx ≥ aTz, and let

z′ = z +1

n + 1Da√aT Da

(7.8)

D′ =n2

n2 − 1

(D− 2

n + 1DaaT DaT Da

)(7.9)

Then D′ is symmetric and positive definite, and therefore E′ = E(z′, D′)

is an ellipsoid. Moreover, E ∩ H ⊆ E′ and Vol(E′) < e−1

2(n+1) Vol(E).

8

Graphs and Flows

8.1 Introduction

Consider a directed graph (network) G = (V, E), Vthe set of vertices,

E ⊆ V ×V a set of edges. Undirected if E is symmetric.

8.2 Minimal Cost Flows

Fill in this stuff from lecturesFill in this stuff from lectures

9

Transportation and Assignment Problems

10

Non-Cooperative Games

Theorem 10.1 (von Neumann, 1928). Let P ∈ Rm×n. Then

maxx∈X

miny∈Y

p(x, y) = miny∈Y

maxx∈X

p(x, y) (10.1)

11

Strategic Equilibrium

Definition 11.1. x ∈ X is a best response to y ∈ Y if p(x, y) =

maxx′∈X p(x′, y). (x, y) ∈ X × Y is a Nash equilibrium if x is a best

response to y and y is a best response to x.

Theorem 11.2. (x, y) ∈ X × Y is an equilibrium of the matrix game P if

and only if

miny′∈Y

p(x, y′) = maxx′∈X

miny′∈Y

p(x′, y′) (11.1)

and

maxx∈X′

p(x′, y) = miny′∈Y

maxx′∈X

p(x′, y′). (11.2)

Theorem 11.3. Let (x, y), (x′, y′) ∈ X×Y be equilibria of the matrix game

with payoff matrix P. Then p(x, y) = p(x′, y′) and (x, y′) and (x′, y) are

equilibria as well.

Theorem 11.4 (Nash, 1951). Every bimatrix game has an equilibrium.

12

Bibliography