ieor 266, fall 2003 graph algorithms and network flowsjfpf/acm/libros/grafos/graph... · ieor266...

IEOR 266, Fall 2003

Graph Algorithms and Network Flows

Professor Dorit Hochbaum

Contents

1 Introduction 11.1 Assignment Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Minimum Cost Network Flow Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Transportation Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.4 Example of general MCNF Problem - the chairs problem . . . . . . . . . . . . . . . . 41.5 Assigning Individuals to Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.6 Basic graph Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Some Network Problems 72.1 Classes of Network Flow Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 Other Network Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 Totally Unimodular Matrices 15

4 Complexity Analysis 174.1 Measuring Quality of an Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.1.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.2 Growth of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.3 Definitions for asymptotic comparisons of functions . . . . . . . . . . . . . . . . . . . 194.4 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.5 Caveats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204.6 A Sketch of the Ellipsoid Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5 NP completeness 225.1 Search vs. Decision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225.2 NP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.2.1 Some Problems in NP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235.3 Co-NP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.3.1 Some Problems in Co-NP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245.4 NP and Co-NP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245.5 NP-completeness and reductions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.5.1 Reducibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265.5.2 NP-Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

i

IEOR266 notes, Prof. Hochbaum, 2003 ii

6 Graph Representations and Graph Search Algorithms 296.1 Representations of a Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296.2 Searching a Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

6.2.1 The complexity of the search procedures . . . . . . . . . . . . . . . . . . . . . 33

7 Shortest Paths Problem 337.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337.2 Topological Sort and directed acyclic graphs . . . . . . . . . . . . . . . . . . . . . . . 347.3 Properties of shortest paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357.4 Shortest Paths in a DAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367.5 Applications of Shortest/Longest Path . . . . . . . . . . . . . . . . . . . . . . . . . . 377.6 Reduction from Hamiltonian Cycle to Longest Path Problem: . . . . . . . . . . . . 387.7 The mathematical programming formulation of the longest path and shortest paths

problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397.8 Dijkstra’s Algorithm for the Single Source Shortest Path Problem for Graphs with

Nonnegative Edge Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397.9 Bellman-Ford Algorithm for Single Source Shortest Paths . . . . . . . . . . . . . . . 417.10 Floyd-Warshall Algorithm for All Pairs Shortest Paths . . . . . . . . . . . . . . . . . 427.11 D.B. Johnson’s Algorithm for All Pairs Shortest Paths . . . . . . . . . . . . . . . . . 437.12 Matrix Multiplication Algorithm for All Pairs Shortest Paths . . . . . . . . . . . . . 447.13 Why finding shortest paths in the presence of negative cost cycles is difficult . . . . . 45

8 Minimum Cost Spanning Tree Problem - taught by Professor Atamturk 468.1 Kruskal’s algorithm for MST. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

8.1.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498.1.2 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

8.2 combinatorial issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

9 Maximum Flow Problem 529.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529.2 Linear Programming duality and Max flow Min cut . . . . . . . . . . . . . . . . . . . 549.3 Hall’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559.4 Amazing Applications of Minimum Cut . . . . . . . . . . . . . . . . . . . . . . . . . 57

9.4.1 Selection Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579.4.2 The maximum closure problem . . . . . . . . . . . . . . . . . . . . . . . . . . 589.4.3 The open-pit mining problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 599.4.4 Vertex Cover on Bipartite Graphs . . . . . . . . . . . . . . . . . . . . . . . 609.4.5 LP Formulation of Minimum Vertex Cover . . . . . . . . . . . . . . . . . . . 609.4.6 Solving the vertex cover problem on bipartite graphs . . . . . . . . . . . . . 609.4.7 Solving the LP relaxation of the non-bipartite vertex cover . . . . . . . . . . 619.4.8 Forest Clearing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629.4.9 Producing Memory Chips (VLSI Layout) . . . . . . . . . . . . . . . . . . . . 64

9.5 Flow Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659.6 Maximum Flow Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

9.6.1 Ford-Fulkerson Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659.6.2 Capacity Scaling Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 679.6.3 Maximum Capacity Augmenting Path Algorithm . . . . . . . . . . . . . . . . 67

IEOR266 notes, Prof. Hochbaum, 2003 iii

9.7 Dinic’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 709.7.1 Dinic’s Algorithm for Unit Capacity Networks . . . . . . . . . . . . . . . . . 719.7.2 Dinic’s Algorithm for Simple Networks . . . . . . . . . . . . . . . . . . . . . . 72

10 Necessary and Sufficient Condition for Feasible Flow in a Network 7410.1 For a network with zero lower bounds . . . . . . . . . . . . . . . . . . . . . . . . . . 7410.2 In the presence of positive lower bounds . . . . . . . . . . . . . . . . . . . . . . . . . 7510.3 For a circulation problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7510.4 For the transportation problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

11 Goldberg’s Algorithm - the Push/Relabel Algorithm 7611.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7611.2 The Generic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7711.3 Variants of Push/Relabel Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 8011.4 Wave Implementation of Goldberg’s Algorithm (Lift-to-front) . . . . . . . . . . . . . 80

12 A sketch of the pseudoflow algorithm for maximum flow - Hochbaum’s algo-rithm 8112.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8212.2 The maximum flow problem and the maximum s-excess

problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8312.2.1 The s-excess problem and maximum closure problem . . . . . . . . . . . . . . 84

12.3 Superoptimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8512.4 The description of the pseudoflow algorithm . . . . . . . . . . . . . . . . . . . . . . . 8612.5 Finding a merger arc and use of labels . . . . . . . . . . . . . . . . . . . . . . . . . . 88

13 Algorithms for MCNF 8913.1 Network Simplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8913.2 (T,L,U) structure of an optimal solution . . . . . . . . . . . . . . . . . . . . . . . . . 9013.3 Simplex’ basic solutions and the corresponding spanning trees . . . . . . . . . . . . . 9013.4 An example of simplex iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9013.5 Optimality Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9113.6 Implied algorithms – Cycle cancelling . . . . . . . . . . . . . . . . . . . . . . . . . . 9313.7 Analogy to maximum flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9313.8 The gap between the primal and dual . . . . . . . . . . . . . . . . . . . . . . . . . . 94

These notes are based on “scribe” notes taken by students attending Professor Hochbaum’s courseIEOR 266 in the past 8 years. The notes were improved by the editing of Anu Pathria and EliOlinick, who served as TAs for the course in 1994 and 1995 (resp.). The current version has beenupdated and edited by Professor Hochbaum and used some scribe notes from Fall 1999/2000/2003.

The text book used for the course, and mentioned in the notes, is Network Flows: theory, algo-rithms and applications by Ravindra K. Ahuja, Thomas L. Magnanti and James B. Orlin. Publishedby Prentice-Hall, 1993. The notes also make reference to the book Combinatorial Optimization: al-gorithms and complexity by Christos H. Papadimitriou and Kenneth Steiglitz, published by PrenticeHall, 1982.

IEOR266 notes, Prof. Hochbaum, 2003 1

1 Introduction

We will begin the study of network flow problems with a review of the formulation of linear pro-gramming (LP) problems. Let the number of decision variables, xj ’s, be N , and the number ofconstraints be M . LP problems take the following generic form:

min∑N

j=1 cjxj

s.t.∑N

j=1 ai,jxj ≤ bi ∀i ∈ 1, . . . ,Mxj ≥ 0 ∀j ∈ 1, . . . , N (1)

Integer linear programming (ILP) has the following additional constraint:

xj integer ∀j ∈ 1, . . . , N (2)

It may appear that ILP problems are simpler than LP problems, since the solution space inILP is countably finite while the solution space in LP is infinite; one obvious solution to ILP isenumeration, i.e. systematically try out all feasible solutions and find one with the minimum cost.As we will see in the following ILP example, enumeration is not an acceptable algorithm as itscomplexity is too high.

1.1 Assignment Problem

In words, the assignment problem is the following: given n tasks to be completed individually byn people, what is the minimum cost of assigning all n task to n people, one task for one person,where ci,j is the cost of assigning task j to person i? Let the decision variables be defined as:

xi,j =

1 if person i takes on task j0 otherwise

(3)

The problem can now be formulated as:

min∑n

j=1

∑ni=1 ci,jxi,j

s.t.∑n

j=1 xi,j = 1 ∀i = 1, ..., n∑n

i=1 xi,j = 1 ∀j = 1, ..., n

xi,j ≥ 0 xi,j integer (4)

The first set of constraints ensures that exactly one person is assigned to each task; the secondset of constraints ensures that each person is assigned exactly one task. Notice that the upper boundconstraints on the xij are unnecessary. Also notice that the set of constraints is not independent(the sum of the first set of constraints equals the sum of the second set of constraints), meaningthat one of the 2n constraints can be eliminated.

While at first it may seem that the integrality condition limits the number of possible solutionsand could thereby make the integer problem easier than the continuous problem, the opposite isactually true. Linear programming optimization problems have the property that there exists anoptimal solution at a so-called extreme point (a basic solution); the optimal solution in an integerprogram, however, is not guaranteed to satisfy any such property and the number of possible integervalued solutions to consider becomes prohibitively large, in general.


70! = 1197857166996989179607278372168909

8736458938142546425857555362864628

009582789845319680000000000000000

≈ 2332

Figure 1: 70!

Consider a simple algorithm for solving the Assignment Problem: It enumerates all possibleassignments and takes the cheapest one. Notice that there are n! possible assignments. If weconsider an instance of the problem in which there are 70 people and 70 tasks, that means thatthere are 70! (see Figure 1) different assignments to consider. Our simple algorithm, while correct,is not at all practical! The existence of an algorithm is very different than the existence of a usefulalgorithm.

While linear programming belongs to the class of problems P for which “good” algorithms exist(an algorithm is said to be good if its running time is bounded by a polynomial in the size of theinput), integer programming belongs to the class of NP-hard problems for which it is consideredhighly unlikely that a “good” algorithm exists. Fortunately, as we will see later in this course,the Assignment Problem belongs to a special class of integer programming problems known as theMinimum Cost Network Flow Problem, for which efficient polynomial algorithms do exist.

The reason for the tractability of the assignment problem is found in the form of the constraintmatrix. The constraint matrix is totally unimodular (TUM). Observe the form of the constraintmatrix A:

A =

0 1 · · ·1 0 · · ·0

... · · ·...

...

1...

0 1

(5)

We first notice that each column contains exactly two 1’s. Notice also that the rows of matrix Acan be partitioned into two sets, say A1 and A2, such the two 1’s of each column are in differentsets. It turns out these two conditions are sufficient for the matrix A to be TUM. Note that simplyhaving 0’s and 1’s are not sufficient for the matrix to be TUM. Consider the constraint matrix forthe vertex cover problem:

A =

0 · · · 1 0 · · · 10 1 0 · · · 1 0 · · ·...

(6)

Although this matrix also contains 0’s and 1’s only, it is not TUM. In fact, vertex cover is anNP-complete problem. We will revisit the vertex cover problem in the future. We now turn ourattention to the formulation of the generic minimum cost network flow problem.


1.2 Minimum Cost Network Flow Problem

We now formulate the Minimum Cost Network Flow Problem. In general,

Problem Instance: Given a network G = (N,A), with a cost cij , upper bound uij , and lowerbound lij associated with each directed arc (i, j), and supply bv at each node.

Integer Programming Formulation: The problem is that of finding a cheapest integer valuedflow:

min∑n

j=1

∑ni=1 cijxij

subject to∑

k∈N xik −∑

k∈N xki = bi ∀i ∈ Nlij ≤ xij ≤ uij , xij integer, ∀(i, j) ∈ A

The first set of constraints is called the flow balance constraints. The second set is thecapacity constraints. Nodes with negative supply are sometimes referred to as demand nodesand nodes with 0 supply are sometimes referred to as transshipment nodes. Notice that theconstraints are dependent (left hand sides add to 0, so that the right hand sides must add to0 for consistency) and, as before, one of the constraints can be removed.

The distinctive feature of this matrix of the flow balance constraint matrix is that each columnhas precisely one 1 and one −1 in it. Can you figure out why?

A =

0 1 · · ·0 −1 · · ·1 0

0...

−1...

0...

(7)

It turns out that all extreme points of the linear programming relaxation of the Minimum CostNetwork Flow Problem are integer valued (assuming that the supplies and upper/lower bounds areinteger valued). This matrix, that has one 1 and one −1 per column (or row) is totally unimodular– a concept we will elaborate upon later– and therefore MCNF problems can be solved using LPtechniques. However, we will investigate more efficient techniques to solve MCNF problems, whichfound applications in many areas. In fact, the assignment problem is a special case of MCNFproblem, where each person is represented by a supply node of supply 1, each job is representedby a demand node of demand 1, and there are an arc from each person j to each job i, labeled byci,j , that denotes the cost of person j performing job i.

1.3 Transportation Problem

A problem similar to the minimum cost network flow, which is an important special case is calledthe transportation problem: the nodes are partitioned into suppliers and consumers only. Thereare no transhippment nodes. All edges go directly from suppliers to consumers and have infinitecapacity and known per unit cost. The problem is to satisfy all demands at minimum cost. Theassignment problem is a special case of the transportation problem, where all supply values and alldemand values are 1.


warehouses

P1

P2

P3

W5

W4

W3

W2

W1

(180)

(150)

(-120)

(-100)

(-160)

(-80)

(-150)

(280)

(0, inf, 4)

plants

Figure 2: Example of a Transportation Problem

The graph that appears in the transportation problem is a so-called bipartite graph, or a graphin which all vertices can be divided into two classes, so that no edge connects two vertices in thesame class:

(V1 ∪ V2;A) ∀(u, v) ∈ A

u ∈ V1, v ∈ V2 (8)

A property of bipartite graphs is that they are 2-colorable, that is, it is possible to assign eachvertex a ”color” out of a 2-element set so that no two vertices of the same color are connected.

An example transportation problem is shown in Figure 2. Let si be the supplies of nodes in V1,and dj be the demands of nodes in V2. The problem can be formulated as:

min∑

i∈V1,j∈V2ci,jxi,j

s.t.∑

j∈V2xi,j = si ∀i ∈ V1

−∑i∈V1xi,j = −dj ∀j ∈ V2

xi,j ≥ 0 xi,j integer (9)

Note that assignment problem is a transportation problem with si = dj = 1 and |V1| = |V2|.

1.4 Example of general MCNF Problem - the chairs problem

We now look at an example of general network flow problem, as described in handout #2. We beginthe formulation of the problem by defining the units of various quantities in the problem: we let acost unit be 1 cent, and a flow unit be 1 chair. To construct the associated graph for the problem,we first consider the wood sources and the plants, denoted by nodes WS1,WS2 and P1, . . . , P4respectively, in Figure 3a. Because each wood source can supply wood to each plant, there is anarc from each source to each plant. To reflect the transportation cost of wood from wood source toplant, the arc will have cost, along with capacity lower bound (zero) and upper bound (infinity).Arc (WS2, P4) in Figure 3a is labeled in this manner.

It is not clear exactly how much wood each wood source should supply, although we know thatcontract specifies that each source must supply at least 800 chairs worth of wood (or 8 tons). Werepresent this specification by node splitting: we create an extra node S, with an arc connectingto each wood source with capacity lower bound of 800. Cost of the arc reflects the cost of woodat each wood source. Arc (S,WS2) is labeled in this manner. Alternatively, the cost of wood can


(0, inf, 190)P4

P3

P2

P1

WS1

WS2

wood source

S

P4’

P3’

P2’

P1’

plant plant’

(250, 250, 400)

(0, inf, 40)

(800, inf, 150)(800, inf, 0)

(500, 1500, 0)Ch

SF

Au

NY

P4’

P3’

P2’

P1’

(500, 1500, -1800)

city

(0, 250, 400)

D

plant’(0, 250, -1400)

a) wood source & plants b) plants & cities

Figure 3: Chair Example of MCNF

plant’

P4

P3

P2

P1

(0, inf, 0)

Ch

SF

Au

NY

P4’

P3’

P2’

P1’

city

DWS1

WS2

plant

wood source

S

(-M)

P4

P3

P2

P1

(0, inf, 0)

(M)WS1

WS2

plant

wood source

S

Ch

SF

Au

NY

P4’

P3’

P2’

P1’

city

D

plant’

a) circulation arc b) drainage arc

Figure 4: Complete Graph for Chair Example

be lumped together with the cost of transportation to each plant. This is shown as the grey colorlabels of the arcs in the figure.

Each plant has a production maximum, production minimum and production cost per unit.Again, using the node splitting technique, we create an additional node for each plant node, andlabeled the arc that connect the original plant node and the new plant node accordingly. Arc(P4, P4′) is labeled in this manner in the figure.

Each plant can distribute chairs to each city, and so there is an arc connecting each plant toeach city, as shown in Figure 3b. The capacity upper bound is the production upper bound ofthe plant; the cost per unit is the transportation cost from the plant to the city. Similar to thewood sources, we don’t know exactly how many chairs each city will demand, although we knowthere is a demand upper and lower bound. We again represent this specification using the nodesplitting technique; we create an extra node D and specify the demand upper bound, lower boundand selling price of each unit of each city by labeling the arc that connects the city and node D.

Finally, we need to relate the supply of node S to the demand of node D. There are two waysto accomplish this. We observe that it is a closed system; therefore the number of units suppliedat node S is exactly the number of units demanded at node D. So we construct an arc from nodeD to node S, with label (0,∞, 0), as shown in Figure 4a. The total number of chairs produced willbe the number of flow units in this arc. Notice that all the nodes are now transshipment nodes —such formulations of problems are called circulation problems.

Alternatively, we can supply node S with a large supply of M units, much larger than whatthe system requires. We then create a drainage arc from S to D with label (0,∞, 0), as shown inFigure 4b, so the excess amount will flow directly from S to D. If the chair production operationis a money losing business, then all M units will flow directly from S to D.


1.5 Assigning Individuals to Tasks

Next, we consider an example of a problem that can be formulated as a Minimum Cost NetworkFlow (MCNF) Problem.

See the handout titled Finding the Minimal Number of Individuals Required to Meet a Fixed Sched-ule of Tasks.

Assigning Individuals to Tasks: In modeling the problem of determining the fewest numberof individuals needed to complete a set of tasks (for each task, a start time and end time isgiven), we require that exactly one unit of flow pass through each node. Suppose, in general,that between li and ui units of flow must pass through node i. We can model this requirementby splitting node i into two nodes, i1 and i2, and assign a lower bound of li and upper boundof ui to arc (i1, i2). All arcs originally entering i now enter i1, and all arcs originally leavingnode i now leave i2.

This device of node-splitting comes up frequently in network modeling (ex. the maximumcardinality set of vertex disjoint (s, t)-paths problem that will appear on Assignment 2).

The special case in which all supplies and demands are of value 1 is referred to as the Assign-ment Problem, which we have previously encountered.

The handout goes through the details for a particular instance of this problem.

1.6 Basic graph Definitions

We consider graphs that are directed, of the form (V,A), where A is the set of directed arcs, namelyordered pairs (i, j). In an arc (i, j) node i is called the tail of the arc and node j the head of thearc.

An undirected graph, G = (V,E) consists of a set of edges E which is a set of unordered pairsof the form [i, j].

1. A directed/undirected graph is said to be connected (and for directed graph, strongly con-nected), if, for every (ordered) pair of nodes, there is a directed/undirected path in the graphfrom one node to the other.

2. The degree of a vertex is the number of edges incident to the vertex. In the homeworkassignement you will prove that

∑

v∈V degree(v) = 2|E|.

3. In a directed graph the indegree of a node is the number of incoming arcs taht have that nodeas a head. The outdegree of a node is the number of outgoing arcs from a node, that havethat node as a tail.

Be sure you can prove,∑

v∈Vindeg(v) = |A|,

∑

v∈Voutdeg(v) = |A|.


4. A tree can be characterized as a connected graph with no cycles. The relevant property forthis problem is that a tree with n nodes has n− 1 edges.

Definition: An undirected graph G = (V, T ) is a tree if the following three properties aresatisfied:

Property 1: |T | = |V | − 1.Property 2: G is connected.Property 3: G is acyclic.

(Actually, it is possible to show, by induction, that any two of the properties imply the third).

5. A graph is bipartite if the vertices in the graph can be partitioned into two sets in such a waythat no edge joins two vertices in the same set.

6. A maximum matching in a graph is a maximum cardinality set of graph edges such thatno two edges in the set are incident with the same vertex. We will consider both bipartitematching, which is the matching problem defined on a bipartite graph, and the nonbipartitematching, which, as the name suggests, refers to matching in nonbipartite graphs.

2 Some Network Problems

2.1 Classes of Network Flow Problems

Recall the formulation of the Minimum Cost Network Flow problem.

( MCNF)Min cx

subject to Tx = b Flow balance constraintsl ≤ x ≤ u, Capacity bounds

Here xi,j represents the flow along the arc from i to j. We now look at some specialized variants.The Maximum Flow problem is defined on a directed network with capacities, and no costs.

In addition two nodes are specified, a source node, s, and sink node, t. The objective is to findthe maximum flow possible between the source and sink while satisfying the arc capacities. If xi,jrepresents the flow on arc (i, j), and A the set of arcs in the graph, the problem can be formulatedas:

(MaxFlow)

Max Vsubject to

∑

(s,i)∈A xs,i = V Flow out of source∑

(j,t)∈A xj,t = V Flow in to sink∑

(i,k)∈A xi,k −∑

(k,j)∈A xk,j = 0, ∀k 6= s, t

li,j ≤ xi,j ≤ ui,j , ∀(i, j) ∈ A.The Maximum Flow problem and its dual, known as the Minimum Cut problem have a number

of applications and will be studied at length later in the course.The Shortest Path problem is defined on a directed, weighted graph, where the weights may be

thought of as distances. The objective is to find a path from a source node, s, to node a sink node,t, that minimizes the sum of weights along the path. To formulate as a network flow problem, letxi,j be 1 if the arc (i, j) is in the path and 0 otherwise. Place a supply of one unit at s and ademand of one unit at t. The formulation for the problem is:


(SP)

Min∑

((i,j)∈A di,jxi,jsubject to

∑


(j,i)∈A xj,i = 0 ∀i 6= s, t∑

(s,i)∈A xs,i = 1∑

(j,t)∈A xj,t = 1

0 ≤ xi,j ≤ 1

This problem is often solved using a variety of search and labeling algorithms depending on thestructure of the graph.

A related problem is the single source shortest paths problem (SSSP). The objective now is tofind the shortest path from node s to all other nodes t ∈ V \ s. The formulation is very similar,but now a flow is sent from s, which has supply n− 1, to every other node, each with demand one.

(SSSP)

Min∑

(i,j)∈A di,jxi,jsubject to

∑


(j,i)∈A xj,i = −1 ∀i 6= s∑

(s,i)∈A xs,i = n− 1

0 ≤ xi,j ≤ n− 1, ∀(i, j) ∈ A.But in this formulation, we are minimizing the sum of all the costs to each node. How can we

be sure that this is a correct formulation for the single source shortest paths problem?

Theorem 2.1. For the problem of finding all shortest paths from a single node s, let X be a solutionand A+ = (i, j)|Xij > 0. Then there exists an optimal solution such that the in-degree of eachnode j ∈ V − s in G = (V,A+) is exactly 1.

Proof: First, note that the in-degree of each node in V − s is ≥ 1, since every node has atleast one path (shortest path to itself) terminating at it.If every optimal solution has the in-degree of some nodes > 1, then pick an optimal solution thatminimizes

∑

j∈V−s indeg(j).

In this solution, i.e. graph G+ = (V,A+) take a node with an in-degree > 1, say node i. Backtrackalong two of the paths until you get a common node. A common node necessarily exists since bothpaths start from s.

If one path is longer than the other, it is a contradiction since we take an optimal solution tothe LP formulation. If both paths are equal in length, we can construct an alternative solution.

For a path segment, define the bottleneck link as the link in this segment that has the smallestamount of flow. And define this amount as the bottleneck amount for this path segment. Nowconsider the two path segments that start at a common node and end at i, and compare thebottleneck amounts for them. Pick either of the bottleneck flow. Then, move this amount of flowto the other segment, by subtracting the amount of this flow from the in-flow and out-flow of everynode in this segment and adding it to the in-flow and out-flow of every node in the other segment.The cost remains the same since both segments have equal cost, but the in-degree of a node at thebottleneck link is reduced by 1 since there is no longer any flow through it.

We thus have a solution where∑

j∈V \s indeg(j) is reduced by 1 unit. This contradictioncompletes the proof.

Therefore the set A+ forms a tree. In this tree the path to each node from s is unique. Thismust be the optimum as otherwise the objective value can be reduced.

Corollary 2.1. The shortest path to each node is the unique path in the tree A+.


The same theorem may also be proved following Linear Programming arguments by showingthat a basic solution of the algorithm generated by a linear program will always be acyclic in theundirected sense.

Bipartite Matching, or Bipartite Edge-packing involves pairwise association on a bipartite graph.In a bipartite graph, G = (V,E), the set of vertices V can be partitioned into V1 and V2 such thatall edges in E are between V1 and V2. That is, no two vertices in V1 have an edge between them, andlikewise for V2. A matching involves choosing a collection of arcs so that no vertex is adjacent tomore than one edge. A vertex is said to be “matched” if it is adjacent to an edge. The formulationfor the problem is

(BM)

Max∑

(i,j)∈E xi,jsubject to

∑

(i,j)∈E xi,j ≤ 1 ∀i ∈ Vxi,j ∈ 0, 1, ∀ (i, j) ∈ E.

Clearly the maximum possible objective value is min|V1|, |V2|. If |V1| = |V2| it may be possibleto have a perfect matching, that is, it may be possible to match every node.

The Bipartite Matching problem can also be converted into a maximum flow problem. Adda source node, s, and a sink node t. Make all edge directed arcs from, say, V1 to V2 and add adirected arc of capacity 1 from s to each node in V1 and from each node in V2 to t. Place a supplyof min|V1|, |V2| at s and an equivalent demand at t. Now maximize the flow from s to t. Theexistence of a flow between two vertices corresponds to a matching. Due to the capacity constraints,no vertex in V1 will have flow to more than one vertex in V2, and no vertex in V2 will have a flowfrom more than one vertex in V1.

Alternatively, one more arc from t back to s with cost of -1 can be added and the objectivecan be changed to maximizing circulation. Figure 5 depicts the circulation formulation of bipartitematching problem.

Figure 5: Circulation formulation of Maximum Cardinality Bipartite Matching

The problem may also be formulated as an assignment problem. Figure 6 depicts this formula-tion. An assignment problem requires that |V1| = |V2|. In order to meet this requirement we adddummy nodes to the deficient component of B as well as dummy arcs so that every node dummyor original may have its demand of 1 met. We assign costs of 0 to all original arcs and a cost of 1to all dummy arcs. We then seek the assignment of least cost which will minimize the number ofdummy arcs and hence maximize the number of original arcs.


Figure 6: Assignment formulation of Maximum Cardinality Bipartite Matching

The General Matching, or Nonbipartite Matching problem involves maximizing the number ofvertex matching in a general graph, or equivalently maximizing the size of a collection of edgessuch that no two edges share an endpoint. This problem cannot be converted to a flow problem,but is still solvable in polynomial time.

Another related problem is the weighted matching problem in which each edge (i, j) ∈ E hasweight wi,j assigned to it and the objective is to find a matching of maximum total weight. Themathematical programming formulation of this maximum weighted problem is as follows:

(WM)

Max∑

(i,j)∈E wi,jxi,jsubject to

∑

(i,j)∈E xi,j ≤ 1 ∀i ∈ Vxi,j ∈ 0, 1, ∀ (i, j) ∈ E.

Note that in the constraint matrix there are two ones per column and that the constraint matrixis not totally unimodular. Note also that the largest subdeterminant has value 2. By Cramer’srule we divide integers by 2 and hence we may obtain half integer solutions. However, by imposingthe following additional constraints on all subsets S with odd cardinality, we obtain an integralsolution to the linear programming relaxation.

∑

(i,j)∈Sxi,j ≤

⌊ |S|2

⌋

∀S ⊆ V, |S| = odd (10)

These inequalities enforce the condition that when |S| is odd, no matching can have more than

b |S|2 c edges internal to S.There are a number of reasons for classifying special classes of MCNF as we have done in

Figure 7. The more specialized the problem, the more efficient are the algorithms for solving it. So,it is important to identify the problem category when selecting a method for solving the problemof interest. In a similar vein, by considering problems with special structure, one may be able todesign efficient special purpose algorithms to take advantage of the special structure of the problem.

Note that the sets actually overlap, as Bipartite Matching is a special case of both the Assign-ment Problem and Max-Flow. (Verify this!)


'

&

$

%

'

&

$

%

'

&

$

%

'

&

$

%

'

&

$

%

'

&

$

%

#"Ã!

'

&

$

%

?

?

¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡µ

?

6

Á

?

Max-Flow

Bipartite-Matching

Bipartite-Matching

Transportation

Shortest Paths Assignment

(No Costs)

Transshipment (No capacities)

MCNF

Figure 7: Classification of MCNF problems

2.2 Other Network Problems

We first mention the classical Konigsberg bridge problem. This problem was known at the timeof Euler. He solved it in 1736 by posing it as a graph problem. The graph problem is, given anunidrected graph, is there a path that traverses all edges exactly once and returns to the startingpoint. Any such graph is called an Eulerian graph. You will have an opportunity to show in yourhomework assignment that a graph is Eulerian if and only if all node degerees are even. A paththat traverses each edge at most once will be referred to, in honor of Euler, as Eulerian walk.

The Chinese postman problem is closely related to the Eulerian walk problem.

Chinese Postman ProblemOptimization Version: Given a graph G = (V,E) with weighted edges, traverse all edges atleast once, so that we minimize the total distance traversed.

Decision Version: Given an edge-weighted graph G = (V,E), is there a tour traversing eachedge at least once of total weight ≤M?

For edge weights = 1, the decision problem with M = |E| = m is precisely the Eulerian Tour


problem. It will provide an answer to the question: Is this graph Eulerian?

The problem comes in two more variants. One variant is the directed Chinese postmanproblem, which poses the same question on a directed graph. Another variant is the mixedChinese postman problem where some of the edges in the graph are undirected, and somehave directions associated with them. It is interesting to note that the first two problems – thepure directed and pure undirected are solved in polynomial time, where the mixed problemis NP-complete.

Hamiltonian Tour Problem Given a graphG = (V,E), is there a tour visiting each node exactlyonce?

This is a NP-complete decision problem and no polynomial algorithm is known to determinewhether a graph is Hamiltonian.

Traveling Salesman Problem (TSP)Optimization Version: Given a graph G = (V,E) with weights on the edges, visit each nodeexactly once along a tour, such that we minimize the total traversed weight.

Decision version: Given a graph G = (V,E) with weights on the edges, and a number M , isthere a tour traversing each node exactly once of total weight ≤M?

Matching Problems A matching is defined as M , a subset of the edges E, such that each nodeis incident to at most one edge in M . Given a graph G = (V,E), find the largest cardinalitymatching. Maximum cardinality matching is also referred to as the Edge Packing Problem.(We know that M ≤ bV2 c.)Given a bipartite graph, is there perfect matching (a matching such that each vertex is incidentto exactly one edge in the matching)? If not, find a maximum cardinality matching. Thisproblem can be solved using Max-Flow.

Maximum Weight Matching: We can also assign weights to the edges and look for maximumtotal weight instead of maximum cardinality. (Depending on the weights, a maximum weightmatching may not be of maximum cardinality.)

Smallest Cost Perfect Matching: Given a graph, find a perfect matching such that the sum ofweights on the edges in the matching is minimized. On bipartite graphs, finding the smallestcost perfect matching is the assignment problem.

Vertex Packing Problems (Independent Set) Given a graph G = (V,E), find a subset S ofnodes such that no two nodes in the subset are neighbors, and such that S is of maximumcardinality. (We define neighbor nodes as two nodes with an edge between them.)

We can also assign weights to the vertices and consider the weighted version of the problem.

Clique A “Clique” is a collection of nodes S such that each node in the set is adjacent to everyother node in the set. The problem is to find a clique of maximum cardinality.

Recall that given G, G is the graph composed of the same nodes and exactly those edges notin G. Every possible edge is in either G or G. The maximum independent set problem on agraph G, then, is equivalent to the maximum clique problem in G.

Vertex Cover Problem Find the smallest collection of vertices S in an undirected graph G suchthat every edge in the graph is incident to some vertex in S. (Every edge in G has an endpointin S).


Notation: If V is the set of all vertices in G, and S ⊂ V is a subset of the vertices, thenV \ S is the set of vertices in G not in S.

Lemma 2.1. If S is an independent set, then V \ S is a vertex cover.

Proof. Suppose V − S is not a vertex cover. Then there exists an edge whose two endpointsare not in V −S). Then both endpoints must be in S. But, then S cannot be an independentset! (Recall the definition of an independent set – no edge has both endpoints in the set.) Sotherefore, we proved the lemma by contradiction.

The proof will work the other way, too:

Result: S is an independent set if and only if V − S is a vertex cover.

This leads to the result that the sum of a vertex cover and its corresponding independent setis a fixed number (namely |V |).Corollary 2.2. The complement of a feasible independent set is a feasible vertex cover, andvice versa.

A 2-Approximation Algorithm for the Vertex Cover Problem: We can use these results toconstruct an algorithm to find a vertex cover that is at most twice the size of an optimalvertex cover.

1) Find M , a maximum cardinality matching in the graph.2) Let S = u |u is an endpoint of some edge inM.Claim 2.1. S is a feasible vertex cover.

Proof. Suppose S is not a vertex cover. Then there is an edge e which is not covered (neitherof its endpoints is an endpoint of an edge in M). So if we add the edge to M (M ∪ e), wewill get a feasible matching of cardinality larger than M . But, this contradicts the maximumcardinality of M .

Claim 2.2. |S| is at most twice the cardinality of the minimum V C

Proof. Let OPT = optimal minimal vertex cover. Suppose |M | > |OPT |. Then there existsan edge e ∈M which doesn’t have an endpoint in the vertex cover. Every vertex in the V Ccan cover at most one edge of the matching. (If it covered two edges in the matching, thenthis is not a valid matching, because a match does not allow two edges adjacent to the samenode.) This is a contradiction. Therefore, |M | ≤ |OPT |. Because |S| = 2|M | , the claimfollows.

Edge Cover Problem This is analogous to the Vertex Cover problem. Given G = (V,E), find asubset EC ⊂ E such that every node in V is an endpoint of some edge in EC, and EC is ofminimum cardinality.

As is often the case, this edge problem is of polynomial complexity, even though its vertexanalog is NP-hard.

b-factor (b-matching) Problem This is a generalization of the Matching Problem: given thegraph G = (V,E), find a subset of edges such that each node is incident to at most d edges.Therefore, the Edge Packing Problem corresponds to the 1-factor Problem.


Graph Colorability Problem Given an undirected graph G = (V,E), assign colors c(i) to nodei ∈ V , such that ∀(i, j) ∈ E; c(i) 6= c(j).

Example – Map coloring: Given a geographical map, assign colors to countries such thattwo adjacent countries don’t have the same color. A graph corresponding to the map isconstructed as follows: each country corresponds to a node, and there is an edge connectingnodes of two neighboring countries. Such a graph is always planar (see definition below).

Definition – Chromatic Number: The smallest number of colors necessary for a coloring iscalled the chromatic number of the graph.

Definition – Planar Graph: Given a graph G = (V,E), if there exists a drawing of the graphin the plane such that no edges cross except at vertices, then the graph is said to be planar.

Theorem 2.2 (4-color Theorem). It is always possible to color a planar graph using 4colors.

A computer proof exists that uses a polynomial-time algorithm (Appel and Haken (1977)).An “elegant” proof is still unavailable. Observe that each set of nodes of the same coloris an independent set. (It follows that the cardinality of the maximum clique of the graphprovides a lower bound to the chromatic number.) A plausible graph coloring algorithmconsists of finding the maximum independent set and assigning the same color to each nodecontained. A second independent set can then be found on the subgraph corresponding to theremaining nodes, and so on. The algorithm, however, is both inefficient (finding a maximumindependent set is NP-hard) and not optimal. It is not optimal for the same reason (andsame type of example) that the b-matching problem is not solvable optimally by finding asequence of maximum cardinality matchings.

Note that a 2-colorable graph is bipartite. It is interesting to note that the opposite is alsotrue, i.e., every bipartite graph is 2-colorable. 1 We can show it building a 2-coloring of abipartite graph through the following breadth-first-search (see 5.1) based algorithm:

Initialization: Set i = 1, L = i, c(i) = 1.General step: Repeat until L is empty:

• For each previously uncolored j ∈ N(i), (a neighbor of i), assign color c(j) = c(i) +1 (mod 2) and add j at the end of the list L. If j is previously colored and c(j) =c(i) + 1(mod 2) proceed, else stop and declare graph is not bipartite.

• Remove i from L.

• Set i = first node in L.

Termination: Graph is bipartite

Note that if the algorithm finds any node in its neighborhood already labeled with the oppositecolor, then the graph is not bipartite, otherwise it is. While the above algorithm solves the2-color problem in O(|E|), the d-color problem for d ≥ 3 is NP-hard even for planar graphs(see Garey and Johnson, page 87).

Minimum Spanning Tree

1Indeed, the concept of k-colorability and the property of being k-partite are equivalent.


Definition: An undirected graph G = (V, T ) is a tree if the following three properties aresatisfied:

Property 1: |T | = |V | − 1.Property 2: G is connected.Property 3: G is acyclic.

(In your home work assignment you will show that any two of the properties imply the third).

Definition: Given an undirected graph G = (V,E) and a subset of the edges T ⊆ E suchthat (V, T ) is tree, then (V, T ) is called a spanning tree of G.

Minimum Spanning Tree Problem: Given an undirected graph G = (V,E), and costs c :E → <, find T ⊆ E such that (V, T ) is a tree and

∑

(i,j)∈T c(i, j) is minimum. Such a graph(V, T ) is a Minimum Spanning Tree (MST) of G. The problem of finding an MST for agraph is solvable by a “greedy” algorithm in O(|E|log|E|). A lower bound on the worst casecomplexity can be easily proved to be |E|, because every arc must get compared at least once.The open question was to verify whether there exists a linear algorithm to solve the problem.Recently Pettie and Ramachandran 2 gave an optimal algorithm, in the sense that the runtime is the fastest possible. It is not known however whether this run time is linear, but it isproved it is only a little bit more than linear.

3 Totally Unimodular Matrices

An m× n matrix is said to be unimodular if for every full rank submatrix of size m×m the valueof its determinant is 1, 0, or −1. A matrix, A, is said to be totally unimodular if every squaresubmatrix has a determinant with is 0, 1 or −1.

Consider the general linear programming problem with integer coefficients in the constraints,right hand sides and objective function coefficients:

(LP )Max cx

subject to Ax ≤ b

Recall that a basic solution of this problem represents an extreme point on the polytope definedby the constraints and will not be the convex combination of any two other feasible solutions. Also,if there is an optimal solution for LP then there is a optimal, basic solution. If in addition A istotally unimodular we have the following result.

Lemma 3.1. If the constraints matrix A in LP is totally unimodular, then there exists an integeroptimal solution. In fact every basis or extreme point solution is integral.

Proof. If A is m× n, m < n, a basic solution is defined by a nonsigular, m×m square submatrixof A, Bmxm, called the basis, and a corresponding subset of the decision variables, xB, called thebasic variables, such that

Bm×mxB = b (11)

xB = B−1m×mb (12)

2Pettie and Ramachandran. ”An optimal minimum spanning tree algorithm”. JACM, 49 (2002) 16-34.


A cofactor of element (i, j) of a square matrix with columns C and rows R is the determinant ofthe submatrix formed by the rows R\ i and the columns C \j, that is, by deleting row i and columnj. This determinant is then multiplied by an appropriate sign coefficient (1 or −1) depending onwhether i + j is even or odd. The determinant is a sum of terms each of which is a product ofentries of the submatrix, and thus integer. By Cramer’s Rule we know B−1det(B) = [cof(b)]T ,where cof(B)i,j is the cofactor of element (i, j) of B. Since A is totally unimodular and B is abasis, |det(B)| = 1. Therefore, all of the elements of B−1 are integers. That means xB = B−1m×mbis an integer combination of integers, and is therefore integral itself.

Although we will require all coefficients to be integers, the lemma also holds if only the righthand sides vector b is integer.

This result has ramifications for combinatorial optimization problems. Integer programmingproblems with totally unimodular constraint matrices can be solved simply by solving the linearprogramming relaxation. This can be done in polynomial time via the Ellipsoid method, or morepractically, albeit not in polynomial time, by the Simplex Method. The resulting basic solution isguaranteed to be integral.

The set of totally unimodular integer problems includes network flow problems. The rows (orcolumns) of a network flow constraint matrix contain exactly one 1 and exactly one -1, with all theremaining entries 0. A square matrix A is said to have the network property if every column of Ahas at most one entry of value 1, at most one entry of value −1, and all other entries 0.

Lemma 3.2. Let A be a matrix with the network property. Then, det(A) ∈ 0, 1,−1.

Proof. We proceed by induction on n, where A is an n×n matrix. Basis: The result holds triviallyfor n = 1, since det(A) = A1,1. Induction Step: We consider 3 cases:Case 1: ∃ a column with all 0 entries. Clearly, det(A) = 0.Case 2: ∃ a column with only one non-zero entry. Calculate the determinant of A by taking thecofactor expansion along the column with exactly one non-zero entry; say this non-zero elementhas value a ∈ 1,−1. Then, det(A) = ±a× det(A′), where A′ is an (n− 1)× (n− 1) matrix withthe network property. It follows from the induction hypothesis that det(A′) ∈ 0. − 1, 1; hence,det(A) ∈ 0,−1, 1.Case 3: Each column has exactly two non-zero entries (one 1 and one −1). Adding up the rows ofA, we get the null vector, which implies that the rows of A are linearly dependent; so det(A) = 0.

The above lemma can be used to establish the following results (matrices of the first type comeup in network applications; matrices of the second type arise in many optimization problems onbipartite graphs):

Corollary 3.1. Let A be the constraint matrix of a network flow problem. Then, A is totallyunimodular.

Proof. The constraint matrix of a network flow problem has the property that each column hasexactly one 1, exactly one −1, and all other entries 0. It follows that any square submatrix of Ahas the network property, and the result follows from lemma 3.2.

Corollary 3.2. Let A be a matrix with 0 − 1 entries, such that each column of A has at mosttwo non-zero entries. Suppose that the rows of A can be partitioned into two sets, such that eachcolumn of A has at most one 1 in either set. Then, A is totally unimodular.

Proof. Partition the rows into two sets, as described, and multiply the rows of one of the sets by−1; let the resulting matrix be A′. Now, A′ is the node-arc adjacency matrix of a network and,


according to the previous lemma, is totally unimodular. Since every square submatrix of A′ andthe corresponding submatrix of A have determinants of equal magnitude (though perhaps differingin sign), it follows that A is totally unimodular.

Note that we can switch the roles of “rows” and “columns” in the above corollaries and getsimilar results.

4 Complexity Analysis

4.1 Measuring Quality of an Algorithm

Algorithm: One approach is to enumerate the solutions, and select the best one.Recall that for the assignment problem with 70 people and 70 tasks there are approximately

2332.4 solutions. The existence of an algorithm does not imply the existence of a good algorithm!To measure the complexity of a particular algorithm, we count the number of operations that

are performed as a function of the input size. The idea is to consider each elementary operation(usually defined as a set of simple arithmetic operations such as +,−,×, /, <) as having unitcost, and measure the number of operations (in terms of the size of the input) required to solve aproblem. The goal is to measure the rate, ignoring constants, at which the running time grows asthe size of the input grows; it is an asymptotic analysis.

Traditionally, complexity analysis has been concerned with counting the number of operationsthat must be performed in the worst case.

Definition 4.1 (Concrete Complexity of a problem). The complexity of a problem is thecomplexity of the algorithm that has the lowest complexity among all algorithms that solve theproblem.

4.1.1 Examples

Set Membership - Sorted list: We can determine if a particular item is in a list of n elementsvia binary search. The number of comparisons needed to find a member in a sorted list oflength n is proportional to log2n. (Note that one may be lucky and only need to perform 1operation for a particular instance.)

Problem: given a real number x, we want to know if x ∈ S.Algorithm:

1. Select smed = bfirst+last2 c and compare to x

2. If smed = x stop

3. If smed < x then S = (smed+1, . . . , slast) else S = (sfirst, . . . , smed−1)

4. If first < last goto 1 else stop

Complexity: after kth iteration n2k−1 elements remain. We are done searching for k such that

n2k−1 ≤ 2.

log2n ≤ k (13)

Thus the total number of comparisons is at most log2n+ 2.


Set Membership - Unsorted list: We can determine if a particular item is in a list of n itemsby looking at each member of the list one by one. Thus the number of comparisons neededto find a member in an unsorted list of length n is n.

Problem: given a real number x, we want to know if x ∈ S.Algorithm:

1. Compare x to si

2. Stop if x = si

3. else if i← i+ 1 < n goto 1 else stop x is not in S

Complexity = n comparisons in the worst case. This is also the concrete complexity ofthis problem. Why?

Matrix Multiplication: The straightforward method for multiplying two n× n matrices takesn3 multiplications and n2(n − 1) additions. Algorithms with better complexity (though notnecessarily practical, see comments later in these notes) are known. Coppersmith and Wino-grad came up with an algorithm with complexity Cn2.375477 where C is large.

Forest Harvesting: In this problem we have a forest divided into a number of cells. For eachcell we have the following information: Hi - benefit for the timber company to harvest, Ui

- benefit for the timber company not to harvest, and Bij - the border effect, which is thebenefit received for harvesting exactly one of cells i or j. This produces an m by n grid. Theway to solve is to look at every possible combination of harvesting and not harvesting andpick the best one. This algorithm requires (2mn) operations.

An algorithm is said to be polynomial if its running time is bounded by a polynomial in thesize of the input. All but the forest harvesting algorithm mentioned above are polynomialalgorithms.

An algorithm is said to be strongly polynomial if the running time is bounded by a polynomialin the size of the input and is independent of the numbers involved; for example, a max-flowalgorithm whose running time depends upon the size of the arc capacities is not stronglypolynomial, even though it may be polynomial (as in the scaling algorithm of Edmonds andKarp). The algorithms for the sorted and unsorted set membership have strongly polynomialrunning time. So does the greedy algorithm for solving the minimum spanning tree problem.This issue will be returned to later in the course.


4.2 Growth of Functions

n 2dlogne n− 1 2n n!

1 0 0 2 1

2 2 1 4 2

3 2 2 8 6

4 2 3 16 24

5 2 4 32 120

10 2 9 1024 3628800

70 4 69 270 ≈ 2332

We are interested in the asymptotic behavior of the running time.

4.3 Definitions for asymptotic comparisons of functions

We define for functions f and g,f, g : Z+ → [0,∞) .

1. f(n) ∈ O(g(n)) if ∃ a constant c > 0 such that f(n) ≤ cg(n), for all n sufficiently large.

2. f(n) ∈ o(g(n)) if, for any constant c > 0, f(n) < cg(n), for all n sufficiently large.

3. f(n) ∈ Ω(g(n)) if ∃ a constant c > 0 such that f(n) ≥ cg(n), for all n sufficiently large.

4. f(n) ∈ ω(g(n)) if, for any constant c > 0, f(n) > cg(n), for all n sufficiently large.

5. f(n) ∈ Θ(g(n)) if f(n) ∈ O(g(n)) and f(n) ∈ Ω(g(n)).

For example, matrix multiplication is O(n3).

4.4 Properties

We mention a few properties that can be useful when analyzing the complexity of algorithms.

Proposition 4.1. f(n) ∈ Ω(g(n)) if and only if g(n) ∈ O(f(n)).

The next property is often used in conjunction with L’hopital’s Rule.

Proposition 4.2. Suppose that limn→∞f(n)g(n) exists:

limn→∞

f(n)

g(n)= c .

Then,


1. c <∞ implies that f(n) ∈ O(g(n)).

2. c > 0 implies that f(n) ∈ Ω(g(n)).

3. c = 0 implies that f(n) ∈ o(g(n)).

4. 0 < c <∞ implies that f(n) ∈ Θ(g(n)).

5. c =∞ implies that f(n) ∈ ω(g(n)).

4.5 Caveats

One should bear in mind a number of caveats concerning the use of complexity analysis.

1. The model presented is a poor one when dealing with very large numbers, as each operationis given unit cost, regardless of the size of the numbers involved. But multiplying two hugenumbers, for instance, may require more effort than multiplying two small numbers.

2. Traditionally, complexity analysis has been a pessimistic measure, concerned with worst-casebehavior. The simplex method for linear programming is known to be exponential (in the worstcase), while the ellipsoid algorithm is polynomial; but, for the ellipsoid method, the averagecase behavior and the worst case behavior are essentially the same, whereas the average casebehavior of simplex is much better than it’s worst case complexity, and in practice is preferredto the ellipsoid method. The ellipsoid method is described in Section 4.6.

Similarly, Quicksort, which has O(n2) worst case complexity, is often chosen over other sortingalgorithms with O(n logn) worst case complexity. This is because QuickSort has O(n logn)average case running time and, because the constants (that we ignore in O notation) aresmaller for QuickSort than for many other sorting algorithms, it is often preferred to algo-rithms with “better” worst-case complexity and “equivalent” average case complexity.

3. We are concerned with the asymptotic behavior of an algorithm. But, because we ignoreconstants (as mentioned in QuickSort comments above), it may be that an algorithm withbetter complexity only begins to perform better for instances of inordinately large size.

Indeed, this is the case for the O(n2.375477) algorithm for matrix multiplication, that is“. . . wildly impractical for any conceivable applications.” 3

4. An algorithm that is polynomial is considered to be “good”. So an algorithm with O(n100)complexity is considered good even though, for reasons already alluded to, it may be com-pletely impractical.

Still, complexity analysis is in general a very useful tool in both determining the intrinsic“hardness” of a problem and measuring the quality of a particular algorithm.

3See Coppersmith D. and Winograd S., Matrix Multiplication via Arithmetic Progressions. Journal of Symbolic Compu-

tation, 1990 Mar, V9 N3:251-280.


4.6 A Sketch of the Ellipsoid Method

We now outline the two key features of the ellipsoid method. A linear programming problem maybe formulated in one of two ways - a primal and dual representation. The first key feature combinesboth formulations in a way that forces all feasible solutions to be optimal solutions.

Primal Problem:maxcTx : Ax ≤ b, x ≥ 0 (14)

Dual Problem:minbT y : yTA ≥ c, y ≥ 0 (15)

The optimal solution satisfies:cTx = bT y (16)

We create a new problem combining these two formulations:

Ax ≤ b

x ≥ 0

yTAT ≥ c

y ≥ 0

bT y ≤ cTx

Note that by weak duality theorem bT y ≥ cTx, so with the inequality above this gives thedesired eqaultiy. Every feasible solution to this problem is an optimal solution. The ellipsoidalgorithm works on this feasibilty problem as follows:

1. If there is a feasible solution to the problem, then there is a basic feasible solution. Any basicsolution is contained in a ball of some radius R, where R is a function of the logarithm of L,the largest subdeterminant of the augmented constraint matrix.

2. Check if the ball’s center is in the region of feasibility. If yes, we are done. If no, find aconstraint that has been violated.

3. This violated constraint specifies a hyperplane through the origin and a halfspace which isinfeasible.

4. Reduce the search space to the smallest ellipsoid that contains the half ball that is theintersection of the original ball and the halfspace that contains the feasible region.

5. Repeat the algorithm at the center of the new ellipsoid.

Obviously this algorithm is not complete. How do we know when to stop if the feasible set isempty? The second key feature addresses this. The volume of the ellipsoids scales down with eachiteration by a factor of about e−1/2(n+1). Thus, each subsequent ellipsoid is smaller in volume thanits predecessor. The second feature is that there exists a fine grid with mesh length roughly L−1

for which if the ellipsoid’s volume is smaller than the mesh length the set of feasible solutions isempty, because a basic feasible solution lies in a grid of a certain mesh length.

The run time of the ellipsoid algorithm is a function of logL, hence polynomial. However,ellipsoid method is not practical.


5 NP completeness

5.1 Search vs. Decision

Decision Problem - A problem to which there is a yes or no answer.Example. SAT = Does there exist an assignment of variables which satisfies the boolean

function φ; where φ is a conjunction of a set of clauses, and each clause is a disjunction of some ofthe variables and/or their negations?Optimization Problem - A problem to which the answer is an optimal feasible solution.

An optimization problem can be solved by calling a decision problem in the form “Is there asolution with value less than or equal to k?” (for minimization) a polynomial number of timesusing binary search.

To illustrate, consider the Traveling Salesperson Problem.TSP OPT = What is the minimum weight tour in G = (V,A)? TSP DEC = Is there a tour in G = (V,A) the total weight of which is ≤ m?

An important problem for which the distinction between the decision problem and giving asolution to the decision problem – the search problem – is significant is primality. It was an openquestion (until Aug 2002) whether or not there exist polynomial-time algorithms for testing whetheror not an integer is a prime number. 4 However, for the corresponding search problem, finding allfactors of an integer, no similarly efficient algorithm is known.Evaluation Problem - A problem to which the answer is the cost of the optimal solution.

Another example: A graph is called k-connected if one has to remove at least k vertices in orderto make it disconnected. A theorem says that there exists a partition of the graph into k connectedcomponents of sizes n1, n2 . . . nk such that

∑ki=1 nk = n (the number of vertices), such that each

component contains one of the vertices v1, v2 . . . vk. The optimization problem is finding a partitionsuch that

∑

i |ni − n| is minimal, for n = n/k. No polynomial-time algorithm is known for k ≥ 4.However, the corresponding recognition problem is trivial, as the answer is always Yes once thegraph has been verified (in polynomial time) to be k-connected. So is the evaluation problem: theanswer is always the sum of the averages rounded up with a correction term, or the sum of theaverages rounded down, with a correction term for the residue.

For the rest of the discussion, we shall refer to the decision problem, unless stated otherwise.

5.2 NP

A very important class of decision problems is called NP, which stands for nondeterministicpolynomial-time. It is an abstract class, not specifically tied to optimization.

Definition 5.1. A decision problem is said to be in NP if for all “Yes” instances of it there existsa polynomial-length “certificate” that can be used to verify in polynomial time that the answer isindeed Yes.

4There are reasonably fast superpolynomial-time algorithms, and also Miller-Rabin’s algorithm, a randomizedalgorithm which runs in polynomial time and tells either “I don’t know” or “Not prime” with a certain probability.In August 2002 (here is the official release) ”Prof. Manindra Agarwal and two of his students, Nitin Saxena andNeeraj Kayal (both BTech from CSE/IITK who have just joined as Ph.D. students), have discovered a polynomialtime deterministic algorithm to test if an input number is prime or not. Lots of people over (literally!) centurieshave been looking for a polynomial time test for primality, and this result is a major breakthrough, likened by someto the P-time solution to Linear Programming announced in the 70s.”


Prover

Yes

certificate)

Verifier

(with a poly-length

checks certificate

in poly-time

Figure 8: Recognizing Problems in NP

Imagine that you are a verifier and that you want to be able to confirm that the answer to agiven decision problem is indeed “yes”. Problems in NP have poly-length certificates that, withoutnecessarily indicating how the answer was obtained, allow for this verification in polynomial time(see Figure 8).

5.2.1 Some Problems in NP

Hamiltonian Cycle = Is there a Hamiltonian circuit in the graph G = (V,A)?Certificate - the tour.

Clique = Is there a k–clique in the graph G = (V,A)?Certificate - the clique.

Composite = Is N composite?Certificate - the factors. ( Note - logN is the length of the input, length of certificate ≤ 2 logN .

can check in poly-time (multiply) )Prime = Is N prime?

Certificate - not so trivial. In fact, it was only in 1976 that Pratt showed that one exists.TSP = Is there a tour in G = (V,A) the total weight of which is ≤ m?

Certificate - the tour.Dominating Set = Is there is a dominating set S ⊆ V in a graph G = (V,A) such that |S| ≤ k?

Certificate - the set. ( A vertex is said to dominate itself and its neighbors. A dominating setdominates all vertices )LP = Is there a vector ~x, ~x ≥ 0 and A~x ≤ b such that C~x ≤ k?

Certificate - the feasible vector ~x. ( This vector may be arbitrarily large. However, every basicfeasible solution ~x can be shown to be poly-length in terms of the input. Clearly, checkable inpolynomial-time )IP = Is there a vector ~x, ~x ≥ 0, ~x ∈ Z and A~x ≤ b such that C~x ≤ k?

Certificate - a feasible vector ~x. (just like LP, it can be certified in polynomial time; it is alsonecessary to prove that the certificate is only polynomial in length. see AMO, pg 795.)k-center = Given a graph G = (V,E) weighted by w : V×V 7→ R+, is there a subset S ⊆ V ,|S| = k, such that ∀v ∈ V \ S, ∃s ∈ S such that w(v, s) ≤ m?

Certificate - the subset.


5.3 Co-NP

Suppose the answer to the recognition problem is No. How would one certify this?

Definition 5.2. The class of problems for which the certificate exists for all “No” instances iscalled co-NP.

5.3.1 Some Problems in Co-NP

Prime = Is N prime? “No” Certificate - the factors ( Note - logN is the length of the input, length of certificate

≤ 2 logN . This can be checked in poly-time (multiply) )LP = Is there a vector ~x, ~x ≥ 0 and A~x ≤ b such that C~x ≤ k?

“No” Certificate - the feasible dual vector ~y, ~y ≥ 0 such that ~yAT ≥ C (from Weak DualityThm.)

5.4 NP and Co-NP

As a verifier, it is clearly nice to have certificates to confirm both “yes” and “no” answers; that is,for the problem to be in both NP and Co-NP. From Sections 3 and 4, we know that the problemsPrime and LP are in both NP and Co-NP. However, for many problems in NP, such as TSP,there is no obvious polynomial “no” certificate.

Before we look at these 2 types of problems in detail, we define P, the class of polynomiallysolvable problems.

Definition 5.3. Let P be the class of polynomial-time (as a function of the length of the input)solvable problems.

It is easy to see that P⊆NP∩co-NP - just find an optimal solution. It is not known, however,whether, P=NP∩co-NP; the problems that are in the intersection but for which no polynomial-time algorithm is known are usually very pathological. Primality ∈ NP∩co-NP; this gave rise to theconjecture that primality testing can be carried out in polynomial time, as indeed it was recentlyverified to be.

Consider those problems that are in NP ∩ Co-NP. When is a decision problem Q in thisintersection? We consider some cases:

1. Polynomial (P) : Consider a problem Q for which there exists an algorithm with polynomialrunning time; for example,MST (“Is there a spanning tree with cost less than or equal to k”)can be answered in polynomial time. Such problems are in NP∩Co-NP: the algorithm itself(along with a proof of the algorithm’s correctness) is the certificate for both “yes” and “no”instances, and verification can be carried out in polynomial time by running the algorithm.

2. “Nice” Exact Neighborhoods: Consider an optimization problem that has an exact neigh-borhood (i.e. a local optimum is a global optimum) which can be searched in polynomialtime (this is what we mean by “nice”). For example, MST has a “nice” exact neighborhood(those trees that can generated by replacing a single edge of the current tree by a singlenon-tree edge). Assuming that the exact neighborhood can be searched in polynomial time,the decision problem corresponding to such problems is in both NP and Co-NP: the “yes”certificate is an optimal solution; the “no” certificate is also an optimal solution (along witha description of the neighborhood). So, even if we were not aware of a polynomial algorithm


for MST, the existence of a nice exact neighborhood is enough to tell us that MST is inboth NP and Co-NP.

3. Min-Max Problems: Consider optimization problems, like LP, which satisfy so-called min-max relationships; in the case of LP (assume the LP problem is a minimization problem),the minimum of the primal is equal to the maximum of the dual. Such min-max problems,assuming that solutions to the min and max problems can be verified and evaluated in poly-nomial time, are in NP∩Co-NP: a “yes” certificate is an optimal solution to the min problem;a “no” certificate is an feasible solution to the corresponding max problem. For example, LPis in NP and Co-NP. Similarly, min cut (“Is there an s-t cut with capacity less than or equalk?”) is in NP (a certificate is such a cut) and in Co-NP (a certificate is a flow whose valuesexceeds k).

Notice that the fact that a problem belongs to NP ∩ Co-NP does not automatically suggest apolynomial time algorithm for the problem.

P

Co-NPNP

PRIMECOMPOSITE

+ +

Figure 9: NP , Co-NP , and P

Conjectures

1. NP 6= Co-NP

2. NP ∩ Co-NP = P

However, being in NP ∩ Co-NP is usually considered as strong evidence that there exists apolynomial algorithm. Indeed, many problems were known to be in NP ∩ Co-NP well before apolynomial algorithm was developed (ex. LP, matching (“Is there a matching with cost greaterthan or equal to K”, primality); this membership is considered to be strong evidence that therewas a polynomial algorithm out there, giving incentive for researchers to put their efforts intofinding one. Both prime and composite lie in ∈ NP ∩ Co-NP; this gave rise to the conjecturethat both can be solved in polynomial time.

On the flip-side, NP-Complete problems, which we turn to next, are considered to be immuneto any guaranteed polynomial algorithm.

5.5 NP-completeness and reductions

A major open question in contemporary computer science is whether P=NP. It is currently believedthat P6=NP.


SAT:Is there a satisfying assignment of variables to the boolean function in 3cnf (3-conjunctive normal form)?

PARTITION:Does there exist a subset S ⊆ 1, 2, . . . , n of a multi-set of integers b1, b2, . . . , bn

such that∑

i∈S bi =∑

i6∈S bi ?

0-1 KNAPSACK:Does there exist an assignment to the 0-1 variables x1, x2, . . . , xn such that∑n

i=1 vixi ≤ B and∑n

i=1 uixi ≥ K?

INDEPENDENT SET:Given a graph G = (V,E), is there a subset S ⊆ V , with |S| ≥ K, such that thereis no edge e ∈ E that has both of its endpoints in S?

Figure 10: Definitions of some NP-Complete problems.

As long as this conjecture remains unproven, instead of proving that a certain NP problem isnot in P, we have to be satisfied with a slightly weaker statement: if the problem is in P thenP=NP, that is, a problem is “at least as hard” as any other NP problem. Cook has proven this fora problem called SAT; for other problems, this can be proven using reductions.

In the rest of this section, we formally define reducibility and the complexity class NP-Complete.Also, we provide some examples.

5.5.1 Reducibility

There are two definitions of reducibility, Karp and Turing; they are known to describe the sameclass of problems. The following definition uses Karp reducibility.

Definition 5.4. A problem P1 is said to reduce in polynomial time to problem P2 (written as“P1 ∝ P2”) if there exists a polynomial-time algorithm A1 for P1 that makes calls to a subroutinesolving P2 and each call to a subroutine solving P2 is counted as a single operation.

We then say that P2 is at least as hard as P1.If P2 ∈P then P1 ∈P, because the ring of polynomials is closed under substitutions. Turing

reductions allow for only one call to the subroutine.Example: The k-center problem is at least as hard as the dominating set problem. The reduction

proceeds as follows: Given a graph G = (V,E), define a weight function on K|V | (the completegraph with |V | vertices) to be 0 if the two vertices share an edge in G, and 1 if they do not. Then,to find a dominating set in G it is sufficient to find a k-center in K|V | with objective value = 0 (forthis weight function). The only nontrivial computation involved is building K|V |, which is clearlypolynomial-time.

5.5.2 NP-Completeness

Definition and Implications: A decision problem Q is said to be NP-complete if:


1. Q ∈ NP.2. All other problems in NP are polynomially reducible to Q.

It follows that if any NP-complete problem were to have a polynomial algorithm, they allwould.

Conjecture: P 6= NP.

So, we do not expect any polynomial algorithm to exist for an NP-complete problem. There-fore, if Q is NP-complete, we should not expect there to exist a nice exact neighborhood forthe optimization problem associated with Q.

SAT is NP-Complete: Cook and Levin independently established that all problems in NP couldbe polynomially reduced to SAT.

SAT = Is there a satisfying assignment of variables to the boolean function in 3cnf (3-conjunctive normal form)? (X1 ∨X3 ∨ X7) ∧ (X15 ∨ X1 ∨ X3) ∧ ( ) . . . ∧ ( )

Theorem 5.1 (Cook-Levin (1970)). Let X be an instance of a problem Q, where Q ∈ NP.Then, there exists a SAT instance F (X,Q), whose size is polynomially bounded in the size ofX, such that F (X,Q) is satisfiable if and only if X is a “yes” instance of Q.

Proof. In very vague terms: Consider the process of verifying the correctness of a “yes”instance, and consider a Turing machine that formalizes this verification process. Now, onecan construct a SAT instance (polynomial in the size of the input) mimicking the actions ofthe Turing machine, such that the SAT expression is satisfiable if and only if verification issuccessful.

Corollary 5.1. SAT is NP-complete.

Proof. Clearly SAT is in NP. Completeness follows from the above theorem.

Also, when a decision problem is NP-Complete, we call the optimization problem NP-Hard.

Reductions: Karp 5 pointed out the practical importance of Cook’s theorem, via the conceptof reducibility.To show that a problem Q is NP-complete, one need only demonstrate two things:

1. Q ∈ NP.2. Show that Q1 ∝ Q, for some NP-complete problem Q1.

Because all problems in NP are polynomially reducible to Q1, if Q1 ∝ Q, it follows that Q isat least as hard as Q1 and that all problems in NP are polynomially reducible to Q.Starting with SAT, Karp produced a series of problems that he showed to be NP-complete. 6

An example of proving NP-Completeness using reductions (see figure on next page for someNP-Complete problems):

5Reference: R. M. Karp. Reducibility Among Combinatorial Problems, pages 85–103. Complexity of ComputerComputations. Plenum Press, New York, 1972. R. E. Miller and J. W. Thatcher (eds.).

6For a good discussion on the theory of NP-completeness, as well as an extensive summary of many known NP-complete problems, see: M.R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of

NP-Completeness. Freeman, San Francisco, 1979.


1. 0-1 Knapsack is NP-complete.

Proof. First of all, we know that it is in NP, as a list of items acts as the certificate.

Now, we show that Partition ∝ 0-1 Knapsack. Consider an instance of Partition.Construct an instance of 0-1 Knapsack with wi = ci = bi for i = 1, 2, . . . , n, andW = K = 1

2

∑ni=1 bi. Then, a solution exists to the Partition instance if and only if one

exists to the constructed 0-1 Knapsack instance. Exercise: Try to find a reduction inthe opposite direction (such a reduction is given in in P&S).

2. Independent Set is NP-complete.

Proof. Clearly Independent Set ∈ NP, as a list of the vertices in the independent setsuffices as a certificate.

Now, we show that SAT ∝ Independent Set. Suppose you are given an instance of SAT,with k clauses. Form a graph that has a component corresponding to each clause. Fora given clause, the corresponding component is a complete graph with one vertex foreach variable in the clause. Now, connect two nodes in different components if andonly if they correspond to a variable and its negation. Now, it is easy to see that thisresulting graph has an independent set of size at least k if and only the SAT instanceis satisfiable. An example of the reduction is illustrated in Figure 11 for the 3-SATexpression (x1 ∨ x2 ∨ x3) ∧ (x1 ∨ x2 ∨ x4) ∧ (x1 ∨ x3 ∨ x4) ∧ (x2 ∨ x3 ∨ x4).

x

x

x

x x x

x

x

x

xx x

3−SAT Reduction to Independent Set

1

2

3

21 4

4

3

1

2 3 4

Figure 11: Example of 3-SAT reduction to Independent set

3. k-center is NP-complete.


Proof. The k-center problem is in NP from Section 5.2. The dominating set problem isknown to be NP-complete; therefore, so is k-center (from Section 5.5.1).

Additional remarks about the Knapsack problem:The problem can be solved by a longest path procedure on a graph – DAG– where for each j ∈1, . . . , n and b ∈ 0, 1, . . . , B we have a node. This algorithm can be viewed alternatively alsoas a dynamic programming with fj(b) the value of max

∑ji=1 uixi such that

∑ji=1 vixi ≤ b. This is

computed by the recursion:

fj(b) = maxfj−1(b); fj−1(b− vj) + uj.

There are obvious boundary conditions,

f1(b) =

0 if b ≤ v1 − 1u1 if b ≥ v1

The complexity of this algorithm is O(nB). So if the input is represented in unary then theinput length is a polynomial function of n and b and this run time is considered polynomial. Theknapsack problem, as well as any other NP-complete problem that has a poly time algorithm forunary input is called weakly NP-complete.

6 Graph Representations and Graph Search Algorithms

6.1 Representations of a Graph

Some ways in which graphs or networks can be represented are described below.

Node-Node Adjacency Matrix: The network is described by the matrix N , whose (i, j)thentry is nij , where

nij =

1 if (i, j) ∈ E0, otherwise

Figure 12 depicts the node-node adjacency matrices for directed and undirected graphs.

From

N =

j

i 1

To

Node

Node

N =

j

i 1

j

i

1

a) node-node adjacency ma-trix for a directed graph

b) node-node adjacency ma-trix for an undirected graph

Figure 12: Node-node adjacency matrices

Note: For undirected graphs, the node-node adjacency matrix is symmetric. The density ofthis matrix is 2|E|

|V |2 for an undirected graph, and |A||V |2 , for a directed graph. If the graph is

complete, then |A| = |V | · |V − 1| and the matrix N is dense.


Is this matrix a good representation for a breadth-first-search? (discussed in the next sub-section). To determine whether a node is adjacent to any other nodes, n entries need tobe checked for each row. For a connected graph with n nodes and m edges, n − 1 ≤ m ≤n(n− 1)/2, and there are O(n2) searches.

Node-Arc Adjacency Matrix: The node-arc adjacency matrix consists of the coefficients fromthe flow balance constraints in the integer programming formulation of MCNF. The rowscorrespond to nodes in the network, and the columns correspond to arcs. As mentioned inan earlier lecture, the node-arc adjacency matrix is totally unimodular. For an arc (i,j) thematrix entry for node (row) i is +1 while that for node (row) j is -1.

The density of a node-arc adjacency matrix is 2n

Adjacency List: Consider a tree, which is a very sparse graph with |E| = |V | − 1. If we rep-resent an undirected tree using the matrices described above, we use a lot of memory. Amore compact representation is to write each node and its associated adjacency list. Such arepresentation requires 2 |E|+ |V | memory spaces as opposed to |E| · |V | memory spaces forthe matrix representation.

Node Adjacent Nodes

1 Neighbors of node 12 Neighbors of node 2...

...n Neighbors of node n

There are 2 entries for every edge or arc, and a constant amount of work is required for everyedge in the graph. This results in O(m + n) size of the adjacecy list represenation. This isthe most compact represetation among the three presented here. For sparse graphs it is thepreferrable one.

6.2 Searching a Graph

A common problem that requires searching a graph is that of identifying all nodes that can bereached from a certain node s.

Generic Search Algorithm: A generic search algorithm starts with the special node (node 1in the above graph), and designates it as the root of a tree. The algorithm then selects othernodes that are adjacent to that node and adds them to the tree. It continues in an iterativemanner until all nodes are part of the tree. We will define methods for which this selectionmechanism more closely.

Code for a generic search algorithm:VT ← root, ET ← While V 6= VTSelect w ∈ VTSelect v ∈ (V − VT ) s.t. ∃ (w, v) ∈ E − ET

VT ← VT ∪ vET ← ET ∪ (w, v)end


For example, we are interested in searching the graph in Figure 13, starting at node 1:Two basic methods that are used to define the selection mechanism are Breadth-First-Search

(BFS) and Depth-First-Search (DFS).

Breadth-First-Search: In breadth-first-search, the graph is searched in an exhaustive manner;that is, all previously unscanned neighbors of a particular node are tagged and added to theend of the reachability list, and then the previously unscanned neighbors of the next node inthe list are added, etc. Breadth-first-search employs a first-in-first-out (FIFO) strategy andis usually implemented using a queue. The algorithm is as follows:

Given G=(V,E) ;VT=root ; /*initializeET = φ ;

while VT 6= V ;select (FIFO first in queue) w ∈VT ;while ∃ (w, v) ∈ E \ ET s.t. v /∈ V \ VT ;

add v to queue ;VT ← VT

⋃v ;ET ← ET

⋃

(w, v) ;end ;remove w from the queue ;

end ;

The original graph, rearranged in a breadth-first-search, shows the typically short and bushyshape of a BFS tree as shown in Figure 14.

Note that the BFS tree is not unique and depends on the order in which the neighbors arevisited. Not all edges from the original graph are represented in the tree. However, we knowthat the original graph cannot contain any edges that would bridge two nonconsecutive levelsin the tree. By level we mean a set of nodes such that the number of edges that connect eachnode to the root of the tree is the same.

Consider any graph. Form its BFS tree. Any edge belonging to the original graph and notin the BFS can either be from level i to level i+ 1; or between two nodes at the same level.

12

3

4

5

6

7 8

9

10

Figure 13: Graph to be searched


1

23

4

5

6

7 8

9

10 Distance 1 from root

Distance 2 from root

Distance 3 from root

Figure 14: Solid edges represent the BFS tree of the example graph. The dotted edges cannot bepresent in a BFS tree.

For example, consider the dotted edge in Figure 14 from node 5 to node 6. If such an edgeexisted in the graph, then node 6 would be a child of node 5 and hence would appear at adistance of 2 from the root.

We say that an edge forms an even cycle if it is from level i to level i+1 in the graph and anodd cycle if it is between two nodes at the same level.

Depth-First-Search: In a depth-first-search, one searches for a previously unscanned neighborof a given node. If such a node is found, it is added to the head of the list and is scanned outfrom; otherwise, the original node is removed from the list, and we scan from the node nowat the head of the list. Depth-first-search employs a last-in-first-out (LIFO) strategy and isusually implemented using a stack. The algorithm for DFS is as follows:

Given G=(V,E) ;VT=root ; /*initializeET = 0 ;

while VT 6= V ;select (LIFO first in stack) w ∈VT ;if ∃(w, v) ∈ E \ ET s.t. v /∈ V \ VT ;

add v to stack ;VT ← VT

⋃v ;ET ← ET

⋃

(w, v) ;else

remove w from the stack ;end ;

The same graph, rearranged in a depth-first-search demonstrates the typically long and skinnyshape of a DFS tree as shown in Figure 15.

Depending upon the problem at hand, one search method may be more appropriate than theother. For example, DFS is useful for finding the biconnected components of a graph, while BFS isused for finding the minimum number of edges needed to walk from the source to any other nodein the graph.


1

2

3

456

7

8

9

10

Figure 15: DFS tree of the example graph

6.2.1 The complexity of the search procedures

In order to manage the search efficiently we use the adjacency list data structure. Each node inV has a pointer to the last scanned out-neighbor in the list (or neighbor in the undirected case).Prior to the initial search of a node’s list the pointer is at start position, say pointing to 0. If thepointer is to a position > 0 then we recognize that the node has been visited already, and thusbelongs to VT .

At each iteration there is a lookup at the adjacency list, checking if the neighboring node is inVT and then advancing the pointer, or adding the arc to the search tree. In either case, each arcis inspected at most once. This leads to complexity of O(m). For graphs that are very sparse wehave to note that each node is visited once for added complexity of O(n). The total is O(m+ n).For undirected graphs each edge can be inspected at most twice, so the complexity is up to a factorof 2 more operations, which is still O(m+ n).

7 Shortest Paths Problem

7.1 Introduction

Consider graph G = (V,E), with cost cij on arc (i, j). Shortest path problems can have differentobjectives:– Finding the shortest path from s to t (single pair).– Finding the shortest path from a single source to all other nodes (one → everything else).– Finding the shortest path from every node to every other node (every pair).

We will begin by considering shortest paths (or longest paths) in Directed Acyclic Graphs(DAGs).

Properties of DAGs

A DAG is a Directed Acyclic graph, namely, there are no directed cycles in the graph.


Definition 7.1. A Topological ordering is the assignments of numbering to the nodes 1, 2, · · · , nsuch that all arcs go in the forward direction:

(i, j) ∈ E ⇒ i < j

7.2 Topological Sort and directed acyclic graphs

Given a directed graph (also called a digraph), G = (N,A), we wish to topologically sort the nodesof G; that is, we wish to assign labels 1, 2, . . . , |N | to the nodes so that (i, j) ∈ A ⇒ i < j.

A topological sort does not exist in general, and a trivial example to show this is a graph withtwo nodes, with directed edges from each node to the other. It does, however, always exist foracyclic graphs:

Lemma 7.1. In an acyclic digraph G, there exists a node with in-degree 0.

Proof: Say that in the acyclic digraph G, there is no node with in-degree 0. We arbitrarilypick a node i in G, and apply a Depth-First-Search by visiting the predecessor nodes of each suchnode encountered. Since every node has a predecessor, when the nodes are exhausted (which musthappen since there are a finite number of nodes in the graph), we will encounter a predecessor whichhas already been visited. This shows that there is a cycle in G, which contradicts our assumptionthat G is acyclic.

Lemma 7.2. There exists a topological sort for a digraph G if and only if G has no directed cycles

proof:[Only if part] Suppose there exists a topological sort forG. If there exists cycle (i1,...,ik,i1),then, by the definition of topological sort above, i1 < i2 < ... < ik < i1, which is impossible. Hence,if there exists a topological sort, then G has no directed cycles.

[If part] In an acyclic graph, there exists a node i with in-degree 0, by the previous lemma. Thealgorithm for topological sort is described in Figure 16. The nodes can be topologically sorted,by iteratively assigning a node with in-degree 0 the lowest remaining label, and removing thatnode from the graph. The correctness of this algorithm is established by induction on the labelnumber.

i← 1V ← Nwhile V 6= ∅

select v ∈ V with in-degreeV (v) = 0label(v)← ii← i+ 1V ← V \ v; A← A \ (v, w) ∈ Aupdate in-degrees of neighbors of v

repeat

Figure 16: Algorithm Topological Sort

Note that the topological sort resulting from this algorithm is not unique. Further note that wecould have alternately taken the node with out-degree 0, assigned it the highest remaining label,removed it from the graph, and iterated on this until no nodes remain.


1

3 4

6

7

5

2

1

2

3

4

5

6

7

Figure 17: An example of topological ordering in a DAG. Topological orders are given in boxes.

An example of the topological order obtained for a DAG is depicted in Figure 17. The numbersin the boxes near the nodes represent a topological order.

Consider the complexity analysis of the algorithm. We can have a very loose analysis and saythat each node is visited at most once and then its list of neighbors has to be updated (a search ofO(m) in the node-arc adjacency matrix, or O(n) in the node-node adjacency matrix, or the numberof neighbors in the adjacency list data structure). Then need to find among all nodes one withindegree equal to 0. This algorithm runs in O(mn) time, or if we are more careful it can be donein O(n2). The second factor of n is due to the search for a indegree 0 node. To improve on thiswe maintain a ”bucket” or list of nodes of indegree 0. Each time a node’s indegree is updated, wecheck if it is 0 and if so add it to the bucket. Now to find a node of indegree 0 is done in O(1), justlookup the bucket in any order. Updating the indegrees is done in O(m) time total throughout thealgorithm, as each arc is looked at once. Therefore, Topological Sort can be implemented to run inO(m) time.

The Shortest Path Problem (henceforth SP problem) on a digraph G = (N,A) involves findingthe least cost path between nodes i, j ∈ N . Each arc (p, q) ∈ A has a weight assigned to it, andthe cost of a path is defined as the sum of the weights of all arcs in the path.

We address the problem of finding the SP between nodes i and j, such that i < j, assuming thegraph is topologically ordered to start with. It is beneficial to sort the graph in advance, since wecan trivially determine that there is no path between i and j if i > j.

We have already seen how to model the single source shortest path problem as a MinimumCost Network Flow problem. We consider properties of an optimal solution to this problem, beforeturning to algorithms specially designed to solve this problem. (These properties can be proven bya contradiction-type argument.)

7.3 Properties of shortest paths

Proposition 7.1. If the path (s = i1, i2, . . . , ih = v) is a shortest path from node s to node v, thenthe subpath (s = i1, i2, . . . , ik) is a shortest path from source to node ik, ∀ k = 2, 3, . . . , h− 1.

This property is a rephrasing of the principle of optimality, that serves to justify dynamicprogramming algorithms.

Proposition 7.2. Let the vector ~d represent the shortest path distances from the source node. Then

1. d(j) ≤ d(i) + cost(i, j), ∀ (i, j) ∈ A.


2. A directed path from s to k is a shortest path if and only if d(j) = d(i)+cost(i, j), ∀ (i, j) ∈ P .

This property is useful for backtracking and identifying the tree of shortest paths.Given the distance vector ~d, we call an arc eligible if d(j) = d(i) + cost(i, j). We may find a

(shortest) path from the source s to any other node by performing a breadth first search of eligiblearcs. The graph of eligible arcs looks like a tree, plus some additional arcs in the case that theshortest paths are not unique. But we can choose one shortest path for each node so that theresulting graph of all shortest paths from s forms a tree.

Given and undirected graph with nonnegative weights. The following analogue algorithm solvesthe shortest paths problem:

The String Solution: Given a shortest path problem on an undirected graph with non-negativecosts, imagine physically representing the vertices by small rings (labeled 1 to n), tied toother rings with string of length equaling the cost of each edge connecting the correspondingrings. That is, for each arc (i, j), connect ring i to ring j with string of length cost(i, j).

Algorithm: To find the shortest path from vertex s to all other vertices, hold ring s and letthe other nodes fall under the force of gravity.

Then, the distance, d(k), that ring k falls represents the length of the shortest path from sto k. The strings that are taut correspond to eligible arcs.

7.4 Shortest Paths in a DAG

We consider the single source shortest paths problem for a directed acyclic graph G. Without lossof generality, we can assume that the source is labeled 1, since any node with a smaller label thanthe source is unreachable from the source and can thereby be removed.

Given a topological sorting of the nodes, with node 1 the source, the shortest path distancescan be found by using the recurrence,

d(j) = min1≤i<j

d(i) + cost(i, j).

The validity of the recurrence is easily established by induction on j. It also follows from property7.2.

What is the complexity of this dynamic programming algorithm? It takes O(m) operationsto perform the topological sort, and O(m) operations to calculate the distances via the givenrecurrence. Alternatively, the topological sort can be done in parallel at the same time as therecurrence with added constant time per iteration. Therefore, the distances can be calculated inO(m) time. Note that by keeping track of the i’s that minimize the right hand side of the recurrence,we can easily backtrack to reconstruct the shortest paths.

It is important how we represent the output to the problem. If we want a list of all shortestpaths, then just writing this output takes O(n2). The shortest path tree, or the list of eligible arcsis a compact O(n) representation of all shortest paths, and will thus be assumed as the form ofrequired output.Longest Paths in DAGs:To find the longest path from node 1 to every other node in the graph, simply replace min withmax.


7.5 Applications of Shortest/Longest Path

Designer’s Relocation This is a typical project management problem. The objective of theproblem is to find the earliest time by which the project can be completed. Refer to thegraph in Figure 18. This is because all paths in the network have to be traveresed in orderfor the project to be completed. So the longest one will be the bottleneck determining thecompletion time of the project. It can be formulated as a longest path problem in an acyclicnetwork (Critical Path Method – CPM). (Note that in general, the longest path problem ona graph is an NP-complete problem.) One can make all the costs (time durations) in thegraph negative and then solve the shortest path problem on a DAG. The nodes can be splitto represent the time taken in each node. Each node is labelled as a triplet (node name, timein node, earliest start time). Earliest start time of a node is the earliest time that work canbe started on that node. Hence, the objective can also be written as the earliest start timeof the finish node.

Label of a node = Earliest start time of the node

label(x) = max(y,x)label(y) + duration(y)The critical path (thick lines) in the project graph below is Start B F K M Finish. It is calledcritical because any delay in a task along this path delays the whole project path.

START

(FINISH,0,38)

(A,10,0)

(B,19,0)

(N,7,24)

(M,9,29)

(K,6,23)

(J,4,14)

(I,5,19)

(H,3,19)

(G,1,19)

(F,4,19)

(E,14,10)

(L,8,28)

(D,8,10)

(C,13,0)

Figure 18: Project management example

See the handout titled Project Formulation for Designer Gene’s Relocation.

Maximizing Rent The graph shown in Figure 19 has arcs going from the arrival day to the de-parture day with the bid written on the arcs. Arcs with cost 0 are drawn between consecutivedays when no such arcs exist. These are shown as dotted arcs in the graph. To find themaximum rent plan one should find the longest path in this graph. The graph has a topolog-ical order and can be solved using the shortest path algorithm. (Note that the shortest andlongest path problems are equivalent for acyclic networks, but dramatically different whenthere are cycles.)


1 2 3 4 5 6 7 8 9 10

7

2

2

411

1

6

3

7

4

5

3

0 0 0 0 0

Figure 19: Maximizing rent example

See the handout titled Maximizing Rent: Example.

Equipment Replacement Minimize the total cost of buying, selling, and operating the equip-ment over a planning horizon of T periods.

See the handout titled The Equipment Replacement Problem.

Lot Sizing Problem Meet the prescribed demand dj for each of the periods j = 1, 2, ..., T byproducing, or carrying inventory from previous production runs. The cost of productionincludes fixed set up cost for producing a psoitive amount at any period, and a variable perunit production cost at each period. This problem can be formulated as a shortest pathproblem in a DAG (AM&O page 749) if one observes that any production will be for thedemand of an integer number of consecutive periods ahead. You are asked to prove this inyour homework assignment.

7.6 Reduction from Hamiltonian Cycle to Longest Path Problem:

We show here that the longest path problem is at least as hard to solve as the Hamiltonian cycleproblem.

We wish to find a Hamiltonian Cycle (HC) in G = (V,E). We first claim that the HamiltonianPath (HP) is equivalent to Hamiltonian cycle: Given a HP problem on a graph G = (V,E) to findthe HP beginning at s and ending at t, we add an edge between s and t and can solve it as HCproblem. VIce versa, we split a node, say 1 into two nodes 1′ and 1′′ with the same set of neighborsas 1. Now there is a HP between 1′ and 1′′ if there is a HC in the original graph.

Suppose we can solve the longest path problem and we want to solve the HP problem for a HPbetween s and t. We let the cost of all edges in the graph be 1, and note that the longest path froms to t is of length n− 1 iff there exists a Hamiltonian path from s and t. We can thus use a longestpath algorithm to solve the Hamiltonian path problem. Therefore, the longest path problem is atleast as hard as the Hamilton cycle problem. Conclusion: Longest path problem is an NP-hardproblem, since the Hamiltonian cycle problem is NP-hard.

Still, in an acyclic network the longest path problem is easy and solvable in linear time O(m).


7.7 The mathematical programming formulation of the longest path and short-est paths problems

Consider the longest path problem in the context of project management. Let the decision variablesbexi = earliest start time of activity i.Let the start node be denoted by s and the finish node by t. Let the precedence constraints networkbe G = (V,A).

(LP)Min xt

subject to xj ≥ xi + di ∀(i, j) ∈ A.xs = 0.

Note that the constraint matrix is totally unimodular: each row has one 1 and one −1. We canrecognize the critical path from the dual: a binding constraint for (i, j) that has dual value equalto 1 is on the critical path, and therefore both i and j are critical.

In a linear programming extreme point solution each xi will be either the earliest start time orthe latest start time of the activity. This is because either one of these is an extreme point solution.

Consider the shortest path from s to t problem. Set,xi = distance from s to i.

(SP)Max xt

subject to xj ≤ xi + di ∀(i, j) ∈ A.xs = 0.

7.8 Dijkstra’s Algorithm for the Single Source Shortest Path Problem for Graphswith Nonnegative Edge Weights

begin

N+(i) := j|(i, j) ∈ A;P := 1;T := V \ 1;d(1) := 0 and pred(1) := 0;d(j) := C1j and pred(j) := 1 for all j ∈ A(1),and d(j) :=∞ otherwise;while P 6= V do

begin

(Node selection, also called FINDMIN)let i ∈ T be a node for which d(i) = mind(j) : j ∈ T;P := P ∪ i;T := T \ i;(Distance update)for each j ∈ N+(i) do

if d(j) > d(i) + Cij then

d(j) := d(i) + Cij and pred(j) := i;end

end

Figure 20: Algorithm: Dijkstra’s


Dijkstra’s algorithm is a special purpose algorithm for finding shortest paths from a singlesource, s, to all other nodes in a graph with non-negative edge weights. Nodes are given distancelabels, from node s, which are either temporary or permanent. Initially all nodes have temporarylabels. At any iteration, a node with the least distance label is expanded, and labeled permanent.If the expansion results in a shorter distance path to any of its successors, then the distance labelof the successor is updated. This is continued until no temporary nodes are left in the graph.

Dijkstra’s algorithm locates the nodes in increasing distance from node 1. At the kth iteration,the kth farthest node from node 1 is labeled as “permanent”.P : set of nodes with permanent labelT : set of nodes with temporary label

Theorem 7.1 (The correctness of Dijkstra’s algorithm). Dijkstra’s algorithm correctly de-termines the shortest path from node 1 for the networks in which arc lengths are nonnegative. Inaddition, it labels nodes permanently in order of non-decreasing distance from node 1.

Proof At each iteration the nodes are partitioned into subsets P and T .We observe that the labels at each iteration have the following properties and prove that by induc-tion. Inductive Assumptions1. The label for each node i ∈ P is the shortest distance from node 1.2. The label for each j ∈ T is the shortest distance from node 1 such that all internal nodes of thepath are in P (i.e. it would be the shortest path if we eliminated all arcs with both endpoints in T ).

Assume for |P | = k and prove for |P | = k + 1. Suppose that node i with d(i) = mind(j) : j ∈T, was added to P , but its label d(i) is not the shortest path label. Then there should be somenodes in T along the shortest path 1 to i. Let j be the first node in T on the shortest path from1 to i. Then the length of shortest path from 1 to i = d(j) + distance(j, i). Since d(i) is not theshortest label, d(j) + distance(j, i) < d(i), and since distance(j, i) is positive, we have d(j) < d(i).This contradicts that d(i) is the minimum of d(j) ∈ T as per the second inductive assumption.

We next show that the algorithm preserves the second inductive assumption. After a node i inT is labeled permanently, the distance labels in T \ i might decrease. After permanently labelingnode i, the algorithm sets d(j) = d(i) + cij if d(j) > d(i) + cij , for (i, j) ∈ A. Therefore, after thedistance update, by the inductive assumption the path from source to j satisfies Proposition 7.2.Thus, the distance label of each node in T \ i is the length of the shortest path subject to therestriction that each internal node in the path must belong to S

⋃i.

Complexity AnalysisThere are two major operations in Dijkstra’s algorithm :Find minimum, which has O(n2) complexity.Update labels, which has O(m) complexity.Therefore, complexity = O(n2 +m) ≈ O(n2).

There are several implementations that improve upon this running time. One improvementuses Binary Heaps: whenever a label is updated, use binary search to insert that label properlyin the sorted array. This requires O(logn). Therefore, the complexity of finding the minimumlabel in non-permanent set is O(mlogn) , and complexity of the algorithm is O(m + mlogn) ≈


-2

4

3

2

3

1

¡¡¡¡µ

BBBBBBN

XXXXXXz

µ´¶³µ´¶³

µ´¶³

Figure 21: Network for which Dijkstra’s algorithm fails

O(mlogn). Another improvement is Radix Heap which has complexity O(m + n√logC) where

C = max(i,j)∈ACij . But this is not a strongly polynomial complexity.Currently, the best strongly polynomial implementation of Dijkstra’s algorithm uses Fibonacci

Heaps and has complexity O(m+ nlogn). Since Fibonacci heaps are hard to program, this imple-mentation is not used in practice.

Dijkstra’s algorithm does not work correctly in the presence of negative edge weights. In thisalgorithm once a node is included in the ”permanent” set it will not be checked later again, butbecause of the presence of negative edge that is not enough (i.e. the length of the path from 1to the node which is in permanent set already might be reduced to the length of the path withnegative cost edge later because of that negative cost edge).

See Figure 21 for an example such that Dijkstra’s algorithm does not work in presence ofnegative edge weight. Using the algorithm, we get d(3) = 3. But the actual value of d(3) is 2.

7.9 Bellman-Ford Algorithm for Single Source Shortest Paths

This algorithm can be used with networks that contain negative weighted edges, and also can detecta negative cycle (if one exists) at the last iteration. The algorithm consists of n pulses. (pulse =update of labels along all m arcs).

initialize d(1) := 0, d(i) :=∞, i = 2, ..., npulse ∀ (i, j) ∈ A, d(j) := mind(j), d(i) + CijComplexity = O(mn) since each pulse is O(m), and we have n pulses.Note that it suffices to apply the pulse operations only for arcs (i, j) where the label of i changed

in the previous pulse.

Theorem 7.2. If there are no negative cost cycles in the network G = (N,A), then there exists ashortest path from s to any node i which uses at most n− 1 arcs.

Proof. Suppose that G contains no negative cycles. Observe that at most n − 1 arcs are requiredto construct a path from s to any node i. Now, consider a path, P , from s to i which traverses acycle.

P = s→ i1 → i2 → . . .→ (ij → ik → . . .→ ij)→ i` → . . .→ i.

Since G has no negative length cycles, the length of P is no less than the length of P where

P = s→ i1 → i2 → . . .→ ij → ik → . . .→ i` → . . .→ i.


Thus, we can remove all cycles from P and still retain a shortest path from s to i. Since thefinal path is acyclic, it must have no more than n− 1 arcs.

Theorem 7.3 (Invariant of the algorithm). After pulse k, all shortest paths from s of lengthk (in terms of number of arcs) or less have been identified.

Proof. Notation : dk(j) is the label d(j) at the end of pulse k.By induction. For k = 1, obvious. Assume true for k, and prove for k + 1.Let (s, i1, i2, ..., i`, j) be a shortest path to j using k + 1 or fewer arcs.It follows that (s, i1, i2, ..., i`) is a shortest path from s to i` using k or fewer arcs.After pulse k + 1, dk+1(j) ≤ dk(i`) + ci`,j .But dk(i`) + ci`,j is the length of a shortest path from s to j using k + 1 or fewer arcs, so we

know that dk+1(j) = dk(i`) + ci`,j .

Thus, the d() labels are correct after pulse k + 1.Note that the shortest paths identified in the theorem are not necessarily simple. They may

use negative cost cycles.

Theorem 7.4. If there exists a negative cost cycle, then there exists a node j such that dn(j) <dn−1(j) where dk(i) is the label at iteration k.

Proof. Suppose not.Let i1 − i2 − ...− ik − i1 be a negative cost cycle.We have dn(ij) = dn−1(ij), since the labels never increase.From the optimality conditions, dn(ir+1) ≤ dn−1(ir) + Cr,r+1 where r = 1,...,k-1, and dn(i1) ≤

dn−1(ik) + Cik,i1 .

These inequalities imply∑k

j=1 dn(ij) ≤∑k

j=1 dn−1(ij) +∑k

j=1Cij−1,ij .

But if dn(ij) = dn−1(ij), the above inequality implies 0 ≤∑kj=1Cij−1,ij .

∑kj=1Cij−1,ij is the length of the cycle, so we get an inequality which says that 0 is less than a

negative number which is a contradiction.

Corollary 7.1. The Bellman-Ford algorithm identifies a negative cost cycle if one exists.

By running n pulses of the Bellman-Ford algorithm, we either detect a negative cost cycle (ifone exists) or find a shortest path from s to all other nodes i. Therefore, finding a negative costcycle can be done in polynomial time, whereas finding the most negative cost cycle is NP-hard,because it reduces to the Hamiltonian Cycle problem.Remark Minimizing mean cost cycle is also polynomially solvable.

7.10 Floyd-Warshall Algorithm for All Pairs Shortest Paths

The Floyd-Warshall algorithm is designed to solve all pairs shortest paths problems for graphswith negative cost edges. The algorithm utilizes the concept of triangular comparison of distancelabels. The way the algorithm works will be illustrated through the attached handout. The mainadvantage of this algorithm is that it will terminate if a negative cost cycle is found. The algorithmmaintains a matrix (dij) such that at iteration k, dij is the shortest path from i to j using nodes1, 2, ...k as intermediate nodes. After the algorithm terminates, assuming that no negative costcycle is present, the shortest path from nodes i to j is dij .


See the handout titled Floyd-Warshall Algorithm.

At iteration j, dik = min(dik, dij + djk) (Triangle operation)and eik is updated according to the handout. (eik = index of the last intermediate node inserted

between node i and node k in the shortest path between the two.)

At the initial stage, dij = cij if there exist an arc between node i and jdij =∞ otherwiseeij = 0

First iteration : d24 ← min(d24, d21 + d14) = 3e24 = 1Update distance label

Second iteration : d41 ← min(d41, d42 + d21) = −2d43 ← min(d43, d42 + d23) = −3d44 ← min(d44, d42 + d24) = −1e41 = e43 = e44 = 2No other label changed

Note that we found a negative cost cycle since the diagonal element of matrix D, d44 = −1 ≤ 0.Negative dii means there exists a negative cost cycle since it simply says that there exists a pathof negative length from node i to itself. The algorithm will terminate when negative dii is found,or when the “for” looping is finished.

Invariant property of Floyd-Warshall Algorithm:After iteration k, dij is the shortest path distance from i to j involving a subset of nodes in1, 2, . . . , k as intermediate nodes.

Complexity AnalysisAt each iteration we consider another node as intermediate node → O(n) iterations.In each iteration we compare n2 triangles for n2 pairs → O(n2).Therefore total complexity = O(n3).

7.11 D.B. Johnson’s Algorithm for All Pairs Shortest Paths

This algorithm takes advantage of the idea that if all edge costs were nonnegative, we can find allpairs shortest paths by solving O(n) single source shortest paths problems using Dijkstra’s algo-rithm. To do this efficiently, it converts the network to an equivalent one with no negative costedges. Thus Dijkstra’s algorithm can be used. The algorithm goes as follows:0) Add a node s to the graph with csj = 0 for all nodes j ∈ V .1) Find single source shortest paths d(i) originating at s to every node i using Bellman-Ford algo-rithm. If a negative cost cycle is detected the algorithm will terminate at this step.2) Set c′ij = cij + d(i) − d(j). Note that c′ij ≥ 0 according to the property of shortest paths that


satisfies cij + d(i) ≥ d(j). (Property 7.2)3) Apply Dijkstra’s algorithm n times to the graph with weights c′.

Claim 7.1. Adding node s to the graph does not create any negative cost cycle.

The claim is obvious as s cannot be on any cycle.Let dPc (s, t) be the distance of a P path from s to t with the cost c.

Claim 7.2. For every pair of nodes s and t, and a path P , dPc (s, t) = dPc′(s, t)− d(s) + d(t). (Thatis, inimizing dc(s, t) is equivalent to minimizing dc′(s, t))

Proof: Let the path between s and t, be P = (s, i1, i2, .., ik, t).The distance along the path satisfies,

dPc′(s, t) = csi1 + d(s)− d(i1) + csi2 + d(i1)− d(i2) . . .+ csik + d(ik)− d(t) = dPc (s, t) + d(s)− d(t).

Note that the d(i) terms are canceled out except for the first and last terms. Thus the lengthof all paths from s to t changes by the constant amount d(s) − d(t). Now we can determine theshortest distance between from any nodes s to t by dPc′(s, t) :

dPc′(s, t) = dPc (s, t) + d(s)− d(t).

Thus, minP dPc′(s, t) = minP d

Pc (s, t)+ constant. Hence minimizing dc′(s, t) is equivalent to

minimizing dc(s, t).

Complexity Analysisstep 1: Bellman-Ford → O(mn).step 2: Updating c′ij for m arcs → O(m).step 3: Dijkstra’s algorithm n− 1 times → O(n(m+ n logn)).Total complexity: O(n(m+ n logn)).So far this has been the best complexity established for the all-pairs shortest paths problem.

7.12 Matrix Multiplication Algorithm for All Pairs Shortest Paths

The algorithm

Consider a matrix D with dij = cij if there exists an arc from i to j, ∞ otherwise.Let dii = 0 for all i. Then we define the “dot” operation as

D(2)ij = (D ¯DT )ij = mindi1 + d1j , . . . , din + dnj.

D(2)ij represents the shortest path with 2 or fewer edges from i to j .

D(3)ij = (D ¯D(2)T )ij = shortest path with number of edges ≤ 3 from i to j

...

D(n)ij = (D ¯D(n−1)T )ij = shortest path with number of edges ≤ n from i to j


Complexity AnalysisOne matrix “multiplication” (dot operation) → O(n3)n multiplications leads to the total complexity O(n4)

By doubling, we can improve the complexity to O(n3log2n). Doubling is the evaluation

of D(n)ij from its previous matrices that are powers of two.

“Doubling”: Improving Matrix MultiplicationImproving complexity by Doubling (binary powering):D(4) = D(2) ¯D(2)T

D(8) = D(4) ¯D(4)T

...D(2k) = D(2k−1) ¯D(2k−1)T

Stop once k is such that 2k > n (i.e. need only calculate up to k = dlog2ne).Then, D(n) can be found by “multiplying” the appropriate matrices (based on the binary

representation of n) that have been calculated via doubling. Or we can compute D(2k), wherek = dlog2ne and it will be equal to D(n) since D(n) = D(n) ¯DT ( because the shortest pathmay go through at most n nodes).⇒ Complexity of algorithm with doubling is (n3log2n).

Back-tracking:

To find the paths, construct an ′e′ matrix as for Floyd-Warshall’s algorithm where e(k)ij = p if

min(d(k−1)i1 + d

(k−1)1j , . . . , d

(k−1)in + d

(k−1)nj ) = d

(k−1)ip + d

(k−1)pj .

p is then an intermediate node on the shortest path of length ≤ k from i to j.

Identifying negative cost cycles:THis is left as a question to class. Suppose there is a negative cost cycle. Then how is itdetected if we only have powers of 2 computations of the matrix? The problem is to showthat if there is a negative cost cycle then starting at some iteration ≤ n the distance labelsof some nodes must go down at each iteration. What are these nodes?

Seidel’s algorithm for Unit Cost Undirected graph:(1) Seidel has used the matrix multiplication approach for the all pairs shortest paths problemin an undirected graph. Seidel’s algorithm only works when the cost on all the arcs is 1. Ituses actual matrix multiplication to find all pairs shortest paths. The running time of Seidel’salgorithm is O(M(n) logn), where M(n) is the time required to multiply two n×n matrices.(2) Currently, the most efficient (in terms of asymptotic complexity) algorithm known formatrix multiplication is O(n2.376). However, there is a huge constant which is hidden by thecomplexity notation, so the algorithm is not used in practice.

7.13 Why finding shortest paths in the presence of negative cost cycles is dif-ficult

Suppose we were to apply a minimum cost flow formulation to solve the problem of finding ashortest path from node s to node t in a graph which contains negative cost cycles. We could give


4

321

¶¶¶/ S

SSo

--¹¸º·¹¸º·¹¸º·¹¸º·

−10 1

11

Figure 22: Shortest Path Problem With Negative Cost Cycles

a capacity of 1 to the arcs in order to guarantee a bounded solution. This is not enough, however,to give us a correct solution. The problem is really in ensuring that we get a simple path as asolution.

Consider the network in Figure 22. Observe first that the shortest simple path from node 1 tonode 2 is the arc (1, 2). If we formulate the problem of finding a shortest path from 1 to 2 as aMCNF problem, we get the following linear program:

min x12 + x23 + x34 − 10x42s.t. x12 = 1

x23 − x12 − x42 = −1x34 − x23 = 0x42 − x34 = 00 ≤ xij ≤ 1 ∀(i, j) ∈ A.

Recall that the interpretation of this formulation is that xij = 1 in an optimal solution meansthat arc (i, j) is on a shortest path from node 1 to node 2 and the path traced out by the unit offlow from node 1 to node 2 is a shortest path. In this case, however, the optimal solution to this LPis x12 = x23 = x34 = x42 = 1 which has value −7. Unfortunately, this solution does not correspondto a simple path from node 1 to node 2.

8 Minimum Cost Spanning Tree Problem - taught by ProfessorAtamturk

Scribed by XiaoyanGiven undirected graph G = (V,E):

Definition: GF = (V, F ), F ⊆ E, is a forest if GF is acyclic. Forest may contain several components.For example, Figure 23 is a forest. Definition: GT = (V (T ), T ), T ⊆ E, is a tree if G(T) is a forestwith a single component, where V (T ) =subset of V incident to T ⊆ E.Definition: GT = (V (T ), T ), T ⊆ E, is a spanning tree if V (T ) = V , and GT is a tree. Figure 24(solid line) is a tree.

Definition: Minimum cost spanning tree (MST) problem can be expressed as:min∑e∈T Ce : (V (T ), T ) is a spanning tree, where C : E → R is the cost function.

Applications of minimum cost spanning tree include:

1. Natural gas pipelines


Figure 23: Forest

Figure 24: Tree

2. Telecommunication networks

Mathematical Formulation for MST Problem:Notations: E(S) = (i, j) ∈ E, i, j ∈ S, i.e., set of all edges for which both ends are in S.

δ(S) = (i, j) ∈ E, i ∈ S, j /∈ S, i.e. the cut set of the vertex set S.

Decision variables: Xe =

1 if e ∈ T0 if e /∈ T .

min∑

e

CeXe

Subject to: Xe ∈ 0, 1, ∀ e ∈ E (17)∑

e∈EXe = n− 1 (18)

∑

e∈δ(S)Xe ≥ 1, ∀ S ( V (19)

or (20)∑

e∈E(S)

Xe ≤ |S| − 1, ∀ S ⊆ V

Observation: MST problem is not a network flow problem and the constraint set for the aboveformulation is feasible only when the graph is connected.Theorem: Cut Optimality Condition

A spanning tree T is an MST if and only if ∀ e ∈ T,Ce ≤ Cf , ∀ f in the cut formed by deletinge from T .


Proof:(⇒) If Ce > Cf , then C(T

′) < C(T ) for T

′= (T\e) ∪ f , which is a contradiction to T being a

MST tree. See Figure 25.

e

f

Figure 25: Cut Optimality Condition

(⇐) Let T′be an MST, and let f ∈ T ′\T , such that f is in the cut set of e, and e is in the

cut set of f . Since T′is an MST, Cf ≤ Ce. With the given assumption, Ce = Cf , we can derive

Ce = Cf . Replace e and f in T′to get another MST with one edge less different than T

′. Repeat

until T′= T . ¤

Prim’s AlgorithmIntitialize GT = (S, T ) as (1, ∅).While GT is not a spanning treeAdd to T a min-cost edge from δ(S).Add to S the end points of this edge.

EndExample In Figure 26, using Prim’s Algorithm we find out the MST is 1 → 2 → 4 → 3 → 5.

Its correctness follows from the cut optimality condition.The complexity of Prim’s Algorithm is O(nm), and it can be further improved to be O(m +n logn) if Fibonacci trees are used. Here, m = |E| and n = |V |.

1

2

3 5

4 3 5

1 0

3 0

2 0

2 5

4 0 1 5

Figure 26: Example of Prim’s Algorithm

Scribed by Jiang Yao:

Theorem 8.1. (Path Optimality Condition). A spanning tree T is a MST if and only if for each(i, k) /∈ T , C(i,k) ≥ Ce for all e in the unique (i, k) path in T.

Proof. “ =⇒ ”: If C(i,k) < Ce for i, k, e defined in the theorem, then C(T ′) < C(T ), whereT ′ = (T \ e) ∪ (i, k).


“⇐= ”: Path condition =⇒ cut condition =⇒ T is a MST.

e

i

k

Figure 27: Path optimality condition

8.1 Kruskal’s algorithm for MST.

8.1.1 Algorithm

Order E as e1, e2, ...em such that Ce1 ≤ Ce2 ≤ ... ≤ Cem

initialize GT = (V, T ) as (V, φ)

for i = 1 to m

If the end nodes of ei are in different components of GT ,

then add ei in T.

Correctness of the Kruskal’s algorithm follows immediately from the path optimality condition.The following is an example of Kruskal’s algorithm.

8.1.2 Complexity

1. Sorting takes O(mlogm) ≈ O(mlogn) since m ≤ n2

2. In each iteration, O(n). (use merge algorithm, O(n2) can be reached in total.)

8.2 combinatorial issues

Recall the formulation:

min∑

e∈ECexe (0)

s.t.

∑

e∈Exe = n− 1 (1)

∑

e∈E(s)

xe ≤ |S| − 1 S ⊂ V (2)


1

2 4

5 3

3 5

1 0

2 0 2 5

4 0

1 5

3 0

1

2 4

5 3

1 0

1 5

1

2 4

5 3

1 0

2 0

1 5

1

2 4

5 3

3 5

1 0

2 0

1 5

1

2 4

5 3

1 0

1

2 4

5 3

1 0

2 0 2 5

1 5

1

2 4

5 3

1 0

2 0

1 5

3 0

1

2 4

5 3

3 5

1 0

2 0

4 0

1 5

T h e o r ig in a l g r a p h

Ite r a tio n 2 Ite r a tio n 3



Ite r a tio n 1

Figure 28: An example of Kruskal’s algorithm


xe ≥ 0 e ∈ E (3)

where (3) replaces the constraint:

xe ∈ 0, 1

due to the fact that the LP relaxation gives integer solutions.

Assume G = (V,E) is connected.

Theorem 8.2. (Edmonds): the linear program (0)-(3) solves the MST problem.

Proof. Rewrite (2) as

∑

e∈Axe ≤ n− κ(A) A ⊂ E (2’)

where κ(A) is the number of components in graph (V,A).(2’) =⇒ (2). For S ⊆ V , let A = E(S), then

∑

e∈E(S)

xe =∑

e∈Axe ≤ n− κ(A) = n− κ(E(S)) ≤ |S| − 1

where the last inequality follows from κ(E(S)) ≥ n− |S|+ 1.(2) and (3) =⇒ (2’)

Let A ⊆ E, and consider the components S1, S2, ...Sk of (V,A)

∑

e∈Axe ≤

k∑

i=1

∑

e∈E(Si)

xe ≤k∑

i=1(|Si| − 1) = n− k = n− κ(A)

where the first inequality follows from (3) and the second inequality follows from (2).

The new formulation is then:

−max∑

e∈E−Cexe (0)

s.t.

∑

e∈Exe = n− 1 (1)

∑

e∈Axe ≤ n− κ(A) , ∀A ⊂ E (2’)

xe ≥ 0 , ∀e ∈ E (3)

and its dual problem is:

−min∑

A∈E(n− κ(A))yA

s.t.

∑

A⊂E:e∈AyA ≥ −Ce e ∈ E (4)

yA ≥ 0 , A ⊂ E (5)


yE free

where yE and yA are the dual variables of constraints (1) and (2’).Let’s construct a dual solution as follows:

Let e1, e2, ..em be the ordered edges processed in Kruskal’s algorithm.

Let Ri = (e1, e2, ..., ei), i = 1, 2, ...,m

yRi= Cei+1

− Cei≥ 0 i = 1, 2, ...,m− 1

(yE =)yRm = −Cem , yA = 0 if A 6= Ri

All of constraints (4) are satisfied as equality:

∑

A⊂E:e∈AyA =

m∑

j=iyRj

=m−1∑

j=i(Cej+1

− Cej)− Cem = −Cei

= −Ce

Complementary slackness:

(CS1) xe > 0 =⇒ ∑

A∈E:e∈AyA = −Ce e ∈ E

(CS2) yRi> 0 =⇒ ∑

e∈Ri

xe = n− κ(Ri) i = 1, 2, ...,m

where (CS2) follows from

n−( number of merges up to ei) = κ(Ri)

with number of merges is∑

e∈Ri

xe.

Therefore, the solution provided by Kruskal’s algorithm solves the LP as shown by the comple-mentary slackness conditions.

9 Maximum Flow Problem

9.1 Introduction

See the handout titled Travellers Example: Seat Availability.Definitions:Assume for now that all lower bounds are 0.–A feasible flow f is a flow (i.e. obeys the flow balance constraints) that satisfies 0 ≤ fij ≤uij , (i, j) ∈ A–A cut is a partition (S, T ), S ⊆ V , s ∈ S, T = V − S, t ∈ T–Cut capacity: U(S, T ) =

∑

i∈S∑

j∈T uij

For a flow f to be feasible, |f | ≤ U(S, T ) for any (S, T ) cut, where |f | is the total amount offlow out of the source (or into the sink, or across any s-t cut). This inequality corresponds tothe weak duality theorem of linear programming.The following theorem, which we will establish algorithmically, can be viewed as a special case

of the strong duality theorem in linear programming.


Theorem 9.1 (Max Flow Min Cut). The value of a maximum flow is equal to the capacity ofa cut with minimum capacity among all cuts.

Residual graph: The residual graph, Gf = (V,Af ), with respect to a flow f , has the followingarcs:

• forward arcs: (i, j) ∈ Af if (i, j) ∈ A and fij < uij . The residual capacity is ufij = uij − fij .The residual capacity on the forward arc tells us how much we can increase flow on the originalarc (i, j) ∈ A.

• reverse arcs: (j, i) ∈ Af if (i, j) ∈ A and fij > 0. The residual capacity is ufji = fij . Theresidual capacity on the reverse arc tells us how much we can decrease flow on the originalarc (i, j) ∈ A.

Intuitively, residual graph tells us how much additional flow can be sent through the originalgraph with respect to the given flow. This brings us to the notion of an augmenting path.In the presence of lower bounds, ìj ≤ fij ≤ uij , (i, j) ∈ A, the definition of forward arcremains, whereas for reverse arc, (j, i) ∈ Af if (i, j) ∈ A and fij > ìj . The residual capacity is

ufji = fij − ìj .Augmenting path: An augmenting path is a path from s to t in the residual graph. Thecapacity of an augmenting path is the minimum residual capacity of the arcs on the path – thebottleneck capacity.If we can find an augmenting path with capacity C in the residual graph, we can increase the

flow in the original graph by adding C units of flow on the arcs in the original graph whichcorrespond to forward arcs in the augmenting path and subtracting C units of flow on the arcsin the original graph which correspond to reverse arcs in the augmenting path. Note that thisoperation does not violate the capacity constraints since C is the smallest arc capacity in theaugmenting path in the residual graph, which means that we can always add C units of flowon the arcs in the original graph which correspond to forward arcs in the augmenting path andsubtract C units of flow on the arcs in the original graph which correspond to reverse arcs inthe augmenting path without violating capacity constraints. The flow balance constraints arenot violated either, since for every node on the augmenting path in the original graph we eitherincrement the flow by C on one of the incoming arcs and increment the flow by C on one of theoutgoing arcs (this is the case when the incremental flow in the residual graph is along forwardarcs) or we increment the flow by C on one of the incoming arcs and decrease the flow by Con some other incoming arc (this is the case when the incremental flow in the residual graphcomes into the node through the forward arc and leaves the node through the reverse arc) or wedecrease the flow on one of the incoming arcs and one of the outgoing arcs in the original graph(which corresponds to sending flow along the reverse arcs in the residual graph).This is summarized in the following algorithm for finding Max Flow which is also known as the

Ford-Fulkerson algorithm:The augmenting path algorithm (Ford-Fulkerson)

1. Start with a feasible flow (usually fij = 0 ∀(i, j) ∈ A ).

2. Construct the residual graph Gf with respect to the flow.

3. Search for augmenting path by doing breadth-first-search from s (we consider nodes to beadjacent if there is a positive capacity arc between them in the residual graph) and seeing


whether the set of s-reachable nodes (call it S) contains t.If S contains t then there is an augmenting path (since we get from s to t by going through aseries of adjacent nodes), and we can then increment the flow along the augmenting path bythe value of the smallest arc capacity of all the arcs on the augmenting path. We then updatethe residual graph (by setting the capacities of the forward arcs on the augmenting path tothe difference between the current capacities of the forward arcs on the augmenting path andthe value of the flow on the augmenting path and setting the capacities of the reverse arcson the augmenting path to the sum of the current capacities and the value of the flow on theaugmenting path) and go back to the beginning of step 3.If S does not contain t then the flow is maximum.

To establish correctness we prove the following theorem which includes the max-flow min-cuttheorem.

Theorem 9.2 (Augmenting Path Theorem). (generalization of the max-flow min-cut theorem)The following conditions are equivalent:1. f is a maximum flow.2. There is no augmenting path for f .3. |f | = U(S, T ) for some cut (S, T ).(Reminder: fij is the flow from i to j, |f | = ∑

(s,i)∈A fsi =∑

(i,t)∈A fit =∑

u∈S,v∈T fuv for anycut (S, T ).)

Proof[1⇒ 2] If ∃ augmenting path p, then increase flow along p – contradiction.[2 ⇒ 3] Gf is the residual graph w.r.t. f . Let S be a set of nodes reachable in Gf from s. LetT = V − S. Since s ∈ S, t ∈ T, (S, T ) is a cut. For v ∈ S,w ∈ T, (v, w) 6∈ Gf ⇒ fvw = uvw andfwv = 0⇒ |f | =∑v∈S,w∈T fvw −

∑

w∈T,v∈S fwv =∑

v∈S,w∈T uvw = U(S, T )[3⇒ 1] follows since |f | ≤ U(S, T ) for any (S, T ) cut.Note that the equivalence of conditions 1 and 3 gives the max-flow min-cut theorem.Given the max flow (with all the flow values), a min cut can be found in by looking at the

residual network. The set S, consists of s and the all nodes that s can reach in the final residualnetwork and the set T consists of all the other nodes. Since s can’t reach any of the nodes inT , it follows that any arc going from a node in S to a node in T in the original network mustbe saturated which implies this is a minimum cut. This cut is known as the minimal source setminimum cut. Another way to find a minimum cut is to let T be t and all the nodes that canreach t in the final residual network. This a maximal source set minimum cut which is differentfrom the minimal source set minimum cut when the minimum cut is not unique.While finding a minimum cut given a maximum flow can be done in linear time, O(m), we

have yet to discover an efficient way of finding a maximum flow given a list of the edges in aminimum cut other than simply solving the maximum flow problem from scratch. Also, we haveyet to discover a way of finding a minimum cut without first finding a maximum flow. Since theminimum cut problem is asking for less information than the maximum flow problem, it seemsas if we should be able to solve the former problem more efficiently than the later one. More onthis in Section 9.2.

9.2 Linear Programming duality and Max flow Min cut

Given G = (N,A) and capacities uij for all (i, j) ∈ A:Consider the formulation of the maximum flow problem on a general graph. Let xij be a variable


denoting the flow on arc (i, j).

Max xtssubject to

∑

i xki −∑

j xjk = 0 k ∈ V0 ≤ xij ≤ uij ∀(i, j) ∈ A.

For dual variables we let zij be the nonnegative dual variables associated with the capacityupper bounds constraints, and λi the variables associated with the flow balance constraints.

Min∑

(i,j)∈A uijzijsubject to zij − λi + λj ≥ 0 ∀(i, j) ∈ A

λs − λt ≥ 1zij ≥ 0 ∀(i, j) ∈ A.

The dual problem has an infinite number of solutions: if (λ∗, z∗) is an optimal solution, then sois (λ∗+C, z∗) for any constant C. To avoid that we set λt = 0 (or to any other arbitrary value).Observe now that with this assignment there is an optimal solution with λs = 1 and a partitionof the nodes into two sets: S = i ∈ V |λi = 1 and S = i ∈ V |λi = 0.The complementary slackness condition state that the primal and dual optimal solutions x∗, λ∗, z∗

satisfy,x∗ij · [z∗ij − λ∗i + λ∗j ] = 0

[uij − x∗ij ] · z∗ij = 0.

In an optimal solution z∗ij − λ∗i + λ∗j = 0 so the first set of complementary slackness conditionsdo not provide any information on the primal variables xij. As for the second set, z∗ij = 0 on

all arcs other than the arcs in the cut (S, S). So we can conclude that the cut arcs are saturated,but derive no further information on the flow on other arcs.The only method known to date for solving the minimum cut problem requires finding a maxi-

mum flow first, and then recovering the cut partition by finding the set of nodes reachable fromthe source in the residual graph (or reachable from the sink in the reverse residual graph). Thatset is the source set of the cut, and the recovery can be done in linear time in the number ofarcs, O(m).On the other hand, if we are given a minimum cut, there is no efficient way of recovering the

flow values on each arc other than essentially solving from scratch. The only information givenby the minimum cut, is the value of the maximum flow and the fact that the arcs on the cut aresaturated. Beyond that, the flows have to be calculated with the same complexity as would berequired without the knowledge of the minimum cut.This asymmetry implies that it may be easier to solve the minimum cut problem than to solve

the maximum flow problem. Still, no minimum cut algorithm has ever been discovered, in thesense that every so-called minimum cut algorithm known computes first the maximum flow.

9.3 Hall’s Theorem

Hall discovered this theorem about a necessary and sufficient condition for the existence of a perfectmatching independently of the max flow min cut theorem. The max flow min cut theorem canhowever be used as a quick alternative proof to Hall’s theorem:


......

.

s

......

.

1

1

1

1

8

1

1

1

1

U V

t

Figure 29: Graph for Hall’s Theorem

Theorem 9.3 (Hall). Given a bipartite graph, B = (U, V ;E) where | U |=| V |= n. There is aperfect matching in B iff ∀X ⊆ U , | Γ(X) |≥| X |, where Γ(X) is the set of neighbors of X,Γ(X) ⊆ V

Proof: We first construct a flow graph by adding a source and a sink, arcs directed from sourceto all nodes of U of capacity 1, and arcs directed from V to sink, all with capacity 1. Direct allarcs of the graph G from U to V with capacity ∞ (See Figure 29).[⇐] Assume that | Γ(X) | ≥ | X |, ∀ X ⊆ U . We need to show that there exists a perfect

matching.Consider a finite cut (S, T ) in the flow graph defined. Let X = U ∩ S then Γ(X) ⊆ S ∩ V , elsethe cut is not finite.Since |Γ(X)| ≥ |X|, the capacity of the cut satisfies,

U(S, T ) ≥| Γ(X) | +n− | X |≥ n.

Now, for (S, T ) a minimum cut,n ≤ U(S, T ) = max flow value = size of matching ≤ n.It follows that all the inequalities must be satisfied as equalities, and the matching size is n.[⇒] If there is a perfect matching, then for every X ⊆ U the set of its neighbors Γ(X) contains

at least all the nodes matched with X. Thus, | Γ(X)| ≥ |X|

Define, Discrepancy = −min(| Γ(X) | − | X |) = max(| X | − | Γ(X) |), X ⊆ U

Lemma 9.1. For a bipartite graph with discrepancy D, there is a matching of size n−D

The correctness of the lemma follows since the minimum cut is of size n−D; D = n− minimumcut.


9.4 Amazing Applications of Minimum Cut

9.4.1 Selection Problem

Suppose you are going on a hiking trip. You need to decide what to take with you. Many individualitems are useless unless you take something else with them. For example, taking a bottle of winewithout an opener does not make too much sense. Also, if you plan to eat soup there, forexample, you might want to take a set of few different items: canned soup, an opener, a spoonand a plate. The following is the formal definition of the problem.Items 1,. . . ,n each have an associated cost, ci. Sets of items Si ⊆ 1, . . . ,m have benefit bi.

Maximize net benefit = total benefit - total cost of items selected. A “selection” corresponds toa collection of sets, Sj , j ∈ J . Selected items =

⋃

Sj , j ∈ J .Formulation:

vj =

1 if item j is included0 otherwise

ui =

1 if set i is selected0 otherwise

The objective is to choose sets Sj so as to maximize:

∑

ui∈Sj

biui −∑

j ∈⋃

Sj

cjvj

The Selection Problem can be solved by finding a minimum s-t cut in an appropriate network.Create a bipartite graph with sets on one side, items on the other. Add arc (i, j) with capacityinfinity if item j ∈ Si. Add a source and a sink. Add arcs from s to i with capacity bi, from j tot with capacity cj .Since arcs (i,j) have capacity infinity, a feasible selection corresponds to a finite cut. This is

because the inclusion of a set in the source set implies that all of its elements are in the sourceset. The min-cut capacity will be equivalent to the maximum benefit.Net benefit of selection =

∑

set i∈S bi −∑

item j∈S cj .

Capacity of cut(S, T ) =∑

set i∈Tbi +

∑

item j∈S

cj

=∑

set i∈Vbi −

∑

set i∈Sbi +

∑

item j∈S

cj

= B − net benefit of corresponding selection

Where B =∑m

i=1 bi.Since we are seeking a minimum cut we want to minimize (B− “net benefit of selection”). TheB is a constant. Hence the above is equivalent to B + min (- net benefit). But minimizing (−net benefit) is the same as maximizing (+ net benefit). Thus, we maximize the benefit of theselection.


9.4.2 The maximum closure problem

Given a directed graph G = (V,A), a subset of the nodes D ⊂ V is called closed, if all successorsof D are contained in D, or, the set of nodes reachable from D is D. The maximum closureproblem is:

Problem Name: Maximum closure

Instance: Given a directed graph G = (V,A), and node weights (positive or nega-tive) bi for all i ∈ V .

Optimization Problem: find a closed subset of nodes V ′ ⊆ V such that∑

i∈V ′ biis maximum.

The formulation of the maximum closure problem is: Let xj be a binary variable that is 1 ifnode j is in the closure, and 0 otherwise. bj is the weight of the node or the net benefit derivedfrom the corresponding block.

Max∑

j∈V bj · xjsubject to xj − xi ≥ 0 ∀(i, j) ∈ A

0 ≤ xj ≤ 1 integer j ∈ V.

Each row of the constraint matrix has one 1 and one −1 - a structure indicating that the matrixis totally unimodular. More specifically, it is indicating that the problem is a dual of a flowproblem.Apparently, Johnson [Jo1] seems to be the first researcher who demonstrated that the max

closure problem is equivalent to the selection problem (max closure on bipartite graphs), andthat the selection problem is solvable by max flow algorithm. Picard [Pic1], demonstrated thata minimum cut algorithm on a related graph, solves the maximum closure problem. The relatedgraph is constructed by adding a source and a sink node, s and t, V = V ∪ s, t. Let V + =j ∈ V |bj > 0, and V − = j ∈ V |bj < 0. The set of arcs in the related graph, A, is the setA appended by arcs (s, v)|v ∈ V + ∪ (v, t)|v ∈ V −. The capacity of all arcs in A is set to∞, and the capacity of all arcs adjacent to source or sink is |bv|: c(s, v) = bv for v ∈ V + and,c(v, t) = −bv for v ∈ V −. In this graph the source set of a minimum cut separating s from t isalso a maximum closure in the graph. The proof of this fact is repeated here for completeness:The source set is obviously closed as the cut must be finite and thus cannot include any arcs ofA.Let a finite cut be (S, S) in the graph G = (V , A). The capacity of the cut C(S, S) is,

∑

j∈V −∩S |bj |+∑

j∈V +∩S bj =∑

j∈V −∩S |bj |+∑

j∈V bj −∑

j∈V +∩S bj= B −∑j∈S bj .

where B - the sum of all weights - is fixed. Hence minimizing the cut capacity is equivalentto maximizing the total sum of weights of nodes in the source set of the cut. We refer to therelated graph also as the closure graph, which is any graph with a source and a sink and with allfinite capacity arcs adjacent only to source or sink. An example of a closure graph related to amaximum closure problem, where the source set of the minimum cut is the maximum closed set,is depicted in Figure 30.


8b5

b5

b7

S

b4

-b3

-b4

8

-b2

8

8

8 8

8

8

8

b7

b6

b1

s t

b1

b2

b3

b6

b8

-b8

8

S

Figure 30: A maximum closure and min cut in closure graphs

9.4.3 The open-pit mining problem

Open-pit mining is a surface mining operation in which blocks of earth are extracted from thesurface to retrieve the ore contained in them. During the mining process, the surface of theland is being continuously excavated, and a deeper and deeper pit is formed until the operationterminates. The final contour of this pit mine is determined before mining operation begins. Todesign the optimal pit – one that maximizes profit, the entire area is divided into blocks, and thevalue of the ore in each block is estimated by using geological information obtained from drillcores. Each block has a weight associated with it, representing the value of the ore in it, minusthe cost involved in removing the block. While trying to maximize the total weight of the blocksto be extracted, there are also contour constraints that have to be observed. These constraintsspecify the slope requirements of the pit and precedence constraints that prevent blocks frombeing mined before others on top of them. Subject to these constraints, the objective is to minethe most profitable set of blocks.The problem can be represented on a directed graph G = (V,A). Each block i corresponds to a

node with a weight bi representing the net value of the individual block. The net value is computedas the assessed value of the ore in that block, from which the cost of extracting that block aloneis deducted. There is a directed arc from node i to node j if block i cannot be extracted beforeblock j which is on a layer right above block i. This precedence relationship is determined by theengineering slope requirements. Suppose block i cannot be extracted before block j, and block jcannot be extracted before block k. By transitivity this implies that block i cannot be extractedbefore block k. We choose in this presentation not to include the arc from i to k in the graph andthe existence of a directed path from i to k implies the precedence relation. Including only arcsbetween immediate predecessors reduces the total number of arcs in the graph. Thus to decidewhich blocks to extract in order to maximize profit is equivalent to finding a maximum weightset of nodes in the graph such that all successors of all nodes are included in the set. This canbe solved as the maximum closure problem in G described previously. Finding a subset of nodeswithin a graph, such that their associated density is maximum is another interesting problemthat can also be solved efficiently as the maximum closure problem.


9.4.4 Vertex Cover on Bipartite Graphs

Given G = (V,E), each node v has weight wi associated with it. The vertex cover problem is tofind C ⊂ V such that every edge (i, j) ∈ E has i ∈ C or j ∈ C so that

∑

i∈C wi is minimum.The complement of a vertex cover C is an independent set (No two vertices in V \ C are

adjacent), since if two adjacent vertices 6∈ C then the edge between these vertices is not coveredby any vertex in C.

9.4.5 LP Formulation of Minimum Vertex Cover

G = (V,E) with vertex weights wi ≥ 0, i ∈ V .

xi =

1 if i ∈ cover0 otherwise

min∑

i∈V wixi

s.t. xi + xi ≥ 1, ∀ (i, j) ∈ Exi ≥ 0, integer

“Vertex Cover” on general graphs is an NP-complete problem, but on bipartite graphs it ispolynomial. To see that, observe that for bipartite graphs the set of columns of the constraintmatrix can be partitioned so that there is at most one 1 in each side of the partition. That provesthat the matrix is totally unmodular, Moreover we can replace the variables for one side of thebipartition by the negative of the vairable. Each constraint then has one 1 and one −1 which isthe signature of dual of maximum flow – or, minimum cut..

9.4.6 Solving the vertex cover problem on bipartite graphs

Let B = (V1, V2, E), each node v has weight wv associated with it. The goal is to find a vertex coverwith minimum total weights. We construct a bipartite flow network as follows: add a source anda sink and let the cost wi be the capacity of the respective arcs. with wi the capacity of the arc(s, i) if i ∈ V1, and the arc (i, t) if i ∈ V2. This is demonstrated in Figure 31.Let (S, T ) is a finite cut, then the following claim show that it corresponds to a vertex cover.

Claim 9.1. For finite cut (S, T ), (V1∩T )∪(V2∩S) is a vertex cover of weight equal to cut capacity.Proof: If not a vertex cover then there exists (u, v) such that u ∈ V1 ∩ S and v ∈ V2 ∩ T but

then (u, v) is in the cut violating its finiteness.Therefore minimizing the capacity of the cut provides a minimum weight vertex cover.Also, (S ∩ V1) ∪ (T ∩ V2) is an independent set.

minvalue of the finite cut+maxweight of the independent set =∑

i∈V1

wi+∑

i∈V2

wi = constant

THe opposite works as well:If for X1 ⊆ V1 X2 ⊆ V2 X1 ∪X2 is a vertex cover, the ((V1 \X1)∪X2), (X1 ∪ (V2 \X2)) is a finitecut. This is so since there are no arcs between V1 \X1 and V2 \X2 which are the only arcs thatcould be finite.


s t

S

T

infinity

Could not happen else the cut is of infinite capacity

Figure 31: Graph for bipartite vertex cover

9.4.7 Solving the LP relaxation of the non-bipartite vertex cover

The LP-relaxation of the nonbipartite vertex cover problem can be solved by finding an optimalcover in a bipartite graph with two vertices for each vertex in the original graph, and two edgesfor each edge in the original graph (see Figure 32). In the bipartite graph a vertex cover may beidentified from the solution of a corresponding minimum cut problem. Specifically, as suggestedby Edmonds and Pulleyblank and noted in [NT75], the LP-relaxation can be solved by findingan optimal cover C in the bipartite graph with two vertices aj , bj of weight wj for each vertex jof G, and two edges (ai, bj), (aj , bi) for each edge (i, j) of G: then it suffices to set

xj = 1 if aj , bj ∈ C,

xj =1

2if aj ∈ C, bj 6∈ C, or aj 6∈ C, bj ∈ C,xj = 0 if aj 6∈ C, bj 6∈ C.

In turn, the problem of finding C can be reduced into a minimum cut problem: the bipartitegraph can be converted into a network by making each edge (ai, bj) into a directed arc (ai, bj) ofan infinite capacity, adding a source s with an arc (s, ai) of capacity wi for each i and adding asink t with an arc (bj , t) of capacity wj for each j. Now a minimum cut (S, T ) with s ∈ S andt ∈ T points out the desired C: it suffices to set ai ∈ C iff ai ∈ T and bj ∈ C iff bj ∈ S. Adescription of the resulting graph is given in Figure 32.The minimum cut can be found by efficient algorithms for maximum flow on (bipartite) graphs.

For instance, if one uses Goldberg and Tarjan’s algorithm, [GT88], it takes only O(mn log n2m)

steps to preprocess G with n vertices and m edges by partitioning the set of vertices into P,Qand R. This algorithm is a special case of the algorithm used to find the half integer solution forIP2 where min cut is also used for preprocessing (see approximation algorithms book Chapter3).We now present an alternative method of reducing the LP-relaxation to a flow problem that

is illuminating as to why min cut solves this problem. Consider the LP-relaxation of the vertexcover problem (VCR),

Min∑n

j=1wjxjsubject to xi + xj ≥ 1 (for every edge (i, j) in the graph)

0 ≤ xj ≤ 1 (j = 1, . . . , n).


8

w2

wi

wj

wn

w1w2

wi

wj

wn

w1

Figure 32: The graph constructed to solve the LP relaxation of the vertex cover problem

Replace each variable xj by two variables, x+j and x−j , and each inequality by two inequalities:

x+i − x−j ≥ 1

−x−i + x+j ≥ 1 .

The two inequalities have one 1 and one −1 in each, and thus correspond to a dual of a networkflow problem. The upper and lower bounds constraints are transformed to

0 ≤ x+j ≤ 1

−1 ≤ x−j ≤ 0 .

In the objective function, the variable xj is substituted by 12(x

+j − x−j ).

The resulting constraint matrix of the new problem is totally unimodular. Hence the linearprogramming (optimal basic) solution is integer, and in particular can be obtained using aminimum cut algorithm. When the original variables are recovered, they are integer multiples of12 .

Remark: When the problem is unweighted, the network flow that solves the LP relaxation isdefined on simple networks. These are networks with all arcs of capacity 1, and every node haseither one incoming arc, or one outgoing arc. In this case, the arcs in the bipartition of thetype (ai, bj) can be assigned capacity 1 instead of ∞. For simple networks, Dinic’s algorithm formaximum flow works in O(

√nm) time - a significant improvement in running time.

9.4.8 Forest Clearing

The Forest Clearing problem is a real, practical problem. We are given a forest which needs to beselectively cut. The forest is divided into squares where every square has different kind of trees.Therefore, the value (or benefit from cutting) is different for each such square. The goal is tofind the most beneficial way of cutting the forest while preserving the regulatory constraints.


Version 1. One of the constraints in forest clearing is preservation of deer habitat. For example,deer like to eat in big open areas, but they like to have trees around the clearing so they can runfor cover if a predator comes along. If they don’t feel protected, they won’t go into the clearingto eat. So, the constraint in this version of Forest Clearing is the following: no two adjacentsquares can be cut. Adjacent squares are two squares which share the same side, i.e. adjacenteither vertically or horizontally.Solution. Make a grid-graph G = (V,E). For every square of forest make a node v. Assign every

node v, weight w, which is equal to the amount of benefit you get when cut the correspondingsquare. Connect two nodes if their corresponding squares are adjacent. The resulting graphis bipartite. We can color it in a chess-board fashion in two colors, such that no two adjacentnodes are of the same color. Now, the above Forest Clearing problem is equivalent to finding amax-weight independent set in the grid-graph G. In the last class it was shown that MaximumIndependent Set Problem is equivalent to Minimum Vertex Cover Problem. Therefore, we cansolve our problem by solving weighted vertex cover problem on G by finding Min-Cut in thecorresponding network. This problem is solved in polynomial time.Version 2. Suppose, deer can see if the diagonal squares are cut. Then the new constraint

is the following: no two adjacent vertically, horizontally or diagonally squares can be cut. Wecan build a grid-graph in a similar fashion as in Version 1. However, it will not be bipartiteanymore, since it will have odd cycles. So, the above solution would not be applicable in thiscase. Moreover, the problem of finding Max Weight Independent Set on such a grid-graph withdiagonals is proven to be NP-complete.Another variation of Forest Clearing Problem is when the forest itself is not of a rectangular

shape. Also, it might have “holes” in it, such as lakes. In this case the problem is also NP-complete.

The checkered board pattern permitted to clear

Because there are no odd cycles in the above grid graph of version 1, it is bipartite and theaforementioned approach works. Nevertheless, notice that this method breaks down for graphswhere cells are neighbors also if they are adjacent diagonally. An example of such a graph isgiven below.


Such graphs are no longer bipartite. In this case, the problem has indeed been proven NP-completein the context of producing memory chips (next application).

9.4.9 Producing Memory Chips (VLSI Layout)

We are given a silicon wafer of 64K RAM VLSI chips arranged in a grid-like fashion. Some chipsare damaged, those will be marked by a little dot on the top and cannot be used. The objectiveis to make as many 256K chips as possible out of good 64K chips. A valid 256K chip is theone where 4 little chips of 64K form a square, (i.e. they have to be adjacent to each other in asquare fashion.) Create a grid-graph G = (V,E) in the following manner: place a node in themiddle of every possible 256K chip. (The adjacent 256K chips will overlap.) Similar to Version2 of Forest Clearing Problem put an edge between every two adjacent vertically, horizontally ordiagonally nodes. Now, to solve the problem we need to find a maximum independent set on G.This problem is proven to be NP-complete.


9.5 Flow Decomposition

Every flow can be decomposed into “primitive elements”, where a primitive element is either:1. a path P from s to t with flow δ.2. a cycle Γ with flow δ.

Theorem 9.4 (Flow Decomposition Theorem). Any flow f can be decomposed into no morethan m primitive elements.

Proof. (Constructive)Define Af = (u, v) ∈ E : fuv > 0While the flow from s to t is positive, do:

Begin DFS in Af from s. Then either reach t along a simple path, or else found a cycle inAf .

1. If found a simple path (that does not contain a cycle), set δ = min(u,v)∈P fuv. (P, δ) isa primitive element. Update flow by setting: fuv ← fuv − δ, (u, v) ∈ P .Update Af - the number of arcs in Af is reduced by at least one arc.

2. If found a cycle Γ then set f = min(u,v)∈Γ fuv. Update fuv ← fuv−δ, (u, v) ∈ Γ. UpdateAf .

If no flow from s to t and Af = ∅, then done.

If flow |f | = 0, but Af 6= ∅ then continue by picking an arc in Af , say (v, w). (There mustbe one in Af , flow conservation).

Repeat till a cycle is found.

The first time such cycle is found, set it as Γ and the value of δ is updated as before.

⇒ At most |A| primitive elements.An interesting question is the complexity of the algorithm to generate the primitive elements

and how it compares to maximum flow. We now show that the overall complexity is O(mn).Assuming the we have for each node the adjacency list - the outgoing arcs. Follow a path from sin a DFS manner, visiting one arc at a time. If a node is repeated (visited twice) then we founda cycle. We trace it back and find its bottleneck capacity and record the primitive element. Thebottleneck arcs are removed, and the adjacency lists affected are updated. This entire operationis done in O(n) as we visit at most n nodes.If a node is not repeated, then after at most n steps we reached the sink t. The path followed

is then a primitive path with flow equal to its bottleneck capacity. We record it and update theflows and adjacency lists in O(n) time.Since there are O(m) primitive elements, then all are found in O(mn). Notice that this is

(pretty much) faster than any known maximum flow algorithm.

9.6 Maximum Flow Algorithms

9.6.1 Ford-Fulkerson Algorithm

A maximum flow algorithm implied by the augmenting path theorem:


Initialize: f = 0, done:=”no”;While done=”no” do

g := 0;Construct layered network with respect to flow f , AN(f);if t is not reachable from s in AN(f) then done:=”yes”;else

repeat

while there is a node v with thru(v) = 0 doif v = s or t then go to increment

else delete v and all incident arcs from AN(f);End Do

let v be the node in AN(f) for whichthru(v) is the smallest(nonzero);

push thru(v) from v all the way to t;pull thru(v) from s to v;

End Do

increment: f ← f + g;End Do

Theorem 9.5. For integer capacities, the augmenting path algorithm finds a maximum flow intime O(mnU), where U = max(v,w)∈A uvw.

Proof. The value of the flow goes up at each iteration by at least one unit. (Note: this is so sincecapacities are integer).Since there are at most n− 1 arcs in the cut (s, V − s),then |f | ≤ nU).Therefore, |f | ≤ nU and each iteration takes O(m) (BFS) for a total of O(mnU). The aboveresult also applies for rational capacities, as we can scale to convert the capacities to integervalues.

Drawbacks of Ford-Fulkerson algorithm:

1. The running time is not polynomial. The complexity is exponential since it depends on U .

2. In Ford-Fulkerson algorithm which augmenting path to pick is not specified. Given the graphin Figure 33, the algorithm could take 4000 iterations for a problem with maximum capacityof 2000.

3. Another drawback is that for continuous data, this algorithm may converge to the wrongvalue (see Papadimitriou and Steiglitz p.126-128)

Improvements for enhancing the performance of the Ford-Fulkerson algorithm could include:

1. using the concept of capacity scaling to cope with the nonpolynomial running time;

2. picking maximum capacity augmenting path;

3. picking an augmenting path with the shortest length (in number of edges).

Each alternative leads to different types of max-flow algorithm.


2000

2000

2000

1

2000 ?

@@@R

½½½>Z

ZZ~

½½½>

2

1

ts

jjjj

Figure 33: Graph leading to long running time for Ford-Fulkerson algorithm.

9.6.2 Capacity Scaling Algorithm

Recall: At each iteration i, consider the network Pi with capacities restricted to the i mostsignificant digits.Number of significant digits of U: n := blog2 U + 1c; k significant bits of Uij : bUij/2(n−k)c.

Algorithm

fij := 0;Consider P0 and apply the augmenting path algorithm;For k := 1 to n Do

Multiply all residual capacities and flows of residual graph from previous iteration by 2;Add 1 to all capacities of arcs that have 1 in the (k+1)st position

(of the binary representation of their original capacity);Apply the augmenting path algorithm starting with the current flow;

End Do

The algorithm is illustrated in Figure 9.6.2.

Theorem 9.6. Capacity scaling algorithm runs in O(m2 log2 U).

Proof. For arcs on the cut in the previous network residual capacities were increased by at mostone each, then the amount of residual max flow in Pi is bounded by the number of arcs in thecut which is ≤ m.The complexity of finding an augmenting path is O(m).

The total number of iterations is O(log2 U).Thus the total complexity is O(m2 log2 U). Notice that this algorithm is polynomial, but stillnot strongly polynomial - it cannot handle real capacities.

9.6.3 Maximum Capacity Augmenting Path Algorithm

Maximum capacity augmenting path is a variation of the augmenting path algorithm where in eachstep augmentation is done along max capacity path in residual graph. The generic algorithm isas follows:

AlgorithmStep 0: f = 0Step 1: Construct Gf (Residual graph with respect to flow f)Step 2: Find max capacity path from s to t in Gf


(1001) = 9

9

9

9

s t

Example:

s t

s t

s t

s t

Iteration 2:

Iteration 1:

Iteration 3:

Iteration 4:

1 1

1 1

0

max flow = 2

max flow = 4

max flow = 8

10 10

10 10

00

100 100

000

100 100

1001 1001

1001 1001

0000

(0001) = 1

multiply flow by 2, add next significant digit to capacity

s t

1 1

1

1 1

residual graph:

increase the flow by 1 unit along

each of the two augmenting paths,this flow is maximum.

max flow = 18

Figure 34: Example Illustrating the Capacity Scaling Algorithm


Let path capacity be δAugment f by δ along this pathGo to step 1

If there does not exist any pathStop with f as a maximum flow.

Algorithm For Finding Maximum Capacity Augmenting Path

Dijkstra’s algorithms is appropriate for use here, where the label of a node is the max capacity ofa path from s. At each iteration we add to the set of permanent nodes a node with largest labelreachable from the permanent set.

AlgorithmInitialize: P ← s and T ← V − sWhile T is not empty do

choose maximum label l∗i ∈ T and add it to the set of permanent nodes P.update nodes in T .

lj ∈ T ← min(i,j)∈T l∗i , cij.End Do

The complexity of finding the max capacity augmenting path is O(m+ n logn).

The complexity of the max capacity augmenting path algorithm

Suppose that the value of the maximum flow, f ∗, is v∗. We are at an iteration with flow f of valuev. What is the maximum flow value on Gf?From the flow decomposition theorem, there are ≤ m paths from s to t with sum of flows equal

to v∗ − v.∑

i=1,k δi = v∗ − v

⇒ ∃i|δi ≥ (v∗−v)k ≥ (v∗−v)

m⇒ remaining flow (after augmenting along a maximum capacity path) is ≤ m−1

m (v∗ − v).

Repeat the above for q iterations, where q = log(v∗)log( m

m−1) .

Stop when (m−1m )qv∗ ≤ 1, since flows must be integers. Thus it suffices to have q ≥ log(v∗)log( m

m−1) .

Now log( mm−1) ≈ 1

m . Thus the overall complexity is O(m(m+ n log n) log(nU)).Why is the largest capacity augmenting path not necessarily a primitive path?A primitive path never travels backwards along an arc, yet an augmenting path may contain

backward arcs. Thus, knowing the flow in advance is a significant advantage (not surprisingly).


9.7 Dinic’s Algorithm

Dinic’s algorithm is also known as the Blocking Flow Algorithm, or shortest augmenting pathsalgorithm. Given a flow, f , and the corresponding residual graph, Gf = (V,Af ), we find ashortest augmenting path by constructing a layered network AN(f) (AN stands for AuxiliaryNetwork) as follows:Using breadth-first-search in Gf , we construct a tree a rooted at s. Each node in V gets

a distance label which is its distance in the BFS tree from s. Thus, the nodes that have anincoming arc from s get assigned a label of 1 and nodes at the next level of the tree get a labelof 2, etc. Nodes with the same label are said to be at the same layer. Let the distance label oft be k = d(t) then all nodes in layer ≥ k are removed from the layered network and all arcs areremoved except those which go from a node at one layer to a node at the next consecutive layer.We call stage k the augmentations along paths in the layered network where k = d(t) (a network

that includes k layers).

See the handout titled Dinic’s Algorithm.

Definition 9.1. A flow is called blocking flow if every s, t path in G intersects with at least onesaturated arc.

Notice that a blocking flow is not maximum flow. A flow of one unit on the Ford-Fulkerson badcomplexity example is blocking but not maximum.

Definition 9.2. The throughput of a node is the minimum of the sum of incoming arcs capacitiesand the outgoing arcs capacity. It is the largest amount of flow that could possibly be routedthrough the node.

Dinic’s Algorithm for max flow problem

Initialize: f = 0, done:=”no”;While done=”no” do

Construct a layered network with respect to flow f , AN(f);if t is not reachable from s in AN(f) then done:=”yes”;else

repeat

while there is a node v 6= s, t with thru(v) = 0 dodelete v and all incident arcs from AN(f);

End Do

let v be the node in AN(f) for whichthru(v) = δ is the smallest(nonzero);

push thru(v) from v all the way to t: set f ← f + δ for each arc on (any) selected path from v to t;pull thru(v) from s to v: set f ← f + δ for each arc on (any) selected path from s to v;

End Do

End Do

Theorem 9.7. Dinic’s algorithm correctly solves the max-flow problem for a network G = (V,A)in O(|V |3) arithmetic operations.


Proof: We first address the issue of correctness. In the last stage, s and t are disconnected,and so there is no augmenting path, or any path in the residual network from s to t, thus by theaugmenting path theorem the flow in the last stage is maximum.Consider now the complexity bound. We first note that algorithm has O(n) stages since the

longest shortest path in a network on n nodes can at have length n.At each stage we need to create the layered network with BFS. This takes O(m) complexity.The operations at each stage can be either saturating push along an arc or an unsaturating

push along an arc. So we can write number of operations as sum of saturating and unsaturatingpull and push steps, N = Ns + Nu . We first note that once an arc is saturated, it is deletedfrom graph, therefore Ns = O(m). However we may have many unsaturating steps for the samearc.The observation is that we have at most n executions of the push and pull procedures (along a

path) at each stage. This is so because each such execution results in the deletion of the node vwith the lowest throughput where this execution started. Furthermore, in each execution of thepush and pull procedures, we have at most n unsaturated pushes – one for each node along thepath. Therefore, we have at each stage at most O(n2) unsaturated pushes and N = Ns +Nu =O(m) +O(n2) = O(n2).So we have a total of O(n3) arithmetic operations for all stages.So the complexity per stage is O(m + n2) = O(n2). The total complexity is thus, O(nn2) =O(n3).

9.7.1 Dinic’s Algorithm for Unit Capacity Networks

Examples of unit capacity networks:

1. Maximum Bipartite matching,

2. Maximum number of edge disjoint Paths.

Lemma 9.2. In a unit capacity network with distance from s to t greater or equal to ` the maximum

flow value ˆ|f | satisfies√

ˆ|f | ≤ 2 |V |` .

Proof: Construct a layered network with Vi the set of the nodes in Layer i (See Figure 35).And let Si = V0 ∪ V1 ∪ · · · ∪ Vi1 ≤ i ≤ `− 1.Then (Si, Si ) is a s-t cut. Using the fact that the maximum flow value is less than the capacity

of any cut,

ˆ|f | ≤ |Vi||Vi+1| ⇒ either |Vi| ≥√

ˆ|f | or |Vi+1| ≥√

f

The first inequality comes from the fact that the maximum possible capacity of a cut is equalto |Vi||Vi+1|. Since each arc has a capacity of one, and the maximum of number of arcs from VitoVi+1 is equal to |Vi||Vi+1| ( all the combinations of the set of nodes Vi and Vi+1).

So at least `2 of the layers have at least

√

ˆ|f | nodes each(consider the pairs, V0V1, . . . , V2V3, . . . ).

`

2

√

ˆ|f | ≤∑

i=1

|Vi| ≤ |V | ⇒√

ˆ|f | ≤ 2|V |`


s V1 V2 l -1V

extra

nodes

t. . .

cut

Figure 35: The layered network, distance from s to t is greater or equal to `

Remark In Homework assignment 6 you will prove that under the same conditions as thislemma, ˆ|f | ≤ 2|A|

` .

Claim 9.2. Dinic’s Algorithm, applied to the unit capacity network requires at most O(n2/3) stages.

Proof: Notice that in this case, it is impossible to have non-saturated arc processing, becauseof unit capacity, so the total computation per stage is no more than O(m). If Max flow ≤ 2n2/3

then done.Each stage increments the flow by at least one unit.If Max flow > 2n2/3, run the algorithm for 2n2/3 stages. The distance labels from s to t will

be increased by at least 1 for each stage. Let the Max flow in the residual network (after 2n2/3

stages) be g. The shortest path from s to t in this residual network satisfies ` ≥ 2n2/3.Apply lemma 9.2 to this residual network.

√

|g| ≤ 2|V |2n2/3

= n1/3, |V | = n⇒ |g| ≤ n2/3

It follows that no more than n2/3 additional stages are required. Total number of stages ≤ 3n2/3

Complexity AnalysisAny arc processed is saturated (since this is a unit capacity network). Hence O(m) work perstage. There are O(n2/3) stages. This yields an overall complexity O(mn2/3).

Remark: Recently (1998) A. Goldberg and S. Rao [GR98] devised a new algorithm for maximumflow in general networks that use ideas of the unit capacity as subroutine. The overall runningtime of their algorithm is O(minn2/3,m1/2m log(n2/m) logU) for U the largest capacity in thenetwork.

9.7.2 Dinic’s Algorithm for Simple Networks

In a simple network each node has either exactly one incoming arc of capacity one or exactly oneoutgoing arc of capacity one (See Figure 36 b). Bipartite matching is an example of a problemthat can be formulated as a maximum flow problem on a simple network.


3

2

1

4

a) b)

14

2

1

1

Figure 36: a) not a simple network b) a simple network

For simple networks, Dinic’s Algorithm works more efficiently than it does for unit capacitynetworks, since the throughput of each node is one.

Lemma 9.3. In a simple network, with distance from s to t greater than or equal to `, the maxflow, f satisfies |f | ≤ |V |

(`−1)

Proof. Consider layers V0, V1 . . . and the cut (Si, Si), where, Si = V0 ∪ · · · ∪ Vi. Since for i =1 . . . `− 1 the throughput of each node for each layer Vi is one,

ˆ|f | ≤ |Vi| i = 1 . . . `-1

Hence, ˆ|f | ≤ mini=1...`−1

|Vi|

`−1∑

i=1

|Vi| ≤ |V | → mini=1....`−1

|Vi| ≤|V |

(`− 1)

Claim 9.3. Applying Dinic’s Algorithm to a simple network, the number of stages required isO(√n).

Proof. If the max flow ≤ √n done. Otherwise, run Dinic’s algorithm for√n stages. l ≥ √n+ 1 in

that residual network, since each stage increments the flow by at least one unit and increases thedistance label from the source to the sink by at least one at each stage. In the residual network,the Maxflow g satisfies |g| ≤ |V |

(√n+1)

≤ √n. So, we need at most√n additional stages. Thus the

total number of the stages is O(√n)

Complexity AnalysisRecall we are considering of simple networks. All push/pull operations are saturating a nodesince there is only one incoming arc or outgoing arc with capacity 1 each node v. The processingthus makes the throughput of node v be 0.Note that even though we only have to check O(n) nodes, we still have to update the layered

network which requires O(m) work. Hence the work per stage is O(m).Thus, the complexity of the algorithm is O(m

√n)


References

[Din1] E. A. Dinic. Algorithm for Solution of a Problem of Maximum Flows in Networks withPower Estimation. Soviet Math. Dokl., 11, 1277-1280, 1970.

[EK1] J. Edmonds and R. M. Karp. Theoretical Improvements in Algorithmic Efficiency forNetwork Flow Problems. J. of ACM, 19, 248-264, 1972.

[FF1] L. R. Ford, Jr. and D. R. Fulkerson. Maximal Flow through a Network. Canad. J.Math., 8, 399-404, 1956.

[GR98] A. V. Goldberg and S. Rao. Beyond the flow decomposition barrier. Journal of theACM 45, 783–797, 1998.

[GT1] A. V. Goldberg and R. E. Tarjan. A new approach to the maximum flow problem. J.of ACM, 35, 921-940, 1988.

[Pic1] J. C. Picard, ”Maximal Closure of a Graph and Applications to Combinatorial Prob-lems,” Management Science, 22 (1976), 1268-1272.

[Jo1] T. B. Johnson, Optimum Open Pit Mine Production Technology, Operations ResearchCenter, University of California, Berkeley, Ph.D. Dissertation., 1968.

10 Necessary and Sufficient Condition for Feasible Flow in a Net-work

First the discussion is for zero lower bounds. We assume throughout that the total supply in thegraph is equal to the total demand, else there is no feasible solution.

10.1 For a network with zero lower bounds

A cut is a partition of the nodes, (D, D). We first demonstrate a necessary condition, and thenshow that it is also sufficient for flow feasibility.The set D contains some supply nodes and some demand nodes. Let the supply and demand

of each node j ∈ V be denoted by bj , where bj is positive if j is a supply node, and negative, if jis a demand node. The cut (D, D) must have sufficient capacity to accomodate the net supply(which could be negative) that must go out of D.

∑

j∈Dbj ≤ C(D, D) =

∑

i∈D,j∈Duij , ∀D ⊂ V. (21)

This condition must be satisfied for every partition, namely, for every subset of nodes D ⊂ V .The interesting part is that this condition is also sufficient, and that it can be verified by solving

a certain minimum cut problem.In order to show that the condition is sufficient we construct an auxiliary s, t graph. Let S be

the set of all supply nodes, and D be the set of all demand nodes. We append to the graph asource node s with arcs going to all supply nodes with capacity equal to their supplies. We placea sink node t with arcs going from all nodes of B to the sink with capacity equal to the absolutevalue of the demand at each node. There exists a feasible flow in the original graph only if the


minimum s, t cut in this auxiliary graph is at least equal to the total supply sup =∑

bj>0 bj (or

total demand, which is equal to it). In other words, since the total supply value is the capacityof some minimum cut – the one adjacent to the source, or the one adjacent to the sink – eitherone of these cuts must be a minimum cut. In particular, for any partition of V , (D, D), the cutcapacity C(s ∪D, t ∪ D) must be at least as big as the total supply:

sup ≤ C(s ∪D, t ∪ D) = C(s, D) + C(D, D) + C(D, t)=∑

j∈D,bj>0 bj + C(D, D) +∑

j∈D,bj<0−bj .Reordering the terms yields,

∑

j∈D,bj>0

bj +∑

j∈D,bj<0

bj ≤ C(D, D).

The left hand side is simply the sum of weights of nodes in D. This necessary condition istherefore identical to the condition stated in (21).Verification of condition (21).

A restatement of condition (21) is that for each subset of nodes D ⊂ V ,

∑

j∈Dbj − C(D, D) ≤ 0.

The term on the left is called the s-excess of the setD. In Section 12 it is shown that the maximums-excess set in a graph can be found by solving a minimum cut problem. The condition is thussatisfied if and only if the maximum s-excess in the graph is of value 0.

10.2 In the presence of positive lower bounds

In this case condition (21) still applies where the cut value C(D, D) =∑

i∈D,j∈D uij−∑

u∈D,v∈D `uv

10.3 For a circulation problem

Here all supplies and demands are zero. The condition is thus,For all D ⊂ V

∑

i∈D,j∈Duij −

∑

u∈D,v∈D`uv ≥ 0.

10.4 For the transportation problem

Here all arc capacities are infinite. The arcs capacities in the s, t graph we construct, as before, areonly finite for source adjacent and sink adjacent arcs. Let the transportation problem be givenon the bipartite graph (V1 ∪ V2, A) for V1 a set of supply nodes, and V2 a set of demand nodes.Consider an partition in the graph (D, D). Since all arc capacities in A are infinite, Condition

(21) is trivially satisfied whenever an arc of A is included in the cut. So we restrict our attentionto cuts that do not include an arc of A. Such cuts have the set D closed, namely all the successorsof the supply nodes in D are also included in D. For closed sets D the cut capacity is 0. Thecondition is then for each closed set D,

∑

j∈Dbj>0

bj ≤∑

j∈Dbj<0

−bj .


This condition is an extension of Hall’s theorem for the existence of perfect matching in bipartitegraphs.(See also pages 194-196 of text)

11 Goldberg’s Algorithm - the Push/Relabel Algorithm

11.1 Overview

All algorithms discuss till now were “primal” algorithms, that is, they were all based on augment-ing feasible flows. The pus/relabel algorithm for the maximum flow problem is basd on using“preflows”. This algorithm was originally developed by Goldberg and is known as the PreflowAlgorithm. A generic algorithm and a particular variant (lift-to-front) will be presented.Goldberg’s algorithm can be contrasted with the algorithms we have learned so far in following

ways:

• As opposed to working on feasible flows at each iteration of the algorithm, preflow algorithmworks on preflows: the flows through the arcs need not satisfy the flow balance constraints.

• There is never an augmenting path in the residual network. The preflow is possibly infeasible,but super-optimal.

Definitions

Preflow: Preflow ~f in a network is defined as the one which

1. Satisfies the capacity constraints (i.e., 0 ≤ fij ≤ Uij , ∀(i, j)).2. The incoming flow at any node except the source node must be greater than or equal to

the outgoing flow, i.e.,Σjfji − Σkfik ≥ 0

Excess: Excess of a node is defined w.r.t. a flow and is given by

ef (v) = incoming flow(v)− outgoing flow(v).

Active Node: A node v 6= t is called active if ef (v) > 0.

Distance Labels: Each node v is assigned a label. The labeling is such that it satisfies thefollowing inequality

d(i) ≤ d(j) + 1 ∀(i, j) ∈ Af .

Admissible Arc An arc (u, v) is called admissible if (u, v) ∈ Af and d(u) > d(v). Notice thattogether with the validity of the distance labels this implies that d(u) = d(v) + 1.

Lemma 11.1. The admissible arcs form a directed acyclic graph (DAG).

Proof: For contradiction suppose the cycle is given as (i1, i2, ..., ik, i1). From the definition of anadmissible arc, d(i1) > d(i2) > . . . > d(ik) > d(i1), from which the contradiction follows.


11.2 The Generic Algorithm

begin

/* Preprocessing steps */f = 0d(s) = n, d(i) = 0, ∀i 6= s∀(s, j) ∈ E fsj = usjef (j) = usjwhile network contains an active node do

begin

/* Push step */select an active node (v)while ∃ admissible arcs (v, w)then

Let δ = minef (v), Uf (v, w), where (v, w) is admissibleSend δ units of flow from v to w.

else

/* Relabel step (all (v, w) are inadmissible) */set d(v) = min(v,w)∈Af

d(w) + 1end

end

Now we prove the following lemmas about this algorithm which will establish its correctness.

Lemma 11.2. Relabeling step maintains the validity of distance labels.

Proof: Relabeling is applied to node i with d(i) ≤ d(j) ∀(i, j) ∈ Af . Suppose the new label is

d′(i) = min(i,j)∈Af

d(j) + 1

This impliesd′(i) ≤ d(j) + 1 ∀(i, j) ∈ Af

Lemma 11.3 (Super optimality). For preflow ~f and distance labeling ~d there is no augmentingpath from s to t in the residual graph Gf .

Proof: For contradiction suppose [s, v0, v1, ..., vk, t] is an augmenting path in the residual graph.We start with d(s) = n and the labeling of the source does not change. Also d(t) = 0. Now fromvalidity of distance labels, we have

n = d(s) ≤ d(v1) + 1 ≤ d(v2) + 1 · · · · · · ≤ d(t) + k + 1 = k + 1 ≤ n− 2 + 1 = n− 1

Which is a contradiction.

Lemma 11.4 (Optimality). When there is no active node in the residual graph (∀v, ev = 0)preflow is a maximum flow.

Proof: Since there is no node with excess flow, the preflow is a feasible flow. Also from the previouslemma there is no augmenting path, hence the flow is optimal.


Lemma 11.5. For any active node i, there exists a path in Gf from i to s.

Proof: Let Si be the nodes reachable in Gf from i. If s /∈ Si, then all nodes in Si have non-negativeexcess. (Note that s is the only node in the graph that is allowed to have a negative excess -deficit.) In-flow into Si must 0 (else can reach more nodes than Si)

0 = inflow(Si)

≥ inflow(Si)− outflow(Si)

=∑

j∈Si

[inflow(j)− outflow(j)]

=∑

j∈Si\i[inflow(j)− outflow(j)] + ef (i)

> 0.

Since the node is active, hence it has strictly positive excess. Therefore, a contradiction. ⇒ s ∈Si ⇒ there exists a path in Gf from i to s.

Lemma 11.6. When all active nodes satisfy d(i) ≥ n, then these nodes form a source set of amin-cut.

Proof: Let S = i | d(i) ≥ n, e(i) > 0∀i ∈ S there is no path to t in Gf using the same arguments as in Lemma 11.3 (valid distancelabels satisfy d(i) ≤ d(j) + 1, ∀ (i, j) ∈ Af , otherwise d(i) < n.) From lemma 11.5, each activenode can send back excess to s. Hence, the preflow can be converted to (a feasible) flow andthe saturated cut [S, S] is a min-cut, and the resulting flow is maximum flow. (Although thisconversion will be accomplished by the algorithm, it is more efficient to send flow from excessnodes back to source by using the flow decomposition algorithm.)Notice that the source set S is minimal source set of a min cut. The reason is that the same

paths that are used to send the excess flows back to source, can then be used in the residualgraph to reach those nodes of S.

Lemma 11.7. ∀v, d(v) ≤ 2n− 1

Proof: Consider an active node v. From lemma 11.5, s is reachable from v in Gf . Consider asimple path, P , from v to s. P = (i0, i1, .....ik) where i0 = v, ik = s. Now,

d(s) = n (fixed)

d(i0) ≤ d(i1) + 1 ≤ d(i2) + 2..... ≤ d(ik) + k

d(i0) ≤ d(ik) + k = n+ k

d(i0) ≤ n+ k

Since any simple path has no more that n nodes, we have k ≤ n − 1, which implies that ⇒d(i0) ≤ 2n− 1.

Lemma 11.8. 1. d(v) never decreases when relabeled

2. Immediately after relabel(v), v does not have incoming admissible arcs

Proof:


6

Height (Label) of node

Third Push

Second Push

First Push v

v v

u

u

uQQQQs¹¸º·`

¹¸º·

QQQQs¹¸º·`

¹¸º·

¹¸º·

¹¸º·´

´´+

Figure 37: Sequence of saturating pushes on (u, v)

1. Before relabel(v), d(v) ≤ d(w) ∀(v, w) ∈ Af . After relabel the new label is d(w)+1 ≥ d(v)+1⇒ d(v) increased after relabeling.

2. Assume (u, v) ∈ Af and previously admissible⇒ d(u) = d(v)+1. After relabel d(v) is the newlabel and d(v) ≥ d(v) + 1 ≥ d(u). Since v is now at least as high as u, (u, v) is inadmissible.

Lemma 11.9 (Complexity of Relabel).

1. Total number of relabels = O(n2)

2. Complexity of all relabels = O(mn)

Proof:

1. Because of Lemma 11.7 each node can be relabeled at most 2n-1 times. Total number ofnodes = n ⇒ Total number of all relabels is less than n(2n− 1) = O(n2)

2. The cost of relabel(v) is equal to the out-degree of v in Gf because for each v, we have to checkall its adjacent nodes. Therefore, the cost of relabeling each node throughout the algorithmis ≤ n× deg(v). The cost of relabeling all nodes is then at most

∑

v∈V n× deg(v) = 2mn.

Lemma 11.10 (Complexity of Push). Total number of saturating pushes is O(mn)

Proof: After a saturating push on (u, v), we cannot push again to v, unless v pushes on (v, u).Therefore d(v) has increased by at least 2 units and hence d(u) must also have increased by atleast 2 units between saturated pushes on (u, v) (See Figure 37.)⇒ number of saturated pushes on (u, v) ≤ 2n−1

2 ≤ nTherefore, for O(m) arcs, total number of saturating pushes = O(mn)

Lemma 11.11. Total number of non saturating pushes = O(n2m)


Proof: Define Φ =∑

v active d(v)Initially Φ = 0 because all initial labels are zero. A non-saturating push (v, w) reduces Φ byat least 1, because it deactivates node v and reduces Φ by d(v). Note that d(v) ≥ 1 alwaysfor a push(v,w) to occur. But d(w) may now become active and add to Φ. Therefore Φ′ =Φ+d(w)−d(v). However d(v) > d(w)⇒ Φ is reduced by at least 1. In fact the decrease is exactlyone if w was not previsouly active, since d(v) ≤ d(w) + 1 for valid labeling; d(v) ≥ d(w) + 1 forpush(v,w) to occur. The total number of non saturating pushes is bounded by the total amountthat Φ can increase during the algorithm. Φ can increase due to:

1. Saturating push :A saturating push can increase Φ by at most 2n-2 since it can activate at most one node andd(v) ≤ 2n− 1.Total amount the Φ can increase per saturating push = O(n)Total number of saturating pushes = O(nm)Total amount that Φ can increase due to all saturating pushes = O(n2m).

2. Relabeling:Each relabel increases Φ by the increase in the label.

Maximum increase of label per node = 2n− 1Total increase in Φ by relabeling = O(n2). This term is dominated by O(n2)m).

Since total increase = O(n2m), the number of non-saturating pushes is at most O(n2m)

Overall Complexity

Cost of relabels = O(nm)Cost of saturating pushes = O(nm)Cost of non-saturating pushes = O(n2m)Therefore total complexity is O(n2m).

11.3 Variants of Push/Relabel Algorithm

• FIFO: A node that becomes active is placed at the beginning of the active node list. Runningtime = O(n3)

• Highest label preflow: picks an active node to process that has the highest distance labelO(√mn2)

• Dynamic tree implementation: Running time = O(mn log n2

m )

11.4 Wave Implementation of Goldberg’s Algorithm (Lift-to-front)

This algorithm has running time O(n3). The algorithm is listed below:

preprocess()While (there exists an active node(i))

Do

if v is active then push/relabel vif v gets relabeled, place v in front of L


else replace v by the node after v in Lend

end

See the handout titled Wave Algorithm.(from Cormen. Leiserson and Rivest).Since the run time for relabel and saturating push are less than O(n3), we only have to show

the run time of non-saturating push to be O(n3) to proof the overall running time to be O(n3).

Lemma 11.12. The complexity of wave algorithm is O(n3).

Proof. It suffices to show that there are O(n3) non-saturating pushes for the wave algorithm. Definea phase to be the period between two consecutive relabels.There are ≤ n(2n − 1) relabels. So there are O(n2) phases and we need to show that for eachnode there are ≤ 1 non-saturating push per phase. (Lemma 11.13.)

Claim 11.1. The list L is in topological order for the admissible arcs of Af .

Proof. By induction on the iteration index. After preprocess, there are no admissible arcs andso any order is topological. Given a list in topological order, process node u. ’Push’ can onlyeliminate admissible arcs (with a saturating push) and ’relabel’ can create admissible arcs, butnone into u (lemma 11.8), so moving u to the front of L maintains topological order.

Lemma 11.13. At most one non-saturating push per node per phase.

Proof. Let u be a node from which a non-saturating push is executed. Then u becomes inactive.Also all the nodes preceding u in L are inactive. In order for u to become active, one nodepreceding it has to become active. But every node can only make nodes lower than itself active.Hence the only way for u to become active is for some node to be relabeled.

It follows that the complexity of wave implementation of Goldberg’s algorithm is O(n3).

12 A sketch of the pseudoflow algorithm for maximum flow -Hochbaum’s algorithm

7

7The material presented in this Chapter is extracted from ”The Pseudoflow algorithm: A new algorithm and anew simplex algorithm for the maximum flow problem”, by Dorit S. Hochbaum, March 97, and a revised version of2001.


12.1 Preliminaries

Let G be a directed graph (V,A) and let Gst be a closure graph (V ∪ s, t, Ast) where Ast =A∪A(s)∪A(t)) and A(s) and A(t) are the sets of arcs adjacent to source and sink respectively. LetAf be the set of residual arcs for pseudoflow f in Gst. That is, Af = (i, j)|fij < cij and (i, j) ∈A or fij > 0 and (j, i) ∈ A. With respect to f we define excess(v) to be the total amount ofincoming flow minus the outgoing flow to node v,

∑

(i,v)∈Astfiv −

∑

(v,j)∈Astfvj . Let fij be the

flow on arc (i, j) and cij its capacity. Then the residual capacity of arc (i, j) is cfij = cij−fij , andthe residual capacity of the reverse arc (j, i) is cfji = fij . We refer to an arc as edge and denoteit by [i, j] whenever the direction is immaterial to the discussion.The pseudoflow algorithm works with the graph G = (V,A), excluding from Gst the nodes s andt and their adjacent sets of arcs A(s) and A(t). Throughout the algorithm there is a correspondingpseudoflow on Gst that is modified at each iteration on the arcs of A and maintains A(s) andA(t) saturated.

Figure 38: An initial pseudoflow saturates A(s) and A(t), and 0 on all other arcs

A normalized tree is a rooted spanning forest in G = (V,A) where each rooted tree (also calledbranch) has a root node which is the only node permitted to carry excess or deficit. We definethe extended network Gext where we add to G a super-root node r and connect each root of abranch to r with an arc that is called an excess arc if the root is an excess node, and a deficit arcif the root is a deficit node, Gext = (V ∪ r, A ∪ (r, i), (i, r)|∀i ∈ V ). In Gext the normalizedtree is a spanning tree. If a root of a branch has excess then all nodes in that branch are calledstrong. If the root has a nonpositive excess then all the nodes in that branch are called weak.Thus 0-deficit branches are considered weak.The root is placed at the top of the tree, so that the topologically downwards direction is

equivalent to “pointing away from the root”, and the upwards direction is equivalent to “pointingtowards the root”.Consider a spanning tree T in Gext consisting of rooted branches in G = (V,A) and a pseudoflowf in Gst, satisfying the properties:

Proposition 12.1. The pseudoflow f saturates all source-adjacent arcs and all sink-adjacent arcsA(s) ∪A(t).


Figure 39: A normalized tree with strong and weak branches

Proposition 12.2. The pseudoflow f on out-of-tree arcs of A \ T is either at the lower or theupper bound capacity of the arcs.

Proposition 12.3. In every branch all downwards residual capacities are strictly positive.

Proposition 12.4. The only nodes that do not satisfy flow balance constraints are the roots oftheir respective branches.

Definition 12.1. A spanning tree T in Gext with associated pseudoflow f in Gst is called normalizedif it satisfies properties 12.1, 12.2, 12.3, and 12.4.

A schematic illustration of a normalized tree is provided in Figure 39.An example of a normalized tree that we call simple is generated as follows: A simple normalized

tree corresponds to a pseudoflow saturating all arcs A(s) and A(t) and 0 on all other arcs. In thiscase each node is a separate branch for which the node serves as root. Thus all nodes adjacentto source N(s) are strong nodes, and all those adjacent to sink N(t) are weak nodes. All theremaining nodes have zero inflow and outflow, and are thus of zero deficit and set as weak. Weuse a simple normalized tree to initialize the algorithm. Such tree is illustrated in Figure 40.

12.2 The maximum flow problem and the maximum s-excessproblem

The pseudoflow algorithm described here ultimately finds a maximum flow in a graph. Rather thansolving the problem directly the algorithm solves instead the maximum s-excess problem:

Problem Name: Maximum s-Excess

Instance: Given a directed graph G = (V,A), node weights (positive or negative)wi for all i ∈ V , and nonnegative arc weights cij for all (i, j) ∈ A.

Optimization Problem: Find a subset of nodes S ⊆ V such that∑

i∈S wi −∑

i∈S,j∈S cij is maximum.


Figure 40: Initial simple normalized tree

The s-excess problem is closely related to the maximum flow and minimum cut problems asshown next.

Lemma 12.1. For S ⊆ V , s ∪ S is the source set of a minimum cut in Gst if and only if S is aset of maximum s-excess capacity C(s, S)− C(S, S ∪ t) in the graph G.

Proof: We rewrite the objective function in the maximum s-excess problem:

maxS⊆V [C(s, S)− C(S, S ∪ t)] = maxS⊆V [C(s, V )− C(s, S)− C(S, S ∪ t)]= C(s, V )−minS⊆V [C(s, S) + C(S, S ∪ t)].

In the last expression the term C(s, V ) is a constant from which the minimum cut value issubtracted. Thus the set S maximizing the s-excess is also the source set of a minimum cut and,vice versa – the source set of a minimum cut also maximizes the s-excess.We note that the s-excess problem has appeared in several forms in the literature. For instance

the boolean quadratic minimization problem with all the quadratic terms having positive coeffi-cients is a restatement of the s-excess problem. More closely related is the feasibility conditionof Gale [Gale57] for a network with supplies and demands, or Hoffman (1960) for a networkwith lower and upper bounds. Verifying feasibility is equivalent to ensuring that the maximums-excess is zero, in a graph with node weights equal to the respective supplies and demands withopposite signs: If the s-excess is positive, then there is no feasible flow satisfying the supply anddemand balance requirements. The problem appeared also under the names maximum blockingcut or maximum surplus cut in Radzik (1993).

12.2.1 The s-excess problem and maximum closure problem

The pseudoflow algorithm is a generalization of the algorithm solving the maximum closure problemin closure graphs. The s-excess problem is a natural extension of the maximum closure problem.The maximum closure problem is to find in a node weighted graph a subset of nodes that formsa closed set, (i.e. with each node in a closed set all its successors are in the set as well), so that


the total weight of the nodes in the closed subset is maximized. Picard (1976) demonstrated howto formulate the maximum closure problem as a flow problem: all arc capacities in the graphare set to infinity, arcs are added from a source s to all nodes of positive weight with capacityequal to that weight, and from all nodes of negative weights to a sink t with capacities equal tothe absolute value of the weight. In this graph, the term C(S, S ∪ t) in any finite cut valuemust be finite and thus (S, S) = ∅. This implies that the set S is closed as required. It followsthat C(S, S ∪ t) is equal to C(S, t), and Lemma 12.1 demonstrates that the source set of aminimum cut is also a maximum weight closed set.The s-excess problem may be viewed as a maximum closure problem with a relaxation of the

closure requirement: Nodes that are successors of other nodes in S (i.e. that have arcs originatingfrom node of S to these nodes) may be excluded from the set, but at a penalty that is equal to thecapacity of those arcs. A family of problems derived by a similar type of relaxation was studiedby Hochbaum and Pathria, (1996). One such other example is the generalized independent setproblem which is to find a maximum weight set of nonadjacent nodes (independent set of nodes),or if adjacent, a penalty is charged to the objective function for each adjacent pair.With the discussion in the previous section we conclude that any maximum flow or minimum

cut algorithm solves the maximum s-excess problem: The source set of a minimum cut is amaximum s-excess set. The pseudoflow algorithm works in the opposite direction, first solvingfor a maximum s-excess set which immediately provides a minimum cut and then recovering amaximum flow.

12.3 Superoptimality

Proposition 12.5 (Superoptimality). The set of strong nodes of a normalized tree T is a su-peroptimal solution to the s-excess problem. That is, the sum of excesses of the strong branchesis an upper bound on the maximum s-excess.

Our proof of the superoptimality property uses the next lemma about the relationship betweenthe excess of a set, the s-excess of a set, and a “relaxed” concept of s-excess for a normalizedtree T in Gext. The relaxation is in removing all out-of-tree arcs from the graph and definingthe s-excess concept with respect to the set of arcs T only.

Definition 12.2. For a pseudoflow f , a set of nodes D ⊆ V and a given normalized tree T ,

s-excessT (D) = inflow(D)− outflow(D)−∑

(i,j)∈(D,D)∩Tcij .

Lemma 12.2. For any pseudoflow f saturating source and sink adjacent arcs, a normalized treeT and a subset of nodes D, excess(D) ≥ s-excessT (D) ≥ s-excess(D).

Proof:

excess(D) = inflow(D)− outflow(D) ≥ inflow(D)− outflow(D)−∑(i,j)∈(D,D)∩T cij≥ inflow(D)− outflow(D)−∑(i,j)∈(D,D) cij = s-excess(D).

The three forms of excess are equal if and only if the residual capacity C(D, D) = 0, i.e.f(D, D) = C(D, D) and f(D,D) = 0. (Again, recall that D = V \D.)

For the set of strong nodes S, excess(S) = s-excessT (S). To see that, observe that for any setB that is a union of entire branches, (B, B) ∩ T = ∅, and thus excess(B) = s-excessT (B). We


show next in Lemma 12.3, that the set D maximizing s-excessT (D) is the set of strong nodes S.This leads to the conclusion that for any normalized tree with a set of strong nodes S, excess(S)is always an upper bound to the maximum s-excess in the graph, and furthermore, that it isequal to the maximum s-excess when all residual capacities out of the set S are 0.

Lemma 12.3. For a normalized tree T with pseudoflow f , strong nodes S and weak nodes S,

maxD⊆V

s-excessT (D) = s-excessT (S).

Proof: For any strong branch B ⊆ S and a proper subset B1 ⊂ B, the set of residual arcs in(B1, B1)∩T can be nonempty. Furthermore, since T is a normalized tree each node v that is notthe root satisfies inflow(v) − outflow(v) = 0. So including the root of B can only increase thes-excess of a set. Including all nodes of the branch can only decrease the cut and thus increasethe s-excess. Thus B maximizes the value maxD⊆B s-excessT (D). Any weak branch or a propersubset of a weak branch has nonpositive s-excessT . Since every strong branch contributes apositive amount to s-excessT and every weak branch contributes a nonpositive amount, themaximum is attained for the set of all strong nodes S.In particular, observing that (S, S) ∩ T = ∅, it follows that

excess(S) = maxD⊆V

s-excessT (D) ≥ maxD⊆V

s-excess(D).

Thus the excess of the strong nodes S is an upper bound on the value of the optimal solutionto the s-excess problem. Moreover, when the residual capacity of arcs in (S, S) is zero, thenexcess(S) = s-excessT (S) = s-excess(S). It thus follows from Lemma 12.3,

Corollary 12.1 (Optimality condition). Consider a normalized tree, a pseudoflow f and thecollection of strong nodes S in the normalized tree. If f saturates all arcs in (S, S) and f(S, S) = 0then S is a maximum s-excess set in the graph.

12.4 The description of the pseudoflow algorithm

At termination of the algorithm there is a partition of the nodes of the graph into two sets so thatthere is no residual arc from one set to the other. According to the superoptimality property thispartition provides an optimal solution to the s-excess problem and is also a minimum s, t-cut.The pseudoflow at termination is usually infeasible and an additional procedure is required to

recover a feasible flow. The complexity of feasible flow recovery O(m logn) is dominated by thecomplexity of the main algorithm. Further, for closure problems the recovery is O(n), [Hoc01].The pseudoflow algorithm begins with an initial normalized tree – a simple normalized tree

and an associated pseudoflow which saturates source and sink adjacent arcs. An iteration of thealgorithm consists of identifying a residual (unsaturated) arc from a strong node to a weak node(called a merger arc) if one exists. Once a merger arc has been found, the entire excess of therespective strong branch is sent on the path from the root of the strong branch to the root of theweak branch. An arc encountered along this path that does not have a sufficient residual capacityis split and the head node of that arc becomes a root of a strong branch with excess equal tothe difference between the amount pushed and the capacity. A merger iteration is schematicallydescribed in Figure 41. In that figure the strong merger node is denoted by s as there is no riskof ambiguity – in that graph there is no source node.


Wr

T

TS

rT

W

TS

W

T

TW

r

W

AfterBefore

b

e

a

r

Wr

Sr

S

W

r

Sr

W S S

TS

TS

r

Wr

Figure 41: Illustration of a merger iteration

The absence of a merger arc is a certificate of optimality of the normalized tree. For an optimalnormalized tree with a set of strong nodes S∗ and a set of weak nodes W ∗ = V \ S∗ the cut(s ∪ S∗, t ∪W ∗) is a minimum cut in the s, t-graph.To summarize, the algorithm consists, at each iteration, of the following operations:

1. Finding a merger arc, once;2. Pushing the flow to the next bottleneck arc;3. Splitting a bottleneck edge.Operations 2 and 3 may be repeated several times in a single merger iteration depending on thenumber of bottlenecks encountered along the merger path.Finding a merger arc amounts to scanning the strong nodes for a weak neighbor in the residual

graph. The merger arc, (s′, w) is added to the normalized tree and the excess arc of the respectivestrong branch is removed. Then the full amount of excess from the root of the strong branchrs′ is pushed along the merger path to s′ then to w and up to the root of the branch containingw, rw. This process may not succeed in pushing the full amount of excess if it encounters anarc of residual capacity less than the amount of flow to be pushed. In that case this bottleneckarc is saturated and removed from the tree in the procedure split. The head of such arc thenhas positive excess, and it becomes a root of a strong branch. The excess that goes through isfurther pushed along the merger path.A pseudopolynomial bound applies to the number of iterations of the algorithm. It depends on

the initial amount of excess M+ or on the value of the minimum cut or maximum flow in thenetwork.

Lemma 12.4 (Hoc01). Let the initial excess of the strong nodes be M+ and the minimum resid-ual cut capacity be C∗. Then the number of merger iterations performed by the algorithm isO(nminM+, C∗).

Lemma 12.5 (Hoc01). Throughout the algorithm the number of calls to split is no more than thenumber of merger iterations plus n.

The work per merger iteration consists of:1. Finding a merger arc: This is discussed later.2. Finding a next bottleneck residual capacity of residual capacity δ along the path


3. Adding −δ to the residual capacities of edges along the segment of the path through thebottleneck arc (vi, vi+1).4. Split the bottleneck arc and suspend the subtree rooted at vi as a separate strong branch ifexcess is positive. If excess is 0 suspend it as a 0-deficit weak branch.We use the dynamic trees data structure for operations 2 and 3 and 4. All these operations are

performed in O(logn) amortized time each.

12.5 Finding a merger arc and use of labels

The naive procedure for finding a merger arc may require up to O(m) operations per merger arc,scanning all neighbors of all strong nodes. A labeling procedure is useful in speeding up thisoperation as well as some other operations. All nodes carry a label, `v for all v ∈ V . Initially alllabels are set to the value 1. At each iteration the search is for merger arc of from a strong nodeof label ` to a node of label `− 1 so that ` is a lowest label of a strong node. A `-labeled strongnode is relabeled if it has no neighbor of label `− 1 and if all its children in the normalized treehave label `+ 1.With these rules the labels satisfy the property that the labels of nodes are monotone nonde-

creasing with their distance from the root. This monotonicity property facilitates the finding ofa lowest label strong node – it is always the root of its respective branch. Another property isthat for every residual arc (u, v), `u ≤ `v + 1.The list of nodes of each label ` is maintained in a label bucket B`. Each label bucket B` is ann bit word with entry i equal to 1 if node i has label `. Every time a node changes its label, itis moved from bucket ` to bucket `+ 1, in O(1) time. We maintain also a second set of bucketsused exclusively for roots of branches.For each node v it is easy to maintain its out-neighbors, the set of nodes u so that (v, u) is

in the residual graph. That is because every time an arc changes its orientation in the residualgraph the procedure split was called. In that case we update the status of both endpoints of thearc to show the current residual direction. So this can be done according to lemma 12.5 in O(Q)for Q the number of iterations.

Lemma 12.6 (Hoc01). Throughout the algorithm the total complexity of identifying mergers arcsadjacent to strong nodes of label ` is at most O(m).

As a corollary we get that the complexity of finding all merger arcs is at most O(mn) as n isthe longest shortest path possible in the residual graph and thus the largest valued label.With the labeling procedure we can show that the maximum label of each node is at most n,

and that each arcs can be used at most n/2 times as a merger arc. The latter is proved with a“seesaw” argument similar to the one used in push-relabel. We thus have,

Lemma 12.7 (Hoc01). Throughout the algorithm the total complexity of identifying mergers arcsadjacent to strong nodes of label ` is at most O(m).

We use dynamic trees to implement each merger operation in O(log n). The total complexityof the algorithm is then O(mn logn).


13 Algorithms for MCNF

13.1 Network Simplex

First, let’s consider the linear programming formulation of the minimum cost flow problem:

Minimize z =n∑

i=1

n∑

j=1

cijxij

Subject to:n∑

j=1

xij −n∑

k=1

xki = bi i = 1, 2, . . . , n

0 ≤ xij ≤ uij i, j = 1, 2, . . . , n

It is easily seen that the constraint matrix is not of full rank. In fact, the matrix is of rankn-1. This can be remedied by adding an artificial variable corresponding to an arbitrary node.Another way is just to throw away a constraint, the one corresponding to node 1 say. Assumethat the capacities uij are infinite then the dual is :

Maximize z =n∑

j=2

bjπj

Subject to: πi − πj ≤ cij (i, j) ∈ A

We denote by cπij the reduced cost cij − πi+ πj which are the slacks in the dual constraints. Asusual, the sum of the reduced costs along a cycle agrees with the sum of the original costs.

Simplex method assumes that an initial ”basic” feasible flow is given by solving phase I or bysolving another network flow problem. By basis, we mean the n− 1 xij variables whose columnsin the constraint matrix are linearly independent. The non-basic variables are always set to zero.(or in the case of finite upper bounds, nonbasic variables may be set to their respective upperbounds.) We will prove in Section 13.3 that a basic solution corresponds to a spanning tree. Theconcept of Simplex algorithm will be illustrated in the handout but first we need to define someimportant terminologies used in the algorithm first.

Given a flow vector x, a basic arc is called free iff 0 < xij < uij . An arc (i, j) is called restrictediff xij = 0 or xij = uij . From the concept of basic solution and its corresponding spanningtree we can recognize a characteristic of an optimal solution. An optimal solution is said to becycle-free iff in every cycle at least one arc assumes its lower or upper bound.

Proposition 13.1. If there exists a finite optimal solution, then there exists an optimal solutionwhich is cycle-free.

Proof: Prove by contradiction. Suppose we have an optimal solution which is not cycle-free andhas the minimum number of free arcs. Then we can identify a cycle such that we can send flow ineither clockwise or counter-clockwise direction. By linearity, one direction will not increase theobjective value as sum of the costs of arcs along this direction is nonpositive. From our finitenessassumption, we will make at least one arc achieve its lower or upper bound after sending a certain


amount of flow along the legitimate direction. This way, we get another optimal solution havingfewer free arcs and the contradiction is found.

13.2 (T,L,U) structure of an optimal solution

There are three types of arcs in any feasible solution: free arcs, arcs with zero flow, and arcs withthe flow at upper bound. We simply state the corollary which stems from the property 13.1 thatthere exists an optimal solution that is cycle-free.

Corollary 13.1. There exists an optimal solution that has the structure (T,L,U), where T is aspanning tree containing all free arcs, L is the set of arcs at lower bounds, and U the set of arcsat upper bounds.

Proof: Since there exists an optimal solution which is cycle-free, then the free arcs form a spanningforest. It is then possible to add restricted arcs to form a spanning tree.

13.3 Simplex’ basic solutions and the corresponding spanning trees

The following is an important observation for applying the simplex method to the minimum costnetwork flow problem.

Theorem 13.1. There is a one-to-one correspondence between a basic solution and a spanningtree.

Proof: A basic solution corresponds to linearly independent columns. Recall that we can add anartificial variable to make the constraint matrix full rank. Clearly, the fictitious column we add isalways in a basis for otherwise the original matrix is of full rank. So a basis can be characterizedby n nodes and n-1 arcs. If we can show that there are no cycles, then we are done. For the sakeof contradiction, suppose there exists a cycle in the graph corresponding to a basis. Select somearc (i,j) in the cycle and orient the cycle in the direction of that arc. Then for each column inthe basis associated with an arc in the cycle, assign a coefficient 1 if the arc is in the direction ofthe orientation of the cycle and -1 otherwise. Applying these coefficients to respective columnsin the basis we find that the weighted sum of these columns is the zero vector. Thus the columnsof the basis could not have been linearly independent.Note that not every basis is feasible, the same as for general linear programming. Furthermore,

it can be shown that we can always permute the columns of a basis in a way such that it is inlower-triangular form, which is ideal for Gaussian elimination.The simplex algorithm works as follows: Given a feasible spanning tree, an entering arc is a

nonbasic arc with negative reduced cost, or equivalently, we identify a negative cost cycle byadding it to the spanning tree of basic arcs, all of which have zero reduced costs. The leavingarc, the counterpart of the entering arc, is the bottleneck arc on the residual cycle.This algorithm corresponds to optimality condition 1 that says that a solution is optimal if the

residual graph contains no negative cost cycle (the sum of reduced costs along a cycle is equal tothe sum of costs along a cycle).

13.4 An example of simplex iteration

See the handout titled An Example of a Minimum Cost Flow Problem (not used in class this fall


2003.).The example shown in the handout is a minimum cost flow problem. Each arc has capacity withzero lower bound and finite upper bound. The number next to each arc is its capacity and itsunit cost flow. The demand/supply of each node is identified by the number next to the node.Note that only node 3 is a transshipment node (b3 = 0). We are given an initial feasible flow asshown in (a). The basic variables are x14, x13, x23, and x25 that together form a spanning tree.Of the nonbasic arcs, all are at the lower bound, except for x34 which is at its upper bound.To identify a lesser cost flow, we calculate the potential of each node as shown in (b). Here we

arbitrarily set π1 = 0. Then we can set the reduced costs of all basic variables to be zero. Thatis, cπ14 = cπ13 = cπ23 = cπ25 = 0, where cπij = cij − πi + πj . The calculated potential of each nodeis shown in (b). Part (c) shows the reduced costs for each non-basic arcs (which is generallynonzero). We can see that only cπ21 = −2 < 0. Therefore, x21 will be an incoming variable to thebasis.Adding arc (2,1) into the basis, we form a cycle as shown in (d). The numbers next to each

boldface arc are its current flow and capacity respectively. Notice that we can now send theflow from node 2 along the arcs (2,1), (1,3), and (3,2). The bottleneck arc is (2,3) which will beselected as an outgoing arc. The flow that we can send along this cycle is 2 units. So now weupdate the flow in each arc as shown in (e). It should be emphasized that, in this past iteration,we attain the lower total cost. We can now reiterate the whole process as shown in part (f) and(g). The reduced costs shown in part (g) now are all non-negative which means that we can nolonger lower the total cost and the flow amount is optimal. Part (h) summarizes the optimalflows of this problem.

13.5 Optimality Conditions

For a flow x∗, we define a residual network G(x∗) such that every arc (i, j) with capacity uij , flowxij , and arc cost cij in the original network corresponds to two arcs in the residual network: onegoing from i to j, with cost cij and flow uij − xij , the other from j to i with cost −cij and flowxij .There are three equivalent statements of the optimality of flow x∗ for the minimum cost flow

problems.

1. The following are equivalent to optimality of x∗.

• x∗ is feasible.

• There is no negative cost cycle in G(x∗).

2. The following are equivalent to optimality of x∗.There is a potential vector π such that,

• x∗ is feasible.

• cπij ≥ 0 for each arc (i,j) in G(x∗).

3. The following are equivalent to optimality of x∗.There is a potential vector π such that,

• x∗ is feasible,

• cπij > 0⇒ x∗ij = 0.


• cπij = 0⇒ 0 ≤ x∗ij ≤ uij .

• cπij < 0⇒ x∗ij = uij .

Theorem 13.2 (Optimality condition 1). A feasible flow x∗ is optimal iff the residual networkG(x∗) contains no negative cost cycle.

Proof: ⇐ direction.Let x0 be an optimal flow and x0 6= x∗. (x0−x∗) is a feasible circulation flow vector. A circulationis decomposable into at most m primitive cycles. The sum of the costs of flows on these cyclesis c(x0 − x∗) ≥ 0 since all cycles in G(x∗) have non-negative costs. But that means cx0 ≥ cx∗.Therefore x∗ is a min cost flow.⇒ direction.

Suppose not, then there is a negative cost cycle in G(x∗). But then we can augment flow alongthat cycle and get a new feasible flow of lower cost, which is a contradiction.

Theorem 13.3 (Optimality condition 2). A feasible flow x∗ is optimal iff there exist nodepotential π such that cπij ≥ 0 for each arc (i,j) in G(x∗).

Proof: ⇐ direction.If cπij ≥ 0 for each arc (i,j) in G(x∗), then in particular for any cycle C in G(x∗),

∑

(i,j)∈C cπij =

∑

(i,j)∈C cij . Thus every cycle’s cost is nonnegative. From optimality condition 1 it follows thatx∗ is optimal.⇒ direction.x∗ is optimal, thus (Opt cond 1) there is no negative cost cycle in G(x∗). Let d(i) be the shortestpaths distances from node 1 (say). Let the node potentials be πi = −d(i). From the validity ofdistance labels it follows that,

cij + d(i) ≥ d(j)

Thus, cπij = cij − πi + πj ≥ 0.

Theorem 13.4 (Optimality condition 3). A feasible flow x∗ is optimal iff there exist nodepotential π such that,

• cπij > 0⇒ x∗ij = 0

• cπij = 0⇒ 0 ≤ x∗ij ≤ uij

• cπij < 0⇒ x∗ij = uij.

Proof: ⇐ direction.If these conditions are satisfied then in particular for all residual arcs cπij ≥ 0 which means thatoptimality condition 2 is satisfied.

⇒ direction.From Optimality condition 2 cπij ≥ 0 for all (i,j) in G(x∗). Suppose that for some (i, j) cπij > 0but x∗ij > 0. But then (j, i) ∈ G(x∗) and thus cπji = −cπij < 0 which is a contradiction (toopt con 2). Therefore the first condition on the list is satisfied. The second is satisfied due tofeasibility. Suppose now that the third holds, and cπij < 0. So then arc (i, j) is not residual, asfor all residual arcs the reduced costs are nonnegative. So then the arc must be saturated andx∗ij = uij as stated.


13.6 Implied algorithms – Cycle cancelling

Optimality condition 1 implies an algorithm of negative cycle cancelling: whenever a negative costcycle is detected in the residual graph it is “cancelled” by augmenting flow along the cycle in theamount of its bottleneck residual capacity.Finding a negative cycle in the residual graph is done by applying the Bellman-Ford algorithm.

If no negative cost cycle is detected then the current flow is optimal. In the simplex methodit is easier to detect a negative cost cycle. Since all in-tree arcs have residual cost 0, it is onlynecessary to find a negative reduced cost arc out of tree. So it seems that simplex is a moreefficient form of negative cycle cancelling algorithm, isn’t it?The answer is no. The reason is, that while simplex identifies a negative cost cycle more easily

than by Bellman Ford algorithm, this cycle is not necessarily in the residual graph. In otherwords, there can be in the tree and in the basis, arcs that are at their lower or their upperbound. This is precisely the case of degeneracy. In that case, although a negative cost cycle wasfound, the bottleneck capacity along that cycle can be 0. In that case the simplex pivot leads toanother solution of equal value.The complexity of the cycle cancelling algorithm, is finite. Every iteration reduces the

cost of the solution by at least one unit. For maximum arc cost equal to C and maximumcapacity value equal to U the number of iterations cannot exceed 2mUC. The complexity is thusO(m2nUC), which is not polynomial, but is only exponential in the length of the numbers in theinput.In order to make this complexity polynomial we can apply a scaling algorithm. The scaling can

be applied to either capacities, or costs, or both.

13.7 Analogy to maximum flow

We know how to represent the maximum flow problem as a minimum cost network flow. We addan arc of cost −1 and infinite capacity between t and s. Since all other costs in the network are0 the total cost will be minimum when the flow on the arc from t to s is maximum. Maximumflow as minimum cost network flow is described in Figure 42

s

j

t i

finite capacity

Costs=0

Cost=-1

capacity=infinity

… …

Figure 42: Maximum flow as MCNF problem

If we solve this MCNF problem by cycle cancelling algorithm, we observe that the only negativecost arc is (t, s) and therefore each negative cost cycle is a path from s to t augmented with thisarc. When we look for negative cost cycle of minimum mean, that is, the ratio of the cost to thenumber of arcs, it translates in the context of maximum flow to −1 divided by the length of the


s, t path in terms of the number of arcs. The minimum mean corresponds to a path of smallestnumber number of arcs – shortest augmenting path. You can easily verify the other analogies inthe following table.

Minimum cost network flow maximum flow

Negative cost cycle in G(x) Augmenting path in G(x)

Cycle cancelling algorithm Augmenting path algorithm

Minimum mean cost cycle cancelling Shortest augmenting path

Most negative cost cycle Augmenting path (all paths have same cost)

Notice that finding a most negative cost cycle is NP-hard as it is at least as hard as finding ahamiltonian path if one exists. (set all costs to 1 and all capacities to 1 and then finding mostnegative cost cycle is equivalent to finding a longest cycle.) But finding a minimum mean costcycle can be done in polynomial time. The mean cost of a cycle is the cost divided by the numberof arcs on the cycle.

13.8 The gap between the primal and dual

For the max flow min cut problem the gap between the primal and dual is significant, as we alreadydiscussed. We now investigate the gap between primal and dual for MCNF.We want to show that for an optimal flow x ∗ there is a corresponding set of potentials πi and

vice versa.Flow → potentials: Let x ∗ be the flow. The residual graph G(x∗) is connected as there must

be at least one arc in the residual graph for each (i, j) ∈ A.Let the length of the arc (i, j) ∈ G(x ∗) be:

dij =

cij if (i, j) ∈ A−cji if (j, i) ∈ A.

Now find the shortest path distances from node 1, d(i), and let πi = −d(i). The total complexityof this procedure is O(mn) using Bellman-Ford algorithm.Potentials → flow :

If cπij > 0 then set x∗ij = 0If cπij < 0 then set x∗ij = uijThe arcs with cπij = 0 form an acyclic graph – a forest. We now use the maximum flow procedure

to find a flow that is feasible on these arcs: connect excess nodes to source and deficit nodesto sink and find a flow that saturates these arcs, that is, – maximum. The complexity of thisprocess is that of finding maximum flow, e.g. O(mn log n2

m ).

ieor 266, fall 2003 graph algorithms and network flowsjfpf/acm/libros/grafos/graph... · ieor266...

Documents