ma4254 discrete optimization

69
MA4254: Discrete Optimization Defeng Sun Department of Mathematics National University of Singapore Oce: S14-04-25 Telephone: 6516 3343 Aims/Objectives:  Discrete optimization deals with problems of max- imizing or minimizing a function over a feasible region of discrete struc- ture. These problems come from many elds like operations research, management science, and compu ter scienc e. The primary objective of this course is tw ofol d: a) to stud y key techn ique s to separa te easy problems from dicult ones and b) to use typical methods to deal with dicult problems. Mode of Evaluation:  Tutorial class performance (10%); Mid-Term test (20%) and Final examination (70%) This course is taught at Department of Mathematics, National Uni- versity of Singapore, Semester I, 2009/2010. E-mail: [email protected] 1

Upload: mengsiong

Post on 17-Oct-2015

70 views

Category:

Documents


11 download

DESCRIPTION

Course Notes for NUS MA4254 Discrete Optimization

TRANSCRIPT

  • MA4254: Discrete Optimization

    Defeng Sun

    Department of Mathematics

    National University of Singapore

    Office: S14-04-25

    Telephone: 6516 3343

    Aims/Objectives: Discrete optimization deals with problems of max-

    imizing or minimizing a function over a feasible region of discrete struc-

    ture. These problems come from many fields like operations research,

    management science, and computer science. The primary objective

    of this course is twofold: a) to study key techniques to separate easy

    problems from difficult ones and b) to use typical methods to deal with

    difficult problems.

    Mode of Evaluation: Tutorial class performance (10%); Mid-Term

    test (20%) and Final examination (70%)

    This course is taught at Department of Mathematics, National Uni-versity of Singapore, Semester I, 2009/2010.E-mail: [email protected]

    1

  • 2References:

    1) D. Bertsimas and J. N. Tsitsiklis, Introduction to Linear Optimiza-

    tion. Athena Scientific, 1997.

    2) G. L. Nemhauser and L. A. Wolsey, Integer and Combinatorial

    Optimization. John Wiley and Sons, 1999.

    3) C. H. Papadimitriou and K. Steiglitz, Combinatorial Optimization:

    Algorithms and Complexity. Prentice-Hall, 1982. Second edition by

    Dover, 1998.

    PARTIAL lecture notes will be made available in my webpage

    http://www.math.nus.edu.sg/ matsundf/

  • Discrete Optimization 3

    1 Introduction

    In this Chapter we will briefly discuss the problems we are going to study; give a

    short review about simplex methods for solving linear programming problems and

    introduce some basic concepts in graphs and digraphs.

    1.1 Linear Programming (LP): a short re-view

    Consider the following linear programming

    (P )

    min cTx

    s.t. Ax bx 0

    and its dual

    (D)

    max bTy

    s.t. ATy cy 0 .

    Simplex Method Dantzig (1947) Very efficient Not polynomial time algorithm. Klee and Minty (1972) gave an counterex-

    ample.

    Average analysis versus worst-case analysis Russians Ellipsoid Method Polynomial time algorithm (Khachiyan, 1979) Less efficient

    Interior-Point Algorithms Karmarkar (1984) Polynomial times algorithm Efficient for some large-scale sparse LPs

    Others

  • 41.2 Discrete Optimization (DO)

    Also Combinatorial Optimization (CO)

    Mathematical formula in general:

    min (x)

    s.t. x F x decision policy F is the collection of feasible decision policies (x) measures the value of members of F .A typical DO (CO) problem:

    (IP )

    min cTx

    s.t. Ax bx 0

    xj integer for j I N := {1, , n}.

    where c

  • Discrete Optimization 5

    2. The Assignment Problem

    n people and m jobs, where n m Each job must be assigned to exactly one person, and each person can do at

    most one job

    The cost of person j doing job i is cij.Then the Assignment Problem can be formulated as

    minmi=1

    nj=1

    cijxij

    s.t.nj=1

    xij = 1, i = 1, ,mmi=1

    xij 1, j = 1, , n

    x Bmn .

    Extensions Three-Index Assignment Problem

    3. Set-Covering, Set-Packing, and Set-Partitioning Problems

    The Set-Covering Problem is

    min cTx

    s.t. Ax 1x Bn .

    The Set-Packing Problem is

    max cTx

    s.t. Ax 1x Bn .

    4. Traveling Salesman Problem (TSP)

  • 6We are given a set of nodes V = {1, , n} and a set of arcs A. The nodes representcities, and the arcs represent ordered pairs of cities between which direct travel is

    possible.

    For (i, j) A, cij is the direct travel time from city i to city j.The TSP is to find a tour, starting at city 1, that

    (a) visits each other city exactly once and then returns to city 1, and

    (b) takes the least total travel time.

    5. Facility Location Problem, Network Flow Problem, and many more

    1.4 Why DO (CO) difficult

    Arrangements grow exponentially is the superficial reason.

    Total Unimodularity (TU) Theory; Shortest Path; Matroids and Greedy Algo-

    rithm; Complexity (P 6= NP conjecture); Interior-Point Algorithms; Cutting Plane;Branch and Bound; Decomposition; Flowshop Scheduling, etc.

    1.5 Convex sets

    In linear programming and nonlinear programming, we have already met many con-

    vex sets. For examples, the line segment between two points in

  • Discrete Optimization 7

    1.6 Hyperplanes and half spaces

    Definition 1.2 Let a be a nonzero vector in

  • 8Let x0 be any point on the hyperplane {x
  • Discrete Optimization 9

    It is noted that these halfspaces are finite in number. The intersection of two poly-

    hedrons is again a polyhedron. So {x

  • 10

    over which we are optimizing. There are quite a number of different but equivalent

    ways to define the concept of a corner. Here we introduce two of them exreme

    points and basic feasible solutions.

    Our first definition defines an extreme point of a polyhedron as a point that can

    not be expressed as a convex combination of two other points of the polyhedron.

    Definition 1.6 Let P

  • Discrete Optimization 11

    where a1 = (0, 0, 2)T , a2 = (4, 0, 0)

    T and a3 = (1, 1, 1)T . Let a4 = e1, a5 = e2 and

    a6 = e3. Then

    M1 = {1, 4, 5, 6}, M2 = {2}, M3 = {3}.

    Definition 1.7 If a vector x satisfies aTi x = bi for some i M1,M2 or M3, we

    say that the corresponding constraint is active or binding at x. The active set

    of P at x is defined as

    I(x) = {i M1 M2 M3 | aTi x = bi},

    i.e., I(x) is the set of indices of constraints that are active at x.

    For example, suppose that P is defined by (1.1). Let x = (0.5, 0, 0.5)T . All

    active constraints at x are

    aT1 x 1, aT3 x = 1, aT5 x(= x2) 0

    and

    I(x) = {1, 3, 5}.

    Recall that vectors x1, . . . , xk

  • 12

    (a) There exist n vectors in the set {ai | i I(x)}, which are linearly independent.

    (b) The span of the vectors ai, i I(x), is all of

  • Discrete Optimization 13

    which is orthogonal to the subspace spanned by these vectors. If x satisfies aTi x = bi

    for all i I(x), we also have aTi (x+d) = bi for all i I(x), thus obtaining multiplesolutions. We have therefore established that (b) and (c) are equivalent. Q.E.D.

    With a slight abuse of language, we will often say that certain constraints are

    linearly independent, meaning that the corresponding vectors ai are linearly inde-

    pendent. We are now ready to provide an algebraic definition of a corner point of

    the polyhedron P .

    Definition 1.8 Let x

  • 14

    Note that if the number m of constraints used to define a polyhedron P

  • Discrete Optimization 15

    The set P = {x

  • 16

    Definition 1.11

    (a) A nonzero element d of a polyhedral cone C

  • Discrete Optimization 17

    1.10 Simplex Method Revisited

    Consider the standard linear programming problem

    (P )

    min cTx

    s.t. Ax = b,

    x 0,(1.2)

    where A

  • 18

    Finding an initial basic feasible solution: The artificial variables method and the

    bigM method.

    For the dual simplex method, we have

    0 c1 . . . cn

    b1 | |... A1 . . . An

    bm | |

    and

    cTBxB c1 . . . cnxB(1) | |... B1A1 . . . B1An

    xB(m) | |We do not require B1b to be nonnegative, which means that we have a basic,

    but not necessarily feasible solution to the primal problem. However, we assume

    that c 0; equivalently, the vector yT = cTBB1 satisfies yTA cT , and we havea feasible solution to the dual problem. The cost of this dual feasible solution is

    yT b = cTBB1b = cTBxB, which is the negative of the entry at the upper left corner of

    the tableau.

  • Discrete Optimization 19

    1.11 Graphs and Digraphs

    1.11.1 Graphs

    Definition 1.12 A graph G is a pair (V,E), where V is a finite set and E is a set

    of unordered pairs of elements of V . Elements of V are called vertices and elements

    of E edges. We say that a pair of distinct vertices are adjacent if they define an

    edge, and that the edge is said to be incident to its defining vertices. The degree of a

    vertex v (denoted deg(v)) is the number f edges incident to that vertex.

    An Example.

    e1

    e3 e2e4

    Figure 1.3: A Graph

    Definition 1.13 An v1vk-path (or path connecting v1 and vk) is a sequence of edges

    v1v2, . . . , vi1vi, . . . , vk1vk.

    A cycle is a sequence of edges

    v1v2, . . . , vi1vi, . . . , vk1vk, vkv1.

    In both cases vertices are all distinct. A graph is acyclic if it has no cycle.

    Proposition 1.2 If every vertex of G has degree of at least two then G has a cycle.

    Proof. Let P = v1v2, . . . , vk1vk be a path of G with a maximum number of edges.

    Since deg(vk) 2, there is an edge vkw where w 6= vk1. It follows from the choiceof P that w is a vertex of P , i.e., w = vi for some i {1, . . . , k 2}. Thenvivi+1, . . . , vk1vk, vkvi is a cycle. Q.E.D.

  • 20

    Definition 1.14 G is connected if each pair of vertices is connected by a path.

    Proposition 1.3 Let G be a connected graph with a cycle C and let e be an edge of

    C. Then G e is connected.

    Proof. Let v1, v2 be vertices of G e. We need to show there exists a v1v2-path P of G e. Since G is connected there exists a v1v2-path P of G. If P does not use ethen we are done. Otherwise P implies there exists a v1w1-path P1 and a w2v2-path

    P2, where w1, w2 are endpoints of e. Moreover, C w1w2 is a w1w2-path. The resultnow follows. Q.E.D.

    Definition 1.15 H is a subgraph of G if V (H) V (G) and E(H) E(G). It is aspanning subgraph if in addition V (H) = V (G).

    Definition 1.16 A tree is a connected acyclic graph.

    Theorem 1.2 If T = (V,E) is a tree, then |E| = |V | 1.

    Proof. Let us proceed by induction of the number of vertices of V . The base

    case |V | = 1 is trivial since then |E| = 0. Assume now |V | 2 and suppose thetheorem holds for all trees with |V | 1 vertices. Since T is acyclic, it follows formProposition 1.2 that there is a vertex v with deg(v) 1. Since T is connected and|V | 2, deg(v) 6= 0. Thus, there is a unique uv incident to v. Let T be defined asfollows V (T ) = V {v} and E(T ) = E {uv}. Observe that T is a tree. Henceby induction |E(T )| = |V (T )| 1 and it follows |E| = |V | 1. Q.E.D.

    Proposition 1.4 Let G = (V,E) be a connected graph. Then |E| |V | 1. More-over, if equality holds then G is a tree.

    Proof. If G has a cycle then remove from G any edge on the cycle. Repeat until

    the resulting graph T is acyclic. It follows from Proposition 1.3 that T is connected.

    Hence T is a tree and by Theorem 1.2,

    |E(G)| |E(T )| = |V (G)| 1.

    Q.E.D.

  • Discrete Optimization 21

    1.11.2 Bipartite Graph

    G = (S, T,E): For any edge in E with one vertex in S and the other in T .

    1.11.3 Vertex-Edge Incidence Matrix

    Definition 1.17 The vertex-edge incidence matrix of a graph G = (V,E) is a

    matrix A with |V | rows and |E| columns whose entries are either 0 or 1 such that The rows correspond to the vertices of G, The columns correspond to the edges of G, and the entry Av,ij for vertex v andedge ij is given by

    Av,ij =

    0 if v 6= i and v 6= j1 if v = i or j.1.11.4 Digraphs (Directed Graphs)

    Definition 1.18 A directed graph (or digraph) D is a pair (N,A) where N is a finite

    set and A is a set of ordered pairs of elements of N . Elements of N are called nodes

    and elements of A arcs. Node i is the tail (resp. head) of arc ij. The in-degree (resp.

    out-degree) of node v (denoted deg+(v) (resp. deg(v)) is the number of arcs with

    head (resp. tail) v.

    1.11.5 Bipartite Digraph

    D = (S, T,A)

    1.11.6 Node-Arc Incidence Matrix

    Definition 1.19 The node-arc incidence matrix of a graph D = (N,A) is a

    matrix M with |V | rows and |A| columns whose entries are either 0, +1, or 1 suchthat

    The rows correspond to the nodes of D, The columns correspond to the arcs of D, and the entry Mv,ij for node v and arc

  • 22

    ij is given by

    Mv,ij =

    0 if v 6= i and v 6= j+1 ifv = j, and

    1 if v = i.

  • Discrete Optimization 23

    2 Total Unimodularity (TU) and Its Applications

    In this section we will discuss the total unimodularity theory and its applications to

    flows in networks.

    2.1 Total Unimodularity: Definition and Properties

    Consider the following integer linear programming problem

    (P )

    max cTx

    s.t. Ax = b

    x 0(2.1)

    where A Zmn, b Zm and C Zn all integers.

    Definition 2.1 A square, integer matrix B is called unimodular if |Det(B)| = 1.An integer matrix A is called totally unimodular if every square, nonsingular

    submatrix of A is unimodular.

    The above definition means that a TU matrix is a {1, 0,1}-matrix. But, a{1, 0,1}-matrix may not necessarily a TU matrix, e.g.,

    A =

    1 11 1

    Lemma 2.1 Suppose that A Znn is a unimodular matrix and that b Zn isan integer vector. If A is nonsingular, then Ax = b has the unique integer solution

    x = A1b.

    Proof. Let aij be the ij-th entry of A, i, j = 1, . . . , n. For any aij, define the cofactor

    of aij as

    Cof(aij) = (1)i+jDet(A{1,...,n}\{j}{1,...,n}\{i} ),

    where (A{1,...,n}\{j}{1,...,n}\{i} ) is the matrix obtained by removing the i-th row and the j-th

    column of A. Then

    Det(A) =ni=1

    ai1 Cof(ai1).

  • 24

    The Adjoint of A is

    Adj(A) = Adj({aij}) = {Cof(aij)}T

    and the inverse of A is

    A1 =1

    Det(A)Adj(A).

    Since A Znn is a unimodular nonsingular integer matrix, every Cof(aij) is aninteger and Det(A) = 1. Hence A1 is an integer matrix and x = A1b is integerwhenever b is. Q.E.D.

    Theorem 2.1 If A is TU, every basic solution to P is integer.

    Proof. Suppose that x is a basic solution to P . Let N be the set of indices of x such

    that xj = 0. Since x is a basic solution to P , there exist two nonnegative integers p

    and q with p+ q = n and indices B(1), . . . , B(p) {1, . . . ,m} and N(1), . . . , N(q) N such that

    {ATB(i)}pi=1 {eTN(j)}qj=1are linearly independent, where eN(j) is the N(j)-th unit vector in

  • Discrete Optimization 25

    Proposition 2.3 A Zmn is TU = (A I) is TU, where I

  • 26

    Obviously, |Det(B)| = |Det(B)| and

    |Det(B)| = |Det(A1)||Det(I )| = |Det(A1)|.

    Now A is totally unimodular implies |Det(A1)| = 0 or 1 and since B is assumed tobe nonsingular, |Det(B)| = 1. Again, from Lemma 2.1, yB is an integer. Hence y isinteger because yj = 0, j / B. This implies that x is integer. [One may also makeuse of Theorem 2.1 and Proposition 2.3 to get the proof immediately.]

    (2 3).Let B Zpp be any square nonsingular submatrix of A. It is sufficient to prove

    that bj is an integer vector, where bj is the jth column of B1, j = 1, . . . , p.

    Let t be an integer vector such that t + bj > 0 and bB(t) = Bt + ej, where ej is

    the jth unit vector. Then

    xB = B1bB(t) = B1(Bt+ ej) = t+B1ej = t+ bj > 0.

    By choosing bN (N = {1, . . . , n}\B) sufficiently large such that (Ax)j < bj,j N , where xj = 0, j N . Hence x is an extreme point of S(b(t)). As xB and tare integer vectors, bj is an integer vector too for j = 1, . . . , p and B

    1 is an integer.

    (3 1).Let B be an arbitrary square, nonsingular submatrix of A. Then

    1 = |Det(I)| = |Det(BB1)| = |Det(B)||Det(B1)|.

    By the assumption, B and B1 are integer matrices. Thus

    |Det(B)| = |Det(B1)| = 1,

    and A is TU. Q.E.D.

  • Discrete Optimization 27

    Theorem 2.3 (A sufficient condition of TU) An integer matrix A with all aij = 0, 1,

    or 1 is TU if

    1. no more than two nonzero elements appear in each column,

    2. the rows of A can be partitioned into two subsets M1 and M2 such that

    (a) if a column contains two nonzero elements with the same sign, one element

    is in each of the subsets,

    (b) if a column contains two nonzero elements of opposite signs, both elements

    are in the same subset.

    Proof. The proof is by induction. One element submatrix of A has a determinant

    equal to (0, 1,1).Assume that the theorem is true for all submatrices of A of order k 1 or less.

    If B contains a column with only one nonzero element, we expand Det(B) by that

    column and apply the induction hypothesis.

    Finally, consider the case in which every column of B contains two nonzero ele-

    ments. Then from 2(a) and 2(b) for every column jiM1

    bij =iM2

    bij, j = 1, . . . , k.

    Let bi be the ith row. Then the above equality givesiM1

    bi iM2

    bi = 0,

    which implies that {bi}, i M1 M2 are linearly dependent and thus B is singular,i.e., Det(B) = 0. Q.E.D.

    Corollary 2.1 The vertex-edge incidence matrix of a bipartite graph is TU.

    Corollary 2.2 The node-arc incidence matrix of a digraph is TU.

  • 28

    2.2 Applications

    In this section we show that the assumptions in Theorems in Section 2.1 for integer

    programming problems connected with optimization of flows in networks are fulfilled.

    This means that these problems can be solved by the SIMPLEX METHOD.

    However, it is not necessarily to use the simplex method because more efficient

    methods have been developed by taking into consideration the specific structure of

    these problems.

    Many commodities, such as gas, oil, etc., are transported through networks in which

    we distinguish sources, intermediate transportation or distribution points and desti-

    nation points.

    We will represent a network as a directed graph G = (V,E) and associate with

    each arc (i, j) E the flow xij of the commodity and the capacity dij (possiblyinfinite) that bounds the flow through the arc. The set V is partitioned into three

    sets:

    V1 set of sources or origins, V2 set of intermediate points, V3 set of destinations or sinks.

    V231V

    V

    Figure 2.1: A network

  • Discrete Optimization 29

    For each i V1, let ai be a supply of the commodity and for each i V3, let bi be ademand for the commodity.

    We assume that there is no loss of the flow at intermediate points. Additionally,

    denote V (i) (V (i)) as

    V (i) = {j| (i, j) E} and V (i) = {j| (j, i) E},

    respectively.

    Then the minimum cost capacitated problem may be formulated as

    (P) v(P ) = min

    (i,j)Ecijxij

    subject to

    jV (i)

    xij

    jV (i)xji

    ai, i V1,= 0, i V2, bi, i V3,

    (2.2)

    0 xij dij, (i, j) E. (2.3)

    Constraint (2.2) requires the conservation of flow at intermediate points, a net flow

    into sinks at least as great as demanded, and a net flow out of sources equal or less

    than the supply. In some applications, demand must be satisfied exactly and all of

    the supply must be used. If all of the constraints of (2.2) are equalities, the problem

    has no feasible solutions unless

    iV1

    ai =iV3

    bi.

    To avoid pathological cases, we assume for each cycle in the network G = (V,E)

    either that the sum of costs of arcs in the cycle is positive or that the minimal

    capacity of an arc in the cycle is bounded.

    Theorem 2.4 The constraint matrix corresponding to (2.2) and (2.3) is totally uni-

    modular.

  • 30

    Proof. The constraint matrix has the form

    A =

    A1I

    ,where A1 is the matrix for (2.2) and I is an identity matrix for (2.3). In the last

    section, we show that A1 is totally unimodular implies that A is totally unimodular.

    Each variable xij appears in exactly two constraints of (2.2) with coefficients +1

    or 1. Thus A1 is an incidence matrix for a digraph and therefore it is totallyunimodular. Q.E.D.

    The most popular case of P is the so-called (capacitated) transportation prob-

    lem. We obtain it if we put in P : V2 = , V (i) = for all i V1 and V (i) = forall i V3.So we get

    (TP)

    v(T ) = min

    (i,j)Ecijxij,

    s.t.jV (i)

    xij ai, i V1,

    jV (i)

    xji bi, i V3,

    0 xij dij, (i, j) E.

    If dij = for all (i, j) E, the uncapacitated version of P is sometimes calledthe transshipment problem.

  • Discrete Optimization 31

    If all ai = 1, and all bi = 1, and additionally, |V1| = |V3|, the transshipmentproblem reduces to the so-called assignment problem of the form

    (AP)

    v(AP ) = miniV1

    jV (i)

    cijxij,

    s.t.jV (i)

    xij = 1, i V1,

    jV (i)

    xji = 1, i V3,

    xij 0.

    Note that |V1| = |V3| implies that all constraints in (AP) must be satisfied as equal-ities.

    Let V = {1, . . . ,m}. Still another important practical problem obtained fromP is called the maximum flow problem. In this problem, V1 = {1}, V3 = {m},V (1) = , V (m) = , a1 =, bm =.

    The problem is to maximize the total flow into the vertex m under the capacity

    constraints

    (MF)

    v(MF ) = max

    iV (m)xim,

    s.t.jV (i)

    xij

    jV (i)xji = 0,

    i V2 = {2, . . . ,m 1},

    0 xij dij, (i, j) E.

    Finally, consider the shortest path problem. Let cij be interpreted as the

    length of edge (i, j). Define the length of a path in G to be the sum of the edge

    lengths over all edges in the path. The objective is to find a path of minimum length

  • 32

    from a vertex 1 to vertex m. It is assumed that all cycles have nonnegative length.

    This problem is a special case of the transshipment problem in which V1 = {1},V3 = {m}, a1 = 1 and bm = 1.

    Let A be the incidence matrix of the digraph G = (V,E), where V = {1, . . . ,m}and E = {e1, . . . , en}. With each arc ej we associate its length cj 0 and its flowxj 0. The shortest path problem may be formulated as:

    (SP)

    v(SP ) = minnj=1

    cjxj,

    s.t. Ax =

    10...

    0

    +1

    , x 0.

    The first constraint corresponds to the source vertex, the mth constraint corresponds

    to the demand vertex, while the remaining constraints correspond to the intermediate

    vertices, i.e., the points of distribution of the unit flow.

    The dual problem to SP is

    (DSP) v(DSP ) = max(u1 + um),

    ATu c. (2.4)

  • Discrete Optimization 33

    3 The Shortest Path

    3.1 The Primal-Dual Method

    Consider the standard linear programming

    (P )

    min cTx

    s.t. Ax = b 0x 0

    and its dual

    (D)max piT b

    s.t. piTA cT .

    Suppose that we have a current pi which is feasible to the dual problem (D). Define

    the index set J by

    J = {j : piTAj = cj} ,

    where Aj is the jth column of A. Then for any j / J , we have piTAj < cj. Wecall J the set of admissible columns. In order to search for an x such that it is

    not only feasible to the primal problem (P) but also it, togther with pi, satisfies the

    complementary condition of (P) and (D), we invent a new LP, called the restricted

    primal (RP), as follows

    (RP )

    = minmi=1

    xai

    s.t. Ax+ xa = b

    xj 0 , for all j ,

    xj = 0 , j / J ,

    xai 0 , i = 1, . . . ,m ,

  • 34

    i.e.,

    (RP )

    = min 0TxJ +mi=1

    xai

    s.t. AJxJ + xa = b

    xJ 0, xa 0 .

    The dual of (RP) is

    (DRP )

    w = max piT b

    s.t. piTAj 0, j J

    pii 1, i = 1, . . . ,m .

    Let (xJ , xa) be an optimal basic feasible solution to (RP) and pi be an optimal basic

    feasible solution to (DRP) obtained from (xJ , xa). If w = 0, then = 0. Such an

    x is found. Otherwise, w > 0 and we can update pi to

    pinew = pi + pi .

    The new cost to (D) is

    (pinew)T b = piT b+ piT b = piT b+ w,

    which means that we shall get a better pi if we can take > 0. On the other hand,

    pinew should be feasible to (D), i.e.,

    (pinew)TAj = piTAj + pi

    TAj cj .

    Since for every j J , piTAj 0, we only need to consider those piTAj > 0, j / J .Therefore, we can take

  • Discrete Optimization 35

    = mincj piTAjpiTAj

    .

    j / J

    such that

    piTAj > 0

    Primal P Dual P

    (DRP)

    Restricted Primal (RP)

    Dual of RP

    pi

    pi

    Adjustment to pi

    Figure 3.1: An illustration of the prima-dual method

    3.2 The Primal-Dual Method for the Shortest Path Problem

    Let A be the incidence matrix of the digraph G = (V,E), where V = {1, . . . ,m} andE = {e1, . . . , en}. With each arc ej we associate its length cj 0 and its flow xj 0.The shortest path problem, as we have already known, may be formulated as:

    minnj=1

    cjxj,

    s.t. Ax =

    10...

    0

    +1

    ,

    x 0 .

    (3.1)

    Let A be the remaining submatrix of A by removing the last row of A (it is redundant

    because the sum of all rows of A is zero). Then (3.1) turns into

  • 36

    minnj=1

    cjxj,

    s.t. Ax =

    10...

    0

    ,x 0 .

    (3.2)

    The dual problem to (3.2) is

    max pi1s.t. pii + pij cij for all (i, j) E,

    pim = 0 ,

    (3.3)

    where we must fix pim = 0 because the last row of A is omitted in A.

    The idea of primal-dual algorithm is derived from the idea of searching for a

    feasible point x such that

    xij = 0 (some xk) whenever pii + pij < cij ,

    for given feasible pi (Remark: think about complementary conditions). We search

    for such an x by solving an auxiliary problem, called the restricted primal (RP),

    determined by the pi we are working with. If our search for the x is not successful,

    we nevertheless obtain information from the dual of RP, which we call DRP, and

    tells us how to improve the particular pi with which we started.

  • Discrete Optimization 37

    Next, we give the details. The shortest-path problem can be written as

    minnj=1

    cjxj,

    s.t. Ax =

    +1

    0...

    0

    ,x 0 ,

    (3.4)

    where A = A. The purpose of introducing A is to make the right hand side of theconstraint Ax = b nonnegative. Now, the dual problem of (3.4) is

    max pi1

    s.t. pii pij cij for all (i, j) E,pim = 0 .

    (3.5)

    For a given feasible pi to (3.5), the set of admissible arcs is defined by

    J = {arcs (i, j) : pii pij = cij} .

    The corresponding restricted primal problem (RP) is

    = minm1i=1

    xai ,

    s.t. Ax+ xa =

    +1

    0...

    0

    ,xj 0 , for all j ,

    xj = 0 , j / J ,

    xai 0 , i = 1, . . . ,m 1

    (3.6)

  • 38

    and the dual of the restricted primal (DRP) is

    w = max pi1

    s.t. pii pij 0 for all (i, j) J ,

    pii 1 for all i = 1, . . . ,m 1 ,

    pim = 0 .

    (3.7)

    DRP (3.7) is evry easy to solve:

    Since pi1 1 and we wish to maximize pi1, we try pi1 = 1. If there is no pathfrom pi1 to pim (node 1 to node m), using only arcs in J , then we can propagate the

    1 from node 1 to all nodes reachable by a path from node 1 without violating the

    pii pij 0 constraints, and an optimal solution to the DRP is then

    pi =

    1 for all nodes reachable by paths

    from node 1 using arcs in J

    0 for all nodes from which node m

    is reachable using arcs in J

    1 for all other nodes.

    (Notice that this pi is not unique.)

    We can then calculate

    1 = min {cij (pii pij)}

    arcs (i, j) / J

    such that

    pii pij > 0

    to update pi and J , and re-solve the DRP.

  • Discrete Optimization 39

    1 J

    J

    J

    J

    J

    m

    0

    0

    1

    1 1

    1

    01

    Figure 3.2: A solution to the restricted dual problem

    pi : = pi + 1pi .

    If we get to a point where there is a path from node 1 to node m using arcs in J ,

    pi1 = 0, and we find an optimal solution because = w = 0. Any path from node

    1 to node m using only arcs in J is optimal.

    The primal-dual algorithm reduces the shortest path problem to repeated solution

    of the simpler problem of finding the set of nodes reachable from a given node.

    Interpretation: Define at any point in the algorithm the set

    W = {i : node m is reachable from i

    by admissible arcs}

    = {i : pii = 0} .

    Then the variable pii remains fixed from the time that i enters W to the conclusion

    of the algorithm, because the corresponding pii will always be zero.

    Every arc that becomes admissible (enter J) stays admissable throughout the

  • 40

    algorithm, because once we have

    pii pij = cij for (i, j) E ,

    we always change pii and pij by the same amount.

    pii, i W is the length of the shortest path from node i to node m and thealgorithm proceeds by adding to W , at each stage, the nodes not in W next closest

    to node m.

    At most |v| = m stages.

    Dijkstras algorithm is an efficient implementation of the primal-dual algorithm

    for the shortest path problem.

    3.3 Bellmans Equation

    Let cij be the length of arc (i, j) (positive arcs if cij > 0; nonnegative if cij 0).

    Let uij be the length of the shortest path from i j. Define

    ui = u1i.

    Then Bellmans Equations are u1 = 0,ui = mink 6=i

    {uk + cki}.

    3.4 Dijkstras Algorithm

    In this section we assume that cij 0. Denote

    P : permanently labeled nodes;

    T : temporarily labeled nodes.

  • Discrete Optimization 41

    1

    i

    kk

    i

    ki

    u

    c

    u

    Figure 3.3: Bellmans equation

    P and T always satisfy

    P T = & P T = V.

    Label for node j, [uj, lj] where uj : the length of the (may be temporary) shortest

    path from node 1 to j and lj : the preceding node in the path.

    Dijkstras algorithm can be summarized as follows.

    Step 0. P = {1}, u1 = 0, l1 = 0, T = V \P. Compute

    uj =

    c1j if (1, j) E, if (1, j) / E,lj =

    1 if (1, j) E,0 if (1, j) / E.Step 1. Find k T such that

    uk = minjT

    {uj}.

    Let P = P {k} and T = T\{k}. If k = n, stop.

  • 42

    Step 2. For j T , if uk + ckj < uj, let [uj = uk + ckj, lj = k] and go back toStep 1.

    Claim: At any step, uj is the length of the shortest path from 1 to j, only passing

    nodes in P .

    [Suppose not and j is the first violation... ].

    Claim: The total cost is O(n2).

    3.5 PERT or CPM Network

    A large project is devisable into many unit tasks. Each task requires a certain

    amount of time for its completion, and the tasks are partially ordered.

    This network is sometimes called a PERT (Project Evaluation and Review Tech-

    nique) or CPM (Critical Path Method) network. A PERT network is necessarily

    acyclic.

    Theorem 3.1 A digraph is acyclic if and only if its nodes can be renumbered in such

    a way that for all arc (i, j), i < j. [The work of this is O(n2)]

    Claim: For any acyclic graph, at least one node has indegree 0. After renumbering

    it, we have for all (i, j), i < j.

    Bellmans equations are u1 = 0,ui = mink 6=i

    {uk + cki}

  • Discrete Optimization 43

    For acyclic graphs, they turn out to be u1 = 0,ui = mink

  • 44

    3.7 Floyd-Warshall Method for Shortest Paths Between All Pairs

    Again, we need the assumption that the networks contain no negative cycles in order

    that the Floyd-Warshall method works.

    Step 0. u(1)ij = cij, i, j = 1, . . . , n.

    Step k. For k = 1, . . . , n,

    u(k+1)ij = min{u(k)ij , u(k)ik + u(k)kj }, i, j = 1, . . . , n

    Claim: u(k)ij is the length of a shortest path from i to j, subject to the condition

    that the path does no pass through k, k + 1, . . . n (i and j excepted). [This means

    u(n+1)ij = uij].

    Proof by induction. It is clearly true for Step 0. Suppose it is true for u(k)ij for

    all i and j. Now consider u(k+1)ij . If a shortest path from node i to node j which does

    not pass through nodes k+1, k+2, . . . n does not pass through k, then u(k+1)ij = u

    (k)ij .

    Otherwise, if it does pass through node k, u(k+1)ij = u

    (k)ik + u

    (k)kj .

    It is easy to see that the complexity of the Floyd-Warshall method is O(n3).

    The Floyd-Warshall requires the storage of an n n matrix. Initially this isU (1) = C. Thereafter, U (k+1) is obtained from U (k) by using row k and column k

    to revise the remaining elements. That is, uij is compared with uik + ukj and if the

    later is smaller, uik + ukj is substituted for uij in the matrix.

    There are other methods of the above type, e.g. G B Dantzig method.

    3.8 Other Cases

    1. Sparse graphs

    |A|

  • Discrete Optimization 45

    not allow repetitive arcs not allow repetitive nodes3. with time constraints

    4. with fixed charge

  • ZusrlarafZ (z4zdz 'z) =Q :G

  • 46

    4 The Greedy Algorithm and Com-putational Complexity

    4.1 Matroid

    1935, matroid theory founded by H. Whitney; 1965, J. Edmonds pointed out the significance of matroid theory

    to combinatorial optimization (CO).

    Importance: 1) Many CO problems can be formulated as matroid

    problems, and solved by the same algorithm;

    2) We can detect the insight of the CO problems;

    3) A special tool for CO.

    Definition 4.1 Suppose we have a finite ground set S, |S| < , anda collection, , of subsets of S. Then H := (S,) is said to be an

    independent system if the empty set is in and is closed under

    inclusion; that is

    i) ;

    ii) X Y = X .

    Elements in are called independent sets, and subsets of S not in

    are called dependent sets.

  • Discrete Optimization 47

    Example: Matching system. G = (V,E),

    = {all matchings in G}.

    [A matchingM of a graph G = (V,E) is a subset of the edges with the

    property that no two edges of M share the same node. A matching M

    is a piecewise disjoint edge set]

    e1

    e3 e2e4

    Figure 4.1: A Matching Example

    In Figure 4.1,

    S = {e1, e2, e3, e4}, = {, {e1}, {e2}, {e3}, {e4}, {e2, e3}}.

  • 48

    Definition 4.2 If H = (S,) is an independent system such that

    X, Y , |X| = |Y |+ 1 =

    there exists e X\Y such that Y + e ,

    then H (or the pair (S,)) is called a matroid.

    Examples: i) Matric matroid: A matrix A = (a1, . . . , an)mn, S =

    {a1, . . . , an},

    X X = {ai1, . . . , aik} is independent.

    ii) Graphic matroid: G = (V,E), S = E,

    X X E, X has no cycle.

    ii) is a special case of i) with A = the vertex-edge incidence matrix.

    4.2 The Greedy Algorithm

    Suppose that H = (S,) is an independent system and W : S

  • Discrete Optimization 49

    Greedy Algorithm:

    Suppose W (e1) W (e2) . . . W (en).Step 0. Let X = .Step k. If X + ek , let X := X + ek, where k = 1, . . . , n.

    Theorem 4.1 (Rado, Edmonds) The above algorithm works if and

    only if H is a matroid.

    Applications:

    1) The Maximal Spanning Tree Problem.

    Suppose that there is a television network leasing video links so that

    its stations in various places can be formed into a connected network.

    Each link (i, j) has a different rental cost cij. The question is how the

    network can be constructed to have the minimum cost? Obviously,

    what is wanted is a minimum cost spanning tree of video links. Re-

    placing cij by M cij, where M is a larger number, we can see that itthen turns into a maximum spanning tree (MST). Kruskal has already

    proposed the following solution: Choose the edges one at a time in

    order of their weights, largest first, rejecting an arc only if it forms a

    cycle with edges already chosen.

    2) A Sequencing Problem.

    Suppose that there are a number of jobs which are to be processed

  • 50

    by a single machine. All jobs require the same processing time. Each

    job j has assigned to it a deadline dj, and a penalty pj, which must be

    paid if the job is not completed by its deadline. What ordering of the

    jobs minimizes the total penalty costs? It can be easily seen that there

    exists an optimal sequence in which all jobs completed on time appear

    at the beginning of the sequence in order of deadlines, earliest deadline

    first. The late jobs follow, in arbitrary order. Thus, the problem is to

    choose an optimal set of jobs which can be completed on time. The

    following procedure can be shown to accomplish that objective.

    Choose the jobs one at a time in order of penalties, largest first,

    rejecting a job only if its choice would mean that it, or one of the jobs

    already chosen, cannot be completed on time. [This requires checking to

    see that the total amount of processing to be completed by a particular

    deadline does not exceed the deadline in question.]

    For example, consider the set of jobs below, where the processing

    time of each job is one hour, and the deadlines are expressed in hours

    of elapsed time.

  • Discrete Optimization 51

    Job Deadline Penalty

    j dj pj

    1 1 10

    2 1 9

    3 3 7

    4 2 6

    5 3 4

    6 6 2

    Job 1 is chosen, but job 2 is discarded, because the two together

    require two hours of processing time and the deadline for job 2 is at

    the end of the first hour. Jobs 3 and Jobs 4 are chosen, job 5 is

    discarded, and job 6 is chosen. An optimal sequence is jobs 1, 4,3, and

    6, followed by the late jobs 2 and 5.

    3) A Semimatching Problem.

    Let W be an mn nonnegative matrix. Suppose we wish to choosea maximum weight subset of elements, subject to the constraint that

    no two elements are from the same row of the matrix. Or, in other

  • 52

    words, the problem is to

    maximizei,j

    wijxij

    subject toj

    xij 1, i = 1, ...,m

    xij {0, 1}.

    This semimatching problem can be solved by choosing the largest el-

    ement in each row of W . Or alternatively: choose the elements one

    at a time in order of size, largest first, rejecting an element only if an

    element in the same row has already been chosen.

    4.3 General Introduction on Compu-tational Complexity

    Initiated in large measure by the seminal papers of S. A. Cook (1971)

    and R. M. Karp (1972) in the area of discrete optimization.

    Definition 4.3 An instance of an optimization problem consists of

    a feasible set F and a cost function c : F

  • Discrete Optimization 53

    some instances are larger than others, and it is convenient to define

    the notion of the size of an instance.

    Definition 4.4 The size of an instance is defined as the number of

    bits used to describe the instance, according to a prescribed format.

    Given that arbitrary numbers cannot be represented in binary, this

    definition is geared towards instances involving integer (or rational)

    numbers. Note that any nonnegative integer r smaller or equal to U

    can be written in binary as follows:

    r = ak2k + ak12k1 + . . .+ a121 + a0,

    where the scalars a0, . . . , ak, are 0 or 1. The number k is clearly at

    most blog2Uc, since r U . We can then represent r by the binaryvector (a0, a1, . . . , ak). With an extra bit for sign, we can aslo represent

    negative numbers. In other words, we can represent any integer with

    absolute value less than or equal to U using at most blog2Uc+ 2 bits.Consider now an instance of a linear programming problem in stan-

    drad form, i.e., an m n matrix A, an m-vector b, and an nvectorc, and assume that the magnitude of the largest element of {A,b, c}is equal to U . Since there are (mn+m+n) entries in A,b, and c, the

    size of such an instance is at most

    (mn+m+ n)(blog2Uc+ 2).

    In fact, this count is not exactly correct: more bits will be needed

    to encode flags that indicate where a number ends, and another

  • 54

    starts. However, our count is right as far as the order of magnitude is

    concerned. To avoid details of this kind, we will be using instead the

    order-of-magnitude notation, and we will simply say that the size of

    such an instance is O(mnlogU).

    Optimization problems are solved by algorithms. The running time

    of an algorithm will, in general, depend on the instance to which it is

    applied. Let T (n) be the worst-case running time of some algorithm

    over all instances of size n, under the bit model.

    Definition 4.5 An algorithm runs in polynomial time if there exists

    an integer k such that T (n) = O(nk).

    Fact: Suppose that an algorithm takes polynomial time under the

    arithmetic model. Furthermore, suppose that on instances of size n,

    any integer produced in the course of execution of the algorithm has

    size bounded by a polynomial in n. Then, the algorithm runs in poly-

    nomial time under the bit model as well.

    The class P : A combinatorial optimization (CO) problem is in P ifit admits algorithms of polynomial complexity.

    The class NP : A combinatorial problem is in NP if for all YESinstances, there exists a polynomial length certificate that can be

    used to verify in polynomial time that the answer is indeed yes.

  • Discrete Optimization 55

    NP : e.g., verify the optimality of an LP solution.

    Obviously, P NP . But,

    P = NP?

    Definition 4.6 Suppose that there exists an algorithm for some prob-

    lem A that consists of a polynomial time computation in addition of

    polynomial number of subroutine calls to an algorithm for problem B.

    We then say that problem A reduces (in polynomial time) to problem

    B. For short, AR= B.

    In the above definition, all references to polynomiality are with re-

    spect to the size of an instance of problem A.

    Theorem 4.2 If AR= B and B P, then A P.

    The above theorem says that if AR= B, then problem A is not

    much more difficult than problem B.

    For example, let us consider the following scheduling problem: a set

    of jobs are to be processed on two machines where no job requires in

    excess of three operations. A job may require, for example, processing

    on machine one first, followed by machine two, and finally back on

    machine one. Our objective is to minimize makespan, i.e., complete

    the set of jobs in minimum time. Let us refer to this problem as (PJ).

  • 56

    Now, take the one-row integer program or knapsack problem that

    we state in the equality form: given integers a1, a2, . . . , an and b, does

    there exist a subset S {1, 2, . . . , n} such that jS aj = b? Callingthe later problem (PK), our objective is to show that (PK) polynomially

    reduces to (PJ).

    For a given (PK) we construct an instance of (PJ) wherein the first

    n jobs require only one operation, this being on machine one. Each

    has processing time aj for j = 1, 2, . . . , n. Job n + 1 possesses three

    operations constrained in such a way that the first is on machine two,

    the second on machine one, and the last on machine two again. The

    first such operation has duration b, the second duration 1, and the

    third durationn

    j=1 aj b.

    Clearly, one lower bound on the completion of processing time of

    all jobs in this instance of (PJ) is the sum of processing times for job

    n + 1, i.e.,n

    j=1 aj + 1. Any feasible schedule for all jobs achieving

    this makespan value must be optimal. Suppose a subset S exists such

    that the knapsack problem is solvable. For (PJ) we can schedule jobs

    implies by S first on machine one, followed by the second operation of

    job n + 1, and complete with the remaining jobs (those not given by

    S). The first and last operations for job n+1 (on machine two) finish

    at times b andn

    j=1 aj +1, respectively. Thus, the completion time of

    this schedule isn

    j=1 aj + 1.

  • Discrete Optimization 57

    If, conversely, there is no subset S {1, 2, . . . , n} withjS aj = bour scheduling instance would be forced to a solution like: either job

    n + 1 waits before it obtains the needed unit of time on machine one

    or some of jobs 1, 2, . . . , n wait to keep job n + 1 progressing. Either

    way the last job will complete after timen

    j=1 aj + 1.

    We can conclude that the question of whether (PK) has a solution

    can be reduced to asking whether the corresponding (PJ) has makespan

    no greater thann

    j=1 aj+1. Since (as is usually the case) the size of the

    required (PJ) instance is a simple polynomial (in fact linear) function

    of the size of (PK), we have a polynomial reduction. Problem (PK)

    indeed reduces polynomially to (PJ).

    4.4 Three Forms of a CO Problem

    A CO problem: F is the feasible solution set and c : F < is a costfunction,

    min c(f)

    s.t. f F.

    The above CO problem has three versions:

    a) Optimization version: Find the optimal solution.

    b) The evaluation version: Find the optimal value of c(f), f F .

    c) The recognition version: Given an integer L, is there a feasible

  • 58

    solution f F such that c(f) L?.

    These three type of problems are closely related in terms of algorith-

    mic difficulty. In particular, the difficulty of the recognition problem

    is usually a very good indicator of the difficulty of the corresponding

    evaluation and optimization problems. For this reason, we can focus,

    without loss of generality, on recognition problems.

    Consider the following combinatorial optimization problem, called

    the maximum clique problem:

    Given a graph G = (V,E) find the largest subset C V such thatfor all distinct u, v C, (v, u) E.

    The maximum clique problem is in NP or in short, Clique NP .

    Assume that we have a procedure cliquesize which, given any graph

    G, will evaluate the size of the maximum clique of G. In other words

    cliquesize solves the evaluation version of the maximum clique problem.

    We can then make efficient use of this routine in order to solve the

    optimization version.

    Step 0 . X = .Step 1. Find v V such that cliquesize(G(v)) = cliquesize(G),

    where G(v) is the subgraph of G consisting of v and all its adjacent

    nodes.

    Step 2. X = X + v. G = G(v)\v. If G = , stop; otherwise, go to

  • Discrete Optimization 59

    Step 1.

    We now discuss the relation between the three variants in general.

    Let us assume that the cost c(f) of any feasible f F can be computedin polynomial time. It is then clear a polynomial time algorithm for

    the optimization problem leads to a polynomial time algorithm for

    the optimization problem. (Once an optimal solution is found, use

    it to evaluate - in polynomial time, the optimal cost.) Similarly, a

    polynomial time for the evaluation problem immediately translates to

    a polynomial time algorithm for the recognition problem. For many

    interesting problems, the converse is also true: namely a polynomial

    time algorithm for the recognition problem often leads to polynomial

    time algorithms for the evaluation and optimization problems.

    Suppose that the optimal cost is known to take one ofM values. We

    can then perform binary search and solve the evaluation problem using

    dlogMe calls to an algorithm for the recognition problem. If logM isbounded by a polynomial function of the instance size (which is often

    the case), and if the recognition algorithm runs in polynomial time, we

    obtain a polynomial time algorithm for the evaluation problem.

    We will now give another example to show how a polynomial time

    evaluation algorithm can lead to a polynomial time optimization al-

    gorithm by using the zero-one integer programming problem (ZOIP).

    Given an instance I of ZOIP, let us consider a particular component

  • 60

    of the vector x to be optimized, say x1, and let us form a new instance

    I by adding the constraint x1 = 0. We run an evaluation algorithm

    on instances I and I . If the outcome is the same for both instances,

    we can set x1 to zero without any loss of optimality. If the outcome

    is different, we conclude that x1 should be set to 1. In either case, we

    have arrived at an instance involving one less variable to be optimized.

    Continuing the same way, fixing the value of one variable at a time, we

    obtain an optimization algorithm whose running time is roughly equal

    to the running time of the evaluation algorithm times the number of

    variables.

    4.5 NPCThe class coNP : A combinatorial problem is in coNP if for allNO instances, there exists a polynomial length certificate that can

    be used to verify in polynomial time that the answer is indeed no.

    Obviously, P coNP . But,

    P = coNP?

    The next definition deals with the simplest type of a reduction,

    where an instance of problem A is replaced by an equivalent instance

    of problem B. Rather than developing a general definition of equiv-

    alence, it is more convenient to focus on the recognition problems,

    that is, problems that have a binary answer (e.g., YES or NO).

  • Discrete Optimization 61

    NPP

    co_NP

    Figure 4.2: Relationships among P, NP and co-NP

    Definition 4.7 Let A and B be two recognition problems. We say that

    problem A transforms to problem B (in polynomial time) if there ex-

    ists a polynomial time algorithm which given an instance I1 of problem

    A, outputs an instance I2 of B, with the property that I1 is a YES

    instance of A if and only if I2 is a YES instance of B. [AR= B.]

    The class NP-hard: A problem A is NPhard if for any problemB NP , B R= A.

    Theorem 4.3 Suppose that a problem C is NP-hard and that C canbe transformed (in polynomial time) to another problem D. Then D is

    NP-hard.

    Define a set of Boolean variables {x1, x2, . . . , xn} and let the com-plement of any of these variables xi be denoted by xi. In the language

    of logic, these variables are referred to as literals. To each literal we

    assign a label of true or false such that xi is true if and only if xi is

    false.

  • 62

    Let the symbol denote or and the symbol denote and. We thencan write any Boolean expression in which is referred to as conjunctive

    normal form, i.e., as a finite conjunction of disjunctions using each lit-

    eral once at most. For example, with the set of variables {x1, x2, x3, x4}one might encounter the following conjunctive normal form expression

    (x1 x2 x4) (x1 x2 x3) (x2 x4).

    Each disjunctive grouping in parenthesis is referred to as a clause. The

    satisfiability problem is

    Given a set of literals and a conjunction of clauses defined over the

    literals, is there an assignment of values to the literals for which the

    Boolean expression is true?

    If so, then the expression is said to be satisfiable. The Boolean expres-

    sion above is satisfiable via the following assignment: x1 = x2 = x3 =

    true and x4 = false. Let SAT denote the satisfiability problem and Q

    be any member of NP .

    Theorem 4.4 (Cook (1971)) Every problem Q NP polynomiallyreduces to SAT.

    Karp (1972) showed that SAT polynomially reduces to many com-

    binatorial problems.

    The class NPC: A recognition problem A is NPC if

  • Discrete Optimization 63

    i) A NP and

    ii) for any problem B NP , B R= A.

    Cooks Theorem shows SAT NPC because it can be checked easilythat SAT NP .

    Examples of NPC problems: ILP, ZOIP, Clique, Vertex Packing,TSP, TSP, 3-Index Assignment, Knapsack, etc.

    NP-hard

    NPC

    P

    NP

    Figure 4.3: Relationships among P, NP, NPC, and NP-hard

    NP-hardness is not a definite proof that no polynomial time algo-rithm exists. For all we know, it is always possible that ZIOP belongs

    to P , and P = NP . Nevertheless, NP-hardness suggests that we

  • 64

    should stop searching for a polynomial time algorithm, unless we are

    willing to tackle the P = NP question.

    For a good guide to the theory of NPC, see1979, M. R. Garey and D. S. Johnson, Computers and Intractabil-

    ity: a Guide to the Theory of NP-Completeness.1995, C.H. Papadimitriou, Computational Complexity.