lncs 4331 - computational experience with branch, cut and ... filelp-based branch and bound is a...

Computational Experience with Branch, Cut

and Price Algorithms in Grid Environments

Sonya Marcarelli, Emilio Pasquale Mancini, and Umberto Villano

Universita del Sannio, Dipartimento di Ingegneria, RCOST, Benevento, Italy{sonya.marcarelli, epmancini, villano}@unisannio.it

Abstract. This paper presents our computational experience with par-allel Branch, Cut and Price algorithms in a geographically-distributedgrid environment. After a brief description of our framework for solvinglarge-scale optimization problems, we describe the experimental grid en-vironment and the tests performed, presenting the obtained performanceresults.

1 Introduction

In the field of integer optimization, Branch and Bound is one of the most commonmethods used to solve hard optimization problems. It uses a divide-and-conquerstrategy to explore the set of feasible solutions and takes trace of them using asearch tree. Unfortunately, many real-world problems are NP-Hard and may re-quire search trees of exponential size. Then, it is natural to try to parallelize thesearch in order to make the solution more practical. Currently, there are severalframeworks using a parallel approach to solve optimization problems, while thepotential of computing grids seems to have been only partially exploited. Thispaper aims to explore this field, since it describes our computational experiencein using the Branch, Cut and Price platform, described in [1], for solving large-scale optimization problems in a grid environment. In a previous paper [1], wehave shown the tests performed in a cluster of Globus nodes on a single LAN.Here, the target computing environment is a grid made up of two clusters be-longing to two different LANs, where the front-ends have public addresses andthe compute nodes have private addresses.

The software system developed for our test is composed of two framework,BCP-G and Meta-PBC, and a web portal, SWI-Portal. BCP-G is a customizedversion of COIN/BCP, an open source framework developed within the IBMCOIN-OR project. The original COIN/BCP framework, based on the use ofPVM libraries, has been provided with a new MPI communication API ableto exploit the MPICH-G2 system, a grid-enabled MPI implementation [2, 3].Meta-PBC is instead a brand new framework, implementing a master-workerschema [4]. SWI-Portal is a web portal that manages users and jobs.

In the next section, we introduce the Branch, Cut and Price algorithms andthe architecture of our grid-enabled system. Then, we describe our experimen-tal grid environment with the interconnection network used and we present

G. Min et al. (Eds.): ISPA 2006 Ws, LNCS 4331, pp. 125–134, 2006.c© Springer-Verlag Berlin Heidelberg 2006

126 S. Marcarelli, E.P. Mancini, and U. Villano

several case studies and the obtained performance results. The paper closes witha discussion on future work and our conclusions.

2 The Branch, Cut and Price Algorithms

Branch and Bound algorithms are among the most widely used methods forsolving complex optimization problems [5]. An optimization problem is the taskof minimizing (maximizing) an objective function, a function that associates acost to each solution. Branch and Bound uses a divide-and-conquer strategy thatpartitions the solution space into subsets. As is well known, it is made up of twophases: a branching one, where the subsets of solutions are examined forming atree structure called search tree, and a bounding one, where they are evaluatedfinding upper and lower bounds to the optimal solution.

LP-based Branch and Bound is a Branch and Bound where the lower boundis computed solving the LP-relaxation of the problem. For example, in a genericMILP (Mixed Integer Linear Programming) problem, there is a finite numberof variables subject to the integrality constraint. By relaxing this constraint,we have a LP-relaxation, whose optimal value is a lower bound to the originalproblem. A typical branching operation is to select a variable with fractionalvalue in the LP solution.

Branch and Cut algorithms are a type of Branch and Bound where a finitenumber of cuts, that is, valid inequalities, are dynamically added to the searchtree, in order to improve the lower bound to the LP-relaxation [6, 7].

Branch and Price algorithms are instead based on column generation in orderto solve problems with a very large number of variables. They use initially onlya small subset of the problem variables and of the respective columns in theconstraints matrix, thus defining a reduced problem. In fact, in the originalproblem, there are too many columns and great part of them will have therespective variables equal to zero in an optimal solution.

Branch, Cut and Price joins the two methods used by Branch and Cut andBranch and Price, producing dynamically both cutting planes and variables [5].

3 The Architecture of the Solver System

The architecture of our solver [1] is shown in Fig. 1. In the figure, the upperlayer is the portal interface. In the middle, there are the two framework BCP-Gand MetaPBC, all of which rely on the lower layer (Globus and MPICH-G2).

BCP-G is an optimization framework based on the Branch, Cut And Pricemethod, which we have implemented from COIN/BCP, adding to it a newcommunication interface written in MPI. COIN/BCP is an open-source frame-work based on Branch, Cut and Price for solving mixed integer programmingproblems [8]. It offers a parallel implementation of the algorithm based onthe message-passing environment PVM (Parallel Virtual Machine). COIN/BCPimplements a single-node pool algorithm, where there is a single central list

Computational Experience with Branch, Cut and Price Algorithms 127

of candidate sub-problems to be processed, owned by the tree manager. The mod-ules communicate with each other by exchanging messages through a message-passing protocol defined in a separate communications API.

Fig. 1. Solver architecture

Our communication interface in MPI is implemented by the two classes BCPmpi environment and BCP mpi id, which manage the communications betweencomputational modules and the process ids, respectively. This new interface,which is now integrated in the COIN-OR framework (http://www.coin-or.org/download.html), allows the use of this framework in a Globus grid environmentthrough the grid-enabled implementation of MPI, MPICH-G2. The user has sim-ply to write a Globus rsl script and, through the globusrun command, he/shecan launch the solver execution [9].

Meta-PBC is a parallel solver for solving optimization problems using theBranch and Cut algorithm. It consists of three modules: manager, worker andtree monitor [4]. The manager is responsible for the initialization of the problemand manages the message handling between the workers. The worker is a sequen-tial Branch and Cut solver, with additional functionality to communicate in theparallel layer. The workers communicate with each other through the parallelAPI to know the state of the overall solution process. The parallel interactionbetween modules is achieved by a separate communication API. The current ver-sion is implemented in MPI. In this way, the processes can be executed on a gridusing MPICH-G2. The tree monitor collects information about the search tree.

SWI-Portal is a web portal that allows users to submit jobs and hence to solveoptimization problems, to monitor their job, to view their output and to down-load the results. Users interact with the portal, and, therefore, with the solversand the grid, through this interface. SWI-Portal is implemented using the JavaServer Pages technology (JSP). It consists of an user interface and of a set ofJava classes, wrapping the most important and useful Globus functions. Fur-thermore, it uses a database to collect information on users, jobs and resources.

(http://www.coin-or.org/download.html)

(http://www.coin-or.org/download.html)


All the services and the functions supplied by the SWI-Portal are grouped infour subsystems. The first, the account subsystem, is responsible for managinguser access in conjunction with the users DB. It allows a user to register in thesystem and to enter the portal, giving his login and password. The second one isthe scheduling subsystem. It invokes the Globus gatekeeper and the associatedjob-manager to start the run with the parameters supplied by the user. Thesubsystem also records information about the runs in the database. From thepages of the Monitoring subsystem, a user can check the status and any otherinformation about all the started processes (such as output, error, rsl, and searchtree). Through the Download Subsystem, a user can download all informationregarding his jobs and/or cancel this from the server. More details about thesolver system architecture can be found in [1].

4 The Grid Testbed

To test the system on slow extra-LAN connections, we have configured an exper-imental grid environment made up of two Rocks clusters (fab4 and e-science)at two different sites at the University of Sannio [10]. Fig. 2 shows the archi-tecture of our grid environment. Each cluster has front-end with public IP andcompute nodes with hidden IPs. We used 4 workstation on fab4, each equippedwith Pentium Xeon, 2.8 GHz CPU and 1 GB of RAM, and 16 workstationson e-science, each equipped with Pentium Xeon, 2.8 GHz CPU and 1 GB ofRAM. The intra-cluster connection is 100 GigaEthernet. The two clusters arenot on the same campus LAN, and are actually connected by a very slow connec-tion (the details are provided later). We installed on each front-end the GlobusToolkit and the local Sun Grid Engine (SGE) scheduler. The Sun Grid Engine isa distributed resource management (DRM) software and it provides functions toutilize effectively the resources within the cluster as submitting, monitoring andmanaging jobs. In particular, the Globus gatekeeper uses the SGE scheduler asits job-manager. On each head node, we installed MPICH-G2, which allows inter-cluster and intra-cluster communication. MPICH-G2, based on Globus Toolkitservices, allows to run MPI application on a grid environment. It uses TCP forinter-machine messaging and a vendor-supplied MPI (where available) for intra-machine messaging. MPICH-G2 requires point-to-point communication betweenthe nodes where the jobs are running. Unfortunately, this requires that all com-pute nodes have public IP addresses, but this is in contrast with a classicalcluster configuration like ours, where the compute nodes have private IPs.

In order to solve this problem, and to use all of the processors available, we havechosen a solution based on the Realm Specific IP (RSIP) framework and protocol.RSIP is a network address translation technology that performs a function similarto NAT. It allows the communication between two hosts belonging to differentaddress spaces. In our solution, we installed on each head node an RSIP serverand on each compute node an RSIP client. When a compute node of e-science(RSIP client) wishes to contact a node of fab4, it queries the RSIP server for aport number and a public IP address. The client then tunnels the packets to the


Fig. 2. The Grid Testbed

Fig. 3. Bandwidth in log scale

RSIP server, which strips off the tunnel headers and sends the packets to the targetnode. On incoming packets, the RSIP server looks up the client IP, based on portnumber, adds the tunnel header and sends them to the RSIP client.

We measured the bandwidth of the extra-LAN connection between the twoclusters, including the overhead of the RSIP protocol, through a simple MPIping-pong program, which calculates the communication time between two pro-cesses using blocking send and receive (Fig. 3). The figure shows the band-width of the intra-cluster network, on e-science and fab4, and of the inter-cluster


Table 1. Transmission latency

Latency (µs)

inter-cluster 650158.2

e-science 2067.8

fab4 1994.08

network. Moreover, Table 1 shows the transmission latency measured betweentwo nodes in a single cluster, and in different clusters (inter-cluster).

5 Case Studies

We present here the performance results obtained by our solver for the solutionof an optimization problem in the above-described grid environment. In partic-ular, we have implemented a generic MIP solver to solve mixed integer linearprogramming problems. A MIP problem has the following form:

min cT x s.t. Ax ≥ b xz ∈ Zn, xc ∈ Rn

where c ∈ Rn, A ∈ Rm × n, b ∈ Rm. In the computational experiments of BCP-G with the MIP solver we take advantage of the MIPLIB library [11], which,since its introduction, has become a standard test set, and is commonly usedto compare the performance of mixed integer optimizers. As LP solver, we useCLP, an open source solver of the COIN-OR project. Table 2 shows the detailsof the tested problem instances. Column name is the name of the probleminstance, rows the number of constraints, cols the number of variables, ints thenumber of integer variables and nonzeros the number of nonzero elements inthe constraints matrix.

Table 2. Tested instances

name rows cols ints nonzeros

Stein45 331 45 45 1034Misc07 212 260 259 861910teams 230 2025 —— 12150

Figures 4(a), 4(b), 4(c) compare the response times of BCP-G using differ-ent scheduling strategies. Using the first strategy, s1, slave processes first arespawned on e-science, and, only when all the nodes of this cluster have beenused, on fab4. In the second one, s2, we spawn immediately slave processes al-ternatively on both clusters. Of course, in the second case the response time ishigher, especially for low number of hosts, due to the latency introduced by theinter-cluster connection (its max measured bandwidth is 0,2 MB/s). All the testspresented show that from 16 hosts onward the use of additional hosts does notinvolve any significant gain in the solver performance. The reason is the used


(a) stein45 (b) misc07

(c) 10teams

Fig. 4. Response times for the stein45, misc07 and 10teams problem, using twoscheduling strategies on a variable number of hosts

Fig. 5. The centralized approach

parallelization strategy along with the architecture of our grid environment. Thesolver system uses a classical centralized approach where all the communicationsare between the master and the slave processes. During the execution, there isa large amount of data exchanged between the master and the slave processes.In our tests, the master process, the tree manager, is on a compute node one-science and the slave processes are on all the other compute nodes. The use ofthe additional compute nodes of fab4 increases the time spent in communicationand hence the parallel overhead because of the high transmission latency of theextra-LAN connection. The alternative is to use a decentralized approach whereeach cluster has its local pool of problems to solve in order to reduce the com-munication on slow connections. Figure 5 and 6 show a centralized approach,


Fig. 6. The decentralized approach

like our solver, and a decentralized approach in a grid environment made up ofthree clusters on three LANs.

6 Related Work

Many software packages implementing parallel branch and bound have beendeveloped. SYMPHONY [5] is a parallel framework, similar to COIN/BCP, forsolving mixed integer linear programs; PICO [12]. PARINO [13] and FATCOP[14, 15] are generic parallel MIP solvers. Some other parallel solver are PUBB[16] and PPBB-Lib [17]. ALPS [18] is a framework for implementing parallelgraph search algorithms and MW [19] is a framework for making master-workerapplication in grid-environment using Condor.

The literature dealing with the management and the performance of MPIapplications in grid environments made up of private IP clusters is relativelylimited. The paper [20] presents MPICH-GP, an extension of MPICH-G2 forsupporting Private IP, whereas [21] describes a solution based on IMPI standardwith Network Address Translation mechanism and [22] proposes a solution basedon RSIP. Papers on similar topics are [23] and [24]. The paper [25] presentsa performance analysis on hierarchical grid system with different bandwidthsbetween clusters.

7 Conclusions and Future Work

In this paper, we have described the configuration of our experimental grid en-vironment, which is made up of two clusters located in two different LAN. Wehave described the problems encountered using MPICH-G2 in such environ-ment, where the compute nodes have hidden IPs, showing the solution basedon RSIP and its performance evaluation. The computational tests performedand presented here led to unsatisfactory results, in that the high latency of theextra-LAN connection minimizes the performance gains due to the use of a highnumber of compute nodes. A grid is a set of resources of heterogeneous naturewith different computational power connected by network with different perfor-mance characteristics. In this context, it is necessary to make grid-aware the


application in order to achieve good performance. A simple master-worker ap-proach, as our tests prove, is not a good solution because does not take intoaccount the topology of the grid environment. In our future work, we wish tochange the architecture of the solver system using a decentralized approach. Wewould divide the search tree in many sub-trees and assign one of them to eachcluster, which will solve it individually (Fig. 6). In this way, we think that theoverhead introduced by slow networks, as in the case described in our tests,should be reduced.

References

1. Mancini, E., Marcarelli, S., Ritrovato, P., Vasil’ev, I., Villano, U.: A grid-awarebranch, cut and price implementation. Lecture Notes in Computer Science 3666(2005) 38–47

2. Ferreira, L., Jacob, B., Slevin, S., Brown, M., Sundararajan, S., Lepesant, J., Bank,J.: Globus Toolkit 3.0 Quick Start. IBM. (2003)

3. Karonis, N., Toonen, B., Foster, I.: MPICH-G2: A Grid-Enabled Implementationof the Message Passing Interface. J. of Parallel and Dist. Comp. 63 (2003) 551–563

4. Vasil’ev, I., Avella, P.: PBC: A parallel branch-and-cut framework. In: Proc. of35th Conference of the Italian Operations Res. Society, Lecce, Italy (2004) 138

5. Ralphs, T., Ladanyi, L., Saltzman, M.: Parallel Branch, Cut, and Price for Large-Scale Discrete Optmization. Mathematical Programming 98 (2003) 253–280

6. Margot, F.: BAC: A BCP Based Branch-and-Cut Example. (2003)7. Cordiery, C., Marchandz, H., Laundyx, R., Wolsey, L.: bc-opt: a Branch-and-Cut

Code for Mixed Integer Programs. Mathematical Programming (86) (1999) 335–354

8. Ralphs, T., Ladanyi, L.: COIN/BCP User’s Manual. (2001) http://www.coin-or.org/Presentations/bcp-man.pdf.

9. Globus Alliance: WS GRAM: Developer’s Guide. (2005) http://www-unix.

globus.org/toolkit/docs/3.2/gram/ws/developer .10. Papadopoulos, P.M., Katz, M.J., Bruno, G.: NPACI Rocks: tools and techniques

for easily deploying manageable Linux clusters. Concurrency and Computation:Practice and Experience 15 (2003) 707–728

11. Bixby, R.E., Ceria, S., McZeal, C.M., Savelsbergh, M.W.P.: An updated mixedinteger programming library MIPLIB 3.0. Optima (58) (1998) 12–15

12. Eckstein, J., Phillips, C., Hart, W.: Pico: An object-oriented framework for parallelbranch and bound. Technical report, Rutgers University, Piscataway, NJ (2000)

13. Linderoth, J.: Topics in Parallel Integer Optimization. PhD thesis, School ofIndustrial and Systems Engineering, Georgia Inst. of Tech., Atlanta, GA (1998)

14. Chen, Q., Ferris, M.C.: Fatcop: A fault tolerant condor-pvm mixed integer pro-gramming solver. Technical report, University of Wisconsin CS Department Tech-nical Report 99-05, Madison, WI (1999)

15. Chen, Q., Ferris, M., Linderoth, J.: Fatcop 2.0: Advanced features in an oppor-tunistic mixed integer programming solver. Annals of Op. Res. (103) (2001) 17–32

16. Shinano, Y., Higaki, M., Hirabayashi, R.: Control schemas in a generalized utilityfor parallel branch and bound. In: Proc. of the 1997 Eleventh International ParallelProcessing Symposium, Los Alamitos, CA, IEEE Computer Society Press (1997)

17. Tschoke, S., Polzer, T.: Portable Parallel Branch-And-Bound Library PPBB-LibUser Manual. Department of computer science Univ. of Paderborn. (1996)

http://www.coin-or.org/Presentations/bcp-man.pdf

http://www.coin-or.org/Presentations/bcp-man.pdf

http://www-unix.globus.org/toolkit/docs/3.2/gram/ws/developer

http://www-unix.globus.org/toolkit/docs/3.2/gram/ws/developer


18. Ralphs, T.K., Ladanyi, L., Saltzman, M.J.: A library hierarchy for implementingscalable parallel search algorithms. J. Supercomput. 28(2) (2004) 215–234

19. Goux, J., Kulkarni, S., Yoder, M., Linderoth, J.: An enabling framework for master-worker applications on the computational grid. In: HPDC ’00: Proceedings of theNinth IEEE International Symposium on High Performance Distributed Comput-ing (HPDC’00), Washington, DC, USA, IEEE Computer Society (2000) 43

20. Park, K., Park, S., Kwon, O., Park, H.: Mpich-gp: A private-ip-enabled mpiover grid environments. In: Lecture Notes in Computer Science, Proc. of Paral-lel and Distributed Processing and Applications, Second International Symposium(ISPA04). Volume 3358., Berlin, DE, Springer (2004) 469–473

21. Velusamy, V., Bangalore, P., Raman, P.: Communication strategies for private-ip-enabled interoperable message passing across grid environments. In: Proc.of First International Workshop on Networks for Grid Applications. (2004)http://www.broadnets.org/2004/workshop-papers/Gridnets/Velusamy V.pdf.

22. Das, D., Sabharwal, R., Saraswati, S., Anantharaman, P.N., Oh, J.: A networkarchitecture for enabling execution of mpi applications on the grid. InternationalJournal of Information Technology 11(4) (2004) 74–83

23. Heymann, E., Senar, M.A., Fernandez, E., Fernandez, A., Salt, J.: Managing mpiapplications in grid environments. In Dikaiakos, M.D., ed.: Grid Computing: Sec-ond European AcrossGrids Conference, Lecture Notes in Computer Science. Vol-ume 3165. (2004) 42–50

24. Choi, S., Park, K., Han, S., Park, S., Kwon, O., Kim, Y., Park, H.: An nat-basedcommunication relay scheme for private-ip-enabled mpi over grid environments.In: International Conference on Computational Science. (2004) 499–502

25. Chen, C., Schmidt, B.: Performance analysis of computational biology applicationson hierarchical grid systems. In: CCGRID, IEEE Computer Society (2004) 426–433

http://www.broadnets.org/2004/workshop-papers/Gridnets/Velusamy_V.pdf

lncs 4331 - computational experience with branch, cut and ... filelp-based branch and bound is a...

Documents