integer programming methods for reserve selection and design

15
Im CHAPTER 4 Integer Programming Methods for Reserve Selection and Design Robert G. Haight and Stephanie A. Snyder 4.1 Introduction How many nature reserves should there be? Where should they be located? Which places have high- est priority for protection? Conservation biolo- gists, economists, and operations researchers have been developing quantitative methods to address these questions since the 1980s. The first formu- lations (Kirkpatrick 1983; Margules et al. 1988; Chapters 3 and 19) focused on species protection and efficiency - select the minimum number of reserves from a list of candidate sites to represent all species - and were solved using iterative heu- ristics, which guarantee only an approximation of the optimal solution (Chapter 5). Researchers later formulated the problem as a 0-1 linear integer pro- gramming (IP) model and found mathematically proven optimal solutions using the branch and bound (B&B) method available in commercial opti- mization software (Saetersdal et al. 1993; Underhill 1994). The main advantage of formulating a reserve selection problem as an IF model is the availability of solution methods such as B&B that guarantee finding the optimal solution. Researchers quickly recognized this advantage and have formulated reserve selection problems with a remarkable array of objectives and constraints. We review IP methods and formulations applied to reserve selection and design. Because IP is a branch of linear programming (LP), we begin with a discussion LP models, assumptions, solution methods, and linearization techniques (Section 4.2). Then, we use a fundamental reserve selection model called the maximum species covering prob- lem (MSCP) to discuss IF formulations and solution methods (Section 4.3). In Section 4.4, we discuss computational aspects of solving the MSCP. In Sec- tions 4.5-4.8, we present a range of IP extensions (' the MSCP (Table 4.1). We conclude with a discu'- sion of the limitations of IF models and directions for future work. Throughout the chapter, we attach specific mean- ing to the terms 'site', 'reserve', and 'reserve sys- tem'. Following Williams et al. (2005a), a site is a selection unit - a piece of land that may be selected for protection. A site is usually undeveloped open space belonging to one or more cover types, includ- ing forest, grassland, pasture, or cropland. A reserve is a single site or a contiguous cluster of sites that has been selected for protection. A reserve system or network is a set of multiple, spatially separated reserves. We also distinguish reserve selection from reserve design models. Reserve selection models identify sites to protect to maximize some measure of biological diversity (e.g. species richness) sub- ject to a budget constraint with no consideration given to the spatial attributes of the reserve system. Reserve design problems use spatial attributes of the reserve system as objectives. Excellent reviews of IF models for reserve selec- tion and design are available. Rodrigues and Gaston (2002) provide a comprehensive list of published studies that use IF formulations of reserve selec- tion problems. ReVelle et al. (2002) discuss the rela- tionship between IF reserve selection models and their counterparts in the facility location literature. Williams et al. (2005a) discuss spatial attributes of reserve systems and how they can be incorporated as objectives in IF reserve design models. 43

Upload: others

Post on 03-Apr-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Im

CHAPTER 4

Integer Programming Methodsfor Reserve Selection and Design

Robert G. Haight and Stephanie A. Snyder

4.1 IntroductionHow many nature reserves should there be? Whereshould they be located? Which places have high-est priority for protection? Conservation biolo-gists, economists, and operations researchers havebeen developing quantitative methods to addressthese questions since the 1980s. The first formu-lations (Kirkpatrick 1983; Margules et al. 1988;Chapters 3 and 19) focused on species protectionand efficiency - select the minimum number ofreserves from a list of candidate sites to representall species - and were solved using iterative heu-ristics, which guarantee only an approximation ofthe optimal solution (Chapter 5). Researchers laterformulated the problem as a 0-1 linear integer pro-gramming (IP) model and found mathematicallyproven optimal solutions using the branch andbound (B&B) method available in commercial opti-mization software (Saetersdal et al. 1993; Underhill1994). The main advantage of formulating a reserveselection problem as an IF model is the availabilityof solution methods such as B&B that guaranteefinding the optimal solution. Researchers quicklyrecognized this advantage and have formulatedreserve selection problems with a remarkable arrayof objectives and constraints.

We review IP methods and formulations appliedto reserve selection and design. Because IP is abranch of linear programming (LP), we begin witha discussion LP models, assumptions, solutionmethods, and linearization techniques (Section4.2). Then, we use a fundamental reserve selectionmodel called the maximum species covering prob-lem (MSCP) to discuss IF formulations and solution

methods (Section 4.3). In Section 4.4, we discusscomputational aspects of solving the MSCP. In Sec-tions 4.5-4.8, we present a range of IP extensions ('the MSCP (Table 4.1). We conclude with a discu'-sion of the limitations of IF models and directionsfor future work.

Throughout the chapter, we attach specific mean-ing to the terms 'site', 'reserve', and 'reserve sys-tem'. Following Williams et al. (2005a), a site is aselection unit - a piece of land that may be selectedfor protection. A site is usually undeveloped openspace belonging to one or more cover types, includ-ing forest, grassland, pasture, or cropland. A reserveis a single site or a contiguous cluster of sites thathas been selected for protection. A reserve systemor network is a set of multiple, spatially separatedreserves. We also distinguish reserve selection fromreserve design models. Reserve selection modelsidentify sites to protect to maximize some measureof biological diversity (e.g. species richness) sub-ject to a budget constraint with no considerationgiven to the spatial attributes of the reserve system.Reserve design problems use spatial attributes ofthe reserve system as objectives.

Excellent reviews of IF models for reserve selec-tion and design are available. Rodrigues and Gaston(2002) provide a comprehensive list of publishedstudies that use IF formulations of reserve selec-tion problems. ReVelle et al. (2002) discuss the rela-tionship between IF reserve selection models andtheir counterparts in the facility location literature.Williams et al. (2005a) discuss spatial attributes ofreserve systems and how they can be incorporatedas objectives in IF reserve design models.

43

44 SPATIAL CONSERVATION PRIORITIZATION

Table 4.1 Reserve selection (A) and design problems (B) that have been formulated and solved using integer programming methods. Reserveselection models identify sites to protect to maximize some measure of biological diversity (e.g. species richness) subject to a budget constraintwith no consideration given to the spatial attributes of the reserve system. Reserve design problems use spatial attributes of the reserve system asobjectives

Problem

A. Non-spatial reserve selectionproblems

Maximum species coveringBi-criteria reserve selection

Reserve selection with uncertain speciespresence

Dynamic reserve selection with uncertainsite availability

B.Spatial reserve design problemsReserve proximityReserve connectivityReserve compactnessReserve core and buffer areas

Objective

Maximize number of species protected for a given budgetMaximize number of species protected and some other

conservation objectiveMaximize expected number of species protected for a

given budgetMaximize expected number of species protected at endof horizon

Reference

Church et al. (1996)Church et al. (2000); Ruliffson et al,

(2003)Carom et al. (2002); Arthur et al. (2004)

Haight et al. (2005)

Minimize sum of pairwise distances between reserves Onal and Briers (2002)Maximize number of adjacent reserves Nalle et at (2002)Minimize boundary length of reserves Fischer and Church (2003)Maximize core area protected Williams and ReVelle (1998)

4.2 Linear programming

Linear programming is a mathematical modellingand optimization technique invented in the 1940s(see Dantzig 1963). LP problems involve findingvalues of decision variables to optimize a linearobjective function subject to linear equality andinequality constraints. The related problem of IPrequires some or all of the variables to take inte-ger values. LP and IP methods have proved valu-able for modelling problems in planning, routing,scheduling, assignment, and design in a wide rangeof industries. An excellent guide to LP and IP prob-lems, algorithms, textbooks and software is avail-able online (Fourer 2000).

4.2.1 Land allocation problem

LP models have been used for decades in forestmanagement to allocate land to mutually exclusiveuses such as timber production and wildlife habitat(see Hof and Bevers 1998 for review). We describea simple version of this problem to illustrate theassumptions of LP models. Suppose a planner hasforest sites that can be used for timber productionor wildlife habitat and must allocate a proportion of

each site to each use. The planner knows the abun-dance of each species in each site and the minimumtotal abundance desired for each species acrosssites. To put a cost on habitat protection, the plan-ner uses the revenue of foregone timber production.The land allocation problem is an LP model withthe following notation:1, I = index and set of sites,j, J= index and set of species,

a', = abundance of species] in site i,c,=cost of allocating a unit of site i to wildlifehabitat,

= minimum desired abundance of species],x= proportion of site i allocated to wildlife habitat,z=objective function value.

The model is formulated as follows:

Minimize = (4.1)

7' for all ]mJ (4.2)

0!^x:^1 for all im (4.3)

The objective is to minimize the cost of allocatingland to habitat (Equation 4.1) subject to a set of con-straints requiring a minimum abundance of each

INTEGER PROGRAMMING METHODS FOR RESERVE SELECTION AND DESIGN 45

species across sites (Equation 4.2) and restrictionson the decision variables (Equation 4.3).

This land allocation problem illustrates threeproperties of linear programmes: proportional-ity, additivity, and continuity. The proportionalityassumption states that if it costs ç units to protectall of site i (i.e. x , = 1), then it costs cx to protect aproportion x. of site i. Likewise, if a units of spe-cies j are produced by protecting all of site i, thenax units of species j are produced by protecting aproportion x of site i. The proportionality assump-tion implies constant returns to scale (e.g. the unitcost of protecting a proportion of site i, c, does notdepend on the proportion of site i protected, x). Theadd itivity assumption states that the total contribu-tion of all variables equals the sum of the individ-ual variable contributions regardless of the valuesof the variables. The additivity assumption impliesthat the objective function is separable in the vari-ables: z(x i ,..., x,) is separable in variables x 1 ,..., x,,if z can be written as a sum of n functions each ofwhich involves only one variable in the model (i.e.z = 1z 1 (x 1 ) +...+ z(x)I, where z.(x) is the contribu-tion of the variable x, to the objective function). Thethird assumption is continuity of the variables: eachvariable can take on all real values in its allowedrange. In the optimization problem above, the deci-sion variable for the proportion of each site pro-tected, x, can take on any real value between 0 and1 (Equation 4.3).

4.2.2 Solving LP problems

The importance of LP problems derives fromthe existence of general purpose (independ-ent of the problem being solved) and computa-tionally effective (able to solve large problems)solution algorithms. LP problems having tens orhundreds of thousands of continuous variablesare regularly solved on Pentium-based personalcomputers or Unix workstations. Two familiesof solution algorithms are in wide use: the sim-plex algorithm introduced in the 1940s (Dantzig1963) and interior point methods introduced inthe 1980s (Kamarkar 1984). Both visit a progres-sively improving series of trial solutions until asolution is reached that satisfies the mathemati-cal conditions for optimality.

LP software comes in two related but differentkinds of packages. The first, algorithmic codes,finds and lists optimal solutions to specified LPproblems. The second, modelling systems, helpspeople solve LP problems by taking a descriptionof an LP problem in a straightforward, logical for-mat, converting the model to a form required bythe algorithmic code, and displaying the results ofthe optimal solution. Modelling systems includeprogramming languages that allow users to specifymodels in concise algebraic statements (e.g. Gen-eralized Algebraic Modelling System, GAMS) andspreadsheets in which models are represented assystems of linear equations (e.g. Microsoft Excel).Most modelling systems support a variety of algo-rithmic codes, while the more popular codes can beused with many different modelling systems. TheInstitute for Operations Research and the Man-agement Sciences (INFORMS) regularly publishessurveys of commercial modelling systems andalgorithmic codes, including both LP and IP solvers(Fourer 2007).

4.2.3 Linear approximations of non-linearoptimization problems

While LP problems are easy to solve, many real-lifeproblems are better expressed with non-linear equa-tions and inequalities that violate the proportional-ity and additivity assumptions. It may be possibleto approximate the non-linear parts of the problemwith linear approximations that satisfy the additiv-ity and proportionality assumptions and then useLP to solve the approximation. For example, sup-pose the unit cost of site protection c in the landallocation model above does not vary by site so theobjective function (Equation 4.1) is z=c(x 1 +. . . + x)where a is the number of sites. Further, supposeunit cost c increases as a linear function of the totalamount of protected land, c = b(x 1 +. . . + x), whereb is the slope of the unit cost function. Then, theobjective function is z=b(x1 +...+ x, )(x 1 +...+

which violates both additivity and proportionalityassumptions. We can address the additivity viola-tion by defining anew variable y = +...+ x,, whichis a linear function of the decision variables, andwriting the objective z=by2. While this objective is

46 SPATIAL CONSERVATION PRIORITIZATION

separable in the variables (it has only one variable!),it still violates the proportionality assumption. Theviolation of the proportionality assumption can beaddressed with a piecewise linear approximation.First, divide the total amount of protected land, y,into a set of cost classes, K, ordered from lowest costto highest cost, where Ck is the unit cost of protectingan amount of land in class k and c 1 <c2 <Ck. Each Ck isa linear approximation of the slope of the total costcurve by' in the kth interval of y. Further, define Ykas a decision variable for the total amount of landselected for protection in cost class k and d be theupper bound on the amount available in class k,

so that 0 <Yk < dk . Then, we can formulate the fol-lowing linear approximation of the non-linear landallocation problem:

Minimize z =>Ckyk (4.4)kEK

Yk=X (4.5)

^! T,for all EJ (4.6)

0:^x!^1 for all iEI (4.7)

O_< Yk _< dk for all kcK (4.8)

The non-linear objective function z = by2 is replacedby a piecewise linear approximation (Equation 4.4),and Equations 4.5 and 4.8 define conditions for thedecision variables Yk If the model selects any landfor protection, the model will select land with thelowest unit cost first. As a result, for any k, if Yk> 0,then y, = d, for all t <k and Equation 4.4 is piecewiselinear. The major disadvantage of formulating alinear approximation of a non-linear problem is theconsiderable increase in the number of variablesand constraints. Nevertheless, linear approxima-tions of non-linear site selection models have beenformulated with thousands of decision variablesand successfully solved using LP software (e.g. Hofand Raphael 1997; Arthur et al. 2004).

4.3 Integer programmingInteger programming methods are used to solvelinear optimization problems in which one or

more of the variables are restricted to be inte-gers. Zero—one or binary IP problems restricttheir integer variables to be 0 or 1. Many reserveselection and design problems are formulated asbinary IP problems because site selection deci-sions are binary and the logic of the conservationobjective can be modelled with binary variables.To illustrate a 0-1 IF problem, we present themaximum species covering problem, one of thefirst of many reserve selection and design prob-lems formulated in the conservation biologyliterature.

4.3.1 Maximum species covering problem

Suppose a planner has identified a set of poten-tial reserve sites and the cost of protecting eachsite. The planner only knows whether or not eachspecies is present in each site and does not haveinformation on its abundance. Further, the plan-ner must decide whether or not to protect eachsite in total and cannot protect a portion of thesite. This restriction fits the common situationin which sites are indivisible ownerships. TheMSCP identifies the sites to protect to maximizethe number of species represented - where a spe-cies is represented if it is present in at least oneprotected site - subject to a budget constraint(Church et al. 1996). The MSCP is analogous tothe maximum covering location problem in thelocation science literature (Church and ReVelle1974; ReVelle et al. 2002), and it provides sets ofsites that efficiently achieve conservation goalsand trade-offs between conservation goals. TheMSCP is a binary IF problem with the followingnotation:i, I = index and set of potential reserve sites,j, I = index and set of species in need of protection,a ,.,. = 0-1 parameter: 1 if species j is present in site i,0 otherwise,B = upper bound on budget,C 1. = cost of protecting site i,

x. = 0-1 variable: 1 if site i is selected for protection,0 otherwise,Y] = 0-1 variable: 1 if species j is represented in atleast one protected site, 0 otherwise,z = objective function value.

INTEGER PROGRAMMING METHODS FOR RESERVE SELECTION AND DESIGN 47

The model is formulated as follows:

Maximize = I y1

for all j c j (4.10)

< B(4.11)

x,y 1 ei0,1l for all icl and jeJ (4.12)

The objective (Equation 4.9) is to maximize the numberof species that are represented or covered in the setof selected sites. Equation 4.10 enforces the logic ofcovering: a species is covered (y, =1) if at least onesite that contains the species is selected for protection.Equation 4.11 is the budget constraint. Equation 4.12describes the integer restrictions on the variables.

Note that the same IP model structure couldinclude species abundance data rather than pres-ence—absence data. In this case, the abundanceparameter a , would be added to the objective func-tion (Equation 4.9) to maximize the total abundanceof species contained in the selected set of sites:

Maximize z =

Note also that the MSCP problem is written interms of maximizing species coverage subject to abudget constraint. This budget-constrained formu-lation fits a common situation in which resources forsite protection are limited and the decision-makerwants to allocate resources to optimize a conser-vation objective. Solving the problem for a givenbudget B allows the determination of an efficientset of sites, where efficiency means that there areno other sets of sites that provide a higher level ofspecies coverage and stay within the budget. Solv-ing the problem with increasing budgets allowsconstruction of a cost curve, which shows the costof increasing the number of species covered. Box4.1 describes an application of the MSCP in LakeCounty, Illinois, USA, where planners want todetermine the impact of budget restrictions on effi-cient sets of sites for species protection in the face ofurban development.

4.3.2 Solving lP problems

The integer requirements on the decision variablesmake IP problems difficult to solve. In contrast toLP problems, there are no general-purpose and

Box 4.1

We describe an application of the maximalspecies covering problem (Equations 4.9-4.12)in a case study in the Lake County portion ofthe Fox River watershed north-west of the cityof Chicago (Figure 4.1). In response to rapidpopulation growth and conversion of open spaceto housing and commercial development, LakeCounty planners want to acquire land to protectrare animals and plants and provide equitableaccess to recreation. To help planners identifycost-effective sets of sites, we solved an MSCPand analysed the cost of increasing the number ofspecies covered.

The analysis is conducted using data for 31privately owned open-space sites (see Haightet al. 2005 for details). The sites vary in size from1 t 313 ha, with median of 29 ha. Each site isdescribed by a list of rare plants and animalspresent. Collectively, 27 rare species occur in the

31 sites, and species richness of individual sitesvaries from 1 to 9 species. We used areas of sitesas proxies for site costs and assumed that theplanner has an area budget for selecting sites.

We determined the optimal sets of sites toprotect for area budgets ranging from 1 to618 ha and plot the cost curve in Figure 4.2.Each point represents a cost-effective set of sitesfor a given budget. The slope of the cost curveis the marginal cost of species protection - thearea required to protect an additional species.Marginal cost is small (4 ha/species) as coverageincreases from 5 to 20 species, moderate (34ha/species) in the range of 20 to 25 species, andlarge (195 ha/species) for levels of speciescoverage greater than 25.

As the budget increases, the optimal set of sitesis not always found by adding another site to thepreviously selected set. For example, to increase

continues E

48 SPATIAL CONSERVATION PRIORITIZATION

Box 4.1 continued

N

0 20 40 Kilometres /'\

Figure 4.1 Fox River watershed (shaded grey) in counties of north-eastern Illinois, USA. The study area (shaded black)

is the north-eastern portion of the watershed located in Lake County, Illinois, USA.

coverage from 20 to 22 species, one site can beadded to the list of protected sites (Table 4.2).However, increasing species coverage above 22species involves dropping one site and adding

700 --600500

co400

300

<200

100. . .

0 5 10 15 20 25 30Coverage (number of species)

Figure 4.2 Cost curve showing'area protected versus numberof species covered for the site selection options in the Fox Riverwatershed of Lake County, Illinois, USA.

Table 4.2 Optimal sets of sites selected for protection underincreasing area budgets in the Fox River watershed of Lake County,Illinois, USA

Objective Site numbers selected for protection

values

Species Area 3 7 8 14 15 17 18 20 21 22 23 30(ha)

20 59 X X X X X X

22 123 X X X X X X X X

25 228 X X X X X X X X X

27 618 X X X X X X X X

up to four others. There is consistency in sitesselected for protection. Six sites are selectedwhenever the budget is greater than 50 ha, Thesesites are small (<16ha), have higher numbers ofspecies per hectare, and contain endemics.

INTEGER PROGRAMMING METHODS FOR RESERVE SELECTION AND DESIGN 49

computationally effective algorithms for solving IPproblems. Solution methods for IP problems can becategorized as 'optimal' and 'heuristic'. An optimalalgorithm is one that mathematically guaranteesfinding the optimal solution while a heuristic is analgorithm that should find a feasible solution which,in objective function terms, is close to the optimalsolution. The choice between using an optimal orheuristic algorithm depends in part on whetherthe problem size is beyond the computational limitof the optimal algorithm and whether the analystwants to spend the effort needed to solve the prob-lem optimally.

The most effective optimal algorithm for IPproblems is LP-based branch and bound (B&B).The algorithm proceeds by solving a sequence ofLP problems using either efficient simplex methodor interior-point method until a mathematicallyproven optimal solution to the IF is found. TheB&B algorithm begins by solving a 'relaxed' LPversion of the problem, ignoring the binary inte-ger restrictions on the decision variables. If thisrelaxed solution is all-integer, then it is the opti-mal solution and the algorithm terminates. If therelaxed solution contains one or more binary vari-ables with fractional values, the problem is itera-tively branched into sub-problems. Branching isdone by picking one of the binary variables withfractional value in the preceding solution. Twonew sub-problems are created by setting this vari-able to 1 and 0, respectively, and the remainingbinary variables are allowed to have continuousvalues between 0 and 1. In this manner, each sub-problem is solved as a linear programme. Thissequence of creating sub-problems and solvinglinear programmes is continued until either aninteger solution is found for each sub-problembranch or the relaxation causes the sub-problemto become infeasible. An exhaustive search of allof the sub-problems can be prohibitive because 2problems must be solved when there are n binaryvariables. To reduce the number of sub-problems,the B&B algorithm employs a 'bounding' feature,which computes a bound on the optimum solu-tion at each step. If the solution to a sub-problemis worse than the best integer solution previouslyfound, then that sub-problem is discarded, therebyreducing the search space. Most commercial

modelling packages contain B&B algorithms(Fourer 2007).

4.4 Computational aspects of IP models

The computer time required to solve any particu-lar IP problem is hard to predict. Problems with ahundred variables can be challenging, while otherswith tens of thousands of variables solve readily.Examples of the MSCP with hundreds or thousandsof 0-1 decision variables have been solved usingB&B software in one to several minutes (Churchet al. 1996; Rodrigues and Gaston 2002; Snyderet al. 2004a); however, reserve design problems withspatial objectives may take much longer to solve. Inthis section, we offer tips to explore when solvingthe MSCP with B&B.

4.4.1 Formulation tips

Camm et al. (1996) recommend carefully pre-processing MSCP datasets to reduce problem sizeprior to attempting solution with B&B. Specifically,they suggest removing any eligible site from con-sideration if it contains all the same species or asubset of the species of another site that costs less.Similarly, if two species are found in the same set ofeligible sites, then only one of the species needs tobe included.

As recognized by Camm et al. (1996) and Churchet al. (1996), the species coverage variables, y, neednot be explicitly defined as binary decision vari-ables. The structure of the MSCP and the binary siteselection variables, x1, will automatically produceinteger values for the coverage variables whensolved via B&B. As a result, the coverage variablescan be defined as non-negative variables with anupper bound of 1. This reduction in the number ofinteger variables may provide computational sav-ings because the solution difficulty of integer pro-grammes is often significantly influenced by thenumber of integer variables.

A mathematically equivalent specification of theMSCP can be formulated based upon a 'minimaluncovering' model, which may have some com-putational advantage (Church and ReVelle 1974;

50 SPATIAL CONSERVATION PRIORITIZATION

'I

Church et al. 1996; Arthur et al. 1997; Snyder et al.2004n). In the uncovering formulation, each deci-sion variable for species coverage is 0 if the spe-cies is represented in at least one site and I if thespecies is unrepresented. The objective of this newformulation is to minimize the number of unrepre-sented species. As a result, the uncoverage variablescan be declared non-negative (rather than integer)variables. This lack of a required upper bound maymake the minimal uncovering version of the MSCPmore computationally efficient than the equivalentmaximal covering version.

4.4.2 Solver settings - optimality gap

The B&B algorithm provides, for each feasiblesolution obtained, a provable upper bound on itsdistance from optimality. Each time a sub-prob-lem is solved, the algorithm calculates the abso-lute and percent deviation between the objectivefunction value of the current best feasible integersolution and the current upper bound on the opti-mal objective function value. Based on this devia-tion, the analyst can specify an optimality gap'that functions as a termination criterion (McDilland Braze 2001; Onal 2003; Snyder et al. 2004a).For example, to ensure that the optimal solutionis found, the analyst would set the optimality gapequal to 0. Then, all sub-problem branches willbe explored as long as a non-0 difference existsbetween the best current solution and the cur-rent upper bound. When the problem takes toolong to solve to optimality, an analyst can set theoptimality gap to a non-0 value (e.g. 1 %), whichreduces problem size and hastens algorithm ter-mination. However, allowing a non-0 optimalitygap also allows for the possibility of a suboptimalsolution. Thus, one must be judicious in settingthe optimality gap because better solutions couldbe missed by allowing the algorithm to terminatebefore reaching optimality.

An important feature of the MSCP allows theabsolute optimality gap to be just less than I whilestill guaranteeing that the optimal solution will befound (Onal 2003). Because the objective function ofthe MSCP can only take integer values (e.g. numberof species covered), no improvement in the objec-tive function value is possible when the absolute

optimality gap is below I. Then, the current integersolution is optimal and the algorithm can termi-nate. This is an important feature of the MSCP thatshould be exploited when using B&B. If the objec-tive function includes non-binary coefficients on thedecision variables (e.g. weights for species impor-tance or rarity), then optimality gaps less than oneno longer guarantee optimal solutions.

4.5 Bi-criteria reserve selection problem

While the MSCP has traditionally focused on thesingle objective of maximizing species representa-tion, additional objectives can be considered in bi-criteria formulations, including spatial attributes ofthe reserve systems (e.g. Rothley 1999; Cerdeira etal. 2005; Onal and Briers 2005, 2006; Alagador andCerdeira 2007), total habitat area (Snyder et al. 2004ii),habitat quality (Church et al. 2000), and public accessor proximity (Ruliffson et al. 2003; Haight et al. 2005).Cohon (1978) provides a comprehensive discussion ofmulti-objective programming theories and methods.While solution of a single objective problem yields anoptimal solution, solution of a multi-objective modelyields a set of solutions that are termed non-domi-nated or non-inferior. A non-inferior solution is one inwhich no other feasible solution exists to the problemthat will lead to an increase in the value of one objec-tive without simultaneously causing a degradation inthe value of another. The choice of a preferred solu-tion from the set of non-inferior solutions depends onthe preferences of a decision-maker.

To illustrate a bi-criteria problem, we extend theMSCP to handle a second objective of maximiz-ing the number of people with access to reserves(Ruliffson et al. 2003). We assume that people in atown have access if one or more reserves are withina required distance of the town. The data for eachtown include its population and a list of sites thatare within the required distance. In addition to thenotation of the MSCP, the bi-criteria site selectionmodel has the following notation:k, K= index and set of towns,Q 1 = number of species represented in the protectedsites,

Q2= number of people with access to protected sites,rk= number of people in town k,

M, = set of sites that contain species j,

INTEGER PROGRAMMING METHODS FOR RESERVE SELECTION AND DESIGN 51

Nk set of sites that are within the required distanceof town k,

Zk =Ol variable: I if town k has at least one pro-tected site within the required distance,o otherwise.

The model is formulated as follows:

Maximize Q1 = (4.13)

Maximize Q, = I rZ (4.14)

K, ^!Z for all k E K (4.15)

for all j J (4.16)

c,x B (4.17)

x,y/ ,zk eO,11 for all id, jdJ,kEK (4.18)

The problem has two objective functions: maxi-mize the number of species represented in pro-tected sites (Equation 4.13) and maximize thenumber of people with access to protected sites(Equation 4.14). Public access is the number oftowns with access weighted by population size, r.Equation 4.15 is the condition under which town khas access (i.e. Zk= 1): at least one site that is withinthe required distance of town k must be selectedfor protection. Equations 4.16-4.18 define speciescoverage, the budget constraint, and the integerrestrictions on the variables.

One approach to solving a bi-criteria optimiza-tion problem is the weighting method, which cre-ates a single objective function as a weighted sumof the two objectives. In our case, the problem isto maximize the weighted sum, wQ 1 + (1-w) Q2,where 0 w 1. The value of the weight, w, is sys-tematically varied between 0 and 1, and the prob-lem re-solved many times to produce an estimateof the non-inferior set of solutions. The weight w

represents the decision-maker's position on the rel-ative importance of the two objectives.

The constraint method is another approach tosolving bi-criteria problems in which one of theobjectives is transformed into a constraint. Forexample, the bi-criteria problem above could be

solved by optimizing the species' representationobjective (QI) subject to a constraint requiring thetotal number of people with access to protectedsites to be greater than, or equal to, some specifiedthreshold:

tZk ^!T (4.19)

The value of the parameter T would then be system-atically varied and the problem resolved to yield anestimate of the non-inferior set of solutions.

While both solution methods are effective meansof transforming a bi-criteria problem into a problemwith a single objective function and generating anestimate of the non-inferior set of solutions, thereare some potential computational differences. AsReVelle (1993) suggests and Snyder et al. (2004a)illustrate, an MSCP model with a constraint inwhich the coefficients are not 0 or 1 (e.g. Equation4.19) is not likely to be integer-friendly (i.e. a struc-ture that is amenable to integer solutions) nor solvequickly to optimality. In such cases, including theconstraint as an objective in a bi-criteria optimiza-tion formulation using the weighting method mightbe computationally more efficient. A computationalissue known as gap points, however, can arise whenthe weighting method is applied to integer models(Cohon 1978). Gap points are non-inferior solutionsto a multi-objective integer model that cannot hefound using the weighting method because theyare located within the interior of the convex hullof the trade-off curve formed by the non-gap solu-tions. These solution points can be found, however,through the use of the constraint method.

For these computational reasons, analystsuse both the weighting method and constraintmethod. The weighting method is used to quicklygenerate an estimate of the non-inferior set ofsolutions. Then, if an analyst wanted to hone inon a particular segment of the curve, the con-straint method is used to explore small ranges ofthe curve that the weighting method might missas a result of gap points. Box 4.2 illustrates thebi-criteria site selection problem (Equations 4.13-4.18) with results from an application to lakeCounty, Illinois, USA.

30

25

20a)>0

15a)C.)

E- 10C')

5

00

EAO

B

20 40 60 80 100 120 140

People with access (1,000s)

Budget:• 81 hao 200 ha

52 SPATIAL CONSERVATION PRIORITIZATION

wool

Box 4.2

We used the bi-criteria site selection model(Equations 4.13-4.18) to analyse how the optimalset of protected sites in Lake County, Illinois, USA,varies as we trade off species representation andpublic access under different budgets. There are 34towns in western Lake County. Based on the 2000U.S. Census, the towns collectively held 222,000people, and individual towns were home to from1,000 to 30,000 people. We assume that peoplein a town have access to a site if the site is within3.2 km of the town, and we know the sites thatare within the required distance of each town. Alltowns have at least one site within 3.2 km.

We used the multi-objective weighting method tosolve the bi-criteria problem. We computed optimalsets of sites for problems in which the objectivefunction weight is decreased from 1.0 to 0.0 inincrements of 0.05 subject to area budgets of 81 and200 ha. The curves showing the trade-offs betweenspecies representation and public access haveconcave shapes in which species representation dropsas public access increases (Figure 4.3). The points oneach curve represent non-dominated sets of sites andtheir performance with respect to the two objectivesfor a given budget. For each non-dominated setof sites, improvement in one objective cannot beachieved without simultaneously causing degradationin the value of the other objective. As a result, thepoints on each trade-off curve represent a frontierbeyond which no better solutions can be found.

Among the non-dominated solutions for agiven budget, the best depends on the decision-maker's preference for the two objectives. Ifspecies representation is most important andthe budget is 81 ha, the choice is alternative A,in which species representation is 20 (74 % ofthe maximum representation without a budgetconstraint) and public access is 73,000 people(33 % of the maximum accessibility). The dashedhorizontal line between the y-axis and point Aindicates that several solutions exist with the samespecies representation as alternative A, but withless public access. The highest level of public access(point D, 91,000 people) is obtained with a 35 %reduction in species representation. The dashedvertical line from point D to the x-axis indicatesthat several solutions exist with the same level ofpublic access as alternative D but with less speciesrepresentation. Increasing the budget from 81 ha to200 ha shifts the trade-off curve up and to the rightwhile reducing the trade-off between objectives.

To complement the trade-off curves, we look atthe site selection decisions and identify core sites,which are sites selected for protection regardless ofthe weights given to the objective functions. With abudget of 81 ha, three core sites are protected in allfour solutions (Table 4.3). With a budget of 200 ha,there are four additional core sites. The core sitesare typically small (<30 ha) and have relatively largenumbers of species and people with access.

Figure 4.3 Trade-offs between open-space protection objectives of maximizing species coverage and maximizing public access underdifferent area budgets.

INTEGER PROGRAMMING METHODS FOR RESERVE SELECTION AND DESIGN 53

Table 4.3 Objective function values and sites selected for protection for non-dominated solutions with area budgets of 81 ha(solutions A, B, C, D) and 200 ha (solutions E, F, G) in the Fox River watershed of Lake County, Illinois, USA

Objective value Site numbers selected for protection Area Protected

Solution Species People (lOCUs) 3 7 8 14 15 16 17 18 19 20 21 30 (ha)A 20 73 X X X X X X X 73B 18 85 X X X X 75C 17 88 X X X X X X X X 800 13 91 X X X X X X 81E 24 85 X X X X X X X X 198F 23 121 X X X X X X X X 196G 21 124 X X X X X X X X X 193

4.6 Reserve selection with uncertainspecies presence

In many cases, species presence in each site is notknown with certainty and expressed as a probabilityof occurrence. The MSCP can be extended to handleprobabilities of occurrence and maximize the expectednumber of species covered subject to a budget con-straint. Let p 1 be the probability that speciesj is presentin site i where the probability of species presence isindependent of its occurrence in neighbouring sites.This is an important assumption because it allows usto write the probability that species j is not coveredin the sites selected for protection as a product of theabsence probabilities over all sites:

= [1 (1 - for all j J (4.20)

where x is the 0-1 decision variable for whetheror not site i is selected for protection. If occurrenceprobabilities are not independent, then we wouldhave a much more complicated expression for V1

With the independence assumption, the problemis to select sites to maximize the expected numberof species covered subject to a budget constraint:

and the model has been illustrated using probabilis-tic occurrence data for 403 terrestrial vertebrates in147 candidate sites in western Oregon, USA (Arthuret al. 2004).

In some situations, the decision-maker may be con-cerned about the likelihood that a subset of endan-gered species is represented in the reserves. Thisconcern can be addressed by imposing additionalconstraints for minimum coverage probabilities fortarget species (Haight et al. 2000). Letting E be the setof endangered species and be the minimum cover-age probability for endangered species j (e.g. 0.95),the minimum threshold coverage constraints are

1—v1 ^!h 1 for all j€E (4.24)

Rearranging and taking the natural logarithm pro-duces an equivalent set of linear constraints:

x1 ln(1 - p) ln(1 - h) for all j E E (4.25)

Arthur et al. (2004) added these constraints to the max-imum expected species covering problem to estimatethe trade-offs between total species coverage and thelikelihood of endangered species representation.

Maximize: (1 - v) (4.21)

cx B (4.22) 4.7 Dynamic reserve selectionWith

X, E {0,1} for all i e 1 (4.23) uncertain site availability

The MSCP assumes that site selections are made all

A linear approximation of this non-linear problem at once and protection takes place rapidly before

can be solved using IP methods (Camm et al. 2002), site degradation or loss. In practice, decisions are

54 SPATIAL CONSERVATION PRIORITIZATION

made sequentially with budget restrictions anduncertainties about site degradation and loss.Although dynamic site selection problems andheuristic solution methods are discussed in detailin Chapter 10, we discuss how to formulate andsolve a dynamic site selection problem as an IPmodel.

Snyder et al. (2004h) developed a two-periodlinear-integer model for sequential site selectionin which uncertainty about future site availabilityis represented with a set of probabilistic scenarios.The two-period problem maximizes the expectednumber of species covered at the end of the sec-ond period subject to an upper bound on the totalcost of site protection. The model employs a listof sites, some of which are available for protec-tion in the first period and others which are not.Each site, not protected in the first period, has aprobability of remaining undeveloped and beingavailable for protection in the second period.Uncertainty about the development of unpro-tected sites is represented with a set of develop-ment scenarios. Each scenario is one possibledevelopment outcome identifying which sites areundeveloped and available for protection in thesecond period. Associated with each scenario is aprobability of occurrence. The model has two setsof 0-1 site selection variables. The first set includesthe protection choices for sites in the first period.The model assumes that protection decisions inthe second period are made after the decisionsin the first period are implemented and the sitedevelopment scenario is revealed. Thus, the sec-ond set of decision variables includes the protec-tion choices for sites in the second period undereach development scenario. The two-period prob-lem is readily solvable using IP methods (Snyderet al. 2004b) and provides information about howuncertain site availability affects current site selec-tion decisions (Haight et al. 2005).

4.8 Spatial reserve design problems

may not maintain or support long-term persistenceof target species (Cabeza and Moilanen 2001) andmay increase the difficulty and expense of reservemanagement. Scattered reserves are particularlytroublesome when they are surrounded by a matrixof land uses and cover types that adversely impactspecies' persistence.

Reserve design attributes such as reserve proxim-ity, connectivity, and shape can be incorporated intoIP models for site selection (Williams et al. 2005a).Here, we discuss how these attributes can be for-mulated as spatial objectives in IP models. Eachmodel includes a spatial objective combined witha species coverage constraint. By varying the levelof the constraint, trade-offs between the spatial andcoverage objectives can be obtained.

4.8.1 Reserve proximity

When a reserve system consists of disjunct areas ofprotected habitat, the distance between reserves mayinfluence species' mobility and viability. A reservesystem in which the reserves are closer togethermay be preferred because shorter migration dis-tances facilitate recolonization of areas where a spe-cies has become locally extinct and help prevent theloss of genetic diversity because of inbreeding. Oneway to reduce the distances between reserves is tominimize the sum of distances between all pairs ofselected sites. Letting dk be the distance betweensites i and k and 11k

be a 0-1 variable for whether ornot both sites i and k are selected, the problem canbe written:

Minimize: 11 d,k u k (4.26)

U k 2^ X +Xk —1 for all i,k c I,k > i (4.27)

X ^ I/ for all j c 1 (4.28)

(4.29)JJ

e {0,l},u,k c= 10,1) for all i,kel,jeJ (4.30)

One shortcoming of reserve selection models isthat they do not consider the spatial distribution of The objective (Equation 4.26) minimizes the sumselected sites. As a consequence, MSCP solutions of the pairwise distances between selected sitesmay consist of scattered reserves with little spa-tial coherence. A scattered distribution of reserves

subject to constraints (Equations 4.28 and 4.29)that require at least R species to he represented.

INTEGER PROGRAMMING METHODS FOR RESERVE SELECTION AND DESIGN 55

Equation 4.27 enforces the definition of u , byrequiring both x = I and

x = 1 for u = 1. Onal and

Briers (2002) apply a similar formulation to theproblem of selecting a subset of 131 pond sitesin Oxfordshire, UK, to protect 256 invertebratespecies. Variants of this approach include maxi -mizing the inverse pairwise distance between allselected reserves (Rothley 1999), minimizing themaximum intersite distance between the selectedreserves (Onal and Briers 2002), constraining themaximum distance between eligible reserves(Malcolm and ReVelle 2002), and minimizing thepairwise distance between reserves and existingcore areas (Alagador and Cerdeira 2007; Snyderet al. 2007).

4.8.2 Reserve connectivity

The degree to which separate reserves are struc-turally or functionally connected is an importantattribute contributing to species persistence. Con-nectivity can be defined as the degree to whichseparate habitat reserves are accessible from otherpatches. Structural connectivity calls for strict adja-cency of reserves while functional connectivity isachieved if reserves are within a certain distanceof each other. Connectivity is a species-specific andlandscape-specific function which is dependentupon mobility characteristics of individual species.

In situations where the landscape is subdividedinto contiguous polygons representing candidatesites, structural connectivity can be promoted byselecting sites for protection that are adjacent to eachother. Letting U ik be a 0-1 variable for whether or notboth sites i and k are selected, the objective is to max-imize the number of adjacent pairs of selected sites:

Maximize: I Y_U,. (4.31),I kA ,k>i

U k ^: X. +; —1 for all iE I,k E A,k > i (4.32)

X, ^! y 1for all j e J (4.33)

^: R (4.34)jeJ

xe(O, 1), uIk E{0, 11 for all i, kEI, jsJ (4.35)

where the set A represents all sites that are adjacentto site i (Williams et al. 2005a) and Equations 4.33

and 4.34 represent the species coverage constraints.Nalle et al. (2002) employ a similar formulation tothe problem of selecting a subset of 4,181 sites inJosephine County, Oregon, USA, to protect exam-ples of 13 habitat types.

Another approach to reserve connectivity usesconstructs from graph theory and network opti-mization (e.g. Williams 1998,2002; Onal and Briers2005, 2006; Cerdeira et al. 2005). These modelsenforce rather than promote structural connectivityof the reserve system, although model variationsallow for contiguity gaps (e.g. Williams 2002; ()naland Briers 2005). In network terminology, sites areviewed as network nodes. Network arcs are definedfor each pair of adjacent nodes or sites. A contigu-ous reserve system is a network formed by a set ofselected sites linked by arcs between those sites.These constructs can be integrated into an IP modelfor reserve selection (e.g. Williams 2002; Onal andBriers 2005, 2006).

4.8.3 Reserve shape

The shape of reserves may be important for spe-cies survival and many authors advocate creat-ing compact reserves that are nearly circular andhave low edge/area ratios. Compact reservesare better for edge-intolerant species that pre-fer large areas of interior habitat. One approachto forming reserves that meet size and shaperequirements is to predefine desirable clustersof sites and include the clusters as decision vari-ables in a site selection model (Williams andReVelle 1998; Rebain and McDill 2003; Man-anov et al. 2008). Another approach, the 'coreand buffer' method, involves creating an inner,protected reserve area surrounded by a ringof land managed to buffer the core areas fromnegative impacts of the surrounding landscape(Williams and ReVelle 1998). A third approachinvolves selecting sites to minimize a measureof compactness of the resulting reserves, wherecompactness is the total length of the reserveboundaries (Fischer and Church 2003; Onal andBriers 2003). Total boundary length is the differ-ence between the length of the boundaries of allthe selected sites and two times the length of theshared boundaries between the selected sites.

56 SPATIAL CONSERVATION PRIORITIZATION

Letting b , be the length of the boundary of sitei and sb k be the length of the shared boundarybetween sites i and k, the problem of minimizingtotal boundary length is:

Minimize: yb , x, - 21 Y sb ik il (4.36)

U k ^ X, + Xk 1 for all ic I, k cA, k > i (4.37)

X, y 1for all jcJ (4.38)

(4.39)

X,, y 1 , U c{O, 1) for all i, kcl, jcJ (4.40)

In the objective function (Equation 4.36), the bound-ary length of the reserve system is calculated by add-ing the boundary lengths of the selected sites andthen subtracting twice the length of the boundariesshared by adjacent sites. Fischer and Church (2003)utilized this model to analyse trade-offs betweentotal area and compactness of reserve systems toprotect examples of 55 plant community types innorthern California forests.

4.9 Applicability and limitations of IP

We have presented a wide range of reserve selec-tion and design problems that have been success-fully formulated and solved using IP methods.These problems involve tens, hundreds, or thou-sands of 0—I site selection variables and assumethat parameters such as site cost and species pres-ence are fixed and independent of the sites selectedfor protection. While IF is an efficient way to findoptimal solutions to problems that satisfy theseassumptions, IP models have limitations. Mostimportantly, any relevant non-linear relationshipbetween the decision variables, the objective func-tion, and the constraints must be transformed into,or approximated by, linear equations. Non-linearrelationships arise, for example, when species pres-ence depends on the spatial arrangement of the setof selected sites (Moilanen 20055), when speciesrepresentation is valued with non-linear utilityfunctions (Arponen et al. 2005), or when site protec-tion cost depends on the num6er and location of theselected sites (Tajibaeva et al. 2008). Although somenon-linear relationships can be captured with linear

transformations and approximations (e.g. Hof andRaphael 1997; Arthur et al. 2004), more complexand possibly more realistic non-linear models canbe analysed using non-linear optimization methodsor heuristics that approximate optimal solutions(e.g. Arponen et al. 2005; Moilanen 2005b; Tajibaevaet al. 2008). Another limitation of IP models is prob-lem size. It is difficult to generalize about the limitsof problem size because IP solution time is often afunction of the structure of the data matrix. How-ever, generally speaking, problem complexity andsolution time tend to increase with the number ofbinary variables. Thus, a reserve selection or designproblem with thousands of potential reserve sites(e.g. remote sensing grid cells at a very fine-grainspatial scale) might be difficult to solve to optimal-ity in a reasonable time frame using IP methods.

4.10 Next stepsWhile a rich body of research has been published onthe application of IP to reserve selection and design,more work is needed in a number of areas. Ofgrowing importance is the need to address species'dynamics and persistence in reserve systems. Oneapproach is to incorporate stochastic, non-linearpopulation dynamics directly into site selectionmodels and use heuristic algorithms to search forsolutions that come close to maximizing species per-sistence (e.g. Moilanen and Cabeza 2002; Haight andTravis 2008). Another approach is to approximatepopulation dynamics with a system of linear equa-tions and formulate linear or integer programmingmodels for site selection (e.g. Bevers et al.1997; Hofet al. 2002). While optimal solutions can be obtainedfor the linear models, additional work is needed toevaluate under what conditions linear equations aresuitable to represent population dynamics.

While site selection models address problemsinvolving site acquisition and protection, plan-ners also face problems involving the allocation ofresources to a wider range of conservation activities,including fire management, invasive species con-trol, and reintroduction of extirpated species. Wilsonet al. (2007) describe the decision steps involved in ageneral conservation investment framework, includ-ing definition of the conservation objective, threatsto achieving the objective, and possible actions for

INTEGER PROGRAMMING METHODS FOR RESERVE SELECTION AND DESIGN 57

each site along with their costs and benefits. Whenthe conservation objective function can be expressedas a linear function of the activities, then LP or IPmethods can be used to optimally allocate resourcesover time (e.g. Bevers et al. 1997).

Another area of current research is the develop-ment of models that address species-specific reservedesign and habitat needs. Moilanen et al. (2005)develop a heuristic for selecting core areas for mul-tiple species based on species-specific habitat con-nectivity requirements and conservation weights.More research is needed to develop IF models withspecies-specific habitat requirements and analysethe trade-offs associated with reserve systems thatfavour certain species over others.

Finally, more research is needed to benchmarkand compare the solution performance of heuristic

methods to exact IP optimization techniques in avariety of reserve selection and design problems.While the use of heuristics may allow an analystto more rapidly reach a solution, little is known orreported about the quality of solutions generatedvia heuristics. If heuristics are the preferred optionfor complex reserve selection and design problems,much more needs to be known about how wellthese solution methods perform.

Acknowledgement

We are grateful to the reviewers, Atte Moilenan andHelen Regan, for thoughtful suggestions and com-ments that improved earlier drafts. Research associ-ated with this chapter was supported by the U.S.Forest Service Northern Research Station.