graph cut segment

Upload: shickson8

Post on 05-Apr-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Graph Cut Segment

    1/23

    International Journal of Computer Vision 70(2), 109131, 2006c 2006 Springer Science + Business Media, LLC. Manufactured in The Netherlands.

    DOI: 10.1007/s11263-006-7934-5

    Graph Cuts and Efcient N-D Image Segmentation

    YURI BOYKOVComputer Science, University of Western Ontario, London, ON, Canada

    [email protected]

    GARETH FUNKA-LEA

    Imaging & Visualization, Siemens Corp. Research, Princeton, NJ, [email protected]

    Received May 12, 2003; Revised April 2, 2004; Accepted April 30, 2004

    Abstract. Combinatorial graph cut algorithms have been successfully applied to a wide range of problems invision and graphics. This paper focusses on possibly the simplest application of graph-cuts: segmentation of objectsin image data. Despite its simplicity, this application epitomizes the best features of combinatorial graph cutsmethods in vision: global optima, practical efciency, numerical robustness, ability to fuse a wide range of visualcues and constraints, unrestricted topological properties of segments, and applicability to N-D problems. Graphcuts based approaches to object extraction have also been shown to have interesting connections with earlier segmentation methods suchas snakes, geodesic active contours, and level-sets. The segmentation energiesoptimizedby graph cuts combine boundary regularization with region-based properties in the same fashion as Mumford-Shahstyle functionals. We present motivation and detailed technical description of the basic combinatorial optimizationframework for imagesegmentation via s / t graph cuts. After the general concept of using binary graph cut algorithmsfor object segmentation was rst proposed and tested in Boykov and Jolly (2001), this idea was widely studiedin computer vision and graphics communities. We provide links to a large number of known extensions basedon iterative parameter re-estimation and learning, multi-scale or hierarchical approaches, narrow bands, and other techniques for demanding photo, video, and medical applications.

    1. Introduction

    In the last 20 years the computer vision communityhas produced a number of useful algorithms for lo-calizing object boundaries in images. There are snakes(Kass et al., 1988; Cohen, 1991), active contours (Isardand Blake, 1998), geodesic active contours (Caselleset al., 1997; Yezzi et al., 1997), shortest path tech-niques (Mortensen and Barrett, 1998; Falc ao et al.,1998) and many other examples of methods for par-titioning an image into two segments: object andbackground. Each method comes with its own setof features. This paper describes a novel approach to

    object/background segmentation based on s-t graphcuts. 1 This technique from combinatorial optimization

    has already demonstrated a great potential for solv-ing many problems in vision and graphics 2 (e.g. imagerestoration (Greig et al., 1989; Boykov et al., 2001),stereo (Roy and Cox, 1998; Ishikawa and Geiger,1998; Boykov et al., 1998), multi-view reconstruction(Kolmogorov and Zabih, 2002; Vogiatzis et al., 2005;Lempitsky et al., 2006), texture synthesis (Kwatraet al., 2003)). This paper presents a detailed de-scription of the basic framework for efcient ob- ject extraction from N-D image data using s / t graphcuts.

  • 7/31/2019 Graph Cut Segment

    2/23

    110 Boykov and Funka-Lea

    Figure 1 . Energy-based approaches to object extraction. Combinatorial s / t graph cut framework for object segmentation was rst proposedin Boykov and Jolly (2001) and recently further developed in (Boykov and Kolmogorov, 2003; Blake et al., 2004; Rother et al., 2004; Liet al., 2004; Kirsanov and Gortler, 2004; Li et al., 2005; Wang et al., 2005; Kolmogorov et al., 2005; Kumar et al., 2005; Kolmogorov andBoykov, 2005; Lombaert et al., 2005; Kohli and Torr, 2005; Li et al., 2006; Bray et al., 2006; Kohli and Torr, 2006; Juan and Boykov, 2006).

    In particular, it was shown that graph cuts can approximate global (Boykov and Kolmogorov, 2003; Kirsanov and Gortler, 2004; Kolmogorovand Boykov, 2005) and local (Boykov et al., 2006) optima of continuous surface functionals. The majority of current graph cut methodsfor object segmentation use implicit representation of object boundaries. One notable exception is an explicit technique recently shown inKirsanov and Gortler (2004).

    It should be noted that our graph cut approach toobject segmentation was preceded by a number of graph-based methods for image clustering that useeither combinatorial optimization algorithms (Wu andLeahy, 1993;Ishikawa andGeiger, 1998;Felzenszwalband Huttenlocher, 2004;Veksler, 2000)or approximate

    spectral analysis techniques, e.g., normalized cuts (Shiand Malik, 2000). The main goal of such methods isa completely automatic high-level grouping of imagepixels. Typically, this means that they divide an imageinto blobs or clusters using only generic cues of coherence or afnity between pixels. 3 Even though wealso use a graph-based framework, our work is directlyrelated to a fairly different group of image segmenta-tion methods that includes snakes (Kass et al., 1988;Cohen, 1991), active contours (Isard and Blake, 1998;Caselles et al., 1997), intelligent scissors (Mortensenand Barrett, 1998), live-wire (Falc ao et al., 1998), andmany techniques based on level-sets (Sethian, 1999;Osher and Fedkiw, 2002). These methods integratemodel-specic visual cues and contextual informationin order to accurately delineate particular object(s) of interest.

    The major contribution of our work originally out-lined in Boykov and Jolly (2001) is that it rst demon-strated how to use binary graph cuts to build efcientobject extraction tools for N-D applications based ona wide range of model-specic (boundary and region-based) visual cues, contextual information, and useful

    topological constraints. Relationship of our combina-torial graph cuts framework for object extraction toprevious methods is illustrated in Fig. 1. Interestingly,graph cuts framework uses implicit representation of object boundaries which makes them a discrete coun-terpart of level-sets . Relationship with level-sets is fur-

    ther studied in Boykov et al. (2006).The effectiveness of formulating the object segmen-tation problem via binary graph cuts is also demon-strated by a large number of recent publications incomputer vision and graphics that directly build uponthe basic concept outlined in Boykov and Jolly (2001).They extendedoursegmentation technique in a number of interesting directions: geometric cues (Boykov andKolmogorov, 2003; Kolmogorov and Boykov, 2005)(geo-cuts ), regional cues based on Gaussian mixturemodels (Blake et al., 2004) for improved interac-tivity (Rother et al., 2004) ( grab-cuts ), using super-pixels (Li et al., 2004) ( lazy snapping ), integratinghigh-level contextual information (Kumar et al., 2005)(obj-cuts ), multi-level and banded methods (Lombaertet al., 2005; Xu et al., 2003; Juan and Boykov, 2006),binary segmentation using stereo cues (Kolmogorovet al., 2005), efcient algorithms for dynamic applica-tions (Kohli and Torr, 2005; Juan and Boykov, 2006)( ow- and cut-recycling ), extraction of moving or for-ground objects from video (Li et al., 2005; Wang et al.,2005), simultaneous segmentation of multiple objects(Li et al., 2006), combining segmentation with 3D pose

  • 7/31/2019 Graph Cut Segment

    3/23

    Graph Cuts and Efcient N-D Image Segmentation 111

    estimation(Brayet al., 2006), computing segmentation

    uncertainty (Kohli and Torr, 2006), and methods for solving surface evolution PDEs (Boykov et al., 2006).

    1.1. Previous Object Segmentation Methods

    There are many methods for object/background seg-mentation that predate graph cuts. The simplest tech-niques, such as region growing or split-and-merge (seeChapter 10 in Haralick and Shapiro (1992)), do not in-corporate a clear cost function and are largely basedon ad-hoc ideas. Unfortunately, these simple methodsare not robust in practice. They are prone to many

    problems, most notably, to leaking through weakspots in object boundaries. Such weak spots can befound in practically any real image. Despite signicantproblems, methods like region growing and thresh-olding are still the most widely known image seg-mentation techniques. This can be explained by their simplicity and speed. For example, they could be eas-ily run on personal computers available even 1520years ago. Currently available PCs allow much morerobust segmentation algorithms, most of which rely onoptimizing some form of energy function.

    Energy Based Object Segmentation. Energy-basedsegmentation methods can be distinguished by the typeof energy function they use and by the optimizationtechnique for minimizing it. The majority of standardalgorithms can be divided into two large groups:

    (A) Optimization of a functional dened on a contin-uous contour or surface

    (B) Optimization of a cost function dened on a dis-crete set of variables

    The standard methods in group A formulate segmen-tation problem in the domain of continuous functions

    R . Most of them rely on a variational approach andgradient descent for optimization. The correspondingnumerical techniques are based on nite differences or on nite elements . The segmentation methods in groupB either directly formulate the problem as combinato-rial optimization in nitedimensional space Z n or opti-mize some discrete energy function whose minima ap-proximates solution of some continuous problem. Ex-amplesof methods ingroupsA and B are given inFig.1.

    The methods in group (A) include snakes (Kasset al., 1988; Cohen,1991), region competition (Zhu andYuille, 1996), geodesic active contours (Caselles et al.,

    1997), and other methods based on level-sets (Sethian,

    1999; Osher and Fedkiw, 2002; Sapiro, 2001; Os-her and Paragios, 2003). Typically, continuous surfacefunctionalsincorporate various regional and bound-ary properties of segments some of which can be geo-metrically motivated (Caselles et al., 1997; VasilevskiyandSiddiqi, 2002). In most cases,methods in group (A)use variational optimization techniques that can guar-antee to nd only a local minima of the correspondingenergy functional.

    Our new graph cut approach to object extractionbelongs to group B. Most of the discrete optimizationmethods for object segmentation minimize an energydened over a nite set of integer-valued variables. 4

    Such variables are usually associated with graph nodescorresponding to image pixels or control points. Tothe best of our knowledge, all previous combinatorialmethods for object segmentation use discrete variableswhose values encode direction of a path alongthe graph. Many path-based methods use DynamicProgramming (DP) to compute optimal paths. For example, (Mortensen and Barrett, 1998) ( intelligent scissors ) and (Falc ao et al., 1998) ( live-wire ) useDijkstra algorithm while (Amir et al., 1990) ( DP-snakes ) use Viterbi algorithm. Note that all path-basedmethods can naturally encode boundary-based seg-

    mentation cues while the incorporation of regionproperties in segments is less obvious (Jermyn andIshikawa, 1999; Reese, 1999). In any case, all path-based methods are limited to 2D applications becauseobject boundary in 3D volumes can not be representedby a path.

    Global vs. Local Optimization. Before the s / t graphcuts approach for object segmentation was rst pre-sented in Boykov and Jolly (2001), computing a globaloptima was possible only for some 2D object segmen-tation methods. In general, global solutions are attrac-

    tive because of their potentially better stability. For example, imperfections in a globally optimal solutionare guaranteed to directly relate to the cost functionrather than to a numerical problem during minimiza-tion. Thus, global methods can be more reliable androbust. Some versions of active contours (Cohen andKimmel, 1997), shortest path algorithms (Mortensenand Barrett, 1998; Falc ao et al., 1998), ratio regions(Cox et al., 1996), and some other segmentation meth-ods (Jermyn and Ishikawa, 1999) compute a globallyoptimal solution in 2D applications when a segmen-tation boundary is a 1D curve. To the best of our

  • 7/31/2019 Graph Cut Segment

    4/23

    112 Boykov and Funka-Lea

    knowledge, none of the previous global techniques

    generalize to 3D segmentation problems leaving thatdomain for variational techniques and for all kinds of ad-hoc approximation methods (e.g. extrapolating seg-ments from slice to slice).

    Our s / t graph cuts framework offers a globally op-timal object extraction method for N-dimensional im-ages. We describe a fairly general cost function that caninclude both region and boundary properties of seg-ments and certain types of topological constrains thatnaturally t into our global optimization framework.

    Recent developments of the graph cuts methods for image segmentation have somewhat blurred the dif-ferences between continuous methods (A) that usevariational techniques for local optimization and dis-crete combinatorial methods (B). For example, graphcut approaches have inspired some global optimiza-tion techniques for continuous problems (Boykovand Kolmogorov, 2003; Kirsanov and Gortler, 2004;Kolmogorov and Boykov, 2005; Appleton and Talbot,2006) and a new class of local optimization techniques(Boykov et al., 2006).

    1.2. Why Graph Cuts?

    The main technical novelty of the object extraction ap-proach presented in this paper is that we formulate seg-mentation energy over binary variables whose valuesonly indicate whether the pixel is inside or outside theobjectof interest. In contrast to theearlier path-basedcombinatorial methods (see Sec. 1.1), this can be seenas a region-based approach to encoding image seg-ments. In fact, the difference between path-based andregion-based representations of segments on a discretegraph is analogous to the difference between explicitcontour representation, e.g. snakes, and the implicitlevel-sets approach (see Fig. 1).

    Robust and Efcient Optimization in N-D. Numer-ically, our image segmentation framework relies onpowerful graph cut algorithms from combinatorialoptimization (Ford and Fulkerson, 1962; Goldbergand Tarjan, 1988; Boykov and Kolmogorov, 2004).We start with a discrete energy formulation and di-rectly solve it with an exact graph-based optimizationmethod.

    In contrast, variational methods inherently rely onapproximating numerical schemes (e.g. nite differ-ences or nite elements) that must be very carefullydesigned to insure robustness. Convergence of such

    numericalmethods is an importantand non-trivial issue

    that hasto be carefully addressed. Segmentationresultsgeneratedby twovariational techniques using thesameenergy may depend on their implementation details. Incontrast, the discrete optimization approach advocatedin this paper is very straightforward and robust numer-ically. It is also repeatable. Assuming the same energyfunction, one would always get identical segmentseventhoughonecan choosefrom a numberof differentcom-binatorial min-cut/max-ow algorithms for computingminimum s-t cuts on graphs(Fordand Fulkerson, 1962;Goldberg and Tarjan, 1988; Boykov and Kolmogorov,2004).

    Recently, (Boykov and Kolmogorov, 2004) studiedthe practical efciency of combinatorial min-cut/max-ow algorithms on applications in computer vision. Itwas shown that some max-ow techniques could solve2D and 3D segmentation problems in close to real-time usingregular PCs. Further signicantaccelerationwas demonstrated for dynamic segmentation problemsusing ow-recycling (Kohli and Torr, 2005) and cut-recycling (Juan and Boykov, 2006). 5 Some versionsof max-ow/min-cut algorithms can be run on parallelprocessors (Goldberg and Tarjan, 1988). Parallel im-plementations are also possible on Graphics Process-ing Units. 6 While straightforward implementation of

    graph cuts may require a lot of memory for 3D appli-cations, recent results in Lombaert et al. (2005) showedthat multi-level andbanded techniques canalleviate theproblem.

    Relation to Previous Segmentation Methods.Figure 2 demonstrates two simple examples of objectsegmentation via graph cuts and Example (b) showsthat graph cuts include image thresholding as a trivialspecial case. Another special case in (a) shows that,in 2D applications, graph cuts are dual to shortestpath methods like intelligent scissors (Mortensen and

    Barrett, 1998) and live-wire (Falc ao et al., 1998). Thesetechniques nd the cheapest subset of edges (a path)that connects two seeds placed on the desired objectboundary. In contrast, graph cuts nd the cheapestsubset of edges (a cut) that separates seeds marking theinside of the object and background regions. It is easyto show that on planar graphs the two algorithms solvetwo equivalent problems formulated on dual graphs.Unlike intelligent scissors or live-wire , however,graph cuts can naturally integrate any regional biasin addition to boundary cues and they extend to N-Dsegmentation problems.

  • 7/31/2019 Graph Cut Segment

    5/23

    Graph Cuts and Efcient N-D Image Segmentation 113

    Figure 2 . Two basic examples of object extraction via s / t graph cuts. Image pixels form a 2D grid graph. In (a) all neighboring pixels areconnected by edges, n-links , whose capacitiesdepend on intensity differences | I p I q | . The seeds,circles,are also connectedby edges, t-links , totheterminals(source and sink).If t-links areexpensive enough,the seeds arehardwiredto the terminalsthereby providing topological constraintsfor segmentation. Max-ow algorithms (Ford and Fulkerson, 1962; Boykov and Kolmogorov, 2004) gradually increase the ow of water fromsource s to sink t along the edges (pipes) until enough edges get saturated to form a boundary (cut) separating the terminals. Typically, thebottlenecks are cheap n-links between pixels caused by high intensity differences. While example (a) mostly relies on boundary-based cuesencoded by n-links, (b) shows an extreme opposite case of graph cut segmentation using only t-links. In general, t-links encode region-basedcues. Assuming that all n-links have zero capacity and each pixel p is connected to both terminals via t-links with capacities | I p A| and| I p B| then s / t graph cuts will generate a segmentation, shown in red/blue in (b), equivalent to thresholding the image around intensity level A+ B

    2 . One of the main strengths of our graph cut approach to segmentation is that it can combine boundary cues, regional cues, and topologicalconstraints in a unied global optimization framework.

    Implementations of graph cuts using push-relabel(Goldberg and Tarjan, 1988) and algorithms based onpseudo-ows (Hochbaum, 1998; Juan and Boykov,2006) can show a sequence of cuts as they converge tothe global optima. The corresponding imagesegmenta-tion has dynamics that may resemble methods like re-giongrowing and split-and-merge . Of course, themajor difference is that the principles built into combinato-rial algorithms for graph cuts guarantee convergenceto the exact minima of a clearly dened global energyfunction.

    Recent results in (Boykov and Kolmogorov, 2003;Kirsanov and Gortler, 2004; Kolmogorov and Boykov,

    2005) connect graph cuts to a very popular geometricapproach to image segmentation, geodesic active con-tours (Caselles et al., 1997). For example, (Boykovand Kolmogorov, 2003) demonstrated that geomet-ric artifacts previously attributed to discrete segmen-tation methods can be avoided. They show that dis-crete cut metrics on a grid can approximate any con-tinuous Riemannian metric. Then, (Kolmogorov andBoykov, 2005) obtained a complete geometric charac-terization of continuous surface functionals that graphcut methods can approximate. The result is derivedfrom submodularity (Murota, 2003; Kolmogorov and

    Zabih, 2004) of discrete energies that network ow al-gorithms can minimize exactly.

    More recently, (Boykov et al., 2006) showed a verystrong connection between graph cuts and level-sets(Sethian, 1999; Osher and Fedkiw, 2002; Sapiro, 2001;Osher andParagios, 2003).In particular,(Boykovet al.,2006) developed a novel integral approach to solv-ing surface propagation PDEs based on combinatorialgraph cuts algorithms. Such PDEs arise when comput-ing gradient ow evolution of active contours whichare very widely used in computer vision and medicalimage analysis. The results in Boykov et al. (2006)suggest that combinatorial graph cuts algorithms can

    be used as a robust numerical method for an impor-tant class of variational problems that was previouslyaddressed mainly with level-sets .

    Integrating Regions, Boundary, and Shape. Graphcuts optimize discrete energies that combine bound-ary regularization with regularization of regional prop-erties of segments in the same style as continu-ous Mumford-Shah functionals (Mumford and Shah,1989). 7 This work describes in details how to incor-porate the most basic types of visual cues into the bi-nary graph cut framework for object extraction. Such

  • 7/31/2019 Graph Cut Segment

    6/23

    114 Boykov and Funka-Lea

    model-specic visual cues allow ne localization of

    the desired object boundaries.We show that the regional bias can encode any desir-able intensity distributions of the object and/or back-ground while the most typical boundary cue is basedon the intensity differences between neighboring pix-els. Boundary cues based on intensity differences arefairly standard for many applications of graph cuts.For example, this is how static cues were encoded for stereo in Boykov et al. (2001). This is also a typicalgeneric cue widely used in image clustering (Wu andLeahy, 1993;Ishikawa andGeiger, 1998;Felzenszwalband Huttenlocher, 2004; Shi and Malik, 2000; Veksler,2000). In general, graph cuts allow more generaldirected n-links. We demonstrate that model-specic directed n-links avoid some segmentationartifacts.

    Note that a number of papers that followed our orig-inal publication in Boykov and Jolly (2001) further developed object segmentation cues that graph cutscan encode. For example, (Boykov and Kolmogorov,2003; Kolmogorov and Boykov, 2005) showed that n-links can be used to encode geometric functionals suchas length/area and ux. Consequently, graph cut seg-mentation methods can use geometrically motivatedcues rst introduced by geodesicactive contour models

    (Caselles et al., 1997). For example, incorporating uxcan improve edge alignment (Vasilevskiy and Siddiqi,2002; Kimmel and Bruckstein, 2003; Kolmogorov andBoykov, 2005; Funka-Lea et al., 2006) and greatlyhelps to segment thin objects.

    A number of recent publications also further de-veloped the use of regional cues. Blake et al. (2004)suggested to use a Gaussian Mixture (GM) model toapproximate the regional properties of segments. Asdemonstrated by grab-cuts (Rother et al., 2004), iter-ative re-estimation of GM intensity models of the ob- ject and background help to signicantly improve seg-mentation of colored photographs. A very innovativeuse of regional cues from stereo was demonstrated for foreground-background segmentation in Kolmogorovet al. (2005).

    Binary graph cuts can also integrate some globalshape priors. In the simplest case, one can incorporatea regional bias based on values of a signed distancemap function representing some shape. The same ba-sic idea is widely used in level-sets , e.g. (Cremers et al.,2006; Cremers, 2006). Another comparable techniquefor incorporating shape priors into graph cuts uses ux(Kolmogorov and Boykov, 2005). The results in

    Boykov et al. (2006) show that a regional bias based

    on a signed distance map penalizes deviation betweenthe segment and the prior shape according to L2 dis-tance in the space of contours. Other metrics can beapproximated as well. Boykov et al. (2006) use graphcuts to compute a gradient ow evolution of a surfaceby adding an L2 shape-penalty for a drift from its cur-rent position. This additional current-shape penaltyworks as a breaker that does not allow the graph cutsto jump too far from the current solution. This iterativegraph cut technique can generate gradual motion of asurface consistent with continuous gradient ow PDEs(Boykov et al., 2006).

    Integrating Topological Constraints. Our graph cutframework for segmentation is based on an implicitrepresentation of object boundaries. Thus, it allowsthe same exible topological properties of segmentsas in level-sets techniques. Our segmentation re-sults may have isolated objects which may alsocontain holes. However, it may be useful to im-pose some topological constraints reecting certainhigh-level contextual information about the object of interest.

    Graph cuts can incorporate some types of topologi-cal constraints. For example, the hard constraints can

    indicate some image pixels a priori known to be a partof the object or background. We show that topologi-cal constraints can be used to reduce the search spaceof feasible segmentations. Our algorithm also allowsvery efcient editing of segmentation results, if nec-essary. The optimal segmentation can be efciently re-computed if some hard constraints (seeds) are added or removed. Our algorithm efciently adjusts the currentsegmentation without recomputing the whole solutionfrom scratch.

    Region-based topological constraints naturally tinto graph cuts. 8 These constraints correspond to aninnite cost regional bias that guarantees that somegiven subsets of pixels should be inside or outside of the object, as shown in Fig. 2(a). Such constraints canrestrict the search space for the object of interest in theimage. BoykovandVeksler (2006) discussessome non-regional topological constraints that potentially couldbe used in the graph cut algorithms for object extrac-tion.

    Notethat intelligent scissors (Mortensen andBarrett,1998)and livewire (Falc ao et al., 1998) usea boundary-based hard constraints where the user can indicate cer-tain pixels where the segmentation boundary should

  • 7/31/2019 Graph Cut Segment

    7/23

    Graph Cuts and Efcient N-D Image Segmentation 115

    pass. The segmentation boundary is then computed as

    the shortest path between the marked pixels accord-ing to some energy function based on image gradient.Onedifcultywith such hard constraints is that theuser inputs have to be very accurately positioned at the de-sired boundary. In contrast, regional hard constraintsfor graph cuts do not have to be precisely positioned.Moving the seeds around the object of interest (withinsome limits) does not typically change the segmenta-tion results.

    Relation to Markov Random Fields. Our segmenta-tion method is based on s-t graph cut algorithm. Greiget al. (1989) were the rst to discover that powerfulgraph cut algorithms from combinatorial optimizationcan be useful for computer vision problems. In particu-lar, they showed that graph cuts can be used for restora-tion of binary images. 9 The problem was formulated asMaximum A Posterior estimation of a Markov RandomField (MAP-MRF) that required minimization of pos-terior energy

    E ( I ) = pP

    ln Pr( I p | I o ) +{ p,q } N

    I p = I q , (1)

    where

    I p = I q =1 if I p = I q0 if I p = I q .

    is a Kronecker delta representing interaction potential, I = { I p | p P } is an unknown vector of originalbinary intensities I p {0, 1} of image pixels P , vector I o represents observed binary intensities corrupted bynoise, and N is a set of all pairs of neighboring pixels.

    Greig et al. constructed a two terminal graph suchthat the minimum cost cut of the graph gives a globallyoptimal binary vector I . Previously, exact minimizationof energies like (1) was not possible and such energies

    were approached mainly with iterative algorithms likesimulated annealing. In fact, Greig et al. used their re-sults mainly to show that in practice simulated anneal-ingreaches solutionsvery farfrom theglobal minimumeven in very simple binary image restoration examples.

    Unfortunately, the graph cut technique in Greig et al.remainedunnoticedfor almost10 years mainlybecausebinary image restoration looked very limited as anapplication. In the late 90s new computer vision tech-niques appeared that showed how to use s-t cut algo-rithm on graphs for more interesting non-binary prob-lems. Roy and Cox (1998) was the rst to use graph

    cuts to compute multi-camera stereo. Later, (Ishikawa

    and Geiger, 1998; Boykov et al., 1998) showed thatwith the right edge weights on a graph similar to (Royand Cox, 1998) one can minimize a non-binary caseof (1) with linear interaction penalties. This graph con-struction was further generalized to handle arbitraryconvex interactions in Ishikawa (2003). Another gen-eral case of multi-label energies where the interactionpenalty is a metric (on the space of labels) was stud-ied in (Boykov et al., 1998, 2001). Their -expansionalgorithm nds provably good approximate solutionsby iteratively running min-cut/max-ow algorithms onappropriate graphs. The case of metric interactions in-cludes many kinds of robust cliques that are fre-quently used in practice. Later it was shown that -expansion technique can be also used for non-metricinteractions, often loosing optimality guarantees (e.g.,Rother et al., 2005).

    In this paper we consider a binary segmentationproblem where a given object has to be accuratelyseparated from its background. One of our insightsis that such a problem can be formulated as a bi-nary (object/background) labeling problem with en-ergy similar to (1). Indeed, binary image restorationenergy (1) contains two terms representing regionaland boundary properties. Such a combination looks

    very appropriate for an object segmentation method.Moreover, a binary energy like (1) can be minimizedexactly even in N-dimensional cases using standard s-t graph cut algorithms. Technically, our basic methoduses a graph constructionsimilar to (Greiget al., 1989).Our main contribution is that our work rst demon-strated that global optimization of discrete energy like(1) can be effectively used for accurate object extrac-tion from N-D images and showed how (1) can inte-grate segmentation cues previously used in snakes andin implict active contour models (e.g. level-sets).

    2. Optimal Object Segmentation via Graph Cuts

    In this section we describe our object segmentationframework in detail. Section 2.1 presents our ba-sic ideas relating graph cuts and binary segmenta-tion. Minimal background on s-t cuts from graphtheory is provided. Section 2.2 formulates our seg-mentation technique in terms of energy minimiza-tion. Our cost function serves as a soft constraint onregional and boundary properties of segments. Thesynergy of regional and boundary properties is dis-cussed in Section 2.3. In Section 2.4 we introduce hard

  • 7/31/2019 Graph Cut Segment

    8/23

    116 Boykov and Funka-Lea

    Figure 3 . A simple 2D segmentation example for a 3 3 image. The seeds are O = { v} and B = { p}. The cost of each edge is reectedby the edges thickness. The boundary term (4) denes the costs of n-links while the regional term (3) denes the costs of t-links. Inexpensiveedges are attractive choices for the minimum cost cut. Hard constraints (seeds) (8,9) are implemented via innity cost t-links. A globally optimalsegmentation satisfying hard constraintscan be computed efciently in low-order polynomial time using max-ow/min-cut algorithms on graphs(Ford and Fulkerson, 1962; Goldberg and Tarjan, 1988; Cook et al., 1998).

    constraints and show how they can be used to restrictthe search space of feasible solutions. Section 2.5 pro-videsimplementational details and formally shows thata minimum cost s-t cut on an appropriately constructedgraphcorresponds to a globallyoptimal solutionamongall binary segmentations satisfying a given set of hardconstraints. Section 2.6 shows an efcient solution for recomputing optimal segments when hard constraintsare changed. This feature of our method is very use-ful for fast object editing, especially in 3D applica-tions. Generalization to directed graphs is discussed inSection 2.7. In some cases directed edges can signi-cantly improve segmentation results.

    2.1. Basic Ideas and Background Information

    First, we will introduce some terminology. A graphG = V , E is dened as a set of nodes or vertices V and a set of edges E connecting neighboring nodes.For simplicity, we mainly concentrate on undirected

    graphswhere each pair of connectednodes is describedby a single edge e = { p, q } E .10 A simple 2D exam-ple of an undirected graph that can be used for imagesegmentation is shown in Fig. 3(b).

    The nodes of our graphs represent image pixels or voxels. There are also two specially designated termi-nal nodes S (source ) and T (sink ) that represent ob- ject and background labels. Typically, neighboringpixelsare interconnected by edges in a regular grid-likefashion. Edges between pixels are called n-links wheren stands for neighbor. Note that a neighborhood sys-tem can be arbitrary and may include diagonal or anyother kind of n-links. Another type of edges, called t-links , are used to connect pixels to terminals. All graphedges e E including n-links and t-links are assignedsome nonnegative weight (cost) w e . In Fig. 3(b) edgecosts are shown by the thickness of edges.

    An s-t cut is a subset of edges C E such thatthe terminals S and T become completely separated onthe induced graph G(C ) = V , E \ C . Note that a cut

  • 7/31/2019 Graph Cut Segment

    9/23

    Graph Cuts and Efcient N-D Image Segmentation 117

    (see Fig. 3(c)) divides the nodes between the terminals.

    As illustrated in Fig. 3 (c-d), any cut corresponds tosome binary partitioning of an underlying image intoobject and background segments. Note that in thesimplistic example of Fig. 3 the image is divided intooneobjectand onebackground regions.In general,cuts can generate binary segmentation with arbitrarytopological properties. Examples in Section 3 illustratethat object and background segments may consist of several isolated connected blobs that also may haveholes.

    Our goalis to compute the best cut that would giveanoptimalsegmentation. In combinatorial optimizationthe cost of a cut is dened as the sum of the costs of edges that it severs

    |C | =eC

    w e

    Note that severed n-links are located at the segmen-tation boundary. Thus, their total cost represents thecost of segmentation boundary. On the other hand, sev-ered t-links canrepresent theregionalproperties of seg-ments. Thus, a minimum cost cut may correspond toa segmentation with a desirable balance of boundaryand regional properties. In Section 2.2 we formulate aprecise segmentation energy function that can be en-coded via n-links and t-links. Note that innity costt-links make it possible to impose hard constraints onsegments.

    Numerically, our techniqueis based on a well-knowncombinatorial optimization fact that a globally mini-mum s-t cut can be computed efciently in low-order polynomial time (Ford and Fulkerson, 1962; GoldbergandTarjan,1988;Cook etal., 1998).The correspondingalgorithms work on any graphs. Therefore, our graphcut segmentation method is not restricted to 2D im-ages and computes globally optimal segmentation onvolumes of any dimensions. In Section 3 we show anumber of 3D examples.

    Note that a fast implementation of graph cut algo-rithms can be an issue in practice. The most straight-forward implementations of the standard graph cut al-gorithms, e.g. max-ow (Ford and Fulkerson, 1962) or push-relabel (Goldberg and Tarjan, 1988), can be slow.The experiments in Boykov and Kolmogorov (2004)compare several well-known tuned versions of thesestandard algorithms in thecontext of graph based meth-ods in vision. The same paper also describes a newversion of the max-ow algorithm that (on typical vi-sion examples) signicantly outperformed the standardtechniques. Our implementation of the segmentation

    method of this paper uses the new graph cut algorithm

    from (Boykov and Kolmogorov, 2004).

    2.2. Segmentation Energy

    Consider an arbitrary set of data elements (pixels or voxels) P and some neighborhood system representedby a set N of all (unordered) pairs { p, q } of neighbor-ing elements in P . For example, P can contain pixels(or voxels) in a 2D (or 3D) grid and N can containall unordered pairs of neighboring pixels (voxels) un-der a standard 8- (or 26-) neighborhood system. Let A = ( A1 , . . . , A p , . . . , A|P | ) be a binary vector whose

    components A p specify assignments to pixels p in P .Each A p can be either obj or bkg (abbreviations of object and background). Vector A denes a seg-mentation. Then, the soft constraints that we imposeon boundary and region properties of A are describedby the cost function

    E ( A) = R( A) + B( A) (2)

    where

    R( A) = pP

    R p( A p) (regional term ) (3)

    B( A) ={ p,q} N

    B p, q A p = Aq (boundary term )

    (4)

    and

    A p = Aq =1 if A p = Aq0 if A p = Aq .

    The coefcient 0 in (2) species a relative im-portance of the region properties term R( A) versusthe boundary properties term B( A). The regional term R( A) assumes that the individual penalties for assign-

    ing pixel p to object and background, correspond-ingly R p(obj) and R p(bkg), are given. For exam-ple, R p() may reect on how the intensity of pixel pts into given intensity models (e.g. histograms) of theobject and background

    R p(obj) = ln Pr( I p|obj) (5)

    R p(bkg) = ln Pr( I p|bkg) (6)

    This use of negative log-likelihoods is motivated by theMAP-MRF formulationsin (Greiget al., 1989; Boykovet al., 2001).

  • 7/31/2019 Graph Cut Segment

    10/23

    118 Boykov and Funka-Lea

    Figure 4 . Synthetic Gestalt example. The optimal object segment (red tinted area in (c)) nds a balance between region and boundaryterms in (2). The solution is computed using graph cuts as explained in Section 2.5. Some ruggedness of the segmentation boundary is dueto metrication artifacts that can be realized by graph cuts in textureless regions. Such artifacts can be eliminated using results in Boykov andKolmogorov (2003).

    The term B( A) comprises the boundary proper-ties of segmentation A. Coefcient B p,q 0 should beinterpreted as a penalty for a discontinuity between pand q . Normally, B p,q is large when pixels p and q aresimilar (e.g. in their intensity) and B p,q is close to zerowhen the two are very different. The penalty B p,q canalso decrease as a function of distance between p andq . Costs B p,q may be based on local intensity gradient,Laplacian zero-crossing, gradient direction, geomet-

    ric (Boykov and Kolmogorov, 2003; Kolmogorov andBoykov, 2005) or other criteria. Often, it is sufcient toset the boundary penalties from a simple function like

    B p,q exp ( I p I q )2

    2 2

    1dist ( p, q )

    . (7)

    This function penalizes a lot for discontinuities be-tween pixels of similar intensities when | I p I q | < .However, if pixels are very different, | I p I q | > ,then the penalty is small. Intuitively, this function cor-

    responds to the distribution of noise among neighbor-ing pixels of an image. Thus, can be estimated ascamera noise.

    2.3. Region vs. Boundary

    A simple example of Fig. 4 illustrates some interestingproperties of ourcost function(2).The objectof interestis a cluster of black dots in Fig. 4(a) that we would liketo segment as one blob. We combine boundary andregion terms (3,4) taking > 0 in (2). The penalty for

    discontinuity in the boundary cost is

    B p, q =1 if I p = I q0.2 if I p = I q

    To describe regionalproperties of segments we use a priori known intensityhistograms (Fig. 4(b)).Note thatthe background histogram concentrates exclusively onbright values while the object allows dark intensitiesobserved in the dots. If these histograms are usedin (5,6) then we get the following regional penalties R p( A p) for pixels of different intensities.

    I p R p(obj) R p(bkg)

    dark 2.3 +bright 0.1 0

    The optimal segmentation in Fig. 4(c) nds a bal-ance between the regional and the boundary term of energy (2). Individually, bright pixels slightly prefer tostay with the background (see table above). However,spatial coherence term (4) forcessome of them to agreewith nearby dark dots which have a strong bias towardsthe object label (see table).

    2.4. Hard Constraints

    In the simple example of Fig. 4 the regional propertiesof the object of interest are distinct enough to segmentit from the background. In real examples, however,

  • 7/31/2019 Graph Cut Segment

    11/23

    Graph Cuts and Efcient N-D Image Segmentation 119

    Figure 5 . Automatic segmentation of cardiac MR data. Initialization in (b) is based on hard constraints that can be placed automatically usingsimple template matching. Then, graph cuts accurately localize object boundaries in (c).

    objects maynot have sufciently distinct regionalprop-erties. In such cases it becomes necessary to further constraint the search space of possible solutions be-fore computing an optimal one. We propose topologi-cal (hard) constraints as an important source of highlevel contextual information about the object of inter-est in real images.

    Assume that O and B denote the subsets of pixels a priori known to be a part of object andbackground,correspondingly. Naturally, the subsets O P andB P are such that O B = . For example, consider sets O (red pixels) and B (blue pixels) in Fig. 5(b). Our goal is to compute the global minimum of (2) amongall segmentations A satisfying hard constraints

    p O : A p = obj (8) p B : A p = bkg . (9)

    Figure 5(c) shows an example of an optimal segmenta-tion satisfying the hard constraints in (b). Throughoutthis paper we use red tint to display object segmentsand blue tint for background.

    Ability to incorporate hard constraints (8,9) is oneof the most interesting features of our segmentationmethod. There is a lot of exibility in how these hardconstraints can be used to adjust the algorithm for dif-ferent tasks. The hard constraints can be used to ini-tialize the algorithm and to edit the results. The hardconstraints can be set either automatically or manu-ally depending on an application. Manually controlled

    seeds is the most likely way to enter hard constraintsin many generic applications, e.g. photo-editing shop.Manual seeds are also useful for editing segments (seeSection 2.6) when initial segmentation results requirecorrections. On the other hand, automatically set hardconstraints can be used to initialize the algorithm inhighly specialized applications such as organ segmen-tation from medical images or volumes.

    For example, consider a medical application whereone should segment the blood pool of a left ventricle(one of the heart chambers) captured in an MR imageof Fig. 5(a). A simple template matching can roughlylocalize the left ventricle in the image, e.g. using itsknown circular shape. The hard constraints in Fig. 5(b)can be set automatically as soon as a rough position of the blood pool is known. Then, our graph cut techniquecan accurately localize the boundary of the blood poolin Fig. 5(c).

    Note that the hard constraints in Fig. 5 restrict the set

    of feasible cuts to closed contours in a band betweenthe blue and red seeds. It is possible to show that theminimum cost cut in that band can be interpreted asthe shortest length path on a dual graph. In fact, our graph cut approach can be seen as a generalization of path-based segmentation techniques in (Geiger et al.,1995;Mortensen andBarrett,1998;Falc aoetal., 1998).These methods are intrinsically 2D while our graph cutapproach can compute optimal segmentation boundaryfor N-dimensional cases as well.

    The band in Fig. 5 also restricts the area where theactual computation takes place. It is enough to build a

  • 7/31/2019 Graph Cut Segment

    12/23

    120 Boykov and Funka-Lea

    Figure 6 . Editing segments by adding hard constraints (seeds). A fragment of an original photo is shown in (a). Initial seeds and segmentationare shown in (b).The results in (c,d,e) illustrate changes in optimal segmentation as newhard constraint aresuccessively added. Thecomputationtime for consecutive corrections in (c,d,e) is marginal compared to time for initial results in (b).

    graph only in the area of the band since max-ow/min-

    cut algorithms would not access any other nodes.An example in Fig. 6 further illustrates how hardconstraints (8,9) affect the search space of feasible so-lutions. Naturally, adding hard constraints helps to seg-ment desirableobjectas unacceptablesolutionsare dis-carded and the search space reduces. Hard constraintsplaced in one 2D slice may be enough to properly con-strain the search space of segmentations in a 3D vol-ume, e.g. in Fig. 10.

    Note that in a practical implementation of our method it is possible to make a double use of the seeds.First of all, they can constrain the search space as dis-cussed above. In addition, we can use intensities of pixels (voxels) marked as seeds to learn the histogramsfor object and background intensity distributions:Pr( I |obj) and Pr( I |bkg) in (5,6). Other ideas for initializing intensity distributions are studies in (Blakeet al., 2004; Rother et al., 2004).

    2.5. Optimal Solution Via Graph Cuts

    In this section we provide algorithmic details aboutour segmentation technique. The general work-ow isshown in Fig. 3. We describe the details of the corre-sponding graph construction and prove that the mini-mum cost cut gives an optimal segmentation for energy(2) and hard constraints (8,9).

    To segment a given image we create a graph G =V , E with nodes corresponding to pixels p P of

    the image. There are two additional nodes: an objectterminal (a source S) and a background terminal (asink T ). Therefore,

    V = P {S, T }.

    The set of edges E consists of two types of undi-rected edges: n-links (neighborhood links) and t-links

    (terminal links). Each pixel p has two t-links { p, S}

    and { p, T } connecting it to each terminal. Each pair of neighboring pixels { p, q } in N is connected by ann-link. Without any ambiguity, an n-link connecting apair of neighbors p and q is also denoted by { p, q }.Therefore,

    E = N pP

    {{ p, S}, { p, T }}.

    The following table gives weights of edges in E

    edge weight (cost) for

    { p, q } B p,q { p, q } N

    R p(bkg) p P , p O B{ p, S} K p O

    0 p B

    R p(obj) p P , p O B{ p, T } 0 p O

    K p B

    whereK = 1 + max

    pP q : { p,q } N

    B p,q .

    The graph G is now completely dened. We drawthe segmentation boundary between the object and thebackground by nding the minimum cost cut on thegraph G. The minimum cost cut C on G can be com-puted exactly in polynomial time via algorithms for twoterminal graph cuts (see Section 2.1) assuming thatthe edge weights specied in the table above are non-negative. 11

  • 7/31/2019 Graph Cut Segment

    13/23

    Graph Cuts and Efcient N-D Image Segmentation 121

    Below we state exactly how the minimum cut C de-

    nes a segmentation A and prove this segmentation isoptimal. We need one technical lemma. Assume thatF denotes a set of all feasible cuts C on graph G suchthat

    C severs exactly one t-link at each p { p, q } C iff p and q are t-linked to different ter-

    minals if p O then { p, T } C if p B then { p, S} C .

    Lemma 1. The minimum cut on G is feasible , i.e.C F .

    Proof: C severs at least one t-link at each pixel sinceit is a cutthat separatesthe terminals. Ontheotherhand,it can not sever both t-links. In such a case it would notbe minimal since one of the t-links could be returned.Similarly, a minimum cut should sever an n-link { p, q }if p and q are connected to the opposite terminals justbecause any cut must separate the terminals. If p and qare connected to the same terminal then C should notsever unnecessary n-link { p, q } due to its minimality.The last two properties are true for C because the con-

    stant K is larger than the sum of all n-links costs for any given pixel p . For example, if p O and C severs{ p, S} (costs K ) then we would construct a smaller costcut by restoring { p, S} and severing all n-links from p(costs less then K ) as well as the opposite t-link { p, T }(zero cost).

    For any feasible cut C F we can dene a uniquecorresponding segmentation A(C ) such that

    A p(C ) =obj, if { p, T } C

    bkg, if { p, S} C .(10)

    The denition above is coherent since any feasible cutsevers exactly one of the two t-links at each pixel p .The lemma showed that a minimum cut C is feasible.Thus,we can denea corresponding segmentation A = A(C ). The next theorem completes the description of our algorithm.

    Theorem 1 . The segmentation A = A(C ) dened bythe minimum cut C as in (10) minimizes (2) among allsegmentations satisfying constraints (8 , 9) .

    Proof: Using the table of edge weights, denition of

    feasible cuts F , and Eq. (10) one can show that a costof any C F is

    |C | = pOB

    R p( A p(C )) +{ p,q } N

    B p,q A p (C )= Aq (C )

    = E ( A(C )) pO

    R p(obj) pB

    R p(bkg) .

    Therefore, |C | = E ( A(C )) const . Note that for anyC F assignment A(C ) satises constraints (8,9).In fact, Eq. (10) gives a one-to-one correspondencebetween the set of all feasible cuts in F and the set Hof all assignments A that satisfy hard constraints (8,9).Then,

    E ( A) = | C | + const = minC F

    |C | + const

    = minC F

    E ( A(C )) = min AH

    E ( A)

    and the theorem is proved.

    2.6. Fast Editing of Segments

    In practice, no segmentation algorithm can guarantee100% accuracy. Thus, it is convenient to have a simplewayto correct segments if necessary. Within ourframe-work segment editing canbe done by placing additionalhard constraints (seeds) in incorrectly segmented im-age areas. 12 Figure 6 shows one example of editing ina photo-shop context.

    In fact, our technique can efciently recompute anew globally optimal solution that satises additionalconstraints by adjusting a current segmentation. Thisfast editing feature is very useful in practical appli-cations. In particular, this is very important for fastediting of objects in 3D cases. Indeed, initial segmen-tation may take 5-30 seconds or more depending onvolume size. Corrections, however, can be computed

    within a second.Below we describe an efcient method to recompute

    an optimal solution when hard constraints are changed.We assume that a max-ow algorithm (see (Ford andFulkerson, 1962; Cook et al., 1998)) is used to deter-mine the minimum cut on G. The max-ow algorithmgradually increases the ow sent from the source S tothe sink T along the edges in G given their capacities(weights). Upon termination the maximum ow satu-rates the graph. The saturated edges correspond to theminimum cost cut on G giving us an optimal segmen-tation.

  • 7/31/2019 Graph Cut Segment

    14/23

  • 7/31/2019 Graph Cut Segment

    15/23

    Graph Cuts and Efcient N-D Image Segmentation 123

    the left (bone) by a thin layer of a relatively dark muscle

    tissue. The contrast between the bone and the muscle ismuch better then the contrast between the muscle andthe liver. Thus, according to (7) the standard undi-rected cost of edges between the bone and the muscleis much cheaper than the cost of edges between themuscle and the liver. Consequently, an optimal cut onan undirected graph produces segmentation in Fig. 7(c)that sticks to the bone instead of following the actualliver boundary. Correcting this error may require seg-ment editing as in Section 2.6.

    Alternatively, directed graphs can automatically dis-tinguish between the incorrect boundary in Fig. 7(c)and the desirable one in (d). The key observation isthat the weights of directed edges ( p, q ) can dependon a sign of intensity difference ( I q I q ). In contrast,weights of undirected edges should be symmetric withrespect to its end points and could depend only on theabsolute value | I p I q | as in (7). Note that the objectboundary that stuck to the bone in (c) separates darker tissue (muscle) in the object segment from brighter tis-sue (bone) in the background. On the other hand, thecorrect object boundary in (d) goes from brighter tis-sue (liver) in the object to darker tissue (muscle) in thebackground. Note that directed edge weights

    w ( p,q ) =

    1 if I p I q

    exp ( I p I q )2

    2 2if I p > I q

    would specically encourage cuts from brighter tissuein the object to darker tissue in the background. Theresults in Fig. 7(d) show optimal segmentation on adirected graph using such edge weights.

    Formally, a directed version of segmentation en-ergy (2)canbe presentedas follows.Theneighborhoodsystem N should include all ordered pairs ( p, q ) of neighboring pixels. Then the boundary term of energy

    (2) is

    B( A) =( p,q ) N

    B( p,q ) A p= obj , Aq = bkg (11)

    Note that B( p,q ) and B(q , p) are two distinct (directed)discontinuity penalties for cases when p object,q background and when p background, q object. Straightforward generalization of the tech-nical results in Section 2.5 shows that energy (2) withboundary term (11) canbe minimizedvia s-t graph cutsusing directed n-links w ( p,q ) = B( p,q ) 0.

    In fact, non-negativity of directed penalties B( p,q )

    0 is not an essential limitation of our general s / t graph cut framework for object extraction. Vladimir Kolmogorov has pointed out to us that it generalizes tosubmodular penalties

    B( p, q ) + B(q , p) 0

    In the context of object segmentation, such di-rected submodular penalties were recently studied inKolmogorov and Boykov (2005) where directednessof the boundary cost (11) is geometrically character-ized in terms of ux of a vector eld. Consistenly withsome variational methods (Vasilevskiy and Siddiqi,

    2002; Kimmel and Bruckstein, 2003) from group A inSection 1.1, (Kolmogorov and Boykov, 2005) demon-strates that optimization of ux helps graph cuts to bet-ter align with object boundaries and to segment thinstructures.

    3. Experimental Results

    We demonstrate our general-purpose segmentationmethod on several generic examples includingphoto/video editing and medical data processing. Themain goal is to prove the concept of object extractionvia s / t graph cuts proposed in our work. We show orig-inal data and segments generated by our technique for a given set of hard constraints. Our actual implemen-tation allows a user to enter hard constraints (seeds)via mouse operated brush of red (for object) or blue(for background) color. We present segmentation re-sults in different formats depending on what is moreappropriate in each case.

    Note that we used simple 4-neighborhood systemsin 2D examples and 26-neighborhood system in 3Dexamples. All running times are given for 1.4 GHzPentium III. Our implementation is based on a new

    max-owalgorithm from (Boykov and Kolmogorov,2004).

    3.1. Photo and Video Editing

    The results in this Section are obtained with =0, that is without the regional term (3). There areexamples where some objects in real images havedistinct intensity distributions that may help to seg-ment them. However, a segmentation algorithm in ageneral-purpose photo/video editor can not count on

  • 7/31/2019 Graph Cut Segment

    16/23

    124 Boykov and Funka-Lea

    Figure 8 . Segmentation of photographs (early 20th century). Initial segmentation for a given set of hard constraints (seeds) takes less than asecond for most 2D images (up to 1000 1000). Correcting seeds are incorporated in the blink of an eye. Thus, the speed of our method for photo editing mainly depends on time for placing seeds. An average user will not need much time to enter seeds in (a) and (b).

    any specic regional properties. Indeed, objects of in-terest are determined by random users and can not bepredicted. On theotherhand,we canstilluse thebound-aryterm (4)with thediscontinuity penalty (7). Most ob- jects of interest will have borders along high contrastboundaries inside images. Hard constraints entered asseeds can bring high level contextual information.

    In Fig. 8 we show segmentation results for somephotographs. 14 The user can start with a few objectand background seeds loosely positioned inside and,correspondingly, outside the object(s) of interest. Byreviewing the results of initial segmentation the user may observe that someareas are segmentedincorrectly.Then,onecan edit segmentsas describedin Section 2.6.

    Naturally, the hope is that the method can quicklyidentify the right objectboundaries with minimum user interaction. The algorithm would not be practical if theuser has to add correcting seeds until the whole image

    is basically manually segmented. The performance of the algorithm can be judged by the amount of con-straints/seeds that had to be placed. For that reason,the results in Fig. 8 are shown with the correspondingseeds.

    In Fig. 9 we segmented moving cars in a video se-quence. The sequence of 21 video frames (255 189)was treated as a single 3D volume. The necessary seedswere entered in a simple 3D interface where we couldbrowse through individual 2D slices (frames) of thevol-ume. The whole volume is segmented based on hardconstraints placed in just a few frames. Note that cor-

    recting seeds in one frame can x imperfections inmany adjacent frames. Each car was segmented in anindependent experiment. Segmentation of the wholevolume of 21 frames can be obtained by placing hardconstraints in only a few of the frames.

    The initial segmentation might take from 23 sec-onds on smaller volumes (200 200 10) to a fewminutes on bigger ones (512 512 50). Thus, ef-cient editing described in Section 2.6 is crucial for 3Dapplications. Usually, the new seeds are incorporatedin a few seconds even on bigger volumes.

    3.2. Medical Images and Volumes

    Figures 10, 11, and 12 show segmentation results thatwe obtained on a 3D medical volumes. Each objectwas segmented in 10 to 30 seconds. In the examplesof Figs. 10 and 12 the objects were extracted from 3D

    volumes after entering seeds in only one slice shownin (a). In Fig. 11 some correcting seeds were added ina couple of slices in addition to initial hard constraintsshown in (a). Note that the object in Fig. 10 consistof two isolated blobs one of which has a hole. Thisclearly demonstrates that our method does not restricttopological properties of segments.

    As in Section 3.1 we did not use regional term (3) for the experiments in Figs. 10, 11 and 12 as it was not use-ful. In some applications, however, the regional termmay signicantly simplify, if not completrely automate(Kolmogorov et al., 2005), the segmentation process.

  • 7/31/2019 Graph Cut Segment

    17/23

    Graph Cuts and Efcient N-D Image Segmentation 125

    Figure 9 . Segmentation of a video sequence (21 frames). Initial segmentation takes 35 seconds while correcting seeds are incorporated withina second. The car in the center is the simplest to segment. It is enough to place hard constraints (seeds) in one frame. The car on the left requiredsome editing due to low contrast and seeds were placed in 3 frames. The bus on the right moves behind a tree and its segmentation requiredseeds in 4-5 frames.

    Figure 10 . Segmentation of bones in a CT volume [256x256x119].

    In Fig. 13 we demonstrate segmentation on 3D kidneyMR data that beneted from regional term (3). We seg-mented out cortex, medulla, and collecting system of akidney in three consecutive steps. First, the whole kid-ney is separated from the background and the latter iscropped. The remaining pixels belong to three distincttypes of kidney tissue (cortex, medulla, or collectingsystem) with identiable regional properties. At thispoint it becomes useful to engage the regional term (3)of energy.

    The results in Fig. 13 are shown without seeds sincethe process involved three different segmentations. Us-ing regional bias allows to get 3D segmentation resultsby entering only a few seeds in one slice. Initial opti-mal segments are computed in 110 seconds and minor correction can be incorporated in less then a second.This example also demonstrates unrestricted topolog-ical properties of our segments. Fully automatic seg-mentation of kidney might be possible with more so-phisticated models for regional.

  • 7/31/2019 Graph Cut Segment

    18/23

    126 Boykov and Funka-Lea

    Figure 11 . Segmentation of liver in a CT volume [170x170x144].

    Figure 12 . Segmentation of lung lobes in a CT volume [205x165x253].

    4. Discussion

    We presented a novel framework for extracting objectsfrom images/volumes. In many ways our graph-cut ap-proach to object extraction can be seen as a unify-ing framework for segmentation that combines many

    good features of the previous methods like snakes,active contours, and level sets while providing ef-cient and robust global optimization applicable to N-Dproblems.

    The graph cuts framework is very exible withinitialization. It can be based either on topologicalconstraints reecting high-level contextual informa-tion (Boykov and Jolly, 2001), or on various vi-sual cues like object/background color distributions(Boykov and Jolly, 2001; Blake et al., 2004; Rother et al., 2004), texture, or even stereo cues (Kolmogorovet al., 2005). The method does not have to have an

    initial contour/surface, but it may take advantage of some shape prior (Kolmogorov and Boykov, 2005;Boykov et al., 2006) or of some initial approximateguess (Juan and Boykov, 2006), if available. The algo-rithm can also be accelerated in dynamic applicationswhere previously segmented image frame is similar to

    the new frame (Kohli and Torr, 2005). Similar to level-set based methods, graph cuts use implicit boundaryrepresentation and the topological properties of the re-covered segments are unrestricted, unless additionalconstraints are imposed (Boykov and Veksler, 2006).In addition, our algorithm allows effective editing of segments if necessary.

    To the best of our knowledge, our graph cut ap-proach is the rst global optimization object extractiontechnique that extends to N-dimensional images. Theunderlying numerical optimization scheme is straight-forward and robust. It uses exact min-cut/max-ow

  • 7/31/2019 Graph Cut Segment

    19/23

    Graph Cuts and Efcient N-D Image Segmentation 127

    Figure 13 . Kidney in a 3D MRI angio data [55 80 32] segmented into cortex, medulla, and collecting system.

    algorithms from combinatorial optimization and thereare no numerical convergence issues.

    The downside of our global optimization approachis a restricted set of constraints that we have avail-able to guide the segmentation process (Kolmogorovand Zabih, 2004; Kolmogorov and Boykov, 2005). Atthe same time, the range of available constraints issufciently wide to make graph cuts very useful inpractice. Surprisingly, graph cuts can implement thesame geometrically-motivated cues (Kolmogorov andBoykov, 2005) that arewidely used in continuous level-sets methods.

    Our main technical tools are weights and topol-ogy of n-links and t-links of image-embedded graphs.

    Typically, n-links encode the segmentation boundarycost that is closely related to its geometric length

    in a non-Euclidean image-based metric (Boykov andKolmogorov, 2003). At the moment we can not opti-mize second order properties of the boundary (e.g. cur-vature) which may be helpful in some applications. Itis plausible that combinatorial representation of suchproperties would require an energy function (4) withtriple-cliques interactions in the boundary term 15 whileour n-links describe only pairwise relations. Recent re-sults in Kolmogorov and Zabih (2004) describe a classof triple-clique interactions that can be optimized withgraph cuts. It is not clear, though, if this class is goodenough to model any particular second-order boundary

  • 7/31/2019 Graph Cut Segment

    20/23

    128 Boykov and Funka-Lea

    properties of segments. In case this open question has a

    positiveanswerourobject extraction frameworkshouldbe able to benet from the corresponding construction.Our examples with directed n-links (see Section 2.7)

    show that ourmethod canmodel certain boundary prop-erties which might be hard to handle with other objectextraction methods. In fact, there are many other cre-ative ways to use specic features of our graph-basedapproach. For example, directed n-links can be usedto optimize ux (Kolmogorov and Boykov, 2005) of a given vector eld through the segmentation bound-ary. A eld of such vectors for each image pixel canbe a priori dened from data (e.g. intensity gradients)or other contextual information (e.g. gradients of a dis-tancemap fora priorobject/shapemodel) (Kolmogorovand Boykov, 2005).

    Note that (Kolmogorov and Boykov, 2005) charac-terized the boundary component of the energy that canbe minimized with graph cuts through geometric termslike length / area and ux. Note that these two geomet-ric concepts have somewhat dual effect on the segmen-tation boundary. While optimization of length causesshrinking of the boundary, optimization of ux mayhave the opposite ballooning effect on the boundary.The discovery of ux component in Kolmogorov andBoykov (2005) allows to counteract the shrinking

    bias that graph cut techniques were previously criti-cized for. The shrinking bias can be also addressedby integrating an appropriate regional bias.

    The t-links are a tool for implementing region-basedtopological constraints and regional bias in the rstterm of energyformulation(2). Constraining thesearchspace of cuts isa critical feature of our approach and theregional bias is an option that should be used carefullydesigned dependingon a specic domain. For example,intensity-based regional models in (5,6) may not helpto segment an object if its intensity properties are verysimilar to the background as in Fig. 12(a) (also knownas camouage problem). In this case, intensity dis-tributions (5,6) are almost identical and relative weightfor that type of regional bias in the energy should beset close to zero.

    Our articial example in Fig. 4 shows that a regionalbias may work well even when there is a signicantoverlap in the intensity properties of the object andbackground. In real images, however, it may be hard toset the right value of and the balance between the re-gion andboundarytermsof energy(2) might beelusive.

    Robustness of the regional term can be improved if simple gray-scale intensity histograms in (5,6) are re-

    placed with more elaborate regional bias models based

    on color, local texture, etc. For example, in Fig. 13 weused

    R p( A p) = ln Pr( I p | A p)

    based on multi-dimensional vector I p of intensities ob-tainedat each pixel in a sequence of timely acquisitionsas a contrast agent propagatedthrough different tissues.Distinct functional properties of these tissues gave in-formative regional models based on multi-dimensionalintensity distributions. Importance of the careful esti-mation of regional intensity properties of segments isalso emphasized in (Blake et al., 2004; Rother et al.,2004) where object/background color distribution isestimated using Gaussian Mixture Models with fairlyimpressive results.

    One additional option studied in (Blake et al., 2004;Rother et al., 2004) is that the parameters of the energy(e.g. regional bias) can be iteratively re-estimated. Itis also possible to use new correcting seeds duringthe editing stage (see Section 2.6) to learn the energyparameters. A new global optima with updatedregional term can be efciently recomputed from aprevious solution using algorithmic ideas similar tothose in Section 2.6. Efcient updating of n-links was

    studied in Kohli and Torr (2005) in the context of dynamic applications.Finally, there is an interesting way to generalize

    our segmentation framework to multi-object extractionproblem. Figures 12 (b) and 13(b) show examples of multi-label segmentations obtained in a sequence of consecutive independent binary steps based on con-strained s-t graph cuts described in Section 2. In fact,the multi-way graph-cut algorithms in Boykov et al.(2001) can be used to extract multiple objects at once.In practice, extracting multiple objects simultaneouslyshould be more robust, convenient, and faster whencompared to iterative binary approach.

    Our graph-cuts approach can be extended to simul-taneous multi-object extraction as follows. A graphshould have multiple terminals representing each ob- ject of interest. 16 A user can appropriately place seeds(hard constraints) of different colors to provide high-level contextual information andto constrain thesearchspace of feasible multi-way cuts. In fact, seeds of each color would represent innity-cost t-links to thecorresponding graph terminal (object label). N-linksbetween image pixels can still represent a segmen-tation boundary cost. The multi-way cut approach in

  • 7/31/2019 Graph Cut Segment

    21/23

    Graph Cuts and Efcient N-D Image Segmentation 129

    Boykov et al. (2001) allows to set different disconti-

    nuity penalties B p,q ( A p , Aq ) between a given pair of neighboring pixels depending on a specic pair of la-bels assigned to them. A minimum cost multi-way cutis a graph partitioning that separates all terminals intoisolated segments by severing edges of the smallestpossible accumulative cost. Innity cost t-links guar-anteethat multi-waypartitioning of a constrained graphdescribed above infers an optimal multi-object imagesegmentation. The corresponding segmentation energyis a direct generalization of (2) to a multi-label setting

    E ( A) = pP

    R p( A p)

    +{ p,q} N

    B p,q ( A p , Aq ) A p= Aq (12)

    where A {obj 1 , obj 2 , . . . , obj N }|P | is a multi-objectimage labeling.

    Unfortunately, multi-way cut problem is NP-hard.Thus, we loose guaranteed global optimality that wehave in the binary object/background setting of Sec-tion 2. At the same time, -expansion algorithm inBoykov et al. (2001) can efciently nd provably goodapproximating solutions to the multi-way cut problemabove. In practice, it has been shown that these ap-proximatingsolutions have sufciently high quality for many problems in vision(e.g.for stereo(ScharsteinandSzeliski,2002;Szeliski and Zabih, 1999)). Constrainedmulti-way graph cuts is a promising natural extensionof our object/background segmentation approach to amore general multi-object extraction problem.

    Acknowledgements

    We would like to thank (in no particular order) Marie-Pierre Jolly, Christophe Chefdhotel (Siemens Re-search, NJ), Vladimir Kolmogorov (University Col-lege London, UK), Andrew Blake, Carsten Rother

    (Microsoft Research, UK), Philip Torr (OxfordBrookes University, UK), Daniel Cremers (Universityof Bonn, Germany), Nikos Paragios (Ecole Centralede Paris, France), Kaleem Siddiqi (McGill University,Canada), David Fleet (University of Toronto, Canada),Davi Geiger (New York University, NY), and OlgaVeksler (University of Western Ontario) for numer-ous scientic discussions and for their motivation thatgreatly helpedin ourresearch. Cheng-Chung Liang andBernard Geiger (Siemens Research, NJ) contributed3D object rendering tools. We also thank our secondanonymous reviewer for a very careful reading of the

    paper. His comments (besides revealing a large number

    of typos, including some in our proofs) motivated us toadd a detailed discussion of limitations and extensionsof our method in Section 4. Our work was supported bytheresearch grant from Natural Sciences andEngineer-ingResearch Council of Canada, by research funding atSiemens Corporation (Germany), and by contributionsfrom Microsoft Research Lab in Cambridge, UK.

    Notes

    1. Preliminary version of this work appeared in Boykov and Jolly(2001).

    2. One survey can be found in Boykov and Veksler (2006).

    3. Impossibility theorem in Kleinberg (2002) shows that no cluster-ing algorithm can simultaneously satisfy three basic axioms onscale-invariance , richness , and consistency . Thus, anyclusteringmethod has some bias.

    4. One interesting exception is a recent random walker approach toimage segmentation (Grady, 2005) based on optimization prob-lem over a nite set of real-valued variables.

    5. It seems feasible that ow-recycling (Kohli and Torr, 2005) andcut recycling (Juan and Boykov, 2006) can be combined for amultiplied effect.

    6. A patent application jointly with G. Paladini (Siemens Corp.Research, Princeton, NJ) is pending.

    7. The relationship of binary graph cuts with piece-wice constantMumford-Shah model is obvious from the characterization of the segmentation boundary cost in terms of geometric length

    (Boykov and Kolmogorov, 2003).The general form of Mumford-Shah functional can be seen as a continuous counterpart for thepiece-wice smooth discrete MRF models which, for example,can be optimized with non-binary graph cuts methods like -expansion (Boykov et al., 2001).

    8. Some versions of region growing method (Grifn et al., 1994;Reese, 1999) use region-based hard constraints similar to ours.

    9. A typed or hand-written letter is an example of a binary image.Restoration of such an image may involve removal of a salt andpepper noise.

    10. In contrast, each pair of connected nodes on a directed graph islinked by two distinct (directed) edges ( p, q ) and (q , p). Suchdirected edges can be very useful in some applications (see Sec-tion 2.7).

    11. In fact, non-negativity assumption is not essential. For example,

    atany pixel p we canalways increasethe valuesof R p(obj) and R p(bkg) by the same amount without changing the minimumof energy (2). The actual limitations for discontinuity penalties B are discussed in Section 2.7.

    12. Note that adding correcting object seeds to pixels that are al-ready segmented as object cannot changethe optimal segmen-tation. Indeed, such hard constraints are already satised by thecurrent optimal solution. The same argument applies to correct-ing background seeds.

    13. Note that the minimum cut severs exactly one of two t-links atpixel p .

    14. Courtesy of the Library of Congress collection of slides by S.M.Prokudin-Gorskii made in Russia in the early 20th century. Thewhole collection is available at www.loc.gov/exhibits/empire/

  • 7/31/2019 Graph Cut Segment

    22/23

    130 Boykov and Funka-Lea

    15. This point wasraised by Davi Geiger(NY University) in a privatecommunication.

    16. In Section 2 we useonly twographterminals s and t representingobject and background labels.

    References

    Amini, A.A., Weymouth, T.E., and Jain, R.C. 1990. Using dy-namic programming for solving variational problems in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence ,12(9):855867.

    Appleton, B. and Talbot, H. 2006. Globally minimal surfaces bycontinuous maximal ows. IEEE transactions on Pattern Analysisand Pattern Recognition (PAMI) , 28(1):106118.

    Blake, A., Rother, C., Brown, M., Perez, P., and Torr, P. 2004. In-teractive image segmentation using an adaptive gmmrf model. In European Conferenceon Computer Vision(ECCV) , Prague, ChechRepublic.

    Boykov, Y. and Kolmogorov, V. 2003. Computing geodesics andminimal surfaces via graph cuts. In International Conference onComputer Vision , vol. I, pp. 2633.

    Boykov, Y., Veksler, O., and Zabih, R. 1998. Markov random eldswith efcient approximations. In IEEE Conference on Computer Vision and Pattern Recognition , pp. 648655.

    Boykov, Y. and Jolly, M.-P. 2001. Interactive graph cuts for optimalboundary & region segmentation of objects in N-D images. In International Conference on Computer Vision , vol. I, pp. 105 112, July 2001.

    Boykov, Y. and Kolmogorov, V. 2004. An experimental comparisonof min-cut/max-owalgorithms for energy minimization in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence ,26(9):11241137.

    Boykov, Y., Kolmogorov, V., Cremers, D., and Delong, A. 2006.An integral solution to surface evolution PDEs via geo-cuts. In European Conference on Computer Vision , LNCS 3953, Graz,Austria, vol. III, pp. 409422.

    Boykov, Y. and Veksler, O. 2006. Graph cuts in vision and graph-ics: Theories and applications. In: N. Paragios, Y. Chen, andO. Faugeras, (Eds.), Handbook of Mathematical Models in Com-puter Vision, Springer-Verlag, pp. 7996.

    Boykov, Y., Veksler, O., and Zabih, R. 2001. Fast approximate en-ergy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence , 23(11):12221239.

    Bray, M., Kohli, P., and Torr, P.H.S. 2006. Posecut: Simultaneoussegmentation and 3D pose estimation of humans using dynamicgraph-cuts. In European Conference on Computer Vision , Graz,Austria, May 2006, (to appear).

    Caselles, V., Kimmel, R., and Sapiro, G. 1997. Geodesic active con-tours. International Journal of Computer Vision , 22(1):6179.

    Cohen, L.D. 1991. On active contour models and ballons. Computer Vision, Graphics, and Image Processing: Image Understanding ,53(2):211218.

    Cohen, L.D. and Kimmel, R. 1997. Global minimum for active con-tour models: A minimal path approach. International Journal of Computer Vision , 24(1):5778.

    Cook, W.J., Cunningham, W.H., Pulleyblank, W.R., and Schrijver,A. 1998. Combinatorial Optimization . John Wiley & Sons.

    Cox,I.J.,Rao,S.B.,and Zhong,Y. 1996. Ratioregions: a techniquefor image segmentation. In International Conference on Pattern Recognition , vol. II, pp. 557564.

    Cremers, D. 2006. Dynamical statistical shape priors for level setbased tracking. IEEE Trans. on Pattern Analysis and Machine Intelligence , (to appear).

    Cremers, D., Osher, S.J., and Soatto, S. 2006. Kernel density estima-tion and intrinsic alignment for shape priors in level set segmen-tation. International Journal of Computer Vision , (to appear).

    Falc ao, A.X., Udupa, J.K., Samarasekera, S., and Sharma, S.1998. User-steered image segmentation paradigms: Live wireand live lane. Graphical Models and Image Processing , 60:233 260.

    Felzenszwalb, P. and Huttenlocher, D. 2004. Efcient graph-basedimage segmentation. International Journal of Computer Vision ,59(2):167181.

    Ford, L. and Fulkerson, D. 1962. Flows in Networks . Princeton Uni-versity Press.

    Funka-Lea, G., Boykov, Y., Florin, C., Jolly, M.-P., Moreau-Gobard,R., Ramaraj, R., and Rinck,D. 2006. Automatic heart isolation for CT coronary visualization usinggraph-cuts.In IEEE InternationalSymposium on Biomedical Imaging , Arlington, VA, April 2006.

    Geiger, D., Gupta, A., Costa, L.A., and Vlontzos, J. 1995. Dynamicprogramming for detecting, tracking, and matching deformablecontours. IEEE Transactions on Pattern Analysis and Machine Intelligence , 17(3):294402.

    Goldberg, A.V. and Tarjan, R.E. 1988. A new approach to themaximum-ow problem. Journal of the Association for Comput-ing Machinery , 35(4):921940.

    Grady, L. 2005. Multilabel random walker segmentation usingprior models. In IEEE Conference of Computer Vision and Pat-tern Recognition , San Diego, CA, June 2005, vol. 1, pp. 763 770.

    Greig, D., Porteous, B., and Seheult, A. 1989. Exact maximum aposteriori estimation for binary images. Journal of the Royal Sta-tistical Society, Series B , 51(2):271279.

    Grifn, L.D., Colchester, A.C.F., R oll, S.A., and Studholme, C.S.1994. Hierarchical segmentation satisfying constraints. In British Machine Vision Conference , pp. 135144.

    Haralick, R.M. and Shapiro, L.G. 1992. Computer and Robot Vision .Addison-Wesley Publishing Company.

    Hochbaum, D.S. 1998. The pseudoow algorithm for the maximumow problem. Manuscript, UC Berkeley , revised 2003, Extendedabstract in: The pseudoow algorithm and the pseudoow-basedsimplex for the maximum ow problem. Proceedings of IPCO98,June 1998. Lecture Notes in Computer Science , Bixby, Boyd andRios-Mercado (Eds.) 1412, Springer, pp. 325337.

    Isard, M. and Blake, A. 1998. Active contours . Springer-Verlag.Ishikawa, H. and Geiger, D. 1998. Occlusions, discontinuities, and

    epipolar lines in stereo. In 5th European Conference on Computer Vision , pp. 232248.Ishikawa, H. and Geiger, D. 1998. Segmentation by grouping junc-

    tions. In IEEE Conference on Computer Visionand Pattern Recog-nition , pp. 125131.

    Ishikawa, H. 2003. Exact optimization for Markov Random Fieldswith convex priors. IEEE Transactions on Pattern Analysis and Machine Intelligence , 25(10):13331336.

    Jermyn, I.H. and Ishikawa, H. 1999. Globally optimal regions andboundaries. In International Conference on Computer Vision ,vol. II, pp. 904910.

    Juan, O. and Boykov, Y. 2006. Active Graph Cuts. In IEEE Con- ference of Computer Vision and Pattern Recognition , 2006 (toappear).

  • 7/31/2019 Graph Cut Segment

    23/23

    Graph Cuts and Efcient N-D Image Segmentation 131

    Kass,M., Witkin, A., and Terzolpoulos,D. 1988. Snakes: Active con-tour models. International Journal of Computer Vision , 1(4):321 331.

    Kimmel, R. and Bruckstein, A.M. 2003. Regularized Laplacian zerocrossings as optimal edge integrators. International Journal of Computer Vision , 53(3):225243.

    Kirsanov, D. and Gortler, S.J. 2004. A discrete global minimizationalgorithm for continuous variational problems. Harvard Computer Science Technical Report , TR-14-04, July 2004, (also submittedto a journal).

    Kleinberg, J. 2002. An impossibility theorem for clustering. InThe 16th conference on Neural Information Processing Systems(NIPS) .

    Kohli, P. and Torr, P.H.S. 2005. Efciently solving dynamic markovrandom elds using graph cuts. In International Conference onComputer Vision .

    Kohli, P. and Torr, P.H.S. 2006. Measuring uncertainty in graph cutsolutionsefcientlycomputing min-marginal energies usingdy-namic graph cuts. In European Conference on Computer Vision ,Graz, Austria, May 2006 (to appear).

    Kolmogorov, V. and Boykov, Y. 2005. What metrics can be approxi-mated by geo-cuts, or global optimization of length/area and ux.In International Conference on Computer Vision , Beijing, China,vol. I, pp. 564571.

    Kolmogorov, V., Criminisi, A., Blake, A., Cross, G., and Rother, C.2005. Bi-layer segmentation of binocular stereo video. In IEEE Conference of Computer Vision and Pattern Recognition , SanDiego, CA.

    Kolmogorov, V. and Zabih, R. 2002. Multi-camera scenereconstruc-tion via graph cuts. In 7th European Conference on Computer Vision , volume III of LNCS 2352 , pp. 8296, Copenhagen, Den-mark, May 2002. Springer-Verlag.

    Kolmogorov, V. and Zabih, R. 2004. What energy functions can beminimized via graph cuts. IEEE Transactions on Pattern Analysisand Machine Intelligence , 26(2):147159.

    Kumar, M.P., Torr, P.H.S., and Objcut, A.Z. 2005. In IEEE Confer-ence of Computer Vision and Pattern Recognition , pp. 1825.

    Kwatra, V., Schodl, A., Essa, I., and Bobick, A. 2003. GraphCuttextures: image and video synthesis using graph cuts. In ACM Transactions on Graphics (SIGGRAPH) , vol. 22, July 2003.

    Lempitsky, V., Boykov, Y., and Ivanov, D. 2006. Oriented visibilityfor multiview reconstruction. In European Conference on Com- puter Vision , Graz, Austria, May 2006 (to appear).

    Li, K., Wu, X., Chen, D.Z., and Sonka, M. 2006. Optimal surfacesegmentation in volumetric images-a graph-theoretic approach. IEEE transactions on Pattern Analys is and Pattern Recognition

    (PAMI) , 28(1):119134.Li, Y., Sun, J., and Shum, H.-Y. 2005. Video object cut and paste. InSIGGRAPH (ACM Transaction on Graphics) ,

    Li, Y., Sun, J., Tang, C.-K., and Shum, H.-Y. 2004. Lazy snapping.In SIGGRAPH (ACM Transaction on Graphics) .

    Lombaert, H., Sun, Y., Grady, L., and Xu, C. 2005. A multilevelbanded graph cuts method for fast image segmentation. In Inter-national Conference on Computer Vision , October 2005.

    Mortensen, E.N. and Barrett, W.A. 1998. Interactive segmentationwith intelligentscissors. Graphical Modelsand ImageProcessing ,60:349384.

    Mumford, D. and Shah, J. 1989. Optimal approximations bypiecewise smooth functions and associated variational problems.

    Comm. Pure Appl. Math. , 42:577685.Murota, K. 2003. Discrete Convex Analysis . SIAM Monographs on

    Discrete Mathematics and Applications.Osher, S. and Paragios, N. 2003. Geometric Level Set Methods in

    Imaging, Vision, and Graphics . Springer Verlag.Osher, S.J. and Fedkiw, R.P. 2002. Level Set Methods and Dynamic

    Implicit Surfaces . Springer Verlag.Reese, L.J. 1999. Intelligent paint: Region-based interactive image

    segmentation. Masters thesis, Brigham Young University.Rother, C., Kolmogorov, V., and Blake, A. 2004. Grabcut

    interactive foreground extractionusing iterated graphcuts. In ACM Transactions on Graphics (SIGGRAPH) .

    Rother, C., Kumar, S., Kolmogorov, V., and Blake, A. 2005. Digi-tal tapestry. In IEEE Conference of Computer Vision and Pattern Recognition , San Diego, CA.

    Roy, S. and Cox, I. 1998. A maximum-ow formulation of the n-camera stereo correspondence problem. In IEEE Proc. of Int. Con- ference on Computer Vision , pp. 492499.

    Sapiro, G. 2001. Geometric PartialDifferentialEquationsand Image Analysis . Cambridge University Press.

    Scharstein, D. and Szeliski, R. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision , 47(1/3):742.

    Sethian, J.A. 1999. Level Set Methods and Fast Marching Methods .Cambridge University Press.

    Shi, J. and Malik, J. 2000. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence ,22(8):888905.

    Szeliski, R. and Zabih, R. 1999. An experimental comparison of stereo algorithms. In Vision Algorithms: Theory and Practice ,number 1883in LNCS, pp. 119, Springer-Verlag, Corfu, Greece,September 1999.

    Vasilevskiy, A. and Siddiqi, K. 2002. Flux maximizing geometricows. PAMI , 24(12):15651578.

    Veksler, O. 2000. Image segmentation by nested cuts. In IEEE Conference on Computer Vision and Pattern Recognition , vol. 1,pp. 339344.

    Vogiatzis, G., Torr, P.H.S., and Cipolla, R. 2005. Multi-view stereoviavolumetricgraph-cuts. In IEEE Conference of Computer Visionand Pattern Recognition , pp. 391398.

    Wang, J., Bhat, P., Colburn, R.A., Agrawala, M., and Cohen, M.F.2005. Interactive video cutout. In SIGGRAPH (ACM Transactionon Graphics) .

    Wu, Z. and Leahy, R. 1993. An optimal graph theoretic approach todata clustering: Theory and its application to image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence ,

    15(11):11011113.Xu, N., Bansal, R., and Ahuja, N. 2003. Object segmentationusing graph cuts based active contours. In IEEE Conferenceon Computer Vision and Pattern Recognition , vol. II, pp. 46 53.

    Yezzi, A., Jr., Kichenassamy, S., Kumar, A., Olver, P., and Tannen-baum,A. 1997. A geometricsnakemodelforsegmentation ofmed-ical imagery. IEEE Transactions on Medical Imaging , 16(2):199 209.

    Zhu,S.C. and Yuille,A. 1996. Regioncompetition: Unifying snakes,region growing, and Bayes/MDL for multiband image segmenta-tion. IEEE Transactions on Pattern Analysis and Machine Intelli-gence , 18(9):884900.