arxiv:1607.01092v1 [cs.cv] 5 jul 2016 · 2016. 7. 6. · arxiv:1607.01092v1 [cs.cv] 5 jul 2016...

arX

iv:1

607.

0109

2v1

[cs.

CV

] 5

Jul 2

016

Incorporating prior knowledge in medical image segmentation: a survey

Masoud S. Nosrati and Ghassan Hamarneh

Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, 8888 University Dr., BC, Canadasmn6, [email protected]

Abstract

Medical image segmentation, the task of partitioning an image into meaningful parts, is an important step toward automatingmedical image analysis and is at the crux of a variety of medical imaging applications, such as computer aided diagnosis,therapyplanning and delivery, and computer aided interventions. However, the existence of noise, low contrast and objects’ complexityin medical images are critical obstacles that stand in the way of achieving an ideal segmentation system. Incorporatingpriorknowledge into image segmentation algorithms has proven useful for obtaining more accurate and plausible results. This papersurveys the different types of prior knowledge that have been utilized in different segmentation frameworks. We focus our surveyon optimization-based methods that incorporate prior information into their frameworks. We review and compare these methodsin terms of the types of prior employed, the domain of formulation (continuous vs. discrete), and the optimization techniques(global vs. local). We also created an interactive online database of existing works and categorized them based on the type of priorknowledge they use. Our website is interactive so that researchers can contribute to keep the database up to date. We concludethe survey by discussing different aspects of designing an energy functional for image segmentation, open problems, and futureperspectives.

Keywords: Prior knowledge, targeted object segmentation, review, survey, medical image segmentation

1. Introduction

Image segmentation is the process of partitioning an imageinto smaller meaningful regions based in part on some homo-geneity characteristics. The goal of segmentation is to delineate(extract or contour) targeted objects for further analysis.

For example, in medical image analysis (MIA), image seg-mentation of organs or tissue types is a necessary first stepfor numerous applications, e.g. measuring tumour burden (orvolume) from positron emission tomography (PET) or com-puted tomography (CT) scans (Hatt et al., 2009; Bagci et al.,2013), analyzing vasculature from magnetic resonance angiog-raphy (MRA) (e.g. measuring tortuosity) (Bullitt et al., 2003;Yan and Kassim, 2006), grading cancer from histopathologyimages (Tabesh et al., 2007), performing fetal measurementsfrom prenatal ultrasound (Carneiro et al., 2008), performingaugmented reality in robotic image guided surgery (Su et al.,2009; Pratt et al., 2012), building statistical atlas for populationstudies and voxel-based morphometry (Ashburner and Friston,2000).

Given an input image,I , the goal of a typical imagesegmentation system is to assign every pixel inI a spe-cific label where each label represents a structure of in-terest. Several traditional segmentation algorithms havebeen proposed for assigning labels to pixels; these in-clude thresholding (Otsu, 1975; Sahoo et al., 1988), region-growing (Adams and Bischof, 1994; Pohle and Toennies, 2001;Pan and Lu, 2007), watershed (Vincent and Soille, 1991;Grau et al., 2004; Hamarneh and Li, 2009) and optimization-

based methods (Grady, 2012; McIntosh and Hamarneh, 2013a;Ulen et al., 2013). The existence of noise, low contrast and ob-jects complexity in medical images, typically cause the afore-mentioned methods to fail. In addition, all these traditionalmethods assume that objects’ entire appearance have somenotion of homogeneity; however, this is not necessarily thecase for complex objects (e.g. multi-region cells with mem-brane, nucleus and nucleolus; or brain regions affected by mag-netic field of a magnetic resonance imaging (MRI) device non-uniformity). Many real-world objects are better describedby acombination of regions with distinct appearance models. Thisis where more elaborateprior information about the targetedobjects becomes helpful.

The majority of state-of-the-art image segmentation meth-ods are formulated as optimization problems, i.e. energy mini-mization or maximum-a-posteriori estimation, mainly becauseof their: 1) formal and rigorous mathematical formulation,2)availability of mathematical tools for optimization, 3) capabil-ity to incorporate multiple (competing) criteria as terms in theobjective function, 4) ability to quantitatively measure the ex-tent by which a method satisfies the different criteria/terms, and5) ability to examine the relative performance of different solu-tions.

In this paper, we review the various types of prior informa-tion that are utilized in different optimization-based frameworksfor segmentation of targeted objects. Prior information can takemany forms: user interaction; appearance models; boundariesand edge polarity; shape models; topology specification; mo-ments (e.g. area/volume and centroid constraints); geometrical

Preprint submitted to arXiv.org July 6, 2016

http://arxiv.org/abs/1607.01092v1

interaction and distance prior between different regions/labels;and atlas or pre-known models. We compare the different meth-ods utilizing prior information in image segmentation in termsof the type of prior information utilized, domain of formulation(continuous vs. discrete) and optimization techniques (globalvs. local) used.

The rest of the paper is organized as follows. In Section2,we briefly review the previous surveys that covered the med-ical image segmentation (MIS) problems and justify the needfor our survey. In Section3, we review the fundamentals ofoptimization-based image segmentation techniques. In Section4, we give a concrete overview of the different types of priorknowledge devised to improve image segmentation. Finally,inSection5, we summarize our notes and elaborate on future per-spectives.

2. Why yet another survey paper on MIS?

Many survey papers on the topic of medical image segmen-tation have appeared before and in this section, we explore therelated surveys.

McInerney and Terzopoulos(1996) reviewed the develop-ment and application of deformable models to problems inmedical image analysis, including segmentation, shape repre-sentation, matching and motion tracking.Pham et al.(2000)surveyed some of the segmentation approaches such as thresh-olding, region growing and deformable models with an em-phasis on the advantages and disadvantages of these methodsfor medical imaging applications.Olabarriaga and Smeulders(2001) presented a review of the user interaction in imagesegmentation. InOlabarriaga and Smeulders(2001), the goalwas to identify patterns in the use of user-interaction and topropose a criteria to evaluate interactive segmentation tech-niques. All of these surveys are limited to specific seg-mentation techniques that adopted a few basic priors suchas intensity and texture. In addition, the aforementionedsurveys (McInerney and Terzopoulos, 1996; Pham et al., 2000;Olabarriaga and Smeulders, 2001) are more than 10 years oldand many important new developments have appeared sincethen.

Elnakib et al.(2011) andHu et al. (2009) focus on appear-ance and shape features and overview most popular medicalimage segmentation techniques such as deformable models andatlas-based segmentation techniques.Heimann and Meinzer(2009) reviewed several statistical shape modelling approaches.Shape representation, shape correspondence, model construc-tion, local appearance model, and search algorithms structuredtheir survey.Peng et al.(2013) present five categories of graph-based discrete optimization strategies to MIS within a graph-theoretical perspective and hence do not discuss specific formsof priors the way we do in this survey.McIntosh and Hamarneh(2013b) surveyed the field of energy minimization in MIS andprovided an overview on the energy function, the segmentationand image representation, the training data, and the minimiz-ers. Many advanced prior information proposed in recent yearshave not been included in the aforementioned surveys.

Some surveys are focused on specific modalities only. Asan example,Noble and Boukerroui(2006) focused on review-ing methods on ultrasound segmentation in different medicalapplications, including cardiology, breast, and prostate. In(Sharma and Aggarwal, 2010), the details of automated seg-mentation methods, specifically in the context of CT and MRimages, were discussed. All the method discussed in this sur-vey adopted limited forms of priors such as appearance, edge,and shape.

Other surveys focused on specific organs. For exampleLesage et al.(2009) reviewed literature on 3D vessel lumensegmentation whilePetitjean and Dacher(2011) reviewed fullyand semi-automated methods performing segmentation in shortaxis images of cardiac cine MRI sequences. They proposea categorization for cardiac segmentation methods based onwhat level of external information (prior knowledge) is requiredand how it is used to constrain segmentation. However, thediscussed priors inPetitjean and Dacher(2011) are limited toshape and appearance information in deformable contour andatlas-based methods.

The recent paper,Grady(2012) is most similar to this surveyand includes a solid introduction to targeted object segmenta-tion. However, this survey focused on graph-based methodsonly and studied only a subset of priors we cover in this report.

In this survey, we categorize several prior information basedon their types and discuss how these priors have been encodedinto segmentation frameworks both in continuous and discretesettings. The prior information discussed in this survey hasbeen proposed in both computer vision and medical image anal-ysis communities. We hope this survey would be useful for theMIA community and the users of medical image segmentationtechniques by: summarizing what types of priors exist so far;which ones can be useful for one’s own targeted segmentationproblems; which methods (papers) have incorporated such pri-ors in their formulation already; the approach adopted for incor-porating certain priors (the associated complexities and trade-offs). We also hope that by surveying what has been done, re-searchers can more easily identify existing “gaps” and “weak-nesses”, e.g. identifying other important priors that havebeenmissed so far and require future research; or proposing bettertechniques for incorporating known priors.

3. Fundamentals of image segmentation

3.1. Traditional image segmentation methods

As mentioned in Section1, several traditional segmenta-tion algorithms have been proposed in the literature includingthresholding, region-growing, and watershed.

Thresholdingis the simplest segmentation technique where,for a simple case of binary segmentation (foreground vs. back-ground), the input image is divided into regions: regionswith values either less or more than a threshold. Determin-ing more than one threshold value is called multithresholding(Sahoo et al., 1988). In thresholding, segmentation results de-pend on the image properties and on how the threshold is cho-sen (e.g. using Otsu’s method (Otsu, 1975)). Thresholding

2

techniques do not take into account the spatial relationships be-tween features in an image and thus are very sensitive to noise.

Region growingmethods start from a set of seed pixels de-fined by the user and examine neighbouring pixels of the seedsto determine whether the neighbouring pixels should be addedto the region preserving some uniformity and connectivity cri-teria (Adams and Bischof, 1994). Different variations of thistechnique have been applied on different medical image modal-ities (Pohle and Toennies, 2001; Pan and Lu, 2007). Regiongrowing methods consider the neighbourhood information ofpixels, and hence, they are more robust to noise compared tothresholding methods. However, these methods are sensitive tothe chosen “uniformity predicate” (a logical statement foreval-uating the membership of a pixel) and corresponding threshold(Adams and Bischof, 1994), the location of seeds, and type ofpixel connectivity.

In conventional watershed algorithms (Vincent and Soille,1991), an image may be seen as a topographic relief, wherethe intensity value of a pixel is interpreted as its altitudein therelief. Suppose that the entire topography is flooded with wa-ter through virtual “holes” at the bottom of basins, then as thewater level rises and water from different basins are about tomerge, a dam is built to prevent merging. These dam boundariescorrespond to the watershed lines or objects boundaries. Anim-proved version of the watershed technique has been used to seg-ment brain MR images (Grau et al., 2004; Hamarneh and Li,2009). In practice, watershed produces over-segmentation dueto noise or local irregularities in the image. Marker-basedwa-tershed (Grau et al., 2004; Vincent, 1993; Beucher, 1994) pre-vents over-segmentation by limiting the number of regionalminima. However, this method is also sensitive to noise.

Having prior knowledge about the objects of interest and in-corporating this knowledge into the segmentation frameworkhelps us overcome the shortcomings associated with the tra-ditional methods and obtain more plausible results. As men-tioned earlier, formulating image segmentation as an optimiza-tion problem allows for the use of multiple criteria and priorinformation as energy terms in an objective functional. Inthe following section, we briefly review the fundamentals ofoptimization-based techniques for medical image segmenta-tion.

3.2. Optimization-based image segmentationGiven an imageI : Ω ⊂ R

n → Rm, image segmentation

partitionsΩ into k disjoint regionsS = S1, · · · ,Sk ⊂ S suchthatΩ = ∪k

i=1Si andSi ∩ S j = ∅, ∀i , j. S is the solutionspace. The aforementioned partitioning is referred to as a crispbinary (whenk = 2) or multi-region (k > 2) segmentation. In afuzzy or probabilistic segmentation, each element inΩ (e.g. apixel) is assigned a vectorp of lengthk quantifying the mem-berships or probabilities of belonging to each of thek classes,p = [p1, p2, · · · , pk] wherepi ≥ 0, i = 1, · · · , k and

∑ki pi = 1.

This task of image partitioning can be formulated as an energyminimization problem. An energy function,E : S → R, usu-ally consists of several objectives that are divided into two maincategories:regularization terms, Ri : S → R, anddata terms,Di : S → R. The regularization terms correspond to priors on

the space of feasible solutions and penalize any deviation fromthe enforced prior such as shape, length, etc. The data termsmeasure how strongly a pixel should be associated with a spe-cific label/segment. These objectives (regularization and dataterms) can then be scalarized as:

E(S) = λ∑

i

Ri(S) +∑

j

D j(S; I ) . (1)

The optimization problem is then formulated as:

S∗ = arg minS

E(S) = arg minSλR(S) +D(S; I ) , (2)

whereS∗ = S∗1, · · · ,S∗k are the optimal solutions and, for sim-plicity, R andD represent all the regularization and data terms,respectively.λ is a constant weight that balances the contribu-tion/importance of the data term and the regularization term inthe minimization problem.

One example of such energy is written as:

S∗1, · · · ,S∗k = arg minS1,··· ,Sk

λ

k∑

i=1

∫

∂Si

dx +k

∑

i=1

∫

Si

Di(x; I )dx

,

(3)

where the first term (regularization term) measures the perime-ter of the segmented regionsSi and penalizes large perimeters,thus favouring smooth boundaries.Di(x) : Ω → R, associatedwith regionSi , measures how strongly pixelx ∈ Ω should beassociated with regionSi . In Section4.3, we discuss differenttypes of regularization terms used in image segmentation prob-lems.

An optimization-based image segmentation problem can alsobe formulated as a maximization problem:

S∗ = arg maxS

P(S|I ) , (4)

whereS∗ is the optimal segmentation. Using Bayes’ theorem,(4) can be written as:

S∗ = arg maxS

P(I |S)P(S)P(I )

≡ arg maxS

P(I |S)P(S). (5)

In (4) and (5), P(S|I ) is theposterior probabilitythat definesthe degree of belief inS given the evidenceI (or some featuresof I ), P(I |S) is the imagelikelihood measuring the probabilityof the evidence inI given the segmentationsS, andp(S) is theprior probability that indicates the initial (prior to observingI )degree of belief inS. Maximizing the posterior probability (5)is equivalent to minimizing its negative logarithm:

S∗ = arg minS− logP(I |S) − logP(S). (6)

The probability (6) and energy (2) notations are related via theGibbs or Boltzmann distribution. Ignoring the Boltzmann’sconstant and thermodynamic temperature (as they do not af-fect the optimization) and substitutingP(I |S) ∝ e−D(S;I ) andP(S) ∝ e−λR(S) into (6), we obtain (2).

To avoid terminological confusion, we emphasize that to im-prove a segmentation, prior knowledge can be incorporated intoone or both of the regularization and data terms. Hence, theterm prior knowledge itself should not be confused with theprior probability in (5).

3

3.3. Domain of formulation: continuous vs. discrete

In general, a segmentation problem can be formulated in aspatially discrete or continuous domain. In the community thatadvocates continuous methods, it is assumed that the world welive in is a continuous world (continuousΩ). However, im-ages captured by digital cameras are discrete both in space andcolor/intensity. The discretization in space is calledsampling(discreteΩ) and the discretization in color/intensity or valuespace is calledquantization. Given this categorization, we havefour different cases for image representation (Figure1).

The energy function describing a segmentation problem canalso be formulated in a discrete or continuous domain. De-pending on the solution space (discrete vs. continuous) andtheenergy values, four possible cases can be considered for an en-ergy functional (Figure2). In the spatially discrete setting, theenergy function is defined over a set of finite variables (nodesP ⊂ Ω and edges), leading to the adoption of graphical models(Wang et al., 2013). One of the most commonly used graph-ical models is the Markov random field (MRF) (Wang et al.,2013). In MRF formulations, solutions are often calculated us-ing graph cut methods, e.g. max-flow/min-cut algorithms orgraph partitioning methods. Conversely, in the spatially con-tinuous setting, energy functionals are continuous and so arethe optimality conditions, which are written in terms of a set ofpartial differential equations (PDE). The minimization problemin (3) is a continuous version of a multi-region segmentationfunctional, often calledminimal partition problemin the PDEcommunity (Nieuwenhuis et al., 2013). Note that in Figure2,the objective function is a cost or an energy function that has tobe minimized. Nevertheless, an objective function can alsobea fitness or utility function that has to be maximized.

In the discrete setting, the segmentation task usually beginswith an undirected graph,G(P,E), that is composed of verticesP and undirected edgesE. Each node of the graph (p ∈ P)represents a random variable (f i

p) taking on different labels(i ∈ L = l1, · · · , lk) and each edge encodes the dependencybetween neighbouring variables. The corresponding optimiza-tion problem of (3) in the discrete domain is:

minf

∑

pq∈N i

V( f ip, f

iq) +

∑

p∈PDp( fp)

(7)

s.t.∑

i∈Lf ip = 1, ∀p ∈ P ,

whereV is the regularization term (pairwise term) that encour-ages spatial coherence by penalizing discontinuities betweenneighbouring pixels,D is the data penalty term (unary term),f ∈ B

L×P are the binary variables (f ip = 1 if pixel p ∈ P be-

longs to regioni ∈ L and f ip = 0 otherwise) andN i is the

neighbourhood which is typically defined as nearest neighbourgrid connectivity.

There are several advantages and drawbacks associated withdiscrete and continuous methods:

• Parameter tuning: in the continuous domain, PDE-basedapproaches typically require setting a step size during the

I(x,y)

I(x,y)

I(x,y)

I(x,y)

x

y

x

xx

y

yy

Domain: continuous Domain: discrete

Ran

ge:

co

nti

nu

ou

sR

ang

e: d

iscr

ete

(a) (b)

(c) (d)

Figure 1: Image as a mapping. Continuous vs. discrete domainand image values.

Domain: continuous Domain: discrete

Ran

ge:

co

nti

nu

ou

sR

ang

e: d

iscr

ete

E(S)

E(S) E(S)

E(S)

(a) (b)

(c) (d)

Figure 2: Energy function: continuous vs. discrete.S is thespace of possible segmentations.

optimization procedure. More formally, in the PDE com-munity, it is stated that the Euler-Lagrange equation pro-

4

vides a sufficient condition for the existence of a station-ary point of the energy functional. Letu be a differentiablelabeling function in a continuous domain andE(u) be anenergy functional. Then, the Euler-Lagrange equation ap-plied toE is:

∂E∂u− d

dx

(

∂E∂ux

)

− ddy

(

∂E∂uy

)

= 0 , (8)

whereux anduy are the derivatives ofu in x andy direc-tions, respectively. The minimizer ofE may be computedvia the steady state solution of the following update equa-tion:

∂u∂t= −∂E∂u, (9)

where∂t is an artificial time step size. A step size toolarge leads to a non-optimal solution and numerical in-stability, while a step size too small increases the conver-gence time. One way to ensure numerical stability duringthe optimization is to place an upper bound on the time-step using the Courant-Friedrichs-Lewy (CFL) condition(Courant et al., 1967). Under some conditions, the optimalstep sizes may be computed automatically as proposed byPock and Chambolle(2011). On the other hand, in dis-crete domain, graph cuts-based methods do not requiresuch parameter tuning and have proven to be numericallystable.

Note that other parameters in the segmentation en-ergy function, including weighting parameters to bal-ance the energy terms (e.g.λ in (2)) and hyper param-eters within each energy term or objective (e.g. num-ber of histogram bins in calculating the regional/dataterm) are common between continuous and discrete ap-proaches. Setting parameters can be done based ontraining data (learning-based) (Gennert and Yuille, 1988;McIntosh and Hamarneh, 2007) or based on the imagecontentRao et al.(2010).

• Termination criterion: While graph-based methods havean exact termination criterion, finding a general-purposetermination criteria for PDE-based methods is difficult.Strategies for stopping the optimization procedure includeperforming a fixed number of iterations and/or iteratinguntil the change in the solution or energy is smaller than apredefined threshold.

• Metrication error: Metrication error, also known as gridbias, is defined as the artifacts which appear in graph-based segmentation methods due to penalizing regionboundaries only across axis aligned edges. Figure3 com-pares the discrete and continuous version of a max-flowalgorithm. As seen in Figure3, the contours obtained bygraph cuts are noticeably blocky in the areas with weakregional cues (weak data term), while the contours ob-tained by the continuous method are smooth. The discretenature of graph-based methods makes it difficult to effi-ciently implement a convex regularizer like total variation

(a) GF (b) CCMF

(c) (d) (e) (f)

Figure 3: Metrication artifacts. Brain segmentation using(a)classical max-flow algorithm or graph cuts (GC) and (b) com-binatorial continuous max-flow (CCMF) (Couprie et al., 2011).(c,e) Zoomed regions of (a). (d,f) Zoomed regions of (b).(Images adopted from (Couprie et al., 2011))

in the discrete domain. Metrication error can be reducedin graph-based methods by increasing the graph connectiv-ity, e.g. (Boykov and Kolmogorov, 2003), but that also in-creases memory usage and computation time. In contrast,within the continuous domain, there is no such limitationand regularizers can be implemented efficiently that makesthe PDE approaches free from metrication error. Note thatalthough approaches with continuous energy formulationsdo not induce metrication errors, due to the discrete natureof digital images, all continuous operations are estimatedby their discrete versions in the implementation stage.

• Parallelization: Unlike PDE approaches that are eas-ily parallelizable on GPUs, graph-based techniques arenot straightforward to parallelize. As an example, themax-flow/min-cut, a core algorithm of many state-of-the-art graph-based segmentation methods, is a P-completeproblem, which is probably not efficiently paralleliz-able (Goldschlager et al., 1982; Nieuwenhuis et al., 2013)due to two reasons: (1) augmenting path operations inmin-cut/max-flow algorithms are interdependent as dif-ferent augmentation paths can share edges; (2) the up-dates of the edge residuals have to be performed si-multaneously in each augmentation operation as they alldepend on the minimum capacity within the augmenta-tion path (Nieuwenhuis et al., 2013). Several attemptshave focused on parallelizing the max-flow/min-cut com-putation. Push-relabel algorithms (Boykov et al., 1998;Delong and Boykov, 2008) relaxed the first issue men-tioned above but the update operations are still interde-pendent. Other techniques split the graph into multipleparts and obtained the global optimum by iteratively solv-ing sub-problems in parallel (Strandmark and Kahl, 2010;Liu and Sun, 2010) while Shekhovtsov and Hlavac(2013)combined the path augmentation and push-relabel tech-niques.

5

• Memory usage: With respect to memory consumption,the continuous optimization methods are often the winner.While continuous methods require few floating point val-ues for each pixel in the image, the graphical models re-quire an explicit storage of edges as well as one floatingvalue for each edge. This difference becomes importantwhen we deal with very large images and when the largenumber of graph edges required to be implemented, e.g.hundreds of millions pixels of microscopy images, and 3Dvolumes (Appleton and Talbot, 2006).

• Runtime: The runtime variance in graph-based methodsis higher than PDE-based methods. For example, consid-ering theα-expansion (Boykov et al., 2001) as a popularmulti-label optimization technique, the number of max-flow problems that need to be solved highly depends onthe input image and the chosen label order. In addition, thenumber of augmentation steps needed to solve a max-flowproblem depends on the graph structure and edge residu-als (Nieuwenhuis et al., 2013). On the other hand, PDE-based methods have less runtime variance as they performthe same computation steps on each pixel.

For more qualitative and quantitative comparisons betweencontinuous and discrete domain, refer to (Nieuwenhuis et al.,2013; Couprie et al., 2011; Nosrati and Hamarneh, 2014).

3.4. Optimization: convex (submodular) vs. non-convex (non-submodular)

In the continuous domain of energy, a function may be classi-fied as non-convex, convex, pseudoconvex or quasiconvex (Fig-ure4). Below, we define each of these terms mathematically.

An energy functionE : S → R is convex if

• the energy domainS (or the solution space) is a convex set and(10)

• ∀S1,S2 ∈ S and 0≤ λ ≤ 1

E(λS1 + (1− λ)S2) ≤ λE(S1) + (1− λ)E(S2).

A setS is a convex set ifS1,S2 ∈ S and 0≤ λ ≤ 1⇒ λS1 +

(1− λ)S2 ∈ S. If E is differentiable inS1 ∈ S, E is said to bepseudoconvex atS1 if

∇E(S1) · (S2 − S1) ≥ 0,S2 ∈ Ω⇒ E(S2) ≥ E(S1). (11)

We callE a quasiconvex function if

• the energy domainS is a convex set and (12)

• the sub-level setsSα = S ∈ S|E(S) ≤ α are convex for allα .

Pseudoconvex functions share the property of convex functionsin that, if ∇E(S) = 0, thenS is a global minimum ofE. Thepseudoconvexity is strictly weaker than convexity. In fact, everyconvex function is pseudoconvex. For example,E(S) = S+ S3

is pseudoconvex and non-convex. Also, every pseudoconvexfunction is quasiconvex, but the relationship is not commuta-tive, e.g.E(S) = S3 is quasiconvex and not pseudoconvex.

x

E(x)

(a) Non-convex

x

E(x)

(b) Convex

x

E(x)

(c) Pseudoconvex

x

E(x)

(d) Quasiconvex

Figure 4: One dimensional example of a (a) non-convex, (b)convex, (c) pseudoconvex and (d) a quasiconvex function. Redand green dots indicate the local minimum and the global so-lution, respectively. Red and green circles represent local andglobal optimum, respectively.

In this paper we focus on convex and non-convex optimiza-tion problems; more details on quasiconvex problems can befound in (dos Santos Gromicho, 1998). In the continuous do-main, an optimization problem must meet two conditions tobe a convex optimization problem: 1) the objective functionmust be convex, and 2) the feasible set must also be convex.The drawbacks associated with non-convex problems are that,in general, there is no guarantee in finding the global solutionand results strongly depend on the initial guess/initialization. Incontrast, for a convex problem, a local minimizer is actually aglobal minimizer and results are independent of the initializa-tion. However, non-convex energy functional often give moreaccurate models (see Section3.5).

The corresponding terminologies for convex and non-convexproblems in the discrete domain are submodular and non-submodular (supermodular) problems, respectively. LetE bea function ofn binary variables andE( f1, ..., fn) =

∑

i Ei( fi) +∑

i< j Ei j ( fi , f j). Then the discrete energy functionalE is sub-modular if the following condition holds:

Ei j (0, 0)+ Ei j (1, 1) < Ei j (0, 1)+ Ei j (1, 0). (13)

For higher order energy terms, e.g.Ei jk ( fi , f j , fk), E is submod-ular if all projections1 of E of two variables are submodular(Kolmogorov and Zabin, 2004).

Submodular energies can be optimized efficiently via graphcuts. Greig et al.(1989) were the first to utilize min-cut/max-flow algorithms to find the globally optimal solution for binarysegmentation in 1989. Later in 2003,Ishikawa(2003) gener-alized the graph cut technique to find the exact solution for aspecial class of multi-label problems (more detail on Ishikawa’sapproach in Section4.5).

In recent years, many efforts have been made to bridge thegap between convex and non-convex optimization problemsin the continuous domain through convex approximations ofnon-convex models. Historically, the two-region segmentationproblem (foreground and background) was convexified in 2006by Chan et al.(2006) and the multi-region segmentation prob-lem was convexified in 2008 byChambolle et al.(2008) and

1SupposeE hasnbinary variables. Ifm< nof these variables are fixed, thenwe get a new functionE′ of n−m binary variables;E′ is called a projection ofE.

6

Optimizability

FidelityFidelity vs. optimizability

Improves optimizability

(e.g. convexification)

Imp

rov

es f

idel

ity

(e.g

. n

ew e

ner

gy

ter

ms)

Figure 5: Fidelity vs. optimizibility. Ideally, an energy functionis designed in a way that is faithful to the underlying segmenta-tion problem and, at the same time, easy to be optimized.

Pock et al.(2008) for the first time (additional details on con-tinuous multi-region segmentation problem in Section4.5).

3.5. Fidelity vs. OptimizibilityIn energy-based segmentation problems there is a

trade-off between fidelity and optimizability (Hamarneh,2011; McIntosh and Hamarneh, 2012; Ulen et al., 2013;Nosrati and Hamarneh, 2014). Fidelity describes how faithfulthe energy function is to the data and how accurate it can modeland capture intricate problem details. Optimizability refers tohow easily we can optimize the objective function and attainthe global optimum.

Generally, the better the objective function models the prob-lem, the more complicated it becomes and the harder it is tooptimize. If we instead sacrifice fidelity to obtain a globally op-timizable objective function, the solution might not be accurateenough for our segmentation purpose.

In the image segmentation literature, many works havefocused on increasing the fidelity and improving themodeling capability of objective functions by (i) addingnew energy terms, e.g. edge, region, shape, statis-tical overlap and area prior terms (Gloger et al., 2012;Shen et al., 2011; Andrews et al., 2011b; Bresson et al., 2006;Pluempitiwiriyawej et al., 2005; Ayed et al., 2009, 2008);(ii) extending binary segmentation methods to multi-labelsegmentation (Vese and Chan, 2002; Mansouri et al., 2006;Rak et al., 2013); (iii) modeling spatial relationships betweenlabels, objects, or object regions (Felzenszwalb and Veksler,2010; Liu et al., 2008; Rother et al., 2009; Colliot et al.,2006; Gould et al., 2008); and (iv) learning objectivefunction parameters (Alahari et al., 2010; Nowozin et al.,2010; Szummer et al., 2008; McIntosh and Hamarneh, 2007;Kolmogorov et al., 2007).

Other works chose to improve optimizibility by approxi-mating non-convex energies with convex ones (Lellmann et al.,2009; Bae et al., 2011a; Boykov et al., 2001; Chambolle et al.,2008).

An ideal method improves both optimizibility and fidelitywithout sacrificing either property (green contour in Figure5).

3.6. Uncertainty and fuzzy/ probabilistic vs. crisp labellingIn an MIS problem, ideally, we are interested in finding an

optimal ground truth labeling for an image, where each label

X

E1(X)

x

e

(a) Less certain solution

X

E2(X)

x

e

(b) More certain solution

Figure 6: One dimensional example of two energy functionswith (a) less vs. (b) more certain solutions.

represents a single structure of interest. However, as medicalimages are approximate representations of physical tissues anddue to noise coming from the internal body structures and/orimaging devices, it is often difficult to precisely define a groundtruth labeling. Even the manual segmentation of an imageby several experts have some degree of inter-expert (differ-ent experts) and intra-expert (same expert at different times)variability due to ambiguities in the image. Therefore, it isbeneficial to encode uncertainty into segmentation frameworks(Koerkamp et al., 2010). This information can be used to high-light the ambiguous image regions so to prompt users’ attentionto confirm or manually edit the segmentation of these regions.

Uncertainty in object boundaries may arise from numeroussources, including graded composition (Udupa and Grevera,2005), image acquisition artifacts, partial volume effects.Therefore, various image segmentation methods have been in-tentionally designed to output probabilistic or fuzzy results tobetter capture uncertainty in segmentation solutions (Grady,2006; Zhang et al., 2001). Figure6 demonstrates an exampleof how uncertainty information can be observed in an energyfunction. E1 and E2 in Figure 6 are two 1-D energy func-tions with the same optimal solution. However, segmentationsnear the minimal solution inE1 have very similar energy values(high uncertainty) as opposed to solutions near the same opti-mal point inE2 (less uncertainty/more certain). In fact, underthe energyE1, a small perturbation in the image (e.g. additionalnoise) may change the segmentation result significantly. Givena probability distribution function over the label space, i.e. P(x)in (4), one way to calculate the uncertainty at pixelx is to useShannon’s entropy as:h(x) = −∑

P(x) log2(P(x)). The entropycan be used as an energy term in a segmentation energy func-tion. In this case, lower entropy corresponds to larger certaintyand vice versa.

As stated in Section3.2, in addition to crisp labelling whereeach pixel is mapped to exactly one object label, two com-mon ways to encode uncertainty into a segmentation frameworkare the adoption of probabilistic and fuzzy labelling. In prob-abilistic labelling, the probability of each label at each pixelis reported (Wells III et al., 1996; Grady, 2006; Saad et al.,2008, 2010b; Changizi and Hamarneh, 2010; Andrews et al.,2011a,b). In contrast, a partial membership of each pixel be-longing to each class of labels by a membership function isreported in fuzzy labeling (Bueno et al., 2004; Howing et al.,1997).

7

(a) Sub-pixel accuracy (b) Grid artifact (c) Fuzzy representa-tion

Figure 7: A sample segmentation (a) with and (b) without sub-pixel accuracy. (c) Representing sub-pixel accuracy usingafuzzy representation.

One important issue with probabilistic methods is that moststandard techniques for statistical shape analysis (e.g. principalcomponent analysis (PCA)) assume that the probabilistic datalie in the unconstrained real Euclidean space, which is not validas the sample space for a probabilistic data is the unit simplex.Neglecting this unit simplex in statistical shape analysismayproduce invalid shapes. In fact, moving along PCA modes re-sults in invalid probabilities that need to be projected back tothe unit simplex. This projection also discards valuable uncer-tainty information. To avoid the problem of generating invalidshapes,Pohl et al.(2007) proposed a method based on the log-arithm of odds (LogOdds) transform that maps probabilisticla-bels to an unconstrained Euclidean vector space and its inversemaps a vector of real values (e.g. values of the signed distancemap at a pixel) to a probabilistic label. However, a shortcomingof the LogOdds transform is that it is asymmetric in one of thelabels, usually chosen as the background, and changes in this la-bel’s probability are magnified in the LogOdds space. This lim-itation was addressed byChangizi and Hamarneh(2010) andAndrews et al.(2014) where the authors first use the isomet-ric log-ratio (ILR) transformation to isometrically and bijec-tively map the simplex to the Euclidean real space and thenanalyzed the transformed data in the Euclidean real space, andfinally transformed the analysis results back to the unit simplex.More recently,Andrews and Hamarneh(2015) proposed a gen-eralized log ratio transformation (GLR) that offers more refinedcontrol over the distances between different labels.

3.7. Sub-pixel accuracy

In the spatially discrete setting, objects are converted into adiscrete graph. This discretization causes loss of spatialinfor-mation, which causes the object boundaries to align with theaxes or graph edges as demonstrated in Figure7(b). In con-trast, the continuous domain does not have such shortcoming.In other words, sub-pixel accuracy allows for assigning a labelto one part of a pixel and another label to the other part. Thissub-pixel label assignment causes the segmentation accuracy toexceed the nominal pixel resolution of the image (Figure7(a)).However, as images are digitalized in computers, the accuracyof a crisp segmentation is always limited to the image pixel res-olution. One way to achieve sub-pixel accuracy is to use a fuzzyrepresentation (see Section3.6) where at each pixel, its degree

of membership to a label is proportional to the area covered bythat label (Figure7(c)).

We should emphasize that although in the continuous do-main, image representation and energy formulations are con-tinuous (Figure2(a) and Figure1(a)), implementation of thesemethods for image processing involves a discretization step(e.g. estimating a derivative by discrete forward difference).However, while the values of labels are discrete (e.g. integervalues) in the discrete settings, label values in the continuoussetting can be real-valued. Nevertheless, from the theoreticalpoint of view, continuous models correspond to the limit of in-finitely fine discretization.

4. Prior knowledge for targeted image segmentation

In this section, we review the prior knowledge informationdevised to improve image segmentation. Table1 presents someof these important priors and compares them in terms of the na-ture of achievable solution due to a given formulation (i.e.glob-ally vs. locally optimal), metrication error, domain of action(continuous vs. discrete), and other properties. We also cre-ated an interactive online database to categorize existingworksbased on the type of prior knowledge they use. We made ourwebsite interactive so that researchers can contribute to keepthe database up to date. Figure8 illustrates a snapshot of ouronline database showing different prior information that havebeen used in the literature for targeted image segmentation.

4.1. User interaction

Incorporating user input into a segmentation framework maybe an intuitive and easy way for the users to assist with char-acterizing the desired object and obtain usable results. Inaninteractive segmentation system, the user input is used to en-code prior knowledge about the targeted object. The specificprior knowledge that the user is considering is unknown to themethod, but only the implication of such prior knowledge (e.g.pixel x must be part of the object) is passed on to the interactivealgorithm. Given a high-level intuitive user interactive system,the end-user does not need to know about the low-level under-lying optimization and energy function details.

User input can be incorporated in several ways, such asthrough: mouse clicking (or even via eye gaze (Sadeghi et al.,2009)) and to provide seed points, specifying the subsets ofobject boundary or specifying sub-regions (bounding boxes)that contain the object of interest. The work proposed by(Kass et al., 1988) is perhaps one of the early works to incor-porate user interaction into the segmentation framework, wherethey enable users to enforce spring-like forces between snake’scontrol points to affect the energy functional and to push thesnake out of a local minima into another more desirable loca-tion.

The first form of user input (providingseeds) involves userspecifying labels of some pixels inside and outside the targetedobject by mouse-clicking or brushing. This allows a user toenforce hard constraints on labeled pixels. For example, inabinary segmentation scenario in the discrete setting, one can

8

Table 1: Some important prior information for targeted image segmentation

Method Mul

ti-ob

ject

Sha

pe

Topo

logy

Mom

ents

Geo

met

rical/r

egio

nin

tera

ctio

n

Spa

tiald

ista

nce

Adj

acen

cy

No.

ofre

gion

s/lab

els

Mod

el/A

tlas

No

grid

artif

act

Gua

rant

ees

ongl

obal

solu

tion

0th(S

ize/

Are

a/Vo

lum

e)

1st(L

ocat

ion/

Cen

troi

d)

Hig

her

orde

rs

Con

tain

men

t

Exc

lusi

on

Rel

ativ

epo

sitio

n

min

.

max

.(C

entr

oid)

Attr

actio

n

(Cootes et al., 1995)(Cootes and Taylor, 1995)

(Rousson and Paragios, 2002) X X X (Chen et al., 2002)(Tsai et al., 2003)

(Slabaugh and Unal, 2005)(Zhu-Jacquot and Zabih, 2007) X X

(Veksler, 2008) X X

(Song et al., 2010) X X X X X

(Andrews et al., 2011b) X X X X X X

(Han et al., 2003)(Zeng et al., 2008) X X X

(Vicente et al., 2008) X

(Foulonneau et al., 2006) X X X X X X

(Ayed et al., 2008) X X

(Klodt and Cremers, 2011) X X X X X

(Lim et al., 2011) X X X

(Wu et al., 2011) X X X X X

(Zhao et al., 1996) X X X X

(Samson et al., 2000) X X X

(Li et al., 2006) X X X X X X X

(Zeng et al., 1998) X X X X X

(Goldenberg et al., 2002) X X X X X

(Paragios, 2002) X X X X X

(Vazquez-Reina et al., 2009) X X X

(Ukwatta et al., 2012) X X X X

(Rajchl et al., 2012) X X X X X

(Delong and Boykov, 2009) X X X X X X

(Ulen et al., 2013) X X X X X X

(Schmidt and Boykov, 2012) X X X X X

(Nosrati and Hamarneh, 2014) X X X X X X X

(Nosrati et al., 2013) X X X X X X

(Nosrati and Hamarneh, 2013) X X X X X X X X X X

(Liu et al., 2008) X X

(Felzenszwalb and Veksler, 2010) X X X

(Strekalovskiy and Cremers, 2011)(Strekalovskiy et al., 2012) X X X X

(Bergbauer et al., 2013)(Zhu and Yuille, 1996)

(Brox and Weickert, 2006) X X X

(Ben Ayed and Mitiche, 2008)(Delong et al., 2012a) X X

(Yuan et al., 2012) X X X X

(Iosifescu et al., 1997)(Collins and Evans, 1997) X X X X

(Prisacariu and Reid, 2012)(Sandhu et al., 2011)

(Prisacariu et al., 2013) X X X

9

Figure 8: Snapshot of our interactive online database of segmentation articles categorized by type of prior information devised intheir framework (http://goo.gl/gy9pyn). Our online system allows users to update the records to ensure an up-to-date database.

enforcef f oregroundp = 1 if p ∈ f oregroundand f background

p = 0 ifp ∈ backgroundin (7).

In the continuous domain, (Paragios, 2003), (Cremers et al.,2007) and (Ben-Zadok et al., 2009) have proposed level set-based methods in which a user can correct the solution inter-actively by clicking on incorrectly labelled pixels. Mathemati-cally, letφ be the level set function (often is represented by thesigned distance map of the foreground) whereφ > 0 andφ < 0represent inside and outside regions of the object of interest,respectively.Cremers et al.(2007) proposed to add the follow-ing user interaction term to their energy functional consistingof other data and regularization terms:

Euser(φ) = −∫

Ω

L(x)sign(φ(x))dx , (14)

whereL : Ω→ −1, 0,+1 reflects the user input and is definedas:

L(x) =

+1 if x is marked as ’object’

−1 if x is marked as ’background’

0 if x is not marked

. (15)

Ben-Zadok et al.(2009) also used a similar energy functionalsimilar to (Cremers et al., 2007). Assuming thatxini=1 denotesthe set of user input, which indicates the incorrectly labelledregions, they definedM : Ω→ 0, 1 as:

M(z) =n

∑

i=1

δ(z− xi) , (16)

whereδ is the Dirac delta function. The functionL : Ω → R isdefined as:

L(x) = H(φ(x)) + (1− 2H(φ(x)))∫

z∈NM(z)dz , (17)

whereH is the Heaviside step function andN is the neighbour-hood of the coordinatex. L(x) = 0 if the user’s click is withinthe segmented region andL(x) = 1 if the click is on the back-ground.L(x) = H(φ(x)) if x is not marked. The user interactionterm proposed byBen-Zadok et al.(2009) is then defined as:

Euser(x) =∫

x∈Ω

∫

x′∈Ω

(

L(x′) − H(φ(x)))2K(x, x′)dx′dx , (18)

whereK(x, x′) is a Gaussian kernel.Another form of user input isobject boundary specification

where all or part of the object boundary is roughly specifiedby drawing a contour (in 2D) or initializing a surface (in 3D)around the object boundary. This form of user input is moresuitable for 2D images as providing manual rough segmenta-tions in 3D images (which is often the case in many medi-cal image analysis problems) is not straightforward. Examplesthat require the user to provide an initial guess close to ob-jects’ boundary include (Wang et al., 2007) in the discrete set-ting, and edge-based active contours (e.g. gradient vectorfield(GVF) (Xu and Prince, 1997, 1998) and geodesic active con-tour (Goldenberg et al., 2001)) in the continuous setting. Live-wire, proposed by (Barrett and Mortensen, 1997), is another ef-fective tool for 2D segmentation that benefits from user-definedseeds on the boundary of the desired object. The 2D live-wireuses the gradient magnitude, gradient direction, and cannyedgedetector to build cost terms. After providing an initial seedpoint on the boundary of the object, live-wire calculates the lo-cal cost for each pixel starting from the provided seed and findsthe minimal path between the initial seed pointp and the nextpoint q chosen by the user. The 2D live-wire was extended to3D byHamarneh et al.(2005).

Another form of user input, and probably the most conve-nient way for a user, is thesub-region specificationwhere a user

10

http://goo.gl/gy9pyn

is asked to draw a box around the targeted object. This bound-ing box can be provided automatically using machine learningtechniques in object detection. In the discrete setting, GrabCutproposed byRother et al.(2004) is one of the most well-knownmethods with this kind of initialization.Lempitsky et al.(2009)proposed a method which shows how a bounding box is used toimpose a powerful topological prior that prevents the solutionfrom excessively shrinking and splitting, and ensures thatthesolution is sufficiently close to each of the sides of the bound-ing box.Grady et al.(2011) performed a user study and showedthat a single box input is in fact enough for segmenting the tar-geted object. In the continuous setting, this kind of user in-put (sub-region specification) is taken into account by methodslike geodesic active contours (Caselles et al., 1997) in which theuser initializes the active contour around the object of interest.

Similar interaction is utilized in 3D live-wire(Hamarneh et al., 2005) as implemented in the TurtleSegsoftware2 (Top and et al., 2011; Top et al., 2011). In 3Dlive-wire, few slices in different orientations of a 3D volumeare segmented using 2D live-wire. Then, the segmented 2Dslices are used to segment the whole 3D volume by generatingadditional contours on new slices automatically. The newcontours are obtained by calculating optimal paths connectingthe points of intersection between the new slice planes and theoriginal contours provided semi-automatically by the user.

Saad et al.(2010a) proposed another type of interactive im-age analysis in which a user is able to examine the uncertaintyin the segmentation results and improve the results, e.g. bychanging the parameters of their segmentation algorithm. Foran expanded study on interaction in MIS, interested readersmay refer to (Saad et al., 2010b,a).

4.2. Appearance prior

Appearance is one of the most important visual cues to dis-tinguish between different structures in an image. Appearanceis described by studying the distribution of different featuressuch as intensity values in gray-scale images, color, and textureinside each object. In most cases, appearance models are incor-porated into the data term in (2) and (7). The purpose of incor-porating appearance prior is to fit the appearance distributionof the segmented objects to the distribution of objects of inter-est, e.g. using Gaussian mixture model (GMM) (Rother et al.,2004). In the literature, there are two ways to model the appear-ance: 1) adaptively learning the appearance during the segmen-tation procedure, and 2) knowing the appearance model priorto performing segmentation (e.g. by observing the appearancedistribution of the training data). In the former case, the ap-pearance model is learned as the segmentation is performed(Vese and Chan, 2002) (computed online). In the second case,it is assumed that the probability of each pixel belonging topar-ticular label is known, i.e. ifFi(x) represents a particular set offeature values (e.g. intensity/color) associated with each im-age location forith object, then it is assumed thatP(x|Fi(x)) isknown (or pre-computed offline). This probability is usually

2www.turtleseg.org

(a) (b)

(c) (d)

Figure 9: Examples of regional probabilities. (a) Originalen-doscopic image. (b-d) Probability of background, kidney andtumour for the frame shown in (a). A lower intensity in (b-d)corresponds to higher probability.

learned and estimated from the distribution of features insidesmall samples of each object. Figure9 illustrates the probabil-ity of different structures (the kidney, the tumour, and the back-ground) in an endoscopic scene. A lower intensity in Figures9(b-d) corresponds to higher probability.

To fit the segmentation appearance distribution to the priordistribution, a dissimilarity measured is usually needed whered(gi, gi) measures the difference between the appearance distri-bution of ith object (gi) and its corresponding prior distributiongi . This dissimilarity measure can be encoded into the energyfunctional (2) directly as the data term or via a probabilisticformulation. For example, consider the appearance prior ofanobject in a scalar-valued imageI , thengi would be the mean (µi)and variance (σ2

i ) of the intensities of the targeted object. Then,assuming a Gaussian approximation of the object’s intensity I ,the corresponding probability distribution will be:

P(x|gi) =1

√2πσ2

i

e− (I (x)−µi )2

2σ2i . (19)

Other than scalar-valued medical images suchas MR (Pluempitiwiriyawej et al., 2005) and US(Noble and Boukerroui, 2006)), appearance models canbe extracted from other types of images like color image(e.g. skin (Celebi et al., 2009), endoscopy (Figueiredo et al.,2010), microscopy (Nosrati and Hamarneh, 2013)), othervector-valued images (dynamic postiron emission tomography,dPET, (Saad et al., 2008)), and tensor-valued or manifold-valued images (Feddern et al., 2003; Wang and Vemuri, 2004;Weldeselassie and Hamarneh, 2007). For the vector-valuedimages, one can use multivariate Gaussain density as anappearance model. The formulation is similar to (19) with theuse of the covariance matrixΣi instead ofσ2

i . Regarding thetensor-valued images, several distance measures in the spaceof tensors have been proposed such as:

11

• Log-Euclidean tensor distanceis defined as:

dLE(Ti , Ti) =

√

trace(

(

logm(Ti) − logm(Ti))2)

,

whereTi andTi are a tensor fromith region and its corre-sponding prior tensor model, respectively.

• The symmetrized Kulback-Leibler (SKL) divergence (alsoknown as J-convergence) (Wang and Vemuri, 2004) is de-fined as:

dS KL(Ti , Ti) =12

√

trace(T−1i Ti + TiT−1

i ) − 2n ,

wheren is the size of the tensorTi andTi (n = 3 in DT-MRI). This measure is affine invariant.

• The Rao distance (Lenglet et al., 2004) is defined as:

dR(Ti, Ti) =

√

12(

log2(λi) + log2(λ j))

,

whereλi denotes the eigenvalues ofT−1/2i TiT

−1/2i .

Intensity and color information are not always sufficient todistinguish different objects. Hence, several methods proposedto model objects with more complex appearance using textureinformation as a complementary feature (Huang et al., 2005;Malcolm et al., 2007; Santner et al., 2009).

Bigun et al.(1991) introduced a simple texture feature modelconsists of the Jacobian matrix convolved by a Gaussian kernel(Kσ) that results in three different feature channels, i.e. in caseof a 2D image the features areKσ ∗ (I2

x, IxIy, I2y). However, these

features ignore the non-textured object that might be of interest.Therefore,Rousson et al.(2003) proposed to use the followingtexture features in order to segment objects with and without

texture: (I , I2x|∇I | ,

IxIy

|∇I | ,I2y

|∇I | ).More advanced texture features such as those based on Haar

and Gabor filter banks have shown many successes in medicalimage segmentation (Huang et al., 2005; Malcolm et al., 2007;Santner et al., 2009). Koss et al.(1999) andFrangi et al.(1998)are two works that utilized advanced features to segment ab-dominal organs and to measure vesselness, respectively. In(Frangi et al., 1998), the eigenvalues of the image Hessian ma-trix are used for measuring the vesselness of pixels in images.This measure is used for liver vessel segmentation both in avariational framework (Freiman et al., 2009) and in a graph-based framework (Esneault et al., 2010). Statistical overlapprior is another strong appearance prior that has been proposedby Ayed et al.(2009). Their method embeds statistical informa-tion (e.g. histogram of intensities) about the overlap betweenthe distributions within the object and the background in a vari-ational image segmentation framework. They used the Bhat-tacharyya coefficient measuring the amount of overlap betweentwo distributions, i.e.dB(gi(z), gi) =

∑

z

√

gi(z)gi(z) ∀z ∈ Z

if I : Ω → Z. Ben Ayed et al.(2009) used this strong prior tosegment left ventricle in MR images.

Other features such as frequency, bag of visual words,gradient location and orientation histogram (GLOH)

(Mikolajczyk and Schmid, 2005), DAISY (Tola et al.,2008), GIST (spatial envelop) (Oliva and Torralba, 2001),local binary pttern (LBP) (Heikkila et al., 2009), SURF(Bay et al., 2006), histogram of oriented gradient (HOG)(Dalal and Triggs, 2005), and scale invariant feature transform(SIFT) (Lowe, 2004) are sometimes helpful as appearancefeatures (Bosch et al., 2007).

Sometimes the appearance of structures is too complicatedthat regular features cannot describe them accurately. To ex-tract the appearance characteristics of such structures differentmachine learning techniques have been proposed. These ma-chine learning techniques learn the appearance either by com-bining several features like texture, color, intensity, HOG, etc.,and feed the combined feature vectors to a classifier like ran-dom decision forest (RF) or support vector machine (SVM)(Tu et al., 2006; Nosrati et al., 2014), or by learning a dictio-nary which describes the object of interest (Mairal et al., 2008;Nieuwenhuis et al., 2014; Nayak et al., 2013).

In general, appearance features can be extracted in the fol-lowing domains based on the type of the medical data:

• spatial domain: several methods have been developedto segment 2D or 3D static images (Chan et al., 2000;Cootes et al., 2001; Vese and Chan, 2002; Feddern et al.,2003; Wang and Vemuri, 2004; Huang et al., 2005;Malcolm et al., 2007; Santner et al., 2009);

• time domain: in dynamic medical images, it is beneficialto consider the temporal dimension along with the spa-tial dimensions. For example, extracting appearance fea-tures in temporal direction would be very informative indynamic positron emission tomography (dPET) images,where each pixel in the image represents a time activitycurve (TAC) that describes the metabolic activity of a tis-sue as a result of tracer uptake (Saad et al., 2008). Otherexamples include (Mirzaei et al., 2013) where spatio-temporal features are used to distinguish tumour regionsin 4D lung CT (3D+time) and (Amir-Khalili et al., 2014)where the likelihood of vessel regions are calculated basedon temporal and frequency analysis.

• scale domain: for some objects with more complextexture, it is useful to estimate the appearance modelin different scales for more accurate results and en-sure that the model is scale invariant (Han et al., 2009;Mirzaalian and Hamarneh, 2010).

Regardless of where the appearance information comes from,it is encoded through a data energy term (D) that assigns eachpixel a probability of belonging to each class of objects (5).

4.3. RegularizationThe regularization term corresponds to priors on the space

of feasible solutions. Several regularization terms have beenproposed in the literature. The most famous one is theMumford-Shah model (Mumford and Shah, 1989) that penal-izes the boundary length of different regions in a spatially con-tinuous domain, i.e.

∑ni=1 |∂Si |. The corresponding regulariza-

tion model in the discrete domain is Pott’s model that penalizes

12

any appearance discontinuity between neighbouring pixelsandis defined as

∑

p,q∈N wpqδ(Ip , Iq).The regularization term formulated in the discrete settingis

biased by the discrete grid and favours curves to orient alongwith the grid, e.g. in horizontal and vertical or diagonal direc-tions in a 4-connected lattice of pixels. As mentioned in Section3.3, this produces grid artifacts, also known as metrication er-ror (Figure3). On the other hand, the regularization term in thecontinuous settings allows one to accurately represent geomet-rical entities such as curve length (or surface area) without anygrid bias.

Some other regularization terms in continuous domain, writ-ten in the level set notation (φ), are listed as follows:

• Length regularization:∫

Ω|∇H(φ(x))|dx, whereH(.) is the

Heaviside step function.

• Total variation (TV):∫

Ω|∇φ(x)|dx, which only smooths

the tangent direction of the level set curve. This term isused especially when a single functionφ is used to seg-ment multiple regions, i.e.φ is not necessarily a signeddistance function. It is worth mentioning that there aretwo variants of the total variation term: the isotropic vari-ant usingℓ2 norm,

∫

Ω

|∇φ(x)|2dx =∫

Ω

√

|φx1|2 + · · · + |φxN |2 dx , (20)

and the anisotropic variant usingℓ1 norm,∫

Ω

|∇φ(x)|1dx =∫

Ω

|φx1| + · · · + |φxN |dx . (21)

The anisotropic version is not rotationally invariant andtherefore favours results that are aligned along the gridsystem. The isotropic version is typically preferred butcannot be properly handled by discrete optimization algo-rithms as the derivatives are not available in all directionsin the discrete settings.

• H1 norm:∫

Ω|∇φ(x)|2dx, which applies a purely isotropic

smoothing at every pixelx.

A comparison of the above mentioned regularization terms canbe found in (Chung and Vese, 2009).

Higher order regularization terms were also proposed toencode more constraints on the optimization problem. Forexample,Duchenne et al.(2011) introduced the ternary term(along with unary and pairwise terms in the standard MRF) forgraph matching (and not image segmentation) application andDelong et al.(2012b) proposed an efficient optimization frame-work to optimize sparse higher order energies in the discretedomain.

Curvature regularization is another useful type of regu-larization that has been shown to be capable of captur-ing thin and elongated structures (Schoenemann et al., 2009;El-Zehiry and Grady, 2010). In addition, there is evidence thatcells in the visual cortex are responsible for detecting curvature(Dobbins et al., 1987).

f1

f2

f3

f4

eij

(a) (b)

Figure 10: Curvature regularization term. (a) Approximatingthe curvature term by tesselating the image domain into a cellcomplex.ei j is the boundary variable andf1, f2, f3 and f4 are itscorresponding region variables. (b) Vessel segmentationtop:without curvature andbottom: with curvature regularizationterm. (Image (b) adopted from (Strandmark et al., 2013))

While curvature regularization term can be easily formulatedin the local optimization frameworks, e.g. in a level set formu-lation (Leventon et al., 2000a) and Snakes’ model (Kass et al.,1988), it is much more difficult to incorporate such prior in aglobal optimization framework.Strandmark and Kahl(2011)proposed several improvements to approximate curvature reg-ularization within a global optimization framework. They de-fined the curvature term as:

∫

∂Rk(x)2dx, where∂R is the bound-

ary of the foreground region andk is the curvature function.They approximated the above mentioned curvature term withdiscrete computation techniques by tessellating the imagedo-main into a cell complex, e.g. hexagonal mesh, which is a col-lection of non-overlapping basic regions whose union givesthewhole domain. They recast the problem as an integer linearprogram (along with a data term and length/area regularizationterms) and optimized the total energy via linear programming(LP) relaxation. Figure10(a) shows howStrandmark and Kahl(2011) discretized the image domain by cells. Iffi , i = 1, · · · ,mdenotes binary variables associated to each cell region andei

denotes the boundary variable, then the curvature regulariza-tion term is written as a linear function:

∑

i, j bi jei j , whereei j

denotes the boundary pairs and

bi j = minl i , l j(

α

minl i , l j

)2

, (22)

where l i is the length of edgei andα is the angle differencebetween two lines. Later,Strandmark et al.(2013) extendedtheir previous work (Strandmark and Kahl, 2011) and proposeda globally optimal shortest path method that minimizes gen-eral functionals of higher-order curve properties, e.g. curvatureand torsion. Figure10(b) illustrates the usefulness of curvatureprior on vessel segmentation.

4.4. Boundary information

Boundary and edge information is a powerful feature for de-lineating the objects of interest in an image. To incorporatesuch information, it is often assumed that the object bound-aries are more likely to pass between pixels with large in-tensity/color contrast or, more generally, regions with differ-

13

ent appearance (as captured by any of the measures in Sec-tion 4.2). As object boundaries are locations where we ex-pect discontinuities in the labels, this information is usuallylinked to the regularization term in (2) such that the regulariza-tion penalty is decreased in high contrast regions (most likelyobjects’ boundaries) to allow for discontinuity in labels.Thefunctionswi j = exp(−β‖I i − I j‖22) andw′i j = 1/1 + β‖I i − I j‖22are two examples of a boundary weighting function whereI i

andI j represent the intensity/color value associated with pixelsi and j in imageI , respectively (Grady, 2012). These bound-ary weights are used as multiplication factors along with theregularization terms mentioned in Section4.3. Geodesic activecontour (Caselles et al., 1997), normalized-cut (Shi and Malik,2000), and random walker (Grady, 2006) are three examplesthat employed such boundary weighting technique.

Boundary and edge information can also be linked to the dataterm in (2) via the use of edge detectors, which typically involvefirst and second order spatial differential operators. Severalmethods have been proposed to calculate first and second orderdifferences in scalar images (Canny, 1986; Frangi et al., 1998)and color images (Shi et al., 2008; Tsai et al., 2002). However,some medical images are manifold-valued (e.g. DT MRI). Toaddress this,Nand et al.(2011) extended the first order differ-ential asg(x) =

√λe whereλ ande are respectively the largest

eigenvalue and eigenvector ofS(x) = J(x)T J(x) andJ(x) is theJacobian matrix generalizing the gradient of a scalar field to thederivatives of the 3D DT image. Similarly, the authors extendedthe second order differential asG′(x) = G(x)+G(x)T

2 whereG(x)

is the Jacobian matrix ofg(x), i.e. Gi j =∂gi

∂x j. Similar approach

has been proposed for boundary detection in color images, e.g.in color snakes (Sapiro, 1997) and in detecting boundaries oforal lesions in color images (Chodorowski et al., 2005).

Boundary polarity: A problem with the aforemen-tioned boundary models is that they describe a boundarypoint that passes between two pixels with high image con-trast without accounting for the direction of the transition(Boykov and Funka-Lea, 2006; Grady, 2012). Singaraju et al.(2008) considered the transition direction in boundary detec-tion. For example, it is possible to distinguish between bound-aries from bright to dark and from dark to bright (boundarypolarity). This boundary polarity is incorporated into a graph-based framework by replacing each undirected edge,ei j , by twodirected edges,ei j andeji , with edge weight calculated as:

wi j =

exp(−β1‖I i − I j‖22) if I i > I j

exp(−β2‖I i − I j‖22) otherwise, (23)

whereβ1 > β2. In (23), boundary transition from bright todark is less costly than boundary transition from dark to bright.One example of encoding boundary polarity is shown in Figure11, where the boundary ambiguity is resolved by specifying theboundary polarity, i.e. in this example, bright to dark boundary.

The assumption of high contrast in objects’ boundaries mightnot be always valid in many medical images, e.g. soft tissueboundaries in CT images. In addition, the two proposed con-trast models,wi j andw′i j , are suitable for objects with smooth

(a) (b)

Figure 11: Cardiac right ventricle segmentation (a) without en-coding edge polarity and (b) with encoding edge polarity byspecifying the bright to dark edges as the desired ones. Notehow the incorrect boundary transition (the yellow arrow) in(a)has been corrected in (b) by specifying boundary polarity.

appearance and not for textured objects. One possible way toaddress these aforementioned issues (low contrast image andtextured objects) is to utilize the piecewise constant caseofMumford-Shah model (Mumford and Shah, 1989) and replaceI i with τ(I i), whereτ is a function that maps the pixel content toa transformed space where the object appearance is relativelyconstant (Grady, 2012). The Mumford-Shah model segmentsthe image into a set of pairwise disjoint regions with minimalappearance variance and minimal boundary length. Among themost popular methods that adopted the Mumford-Shah model isthe active contours without edges (ACWOE) method proposedby Chan and Vese(2001). As an example (Sandberg et al.,2002) proposed a level set-based active contour algorithm tosegment textured objects. Another example is the work pro-posed byParagios and Deriche(2002) where boundary andregion-based segmentation modules were exploited and unifiedinto a geodesic active contour model to segment textured ob-jects.

4.5. Extending binary to multi-label segmentation

In many medical image analysis problems, we are often in-terested in segmenting multiple objects (e.g. segmenting retinallayers from optical coherence tomography (Yazdanpanah et al.,2011)). Unlike a large class of binary labeling problems thatcan be solved globally, multi-label problems, on the other hand,cannot be globally minimized in general. In 2001,Boykov et al.(2001) proposed two algorithms (α-expansion andα-β swap)based on graph cuts that efficiently find a local minimum of amulti-label problem. They consider the following energy func-tional

E( f ) =∑

p∈PDp( fp) +

∑

p,q∈NVpq( fp, fq) , (24)

whereP is the set of all pixels,f = fp|p ∈ P is a labelingof the image,Dp( fp) measures how well labelfp fits pixel pandVpq is a penalty term for every pair of neighbouring pixelsp andq and encourages neighbouring pixels to have the samelabel. The second term ensures that the segmentation bound-ary is smooth. The methods proposed in (Boykov et al., 2001)requireVpq to be either ametric or semimetric. V is a metric

14

on the space of labelsL if it satisfies the following three condi-tions:

V(α, β) = 0⇔ α = β (25)

V(α, β) = V(β, α) > 0 (26)

V(α, β) ≤ V(α, γ) + V(γ, β) , (27)

for any labelsα, β,γ ∈ L. If V only satisfies (25) and (26) thenV is a semimetric.Boykov et al.(2001) find the local minimaby swapping a pair of labels (α-β-swap) or expanding a label(α-expansion) and evaluate the energy using graph cuts itera-tively. Later in 2003, Ishikawa (Ishikawa, 2003) showed that, ifVpq( fp, fq) is convex and symmetric infp− fq, one can computethe exact solution of the multi-label problem. Ishikawa used thefollowing formulation:

E( f ) =∑

p∈PD( fp) +

∑

(p,q)∈Ng(ℓ( fp) − ℓ( fq)) , (28)

whereD(.) in the first term (data term) is any bounded func-tion that can be non-convex,g(.) is a convex function, andℓ isa function that gives the index of a label, i.e.ℓ(label i) = i.The termg(ℓ( fp) − ℓ( fq)) expresses that there is a linear orderamong the labels and the regularization depends only on thedifference of their ordinal number. Ishikawa showed that ifg(.)is convex in terms of a linearly ordered label set, the problemof (28) can be exactly optimized by finding the min-cut over aspecially constructed multi-layered graph in which each layercorresponds to one label.

In the continuous domain,Vese and Chan(2002) extendedtheir level set-based method to multiphase level sets. To seg-ment N objects, their method needs⌈log2 N⌉ level set func-tions. The number of regions is upper-bounded by a power oftwo (Figure12(a)). Therefore, the actual number of regions themethod yields is sometimes not clear as it depends on the im-age and the regularization weights. This issue happens specifi-cally when the number of regions of interest is less than 2⌈log2 N⌉.Mansouri et al.(2006) proposed to assign an individual level setfunction to each object of interest (excluding the background),i.e. their method needsN−1 non-overlapping level set functionsto segmentN objects (Figure12(b)). Chung and Vese(2009)proposed another method that uses a single level set functionfor multi-object segmentation. They proposed to use differ-ent layers (or levels) of a level set function to represent dif-ferent regions as opposed to just using the zero level set (Fig-ure 12(c)). None of the aforementioned continuous methodsguarantee a globally optimal solution for multi-label problems.Pock et al.(2008) proposed a spatially continuous formulationof Ishikawa’s multi-label problem. In their method, the non-convex variational problem is reformulated as a convex vari-ational problem via a technique they calledfunctional lifting.They used the following energy functional

E(u) =∫

Ω

ρ(u(x), x)dx +∫

Ω

|∇u(x)|dx , (29)

which can be seen as the continuous version of Ishikawa’s for-mulation (24). u : Ω → Γ in (29) is the unknown labeling

φ1 φ2

φ3

r0

r1 r2

r3

r4

r5r6r7

(a)

φ1 φ2

φ3r0

r1 r2

r3

(b) (c)

Figure 12: Multi region level set methods proposed by (a)Vese and Chan(2002), (b) Mansouri et al. (2006), and (c)Chung and Vese(2009).

function andΓ = [γmin, γmax] is the range ofu. The first term in(29) is the data term, which can be a non-convex function, andthe second term is the total variation regularization term whichis a convex term. In the functional lifting technique, the idea isto transfer the original problem formulation to a higher dimen-sional space by representingu in terms of its super level setsφdefined as:

φ(x, γ) =

1 if u(x) > γ

0 otherwise. (30)

Now, (29) can be re-written in terms of the super level set func-tion as

E(φ) =∫

Σ

ρ(x, γ)|∂γφ(x, γ)|dΣ +∫

Σ

|∇φ(x, γ)|dΣ , (31)

which is convex inφ andΣ = [Ω × Γ]. The minimization ofE(φ) is not a convex optimization problem sinceφ : Σ→ 0, 1.Hence,φ is relaxed to vary in [0, 1]. We emphasize that themethod of Pock et al. cannot always guarantee the globallyoptimal solution of the original problem (beforeφ is relaxedand whenφ is binary). Brown et al.(2009) utilized functionallifting technique proposed by (Pock et al., 2008) and proposeda dual formulation for the multi-label problem. Their methodguarantees a globally optimal solution. Recently, inspired byIshikawa,Bae et al.(2011b) proposed a continuous max-flowmodel for multi-labeling problem via convex relaxed formu-lations. Not only can their continuous max-flow formulationsobtain exact and global optimizers to the original problem,butthey also showed that their method is significantly faster thanthe primal-dual algorithm ofPock et al.(2008).

4.6. Shape prior

Shape information is a powerful semantic descriptor forspecifying targeted objects in an image. In our categoriza-tion, shape prior can be modelled in three ways: geometrical(template-based), statistical, and physical.

4.6.1. Geometrical model (template)Sometimes the shape of the targeted object is known a pri-

ori (e.g. ellipse or cup-like shape). In this case, the shapecanbe modelled either by parametrization (e.g. an ellipse can beparametrized by its center coordinate, major and minor radiusand orientation) or by a non-parametric way (e.g. by its level

15

set representation) and incorporated into a segmentation frame-work.

One way to incorporate a geometrical shape model into asegmentation framework is to penalize any deviation from themodel. In the continuous domain, given two shapes repre-sented by their signed distance functionsφ1 and φ2, a sim-ple way to calculate the dissimilarity between them is givenby

∫

Ω(φ1 − φ2)2dx. The problem with this measure is that it

depends onΩ, i.e. as the size ofΩ is increased, the differ-ence becomes larger. An alternative is to constrain the inte-gral to the domain ofφ1, i.e.

∫

Ω(φ1 − φ2)2H(φ1)dx, as pro-

posed in (Rousson and Paragios, 2002). The aforementionedformulas are usable if the pose of the object of interest (loca-tion, rotation and scale) is known. If the pose of an object isunknown, one can include the pose parameters into the shapeenergy term and optimize the energy functional with respecttoboth pose parameters and the level set. For example, the authorsin (Chen et al., 2002) imposed the shape prior on the extractedcontour after each iteration of their level set-based algorithm.Pluempitiwiriyawej et al.(2005) also described the shape of anellipse with five parameters that include its pose parameters andoptimized their energy functional by iterating between optimiz-ing the shape energy term and the regional term.

In the discrete domain, the method ofSlabaugh and Unal(2005) is one of the primary works to incorporate an explicitshape model into a graph-based segmentation framework. Theyproposed the following extra term (in addition to data and regu-larization terms) that constrained the segmentation to return anelliptical object:

Eellipse( f , θ) =∑

i∈P|Mθi − fi | , (32)

whereMθ is the mask of an ellipse parametrized byθ. As mini-mizing such a term is not straightforward, the authors optimizethe energy functional iteratively, i.e. by finding the bestf fora fixed θ and then optimizingθ for a fixed f . For complexshapes that are hard to parametrize, an alternative approach isto fit a shape template to the current segmentation as proposedin (Freedman and Zhang, 2005). Veksler(2008) proposed to in-corporate a more general class of shapes, known asstar shapes,into graph-based segmentation. In Veksler’s work, it is assumedthat the center point (c) of the object is given. According to theirdefinition, “an object has a star shape if for any pointp insidethe object, all points on the straight line between the center cand p also lie inside the object” (Figure13). The followingpairwise term was introduced to impose the star shape prior:

ES tarpq ( fp, fq) =

0 if fp = fq∞ if fp = 1 and fq = 0

β if fp = 0 and fq = 1

. (33)

This prior is particularly useful for segmentation of convex ob-jects, e.g. optic cum and disc segmentation (Bai et al., 2014).

4.6.2. Statistical modelIn many practical applications, objects of the same class are

not identical or rigid. For example, in medical images, the

c

pq

(a)

Figure 13: A star shape with a given centerc. p anq are twopoints on the line passing throughc. If p is labeled as object,thenq must also be labeled as the object.

shape of organs vary from one subject to another or even overtime and so, assuming a fixed shape template may be inappro-priate. A typical way to capture the intra-class variation ofshapes is to build a shape probability model, i.e.P(shape).Now, two questions have to be investigated: 1) how to repre-sent a shape; explicitly (e.g. point cloud), implicitly (e.g. levelset), boundary-based (e.g. surface mesh) or medial-based (e.g.m-reps (Pizer et al., 2003)), and 2) what probability distributionmodel to adopt, e.g. Gaussian distribution, Gaussian mixturemodel, or kernel density estimation (KDE).

Cootes et al.(1995) generated a compact shape representa-tion and performed PCA (assuming Gaussian distribution) onaset of training shapes to obtain the main modes of variation.Theidea is to model the plausible deformations of object’s shape (S)by its principal modes of variation:

S = S +k

∑

i=1

wiPi , (34)

where S is the average shape,Pi is the ith principal com-ponent andwi is its corresponding weight (or shape parame-ter). Cootes et al.(1995) used object’s coordinates to repre-sentS. Given an initial estimation of the position of an ob-ject, the segmentation is performed by directly optimizinganenergy functional over the weightswi . This model is laterimproved byTsai et al.(2001, 2003); Leventon et al.(2000b)andVan Ginneken et al.(2002). For example,Leventon et al.(2000b) representS by its level sets to automatically handletopological changes during the contour evolution.Tsai et al.(2003) used the same level set-based shape representation asLeventon et al.(2000b) and incorporated the shape prior in aregion-based energy functional as opposed to an edge-basedenergy proposed in (Cootes et al., 1995). Van Ginneken et al.(2002) proposed to use a general set of local image struc-ture descriptors including the moments of local histogramsin-stead of the normalized first order derivative profiles used inCollins et al.(1995).

Similar to Tsai et al. (2003) in the continuous domain,Zhu-Jacquot and Zabih(2007) employed an iterative approachthat accounts for shape variability in a graph-based setting. Ateach iteration, they optimize the weights of principal modes of

16

variations and the set of rigid transformation parameters givena tentative segmentation. Then, the segmentation is updatedgiven the fitted shape template by minimizing an energy func-tional consisting of a regional term. The procedure is repeateduntil convergence. Recently,Andrews et al.(2014) proposeda probabilistic framework and incorporated shape prior to seg-ment multiple anatomical structures. They utilized PCA in theisometric log-ratio space as PCA assumes that the probabilisticdata lie in the unconstrained real Euclidean space. This is nota valid assumption as the sample space for a probabilistic datais the unit simplex and PCA may generate invalid probabilities,and hence, invalid shapes.

In the above mentioned methods based on PCA, aligning theshapes before computing the principal modes of variation isnecessary and to perform this alignment, it is often needed toprovide point-to-point correspondences between landmarks ofdifferent subjects. This might be a tedious task. Hence, somemethods proposed to capture shape variations in the frequencydomain by representing shapes with the coefficients of itsdiscrete cosine transform (DCT) (Hamarneh and Gustavsson,2000), Fourier transform (Staib and Duncan, 1992) or spheri-cal wavelet transform (Nain et al., 2006).

While PCA is a popular linear dimensionality reduction tech-nique, it has the restrictive assumption that the input dataisdrawn from a Gaussian distribution. If the shape variation doesnot follow a Gaussian distribution, we might end up with invalidshapes or unable to represent valid shapes. In this case, a moreaccurate estimation of shape parameters might be obtained byGaussian mixture models as proposed in (Cootes and Taylor,1999). In addition, PCA is only capable of describing globalshape variations, i.e. changing a parameter correspondingtoone eigenvector deforms the entire shape, which makes it diffi-cult to obtain a proper local segmentation. To control the sta-tistical shape parameters locally,Davatzikos et al.(2003) pre-sented a hierarchical formulation of active shape models us-ing the wavelet transform. Their wavelet-based encoding ofdeformable contours is followed by PCA analysis. The statis-tical properties extracted by PCA are then used as priors onthe contour’s deformation, some of which capture global shapecharacteristics of the object boundaries while others capture lo-cal and high-frequency shape characteristics.Hamarneh et al.(2004) also proposed a method to locally control the statisti-cal shape parameters. They used the medial-based profile forshape representation and developed spatially-localized feasi-ble deformations using hierarchical (multi-scale) and regional(multi-location) PCA and deform the medial profile at certainlocations and scales.Uzumcu et al.(2003) proposed to use in-dependent component analysis (ICA) instead of PCA whichdoes not assume a Gaussian distribution of the input data andcan better capture localized shape variations. However, ICArepresentation for shape variability is not as compact as PCA.Ballester et al.(2005) proposed to use principal factor analysis(PFA) as an alternative to PCA. PFA represents the observedD-dimensional dataO as a linear functionF of anL-dimensional(L < D) latent variablez and an independent Gaussian noiseeas:F (O) = Λz+µ+err, whereΛ is theD×L factor loading ma-trix defining the linear functionF , µ is aD-dimensional vector

representing the mean of the distribution ofO, anderr is a D-dimensional vector representing the noise variability associatedwith each of theD observed variables. As PFA models covari-ance between variables and generates “interpretable”modes ofvariation, while PCA determines the factors that account forthe total variance, (Ballester et al., 2005) argued that PFA is notonly adequate for the study of shape variability but also givesbetter “interpretability” than PCA, and thus conclude thatPFAis better suited for medical image analysis.

Nevertheless, both PCA and ICA are linear factor anal-ysis techniques, which make them difficult to model non-linear shape variations. Techniques such as kernel PCA(Scholkopf et al., 1998) and kernel density estimation (KDE)are two alternatives to describe non-linear data. The worksproposed by (Cremers et al., 2006; Kim et al., 2007; Lu et al.,2012) are examples that used non-linear dimensionality reduc-tion techniques (e.g. kernel PCA and KDE) to incorporateshape priors into image segmentation frameworks. For moreinformation about other linear and non-linear factor analysistechniques, we refer to (Fodor, 2002; Bowden et al., 2000).

In addition to representing shapes as a set of points (as usu-ally done in e.g. PCA cf. (34)), shapes can be described bydistance and angle information between different anatomicallandmarks (Wang et al., 2010; Nambakhsh et al., 2013). Forexample,Wang et al.(2010) proposed a scale-invariant shapedescription by measuring the relative distances between pair oflandmarks in a triplet, whileNambakhsh et al.(2013) modelthe left ventricle (LV) shape in the cardium by calculating thedistance between each point on the surface of the LV and a ref-erence point in the middle of the LV provided by a user. Morereviews on statistical shape models for 3D medical image seg-mentation can be found in (Heimann and Meinzer, 2009).

Beside the aforementioned statistical methods, some meth-ods employed learning algorithms to impose a shape model intosegmentation (Zhang et al., 2012; Kawahara et al., 2013). In(Zhang et al., 2012), authors proposed a deformable segmenta-tion method based on sparse shape composition and dictionarylearning. In another work,Kawahara et al.(2013) augmentedthe auto-context method (Tu and Bai, 2010) and trained se-quential classifiers for segmentation. Auto-context (Tu and Bai,2010) is an iterative learning framework that jointly learns theappearance and regularization distributions where the predictedlabels from the previous iteration are used as input to the currentiteration. (Kawahara et al., 2013) used auto-context to learnwhat shape-features (e.g. volume of a segmentation) a goodsegmentation should have.

4.6.3. Physical model

In some medical applications, the biomechanical characteris-tics of tissues can be estimated so that the physical characteris-tics of tissues can be modeled in a segmentation framework asadditional prior information, thereby leading to more reliablesegmentations.

The incorporation of material elasticity property into imagesegmentation was first introduced in 1988 byKass et al.(1988)in which spring-like forces between snake’s points is enforced.

17

Following Kass’ snakes model, several researchers also exam-ined ways to extract vibrational (physical) modes of shapesbased on finite element method (FEM); these include methodsproposed byKaraolani et al.(1989), Nastar and Ayache(1993)andPentland and Sclaroff (1991). In these frameworks, an ob-ject is modelled based on its vibrational modes similar to (34)wherePi is a vibrational mode rather than a statistical mode.

When the physical characteristics of a tissue are known andseveral samples from the same tissue are available, one cantake advantage of both statistical and physical models to ob-tain more accurate segmentation, as done byCootes and Taylor(1995) andHamarneh et al.(2008) where statistical and vibra-tional modes of variation are combined into a single objectivefunction.

Schoenemann and Cremers(2007) encoded an elastic shapeprior into a segmentation framework by combining the shapematching and segmentation tasks. Given a shape template,they proposed an elastic shape matching energy term that mapsthe points of the evolving shape to the template based on twocriteria: 1) points of similar curvature should be matched,and2) a curve piece of the evolving shape should be matched toa piece of the template of equal size. Their method achievesglobally optimal solutions.

4.7. Topological priorMany anatomical objects in medical images have a specific

topology that has to be preserved after segmentation in order toobtain plausible results. There are two types of topology spec-ification in the literature:connectivityandgenus. Connectivityspecification ensures that the segmentation of a single object isconnected3. The genus information ensures that the final seg-mentation does not have any void region (if the object is knownto be connected) or incorrectly fill void regions when the ob-ject is known to have internal holes (Grady, 2012). For ex-ample, a doughnut-shape initial segmentation should keep itsshape (doughnut) during the segmentation process.

Han et al.(2003) proposed a level set-based method for seg-menting objects with topology preservation. Their methodis based on thesimple pointconcept from digital topology(Bertrand, 1994). A simple pointis one that does not changethe topology of the segmentation when it is added or removedfrom a segmentation. Specifically, the proposed method checksthe topological number at each iteration to detect topologicalchanges during the contour evolution. If the segmentation al-gorithm adds or removes onlysimple pointsfrom an initial seg-mentation, then the new segmentation will have the same genusas before.

Inspired byHan et al.(2003), Zeng et al.(2008) introducedtopology cutsand cast the formulation ofHan et al. (2003)in a discrete setting. They showed that the optimization oftheir energy functional with topology-preservation is NP-hard.In another work,Vicente et al.(2008) proposed an interac-tive method in the discrete domain to segment objects with

3Formally, a segmentationS is connected if∀x, y ∈ S, ∃Pathxy, s.t. ifz ∈ Pathxy, thenz ∈ S.

(a) (b) (c)

(d) (e) (f)

Figure 14: The effect of incorporating topological constraintsinto medical image segmentation frameworks.Top row: Seg-mentation of carpal bones in a CT image (Han et al., 2003).Bottom row: Segmentation of cardiac ventricles in a MR im-age (Zeng et al., 2008). (a) Initialization. (b,c) Segmentationwithout (b) and with (c) topological constraints.

topology-preservation. Their algorithm guarantees the con-nectivity between two designated points. Further, the authorsshowed that their method can sometime find the global opti-mum under some conditions. Figure14shows examples of en-coding topological constraint in segmenting capral bones andcardiac ventricles.

4.8. Moment prior

In most segmentation methods that impose shape prior, devi-ations of the observed shape from training shapes are usuallysuppressed by the shape prior imposed. This is undesirablein medical image segmentation where pathological cases oc-cur (i.e. abnormal cases that deviate from the training shapesof healthy organs). Lower-order moment constraints can be analternative to avoid this limitation.

• 0th order moment (size/area/volume): The 0th order mo-ment corresponds to the size of an object.Ayed et al.(2008) proposed to add the area prior into the level setframework to speed up the curve evolution and to preventleakage in the final segmentation. Given an imageI andthe approximate area value of the targeted object (A), theirarea energy term is defined as:

EArea(x) =1A2

(∫

Ωin

dx −A)2 ∫

Ωin

g(I (x))dx , (35)

whereΩin is the region inside the current segmentation andg(.) attracts the evolving contour toward the high gradientregions (object boundaries).

• 1st order moment (location/centroid): In case of hav-ing some rough information about the centroid of the tar-geted object, this valuable information can be encoded

18

into a segmentation framework using the 1st order momentas proposed in (Klodt and Cremers, 2011) (see below formore details).

• Higher-order moment: Generally, we can impose mo-ment constraints of any order to refine the segmentationand capture fine-scale shape details.Foulonneau et al.(2006) proposed to encode higher-order moments into alevel set framework using a local optimization scheme.Recently,Klodt and Cremers(2011) proposed a convexformulation to encode moment constraints. They used theobjective function in the form of

E(u) =∫

Ω

ρ(x)u(x)dx +∫

Ω

g(x)|Du(x)|dx , (36)

whereu ∈ BV : Rd → 0, 1 is the labeling function andDu is the distributional derivative (Du(x) = ∇u(x) for adifferentiableu). Relaxingu to vary between 0 and 1, (36)becomes a convex optimization problem over the convexsetBV : Rd → [0, 1]. The global minimizer of the originalproblem (E(u) before relaxingu) is obtained by finding theglobal minimum of the relaxed energy functional,u∗, andthresholdingu∗ by a valueµ ∈ (0, 1).

Klodt and Cremers(2011) imposed the 0th order moment(i.e. area constraint in a 2D image) by bounding the areaof u betweenc1 andc2 wherec1 ≤ c2 such thatu lies inthe set

C0 =

u∣

∣

∣c1 ≤∫

Ω

udx ≤ c2

. (37)

The exact area prior can be imposed by settingc1 = c2.The 1st moment (i.e. centroid constraint) is imposed byconstraining the solutionu to be in the setC1 as:

C1 =

u∣

∣

∣µ1 ≤∫

Ωxudx

∫

Ωudx

≤ µ2

, (38)

whereµ1, µ2 ∈ Rd. The setC1 ensures that the centroid ofthe segmented object lies betweenµ1 andµ2. The centroidis fixed whenµ1 = µ2.

In general, thenth order moment constraint is imposed as:

Cn =

u∣

∣

∣A1 ≤∫

Ω(x1 − µ1)i1 · · · (xd − µd)idudx

∫

Ωudx

≤ A2

,

(39)

wherei1+ · · ·+ id = n, A1, A2 ∈ Rd×d are symmetric matri-ces andA1 ≤ A2 element wise.Klodt and Cremers(2011)proved that all these sets are convex. In their work, theabove constraints (37)-(39) are all hard constraints. Alter-natively, all of the aforementioned constraints can be en-forced as soft constraints by including them into the energyfunctional using Lagrange multipliers.Klodt and Cremers(2011) mentioned that, in practice, imposing moments ofmore than the order of 2 is not very useful as users cannotinterpret these moments visually and the improvements arevery small.

(a) (b) (c)

Figure 15: CT segmentation (b) without and (c) with mo-ments constraints. The area constraint limits the segmenta-tion to the size of the ellipse that was clicked by the user (a)that generates more accurate results. (Images adopted fromKlodt and Cremers(2011))

In the discrete settings,Lim et al. (2011) encode area, cen-troid and covariance (2nd order constraint) constraints into agraph-based method. While their method does not guarantee aglobally optimal solution, their method can impose non-linearcombinations of the aforementioned constraints as opposedto(Klodt and Cremers, 2011).

Figure15illustrates an example application of using momentconstraints in CT segmentation.

4.9. Geometrical and region interactions prior

Anatomical objects often consist of multiple regions, eachwith a unique appearance model, and each has meaningful ge-ometrical relationships or interactions with other regions of theobject. Over the past decade, much attention has been givento incorporating geometrical constraints into the segmentationobjective function.

In the continuous domain, several methods have been pro-posed based on coupled surfaces propagation to segment a sin-gle object in an image (Zeng et al., 1998; Goldenberg et al.,2002; Paragios, 2002). Vazquez-Reina et al.(2009) definedelastic coupling between multiple level set functions to modelribbon-like partitions. However, their approach was not de-signed to handle interactions between more than two regions.

Nosrati and Hamarneh(2014) augmented the level set frame-work with the ability to handle two important and intuitive ge-ometric relationships,containmentand exclusion, along witha distance constraint between boundaries of multi-region ob-jects (Figures16 and 17). Level sets important property ofautomatically handling topological changes of evolving con-tours/surfaces enables them to segment spatially recurring ob-jects (e.g. multiple instances of multi-region cells in a large mi-croscopy image) while satisfying the two aforementioned con-straints.

Bloch (2005) briefly reviewed the main fuzzy approachesthat define spatial relationships includingtopological relations(i.e. set relationships and adjacency) as well asmetrical rela-tions(i.e. distances and directional relative position).

None of the aforementioned methods guarantee a globallyoptimal solution. In contrast,Wu et al. (2011) proposed amethod that yields globally optimal solution for segmenting aregion bounded by two coupled terrain-like surfaces. They doso by minimizing the intraclass variance. Despite the global

19

Urethra

epithelium (A)

Urethra lumen (B)

Corpus

spongiosum (C)

(a) Original image (b) Initialization (c) No geometricconstraint

(d) With geometricconstraints

Figure 16: Urethra segmentation in a histology image. The con-straint is set such that the urethra epithelium (A) containstheurethra lumen (B) and excludes the other regions with similarintensity with B, i.e. the corpus spongiosum (C). Here A, B andC are represented by red, green and blue colors, respectively.(Images adopted from (Nosrati and Hamarneh, 2014))

(a) (b) (c) (d) (e)

(f)

• Skull contains gray matter

•Gray matter contains white matter and cerebellum

•White matter contains putamen

• Putamen and cerebellebum are excluded

(g)

Figure 17: Brain dPET segmentation. (a) Raw image. (b)Ground truth. (c) Initialization. (d) No geometric constraint. (e)Result with containment but without exclusion constraint.(f)Result with containment and exclusion constraints. (g) Colorcoding of the employed geometric constraints. Note how theputamen is contained by the white matter (red) as it should be,whereas (d) and (e) are anatomically incorrect. (Images from(Nosrati and Hamarneh, 2014))

optimality of their solutions, their method can only segment asingle object in an image and is limited to handling objects thatcan be “unfolded” into two coupled surfaces.

Ukwatta et al.(2012) also proposed a method that is based oncoupling two surfaces for carotid adventitia (AB) and lumenin-tima (LIB) segmentation. The advantage of their work overprevious works is that they optimized their energy functional bymeans of convex relaxation. However, their method could onlysegment objects with coupled surfaces. Using the same frame-work as (Ukwatta et al., 2012), Rajchl et al.(2012) presenteda graphical model to segment the myocardium, blood cavitiesand scar tissue. Their method used seed points as hard con-straints to distinguish the background from the myocardium.Nambakhsh et al.(2013) proposed an efficient method for leftventricle (LV) segmentation that iteratively minimizes a con-vex upper bound energy functional for a coupled surface. Theirmethod implicitly imposes a distance between two surfaces bylearning the LV shape. Recently,Nosrati et al.(2013) pro-posed a convex formulation to incorporate containment and de-

tachment constraints between different regions with a specifiedminimum distance between their boundaries. Their frameworkused a single labeling function to encode such constraints whilemaintaining global optimality. They showed that their convexcontinuous method is superior to other state-of-the art methods,including its discrete counterpart, in terms of memory usageand metrication errors.

In the discrete domain,Li et al. (2006) proposed a methodto segment “nested objects” by defining distance constraintsbetween the object’s surfaces with respect to a center point.As their formulation employed polar coordinates, their methodcould only handle star-shaped objects. Two containment andexclusion constraints between distinct regions have been en-coded into a graph-cut framework by (Delong and Boykov,2009) and (Ulen et al., 2013). If only containment constraint isenforced, then both approaches guarantee the global solution.More formally, for a two-region object scenario (region A, Band background), the idea of (Delong and Boykov, 2009) is tocreate two graphs for A and B, i.e.G(VA,EA) andG(VB,EB).The segmentations of A and B are represented by the binaryvariablesxA andxB, respectively. The geometrical constraintsbetween regionsA and B are enforced by adding an addi-tional penalty termWAB defined in Table2. This interactionterm, W, is implemented in the graph construction by addinginter-layer edges with infinity values as shown in Figure18(a).Delong and Boykov(2009) employed what is known as the in-teraction termW as follows:

∑

pq∈NAB

WABpq ( f A

p , fBq ) , (40)

whereNAB is the set of all pixel pairs (p, q) at which regionA isassigned some geometric interaction with regionB. Table2 listsenergy terms for the region interaction constraints proposed in(Delong and Boykov, 2009). Since the energy terms for con-tainment are submodular, their graph-based method guaran-tees the globally optimal solution if only containment termsare used. However, the energy for the exclusion constraint isnonsubmodular and thus harder to optimize. In some cases,because exclusion is nonsubmodular everywhere, it is possibleto make all these nonsubmodular terms submodular by flippingthe meaning of layer B’s variables so thatf B

p = 0 designatesthe region’s interior. Nonetheless, there are many useful in-teraction energy terms that cannot be modelled and optimizedefficiently by (Delong and Boykov, 2009) and other approx-imation like quadratic pseudo-boolean optimization (QPBO)(Kolmogorov and Rother, 2007; Rother et al., 2007) or αβ-swap (Boykov et al., 2001) should be used for the optimizationof these terms.

4.10. Spatial distance prior

In the literature, works that incorporate spatial distancepriorsmay be categorized as follows:

• Minimum distance: In some applications the minimumdistance between two structures must be enforced toensure that sufficient separation between regions exists

20

∞

∞

A

B

(a)

∞ ∞

A

B

(b)

Figure 18: Enforcing containment constraint between objectsA and B. Graph constructions used to enforce (a) ‘A containsB’ as proposed by (Delong and Boykov, 2009), and (b) 1-pixeldistance between boundaries of regions A and B (shown for asingle pixel), respectively.

Table 2: Energy terms for encoding containment andexclusion constraints between regions A and B in (40)(Delong and Boykov, 2009).

A containsBf Ap f B

q WABpq

0 0 00 1 ∞1 0 01 1 0

A excludesBf Ap f B

q WABpq

0 0 00 1 01 0 01 1 ∞

to obtain plausible results (e.g. distance between carotidAB and LIB). Examples of methods that employ thisconstraint include (Zeng et al., 1998; Goldenberg et al.,2002; Paragios, 2002; Nosrati and Hamarneh, 2014;Nosrati et al., 2013) in the continuous settings, and(Wu et al., 2011; Li et al., 2006; Delong and Boykov,2009; Ulen et al., 2013) in the discrete settings.Looking at (40) for example, Delong and Boykov(Delong and Boykov, 2009) (and similarly, Ulen etal. (Ulen et al., 2013)) enforce the minimum distancebetween two regions by defining theNAB in (40). Fig-ure 18(b) shows how 1-pixel margin between regionboundaries is enforced by (Delong and Boykov, 2009;Ulen et al., 2013).

• Maximum distance: In other medical applications, max-imum distance between regions is known a priori. Forexample, in cardiac LV segmentation, maximum distancebetween LV and its myocardium can be approximated.Enforcing a maximum distance between LV and its my-ocardium boundaries prevents the myocardium segmenta-tion from growing too far from the LV. Maximum distancebetween two boundaries/surfaces is enforced as proposedin (Zeng et al., 1998; Goldenberg et al., 2002; Paragios,2002; Nosrati and Hamarneh, 2014) in the continuous set-tings. There is not much work on incorporating maxi-mum distance between region boundaries in discrete set-tings except forWu et al.(2011) andSchmidt and Boykov(2012). In (Wu et al., 2011) the maximum distance alongwith minimum distance prior for segmenting two-region

ribbon-like objects can be enforced. To the best of ourknowledge, the only work that solely focused on incor-porating maximum distance between regions for multi-region object segmentation is the approach proposed by(Schmidt and Boykov, 2012). They modified the frame-work of (Delong and Boykov, 2009) by adding the Haus-dorff distance prior to the MRF-based segmentation frame-work to impose maximum distance constraints. Theyshowed that incorporating this prior into multi-surface seg-mentation is NP-hard due to the existence of supermodularenergy terms.

• Attraction /repulsion distance:In applications like multi-region cell segmentation, distance between regions shouldbe in a specific range. A specific distance between differ-ent regions can be maintained by enforcing attraction andrepulsion forces between region boundaries as proposedin (Zeng et al., 1998; Goldenberg et al., 2002; Paragios,2002; Nosrati and Hamarneh, 2014) in the continuous set-tings. Vazquez-Reina et al.(2009) specifically focused onattraction/repulsion interaction between two boundaries.They defined elastic couplings between level set functionsusing dynamic force fields to model ribbon-like partitions.Note that none of the above mentioned methods guaranteethe globally optimal solution.

In the discrete domain, Wu et al. (Wu et al., 2011) canimpose attraction/repulsion force between two surfacesby controlling the minimum and maximum distances be-tween them. Delong and Boykov(2009), and similarly(Ulen et al., 2013; Schmidt and Boykov, 2012), enforcedattraction/repulsion between pairs of regions, e.g. Aand B, by penalizing the intersection ofA and B (i.e.area/volume of A − B). Such constraints are encoded in(40) using the penalty terms shown in Table3. In the con-

Table 3: Energy terms for encoding containment with attrac-tion/repulsion betweenA andB regions.

A attractsBf Ap f B

q WABpq

0 0 00 1 01 0 α

1 1 0

tainment setting mentioned in Table2, replacing infinityvalue with a positive value forWAB

pq (0, 1) > 0 creates aspring-like repulsion force between inner and outer bound-aries. Hence, an attraction/repulsion constraint is similarto a containment constraint but with a different orientation.

In graph-based methods, e.g. (Delong and Boykov, 2009;Ulen et al., 2013), increasing the distance (or thickness) be-tween regions requires more edges to be added to the un-derlying graph, which increases memory usage and compu-tation time. In fact, to impose a distance constraint ofw

21

pixels between two regions, (Delong and Boykov, 2009) and(Ulen et al., 2013) need to addO(w2) extra edgesper pixel.Therefore, although these graph-based methods are highly ef-ficient in segmenting images with reasonable size and thick-ness constraint, they are not that efficient for large distanceconstraints. On the other hand, the memory usage and timecomplexity in the continuous methods (e.g. works proposedby (Nosrati and Hamarneh, 2014; Nosrati et al., 2013)) are in-dependent of thickness constraints.

In addition to the above mentioned approaches, methodsbased on the artificial life framework (deformable organisms)also employ spatial distance constraints to maintain the organ-ism’s structure (Hamarneh et al., 2009; Prasad et al., 2011). Inthese models, the deformable organism evolves in a restrictedway such that the distance between its skeleton and its bound-ary is restricted to be within a certain range.

4.11. Adjacency prior

Recently, several methods focused on ordering constraintsand adjacency relationships on labels for semantic segmenta-tion. As an example, “sheep” and “wolf” are unlikely to benext to each other and label transition from “sheep” to “wolf”should be penalized (Strekalovskiy et al., 2012).

In the discrete settings,Liu et al. (2008) proposed a graph-based method to incorporate label ordering constraints in scenelabeling and tiered4 segmentation. They assumed that an im-age is to be segmented into five parts (“centre”,“left”, “right”,“above” and “bottom”) such that a pixel labeled as “left” cannotbe on the right of any pixel labeled as “center”, etc.Liu et al.(2008) encoded such constraints into the pair-wise energy term(regularization), i.e.

∑

(p,q)∈N Vpq( fp, fq). For example, if pixelp is immediately to the left ofq, to prohibit fp =“center” andfq =“left”, then one definesVpq(“center”,“left”) = ∞. Gener-alizing this rule to other cases gives the following settings forVp,q:

fp\ fq L R C T BL 0 ∞ wpq wpq wpq

R ∞ 0 ∞ ∞ ∞C ∞ wpq 0 ∞ ∞T ∞ wpq ∞ 0 ∞B ∞ wpq ∞ ∞ 0

p is the left neighbour ofqfp\ fq L R C T B

L 0 ∞ ∞ ∞ wpq

R ∞ 0 ∞ ∞ wpq

C ∞ ∞ 0 ∞ wpq

T wpq wpq wpq 0 ∞B ∞ ∞ ∞ ∞ 0

p is the top neighbour ofq

Figure 19 illustrates an example of a tiered labelling. Fromoptimization point of view and according to (Liu et al., 2008),

4Tiered labeling problem partitions an input image into multiple horizontaland/or vertical tiers.

α-expansion technique is more likely to get stuck in a local min-imum when ordering constraints are used, asα-expansion actson a single label (α) at each move. In order to improve onα-expansion moves, authors introduced two horizontal and ver-tical moves and allowed a pixel to have a choice of labels toswitch to, as opposed to just a single labelα. Although theirproposed optimization approach leads to better results (com-pared toα-expansion approach), the globally optimal solutionis still not guaranteed.Felzenszwalb and Veksler(2010) pro-posed an efficient dynamic programming algorithm to imposesimilar constraints as (Liu et al., 2008) but with much less com-plexity. Their method computes the globally optimal solutionin the class of tiered labelings.

In the continuous domain,Strekalovskiy and Cremers(2011)proposed a generalized label ordering constraint which canen-force many complex geometric constraints while maintainingconvexity. This method requires that the constraint term obeysthe triangle inequality, a requirement that was later relaxed byintroducing a convex relaxation method for non-metric priors(Strekalovskiy et al., 2012). To do so, authors enforce non-metric label distances in order to model arbitrary probabili-ties for label adjacency. The distances between different la-bels5 operates only directly on neighbouring pixels. This of-ten leads to artificial one pixel-wide regions between labelsto allow the transition between labels with very high or infi-nite distance. For example, both the “wolf” and “sheep” la-bels can be next to “grass” but they cannot be next to eachother (Strekalovskiy et al., 2012). The method proposed in(Strekalovskiy et al., 2012) would create an artificial “grass”region between “wolf” and “sheep” to allow for this transi-tion. Obviously this one-pixel wide distance between “wolf”and “sheep” would not make the sheep any more secured! Gen-erally, a neighbourhood larger than one pixel is needed to avoidthese artificial labeling artifacts.Bergbauer et al.(2013) ad-dressed this issue and proposed amorphological proximitypriorfor semantic image segmentation in a variational framework.The idea is to consider pixels as adjacent if they are within aspecified neighbourhoodof arbitrary size. Consider two regionsi and j and their indicator functionsui andu j , respectively. Tosee ifi and j are close to each other, the overlap between the di-lation of the indicator functionui , denoted bydi, and the indica-tor function ofu j is computed. The dilation ofui is formulatedas:

di(x) = maxz∈S

ui(x + z), ∀x ∈ Ω (41)

with a structuring elementS. For each pair of regioni and j, aproximitypenalty term is defined as:

∑

1≤i≤ j≤n

∫

Ω

A(i, j)di(x)u j(x)dx , (42)

whereA(i, j) indicates the penalty for the co-occurrence of la-bel j in the proximity of labeli such thatA(i, i) = 0. In

5Note that this is not a spatial distance but is a distance between labelclasses.

22

(a) (b) (c)

Figure 19: Tiered labelling. (a) Input image. Segmentationre-sult (b) without and (c) with label ordering constraints. (Imagesfrom (Strekalovskiy and Cremers, 2011))

Bergbauer et al.(2013), this penalty term is relaxed and addedto an energy functional (along with regional and regularizationterms), which is then optimized with the help of Lagrange mul-tipliers.

To the best of our knowledge, the adjacency and proximitypriors as described above have not been utilized in medical im-age segmentation yet.

4.12. Number of regions/labels

In most segmentation problems, the number of regions is as-sumed to be known beforehand. However, it is not the case inmany applications and predefining a fixed number of labels inthese cases often causes over-segmentation.

The intuitive way to handling this problem is to penalize thetotal number of labels. For the given maximum number of re-gions/labels (at mostn labels), which is available in most ap-plications,Zhu and Yuille(1996) proposed to partition imagesbased on the following energy functional in the continuous do-main:

minΩi

n∑

i=1

∫

Ωi

ρ(ℓi , x)dx +∫

∂Ωi

ds

+ γM , (43)

whereΩi is the region corresponding to labelℓi ; ρ(ℓi , x) is thedata term that encodes the model ofℓi at pixel x; the secondterm is the regularization term; andM in the third term is thenumber of non-empty partitions (known as label cost prior).Zhu and Yuille(1996) optimized the above energy functionalusing a local optimization technique which converges to a localminimum. This approach was later adapted in the level set for-mulation by (Kadir and Brady, 2003; Ben Ayed and Mitiche,2008; Brox and Weickert, 2006) that allow region-merging.A convex formulation of such constraint was proposed byYuan et al.(2012). They enforced the label cost prior into multilabel segmentation by solving the following convex optimiza-tion problem:

minu(x)

n∑

i=1

∫

Ω

ui(x)ρ(ℓi, x)dx +∫

∂Ω

|∇ui(x)|dx

+ γ

n∑

i=1

maxx∈Ω

ui(x) ,

s.t.n

∑

i=1

ui(x) = 1, ui(x) ≥ 0; ∀x ∈ Ω. (44)

In the discrete domain,Delong et al. (2012a) developedan α-expansion method to optimize a general energy func-tional incorporated with label cost in a graph-based framework.

(a) (b) (c)

Figure 20: Endoscopic video segmentation. (a) Result of theactive contour without edges (blue arrows indicate errors). (b,c)Nosrati et al.(2014)’s method (b) without and (c) with motionprior. Green: kidney; Red: tumor; Yellow: ground truth. (Im-ages adopted from (Nosrati et al., 2014))

Along with unary (data) and pairwise (regularization) terms,Delong et al.(2012a) penalized each unique label that appearsin the image by introducing the following term:

∑

l∈Lhl · δl( f ) (45)

δl( f ) =

1 ∃p ∈ Ω : fp = l

0 otherwise, (46)

wherehl is the non-negative label cost of labell.

4.13. Motion prior

The segmentation and tracking of moving objects invideos have a wide variety of applications in medical im-age analysis, e.g. in echocardiography (Dydenko et al., 2006).Paragios and Deriche(1999) used motion prior to constrainthe evolution of a level set function by integrating motionestimation and tracking into a level set-based framework.Dydenko et al.(2006) proposed a method to segment and trackthe cardiac structure in high frame rate echocardiographicim-ages. The motion field is estimated from the level set evolu-tion. Both (Paragios and Deriche, 1999) and (Dydenko et al.,2006) perform motion estimations under the constraint of anaffine model. More complex motion prior is used in objecttracking. For example, to track the LV in echocardiography,Orderud et al.(2007) employed the Kalman filter, which is anoptimal recursive algorithm that uses a series of measurementsobserved over time to estimate the desired variables (i.e. dis-placement in motion estimation).

Recently,Nosrati et al.(2014) proposed an efficient tech-nique to segment multiple objects in intra-operative multi-viewendoscopic videos based on priors captured from pre-operativedata. Their method allows for the inclusion of laparoscopiccamera motion model to stabilize the segmentation in the pres-ence of a large occlusion (Figure20). This feature is especiallyuseful in robotic minimally invasive surgeries as camera motionsignals can be easily read using the robot’s API and be incorpo-rated into their formulation.

4.14. Model/Atlas

Atlas-based segmentation has also been particularly usefulin medical image analysis applications. Works that adopt atlas-based approach include (Gee et al., 1993; Collins et al., 1995;

23

Collins and Evans, 1997; Iosifescu et al., 1997). An atlas hasthe ability to encode (non-pathological) spatial relationshipsbetween multiple tissues, anatomical structures or organs. Inatlas-based image segmentation, the image is non-rigidly de-formed and registered with a model or atlas that has been la-belled previously. Applying the inverse transformation ofthelabels to the image space gives the segmentation. However,atlas-based segmentation has so far been restricted to single(albeit multi-part or multi-region) object instance, and doesnot address spatially-recurring objects in the scene. Also, at-lases usually are built from datasets of manually segmentedim-ages. These manual segmentations may not always be avail-able, and/or can not be used to define a representative templatefor a given object in a straightforward manner.

The performance of atlas-based segmentation techniques re-lies on an accurate registration. Surveying registration meth-ods is beyond the scope of this paper. Interested read-ers may refer to (Sotiras et al., 2013), (Hill et al., 2001), andTang and Hamarneh(2013) for more details on image registra-tion.

In the field of computer vision (non-medical), a few tech-niques used 3D models of objects (more realistic but more com-plex) to segment 2D images.Prisacariu and Reid(2012) pro-posed a variational method to segment an object in a 2D imageby optimizing a Chan-Vese energy functional with respect tosix pose parameters of the object model in 3D. The idea is totransform the object’s model in 3D so that its projection on the2D image delineates the object of interest. Consider segment-ing a single object in an image,Prisacariu and Reid(2012) usedthe following energy:

E(φ(x)) =∫

Ω

ρ f (x)H(φ(x)) + ρb(x)(1− H(φ(x)))dΩ , (47)

whereρ f andρb are two monotonically decreasing functions,measuring matching quality of the image pixels with respecttothe foreground and background models, respectively. Instead ofoptimizingE(φ) with respect to the level set functionφ, authorsin (Prisacariu and Reid, 2012) proposed to minimizeE(φ) withrespect to the pose parameters (ξi) of the object of interest in3D space:

∂E∂ξi= (ρ f − ρb)

∂H(φ)∂ξi

= (ρ f − ρb)δ(φ)[

∂φ

∂x∂φ

∂y

]

∂x∂λi∂y∂λi

. (48)

Unlike (Prisacariu and Reid, 2012), Sandhu et al.(2011) de-rived a gradient flow for the task of non-rigid pose esti-mation for a single object and used kernel PCA to capturethe variance in the space of shapes. Later,Prisacariu et al.(2013), introduced non-rigid pose parameters into the sameoptimization framework. They capture 3D shape vari-ance by learning non-linear probabilistic low dimensionallatent spaces, using theGaussian process latent vari-able dimensionality reduction technique. All three afore-mentioned works (Prisacariu and Reid, 2012; Prisacariu et al.,2013; Sandhu et al., 2011) assume that the camera parameters(for 3D to 2D projection) are given.

Recently, inspired by (Prisacariu and Reid, 2012),Nosrati et al. (2014) proposed a closed-form solution to

segment multiple tissues in multi-view endoscopic videosbased on pre-operative data. Their method simultaneouslyestimates the 3D pose of tissues in the pre-operative domainaswell as their non-rigid deformations from their pre-operativestate. They validated their approach onin vivo surgery data ofpartial nephrectomy and showed the potential of their methodin an augmented reality environment for minimally invasivesurgeries. Figure20 shows an example of segmentation ofkidney and tumour in an endoscopic view produced by theirmethod.

5. Summary, discussion, and conclusions

Segmentation techniques are aimed at partitioning (crisply orfuzzily) an image into meaningful parts (two or more). Tradi-tional segmentation approaches (e.g. thresholding, watershed,or region growing) proved incapable of robust and accurate seg-mentation due to noise, low contrast and complexity of objectsin medical images. By incorporating prior knowledge of ob-jects into rigorous optimization-based segmentation formula-tions, researchers developed more powerful techniques capableof segmenting specific (targeted) objects.

In recent years, several types of prior knowledge have beenutilized in a variety of forms (e.g. via user interaction, ob-ject shape and appearance, interior and boundary properties,regularization mechanisms, topological and geometrical con-straints, moment priors, distance and adjacency constraints, aswell as motion and model/atlas-based priors). In this paper,we attempted to provide a comprehensive survey of such im-age segmentation priors, with a focus on medical imaging ap-plications, including both high-level information as wellas theessential technical and mathematical details. We compareddif-ferent prior in terms of the domain settings (continuous vs.dis-crete) and the optimizibility (e.g. convex or not).

It is important to appreciate that, although incorporatingricher prior into an objective function may increase the fidelityof the energy functional (by better modelling the underlyingproblem), this typically comes at the expense of complicatingits optimization (lower optimizibility). On the other hand, fo-cusing on optimizibility by simplifying the energies mightde-crease the fidelity of the energy functions. In other words,be wary of segmentation algorithms that always converge tothe globally optimal but inaccurate solution, or ones that relyheavily on intricate initialization or meticulously tweaked pa-rameters. Consequently, recent research surveyed has focusedon developing methods that increase the optimizibility of en-ergy functions (e.g. by proposing convex or submodular energyterms) without sacrificing the fidelity.

In addition to the optimizability-fidelity tradeoff that is im-pacted by the choice of priors, it is important to observe theruntime and memory efficiency of proposed medical image seg-mentation algorithms. For example, graph-based approachesmay not be very efficient in handling very large images andthey often produce artifacts like grid-bias errors (also known asmetrication error) due to their discrete nature.

Despite the great advances that have been made in terms ofincreasing the fidelity and optimizibility of various segmenta-

24

tion energy models, there is still more to be done. We believethat through ongoing research, new methods will be proposedthat allow for models that are faithful to the underlying prob-lems, while being globally optimizable, memory- and time-efficient regardless of image size, and are free from any artifactslike metrication error.

In extending prior information in medical image segmenta-tion, there are several directions to explore. One direction mayfocus on consolidating all of these previously mentioned pri-ors such that a user (or an automatic system) can add one ormore of these priors as a module to the segmentation task athand. Such system is expected to minimize user inputs likemanual initialization. Although many efforts have been madeto convexify energy terms, many priors (especially when com-bined together) are non-convex (non-submodular) and hard tooptimize. As convex relaxation and convex optimization tech-niques are becoming popular recently, research emphasis thatfocuses on convexification of energy functions with as manypriors as needed would be an important step toward automaticimage segmentation.

In optimization-based segmentation that encodes a set of de-sired priors, it is important to consider how to combine theirrespective energy (or objective) terms. The most common ap-proach for dealing with such a multi-objective optimization isto scalarize the energies (via a linear sum of terms). Asidefrom choosing which priors are relevant and which mathemati-cal formulae encode them, how to learn and set the contributionweight of each term needs to be explored carefully especiallywhen there is not enough training data. When large sets of train-ing data are available, machine learning techniques have beenused to discover a good set of weights that adapt to image class,weights that change per image, and spatially adaptive weights.

The priors we reviewed and introduced in this thesis havebeen specifically and carefully designed to address particularsegmentation problem. Another potential complementary ap-proach that is worthy of future exploration is to attempt to learnthe priors (not only their weight in the objective function)fromavailable training data.

Future research directions could also focus on combining thehand-crafted features with machine learning techniques incaseof availability of training data. For example it makes sensetouse expert knowledge when the training data is not availableand increase the contribution of machine learning techniques asmore data becomes available and/or expert knowledge is harderto collect.

Conflict of Interest

The authors declare that there is no conflict of interest regard-ing the publication of this article.

References

References

Adams, R., Bischof, L., 1994. Seeded region growing. IEEE Transactions onPattern Analysis and Machine Intelligence (IEEE TPAMI) 16,641–647.

Alahari, K., Russell, C., Torr, P., 2010. Efficient piecewise learning for condi-tional random fields, in: IEEE Conference on Computer Visionand PatternRecognition (IEEE CVPR), pp. 895–901.

Amir-Khalili, A., Peyrat, J.M., Abinahed, J., Al-Alao, O.,Al-Ansari, A.,Hamarneh, G., Abugharbieh, R., 2014. Auto localization andsegmentationof occluded vessels in robot-assisted partial nephrectomy, in: Medical Im-age Computing and Computer-Assisted Intervention (MICCAI). Springer,pp. 407–414.

Andrews, S., Changizi, N., Hamarneh, G., 2014. The isometric log-ratio trans-form for probabilistic multi-label anatomical shape representation. IEEETransactions on Medical Imaging (IEEE TMI) .

Andrews, S., Hamarneh, G., 2015. The generalized log-ratiotransformation:Learning shape and adjacency priors for simultaneous thighmuscle segmen-tation. IEEE Transactions on Medical Imaging (IEEE TMI) .

Andrews, S., Hamarneh, G., Yazdanpanah, A., HajGhanbari, B., Reid, W.D.,2011a. Probabilistic multi-shape segmentation of knee extensor and flexormuscles, in: Medical Image Computing and Computer-Assisted Interven-tion (MICCAI), pp. 651–658.

Andrews, S., McIntosh, C., Hamarneh, G., 2011b. Convex multi-region prob-abilistic segmentation with shape prior in the isometric log-ratio transfor-mation space. IEEE International Conference on Computer Vision (IEEEICCV) .

Appleton, B., Talbot, H., 2006. Globally minimal surfaces by continuous maxi-mal flows. IEEE Transactions on Pattern Analysis and MachineIntelligence(IEEE TPAMI) 28, 106–118.

Ashburner, J., Friston, K.J., 2000. Voxel-based morphometrythe methods. Neu-roimage 11, 805–821.

Ayed, I.B., Li, S., Islam, A., Garvin, G., Chhem, R., 2008. Area prior con-strained level set evolution for medical image segmentation, in: SPIE Med-ical Imaging, p. 691402.

Ayed, I.B., Li, S., Ross, I., 2009. A statistical overlap prior for variationalimage segmentation. International Journal of Computer Vision (IJCV) 85,115–132.

Bae, E., Yuan, J., Tai, X.C., 2011a. Global minimization forcontinuous multi-phase partitioning problems using a dual approach. International Journal ofComputer Vision (IJCV) 92, 112–129.

Bae, E., Yuan, J., Tai, X.C., Boykov, Y., 2011b. A fast continuous max-flowapproach to non-convex multilabeling problems. Efficient Global Minimiza-tion Methods for Variational Problems in Imaging and Vision.

Bagci, U., Udupa, J.K., Mendhiratta, N., Foster, B., Xu, Z.,Yao, J., Chen,X., Mollura, D.J., 2013. Joint segmentation of anatomical and functionalimages: Applications in quantification of lesions from pet,pet-ct, mri-pet,and mri-pet-ct images. Medical Image Analysis (MedIA) 17, 929–945.

Bai, J., Miri, M.S., Liu, Y., Saha, P., Garvin, M., Wu, X., 2014. Graph-basedoptimal multi-surface segmentation with a star-shaped prior: Application tothe segmentation of the optic disc and cup, in: IEEE International Sympo-sium on Biomedical Imaging (ISBI), IEEE. pp. 525–528.

Ballester, M.A.G., Linguraru, M.G., Aguirre, M.R., Ayache, N., 2005. On theadequacy of principal factor analysis for the study of shapevariability, in:Medical Imaging, International Society for Optics and Photonics. pp. 1392–1399.

Barrett, W.A., Mortensen, E.N., 1997. Interactive live-wire boundary extrac-tion. Medical Image Analysis (MedIA) 1, 331–341.

Bay, H., Tuytelaars, T., Van Gool, L., 2006. Surf: Speeded uprobust features,in: European Conference on Computer Vision (ECCV). Springer, pp. 404–417.

Ben Ayed, I., Li, S., Ross, I., 2009. Embedding overlap priors in variational leftventricle tracking. IEEE Transactions on Medical Imaging (IEEE TMI) 28,1902–1913.

Ben Ayed, I., Mitiche, A., 2008. A region merging prior for variational level setimage segmentation. IEEE Transactions on Image Processing(IEEE TIP)17, 2301–2311.

Ben-Zadok, N., Riklin-Raviv, T., Kiryati, N., 2009. Interactive level set seg-mentation for image-guided therapy, in: IEEE International Symposium onBiomedical Imaging (ISBI), IEEE. pp. 1079–1082.

Bergbauer, J., Nieuwenhuis, C., Souiai, M., Cremers, D., 2013. Morphologicalproximity priors: Spatial relationships for semantic segmentation .

Bertrand, G., 1994. Simple points, topological numbers andgeodesic neigh-borhoods in cubic grids. Pattern Recognition Letters 15, 1003–1011.

Beucher, S., 1994. Watershed, hierarchical segmentation and waterfall algo-rithm, in: Mathematical morphology and its applications toimage process-

25

ing. Springer, pp. 69–76.Bigun, J., Granlund, G.H., Wiklund, J., 1991. Multidimensional orientation

estimation with applications to texture analysis and optical flow. IEEETransactions on Pattern Analysis and Machine Intelligence(IEEE TPAMI13, 775–790.

Bloch, I., 2005. Fuzzy spatial relationships for image processing and interpre-tation: a review. Image and Vision Computing 23, 89–110.

Bosch, A., Zisserman, A., Munoz, X., 2007. Image classification using randomforests and ferns .

Bowden, R., Mitchell, T.A., Sarhadi, M., 2000. Non-linear statistical modelsfor the 3d reconstruction of human pose and motion from monocular imagesequences. Image and Vision Computing 18, 729–737.

Boykov, Y., Funka-Lea, G., 2006. Graph cuts and efficient nd image segmenta-tion. International Journal of Computer Vision (IJCV) 70, 109–131.

Boykov, Y., Kolmogorov, V., 2003. Computing geodesics and minimal sur-faces via graph cuts, in: IEEE International Conference on Computer Vision(IEEE ICCV), pp. 26–33.

Boykov, Y., Veksler, O., Zabih, R., 1998. Markov random fields with effi-cient approximations, in: IEEE Conference on Computer Vision and PatternRecognition (IEEE CVPR), IEEE. pp. 648–655.

Boykov, Y., Veksler, O., Zabih, R., 2001. Fast approximate energy minimiza-tion via graph cuts. IEEE Transactions on Pattern Analysis and MachineIntelligence (IEEE TPAMI) 23, 1222–1239.

Bresson, X., Vandergheynst, P., Thiran, J.P., 2006. A variational model for ob-ject segmentation using boundary information and shape prior driven by themumford-shah functional. International Journal of Computer Vision (IJCV)68, 145–162.

Brown, E., Chan, T., Bresson, X., 2009. Convex formulation and exact globalsolutions for multi-phase piecewise constant Mumford-Shah image segmen-tation. UCLA CAM Report , 09–66.

Brox, T., Weickert, J., 2006. Level set segmentation with multiple regions.IEEE Transactions on Image Processing (IEEE TIP) 15, 3213–3218.

Bueno, G., Martınez-Albala, A., Adan, A., 2004. Fuzzy-snake segmentation ofanatomical structures applied to ct images, in: Image Analysis and Recog-nition. Springer, pp. 33–42.

Bullitt, E., Gerig, G., Pizer, S.M., Lin, W., Aylward, S.R.,2003. Measuring tor-tuosity of the intracerebral vasculature from mra images. IEEE Transactionson Medical Imaging (IEEE TMI) 22, 1163–1171.

Canny, J., 1986. A computational approach to edge detection. IEEE Transac-tions on Pattern Analysis and Machine Intelligence (IEEE TPAMI) , 679–698.

Carneiro, G., Georgescu, B., Good, S., Comaniciu, D., 2008.Detection andmeasurement of fetal anatomies from ultrasound images using a constrainedprobabilistic boosting tree. IEEE Transactions on MedicalImaging (IEEETMI) 27, 1342–1355.

Caselles, V., Kimmel, R., Sapiro, G., 1997. Geodesic activecontours. Interna-tional Journal of Computer Vision (IJCV) 22, 61–79.

Celebi, M.E., Iyatomi, H., Schaefer, G., Stoecker, W.V., 2009. Lesion bor-der detection in dermoscopy images. Computerized Medical Imaging andGraphics 33, 148–153.

Chambolle, A., Cremers, D., Pock, T., 2008. A Convex Approach for Comput-ing Minimal Partitions. Technical report TR-2008-05. Dept. of ComputerScience, University of Bonn. Bonn, Germany.

Chan, T., Sandberg, B., Vese, L., 2000. Active contours without edges forvector-valued images. Journal of Visual Communication andImage Repre-sentation 11, 130–141.

Chan, T., Vese, L., 2001. Active contours without edges. IEEE Transaction sonImage Processing (IEEE TIP) 10, 266–277.

Chan, T.F., Esedoglu, S., Nikolova, M., 2006. Algorithms for finding globalminimizers of image segmentation and denoising models. SIAM Journal onApplied Mathematics 66, 1632–1648.

Changizi, N., Hamarneh, G., 2010. Probabilistic multi-shape representationusing an isometric log-ratio mapping, in: Medical Image Computing andComputer-Assisted Intervention (MICCAI), pp. 563–570.

Chen, Y., Tagare, H.D., Thiruvenkadam, S., Huang, F., Wilson, D., Gopinath,K.S., Briggs, R.W., Geiser, E.A., 2002. Using prior shapes in geometric ac-tive contours in a variational framework. International Journal of ComputerVision 50, 315–328.

Chodorowski, A., Mattsson, U., Langille, M., Hamarneh, G.,2005. Color le-sion boundary detection using live wire, in: Medical Imaging, InternationalSociety for Optics and Photonics. pp. 1589–1596.

Chung, G., Vese, L.A., 2009. Image segmentation using a multilayer level-setapproach. Computing and Visualization in Science 12, 267–285.

Collins, D., Evans, A., 1997. Animal: validation and applications of nonlinearregistration-based segmentation. International Journalof Pattern Recogni-tion and Artificial Intelligence 11, 1271–1294.

Collins, D., Holmes, C., Peters, T., Evans, A., 1995. Automatic 3-D model-based neuroanatomical segmentation. Human Brain Mapping 3, 190–208.

Colliot, O., Camara, O., Bloch, I., 2006. Integration of fuzzy spatial relations indeformable models–application to brain MRI segmentation.Pattern Recog-nition 39, 1401–1414.

Cootes, T., Edwards, G., Taylor, C., 2001. Active appearance models. IEEETransactions on Pattern Analysis and Machine Intelligence(IEEE TPAMI)23, 681–685.

Cootes, T.F., Taylor, C.J., 1995. Combining point distribution models withshape models based on finite element analysis. Image and Vision Computing13, 403–409.

Cootes, T.F., Taylor, C.J., 1999. A mixture model for representing shape varia-tion. Image and Vision Computing 17, 567–573.

Cootes, T.F., Taylor, C.J., Cooper, D.H., Graham, J., 1995.Active shapemodels-their training and application. Computer Vision and Image Under-standing (CVIU) 61, 38–59.

Couprie, C., Grady, L., Talbot, H., Najman, L., 2011. Combinatorial continuousmaximum flow. SIAM Journal on Imaging Sciences 4, 905–930.

Courant, R., Friedrichs, K., Lewy, H., 1967. On the partial difference equationsof mathematical physics. IBM Journal of Research and Development 11,215–234.

Cremers, D., Fluck, O., Rousson, M., Aharon, S., 2007. A probabilistic levelset formulation for interactive organ segmentation, in: Medical Imaging,International Society for Optics and Photonics. pp. 65120V–65120V.

Cremers, D., Osher, S.J., Soatto, S., 2006. Kernel density estimation and in-trinsic alignment for shape priors in level set segmentation. InternationalJournal of Computer Vision (IJCV) 69, 335–351.

Dalal, N., Triggs, B., 2005. Histograms of oriented gradients for human de-tection, in: IEEE Conference on Computer Vision and PatternRecognition(IEEE CVPR), IEEE. pp. 886–893.

Davatzikos, C., Tao, X., Shen, D., 2003. Hierarchical active shape models,using the wavelet transform. IEEE Transactions on Medical Imaging (IEEETMI) 22, 414–423.

Delong, A., Boykov, Y., 2008. A scalable graph-cut algorithm for nd grids,in: IEEE Conference on Computer Vision and Pattern Recognition (IEEECVPR), IEEE. pp. 1–8.

Delong, A., Boykov, Y., 2009. Globally optimal segmentation of multi-regionobjects, in: IEEE International Conference on Computer Vision (IEEEICCV), pp. 285–292.

Delong, A., Osokin, A., Isack, H.N., Boykov, Y., 2012a. Fastapproximateenergy minimization with label costs. International Journal of ComputerVision (IJCV) 96, 1–27.

Delong, A., Veksler, O., Osokin, A., Boykov, Y., 2012b. Minimizing sparsehigh-order energies by submodular vertex-cover, in: Advances in NeuralInformation Processing Systems (NIPS), pp. 971–979.

Dobbins, A., Zucker, S.W., Cynader, M.S., 1987. Endstoppedneurons in thevisual cortex as a substrate for calculating curvature. Nature 329, 438–441.

Duchenne, O., Bach, F., Kweon, I.S., Ponce, J., 2011. A tensor-based algorithmfor high-order graph matching. IEEE Transactions on Pattern Analysis andMachine Intelligence (IEEE TPAMI) 33, 2383–2395.

Dydenko, I., Jamal, F., Bernard, O., Dhooge, J., Magnin, I.E., Friboulet, D.,2006. A level set framework with a shape and motion prior for segmentationand region tracking in echocardiography. Medical Image Analysis (MedIA)10, 162–177.

El-Zehiry, N.Y., Grady, L., 2010. Fast global optimizationof curvature,in: IEEE Conference on Computer Vision and Pattern Recognition (IEEECVPR), IEEE. pp. 3257–3264.

Elnakib, A., Gimelfarb, G., Suri, J.S., El-Baz, A., 2011. Medical image seg-mentation: A brief survey, in: Multi Modality State-of-the-Art Medical Im-age Segmentation and Registration Methodologies. Springer, pp. 1–39.

Esneault, S., Lafon, C., Dillenseger, J.L., 2010. Liver vessels segmentationusing a hybrid geometrical moments/graph cuts method. IEEE Transactionson Biomedical Engineering (IEEE TBME) 57, 276–283.

Feddern, C., Weickert, J., Burgeth, B., 2003. Level-set methods for tensor-valued images, in: IEEE Workshop on Geometric and Level Set Methods inComputer Vision, pp. 65–72.

26

Felzenszwalb, Veksler, 2010. Tiered scene labeling with dynamic program-ming, in: IEEE Conference on Computer Vision and Pattern Recognition(IEEE CVPR), pp. 3097–3104.

Figueiredo, I.N., Figueiredo, P.N., Stadler, G., Ghattas,O., Araujo, A., 2010.Variational image segmentation for endoscopic human colonic aberrantcrypt foci. IEEE Transactions on Medical Imaging (IEEE TMI)29, 998–1011.

Fodor, I.K., 2002. A survey of dimension reduction techniques.Foulonneau, A., Charbonnier, P., Heitz, F., 2006. Affine-invariant geometric

shape priors for region-based active contours. IEEE Transactions on PatternAnalysis and Machine Intelligence (IEEE TPAMI) 28, 1352–1357.

Frangi, A.F., Niessen, W.J., Vincken, K.L., Viergever, M.A., 1998. Multiscalevessel enhancement filtering, in: Medical Image Computing and Computer-Assisted Intervention (MICCAI). Springer, pp. 130–137.

Freedman, D., Zhang, T., 2005. Interactive graph cut based segmentation withshape priors, in: IEEE Conference on Computer Vision and Pattern Recog-nition (IEEE CVPR), pp. 755–762.

Freiman, M., Joskowicz, L., Sosna, J., 2009. A variational method for vesselssegmentation: algorithm and application to liver vessels visualization, in:SPIE Medical Imaging, International Society for Optics andPhotonics. pp.72610H–72610H.

Gee, J., Reivich, M., Bajcsy, R., 1993. Elastically deforming 3D atlas to matchanatomical brain images. Journal of Computer Assisted Tomography 17,225.

Gennert, M.A., Yuille, A.L., 1988. Determining the optimalweights in mul-tiple objective function optimization, in: IEEE International Conference onComputer Vision (IEEE ICCV), Citeseer. pp. 87–89.

Gloger, O., Toennies, K.D., Liebscher, V., Kugelmann, B., Laqua, R., Volzke,H., 2012. Prior shape level set segmentation on multistep generated proba-bility maps of mr datasets for fully automatic kidney parenchyma volumetry.IEEE Transactions on Medical Imaging (IEEE TMI) 31, 312–325.

Goldenberg, R., Kimmel, R., Rivlin, E., Rudzsky, M., 2001. Fast geodesicactive contours. IEEE Transactions on Image Processing (IEEE TIP) 10,1467–1475.

Goldenberg, R., Kimmel, R., Rivlin, E., Rudzsky, M., 2002. Cortex segmenta-tion: A fast variational geometric approach. IEEE Transactions on MedicalImaging (IEEE TMI) 21, 1544–1551.

Goldschlager, L.M., Shaw, R.A., Staples, J., 1982. The maximum flow problemis log space complete for p. Theoretical Computer Science 21, 105–111.

Gould, S., Rodgers, J., Cohen, D., Elidan, G., Koller, D., 2008. Multi-classsegmentation with relative location prior. InternationalJournal of ComputerVision (IJCV) 80, 300–316.

Grady, L., 2006. Random walks for image segmentation. IEEE Transactions onPattern Analysis and Machine Intelligence (IEEE TPAMI) 28,1768–1783.

Grady, L., 2012. Targeted image segmentation using graph methods. ImageProcessing and Analysis with Graphs .

Grady, L., Jolly, M.P., Seitz, A., 2011. Segmentation from abox, in: IEEEInternational Conference on Computer Vision (IEEE ICCV), IEEE. pp. 367–374.

Grau, V., Mewes, A., Alcaniz, M., Kikinis, R., Warfield, S.K., 2004. Improvedwatershed transform for medical image segmentation using prior informa-tion. IEEE Transactions on Medical Imaging (IEEE TMI) 23, 447–458.

Greig, D., Porteous, B., Seheult, A.H., 1989. Exact maximuma posterioriestimation for binary images. Journal of the Royal Statistical Society. SeriesB (Methodological) , 271–279.

Hamarneh, G., 2011. The Optimizability-Fidelty Trade-Off in Image Analy-sis. Technical Report SFU-Summit-10897. School of Computing Science,Simon Fraser University, Burnaby, BC, Canada.

Hamarneh, G., Abugharbieh, R., McInerney, T., 2004. Medialprofiles for mod-eling deformation and statistical analysis of shape and their use in medicalimage segmentation. International Journal of Shape Modeling 10, 187–210.

Hamarneh, G., Gustavsson, T., 2000. Statistically constrained snake deforma-tions, in: EEE International Conference on Systems, Man, and Cybernetics,IEEE. pp. 1610–1615.

Hamarneh, G., Jassi, P., Tang, L., 2008. Simulation of ground-truth valida-tion data via physically- and statistically-based warps, in: Medical ImageComputing and Computer-Assisted Intervention (MICCAI), pp. 459–467.doi:10.1007/978-3-540-85988-8_55 .

Hamarneh, G., Li, X., 2009. Watershed segmentation using prior shape andappearance knowledge. Image and Vision Computing 27, 59–68.

Hamarneh, G., McIntosh, C., McInerney, T., Terzopoulos, D., 2009. De-

formable organisms: An artificial life framework for automated medical im-age analysis. Computational Intelligence in Medical Imaging: Techniquesand Applications , 433.

Hamarneh, G., Yang, J., McIntosh, C., Langille, M., 2005. 3dlive-wire-basedsemi-automatic segmentation of medical images, in: Medical Imaging, In-ternational Society for Optics and Photonics. pp. 1597–1603.

Han, S., Tao, W., Wang, D., Tai, X.C., Wu, X., 2009. Image segmentationbased on grabcut framework integrating multiscale nonlinear structure ten-sor. IEEE Transactions on Image Processing (IEEE TIP) 18, 2289–2302.

Han, X., Xu, C., Prince, J.L., 2003. A topology preserving level set method forgeometric deformable models. IEEE Transactions on PatternAnalysis andMachine Intelligence (IEEE TPAMI) 25, 755–768.

Hatt, M., Cheze le Rest, C., Turzo, A., Roux, C., Visvikis, D., 2009. A fuzzylocally adaptive bayesian segmentation approach for volume determinationin pet. IEEE Transactions on Medical Imaging (IEEE TMI) 28, 881–893.

Heikkila, M., Pietikainen, M., Schmid, C., 2009. Description of interest regionswith local binary patterns. Pattern recognition 42, 425–436.

Heimann, T., Meinzer, H.P., 2009. Statistical shape modelsfor 3d medicalimage segmentation: A review. Medical Image Analysis (MedIA) 13, 543–563.

Hill, D.L., Batchelor, P.G., Holden, M., Hawkes, D.J., 2001. Medical imageregistration. Physics in Medicine and Biology 46, R1.

Howing, F., Wermser, D., Dooley, L., 1997. Fuzzy snakes, in:InternationalConference on Image Processing and Its Applications, IET. pp. 627–630.

Hu, Y., Grossberg, M., Mageras, G., 2009. Survey of recent volumetric medicalimage segmentation techniques. Biomedical Engineering , 321–346.

Huang, X., Qian, Z., Huang, R., Metaxas, D., 2005. Deformable-model basedtextured object segmentation, in: Energy Minimization Methods in Com-puter Vision and Pattern Recognition (EMMCVPR), Springer.pp. 119–135.

Iosifescu, D., Shenton, M., Warfield, S., Kikinis, R., Dengler, J., Jolesz, F.,McCarley, R., 1997. An automated registration algorithm for measuringMRI subcortical brain structures. Neuroimage 6, 13–26.

Ishikawa, H., 2003. Exact optimization for Markov random fields with convexpriors. IEEE Transactions on Pattern Analysis and Machine Intelligence(IEEE TPAMI) 25, 1333–1336.

Kadir, T., Brady, M., 2003. Unsupervised non-parametric region segmenta-tion using level sets, in: IEEE International Conference onComputer Vision(IEEE ICCV), IEEE. pp. 1267–1274.

Karaolani, P., Sullivan, G.D., Baker, K.D., Baines, M., 1989. A finite elementmethod for deformable models., in: Alvey Vision Conference, Citeseer. pp.1–6.

Kass, M., Witkin, A., Terzopoulos, D., 1988. Snakes: Activecontour models.International Journal of Computer Vision (IJCV) 1, 321–331.

Kawahara, J., McIntosh, C., Tam, R., Hamarneh, G., 2013. Augmenting auto-context with global geometric features for spinal cord segmentation, in: Ma-chine Learning in Medical Imaging. Springer, pp. 211–218.

Kim, J., Cetin, M., Willsky, A.S., 2007. Nonparametric shape priors for activecontour-based image segmentation. Signal Processing 87, 3021–3044.

Klodt, M., Cremers, D., 2011. A convex framework for image segmentationwith moment constraints, in: IEEE International Conference on ComputerVision (IEEE ICCV), IEEE. pp. 2236–2243.

Koerkamp, B.G., Weinstein, M.C., Stijnen, T., Heijenbrok-Kal, M.H., Hunink,M.M., 2010. Uncertainty and patient heterogeneity in medical decisionmodels. Medical Decision Making .

Kolmogorov, V., Boykov, Y., Rother, C., 2007. Applicationsof parametricmaxflow in computer vision, in: IEEE International Conference on Com-puter Vision (IEEE ICCV), pp. 1–8.

Kolmogorov, V., Rother, C., 2007. Minimizing nonsubmodular functions withgraph cuts-a review. IEEE Transactions on Pattern Analysisand MachineIntelligence (IEEE TPAMI) 29, 1274–1279.

Kolmogorov, V., Zabin, R., 2004. What energy functions can be minimized viagraph cuts? IEEE Transactions on Pattern Analysis and Machine Intelli-gence (IEEE TPAMI) 26, 147–159.

Koss, J.E., Newman, F., Johnson, T., Kirch, D., 1999. Abdominal organ seg-mentation using texture transforms and a hopfield neural network. IEEETransactions on Medical Imaging (IEEE TMI) 18, 640–648.

Lellmann, J., Kappes, J., Yuan, J., Becker, F., Schnorr, C., 2009. Convex multi-class image labeling by simplex-constrained total variation, in: Scale Spaceand Variational Methods in Computer Vision, pp. 150–162.

Lempitsky, V., Kohli, P., Rother, C., Sharp, T., 2009. Imagesegmentation with abounding box prior, in: IEEE International Conference on Computer Vision

27

http://dx.doi.org/10.1007/978-3-540-85988-8_55

(IEEE ICCV), IEEE. pp. 277–284.Lenglet, C., Rousson, M., Deriche, R., 2004. Segmentation of 3d probabil-

ity density fields by surface evolution: Application to diffusion mri, in:Medical Image Computing and Computer-Assisted Intervention (MICCAI).Springer, pp. 18–25.

Lesage, D., Angelini, E.D., Bloch, I., Funka-Lea, G., 2009.A review of3d vessel lumen segmentation techniques: Models, featuresand extractionschemes. Medical image analysis 13, 819–845.

Leventon, M.E., Faugeras, O., Grimson, W.E.L., Wells, W.M., 2000a. Level setbased segmentation with intensity and curvature priors, in: IEEE Workshopon Mathematical Methods in Biomedical Image Analysis, IEEE. pp. 4–11.

Leventon, M.E., Grimson, W.E.L., Faugeras, O., 2000b. Statistical shape influ-ence in geodesic active contours, in: IEEE Conference on Computer Visionand Pattern Recognition (IEEE CVPR), IEEE. pp. 316–323.

Li, K., Wu, X., Chen, D., Sonka, M., 2006. Optimal surface segmentation involumetric images-a graph-theoretic approach. IEEE Transactions on Pat-tern Analysis and Machine Intelligence (IEEE TPAMI) 28, 119–134.

Lim, Y., Jung, K., Kohli, P., 2011. Constrained discrete optimization via dualspace search .

Liu, J., Sun, J., 2010. Parallel graph-cuts by adaptive bottom-up merging,in: IEEE Conference on Computer Vision and Pattern Recognition (IEEECVPR), IEEE. pp. 2181–2188.

Liu, X., Veksler, O., Samarabandu, J., 2008. Graph cut with ordering con-straints on labels and its applications, in: IEEE Conference on ComputerVision and Pattern Recognition (IEEE CVPR), pp. 1–8.

Lowe, D.G., 2004. Distinctive image features from scale-invariant keypoints.International Journal of Computer Vision (IJCV) 60, 91–110.

Lu, C., Chelikani, S., Jaffray, D.A., Milosevic, M.F., Staib, L.H., Duncan, J.S.,2012. Simultaneous nonrigid registration, segmentation,and tumor detec-tion in mri guided cervical cancer radiation therapy. IEEE Transactions onMedical Imaging (IEEE TMI) 31, 1213–1227.

Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A., 2008. Discrimina-tive learned dictionaries for local image analysis, in: IEEE Conference onComputer Vision and Pattern Recognition (CVPR), pp. 1–8.

Malcolm, J., Rathi, Y., Tannenbaum, A., 2007. A graph cut approach to imagesegmentation in tensor space, in: IEEE Conference on Computer Vision andPattern Recognition (IEEE CVPR), IEEE. pp. 1–8.

Mansouri, A., Mitiche, A., Vazquez, C., 2006. Multiregioncompetition: Alevel set extension of region competition to multiple region image partition-ing. Computer Vision and Image Understanding (CVIU) 101, 137–150.

McInerney, T., Terzopoulos, D., 1996. Deformable models inmedical imageanalysis: a survey. Medical Image Analysis (MedIA) 1, 91–108.

McIntosh, C., Hamarneh, G., 2007. Is a single energy functional sufficient?adaptive energy functionals and automatic initialization. Medical ImageComputing and Computer-Assisted Intervention (MICCAI) , 503–510.

McIntosh, C., Hamarneh, G., 2012. Medial-based deformablemodels in non-convex shape-spaces for medical image segmentation. IEEE Transactionson Medical Imaging (IEEE TMI) 31, 33–50.

McIntosh, C., Hamarneh, G., 2013a. Medical image segmentation: Energyminimization and deformable models. Medical Imaging: Technology andApplications , 619–660.

McIntosh, C., Hamarneh, G., 2013b. Medical image segmentation: Energyminimization and deformable models (chapter 23). Medical Imaging: Tech-nology and Applications , 661–692.

Mikolajczyk, K., Schmid, C., 2005. A performance evaluation of local de-scriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence(IEEE TPAMI) 27, 1615–1630.

Mirzaalian, H., Hamarneh, G., 2010. Vessel scale-selection using mrf opti-mization, in: IEEE Conference on Computer Vision and Pattern Recognition(IEEE CVPR), IEEE. pp. 3273–3279.

Mirzaei, H., Tang, L., Werner, R., Hamarneh, G., 2013. Decision forests withspatio-temporal features for graph-based tumor segmentation in 4d lung ct,in: Machine Learning in Medical Imaging. Springer, pp. 179–186.

Mumford, D., Shah, J., 1989. Optimal approximations by piecewise smoothfunctions and associated variational problems. Communications on Pureand Applied Mathematics 42, 577–685.

Nain, D., Haker, S., Bobick, A., Tannenbaum, A., 2006. Shape-driven 3dsegmentation using spherical wavelets, in: Medical Image Computing andComputer-Assisted Intervention (MICCAI). Springer, pp. 66–74.

Nambakhsh, C., Yuan, J., Punithakumar, K., Goela, A., Rajchl, M., Peters, T.,Ayed, I.B., 2013. Left ventricle segmentation in MRI via convex relaxed

distribution matching. Medical Image Analysis (MedIA) 17,1010–1024.Nand, K.K., Abugharbieh, R., Booth, B.G., Hamarneh, G., 2011. Detecting

structure in diffusion tensor mr images, in: Medical Image Computing andComputer-Assisted Intervention (MICCAI). Springer, pp. 90–97.

Nastar, C., Ayache, N., 1993. Non-rigid motion analysis in medical images: aphysically based approach, in: Information Processing in Medical Imaging(IPMI), Springer. pp. 17–32.

Nayak, N., Chang, H., Borowsky, A., Spellman, P., Parvin, B., 2013. Classifi-cation of tumor histopathology via sparse feature learning, in: IEEE Inter-national Symposium on Biomedical Imaging (IEEE ISBI), pp. 410–413.

Nieuwenhuis, C., Hawe, S., Kleinsteuber, M., Cremers, D., 2014. Co-sparsetextural similarity for interactive segmentation, in: European Conference onComputer Vision (ECCV), pp. 285–301.

Nieuwenhuis, C., Toppe, E., Cremers, D., 2013. A survey andcomparisonof discrete and continuous multi-label optimization approaches for the pottsmodel. International Journal of Computer Vision (IJCV) 104, 223–240.

Noble, J.A., Boukerroui, D., 2006. Ultrasound image segmentation: a survey.IEEE Transactions on Medical Imaging (IEEE TMI) 25, 987–1010.

Nosrati, M., Hamarneh, G., 2014. Local optimization based segmentationof spatially-recurring, multi-region objects with part configuration con-straints. IEEE Transactions on Medical Imaging (IEEE TMI) 33, 1845–1859. doi:10.1109/TMI.2014.2323074 .

Nosrati, M.S., Andrews, S., Hamarneh, G., 2013. Bounded labeling functionfor global segmentation of multi-part objects with geometric constraints, in:IEEE International Conference on Computer Vision (IEEE ICCV), IEEE.pp. 2032–2039.

Nosrati, M.S., Hamarneh, G., 2013. Segmentation of cells with partial occlu-sion and part configuration constraint using evolutionary computation, in:Medical Image Computing and Computer-Assisted Intervention (MICCAI).Springer, pp. 461–468.

Nosrati, M.S., Peyrat, J.M., Abinahed, J., Al-Alao, O., Al-Ansari, A., Abughar-bieh, R., Hamarneh, G., 2014. Efficient multi-organ segmentation in multi-view endoscopic videos using pre-operative priors, in: Medical Image Com-puting and Computer-Assisted Intervention (MICCAI). Springer, pp. 324–331.

Nowozin, S., Gehler, P., Lampert, C., 2010. On parameter learning in crf-basedapproaches to object class image segmentation. European Conference onComputer Vision (ECCV) , 98–111.

Olabarriaga, S.D., Smeulders, A.W., 2001. Interaction in the segmentation ofmedical images: A survey. Medical Image Analysis (MedIA) 5,127–142.

Oliva, A., Torralba, A., 2001. Modeling the shape of the scene: A holisticrepresentation of the spatial envelope. International Journal of ComputerVision (IJCV) 42, 145–175.

Orderud, F., Hansgård, J., Rabben, S.I., 2007. Real-time tracking of theleft ventricle in 3d echocardiography using a state estimation approach, in:Medical Image Computing and Computer-Assisted Intervention (MICCAI).Springer, pp. 858–865.

Otsu, N., 1975. A threshold selection method from gray-level histograms. Au-tomatica 11, 23–27.

Pan, Z., Lu, J., 2007. A bayes-based region-growing algorithm for medicalimage segmentation. Computing in science and Engineering 9, 32–38.

Paragios, N., 2002. A variational approach for the segmentation of the left ven-tricle in cardiac image analysis. International Journal ofComputer Vision(IJCV) 50, 345–362.

Paragios, N., 2003. User-aided boundary delineation through the propagationof implicit representations, in: Medical Image Computing and Computer-Assisted Intervention (MICCAI). Springer, pp. 678–686.

Paragios, N., Deriche, R., 1999. Geodesic active regions for motion estimationand tracking, in: IEEE International Conference on Computer Vision (IEEEICCV), IEEE. pp. 688–694.

Paragios, N., Deriche, R., 2002. Geodesic active regions and level set meth-ods for supervised texture segmentation. International Journal of ComputerVision (IJCV) 46, 223–247.

Peng, B., Zhang, L., Zhang, D., 2013. A survey of graph theoretical approachesto image segmentation. Pattern Recognition 46, 1020 – 1038.URL:http://www.sciencedirect.com/science/article/pii/S0031320312004219 ,doi:http://dx.doi.org/10.1016/j.patcog.2012.09.015 .

Pentland, A., Sclaroff, S., 1991. Closed-form solutions for physically basedshape modeling and recognition. IEEE Transactions on Pattern Analysisand Machine Intelligence (IEEE TPAMI) 13, 715–729.

Petitjean, C., Dacher, J.N., 2011. A review of segmentationmethods in short

28

http://dx.doi.org/10.1109/TMI.2014.2323074

http://www.sciencedirect.com/science/article/pii/S0031320312004219

http://dx.doi.org/http://dx.doi.org/10.1016/j.patcog.2012.09.015

axis cardiac mr images. Medical Image Analysis (MedIA) 15, 169–184.Pham, D.L., Xu, C., Prince, J.L., 2000. Current methods in medical image

segmentation 1. Annual Review of Biomedical Engineering 2,315–337.Pizer, S.M., Fletcher, P.T., Joshi, S., Thall, A., Chen, J.Z., Fridman, Y., Fritsch,

D.S., Gash, A.G., Glotzer, J.M., Jiroutek, M.R., et al., 2003. Deformable m-reps for 3d medical image segmentation. International Journal of ComputerVision (IJCV) 55, 85–106.

Pluempitiwiriyawej, C., Moura, J., Wu, Y., Ho, C., 2005. Stacs: New activecontour scheme for cardiac MR image segmentation. IEEE Transactions onMedical Imaging (IEEE TMI) 24, 593–603.

Pock, T., Chambolle, A., 2011. Diagonal preconditioning for first order primal-dual algorithms in convex optimization, in: IEEE International Conferenceon Computer Vision (IEEE ICCV), pp. 1762–1769.

Pock, T., Schoenemann, T., Graber, G., Bischof, H., Cremers, D., 2008. Aconvex formulation of continuous multi-label problems, in: European Con-ference on Computer Vision (ECCV), pp. 792–805.

Pohl, K., Fisher, J., Bouix, S., Shenton, M., McCarley, R., Grimson, W., Kiki-nis, R., Wells, W., 2007. Using the logarithm of odds to definea vector spaceon probabilistic atlases. Medical Image Analysis (MedIA) 11, 465–477.

Pohle, R., Toennies, K.D., 2001. Segmentation of medical images using adap-tive region growing, in: Medical Imaging, International Society for Opticsand Photonics. pp. 1337–1346.

Prasad, G., Joshi, A.A., Feng, A., Barysheva, M., Mcmahon, K.L., De Zu-bicaray, G.I., Martin, N.G., Wright, M.J., Toga, A.W., Terzopoulos, D.,et al., 2011. Deformable organisms and error learning for brain segmen-tation, in: International Workshop on Mathematical Foundations of Compu-tational Anatomy-Geometrical and Statistical Methods forModelling Bio-logical Shape Variability, pp. 135–147.

Pratt, P., Mayer, E., Vale, J., Cohen, D., Edwards, E., Darzi, A., Yang, G.Z.,2012. An effective visualisation and registration system for image-guidedrobotic partial nephrectomy. Journal of Robotic Surgery 6,23–31.

Prisacariu, V., Reid, I., 2012. PWP3D: Real-time segmentation and tracking of3d objects. International Journal of Computer Vision (IJCV) 98, 335–354.

Prisacariu, V.A., Segal, A.V., Reid, I., 2013. Simultaneous monocular 2d seg-mentation, 3d pose recovery and 3d reconstruction, in: Asian Conference onComputer Vision (ACCV), pp. 593–606.

Rajchl, M., Yuan, J., White, J.A., Nambakhsh, C.M., Ukwatta, E., Li, F., Stirrat,J., Peters, T.M., 2012. A fast convex optimization approachto segmenting3D scar tissue from delayed-enhancement cardiac MR images,in: MedicalImage Computing and Computer-Assisted Intervention (MICCAI), pp. 659–666.

Rak, M., Konig, T., Toennies, K.D., 2013. An adaptive subdivision scheme forquadratic programming in multi-label image segmentation.Proceedings ofthe British Machine Vision Conference .

Rao, J., Abugharbieh, R., Hamarneh, G., 2010. Adaptive regularization forimage segmentation using local image curvature cues, in: European Confer-ence on Computer Vision (ECCV). Springer, pp. 651–665.

Rother, C., Kohli, P., Feng, W., Jia, J., 2009. Minimizing sparse higher orderenergy functions of discrete variables, in: IEEE Conference on ComputerVision and Pattern Recognition (IEEE CVPR), pp. 1382–1389.

Rother, C., Kolmogorov, V., Blake, A., 2004. Grabcut: Interactive foregroundextraction using iterated graph cuts, in: ACM Transactionson Graphics(TOG), ACM. pp. 309–314.

Rother, C., Kolmogorov, V., Lempitsky, V., Szummer, M., 2007. Optimizingbinary mrfs via extended roof duality, in: IEEE Conference on ComputerVision and Pattern Recognition (IEEE CVPR), IEEE. pp. 1–8.

Rousson, M., Brox, T., Deriche, R., 2003. Active unsupervised texture segmen-tation on a diffusion based feature space, in: IEEE Conference on ComputerVision and Pattern Recognition (IEEE CVPR), IEEE. pp. II–699.

Rousson, M., Paragios, N., 2002. Shape priors for level set representations, in:European Conference on Computer Vision (ECCV). Springer, pp. 78–92.

Saad, A., Hamarneh, G., Moller, T., 2010a. Exploration and visualizationof segmentation uncertainty using shape and appearance prior information.IEEE Transactions on Visualization and Computer Graphics 16, 1366–1375.

Saad, A., Hamarneh, G., Moller, T., Smith, B., 2008. Kinetic modeling basedprobabilistic segmentation for molecular images. MedicalImage Computingand Computer-Assisted Intervention (MICCAI) , 244–252.

Saad, A., Moller, T., Hamarneh, G., 2010b. ProbExplorer: Uncertainty-guidedexploration and editing of probabilistic medical image segmentation, in:Computer Graphics Forum, Wiley Online Library. pp. 1113–1122.

Sadeghi, M., Tien, G., Hamarneh, G., Atkins, M.S., 2009. Hands-free interac-

tive image segmentation using eyegaze, in: SPIE Medical Imaging, Interna-tional Society for Optics and Photonics. pp. 72601H–72601H.

Sahoo, P.K., Soltani, S., Wong, A.K., 1988. A survey of thresholding tech-niques. Computer Vision, Graphics, and Image Processing 41, 233–260.

Samson, C., Blanc-Feraud, L., Aubert, G., Zerubia, J., 2000. A level set modelfor image classification. International Journal of Computer Vision (IJCV)40, 187–197.

Sandberg, B., Chan, T., Vese, L., 2002. A level-set and gabor-based activecontour algorithm for segmenting textured images, in: UCLADepartmentof Mathematics CAM report, Citeseer.

Sandhu, R., Dambreville, S., Yezzi, A., Tannenbaum, A., 2011. A nonrigidkernel-based framework for 2D-3D pose estimation and 2D image segmen-tation. IEEE Transactions on Pattern Analysis and Machine Intelligence(IEEE TPAMI) 33, 1098–1115.

Santner, J., Unger, M., Pock, T., Leistner, C., Saffari, A., Bischof, H., 2009.Interactive texture segmentation using random forests andtotal variation.,in: BMVC, Citeseer. pp. 1–12.

dos Santos Gromicho, J.A., 1998. Quasiconvex optimizationand location the-ory. 9, Springer.

Sapiro, G., 1997. Color snakes. Computer Vision and Image Understanding(CVIU) 68, 247–253.

Schmidt, F.R., Boykov, Y., 2012. Hausdorff distance constraint for multi-surface segmentation, in: European Conference on ComputerVision(ECCV), pp. 598–611.

Schoenemann, T., Cremers, D., 2007. Globally optimal imagesegmentationwith an elastic shape prior, in: IEEE International Conference on ComputerVision (IEEE ICCV), IEEE. pp. 1–6.

Schoenemann, T., Kahl, F., Cremers, D., 2009. Curvature regularity for region-based image segmentation and inpainting: A linear programming relaxation,in: IEEE International Conference on Computer Vision (IEEEICCV), IEEE.pp. 17–23.

Scholkopf, B., Smola, A., Muller, K.R., 1998. Nonlinear component analysisas a kernel eigenvalue problem. Neural Computation 10, 1299–1319.

Sharma, N., Aggarwal, L.M., 2010. Automated medical image segmentationtechniques. Journal of Medical Physics/Association of Medical Physicistsof India 35, 3.

Shekhovtsov, A., Hlavac, V., 2013. A distributed mincut/maxflow algorithmcombining path augmentation and push-relabel. International Journal ofComputer Vision (IJCV) 104, 315–342.

Shen, T., Li, H., Huang, X., 2011. Active volume models for medical im-age segmentation. IEEE Transactions on Biomedical Engineering (IEEETBME) 30, 774–791.

Shi, J., Malik, J., 2000. Normalized cuts and image segmentation. IEEE Trans-actions on Pattern Analysis and Machine Intelligence (IEEETPAMI) 22,888–905.

Shi, L., Funt, B., Hamarneh, G., 2008. Quaternion color curvature, in: Colorand Imaging Conference, Society for Imaging Science and Technology. pp.338–341.

Singaraju, D., Grady, L., Vidal, R., 2008. Interactive image segmentation viaminimization of quadratic energies on directed graphs, in:IEEE Conferenceon Computer Vision and Pattern Recognition (IEEE CVPR), IEEE. pp. 1–8.

Slabaugh, G., Unal, G., 2005. Graph cuts segmentation usingan elliptical shapeprior, in: IEEE International Conference on Image Processing (IEEE ICIP),IEEE. pp. II–1222.

Song, Q., Wu, X., Liu, Y., Sonka, M., Garvin, M., 2010. Simultaneous search-ing of globally optimal interacting surfaces with shape priors, in: IEEE Con-ference on Computer Vision and Pattern Recognition (IEEE CVPR), IEEE.pp. 2879–2886.

Sotiras, A., Davatzikos, C., Paragios, N., 2013. Deformable medical imageregistration: A survey. IEEE Transactions on Medical Imaging (IEEE TMI)32, 1153–1190.

Staib, L.H., Duncan, J.S., 1992. Deformable fourier modelsfor surface findingin 3-d images, in: Visualization in Biomedical Computing, InternationalSociety for Optics and Photonics. pp. 90–104.

Strandmark, P., Kahl, F., 2010. Parallel and distributed graph cuts by dual de-composition, in: IEEE Conference on Computer Vision and Pattern Recog-nition (IEEE CVPR), IEEE. pp. 2085–2092.

Strandmark, P., Kahl, F., 2011. Curvature regularization for curves and surfacesin a global optimization framework, in: Energy Minimization Methods inComputer Vision and Pattern Recognition (EMMCVPR), Springer. pp. 205–218.

29

Strandmark, P., Ulen, J., Kahl, F., Grady, L., 2013. Shortest paths with curvatureand torsion, in: IEEE International Conference on ComputerVision (IEEEICCV), IEEE. pp. 2024–2031.

Strekalovskiy, E., Cremers, D., 2011. Generalized ordering constraints for mul-tilabel optimization, in: IEEE International Conference on Computer Vision(IEEE ICCV), IEEE. pp. 2619–2626.

Strekalovskiy, E., Nieuwenhuis, C., Cremers, D., 2012. Nonmetric priors forcontinuous multilabel optimization, in: European Conference on ComputerVision (ECCV). Springer, pp. 208–221.

Su, L.M., Vagvolgyi, B.P., Agarwal, R., Reiley, C.E., Taylor, R.H., Hager, G.D.,2009. Augmented reality during robot-assisted laparoscopic partial nephrec-tomy: toward real-time 3D-CT to stereoscopic video registration. Urology73, 896–900.

Szummer, M., Kohli, P., Hoiem, D., 2008. Learning CRFs usinggraph cuts.European Conference on Computer Vision (ECCV) , 582–595.

Tabesh, A., Teverovskiy, M., Pang, H.Y., Kumar, V.P., Verbel, D., Kotsianti, A.,Saidi, O., 2007. Multifeature prostate cancer diagnosis and gleason gradingof histological images. IEEE Transactions on Medical Imaging (IEEE TMI)26, 1366–1378.

Tang, L., Hamarneh, G., 2013. Medical image registration: Areview (chapter22). Medical Imaging: Technology and Applications , 619–660.

Tola, E., Lepetit, V., Fua, P., 2008. A fast local descriptorfor dense matching,in: IEEE Conference on Computer Vision and Pattern Recognition (IEEECVPR), IEEE. pp. 1–8.

Top, A., et al., 2011. Active learning for interactive 3D image segmentation, in:Medical Image Computing and Computer-Assisted Intervention (MICCAI).Springer. volume 6893, pp. 603–610.

Top, A., Hamarneh, G., Abugharbieh, R., 2011. Spotlight: Automatedconfidence-based user guidance for increasing efficiency in interactive 3dimage segmentation, in: Medical Computer Vision. Recognition Techniquesand Applications in Medical Imaging. Springer, pp. 204–213.

Tsai, A., Yezzi Jr, A., Wells, W., Tempany, C., Tucker, D., Fan, A., Grimson,W.E., Willsky, A., 2003. A shape-based approach to the segmentation ofmedical imagery using level sets. IEEE Transactions on Medical Imaging(IEEE TMI) 22, 137–154.

Tsai, A., Yezzi Jr, A., Wells III, W., Tempany, C., Tucker, D., Fan, A., Grim-son, W.E., Willsky, A., 2001. Model-based curve evolution technique forimage segmentation, in: IEEE Conference on Computer Visionand PatternRecognition (IEEE CVPR), IEEE. pp. I–463.

Tsai, P., Chang, C.C., Hu, Y.C., 2002. An adaptive two-stageedge detectionscheme for digital color images. Real-Time Imaging 8, 329–343.

Tu, Z., Bai, X., 2010. Auto-context and its application to high-level vision tasksand 3d brain image segmentation. IEEE Transactions on Pattern Analysisand Machine Intelligence (IEEE TPAMI) 32, 1744–1757.

Tu, Z., Zhou, X.S., Comaniciu, D., Bogoni, L., 2006. A learning based ap-proach for 3d segmentation and colon detagging, in: European Conferenceon Computer Vision (ECCV), pp. 436–448.

Udupa, J.K., Grevera, G.J., 2005. Go digital, go fuzzy, in: Pattern Recognitionand Machine Intelligence. Springer, pp. 137–146.

Ukwatta, E., Yuan, J., Rajchl, M., Fenster, A., 2012. Efficient global optimiza-tion based 3D carotid AB-LIB MRI segmentation by simultaneously evolv-ing coupled surfaces, in: Medical Image Computing and Computer-AssistedIntervention (MICCAI), pp. 377–384.

Ulen, J., Strandmark, P., Kahl, F., 2013. An efficient optimization frameworkfor multi-region segmentation based on lagrangian duality. IEEE Transac-tions on Medical Imaging (IEEE TMI) 32, 178–188.

Uzumcu, M., Frangi, A.F., Sonka, M., Reiber, J.H., Lelieveldt, B.P., 2003. Icavs. pca active appearance models: Application to cardiac mrsegmentation,in: Medical Image Computing and Computer-Assisted Intervention (MIC-CAI), pp. 451–458.

Van Ginneken, B., Frangi, A.F., Staal, J.J., ter Haar Romeny, B.M., Viergever,M.A., 2002. Active shape model segmentation with optimal features. IEEETransactions on Medical Imaging (IEEE TMI) 21, 924–933.

Vazquez-Reina, A., Miller, E., Pfister, H., 2009. Multiphase geometric cou-plings for the segmentation of neural processes, in: IEEE Conference onComputer Vision and Pattern Recognition (IEEE CVPR), pp. 2020–2027.

Veksler, O., 2008. Star shape prior for graph-cut image segmentation, in: Eu-ropean Conference on Computer Vision (ECCV). Springer, pp.454–467.

Vese, L., Chan, T., 2002. A multiphase level set framework for image segmenta-tion using the Mumford and Shah model. International Journal of ComputerVision (IJCV) 50, 271–293.

Vicente, S., Kolmogorov, V., Rother, C., 2008. Graph cut based image segmen-tation with connectivity priors, in: IEEE Conference on Computer Visionand Pattern Recognition (IEEE CVPR), pp. 1–8.

Vincent, L., 1993. Morphological grayscale reconstruction in image analysis:applications and efficient algorithms. Image Processing, IEEE Transactionson 2, 176–201.

Vincent, L., Soille, P., 1991. Watersheds in digital spaces: an efficient algorithmbased on immersion simulations. IEEE Transactions on Pattern Analysis andMachine Intelligence (IEEE TPAMI) 13, 583–598.

Wang, C., Komodakis, N., Paragios, N., 2013. Markov ran-dom field modeling, inference & learning in computer vi-sion & image understanding: A survey. Computer Visionand Image Understanding (CVIU) 117, 1610 – 1627. URL:http://www.sciencedirect.com/science/article/pii/S1077314213001343 ,doi:http://dx.doi.org/10.1016/j.cviu.2013.07.004 .

Wang, C., Teboul, O., Michel, F., Essafi, S., Paragios, N., 2010. 3dknowledge-based segmentation using pose-invariant higher-order graphs, in:Medical Image Computing and Computer-Assisted Intervention (MICCAI).Springer, pp. 189–196.

Wang, J., Agrawala, M., Cohen, M.F., 2007. Soft scissors: aninteractive toolfor realtime high quality matting, in: ACM Transactions on Graphics (TOG),ACM. p. 9.

Wang, Z., Vemuri, B.C., 2004. An affine invariant tensor dissimilarity measureand its applications to tensor-valued image segmentation,in: IEEE Confer-ence on Computer Vision and Pattern Recognition (IEEE CVPR), IEEE. pp.I–228.

Weldeselassie, Y.T., Hamarneh, G., 2007. Dt-mri segmentation using graphcuts, in: Medical Imaging, International Society for Optics and Photonics.pp. 65121K–65121K.

Wells III, W.M., Grimson, W.E.L., Kikinis, R., Jolesz, F.A., 1996. Adaptivesegmentation of mri data. IEEE Transactions on Medical Imaging (IEEETMI) 15, 429–442.

Wu, X., Dou, X., Wahle, A., Sonka, M., 2011. Region detectionby minimizingintraclass variance with geometric constraints, global optimality, and effi-cient approximation. IEEE TMI 30, 814–827.

Xu, C., Prince, J.L., 1997. Gradient vector flow: A new external force forsnakes, in: IEEE Conference on Computer Vision and Pattern Recognition(IEEE CVPR), IEEE. pp. 66–71.

Xu, C., Prince, J.L., 1998. Generalized gradient vector flowexternal forces foractive contours. Signal Processing 71, 131–139.

Yan, P., Kassim, A.A., 2006. Segmentation of volumetric mraimages by usingcapillary active contour. Medical Image Analysis (MedIA) 10, 317–329.

Yazdanpanah, A., Hamarneh, G., Smith, B.R., Sarunic, M.V.,2011. Segmenta-tion of intra-retinal layers from optical coherence tomography images usingan active contour approach. IEEE Transactions on Medical Imaging (IEEETMI) 30, 484–496.

Yuan, J., Bae, E., Boykov, Y., Tai, X.C., 2012. A continuous max-flow approachto minimal partitions with label cost prior, in: Scale Spaceand VariationalMethods in Computer Vision. Springer, pp. 279–290.

Zeng, X., Staib, L.H., Schultz, R.T., Duncan, J.S., 1998. Volumetric layersegmentation using coupled surfaces propagation, in: IEEEConference onComputer Vision and Pattern Recognition (IEEE CVPR), pp. 708–715.

Zeng, Y., Samaras, D., Chen, W., Peng, Q., 2008. Topology cuts: A novelmin-cut/max-flow algorithm for topology preserving segmentation inn–dimages. Computer Vision and Image Understanding (CVIU) 112, 81–90.

Zhang, S., Zhan, Y., Metaxas, D.N., 2012. Deformable segmentation via sparserepresentation and dictionary learning. Medical Image Analysis (MedIA)16, 1385–1396.

Zhang, Y., Brady, M., Smith, S., 2001. Segmentation of brainmr im-ages through a hidden markov random field model and the expectation-maximization algorithm. IEEE Transactions on Medical Imaging (IEEETMI) 20, 45–57.

Zhao, H., Chan, T., Merriman, B., Osher, S., 1996. A variational level setapproach to multiphase motion. Journal of Computational Physics 127, 179–195.

Zhu, S.C., Yuille, A., 1996. Region competition: Unifying snakes, region grow-ing, and bayes/mdl for multiband image segmentation. IEEE Transactionson Pattern Analysis and Machine Intelligence (IEEE TPAMI) 18, 884–900.

Zhu-Jacquot, J., Zabih, R., 2007. Graph cuts segmentation with statistical shapepriors for medical images, in: International IEEE Conference on Signal-Image Technologies and Internet-Based System, pp. 631–635.

30

http://www.sciencedirect.com/science/article/pii/S1077314213001343

http://dx.doi.org/http://dx.doi.org/10.1016/j.cviu.2013.07.004

I(x,y)

I(x,y)

I(x,y)

I(x,y)

x

y

x

xx

y

yy

E(S)

S

E(S)

S

E(S)

S

E(S)

S

F(x)

x

F(x)

Non-convex convex

I(x,y)

x

y

F(x)

x

F(x)

Less certainty More certainty

I(x,y)

x

y

I(x,y)

x

y

I(x,y)

x

y

E(S)

S5 10 15 20 25

0

50

100

150

Av

era

ge

tim

e a

ctiv

ity

Background

Skull

Gray matter

White matter

Puttamen

Cerebellum

E(S)

S

E(S)

S

E(S)

S

Segmentation = argmin

Shape

Distance Edge polarity

Energy terms

Regularization terms Energy terms

Raw pixels Energy terms

Shape Topology Moments Geometric

Spatial distance

Energy terms

Data term Regularization term

Raw pixel value Texture Histogram Frequency Other transformations

Segmentation

boundaryTotal varitation Length Curvature Torsion Shape Topology

S I F T HOG GLOH DAISY LBP S U R F etc.

1st order derivative 2nd order derivative Boundary polarity Geometric (Template) Statistical Physical Learning-based

Linear Non-linear

Connectivity Genus

Image

boundary

arxiv:1607.01092v1 [cs.cv] 5 jul 2016 · 2016. 7. 6. · arxiv:1607.01092v1 [cs.cv] 5 jul 2016...

Documents