semiautomatic segmentation with compact shape prior qoveksler/papers/imavis_2679-preprint.pdf ·...
TRANSCRIPT
1
2
3
45 Q1
6
78
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
Available online at www.sciencedirect.com
IMAVIS 2679 No. of Pages 14, Model 5+
10 March 2008 Disk UsedARTICLE IN PRESS
www.elsevier.com/locate/imavis
Image and Vision Computing xxx (2008) xxx–xxx
OF
Semiautomatic segmentation with compact shape prior q
Piali Das c, Olga Veksler a,*, Vyacheslav Zavadsky b, Yuri Boykov a
a University of Western Ontario, Computer Science, Middlesex College, 361, London, Ont., Canada N6A 5B7b Semiconductor Insight Inc., Ottawa, Ont., Canada
c Atamai Inc., London, Ont., Canada
Received 9 February 2007; received in revised form 7 February 2008; accepted 7 February 2008
OEC
TED
PRAbstract
In recent years, interactive methods for segmentation are increasing in popularity due to their success in different domains such asmedical image processing, photo editing, etc. We present an interactive segmentation algorithm that can segment an object of interestfrom its background with minimum guidance from the user, who just has to select a single seed pixel inside the object of interest.Due to minimal requirements from the user, we call our algorithm semiautomatic. To obtain a reliable and robust segmentation withsuch low user guidance, we have to make several assumptions. Our main assumption is that the object to be segmented is of compact
shape, or can be approximated by several connected roughly collinear compact pieces. We base our work on the powerful graph cutsegmentation algorithm of Boykov and Jolly, which allows straightforward incorporation of the compact shape constraint. In orderto make the graph cut approach suitable for our semiautomatic framework, we address several well-known issues of graph cut segmen-tation technique. In particular, we counteract the bias towards shorter segmentation boundaries and develop a method for automaticselection of parameters. We demonstrate the effectiveness of our approach on the challenging industrial application of transistor gatesegmentation in images of integrated chips. Our approach produces highly accurate results in real-time.� 2008 Elsevier B.V. All rights reserved.
Keywords: Segmentation; Shape prior; Graph cut; Parameter estimation
34
35
36
37
38
39
40
41
42
43
44
NC
OR
R
1. Introduction
Segmentation is generally defined as the problem of par-titioning an image into two or more constituent compo-nents, where each component has a short summaryrepresentation. This definition is rather vague, because gen-eral purpose segmentation is not well defined. Segmenta-tion becomes a much better defined problem when it isdeveloped for a particular application, since then one fre-quently has a clearer idea of the properties a segmentationshould have.
U 4546
47
48
49
50
51
0262-8856/$ - see front matter � 2008 Elsevier B.V. All rights reserved.
doi:10.1016/j.imavis.2008.02.006
q This research was partially supported by NSERC and SemiconductorInsights.
* Corresponding author. Tel.: +1 519 858 4403.E-mail addresses: [email protected] (P. Das), [email protected]
(O. Veksler), [email protected] (V. Zavadsky), [email protected] (Y. Boykov).
Please cite this article in press as: P. Das et al., Semiautomatic segmdoi:10.1016/j.imavis.2008.02.006
There are mainly three approaches to segmentation:automatic, manual and interactive. Manual segmentationis labor extensive and extremely time consuming. Purelyautomatic segmentation is very challenging, due to ambi-guities in the presence of multiple objects, image noise,weak edges, etc. Ambiguity problems can be eased withuser guidance, which is the idea of interactive segmentationmethods. Hence, their popularity is increasing in applica-tions in different domains [18,24,5,3,23,1,4].
The motivation behind our work is to reduce interactionto the minimum, asking the user to just choose the object ofinterest by clicking inside it. We call our approach semiau-
tomatic segmentation, to distinguish it from general inter-active segmentation, where the user is allowed to providea potentially unlimited amount of guidance. The namesemiautomatic is used to emphasize that our algorithm isonly a step away from the automatic segmentation, sinceonly one seed point is required from the user. General
entation with compact shape prior, Image Vis. Comput. (2008),
T
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
1 We use the word compact informally, we will explain what we mean byit later.
2 P. Das et al. / Image and Vision Computing xxx (2008) xxx–xxx
IMAVIS 2679 No. of Pages 14, Model 5+
10 March 2008 Disk UsedARTICLE IN PRESS
UN
CO
RR
EC
interactive segmentation can be quite far from automaticsegmentation if lots of input is required from the user inorder to achieve satisfactory results.
To produce an accurate and robust segmentation, wehave to develop our algorithm with some application inmind, since, as we have already mentioned, general purposesegmentation is an ill-defined problem. We chose to designour algorithm in the context of an interesting industrialapplication, which requires transistor gates to be seg-mented from the images of integrated chips.
Over the years, researchers have developed differenttechniques for segmentation. Some of the primitive meth-ods that have been popular because of their simplicity areregion growing, split-and-merge, edge detection and thres-holding, see, for example, Gonzalez and Woods [15].Although these methods and their variants are still widelyused, they are not robust as they are based on local deci-sions. For example, the major problem with region grow-ing is the ‘‘leaking” through weak points in theboundary, which is inevitable in most images. Likewise,thresholding fails when the object of interest is not homo-geneous. In particular, objects with smoothly varyingintensities are split into several segments.
To overcome problems due to local decision strategies,global properties have to be included in the segmentation.Graph theoretic approach to segmentation allows us to doso. Various graph based algorithms have been proposedover the years [33,27,30,5,17,12,32,4,16]. They differ inthe way the segmentation is interpreted and in the tech-niques employed to solve the problem. However, all thesemethods typically involve two main steps – formulatingan objective function and optimizing it.
In some approaches, such as live wire [11,24], a globalobjective function is implicit. Live wire is a paradigm forsegmentation that requires the user to mark a seed on theobject boundary. As the user moves the cursor (the freepoint) close to the object boundary, a curve (livewire)clings to the object boundary and segments the object.The curve position is optimized by finding the shortest pathon a certain graph. In this approach considerable amountof interaction may be required in order to find the appro-priate segmentation.
Level sets sets [25], normalized cut [27], active contour(snake) evolution [18,7,2], and graph cut [5] formulate theenergy function explicitly based on various global proper-ties that the segmentation is expected to have. Unfortu-nately, for many energy functions that one may wish toformulate, finding their global minimum is computation-ally prohibitive. Normalized cut computes only an approx-imation to the global minimum, and in most cases, activecontours and level sets compute only a local minimum (afew notable special case exceptions are Cohen and Kimmel[8,21]).
The advantage of the graph cut compared to the abovelisted methods is that it guarantees a globally optimal solu-tion for a family of energy functions. An additional benefitis that one can easily incorporate both regional and bound-
Please cite this article in press as: P. Das et al., Semiautomatic segmdoi:10.1016/j.imavis.2008.02.006
ED
PR
OO
F
ary properties of segmentation. Also, unlike most activecontour/level set methods, graph cut is not sensitive tothe initialization [4]. Furthermore, level sets/snakes wouldbe unsuitable for our semiautomatic approach since theyrequire the user to initialize a contour, not just one point.These advantages make the graph cut method much moreattractive than others in achieving our goal.
As segmentation is a subjective problem, we start withthe already mentioned application of transistor gate seg-mentation in the images of integrated chips. We make sev-eral assumptions based on the prior knowledge of our dataand fit them into the framework of the algorithm in Boy-kov and Jolly [5]. The most important assumption thatwe make is that an object to be segmented is compact1 inshape. While this assumption allows us to produce veryrobust segmentations, it is also our most restrictiveassumption, making our algorithm not suitable for seg-mentation of objects of general shapes. However, apartfrom the transistor gates there are important applications(industrial and medical) where the objects of interest areapproximately compact. Furthermore, we can also handleobjects with somewhat more general shapes, specificallythe objects that can be divided either vertically or horizon-tally into several approximately collinear pieces, whereeach piece is compact in shape.
There are several related methods that incorporateshape priors into graph cut segmentation. In Slabaughand Unal [28] the authors incorporate an elliptical priorin an iterative refinement process. The disadvantages ofthis approach is that it is iterative and the elliptic shapeassumption is overly restrictive for many applications. InFreedman and Zhang [14], the shape prior can be arbitrary,but their method requires a very accurate registration ofthe assumed shape with the actual location of the objectof interest in the image, which is a difficult task in itself.In Kumar et al. [20], they also require fitting of a modelof a certain shape to an image, and their method, whichuses sampling for estimation of model’s parameters, is verycomputationally intensive.
The use of shape priors for segmentation has been inves-tigated before. Recently there has been a lot of work onusing shape priors in level set segmentation, some examplesare Leventon et al. [22], Tsai et al. [29], Rousson and Para-gios [26], Cremers et al. [10], Cremers et al. [9]. However,level set segmentation is not numerically stable and thesolution is prone to getting stuck in a local minimum.
Another issue that we address is the parameter selection.In the framework of Boykov and Jolly [5], the values ofparameters have a direct impact on the result producedby the algorithm. Unfortunate choice of parameters canproduce unacceptable segmentation results that have tobe detected by the user and corrected by possibly a consid-erable amount of interaction. This is not acceptable for our
entation with compact shape prior, Image Vis. Comput. (2008),
T
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
251251
252
253
254
255
256
257
258
259
260
P. Das et al. / Image and Vision Computing xxx (2008) xxx–xxx 3
IMAVIS 2679 No. of Pages 14, Model 5+
10 March 2008 Disk UsedARTICLE IN PRESS
UN
CO
RR
EC
semiautomatic approach, since our goal is to reduce userinteraction to a single click. If the segmentation algorithmis used for a collection of images that do not exhibit largevariability, then it is possible to select the parameters thatwork well for that type of images beforehand. However,we found that for our application, the images do exhibitconsiderable variability and selecting fixed parameters thatwork well for most instances is not possible. For eachimage, there is an optimal setting of parameters that workswell, but estimating that range is difficult. Our solution is torun the segmentation algorithm for a range of parametersand choose the highest quality segmentation. This, ofcourse, requires some way of judging the quality of seg-mentation. We devise a simple but intuitive test to checkthe quality of the segment automatically. This ‘‘qualitycheck” is application dependent. If the current segmentdoes not pass the quality check, the parameters are read-justed and the graph cut step is redone with the new param-eters. We iterate this process using a search over parameterspace until the resulting segment passes the quality check.Thus in our work, we estimate all the important parametersof the algorithm automatically.
If we could directly incorporate our”quality check” intothe energy function, then we would not have to search overa range of parameters but could compute the best qualitysegment in one step. Unfortunately we cannot incorporateour quality check into the energy function in such a waythat it still can be minimized with a graph cut.
When the user provides many seed points, or when anaccurate color model of the object of interest is known,the regional properties of the object can be relied on, andare included in the graph cut segmentation with a largeweight. Our goal is to have a very low input from the user,who just marks one object seed point. Thus we do not haveenough samples from the user to construct a reliable modelfor the color distribution of the object. In this case we haveto allow the object to deviate from the unreliable colormodel, and therefore the regional terms are given a smallerweight (the smaller the weight of the regional terms, themore is the object allowed to deviate from the colormodel). When regional terms have smaller weight, bound-ary terms become relatively more important. It makes senseintuitively, since if there is no reliable color model, we mustrely more on the fact that we expect the object boundary toaligns with intensity edges in the image. A serious difficultyin graph cut segmentation in the case when regional termshave a small weight is that there is a bias towards produc-ing segments with shorter boundaries. In our framework,we can easily counteract this bias. It turns out that dueto incorporating compact shape prior in the graph cutframework, we can introduce a new parameter bias, whichbiases the algorithm towards a larger object segment.2 Thebias is exactly the parameter for which we search over a
2 Without the compact shape prior, incorporating the bias parameterresults in an energy function which is not submodular, and thus cannot beminimized exactly with a graph cut, see Section 3.4
Please cite this article in press as: P. Das et al., Semiautomatic segmdoi:10.1016/j.imavis.2008.02.006
ED
PR
OO
F
range of values to find the segmentation that passes thequality check mentioned above.
Thus our main contributions to the graph cut segmenta-tion framework of Boykov and Jolly [5] are as follows. Weintroduce the idea of an application dependent ‘‘qualitycheck” which can be effectively used for automatic param-eter selection. We introduce the compact shape prior,which lets us deal with the objects of compact shape veryrobustly. Lastly, due to the shape prior, we are able tointroduce a bias parameter which allows us to counteractthe shrinking bias of the graph cut segmentation.
We evaluate our approach on a transistor segmentationapplication for Semiconductor Insights, which is an engi-neering consultancy company specializing in intellectualproperty protection and competitive intelligence in the inte-grated circuit domain. Our segmentation algorithm pro-duces highly accurate results in real-time,3 and was usedto upgrade their manual system to a semiautomatic one.
This paper is organized as follows. In Section 2, wereview the graph cut segmentation framework of Boykovand Jolly [5], in Section 3 we describe our work, in Section4, we present our experimental results and we finally con-clude with a discussion in Section 5.
2. Graph cut segmentation
In this section we briefly review the graph cut segmenta-tion algorithm in Boykov and Jolly [5].
2.1. Graph cut
Let G ¼ hV ;Ei be a graph consisting of a set of verticesV and a set of edges E connecting the vertices. Each edgee 2 E in G is assigned a non-negative cost we. There aretwo special vertices called terminals identified as the source,s and the sink, t. A cut C is a subset of edges C � E, whichwhen removed from G partitions V into two disjoint sets S
and T ¼ V � S such that s 2 S and t 2 T . The cost of thecut C is just the sum its edge weights:
jCj ¼Xe2C
we:
The minimum cut is the cut with the smallest cost. Themax-flow/mincut algorithm of Ford and Fulkerson [13]can be used to obtain the minimum cut. We use the max-flow algorithm developed by Boykov and Kolmogorov[6], which was designed specifically for computer visionapplications and has the best performance in practice.
2.2. Segmentation algorithm
In Boykov and Jolly [5], the problem of segmenting anobject from its background is interpreted as a binary label-
3 The system is real time in the sense that the user does not have to waitmore than a couple of seconds after he/she places a seed in the image ofthe transistor gate to be segmented.
entation with compact shape prior, Image Vis. Comput. (2008),
T
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275276278278
279
280
281
282
283284
286286
287
288
289
290
291
292
293
294
295296
298298
299
300
301
302
303
304
305
306
308308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
4 Rerunning the algorithm after the addition of new seeds usually takesmuch less time than the first run of the algorithm because the flow fromthe previous iteration can be reused and the max flow program does notstart from scratch. However, the time required from the user to enter thenew seeds can still be considerable.
4 P. Das et al. / Image and Vision Computing xxx (2008) xxx–xxx
IMAVIS 2679 No. of Pages 14, Model 5+
10 March 2008 Disk UsedARTICLE IN PRESS
UN
CO
RR
EC
ing problem, which can be solved in energy minimizationframework. The labeling corresponding to the minimumenergy is chosen as the solution. Let P be the set of all pix-els in the image, and let N be the standard 4-connectedneighborhood system on P, that is N is a set of pixel pairsfp; qg where p is either immediately to the right, or left, ortop, or bottom of q.
Each pixel in the image has to be assigned a label from thelabel set L ¼ f0; 1g, where 0 and 1 represent the backgroundand the object, respectively. Let S ¼ fS1; . . . ; Sp; . . . ; SjP jg bea binary set that defines a segmentation, where each Sp 2 L isthe label assigned to pixel p. Thus the set P is partitionedinto two subsets, where pixels in one subset are labeled 0and the ones in the other subset are labeled 1.
The energy function has the following form:
EðSÞ ¼ aRðSÞ þ BðSÞ: ð1Þ
In Eq. (1), RðSÞ is called the regional term because it incor-porates the regional constraints into the segmentation. Spe-cifically, RðSÞmeasures how well pixels fit into the object orbackground models under labeling S. It has the followingform:
RðSÞ ¼X8p2P
RpðSpÞ; ð2Þ
where RpðSpÞ is the penalty of assigning the label Sp to pixelp. If label Sp is likely for a pixel p, then RpðSpÞ should besmall. If label Sp is unlikely for a pixel p, then RpðSpÞ shouldbe large.
The term BðSÞ in Eq. (1) is called the boundary termbecause it incorporates the boundary constraints. A seg-mentation boundary occurs whenever two neighboring pix-els are assigned different labels. Thus BðSÞ is defined as asum over neighboring pixel pairs:
BðSÞ ¼Xfp;qg2N
p<q
BpqðSp; SqÞ; ð3Þ
where N is the set of all neighboring pixels, and BpqðSp; SqÞdescribes the penalty for assigning labels Sp and Sq to twoneighboring pixels. The term Bpq is used to incorporate theprior knowledge that most nearby pixels tend to have thesame label. Thus there is no penalty if neighboring pixelshave the same label and a penalty otherwise. Typically,BpqðSp; SqÞ ¼ wpq � T ðSp 6¼ SqÞ where T ð�Þ is an identity func-tion of a boolean argument defined as:
T ðSp 6¼ SqÞ ¼1 ifSp 6¼ Sq;
0 otherwise:
�
To align the segmentation boundary with intensityedges, wpq is typically chosen to be a non-increasing func-tion of jIp � Iqj, where Ip and Iq are the intensities of pixelsp and q, respectively.
Note that the term a P 0 in (1) decides the relativeimportance of the regional and boundary terms. The largerthe value of a is, the more importance the regional con-straints RðSÞ have compared with the boundary constraints
Please cite this article in press as: P. Das et al., Semiautomatic segmdoi:10.1016/j.imavis.2008.02.006
F
BðSÞ. Larger values of a result in a segmentation whichobeys the regional model more. Smaller values of a resultin a segmentation with smaller boundary cost, which usu-ally means shorter boundary length. Therefore, this param-eter is one of the most important parameters in the graphcut framework, and the hardest parameter to pick before-hand. Typically different images have different optimal val-ues for parameter a.
In Boykov and Jolly [5], it is shown how to construct thegraph such that the labeling corresponding to the minimumcut on that graph is the labeling optimizing the energy in(1).
ED
PR
OO3. Our work
The goal of our semiautomatic segmentation is accurateand robust segmentation with user interaction restricted toa single click inside the object of interest. The graph cutalgorithm [5] has several issues which make its direct useunsuitable for semiautomatic segmentation. We addressthese issues in our work.
In Boykov and Jolly [5], the user has to initially select afew object and background seeds. After running the algo-rithm the user has to inspect the quality of the segmenta-tion. If required, he/she has to repeatedly add new seedsand rerun the algorithm until an acceptable segmentationis obtained.4 Moreover, the results of the algorithm dependheavily on the choice of parameter a for the energy func-tion in Eq. (1). If the choice of a is far from optimal, theuser might have to perform a significant amount ofinteraction.
Application specific semiautomatic segmentation is amore tractable problem than general purpose semiauto-matic segmentation. One of our main ideas is that for aspecific application, it may not be too hard to come up witha goal-dependent measure of segment quality. We developa relatively simple ‘‘quality check” which lets us decidewhether segmentation under current parameters in Eq.(1) is satisfactory. With this quality check at hand, wecan then search over a range of parameters to quicklyand automatically find the parameter value correspondingto a suitable segmentation. Our particular segment ‘‘qualitycheck” was designed for a specific application, but it maybe possible to design suitable quality checks for otherapplications. For example, when it is known that an objecthas a specific shape, a quality check can be based on theshape of the object segment.
In our particular application, the objects are of compactshape (or close to compact shape), we explain what wemean by compact in Section 3.1. Thus, we introduce a com-pact shape as a hard constraint in our segmentation. Many
entation with compact shape prior, Image Vis. Comput. (2008),
T
PR
OO
F
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
441441
442
443
444
445
446
447
448
Fig. 1. This figure shows how the object segment is restricted in differentquadrants drawn with respect to the object seed (marked in dark gray).The quadrants intersect along the pixels through which the bold lines pass.The segment in light gray color shows an example of a compact shape.
5 Recall that we use a 4-connected neighborhood system.
P. Das et al. / Image and Vision Computing xxx (2008) xxx–xxx 5
IMAVIS 2679 No. of Pages 14, Model 5+
10 March 2008 Disk UsedARTICLE IN PRESS
UN
CO
RR
EC
objects can be approximated by a compact shape, so simi-lar construction can be used in other applications. A majorbenefit of including the compact shape prior is that theobjects of this shape are segmented more robustly and reli-ably. Weak boundaries, background clutter, image noiseare easier to overcome with the use of a shape prior. Anadditional and very important benefit of using the compactshape prior is that we can include a new parameter in ourenergy function which incorporates a bias to larger objects,as explained in Section 3.4. This helps to solve another gen-eral issue in graph cut segmentation, namely its bias to pro-duce segments of smaller size.
This section is organized as follows. In Section 3.1, weexplain the compact shape prior, in Section 3.2, we discussthe assumptions made by our algorithm, in Section 3.3, wegive the regional term that we use for the energy in (1), inSection 3.4, we explain our boundary term and show thatour energy function can be minimized exactly with a graphcut, in Section 3.6, we discuss shapes more general thancompact that our algorithm can handle, and in Section3.7, we give an overview of our algorithm.
3.1. Compact shape
In this section, we define the compact shape precisely.As we have already mentioned, incorporating a shape priorhelps to achieve a more robust segmentation, because allshapes inconsistent with the assumed shape are ignored.This results in an increased robustness to weak boundaries,noise, and clutter. However, incorporating a shape priorwithin graph cut framework is a difficult task, we are awareof only three previous approaches: Slabaugh and Unal [28],Freedman and Zhang [14], Kumar et al. [20], their disad-vantages have been discussed in Section 1.
We develop a shape prior which can be incorporated inthe graph cut framework directly, without the need for iter-ative optimization or registration. We call our shape priorcompact, borrowing the idea from Veksler [31]. The wordcompact is used informally. In Veksler [31], they chosethe word compact to reflect the fact that for compactshapes, the perimeter to area ratio tends to be small. Intu-itively, this shape prior encourages objects with boundariesthat are relatively simple. Our shape prior is especiallyappropriate for industrial parts, and includes rectanglesand ellipses as a special case.
We now formally define our shape prior. ConsiderFig. 1. In this figure, the squares represent the image pixels,and the dark gray square represents the seed point that theuser has selected. We divide the image into four slightlyoverlapping quadrants with respect to the seed, as shownin the figure. Let us name these quadrants P 1; P 2; P 3, andP 4. Quadrant P 1 consists of all pixels above and to the rightof the seed, including the seed. Quadrant P 2 consists of allthe pixels above and to the left of the seed, including theseed. Notice that quadrants P 1 and P 2 have in commonall pixels exactly above the seed, including the seed. Simi-larly, P 3 consists of all the pixels below and to the left of
Please cite this article in press as: P. Das et al., Semiautomatic segmdoi:10.1016/j.imavis.2008.02.006
EDthe seed, and, finally, P 4 consists of all the pixels to the
right and below the seed. We say that an object is compactif its boundary can be fully traced using only the edges ineach quadrant shown in Fig. 1. Intuitively, in each quad-rant, the boundary of the object is allowed to follow alongonly two out of four possible direction. This implies thatthe boundary in each quadrant is relatively simple andshort.
In graph cut framework, in order for the object segmentto be compact, we must prohibit a certain set of labelassignments to neighboring pixels. For example, for anyneighboring pixels p and q in the first quadrant, we mustprohibit assigning 0 to p and 1 to q if p is either to the leftor below q.5 We will use notation p<lq to denote that pixelp is to the left of q. Similarly, notation p<aq means thatpixel p is above pixel q. If l; l0 are labels, we will denotethe assignment of l to pixel p and l0 to pixel q byðp l; q l0Þ. Now, we can define the set of prohibitedassignments:
A� ¼
fðp 0; q 1Þjp; q 2 P 1 [ P 4; p<lqg[fðp 0; q 1Þjp; q 2 P 2 [ P 3; q<lpg[fðp 0; q 1Þjp; q 2 P 1 [ P 2; q<apg[fðp 0; q 1Þjp; q 2 P 3 [ P 4; p<aqg
We say that an object segment is of compact shape if noprohibited assignments are made in its segmentation.
Our definition of a compact shape might sound similarto that of a convex shape, but these two types of shapesare actually quite different. The classes of compact andconvex shapes overlap but neither class contains the other.There are convex shapes which are compact, for example
entation with compact shape prior, Image Vis. Comput. (2008),
T
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
6 P. Das et al. / Image and Vision Computing xxx (2008) xxx–xxx
IMAVIS 2679 No. of Pages 14, Model 5+
10 March 2008 Disk UsedARTICLE IN PRESS
EC
the rectangular object in Fig. 2(a). The shape of the objectin Fig. 2(b) on the other hand is not convex but it is com-pact. The object in Fig. 2(c) is an example of an objectwhich is neither compact nor convex. The object inFig. 2(d) is convex but not compact.
A weakness of the compact shape prior is that it is notrotationally invariant, since the definition relies on the ver-tical and horizontal axes. Suppose an object is compactwith respect to the vertical and horizontal axes rotatedthrough an angle h. If we can compute h (for example, ifthe object is rectangular but rotated by angle h), then wecan use our algorithm by defining the compactness of theobject with respect to the calculated axis.
Another weakness of the compact shape prior is that it isdefined with respect to the seed location. Depending onwhere the user clicks, the object may or may not be com-pact. We have noticed that users tend to click in the centerof the object (or can be specifically instructed to click in thecenter of the object), therefore we make an implicitassumption here that the shape of interest is compact withrespect to its center. Notice that some common shapes,such as rectangles and ellipses, are compact with respectto any seed location.
3.2. Our assumptions
In this paper, we make the following assumptions: (a)the average magnitude of the gradient along the boundaryof the object of interest is larger than the average magni-tude of gradient among pixels inside the object. In otherwords, on average, the intensity difference between pairsof pixels both of which lie inside the object is smaller thanthe intensity difference between pairs of pixels of which oneis inside the object and the other is outside the object; (b)
UN
CO
RR
Fig. 2. (a and b) Are examples of objects which are compact in shape with respobjects which are not compact in shape.
Please cite this article in press as: P. Das et al., Semiautomatic segmdoi:10.1016/j.imavis.2008.02.006
ED
PR
OO
F
the minimum and maximum object sizes are known; (c)the objects to be segmented are compact in shape or canbe divided either vertically or horizontally into approxi-mately collinear compact parts. The first assumption isoften satisfied in practice, since an object of interest fre-quently has a boundary corresponding to a strong intensityedge. The minimum/maximum size of the object can fre-quently be determined for a specific application. The lastassumption is the most restrictive, but can still be satisfiedby certain applications, for example by the application wetest our segmentation algorithm on.
3.3. Regional term
In this section, we discuss the regional term that we usein Eq. (1). In the segmentation algorithm of Boykov andJolly [5], initially the user has to provide a few object andbackground seeds. We only have one object seed providedby the user, therefore we find the background seeds auto-matically using the maximum object size information.For the foreground seed pixel p, we set Rpð0Þ ¼ MaxIntand Rpð1Þ ¼ 0, where MaxInt is the maximum integerallowed by the programming environment. This insuresthat the foreground seed pixel will always be assigned tothe foreground in the optimal labeling. Similarly, if p isthe automatically detected background seed pixel, we setRpð1Þ ¼ MaxInt and Rpð0Þ ¼ 0.
Since the background is unknown in our application, weuse a uniform distribution as the background intensitymodel, that is the probability of each intensity is 1/256,given that there are 256 intensity levels in the images. Forthe object, we do have one but only one pixel marked asthe object seed. We use the knowledge of the minimumobject size to collect more data around the seed point to
ect to the seeds shown with white checked box. (c and d) Are examples of
entation with compact shape prior, Image Vis. Comput. (2008),
T
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529530
532532
533
535535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561562
564564
565
566
567
568
570570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
P. Das et al. / Image and Vision Computing xxx (2008) xxx–xxx 7
IMAVIS 2679 No. of Pages 14, Model 5+
10 March 2008 Disk UsedARTICLE IN PRESS
REC
build the intensity histogram. Our assumption here is thatthe user clicks roughly in the center of the object. The usercan be instructed to click close to the center, but we havenoticed that in many cases users will intuitively prefer toclick close to the center. In case the click was not close tothe center, the object histogram maybe inaccurate if theobject size is actually the minimum size. However, this caseis not very frequent, and as we will see below, we do notoverly rely on the object histogram, so the object can stillbe segmented accurately in most cases.
Even after we collect more data around the object seed,we do not have a sufficient amount of data to faithfullymodel intensity distribution of the object. Therefore weuse a weighted mixture of a uniform distribution and thesmoothed normalized histogram. The actual costs RpðSpÞare taken as negative logarithms of these likelihood mod-els. Therefore for pixel p,
Rpð1Þ ¼ �ln cP histðIpÞ þ ð1� cÞ 1
256
� �; ð4Þ
and
Rpð0Þ ¼ �lnð1=256Þ; ð5Þ
where we assume that there are 256 gray levels possible inan image, and P histðIpÞ is the likelihood of the object pixelto have intensity Ip according to the distribution modeledby the smoothed histogram. We smooth the histogramwith a Gaussian with r ¼ 2 to avoid the problems dueto sparse sampling. Notice that adding the uniform modelto the histogram-based model in Eq. (4) makes the regio-nal terms more robust. We know that our histogrambased model is not very accurate. By adding to it a uni-form model, we make sure that the penalty for an inten-sity that is not present in the histogram is not so large asto prohibit a pixel with this intensity to be a part of theforeground.
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
6 It is enough to make K equal to the cost of EðS0Þ where S0 is anysegmentation not containing prohibited assignments.
UN
CO
R
3.4. Boundary term
In this section, we discuss the boundary term that weuse in Eq. (1). Like the framework of Boykov and Jolly[5], the boundary term serves to insure that most nearbypixels are assigned the same label (and thereby the objectand the foreground regions form coherent blobs) and alsothat the boundary between the object and the backgroundlies on the intensity edges. In addition to the two purposesabove, we use the boundary term to make sure the objectsegment follows the compact shape described in Section3.1 and also to incorporate a bias to a larger objectsegment.
Our boundary terms have the following form:
BpqðSp; SqÞ ¼0 ifSp ¼ Sq
wpq ifðp Sp; q SqÞ 62 A�
K ifðp Sp; q SqÞ 2 A�
8><>: ; ð6Þ
Please cite this article in press as: P. Das et al., Semiautomatic segmdoi:10.1016/j.imavis.2008.02.006
where A� was defined in Section 3.1, the constant K is largeenough so that any assignment in A� is prohibitively expen-sive,6 and
wpq ¼ e�ðIp�IqÞ2
2r2 � bias: ð7Þ
ED
PR
OO
F
The parameter r in Eq. (6) affects the segmentation bycontrolling when intensity difference j Ip � Iq j is largeenough to be a good place for a segmentation boundary.When j Ip � Iq j> r, the weight wpq is typically smallenough to allow a boundary. Thus, we compute r as theaverage difference of the intensities of two adjacent pixelsin a region around the user marked object seed. The sizeof this region is same as the smallest possible object sizewhich is known to us beforehand.
Parameter bias in Eq. (7) implements a bias to a largersegmentation boundary, and it is chosen automatically.When the bias increases the boundary cost decreases,though the gradient of the function remains the same.We devise a simple intuitive test that automatically detectsthe quality of the segment. The first part of our qualitycheck requires the average intensity difference betweenpairs of neighboring pixels such that one pixel is in theobject segment and the other pixel is in the backgroundsegment to be greater than the average absolute intensitydifference between pairs of neighboring pixels such thatboth pixels in the pair are inside the object. This test comesdirectly from our first assumption in Section 3.2, that is wesimply check to see if the segmentation satisfies our firstassumption. The second part of our quality check simplymakes sure that the object size is within bounds specifiedby the minimum and maximum object sizes, and this partfollows from our second assumption in Section 3.2. For atoo small value of bias the object segment is very smalldue to the bias of the graph cut to a small segmentationboundary. In this case, most likely our first assumption willnot be satisfied, since the segment consists of a small areaaround the seed which usually does not have strong inten-sity edges on its boundary. For a too large value of bias, theobject segment is too large, larger than specified maximumobject size. We search over a range to find an appropriatevalue of bias that results in a segmentation passing thisquality check.
Our search algorithm is a very simple iterative search.We search for bias in the range [0, 0.8] using a step sizeof 0.1. We are looking for the smallest value of bias in thatrange such that the object passes the quality checkdescribed above. That is we start with bias ¼ 0 and incre-ment it in steps of 0.1 until the object segment passes thequality check. Occasionally, there is no bias value in theallowed range so that the object passes the quality check.In this rare case, we perform segmentation with the small-est value of bias which results in an object segment at least
entation with compact shape prior, Image Vis. Comput. (2008),
T
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
636636
637
638
639
640641
643643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
8 P. Das et al. / Image and Vision Computing xxx (2008) xxx–xxx
IMAVIS 2679 No. of Pages 14, Model 5+
10 March 2008 Disk UsedARTICLE IN PRESS
RR
EC
passing part of our quality check, namely the object sizeshould be within the minimum–maximum allowed bounds.
Observe that when changing the value of bias we arechanging the values of the boundary terms. This leads toaltering the relative importance between the regional andthe boundary terms in the energy function (1). Recall thatthe parameter a in Eq. (1) also weights the importancebetween the regional and the boundary terms. We foundthat it is enough to search over the bias parameter whilekeeping a fixed. We chose a fixed value of a which workswell for all the images.
3.5. Regularity of the energy function
In this section, we prove that our energy function can beminimized globally and exactly with a graph cut. Werewrite the energy function in Eq. (1) with the boundaryand the regional term defined in the Eq. (3) and Eq. (2),respectively:
EðSÞ ¼X
p
RpðSpÞ þXfp;qg2N
p<q
BpqðSp; SqÞ;
where EðSÞ is a function of j P j binary variables, which is asum of functions of up to 2 binary variables. This energyfunction can be minimized using a graph cut if it is sub-modular, that is if the following property is satisfied [19]:
Bpqð0; 0Þ þ Bpqð1; 1Þ 6 Bpqð1; 0Þ þ Bpqð0; 1Þ: ð8Þ
According to the definition of BpqðSp; SqÞ, the left hand-side of Eq. (8) is always 0, and the right hand-side is alwaysnon-negative, even when wpq is negative, since K is chosento be very large. Thus EðSÞ is submodular.
3.6. More general shapes
We started with the assumption that the object to be seg-mented has to be of compact shape. However, we cansomewhat relax this assumption, making it possible to seg-ment objects of shapes more general than compact. Sup-
UN
CO
Fig. 3. On the left we show the original image and on the right we illustratesselected by the user, and the white squares show the automatically selected se
Please cite this article in press as: P. Das et al., Semiautomatic segmdoi:10.1016/j.imavis.2008.02.006
ED
PR
OO
F
pose the object can be divided either vertically orhorizontally into several approximately collinear adjacentpieces, where each piece is of compact shape. If we applyour algorithm above to such an object, we obtain an initialsegment of compact shape around the user entered seed,but either vertical or horizontal boundaries of this initialsegment do not align with the object boundaries and there-fore do not lie on strong intensity edges. We check if all theedges of the current segment satisfy the criteria for being a‘‘strong edge”. In our application, we require 85% of thepixels lying on that edge to have intensity difference greaterthan the standard deviation inside the object. For this test,we use the pixels of the boundary which lie inside the object(as opposed to those lying on the outside of the objectboundary). Other criteria can be also used, of course. Ifan edge does not pass the ‘‘strong edge” test, a new seedpoint is chosen which lies inside the current segment, atthe center of the weak edge but slightly inside the currentsegment (to be precise, two pixels inside). The last partemphasizes our assumption that the object can be dividedinto approximately collinear compact pieces. Then thegraph-cut is run again in the same way as already describedin this section, except we reuse the value for the bias param-eter estimated at the previous step, we found that there isno need to re-estimate it. Experiments show that the valueof bias parameter, if re-estimated for each extension piece,is so close to the value of bias inside the first piece, thatalmost no difference in segmentation results is observed.Thus by repeatedly finding the new seed and running thegraph-cut algorithm, it is possible to segment the wholeobject accurately. Fig. 3 illustrates the above process. Thewhite circles show the original seed selected by the user,and the white squares show the automatically selectedextension seeds.
This approach is especially helpful for our semiauto-matic segmentation, since information about the exact sizeof the object is not provided. We can segment the objects insmaller pieces, saving the computational time. In addition,we can segment thin and long objects which would be
the extension to more general shapes. The white circles mark is the seededs for extension.
entation with compact shape prior, Image Vis. Comput. (2008),
T
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
P. Das et al. / Image and Vision Computing xxx (2008) xxx–xxx 9
IMAVIS 2679 No. of Pages 14, Model 5+
10 March 2008 Disk UsedARTICLE IN PRESS
RR
EC
otherwise impossible using the basic graph cut segmenta-tion algorithm of Boykov and Jolly [5] due to its biastowards shorter boundary.
The basic step of our algorithm is to segment a piece ofthe object which is compact with relation to the user pro-vided or automatically detected seed. The exact seed place-ment determines the size of the first piece. If the seed isplaced close to the object boundary, then the compact piecesegmented may be of smaller size compared to the compactsegment produced if the seed was placed closer to thecenter.
3.7. Our algorithm
We now summarize our algorithm as follows. Weassume that the objects are compact in shape or can bedivided either vertically or horizontally in compact,roughly collinear parts. We build a graph of size greaterthan the maximum possible size of the object around theseed. Once the initial segment is obtained, it is likely to con-tain only a portion of the object that agrees with the com-pact shape. Then the boundary of the segment is checkedto find weak edges, if any. The initial segment is extendedin smaller pieces along the direction of the weak edgesdetected as described above. Thus by iteratively runningthe graph-cut we can segment the whole object regardlessof its length.
Notice that our piecewise segmentation approach mayseem similar to the region growing methods. However thissimilarity is superficial. In region growing, neighboringregions are repeatedly merged, based on some criterion ofregion similarity. In our approach, the best new region toadd to the current segmentation is found by optimization,namely we choose the best region to add out of combinato-rially many possible regions.
4. Results
We explored the challenging industrial problem of tran-sistor gate segmentation in the images of integrated chips.It is an important preliminary step for performing intellec-
UN
CO
Fig. 4. Sample of the images provided by Semiconductor Insight Inc., showinobject. Different scales are used for displaying. In each of these images, the trFig. 6 clearly delineates the gates for each of these images in gray.
Please cite this article in press as: P. Das et al., Semiautomatic segmdoi:10.1016/j.imavis.2008.02.006
ED
PR
OO
F
tual property protection and competitive intelligence anal-ysis in integrated circuitry domain. To obtain the images,the integrated circuit is de-layered and SEM micro-photo-graphed. The images of the upper layers of the chip, thatcontain the metal wiring, are typically of high quality andcan be segmented by automated means. The lower levelscontain the dopant, the silicon implementation of the tran-sistors. The images of these layers are typically of low qual-ity and could have substantial variation in brightness andcontrast. They occasionally contain artifacts due to theremains of the upper layer left during delayering. Two ofthe most important parameters in integrated chip circuitryare the length and the width of the transistor gates. Theydetermine the circuitry power characteristics and are cru-cial for proper modeling and understanding of its function-ality, which is essential for determining if the functionalityis replicating a patented design. In order to obtain thesemeasurements, accurate segmentation of the transistorgates is essential. Prior to the development of the applica-tion described in this paper, the measurements were takenmanually by a human operator. It was done by selectingthe gate in the image using a computer application, whichalso involved time consuming operations like zooming andpanning across the image. The attempts to use off the shelfsegmentation algorithms, such as magic wand or localthresholding, were unsuccessful.
Fig. 4 shows some of the images of integrated chips pro-vided by Semiconductor Insight Inc., which are representa-tive of the images used regularly. The images are in grayscale with 256 intensity values, where 0 represents blackand 255 represents white. The transistor gates appearroughly rectangular in shape, and therefore can be wellapproximated with a compact shape. Notice the large var-iation in the noise level across the images. Hence accurateestimation of the parameter r in the boundary term of theenergy function is a crucial part in our work in order toaccommodate the variation. From Fig. 4, it is also evidentthat the other challenges are the large variation in contrastand intensity range of the transistor gates. Another chal-lenge is the wide variability in the transistor gate sizes,which range in length from 10 to a few thousands of pixels.
g the variation in image contrast, noise characteristics and the size of theansistor gate has horizontal orientation and is at the center of the image.
entation with compact shape prior, Image Vis. Comput. (2008),
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
10 P. Das et al. / Image and Vision Computing xxx (2008) xxx–xxx
IMAVIS 2679 No. of Pages 14, Model 5+
10 March 2008 Disk UsedARTICLE IN PRESS
For all the experiments in this section, the parameterswere set to the following values: a ¼ 0:007, c ¼ 0:4. Theminimum size for the object was set to be 3 by 3 pixels,and the largest size for the object was set to be 130 by130. Parameter bias is chosen automatically, as discussedin Section 3.4. As discussed in Section 3.4, the parameterr is computed as the average of the intensity differencebetween two adjacent pixels using the data collectedaround the seed and the knowledge of the minimum possi-ble size of the object.
We first compare our results with the results of the algo-rithm in Boykov and Jolly [5]. Fig. 5(a) shows the result of
UN
CO
RR
EC
T
Fig. 5. In (a and b), we show segmentation results obtained using the algorithmmodel as regional term along with the boundary term. In (c and d), we show segwith an automatically determined value of bias > 0; (d) final segmentation obtawith white squares, and the initial user entered seed is labeled. The large dottedsegmented object. Please note that in (b), the gray color which indicates the o
Fig. 6. Shows the segmentation results obtained using our algorithm on
Please cite this article in press as: P. Das et al., Semiautomatic segmdoi:10.1016/j.imavis.2008.02.006
F
algorithm Boykov and Jolly [5] using only the boundaryterm. Only a small part of the transistor gate is segmented,due to the bias to small segments if the regional term issmall or completely absent. On including the intensitymodel for describing the regional property of the segment,the algorithm Boykov and Jolly [5] produces multiple seg-ments with very complex boundaries, most of them beingfalse alarms, shown in Fig. 5(b). This happens because ofconsiderable overlap of the background and object inten-sity distributions. After we estimate an appropriate valuefor bias, that is a value which results in an initial segmentpassing our ‘‘quality check”, we get the part of the transis-
ED
PR
OO
of Boykov and Jolly [5]: (a) with the boundary term only; (b) with intensitymentation results obtained with our algorithm: (c) initial segment obtainedined by extending the initial segment obtained in (d). The seeds are markedsquare shows the maximum allowed segment size. The gray color shows thebject is perceived to be much darker than the gray color in (a, c, and d).
the images in Fig. 4. The segmented transistor gate is shown in gray.
entation with compact shape prior, Image Vis. Comput. (2008),
T
PR
OO
F
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
Fig. 8. (a) Original image of the transistor along with the user markedobject seed. (b)The true boundary of the transistor is too weak, so thesegmentation boundary sticks to the stronger but wrong boundary.
P. Das et al. / Image and Vision Computing xxx (2008) xxx–xxx 11
IMAVIS 2679 No. of Pages 14, Model 5+
10 March 2008 Disk UsedARTICLE IN PRESS
OR
REC
tor gate shown in Fig. 5(c). The results in Fig. 5(c) corre-spond to an acceptable initial segment, which is thenextended iteratively to the whole object as shown inFig. 5(d). Every time a segment is extended, a new seedpoint, marked with a white square in Fig. 5(d), is located.
Fig. 6 shows the segmentation results obtained using ouralgorithm on the images in Fig. 4. The user provided seedpixel is marked with a white circle and the automaticallydetected extension seeds are marked with white squares.Despite large variation in size, intensity distribution, noisetype, shape and contrast, each transistor gate is accuratelysegmented.
To evaluate the performance of our system, we consid-ered 10 images, each containing dozens of transistor gatesand segmented 100 transistor gates chosen at random. In91 cases the transistor gates are segmented accurately, giv-ing the overall accuracy of 91%. In 6 cases the initial seg-ments were segmented accurately but the extension failed.In 3 cases the segmentation boundaries aligned with thewrong but stronger intensity edges. Figs. 7 and 8 showsome failure cases.
Fig. 9 shows more results which illustrate the challengesour system can deal with. In the first two rows, the transis-tor gate is very narrow, and in addition, in the first row itsshape is far from compact. In the third row, the transistorgate includes large white circular spots and the seed isplaced far from the center. In the last row, the transistorgate has a noticeable artifact (on the left) which is inconsis-tent with the overall intensity histogram and creates strongintensity edges inside the gate. In all these cases, our seg-mentation system gave an accurate segmentation.
Fig. 10 shows the results of segmentation of a long thintransistor gate which has nearly horizontal orientation butis rotated several degrees, and therefore is not compact.Our system has no problem extracting it in several compactpieces. In (b) and (c), we show the results under very differ-ent seed placements. Results are essentially identical, whichshows the insensitivity of our system to the exact seedplacement.
The application is implemented using C++ on a P42.8 GHz computer. The time varies with the size of theobject. For small gates, it only takes a fraction of a second.Larger gates, such as of size 120� 3000 pixels, are
UN
C
Fig. 7. (a) Original image of the transistor. (b) The transistor gate is segmentedstops at the wrong edge in the image.
Please cite this article in press as: P. Das et al., Semiautomatic segmdoi:10.1016/j.imavis.2008.02.006
ED
segmented in less than 2 s. It is currently being used bySemiconductor Insight Inc., to upgrade their existing man-ual segmentation system to a semiautomatic one.
We also applied our algorithm to segment objects inother types of images. Note that we chose to segmentobjects which are well approximated with several nearlycollinear pieces of compact shape. The results are in shownin Fig. 11. The tool in the Fig. 11(a) is very far from a com-pact shape, but we were able to extract it by extending it incompact pieces. The roof in Fig. 11(b) is also not compactand was extracted in several pieces from a complexbackground.
We also applied our algorithm to segment the eye sock-ets in a 2D slice of MR brain image, which is required as afirst step in the process of cortex segmentation. The eyesockets are elliptical in shape and follow the convex shapeassumption. Fig. 12(a) shows the eye socket segmentedwith our semiautomatic algorithm. Fig. 12(b) shows theresult obtained with the basic graph cut algorithm, whichrequires more interaction, yet unable to segment the wholesockets.
and is 4 pixels wide. The extension fails when the segmentation boundary
entation with compact shape prior, Image Vis. Comput. (2008),
NC
OR
REC
TED
PR
OO
F
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
Fig. 9. The left column shows the original images, the right column shows the segmentation results. We used black color in the first three rows and whitecolor in the last row to outline the segment boundary. White dot outlined in black is used to show the seed provided by the user.
Fig. 10. (a) Shows the original image; (b and c) show the result of segmentation with different seed placement. The results of segmentation are outlined inblack.
12 P. Das et al. / Image and Vision Computing xxx (2008) xxx–xxx
IMAVIS 2679 No. of Pages 14, Model 5+
10 March 2008 Disk UsedARTICLE IN PRESS
U5. Discussion
In this paper, we presented a semiautomatic segmenta-tion algorithm developed by modifying the basic graphcut segmentation algorithm of Boykov and Jolly [5]. Weshowed how problem specific assumptions and constraintscan be well utilized to reduce the user interaction and alsothe complexity of the problem. The main contribution ofour work is the introduction of the compact shape prior
Please cite this article in press as: P. Das et al., Semiautomatic segmdoi:10.1016/j.imavis.2008.02.006
into the graph cut segmentation, which adds robustnessto the algorithm. An additional benefit of using the com-pact shape prior is that we are able to introduce a param-eter bias into the framework. This parameter biases thegraph cut algorithm to segment objects with longerboundaries. We also showed how an application specific‘‘quality check” for segmentation can be used to automat-ically select the appropriate parameters in graph cutsegmentation.
entation with compact shape prior, Image Vis. Comput. (2008),
OR
REC
TED
PR
OO
F
875
876
877
878
879
880
881
882
883
884
885
886
887888889890891892
Fig. 11. (a) The tool has been segmented well in spite of the variation of width and shading within the object. (b) The roof part of the building has beensegmented as several pieces of compact shape.
Fig. 12. (a) Segmentation of the eye socket using our algorithm. The seeds are marked as the white pixels inside the eye socket. Note that one eye socket issegmented at a time. (b) Segmentation of eye socket using basic graph cut algorithm. The object seeds are shown as white boxes, each 5 � 5 pixels, thebackground pixels are shown as gray boxes outlined with white, each 7 � 7 pixels big.
P. Das et al. / Image and Vision Computing xxx (2008) xxx–xxx 13
IMAVIS 2679 No. of Pages 14, Model 5+
10 March 2008 Disk UsedARTICLE IN PRESS
UN
CA weakness of the compact shape prior is that it is notrotationally invariant as it is defined with respect to thevertical and horizontal axes. If an object is compact withrespect to a rotated set of axis, it can still be segmentedin compact pieces as long at the rotation is not too large,in experiments we have found that we can tolerate rota-tions of about 6–7 degrees.
893894895896897898899900
Acknowledgements
We thank Stephen Begg and Dale Carlson of Semicon-ductor Insights for developing interactive applicationincorporating the method.
Please cite this article in press as: P. Das et al., Semiautomatic segmdoi:10.1016/j.imavis.2008.02.006
References
[1] A. Agarwala, M. Dontcheva, M. Agrawala, S. Drucker, A.Colburn, B. Curless, D. Salesin, M. Cohen, Interactive digitalphotomontage, in: ACM Transactions on Graphics, vol. 23, 2004,pp. 294–302.
[2] A. Amini, T. Weymouth, R. Jain, Using dynamic programming forsolving variational problems in vision 12 (9) (1990) 855–867,September.
[3] A. Blake, C. Rother, M. Brown, P. Perez, P. Torr, Interactiveimage segmentation using an adaptive gmmrf model. in:European Conference on Computer vision, vol. I, 2004, pp.428–441.
[4] Y. Boykov, G. Funka Lea, Graph cuts and efficient n-d imagesegmentation, International Journal of Computer Vision 69 (2) (2006)109–131, September.
entation with compact shape prior, Image Vis. Comput. (2008),
T
901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941
942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982
14 P. Das et al. / Image and Vision Computing xxx (2008) xxx–xxx
IMAVIS 2679 No. of Pages 14, Model 5+
10 March 2008 Disk UsedARTICLE IN PRESS
EC
[5] Y. Boykov, M.P. Jolly, Interactive graph cuts for optimal boundaryand region segmentation, in: International Conference on ComputerVision (ICCV), vol. I, 2001, pp. 105–112.
[6] Y. Boykov, V. Kolmogorov, An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision, IEEETransaction on PAMI 26 (9) (2004) 1124–1137, September.
[7] L. Cohen, On active contour models and ballons, Computer Vision,Graphics and Image Processing 53 (2) (1991) 211–218.
[8] L.D. Cohen, R. Kimmel, Global minimum for active contour models:a minimal path approach, in: IEEE Conference on Computer Visionand Pattern Recognition, 1996, pp. 666–673.
[9] D. Cremers, Nonlinear dynamical shape priors for level set segmen-tation, in: IEEE Conference on Computer Vision and PatternRecognition, 2007, pp. 1–7.
[10] D. Cremers, S. Osher, S. Soatto, Kernel density estimation and intrinsicalignment for shape priors in level set segmentation, InternationalJournal of Computer Vision 69 (3) (2006) 335–351, September.
[11] A.X. Falaco, J. Udupa, S. Samarasekara, S. Sharma, User-steeredimage segmentation paradigms: live wire and live lane, in: GraphicalModels and Image Processing, vol. 60, 1998, pp. 233–260.
[12] P. Felzenszwalb, D. Huttenlocher, Efficient graph-based imagesegmentation, International Journal of Computer Vision 59 (2)(2004) 167–181, September.
[13] L. Ford, D. Fulkerson, Flows in Networks, Princeton UniversityPress, 1962.
[14] D. Freedman, T. Zhang, Interactive graph cut based segmentationwith shape priors, in: IEEE Conference on Computer Vision andPattern Recognition (CVPR), vol. I, 2005, pp. 755–762.
[15] Gonzalez, Woods, Digital Image Processing, second ed., PrenticeHall, Berlin, Heidelberg, New York, 1996.
[16] L. Grady, E.L. Schwartz, Isoperimetric graph partitioning for imagesegmentation, IEEE Transactions on Pattern Analysis and MachineIntelligence 28 (3) (2006) 469–475, March.
[17] I. Jermyn, H. Ishikawa, Globally optimal regions and boundaries asminimum ratio weight cycles, IEEE Transaction on PAMI 23 (10)(2001) 1075–1088, October.
[18] M. Kass, A. Witkin, D. Terzopoulos, Snakes: active contour models,International Journal of Computer Vision 2 (1998) 321–331.
[19] V. Kolmogorov, R. Zabih, What energy function can be minimizedvia graph cuts? IEEE Transaction on PAMI 26 (2) (2004) 147–159,February.
UN
CO
RR 983
Please cite this article in press as: P. Das et al., Semiautomatic segmdoi:10.1016/j.imavis.2008.02.006
ED
PR
OO
F
[20] M. Kumar, P. Torr, A. Zisserman, Obj cut, in: IEEE Conference onComputer Vision and Pattern Recognition (CVPR), vol. I, 2005, pp.18–25.
[21] S. Lee, J. Seo, Level set-based bimodal segmentation with stationaryglobal minimum 15 (9) (2006) 2843–2852. August.
[22] M. Leventon, W. Grimson, O. Faugeras, Statistical shape influence ingeodesic active contours, in: IEEE Conference on Computer Visionand Pattern Recognition, vol. I, 2000, pp. 316–323.
[23] Y. Li, J. Sun, C. Tang, H. Shum, Lazy snapping, ACM Transactionson Graphics 23 (3) (2004) 309–314.
[24] E.N. Mortensen, W.A. Barrett, Interactive segmentation with intel-ligent scissors, in: Graphical Models and Image Processing (GMIP),vol. 60, 1998, pp. 349–384.
[25] S. Osher, J. Sethian, Fronts propagating with curvature dependentspeed: Algorithm based on hamilton jacobi formulations, Journal ofComputational Physics 79 (1988) 12–49.
[26] M. Rousson, N. Paragios, Shape priors for level set representa-tions, in: European Conference on Computer Vision, vol. II, 2002,p. 78 ff.
[27] J. Shi, J. Malik, Normalized cuts and image segmentation, in: IEEEConference on Computer Vision(ICCV), 1997, pp. 731–737.
[28] G. Slabaugh, G. Unal, Graph cuts segmentation using an ellipticalshape prior, in: International Conference on Image Processing (ICIP),vol. II, 2005, pp. 1222–1225.
[29] A. Tsai, A. Yezzi Jr., W. Wells III, C. Tempany, D. Tucker, A. Fan,W. Grimson, A. Willsky, Model-based curve evolution technique forimage segmentation, in: IEEE Conference on Computer Vision andPattern Recognition, vol. I, 2001, pp. 463–468.
[30] O. Veksler, Image segmentation by nested cuts, in: IEEE Conferenceon Computer Vision and Pattern Recognition, vol. I, 2000, pp. 339–344.
[31] O. Veksler, Stereo correspondence with compact windows viaminimum ratio cycle, IEEE Transaction on PAMI 24 (12) (2001)1654–1660, December.
[32] S. Wang, T. Kubota, J. Siskind, J. Wang, Salient closed boundaryextraction with ratio contour, IEEE Transaction on PAMI 27 (4)(2005) 546–561. April.
[33] Z. Wu, R. Leahy, An optimal graph theoretic approach to dataclustering: theory and its application to image segmentation,IEEE Transaction on PAMI 15 (11) (1993) 1101–1113,November.
entation with compact shape prior, Image Vis. Comput. (2008),