an edges-based attribute filtering method dedicated to ... · a connected components energy. we...

An edges-based attribute filtering method dedicatedto image segmentation.

Edwin Carlinet

Technical Report no1005, June 2010revision 2195

We propose a new edge-based method dedicated to image segmentation. The last few years have shownthe development of image processing with connected filters. Indeed, contour-preserving properties of con-nected operators are highly desired for image segmentation. The connected operators-based segmentationmethods usually proceed in two steps. They first compute an attribute on the connected components andfilter the components of which attribute do not satisfy a criterion. In these methods, attributes are actuallycomputed on the pixels of connected components. We propose a new union-find based algorithm that en-ables evaluation of attributes on connected component edges so that we can finally compute an attributeon their contours. We therefore introduce edges-based attributes and propose some of them that evaluatea connected components energy. We finally perform an edge-based attribute filtering to produce a newimage segmentation method.

Nous proposons une nouvelle méthode basée sur l’utilisation des contours, dédiée à la segmentation desimages. Les derniers temps ont été marqués par le développement des techniques de traitement d’imageutilisant les filtres connectés qui préservent les contours des objets et deviennent ainsi de puissants outilsà des fins de segmentation. Les méthodes de segmentation à base d’opérateurs connectés procèdent gé-néralement en deux étapes. Elles calculent un attribut sur les composantes connectées puis filtrent cellesqui ne satisfont pas un critère. Nous proposons un nouvel algorithme basé sur l’union-find qui permetde calculer un attribut sur les contours des composantes connectées. Nous introduisons ainsi les attributsbasés sur les contours et en proposons qui évaluent l’énergie d’une composante connectée. Pour finir, nousconcluons avec le filtrage à base de contours pour produire une nouvelle méthode de segmentation.

Keywordssegmentation, energy, union-find, component tree, connected components, contours, parallel program-ming

Laboratoire de Recherche et Développement de l’Epita14-16, rue Voltaire – F-94276 Le Kremlin-Bicêtre cedex – France

Tél. +33 1 53 14 59 47 – Fax. +33 1 53 14 59 [email protected] – http://www.lrde.epita.fr/

[email protected]

http://www.lrde.epita.fr/

1

Copying this document

Copyright c© 2010 LRDE.Permission is granted to copy, distribute and/or modify this document under the terms of

the GNU Free Documentation License, Version 1.2 or any later version published by the FreeSoftware Foundation; with the Invariant Sections being just “Copying this document”, no Front-Cover Texts, and no Back-Cover Texts.

A copy of the license is provided in the file COPYING.DOC.

Preface

The LRDE has been involved in developing projects under two constraints: genericity and per-formance. OLENA , the image processing framework is not an exception to the rule, the LRDEimaging crew is still involved in developing the most innovative and the most efficient solu-tions for his partners. One of the most important project leaded by OLENA ’s team deals withdocument segmentation that will be widely brought up in this report.

In the way of thinking and solving problems, the crew usually takes benefits from mathemat-ical morphology abilities. MILENA, the image processing framework is a proof of this involve-ment since it is one of the most complete library dealing with mathematical morphology. Forthe last years, this field has seen a very fast expansion, specially with the introduction of con-nected filters. Most segmentation methods developed over its theory have a strong backgroundand have shown interesting results. We so chose to follow this way in developing our own seg-mentation method using mathematical morphology fundamentals. The first part of this reportwill be dedicated to this method.

Then, the team has been involved in making this project efficient. This feature is not onlya constraint designed by the LRDE . Since algorithms have to deal with larger and larger im-ages, performance has become a fundamental need in image processing. As a consequence, thewhole second part will be dedicated to optimizations of algorithms used by our method, and acomplete comparison with other existing algorithms.

Acknowledgment

Thanks to my supervisor Thierry Geraud for his help provided along this year, and RolandLevillain for his advice about parallelism.

Contents

1 An edges-based attribute filtering method dedicated to image segmentation. 41.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2 Edge-attribute computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.1 Union-find algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.2 Edge-oriented union-find algorithm . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Using contours information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.3.1 Gradient computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.3.2 Mean of contour gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.3.3 Selection of objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Comparison of max tree algorithms 112.1 Salembier’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.1.1 Hierarchical pixel queues algorithm . . . . . . . . . . . . . . . . . . . . . . 122.1.2 Analysis and results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2 Union-find based algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2.1 Union-find . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2.2 Union by rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.2.3 Analysis and results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3 Parallel max-tree using Intel TBB’s . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.3.2 Parallel algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.3.3 Analysis and results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3 Bibliography 22

Chapter 1

An edges-based attribute filteringmethod dedicated to imagesegmentation.

1.1 Introduction

For the last years, the need for efficient segmentation techniques has become more and moreimportant. Many fields of computer science dealing with image processing uses segmenta-tion as a basis of their methods: image and video indexation, pattern recognition, data com-pression, digitalizing paper-based documents. . . Moreover, there are almost as many segmenta-tion approaches as application fields. Some of them are based on regions - clustering(Cuttinget al., 1992), region growing, watershed(Beucher, 1992), connected operators(Salembier et al.,1998). . . -, others are based on edges processing, and some of them are mixing the both. Mum-ford and Shah (1989) proposed a segmentation method based on optimization problem wherewe have to minimize image energy under complexity constraint. The energy is actually com-puted on regions whereas the complexity constraint is generally edges-dependant. Recently,with the development of connected operators, many methods have used connected componentfiltering as a basis of a robust segmentation. Indeed, connected filters are edge-preserving op-erators so that they became a good approach for image partitioning.

In this chapter, we are going to expose a segmentation method combining both connectedcomponent and boundary-based approaches. Thus, we take advantage of edge-preservingproperties of connected filters while we are actually perform calculus on the contour whichis the real location of object information.

1.2 Edge-attribute computation

Connected filters are edge-preserving operators that keep or remove connected componentsthat do not check a predicate. Area openings and closings are the most well-known connectedoperators since they remove objects smaller or bigger than a certain size. An easy way to per-form attribute filtering is using a component tree (Vincent, 1993). Component trees are struc-tures that organize connected components as a hierarchy where the inclusion of sets is actuallyrepresented as the parent relationship of trees. To construct such trees, two classes of algorithms

5 An edges-based attribute filtering method dedicated to image segmentation.

exist. The first ones are based on queues (Salembier and Serra, 1995; Vincent, 1993), the otherson Tarjan’s union-find (Berger et al., 2007; Wilkinson and Roerdink, 2000). The algorithm wepropose is based on the second method.

1.2.1 Union-find algorithm

The union-find algorithm provides a set of operations that enables to handle disjoint sets. Bykeeping track of each of them, it allows to merge those that are considered as equivalent. Sinceconnected components are by definition disjoint sets of pixels, union-find can be designed toperform algebraic openings and closings. The standard union-find algorithm, proceeds as fol-lows: it first sorts the pixels in decreasing order of their gray level, builds as many singleton asthe number of pixels and then merges disjoint sets to build up a hierarchy that forms a compo-nent tree. The three main operations are so:

MakeSet(x): Make a singleton {x}

FindRoot(x): Find the root of the set that contains x

Union(x,y): Merge the two disjoint sets that contains x and y.

We do not matter in this section about implementation details. The underlying structure for dis-joint sets handling will be exposed further in this report. Thus, a generic canvas for algorithmsusing union-find can be written as follows:

1: for all pixels x in decreasing order of gray levels do2: MakeSet(x)3: {Do some stuff for attribute initialization}4: for all n ∈ N (x) such that n has not been processed yet do5: q ← FindRoot(n)6: if q 6= x then7: Union(q, x)8: {Do some stuff for attribute merging}

1.2.2 Edge-oriented union-find algorithm

Before going into details, we first expose how to include edge consideration in union-find algo-rithm. Let f denote the original image. We first introduce the mapping active-edges(e) : Df

2 → {True,False}which keeps track for each edge e = (x, y) (x, y pixels of f ) if it belongs to the contour of anyset. Then, let us introduce the concept of edge-attribute attr for which the following operationsare defined 1:

InitAttr(attr): Initialize the attribute.

MergeAttr(attr1, attr2): Merge two attributes in the same way that two disjoint sets are merged.

TakeAttr(attr, e): Consider the edge e in the attribute valuation.

UntakeAttr(attr, e): No longer consider the edge e in the attribute valuation.

Let attribute, an image of attributes as large as the original image. The algorithm proceeds asfollows: when building singleton set {x}, there might or might not be any active edge aroundthis pixel. Since, we do not know yet how this pixel x is going to contribute in the contour

1From a programming point of view, an edge-attribute typically fits the accumulator design pattern.


2 3 22 2 31 3 1

A B CD E FH I J

(a) Left: image gray levels, right: pixel labels

Pixels in process Current max treeActives edges after

process

A B CD E FH I J

λ = 3 B F IB

F

I

A B CD E FH I J λ = 2 EDCA

λ = 3 B F I B

F

I

A C

D E

A B CD E FH I J

λ = 1 JH

λ = 2 EDCA

λ = 3 B F IB

F

I

A C

D E

H J

Figure 1.1: Edge-oriented union-find


Algorithm 1 Edge-attribute computation algorithmfor all edges e doactive-edges(e)← False

for all pixels x in decreasing order of gray levels doMakeSet(x)InitAttr(attribute(x))for all n ∈ N (x) such that n has not been processed yet doq ← FindRoot(n)if q 6= x thenUnion(q, x)MergeAttr(attribute(q), attribute(x))

for all edges e of x doif not active-edges(e) thenactive-edges(e)← TrueTakeAttr(attribute(x), e)

elseactive-edges(e)← FalseUntakeAttr(attribute(x), e)

of its connected component, InitAttr(attribute(x)) is often a no-op. When two disjoint setsare merged, the attributes are merging as well. In the algorithm given previously, disjoint setsalways connect via the pixel x such that in a last step we will be able to disable boundary edgesthat have been removed by the merge. The algorithm 1 shows how to compute an attribute onconnected component edges.

1.3 Using contours information

In the previous section, we have seen an algorithm that computes an attribute on boundariesof connected components. In this section, we are going to expose simple attributes that usecontour information to detect objects and perform an image segmentation.

1.3.1 Gradient computation

Like other boundary-based segmentation methods, we use the postulate that pixel values changerapidly at boundary between two regions. In other words, contours appear with an high valueof gradient - used by many contour detection filters (Sobel filter and others. . . ). As a conse-quence, we compute on each edge e = (x, y) the gradient value between x and y. The gradientwe use is non-oriented and is the 1-norm distance between two pixels gray value. We finallyget the mapping:

∇(e) = abs(f(x)− f(y)) (1.1)

1.3.2 Mean of contour gradient

Our method relies on the second assumption that each object is represented by a single node inthe component tree. Thus, we compute for each node the mean mg of its contour gradient. LetΓ the peak component (a connected component issued from image tresholding) associated to a


node, and Γc the set of edges that forms its contour, then:

mg(Γ) =1Γc

∑e∈Γc

∇(e) (1.2)

Ah high value of mg means that the peak component is an object (or a set of objects) clearlydistinct from the background, on the other hand, a low value ofmg means that the object startedmerging with components that have no link with it.

1.3.3 Selection of objects

In most segmentation methods based on connected operators, we find some recurrent problems:

1. over-segmentation of the image because of a too large number of extrema (generally dueto some noise),

2. merge of two objects because boundaries are not sharp enough (generally due to blur),

3. object masking because a single node of the component tree is selected whereas a branchcan contain several objects.

Object masking is a very well-known problem in connected operator based segmentation.Many issues have been proposed to handle the fact an object might not be uniform and that twoobjects might overlap like hyper-connectivity or merge thresholding (Fabrizio and Marcotegui,2008; Wilkinson, 2009). This problem is typical to region-based segmentation since the selectedconnected component is generally the one having the highest contrast with its parent.

By contrary, our segmentation can retrieve several overlapping objects in the same branch.To do so, we first trace the attribute signature (curve of attribute value along a branch; see Jones(1999)) along each branch and we retrieve its local maxima. To avoid over-segmentation, thecurve is smoothed to reduce the number of maxima so that each of them can represent an object(see Figure 1.2).

(a) Attribute signature from a leaf in M (b) Object from the 1st local max-imum

(c) Object from the 2nd local max-imum

Figure 1.2: Segmentation of overlapping objects

As such, this method suffers from over-segmentation when the image is noisy. We supposethat each branch contains at least one object, that is actually not the case. Indeed, noise ischaracterized by a large amount of local extrema that makes loads of branches in the componenttree. To reduce the number of false positive objects, we post-process extrema and filter thosewho have a value greater than a certain threshold.


1.4 Results

(a) Manifestation: original (b) Manifestation: segmentation. Most text hasbeen detected.

(c) Wolkswagen: original ad-vertisement.

(d) Wolkswagen: segmenta-tion. Most text has been de-tected.

(e) Wolkswagen: zoom in. Over-segmentation due to the low resolution

(f) Spoke-man: original (g) Spoke-man: segmentation. Blur text has beentotally ignored

Figure 1.3: Some segmentation results. In red, the objects detected by our method.

We present in this section, the results of our method with the objects getting the best scores.Indeed, as we cannot represent overlapping objects, we focus on those getting the highest at-tribute value. Once an object has been retrieved, the whole branch is set inactive, thus there isat most one object by branch.


Figure 1.3 shows the result of segmentation with our method. Sharp objects are detected eventhose whose content is not uniform (see the ’N’ of ’occupation’ in Figure 1.3(b)). With a stan-dard region-based segmentation method, objects whose content is not uniform are segmented.The second row shows that this method cannot do miracles and to avoid over-segmentationof letters, images have to be in high resolution. From the third row, it yields the limitations ofedge-only based segmentation, since blur text has been totally ignored. A region-based methodlike watershed would have been able to detect this text.

1.5 Conclusion

We introduced a union-find based algorithm that allows to compute attribute on both com-ponent regions and component boundaries. This algorithm has been used in a segmentationmethod that uses exclusively contour information. This method got pretty good results formost documents, showing a certain robustness to noise, blur and object masking. However,we figured out that it fails on documents that would have been well-treated by region-basedsegmentation methods.

This tends to show that even if information is mainly owned by contours, we cannot ignoreinformation given by the region. As a consequence, a further work would be to introduceinformation given by internal regions while computing image energy. Some energies (Mumfordand Shah, 1989) already consider both regions and boundaries, we so have to explore this way.

Chapter 2

Comparison of max tree algorithms

Connected operators proposed by Breen and Jones (1996) are subclasses of morphological op-erators in the way that they use the shape of objects to filter an image. They usually outperformstandard filters because they preserve contours of objects they filter. As a consequence, con-nected operators present an interesting tool to perform image segmentation (Salembier andSerra, 1995), video segmentation and data compression (Salembier et al., 1998).

Connected operators can be unified with a single structure that represents a hierarchical setof connected components so called the component tree (Jones, 1999). Filtering methods usingthe component tree proceed in three steps: construction, attribute evaluation, and tree pruning.Even if, processing the three steps independently is usually slower than building and filteringin the same time (Wilkinson and Roerdink, 2000), the component tree offers more flexibility andpermits to handle many kinds of pruning strategies with non-increasing attributes.

Efficient algorithms designed to build component trees are based either on pixel priorityqueue introduced by Vincent (1993), or on Tarjan’s union-find algorithm (Tarjan, 1975). After-ward, some improvements have been provided to make it more efficient. Salembier et al. (1998)replaced pixel queue of Vincent’s algorithm by hierarchical queues, Berger et al. (2007) providedan efficient representation of tree with a parent image, and Najman and Couprie (2004) intro-duced the union-by-rank technique in the original union-find algorithm. More recently, withthe development of multi-cores processors, developers have been involved in the paralleliza-tion of salembier’s max-tree algorithm (Matas et al., 2008; Ouzounis and Wilkinson, 2007).

In this part, we are going to make a practical comparison of these algorithms using theirimplementation in MILENA. We propose some variation of those ones to make them faster andwe propose a new parallel version of the union-find algorithm.

Max tree representation

We compare the max tree algorithms on their abilities to build the parent image. We are goingto expose briefly its properties, much more detailed explanations can be found in Berger et al.(2007). Let Ph

k denote a peak component which is the kth set of all connected pixels p such thatf(p) ≤ h, and rh

k its canonical element which is a pixel of Phk that represents the whole peak

component, and ⊥ the root element. The parent holds the following properties:

1. parent(⊥) = ⊥

2. ∀p ∈ Phk , f(rh

k ) ≤ f(parent(p)) ≤ f(p)

3. ∀p ∈ Phk such that f(p) = h, parent(p) = rh

k

12 Comparison of max tree algorithms

4. p is canonical if p = ⊥ ∨ f(parent(p) < f(p)

To finish, we also want the parent image to hold an extra-property: all pixels have their parentcanonical:

5. ∀p, parent(p) is canonical.

h0

h1

h2

A

B CD E

FG(a) Non-canonical tree

A

B CD E

FG

h0

h1

h2

(b) Canonical tree. Canonical elementsare encircled twice.

Figure 2.1: Max trees and canonical max trees.

2.1 Salembier’s algorithm

2.1.1 Hierarchical pixel queues algorithm

The pixel-queue algorithms for max tree construction is briefly described here. The much moredetailed version can be found in Breen and Jones (1996); Vincent (1993). The algorithm startswith retrieving the global minimum of the image that is inserted in the queue at level hmin. Wethen call the flooding function for this level.

Let hqueue[h], an array of queues having as many levels as the original image f . The floodingfunction flood(h) aims at building the peak component Ph

k and proceeds in two steps. It firstretrieves all pixels p from the queue at level h, and push the neighbors n of p in the queue atlevel f(n). If some neighbors have a level greater than h, this leads in the existence of a peakcomponent at level f(h) which is included (directly or not) in Ph

k . We so flood recursively thoselevels until there no more pixels in hqueue[h′], h′ > h. Once hqueue[h] gets empty, the wholepeak component Ph

k has been processed. The second step of the algorithm consists in retrievingthe parent component of Ph

k which is the highest component Ph′

k , h′ < h.An important optimization of this algorithm is to pre-allocate the hierarchical queues. We

first start with parsing the whole image to build its histogram. For each gray level, we thenallocate a queue large enough to contain all pixels at this level. This leads in a structure as largeas the original image.

2.1.2 Analysis and results

Complexity analysis

Let N denote the number of pixels, and L the number of gray levels. From a theoretical pointof view, Salembier’s algorithm complexity is Θ(N) (Darbon and Akgul, 2005), but efficiency


Algorithm 2 Salembier’s flooding function: flood(h)while is_not_empty(hqueue[h]) dop← pop(hqueue[h])parent(p)← levroot[h]for all n ∈ N (p) such that n has not been treated yet. doh′ ← f(n)push(hqueue[h′], n)if levroot[h′] is not defined thenlevroot[h′] = n

while h < h′ doh′ ← flood(h′)

{Attach the peak component to its parent.}return attach− to− parent(h)

Algorithm 3 attach− to− parent(h)x← levroot[h]Undefine levroot[h]while h > hmin and levroot[h] is not defined doh← h− 1

parent(x)← levroot[h]return h

10

100

1000

10000

100000

218 219 220 221 222 223 224 225 226 227

CP

U T

ime

(ms)

Number of pixels

8 bits12 bits16 bits

Figure 2.2: Salembier’s algorithm performance of several quantizations.

actually highly depends on the number of gray levels. Indeed, Salembier’s algorithm is mainlydominated by the flooding function that constantly goes up and down the hierarchical queues.As a consequence, this algorithm is very efficient on lowly-quantized images (up to 16 bits) butbecomes quickly unusable while quantization increases. Figure 2.2 shows execution time of thealgorithm on an Intel Core 2 Duo P8600 2.40Ghz with 3Mb L1 cache, and 4Gb of RAM. Running


2 3 22 2 31 3 1

A B CD E FH I J

(a) Left: image gray levels, right: pixel labels

Pixels in process Current max tree parent image zpar image

A B CD E FH I J

λ = 3 B F IA B

FI

A BF

I

A B CD E FH I J λ = 2 EDCA

λ = 3 B F I C A DE E C

E

E E EE E E

E

A B CD E FH I J

λ = 1 JH

λ = 2 EDCA

λ = 3 B F I

C A DE H CJ E J

E E EH J JJ J J

Figure 2.3: Union-find process.

Salembier’s algorithm on 12 bits is about 50% slower than the 8 bits version of the same image.The gap is even bigger with 16 bits where the algorithm is about twice as slow as the 8 bitsversion.

Use of memory

Concerning the use of memory, this algorithm uses a parent image of integers and, a deja_vuimage of Boolean to notify a pixel already processed. It also uses a levroot array of integersand is_defined array of Boolean to define and store canonical elements. These arrays have bothL elements. To finish, we use pre-allocated hierarchical queues as large as the original image.Therefore, this algorithm needs N.(2.I + 1) + V.(I + 1) bytes where I is the size in bytes of aninteger. Note that since L � N , and most architecture have 4 bytes integers, the memory usedby the algorithm is approximated to 9.N bytes.

2.2 Union-find based algorithms

2.2.1 Union-find

The union-find algorithm was originally introduced by Tarjan (1975) to manipulate families ofdisjoints sets in graph theory. Then, it has been extended to connected operators by consideringan image as a graph where each pixel is a node that connects to its neighbors at different gray


levels. Dillencourt et al. (1992) already used Tarjan’s union-find in a new approach of connectedcomponents labelization.

The algorithm is actually based on three types of instructions for manipulating disjoints sets.MakeSet(x) creates a singleton which only contains x. FindRoot(x) computes the root of theconnected component containing x, while Union(A,B) merges the sets A, B into a single one.Efficient implementations of the union-find methods use tree structures, so as proposed byBerger et al. (2007), we still encode the tree as parent image.

As connectivity between pixels is induced by their gray levels, we need to sort the pixels indecreasing order of their level, so that we first process regional maxima to build up the treeas shown in Figure 2.3. Briefly, the algorithm processed as follows: it retrieves the pixels x indecreasing order, and look for neighbors n of x already processed. We then compute the rootr of Ph(n), Ph(n) being the peak component at level h = f(x) that contains n. Two cases arethen possible. Either r and x are the same points - Ph(x) and Ph(n) denote the same component-, or r 6= x and we join the sets Ph(x) and Ph(n) such that parent(r) = x. At the end of theprocess (see algorithm 7), we get the component tree as shown in Figure 2.3 which does notsatisfy the fifth property of the parent image seen at the beginning of this section. We do needto post-process the parent image with the canonization step as detailed in algorithm 8.

Path root compression

An important optimization of the union-find method consists in path root compression. Indeed,using parent image in FindRoot method is very expensive. That’s why, we introduce a zpar(x)image that stores the furthest root of the connected component containing x. zpar structure isupdated by the FindRoot function.

Algorithm 4 MakeSet(x)zpar(x)← xparent(x)← x

Algorithm 5 Union(x, y)zpar(x)← yparent(x)← y

Algorithm 6 FindRoot(x)if zpar(x) = x then

return xelsezpar(x)← FindRoot(zpar(x))return zpar(x)

2.2.2 Union by rank

Najman and Couprie (2004) proposed an extension of the union-find algorithm using therank while joining disjoint sets. This method, originally introduced by Tarjan (1975) preventsthe degeneration of component trees. Although the Union operation is done with a single index


Algorithm 7 UnionFind(f)Initialize ∀x zpar(x)← undefS ← sort(f) decreasingfor all x ∈ S doMakeSet(x)for all n ∈ N (x) such that zpar(n) is defined doq ← FindRoot(n)if q 6= x thenUnion(q, x)

CanonizeTree()

Algorithm 8 CanonizeTree()for all x ∈ S in reverse order doq ← parent(x)if f(parent(q)) = f(parent(x)) thenparent(x) = parent(q)

Γ rooted in B

A

(a) Original image

Γ

A

B

(b) Max-tree parent relationship

Γ

B

A

(c) Union-by-rank parent rela-tionship

Figure 2.4: Differences between max-tree and union-by-rank parent relationship. The rank of Abeing lower than B’s one, A is attached to B regardless gray level order.

affectation, we can minimize the height of the tree to get a shorter path from a node to the root.Thus, for each node, we update its rank that is an approximation of the size logarithm, anda upper-bound of the sub-tree height. Then, when joining two connected components, we setthe root as the node having the highest rank (see algorithm 9). Nevertheless, we have to takecare that trees built using union by rank as such are not correct since parent relationship isset according to the rank instead of pixels gray levels (see Figure 2.4). As a consequence, weintroduce an extra image zpar_to_par that gives for each node of the tree built using union-by-rank, its equivalent in the max-tree.


Complexity analysis

Let N denote the number of pixels. The algorithm starts with sorting the pixels in decreasingorder. For lowly-quantized values, this can be done by a radix-sort in linear time, however on


Algorithm 9 UnionFindByRnk(f)Initialize for all x: zpar(x)← undef ; rnk(x)← 0S ← sort(f) decreasingfor all x ∈ S do

{Make set}parent(x)← xzpar(x)← xzpar_to_par(x)← xfor all n ∈ N (x) such that zpar(n) is defined dor ← FindRoot(n)q ← FindRoot(x)if q 6= x then

{Join disjoints sets r and q using rank}if rnk(q) < rnk(r) thenswap(q, r)

zpar(r) = qzpar_to_par(q) = xif rnk(q) = rnk(r) thenrnk(q)← rnk(q) + 1

CanonizeTree()

10

100

1000

10000

100000

1e+06

218 219 220 221 222 223 224 225 226 227

CP

U T

ime

(ms)

Number of pixels


(a) Union-find algorithm on several quantizations

10

100

1000

10000

100000

218 219 220 221 222 223 224 225 226 227

CP

U T

ime

(ms)

Number of pixels


(b) Union-find algorithm with union-by-rank on several quantizations

Figure 2.5: Union-by-rank influence on union-find algorithm

highly-quantized data, we can quick-sort the pixels in quasi-linear time. The union-find usingunion-by-rank algorithm as such is Θ(Nα(N)) where α is the inverse of the Ackerman function(the full demonstration can be found in Tarjan (1975)). In both cases, the full process is quasi-linear. Figure 2.5 shows that union-by-rank method has really positive influence in algorithmefficiency since it generally runs between 10 and 30% faster.

Contrary to Salembier’s algorithm whose complexity depends on the number of gray levels,the union-find only depends on the number of pixels (sort step excluded). However, practical


experimentation have shown that quantization impacts union-find efficiency as well. The algo-rithm running on 16 bits value is two or three times as slow as the 8 bits version. This is firstdue to radix-sort whose efficiency decreases significantly with the number of gray levels andon the other hand, this is also due to the internal architecture of processors that processes 8 bitsvalue faster than 12, 16, and 32 bits one.

Use of memory

To compute the max-tree, the algorithm needs a parent image, a zpar image for path-root com-pression, and an array to sort the pixels. All of them are as large as the original image so it uses3.N integers. On the other hand, although union-by-rank is more efficient, it is also greediersince it needs two images more i.e 5.N integers.

Comparison with Salembier’s algorithm

10

100

1000

10000

100000

218 219 220 221 222 223 224 225 226 227

CP

U T

ime

(ms)

Number of pixels

Salembier 8 bitsSalembier 12 bitsSalembier 16 bitsUnion-find 8 bits

Union-find 12 bitsUnion-find 16 bits

Figure 2.6: Comparison of Union-find (with union-by-rank) and Salembier’s algorithms.

Figure 2.6 shows that Salembier’s algorithm outperforms Union-find on 8, 12, 16 bits quan-tized images whatever their size since it runs from two to three times as fast as union-find onaverage. However, the union-find keeps a significant advantage since it is the only one able toprocess highly-quantized images (i.e float images (Berger et al., 2007)).

2.3 Parallel max-tree using Intel TBB’s

2.3.1 Motivation

With the development of multi-core processors, in order to design efficient algorithms, we haveto take care of the new abilities of hardware. Matas et al. (2008); Ouzounis and Wilkinson (2007)have introduced parallelism in max tree algorithms. They first divide the whole image in sub-domains which are processed by individual threads to build local max-tree. Then, adjacent trees


are merged recursively to produce the whole max-tree. In his implementation, Ouzounis andWilkinson (2007) used Salembier’s algorithm to build local max-trees while Matas et al. (2008)used point-tree that runs on 1D-image only. To keep the high-genericity spirit of MILENA, wepropose a new parallel algorithm of the max-tree based on union-find so that we are able tohandle multi-dimensional and float images as well. We are going to see that introducing union-find is not just changing the building step, the merging process is modified as well.

2.3.2 Parallel algorithm

Description

The algorithm proceeds in two steps: a sub-tree is computed using union-find algorithm andneighboring sub-trees are then merged to build the global max-tree. Union-find algorithm usedthere is exactly the same that the one described in section 2.2.1 except that we proceed a sub-domain instead of the whole domain itself. So we are going to focus on the merging process.For each pair of points (x, y) from the border splitting f domain, we modify the parent func-tion from x and y to get a single max-tree. This is done by the procedure connect(x, y) whichtraverses up trees to merge their nodes in a single path. The merge function proposed here isvery similar to the one introduced by Matas et al. (2008) except that we introduce the proce-dure FindCanonical(x) that looks for the canonical element of the node containing x. Whilesearching for this element, FindCanonical(x) performs a path compression similar to FindRootprocedure exposed in section 2.2.1.

Algorithm 10 FindCanonical(x)z ← parent(x)if f(z) = f(x) and z 6= x then

return parent(x)← FindCanonical(z)return x

Algorithm 11 Connect(x, y)x← FindCanonical(x)y ← FindCanonical(y)if f(y) > f(x) thenswap(x, y)

while x 6= y doif parent(x) = x thenparent(x) = yreturn

elsez ← FindCanonical(parent(x))if f(z) < f(y) thenparent(x) = yx = yy = z

elsex = z


Parallelism strategies

Our implementation uses Intel’s TBB v3.0. Matas et al. (2008) explained two mains parallelismstrategies. The first one is called parallelism maximization strategy, it consists in splitting recur-sively the ranges until reaching the maximum grain size which is the minimum number of pix-els for each sub-domains. This strategy can be easily implemented using TBB’s simple partition-ners. However, if the grain size does not suit the image size, it usually leads to over-segmenteddomains and an important overhead.

The second strategy tries to minimize communication between threads and can be imple-mented using TBB’s auto-partitionner. It attempts to minimize domain splitting while provid-ing opportunities for work-stealing. Thus, a domain will be split only if an idle thread becomesavailable so that a thread will process a significant amount data, and context switching will beminimized.


100

200

300

400

500

600

700

800

900

1 2 4 8 16 32 64

Wal

l Clo

ck T

ime

(ms)

Number of threads

Intel(R) Core(TM)2 DuoIntel(R) Core(TM) i7

(a)

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

5500

1 2 4 8 16 32 64

Wal

l Clo

ck T

ime

(ms)

Number of threads


(b)

0

2000

4000

6000

8000

10000

12000

14000

1 2 4 8 16 32 64

Wal

l Clo

ck T

ime

(ms)

Number of threads


(c)

0

1

2

3

4

5

6

7

1 2 4 8 16 32 64

Wal

l Clo

ck T

ime

(ms)

Number of threads


(d)

0

1

2

3

4

5

6

7

8

1 2 4 8 16 32 64

Wal

l Clo

ck T

ime

(ms)

Number of threads


(e)

0

1

2

3

4

5

6

7

8

1 2 4 8 16 32 64

Wal

l Clo

ck T

ime

(ms)

Number of threads


(f)

Figure 2.7: Performance of parallel union-find. First row: wall clock time of execution. Secondrow: gain compared to non-parallel version of union-find. From left to right: 2Mb image, 10Mbimage and 20Mb image.

The tests measure wall clock time (which is different from CPU time measured in previoussection) and are performed on two multi-core machines:

• an Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz,

• an Intel(R) Core(TM) i7 CPU Q720 @ 1.60GHz


The results are quite surprising. First, Figure 2.7 shows that even with a single thread, theparallel algorithm is more efficient than the sequential one since it runs about 30% faster. Thiscan be explained by the fact that a standard bottleneck in image processing is cache-missing.Dividing the image in sub-domains enables processors to perform some cache optimizations 1.This feature is even more visible on Intel Core 2 Duo’s results. Indeed, we got the best improve-ments in Figure 2.7(f), yielding in 370% performance gain whereas this processor have only twocores and no hyper-threading. This tends to show the importance of cache consideration byalgorithms.

The second main observation we can yield from these results is that the number of threadsto get optimal performances highly depends on the image size. While the algorithm gets bestperformances on a 2 mega-pixels image with 4 threads, it gets the best ones with 32 threadson an image ten times bigger. As a rule, to get maximal speedups, number of threads shouldincrease with the image size.

To finish, note that these conclusions are still valid whether we include the rank technique ornot i.e. the performance gain due to parallelism does not depend on the build algorithm.

2.4 Conclusion

In this chapter, we have seen two classes of algorithms dedicated to max-tree construction. Sale-mebier’s algorithm generally outperforms the ones based on Tarjan’s union-find. Nevertheless,it suffers from a main drawback: it cannot handle highly-quantized images like float ones. Al-gorithms based on union-find do not suffer from this point, as a consequence, they provide ageneric way to build a max-tree with few penalties compared to Salembier’s one. We have seenmany techniques that permit to optimize union-find such as path compression or union-by-rankso that we can decrease execution time by one fourth.

We also showed how important is parallelism in image processing. We provided parallelismtechniques that really outperform sequential algorithms. More than expected, we got very goodperformance speedup (up to 3.6 times faster on a dual core processor).

Further work

Even if this thesis provide good basis for a complete comparison of max tree algorithms, thiswork is not totally achieved. First, we have to study the parallel Salembier’s algorithm to see ifwe get the same speedups as for parallel union-find. Then, we should lead deeper researchesabout the way of caching data in parallel algorithms to understand more precisely the incrediblespeedups we have exposed.

1We are currently involved in a deeper study about cache optimizations provided by parallel algorithms. Just besttools for parallelism analysis are Visual Studio plug-ins, and MILENAstill does not compile under Visual Studio.

Chapter 3

Bibliography

Berger, C., Géraud, T., Levillain, R., and Widynski, N. (2007). Effective component tree compu-tation with application to pattern recognition in astronomical imaging. In Proc. IEEE Int. Conf.Image Processing 2007. Citeseer.

Beucher, S. (1992). The watershed transformation applied to image segmentation. SCANNINGMICROSCOPY-SUPPLEMENT-, pages 299–299.

Breen, E. and Jones, R. (1996). Attribute openings, thinnings, and granulometries. ComputerVision and Image Understanding, 64(3):377–389.

Cutting, D., Karger, D., Pedersen, J., and Tukey, J. (1992). Scatter/gather: A cluster-basedapproach to browsing large document collections. In Proceedings of the 15th annual internationalACM SIGIR conference on Research and development in information retrieval, pages 318–329. ACM.

Darbon, J. and Akgul, C. (2005). An efficient algorithm for attribute openings and closings.

Dillencourt, M., Samet, H., and Tamminen, M. (1992). A general approach to connected-component labeling for arbitrary image representations. Journal of the ACM (JACM), 39(2):253–280.

Fabrizio, J. and Marcotegui, B. (2008). Ouverture Ultime: un outil pour la segmentation. Appli-cationa la localisation de texte. In 31th meeting of the International Society for Stereology (ISS08),French section.

Jones, R. (1999). Connected filtering and segmentation using component trees. Computer Visionand Image Understanding, 75(3):215–228.

Matas, P., Dokládalová, E., Akil, M., Grandpierre, T., Najman, L., Poupa, M., and Georgiev,V. (2008). Parallel Algorithm for Concurrent Computation of Connected Component Tree. InAdvanced concepts for intelligent vision systems: 10th International Conference, ACIVS 2008, Juan-les-Pins, France, October 20-24, 2008; proceedings, page 230. Springer-Verlag New York Inc.

Mumford, D. and Shah, J. (1989). Optimal approximations by piecewise smooth functions andassociated variational problems. Comm. Pure Appl. Math, 42(5):577–685.

Najman, L. and Couprie, M. (2004). Quasi-linear algorithm for the component tree. SPIE VisionGeometry XII, 5300:98–107.

23 BIBLIOGRAPHY

Ouzounis, G. and Wilkinson, M. (2007). A parallel implementation of the dual-input Max-Treealgorithm for attribute filtering.

Salembier, P., Oliveras, A., and Garrido, L. (1998). Antiextensive connected operators for imageand sequence processing. IEEE Transactions on Image Processing, 7(4):555–570.

Salembier, P. and Serra, J. (1995). Flat zones filtering, connected operators, and filters byrecon-struction. IEEE Transactions on image processing, 4(8):1153–1160.

Tarjan, R. (1975). Efficiency of a good but not linear set union algorithm. Journal of the ACM(JACM), 22(2):215–225.

Vincent, L. (1993). Morphological area openings and closings for grey-scale images. Shape inPicture: Mathematical Description of Shape in Grey-level Images, pages 196–208.

Wilkinson, M. (2009). Hyperconnectivity, Attribute-Space Connectivity and Path Openings:Theoretical Relationships. In Mathematical Morphology and Its Application to Signal and ImageProcessing: 9th International Symposium on Mathematical Morphology, Ismm 2009 Groningen, theNetherlands, August 24-27, 2009 Proceedings, page 47. Springer-Verlag New York Inc.

Wilkinson, M. and Roerdink, J. (2000). Fast morphological attribute operations using Tarjan’sunion-find algorithm. Mathematical morphology and its applications to image and signal processing,pages 311–320.

an edges-based attribute filtering method dedicated to ... · a connected components energy. we...

Documents