parallel implementation of geodesic distance transform with application in superpixel segmentation

1
Tuan Q. Pham Canon Information Systems Research Australia (CISRA) [email protected] Parallel implementation of geodesic distance transform with application in superpixel segmentation References: 1. Achanta et al., SLIC superpixels compared to state-of-the-art superpixel methods, PAMI 34(11), 2012. 2. Levinshtein et al., TurboPixels: Fast superpixels using geometric flows, PAMI 31(12), 2009. Contact details: Tuan Q. Pham ([email protected]), 1 Thomas Holt drive, North Ryde, NSW 2113, Australia Presented at Int’l Conf. on Digital Image Computing: Techniques and Applications (DICTA) Paper 5, Poster session 2 on Thursday 8 th November, 2013. Hobart, Australia Segment image into superpixels using GDT where Cost image = gradient energy + a small offset Seed points = well-separated local gradient minima by adaptive non-maximum suppression Superpixel segmentation Summary We proposed a parallel implementation of geodesic distance transform using OpenMP Our geodesic segmentation method produces more regular, edge-following superpixels at orders of magnitude faster than state-of-the-art segmentation methods. Within a pass, chamfer algorithm is sequential The algorithm can be parallelised if multiple passes are allowed (OK since GDT is iterative) Image is divided into bands for parallel processing Distance transform is propagated across bands in a next iteration (may require more iterations) Parallel distance transform Geodesic distance between two points = sum of pixel costs along a minimum-cost path Geodesic distance transform d(cost image f, seed points) = geodesic distance from every pixel to its nearest seed Geodesic distance Geodesic distance transform source destination 0 0.2 0.4 0.6 0.8 1 minimum path, cost = 1.7 straight path, cost = 11.1 seed 2 4 6 8 10 Fig. 1. Cost image (left) and its geodesic distance transform (right) Chamfer distance algorithm = multiple iterations of a forward propagation and a backward propagation Fig. 2. One iteration of a forward pass (left) and a backward pass (right) Geodesic distance transform (GDT) produces edge- following Voronoi tessellation if edge is used as cost Fig. 3. Geodesic distance transform (2 nd row) and tessellation (3 rd row) Fig. 4. Band-based image partitioning for parallel GDT implementation Nearest seed label after a first forward pass Nearest seed label after a first backward pass Input image Nearest seed label after 10 iterations Cost image & 4 seed points Intermediate GDT after a first forward pass Intermediate GDT after a first backward pass GDT after 10 fwd+bwd propagation iterations fragmentation region without a nearest seed OpenMP = an easy to use Open Multi-Processing platform that is designed for multicore processors and is supported by most compilers OpenMP parallelises loop by compiler directives OpenMP implementation Segmentation comparison Geodesic superpixel is faster & follows edges better frame 1 frame 4 frame 8 frame 12 Fig. 8. Segmentation of 1MP image (# denotes number of superpixels returned) Method # Time Platform Method # Time Platform Method # Time Platform Watershed 1008 3.2s C/Matlab FH 1024 2.3s C Quickshift 992 13.3s C Entropy 1000 6.5s C Geodesic 1000 0.3s C CVT 1000 2.7s Matlab Lattice 1024 1.4s C SLIC 990 1.2s C Turbo 1067 58.1s Matlab Fig. 7. Three state-of-the-art superpixel methods on 2MP image in Fig.6 SLIC [1] (4.6 seconds) Geodesic (0.64 sec) TurboPixels [2] (207 sec) Fig. 6. 1000 geodesic superpixels from a 2MegaPixel image (1936×1288) Best with static scheduling (where bands are assigned to threads in a round-robin fashion) Number of fwd+bwd propagation iterations increases slightly under parallel implementation (10 iterations are often enough for segmentation) Sub-second runtime on 5 MP image or smaller Speedup of 1.3× on 2-core, 2.6× on 4-core CPU Evaluation of parallel GDT Fig. 5. Runtime & speedup factor on 2.8GHz quad-core CPU with 12GB RAM 0 1000 2000 3000 4000 0 0.5 1 1.5 2 image width (pixels) runtime (seconds) without OpenMP static schedule dynamic schedule 0 1000 2000 3000 4000 0 0.5 1 1.5 2 2.5 3 3.5 image width (pixels) speedup factor static schedule dynamic schedule Runtime Speedup factor Geodesic Voronoi tessellation

Upload: tuan-q-pham

Post on 04-Jul-2015

356 views

Category:

Technology


0 download

DESCRIPTION

This poster presents a parallel implementation of geodesic distance transform using OpenMP. This work forms part of a C implementation for geodesic superpixel segmentation of natural images. Presented at DICTA 2013 conference

TRANSCRIPT

Page 1: Parallel implementation of geodesic distance transform with application in superpixel segmentation

Tuan Q. Pham

Canon Information Systems Research Australia (CISRA)

[email protected]

Parallel implementation of geodesic distance transform with application in superpixel segmentation

References: 1. Achanta et al., SLIC superpixels compared to state-of-the-art superpixel methods, PAMI 34(11), 2012. 2. Levinshtein et al., TurboPixels: Fast superpixels using geometric flows, PAMI 31(12), 2009.

Contact details: Tuan Q. Pham ([email protected]), 1 Thomas Holt drive, North Ryde, NSW 2113, Australia

Presented at Int’l Conf. on Digital Image Computing: Techniques and Applications (DICTA) Paper 5, Poster session 2 on Thursday 8th November, 2013. Hobart, Australia

Segment image into superpixels using GDT where

Cost image = gradient energy + a small offset

Seed points = well-separated local gradient

minima by adaptive non-maximum suppression

Superpixel segmentation

Summary

We proposed a parallel implementation of

geodesic distance transform using OpenMP

Our geodesic segmentation method

produces more regular, edge-following

superpixels at orders of magnitude faster

than state-of-the-art segmentation methods.

Within a pass, chamfer algorithm is sequential

The algorithm can be parallelised if multiple

passes are allowed (OK since GDT is iterative)

Image is divided into bands for parallel processing

Distance transform is propagated across bands in

a next iteration (may require more iterations)

Parallel distance transform

Geodesic distance between two points = sum of

pixel costs along a minimum-cost path

Geodesic distance transform d(cost image f,

seed points) = geodesic distance from every

pixel to its nearest seed

Geodesic distance

Geodesic distance transform

Frame 8

source

destination

0

0.2

0.4

0.6

0.8

1

minimum path, cost = 1.7

straight path, cost = 11.1

seed

2

4

6

8

10

Fig. 1. Cost image (left) and its geodesic distance transform (right)

Chamfer distance algorithm = multiple iterations of

a forward

propagation

and

a backward

propagation

Fig. 2. One iteration of a forward pass (left) and a backward pass (right)

Geodesic distance transform (GDT) produces edge-

following Voronoi tessellation if edge is used as cost

Fig. 3. Geodesic distance transform (2nd row) and tessellation (3rd row)

Fig. 4. Band-based image partitioning for parallel GDT implementation

Nearest seed label after

a first forward pass

Nearest seed label after

a first backward pass

Input image

Nearest seed label

after 10 iterations

Cost image & 4 seed points

Intermediate GDT after

a first forward passIntermediate GDT after

a first backward pass

GDT after 10 fwd+bwd

propagation iterations

fragmentationregion without a nearest seed

OpenMP = an easy to use Open Multi-Processing

platform that is designed for multicore processors

and is supported by most compilers

OpenMP parallelises loop by compiler directives

OpenMP implementationSegmentation comparison

Geodesic superpixel is faster & follows edges better

frame 1frame 4frame 8frame 12

Fig. 8. Segmentation of 1MP image (# denotes number of superpixels returned)

Method # Time Platform Method # Time Platform Method # Time Platform

Watershed 1008 3.2s C/Matlab FH 1024 2.3s C Quickshift 992 13.3s C

Entropy 1000 6.5s C Geodesic 1000 0.3s C CVT 1000 2.7s Matlab

Lattice 1024 1.4s C SLIC 990 1.2s C Turbo 1067 58.1s Matlab

Fig. 7. Three state-of-the-art superpixel methods on 2MP image in Fig.6

SLIC [1] (4.6 seconds) Geodesic (0.64 sec) TurboPixels [2] (207 sec)

Fig. 6. 1000 geodesic superpixels from a 2MegaPixel image (1936×1288)

Best with static scheduling (where bands are

assigned to threads in a round-robin fashion)

Number of fwd+bwd propagation iterations

increases slightly under parallel implementation

(10 iterations are often enough for segmentation)

Sub-second runtime on 5 MP image or smaller

Speedup of 1.3× on 2-core, 2.6× on 4-core CPU

Evaluation of parallel GDT

Fig. 5. Runtime & speedup factor on 2.8GHz quad-core CPU with 12GB RAM

0 1000 2000 3000 40000

0.5

1

1.5

2

image width (pixels)

run

tim

e (

seco

nd

s)

without OpenMP

static schedule

dynamic schedule

0 1000 2000 3000 40000

0.5

1

1.5

2

2.5

3

3.5

image width (pixels)

spee

du

p facto

r

static schedule

dynamic schedule

Runtime Speedup factor

Geodesic Voronoi tessellation