research on graph-cut for stereo vision

60
111/03/15 V L S I S i g n a l P r o c e s s i n g L a b , I n s t i t u t e o f E l e c t r o n i c s N a t i o n a l C h i a o T u n g U n i v e r s i t y , H s i n c h u , T a i w a n Platform-Based Design Group Research on Graph-Cut for Stereo Vision Presenter: Nelson Chang Institute of Electronics, National Chiao Tung University

Upload: chase-lyons

Post on 31-Dec-2015

46 views

Category:

Documents


1 download

DESCRIPTION

Research on Graph-Cut for Stereo Vision. Presenter: Nelson Chang Institute of Electronics, National Chiao Tung University. Outline. Research Overview Brief Review of Stereo Vision Hierarchical Exhaustive Search Partitioned Graph-Cut for Stereo Vision Hierarchical Parallel Graph-Cut. - PowerPoint PPT Presentation

TRANSCRIPT

112/04/19

VL

SI S

ign

al P

roc

es

sin

g L

ab

, Ins

titute

of E

lec

tron

ics

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Research on Graph-Cut for Stereo Vision

Presenter: Nelson Chang

Institute of Electronics,National Chiao Tung University

112/04/19 2

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Outline

• Research Overview• Brief Review of Stereo Vision• Hierarchical Exhaustive Search• Partitioned Graph-Cut for Stereo Vision• Hierarchical Parallel Graph-Cut

112/04/19 3

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Our Research

• A fast vision system for robotics– Stereo vision

• Local block-based + diffusion (M)• Graph-cut (PhD)• Belief propagation (PhD)

– Segmentation• Watershed (M)• Meanshift

• Approaches– Embedded solutions

• DSP (U)• ASIC

– PC-based solutions• Dual webcam stereo (U)

HRP-2 Head

HRP-2 Tri-Camera Head

112/04/19 4

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

My Research

• A fast graph-cut VLSI engine for stereo vision– ASIC approach– Goal: 256x256 pixels, 30 depth label, 30 fps

• Stereo vision system prototypes– PC-based– DSP-based– FPGA/ASIC-based

112/04/19

VL

SI S

ign

al P

roc

es

sin

g L

ab

, Ins

titute

of E

lec

tron

ics

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Review on Stereo Vision

Presenter: Nelson Chang

Institute of Electronics,National Chiao Tung University

112/04/19 6

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Concept of Stereo Vision

• Computational Stereo – to determine the 3-D structure of a scene from 2 or more images taken from distinct view points.

d

d

TfZ

d

T

f

Z

pxpxppd

''

d : disparityZ : depthT : baseline f : focal length

Triangulation of non-verged geometry

M. Z

. Bro

wn

et a

l., “Ad

van

ces in

Co

mp

uta

tion

al S

tere

o,”

IEE

E T

ran

sactio

ns o

n P

atte

rn A

na

lysis an

d M

ach

ine

In

tellig

en

ce, vo

l. 25

, no

. 8, A

ug

ust 2

00

3.

112/04/19 7

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Disparity Image

• Disparity Map/Image– The disparities of all the pixels in the image

• Example: Left Cam Right Cam

Left Disparity Map Right Disparity Map

d= 255

d= 0Farthest

Nearest

0 0 0 0

0 0 110 0

0 100138 0

80 123156176

110 pixels

Disparity map of the 4x4 block

112/04/19 8

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

• Simple Local Method– Block Matching

• SADSum of Absolute Difference

– ∑|IL-IR|

• Find the candidate disparity with minimal SAD

– Assumption• Disparities within a block

should be the same

– Limitation• Works bad in texture-less

region• Works bad in repeating p

attern

How to find the disparity of a pixel? (1/2)

0 0 0

0 100 0

200300 0

0 0 0

0 100 0

200300 0

0 00

0 1000

2003000

0 0

100 0

300 0

0

0

0

d=k-1SAD=400

d=kSAD=0

d=k+1SAD=600Left

Right

112/04/19 9

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

How to find the disparity of a pixel? (2/2)

• Complex Global Method– Graph-cut, Belief Propagation

• Disparity Estimation Optimal Labeling Problem– Assign the label (disparity) of each pixel such that a given global

energy is minimal• Energy is a function of the label set (disparity map/image)• The energy considers the

– Intensity similarity of the corresponding pixel» Example: Absolute Difference (AD), D=|IL-IR|

– Disparity smoothness of neighboring pixels» Example: Potts Model If (dL≠dR), V=K

else, V=0

0

0 ? 16

32

d=0 V=2Kd=16 V=3Kd=32 V=3Kd=2 V=4K

112/04/19 10

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Swap and Expansion Moves

• Weak move– Modifies 1 label at a time– Standard move

• Strong– Modifies multiple labels at a time– Proposed swap and expansion move

Initial labeling Standard move α-βswap αexpansion

Init. Weak Strong

E

More chances of finding more

local minimum

112/04/19 11

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

4-connected structure

• Most common graph/MRF(BP) structure in stereo

Observable nodes

Hidden nodes

Source

Sink

D

D’

V

VV

VV

VV

V

D

2-variable Graph-Cut

MRF in Belief PropagationD,V are vectors

α

α’

112/04/19

VL

SI S

ign

al P

roc

es

sin

g L

ab

, Ins

titute

of E

lec

tron

ics

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Hierarchical Exhaustive Search on

Presenter: Nelson Chang

Institute of Electronics,National Chiao Tung University

112/04/19 13

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Outline

• Combinatorial Optimization• Graph-Cut• Exhaustive Search• Iterated Conditional Modes• Hierarchical Exhaustive Search• Result• Summary & Next Step

112/04/19 14

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Combinatorial Optimization

• Determine a combination (pattern, set of labels) such that the energy of this combination is minimum

• Example: 4-bit binary label problem– Find a label-set which yields the minimal energy

• Each individual bit can be set as 0 or 1– Each label corresponds to an energy cost

• Each neighboring bit pair is better to have the same label (smoothness)

? ? ??0 1 2 3

10 10 10

99 92 100 101

100 79 114 98

0

1

0

1

0

1

0

1

Energy(0000)

Energy(0001) =

= 99+92+100+101= 392

= 99+92+100+98+10= 399

112/04/19 15

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

100 11498

79

10110099 92

14

Graph-Cut

• Formulate the previous problem into a graph-cut problem– Find the cut with minimum total capacity (cost, energy)

• Solving the graph-cut: Ford-Fulkurson Method

? ? ??0 1 2 3

10 10

0

1

10

1

13 3

9

122

4

7

1

99+79+100+98+1+10+3 =390 Max Flow (Energy of the cut 1100)Total Flow Pushed=

1100

112/04/19 16

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Exhaustive Search

• List all the combinations and corresponding energy• Example: 1100 has the minimal energy of 390

Label set Energy Label set Energy0000 392 1000 403

0001 399 1001 410

0010 426 1010 437

0011 413 1011 424

0100 399 1100 390

0101 414 1101 397

0110 413 1110 404

0111 400 1111 391

? ? ??0 1 2 310 10 10

99 92 100 101

100 79 114 98

0

1

0

1

0

1

0

1

112/04/19 17

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Iterated Conditional Modes

• Iteratively finds the best label under the current given condition– Greedy– Different starting decision (initial condition) result in different result– Can find local minima

• Example:– Start with bit 1 because it is more reliable– Iteration order: bit1bit0bit2bit3– Final solution: 1100

? ? ??0 1 2 310 10 10

99 92 100 101

100 79 114 98

0

1

0

1

0

1

0

1

0 1 2 3

79(1)<92(0) 1

1 1

100(1)<99+10(0) 1

100+10 (0)<114 (1) 0

0

101 (0)<98+10(1) 0

0

112/04/19 18

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Exhaustive Search Engine

• Exhaustive search can be hardware implemented• Less sequential dependency• Not suitable for graph larger than 4x4

D0…

Dn

V01…

V(n

-1)(

n)

D0…

Dn

V0

1… V

(n-1

)(n

)

D0…

Dn

V0

1… V

(n-1

)(n

)

CMP

++

+

++

+

+

+

+

++

+

CMP

++

+

++

+

+

+

+

++

+

CMP

++

+

++

+

+

+

+

++

+

Pattern Generator

DV Table

Post Comparison

ArrayCMP

++

+

++

+

+

+

+

++

+

. . . .. . .

D0…

Dn

V0

1… V

(n-1

)(n

)

...

pa

tte

rn 0

pa

tter

n 1

pat

tern

(K

-1)

CMP

CMP

CMP

. . .K sets

Energy Computation Unit

Result of fully connected graph, NOT 4-connected graph

112/04/19 19

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Hierarchical Graph-Cut?

• Solve large n graph with multiple small n GCE hierarchically

• Example:– Solve n=16 with 4+1 n=4 graph-cuts

For each sub-graph,find the best 2 label-sets

Sub-graph 0 Sub-graph 1

Sub-graph 2 Sub-graph 3

0 1

2 3

For each sub-graph verticeLabel 0 = 1st label setLabel 1 = 2nd label set

Assumption:!! The optimal solution must be within the combinations of sub-graph label sets !!

112/04/19 20

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

HGC Speed up Evaluation

• For an 8-point GCE with 8-set of ECUs– Cost: 300 eq. adders– Latency: 41 cycles per graph

• If only 1 GCE is used to compute 64-point 2 variable graph-cut

Latency= 41 cycles x 8 + 41 cycles + TV

= 369 cycles + TV

If V is computed for each pixelsTv=(8x8)X(8x7/2)X4=3584

Total Latency ~ 3953 cyclesQuestion: Is this solution the optimal label set for n=64???

112/04/19 21

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Hierarchical Exhaustive Search

• 64x64 nodes– 4x4 based pyramid structure– 3 levels

Level 2

Level 1

Level 0

D@lv1 E0/E1@lv0Label0@lv1 pat0@lv0Label1@lv1 pat1@lv0

D@lv2 E0/E1@lv1Label0@lv2 pat0@lv1Label1@lv2 pat1@lv1

D@lv0 D0/D1@lv0Label0@lv0 Label0Label1@lv0 Label1

pat0 is the best candidate patternpat1 is 2nd best candidate pattern

112/04/19 22

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Computing V term at Level 1

• For 1st order neighboring sub-graphs Gi and Gj– possible neighboring pair com

bination• (pat0i, pat0j)

• (pat0i, pat1j)

• (pat1i, pat0j)

• (pat1i, pat1j)

• Compute V(patXi,patXj) with original neighboring cost– Example:

• V(pat0i, pat0j) = K

• V(pat0i, pat1j) = K+K+K = 3K

? ? ? 0

? ? ? 0

? ? ? 0

? ? ? 1

0 ? ? ?

0 ? ? ?

1 ? ? ?

1 ? ? ?

? ? ? 0

? ? ? 0

? ? ? 0

? ? ? 1

1 ? ? ?

0 ? ? ?

1 ? ? ?

0 ? ? ?

Gi Gj

pat0i

pat0i

pat0j

pat1j

112/04/19 23

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Result of 16x16 (256) 2 level HES

• Random generated 100 graphs– D/V~ 10– Symmetric V=20

• Error Rate– Max: 17/256 ~ 6.6%– Average: 7/256 ~ 2.8%– Min: 2/256 ~ 0.8%

112/04/19 24

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Result of 64x64 (4096) 3 level HES

• Random generated 100 graphs– D/V~ 10– Symmetric V=20

• Error Rate– Max: 185/4096 ~ 4.5%– Average: 146/4096 ~ 3.6%– Min: 115/4096 ~ 2.8%

112/04/19

VL

SI S

ign

al P

roc

es

sin

g L

ab

, Ins

titute

of E

lec

tron

ics

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Death Sentence to HES

Presenter: Nelson Chang

Institute of Electronics,National Chiao Tung University

112/04/19 26

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Error Rate vs. Graph Size

• (D,V)=(~163:20)

Average Energy Increase %

Average Error Rate %

Error Rate Standard deviation %

Min Error Rate %

Max Error Rate %

16x16

(2 level)

0.20 (7/256)

2.74

1.43 0.39 8.59

64x64

(3 level)

0.25 (149/4096)

3.63

0.40 2.58 4.54

256x256 (4 level)

0.28 (2393/65536) 3.65

0.09 3.36 3.89

3.63 vs. 3.65Error rate did not increase significantly

Error rate range became smaller

112/04/19 27

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Impact of different V cost

• 64x64(3 level) HES– 100 patterns per V cost value

• D cost (average over s-link caps of 10 patterns, 2 for each V)– Average: 162.8– Std.Dev: 94.4

• V cost– 10, 20, 40, 60, 80

HGCE vs. GCE Error Percentage

0.00%

5.00%

10.00%

15.00%

20.00%

25.00%

0 20 40 60 80 100

K Value

Err

or

Ra

te

HGCE vs. GCE Energy Difference Percentage

0.00%

0.50%

1.00%

1.50%

2.00%

2.50%

3.00%

3.50%

4.00%

4.50%

5.00%

0 20 40 60 80 100

K Value

En

erg

y d

iffe

ren

ce

K Err Rate Energy Difference10 1.53% 0.05%20 3.63% 0.25%40 8.75% 1.15%60 14.55% 2.63%80 20.95% 4.50%

HGCE vs. GCE Error Percentage

0.00%

5.00%

10.00%

15.00%

20.00%

25.00%

0 20 40 60 80 100

K Value

Err

or

Ra

te

256x256 1 pattern result

112/04/19 28

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Stereo Matching Case

• Stereo Pair: Tsukuba• Expansion with random label order

– 15 labels 15 graph-cut computations

• Graph Size: 256 x 256• D term: truncated Sum of Squared Error (tSSE)

– Truncated at AD=20

• V term: Potts model– K=20

112/04/19 29

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

1st iteration result

• Error rate might exceed 20% for important expansion moves

Label (α) Error Rate (%)Energy Difference (%)

Label (α)Error Rate (%)

Energy Difference (%)

0 0.62 12.7 8 5.01 28.3

1 1.07 16.1 9 12.01 42.0

2 0.00 0.0 10 5.55 32.2

3 2.76 24.7 11 5.18 30.5

4 21.59 38.7 12 5.33 31.3

5 22.91 44.2 13 7.07 34.6

6 9.21 32.1 14 2.98 23.2

7 7.83 40.0

Important expansions

4

5

9

BnK

’s expansion result

112/04/19 30

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Reason for failure

• Best 2 local candidates does NOT include the final optimal solution– Error often happen near lv2 and lv3 block boundary

• Majority node has both 0 source and sink link capacity• More dependent on neighboring node’s label

– D:V ratio ~ 56:20 2.8:1• Similar to D:V = 163:60 case• Error rate for random pattern ~ 15%

Best 2 patterns in does NOT consider the pattern of

112/04/19

VL

SI S

ign

al P

roc

es

sin

g L

ab

, Ins

titute

of E

lec

tron

ics

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Partitioned (Block) Graph-Cut

Presenter: Nelson Chang

Institute of Electronics,National Chiao Tung University

112/04/19 32

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Motivation

• Global– Considers the whole picture– More information

• Local– Considers a limited region of a picture– Less information

Is it necessary to use that much information in global methods??

112/04/19 33

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Concept

• Original full GC– 1 big graph

• Partitioned GC– N smaller graphs

What’s the smallest possible partition to achieve the same performance?

112/04/19 34

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Experiment Setting

• Energy– D term

• Luma only• Birchfield-Tomasi cost (best result at half-pel position)• Square Error

– V term• Potts Model V= K x T(di≠dj)• K constant is the same for all partition

• Partition Size– 4x4, 16x16, 32x32, 64x64, 128x128

• Stereo Pairs– Tsukuba, Teddy, Cones, Venus

112/04/19 35

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Tsukuba 4x4, 16x16, 32x32, 64x64

4x4 16x16

32x3264x64

112/04/19 36

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Tsukuba 96x96, 128x128

Full GC

96x96

128x128

112/04/19 37

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Venus 32x32, 64x64

32x32 64x64

112/04/19 38

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Venus 96x96, 128x128

Full GC

96x96 128x128

112/04/19 39

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Teddy 32x32, 64x64

32x32 64x64

112/04/19 40

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Teddy 96x96, 128x128Full GC

96x96 128x128

112/04/19 41

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Cones 32x32, 64x64

32x32 64x64

112/04/19 42

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Cones 96x96, 128x128Full GC

96x96 128x128

112/04/19 43

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Middleburry Result

Tsukuba Venus Teddy ConesBlockSize

nonocc ALL Disc nonocc ALL Disc nonocc ALL Disc nonocc ALL Disc

Best 6.0 6.8 24.7 2.9 4.4 19.4 14.5 22.8 29.1 11.2 20.9 20.9

Full 8.6 9.4 27.4 3.3 4.7 18.4 14.5 22.8 29.1 14.5 23.8 24.2

32 16.2 17.0 33.7 24.0 24.9 29.6 27.6 34.9 35.2 24.4 32.7 30.2

64 10.6 11.5 29.6 10.0 11.3 19.5 19.1 27.1 29.8 19.2 28.0 27.3

96 9.4 10.2 28.7 9.1 10.4 20.5 16.3 24.7 29.5 15.1 24.3 24.7

128 8.8 9.5 27.3 8.4 9.6 21.5 15.2 23.5 28.6 14.6 23.9 24.0

Best: Full GC with best parameterFull: Full GC with k=20(tsukuba) and 60 (others)

Evaluation Web Page http://cat.middlebury.edu/stereo/

112/04/19 44

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Summary

• Smallest possible partition size (2% accuracy drop)– Tuskuba64x64– Teddy & Cones 96x96– Venus larger than 128x128

• Benefits– Possible complexity or storage reduction– Parallelism increase

• Drawbacks– Performance (disparity accuracy) drop– PC computation becomes longer

112/04/19

VL

SI S

ign

al P

roc

es

sin

g L

ab

, Ins

titute

of E

lec

tron

ics

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Hierarchical Parallel Graph-Cut

Presenter: Nelson Chang

Institute of Electronics,National Chiao Tung University

112/04/19 46

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Concept of Hierarchical Parallel GC

• Bottom Up– Solve graph-cut for smaller subgraphs– Solve graph-cut for larger subgraphs

• Larger subgraphs = set of neighboring smaller subgraphs

sg0 sg1

sg2 sg3

Level 0

Larger subgraph = sg0+sg1+sg2+sg3

Level 1

!!Each subgraph is temporary independent !!

112/04/19 47

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

HPGC for solving a 256x256 graph

Step 1 64 32x32 Lv0 subgraphs

Step 2 16 64x64 Lv1 subgraphs

Step 3 4 128x128 Lv2 subgraphs

Step 4 1 256x256 Lv3 subgraphs

Total graph-cut computations = 64+16+4+1 =85

!!HPGC must used Ford-Fulkerson-based methods!!

112/04/19 48

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Boykov and Kolmogorov’s Motivation

• Dinic Method– Search the shortest augmenting path– Use Breadth First Search (BFS)

• Example:– Search shortest path (length = k)

• Use BFS, expand the search tree

• Find all paths of length k

– Search shortest path (length = k+1), • Use BFS, RE-expand the search tree again

• Find all paths of length (k+1)

– Search shortest path (length = k+2), • Use BFS, RE-RE-expand the search tree a

gain

• …..

1

1

1 1

1

1

11

1

1

Why don’t we REUSE the expanded tree?

112/04/19 49

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

BnK’s Method

• Concept:– Reuse the already expanded trees– Avoid re-expanding the tress from scratch (nothing)

• 3 stages– Growth

• Grow the search tree

– Augmentation• Ford-Fulkerson style augmentation

– Adoption• Reconnect the unconnected sub-trees• Connect the orphans to a new parent

Augmenting Path

Saturate Critical Edge

Adopt Orphans

112/04/19 50

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Feature of BnK method

• Based on Ford-Fulkerson – Bidirection search tree constructon– Searched tree reuse– Determine label (source or sink) using tree connecti

vity

Source tree

Sink tree

112/04/19 51

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Connectivity is why HPGC works

• Example: a 2x4 binary variable graph

Graph view

Tree view

Case 2

Case 3

Case 4

Case 1

112/04/19 52

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Connectivity of the various cases

Case 2

Case 3

Case 4

112/04/19 53

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

How to add edges

• When should node A and B check their edge– If A & B belong to different search trees

• A is in a sink tree, B is in a source tree• A is in a source tree, B is in a sink tree• Implies a source->sink path

– If A or B is an orphan (not connected to any tree)• A is an orphan, B is not an orphan• A is not an orphan, B is an orphan• Check for possible connectivity of the orphan

A B

112/04/19 54

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Complexity Result

• Method– Annotate each line of code with basic operations

• Read• Write• Arithmetic• Logic• Compare• Branch

• Examples– C=A+B 2R, 1W, 1A– If(A==B) 2R, 1C, 1B

112/04/19 55

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Stereo Matching Case

• Stereo Pair: Tsukuba• Expansion with random label order

– 15 labels 15 graph-cut computations

• Graph Size: 256 x 256• D term: truncated Sum of Squared Error (tSSE)

– Truncated at AD=20

• V term: Potts model– K=20

112/04/19 56

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

1st iteration result

• Label 4, 5, 9 are key moves

Label (α) Error Rate (%)Energy Difference (%)

Label (α)Error Rate (%)

Energy Difference (%)

0 0.62 12.7 8 5.01 28.3

1 1.07 16.1 9 12.01 42.0

2 0.00 0.0 10 5.55 32.2

3 2.76 24.7 11 5.18 30.5

4 21.59 38.7 12 5.33 31.3

5 22.91 44.2 13 7.07 34.6

6 9.21 32.1 14 2.98 23.2

7 7.83 40.0

Important expansions

4

5

9

BnK

’s expansion result

112/04/19 57

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Full BnK Graph-cut Operation Distribution

• 256x256 graph – Tsukuba iteration 0 label 5

Operation distribution

R39%

W18%

A3%

L2%

C20%

B18%

R

W

A

L

C

B

Memory access dominantControl ~22%Arithmetic is insignificant

77,407,307 Operations

112/04/19 58

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Full GC vs. HPGC

• 256x256 graph – Tsukuba iteration 0 label 5

Operation vs. PE Utilization

0.00%

20.00%

40.00%

60.00%

80.00%

100.00%

120.00%

96.59% 88.54% 75.89% 53.13% 33.20%

Overall PE Utilzation

Ope

ratio

n Cou

nt

This Case

Worst Case

77,407,307 Operations

4PE 8PE 32PE 64PE

16PE

112/04/19 59

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Conclusion

• HPGC can improve speed with multiple PEs• To perform 30 fps, 30 labels, 256x256 graph-cut

– 1PE@100MHz– Averge cycle budget for each subgraph ~1.3K cycles

• Lv0 subgraph is 32x32

• Next step– Small BnK graph-cut engine architecture design

• Estimate speed/cost

112/04/19 60

VL

SI S

ign

al P

roc

es

sin

g L

ab

Na

tion

al C

hia

o T

un

g U

niv

ers

ity, H

sin

ch

u, T

aiw

an

Platform-Based Design Group

Progress Check

• Previous plan– Parallel graph-cut engine for binary-variable graph

• Based on Boykov and Kolmogorov’s graph cut algorithm– Complexity analysis done

– Hierarchical parallel algorithm SW model done

– Small BnK graph-cut engine architecture design• Based on Boykov and Kolmogorov’s algorithm next 2 weeks

– Hierarchical parallel graph-cut engine architecture design

• Based my hierarchical parallel algorithm modified from BnK’s algorithm June/July