research on graph-cut for stereo vision
DESCRIPTION
Research on Graph-Cut for Stereo Vision. Presenter: Nelson Chang Institute of Electronics, National Chiao Tung University. Outline. Research Overview Brief Review of Stereo Vision Hierarchical Exhaustive Search Partitioned Graph-Cut for Stereo Vision Hierarchical Parallel Graph-Cut. - PowerPoint PPT PresentationTRANSCRIPT
112/04/19
VL
SI S
ign
al P
roc
es
sin
g L
ab
, Ins
titute
of E
lec
tron
ics
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Research on Graph-Cut for Stereo Vision
Presenter: Nelson Chang
Institute of Electronics,National Chiao Tung University
112/04/19 2
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Outline
• Research Overview• Brief Review of Stereo Vision• Hierarchical Exhaustive Search• Partitioned Graph-Cut for Stereo Vision• Hierarchical Parallel Graph-Cut
112/04/19 3
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Our Research
• A fast vision system for robotics– Stereo vision
• Local block-based + diffusion (M)• Graph-cut (PhD)• Belief propagation (PhD)
– Segmentation• Watershed (M)• Meanshift
• Approaches– Embedded solutions
• DSP (U)• ASIC
– PC-based solutions• Dual webcam stereo (U)
HRP-2 Head
HRP-2 Tri-Camera Head
112/04/19 4
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
My Research
• A fast graph-cut VLSI engine for stereo vision– ASIC approach– Goal: 256x256 pixels, 30 depth label, 30 fps
• Stereo vision system prototypes– PC-based– DSP-based– FPGA/ASIC-based
112/04/19
VL
SI S
ign
al P
roc
es
sin
g L
ab
, Ins
titute
of E
lec
tron
ics
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Review on Stereo Vision
Presenter: Nelson Chang
Institute of Electronics,National Chiao Tung University
112/04/19 6
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Concept of Stereo Vision
• Computational Stereo – to determine the 3-D structure of a scene from 2 or more images taken from distinct view points.
d
d
TfZ
d
T
f
Z
pxpxppd
''
d : disparityZ : depthT : baseline f : focal length
Triangulation of non-verged geometry
M. Z
. Bro
wn
et a
l., “Ad
van
ces in
Co
mp
uta
tion
al S
tere
o,”
IEE
E T
ran
sactio
ns o
n P
atte
rn A
na
lysis an
d M
ach
ine
In
tellig
en
ce, vo
l. 25
, no
. 8, A
ug
ust 2
00
3.
112/04/19 7
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Disparity Image
• Disparity Map/Image– The disparities of all the pixels in the image
• Example: Left Cam Right Cam
Left Disparity Map Right Disparity Map
d= 255
d= 0Farthest
Nearest
0 0 0 0
0 0 110 0
0 100138 0
80 123156176
110 pixels
Disparity map of the 4x4 block
112/04/19 8
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
• Simple Local Method– Block Matching
• SADSum of Absolute Difference
– ∑|IL-IR|
• Find the candidate disparity with minimal SAD
– Assumption• Disparities within a block
should be the same
– Limitation• Works bad in texture-less
region• Works bad in repeating p
attern
How to find the disparity of a pixel? (1/2)
0 0 0
0 100 0
200300 0
0 0 0
0 100 0
200300 0
0 00
0 1000
2003000
0 0
100 0
300 0
0
0
0
d=k-1SAD=400
d=kSAD=0
d=k+1SAD=600Left
Right
112/04/19 9
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
How to find the disparity of a pixel? (2/2)
• Complex Global Method– Graph-cut, Belief Propagation
• Disparity Estimation Optimal Labeling Problem– Assign the label (disparity) of each pixel such that a given global
energy is minimal• Energy is a function of the label set (disparity map/image)• The energy considers the
– Intensity similarity of the corresponding pixel» Example: Absolute Difference (AD), D=|IL-IR|
– Disparity smoothness of neighboring pixels» Example: Potts Model If (dL≠dR), V=K
else, V=0
0
0 ? 16
32
d=0 V=2Kd=16 V=3Kd=32 V=3Kd=2 V=4K
112/04/19 10
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Swap and Expansion Moves
• Weak move– Modifies 1 label at a time– Standard move
• Strong– Modifies multiple labels at a time– Proposed swap and expansion move
Initial labeling Standard move α-βswap αexpansion
Init. Weak Strong
E
More chances of finding more
local minimum
112/04/19 11
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
4-connected structure
• Most common graph/MRF(BP) structure in stereo
Observable nodes
Hidden nodes
Source
Sink
D
D’
V
VV
VV
VV
V
D
2-variable Graph-Cut
MRF in Belief PropagationD,V are vectors
α
α’
112/04/19
VL
SI S
ign
al P
roc
es
sin
g L
ab
, Ins
titute
of E
lec
tron
ics
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Hierarchical Exhaustive Search on
Presenter: Nelson Chang
Institute of Electronics,National Chiao Tung University
112/04/19 13
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Outline
• Combinatorial Optimization• Graph-Cut• Exhaustive Search• Iterated Conditional Modes• Hierarchical Exhaustive Search• Result• Summary & Next Step
112/04/19 14
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Combinatorial Optimization
• Determine a combination (pattern, set of labels) such that the energy of this combination is minimum
• Example: 4-bit binary label problem– Find a label-set which yields the minimal energy
• Each individual bit can be set as 0 or 1– Each label corresponds to an energy cost
• Each neighboring bit pair is better to have the same label (smoothness)
? ? ??0 1 2 3
10 10 10
99 92 100 101
100 79 114 98
0
1
0
1
0
1
0
1
Energy(0000)
Energy(0001) =
= 99+92+100+101= 392
= 99+92+100+98+10= 399
112/04/19 15
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
100 11498
79
10110099 92
14
Graph-Cut
• Formulate the previous problem into a graph-cut problem– Find the cut with minimum total capacity (cost, energy)
• Solving the graph-cut: Ford-Fulkurson Method
? ? ??0 1 2 3
10 10
0
1
10
1
13 3
9
122
4
7
1
99+79+100+98+1+10+3 =390 Max Flow (Energy of the cut 1100)Total Flow Pushed=
1100
112/04/19 16
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Exhaustive Search
• List all the combinations and corresponding energy• Example: 1100 has the minimal energy of 390
Label set Energy Label set Energy0000 392 1000 403
0001 399 1001 410
0010 426 1010 437
0011 413 1011 424
0100 399 1100 390
0101 414 1101 397
0110 413 1110 404
0111 400 1111 391
? ? ??0 1 2 310 10 10
99 92 100 101
100 79 114 98
0
1
0
1
0
1
0
1
112/04/19 17
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Iterated Conditional Modes
• Iteratively finds the best label under the current given condition– Greedy– Different starting decision (initial condition) result in different result– Can find local minima
• Example:– Start with bit 1 because it is more reliable– Iteration order: bit1bit0bit2bit3– Final solution: 1100
? ? ??0 1 2 310 10 10
99 92 100 101
100 79 114 98
0
1
0
1
0
1
0
1
0 1 2 3
79(1)<92(0) 1
1 1
100(1)<99+10(0) 1
100+10 (0)<114 (1) 0
0
101 (0)<98+10(1) 0
0
112/04/19 18
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Exhaustive Search Engine
• Exhaustive search can be hardware implemented• Less sequential dependency• Not suitable for graph larger than 4x4
D0…
Dn
V01…
V(n
-1)(
n)
D0…
Dn
V0
1… V
(n-1
)(n
)
D0…
Dn
V0
1… V
(n-1
)(n
)
CMP
++
+
++
+
+
+
+
++
+
…
…
…
CMP
++
+
++
+
+
+
+
++
+
…
…
…
CMP
++
+
++
+
+
+
+
++
+
…
…
…
Pattern Generator
DV Table
Post Comparison
ArrayCMP
++
+
++
+
+
+
+
++
+
…
…
…
. . . .. . .
D0…
Dn
V0
1… V
(n-1
)(n
)
...
pa
tte
rn 0
pa
tter
n 1
pat
tern
(K
-1)
CMP
CMP
CMP
. . .K sets
Energy Computation Unit
Result of fully connected graph, NOT 4-connected graph
112/04/19 19
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Hierarchical Graph-Cut?
• Solve large n graph with multiple small n GCE hierarchically
• Example:– Solve n=16 with 4+1 n=4 graph-cuts
For each sub-graph,find the best 2 label-sets
Sub-graph 0 Sub-graph 1
Sub-graph 2 Sub-graph 3
0 1
2 3
For each sub-graph verticeLabel 0 = 1st label setLabel 1 = 2nd label set
Assumption:!! The optimal solution must be within the combinations of sub-graph label sets !!
112/04/19 20
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
HGC Speed up Evaluation
• For an 8-point GCE with 8-set of ECUs– Cost: 300 eq. adders– Latency: 41 cycles per graph
• If only 1 GCE is used to compute 64-point 2 variable graph-cut
Latency= 41 cycles x 8 + 41 cycles + TV
= 369 cycles + TV
If V is computed for each pixelsTv=(8x8)X(8x7/2)X4=3584
Total Latency ~ 3953 cyclesQuestion: Is this solution the optimal label set for n=64???
112/04/19 21
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Hierarchical Exhaustive Search
• 64x64 nodes– 4x4 based pyramid structure– 3 levels
Level 2
Level 1
Level 0
D@lv1 E0/E1@lv0Label0@lv1 pat0@lv0Label1@lv1 pat1@lv0
D@lv2 E0/E1@lv1Label0@lv2 pat0@lv1Label1@lv2 pat1@lv1
D@lv0 D0/D1@lv0Label0@lv0 Label0Label1@lv0 Label1
pat0 is the best candidate patternpat1 is 2nd best candidate pattern
112/04/19 22
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Computing V term at Level 1
• For 1st order neighboring sub-graphs Gi and Gj– possible neighboring pair com
bination• (pat0i, pat0j)
• (pat0i, pat1j)
• (pat1i, pat0j)
• (pat1i, pat1j)
• Compute V(patXi,patXj) with original neighboring cost– Example:
• V(pat0i, pat0j) = K
• V(pat0i, pat1j) = K+K+K = 3K
? ? ? 0
? ? ? 0
? ? ? 0
? ? ? 1
0 ? ? ?
0 ? ? ?
1 ? ? ?
1 ? ? ?
? ? ? 0
? ? ? 0
? ? ? 0
? ? ? 1
1 ? ? ?
0 ? ? ?
1 ? ? ?
0 ? ? ?
Gi Gj
pat0i
pat0i
pat0j
pat1j
112/04/19 23
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Result of 16x16 (256) 2 level HES
• Random generated 100 graphs– D/V~ 10– Symmetric V=20
• Error Rate– Max: 17/256 ~ 6.6%– Average: 7/256 ~ 2.8%– Min: 2/256 ~ 0.8%
112/04/19 24
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Result of 64x64 (4096) 3 level HES
• Random generated 100 graphs– D/V~ 10– Symmetric V=20
• Error Rate– Max: 185/4096 ~ 4.5%– Average: 146/4096 ~ 3.6%– Min: 115/4096 ~ 2.8%
112/04/19
VL
SI S
ign
al P
roc
es
sin
g L
ab
, Ins
titute
of E
lec
tron
ics
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Death Sentence to HES
Presenter: Nelson Chang
Institute of Electronics,National Chiao Tung University
112/04/19 26
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Error Rate vs. Graph Size
• (D,V)=(~163:20)
Average Energy Increase %
Average Error Rate %
Error Rate Standard deviation %
Min Error Rate %
Max Error Rate %
16x16
(2 level)
0.20 (7/256)
2.74
1.43 0.39 8.59
64x64
(3 level)
0.25 (149/4096)
3.63
0.40 2.58 4.54
256x256 (4 level)
0.28 (2393/65536) 3.65
0.09 3.36 3.89
3.63 vs. 3.65Error rate did not increase significantly
Error rate range became smaller
112/04/19 27
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Impact of different V cost
• 64x64(3 level) HES– 100 patterns per V cost value
• D cost (average over s-link caps of 10 patterns, 2 for each V)– Average: 162.8– Std.Dev: 94.4
• V cost– 10, 20, 40, 60, 80
HGCE vs. GCE Error Percentage
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
0 20 40 60 80 100
K Value
Err
or
Ra
te
HGCE vs. GCE Energy Difference Percentage
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
3.00%
3.50%
4.00%
4.50%
5.00%
0 20 40 60 80 100
K Value
En
erg
y d
iffe
ren
ce
K Err Rate Energy Difference10 1.53% 0.05%20 3.63% 0.25%40 8.75% 1.15%60 14.55% 2.63%80 20.95% 4.50%
HGCE vs. GCE Error Percentage
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
0 20 40 60 80 100
K Value
Err
or
Ra
te
256x256 1 pattern result
112/04/19 28
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Stereo Matching Case
• Stereo Pair: Tsukuba• Expansion with random label order
– 15 labels 15 graph-cut computations
• Graph Size: 256 x 256• D term: truncated Sum of Squared Error (tSSE)
– Truncated at AD=20
• V term: Potts model– K=20
112/04/19 29
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
1st iteration result
• Error rate might exceed 20% for important expansion moves
Label (α) Error Rate (%)Energy Difference (%)
Label (α)Error Rate (%)
Energy Difference (%)
0 0.62 12.7 8 5.01 28.3
1 1.07 16.1 9 12.01 42.0
2 0.00 0.0 10 5.55 32.2
3 2.76 24.7 11 5.18 30.5
4 21.59 38.7 12 5.33 31.3
5 22.91 44.2 13 7.07 34.6
6 9.21 32.1 14 2.98 23.2
7 7.83 40.0
Important expansions
4
5
9
BnK
’s expansion result
112/04/19 30
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Reason for failure
• Best 2 local candidates does NOT include the final optimal solution– Error often happen near lv2 and lv3 block boundary
• Majority node has both 0 source and sink link capacity• More dependent on neighboring node’s label
– D:V ratio ~ 56:20 2.8:1• Similar to D:V = 163:60 case• Error rate for random pattern ~ 15%
Best 2 patterns in does NOT consider the pattern of
112/04/19
VL
SI S
ign
al P
roc
es
sin
g L
ab
, Ins
titute
of E
lec
tron
ics
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Partitioned (Block) Graph-Cut
Presenter: Nelson Chang
Institute of Electronics,National Chiao Tung University
112/04/19 32
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Motivation
• Global– Considers the whole picture– More information
• Local– Considers a limited region of a picture– Less information
Is it necessary to use that much information in global methods??
112/04/19 33
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Concept
• Original full GC– 1 big graph
• Partitioned GC– N smaller graphs
What’s the smallest possible partition to achieve the same performance?
112/04/19 34
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Experiment Setting
• Energy– D term
• Luma only• Birchfield-Tomasi cost (best result at half-pel position)• Square Error
– V term• Potts Model V= K x T(di≠dj)• K constant is the same for all partition
• Partition Size– 4x4, 16x16, 32x32, 64x64, 128x128
• Stereo Pairs– Tsukuba, Teddy, Cones, Venus
112/04/19 35
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Tsukuba 4x4, 16x16, 32x32, 64x64
4x4 16x16
32x3264x64
112/04/19 36
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Tsukuba 96x96, 128x128
Full GC
96x96
128x128
112/04/19 37
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Venus 32x32, 64x64
32x32 64x64
112/04/19 38
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Venus 96x96, 128x128
Full GC
96x96 128x128
112/04/19 39
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Teddy 32x32, 64x64
32x32 64x64
112/04/19 40
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Teddy 96x96, 128x128Full GC
96x96 128x128
112/04/19 41
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Cones 32x32, 64x64
32x32 64x64
112/04/19 42
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Cones 96x96, 128x128Full GC
96x96 128x128
112/04/19 43
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Middleburry Result
Tsukuba Venus Teddy ConesBlockSize
nonocc ALL Disc nonocc ALL Disc nonocc ALL Disc nonocc ALL Disc
Best 6.0 6.8 24.7 2.9 4.4 19.4 14.5 22.8 29.1 11.2 20.9 20.9
Full 8.6 9.4 27.4 3.3 4.7 18.4 14.5 22.8 29.1 14.5 23.8 24.2
32 16.2 17.0 33.7 24.0 24.9 29.6 27.6 34.9 35.2 24.4 32.7 30.2
64 10.6 11.5 29.6 10.0 11.3 19.5 19.1 27.1 29.8 19.2 28.0 27.3
96 9.4 10.2 28.7 9.1 10.4 20.5 16.3 24.7 29.5 15.1 24.3 24.7
128 8.8 9.5 27.3 8.4 9.6 21.5 15.2 23.5 28.6 14.6 23.9 24.0
Best: Full GC with best parameterFull: Full GC with k=20(tsukuba) and 60 (others)
Evaluation Web Page http://cat.middlebury.edu/stereo/
112/04/19 44
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Summary
• Smallest possible partition size (2% accuracy drop)– Tuskuba64x64– Teddy & Cones 96x96– Venus larger than 128x128
• Benefits– Possible complexity or storage reduction– Parallelism increase
• Drawbacks– Performance (disparity accuracy) drop– PC computation becomes longer
112/04/19
VL
SI S
ign
al P
roc
es
sin
g L
ab
, Ins
titute
of E
lec
tron
ics
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Hierarchical Parallel Graph-Cut
Presenter: Nelson Chang
Institute of Electronics,National Chiao Tung University
112/04/19 46
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Concept of Hierarchical Parallel GC
• Bottom Up– Solve graph-cut for smaller subgraphs– Solve graph-cut for larger subgraphs
• Larger subgraphs = set of neighboring smaller subgraphs
sg0 sg1
sg2 sg3
Level 0
Larger subgraph = sg0+sg1+sg2+sg3
Level 1
!!Each subgraph is temporary independent !!
112/04/19 47
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
HPGC for solving a 256x256 graph
Step 1 64 32x32 Lv0 subgraphs
Step 2 16 64x64 Lv1 subgraphs
Step 3 4 128x128 Lv2 subgraphs
Step 4 1 256x256 Lv3 subgraphs
Total graph-cut computations = 64+16+4+1 =85
!!HPGC must used Ford-Fulkerson-based methods!!
112/04/19 48
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Boykov and Kolmogorov’s Motivation
• Dinic Method– Search the shortest augmenting path– Use Breadth First Search (BFS)
• Example:– Search shortest path (length = k)
• Use BFS, expand the search tree
• Find all paths of length k
– Search shortest path (length = k+1), • Use BFS, RE-expand the search tree again
• Find all paths of length (k+1)
– Search shortest path (length = k+2), • Use BFS, RE-RE-expand the search tree a
gain
• …..
1
1
1 1
1
1
11
1
1
Why don’t we REUSE the expanded tree?
112/04/19 49
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
BnK’s Method
• Concept:– Reuse the already expanded trees– Avoid re-expanding the tress from scratch (nothing)
• 3 stages– Growth
• Grow the search tree
– Augmentation• Ford-Fulkerson style augmentation
– Adoption• Reconnect the unconnected sub-trees• Connect the orphans to a new parent
Augmenting Path
Saturate Critical Edge
Adopt Orphans
112/04/19 50
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Feature of BnK method
• Based on Ford-Fulkerson – Bidirection search tree constructon– Searched tree reuse– Determine label (source or sink) using tree connecti
vity
Source tree
Sink tree
112/04/19 51
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Connectivity is why HPGC works
• Example: a 2x4 binary variable graph
Graph view
Tree view
Case 2
Case 3
Case 4
Case 1
112/04/19 52
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Connectivity of the various cases
Case 2
Case 3
Case 4
112/04/19 53
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
How to add edges
• When should node A and B check their edge– If A & B belong to different search trees
• A is in a sink tree, B is in a source tree• A is in a source tree, B is in a sink tree• Implies a source->sink path
– If A or B is an orphan (not connected to any tree)• A is an orphan, B is not an orphan• A is not an orphan, B is an orphan• Check for possible connectivity of the orphan
A B
112/04/19 54
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Complexity Result
• Method– Annotate each line of code with basic operations
• Read• Write• Arithmetic• Logic• Compare• Branch
• Examples– C=A+B 2R, 1W, 1A– If(A==B) 2R, 1C, 1B
112/04/19 55
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Stereo Matching Case
• Stereo Pair: Tsukuba• Expansion with random label order
– 15 labels 15 graph-cut computations
• Graph Size: 256 x 256• D term: truncated Sum of Squared Error (tSSE)
– Truncated at AD=20
• V term: Potts model– K=20
112/04/19 56
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
1st iteration result
• Label 4, 5, 9 are key moves
Label (α) Error Rate (%)Energy Difference (%)
Label (α)Error Rate (%)
Energy Difference (%)
0 0.62 12.7 8 5.01 28.3
1 1.07 16.1 9 12.01 42.0
2 0.00 0.0 10 5.55 32.2
3 2.76 24.7 11 5.18 30.5
4 21.59 38.7 12 5.33 31.3
5 22.91 44.2 13 7.07 34.6
6 9.21 32.1 14 2.98 23.2
7 7.83 40.0
Important expansions
4
5
9
BnK
’s expansion result
112/04/19 57
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Full BnK Graph-cut Operation Distribution
• 256x256 graph – Tsukuba iteration 0 label 5
Operation distribution
R39%
W18%
A3%
L2%
C20%
B18%
R
W
A
L
C
B
Memory access dominantControl ~22%Arithmetic is insignificant
77,407,307 Operations
112/04/19 58
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Full GC vs. HPGC
• 256x256 graph – Tsukuba iteration 0 label 5
Operation vs. PE Utilization
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
96.59% 88.54% 75.89% 53.13% 33.20%
Overall PE Utilzation
Ope
ratio
n Cou
nt
This Case
Worst Case
77,407,307 Operations
4PE 8PE 32PE 64PE
16PE
112/04/19 59
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Conclusion
• HPGC can improve speed with multiple PEs• To perform 30 fps, 30 labels, 256x256 graph-cut
– 1PE@100MHz– Averge cycle budget for each subgraph ~1.3K cycles
• Lv0 subgraph is 32x32
• Next step– Small BnK graph-cut engine architecture design
• Estimate speed/cost
112/04/19 60
VL
SI S
ign
al P
roc
es
sin
g L
ab
Na
tion
al C
hia
o T
un
g U
niv
ers
ity, H
sin
ch
u, T
aiw
an
Platform-Based Design Group
Progress Check
• Previous plan– Parallel graph-cut engine for binary-variable graph
• Based on Boykov and Kolmogorov’s graph cut algorithm– Complexity analysis done
– Hierarchical parallel algorithm SW model done
– Small BnK graph-cut engine architecture design• Based on Boykov and Kolmogorov’s algorithm next 2 weeks
– Hierarchical parallel graph-cut engine architecture design
• Based my hierarchical parallel algorithm modified from BnK’s algorithm June/July