speeding up mrf optimization using graph cuts for computer vision vibhav vineet adviser: prof. p. j....
TRANSCRIPT
Speeding Up MRF Optimization using
Graph Cuts for Computer Vision
Vibhav Vineet
Adviser: Prof. P. J. Narayanan
Labelling Problem
Flower Image
Extracting Foreground
Pixels
Extracting Foreground Object
Pixel Labeling: Assigning a label to each pixel in image.
Image Segmentation: Involves Separating Foreground layer from background layer
Pixel Labeling: Assigning a label to each pixel in image.
Stereo Correspondence: Involves Calculation of depth map using left and right images
Disparity map
calculation
Left Tsukuba Image Disparity Map
Pixel Labeling: Assigning a label to each pixel in image.
Image Denoising: Involves assigning denoised intensity value to each pixel in image.
Image Denoising
Noisy House Image Denoised Image
Labelling Problem- MAP Estimation
To find the best possible configuration. -But the complexity increases
-With the number of variables/pixels - With the number of labels in the label set
-Using joint probability or conditional probabilities to evaluate the best possible configuration
- Very hard with the limited computation and memory power
-Energy minimization method - MAP – MRF equiivalence - Methods provide approximate solution at a moderate times- Generally, in computer vision an energy function involve unary cost and pairwise interactions between variables.
Labelling Problem
Image-Graph Equivalence
Graph G( V, E )Unary Cost ( Per Vertex Cost)
Cost of Assignment for “fg” is low. Cost of Assignment for “bg” is high.
Image-Graph Equivalence
Graph G( V, E )PairWise Cost ( Per Edge Cost )
Cost of Assignment for “same label” is low. Cost of Assignment for “different labels” is high.
Image-Graph Equivalence
Total Cost = Unary Cost + PairWise Cost Different Labeling, different Cost
Energy E ( X ) = Unary Potential + PairWise Potential
Labeling Problem: Find a Labeling ( X ) with Minimum Cost or Energy Value
MAP-MRF Formulation
),()()( },{ qpqpppp xxVxDxE
MAP(X) Min Energy(X*)
MAP estimation of a configuration X is equivalent to the minimum energy defined over the configuration
Energy = Data Term + Smoothness Term
Graph Cuts in Computer Vision
ImageEnergy
FunctionGraph
Constructionst-MinCut
Graph Construction
Foreground Pixels
Background Pixels
Image-Graph Equivalence
Image
Graph Constructed for vision problems Grid graphs Low connectivity Connectivity is limited to 4, 8, or 27
Vertex per pixel
Graph G(V,E)Add n-edges
s
t
Add t-edges
The st-Mincut Problem• Given a Graph
G(V,E,W) and two vertices s and t.
• Partition G into two disjoint components containing s and t respectively such that sum of edge weights from s to t is minimum
s
t
Mincut
• Solve the dual Maximum Flow problem
• Two approaches Edmond Karp’s Augmenting path method Goldberg’s Push-Relabel method
Computing the st-Mincut
Dualst-Mincut Max Flow
In every network, the maximum flow equals the cost of st-mincut
Edmond Karp Method
• Initialize flow in G to 0
• Find a shortest path from s to t.
• Augment the path with minimum possible flow
• Repeat until there exists a path from s to tCurrent Flow: 0
s
t
100 40
73
4
6
11
13
14
• Initialize flow in G to 0
• Find a shortest path from s to t.
• Augment the path with minimum possible flow
• Repeat until there exists a path from s to tCurrent Flow: 0
s
t
100 40
73
4
6
11
13
14
s
t
100 37
7
0
1
6
11
13
14
3
• Initialize flow in G to 0
• Find a shortest path from s to t.
• Augment the path with minimum possible flow
• Repeat until there exists a path from s to tCurrent Flow: 3
s
t
100 37
7
1
6
11
13
14
3
• Initialize flow in G to 0
• Find a shortest path from s to t.
• Augment the path with minimum possible flow
• Repeat until there exists a path from s to tCurrent Flow: 3
• Initialize flow in G to 0
• Find a shortest path from s to t.
• Augment the path with minimum possible flow
• Repeat until there exists a path from s to tCurrent Flow: 10
s
t
93 37
0
1
6
11
13
7
37
s
t
93 37
1
6
11
13
7
37
• Initialize flow in G to 0
• Find a shortest path from s to t.
• Augment the path with minimum possible flow
• Repeat until there exists a path from s to tCurrent Flow: 10
s
t
87 37
1
0
17
7
1
376
• Initialize flow in G to 0
• Find a shortest path from s to t.
• Augment the path with minimum possible flow
• Repeat until there exists a path from s to tCurrent Flow: 16
• Edge Capcity must be positive
• Flow <= Edge Capacity
Current Flow: 16
s
t
87 37
1
17
7
1
376
Edmond Karp Method
• Initialize flow in G to 0
• Find a shortest path from s to t.
• Augment the path with minimum possible flow
• Repeat until there exists a path from s to t
• Edge Capcity must be positive
• Flow <= Edge Capacity
Goldberg’s Push-Relabel Algorithm
• Initialize excess flow and heights in G
• Perform an applicable Push or Relabel operation
• Repeat until there exists an applicable push or relabel operation
s
100 400 00 t
100 40
7 3
46
11
13
14
Height
0
Current Flow: 0 Current Flow: 0
s
100 40
0 00 t
7 3
46
11
13
14
Height
0
• Initialize excess flow and heights in G
• Perform an applicable Push or Relabel operation
• Repeat until there exists an applicable push or relabel operation
Current Flow: 0
s
100 40
0 00 t
7 3
46
11
13
14
Height
0
• Initialize excess flow and heights in G
• Perform an applicable Push or Relabel operation
• Repeat until there exists an applicable push or relabel operation
Current Flow: 0
s
87 37
7 36 t
7
3
46
11
13
14
Height
0
• Initialize excess flow and heights in G
• Perform an applicable Push or Relabel operation
• Repeat until there exists an applicable push or relabel operation
Current Flow: 0
s
87 37
7 36 t
7
3
46
11
13
14
Height
0
• Initialize excess flow and heights in G
• Perform an applicable Push or Relabel operation
• Repeat until there exists an applicable push or relabel operation
Current Flow: 0
s
87 377 36
t
7 3
46
11
13
14
Height
0
• Initialize excess flow and heights in G
• Perform an applicable Push or Relabel operation
• Repeat until there exists an applicable push or relabel operation
Current Flow: 0
s
87 377 36
t
7 3
46
11
13
14
Height
0
• Initialize excess flow and heights in G
• Perform an applicable Push or Relabel operation
• Repeat until there exists an applicable push or relabel operation
Current Flow: 9
s
87 377 00
t
7 3
16
11
13
8
Height
0
• Initialize excess flow and heights in G
• Perform an applicable Push or Relabel operation
• Repeat until there exists an applicable push or relabel operation
Current Flow: 9
s
87 377 00
t
7 3
16
11
13
8
Height
0
• Initialize excess flow and heights in G
• Perform an applicable Push or Relabel operation
• Repeat until there exists an applicable push or relabel operation
Current Flow: 9
s
87 37
7
00
t
7 3
16
11
13
8
Height
0
• Initialize excess flow and heights in G
• Perform an applicable Push or Relabel operation
• Repeat until there exists an applicable push or relabel operation
Current Flow: 9
s
87 37
7
00
t
7 3
16
11
13
8
Height
0
• Initialize excess flow and heights in G
• Perform an applicable Push or Relabel operation
• Repeat until there exists an applicable push or relabel operation
Current Flow: 9
s
87 37
0
07
t
7 3
16
17
6
8
Height
0
• Initialize excess flow and heights in G
• Perform an applicable Push or Relabel operation
• Repeat until there exists an applicable push or relabel operation
Current Flow: 9
s
87 37
0
07
t
7 3
16
17
6
8
Height
0
• Initialize excess flow and heights in G
• Perform an applicable Push or Relabel operation
• Repeat until there exists an applicable push or relabel operation
Current Flow: 9
s
87 37
0
07
t
7 3
16
17
6
8
Height
0
• Initialize excess flow and heights in G
• Perform an applicable Push or Relabel operation
• Repeat until there exists an applicable push or relabel operation
Current Flow: 16
s
87 37
0
00
t
7 3
16
17
6
1
Height
0
• Initialize excess flow and heights in G
• Perform an applicable Push or Relabel operation
• Repeat until there exists an applicable push or relabel operation
• Edge Capcity must be positive
• Flow <= Edge Capacity
s
87 37
0
00
t
7 3
16
17
6
1
Current Flow: 16
Goldberg’s Push-Relabel Algorithm
Height
0
• Initialize excess flow and heights in G
• Perform an applicable Push or Relabel operation
• Repeat until there exists an applicable push or relabel operation
• Edge Capcity must be positive
• Flow <= Edge Capacity
Motivation
• Fast Computation Required
• Robot navigation, surveillance, video processing etc
• Video Processing at real time
• You tube and other web-servers
• Large images Processing
• Even our offshelf cameras take high resolution images
• Interactive tools
Mapping to CUDA
Image
Image Grid
CUDA Grid CUDA Block
Thread per pixel
Push-Relabel Algorithm on CUDA
• Push is an local operation with each node sending flows to its neighbors.
• Relabel is also a local operation, each vertex updates its own height.
• Problems faced: Read After Write consistency Synchronization of threads
• Push operations can performed without any read after write inconsistencies
• Relabel is a per vertex operation
• Employing atomic Capabilities and combining the push and pull kernels
Push Kernel Relabel Kernel
• Lowers Global memory access, empirically faster convergence is observed.
Handling Problems using Atomics
The Push Kernel• Load heights from the global memory to
the shared memory.
• Synchronize threads ensuring the completion of load operation.
• Push flows to eligible neighbors atomically.
• Update the edge-weights atomically in the residual graph.
• Update excess flow atomically in the residual graph.
The Relabel Kernel• Load height from the global memory to
the shared memory.
• Synchronize ensuring the completion of load operation.
• Compute the minimum height of all neighbors and set own height to plus one of this.
• Write the new height to global memory.
Using Shared Memory
Block size + 2
Blo
ck s
ize
+ 2
Thread
Height needed per thread
• Per CUDA block requires (Block_size+2 X
Block_size+2) memory to be loaded into the shared memory
• Corner pixels need heights from other blocks
Block
Heuristics on Push and Relabel
• On grid graphs Global relabel (BFS based) is an expensive operation
• Local relabel perform better empirically
• Multiple pushes can be performed before applying a relabel step using
• For most general graphs m=3 and k=7 are found to be optimal.
(m*Push + Relabel)*k + Global Relabel
Stochastic Cuts• MRF consists of simple and difficult pixels.
• Simple pixels get their correct labels in few initial iterations
• Difficult (few) pixels exchange flows with their neighbors in later iterations
• Stochastic Cuts processes pixels based on their activity
• Activity is based on change in flows from previous to current iteration. Low activity is observed for simple pixels
• Heuristically process simple pixels after a fixed number of iterations
Experimental Results
Experimental Results
Image SizeTime CPU(ms)
TimeNon
Atomic(ms)
Time Atomic(ms)
TimeStochastic
(ms)
Sponge 640x480
142 28 16 11
Flower 608x456
188 33 26 16
Person 608x456
140 31 27 20
Synthetic
1Kx1K 655 19 10 7
Graph ReparameterizationS
t
5
9
42
1
2
S
t
5
9
42
1
2
Graph CutsGraph Cuts
Graph ReparameterizationS
t
5
9
42
1
2
S
t
5
9
42
1
2
Graph CutsGraph Cuts
S
t
5+2
9
42
1
2+2
S
t
7
9
42
1
4
Graph CutsGraph Cuts
Graph Reparameterized
Graph Reparameterized No change in
cut
Dynamic Cuts
EA SA EB SB
Problems instances where they differ slightly. Solving each independently is computationally expensive
Example: Continuous frames in a video
Problem Instance 1 Problem Instance 2
• Edge capacities are updated and reparameterized using Previous frame edge
capacities Previous frame residual flow Current frame edge
capacities
Previous Frame
Previous Frame after st-MinCut
Current Frame
Dynamic Cuts Steps Involved
ri
ri’ Approximate cut using previous frame and its st-MinCut
Final st-MinCut of current frame
ci’
ci
Updation Step:
ri’ = ri + ci’ - ci
fi
fi’Reparameterization Step:
rsi’ = 0
rit’ = cit - fit + fsi – csi’
Dynamic Cuts are parallizable
• Updation and Reparameterization are independent and parallizable operations, work locally at every vertex.
• st-Mincut is performed using a parallel implementation of Push Relabel algorithm.
Dynamic Cuts Empirically
• Running time depends on the percentage of weights that changed
• On a low resolution video, the dynamic cuts takes about 2 ms compared to 7 ms on the same image for the st-MinCutConsecutive frames of a video
segmented using dynamic cuts
The Multilabeling problem
• Multi-way cut on any graph is an NP-Hard problem for L > 2
• Approximate solutions based on graph cuts α-Expansion αβ-Swap
The α-Expansion1: Initialize the MRF with an arbitrary labeling X
2: For each label alpha \in L do
3: Construct the graph based on the current configuration
4: Perform one α-Expansion step (st-cut)
5: Update the configuration if energy decreases
6:End For
7: Repeat steps 2 to 6 till convergence.
Step 2-6 is a cycle and 3-5 is an iteration
Incremental α-Expansion
• Reusability of flows, as in dynamic MRF Better initializations
for next graph cut
• Incremental/Dynamic Reuse the flows from
label to label and and re-cycle flows from cycle to cycle.
Cycle 1
Cycle 2
Input Label1 Label2 Label3
Incremental α-Expansion Results
Tsukuba Teddy
Penguin Panorama
Incremental α-Expansion Results
Total Timings on Different Datasets
Processing on High Detailed Scene
• High Detailed Scene• High Resolution Image• High Dynamic Ranges of Colors• Wide View Angles.
• Challenges • High Computation Cost• High Memory Requirement• Interaction with high resolution images
• Statistics of image sizes available on Google images. • An overwhelming fraction of images are of size 2 to 10 million pixels. • Only 0.6% of fewer images had more than 40 mega pixels.
Processing on High Detailed Scene
Solve an optimization problem at the coarser level to dynamically update the optimization instance for the next
level for better initialization.
Define E(x) for coarsest
image
Final Result for this level
Define E(x) for next finer levelFinal Result at
this level
Pyramid Reparameterization
Pyramid is Constructed. Input largest image at the base of the pyramid.
Each pixel coarser level image is mean of 4 pixels at the previous finer level
Pyramid Construction
Pyramid Reparameterization
Image (i-1)Graph G(V,E)
Minimization at (i-1)
Segmented Image at (i-1)Residual Graph G(V,E)
Upsampling Step
Upsampled Initial Graph Upsampled Residual Graph
Pyramid Reparameterization
Graph at the current level
Final Residual Graph at the current level
Computationally Expensive Graph Cuts
Pyramid Reparameterization
Graph Cuts
Upsampled graph of previous level
Graph at the current level
Final Residual Graph at the current level
Computationally Expensive Graph Cuts
Difference between two
graphs
Reuse of flows
Cheaper solution
Upsampling Rules
ijw
ijw
ijw
0 00
000
00
siw
siw
sjwsjwskw
itwjtw
ktwktw
skw
itwjtw
jkwjkw
jkw 00
00
0 00
000
00
siw
siw
jtwjtw
00
00
'jkw
'jkw
'jkw
Graph Cuts
Graph Upsampling
Residual Graph Upsampling
Upsampling Rules
ijw
ijw
ijw
0 00
000
00
siw
siw
sjwsjwskw
itwjtw
ktwktw
skw
itwjtw
jkwjkw
jkw 00
00
0 00
000
00
siw
siw
jtwjtw
00
00
'jkw
'jkw
'jkw
Graph Upsampling
Graph Cuts
Residual Graph Upsampling
Upsampling Rules
siw
siw
sjw sjwskw
itwjtw
ktw
ktw
skw
itwjtw
siw
siw
jtwjtw
Graph Upsampling
Graph Cuts
Residual Graph Upsampling
Image Segmentation Results
Horse3.3 MP (2048x1600)
Image Segmentation Results
Interactive Image Segmentation Tool
User Interaction Important in foreground/background separation
User 2
• Results of a user study on image size for comfortable manipulation for two display sizes. • Average subjective response for six image sizes. • Images that are larger than the display is disfavored users.
User 1
Pyramid Segmentation System
Actual Image
Display Window
• User interacts at the display window of comfortable size
• Quick Segment: • Display the segmentation results on this display image• Provides perceptual response to start planning further interactions
• Actual Segmentation goes on in background on other levels
User Study
Results of User Study on CPU and GPU version of Pyramid Segmentation With GrabCut and Quick Selection
Interaction Time Response Time
Total Time Subjective Response of Users
Multiresolution alpha-expansion
- Build pyramid of graphs.
- Perform alpha-expansion at a lower resolution graph.
- Save the initial and final residual graphs for all the labels.
- Upsample and reparameterize the previous resolution initial and final graphs and current resolution initial graph.
- Perform alpha-expansion at this level.
- Repeat this for all the levels in the pyramid.
Stereo Correspondence
Image size – 1328 x 1104 Number of Labels (Disparity) – 200 - 290
Stereo Correspondence
Running time in seconds for stereo correpondence using Pyramid Cuts on the GPU (G-PyCut), the CPU (C-PyCut) and a single level Graph Cuts(GCuts).
Optimization Time Total Time (Optimization time + graph construction time + energy function calculation time)
A speed up of 5-6 times on the CPU is observed.
Image Denoising
Image size – 1000 x 1000 Number of Labels (Disparity) – 256
Image Denoising
Running time in seconds for stereo correpondence using Pyramid Cuts on the GPU (G-PyCut), the CPU (C-PyCut) and a single level Graph Cuts(GCuts).
Optimization TimeTotal Time (Optimization time + graph construction time + energy function calculation time)
A speed up of 5-6 times on the CPU is observed.
Future Work
• Higher order Interactions of variables in MRF
• Computationally more challenging
• Modelling this on our hierarchical and multiresolution framework
• Using multiple GPUs to parallelize the alpha-expansion
• Better interactive tools:
• Both global and local interactions
Conclusion
• Two methods to optimize basic graph cuts algorithm
• Using facilities provided by parallel accelerators like GPU
• Modelling graph cuts on hierarchical and dynamic framework for better initialization
• Graph Cuts methods proved very instrumental solving many computationally challenging problems
• Successes of graph cuts -> Promising future in the realm of energy minimization methods
Related Publications P. J. Narayanan, Vibhav Vineet and Timo Stitch. Fast Graph Cuts on the GPU. GPU Computing Gems (GCG), Volume 1 Dec. 2010 (Book Chapter).
• Vibhav Vineet and P. J. Narayanan. Solving Multi-label MRFs using incremental alpha-expansion move on the GPUs. In Proceeding of Ninth Asian Conference on Computer Vision. (ACCV-2009), China, 2009.
• Vibhav Vineet and P. J. Narayanan. CUDA Cuts: Fast Graph Cuts on the GPU. In Proceeding of CVPR workshop on Visual Computer Vision on GPUs (CVGPU-2008), Alaska, USA, 2008.
• Vibhav Vineet, Pawan Harish, Suryakant Patidar and P. J. Narayanan. Fast Minimum Spanning Tree for Large Graphs on the GPU. In Proceeding of ACM SIGGRAPH High Performance Graphics (HPG-2009), New Orleans, LA, USA, 2009.
• Pawan Harish, Vibhav Vineet and P. J. Narayanan. Large Graph Algorithms for Massively Multithreaded Architectures. IIIT Tech Report, IIIT/TR/2009/74.
• CUDA Cuts: Fast Graph Cuts on the GPU. http://cvit.iiit.ac.in/index.php?page=resources. (Software).
Vibhav Vineet, Pawan Harish, Suryakant Patidar and P. J. Narayanan. Fast Minimum Spanning Tree for Large Graphs on the GPU. GPU Computing Gems (GCG). (Book Chapter).
Changes made to the thesis
Reviewer1 (Dr. Kishore)
• Tables containing experimental results on more standard images (around 60) images added.
• Sections on related and background works expanded.
Reviewer 2 (Dr. Srinivasan)
• Missed references added to the related work section. • Formal results on speed up for dynamic graph cuts on video
segmentation added to the result section. • Figure captions properly referenced with the paper of Kohli and Torr. • Other minor changes made as recommended.
Thank You