speeding up mrf optimization using graph cuts for computer vision vibhav vineet adviser: prof. p. j....

Speeding Up MRF Optimization using

Graph Cuts for Computer Vision

Vibhav Vineet

Adviser: Prof. P. J. Narayanan

Labelling Problem

Flower Image

Extracting Foreground

Pixels

Extracting Foreground Object

Pixel Labeling: Assigning a label to each pixel in image.

Image Segmentation: Involves Separating Foreground layer from background layer

Stereo Correspondence: Involves Calculation of depth map using left and right images

Disparity map

calculation

Left Tsukuba Image Disparity Map

Image Denoising: Involves assigning denoised intensity value to each pixel in image.

Image Denoising

Noisy House Image Denoised Image

Labelling Problem- MAP Estimation

To find the best possible configuration. -But the complexity increases

-With the number of variables/pixels - With the number of labels in the label set

-Using joint probability or conditional probabilities to evaluate the best possible configuration

- Very hard with the limited computation and memory power

-Energy minimization method - MAP – MRF equiivalence - Methods provide approximate solution at a moderate times- Generally, in computer vision an energy function involve unary cost and pairwise interactions between variables.

Labelling Problem

Image-Graph Equivalence

Graph G( V, E )Unary Cost ( Per Vertex Cost)

Cost of Assignment for “fg” is low. Cost of Assignment for “bg” is high.

Graph G( V, E )PairWise Cost ( Per Edge Cost )

Cost of Assignment for “same label” is low. Cost of Assignment for “different labels” is high.

Total Cost = Unary Cost + PairWise Cost Different Labeling, different Cost

Energy E ( X ) = Unary Potential + PairWise Potential

Labeling Problem: Find a Labeling ( X ) with Minimum Cost or Energy Value

MAP-MRF Formulation

),()()( },{ qpqpppp xxVxDxE

MAP(X) Min Energy(X*)

MAP estimation of a configuration X is equivalent to the minimum energy defined over the configuration

Energy = Data Term + Smoothness Term

Graph Cuts in Computer Vision

ImageEnergy

FunctionGraph

Constructionst-MinCut

Graph Construction

Foreground Pixels

Background Pixels

Graph Constructed for vision problems Grid graphs Low connectivity Connectivity is limited to 4, 8, or 27

Vertex per pixel

Graph G(V,E)Add n-edges

Add t-edges

The st-Mincut Problem• Given a Graph

G(V,E,W) and two vertices s and t.

• Partition G into two disjoint components containing s and t respectively such that sum of edge weights from s to t is minimum

Mincut

• Solve the dual Maximum Flow problem

• Two approaches Edmond Karp’s Augmenting path method Goldberg’s Push-Relabel method

Computing the st-Mincut

Dualst-Mincut Max Flow

In every network, the maximum flow equals the cost of st-mincut

Edmond Karp Method

• Initialize flow in G to 0

• Find a shortest path from s to t.

• Augment the path with minimum possible flow

• Repeat until there exists a path from s to tCurrent Flow: 0

100 40

100 37

• Edge Capcity must be positive

• Flow <= Edge Capacity

Current Flow: 16

Edmond Karp Method

• Repeat until there exists a path from s to t

Goldberg’s Push-Relabel Algorithm

• Initialize excess flow and heights in G

• Perform an applicable Push or Relabel operation

• Repeat until there exists an applicable push or relabel operation

100 400 00 t

100 40

Height

Current Flow: 0 Current Flow: 0

100 40

0 00 t

Height

Current Flow: 0

100 40

0 00 t

Height

Current Flow: 0

7 36 t

Height

Current Flow: 0

7 36 t

Height

Current Flow: 0

87 377 36

Height

Current Flow: 0

87 377 36

Height

Current Flow: 9

87 377 00

Height

Current Flow: 9

87 377 00

Height

Current Flow: 9

Height

Current Flow: 9

Height

Current Flow: 9

Height

Current Flow: 9

Height

Current Flow: 9

Height

Current Flow: 16

Height

Current Flow: 16

Goldberg’s Push-Relabel Algorithm

Height

Motivation

• Fast Computation Required

• Robot navigation, surveillance, video processing etc

• Video Processing at real time

• You tube and other web-servers

• Large images Processing

• Even our offshelf cameras take high resolution images

• Interactive tools

Mapping to CUDA

Image Grid

CUDA Grid CUDA Block

Thread per pixel

Push-Relabel Algorithm on CUDA

• Push is an local operation with each node sending flows to its neighbors.

• Relabel is also a local operation, each vertex updates its own height.

• Problems faced: Read After Write consistency Synchronization of threads

• Push operations can performed without any read after write inconsistencies

• Relabel is a per vertex operation

• Employing atomic Capabilities and combining the push and pull kernels

Push Kernel Relabel Kernel

• Lowers Global memory access, empirically faster convergence is observed.

Handling Problems using Atomics

The Push Kernel• Load heights from the global memory to

the shared memory.

• Synchronize threads ensuring the completion of load operation.

• Push flows to eligible neighbors atomically.

• Update the edge-weights atomically in the residual graph.

• Update excess flow atomically in the residual graph.

The Relabel Kernel• Load height from the global memory to

the shared memory.

• Synchronize ensuring the completion of load operation.

• Compute the minimum height of all neighbors and set own height to plus one of this.

• Write the new height to global memory.

Using Shared Memory

Block size + 2

Thread

Height needed per thread

• Per CUDA block requires (Block_size+2 X

Block_size+2) memory to be loaded into the shared memory

• Corner pixels need heights from other blocks

Heuristics on Push and Relabel

• On grid graphs Global relabel (BFS based) is an expensive operation

• Local relabel perform better empirically

• Multiple pushes can be performed before applying a relabel step using

• For most general graphs m=3 and k=7 are found to be optimal.

(m*Push + Relabel)*k + Global Relabel

Stochastic Cuts• MRF consists of simple and difficult pixels.

• Simple pixels get their correct labels in few initial iterations

• Difficult (few) pixels exchange flows with their neighbors in later iterations

• Stochastic Cuts processes pixels based on their activity

• Activity is based on change in flows from previous to current iteration. Low activity is observed for simple pixels

• Heuristically process simple pixels after a fixed number of iterations

Experimental Results

Image SizeTime CPU(ms)

TimeNon

Atomic(ms)

Time Atomic(ms)

TimeStochastic

Sponge 640x480

142 28 16 11

Flower 608x456

188 33 26 16

Person 608x456

140 31 27 20

Synthetic

1Kx1K 655 19 10 7

Graph ReparameterizationS

Graph CutsGraph Cuts

Graph ReparameterizationS

Graph CutsGraph Cuts

Graph Reparameterized

Graph Reparameterized No change in

Dynamic Cuts

EA SA EB SB

Problems instances where they differ slightly. Solving each independently is computationally expensive

Example: Continuous frames in a video

Problem Instance 1 Problem Instance 2

• Edge capacities are updated and reparameterized using Previous frame edge

capacities Previous frame residual flow Current frame edge

capacities

Previous Frame

Previous Frame after st-MinCut

Current Frame

Dynamic Cuts Steps Involved

ri’ Approximate cut using previous frame and its st-MinCut

Final st-MinCut of current frame

Updation Step:

ri’ = ri + ci’ - ci

fi’Reparameterization Step:

rsi’ = 0

rit’ = cit - fit + fsi – csi’

Dynamic Cuts are parallizable

• Updation and Reparameterization are independent and parallizable operations, work locally at every vertex.

• st-Mincut is performed using a parallel implementation of Push Relabel algorithm.

Dynamic Cuts Empirically

• Running time depends on the percentage of weights that changed

• On a low resolution video, the dynamic cuts takes about 2 ms compared to 7 ms on the same image for the st-MinCutConsecutive frames of a video

segmented using dynamic cuts

The Multilabeling problem

• Multi-way cut on any graph is an NP-Hard problem for L > 2

• Approximate solutions based on graph cuts α-Expansion αβ-Swap

The α-Expansion1: Initialize the MRF with an arbitrary labeling X

2: For each label alpha \in L do

3: Construct the graph based on the current configuration

4: Perform one α-Expansion step (st-cut)

5: Update the configuration if energy decreases

6:End For

7: Repeat steps 2 to 6 till convergence.

Step 2-6 is a cycle and 3-5 is an iteration

Incremental α-Expansion

• Reusability of flows, as in dynamic MRF Better initializations

for next graph cut

• Incremental/Dynamic Reuse the flows from

label to label and and re-cycle flows from cycle to cycle.

Cycle 1

Cycle 2

Input Label1 Label2 Label3

Incremental α-Expansion Results

Tsukuba Teddy

Penguin Panorama

Incremental α-Expansion Results

Total Timings on Different Datasets

Processing on High Detailed Scene

• High Detailed Scene• High Resolution Image• High Dynamic Ranges of Colors• Wide View Angles.

• Challenges • High Computation Cost• High Memory Requirement• Interaction with high resolution images

• Statistics of image sizes available on Google images. • An overwhelming fraction of images are of size 2 to 10 million pixels. • Only 0.6% of fewer images had more than 40 mega pixels.

Processing on High Detailed Scene

Solve an optimization problem at the coarser level to dynamically update the optimization instance for the next

level for better initialization.

Define E(x) for coarsest

Final Result for this level

Define E(x) for next finer levelFinal Result at

this level

Pyramid Reparameterization

Pyramid is Constructed. Input largest image at the base of the pyramid.

Each pixel coarser level image is mean of 4 pixels at the previous finer level

Pyramid Construction

Image (i-1)Graph G(V,E)

Minimization at (i-1)

Segmented Image at (i-1)Residual Graph G(V,E)

Upsampling Step

Upsampled Initial Graph Upsampled Residual Graph

Graph at the current level

Final Residual Graph at the current level

Computationally Expensive Graph Cuts

Graph Cuts

Upsampled graph of previous level

Graph at the current level

Final Residual Graph at the current level

Computationally Expensive Graph Cuts

Difference between two

graphs

Reuse of flows

Cheaper solution

Upsampling Rules

sjwsjwskw

itwjtw

ktwktw

itwjtw

jkwjkw

jkw 00

jtwjtw

Graph Cuts

Graph Upsampling

Residual Graph Upsampling

Upsampling Rules

sjwsjwskw

itwjtw

ktwktw

itwjtw

jkwjkw

jkw 00

jtwjtw

Graph Upsampling

Graph Cuts

Upsampling Rules

sjw sjwskw

itwjtw

jtwjtw

Graph Upsampling

Graph Cuts

Image Segmentation Results

Horse3.3 MP (2048x1600)

Image Segmentation Results

Interactive Image Segmentation Tool

User Interaction Important in foreground/background separation

User 2

• Results of a user study on image size for comfortable manipulation for two display sizes. • Average subjective response for six image sizes. • Images that are larger than the display is disfavored users.

User 1

Pyramid Segmentation System

Actual Image

Display Window

• User interacts at the display window of comfortable size

• Quick Segment: • Display the segmentation results on this display image• Provides perceptual response to start planning further interactions

• Actual Segmentation goes on in background on other levels

User Study

Results of User Study on CPU and GPU version of Pyramid Segmentation With GrabCut and Quick Selection

Interaction Time Response Time

Total Time Subjective Response of Users

Multiresolution alpha-expansion

- Build pyramid of graphs.

- Perform alpha-expansion at a lower resolution graph.

- Save the initial and final residual graphs for all the labels.

- Upsample and reparameterize the previous resolution initial and final graphs and current resolution initial graph.

- Perform alpha-expansion at this level.

- Repeat this for all the levels in the pyramid.

Stereo Correspondence

Image size – 1328 x 1104 Number of Labels (Disparity) – 200 - 290

Stereo Correspondence

Running time in seconds for stereo correpondence using Pyramid Cuts on the GPU (G-PyCut), the CPU (C-PyCut) and a single level Graph Cuts(GCuts).

Optimization Time Total Time (Optimization time + graph construction time + energy function calculation time)

A speed up of 5-6 times on the CPU is observed.

Image Denoising

Image size – 1000 x 1000 Number of Labels (Disparity) – 256

Image Denoising

Running time in seconds for stereo correpondence using Pyramid Cuts on the GPU (G-PyCut), the CPU (C-PyCut) and a single level Graph Cuts(GCuts).

Optimization TimeTotal Time (Optimization time + graph construction time + energy function calculation time)

A speed up of 5-6 times on the CPU is observed.

Future Work

• Higher order Interactions of variables in MRF

• Computationally more challenging

• Modelling this on our hierarchical and multiresolution framework

• Using multiple GPUs to parallelize the alpha-expansion

• Better interactive tools:

• Both global and local interactions

Conclusion

• Two methods to optimize basic graph cuts algorithm

• Using facilities provided by parallel accelerators like GPU

• Modelling graph cuts on hierarchical and dynamic framework for better initialization

• Graph Cuts methods proved very instrumental solving many computationally challenging problems

• Successes of graph cuts -> Promising future in the realm of energy minimization methods

Related Publications P. J. Narayanan, Vibhav Vineet and Timo Stitch. Fast Graph Cuts on the GPU. GPU Computing Gems (GCG), Volume 1 Dec. 2010 (Book Chapter).

• Vibhav Vineet and P. J. Narayanan. Solving Multi-label MRFs using incremental alpha-expansion move on the GPUs. In Proceeding of Ninth Asian Conference on Computer Vision. (ACCV-2009), China, 2009.

• Vibhav Vineet and P. J. Narayanan. CUDA Cuts: Fast Graph Cuts on the GPU. In Proceeding of CVPR workshop on Visual Computer Vision on GPUs (CVGPU-2008), Alaska, USA, 2008.

• Vibhav Vineet, Pawan Harish, Suryakant Patidar and P. J. Narayanan. Fast Minimum Spanning Tree for Large Graphs on the GPU. In Proceeding of ACM SIGGRAPH High Performance Graphics (HPG-2009), New Orleans, LA, USA, 2009.

• Pawan Harish, Vibhav Vineet and P. J. Narayanan. Large Graph Algorithms for Massively Multithreaded Architectures. IIIT Tech Report, IIIT/TR/2009/74.

• CUDA Cuts: Fast Graph Cuts on the GPU. http://cvit.iiit.ac.in/index.php?page=resources. (Software).

Vibhav Vineet, Pawan Harish, Suryakant Patidar and P. J. Narayanan. Fast Minimum Spanning Tree for Large Graphs on the GPU. GPU Computing Gems (GCG). (Book Chapter).

Changes made to the thesis

Reviewer1 (Dr. Kishore)

• Tables containing experimental results on more standard images (around 60) images added.

• Sections on related and background works expanded.

Reviewer 2 (Dr. Srinivasan)

• Missed references added to the related work section. • Formal results on speed up for dynamic graph cuts on video

segmentation added to the result section. • Figure captions properly referenced with the paper of Kohli and Torr. • Other minor changes made as recommended.

Thank You

speeding up mrf optimization using graph cuts for computer vision vibhav vineet adviser: prof. p. j....

Documents

vineet kumar.pptx

dr vineet suri

arxiv:1902.03334v1 [cs.cv] 9 feb 2019 · 2019. 2. 12. ·...

davis service group vineet kumar

r narayanan

final presenatation-vibhav misra

narayanan, 2005

corporate blogging - vineet rajan

vineet yagnik biology ee

vibhav vineet, pawan harish, suryakant patidar and...

vibhav internship final project report

vineet edupuganti

learning visuomotor policies for aerial navigation using...

nose, eyes and ears: head pose estimation by … · nose,...

icari 2016 (vineet smvdu) new

vineet intro

cudacuts: fast graph cuts on the...

windows 7 v2 vineet

final report vibhav pachori - 07bs4792-1

vineet - mrp report