mean-field theory and its applications in computer vision3 1

Post on 28-Mar-2015

219 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Mean-Field Theory and Its Applications In Computer Vision3

1

Gaussian Pairwise Potential

2

Spatial

Expensive message passing can be performed by cross-bilateral filtering

Range

Cross bilateral filter

3

qq

pp

output input

pp

reproducedfrom [Durand 02]

S

IIIGGW

IBFq

qqpp

p qp ||||||1

][rs

output input

Efficient Cross-Bilateral Filtering

• Based on permutohedral lattice (PLBF)2

• Embed the points on the permutohedral lattice• Apply Gaussian Blurring

4

Efficient Cross-Bilateral Filtering

• Based on permutohedral lattice (PLBF)2

• Embed the points on the permutohedral lattice• Apply Gaussian Blurring

5

• Based on the domain-transform (DTBF)3

• Project the point to lower dimension• Perform filtering in the transformed domain

Efficient Cross-Bilateral Filtering

• Based on permutohedral lattice (PLBF)2

• Embed the points on the permutohedral lattice• Apply Gaussian Blurring

6

• Based on the domain-transform (DTBF)3

• Project the point to lower dimension• Perform filtering in the transformed domain

• Filtering in frequency domain• Apply fast fourier transform• convolution in (s) domain=multiplication in (f) domain

Barycentric Interpolation

7

Efficient Cross-Bilateral Filtering

8

Permutohedral Lattice based filtering

• For each pixel (x, y)

9

• Downsample all the points (dependent on standard deviations)

rss

),(,,),,(

YXIYX

zyx

Embed to the permutohedral lattice

• Embed each downsampled points to the lattice

10

Embed to the permutohedral lattice

• Embed each downsampled points to the lattice

11

Embed to the permutohedral lattice

• Embed each downsampled points to the lattice

12

Embed to the permutohedral lattice

• Embed each downsampled points to the lattice

13

Gaussian blurring

• Apply Gaussian blurring along axes

14

Gaussian blurring

• Apply Gaussian blurring along axes

15

Gaussian blurring

• Apply Gaussian blurring along axes

16

Splatting

• Upsample the points

17

Splatting

• Upsample the points

18

PLBF

• Final upsampled points

19

Domain Transform Filtering

20

• Project points in low-dimension preserving the distance in the high dimension

• Projecting to the original space

• Filtering performed in low-dimension space

Distance in high-dimension space

21

Filtering in high-dimension space

22

Spatial

Range

Inefficient

Projection in low-dimension space

23

• Project to low-dimension • Maintain geodesic distance high-dimension space

Projection in low-dimension space

24

• Project to low-dimension • Maintain geodesic distance high-dimension space

Projection in low-dimension space

25

• Project to low-dimension • Maintain geodesic distance high-dimension space

Gaussian blurring in low-dimension

26

• Apply Gaussian blurring in low-dimension space

Project

27

• Project the blurred values in the original space

Project

28

• Project the blurred values in the original space

PLBF Vs DTBF

29

• Filter parameter:• PLBF runtime is inversely proportional to the kernel size defined over space and range

• Use PLBF with the relatively large (~10) range • Use DTBF with relatively smaller (~1-2) range

• Processing Time:• Both linear in the number of pixels

Filtering in frequency domain

30

Convergence

31

• Iteration vs. KL-divergence value• In theory: (since parallel update) convergence is not guaranteed• In practice: converges observe a convergence

MSRC-21 dataset

32

• 591 colour images, 320x213 size, 21 object classes

MSRC-21 dataset

33

• 591 colour images, 320x213 size, 21 object classes

Runtime Standard ground truth Accurate ground truth

Global Average Global Average

Unary Classifiers

84.0 76.6 83.2±1.5 80.6±2.3

Grid CRF 1 sec 84.6 77.2 84.8±1.5 82.4±1.8

Robust Pn 30 sec 84.9 77.5 86.5±1.0 83.1±1.5

Dense CRF 0.2 sec 86.0 78.3 88.2±0.7 84.7±0.7

PascalVOC-10 dataset

34

• 591 colour images, 320x213 size, 21 object classes

PascalVOC-10 dataset

35

• 591 colour images, 320x213 size, 21 object classes

Runtime Overall Av. Recall Av. I/U

Dense CRF 0.67 sec 71.63 34.53 28.4

Long-range connections

36

• Accuracy on increasing the spatial and range standard deviations• On MSRC-21 spatial – 61 pixels, range – 11

Long-range connections

37

• On increasing the spatial and range standard deviations• On MSRC-21 spatial – 61 pixels, range – 11

Long-range connections

38

• Sometimes propagates misleading information

Mean-field Vs. Graph-cuts

39

• Measure I/U score on PascalVOC-10 segmentation • Increase standard deviation for mean-field• Increase window size for graph-cuts method

• Both achieve almost similar accuracy

Mean-field Vs. Graph-cuts

40

• Measure I/U score on PascalVOC-10 segmentation • Increase standard deviation for mean-field• Increase window size for graph-cuts method

•Time complexity very high, making infeasible to work with large neighbourhood system

top related