multi-view object segmentation in space and timevision.ia.ac.cn/zh/senimar/reports/multi-view...
TRANSCRIPT
Multi-view object segmentation in space and time
Abdelaziz Djelouah, Jean Sebastien Franco, Edmond Boyer
Outline
• Addressed problem • Method • Results and Conclusion
Outline
• Addressed problem • Method • Results and Conclusion
Addressed problem
Automatic segmentation of a single object seen from multiple calibrated cameras
Outline
• Addressed problem • Method • Results and Conclusion
Method
• Graph cuts for segmentation • Basic idea of this paper • Overview • Formulation • Algorithm
Method
• Graph cuts for segmentation • Basic idea of this paper • Overview • Formulation • Algorithm
Graph cuts for segmentation
Yuri Y. Boykov and Marie-Pierre Jolly ICCV 2001
Graph cuts for segmentation
Advantages: 1.Clear defined cost function 2.Gloabally optimal solution
Graph cuts for segmentation
Preliminaries: 𝒫𝒫: the set of pixels 𝒩𝒩: the set of pairs of neighboring pixels 𝐴𝐴 = (𝐴𝐴1, … ,𝐴𝐴𝑝𝑝, … ,𝐴𝐴|𝒫𝒫|): a binary vector defining a segmentation 𝐴𝐴𝑝𝑝∈ {"obj" , "bkg"}: the assignment to pixel 𝑝𝑝
Cost function: 𝐸𝐸 𝐴𝐴 = 𝜆𝜆.∑ 𝑅𝑅𝑝𝑝 𝐴𝐴𝑝𝑝 + ∑ 𝐵𝐵 𝑝𝑝,𝑞𝑞 . 𝛿𝛿(𝐴𝐴𝑝𝑝,𝐴𝐴𝑞𝑞){𝑝𝑝,𝑞𝑞}∈𝒩𝒩𝑝𝑝∈𝒫𝒫
where
𝛿𝛿 𝐴𝐴𝑝𝑝,𝐴𝐴𝑞𝑞 = �1 𝐴𝐴𝑝𝑝 ≠ 𝐴𝐴𝑞𝑞 0 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜
Graph cuts for segmentation Cost function:
𝐸𝐸 𝐴𝐴 = 𝜆𝜆.∑ 𝑅𝑅𝑝𝑝 𝐴𝐴𝑝𝑝 + ∑ 𝐵𝐵 𝑝𝑝,𝑞𝑞 . 𝛿𝛿(𝐴𝐴𝑝𝑝,𝐴𝐴𝑞𝑞){𝑝𝑝,𝑞𝑞}∈𝒩𝒩𝑝𝑝∈𝒫𝒫 where
𝛿𝛿 𝐴𝐴𝑝𝑝,𝐴𝐴𝑞𝑞 = �1 𝐴𝐴𝑝𝑝 ≠ 𝐴𝐴𝑞𝑞 0 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜
And
�𝑅𝑅𝑝𝑝 "𝑜𝑜𝑜𝑜𝑜𝑜" = −𝑙𝑙𝑙𝑙𝑙𝑙𝑜𝑜(𝐼𝐼𝑝𝑝|𝒪𝒪)𝑅𝑅𝑝𝑝 "𝑜𝑜𝑏𝑏𝑏𝑏" = −𝑙𝑙𝑙𝑙𝑙𝑙𝑜𝑜(𝐼𝐼𝑝𝑝|ℬ)
𝐵𝐵 𝑝𝑝,𝑞𝑞 ∝ exp − 𝐼𝐼𝑝𝑝−𝐼𝐼𝑞𝑞2
2𝜎𝜎2. 1𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑(𝑝𝑝,𝑞𝑞)
Graph cuts for segmentation
Graph
𝒢𝒢 =< 𝒱𝒱, ℰ > : a set of nodes 𝒱𝒱 and a set of edges ℰ 𝓌𝓌𝑒𝑒 : nonnegative weight of edge 𝑜𝑜 𝑆𝑆 and 𝑇𝑇 : two terminal nodes t-link: edges between nodes and terminals n-links: other edges
Cut 𝑜𝑜 − 𝑜𝑜 cut: a subset of edges 𝐶𝐶 ⊂ ℰ making 𝑆𝑆 and 𝑇𝑇 become separated on 𝒢𝒢(𝐶𝐶) =< 𝒱𝒱,ℰ\C >
Cost of cut:
𝐶𝐶 = �𝓌𝓌𝑒𝑒𝑒𝑒∈𝐶𝐶
Graph cuts for segmentation
segmentation ⇔ cut
Graph cuts for segmentation
Feasible cut 𝐶𝐶: 1. 𝐶𝐶 serves exactly one t-link at each 𝑝𝑝 2. {𝑝𝑝, 𝑞𝑞} ∈ 𝐶𝐶 iff 𝑝𝑝 and 𝑞𝑞 are t-linked to different terminals on 𝒢𝒢(𝐶𝐶)
Feasible cut 𝐶𝐶 ⇔ Segmentation A(c)
𝐴𝐴𝑝𝑝 𝑐𝑐 = �"bkg" , 𝑜𝑜𝑖𝑖 {𝑝𝑝,𝑇𝑇} ∈ 𝐶𝐶"o𝑜𝑜𝑜𝑜" , 𝑜𝑜𝑖𝑖 {𝑝𝑝, 𝑆𝑆} ∈ 𝐶𝐶
Min cut �̂�𝐶 on 𝒢𝒢 is feasible
Graph cuts for segmentation
Edge Weight(cost) for
𝐶𝐶 = ∑ 𝓌𝓌𝑒𝑒𝑒𝑒∈𝐶𝐶 = 𝜆𝜆.∑ 𝑅𝑅𝑝𝑝 𝐴𝐴𝑝𝑝(𝑐𝑐) +𝑝𝑝∈𝒫𝒫 ∑ 𝐵𝐵 𝑝𝑝,𝑞𝑞 . 𝛿𝛿(𝐴𝐴𝑝𝑝(𝑐𝑐),𝐴𝐴𝑞𝑞(𝑐𝑐)){𝑝𝑝,𝑞𝑞}∈𝒩𝒩 = 𝐸𝐸(𝐴𝐴(𝐶𝐶))
{𝑝𝑝, 𝑞𝑞} 𝐵𝐵{𝑝𝑝,𝑞𝑞} {𝑝𝑝, 𝑞𝑞} ∈ 𝒩𝒩
{𝑝𝑝, 𝑆𝑆} 𝜆𝜆.𝑅𝑅𝑝𝑝("𝑜𝑜𝑏𝑏𝑏𝑏") 𝑝𝑝 ∈ 𝒫𝒫
{𝑝𝑝,𝑇𝑇} 𝜆𝜆.𝑅𝑅𝑝𝑝("𝑜𝑜𝑜𝑜𝑜𝑜") 𝑝𝑝 ∈ 𝒫𝒫
Edge Weights:
Cost of feasible cut =cost function
Graph cuts for segmentation
Feasible cut 𝐶𝐶 ⇔ Segmentation 𝐴𝐴(𝐶𝐶) Cost of 𝐶𝐶 =Cost function 𝐸𝐸(𝐴𝐴(𝐶𝐶)) Min cut �̂�𝐶 on 𝒢𝒢 is feasible
Minimize 𝐸𝐸(𝐴𝐴) ⇔ find a minimum s-t cut
Min-cut/max-flow algorithms (Boykov and Kolmogorov PAMI 2004)
Graph cuts for segmentation
Important notes – Generally, directed graphs are used to solve energy
minmization. – Energies should satisfy the submodularity constrain. – A general method of graph construction is available. – More Information can be found in Kolmogorov and Zabih,
PAMI 2004.
Graph cuts for segmentation
Advantages: 1.Clear defined cost function 2.Gloabally optimal solution
Graph cuts for segmentation
Advantages: 1.Clear defined cost function 2.Gloabally optimal solution better cost function better segmentation
Method
• Graph cuts for segmentation • Basic idea of this paper • Overview • Formulation • Algorithm
Basic idea
Use multi-view coherence Create inter-view links with 3D samples
Method
• Graph cuts for segmentation • Basic idea of this paper • Overview • Formulation • Algorithm
Overview
Initialize appearance model Divide each image into
superpixels
Label pixels
Label superpixels
Update appearance model
Iterate until
convergence
Method
• Graph cuts for segmentation • Basic idea of this paper • Overview • Formulation • Algorithm
Formulation
• Preliminaries – 𝐼𝐼𝑑𝑑 = 𝐼𝐼1,𝑑𝑑 , … , 𝐼𝐼𝑛𝑛,𝑑𝑑 : a set of input images at instant 𝑜𝑜
– 𝒫𝒫𝑑𝑑𝑑𝑑 : the set of superpixels 𝑝𝑝 in 𝐼𝐼𝑑𝑑,𝑑𝑑 – 𝑥𝑥𝑝𝑝 ∈ {𝑖𝑖, 𝑜𝑜}: the label of 𝑝𝑝 ∈ 𝒫𝒫𝑑𝑑𝑑𝑑 – 𝒮𝒮𝑑𝑑 : the set of 3D samples 𝑜𝑜 uniformly sampled in
the common visibility volume. – 𝑥𝑥𝑑𝑑 ∈ {𝑖𝑖, 𝑜𝑜}: the label of 𝑜𝑜 ∈ 𝒮𝒮𝑑𝑑
Formulation
• Preliminaries – 𝐼𝐼𝑑𝑑 = 𝐼𝐼1,𝑑𝑑 , … , 𝐼𝐼𝑛𝑛,𝑑𝑑 : a set of input images at instant 𝑜𝑜
– 𝒫𝒫𝑑𝑑𝑑𝑑 : the set of superpixels 𝑝𝑝 in 𝐼𝐼𝑑𝑑,𝑑𝑑 – 𝑥𝑥𝑝𝑝 ∈ {𝑖𝑖, 𝑜𝑜}: the label of 𝑝𝑝 ∈ 𝒫𝒫𝑑𝑑𝑑𝑑 – 𝒮𝒮𝑑𝑑 : the set of 3D samples 𝑜𝑜 uniformly sampled in
the common visibility volume. – 𝑥𝑥𝑑𝑑 ∈ {𝑖𝑖, 𝑜𝑜}: the label of 𝑜𝑜 ∈ 𝒮𝒮𝑑𝑑
Formulation
• Foreground and background models – 𝐼𝐼𝑟𝑟𝑑𝑑 : descriptor of pixel 𝑜𝑜 , an 11-dimension vector encoding
gradient magnitude response for 4 scales, Laplacian for 2 scales, and RGB values
– 𝐻𝐻𝑑𝑑𝐵𝐵 and 𝐻𝐻𝑑𝑑𝐹𝐹 : background and foreground histograms of pixel descirptors in 𝐼𝐼𝑑𝑑𝑑𝑑, computed on clusters of pixel descirptors
0
0.2
0.4
0.6
cluster 1 cluster 2 cluster 3 cluster 4
Formulation
• Foreground and background models – 𝐼𝐼𝑟𝑟𝑑𝑑 : descriptor of pixel 𝑜𝑜 , an 11-dimension vector encoding
gradient magnitude response for 4 scales, Laplacian for 2 scales, and RGB values
– 𝐻𝐻𝑑𝑑𝐵𝐵 and 𝐻𝐻𝑑𝑑𝐹𝐹 : background and foreground histograms of pixel descirptors in 𝐼𝐼𝑑𝑑𝑑𝑑, computed on clusters of pixel descirptors
– 𝐻𝐻𝑑𝑑: histogram of the whole image – 𝐴𝐴𝑝𝑝 : descriptor of superpixel 𝑝𝑝, histogram on clusters of
pixel descriptors
Formulation
• Foreground and background models – Model initialization
Formulation
• Energy principles – Individual appearance
• The appearance of a superpixel should comply with image-wide background or forground models, depending on its label.
– Appearance continuity • Neighbouring superpixels likely have the same labels.
– Appearance similarity • Superpixels with similar color/texture likely have the
same labels.
Formulation
• Energy principles – Multi-view coherence
• 3D samples are considered object-consistent if they project to foreground regions with high likelihood.
– Projection constraint • A superpixel should be foreground if it sees at least one
object-consistent sample, otherwise it should be background.
– Time consistency • Temporally linked superpixels likely have the same label.
Formulation
• Energy terms – Individual appearance term
• 𝐸𝐸𝑐𝑐 𝑥𝑥𝑝𝑝 = �∑ −𝑙𝑙𝑙𝑙𝐻𝐻𝑑𝑑𝐵𝐵(𝐼𝐼𝑟𝑟𝑑𝑑)𝑟𝑟∈ℛ𝑝𝑝 𝑜𝑜𝑖𝑖 𝑥𝑥𝑝𝑝 = 𝑜𝑜∑ −𝑙𝑙𝑙𝑙𝐻𝐻𝑑𝑑𝐹𝐹(𝐼𝐼𝑟𝑟𝑑𝑑)𝑟𝑟∈ℛ𝑝𝑝 𝑜𝑜𝑖𝑖 𝑥𝑥𝑝𝑝 = 𝑖𝑖
– Appearance continuity term
• 𝐸𝐸𝑛𝑛 𝑥𝑥𝑝𝑝, 𝑥𝑥𝑞𝑞 = �exp −𝑑𝑑 𝐴𝐴𝑝𝑝, 𝐴𝐴𝑞𝑞2
2<𝑑𝑑 𝐴𝐴𝑝𝑝,𝐴𝐴𝑞𝑞 >2 𝑜𝑜𝑖𝑖 𝑥𝑥𝑝𝑝 ≠ 𝑥𝑥𝑞𝑞
0 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜
Formulation
• Energy terms – Appearance similarity term
• 𝐸𝐸𝑎𝑎 𝑥𝑥𝑝𝑝, 𝑥𝑥𝑞𝑞 = �exp −𝑑𝑑 𝐴𝐴𝑝𝑝, 𝐴𝐴𝑞𝑞2
2𝑑𝑑<𝐴𝐴𝑝𝑝,𝐴𝐴𝑞𝑞>2 𝑜𝑜𝑖𝑖 𝑥𝑥𝑝𝑝 ≠ 𝑥𝑥𝑞𝑞
0 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜
Formulation
• Energy terms – Sample objectness term
• 𝐸𝐸𝑑𝑑 𝑥𝑥𝑑𝑑 = �− ln 1 − 𝑙𝑙𝑑𝑑
𝑓𝑓 𝑜𝑜𝑖𝑖 𝑥𝑥𝑑𝑑 = 𝑜𝑜− ln 𝑙𝑙𝑑𝑑
𝑓𝑓 𝑜𝑜𝑖𝑖 𝑥𝑥𝑑𝑑 = 𝑖𝑖
• 𝑙𝑙𝑑𝑑𝑓𝑓 = 𝑙𝑙 𝑥𝑥𝑑𝑑 = 𝑖𝑖 𝐼𝐼𝑑𝑑1, … , 𝐼𝐼𝑑𝑑𝑛𝑛 = 𝑃𝑃 𝑥𝑥𝑠𝑠=𝑓𝑓 𝑃𝑃(𝐼𝐼𝑠𝑠1,…,𝐼𝐼𝑠𝑠𝑛𝑛|𝑥𝑥𝑠𝑠=𝑓𝑓)
𝑃𝑃(𝐼𝐼𝑠𝑠1,…,𝐼𝐼𝑠𝑠𝑛𝑛)=
𝜋𝜋𝐹𝐹 ∏ 𝐻𝐻𝑖𝑖𝐹𝐹 𝐼𝐼𝑠𝑠𝑖𝑖𝑛𝑛
𝑖𝑖=1∏ 𝐻𝐻𝑖𝑖(𝐼𝐼𝑠𝑠
𝑖𝑖)𝑛𝑛𝑖𝑖=1
, where 𝜋𝜋𝐹𝐹 is the proportion of 3D
samples from the object.
Formulation
• Energy terms – Sample-pixel junction term
• 𝐸𝐸𝑗𝑗 𝑥𝑥𝑑𝑑, 𝑥𝑥𝑝𝑝 = �∞ 𝑜𝑜𝑖𝑖 𝑥𝑥𝑑𝑑 = 𝑖𝑖 𝑎𝑎𝑙𝑙𝑎𝑎 𝑥𝑥𝑝𝑝 = 𝑜𝑜0 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜
– Sample projectin term
• 𝐸𝐸𝑝𝑝 𝑥𝑥𝑝𝑝 = �− ln 1 − 𝑙𝑙 𝑥𝑥𝑝𝑝 𝒱𝒱𝑝𝑝 𝑜𝑜𝑖𝑖 𝑥𝑥𝑝𝑝 = 𝑜𝑜
− ln 𝑙𝑙 𝑥𝑥𝑝𝑝 𝒱𝒱𝑝𝑝 𝑜𝑜𝑖𝑖 𝑥𝑥𝑝𝑝 = 𝑖𝑖
• where 𝑙𝑙(𝑥𝑥𝑝𝑝|𝒱𝒱𝑝𝑝) = max𝑑𝑑∈𝒱𝒱𝑝𝑝
(𝑙𝑙𝑑𝑑𝑓𝑓)
Formulation
• Energy terms – Time consistency terms
• 𝐸𝐸𝑓𝑓 𝑥𝑥𝑝𝑝𝑡𝑡 , 𝑥𝑥𝑞𝑞𝑡𝑡+1 =
�𝜃𝜃𝑓𝑓exp−𝑑𝑑 𝐴𝐴𝑝𝑝𝑡𝑡 , 𝐴𝐴𝑞𝑞𝑡𝑡+1
2
2<𝑑𝑑 𝐴𝐴𝑝𝑝𝑡𝑡 , 𝐴𝐴𝑞𝑞𝑡𝑡+1 >2 𝑜𝑜𝑖𝑖𝑥𝑥𝑝𝑝𝑡𝑡 ≠ 𝑥𝑥𝑞𝑞𝑡𝑡+1
0 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜
Formulation • Energy Function
– with submodularity being satisfied, minimization is solved
by graph cuts
Algorithm
Outline
• Addressed problem • Method • Results and Conclusion
Results and conclusion
Results and conclusion
Results and conclusion
• Conclusion – An unified framework dealing with intra-view,
inter-view, and temporal cues – Inter-view propagation of segmentation
information using 3D samples
References [1] Abdelaziz Djelouah et al. , Multi-View Object Segmentation in Space and Time, ICCV 2013. [2] Abdelaziz Djelouah et al. , N-tuple Color Segmentation for Multi-View Silhouette Extraction, ECCV 2012. [3] Yuri Y. Boykov, Marie-Pierre Jolly, Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N- D Images, ICCV 2001. [4] Boykov, Kolmogorov , An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision, PAMI 2004. [5] Valdimir Kolmogorov, Ramin Zabih, What energy functions can be minimized via graph cuts?, PAMI 2004.
Thanks!