icvss2008: randomized decision forests
DESCRIPTION
Jamie Shotton's ICVSS2008 tutorialTRANSCRIPT
![Page 1: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/1.jpg)
Randomized Decision Forestsfor
Segmentation and Recognition
Jamie Shotton
ICVSS 2008, Sicily, Italy
http://jamie.shotton.org/work/presentations/ICVSS2008.zip
![Page 2: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/2.jpg)
Randomized Decision Forests
• Very fast– for classification– for clustering
• Generalization through random training
• Inherently multi-class– automatic feature sharing [Torralba et al. 07]
• Simple training / testing algorithms
“Randomized Decision Forests” = “Randomized Forests” = “Random ForestsTM”
detailedreferences atend of slides
![Page 3: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/3.jpg)
(Amongothers...)
Randomized Forests in Vision
[Lepetit et al., 06]keypoint recognition
[Amit & Geman, 97]digit recognition
[Moosmann et al., 06]visual word clustering
[Shotton et al., 08]object segmentation
water
boatchair
tree
road
![Page 4: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/4.jpg)
Live Demo [Shotton et al. 08]
• Real-time object segmentationusing randomized decision forests– trained on MSRC 21-category database:
• Segment image and label segments:
airplane bicycle bird boat body bookbuilding car cat chair cow dog face flower
grass road sheep sign sky tree water
Winner CVPR 2008 Best Demo Award!
![Page 5: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/5.jpg)
Outline
• Tutorial on Randomized Decision Forests
• Applications to Vision– keypoint recognition [Lepetit et al. 06]–object segmentation [Shotton et al. 08]
please ask questions as we go!
![Page 6: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/6.jpg)
The Basics: Is The Grass Wet?
world state
is it raining?
is the sprinkler on?P(wet)= 0.95
P(wet)= 0.9
yesno
yesno
P(wet)= 0.1
![Page 7: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/7.jpg)
The Basics: Binary Decision Trees
1
2 3
6 74
9
5
8
category c
split nodesleaf nodes
v
10 11 12 13
14 15 16 17
• feature vector v• split functions
fn(v)• thresholds tn
• ClassificationsPn(c)
≥
<
<
≥
![Page 8: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/8.jpg)
Decision Tree Pseudo-Code
double[] ClassifyDT(node, v)if node.IsSplitNode then
if node.f(v) >= node.t thenreturn
ClassifyDT(node.right, v)else
return ClassifyDT(node.left, v)
endelse
return node.Pend
end
![Page 9: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/9.jpg)
Toy Learning Example
x
y
• feature vectors are x, y coordinates: v = [x, y]T
• split functions are lines with parameters a, b: fn(v) = ax + by• threshold determines intercepts: tn
• four classes: purple, blue, red, green
• Try several lines, chosen at random
• Keep line that best separates data– information gain
• Recurse
![Page 10: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/10.jpg)
Toy Learning Example
x
y
• feature vectors are x, y coordinates: v = [x, y]T
• split functions are lines with parameters a, b: fn(v) = ax + by• threshold determines intercepts: tn
• four classes: purple, blue, red, green
• Try several lines, chosen at random
• Keep line that best separates data– information gain
• Recurse
![Page 11: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/11.jpg)
Toy Learning Example
x
y
• feature vectors are x, y coordinates: v = [x, y]T
• split functions are lines with parameters a, b: fn(v) = ax + by• threshold determines intercepts: tn
• four classes: purple, blue, red, green
• Try several lines, chosen at random
• Keep line that best separates data– information gain
• Recurse
![Page 12: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/12.jpg)
Toy Learning Example
x
y
• feature vectors are x, y coordinates: v = [x, y]T
• split functions are lines with parameters a, b: fn(v) = ax + by• threshold determines intercepts: tn
• four classes: purple, blue, red, green
• Try several lines, chosen at random
• Keep line that best separates data– information gain
• Recurse
![Page 13: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/13.jpg)
• Recursive algorithm– set In of training examples that reach node n is split:
• Features f and thresholds t chosen at random
• At leaf node n, Pn(c) is histogram of examples In
Randomized Learning
left split
right split thresholdfunction ofexample i’s
feature vector
![Page 14: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/14.jpg)
• Features f(v) chosen from feature pool f 2 F
• Thresholds t chosen in range
• Choose f and t to maximize gain in information
More Randomized Learning
left split
right split
![Page 15: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/15.jpg)
Implementation Details
• How many features and thresholds to try?– just one = “extremely randomized” [Geurts et al. 06]– few -> fast training, may under-fit– many -> slower training, may over-fit
• When to stop?– maximum depth– minimum entropy gain– delta class distribution– pruning?
• Unsupervised training– information gain -> most balanced split
![Page 16: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/16.jpg)
Randomized Learning Pseudo CodeTreeNode LearnDT(I)
repeat featureTests timeslet f = RndFeature()
repeat threshTests timeslet t = RndThreshold(I, f)let (I_l, I_r) = Split(I, f, t)let gain = InfoGain(I_l, I_r)if gain is best then remember f, t, I_l, I_r
endend
if best gain is sufficient return SplitNode(f, t, LearnDT(I_l),
LearnDT(I_r))else
return LeafNode(HistogramExamples(I))end
end
![Page 17: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/17.jpg)
• Forest is ensemble ofseveral decision trees
– classification is
A Forest of Trees
……tree t1 tree tT
category ccategory c
split nodesleaf nodes
[Amit & Geman 97][Breiman 01][Lepetit et al. 06]
v v
![Page 18: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/18.jpg)
Decision Forests Pseudo-Code
double[] ClassifyDF(forest, v)// allocate memorylet P = double[forest.CountClasses]
// loop over trees in forestfor t = 1 to forest.CountTrees
let P’ = ClassifyDT(forest.Tree[t], v) P = P + P’ // sum distributions
end
// normaliseP = P / forest.CountTrees
end
![Page 19: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/19.jpg)
Learning a Forest
• Divide training examples into T subsets It µ I– improves generalization– reduces memory requirements & training time
• Train each decision tree t on subset It– same decision tree learning as before
• Multi-core friendly
• Subsets can be chosen at random or hand-picked• Subsets can have overlap (and usually do)• Could also divide the feature pool into subsets
![Page 20: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/20.jpg)
Learning a Forest Pseudo Code
Forest LearnDF(countTrees, I)// allocate memorylet forest = Forest(countTrees)
// loop over trees in forestfor t = 1 to countTrees
let I_t = RandomSplit(I)forest[t] = LearnDT(I_t)
end
// return forest objectreturn forest
end
![Page 21: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/21.jpg)
Toy Forest Classification Demo
ToyClassification
Demo
![Page 22: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/22.jpg)
Randomized Forests for Clustering [Moosmann et al. 06]
• Visual words good for e.g. matching, recognitionbut k-means clustering very slow
• Randomized forests for clustering descriptors– e.g. SIFT, texton filter-banks, etc.
• Leaf nodes in forest are clusters– concatenate histograms from trees in forest
543
1
2
8 96 7
42 61 3
98
……
tree t1 tree tT
75
[Sivic et al. 03][Csurka et al. 04]
![Page 23: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/23.jpg)
Randomized Forests for Clustering [Moosmann et al. 06]
543
1
2
8 96 7
42 61 3
98
……
tree t1 tree tT
75fr
eque
ncy
tree t1 tree tT
node index
we’ll see later how to use whole tree
hierarchy!
“bag of words”
![Page 24: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/24.jpg)
Relation to Cascades [Viola & Jones 04]
• Cascades– very unbalanced tree– good for unbalanced binary problems
e.g. sliding window object detection
• Randomized forests– less deep, fairly balanced– ensemble of trees gives robustness– good for multi-class problems
![Page 25: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/25.jpg)
Random Ferns
• Naïve Bayes classifier over random sets of features
• Can be good alternativeto randomized forests
[Özuysal et al. 07] [Bosch et al. 07]
set of features
“random ferns”
individual features
“naïve Bayes”
Bayes’ rule
![Page 26: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/26.jpg)
Short Pause
Any QuestionsSo Far?
![Page 27: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/27.jpg)
Outline
• Tutorial on Randomized Decision Forests
• Applications to Vision Problems– keypoint recognition [Lepetit et al. 06]–object segmentation [Shotton et al. 08]
![Page 28: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/28.jpg)
Fast Keypoint Recognition [Lepetit et al. 06]
• Wide-baseline matchingas classification problem
• Extract prominent key-points in training images
• Forest to classifies:– patches -> keypoints
• Features– pixel comparisons
• Augmented training set– gives robustness to patch scaling, translation, rotation
![Page 29: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/29.jpg)
Fast Keypoint Recognition [Lepetit et al. 06]
• Example videos– from http://cvlab.epfl.ch/research/augm/detect.php
![Page 30: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/30.jpg)
Real-Time Object Segmentation [Shotton et al. 2008]
• Aim – a better visual vocabulary– image categorization
• does this image contain cows, trees, etc.?– object segmentation
• draw and label the outlines of the cow, grass, etc.
• Design goals– fast and accurate– use learned
semantic information
![Page 31: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/31.jpg)
Object Recognition Pipeline
extract features
SIFT, filter bank
clustering
k-means
unsupervisedhand-crafted
classification algorithm
SVM, decision forest, boostingsupervised
assignment
nearest neighbour
![Page 32: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/32.jpg)
Object Recognition Pipeline
STF
clustering‘semantic textons’
local classification
Semantic Texton Forest (STF)• decision forest for both
clustering & classification• tree nodes have learned
object category associations
classification algorithm
SVM, decision forest, boostingsupervised
![Page 33: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/33.jpg)
Object Recognition Pipeline
STF
clustering‘semantic textons’
local classification
SF
dog
road
building
object segmentation
SVM
test image
buildingdogroad
image categorization
Semantic Texton Forest (STF)• decision forest for both
clustering & classification• tree nodes have learned
object category associations
Support Vector Machine (SVM)• pyramid match kernel
in learned tree hierarchies
Segmentation Forest (SF)• second decision forest• features use layout & context• semantic context
![Page 34: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/34.jpg)
Object Recognition Pipeline
STF
semantic textons(clustering)
local classification
SF
dog
road
building
object segmentation
SVM
test image
buildingdogroad
image categorization
Semantic Texton Forest (STF)• decision forest for both
clustering & classification• tree nodes have learned
object category associations
Support Vector Machine (SVM)• pyramid match kernel
in learned tree hierarchies
Segmentation Forest (SF)• second decision forest• features use layout & context
![Page 35: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/35.jpg)
extract features
Textons & Visual Words
• Textons [Julesz 81]– computed densely e.g. references– clustered filter-bank responses [Malik 01] [Varma 05]– used for object recognition [Winn 05] [Shotton
07]• Visual words
– usually computed sparsely [Mikolacjzyk 04]– clustered descriptors [Lowe 04]– used for object recognition [Sivic 03] [Csurka 04]
localdescriptors
e.g. [Lowe 04]
filterbank clustering
k-means
assignment
nearest neighbourExpensive!
![Page 36: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/36.jpg)
Semantic Texton Forests (STF)
• A STF is– a decision forest applied at each image pixel– simple pixel-based features
• How is this new?– no descriptors or filter-banks– decision forest
• fast clustering & assignment• local classification Very Fast
learned semantic information
![Page 37: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/37.jpg)
i
Image Patch Features
p
i
f(p) learnedthreshold>
Pixel i gives patch p(21x21 pixels in experiments)
tree split function
?
![Page 38: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/38.jpg)
Example Semantic Texton Forest
Input Image Ground Truth
A[r] + B[r] > 363A[b] > 98
A[g] - B[b] > 28
A[g] - B[b] > 13A[b] + B[b] > 284
|A[r] - B[b]| > 21|A[b] - B[g]| > 37
Exam
ple
Patc
hes
![Page 39: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/39.jpg)
Leaf Node Visualization
• Average of all training patches at each leaf node
tree 1
tree 2tree 3
tree 4tree 5
![Page 40: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/40.jpg)
STF Training Examples
• Supervised training
• Regular grid• Random transformations
– learn invariances [Lepetit et al. 06]
(GT colors categories)
![Page 41: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/41.jpg)
Different Levels of Supervision
• STF can be trained with:– no supervision (just the images)
• clustering only – no local classification
– weak supervision (image labels)• trained as if all image labels at pixels
– full supervision (pixel labels)
treebenchgrass
![Page 42: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/42.jpg)
Balancing the Training Set
• Datasets often unbalanced– poor average class accuracy
• Weight training examplesby inverse class frequency
Building
Grass
Tree
CowShee
p
SkyAir-plan
e
Water
FaceCarBike
Flower
SignBird BookChair
Road
CatDog
Body
Boat
Proportion of pixels by class(MSRC dataset)
![Page 43: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/43.jpg)
Semantic Textons & Local Classification
test image
ground truth(for reference)
semantic textons(color leaf node index)
local classification(color most likely category)
comparable
Live Demo
![Page 44: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/44.jpg)
MSRC Naïve Segmentation Baseline
• Use only local classification P(c|l) from STF
globalaccuracy
averageaccuracy
supervised 49.7% 34.5% weakly supervised 14.8% 24.1%
STF
![Page 45: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/45.jpg)
Bags of Semantic Textons (BoSTs)semantic textons
(colors leaf node indices)local classification(colors categories)image
semantic texton histogram
freq
uenc
y
tree t1 tree tT
2 3 4 52 3 4 5depth:
prob
abili
ty
region prior
object category
region r
node index
all trees
split nodeleaf node
![Page 46: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/46.jpg)
Choice of Regions for BoSTs
• Image categorization– region r = whole image
• Object segmentation– many image regions r
r1
i
r
r2 ir3
i ....
![Page 47: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/47.jpg)
Other Clustering Methods
• Efficient codebooks [Jurie et al. 05]
• Hyper-grid clustering [Tuytelaars et al. 07]
• Hierarchical k-means [Nister & Stewénius 06]
• Discriminant embedding [Hua et al. 07]
• Randomized clustering forests [Moosmann et al. 06]– tree hierarchy not used– ignores classification of forest– uses expensive local descriptors
![Page 48: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/48.jpg)
Object Recognition Pipeline
STF
semantic textons(clustering)
local classification
SF
dog
road
building
object segmentation
SVM
test image
buildingdogroad
image categorization
Semantic Texton Forest (STF)• decision forest for both
clustering & classification• tree nodes have learned
object category associations
Support Vector Machine (SVM)• pyramid match kernel
in learned tree hierarchies
Segmentation Forest (SF)• second decision forest• features use layout & context
![Page 49: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/49.jpg)
Image Categorization
• SVM with learned Pyramid Match Kernel (PMK)– descriptor space [Grauman et al.
05]– image location space [Lazebnik et al. 06]
• New PMK acts on semantic texton histogram– matches P and Q in learned hierarchical histogram space– deeper node matches are more important
norm. depth weight
increased similarity at depth d
![Page 50: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/50.jpg)
Categorization Experiments on MSRC
• Learned PMK vs. radial basis function (RBF)Mean AP
RBF 49.9 learned PMK 76.3
Number of Trees T
Mea
n Av
erag
e Pr
ecis
ion
NB mean average precision tougher than EER or AuC
![Page 51: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/51.jpg)
Object Recognition Pipeline
STF
semantic textons(clustering)
local classification
SF
dog
road
building
object segmentation
SVM
test image
buildingdogroad
image categorization
Semantic Texton Forest (STF)• decision forest for both
clustering & classification• tree nodes have learned
object category associations
Support Vector Machine (SVM)• pyramid match kernel
in learned tree hierarchies
Segmentation Forest (SF)• second decision forest• features use layout & context
![Page 52: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/52.jpg)
Segmentation Forest
• Object segmentation
• Adapt TextonBoost [Shotton et al. 06]– boosted classifier → randomized decision forest
textons → semantic textons + region priors
– no conditional random field
bicycle
road
building
![Page 53: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/53.jpg)
Features in Segmentation Forest
ri
offset rectangle r
freq
uenc
y
node index
semantictexton bin
prob
abili
tyobject category
regionprior bin
or
bincount
learnedthreshold>
?tree split function
![Page 54: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/54.jpg)
How the Features Work
• Rectangles pair with semantic textons can capture– appearance, layout, textural context [Shotton et al. 07]
i1 i2
i3
r2
t2i4
input image feature1 = (r1, t1)
feature2 = (r2, t2)semantic texton map
feature1 responses
feature2 response
r1 t1
![Page 55: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/55.jpg)
Features in Segmentation Forest
• Learning the randomized forest– regular grid (10x10 pixels)– discriminative pairs of region r and BoST bin
• Region prior allows semantic context– “sheep tend to stand on grass”
• Efficient calculation– compute bins only as required– use integral images [Viola & Jones 04]– sub-sample integral images Live Demo
![Page 56: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/56.jpg)
• Combine– image categorization (SVM with learned PMK)– object segmentation (decision forest)
Image-Level Prior (ILP)
imagecategorization
posterior
prior forsegmentation
SF ILP
weighting
See also [Verbeek et al. 07]
![Page 57: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/57.jpg)
MSRC Segmentation Results
test image
SF
SF + ILP
bicycle flower sign bird book chair road cat dog bodybuilding grass tree cow sheep sky airplane water face car
boat
![Page 58: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/58.jpg)
More MSRC Results
bicycle flower sign bird book chair road cat dog bodybuilding grass tree cow sheep sky airplane water face car
boat
![Page 59: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/59.jpg)
MSRC Quantitative Comparison
• Pixel-wise segmentation accuracy (%)
• Computation time
Building
Grass
Tree
Cow
Sheep
Sky
Airplane
Water
Face
Car
Bicycle
Flower
Sign
Bird
Book
Chair
Road
Cat
Dog
Body
Boat
Global
Average
[Shotton 06] 62 98 86 58 50 83 60 53 74 63 75 63 35 19 92 15 86 54 19 62 7 71 58 [Verbeek 07] 52 87 68 73 84 94 88 73 70 68 74 89 33 19 78 34 89 46 49 54 31 - 64 SF 41 84 75 89 93 79 86 47 87 65 72 61 36 26 91 50 70 72 31 61 14 68 63 SF + ILP 49 88 79 97 97 78 82 54 87 74 72 74 36 24 93 51 78 75 35 66 18 72 67
red = winner
Training Time Test Time [Shotton 06] 2 days 30 sec / image [Verbeek 07] 1 hr 2 sec / image SF 2 hrs < 0.125 sec / image
![Page 60: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/60.jpg)
• Pixel-wise segmentation accuracy (%)
• Computation time
MSRC Quantitative Comparison
Building
Grass
Tree
Cow
Sheep
Sky
Airplane
Water
Face
Car
Bicycle
Flower
Sign
Bird
Book
Chair
Road
Cat
Dog
Body
Boat
Global
Average
[Shotton 06] 62 98 86 58 50 83 60 53 74 63 75 63 35 19 92 15 86 54 19 62 7 71 58 [Verbeek 07] 52 87 68 73 84 94 88 73 70 68 74 89 33 19 78 34 89 46 49 54 31 - 64 SF 41 84 75 89 93 79 86 47 87 65 72 61 36 26 91 50 70 72 31 61 14 68 63 SF + ILP 49 88 79 97 97 78 82 54 87 74 72 74 36 24 93 51 78 75 35 66 18 72 67 [Tu 08] 69 96 87 78 80 95 83 67 84 70 79 47 61 30 80 45 78 68 52 67 27 78 69
red = winner
Training Time Test Time [Shotton 06] 2 days 30 sec / image [Verbeek 07] 1 hr 2 sec / image SF 2 hrs < 0.125 sec / image [Tu 08] a few days 30-70 sec / image
![Page 61: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/61.jpg)
average accuracy only leaf node bins 64.1% all tree node bins 65.5% only region prior bins 66.1% full model 66.9% no transformations 64.4% unsupervised STF 64.2% weakly supervised STF 64.6%
MSRC Influence of Design Decisions
• MSRC dataset• Category average accuracy
![Page 62: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/62.jpg)
VOC 2007 Segmentation Results
table
person
dog
chai
r
Average Accuracy [Brookes] 9 SF 20 SF + Image-Level Prior 24 [TKK] 30 SF + Detection-Level Prior 42
• Detection-Level Prior– [TKK] detection bounding boxes as segmentation prior
![Page 63: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/63.jpg)
Driving Video Database
• [Brostow, Shotton, Fauqueur, Cipolla, ECCV 2008]– new structure-from-motion cues
can improve object segmentation
test image ground truth(!) STF + SF result
![Page 64: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/64.jpg)
Semantic Texton Forests Summary
• Semantic texton forests– effective alternative to textons for recognition
• Image categorization improves segmentation– “image-level prior”– can use identical image features
• Memory– high memory requirements for training
• Efficiency– very fast on CPU
![Page 65: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/65.jpg)
References (red = most relevant)• Amit & Geman
– Shape Quantization and Recognition with Randomized Trees.– Neural Computation 1997.
• Bosch et al.– Image Classification using Random Forests and Ferns.– ICCV 2007.
• Breiman– Random Forests.– Machine Learning Journal 2001.
• Brostow et al.– To appear ECCV 2008.
• Csurka et al.– Visual Categorization with Bags of Keypoints.– ECCV Workshop on Statistical Learning in Computer Vision, 2004.
• Geurts et al.– Extremely Randomized Trees.– Machine Learning 2006.
• Grauman & Darrel– The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features.– ICCV 2005.
• Hua et al.– Discriminant Embedding for Local Image Descriptors.– ICCV 2007.
• Jurie & Triggs– Creating Efficient Codebooks for Visual Recognition.– ICCV 2005.
• Lazebnik et al.– Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories.– CVPR 2006.
• Lepetit et al.– Keypoint Recognition using Randomized Trees.– PAMI 2006.
• Lowe– Distinctive image features from scale-invariant keypoints.– IJCV 2004.
• Malik et al.– Contour and Texture Analysis for Image Segmentation.– IJCV 2001.
• Mikolajczyk & Schmid– Scale and Affine invariant interest point detectors.– IJCV 2004.
• Moosmann et al.– Fast Discriminative Visual Codebooks using Randomized Clustering Forests.– NIPS 2006.
• Nister & Stewenius– Scalable Recognition with a Vocabulary Tree.– CVPR 2006.
• Özuysal et al.– Fast Keypoint Recognition in Ten Lines of Code.– CVPR 2007.
• Sharp– To appear ECCV 2008.
• Shotton et al.– Semantic Texton Forests for Image Categorization and Segmentation.– CVPR 2008.
• Shotton et al.– TextonBoost for Image Understanding: Multi-Class Object Recognition and
Segmentation by Jointly Modeling Texture, Layout, and Context.– IJCV 2007.
• Sivic & Zisserman– Video Google: A Text Retrieval Approach to Object Matching in Videos.– ICCV 2003.
• Torralba et al.– Sharing visual features for multiclass and multiview object detection.– PAMI 2007.
• Tu– Auto-context and Its application to High-level Vision Tasks.– CVPR 2008.
• Tuytelaars & Schmid– Vector Quantizing Feature Space with a Regular Lattice.– ICCV 2007.
• Varma & Zisserman– A statistical approach to texture classification from single images.– IJCV 2005.
• Verbeek & Triggs– Region Classification with Markov Field Aspect Models.– CVPR 2007.
• Viola & Jones– Robust Real-time Object Detection.– IJCV 2004.
• Winn et al.– Object Categorization by Learned Universal Visual Dictionary.– ICCV 2005.
![Page 66: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/66.jpg)
Take Home Message
• Randomized decision forests are– very fast (GPU friendly [Sharp, ECCV 08])
– simple to implement– flexible tools for computer vision
• Ideas for more research– biasing the randomness– optimal fusion of trees from different modalities
• e.g. appearance, SfM, optical flow
![Page 67: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/67.jpg)
Thank [email protected]
http://jamie.shotton.org/work/presentations/ICVSS2008.zip
Internships at MSRC available for next year.Talk to me or see:
http://research.microsoft.com/aboutmsr/jobs/internships/about_uk.aspx
![Page 68: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/68.jpg)
Example Tree in Segmentation Forest
Maximum Depth 14
![Page 69: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/69.jpg)
More Results
objectclasses
test image
no ILP
with ILP
sky mountain tree
road sidewalk buildinggrass rock
sand plant carsnow sign
waterperson
![Page 70: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/70.jpg)
Effect of Color Space
![Page 71: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/71.jpg)
MSRC Categorization Results
0.0 0.2 0.4 0.6 0.8 1.00.0
0.2
0.4
0.6
0.8
1.0
Categories 1-5
building grass tree cow sheep
Recall
Prec
isio
n
0.0 0.2 0.4 0.6 0.8 1.00.0
0.2
0.4
0.6
0.8
1.0
Categories 6-10
sky aeroplane water face car
RecallPr
ecis
ion
![Page 72: ICVSS2008: Randomized Decision Forests](https://reader033.vdocument.in/reader033/viewer/2022061108/544d5c9db1af9f1a4d8b4576/html5/thumbnails/72.jpg)
MSRC Categorization Results
0.0 0.2 0.4 0.6 0.8 1.00.0
0.2
0.4
0.6
0.8
1.0
Categories 16-21
chair road cat dog bodyboat
RecallPr
ecis
ion
0.0 0.2 0.4 0.6 0.8 1.00.0
0.2
0.4
0.6
0.8
1.0
Categories 11-15
bicycle flower sign bird bookRecall
Prec
isio
n