learning better object models using video dataasamir/cifar/flobject presentation.pdfmotivation...
TRANSCRIPT
Learning Better Object Models using Video DataPatrick Li, Inmar Givoni, Brendan Frey
Motivation
Training on a collection of static monocular images is unnatural.
Labelled Training Images are hard to get. And the lack of is becoming a problem.
�ere is a wealth of video data available.
First Attempt: Learning Bags of Features Modelsfor Image Classi�cation
Goal:
Represent Objects as Bags of SIFT Features
Use unsupervised learning to learn models of objects
Use learned models for image classi�cation
Image Classi�cationINPUT: OUTPUT:
TRAINING:
“Cow”
“Boat” “Car” “Sofa”
...
PART 3
Overview of the TechniqueUnsupervised Training from Video
Supervised Training on Labelled Images
Testing
PART 1 PART 2 PART 3 PART 60
PART 2
...
“Cow”
PART 8PART 1 “Sofa”
Bags of Features Models
PART 1
PART 2
PART 60
...
Latent Dirichlet Allocation for Topic Modelling
SPORT POLITICS BANKINGBASEBALL
HITKICKSOCCER
LEADER
DEM
OC
RAC
YC
APITALISMSH
OU
TERS
MONEY
TRANSACTIONS
TRANSACTIONS
ANIM
ALS
CATDO
G
FROG
CAT
FROG
DO
G
BASEBALLSOCCER
CAPITALISM
LEADER
DEM
OC
RAC
Y
20% ANIMALS40% POLITICS39% BANKING1% SPORTSSingle Document
Latent Dirichlet Allocation for Topic Modelling
Corpus of Documents
? ? ? ?1 2 3 ... 60
Latent Dirichlet Allocation for Topic Modelling
Corpus of Documents
? ? ?1 2 3 ... 60
Money
Transactions
Latent Dirichlet Allocation for Object Modelling
Single Image
COW CAR BOAT
SOFT
DRI
NKS
90% SOFT DRINKS10% CORPORATE LOGOS
Latent Dirichlet Allocation for Object Modelling
Image Collection
? ? ? ?1 2 3 ... 60
Flow-LDA for Motion Modelling
COW CAR BOAT
VILLA
IN
50% SWORD50% VILLAIN
Pair of Consecutive Frame Pairs
Flow-LDA for Motion Modelling
? ? ? ?1 2 3 ... 60
Frame Pair Collection
Flow-LDA for Motion Modelling
Unsupervised Training from Video using FLDA
Training And Testing Images
Image Recognition
PART 1
PART 2
PART 60
...
0.8 Part 10.2 Part 2
0.7 Part 10.2 Part 30.1 Part 4
0.6 Part 20.2 Part 130.2 Part 24
Initial Results
Naive Guesser: 8.6% ErrorSVM trained on SIFT histograms directly: 8.6% ErrorSVM trained using LDA model (no motion): 5.6% ErrorSVM trained using FLDA model (motion): 3.7% Error
... to continue
Experiment on Real Dataset
Go beyond Bags of Features models -Hierarchical Models -Account for Spatial Relations -Account for temporal relations between more than 2 frames
�ank you!