latent boosting for action recognition zhi feng huang et al. bmvc 2011 2014. 6. 12. jeany son
TRANSCRIPT
Latent Boosting for Action RecognitionZhi Feng Huang et al.
BMVC 2011
2014. 6. 12. Jeany Son
Background – learning with latent variables
• Multiple Instance Learning (MI-SVM, mi-SVM)• (-) Single plain latent variable
• Latent SVM • (+) Structured latent variable• (-) Control parameters / Normalize different features
• MILboost• (+) Not require to normalize different features• (-) Single latent variable / Not structured
• hCRF• Learning parameters and weights for features
Latent Boosting: structured latent variable, not require to normalize different features, feature selection
Boosting
• Combining many weak predictors to produce an ensemble predictor• training examples with high error are weighted higher than those with lower
error• Difficult instances get more attention
• AdaBoost : “shortcoming” are identified by high-weight data points
• Gradient Boosting : “shortcomings” are identified by gradients
Gradient Boost
• Gradient Boosting = Gradient Descent + Boosting
• Analogous to line search in steepest descent • Construct the new base-learners to be maximally correlated with the negative
gradient of the loss function, associated with the whole ensemble.• Arbitrary loss functions can be applied
• Function estimate (parametric)• Change the function optimization problem into the parameter estimation one
Function estimation
given
Steepest descent optimization
• “greedy stage-wise” approach of function incrementing with the base-learners
• The optimal step-size rho should be specified at each iteration
• The optimization rule is defined as:
Gradient Boost
• Solution to the parameter estimates can be difficult to obtain
• Choose new function h to be most correlated with –g(x)
• Classic least-squares minimization problem
: Line search by Newton’s method
K-class Gradient Boost
• Goal : learn a set of scoring function
• by minimizing negative log-loss of the training data
• Probability of an example x being class k :
Weak classifier
• Solve the optimization problem
• Select h to the most parallel with the –g(X) by following minimization problem
• Scoring function is updated as
-1
LatentBoost for Human Action Recognition
• A tracklet is denoted by 5 tuples • : image feature• : position of person in the t-th frame of the tracklet
• : latent variables
l1 l2 l3 l4 l5
x1 x2 x3 x4 x5
Features
• Optical flow features (unary)• Split into 4 scalar fields channels & motion magnitude
• Color histogram features (pairwise)• difference between color histograms in rectangular sub-windows taken from
adjacent frames
Positive optical flow features
(a) Bend (b) Jack (c) Jump (d) pJump (e) run (f) side (g) walk (h) wave1 (i) wave2
Latent Boosting
• Assume that • an example (x,y) is associated with a set of latent variables L={l1, l2, …, lT}• These latent variables are constrained by an undirected graph structure
G=(V,E)
• Scoring function of (x,L) pair for the k-th class
where
l1 l2 l3 l4 l5
x1 x2 x3 x4 x5
Weak learners of the unary & pairwise potential
: gradient of loss function w.r.t. unary potential
: gradient of loss function w.r.t. pairwise potential
Marginal distributions
These can be computed efficiently by using Belief Propagation
l1 l2 l3 l4 l5
x1 x2 x3 x4 x5
y
F
Weizmann dataset (83 videos, 9 events)
Typical tracklets (29x60) from the Weizmann dataset
Jacking
Running
Jumping
Waving
TRECVID dataset (5 cameras, 10 videos, 7 events)
• Typical tracklets (29x60) from the TRECVID dataset
Limitations
• Not guaranteed to find the global optimum in a non-convex problem
• Performance of the final classifier is very sensitive to the initialization
• If the latent structure is not a tree, LatentBoost can perform inference with LBP : slow and not exact than BP
• Summation over all the possible latent variable may cause problems
Summary
• Novel boosting algorithm with latent variables
• Applying to the task of human action recognition
• New way to solve problems with a structure of latent variables