silhouette lookup for automatic pose tracking n ick h owe
TRANSCRIPT
Silhouette Lookup for Automatic Pose Tracking
NICK HOWE
Goal: 3D Pose Tracking
Full 3D “motion capture” from 2D video Single camera Unmarked video
Difficulties: 3D ambiguity Self-occlusion Foreshortening Appearance changes Shadowing
↑(Uses hand-entered data)
The “Old” Way:
Incremental Tracking
Previous frame
Compare withwith ImageRefine 2D PoseRefine 2D Pose
2D Pose2D Pose+ Appearance+ Appearance
NumericalNumericalOptimizationOptimization
NextNext frame
Creeping Error
Incremental Errors accumulate and grow.
May be mitigated by: Better motion models (more guidance) Better appearance models (3D) Better tracking (multiple hypotheses)[Sidenbladh, et. al.; Sminchisescu, et. al.]
Intrinsic problems still remain.(initialization, error recovery)
Direct Pose Estimation
Consider human abilities: Estimate pose from still photo Estimate pose from stick figure Estimate pose from silhouette
[Brand ’99; Rosales et. al. ’01)
Recognition/Retrieval
Hypothesis: Humans can recognize pose by recalling similar examples. Pose Recognition Retrieval
Recognition/Retrieval
Hypothesis: Humans can recognize pose by recalling similar examples. Pose Recognition Retrieval
New Approach: 1. Store many silhouettes with known poses
2. Given video, extract silhouettes3. Retrieve best candidate matches4. Look for plausible series of poses over time
Some Related Work
Estimating Human Body Configuration Using Shape Context MatchingMori & Malik, ECCV 2002
3D Tracking = Classification+InterpolationTomasi, Petrov, & Sastry, ICCV 2003
Temporal Integration of Multiple Silhouette-based Body-part HypothesesKwatra, Bobick, & Johnson, CVPR 2001
3D Human Pose from Silhouettes by Relevance Vector RegressionAgarwal & Triggs, CVPR 2004
Silhouette Comparison
Turning angle(Captures morphology)
Chamfer distance(Captures overlap)
Combine using Belkin technique(score = sum of individual ranks)
Sample Retrievals
(Hits from a small library of 1600 poses)
Coordination Between Frames
Need to pick from top matches at each frame. Want good image match at all frames Want small change between frames
Markov chain minimization!
Best local choices minimize global error
etc.
frame i-1 frame i frame i+1
Too Much Coffee?
Initial solution shows “twitches”
Smoothing it Out
Jitters in motion parameters smoothed via polynomial splines
Making it Match
Problem: poor overlap between observed silhouette & smoothed solution Work with 11-frame splines
Optimize spline parameters to reduce chamfer distance
Result: better match to observations, still smooth
Walking Sequence Result
Re-rendering
Same scene, different viewpoint.
Another Example
Tracked using library of ballet poses
Incremental Tracking
Markov chain is best for offline use But: Convergence after ~10 frames
Incremental tracking with latency
Key Points
Silhouette lookup provides set of potential poses for each frame
Markov chain selects best temporal pose sequence (HMM)
Smoothing & optimization based upon temporal splines
Result: simple tracker, tolerates errors
Thank you! Questions?
Continuing Challenges
Mistakes in rotational direction No data for parts not on silhouette
Incorporate optical flow Some unrealistic motions generated
Incorporate motion model Correct pose not always retrieved
Improve library coverage, retrieval
Future Research
People carrying objects Multiple overlapping people (sports) Time considerations
Optimization slow Chaining currently slow Holy Grail: Real-time tracking
2. Identify best (least expensive)
result
Markov Chain Minimization
Frame 1 Frame 2 Frame n
...
1. Compute least expense to reach each state from previous frame (cost = estimate of plausibility)
State 2A
State 2C
State 2B
State 1A
State 1C
State 1B
State nA
State nC
State nB
3. Backtrack, picking out path that gave best result.
Silhouette Extraction
Many candidate approaches. Moving & fixed camera
This work: Static camera Graph-based segmentation
Making it Match
Solution doesn’t match exactly yet.