po-hsiang chen advisor: sheng-jyh wang 2/13/2012
TRANSCRIPT
2
Major References• Shotton, J., A. Fitzgibbon, et al. (2011). "Real-Time
Human Pose Recognition in Parts from Single Depth Images." Microsoft Research Cambridge & Xbox Incubation
• CVPR 2011 Best Paper
• Freedman, B., A. Shpunt, et al. (2008). Depth mapping using projected patterns, US 2010/0118123A1
• PrimeSense Patent
2/13/2012
3 2/13/2012
Outline• What is Kinect?
• Kinect Architecture
• From IR to depth image
• History of Structured Light
• PrimeSense Invented Structured Light
• From depth image to joint positions
• Body Part Interference
• Joint Proposals
• Experiments and Results
• Conclusion
• References
4 2/13/2012
Outline• What is Kinect?
• Kinect Architecture
• From IR to depth image
• History of Structured Light
• PrimeSense Invented Structured Light
• From depth image to joint positions
• Body Part Interference
• Joint Proposals
• Experiments and Results
• Conclusion
• References
5 2/13/2012
What is Kinect?• Motion sensing input device by Microsoft
• Depth camera tech. developed by PrimeSense
• Invented in 2005
• Software tech. developed by Rare
• First announced at E3
2009 as “Project Natal”
• Windows SDK Releases
http://www.microsoft.com
/en-us/kinectforwindows/
discover/features.aspx
7 2/13/2012
Outline• What is Kinect?
• Kinect Architecture
• From IR to depth image
• History of Structured Light
• PrimeSense Invented Structured Light
• From depth image to joint positions
• Body Part Interference
• Joint Proposals
• Experiments and Results
• Conclusion
• References
8 2/13/2012
Kinect Architecture
Depth Image
Body Parts
Joint Position
IR Structured Light
Random Decision Forest Mean Shift
9 2/13/2012
Outline• What is Kinect?
• Kinect Architecture
• From IR to depth image
• History of Structured Light
• PrimeSense Invented Structured Light
• From depth image to joint positions
• Body Part Interference
• Joint Proposals
• Experiments and Results
• Conclusion
• References
11 2/13/2012
Triangulation• Main Problem
• To recover shape from multiple views, need CORRESPONDENCES between the images
• Matching/Correspondence problem is hard• Occlusions, Texture, Colors.. Etc.
• Solution: Structured light
• Idea: Simplify matching
• Strategy: Use illumination to create your
own correspondences
12 2/13/2012
Structured Light• Basic Principle
• Use a projector to create unambiguous correspondences
• Light projection
• If we project a single point, matching is unique
13 2/13/2012
Structured Light• Line projection ( Line Scan )
• For calibrated cameras, the epipolar geometry is known
• Project a line instead of a single point
14 2/13/2012
Structured Light• Project Multiple Stripes or Grids
• Which stripe matches which?• Correspondence Again
16 2/13/2012
Structured Light• Answer 2: Coloured stripes (De Bruijn)
• Difficult to use for coloured surfaces
17 2/13/2012
Structured Light• Answer 2: Coloured dots (M-array)
• Difficult to use for coloured surfaces
18 2/13/2012
Structured Light• Answer 3: Pattern dots (M-array)
• Difficult for industrial manufacturing
19 2/13/2012
Structured Light• Answer 4: Time-coded light patterns (Time
multiplexing)
• Use a sequence of binary patterns → (log N) images
• Each stripe has a unique binary illumination code
20 2/13/2012
Structured Light• All of the above are categorized as Discrete
Methods
• There are a lot more Continuous Structured Light Methods such as Phase shifting and etc.
• Salvi, J., S. Fernandez, et al. (2010). "A state of the art in structured light patterns for surface profilometry." Pattern Recognition 43(8): 2666-2680
21 2/13/2012
Structured Light• All of the above are human designed patterns.
• Random Speckle
• Structured light using randomly generated patterns
• May obtain denser depth information by solving correspondence problem
22 2/13/2012
What can we do better?• A Projector is just an inverse of a camera
• One projector and one camera is enough for triangulation
• Need Calibration
23 2/13/2012
PrimeSense Patents• US 2010/0118123
• Projector-Camera system
• Already calibrated structure• δZ results in δX in 32
24 2/13/2012
PrimeSense Patents• US 2010/0118123
• Structured Light-1• Pseudo-random distribution
• Local: Random
• Global: Gray level decreases
• Can make a rough estimate in
a low resolution image
25 2/13/2012
PrimeSense Patents• US 2010/0118123
• Structured Light-2• Quasi-periodic pattern
• Five-fold symmetry
• Results in distinct peaks
in freq. domain
• Contain no unit cell repeats
over spatial domain
• Use to reduce noise and
ambient light in environment
28 2/13/2012
PrimeSense Patents• US 2010/0290698
• Uses a special (“astigmatic”) lens with different focal length in x- and y- directions
• Orientation of the circle indicates depth
29 2/13/2012
Outline• What is Kinect?
• Kinect Architecture
• From IR to depth image
• History of Structured Light
• PrimeSense Invented Structured Light
• From depth image to joint positions
• Body Part Interference
• Joint Proposals
• Experiments and Results
• Conclusion
• References
30 2/13/2012
From depth to joints • Shotton, J., A. Fitzgibbon, et al. (2011). "Real-Time
Human Pose Recognition in Parts from Single Depth Images." Microsoft Research Cambridge & Xbox Incubation
• Treat body segmentation as a per-pixel classification task ( No pairwise term or CRF is used )
• Algorithms runs 5ms per frame on Xbox GPU
• Novelty: Intermediate body parts representation
31 2/13/2012
Body Part Inference • Body part labeling
• 31 body parts
• Distinct parts for left and right allow classifier to disambiguate the left and right sides of the body
32 2/13/2012
Body Part Inference• Depth image features
• dI(x) is the depth at pixel x in image I
• θ=(u,v) describe offsets u and v
• Each feature need only read at most 3 image pixels and perform at most 5 arithmetic operations
33 2/13/2012
Randomized Decision Forests• Fast and effective multi-class classifier
• Each split node consists of a feature fθ and a threshold τ
• At the leaf node in tree t, given a learned
• Final classification
34 2/13/2012
Combining Models• Multiple classifiers work together
• Committees• E.g. Averaging the predictions of a set of individual
models
• E.g. Majority votes
• Boosting• Classifiers trained in sequence
• E.g. AdaBoost
• Decision Tree• Binary selection corresponding
to the traversal of a tree
35 2/13/2012
Decision Tree• Three major aspect
• A splitting criterion
• A stop-splitting rule
• A rule to assign each
leaf to a specific class
• Decision Forests
• A Decision Tree Committee
36 2/13/2012
Randomized Decision Forests• Fast and effective multi-class classifier
• Each split node consists of a feature fθ and a threshold τ
• At the leaf node in tree t, given a learned
• Final classification How to train?
37 2/13/2012
Randomized Decision Forests• Training
• Each tree train on different images
• Each image pick 2000 example pixels
• Algorithm
39 2/13/2012
Randomized Decision Forests• Algorithm(cont.)
• Training takes a lot of efforts
• 3 trees with depth 20 from 1 million images takes about a day on a 1000 core cluster
Where are those training data?
40 2/13/2012
Training Data• Depth imaging
• Simplify the task of background subtraction
• Most important: easy to synthesize!!!Take
Real
Images
Learning Synthesize
Parameters
Generat
e Lots of
training data
41 2/13/2012
Kinect Architecture
Depth
Image
Body
Parts
Joint Position
IR Structured Light
Random Decision Forest Mean Shift
42 2/13/2012
Joint Position Proposals• From the previous section,
• Use Mean Shift with a weighted Gaussian kernel
43 2/13/2012
Mean Shift• Kernel density estimator
• Discrete points -> Continuous function
• Calculate the gradient at initial point and shift
• Iterate till stop
44 2/13/2012
Outline• What is Kinect?
• Kinect Architecture
• From IR to depth image
• History of Structured Light
• PrimeSense Invented Structured Light
• From depth image to joint positions
• Body Part Interference
• Joint Proposals
• Experiments and Results
• Conclusion
• References
49 2/13/2012
Outline• What is Kinect?
• Kinect Architecture
• From IR to depth image
• History of Structured Light
• PrimeSense Invented Structured Light
• From depth image to joint positions
• Body Part Interference
• Joint Proposals
• Experiments and Results
• Conclusion
• References
50 2/13/2012
Conclusion• Depth images may contain enough information to
solve human pose problems
• Depth images are color and texture invariant, which simplifies a lot of the corresponding problem
• A deep combining model with sufficient training data can become a good classifier even with simple features
• Buy a Kinect for LAB
51 2/13/2012
Outline• What is Kinect?
• Kinect Architecture
• From IR to depth image
• History of Structured Light
• PrimeSense Invented Structured Light
• From depth image to joint positions
• Body Part Interference
• Joint Proposals
• Experiments and Results
• Conclusion
• References
52
References• Shotton, J., A. Fitzgibbon, et al. (2011). "Real-Time
Human Pose Recognition in Parts from Single Depth Images." Microsoft Research Cambridge & Xbox Incubation
• Freedman, B., A. Shpunt, et al. (2008). Depth mapping using projected patterns, US 2010/0118123A1
• Freedman, B., A. Shpunt, et al. (2008). Distance-Varying Illumination and Imaging Techniques for Depth Mapping, US 2010/0290698A1
2/13/2012
53 2/13/2012
References• Salvi, J., S. Fernandez, et al. (2010). "A state of the
art in structured light patterns for surface profilometry." Pattern Recognition 43(8): 2666-2680.
• Albitar, I., P. Graebling, et al. (2007). “Robust structured light coding for 3D reconstruction,” IEEE.
• Scharstein, D. and R. Szeliski (2003). “High-accuracy stereo depth maps using structured light,” IEEE.
• Breiman, L. (2001). "Random forests." Machine learning 45(1): 5-32.
• Amit, Y. and D. Geman (1997). "Shape quantization and recognition with randomized trees." Neural computation 9(7): 1545-1588.
54 2/13/2012
References• John MacCormick, “How does the Kinect work? ”
users.dickinson.edu/~jmac/selected-talks/kinect.pdf
• “Structured Light”, www.igp.ethz.ch/photogrammetry/.../MV-SS2011-structured.pdf
• http://en.wikipedia.org/wiki/Kinect
• http://en.wikipedia.org/wiki/Structured-light_3D_scanner
• http://en.wikipedia.org/wiki/Triangulation
• http://dms.irb.hr/tutorial/tut_dtrees.php
• http://www.anandtech.com/show/4057/microsoft-kinect-the-anandtech-review/2
• Chen, Y. S. and B. T. Chen (2003). "Measuring of a three-dimensional surface by use of a spatial distance computation." Applied optics 42(11): 1958-1972.