academic excellence for business and the professions€¦ · 03/05/2013 27 [] m.asad, w.ikram...
TRANSCRIPT
Academic excellence for business and the professions
Hand gesture recognition using Kinect
Muhammad Asad
Table of Contents
1. Introduction to Kinect
I. What is Kinect?
II. How it works?
III. Why Kinect?
IV. Does Kinect has any limitations?
2. Hand Gesture recognition
I. Types of Hand gestures
II. Distance-Invariant Segmentation of Hand
III. Feature Extraction
IV. Neural Network and HMM training
3. Guidance system for visually impaired
I. Feature extraction
II. Feature Selection
III. Guidance decision
IV. Test sequences
03/05/2013 2
What is Kinect?
• A sensor which is capable of providing
– Depth Image
– RGB(intensity) Image
– Audio from Multi-array microphone
• Real-time provision
• Low cost
• Operating range: 0.5m to 8m
• Originally developed for Microsoft Xbox 360 gaming console
• Now used in different computer vision research areas [1]
3 03/05/2013
[1] Z. Zhang, “Microsoft kinect sensor and its effect,” IEEE Multimedia, vol. 19, no. 2, pp. 4–10, 2012. Image taken from: http://en.wikipedia.org/wiki/Kinect
Figure 1. Kinect Sensor
How it works? • Projection of pattern of infra red points
• Image from infra red camera
• Correlated to pattern for known distance
• Real-time depth image (30 fps)
4 03/05/2013
Image taken from: [1] Z. Zhang, “Microsoft kinect sensor and its effect,” IEEE Multimedia, vol. 19, no. 2, pp. 4–10, 2012.
Figure 2. Inside Kinect Sensor
How it works?
5 03/05/2013
Figure 3. How Kinect Sensor Works?
How it works?
• Depth Image
– Grayscale image
– Normalized Range: 0 – 255
– Invalid Depth: 0
– Darker pixels = Less distance
– Bright pixels = More distance
– Black pixels = no depth
6 03/05/2013 Figure 4. Kinect Depth Map Visualization
Why Kinect?
• Depth Image
– 3D Shape Information
– Invariant to illumination/lighting changes
– Segmentation
• RGB (Intensity) Image
– Can be aligned to depth
– Details about texture/colour
• Audio from Multi-array microphone
– Source localization
– Ambient noise suppression
• Real-time
• Low cost
7 03/05/2013
Why Kinect?
8 03/05/2013
Taken from: [1] Z. Zhang, “Microsoft kinect sensor and its effect,” IEEE Multimedia, vol. 19, no. 2, pp. 4–10, 2012. [2] Weise, Thibaut, Sofien Bouaziz, Hao Li, and Mark Pauly. "Realtime performance-based facial animation." ACM Trans. Graph 30,no. 4 (2011): 77.
Does Kinect has any limitations?
• Occlusion of projected pattern
• Small object
• Light absorbing surfaces
• Scattering of infra-red pattern
• Noise with increased distance [3]
9 03/05/2013
[3] K. Khoshelham and S.O. Elberink, “Accuracy and resolution of kinect depth data for indoor mapping applications,” Sensors, vol. 12, no. 2, pp. 1437–1454, 2012.
Does Kinect has any limitations?
10 03/05/2013
Figure 5. Limitations of Kinect Sensor
Types of Hand gestures
• Static hand gestures
– Defined by hand shape, position and orientation only
– Example: symbolic gestures in sign language
• Dynamic hand gestures
– Temporal integration of static hand gestures
– Defined by hand shape, positiong, orientation, motion,
acceleration and displacement.
– More natural gestures can be modelled
– Examples: telepresence robotics, mapped gestures on mobile
devices
11 03/05/2013
OpenNI hand tracker
12 03/05/2013
Distance-invariant segmentation
• OpenNI hand tracker used
13 03/05/2013
Figure 6. Motivation behind Distance Invariant Segmentation
Distance-invariant segmentation
• Inverse relation between:
– distance Pz of hand
– side length S of segmented hand region
• Dataset collection:
– Varying distance Pz of hand
– Ground truth segmentation size S
14 03/05/2013
Results: Distance-invariant segmentation
15 03/05/2013
Figure 7. Distance-Invariant Segmentation vs Fixed size Segmentation
Feature Extraction
• Projection extraction:
where Cz= Pz – S/2
16 03/05/2013
Feature Extraction
• Projection extraction
• Based on projection based action recognition [4]
17 03/05/2013
[4] W. Li, Z. Zhang, and Z. Liu, “Action recognition based on a bag of 3d points,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2010. IEEE, 2010, pp. 9–14.
Figure 8. Projection Mask extraction; (a) XY Projection (b) ZX Projection (c) ZY Projection
Quantization Error in Projections
18 03/05/2013
Figure 9. Quantization and Random Error noise in Projection Masks at varying distance from Kinect sensor; (a) 700mm (b) 950mm (c) 1200mm (d) 1450mm (e) 1700mm
Quantization Error Reduction [5] • Morphological Operation
• Averaging based interpolation
19 03/05/2013
[5] M.Asad, C.Abhayaratne “Kinect Depth Stream Pre Processing for Hand Gesture Recognition",IEEE International Conference on Image Processing (ICIP'13), September 15-18, 2013 (Accepted)
Feature Extraction
20 03/05/2013
Figure 10. Contour feature extraction
Gesture Stages
• Swipe right and swipe left gestures
• Divided into four gesture stages
• Swipe left: 1->2->3->4
• Swipe right: 4->3->2->1
21 03/05/2013
Figure 11. Gesture stages for swipe gestures; (a) Stage 1 (b) Stage 2 (c) Stage 3 (d) Stage 4
Neural Network Training
22 03/05/2013
• Number of Neurons • 64x64x3 • 300 • 15 • 1
Figure 12. Neural Network Structure
Neural Network Training
03/05/2013 23
Neural Network Output with Varying Distance
24 03/05/2013 Figure 13. Neural Network Response with varying distance from the sensor
Neural Network and HMM
25 03/05/2013
Figure 14. Neural Network and HMM Response against time
Demo
26 03/05/2013
Guidance system for Visually Impaired
Person[6]
27 03/05/2013
[6] M.Asad, W.Ikram “Smartphone based Guidance System for Visually Impaired Person”, IEEE International Conference on Image Processing Theory, Tools and Applications (IPTA'12), October 15-18, 2012, Turkey
Figure 15. Flowchart of the guidance system
Feature Extraction: Edge Detection
28 03/05/2013
Figure 16. Edge Detection using (b) Canny (c) Method in [7]
[7] Y. Zhao, W. Gui, and Z. Chen, “Edge detection based on multi-structure elements morphology,” in Intelligent Control and Automation, 2006. WCICA 2006. The Sixth World Congress on, vol. 2. IEEE, 2006, pp. 9795–9798.
(a) (b) (c)
Feature Selection: Hough Transform
29 03/05/2013
Figure 17. Hough Transform of Fig. 16 with Hough Peaks
Feature Selection: Vanishing Point
30 03/05/2013 Figure 18. Vanishing Point Extraction
Guidance Decision
31 03/05/2013
Figure 19. Guidance Decision
Test Sequences
32 03/05/2013
Thank you!!
33 03/05/2013