using the kinect for fun and profit by tam hanna
DESCRIPTION
Very few devices offer as fascinating features as the Microsoft Kinect. This seminar teaches you what the Kinect can do and how you can develop for it. Attendants are recommended to bring a notebook with Visual C# 2010 express edition and the latest Kinect SDK so that they can fully profit from the talk. A sensor will be available for testing own applications.TRANSCRIPT
Using the Kinect
for fun
and profit
About /me
• Tam HANNA– Director,
Tamoggemon Holding k,s
– Runs web sites about mobile computing
– Writes scientific books
Agenda
• Kinect – what is that?
• Streams
• Skeletons
• Facial tracking
• libfreenect
• OpenNI
Slide download
• http://www.tamoggemon.com/test/ Codemotion-Kinect.ppt
• URL IS case sensitive
Kinect – what is that?
History - I
• Depth: PrimeSense technology– Not from Redmond
• First public mention: 2007– Bill Gates, D3 conference– „Camera for game control“
Contrast detection
Where does the shirt end?
Dot matrix
Shadows / dead areas
Shadows / dead areas - II
History - II
• 2008: Wii ships– Best-selling console of its generation
• 2009: E3 conference– Announcement of „Project Natal“
• 2010: no CPU in sensor– Takes 10% of XBox 360 CPU
History - III
• 4. November 2010– First shipment– “We will sue anyone who reverse engineers“
• June 2011– Official SDK
System overview
Kinect provides
• Video stream
• Depth stream– (IR stream)
• Accelerometer data
• Rest: computedRest: computed
Family tree
• Kinect for XBOX– Normal USB
• Kinect bundle– MS-Fucked USB– Needs PSU
• Kinect for Windows– Costs more– Legal to deploy
Cheap from China
Streams
Kinect provides „streams“
• Repeatedly updated bitmaps
• Push or Pull processes possible– Attention: processing time!!!
Color stream
• Two modes– VGA@30fps– 1280x960@12fps
• Simple data format– 8 bits / component– R / G / B / A components
Depth stream
• Two modes– Unlimited range– Reduced range, with player indexing
Depth stream - II
• 16bit words
• Special encoding for limited range:
Tiefe[12]
Tiefe[11]
Tiefe[10]
Tiefe[9]
Tiefe[8]
Tiefe[7]
Tiefe[6]
Tiefe[5]
Tiefe[4]
Tiefe[3]
Tiefe[2]
Tiefe[1]
Tiefe[0]
Spieler[2]
Spieler[1]
Spieler[0]
Depth stream - III
IR stream
• Instead of color data
• 640x480@30fps
• 16bit words
• IR data in 10 MSB bits
Finding the Kinect
• SDK supports multiple Sensors/PC
• Find one
• Microsoft.Kinect.Toolkit
XAML part<Window x:Class="KinectWPFD2.MainWindow" xmlns:toolkit="clr-
namespace:Microsoft.Kinect.Toolkit;assembly=Microsoft.Kinect.Toolkit" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" Title="MainWindow" Height="759" Width="704"> <Grid> <Image Height="480" HorizontalAlignment="Left" Name="image1"
Stretch="Fill" VerticalAlignment="Top" Width="640" /> <toolkit:KinectSensorChooserUI x:Name="SensorChooserUI"
IsListening="True" HorizontalAlignment="Center" VerticalAlignment="Top" />
<CheckBox Content="Overlay rendern" Height="16" HorizontalAlignment="Left" Margin="267,500,0,0" Name="ChkRender" VerticalAlignment="Top" />
</Grid></Window>
Code - I public partial class MainWindow : Window { KinectSensor mySensor;
KinectSensorChooser myChooser;
public MainWindow() { InitializeComponent();
myChooser = new KinectSensorChooser(); myChooser.KinectChanged += new
EventHandler<KinectChangedEventArgs>(myChooser_KinectChanged); this.SensorChooserUI.KinectSensorChooser = myChooser; myChooser.Start();
Code - II void myChooser_KinectChanged(object sender,
KinectChangedEventArgs e) { if (null != e.OldSensor) {
if (mySensor != null) { mySensor.Dispose(); } }
if (null != e.NewSensor) { mySensor = e.NewSensor;
Initialize streammySensor.DepthStream.Enable(DepthImageFormat.Resolution640x480Fps30);mySensor.ColorStream.Enable(ColorImageFormat.RgbResolution640x480Fps30);myArray = new short[this.mySensor.DepthStream.FramePixelDataLength];myColorArray = new byte[this.mySensor.ColorStream.FramePixelDataLength];mySensor.AllFramesReady += new
EventHandler<AllFramesReadyEventArgs>(mySensor_AllFramesReady); try { this.mySensor.Start(); SensorChooserUI.Visibility = Visibility.Hidden; }
Process stream
void mySensor_AllFramesReady(object sender, AllFramesReadyEventArgs e)
{ ColorImageFrame c = e.OpenColorImageFrame(); DepthImageFrame d = e.OpenDepthImageFrame();
if (c == null || d == null) return;
c.CopyPixelDataTo(myColorArray); d.CopyPixelDataTo(myArray);
Problem: Calibration
• Depth and Color sensors are not aligned
• Position of data in array does not match
Solution
• CoordinateMapper class
• Maps between various frame types– Depth and Color– Skeleton and Color
On Push mode
• Kinect can push data to application
• Preferred mode of operation
• But: sensitive to proc time
• If handler takes too long -> App stops
Skeletons
What is tracked?
• Data format– Real life coordinates
• Color-Mappable
Initialize stream
if (null != e.NewSensor)
{
mySensor = e.NewSensor; mySensor.SkeletonStream.Enable();
Get joints void mySensor_AllFramesReady(object sender, AllFramesReadyEventArgs e) { ColorImageFrame c = e.OpenColorImageFrame(); SkeletonFrame s = e.OpenSkeletonFrame();
if (c == null || s == null) return;
c.CopyPixelDataTo(myColorArray); s.CopySkeletonDataTo(mySkeletonArray);
foreach (Skeleton aSkeleton in mySkeletonArray) {
DrawBone(aSkeleton.Joints[JointType.HandLeft], aSkeleton.Joints[JointType.WristLeft], armPen, drawingContext);
Use joints private void DrawBone(Joint jointFrom, Joint jointTo, Pen aPen,
DrawingContext aContext) { if (jointFrom.TrackingState == JointTrackingState.NotTracked || jointTo.TrackingState == JointTrackingState.NotTracked) {}
if (jointFrom.TrackingState == JointTrackingState.Inferred || jointTo.TrackingState == JointTrackingState.Inferred) { ColorImagePoint p1 =
mySensor.CoordinateMapper.MapSkeletonPointToColorPoint(jointFrom.Position, ColorImageFormat.RgbResolution640x480Fps30);
} if (jointFrom.TrackingState == JointTrackingState.Tracked || jointTo.TrackingState == JointTrackingState.Tracked)
Facial trackingFacial tracking
What is tracked - I
What is tracked - II
What is tracked - III
AU‘s?
• Research by Paul EKMAN
• Quantify facial motion
Structure
• C++ library with algorithms
• Basic .net wrapper provided– Incomplete– Might change!!
Initialize face tracker
myFaceTracker = new FaceTracker(mySensor);
Feed face tracker FaceTrackFrame myFrame = null; foreach (Skeleton aSkeleton in mySkeletonArray) { if (aSkeleton.TrackingState == SkeletonTrackingState.Tracked) { myFrame =
myFaceTracker.Track(ColorImageFormat.RgbResolution640x480Fps30, myColorArray, DepthImageFormat.Resolution640x480Fps30, myArray, aSkeleton);
if (myFrame.TrackSuccessful == true) { break; } } }
Calibration
• OUCH!– Not all snouts are equal
• Maximums vary
libfreenect
What is it
• Result of Kinect hacking competition
• Bundled with most Linux distributions
• „Basic Kinect data parser“
Set-up
• /etc/udev/rules.d/66-kinect.rules
#Rules for Kinect ####################################################SYSFS{idVendor}=="045e", SYSFS{idProduct}=="02ae", MODE="0660",GROUP="video"SYSFS{idVendor}=="045e", SYSFS{idProduct}=="02ad", MODE="0660",GROUP="video"SYSFS{idVendor}=="045e", SYSFS{idProduct}=="02b0", MODE="0660",GROUP="video"### END #############################################################
Set-up II
• sudo adduser $USER plugdev
• sudo usermod -a -G video tamhan
• tamhan@tamhan-X360:~$ freenect-glviewKinect camera test
Number of devices found: 1
Could not claim interface on camera: -6
Could not open device
Set-up III
Problems
• gspca-kinect– Kernel module, uses Kinect as webcam– Blocks other libraries– sudo modprobe -r gspca_kinect
• Outdated version widely deployed– API not compatible
Update library
• sudo foo
• sudo add-apt-repository ppa:floe/libtisch
• sudo apt-get update
• sudo apt-get install libfreenect libfreenect-dev libfreenect-demos
libfreenect - II
color stream
Implementing it
• libfreenect: C++ library
• Question: which framework
• Answer: Qt ( what else ;) )
The .pro file
QT += core gui
TARGET = QtDepthFrame
CONFIG += i386
DEFINES += USE_FREENECT
LIBS += -lfreenect
The freenect thread
• Library needs processing time– Does not multithread itself
• Should be provided outside of main app
class QFreenectThread : public QThread{ Q_OBJECTpublic: explicit QFreenectThread(QObject *parent = 0); void run();
signals:
public slots:
public: bool myActive; freenect_context *myContext;};
QFreenectThread::QFreenectThread(QObject *parent) : QThread(parent){}
void QFreenectThread::run(){ while(myActive) { if(freenect_process_events(myContext) < 0) { qDebug("Cannot process events!"); QApplication::exit(1); } }}
QFreenect
• Main engine module– Contact point between Kinect and app
• Fires off signals on frame availability
• class QFreenect : public QObject• {• Q_OBJECT• public:• explicit QFreenect(QObject *parent = 0);• ~QFreenect();• void processVideo(void *myVideo, uint32_t myTimestamp=0);• void processDepth(void *myDepth, uint32_t myTimestamp=0);
• signals:• void videoDataReady(uint8_t* myRGBBuffer);• void depthDataReady(uint16_t* myDepthBuffer);
• public slots:
• private:• freenect_context *myContext;• freenect_device *myDevice;• QFreenectThread *myWorker;• uint8_t* myRGBBuffer;• uint16_t* myDepthBuffer;• QMutex* myMutex;
• public:• bool myWantDataFlag;• bool myFlagFrameTaken;• bool myFlagDFrameTaken;• static QFreenect* mySelf;• };
Some C++
QFreenect* QFreenect::mySelf;
static inline void videoCallback(freenect_device *myDevice, void *myVideo, uint32_t myTimestamp=0)
{ QFreenect::mySelf->processVideo(myVideo, myTimestamp);}
static inline void depthCallback(freenect_device *myDevice, void *myVideo, uint32_t myTimestamp=0)
{ QFreenect::mySelf->processDepth(myVideo, myTimestamp);}
Bring-up• QFreenect::QFreenect(QObject *parent) :• QObject(parent)• {• myMutex=NULL;• myRGBBuffer=NULL;
• myMutex=new QMutex();• myWantDataFlag=false;• myFlagFrameTaken=true;• mySelf=this;
• if (freenect_init(&myContext, NULL) < 0)• {• qDebug("init failed");• QApplication::exit(1);• }
Bring-up – II• freenect_set_log_level(myContext, FREENECT_LOG_FATAL);
• int nr_devices = freenect_num_devices (myContext);• if (nr_devices < 1)• {• freenect_shutdown(myContext);• qDebug("No Kinect found!");• QApplication::exit(1);• }
• if (freenect_open_device(myContext, &myDevice, 0) < 0)• {• qDebug("Open Device Failed!");• freenect_shutdown(myContext);• QApplication::exit(1);• }
• myRGBBuffer = (uint8_t*)malloc(640*480*3);• freenect_set_video_callback(myDevice,
videoCallback);• freenect_set_video_buffer(myDevice,
myRGBBuffer);• freenect_frame_mode vFrame =
freenect_find_video_mode(FREENECT_RESOLUTION_MEDIUM,FREENECT_VIDEO_RGB);
• freenect_set_video_mode(myDevice,vFrame);• freenect_start_video(myDevice);
• myWorker=new QFreenectThread(this);
• myWorker->myActive=true;
• myWorker->myContext=myContext;
• myWorker->start();
Shut-Down
• QFreenect::~QFreenect()• {• freenect_close_device(myDevice);• freenect_shutdown(myContext);• if(myRGBBuffer!=NULL)free(myRGBBuffer);• if(myMutex!=NULL)delete myMutex;• }
Data passingvoid QFreenect::processVideo(void *myVideo, uint32_t
myTimestamp){ QMutexLocker locker(myMutex); if(myWantDataFlag && myFlagFrameTaken) { uint8_t* mySecondBuffer=(uint8_t*)malloc(640*480*3); memcpy(mySecondBuffer,myVideo,640*480*3); myFlagFrameTaken=false; emit videoDataReady(mySecondBuffer); }}
Format of data word
• Array of bytes
• Three bytes = one pixel
Format of data word - II
for(int x=2; x<640;x++) { for(int y=0;y<480;y++) { r=(myRGBBuffer[3*(x+y*640)+0]); g=(myRGBBuffer[3*(x+y*640)+1]); b=(myRGBBuffer[3*(x+y*640)+2]); myVideoImage->setPixel(x,y,qRgb(r,g,b)); } }
libfreenect - III
depth stream
Extra bring-up
myDepthBuffer= (uint16_t*)malloc(640*480*2);freenect_set_depth_callback(myDevice,
depthCallback);freenect_set_depth_buffer(myDevice,
myDepthBuffer);freenect_frame_mode aFrame =
freenect_find_depth_mode( FREENECT_RESOLUTION_MEDIUM, FREENECT_DEPTH_REGISTERED);
freenect_set_depth_mode(myDevice,aFrame);freenect_start_depth(myDevice);
Extra processingvoid QFreenect::processDepth(void *myDepth, uint32_t
myTimestamp){ QMutexLocker locker(myMutex); if(myWantDataFlag && myFlagDFrameTaken) { uint16_t* mySecondBuffer=(uint16_t*)malloc(640*480*2); memcpy(mySecondBuffer,myDepth,640*480*2); myFlagDFrameTaken=false; emit depthDataReady(mySecondBuffer); }}
Data extraction
void MainWindow::depthDataReady(uint16_t* myDepthBuffer)
{ if(myDepthImage!=NULL)delete myDepthImage; myDepthImage=new
QImage(640,480,QImage::Format_RGB32); unsigned char r, g, b; for(int x=2; x<640;x++) { for(int y=0;y<480;y++) { int calcval=(myDepthBuffer[(x+y*640)]);
Data is in meters if(calcval==FREENECT_DEPTH_MM_NO_VALUE) { r=255; g=0;b=0; } else if(calcval>1000 && calcval < 2000) { QRgb aVal=myVideoImage->pixel(x,y); r=qRed(aVal); g=qGreen(aVal); b=qBlue(aVal); } else { r=0;g=0;b=0; } myDepthImage->setPixel(x,y,qRgb(r,g,b));
Example
OpenNI
What is OpenNI?
• Open standard for Natural Interfaces– Very Asus-Centric
• Provides generic NI framework
• VERY complex APIVERY complex API
Version 1.5 vs Version 2.0
Supported platforms
• Linux
• Windows– 32bit only
Want more?
• Book– German language– 30 Euros
• Launch– When it‘s done!