using the kinect for fun and profit by tam hanna

Using the Kinect

for fun

and profit

About /me

• Tam HANNA– Director,

Tamoggemon Holding k,s

– Runs web sites about mobile computing

– Writes scientific books

Agenda

• Kinect – what is that?

• Streams

• Skeletons

• Facial tracking

• libfreenect

• OpenNI

Slide download

• http://www.tamoggemon.com/test/ Codemotion-Kinect.ppt

• URL IS case sensitive

http://www.tamoggemon.com/test/

Kinect – what is that?

History - I

• Depth: PrimeSense technology– Not from Redmond

• First public mention: 2007– Bill Gates, D3 conference– „Camera for game control“

Contrast detection

Where does the shirt end?

Dot matrix

Shadows / dead areas

Shadows / dead areas - II

History - II

• 2008: Wii ships– Best-selling console of its generation

• 2009: E3 conference– Announcement of „Project Natal“

• 2010: no CPU in sensor– Takes 10% of XBox 360 CPU

History - III

• 4. November 2010– First shipment– “We will sue anyone who reverse engineers“

• June 2011– Official SDK

System overview

Kinect provides

• Video stream

• Depth stream– (IR stream)

• Accelerometer data

• Rest: computedRest: computed

Family tree

• Kinect for XBOX– Normal USB

• Kinect bundle– MS-Fucked USB– Needs PSU

• Kinect for Windows– Costs more– Legal to deploy

Cheap from China

Streams

Kinect provides „streams“

• Repeatedly updated bitmaps

• Push or Pull processes possible– Attention: processing time!!!

Color stream

• Two modes– VGA@30fps– 1280x960@12fps

• Simple data format– 8 bits / component– R / G / B / A components

Depth stream

• Two modes– Unlimited range– Reduced range, with player indexing

Depth stream - II

• 16bit words

• Special encoding for limited range:

Tiefe[12]

Tiefe[11]

Tiefe[10]

Tiefe[9]

Tiefe[8]

Tiefe[7]

Tiefe[6]

Tiefe[5]

Tiefe[4]

Tiefe[3]

Tiefe[2]

Tiefe[1]

Tiefe[0]

Spieler[2]

Spieler[1]

Spieler[0]

Depth stream - III

IR stream

• Instead of color data

• 640x480@30fps

• 16bit words

• IR data in 10 MSB bits

Finding the Kinect

• SDK supports multiple Sensors/PC

• Find one

• Microsoft.Kinect.Toolkit

XAML part<Window x:Class="KinectWPFD2.MainWindow" xmlns:toolkit="clr-

namespace:Microsoft.Kinect.Toolkit;assembly=Microsoft.Kinect.Toolkit" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" Title="MainWindow" Height="759" Width="704"> <Grid> <Image Height="480" HorizontalAlignment="Left" Name="image1"

Stretch="Fill" VerticalAlignment="Top" Width="640" /> <toolkit:KinectSensorChooserUI x:Name="SensorChooserUI"

IsListening="True" HorizontalAlignment="Center" VerticalAlignment="Top" />

<CheckBox Content="Overlay rendern" Height="16" HorizontalAlignment="Left" Margin="267,500,0,0" Name="ChkRender" VerticalAlignment="Top" />

</Grid></Window>

Code - I public partial class MainWindow : Window { KinectSensor mySensor;

KinectSensorChooser myChooser;

public MainWindow() { InitializeComponent();

myChooser = new KinectSensorChooser(); myChooser.KinectChanged += new

EventHandler<KinectChangedEventArgs>(myChooser_KinectChanged); this.SensorChooserUI.KinectSensorChooser = myChooser; myChooser.Start();

Code - II void myChooser_KinectChanged(object sender,

KinectChangedEventArgs e) { if (null != e.OldSensor) {

if (mySensor != null) { mySensor.Dispose(); } }

if (null != e.NewSensor) { mySensor = e.NewSensor;

Initialize streammySensor.DepthStream.Enable(DepthImageFormat.Resolution640x480Fps30);mySensor.ColorStream.Enable(ColorImageFormat.RgbResolution640x480Fps30);myArray = new short[this.mySensor.DepthStream.FramePixelDataLength];myColorArray = new byte[this.mySensor.ColorStream.FramePixelDataLength];mySensor.AllFramesReady += new

EventHandler<AllFramesReadyEventArgs>(mySensor_AllFramesReady); try { this.mySensor.Start(); SensorChooserUI.Visibility = Visibility.Hidden; }

Process stream

void mySensor_AllFramesReady(object sender, AllFramesReadyEventArgs e)

{ ColorImageFrame c = e.OpenColorImageFrame(); DepthImageFrame d = e.OpenDepthImageFrame();

if (c == null || d == null) return;

c.CopyPixelDataTo(myColorArray); d.CopyPixelDataTo(myArray);

Problem: Calibration

• Depth and Color sensors are not aligned

• Position of data in array does not match

Solution

• CoordinateMapper class

• Maps between various frame types– Depth and Color– Skeleton and Color

On Push mode

• Kinect can push data to application

• Preferred mode of operation

• But: sensitive to proc time

• If handler takes too long -> App stops

Skeletons

What is tracked?

• Data format– Real life coordinates

• Color-Mappable

Initialize stream

if (null != e.NewSensor)

{

mySensor = e.NewSensor; mySensor.SkeletonStream.Enable();

Get joints void mySensor_AllFramesReady(object sender, AllFramesReadyEventArgs e) { ColorImageFrame c = e.OpenColorImageFrame(); SkeletonFrame s = e.OpenSkeletonFrame();

if (c == null || s == null) return;

c.CopyPixelDataTo(myColorArray); s.CopySkeletonDataTo(mySkeletonArray);

foreach (Skeleton aSkeleton in mySkeletonArray) {

DrawBone(aSkeleton.Joints[JointType.HandLeft], aSkeleton.Joints[JointType.WristLeft], armPen, drawingContext);

Use joints private void DrawBone(Joint jointFrom, Joint jointTo, Pen aPen,

DrawingContext aContext) { if (jointFrom.TrackingState == JointTrackingState.NotTracked || jointTo.TrackingState == JointTrackingState.NotTracked) {}

if (jointFrom.TrackingState == JointTrackingState.Inferred || jointTo.TrackingState == JointTrackingState.Inferred) { ColorImagePoint p1 =

mySensor.CoordinateMapper.MapSkeletonPointToColorPoint(jointFrom.Position, ColorImageFormat.RgbResolution640x480Fps30);

} if (jointFrom.TrackingState == JointTrackingState.Tracked || jointTo.TrackingState == JointTrackingState.Tracked)

Facial trackingFacial tracking

What is tracked - I

What is tracked - II

What is tracked - III

AU‘s?

• Research by Paul EKMAN

• Quantify facial motion

Structure

• C++ library with algorithms

• Basic .net wrapper provided– Incomplete– Might change!!

Initialize face tracker

myFaceTracker = new FaceTracker(mySensor);

Feed face tracker FaceTrackFrame myFrame = null; foreach (Skeleton aSkeleton in mySkeletonArray) { if (aSkeleton.TrackingState == SkeletonTrackingState.Tracked) { myFrame =

myFaceTracker.Track(ColorImageFormat.RgbResolution640x480Fps30, myColorArray, DepthImageFormat.Resolution640x480Fps30, myArray, aSkeleton);

if (myFrame.TrackSuccessful == true) { break; } } }

Calibration

• OUCH!– Not all snouts are equal

• Maximums vary

libfreenect

What is it

• Result of Kinect hacking competition

• Bundled with most Linux distributions

• „Basic Kinect data parser“

Set-up

• /etc/udev/rules.d/66-kinect.rules

#Rules for Kinect ####################################################SYSFS{idVendor}=="045e", SYSFS{idProduct}=="02ae", MODE="0660",GROUP="video"SYSFS{idVendor}=="045e", SYSFS{idProduct}=="02ad", MODE="0660",GROUP="video"SYSFS{idVendor}=="045e", SYSFS{idProduct}=="02b0", MODE="0660",GROUP="video"### END #############################################################

Set-up II

• sudo adduser $USER plugdev

• sudo usermod -a -G video tamhan

• tamhan@tamhan-X360:~$ freenect-glviewKinect camera test

Number of devices found: 1

Could not claim interface on camera: -6

Could not open device

Set-up III

Problems

• gspca-kinect– Kernel module, uses Kinect as webcam– Blocks other libraries– sudo modprobe -r gspca_kinect

• Outdated version widely deployed– API not compatible

Update library

• sudo foo

• sudo add-apt-repository ppa:floe/libtisch

• sudo apt-get update

• sudo apt-get install libfreenect libfreenect-dev libfreenect-demos

libfreenect - II

color stream

Implementing it

• libfreenect: C++ library

• Question: which framework

• Answer: Qt ( what else ;) )

The .pro file

QT += core gui

TARGET = QtDepthFrame

CONFIG += i386

DEFINES += USE_FREENECT

LIBS += -lfreenect

The freenect thread

• Library needs processing time– Does not multithread itself

• Should be provided outside of main app

class QFreenectThread : public QThread{ Q_OBJECTpublic: explicit QFreenectThread(QObject *parent = 0); void run();

signals:

public slots:

public: bool myActive; freenect_context *myContext;};

QFreenectThread::QFreenectThread(QObject *parent) : QThread(parent){}

void QFreenectThread::run(){ while(myActive) { if(freenect_process_events(myContext) < 0) { qDebug("Cannot process events!"); QApplication::exit(1); } }}

QFreenect

• Main engine module– Contact point between Kinect and app

• Fires off signals on frame availability

• class QFreenect : public QObject• {• Q_OBJECT• public:• explicit QFreenect(QObject *parent = 0);• ~QFreenect();• void processVideo(void *myVideo, uint32_t myTimestamp=0);• void processDepth(void *myDepth, uint32_t myTimestamp=0);

• signals:• void videoDataReady(uint8_t* myRGBBuffer);• void depthDataReady(uint16_t* myDepthBuffer);

• public slots:

• private:• freenect_context *myContext;• freenect_device *myDevice;• QFreenectThread *myWorker;• uint8_t* myRGBBuffer;• uint16_t* myDepthBuffer;• QMutex* myMutex;

• public:• bool myWantDataFlag;• bool myFlagFrameTaken;• bool myFlagDFrameTaken;• static QFreenect* mySelf;• };

Some C++

QFreenect* QFreenect::mySelf;

static inline void videoCallback(freenect_device *myDevice, void *myVideo, uint32_t myTimestamp=0)

{ QFreenect::mySelf->processVideo(myVideo, myTimestamp);}

static inline void depthCallback(freenect_device *myDevice, void *myVideo, uint32_t myTimestamp=0)

{ QFreenect::mySelf->processDepth(myVideo, myTimestamp);}

Bring-up• QFreenect::QFreenect(QObject *parent) :• QObject(parent)• {• myMutex=NULL;• myRGBBuffer=NULL;

• myMutex=new QMutex();• myWantDataFlag=false;• myFlagFrameTaken=true;• mySelf=this;

• if (freenect_init(&myContext, NULL) < 0)• {• qDebug("init failed");• QApplication::exit(1);• }

Bring-up – II• freenect_set_log_level(myContext, FREENECT_LOG_FATAL);

• int nr_devices = freenect_num_devices (myContext);• if (nr_devices < 1)• {• freenect_shutdown(myContext);• qDebug("No Kinect found!");• QApplication::exit(1);• }

• if (freenect_open_device(myContext, &myDevice, 0) < 0)• {• qDebug("Open Device Failed!");• freenect_shutdown(myContext);• QApplication::exit(1);• }

• myRGBBuffer = (uint8_t*)malloc(640*480*3);• freenect_set_video_callback(myDevice,

videoCallback);• freenect_set_video_buffer(myDevice,

myRGBBuffer);• freenect_frame_mode vFrame =

freenect_find_video_mode(FREENECT_RESOLUTION_MEDIUM,FREENECT_VIDEO_RGB);

• freenect_set_video_mode(myDevice,vFrame);• freenect_start_video(myDevice);

• myWorker=new QFreenectThread(this);

• myWorker->myActive=true;

• myWorker->myContext=myContext;

• myWorker->start();

Shut-Down

• QFreenect::~QFreenect()• {• freenect_close_device(myDevice);• freenect_shutdown(myContext);• if(myRGBBuffer!=NULL)free(myRGBBuffer);• if(myMutex!=NULL)delete myMutex;• }

Data passingvoid QFreenect::processVideo(void *myVideo, uint32_t

myTimestamp){ QMutexLocker locker(myMutex); if(myWantDataFlag && myFlagFrameTaken) { uint8_t* mySecondBuffer=(uint8_t*)malloc(640*480*3); memcpy(mySecondBuffer,myVideo,640*480*3); myFlagFrameTaken=false; emit videoDataReady(mySecondBuffer); }}

Format of data word

• Array of bytes

• Three bytes = one pixel

Format of data word - II

for(int x=2; x<640;x++) { for(int y=0;y<480;y++) { r=(myRGBBuffer[3*(x+y*640)+0]); g=(myRGBBuffer[3*(x+y*640)+1]); b=(myRGBBuffer[3*(x+y*640)+2]); myVideoImage->setPixel(x,y,qRgb(r,g,b)); } }

libfreenect - III

depth stream

Extra bring-up

myDepthBuffer= (uint16_t*)malloc(640*480*2);freenect_set_depth_callback(myDevice,

depthCallback);freenect_set_depth_buffer(myDevice,

myDepthBuffer);freenect_frame_mode aFrame =

freenect_find_depth_mode( FREENECT_RESOLUTION_MEDIUM, FREENECT_DEPTH_REGISTERED);

freenect_set_depth_mode(myDevice,aFrame);freenect_start_depth(myDevice);

Extra processingvoid QFreenect::processDepth(void *myDepth, uint32_t

myTimestamp){ QMutexLocker locker(myMutex); if(myWantDataFlag && myFlagDFrameTaken) { uint16_t* mySecondBuffer=(uint16_t*)malloc(640*480*2); memcpy(mySecondBuffer,myDepth,640*480*2); myFlagDFrameTaken=false; emit depthDataReady(mySecondBuffer); }}

Data extraction

void MainWindow::depthDataReady(uint16_t* myDepthBuffer)

{ if(myDepthImage!=NULL)delete myDepthImage; myDepthImage=new

QImage(640,480,QImage::Format_RGB32); unsigned char r, g, b; for(int x=2; x<640;x++) { for(int y=0;y<480;y++) { int calcval=(myDepthBuffer[(x+y*640)]);

Data is in meters if(calcval==FREENECT_DEPTH_MM_NO_VALUE) { r=255; g=0;b=0; } else if(calcval>1000 && calcval < 2000) { QRgb aVal=myVideoImage->pixel(x,y); r=qRed(aVal); g=qGreen(aVal); b=qBlue(aVal); } else { r=0;g=0;b=0; } myDepthImage->setPixel(x,y,qRgb(r,g,b));

Example

OpenNI

What is OpenNI?

• Open standard for Natural Interfaces– Very Asus-Centric

• Provides generic NI framework

• VERY complex APIVERY complex API

Version 1.5 vs Version 2.0

Supported platforms

• Linux

• Windows– 32bit only

Want more?

• Book– German language– 30 Euros

• Launch– When it‘s done!

?!?

[email protected]@tamhanna

Images: pedroserafin, mattbuck

mailto:[email protected]

using the kinect for fun and profit by tam hanna

Technology

sysfsidvendor

sysfsidproduct

trackingstate

processing

object sender

uint16t mydepthbuffer

public slots

uint32t mytimestamp0