a novel gesture driven fuzzy interface system for car ... · a novel gesture driven fuzzy interface...

8
A Novel Gesture Driven Fuzzy Interface System For Car Racing Game Chiranjib Saha * , Debdipta Goswami * , Sriparna Saha * , Amit Konar * , Anna Lekova and Atulya K. Nagar * Department of of Electronics and Telecommunication Engineering, Jadavpur University, Kolkata-700050, India Email: [email protected], [email protected], [email protected] and [email protected] Institute of System Engineering and Robotics, Bulgarian Academy of Sciences, Sofia, Bulgaria Email: [email protected] Department of Mathematics and Computer Science, Liverpool Hope University, United Kingdom Email: [email protected] Abstract—The recently developed Kinect sensor has opened a new horizon to Human-Computer Interface (HCI) and its native connection with Microsoft’s product line of Xbox 360 and Xbox One video game consoles makes completely hands-free control in next generation of gaming. Games that requires a lot of degree of freedoms, especially the driving control of a car in racing games is best suitable to be driven by gestures, as the use of simple buttons does not scale to the increased number of assistive, comfort, and infotainment functions. In this paper, we propose a Mamdani type-I fuzzy inference system based data processing module which effectively takes into account the dependence of actual steering angle with the distance of two palm positions and angle generated with respect to the sagittal plane. The FIS output variable controls the duration of a virtual “key-pressed” event which mocks the users pressing of actual keys assigned to control car direction in the original game. The acceleration and brake(deceleration) of the vehicle is controlled using the relative displacement of left and right feet. The proposed experimental setup, interfacing Kinect and a desktop based racing game, has shown that the virtual driving environment can be easily applied to any games belonging to this particular genre. I. I NTRODUCTION The rapid growth of computer applications for normal users in the second half of 20th century called for a robust and reliable system to interact with the machine, thus initiating a new subject area: Human Computer Interaction, also known as HCI. The evolution in input devices to achieve more intricate man-machine interaction with physical feedback has been made possible by the recent advancement in sensors, display and rendering technology. The principle used to recognize the gestures may be based on the mimicry of human vision system: the reception of photons on the retina and the processing of the signal by visual cortex [1]. A line of motion sensing devices have been developed by different corporations that can track the human motion and basic gestures using infrared sensing. The most popular motion sensing devices are Mi- crosoft Kinect, Nintendo Wii, Sony PlayStation Eye and Sega Dreameye. The Microsoft Kinect gives a skeleton of a human standing in its field of view with the coordinate of 20 joints that can be used for gesture recognition algorithms. Equipped with such revolutionary improvement of input devices, computer games, which have been an integral part of computer based entertainment and solely depends of the features offered by HCI devices, have experienced a paradigm shift in developing advanced control by adding more degrees of freedom into the gaming environment. Considering the driving based games, gesture tracking can lead to a new level of experience beyond classical joystick control, enabling the player to move his/her limbs seemingly controlling an imaginary steering wheel or joystick to move through the game. Also the tracking of feet movements would yield speed controls. Keeping in mind the simulation of virtual reality by head-mounted holographic displays of by Oculus Rift or Microsoft HoloLens, gesture controlled steering seems an indispensible part of driving, eleminating joystick or steering wheel and pedals. However the challenge lies in the fact that gesture-inputs would be ridden with more inaccuracy, redundancy and noise than that occur in a classical contact-based input device, and must be processed with intelligent data-processing and recognition algorithms. The gesture recognition algorithms and applications have been taken the centre field of HCI in the recent decade. The development of an SDK by Microsoft for Kinect sensors has made it an unparalleled motion sensing device used in research and development. Bonanni [2] developed a gesture driven interaction with a mobile robot using Kinect sensors where he employed hidden Markov model [3]. Using of Kinect sensor in Hand Gesture recognition is carried out by Li [4]. Oszust et al. [5] have proposed a method for recognition of signed expressions observed by the Kinect sensor. Here skeletons of human body with shape and position of hands are taken into account for polish sign language recognition. The recognition is implemented using k-nearest neighbour (kNN) classifier. Burba et al [6] have used Microsoft Kinect to monitor respiratory rate is estimated by measuring the visual expansion and contraction of the users chest cavity. This work also lightens the area of measuring fidgeting behavior. This is also done by Kinect sensor focusing on vertical oscillations of the user´ s knees. The feedback-actuator system for steering wheel driving has been improved and modified with different interfaces in the past years to provide more stability and robustness. Zhang et al [7] experimented on driving control of electric vehicles and proposed a new H robust control stratgy. Wada and Kimura [8] analysed the stability of a joystick interface for driving

Upload: others

Post on 11-May-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

A Novel Gesture Driven Fuzzy Interface SystemFor Car Racing Game

Chiranjib Saha∗, Debdipta Goswami∗, Sriparna Saha∗, Amit Konar∗, Anna Lekova† and Atulya K. Nagar‡∗Department of of Electronics and Telecommunication Engineering, Jadavpur University, Kolkata-700050, India

Email: [email protected], [email protected], [email protected] and [email protected]†Institute of System Engineering and Robotics, Bulgarian Academy of Sciences, Sofia, Bulgaria

Email: [email protected]‡Department of Mathematics and Computer Science, Liverpool Hope University, United Kingdom

Email: [email protected]

Abstract—The recently developed Kinect sensor has opened anew horizon to Human-Computer Interface (HCI) and its nativeconnection with Microsoft’s product line of Xbox 360 and XboxOne video game consoles makes completely hands-free control innext generation of gaming. Games that requires a lot of degreeof freedoms, especially the driving control of a car in racinggames is best suitable to be driven by gestures, as the use ofsimple buttons does not scale to the increased number of assistive,comfort, and infotainment functions. In this paper, we proposea Mamdani type-I fuzzy inference system based data processingmodule which effectively takes into account the dependence ofactual steering angle with the distance of two palm positionsand angle generated with respect to the sagittal plane. The FISoutput variable controls the duration of a virtual “key-pressed”event which mocks the users pressing of actual keys assigned tocontrol car direction in the original game. The acceleration andbrake(deceleration) of the vehicle is controlled using the relativedisplacement of left and right feet. The proposed experimentalsetup, interfacing Kinect and a desktop based racing game, hasshown that the virtual driving environment can be easily appliedto any games belonging to this particular genre.

I. INTRODUCTION

The rapid growth of computer applications for normal usersin the second half of 20th century called for a robust andreliable system to interact with the machine, thus initiating anew subject area: Human Computer Interaction, also known asHCI. The evolution in input devices to achieve more intricateman-machine interaction with physical feedback has beenmade possible by the recent advancement in sensors, displayand rendering technology. The principle used to recognize thegestures may be based on the mimicry of human vision system:the reception of photons on the retina and the processing ofthe signal by visual cortex [1]. A line of motion sensingdevices have been developed by different corporations thatcan track the human motion and basic gestures using infraredsensing. The most popular motion sensing devices are Mi-crosoft Kinect, Nintendo Wii, Sony PlayStation Eye and SegaDreameye. The Microsoft Kinect gives a skeleton of a humanstanding in its field of view with the coordinate of 20 joints thatcan be used for gesture recognition algorithms. Equipped withsuch revolutionary improvement of input devices, computergames, which have been an integral part of computer basedentertainment and solely depends of the features offered by

HCI devices, have experienced a paradigm shift in developingadvanced control by adding more degrees of freedom into thegaming environment. Considering the driving based games,gesture tracking can lead to a new level of experience beyondclassical joystick control, enabling the player to move his/herlimbs seemingly controlling an imaginary steering wheel orjoystick to move through the game. Also the tracking offeet movements would yield speed controls. Keeping in mindthe simulation of virtual reality by head-mounted holographicdisplays of by Oculus Rift or Microsoft HoloLens, gesturecontrolled steering seems an indispensible part of driving,eleminating joystick or steering wheel and pedals. However thechallenge lies in the fact that gesture-inputs would be riddenwith more inaccuracy, redundancy and noise than that occur ina classical contact-based input device, and must be processedwith intelligent data-processing and recognition algorithms.

The gesture recognition algorithms and applications havebeen taken the centre field of HCI in the recent decade.The development of an SDK by Microsoft for Kinect sensorshas made it an unparalleled motion sensing device used inresearch and development. Bonanni [2] developed a gesturedriven interaction with a mobile robot using Kinect sensorswhere he employed hidden Markov model [3]. Using of Kinectsensor in Hand Gesture recognition is carried out by Li [4].Oszust et al. [5] have proposed a method for recognitionof signed expressions observed by the Kinect sensor. Hereskeletons of human body with shape and position of handsare taken into account for polish sign language recognition.The recognition is implemented using k-nearest neighbour(kNN) classifier. Burba et al [6] have used Microsoft Kinect tomonitor respiratory rate is estimated by measuring the visualexpansion and contraction of the users chest cavity. This workalso lightens the area of measuring fidgeting behavior. This isalso done by Kinect sensor focusing on vertical oscillations ofthe users knees.

The feedback-actuator system for steering wheel driving hasbeen improved and modified with different interfaces in thepast years to provide more stability and robustness. Zhang etal [7] experimented on driving control of electric vehicles andproposed a new H∞ robust control stratgy. Wada and Kimura[8] analysed the stability of a joystick interface for driving

where a steering wheel and gas/brake pedals are actuatedby electric motors which rotations are controlled by micro-computers based on 2DOF joystick commands.

Recent works have been focussing on integrating gesturein racing video games, specially in the form of touch-screengesture or dummy steering wheel. Pfleging et al [9] has com-bined both the speech and gesture on a multi-touch steeringwheel. Using gestures for manipulation, they provided a fine-grained control with immediate feedback. Siversten [10] useda steering wheel input device with pointer event detectionand angle compensation technology that can serve as inputsignals for vehicular control. However, due to the extremelyhigh degree of freedom required by a driving control system,free-air gesture is best suited for it.

The proposed model of driving control for a racing gamemimics the natural driving of an auto-transmission car withtwo hands kept in such a position that they seem holdingan imaginary steering wheel. The player sits in a chair andmoves his/her right and left feet forward to actuate accelerationand brake. The steering angle is determined by the anglebetween axis created by joining both hand-coordinates and thetransverse plane. As in the free-air gesture, the player may getthe perception of linear displacement as the governing quantityand may turn the imaginary wheel in a greater angle whensteering radius is small and vice-versa, the latter quantity isalso taken into account while generating the control signal.The control signal is generated as the keyboard interruptsfor “keydown” and “keyup” events with the duration betweenthem as the “Keypress duration”. The duration is controlledby the steering angle and radius using a type-I MamdaniFuzzy Inference System to make it precise and robust. Theacceleration and brake is controlled by the detection of relativefeet position from the skeleton tracked by Kinect.

J8 Hand left

J10

Elbow right

J18

knee right

J19

ankle right

J20

foot right

J9 Shoulder right

J14

knee left

J15

ankle left

J16

foot left

J2 spine

J4 Head

J6

Elbow left

J5 Shoulder

left

J3 Shoulder Center

J13

hip left

J7 Wrist left J

17 hip right

J12

Hand right

J11

Wrist rightJ1 hip center

Fig. 1: Twenty body joint of a Kinect-captured skeleton

The main distinction that our driving model presents in theuniverse of gesture driven racing games is its enhanced reality,

more precise and comfortable control of the car movementand the incorporation of acceleration and brake under gesture-control. Inspired from the driving system presented in theForza Motorsport 4 [11] where a Kinect driven mode isalso included, we have chosen the normal steering wheelcontrol for our model. However, in the said game the player’sgesture is constrained by the fact that he has to keep thehands in right angle with the body-axis for movement, whichmay be tiresome as referred by the critics [11]. Moreover,the acceleration and brake control is automatic, cutting shortthe degrees of freedom. We have eliminated that need byconsidering only the line joining both hand joint coordinatesso that the hands can be placed comfortable at any angle withfolded elbow-joint. Another major improvement is that in theproposed model acceleration and brake can also be controlledusing gesture which were not available for Forza Motorsport4. The most important of all is that in our model, even thewild-turning of the imaginary steering wheel will not causedisastrous effect as it does in Forza Motorsport 4 since thesteering angle is processed by a Fuzzy Inference System (FIS)along with the steering diameter. Another improvement is theadditional feature of speed control with feet movements whichsimulates real-time driving experience.

The rest of the paper is organised as follows. Section IIgives a brief overview of Kinect for Xbox 360 sensor andthe description of its skeleton tracking activity. Section IIIgives a vivid analysis of proposed driving model and theinsight of the Fuzzy Inference System. We have describedthe experimental setup in Section IV. The relevant simulationresults and discussion on them constitutes Section V. SectionVI discusses the future work and concludes the paper.

II. KINECT SENSOR AND ITS SPECIFICATIONS

Kinect is a line of motion sensing input devices originallydeveloped for Xbox and Xbox 360 video game consolesand later extended for non-gaming uses in Windows-PC. TheKinect sensor for Xbox 360 [12]- [14] consists of an RGBcamera, an IR camera, an IR projector and a microphone array.It basically looks like a webcam (8-bit VGA resolution of640×480 pixels with a Bayer color filter). Its long horizontalbar is provided with a motorized base and tilt. The Kinectsensor along with the associated Software Development Kit(SDK) tracks the human motion by generating a skeletonwith three dimensional co-ordinates within a finite range ofdistance (roughly 1.2 to 3.5m or 3.9 to 11ft). This is achievedusing the visible IR (which is the depth sensor, by which zdirection value for each joint is obtained) and RGB cameras.The skeleton produced by the Kinect sensor has twenty bodyjoints. Fig. 1 shows all the joints along with the index Ji(1 ≤ i ≤ 20) of the human body considered in the skeleton.The background, dress color and lighting of the room areirrelevant for skeleton detection using the Kinect sensor. Henceit can recognize human motion in a very wide range ofsurrounding physical conditions.

DataProcessing

Module

FIS keypress eventsfor steering

keypressevents for

driving

sgn(θ)

θ, d

δ

drivinggestures

Kinect

skeletondata

Virtual keypressevent simulator

T

to game controlkeyboard interrupt signals

Fig. 2: Block diagram of the proposed driving control system.

III. GESTURE CONTROLLED DRIVING MODEL

The principal objective of gesture controlled driving is toprovide a smooth steering control by means of comparativelyeasy dynamic gesture and control the acceleration and brakeof a car by similarly easy gesture potentially unrelated to thatused in steering so that they do not interfere. The most intuitiveand easiest solution is to mimic the normal gesture used inreal life driving. When driving a real automobile vehicle, thedirection is controlled by turning a steering wheel with bothhands, and the acceleration and brake is actuated by pressingcorresponding pedals with right and left feet respectively. Theproposed model has adopted this steering wheel concept bytracking the gestures produced by two hands while holding animaginary steering wheel. The angle generated by the straightline connecting two palm position (left and right hand jointsin Kinect skeleton) with the transverse or horizontal plane isgiven as the input to the steering system along with its sign.The overview of the proposed mechanism is shown in the formof block-diagram in Fig. 2. The complete algorithm is givenin Fig. 4.

A. Fuzzy Inference System for Steering

The principal challenge to a gesture driven steering controlsystem is to achieve a desired robustness. The success of sucha system depends on the precise detection and measurementof human limb-movements that are noisy and sometimesunintentionally spurious. Additionally there are inherentIF-THEN conditional relationships between input and outputvariables which are easily identifiable and such existenceserves as a motivation to form a fuzzy rule base instead offormulating exact fuctional mapping. We propose a Type-IMamdani Fuzzy Inference System (FIS) to generate thesteering output for the racing car game. The FIS consists oftwo input variables and one output variable.

1) Input 1: The Angle of Steering (θ): The fundamentalinput for steering a car in the racing game is the steeringangle, i.e. the angle by which the player seems to turn the

Fig. 3: Three-dimensional model of a player in driving posi-tion.

imaginary steering wheel. To get a proper measurementof this angle, we calculated the acute angle between thetransverse plane and straight line joining left and right-handjoint-coordinated in the skeleton tracked by the Kinect.Experiment shows that even during the intention extremeturn, this angle doesn’t cross the maximum value of 60degree, and so the scale of the input steering angle has beencalibrated from −60o to +60o. The angle θ is shown in Fig. 3with the help of a 3D image of a player in driving position.

2) Input 2: The Steering Diameter (d): The steering diam-eter, i.e. the diameter of the steering wheel, is usually fixedduring a real driving system; but in gesture-controlled driving,the analogue of the said item, the distance between two hand

joints, may vary widely. According to the common behaviourof human sensory-motor system, the driver tends to rotate thewheel in a greater angle if the diameter is small. But sinceactual rotation depends on the steering angle, thus, smallerthe diameter of the wheel, smaller is the angular displacementto produce an equal amount of turn. This poses increasingsensitivity of rotation as the diameter of wheel decreases.It is to be noted that for small vehicles like motor-cycleor small cars where frequent large turns are required, thesteering diameter is small, whereas, for larger vehicles andships, where smaller (and precise) turn is necessary, a largewheel is provided. But here the intended turning radius ofa car must be within a certain range and it must be freefrom the error coming from human perception of turn dueto variable steering diameter. So the steering diameter is alsocalculated from the Kinect-tracked skeleton and fed to theFIS to calculate the precise Key-press duration. The input tothe FIS is a normalized radius for the sake of generalization.The minimum and maximum feasible diameters should becalibrated according to the gamer. The steering diameter dis also shown in Fig. 3 with help of a 3D image.

3) Steering Output (T ): Our principal aim is to generatea signal that controls the direction of the car’s motion ingaming environment. In a keyboard controlled racing game,a car rotates as long as the user presses the proper key foreg. to turn the car at a desired angle at left, the left arrowkey is pressed for a proportional duration. This “keypress”event can be virtually simulated by combined “keydown” and“keyup” events which are capable of generating system callsfor key-board interrupts.

The time duration of “keypress” event which is essentiallythe time lapse between the instances of “keydown” and“keyup” events related right-arrow or left-arrow key (i.e. thevirtual key codes) is the factor to be modulated. The widthof the time-pulse, hereby termed as “Key-press duration”(T ) is the quantity governing the amount of rotation thecar undergoes during the steering movement. In absence ofgreater level of integration with the game console for the timebeing, the output of the proposed Fuzzy Inference System(FIS) will provide this quantity to the program producingthe virtual key-stroke signals at the user end. We haveexperimentally determined that the response of the racinggame is logarithmic and hence converted the output scaleof the FIS into a logarithmic one. It has been determinedexperimentally that a T = 200ms in a sample is satisfactoryas the maximum Key-press duration for sharp bends.

4) Fuzzy Rule Base:• Inputs: θ, d• Output: T

For d we choose two linguistic constants, viz. Low and High;For θ Low, Mid and High; and for T we choose Very Low(V. Low), Low, Mid and High. As we need precise controlover T , it has the maximum number of linguistic constants.The primary input for steering is the steering angle θ havingprecedence over steering diameter d. So, θ has more linguistic

1: procedure EVALFIS(d, θ)2: Return the output of the FIS3: end procedure4: procedure PRESS KEY(T ,keycode k) . k is the virtual

key code of the corresponding key5: key down(k);6: Sleep(T );7: key release(k);8: end procedure9: ParBegin

10: procedure STEER11: while true do12: T ←EVALVIS(d, θ)13: if θ ≥ θ0 then14: if θ > 0 then15: PRESS KEY(T , RIGHT ARROW)16: else17: PRESS KEY(T , LEFT ARROW)18: end if19: end if20: end while21: end procedure22: procedure DRIVE23: while true do24: δ ← left foot.z − right foot.z25: T ← k ∗ δ . k is the proportionality constant26: if δ > threshold then27: if δ > 0 then28: PRESS KEY(T , UP ARROW)29: else30: PRESS KEY(T , DOWN ARROW)31: end if32: end if33: end while34: end procedure35: procedure UPDATE36: while true do37: draw data from kinect sensors and process θ, d,

and δ38: end while39: end procedure40: ParEnd

Fig. 4: Proposed algorithm

constants than d.

1. IF d is Low AND θ is Low THEN T is Very low.2. IF d is Low AND θ is Mid THEN T is Low.3. IF d is Low AND θ is High THEN T is Mid.4. IF d is High AND θ is Low THEN T is Low.5. IF d is High AND θ is Mid THEN T is Mid.6. IF d is High AND θ is High THEN T is High.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

d

membership

LOW

HIGH

(a)

10 15 20 25 30 35 40 45 50 55 600

0.2

0.4

0.6

0.8

1

θ

membership

LOW

MID

HIGH

(b)

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20

0.2

0.4

0.6

0.8

1

log10(T/2)

membership

V.LOW

LOW

MID

HIGH

(c)

Fig. 5: Membership functions of input and output variables: (a) d, (b) θ and (c) T .

TABLE I: Membership value parametersVariable Ling.

Const.Mean Std.

Dev.α β γ

dLow 0 0.265 - - -High 1 0.265 - - -

θLow 9.41 25.3 - - -Mid 46.77 10.3 - - -High 61.03 15.3 - - -V. Low - - 0.8 0 0.6

TLow - - 0 0.5 1Mid - - 0.5 1.2 2High - - 1.2 1.75 2.88

5) Memberships of Input and Output Variables: For inputvariables, Gaussian membership functions have been used.Membership functions of output variable is triangular. Mem-bership function plots for d, θ and T have been shown in Fig. 5(a), (b) and (c) respectively. The output scale is calibrated in alogarithmic way, i.e. the scale is taken as the power of 10. Theparameters of membership functions are tabulated in Table-I.For Gaussian memberships, mean and standard deviation aregiven; for triangular ones, three parameters, α, β and γ aregiven. α, β and γ are the starting threshold, peak and endingthreshold of the triangular function. For θ, the scale has beenchosen from θ0 = 10o, instead of 0o as the threshold steeringangle for actuating key-press is taken at ±10o. To maintain asharp sigmoid characteristics, the standard deviation for “Low”is chosen much higher than that of “Mid” and “High” for θ.

B. Acceleration and Brake Control

The normal gesture of pressing the pedals with left andright feet is used for acceleration and brake control. Thegamer sits in a driving position and puts his/her right footforward relative to the left foot to signal acceleration of thecar. The corresponding “keypressed” event is invoked. Thereverse gesture is reserved for braking, i.e. left foot has to beadvanced with respect to the right foot. Keeping the feet innormal position signals the maintainance of uniform speed.The difference of z-coordinates (δ) of right and left Feetjoints of the skeleton is used to generate the accelerationand deceleration signal. When δ exceeds a suitable thresholddetermined experimentally, the car accelerated or decelerates

or keeps uniform speed depending on the sign of δ. The signalis actually generated in the form of a keyboard interrupt forup-arrow and down-arrow key and a Key-press duration forthem. The key-press duration is proportional to the differencequantity measured from the skeleton. Fig. 3 gives a cleardepiction of δ in the 3D image of a player.

IV. EXPERIMENTAL SETUP

To test the efficiency and smoothness of the proposeddriving control system, it is applied on a real-time simulatorracing game with turns and acceleration function. Final DriveFury by Wild-Tangent Games has been used as the test-bed for the driving-mechanism. This game is ideal to testthe mechanism due to its sufficiency of driving inputs inaccordance with real-life driving scenario unlike arcade racinggames. Moreover the necessary inputs to control this game arebrief, explicit and well-defined.

Final Drive Fury is a competitive racing game where theplayer races with others in a cosmopolitan racing track. Thecontrol includes acceleration by the up-arrow key, brake andgoing back by the down-arrow key, and steering using leftand right arrows. The game needs Windows XP or aboveoperating system with a minimum requirement of 256 MB

(a) Straight Drive (b) Turn Left (c) Turn Right

Fig. 6: The player in driving position with acceleration actu-ated with right foot.

−0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8

−0.6

−0.4

−0.2

0

0.2

0.4 θd

(a) Straight Drive

−0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8

−0.6

−0.4

−0.2

0

0.2

0.4

θd

(b) Turn Left

−0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8

−0.6

−0.4

−0.2

0

0.2

0.4

θ d

(c) Turn Right

Fig. 7: Tracked skeleton with the player in driving position.

RAM. There is no provision of “nitro” or “drift” making thegame more realistic. This game needs the rigor and precisionof a simulator racing game.

A Windows 7 PC with 2GB RAM has been used to test thedriving control system on this game. A Kinect 360 sensor isused to track the player’s skeleton and the project is developedin Visual C++ using Kinect SDK 1.8. The player sits in acomfortable position as shown in the Fig. 6 while playingthe game. The Kinect is placed at a distance of (1.5-2.1)metre from the player and at a height of (0.4-0.8) metre. TheKinect tracks the skeleton and sends data to the PC where it isprocessed by FIS and T for steering and acceleration-brake isgenerated. The player gets the visual information of the gamein a screen placed in front of him/her.

The system is tested on 20 subjects as players, all withinthe age group of 20-30 years. Ten of them were completelyunexperienced about any type of simulator racing game beforesubjected to the test, while six had simulator racing experiencebeforehand but not with this particular game. Rest weresufficiently habituated with this game.

The joints used for this model are enumerated below:• Left Hand, Right Hand (For measuring steering diameter)• Shoulder Centre, Spine and Hip Centre (For constructing

sagittal plane)• Left Foot, Right Foot (For acceleration and brake control)The input steering angle is the angle between the axis

joining left and right hand and the transverse plane. Butsince it’s difficult to get transverse plane data from Kinectskeleton, we calculated the angle with the Sagittal plane (planeforming by Shoulder Centre, Spine and Hip Centre) and tookits complementary angle as shown in Fig. 7.

V. RESULTS AND DISCUSSION

The player sits as shown in Fig. 6 while playing the game.The skeleton tracked by the kinect application is provided inFig. 7 for three principal hand positions while playing. The

data generated by Kinect is given for the processing and theprocessed data in the form of d, θ and δ are fed to the FIS.

In Table-II, the sampled data from Kinect skeleton and thecorresponding Key-press duration T and key-press type hasbeen shown. When θ < θ0, where θ0 is set at 10o, the outputT is irrelevant as no keyboard interrupt signal is generatedfor right or left turn. This ensured no turning with noisy handmovements. The game snapshots for turning left, right anddriving straight is shown in Fig. 8. It corresponds with theskeleton positions in Fig. 7.

The output of the FIS for the range of interest is given inFig. 9 by means of a surface plot. The surface plot showsthe sigmoid cross-section for θ− log T plane marked by a redline for d = 0.7. The sigmoid function is necessary for precisecontrol in lower steering angle and sharp bend in higher ones.

To show the non-linearity of the input-output relationshipachieved by the FIS, we compare the output with a linear non-Fuzzy system, where T = c1d + c2θ + c3. Fig. 10 illustratesthat the plane, corresponding to a particular (c1, c2, c3) whichhas been obtained by the best linear fit to the surface plot of Tobtained from FIS. But as seen from Fig. 10, the proportionalsystem fails to achieve the precision required at lower θ andr and goes below the FIS output surface, and may not be ableto generate significant T at all. At the same time it is not ableto produce large T at large r and θ. It only approximates thesigmoid curve at moderate values of the inputs.

In Fig. 11 the continuous variation of d and θ duringgameplay and the generation of T is explained. For clarityonly 500ms time span is chosen when the player was tryingto turn right. The diameter d was varying continuously and θwas contuniously decreasing from 58o to 51o. The skeleton issampled at an interval of 50ms (at a rate of 20fps) and a simplesample and hold mechanism is used to store the sampled value.At the zero-th instant, the Keypress duration T is evaluatedand the “keydown” event for right-arrow key is generated. T isnot evaluated further until the “key-release” event is actuated

(a) Straight Drive (b) Turn Left (c) Turn Right

Fig. 8: Game snapshots of Final Drive Fury for frame numbers 128, 256 and 314.

TABLE II: Sample results for a particular instance for unknown subjectIntended operation θ d(normalized) δ T for steering Left/Right T for accel./brake Up/DownTurn Right +42.3o 0.68 N.R. 73.18 Right N.R. N.R.Turn Left −31.27o 0.62 N.R. 45.92 Left N.R. N.R.Go Straight +4.33o 0.71 N.R. 5 N.R. N.R. N.R.Accelerate N.R. N.R. 0.29 N.R. N.R. 481 UpBrake N.R. N.R. 0.23 N.R. N.R. 162 Down

*N.R. means not relevant

(after the time interval of current T ) and one pulse of Key-press is over. Then T is again evaluated and this phenomenagoes on. Same is true for acceleration and braking where the“keydown” and “key-release” events for up-arrow and down-arrow keys are actuated in the same fashion. Between twokey-press pulses, a guard-time of 10ms is kept to avoid over-sampling.

0

0.5

1

020

40600

0.5

1

1.5

2

log

10 T

/2

d=0.7

Fig. 9: Fuzzy inference system output surface plot.

It can be observed from Fig. 11 that the output of theFIS actually relies on the pulse-width modulation of Key-press pulse. According to the value of d and θ, the outputT modulates the time-duration of the virtual pressing of keys.That gives an effective method to control the turn as well asacceleration of the car.

VI. CONCLUSION

The ideas we have proposed in this paper seek to providea robust gesture controlled driving system that mimics thenormal driving phenomena in the automobiles. We have sought

0 0.2 0.4 0.6 0.8 1 0

50−50

0

50

100

150

200

θ

d

T

Fig. 10: FIS output and proportional output: a comparison.

to provide the most natural experience of driving while playinga simulator racing game. The novelty of the proposed modeltaking natural driving gestures as the input signals to a racinggame is its less sensitivity to the erroneous human perceptionand precise virtual key-press duration proportional to thedifferent quantities measured from the skeleton data. A Type-IMamdani Fuzzy Inference System is used to process the rawinformation received from MS Kinect and the sigmoid natureof input-output relationship achieved by the FIS contributedsignificantly to the ease of driving by precise control at lowerangles and sharp turns at higher angles, as well as incor-porating both acceleration and brake for a complete drivingexperience. The final output is taken in the form of key-boardinterrupt signals with a certain key-press duration, which canbe more sophistically integrated into the game engines at itsdevelopment stage to provide a new gesture driven control. Theveracity of our claims are justified using the graphs that showthe relationship among random inputs and generated Key-press

0 50 100 150 200 250 300 350 400 450 5000.65

0.7

0.75

d

0 50 100 150 200 250 300 350 400 450 500

55

60

θ

0 50 100 150 200 250 300 350 400 450 5000

0.5

1

1.5

time (in ms)

acti

vat

ion o

f key

pre

ss ’

right’

0 50 100 150 200 250 300 350 400 450 5000

50

100

150

T

10ms

Fig. 11: Sampling of d and θ over time and the generation of T .

duration.As this model is developed in accordance with the rigour

and precision of a simulator racer, the proposed approachcan be readily applied to the similar type of games, suchas an “Arcade racer” that needs the activation of “Nitro” or“Powerslide”. Hidden Markov Models [3], [15] or DynamicTime Warp [16] may be used for recogntion of some specificset of gestures along with the proposed here ones for steeringand acceleration.

Future work will focus on expanding the degree of freedomof the driving system so that it can be applied to aircraftsimulator where the imaginary steering wheel can be replacedby an imaginary control stick. Again this work can be directlyapplied to a robot-car where the output signal will be the pulsesignal driving the servo-motor.

ACKNOWLEDGMENT

The work is supported by the University Grants Commis-sion, India, University with Potential for Excellence Program(Phase II) in Cognitive Science, Jadavpur University.

REFERENCES

[1] Fraunhofer-Gesellschaft. “Gesture-driven Computers Will Take ComputerGaming To New Level.” ScienceDaily. ScienceDaily, 7 March 2008.

[2] T. M. Bonanni, “Person-tracking and gesture-driven interaction witha mobile robot using the Kinect sensor”, MastersThesis, Faculty ofEngineering, Sapienza Universita Di Roma

[3] L. E. Baum, T. Petrie, “Statistical Inference for Probabilistic Functionsof Finite State Markov Chains”, The Annals of Mathematical Statistics37 (6), 1966, pp. 15541563.

[4] Y. Li, “Hand gesture recognition using Kinect”, Software Engineeringand Service Science (ICSESS), 2012 IEEE 3rd International Conferenceon, 22-24 June 2012, pp. 196-199.

[5] M. Oszust and M. Wysocki, “Recognition of signed expressions observedby Kinect Sensor”, Advanced Video and Signal Based Surveillance(AVSS), 2013 10th IEEE International Conference on, 2013, pp. 220225.

[6] N. Burba, M. Bolas, D. M. Krum, and E. A. Suma, “Unobtrusivemeasurement of subtle nonverbal behaviors with the Microsoft Kinect”,Virtual Reality Workshops (VR), 2012 IEEE, 2012, pp. 14.

[7] C. Zhang, Z. Bai, B. Cao, and J. Lin, “Simulation and Experiment ofDriving Control System for Electric Vehicle”, International Journal ofInformation and Systems Science, vol. 1, no. 3-4, pp. 283-292.

[8] M. Wada and Y. Kimura, “Stability analysis of car driving with ajoystick interface”, IEEE 4th International Conference on CognitiveInfocommunications (CogInfoCom), 2-5 December 2013, pp. 493-496.

[9] B, Pfleging, S. Schneegass, A. Schmidt, “Multimodal Interaction inthe Car- Combining Speech and Gestures on the Steering Wheel”,AutomotiveUI’12, October 17-19, Portsmouth, NH, USA.

[10] C. Sivertsen, “Steering Wheel Input Device Having Gesture Recognitionand Angle Compensation Capabilities”, Google Patents, January 24, 2013.

[11] M. Kato, “Forza Motorsport 4 - Forza 4 Is A Finely Tuned RacingMachine”, Game Informer, October 6, 2011.

[12] J. Solaro, “The Kinect Digital Out-of-Box Experience”, Computer(Long. Beach. Calif)., pp. 9799, 2011.

[13] T. Leyvand, C. Meekhof, Y. C. Wei, J. Sun, and B. Guo, “Kinect identity:Technology and experience”, Computer (Long. Beach. Calif)., vol. 44, no.4, pp. 9496, 2011.

[14] T. Dutta, “Evaluation of the KinectTM sensor for 3-D kinematicmeasurement in the workplace”, Appl. Ergon., vol. 43, no. 4, pp. 645649,2012.

[15] L. Piyathilaka, S. Kodagoda, “Gaussian mixture based HMM for humandaily activity recognition using 3D skeleton features”, Industrial Elec-tronics and Applications (ICIEA), 2013 8th IEEE Conference on , pp.567-572, 1921 June 2013.

[16] S. Masood, M. Parvez Qureshi, M. B. Shah, S. Ashraf, Z. Halim,G. Abbas, “Dynamic time wrapping based gesture recognition”, Roboticsand Emerging Allied Technologies in Engineering (iCREATE), 2014International Conference on , pp. 205-210, 22-24 April, 2014.