segmentation and tracking of the upper body model from range data with applications in hand gesture...

15
Segmentation and tracking of the upper body model from range data with applications in hand gesture recognition Navin Goel Intel Corporation Department of Computer Science, University of Nevada, Reno

Post on 22-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Segmentation and tracking of the upper body model from range data with applications in hand gesture

recognition

Navin GoelIntel Corporation

Department of Computer Science, University of Nevada, Reno

Overview

Introduction Overall System Upper Body Model Segmentation Problem Tracking Color Based Segmentation Results Conclusion and Future Work

Introduction

Applications 3D editing system/ HCI systems,

American Sign Language Recognition,

Entertainment,

Industrial Control,

Video coding, teleconferencing

Requirements Background and illumination independent, Occlusions and self occlusions of the body components, Robust hand free initialization, Robust tracking.

Overall System

Initial Segmentation

Tracking

Stereo (RGB+Z)

video sequence

Valid Track

Invalid Track

Color-based segmentation

Hue Moments

Calculation

6

2

1

...

h

h

h

z

y

x

Train

RecoUpper Body

Model

Color video

sequence

Upper Body Model

CqLJ

ijijij

ij

LJqOPOP,,

)|,,,()|( ]),,,[(],[ ,,, ijijijijc

jid

jiji hzyx OOO

Hal

J

C

OOij

L

Fl Ul He T Ur Fr Har

LHa LHe LT LU

Wl El Sl N Sr Er Wr

LHaLFLULF

)()|(),|(),|()|,,,( LPLJPLJqPLqOPLJqOP ijijijijij

Head — Normal component model

),|U(),|(),|( , HeHeijHeHeijHeHed

ji LOCNKLHeP OO

Upper Body ModelSi

zeH

ead

Neck

)],[,],[|],([)2

)(exp(

2

1),|(

2

2

2,T

TTT

TTT

ijijz

ijij

z

TTd

ji yxyxLLyx

zKLTP

U

O

Planar component modelNeck

WidthTorso

cbyax ijijij

2,

2|

2exp

2)|( maxmax

2

2

2

rrOU

OKAOP d

ijij

dijAd

ij

Linear component modelsElbow

Wrist

),,( rij

Upper Body Model

)],,[],,([),|( maxT

JJJT

pc cccrrJJAP

),,(ccc JJJr

otherwise

Lrand

and

ifK

LJJPc

c

cc

J

J

JJ

pc

0

],[

],[

),|(maxmin

maxmin

Linear PDF Parameters:

Where, are the spherical coordinates of Jc with the origin in Jp

The conditional probability of a joint Jc given its parent joint Jp and the anthropological measure L is given by:

Where, KJc is a normalization constant, maxmaxminmin ,, and represent the

minimum and maximum values of parameters cc JJ

BABA JJQQ ,,, state assignments and joint for the arm and body (head &torso) regions.

Stage I Stage II

Looking for all possible joint configuration is computationally impractical. Therefore, segmentation takes place in two stages.

The Segmentation Problem

Simplifying assumptions

Notations

•Only one user is visible and his/hers torso is the largest body component,•The torso plane is perpendicular to the camera and,•Head is in vertical position.

•Step 3 Compute BijijBq

ij JqOPqij

,,logmaxarg~

•Step 4 Estimate the joints:

•Step 1 Estimate the torso plane parameters from all data using EM. Estimate the torso and head bounding box, and the plane that includes N.

•Step 2 Estimate the head blob parameters from all data using EM.

•Step 5 Repeat steps 3-4 until convergence of ),|(log BBP QJOF

The Upper Body Segmentation. Stage I

T

yx

He

HeHe cbNaNL

N y

yx

,

2,

T

ylxlyT

Tl cbSaSNL

S x

x

,,

2

T

yrxryT

Tr cbSaSNL

S x

x

,,

2

Step 1. For each possible arm parameters estimate the mean of the linear pdfs corresponding to the upper and fore arms, and the mean of the normal pdf for the hands,

Step 2. For each joint configuration JA:

• a) compute the best state assignment of the observation vectors given the joint configuration,

• b) compute the observation likelihood given the joint configuration.

Step 3. Find the max likelihood over all joint configuration and determine the “best” set of joints and the corresponding best state assignment.

Given the fix positions of Sl and Sr, we sub sample the joint space to get NE=18 possible positions for each of the joints El and Er. Given each position of the elbow joints we search for NW = 16 possible positions for each of the joints Wl , Wr.

The Upper Body Segmentation. Stage II

Arm Tracking

• for each joint Jp we build a set of [Jc1, Jc2

, Jc3, Jc4

, Jc5] five possible child joint positions such

that each of them lies on the surface of the sphere with parent joint as the center.Z

Y

X

Φ

θ

Jc1 = (r,Φ,θ) joint center from last frame

Jc3 = (r,Φ,θ+Δθ)

Jc5 = (r,Φ+ΔΦ,θ)Jc4

= (r,Φ,θ-Δθ)

• Step 2 for each joint configuration we determine the best state assignment of the observations

1

~ t

}|)(~{)(~

),,(logmaxarg)(~ AijJqJQqJOPJq AijAAijAijAq

Aijij

),,(log *AAA JQOP

Jc2 = (r,Φ-ΔΦ,θ)Jc1

Jc2

Jc3

Jc5

Jc4

• Step 3 the max log likelihood determines the best joint configuration.

• Step 1 estimate the mean of the linear pdfs corresponding to the upper and fore arms, and the mean of the normal pdf for the hands

1

~ t

Color Based Segmentation

]...[,, maxmin zzZf

jZY

f

iZX ZZ

dZdWjilWYXPjikZYXPkqOPkl ZW

WW

Z

zzijdij

),(,|,,),(,|,,|

kOPkZf

jZ

f

iZPkqOP C

ijijij ||,,|

Pixels with no depth information cannot be assigned to body components by the previous segmentation algorithm. Need to estimate the depth of all pixels and perform global segmentation.

)|(maxarg*ijij

qallij qOPq

ij

Depth Segmentation

Color Based Segmentation

lOPlZf

jZ

f

iZPkOPkZ

f

jZ

f

iZPZ C

ij

ZZkl

lllC

ijkkk

allZk

k

k

||,,||,,maxarg~

/

kqOPkqZf

Zj

f

ZiPkqOP ij

Cijijk

kkijij

||

~,

~,

~|

In practice

Suppose, k = “left forearm”, then l = “all the body components except left forearm”, and if Zk = “a” then Zl = “[zmin … zmax] > a’’.

Color Segmentation

Upper Body Segmentation and Tracking. Results

Contributions Articulated upper body model from dense disparity maps, Linear pdf for the fore arms and upper arms, Hand free initialization of the system from the optimal joint

configuration, Upper body tracking, seen as a particular case of the initialization.

Future work Improvements to the background segmentation, Learn the anthropological measures, Integration with other HCI systems (gesture reco, face reco, speech reco,

speaker identification etc.)

Conclusion and Future Work