segmentation and tracking of multiple humans in crowded environments tao zhao, ram nevatia, bo wu...

38
Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AN D MACHINE INTELLIGENCE, VOL. 30, NO. 7, JULY 2008

Post on 20-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Segmentation and Tracking of Multiple Humans in Crowded

EnvironmentsTao Zhao, Ram Nevatia, Bo Wu

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 30, NO. 7, JULY 2008

Page 2: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Outline

• Introduction

• Overview

• Probabilistic modeling

• Computing MAP by efficient MCMC

• Experimental results

• Conclusion

Page 3: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Introduction

• Segmentation and tracking of multiple humans in crowded situations is made difficult by interobject occlusion.

Page 4: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Introduction

• The method is feasible for a crowed scene:– persistent and temporarily heavy occlusion– Do not require that humans isolated when they

first enter the scene.– More complex shape models are needed.– Joint reasoning about the collection of objects

is needed..

Page 5: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Introduction

• Main features of this work:– A three-dimensional part-based human body

model which enables the segmentation and tracking of humans in 3D and the inference of interobject occlusion naturally.

– A Bayesian framework that integrates segmentaion and tracking based on a joint likelihood for the appearance of multiple objects.

Page 6: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Introduction

– The design of an efficient Markov chain dynamics, directed by proposal probabilities based on image cues.

– The incorporation of a color-based background model in a mean-shift tracking step.

Page 7: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Overview

• The prior models:– Background model:

• Based on a background model, the foreground blobs are extracted as the basic observation.

– 3D human shape model:• Since the hypotheses are in 3D, occlusion reasoning is

straightforward.

– Camera model & Ground Plane• Multiple 3D human hypotheses are projected onto the image

plane and matched with the foreground blobs.

Page 8: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Overview

• The segmentation and tracking are integrated in a unified framework and interoperate along time:

Segment the foreground blobs into multiple humans and associate the segmented humans with the existing trajectories.

The tracks are used to propose human hypothesis in the next frame.

Page 9: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Overview

• We formulate the problem as one of Bayesian inference to find the best interpretation given the image observations, the prior model, and the estimates from the previous frame analysis.

• That is the maximun a posteriori (MAP) estimation.

Page 10: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Overview

• The state to be estimated at each frame:– The number of objects– Their correspondences to the objects in the

previous frame (if any).– Their parameters (for example, position)– Uncertainty of the parameters– …

Page 11: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Probabilistic modeling

• Our goal is to estimate the state at time t, (t), given the image observation, I(1),…, I(t)

: the state of the objects.: the solution space.

)}|()|({maxarg

)|(maxarg

)1,...,1()()()(

),...,1()()(

)(

)(

tttt

ttt

IPIP

IP

t

t

Page 12: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Probabilistic modeling

• a state containing n objects can be written as

where ki is the unique identity of the ith object whose parameters are mi and n is the solution space of exactly n objects.

• The entire solution space is

nnnkk )}m,(),...,m,{( 11

nNn

max0

Page 13: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

3D human shape model

• The parameter of an individual human, m, are defined based on a 3D human shape model.

• Do not attempt to capture the detailed shape and articulation parameters of the human body.

Head, torso, and legs, with fixed spatial relationship.

Page 14: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

3D human shape model

• The parameters (mi) to describe 3D human hypothesis:– size (hi): 3D height of the model, it also control

the overall scaling of the object in the three directions.

– thickness (fi): captures extra scaling in the horizontal directions.

– position (ui or (xi,yi)): the image position of the head.

Page 15: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

3D human shape model

– orientation (oi): 3D orientation of the body• Orientations of the models are quantized into few le

vels for computation efficiency.

– inclination (ii): 2D inclination of the body• There is the chance that the body may be inclined sl

githly.

},,,,,{ iiiiiii ifhyxom

Page 16: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Object appearance model

• We use a color histogram of the object, defined within the object shape.

• It help establish correspondence in tracking because it is insensitive to the nonrigidity of human motion.

• There exists an efficient algorithm, for example, the mean-shift technique, to optimize a histogram-based object function.

}~,...,~{p~ 1 mpp

Page 17: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Background appearance model

• The probability of pixel j being from the background is

),,()( jjjbjb bgrPIP

}],)()()(max{exp[ 222

jjj b

jj

g

jj

r

jj bbggrr

Page 18: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

The prior distribution

• The first term :

– is independent of time and is defined by

– Si is the projected image of the ith object and |Si| is its area.

)}|()|({maxarg )1,...,1()()()()(

)(

ttttt IPIP

t

)|()()|( )1,...,1()()()1,...,1()( ttttt IPPIP

)( )(tP

n

iii mPSP

1

)()(

)]exp(1)[exp(|)(| 21 iii SSSP

)( )(tP

Page 19: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

The prior distribution

– P(ofrontal)=P(oprofile)=1/2

– P(xi,yi) is a uniform distribution in the region where a human head is plausible

– P(hi) is a Gaussian distribution N(h,h2) truncated in the ra

nge of [hmin,hmax]

– P(fi) is a Gaussian distribution N(f,f2) truncated in the ran

ge of [fmin,fmax]

– P(ii) is a Gaussian distribution N(i,i2)

)()()(),()()( iiiiiii iPfPhPyxPoPmP

Page 20: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

The prior distribution

• the second term– We approximate it by– We rearrange (t) and (t-1) as

such that one of is true.

)|( )1,...,1()( tt IP )|( )1()( ttP

Ni

ti

ti

t mk 1)()()( )}~,

~{(

~

Ni

ti

ti

t mk 1)1()1()1( )}~,

~{(

~

}~,~,

~~{ )1()()1()( t

it

it

it

i mmkk

object a is ~ 1 trackedk t

i

object a is ~ 1 deadk t

i

object ew a is ~

nk ti

Page 21: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

The prior distribution

– Passoc

• We assume that the position and the inclination of an object follow constant velocity models with Gaussian noise.

N

i

ti

ti

tttt mmPPP1

)1()()1()()1()( )~|~()~

|~

()|(

)~|~( )1()( ti

ti mmP

)1()()1()( ~~ ),~|~( t

it

it

it

iassoc kkmmP

)1()( ~ ),~( ti

tinew mmP

)()1( ~ ),~( ti

tidead mmP

Page 22: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

The prior distribution

• The height and thickness follow a Gaussian distribution.

• We use Kalman filters for temporal estimation.

– Pnew & Pdead

• the likelihood of the initialization of a new track

• the likelihood of the termination of a existing track

• They are set empirically according to the distance of the object to the entrance/exits.

)u~()m~( (t)i

(t)i newnew PP

)u~()m~( 1)-(ti

1)-(ti deaddead PP

Page 23: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Joint image likelihood for multiple objects and the background

• The visible part of object ( ):– determined by the depth order of all of the obje

cts, which can be inferred from their 3D position and the camera model.

• Non object region ( )

)}|()|({maxarg )1,...,1()()()()(

)(

ttttt IPIP

t

n

i

n

i ii SSS1 1

~

iS~

S

Page 24: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Joint image likelihood for multiple objects and the background

• The joint likelihood P(I|) consists of two terms:

• The first term:

)|()|()|( SS IPIPIP

)]}p~,p()d,p([~

exp{

)|()|(

1

1

~

iifiib

n

iiS

n

ii

SS

BBS

mIPIP i

Background exclusion:the likelihood favors difference in an object hypothesis from the background.

Object attraction:this likelihood favors its similarity to its corresponding object in the previous frame.

Page 25: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Joint image likelihood for multiple objects and the background

– di is the color histogram of the background image within the visibility mask of object i.

– pi is the color histogram of the object.

– is the Bhattachayya coefficient, which reflects the similarity of the two histogram.

m

jjjdpB

1

)d,p(

Page 26: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Joint image likelihood for multiple objects and the background

• The second term is:

– ej=log(Pb(Ij)) is the probability of belonging to the background model

)exp())(()|(~~

Sj

jSSSj

jbS eIPIP

The likelihood penalizes the difference from the background model.

Page 27: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Computing MAP by efficient MCMC

• Computing the MAP is an optimization problem.• Optimization is challenging:

– An unknown number of objects, the solution space contains subspaces of varying dimension.

– Includes both discrete variables and continuous variable.

• we adapt a data-driven Markov chain Monte Carlo (MCMC) approach to explore this complex solution space.

Page 28: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Computing MAP by efficient MCMC

• MCMC method with jump/diffusion dynamics to sample the posterior probability.– Jump: cause the Markov chain to move between

subspaces with different dimension and traverse the discrete variables.

– Diffusions: make the Markov chain sample continuous variables.

• In the process of sampling, the best solution is recorded and the uncertainty associated with the solution is also obtained.

Page 29: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Computing MAP by efficient MCMC

Page 30: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Computing MAP by efficient MCMC

• MCMC method:– We want to design a Markov chain with stationary distr

ibution .

– At the gth iteration, we sample a candidate state ’ from a proposal distribution q(g| g-1).

– If the candidate state ’ is accepted, g= ’ .

– Otherwise, g= g-1.

),|()( )1()()( ttt IPP

Page 31: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Computing MAP by efficient MCMC

• Markov chain constructed in this way has its stationary distribution equal to P(), independent of the choice of the proposal probability q() and the initial state 0.

• The choice of the proposal probability q() can affect the efficiency of MCMC significantly.

• Using more informed proposal probabilities, for example, as in the data-driven MCMC, will make the Markov chain traverse the solution space more efficiently. Therefore, the proposal distribution is written as q(g| g-1, I).

Page 32: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Markov chain dynamic

• The dynamics correspond to the proposal distribution with a mixture density

where A is the set of all dynamic = {add, remove, establish, break, exchange, diff}

• We assume that we have the sample in the (g-1)th iteration

,and now propose a candidate ’ for the gth iteration.

Aa

11 1 , ),|'(),|'( aAa

gaag PIqpIq

)}m,(),...,m,{( 11)(1 nn

tg kk

Page 33: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Markov chain dynamic

• Dynamics:– object hypothesis addition

• Sample the parameter of a new human hypothesis (kn+1,mn+1) and add it to g-1.

– object hypothesis removal•

– establish correspondence•

),|}m,{( 1111 Ikq gnngadd

nkq grrgremove /1)|}m,{\( 11

2

'1)|'(

rrgestablish uuq

Page 34: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Markov chain dynamic

– break correspondence•

– exchange identity•

– Parameter update•

'/1)|'( 1 nq gbreak

2

21 21uu),(

rrexchange rrq

)m|'m()/1()|'( 1 rrdgdiff qnq

Page 35: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Experimental results

• Evaluation on an outdoor scene

Page 36: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
Page 37: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Experimental results

– There are 20 occlusions events overall, nine of which are heavy occlusions.

– We use 500 iterations per frame.– Trajectory-based errors:

• Trajectories of three objects are broken once (ID 28 -> ID 35, ID 31 -> ID 32, ID 30 -> ID 41)

– Trajectories initialization:• Some start when the objects are only partial inside.• Only the initialization of three objects (object 31, 50, 52) are

noticeably delayed.• Partially occlusion and/or the lack of contrast with the background are

the causes of the delays.– The detection rate and the false the false-alarm are 98.13 and 0.27

percent.

Page 38: Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

Conclusion

• A principled approach to simultaneously detect and track humans in a crowed scene.

• We formulate the problem as a Bayesian MAP estimation problem.

• The inference is performed by an MCMC-based approach to explore the joint solution space.

• The success lies in the integration of the top-down Bayesian formulation following the image formation process and the bottom-up features that are directly extracted from images.