change detection in stochastic shape dynamical models with application to activity recognition

Change Detection in Stochastic Shape Dynamical Models with

Application to Activity Recognition

Namrata VaswaniThesis Advisor: Prof. Rama Chellappa

Problem Formulation• The Problem:

– Model activities performed by a group of moving and interacting objects (which can be people or vehicles or robots or diff. parts of human body). Use the models for abnormal activity detection and tracking

• Our Approach: – Treat objects as point objects: “landmarks”.– Changing configuration of objects: deforming shape– ‘Abnormality’: change from learnt shape dynamics

• Related Approaches for Group Activity:– Co-occurrence statistics, Dynamic Bayes Nets

The Framework

• Define a Stochastic State-Space Model (a continuous state HMM) for shape deformations in a given activity, with shape & scaled Euclidean motion forming the hidden state vector and configuration of objects forming the observation.

• Use a particle filter to track a given observation sequence, i.e. estimate the hidden state given observations.

• Define Abnormality as a slow or drastic change in the shape dynamics with unknown change parameters. We propose statistics for slow & drastic change detection.

Overview• The Group Activity Recognition Problem

• Slow and Drastic Change Detection

• Landmark Shape Dynamical Models

• Applications, Experiments and Results

• Principal Components Null Space Analysis

• Future Directions & Summary of Contributions

A Group of People: Abnormal Activity Detection

Normal Activity Abnormal Activity

Human Action Tracking

Cyan: ObservedGreen: Ground TruthRed: SSA Blue: NSSA

The Problem• General Hidden Markov Model (HMM): Markov state

sequence {Xt}, Observation sequence {Yt}

• Finite duration change in system model which causes a permanent change in probability distribution of state

• Change is slow or drastic: Tracking Error & Observation Likelihood do not detect slow changes. Use dist. of Xt

• Change parameters unknown: use Log-Likelihood(Xt)

• State is partially observed: use MMSE estimate of LL(Xt) given observations, E[LL(Xt)|Y1:t)] = ELL

• Nonlinear dynamics: Particle filter to estimate ELL

indep. }{ },{ ,)( ,)( 1 tttttttt nwnXfXwXhY

The General HMM

State, Xt State, Xt+1

Observation, Yt+1Observation, Yt

Xt+2…

Qt(Xt+1|Xt)

ψt(Yt|Xt)

Yt+2

Related Work• Change Detection using Particle filters, unknown change

parameters – CUSUM on Generalized LRT (Y1:t) : assumes finite parameter set– Modified CUSUM statistic on Generalized LRT (Y1:t) – Testing if {uj =Pr(Yt<yj|Y1:t-1) } are uniformly distributed– Tracking Error (TE): error b/w Yt & its prediction based on past– Threshold negative log likelihood of observations (OL)

• All of these approaches use observation statistics: Do not detect slow changes– PF is stable and hence is able to track a slow change

• Average Log Likelihood of i.i.d. observations used often– But ELL = E[-LL(Xt)|Y1:t] (MMSE of LL given observations) in

context of general HMMs is new

Particle Filtering

2 step toGo 1,tSet t 4.

step Resample :)},~({~ , :(b)

~

~ ,~ :(a)

~ ,

at t

,

,Pr on,distributi predictionget Also

,Pr on,distributi filtering Evaluate

11

|

1

1~|

11

~1|

0001

0|0

00

111|

1|

samplegiven obs. ofy probabilitby sampleeach Weight : Update3.

kerneln transitiostateprior from samples Generate :Prediction2.

prior, initial from samples Carlo Monte Generate :tionInitializa 1.

:Aim

0

ni

(i)t

(i)t

(i)t

N

ix

Ntt

N

i

(i)ttt

(i)ttt(i)

t

N

ix

(i)t

Ntt

(i)t-t

(i)t

N

ix

Ntt

|(i)

N

ix

N

|

:ttNtt

:ttNtt

Nt

wxlMultinomiax(dx)δ(dx)π

)x|(Y

)x|(Yw(dx)δw(dx)π

)|x(~Qx(dx)δ(dx)π

(dx)~πx(dx)δ(dx)π

π

t)dx|Y(X(dx)π

t)dx|Y(X(dx)π(dx)π

(i)t

(i)t

(i)t

(i)

Change Detection Statistics

• Slow Change: Propose Expected Log Likelihood (ELL)– ELL = Kerridge Inaccuracy b/w πt (posterior) and pt

0 (prior)

ELL(Y1:t )=E[-log pt0

(Xt)|Y1:t]=Eπ[-log pt0

(Xt)]=K(πt: pt0)

• A sufficient condition for “detectable changes” using ELL– E[ELL(Y1:t

0)] = K(pt0:pt

0)=H(pt0), E[ELL(Y1:t

c)]= K(ptc:pt

0)– Chebyshev Inequality: With false alarm & miss probabilities of 1/9,

ELL detects all changes s.t.

K(ptc:pt

0) -H(pt0)>3 [√Var{ELL(Y1:t

c)} +√Var{ELL(Y1:t0)}]

• Drastic Change: ELL does not work, use OL or TE– OL: Neg. log of current observation likelihood given past

OL = -log [Pr(Yt|Y0:t-1,H0) ] = -log[<t|t-1 , ψt>]– TE: Tracking Error. If white Gaussian observation noise, TE ≈ OL

ELL & OL: Slow & Drastic Change• ELL fails to detect drastic changes

– Approximating posterior for changed system observations using a PF optimal for unchanged system: error large for drastic changes

– OL relies on the error introduced due to the change to detect it

• OL fails to detect slow changes– Particle Filter tracks slow changes “correctly”– Assuming change till t-1 was tracked “correctly” (error in posterior

small), OL only uses change introduced at t, which is also small– ELL uses total change in posterior till time t & the posterior is

approximated “correctly” for a slow change: so ELL detects a slow change when its total magnitude becomes “detectable”

• ELL detects change before loss of track, OL detects after

A Simulated Example• Change introduced in system model from t=5 to t=15

ELL OL

Practical Issues

• Defining pt0(x):

– Use part of state vector which has linear Gaussian dynamics: can define pt

0(x) in closed form OR – Assume a parametric family for pt

0(x), learn parameters using training data

• Declare a change when either ELL or OL exceed their respective thresholds.– Set ELL threshold to a little above H(pt

0)– Set OL threshold to a little above E[OL0,0]=H(Yt|Y1:t-1)

• Single frame estimates of ELL or OL may be noisy– Average the statistic or average no. of detects or modify CUSUM

Change Detection

Yt t|t-1N

t|tN = t

N

ELL=E[-log pt0(Xt)] > Threshold?

PF

OL= -log[<t|t-1 , ψt>] > Threshold? Change(Drastic)

Yes

Change (Slow)

Yes

Approximation Errors

• Total error < Bounding error + Exact filtering error + PF error

– Bounding error: Stability results hold only for bounded fn’s but LL is unbounded. So approximate LL by min{-log pt

0(Xt),M}

– Exact filtering error: Error b/w exact filtering with changed system model & with original model. Evaluating πt

c,0 (using Qt

0 ) instead of πtc,c (using Qt

c)

– PF Error: Error b/w exact filtering with original model & particle filtering with original model. Evaluating πt

c,0,N which is a Monte Carlo estimate of πt

c,0

Stability / Asymptotic Stability

• The ELL approximation error averaged over observation sequences & PF realizations is eventually monotonically decreasing (& hence stable), for large enough N if– Change lasts for a finite time– “Unnormalized filter kernels” are mixing– Certain boundedness (or uniform convergence of bounded

approximation) assumptions hold

• Asymptotically stable if the kernels are uniformly “mixing”

• Use stability results of [LeGland & Oudjane]

• Analysis generalizes to errors in MMSE estimate of any fn of state evaluated using a PF with system model error

“Unnormalized filter kernel” “mixing”• “Unnormalized filter kernel”, Rt, is state transition kernel,Qt,

weighted by likelihood of observation given state

• “Mixing”: measures the rate at which the transition kernel “forgets” its initial condition or eqvtly. how quickly the state sequence becomes ergodic. Mathematically,

• Example [LeGland et al] : State transition, Xt =Xt-1+nt is not mixing. But if Yt=h(Xt)+wt, wt is truncated noise, then Rt is mixing

)'|()'( ),',()'()',( ,, xYgxdxxQxdxxR tYttYtt tt

xx E,ExAAxKA

K

A subsets Borel ),(),()(

s.t. , measure nonneg. a and 0 if mixing is Kernel1

Complementary Behavior of ELL & OL

• ELL approx. error, etc,0, is upper bounded by an increasing

function of OLkc,0, tc< k < t

• Implication: Assume “detectable” change i.e. ELLc,c large

• OL fails => OLkc,0,tc<k<t small => ELL error, et

c,0 small=> ELLc,0 large => ELL detects

• ELL fails => ELLc,0 small =>ELL error, etc,0 large =>

at least one of OLkc,0,tc<k<t large => OL detects

),()exp( 1,

0,0,k

c

kQck

t

tk

ct DOLe

“Rate of Change” Bound• The total error in ELL estimation is upper bounded by

increasing functions of the “rate of change” (or “system model error per time step”) with all increasing derivatives.

• OLc,0 is upper bounded by increasing function of “rate of change”.

• Metric for “rate of change” (or equivalently “system model error per time step”) for a given observation Yt : DQ,t is

'|)',()',(|)'(sup),( '0

,0 dxxxqxxqxQQD Ex t

ctYtxt

ctQ t

The Bound

fn. incr. : ),,()log( .2

)1,(~

)~

,()(

)~

,(2

2

3log

442 ,

3log

42 :[3]

orders all of sderivative incr. fn, incr. : ),,( 1.

,,

~0,

,1,~1

1,,

,

~2

1,,~,

2

22

21

3:210,

2

22

21

3:210,

,

0,0,0,

1

1

1

tktDDCOL

tktDD

DDDC

S

DDC

D

tktDN

MMe

ckQtQDc

t

ckQtDt

kkQk

kQD

k

k

kkQkD

kQk

t

tk kk

kkt

t

tt

ct

t

tk kk

kkt

t

tt

ct

ckQ

cttc

ttct

t

t

k

k

k

k

cc

Assume: Change for finite time, Unnormalized filter kernels mixing,Posterior state space bounded

Implications

• If change slow, ELL works and OL does not work

• ELL error can blow up very quickly as rate of change increases (its upper bound blows up)

– A small error in both normal & changed system models introduces less total error than a perfect transition kernel for normal system & large error in changed system

– A sequence of small changes will introduce less total error than one drastic change of same magnitude

Possible Applications

• Abnormal activity detection, Detecting motion disorders in human actions, Activity Segmentation

• Neural signal processing: detecting changes in stimuli

• Congestion Detection

• Video Shot change or Background model change detection

• System model change detection in target tracking problems without the tracker loses track

What is Shape?

• Shape is the geometric information that remains when location, scale and rotation effects are filtered out [Kendall]

• Shape of k landmarks in 2D– Represent the X & Y coordinates of the k points as

a k-dimensional complex vector: Configuration– Translation Normalization: Centered Configuration– Scale Normalization: Pre-shape– Rotation Normalization: Shape

Related Work

• Related Approaches for Group Activity– Co-occurrence Statistics– Dynamic Bayesian Networks– Shape for robot formation control

• Shape Analysis/Deformation:– Pairs of Thin plate splines, Principal warps– Active Shape Models: affine deformation in configuration space– ‘Deformotion’: scaled Euclidean motion of shape + deformation – Piecewise geodesic models for tracking on Grassmann manifolds

• Particle Filters for Multiple Moving Objects: – JPDAF (Joint Probability Data Association Filter): for tracking

multiple independently moving objects

Motivation• A generic and sensor invariant approach for “activity”

– Only need to change observation model depending on the “landmark”, the landmark extraction method and the sensor used

– Easy to fuse sensors in a Particle filtering framework

• “Shape”: invariant to translation, zoom, in-plane rotation

• Single global framework for modeling and tracking independent motion + interactions of groups of objects– Co-occurrence statistics: Req. individual & joint histograms– JPDAF: Cannot model object interactions for tracking– Active Shape Models: good for only approx. rigid objects

• Particle Filter is better than the Extended Kalman Filter – Able to get back in track after loss of track due to outliers, – Handle multimodal system or observation process

Hidden Markov Shape Model

Xt=[Shape(zt), Shape Velocity(ct), Scale(st), Rotation(θt)] Xt+1

Observation, Yt = Centered Configuration

Observation ModelYt = h(Xt) + wt = ztstejθ + wt

wt : i.i.d observation noise

State Dynamics

Shape X Rotation, SO(2) X Scale, R+ = Centered Config, Ck-1

Using complex notation

State DynamicsShape Dynamics: Linear Markov model on shape velocity• Shape “velocity” at t in tangent space w.r.t. shape at t-1, zt-1

• Orthogonal basis of the tangent space, U(zt-1)

• Linear Gauss-Markov model for shape velocity

• “Move” zt-1 by an amount vt on shape manifold to get zt

Motion (Scale, Rotation):• Linear Gauss-Markov dynamics for log st, unwrapped θt

tttnttttct czUvNnncAc )( ),,0(~ , 11,

ttttt vzvvz 12/1)*1(

The HMM

tjttt

obstttt

eszXh

outliersNwwXhY

)(

),0(~ ,)(

Observation Model: [Shape,Motion]Centered Configuration

•Linear Gauss-Markov models for log st and θt

•Can be stationary or non-stationary

System Model : Shape and Motion Dynamics

ttttt

tttttt

ntttct

vzvvz

CzzIbasiszUczUv

NnncAc

12/1

1111

1

)*1(

)*]([)( ,)(

),0(~ ,

Shape Dynamics: Motion Dynamics:

Three Cases

• Non-Stationary Shape Activity (NSSA)• Tangent space, U(zt-1), changes at every t• Most flexible: Detect abnormality and also track it

• Stationary Shape Activity (SSA)• Tangent space, U(μ), is constant (μ is a mean shape)• Track normal behavior, detect abnormality

• Piecewise Stationary Shape Activity (PSSA)• Tangent space is piecewise constant, U(μk)• Change time: fixed or decided on the fly using ELL • PSSA + ELL: Activity Segmentation

Stationary, Non-Stationary

Stationary Shape

Non-Stationary Shape

• Procrustes mean of set of preshapes wi [Dryden,Mardia]:

S

*ww SS

ww

jbaew

wd

iii

iii

i

ji

ba

iiF

ofr EigenvectoLargest

,*maxarg

]*)(*1[minarg

||||minminarg

),(minargˆ

1||:||

2

,,,

2

)*arg( , :Shape ii

jii wewz i

Learning Procrustes’ Mean

Abnormal Activity Detection

• Define abnormal activity as – Slow or drastic change in shape statistics with

change parameters unknown.– System is a nonlinear HMM, tracked using a PF

• This motivated research on slow & drastic change detection in general HMMs– Tracking Error detects drastic changes. We

proposed a statistic called ELL for slow change.– Use a combination of ELL & Tracking Error and

declare change if either exceeds its threshold.

Tracking to obtain observations

• CONDENSATION tracker framework• State: Shape, shape velocity, scale, rotation,

translation, Observation: Configuration vector • Measurement model: Motion detection locally around

predicted object locations to obtain observation• Predicted object configuration obtained by prediction

step of Particle filter• Predicted motion information can be used to move

the camera (or any other sensor)

• Combine with abnormality detection: for drastic abnormalities will not get observation for a set of frames, if outlier then only for 1-2 frames

Activity Segmentation• Use PSSA model for tracking

• At time t, let current mean shape = μk

• Use ELL w.r.t. μk to detect change time, tk+1 (segmentation boundary)

• At tk+1, set current mean shape to posterior Procrustes mean of current shape, i.e.

μk+1=largest eigenvector of Eπ[ztzt*]=Σi=1

N zt(i)

zt(i)*

• Setting the current mean as above is valid only if tracking error (or OL) has not exceeded the threshold (PF still in track)

A Common Framework for…• Tracking

– Groups of people or vehicles– Articulated human body tracking

• Abnormal Activity Detection / Activity Id– Suspicious behavior, Lane change detection– Abnormal action detection, e.g. motion disorders– Human Action Classification, Gait recognition

• Activity Sequence Segmentation• Fusing different sensors

– Video, Audio, Infra-Red, Radar

Experiments• Group Activity:

– Normal activity: Group of people deplaning & walking towards airport terminal: used SSA model

– Abnormality: A person walks away in an un-allowed direction: distorts the normal shape

• Simulated walking speeds of 1,2,4,16,32 pixels per time step (slow to drastic distortion in shape)

• Compared detection delays using TE and ELL• Plotted ROC curves to compare performance

• Human actions: – Defined NSSA model for tracking a figure skater– Abnormality: abnormal motion of one body part– Able to detect as well as track slow abnormality

Abnormality• Abnormality introduced at t=5• Observation noise variance = 9• OL plot very similar to TE plot (both same to first order)

ELL Tracking Error (TE)

ROC: ELL• Plot of Detection delay against Mean time b/w False Alarms

(MTBFA) for varying detection thresholds• Plots for increasing observation noise

Slow Change: ELL Works Drastic Change: ELL Fails

ROC: Tracking Error(TE)• ELL: Detection delay = 7 for slow change , Detection delay = 60

for drastic• TE: Detection delay = 29 for slow change, Detection delay = 4

for drastic

Slow Change: TE Fails Drastic Change: TE Works

ROC: Combined ELL-TE• Plots for observation noise variance = 81 (maximum)• Detection Delay < 8 achieved for all rates of change

Slow Change: Works! Drastic Change: Works!

Normal ActionSSA better than NSSA

AbnormalityNSSA works, SSA fails

Green: Observed, Magenta: SSA, Blue: NSSA

Human Action Tracking

NSSA Tracks and Detects Abnormality

Red: SSA, Blue: NSSA

ELL Tracking ErrorAbnormality introduced at t=20

Typical Data Distributions

‘Apples from Apples’ problem: All algorithms work well

‘Apples from Oranges’ problem: Worst case for SLDA, PCA

PCNSA Algorithm

• Subtract common mean μ, Obtain PCA space• Project all training data into PCA space, evaluate class

mean, covariance in PCA space: μi, Σi

• Obtain class Approx. Null Space (ANS) for each class: Mi trailing eigenvectors of Σi

• Valid classification directions in ANS: if distance between class means is “significant”: Wi

NSA

• Classification: Project query Y into PCA space, X=WPCA

T(Y- μ), choose Most Likely class, c, as

)||μ(X||W(X)d(X)dc iNSA

iiii

T

,minarg

Classification Error Probability

• Two class problem. Assumes 1-dim ANS, 1 LDA direction• Generalizes to M dim ANS and to non-Gaussian but unimodal

& symmetric distributions

1,

212

221

1

|)(|

)1,0;()(

ANS

T

T

PCNSA

k

NN

N

dzzEP

LDALDA

LDAT

SLDA

WW

W

dzzEP

T

1

21

1

2

|)(|

)1,0;()(

Applications

• Image & Video retrieval– Applied to human action retrieval– Hierarchical image/video retrieval: PCNSA

followed by LDA

• Activity Classification & Abnormal Activity Detection

Applications

Face recognition,Large expression variation

Face recognition,Large pose variation

Facial Feature Matching

Object recognition

Discussion & Ideas• PCNSA test approximates the LRT (optimal Bayes

solution) as condition no. of Σi tends to infinity

• Fuse PCNSA and LDA: get an algorithm very similar to Multispace KL

• For multiclass problems, use error probability expressions to decide which of PCNSA or SLDA is better for a given set of 2 classes

• Perform facial feature matching using PCNSA, use this for face registration followed by warping to standard geometry

)()(minarg :LRT iiT

ii

XXClass 1

Ongoing and Future Work• Change Detection

– Implications of Bound on Errors is increasing fn. of rate of change– CUSUM on ELL & OL– Quantitative performance analysis of ELL & OL– Find examples of mixing “unnormalized filter kernels”

• Non-Stationary & Piecewise Stationary Shape Activities– Application to sequences of different kinds of actions– PSSA + ELL for activity segmentation– Joint tracking and abnormality detection

• Time varying number of Landmarks?– What is “best” strategy to get a fixed no. ‘k’ of landmarks?– Can we deal with changing dimension of shape space?

• Multiple Simultaneous Activities, Multi-sensor fusion• 3D Shape, General shape spaces

Contributions• ELL for slow change detection, Stability of ELL approximation error

• Complementary behavior of ELL & OL, ELL error proportional to “rate of change” with all increasing derivatives

• Stochastic dynamical models for landmark shapes: NSSA, SSA, PSSA

• Modeling the changing configuration of a group of moving point objects as a deforming shape: “shape activity”.

• Using ELL + PSSA for activity segmentation

• PCNSA & its error probability analysis, application to action retrieval, abnormal activity detection

change detection in stochastic shape dynamical models with application to activity recognition

Documents