change detection in stochastic shape dynamical models with application to activity recognition
DESCRIPTION
Change Detection in Stochastic Shape Dynamical Models with Application to Activity Recognition. Namrata Vaswani Thesis Advisor: Prof. Rama Chellappa. Problem Formulation. The Problem: - PowerPoint PPT PresentationTRANSCRIPT
Change Detection in Stochastic Shape Dynamical Models with
Application to Activity Recognition
Namrata VaswaniThesis Advisor: Prof. Rama Chellappa
Problem Formulation• The Problem:
– Model activities performed by a group of moving and interacting objects (which can be people or vehicles or robots or diff. parts of human body). Use the models for abnormal activity detection and tracking
• Our Approach: – Treat objects as point objects: “landmarks”.– Changing configuration of objects: deforming shape– ‘Abnormality’: change from learnt shape dynamics
• Related Approaches for Group Activity:– Co-occurrence statistics, Dynamic Bayes Nets
The Framework
• Define a Stochastic State-Space Model (a continuous state HMM) for shape deformations in a given activity, with shape & scaled Euclidean motion forming the hidden state vector and configuration of objects forming the observation.
• Use a particle filter to track a given observation sequence, i.e. estimate the hidden state given observations.
• Define Abnormality as a slow or drastic change in the shape dynamics with unknown change parameters. We propose statistics for slow & drastic change detection.
Overview• The Group Activity Recognition Problem
• Slow and Drastic Change Detection
• Landmark Shape Dynamical Models
• Applications, Experiments and Results
• Principal Components Null Space Analysis
• Future Directions & Summary of Contributions
A Group of People: Abnormal Activity Detection
Normal Activity Abnormal Activity
Human Action Tracking
Cyan: ObservedGreen: Ground TruthRed: SSA Blue: NSSA
Overview• The Group Activity Recognition Problem
• Slow and Drastic Change Detection
• Landmark Shape Dynamical Models
• Applications, Experiments and Results
• Principal Components Null Space Analysis
The Problem• General Hidden Markov Model (HMM): Markov state
sequence {Xt}, Observation sequence {Yt}
• Finite duration change in system model which causes a permanent change in probability distribution of state
• Change is slow or drastic: Tracking Error & Observation Likelihood do not detect slow changes. Use dist. of Xt
• Change parameters unknown: use Log-Likelihood(Xt)
• State is partially observed: use MMSE estimate of LL(Xt) given observations, E[LL(Xt)|Y1:t)] = ELL
• Nonlinear dynamics: Particle filter to estimate ELL
indep. }{ },{ ,)( ,)( 1 tttttttt nwnXfXwXhY
The General HMM
State, Xt State, Xt+1
Observation, Yt+1Observation, Yt
Xt+2…
Qt(Xt+1|Xt)
ψt(Yt|Xt)
Yt+2
Related Work• Change Detection using Particle filters, unknown change
parameters – CUSUM on Generalized LRT (Y1:t) : assumes finite parameter set– Modified CUSUM statistic on Generalized LRT (Y1:t) – Testing if {uj =Pr(Yt<yj|Y1:t-1) } are uniformly distributed– Tracking Error (TE): error b/w Yt & its prediction based on past– Threshold negative log likelihood of observations (OL)
• All of these approaches use observation statistics: Do not detect slow changes– PF is stable and hence is able to track a slow change
• Average Log Likelihood of i.i.d. observations used often– But ELL = E[-LL(Xt)|Y1:t] (MMSE of LL given observations) in
context of general HMMs is new
Particle Filtering
2 step toGo 1,tSet t 4.
step Resample :)},~({~ , :(b)
~
~ ,~ :(a)
~ ,
at t
,
,Pr on,distributi predictionget Also
,Pr on,distributi filtering Evaluate
11
|
1
1~|
11
~1|
0001
0|0
00
111|
1|
samplegiven obs. ofy probabilitby sampleeach Weight : Update3.
kerneln transitiostateprior from samples Generate :Prediction2.
prior, initial from samples Carlo Monte Generate :tionInitializa 1.
:Aim
0
ni
(i)t
(i)t
(i)t
N
ix
Ntt
N
i
(i)ttt
(i)ttt(i)
t
N
ix
(i)t
Ntt
(i)t-t
(i)t
N
ix
Ntt
|(i)
N
ix
N
|
:ttNtt
:ttNtt
Nt
wxlMultinomiax(dx)δ(dx)π
)x|(Y
)x|(Yw(dx)δw(dx)π
)|x(~Qx(dx)δ(dx)π
(dx)~πx(dx)δ(dx)π
π
t)dx|Y(X(dx)π
t)dx|Y(X(dx)π(dx)π
(i)t
(i)t
(i)t
(i)
Change Detection Statistics
• Slow Change: Propose Expected Log Likelihood (ELL)– ELL = Kerridge Inaccuracy b/w πt (posterior) and pt
0 (prior)
ELL(Y1:t )=E[-log pt0
(Xt)|Y1:t]=Eπ[-log pt0
(Xt)]=K(πt: pt0)
• A sufficient condition for “detectable changes” using ELL– E[ELL(Y1:t
0)] = K(pt0:pt
0)=H(pt0), E[ELL(Y1:t
c)]= K(ptc:pt
0)– Chebyshev Inequality: With false alarm & miss probabilities of 1/9,
ELL detects all changes s.t.
K(ptc:pt
0) -H(pt0)>3 [√Var{ELL(Y1:t
c)} +√Var{ELL(Y1:t0)}]
• Drastic Change: ELL does not work, use OL or TE– OL: Neg. log of current observation likelihood given past
OL = -log [Pr(Yt|Y0:t-1,H0) ] = -log[<t|t-1 , ψt>]– TE: Tracking Error. If white Gaussian observation noise, TE ≈ OL
ELL & OL: Slow & Drastic Change• ELL fails to detect drastic changes
– Approximating posterior for changed system observations using a PF optimal for unchanged system: error large for drastic changes
– OL relies on the error introduced due to the change to detect it
• OL fails to detect slow changes– Particle Filter tracks slow changes “correctly”– Assuming change till t-1 was tracked “correctly” (error in posterior
small), OL only uses change introduced at t, which is also small– ELL uses total change in posterior till time t & the posterior is
approximated “correctly” for a slow change: so ELL detects a slow change when its total magnitude becomes “detectable”
• ELL detects change before loss of track, OL detects after
A Simulated Example• Change introduced in system model from t=5 to t=15
ELL OL
Practical Issues
• Defining pt0(x):
– Use part of state vector which has linear Gaussian dynamics: can define pt
0(x) in closed form OR – Assume a parametric family for pt
0(x), learn parameters using training data
• Declare a change when either ELL or OL exceed their respective thresholds.– Set ELL threshold to a little above H(pt
0)– Set OL threshold to a little above E[OL0,0]=H(Yt|Y1:t-1)
• Single frame estimates of ELL or OL may be noisy– Average the statistic or average no. of detects or modify CUSUM
Change Detection
Yt t|t-1N
t|tN = t
N
ELL=E[-log pt0(Xt)] > Threshold?
PF
OL= -log[<t|t-1 , ψt>] > Threshold? Change(Drastic)
Yes
Change (Slow)
Yes
Approximation Errors
• Total error < Bounding error + Exact filtering error + PF error
– Bounding error: Stability results hold only for bounded fn’s but LL is unbounded. So approximate LL by min{-log pt
0(Xt),M}
– Exact filtering error: Error b/w exact filtering with changed system model & with original model. Evaluating πt
c,0 (using Qt
0 ) instead of πtc,c (using Qt
c)
– PF Error: Error b/w exact filtering with original model & particle filtering with original model. Evaluating πt
c,0,N which is a Monte Carlo estimate of πt
c,0
Stability / Asymptotic Stability
• The ELL approximation error averaged over observation sequences & PF realizations is eventually monotonically decreasing (& hence stable), for large enough N if– Change lasts for a finite time– “Unnormalized filter kernels” are mixing– Certain boundedness (or uniform convergence of bounded
approximation) assumptions hold
• Asymptotically stable if the kernels are uniformly “mixing”
• Use stability results of [LeGland & Oudjane]
• Analysis generalizes to errors in MMSE estimate of any fn of state evaluated using a PF with system model error
“Unnormalized filter kernel” “mixing”• “Unnormalized filter kernel”, Rt, is state transition kernel,Qt,
weighted by likelihood of observation given state
• “Mixing”: measures the rate at which the transition kernel “forgets” its initial condition or eqvtly. how quickly the state sequence becomes ergodic. Mathematically,
• Example [LeGland et al] : State transition, Xt =Xt-1+nt is not mixing. But if Yt=h(Xt)+wt, wt is truncated noise, then Rt is mixing
)'|()'( ),',()'()',( ,, xYgxdxxQxdxxR tYttYtt tt
xx E,ExAAxKA
K
A subsets Borel ),(),()(
s.t. , measure nonneg. a and 0 if mixing is Kernel1
Complementary Behavior of ELL & OL
• ELL approx. error, etc,0, is upper bounded by an increasing
function of OLkc,0, tc< k < t
• Implication: Assume “detectable” change i.e. ELLc,c large
• OL fails => OLkc,0,tc<k<t small => ELL error, et
c,0 small=> ELLc,0 large => ELL detects
• ELL fails => ELLc,0 small =>ELL error, etc,0 large =>
at least one of OLkc,0,tc<k<t large => OL detects
),()exp( 1,
0,0,k
c
kQck
t
tk
ct DOLe
“Rate of Change” Bound• The total error in ELL estimation is upper bounded by
increasing functions of the “rate of change” (or “system model error per time step”) with all increasing derivatives.
• OLc,0 is upper bounded by increasing function of “rate of change”.
• Metric for “rate of change” (or equivalently “system model error per time step”) for a given observation Yt : DQ,t is
'|)',()',(|)'(sup),( '0
,0 dxxxqxxqxQQD Ex t
ctYtxt
ctQ t
The Bound
fn. incr. : ),,()log( .2
)1,(~
)~
,()(
)~
,(2
2
3log
442 ,
3log
42 :[3]
orders all of sderivative incr. fn, incr. : ),,( 1.
,,
~0,
,1,~1
1,,
,
~2
1,,~,
2
22
21
3:210,
2
22
21
3:210,
,
0,0,0,
1
1
1
tktDDCOL
tktDD
DDDC
S
DDC
D
tktDN
MMe
ckQtQDc
t
ckQtDt
kkQk
kQD
k
k
kkQkD
kQk
t
tk kk
kkt
t
tt
ct
t
tk kk
kkt
t
tt
ct
ckQ
cttc
ttct
t
t
k
k
k
k
cc
Assume: Change for finite time, Unnormalized filter kernels mixing,Posterior state space bounded
Implications
• If change slow, ELL works and OL does not work
• ELL error can blow up very quickly as rate of change increases (its upper bound blows up)
– A small error in both normal & changed system models introduces less total error than a perfect transition kernel for normal system & large error in changed system
– A sequence of small changes will introduce less total error than one drastic change of same magnitude
Possible Applications
• Abnormal activity detection, Detecting motion disorders in human actions, Activity Segmentation
• Neural signal processing: detecting changes in stimuli
• Congestion Detection
• Video Shot change or Background model change detection
• System model change detection in target tracking problems without the tracker loses track
Overview• The Group Activity Recognition Problem
• Slow and Drastic Change Detection
• Landmark Shape Dynamical Models
• Applications, Experiments and Results
• Principal Components Null Space Analysis
What is Shape?
• Shape is the geometric information that remains when location, scale and rotation effects are filtered out [Kendall]
• Shape of k landmarks in 2D– Represent the X & Y coordinates of the k points as
a k-dimensional complex vector: Configuration– Translation Normalization: Centered Configuration– Scale Normalization: Pre-shape– Rotation Normalization: Shape
Related Work
• Related Approaches for Group Activity– Co-occurrence Statistics– Dynamic Bayesian Networks– Shape for robot formation control
• Shape Analysis/Deformation:– Pairs of Thin plate splines, Principal warps– Active Shape Models: affine deformation in configuration space– ‘Deformotion’: scaled Euclidean motion of shape + deformation – Piecewise geodesic models for tracking on Grassmann manifolds
• Particle Filters for Multiple Moving Objects: – JPDAF (Joint Probability Data Association Filter): for tracking
multiple independently moving objects
Motivation• A generic and sensor invariant approach for “activity”
– Only need to change observation model depending on the “landmark”, the landmark extraction method and the sensor used
– Easy to fuse sensors in a Particle filtering framework
• “Shape”: invariant to translation, zoom, in-plane rotation
• Single global framework for modeling and tracking independent motion + interactions of groups of objects– Co-occurrence statistics: Req. individual & joint histograms– JPDAF: Cannot model object interactions for tracking– Active Shape Models: good for only approx. rigid objects
• Particle Filter is better than the Extended Kalman Filter – Able to get back in track after loss of track due to outliers, – Handle multimodal system or observation process
Hidden Markov Shape Model
Xt=[Shape(zt), Shape Velocity(ct), Scale(st), Rotation(θt)] Xt+1
Observation, Yt = Centered Configuration
Observation ModelYt = h(Xt) + wt = ztstejθ + wt
wt : i.i.d observation noise
State Dynamics
Shape X Rotation, SO(2) X Scale, R+ = Centered Config, Ck-1
Using complex notation
State DynamicsShape Dynamics: Linear Markov model on shape velocity• Shape “velocity” at t in tangent space w.r.t. shape at t-1, zt-1
• Orthogonal basis of the tangent space, U(zt-1)
• Linear Gauss-Markov model for shape velocity
• “Move” zt-1 by an amount vt on shape manifold to get zt
Motion (Scale, Rotation):• Linear Gauss-Markov dynamics for log st, unwrapped θt
tttnttttct czUvNnncAc )( ),,0(~ , 11,
ttttt vzvvz 12/1)*1(
The HMM
tjttt
obstttt
eszXh
outliersNwwXhY
)(
),0(~ ,)(
Observation Model: [Shape,Motion]Centered Configuration
•Linear Gauss-Markov models for log st and θt
•Can be stationary or non-stationary
System Model : Shape and Motion Dynamics
ttttt
tttttt
ntttct
vzvvz
CzzIbasiszUczUv
NnncAc
12/1
1111
1
)*1(
)*]([)( ,)(
),0(~ ,
Shape Dynamics: Motion Dynamics:
Three Cases
• Non-Stationary Shape Activity (NSSA)• Tangent space, U(zt-1), changes at every t• Most flexible: Detect abnormality and also track it
• Stationary Shape Activity (SSA)• Tangent space, U(μ), is constant (μ is a mean shape)• Track normal behavior, detect abnormality
• Piecewise Stationary Shape Activity (PSSA)• Tangent space is piecewise constant, U(μk)• Change time: fixed or decided on the fly using ELL • PSSA + ELL: Activity Segmentation
Stationary, Non-Stationary
Stationary Shape
Non-Stationary Shape
• Procrustes mean of set of preshapes wi [Dryden,Mardia]:
S
*ww SS
ww
jbaew
wd
iii
iii
i
ji
ba
iiF
ofr EigenvectoLargest
,*maxarg
]*)(*1[minarg
||||minminarg
),(minargˆ
1||:||
2
,,,
2
)*arg( , :Shape ii
jii wewz i
Learning Procrustes’ Mean
Overview• The Group Activity Recognition Problem
• Slow and Drastic Change Detection
• Landmark Shape Dynamical Models
• Applications, Experiments and Results
• Principal Components Null Space Analysis
Abnormal Activity Detection
• Define abnormal activity as – Slow or drastic change in shape statistics with
change parameters unknown.– System is a nonlinear HMM, tracked using a PF
• This motivated research on slow & drastic change detection in general HMMs– Tracking Error detects drastic changes. We
proposed a statistic called ELL for slow change.– Use a combination of ELL & Tracking Error and
declare change if either exceeds its threshold.
Tracking to obtain observations
• CONDENSATION tracker framework• State: Shape, shape velocity, scale, rotation,
translation, Observation: Configuration vector • Measurement model: Motion detection locally around
predicted object locations to obtain observation• Predicted object configuration obtained by prediction
step of Particle filter• Predicted motion information can be used to move
the camera (or any other sensor)
• Combine with abnormality detection: for drastic abnormalities will not get observation for a set of frames, if outlier then only for 1-2 frames
Activity Segmentation• Use PSSA model for tracking
• At time t, let current mean shape = μk
• Use ELL w.r.t. μk to detect change time, tk+1 (segmentation boundary)
• At tk+1, set current mean shape to posterior Procrustes mean of current shape, i.e.
μk+1=largest eigenvector of Eπ[ztzt*]=Σi=1
N zt(i)
zt(i)*
• Setting the current mean as above is valid only if tracking error (or OL) has not exceeded the threshold (PF still in track)
A Common Framework for…• Tracking
– Groups of people or vehicles– Articulated human body tracking
• Abnormal Activity Detection / Activity Id– Suspicious behavior, Lane change detection– Abnormal action detection, e.g. motion disorders– Human Action Classification, Gait recognition
• Activity Sequence Segmentation• Fusing different sensors
– Video, Audio, Infra-Red, Radar
Experiments• Group Activity:
– Normal activity: Group of people deplaning & walking towards airport terminal: used SSA model
– Abnormality: A person walks away in an un-allowed direction: distorts the normal shape
• Simulated walking speeds of 1,2,4,16,32 pixels per time step (slow to drastic distortion in shape)
• Compared detection delays using TE and ELL• Plotted ROC curves to compare performance
• Human actions: – Defined NSSA model for tracking a figure skater– Abnormality: abnormal motion of one body part– Able to detect as well as track slow abnormality
Abnormality• Abnormality introduced at t=5• Observation noise variance = 9• OL plot very similar to TE plot (both same to first order)
ELL Tracking Error (TE)
ROC: ELL• Plot of Detection delay against Mean time b/w False Alarms
(MTBFA) for varying detection thresholds• Plots for increasing observation noise
Slow Change: ELL Works Drastic Change: ELL Fails
ROC: Tracking Error(TE)• ELL: Detection delay = 7 for slow change , Detection delay = 60
for drastic• TE: Detection delay = 29 for slow change, Detection delay = 4
for drastic
Slow Change: TE Fails Drastic Change: TE Works
ROC: Combined ELL-TE• Plots for observation noise variance = 81 (maximum)• Detection Delay < 8 achieved for all rates of change
Slow Change: Works! Drastic Change: Works!
Normal ActionSSA better than NSSA
AbnormalityNSSA works, SSA fails
Green: Observed, Magenta: SSA, Blue: NSSA
Human Action Tracking
NSSA Tracks and Detects Abnormality
Red: SSA, Blue: NSSA
ELL Tracking ErrorAbnormality introduced at t=20
Overview• The Group Activity Recognition Problem
• Slow and Drastic Change Detection
• Landmark Shape Dynamical Models
• Applications, Experiments and Results
• Principal Components Null Space Analysis
Typical Data Distributions
‘Apples from Apples’ problem: All algorithms work well
‘Apples from Oranges’ problem: Worst case for SLDA, PCA
PCNSA Algorithm
• Subtract common mean μ, Obtain PCA space• Project all training data into PCA space, evaluate class
mean, covariance in PCA space: μi, Σi
• Obtain class Approx. Null Space (ANS) for each class: Mi trailing eigenvectors of Σi
• Valid classification directions in ANS: if distance between class means is “significant”: Wi
NSA
• Classification: Project query Y into PCA space, X=WPCA
T(Y- μ), choose Most Likely class, c, as
)||μ(X||W(X)d(X)dc iNSA
iiii
T
,minarg
Classification Error Probability
• Two class problem. Assumes 1-dim ANS, 1 LDA direction• Generalizes to M dim ANS and to non-Gaussian but unimodal
& symmetric distributions
1,
212
221
1
|)(|
)1,0;()(
ANS
T
T
PCNSA
k
NN
N
dzzEP
LDALDA
LDAT
SLDA
WW
W
dzzEP
T
1
21
1
2
|)(|
)1,0;()(
Applications
• Image & Video retrieval– Applied to human action retrieval– Hierarchical image/video retrieval: PCNSA
followed by LDA
• Activity Classification & Abnormal Activity Detection
Applications
Face recognition,Large expression variation
Face recognition,Large pose variation
Facial Feature Matching
Object recognition
Discussion & Ideas• PCNSA test approximates the LRT (optimal Bayes
solution) as condition no. of Σi tends to infinity
• Fuse PCNSA and LDA: get an algorithm very similar to Multispace KL
• For multiclass problems, use error probability expressions to decide which of PCNSA or SLDA is better for a given set of 2 classes
• Perform facial feature matching using PCNSA, use this for face registration followed by warping to standard geometry
)()(minarg :LRT iiT
ii
XXClass 1
Ongoing and Future Work• Change Detection
– Implications of Bound on Errors is increasing fn. of rate of change– CUSUM on ELL & OL– Quantitative performance analysis of ELL & OL– Find examples of mixing “unnormalized filter kernels”
• Non-Stationary & Piecewise Stationary Shape Activities– Application to sequences of different kinds of actions– PSSA + ELL for activity segmentation– Joint tracking and abnormality detection
• Time varying number of Landmarks?– What is “best” strategy to get a fixed no. ‘k’ of landmarks?– Can we deal with changing dimension of shape space?
• Multiple Simultaneous Activities, Multi-sensor fusion• 3D Shape, General shape spaces
Contributions• ELL for slow change detection, Stability of ELL approximation error
• Complementary behavior of ELL & OL, ELL error proportional to “rate of change” with all increasing derivatives
• Stochastic dynamical models for landmark shapes: NSSA, SSA, PSSA
• Modeling the changing configuration of a group of moving point objects as a deforming shape: “shape activity”.
• Using ELL + PSSA for activity segmentation
• PCNSA & its error probability analysis, application to action retrieval, abnormal activity detection