learning riemannian metrics for motion classification fabio cuzzolin inria rhone-alpes computational...

Learning Riemannian metrics for motion

classification

Fabio CuzzolinINRIA Rhone-Alpes

Computational Imaging Group, Pompeu Fabra University, Barcellona

25/1/2007

Myself

Master’s thesis on gesturegesture recognitionrecognition at the University of Padova Visiting student, ESSRL, Washington

University in St. Louis Ph.D. thesis on the theory of belief theory of belief

functionsfunctions Young researcher in Milan with the Image

and Sound Processing group Post-doc at UCLA in the Vision Lab Marie Curie fellowship, INRIA Rhone-Alpes

My research

research

Discrete mathematics

linear independence on lattices

Belief functions and imprecise probabilities

geometric approach

algebraic analysis

combinatorial analysis

Computer vision object and body tracking

data association

gesture and action recognition

identity recognition

Today’s talk

Motion classificationMotion classification is one of most popular vision problems

Applications: surveillance, biometric, human-computer interaction

Issue: choice of distance function

Learning Riemannian metrics for motion classification

Riemannian metrics for classification

Distances between dynamical modelsLearning a metric from a training setPullback metricsSpaces of linear systems and Fisher metricExperiments on scalar models

Distances between dynamical models

Problem: motion classification Approach: representing each movement as a

linear dynamical modellinear dynamical model for instance, each image sequence can be

mapped to an ARMA, or AR linear model Classification is then reduced to find a suitable

distance function in the space of dynamical distance function in the space of dynamical modelsmodels

We can then use this distance in any distance-based classification scheme: k-NN, SVM, etc.

A review of the literature Some distances have been proposed a family of probability distributions depending on a n-

dimensional parameter can be regarded in fact as an n-dimensional manifold, with Fisher information matrixFisher information matrix [Amari]

Kullback-Leibler divergenceKullback-Leibler divergence Gap metricGap metric [Zames,El-Sakkary]: compares graphs

associated with linear systems thought of as input-output maps

Cepstrum normCepstrum norm [Martin] Subspace anglesSubspace angles between column spaces of the

observability matrices

ji

ij

xpxpEg

),(log,),(log

Learning metrics from a training set

All those metrics are task-specific Besides, it makes no sense to choose a single

distance for all possible classification problems as…

Labels can be assigned arbitrarily to dynamical systems, no matter what the underlying structure is

When some a-priori info is available (training set).. .. we can learn in a supervised fashion the “best” .. we can learn in a supervised fashion the “best”

metric for the classification problem!metric for the classification problem! A feasible approach: volume minimization of volume minimization of pullback metricspullback metrics

Learning distances Of course many unsupervised algorithms take an input

dataset and embed it in some other space, implicitly learning a metric (LLE, Laplacian Eigenmaps, etc.) they fail to learn a full metric for the whole input space,

but only images of a set of samples

[Xing, Jordan]: maximizes classification performance for linear maps y=A1/2 x > optimal Mahalanobis optimal Mahalanobis distancedistance reduces to convex optimization

[Shental et al]: relevant component analysisrelevant component analysis – changes the feature space by a global linear transformation which assigns large weights to relevant dimensions" and low weights to irrelevant dimensions

Learning pullback metrics Some notions of differential geometry give us a

tool to build a parameterized family of metrics

The diffeomorphism F induces on M a family of pullback metricspullback metrics

The geodesicsgeodesics of the pullback metric are the liftings of the geodesics associated with the original metric

Consider than a family of diffeomorphisms F between the original space M and a metric space N

M

F

ND

Pullback metrics - detail

)(

:

mFm

MMF

DiffeomorphismDiffeomorphism on M:

MTvMTv

MTMTF

mFm

mm

)(

*

'

:

Push-forwardPush-forward map:

),(),( **)(* vFuFgvug mFm

Given a metric on M, g:TMTM, the

pullback metricpullback metric is

N

k

M

k

k

dmmg

mgDO

1 2

1

2

1

))((det

))((det)( Inverse volumeInverse volume:

Inverse volume maximization The natural criterion would be to optimize the

classification performance In a nonlinear setup this is hard to formulate

and solve Reasonable to choose a different but related

objective function

Effect: finding the manifold which better interpolates the data (i.e. forcing the geodesics to pass through “crowded” regions)

Space of AR(2) models Given an input sequence, we can identify the parameters

of the linear model which better describes it We chose the class of autoregressive models of order 2

AR(2)

21

12

2212121 1

1

)1)(1)(1(

1),(

aa

aa

aaaaaaag

Fisher metric on AR(2)

to get a distance: compute the geodesics of the pullback metric on M

Under stability (|a|<1) and minimality (b 0) this family forms a manifold

0,0|),(0,0|),()1,1,1( babababaM

Space of M(1,1,1) models Consider instead the class of stable discrete-time

linear systems of order 1

After choosing a canonical setting c = 1 the transfer function becomes h(z) = b/(z a)

)()(

)()()1(

kxcky

kubkxakx

Fisher tensor:

20

01),(

rrg )(arctan,

1 2ah

a

br

Families of diffeomorphisms We chose two different families of diffeomorphisms

332211 ,,1

)( mmmm

mFp

For AR(2) systems:

For M(1,1,1) systems: babrbrarbrFp 22 ,),(

Mobo database: 25 people performing 4 different walking actions, from 6 cameras6 cameras

Each sequence has three labels: action, id, viewaction, id, view

MOBO database

Classification of scalar models recognition of actions and identities from

image sequences scalar feature, AR(2) and M(1,1,1) models

compared performance of all known distances, with pullback Fisher metric

built the geodesic distance used NN algorithm to classify

new sequences

Results - action

Action recognition performance, all views considered – second best distance function

Action recognition performance, all views considered – pullback Fisher metric

Action recognition, view 5 only – difference between classification rates pullback metric – second best

Results – action 2 Recognition performance of the second-best distance

(blue) and the optimal pull-back metric (red), increasing size of training set

View 1 View 5

View 3 View 6

Effect of the training set The size of the training set obviously affects the

recognition rate Systems of the class M(1,1,1) Increasing size of the training set on the abscissae

All views considered

View 2 only

Conclusions Movements can be represented as dynamical systems

Motion classification then reduces to finding a distance between dynamical model

having a training set of such models we can learn the “best” metric for a given classification problem…

… and use it to classify new sequences Pullback metrics induced by the Fisher metric

structure on linear models is a possible choice Design of a family of diffeomorphisms

Future: multidimensional observations, better objective function

learning riemannian metrics for motion classification fabio cuzzolin inria rhone-alpes computational...

Documents

motion classification

classification distances

learning metrics

dynamical systems

dynamical models problem

best metric

fisher metric experiments

scalar models