spatio-temporal constraints for recognizing 3d objects in videos nicoletta noceti università degli...

Spatio-temporal constraints for recognizing 3D objects in videos

http://slipguru.disi.unige.it

Nicoletta Noceti

Università degli Studi di Genova

Outline of the presentation

3D object recognition View based approaches Local descriptors for object recognition Our approach

Spatio temporal models for 3D objects recognition Modeling sequences Representation of video sequences w.r.t. the model 2-stage matching procedure

Recognizing objects: experiments and results

Object recognition

Localisation means to determine the pose of each object relative to a sensor

Categorization means recognising the class to which an object belongs instead of recognising that particular object

The goal of recognition systems is to identify which objects are present in a scene

Unlike ”merely” perceiving a shape, recognising it involves memory, that is accessing at representations of shapes seen in the past [Wittgenstein73]

Object recognition

View based object recognition

View based approaches to 3D object recognition gained attention as a way to deal with appearance variation [Murase et al. 95, Pontil et al. 98] no explicit model is required

Local approaches produce relatively compact descriptions of the image content and do not suffer from the presence of cluttered background and occlusions [Mikolajczyk et al. 03 ]

Local object models are often inspired by text categorization [Cristianini et al. 02]

Many view based local approach to recognition have been proposed [Leibe et al. 04, Csurka et al. 04]

Our approach to recognition

We observe an object from slightly different viewpoints and exploit local features distinctive in space and stable in time to perform recognition

Our approach shares some similarities with codebook methods but our method extends this concept also in the temporal domain

Our approach to recognition

View-based recognition systems do not need explicit computation of 3D object models

Local approaches produce compact descriptions and do not suffer from cluttered background and occlusions

Spatial constraints improve quality of recognition [Ferrari et al. 06]

Biological vision systems gather information by means of motion to include important cues for depth perception and object recognition [Stringer et al.06]

Drawbacks of locality

His eyes would dart from one thing to another, picking up tiny features, individual features, as they had done with my face. A striking brightness, a colour, a shape would arrest his attention and elicit comment – but in no case did he get the scene-as-a-whole. He failed to see the whole, seeing only details, which he spotted like blips on a radar screen. He never entered into relation with the picture as a whole - never faced, so to speak, its physiognomy. He had no sense whatever of a landscape or a scene.

”The Man Who Mistook His Wife For A Hat: And Other Clinical Tales”, by Oliver Sacks, 1970

Ideas Obtain a 3D object recognition method

based on a compact description of image sequences

Exploit spatial information on proximity of features appearing contemporaneously

Exploit temporal continuity both on training and test

E. Delponte, N. Noceti, F. Odone and A. Verri Spatio temporal constraints for matching view-based descriptions of 3D objects In WIAMIS 2007

Recognizing objects with ST models

Video sequences

Keypoints detection and

description

Keypoints tracking

Cleaning procedure

Building the spatio temporal model

2-stage matching procedure

Object recognition

Spatio-temporal model for training

Spatio-temporal model for test

From sequence to spatio-temporal model

For each image of the sequence extract Harris corners assign them a scale and a

principal direction assign them a SIFT

descriptor

Tracking of keypoints with Kalman filter cleaning procedure based on

length of trajectories and robustness of descriptors

Computation of time invariant features

Video sequence

Keypoints detection and

description

Keypoints tracking

Cleaning procedure

Building the spatio temporal model

Time invariant feature We obtain a set of time-invariant features:

a spatial appearance descriptor, that is the average of all SIFT vectors of its trajectory

a temporal descriptor, that contains information on when the feature first appeared in the sequence and on when it was last observed

The spatio-temporal model

The collection of time-invariant features constitutes a spatio-temporal model that we use to train our system

We emphasise the temporal coherence of the model and we exploit features appearing simultaneously

Matching spatio-temporal models

2-stage matching procedure

Object recognition

Spatio-temporal model for training

Spatio-temporal model for test

Matching of sequence models

For each video sequence we compute its spatio-temporal model

Given a test sequence, we perform a two stage matching procedure by exploiting spatial and temporal coherence of time-invariant features we compute a first set of matches we reinforce the procedure by analising spatial

and temporal matches neighborhood

Matching of sequence models

Experiments and results Matching assessment

Illumination, scale and background changes Changes in motion Increasing the number of objects

Object recognition on a 20 objects dataset

Recognition on a video streaming

3D objects

Matching assessmentMatches obtained on sequences with simple

changes compared w.r.t. ST models of 4 objects

Changing motion

Matches obtained w.r.t. ST models of 4 objects

Matches obtained in the first step of matching

Models of 3D objects

Matching assessment

Models of 3D objects

Recognizing 20 objectsNumber of experiments: 840

TP=51 FN=13

FP=11 TN=765

%80RECALL

%82PRECISION

Recognition on a video stream

Conclusion and future work

We exploited the compactness and the expressiveness of local image descriptions to address the problem of 3D object recognition

We devised a system based on the use of spatial and temporal information and we have proved how the model of a 3D object benefit of both these information

The system could benefit from adding information on the image context [Tor03]

Thanks for your attention!

spatio-temporal constraints for recognizing 3d objects in videos nicoletta noceti università degli...

d object recognition

particular object

object relative

local object models

object recognition stringer

procedurerecognizing

spatiotemporal constraints

spatiotemporal modelfor

Documents

genova lf...

az genova newspaper

genova smart week

genova - conventionbureauitalia.com · porto antico di...

genova sketchbook

04_architectonic_reliefofschools in genova

genova next stop

genova liguria parma

genova takeaway menu

d2 d roma nicoletta

in genova magazine

nicoletta batini, ian parry, philippe wingender

genova genova ( liguria region) genova capital of liguria...

genova, v. (332291vg)

biomasse - nicoletta nassi o di nasso

disc pasquale genova

learning to classify the visual dynamics of a scene ...

genova, genova, italy (epfl), lausanne, switzerland arxiv

nicoletta di blas – cv summary · nicoletta di blas –...

genova maual espectofotometro