loris bazzani*, marco cristani*†, alessandro perina*, michela farenzena*, vittorio murino*†...

16
Loris Bazzani*, Marco Cristani *†, Alessandro Perina*, Michela Farenzena*, Vittorio Murino*† *Computer Science Department, University of Verona, Italy †Istituto Italiano di Tecnologia (IIT), Genova, Italy Multiple-shot Person Re- identification by HPE signature This research is founded by the EU-Project FP7 SAMURAI,grant FP7-SEC- 2007-01 No. 217899

Upload: stephanie-storey

Post on 14-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Loris Bazzani*, Marco Cristani*†, Alessandro Perina*, Michela Farenzena*,

Vittorio Murino*†

*Computer Science Department, University of Verona, Italy

†Istituto Italiano di Tecnologia (IIT), Genova, Italy

Multiple-shot Person Re-identification by HPE

signature

This research is founded by the EU-Project FP7 SAMURAI,grant FP7-SEC- 2007-01 No. 217899

Analysis of the problem (1)• Person Re-identification: Recognizing an individual

in diverse locations over different (non-)overlapping camera views

T = 222T = 145

T = 1 T = 23

Different cameras

Same camera

2

Analysis of the problem (2)• We focus on the problem with non-overlapping

cameras• Problems in real scenarios:

– Very low resolution – Severe Occlusions– Illumination variations– Pedestrians with very similar clothes– Pose and view-point changes– No geometry of the environment

• Solution: - Histogram Plus Epitome (HPE) descriptor,

and- Multiple-shot approach

3

Outline

4

Overview of the proposed methodPre-processing: Background Subtraction“Images selection” for Multiple-shotHPE descriptor- Global descriptor- Local descriptorsHPEs’ MatchingResultsConclusions

Overview of the proposed method

5

• Employing global and local appearance-based features

• Exploiting the temporal consistency to make robust the descriptor

Background Subtraction

6

We employ a novel generative model: STEL [Jojic el al. 2009]

Capture the structure of an image class as a mixture of component segmentations

Isolate meaningful parts that exhibit tight feature distributions

Learned Mixture Components

“Images selection” for Multiple-shot

7

Objective: discard redundant information and images with occlusionsGaussian Mixture Models Clustering [Figueiredo and

Jain 2002] of HSV histogramsAutomatic model selection employing the Bayesian

Information Criterion [Figueiredo and Jain 2002]Discard the clusters with low number of instancesKeep a random instance for each cluster

Examples of ruled-out examples:

HPE descriptor: Global feature

8

36-dimensional HSV histogram (H=16, S=16, V=4)

Average the histograms of the multiple instances

Robust to illumination and pose variations, keeping the predominant chromatic information only

• Capture chromatic global information

Caused by illumination changes

HPE descriptor: Local feature (1)

9

Epitome [Jojic el al. 2003]: generative model that analyzes the presence of recurrent, structured local patterns

Generic EpitomeGeneric Epitome

LocalEpitome

LocalEpitome

HPE descriptor: Local feature (2)

10

Generic Epitome : 36-dimensional HSV histogram of the Epitome

Local Epitome :Keep the patches with high :

probability that a patch in the epitome having (i, j) as left-upper corner represents several ingredient patches

Discard the patches with low entropyExtract a 36-dimensional HSV histogram of the

“survived” patches

HPEs’ Matching

11

Re-identification: associating each element in the probe set B to the corresponding element in the gallery set A

Minimize the following distance

where is the Bhattacharyya distance and

Results (1)

12

iLIDS dataset: - Multiple images of 119 pedestrians 128x64 pixels- Comparison with Context-based method [Zheng

et al. 2009]- Cross-validation: SvsS 10 trials, MvsS/MvsM 100

trials

Results (2)

13

ETHZ dataset:- Three datasets of 83, 35 and 28 pedestrians

of 64x32 pixels- Comparison with Partial Least Square (PLS)

method [Schwartz and Davis 2009]- Cross-validation: Settings as for iLIDS

Results (3)

14

How many images do we need to perform a “good” person re-identification?

N = Number of images for the multi-shot approach

N = 5 seems to be the best trade-off

Conclusions

15

We proposed a novel descriptor for the person re-identification problem, i.e., HPE descriptor

The descriptor is robust to low resolution, occlusions, illumination variations, pedestrians with very similar clothes, pose changes

It is based on the accumulation of images to gain robustness

Person re-identification problem is still far from being solved

The results suggest that further improvements can be reached

References[Jojic el al. 2009] N. Jojic, A. Perina, M. Cristani, V. Murino, and B. Frey,

“Stel component analysis: Modeling spatial correlations in image class structure,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 2044–2051, 2009.

[Figueiredo and Jain 2002] M. Figueiredo and A. Jain, “Unsupervised learning of finite mixture models,” IEEE Trans. PAMI, vol. 24, no. 3, pp. 381–396, 2002.

[Jojic el al. 2003] N. Jojic, B. J. Frey, and A. Kannan, “Epitomic analysis of appearance and shape,” in IEEE International Conference on Computer Vision. Washington, DC, USA: IEEE Computer Society, 2003, p. 34.

[Schwartz and Davis 2009] W. Schwartz and L. Davis, “Learning discriminative appearance-based models using partial least squares,” in XXIISIBGRAPI, 2009.

[Zheng et al. 2009] W. Zheng, S. Gong, and T. Xiang, “Associating groups of people,” in BMVC, 2009.

16