loris bazzani*, marco cristani*†, alessandro perina*, michela farenzena*, vittorio murino*†...
TRANSCRIPT
Loris Bazzani*, Marco Cristani*†, Alessandro Perina*, Michela Farenzena*,
Vittorio Murino*†
*Computer Science Department, University of Verona, Italy
†Istituto Italiano di Tecnologia (IIT), Genova, Italy
Multiple-shot Person Re-identification by HPE
signature
This research is founded by the EU-Project FP7 SAMURAI,grant FP7-SEC- 2007-01 No. 217899
Analysis of the problem (1)• Person Re-identification: Recognizing an individual
in diverse locations over different (non-)overlapping camera views
T = 222T = 145
T = 1 T = 23
Different cameras
Same camera
2
Analysis of the problem (2)• We focus on the problem with non-overlapping
cameras• Problems in real scenarios:
– Very low resolution – Severe Occlusions– Illumination variations– Pedestrians with very similar clothes– Pose and view-point changes– No geometry of the environment
• Solution: - Histogram Plus Epitome (HPE) descriptor,
and- Multiple-shot approach
3
Outline
4
Overview of the proposed methodPre-processing: Background Subtraction“Images selection” for Multiple-shotHPE descriptor- Global descriptor- Local descriptorsHPEs’ MatchingResultsConclusions
Overview of the proposed method
5
• Employing global and local appearance-based features
• Exploiting the temporal consistency to make robust the descriptor
Background Subtraction
6
We employ a novel generative model: STEL [Jojic el al. 2009]
Capture the structure of an image class as a mixture of component segmentations
Isolate meaningful parts that exhibit tight feature distributions
Learned Mixture Components
“Images selection” for Multiple-shot
7
Objective: discard redundant information and images with occlusionsGaussian Mixture Models Clustering [Figueiredo and
Jain 2002] of HSV histogramsAutomatic model selection employing the Bayesian
Information Criterion [Figueiredo and Jain 2002]Discard the clusters with low number of instancesKeep a random instance for each cluster
Examples of ruled-out examples:
HPE descriptor: Global feature
8
36-dimensional HSV histogram (H=16, S=16, V=4)
Average the histograms of the multiple instances
Robust to illumination and pose variations, keeping the predominant chromatic information only
• Capture chromatic global information
Caused by illumination changes
HPE descriptor: Local feature (1)
9
Epitome [Jojic el al. 2003]: generative model that analyzes the presence of recurrent, structured local patterns
Generic EpitomeGeneric Epitome
LocalEpitome
LocalEpitome
HPE descriptor: Local feature (2)
10
Generic Epitome : 36-dimensional HSV histogram of the Epitome
Local Epitome :Keep the patches with high :
probability that a patch in the epitome having (i, j) as left-upper corner represents several ingredient patches
Discard the patches with low entropyExtract a 36-dimensional HSV histogram of the
“survived” patches
HPEs’ Matching
11
Re-identification: associating each element in the probe set B to the corresponding element in the gallery set A
Minimize the following distance
where is the Bhattacharyya distance and
Results (1)
12
iLIDS dataset: - Multiple images of 119 pedestrians 128x64 pixels- Comparison with Context-based method [Zheng
et al. 2009]- Cross-validation: SvsS 10 trials, MvsS/MvsM 100
trials
Results (2)
13
ETHZ dataset:- Three datasets of 83, 35 and 28 pedestrians
of 64x32 pixels- Comparison with Partial Least Square (PLS)
method [Schwartz and Davis 2009]- Cross-validation: Settings as for iLIDS
Results (3)
14
How many images do we need to perform a “good” person re-identification?
N = Number of images for the multi-shot approach
N = 5 seems to be the best trade-off
Conclusions
15
We proposed a novel descriptor for the person re-identification problem, i.e., HPE descriptor
The descriptor is robust to low resolution, occlusions, illumination variations, pedestrians with very similar clothes, pose changes
It is based on the accumulation of images to gain robustness
Person re-identification problem is still far from being solved
The results suggest that further improvements can be reached
References[Jojic el al. 2009] N. Jojic, A. Perina, M. Cristani, V. Murino, and B. Frey,
“Stel component analysis: Modeling spatial correlations in image class structure,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 2044–2051, 2009.
[Figueiredo and Jain 2002] M. Figueiredo and A. Jain, “Unsupervised learning of finite mixture models,” IEEE Trans. PAMI, vol. 24, no. 3, pp. 381–396, 2002.
[Jojic el al. 2003] N. Jojic, B. J. Frey, and A. Kannan, “Epitomic analysis of appearance and shape,” in IEEE International Conference on Computer Vision. Washington, DC, USA: IEEE Computer Society, 2003, p. 34.
[Schwartz and Davis 2009] W. Schwartz and L. Davis, “Learning discriminative appearance-based models using partial least squares,” in XXIISIBGRAPI, 2009.
[Zheng et al. 2009] W. Zheng, S. Gong, and T. Xiang, “Associating groups of people,” in BMVC, 2009.
16