visuelle perzeption für mensch- maschine schnittstellen€¦ · comparison for each shape location...

56
Interactive Systems Laboratories, Universität Karlsruhe (TH) Edgar Seemann, 04.12.09 1 Visuelle Perzeption für Mensch- Maschine Schnittstellen Vorlesung, WS 2009 Prof. Dr. Rainer Stiefelhagen Dr. Edgar Seemann Institut für Anthropomatik Universität Karlsruhe (TH) http://cvhci.ira.uka.de [email protected] [email protected]

Upload: others

Post on 16-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Interactive Systems Laboratories, Universität Karlsr uhe (TH)Edgar Seemann, 04.12.09 1

Visuelle Perzeption für Mensch-Maschine Schnittstellen

Vorlesung, WS 2009

Prof. Dr. Rainer StiefelhagenDr. Edgar Seemann

Institut für AnthropomatikUniversität Karlsruhe (TH)

http://[email protected]

[email protected]

Page 2: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Interactive Systems Laboratories, Universität Karlsr uhe (TH)Edgar Seemann, 04.12.09 2

Computer Vision:

People Detection II

WS 2009/10

Dr. Edgar Seemann

[email protected]

Page 3: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Interactive Systems Laboratories, Universität Karlsr uhe (TH)Edgar Seemann, 04.12.09 3

Termine (1)

TBA21.12.2009

Stereo and Optical Flow18.12.2009

Termine Thema19.10.2009 Introduction, Applications

23.10.2009 Basics: Cameras, Transformations, Color

26.10.2009 Basics: Image Processing

30.10.2009 Basics: Pattern recognition

02.11.2009 Computer Vision: Tasks, Challenges, Learning, Performance measures

06.11.2009 Face Detection I: Color, Edges (Birchfield)

09.11.2009 Project 1: Intro + Programming tips13.11.2009 Face Detection II: ANNs, SVM, Viola & Jones

16.11.2009 Project 1: Questions

20.11.2009 Face Recognition I: Traditional Approaches, Eigenfaces, Fisherfaces, EBGM

23.12.2009 Face Recognition II

27.11.2009 Head Pose Estimation: Model-based, NN, Texture Mapping, Focus of Attention

30.11.2009 People Detection I

04.12.2009 People Detection II

07.12.2009 Project 1: Student Presentations, Project 2: Intro11.12.2009 People Detection III (Part-Based Models)

14.12.2009 Scene Context and Geometry

Page 4: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 4

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciToday

� Last time: Global approaches� Compute a singe feature vector / representation

for the complete body

� Today:� Silhouette matching (global approach)

� Part-based approaches

Page 5: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 5

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ci

Silhouette Matching

Page 6: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 6

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciChamfer Matching [Gavrila & Philomin ICCV’99]

Goal

• Align known object shapes with image

Object shapesReal-world image of object

Requirements for an alignment algorithm

• High detection rate

• Few false positives

• Robustness

• Computationally inexpensive

Page 7: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 7

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ci

� Used to compare/align two (typically binary) shapes

1. Compute for each pixel the distance to the next edge pixel� Here the eculidean distances are

approximated by the 2-3 distance

Distance Transform

Shape 1 Shape 2

Distance = ?

Distance transform

20

3

36

0

85222025

2024

630

2

520002

42222

4333

555

668

522024

30024

2024

235244

Page 8: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 8

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciDistance Transform

2. Overlay second shape over distance transform

3. Accumulate distances along shape 24. Find best matching position by an exhaustive search

� Distance is not symmetric� Distance has to be normalized w.r.t. the length of the

shapes

Distance transform

20

3

36

0

85222025

2024

630

2

520002

42222

4333

555

668

522024

30024

2024

235244

Distance = 14

Page 9: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 9

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciChamfer Matching

Distance measureDistance transform of a real-world image

Distance transform

20

3

36

0

85222025

2024

630

2

520002

42222

4333

555

668

522024

30024

2024

235244

Binary image

• Compute distance transform (DT)

• For each possible object location

• Position known object shape over DT

• Accumulate distances along the

contour

Chamfer Matching

Page 10: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 10

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciEfficient implementation

� The distance transform can be efficiently computed by two scans over the complete image

� Forward-Scan� Starts in the upper-left corner and moves from left to right, top to

bottom

� Uses the following mask

� Backward-Scan� Starts in the lower-right corner and moves from right to left,

bottom to top

� Uses the following mask

3 2 3

2 0

3 2 3

20

Page 11: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 11

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ci

3

2

? ? ?

?

?

Forward scan

� We can choose different values for the filter mask

� The local distances, d, s and c, in the mask are added to the pixel values of the distance map and the new value of the zero pixel is the minimum of the five sums

� Example:

d s d

s c

2+2 2+3 3+5

2+0 2

3 2 3

2 0 ?

? ? ?

5

?

?

Page 12: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 12

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciAdvantages and Disadvantages

� Fast� Distance transform has to be computed only once

� Comparison for each shape location is cheap

� Good performance on uncluttered images (with few background structures)

� Bad performance for cluttered images

� Needs a huge number of people silhouettes� But computation effort increases with the number of

silhouettes

Page 13: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 13

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciTemplate Hierachy

� To reduce the number of silhouettes to consider, silhouettes can be organized in a template hierarchy

� For this, the shapes are clustered by similarity

Page 14: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 14

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciSearch in the hierarchy

� Matching the shapes, then corresponds to a traversal of the template hierarchy

� How can we prune search branches to speed up matching?� Thresholds depend on:

� Edge detector (likelihood of gaps)

� Silhouette sizes

� Hierarchy level

� Allowed shape variation

� Thresholds are set statisticallyduring training

Page 15: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 15

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciExample Detections

Page 16: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 16

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciVideo

Page 17: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 17

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciCoarse-To-Fine Search

� Goal: Reduce search effort by discarding unlikely regions with minimal computation

� Idea:� Subsample image and search

first at a coarse scale

� Only consider regions with alow distance when searchingfor a match on finer scales

� Again, we have to findreasonable thresholds

Level 1

Level 2

Level 3

Page 18: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 18

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciProtector System (Daimler)

Page 19: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 19

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciAdding edge orientation

� So far edge orientation has been completely ignored

� Idea: Consider edge orientation for each pixel

Distance = small

Page 20: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 20

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciEdge orientation - The math

� Given two shapes S, C, we can express the chamfer distance in the following manner

� The orientation correspondence between two points is then measured by

� The combined distance measure:

Page 21: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 21

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciStatistical Relevance

� Adding statistical relevance of silhouette regions further improves the results[Dimitrijevic06]

Page 22: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 22

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciSpatio-Temporal templates

� Use multiple successive frames to build a spatio-temporal template (T={T1,…,TN})

� Allow spatial variations of dx, dy (due to motion or camera movements)

Page 23: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 23

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciExample: single-frame vs. 3 frames

Page 24: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 24

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciQuantitative Results

� Red: spatio-temporal templates + statistical relevance

Page 25: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 25

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciVideo

� Restrict detection to a single articulation (when legs are in a v-shaped position)

� Spatio-Temporal templates:� Allows more reliable detection of motion direction

� Avoids confusions and some false positive detections

Page 26: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 26

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciAlternatives: Earth Mover’s Distance

� Originally developed to compare histograms

� Idea: Find the minimal ‘flow’ to transform one histogram to another

� Example:

Page 27: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 27

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciEarth mover’s distance (EMD)

Example detection

• Detect edges in image

• For each possible object location

• Optimize correspondences between

known shape and edge image

EMD MatchingChamfer:

EMD:

Distance measure basis

Page 28: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 28

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciEMD – The math

� Variant of the transportation problem (possible solutions: Stepping Stone Algorithm, Transportation-simplex method)

� Constraints

� EMD-Distance

Page 29: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 29

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciAdvantages and Disadvantages

� Optimizes matching between silhouette and edge structure in image

� Enforces one-to-one matchings (unlike chamfer)

� Allows partial matches

� Can deal with arbitrary features

� High computational complexity� Approximation is possible [Graumann, Darrel CVPR’94]

Page 30: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 30

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ci

Part-Based Models

Page 31: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 31

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciPart-Based Models

� Fischler & Elschlager1973

� Model has two components� parts

(2D image fragments)

� structure (configuration of parts)

Page 32: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 32

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciConfiguration of parts

� Fixed Spatial Layout� Local parts have a mostly fixed position and orientation

with respect to the center object center or detection window

� Flexible Spatial Layout� Local parts are allowed to shift in location and scale� Able to compensate for deformations or articulation

changes� Well suited for non-rigid objects � Spatial relations often modeled probabilistically

Page 33: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 33

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ci

K. Grauman, B. Leibe

Different Connectivity Structures

Fergus et al. ’03Fei-Fei et al. ‘03

Leibe et al. ’04, ‘08Crandall et al. ‘05Fergus et al. ’05

Crandall et al. ‘05 Felzenszwalb & Huttenlocher ‘05

Bouchard & Triggs ‘05 Carneiro & Lowe ‘06Csurka ’04Vasconcelos ‘00

from [Carneiro & Lowe, ECCV’06]

O(N6) O(N2) O(N3) O(N2)

Page 34: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 34

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ci

Fixed Spatial Layout

Page 35: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 35

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciMotivation

� Breakdown overall variability into more manageable pieces

� Pieces can be classified by a less complex classifier

� Apply prior information by (manually) grouping the global feature vector into meaningful parts

Page 36: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 36

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciExample: Sashua et al. IVS’04

� Divide person into 9 overlapping parts

� 4 pair combinations

Page 37: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 37

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciFeature Representation

� Edge orientation histograms (similar to HOG)

� Each part is divided into 2x2 regions� i.e. 4 histograms per part

� 8 gradient orientations

� Features are insensitive to small local shifts

Page 38: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 38

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciPart Classifier and Combination

� Parts are classified by a linear regression� i.e. distance to a separating hyperplane

� One score with respect to each of the 9 parts� Separate head vs. leg feature, head vs. upper body etc.

� The values of the individual part detectors are combined by AdaBoost

Page 39: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 39

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciAttention mechanism

� Attention mechanism� Considers scene geometry, camera perspective

� Filters out regions with lack of texture properties

� Only 75 image windows are considered for pedestrian classification per frame

Page 40: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 40

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciResults

� Lower curve:Global SVM

� Middle curve:Mohan et al.

� Upper curve:Sashua et al.

Page 41: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 41

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciResults

Page 42: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 42

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciExample: Mohan et al.

� Body divided into 4 parts� Face/Shoulder

� Legs

� Right/Left arm

� Detection� Sliding window

� 64x128 pixels

Page 43: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 43

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciGeometric constraints

� Body parts are not always at the exact same position

� Allow local shifts� Position

� Scale

� Best location has to befound for each detectionwindow

Page 44: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 44

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciTraining Data

� MIT pedestrian database

Page 45: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 45

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciFeatures

� Based on wavelets� Multi-Resolution representation

� Haar wavelets

� Quadrupledensity

Page 46: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 46

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciDiscrete Wavelet Transform (DWT)

Motivation

� Representation with basis functions

� Given: Signal [9 7 3 5]

� Disadvantages:� Coefficients provide only

local information

� No information aboutglobal signal change

http://nt.eit.uni-kl.de/wavelet/transformation.html

Page 47: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 47

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciHaar Wavelet Basis

� Scaling function provides information about the mean value or signal level

� Representation inwavelet basis:� [12 4 sqrt(2) -sqrt(2)]

Page 48: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 48

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ci1D Wavelet Transformation

� Step wise transformation with orthogonal

spaces

� Basis function for Vj� Scaling function

� Basis function for Wj

� Wavelet function

Page 49: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 49

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciOur Example

� Project signal onto subspaces

V1 W1

V2 W2

Page 50: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 50

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ci2D Haar-Wavelets

� Same princple as in 1D

� Quadruple densityi.e. overcompleterepresentation

Page 51: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 51

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciFeature Discussion

� Haar wavelets represent pixel/region differences

� Strong responses of wavelet coefficients on image boundaries, i.e. coefficients encode shape

� Similar to gradients at different resolutions

� Overcomplete basis allows good spatial resolution

� Fine scales are discarded as they represent noise

� Coarse scales are discarded as they represent global intensity levels and not the shape

Page 52: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 52

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciPerformance: Part Detectors

Page 53: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 53

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciClassifier Combination

� Voting� E.g. majority of part detectors, classify detection

window as person

� SVM combination� Feed SVM scores of part detectors in a second stage

SVM

� Referred to as Adaptive Classifier Combination

Page 54: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 54

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciCombination Performance

Page 55: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 55

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciResults

Page 56: Visuelle Perzeption für Mensch- Maschine Schnittstellen€¦ · Comparison for each shape location is cheap Good performance on uncluttered images (with few background ... Allows

Edgar Seemann, 04.12.09 56

Com

pute

r V

isio

n fo

r H

uman

-Com

pute

r In

tera

ctio

n

Res

earc

h G

roup

, Uni

vers

itätK

arls

ruhe

(T

H)

cv:h

ciOcclusion