trinity college dublin pixelgt: a new ground truth specification for video surveillance dr. kenneth...

23
Trinity College Dublin PixelGT: A new Ground Truth specification for video surveillance Dr. Kenneth Dawson-Howe, Graphics, Vision and Visualisation Group (GV2), School of Computer Science & Statistics, Trinity College Dublin

Post on 21-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Trinity College Dublin

PixelGT: A new Ground Truth specification for video surveillance

Dr. Kenneth Dawson-Howe,Graphics, Vision and Visualisation Group (GV2),

School of Computer Science & Statistics, Trinity College Dublin

Trinity College Dublin

Contents

Introduction Review Specification Annotation Conclusions

IntroductionReviewSpecificationAnnotationConclusions

Trinity College Dublin

Available video sequences

Video sequences with no ground truth

Video sequences with moving object pixel masks

Video sequences with moving object bounding boxes

Video sequences with labeled events

IntroductionReviewSpecificationAnnotationConclusions

Trinity College Dublin

Available video sequencesIntroductionReviewSpecificationAnnotationConclusions

Trinity College Dublin

Video sequences with moving object pixel masks

Video sequences with moving object and moving shadow pixel masks.

Artificial video sequences with moving object pixel masks.

Video sequences with labeled pixels.

IntroductionReviewSpecificationAnnotationConclusions

Cam

Vid

dat

aset

VS

SN

dat

aset

PE

TS

/LIM

U d

atas

etA

TO

N d

atas

et

Trinity College Dublin

Video sequences with moving object bounding boxes

IntroductionReviewSpecificationAnnotationConclusions

ET

ISE

O d

atas

et

CA

VIA

R d

atas

et

Trinity College Dublin

Video sequences with labeled events

IntroductionReviewSpecificationAnnotationConclusions

PE

TS

200

6 da

tase

t

Trinity College Dublin

Motivation

Existing ground truth requires a choice: Binary pixel masks (with no specific object

identification)OR

Object identification (with bounding boxes only and no shadows).

Relatively little pixel accurate ground truth is available.

No existing means of dealing with transparency and reflections.

IntroductionReviewSpecificationAnnotationConclusions

Trinity College Dublin

Ground truth Specifcation

Pixel based ground truth

Bit organisation

Meta data

Gradient profiles

Bounding boxes & events

IntroductionReviewSpecificationAnnotationConclusions

Trinity College Dublin

Pixel based ground truth

Object class (8 bits)

Object instance (9 bits)

Shadow level (4 bits)

Transparency level (3 bits)

IntroductionReviewSpecificationAnnotationConclusions

Trinity College Dublin

Bit organisationIntroductionReviewSpecificationAnnotationConclusions

L = Label classI = IDS = ShadowT = Transparency1 = LSB8 = MSB

Trinity College Dublin

Meta data

Based on CVML

Includes: File names Bit organisation Meaning of object labels Gradient profile

IntroductionReviewSpecificationAnnotationConclusions

Trinity College Dublin

Gradient profiles

Width of edges?

Required for assessing performance?

IntroductionReviewSpecificationAnnotationConclusions

Trinity College Dublin

Bounding boxes and events

Bounding boxes for moving objects Implicitedly included Can be explicitly specified Aids comparison of techniques

Events Some implicitedly included Can be explicitly specified Aids comparison of techniques

IntroductionReviewSpecificationAnnotationConclusions

Trinity College Dublin

Creation of ground truth

Difficult to label all pixels in all frames Short videos? Not all frames / pixels?

Method Background annotation Object and shadow boundary specification Object class and ID annotation Transparency, Meta-data, Bounding boxes, Events

and Gradient Profile specification Propagation of ground truth

IntroductionReviewSpecificationAnnotationConclusions

Trinity College Dublin

Specifying object class & IDIntroductionReviewSpecificationAnnotationConclusions

Trinity College Dublin

Background annotation

For static cameras Annotate all background pixels Facilitates automatic event detection

IntroductionReviewSpecificationAnnotationConclusions

Trinity College Dublin

Specifying object and shadow boundaries

Identify boundary edges of objects / shadows Once shadow boundary is complete shadow level is

computed automatically.

IntroductionReviewSpecificationAnnotationConclusions

Trinity College Dublin

Specifying object class & ID

Once boundary is complete Select object class and (optionally) ID Click on the object

IntroductionReviewSpecificationAnnotationConclusions

Trinity College Dublin

Transparency, Meta-data, Bounding boxes, Events and Gradient Profile specification

Transparency not yet addressed

Meta data – specified in CVML

Bounding boxes & events – extract automatically from pixel based ground truth

Gradient profile – extract automatically from each image

IntroductionReviewSpecificationAnnotationConclusions

Trinity College Dublin

Propagation of ground truth

Propagation only required for moving objects

Working with every frame results in very little movement between frames (0-5 pixels) Search for the closest similar contour in the new

frame where the pixels on the object side of the contour appear similar…

Need to identify & correct1. when new objects enter the scene.2. when objects reappear from behind occlusions.3. when internal holes appear within an object.4. when errors occur in the propagation

IntroductionReviewSpecificationAnnotationConclusions

Trinity College Dublin

The positive spin…

PixelGT is a new form of ground truth which will facilitate Pixel-level assessment of trackers Semi-automatic generation of object location,

bounding boxes and event descriptions. Comparison of trackers which assess performance

using different types of ground truth

IntroductionReviewSpecificationAnnotationConclusions

Trinity College Dublin

Questions

Do we really need this detailed ground truth ?

Should we change anything about the specification ?

Other questions ?

IntroductionReviewSpecificationAnnotationConclusions