multi-target tracking through opportunistic camera control in a resource constrained multimodal...

21
MULTI-TARGET TRACKING THROUGH OPPORTUNISTIC CAMERA CONTROL IN A RESOURCE CONSTRAINED MULTIMODAL SENSOR NETWORK Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2 Department of Electrical Engineering, University of California, Riverside 9/8/2008 ICDSC'08 1 Bourns College of Engineering Information Processing Laboratory www.ipl.ee.ucr.edu

Post on 19-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

ICDSC'08 1

MULTI-TARGET TRACKING THROUGH OPPORTUNISTIC CAMERA CONTROL IN ARESOURCE CONSTRAINED MULTIMODAL

SENSOR NETWORK

Jayanth Nayak1, Luis Gonzalez-Argueta2, Bi Song2,

Amit Roy-Chowdhury2, Ertem Tuncel2

Department of Electrical Engineering,

University of California, Riverside

9/8/2008

Bourns College of EngineeringInformation

Processing

Laboratorywww.ipl.ee.ucr.edu

ICDSC'08 2

Overview

Introduction

Problem Formulation

Audio And Video Processing

Camera Control Strategy

Computing Final Tracks Of All Targets

Experimental Results

Conclusion

Acknowledgements

9/8/2008

ICDSC'08 3

Motivation

Obtaining multi-resolution video from a highly active environment requires a large number of cameras.

DisadvantagesCost of buying, installing and maintaining

Bandwidth limitations

Processing and storage

Privacy

Our goal: minimize the quantity of cameras by a control mechanism that directs the attention of the cameras to the interesting parts.

9/8/2008

ICDSC'08 4

Proposed Strategy

Audio sensors direct the pan/tilt/zoom of the camera to the location of the event.

Audio data intelligently turns on the camera and video data turns off the camera.

Audio and video data are fused to obtain tracks of all targets in the scene.

9/8/2008

ICDSC'08 5

Example Scenario

9/8/2008

An example scenario where audio can be used to efficiently control two video cameras. There are four tracks that need to be inferred. Directly indicated on tracks are time instants of interest, i.e., initiation and end of each track, mergings, splittings, and cross-overs. The mergings and crossovers are further emphasized by X. Two innermost tracks coincide in the entire time interval (t2, t3). The cameras C1 and C2 need to be panned, zoomed, and tilted as decided based on their own output and that of the audio sensors a1, . . . , aM.

ICDSC'08 6

Relation To Previous Work

Fusion of simultaneous audio and video data.Our audio and video data are captured at disjoint time intervals.

Dense network of vision sensors.In order to cover a large field, we focus on controlling a reduced set of vision sensors.

Our video and audio data is analyzed from dynamic scenes.

9/8/2008

ICDSC'08 7

Problem Formulation

Audio sensors A = {a1, . . . , aM} are distributed across ground plane RR is also observable from a set of controllable cameras C = {c 1, . . . ,cL}.

However, entire region R may not be covered with one set of camera settings.

p-tracks: tracks belonging to targets

a-tracks: tracks obtained by clustering audio

Resolving p-track ambiguityCamera Control

Person Matching9/8/2008

ICDSC'08 8

Tracking System Overview

9/8/2008

a-tracks

Overall camera control system. Audio sensors A = {a1, . . . , aM} are distributed across regions Ri. The set of audio clusters are denoted by Bt, and Kt− represent the set of confirmed a-tracks estimated based on observations before time t. P/T/Z cameras are denoted by C = {c1, . . . , cL}. Ground plane positions are denoted by Ot

k .

ICDSC'08 9

Processing Audio and Video

a-tracks are clusters of audio data that are above amplitude threshold

Tracked using Kalman Filter

In video, people are detected using histogram of orientated gradients and tracked using Auxilary Particle Filter

9/8/2008

ICDSC'08 10

Mapping From Image Plane to Ground Plane

Learned parameters are used to transform tracks from image plane to ground plane

Estimate projective transformation matrix H during a calibration phase

Precompute H for each PTZ setting of each camera

9/8/2008

vanishing line

ICDSC'08 11

Tracking System Overview

9/8/2008

ICDSC'08 12

Camera Control

Camera controlGoal: avoid ambiguity or disambiguate when tracks

are created or deleted

intersect

merge

Set pan/tilt/zoom parameters

9/8/2008

ICDSC'08 13

Setting Camera Parameters

Heuristic algorithm

Cover ground plane by regions Ri l

Ri l in field of view of camera Cl

Camera parameters

Tracking algorithm specifies point of interest x from last known a-track

If no camera on, find Ri l containing x

Reassign a camera and set its parameters if x approaches boundary of current Ri

l

9/8/2008

li

li

li ZTP ,,

ICDSC'08 14

Camera Control Based on Track Trajectories

Intersection

9/8/2008

SeparationMerger

Sudden Appearance Undetected Disappearance

Sudden Disappearance

Locatio

n(M

eters)

Time(Seconds)

Locatio

n(M

eters)

Time(Seconds)

Locatio

n(M

eters)

Time(Seconds)

Locatio

n(M

eters)

Time(Seconds)

Locatio

n(M

eters)

Time(Seconds)

Switch to video

Locatio

n(M

eters)

Time(Seconds)

ICDSC'08 15

Creating Final Tracks Of All Targets

Bipartite graph matching over a set of color histograms

We collect features as the target enters and exits the scene in video.

For every new a-track, features are collected from a small set of frames.

The weight of an edge is the distance between the observed video features.

Additionally, audio data is enforced on the weights.

9/8/2008

ICDSC'08 16

Creating Final Tracks Using Bipartite Matching

9/8/2008

Locatio

n(M

eters)

Time(Seconds)

Audio AudioVideo[a+, a-]

[b+, b-]

[c+]

[d+]

[e+, e-]

Tracking in Audio and Video

Locatio

n(M

eters)

Time(Seconds)

Tracking in Audio Only

Three tracks are recovered by matching every node (entry and exit from the scene) where video was capture.

Two tracks are recovered . However, red and green show the wrong path.

Audio cannot disambiguate independence once the clusters have merged.

[f+]

[g+]

Video

abcdefg

+-

Bipartite Graph Matching

abcdefg

abcdefg

+-

Bipartite Graph Matching Without Audio Constraint

abcdefg

[d-]

[c-]

ICDSC'08 17

Experimental Results

9/8/2008

Inter P-Track Distance at a Merge EventInter P-Track Distance at a Crossover Event

ICDSC'08 18

Experimental Results (Cont.)

9/8/2008

Click To Review Layout

ICDSC'08 19

Conclusion

Goal: minimize camera usage in a surveillance system

Save power, bandwidth, storage and money

Alleviate privacy concerns

Proposed a probabilistic scheme for opportunistically deploying cameras in a multimodal network.

Showed detailed experimental results on real data collected in multimodal networks.

Final set of tracks are computed by bipartite matching

9/8/2008

ICDSC'08 20

Acknowledgements

This work was supported by Aware Building: ONR-N00014-07-C-0311 and the NSF CNS 0551719.

Bi Song2 and Amit Roy-Chowdhury2 were additionally supported by NSF-ECCS 0622176 and ARO-W911NF-07-1-0485.

9/8/2008

Thank You.

Questions?

Jayanth Nayak1

[email protected]

Luis Gonzalez-Argueta2, Bi Song2,

Amit Roy-Chowdhury2, Ertem Tuncel2

{largueta,bsong,amitrc,ertem}@ee.ucr.edu9/8/2008

Bourns College of EngineeringInformation

Processing

Laboratorywww.ipl.ee.ucr.edu