vldb 2008. 2008-08-28 2 motivation traclass: trajectory feature generation trajectory...

34
TRACLASS: TRAJECTORY CLASSIFICATION USING HIERARCHICAL REGION-BASED AND TRAJECTORY-BASED CLUSTERING JAE-GIL LEE, JIAWEI HAN, XIAOLEI LI, HECTOR GONZALEZ UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN VLDB 2008

Upload: veronica-grimsley

Post on 15-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

TRACLASS: TRAJECTORY CLASSIFI-CATION USING HIERARCHICAL RE-GION-BASED AND TRAJECTORY-BASED CLUSTERING

JAE-GIL LEE, JIAWEI HAN, XIAOLEI LI, HECTOR GONZALEZUNIVERSITY OF ILLINOIS AT URBANA-CHAM-PAIGN

VLDB 2008

Page 2: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

2

Outline Motivation TraClass: Trajectory Feature Generation

Trajectory Partitioning Region-Based Clustering Trajectory-Based Clustering

Classification Strategy Performance Evaluation Related Work Conclusions

Page 3: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

3

Classification

Feature Gener-ation

NAME RANK YEARS TENUREDMike Assistant Prof 3 noMary Assistant Prof 7 yesBill Professor 2 yesJim Associate Prof 7 yesDave Assistant Prof 6 noAnne Associate Prof 3 no

Classi-fier

Class la-bel

Training data

Fea-tures

Prediction

Unseen data(Jeff, Professor, 4, ?)

Tenured = Yes

Scope of this paper

Page 4: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

4

Trajectory Data A trajectory is a sequence of the location

and timestamp of a moving object

Hurricanes Turtles

Vessels Vehicles

Page 5: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

5

Trajectory Classification Definition: The process of predicting the

class labels of moving objects based on their trajectories and other features

Applications: Homeland security, weather forecast, law enforcement, etc. Example: Detection of vessel

types (e.g., container ships, tankers, and fishing boats) from satellite images

Page 6: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

6

Previous Studies Several trajectory classification methods have

been proposed mainly in the fields of pattern recognition, bioengineering, and video surveil-lance

A common characteristic of earlier methods is that they use the shapes of whole trajecto-ries to do classification, e.g., by using the HMMNote: Although a few methods partition tra-jectories, the purpose of their partitioning is just to approximate or smooth trajectories

Page 7: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

7

Problem Statement and Obser-vations Problem Statement:

Given a set of labeled trajectories, generate discriminative trajectory features that make a specific class distinguishable from other classes

Observations: (1) Discriminative features are likely to appear at parts of trajectories, not at whole trajecto-ries; (2) Discriminative features appear not only as common movement patterns, but also as re-gions

Page 8: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

8

Motivating Example

Observation 1: Parts of trajectories near the container port and near the refinery enable us to distinguish be-tween container ships and tankers even if they share com-mon long paths

Observation 2: Those in the fishery enable us to recog-nize fishing boats even if they have no common path there

Re-gion

Sub-trajec-tory

Page 9: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

9

Limitations of Earlier Meth-ods

The classification accuracy of earlier methods might not be high since the overall shapes of whole trajectories are similar to each other

Þ Our framework TraClass aims at discovering both re-gion and sub-trajectory features

Overall shape

Page 10: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

10

Overall Procedure of Tra-Class Extract features in a top-down fashion,

first by region-based clustering and then by trajectory-based clustering

Region-Based Cluster-ing

Trajectory-Based Clus-tering

Trajectory partitions in non-homogeneous re-

gions

Region-based andTrajectory-based clus-

ters

Trajectory partitions

Recursively quantize non-homogeneous regions

Repeatedly find finer-granularity clusters

Page 11: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

11

Our Contributions

Achieve high classification accuracy owing to the collaboration between the two types of clustering Region features ← Region-based clustering Sub-trajectory features ← Trajectory partitioning

and trajectory-based clustering

Page 12: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

12

Where We Are Now

Region-Based Cluster-ing

Trajectory-Based Clus-tering

Trajectory partitions in non-homogeneous re-

gions

Region-based andTrajectory-based clus-

ters

Trajectory partitions

Recursively quantize non-homogeneous regions

Repeatedly find finer-granularity clusters

Page 13: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

13

Class-Conscious Trajectory Parti-tioning1. Trajectories are partitioned based on their

shapes as in the partition-and-group framework [12]

2. Trajectory partitions are further parti-tioned by the class labels The real interest here is to guarantee that tra-

jectory partitions do not span the class bound-aries

Additional partitioning points

Non-discriminative Dis-criminative

Class AClass B

Page 14: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

14

Partitioning Condition If the most prevalent class around one

endpoint is different from that around the other endpoint, further partition it Example:

Class AClass B

Prevalent class = Class A

Prevalent class = Class B

Need to be further parti-tioned

Page 15: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

15

Where We Are Now

Region-Based Cluster-ing

Trajectory-Based Clus-tering

Trajectory partitions in non-homogeneous re-

gions

Region-based andTrajectory-based clus-

ters

Trajectory partitions

Recursively quantize non-homogeneous regions

Repeatedly find finer-granularity clusters

Page 16: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

16

Region-Based Clustering Discover regions that have trajectories

mostly of one class regardless of their movement patterns The region-based cluster is a set of trajectory

partitions of the same class within a rectangular region regardless of their movement patterns(1) (2)

Page 17: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

17

Desirable Properties of Region-Based Clustering Homogeneity: The class distribution in

each region should be as homogeneous as possible

Conciseness: The number of regions should be as small as possibleNote: Two properties are contradictory to each other

Þ Need to find a good tradeoff between the properties

One large region

Many small regions

homogeneity

conciseness

Page 18: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

18

Translation into MDL Opti-mization The minimum description length (MDL) cost

consists of the description cost and the code cost The former measures conciseness, and the latter

homogeneity

The best hypothesis is the one that minimizes the sum of the description cost and the code cost

Finding a good quantization translates to find-ing the best hypothesis using the MDL principle

Page 19: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

19

Region-Based Clustering Al-gorithm Progressively find a better partitioning al-

ternately for the X axis and for the Y axis as long as the MDL cost decreases Select the partition that has the maximum code

cost and divide it into two parts in order to de-crease the MDL cost

(1) (2) (3) (4)

Page 20: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

20

Where We Are Now

Region-Based Cluster-ing

Trajectory-Based Clus-tering

Trajectory partitions in non-homogeneous re-

gions

Region-based andTrajectory-based clus-

ters

Trajectory partitions

Recursively quantize non-homogeneous regions

Repeatedly find finer-granularity clusters

Page 21: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

21

Trajectory-Based Clustering Discover sub-trajectories that indicate

common movement patterns of each class The trajectory-based cluster is a set of trajec-

tory partitions of the same class which share a common movement pattern

(3) (4)

Page 22: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

22

Trajectory-Based Clustering Al-gorithm Similar to our trajectory clustering algo-

rithm [12], but incorporate the class labels into clustering The algorithm is based on DBSCAN [5] If an ε-neighborhood contains trajectory parti-

tions mostly of the same class, it is used for clustering; otherwise, it is discarded immedi-ately

Non-homogeneous Homo-geneous ε-neighborhood ε-neigh-borhood L1 L2

X O

Page 23: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

23

Selection of Trajectory-Based Clusters After trajectory-based clusters are found,

discriminative clusters are selected for ef-fective classification If the average distance to other clusters of dif-

ferent classes is high, the discriminative power of the cluster is high

Example:

C1 C2Class AClass B

C1 is more discriminative than C2

Page 24: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

24

Generation of Cluster Links A cluster link is a sequence of connectable

(i.e., consecutive) trajectory-based clusters Two clusters are connectable if they share

enough trajectories (more formally, the ratio of common trajectories is higher than χ)

The benefit of cluster links is to derive also whole-trajectory features Cluster links are added to the set of trajectory-

based clusters for use in classification

Page 25: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

25

Classification Strategy1. Partition trajectories by considering the class la-

bels 2. Perform region-based clustering3. Perform trajectory-based clustering4. Select discriminative trajectory-based clusters5. Find cluster links from trajectory-based clusters6. Convert each trajectory into a feature vector

Each feature is either a region-based cluster or a tra-jectory-based cluster

The i-th entry of a feature vector is the frequency that the i-th feature occurs in the trajectory

7. Feed the feature vectors to the SVM

Page 26: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

26

Experimental Setting (1/2) Use three real trajectory data sets

Animal movement data set Movements of elk, deer, and cattle for the years 1993 through 1996 Three classes: Elk, Deer, and Cattle Number of trajectories (points): 38 (7117), 30 (4333), and 34 (3540)

Vessel navigation data set Navigation paths of two vessels in August 2000 Two classes: Point Lobos and Point Sur Number of trajectories (points): 600 (65500) and 550 (125750)

Hurricane track data set Atlantic Hurricanes for the years 1950 through 2006 Two classes: Category 2 and Category 3 Number of trajectories (points): 61 (2459) and 72 (3126)

Randomly select 20% of trajectories for the test set

Page 27: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

27

Experimental Setting (2/2) Measure classification accuracy, training

time, and prediction time for the three data sets

Compare two versions of the algorithm TB-ONLY: Perform trajectory-based clustering only RB-TB: Perform both types of clustering TB-ONLY is expected to be no worse than earlier

methods since it discovers also whole-trajectory features by cluster-link generation

Classification accu-racy =

# of test trajectories correctly

classified

total # of test trajectories

Page 28: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

28

Overall Results

Data Set Animal Vessel Hurricane

Version TB-ONLY

RB-TB TB-ONLY

RB-TB TB-ONLY

RB-TB

Accuracy (%) 50.0 83.3 84.4 98.2 65.4 73.1

Training Time (ms)

3542 2406 44683 22902

331 317

Prediction Time (ms)

104 98 722 608 48 46 The classification accuracy of RB-TB is much higher than that of TB-ONLY

The training time of RB-TB is much shorter than that of TB-ONLY

Page 29: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

29

Features for the Animal Data

Data: Three classes

Features:10 region-based clusters37 trajectory-based clus-ters

Red: Elk Blue: Deer Black: Cattle

Accuracy = 83.3%

Page 30: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

30

Features for the Hurricane Data

Gulf of Mex-ico

1 region-based cluster15 trajectory-based clus-ters

Red: Category 2 Blue: Cat-egory 3

Stronger hurricanes tend to go further than weaker ones

These hurricanes en-tered the Gulf of Mexico and thus stayed longer at sea before landfall than others; They are likely to get strong because hurri-canes gain energy from the evaporation of warm ocean water

Page 31: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

31

Results for Synthetic Data Effect of region-based clustering

Effect of the data size (scalability test)

Page 32: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

32

Related Work Pattern recognition [1] e.g., speech, hand-

writing, signature, and gesture recognition Classifying human motion trajectories Employing the hidden Markov model (HMM)

Bioengineering [16] Classifying biological motion trajectories

Video surveillance [15] Detecting suspicious behaviors of pedestrians

Time-series classification [20,21] Moving-object anomaly detection [14]

Page 33: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

33

Conclusions A novel and comprehensive feature genera-

tion framework for trajectories has been proposed

The primary advantage is the high classifi-cation accuracy owing to the collaboration between the two types of clustering

Various real-world applications, e.g., vessel classification, can benefit from our frame-work

Page 34: VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based

2008-08-28

34

Thank You!