crime forecasting using boosted ensemble classifiers chung-hsien yu crime forecasting using boosted...

19
Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science University of Massachusetts Boston 2012 GRADUATE STUDENTS SYMPOSIUM Present by: Chung-Hsien Yu Advisor: Prof. Wei Ding

Upload: melissa-reeves

Post on 17-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Crime Forecasting Using Boosted Ensemble Classifiers

Department of Computer Science University of Massachusetts Boston

2012 GRADUATE STUDENTS SYMPOSIUM

Present by: Chung-Hsien Yu

Advisor: Prof. Wei Ding

Page 2: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

โ€ข Retaining spatiotemporal knowledge by applying multi-clustering to monthly aggregated crime data.

โ€ข Training baseline learners on these clusters obtained from clustering.

โ€ข Adapting a greedy algorithm to find a rule-based ensemble classifier during each boosting round.

โ€ข Pruning the ensemble classifier to prevent it from overfitting. โ€ข Constructing a strong hypothesis based on these ensemble

classifiers obtained from each round.

Abstract

2

Page 3: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Original Data

3

Residential Burglary

911 Calls

Arrest

Foreclosure

Street Robbery

Page 4: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Aggregated Data

4

3

1

1

1

Page 5: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Monthly Data3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

2

6

1

0

5

6

6

2

7

5

3

3

1

3

4

4

3

1

4

0

4

3

3

2

8

9

4

0

6

4

5

1

2

3

2

3

0

3

0

2

0

1

2

5

0

0

0

0

5

Page 6: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Monthly Clusters (k=3)

6

Page 7: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Monthly Clusters (k=4)

7

Page 8: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Flow Chart

8

Page 9: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Algorithm (Part I)

9

Page 10: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Algorithm (Part II)

10

Page 11: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Confidence Value

11

From AdaBoosting (Schapire & Singer 1998) we have

Let and ignore the boosting round .

๐‘=โˆ‘๐‘–

๐‘ค (๐‘– ) exp (โˆ’๐ถ๐‘…ยฟ ๐‘ฆ ๐‘–)ยฟ

is defined as the confidence value for the rule and if .

Page 12: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Objective Function

12

Therefore,

๐‘Š 0= โˆ‘{ ๐‘–|๐‘ฅ ๐‘–โˆ‰๐‘… }

๐‘ค (๐‘– )๐‘Š+ยฟ= โˆ‘{๐‘–|๐‘ฅ๐‘–โˆˆ๐‘… ๐‘Ž๐‘›๐‘‘ ๐‘ฆ=1 }

๐‘ค ( ๐‘– ) ยฟ๐‘Šโˆ’= โˆ‘{๐‘–|๐‘ฅ ๐‘–โˆˆ๐‘…๐‘Ž๐‘›๐‘‘ ๐‘ฆ=โˆ’ 1}

๐‘ค (๐‘– )

๐‘Š 0+๐‘Š+ยฟ+๐‘Š โˆ’=1ยฟ

Page 13: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Minimum Z Value

13

๐‘‘๐‘๐‘‘๐ถ๐‘…

=โˆ’๐‘Š+ยฟexp (โˆ’๐ถ ๐‘… )+๐‘Š โˆ’exp (๐ถ๐‘… )=0ยฟ

โ†’๐‘Šโˆ’exp (๐ถ๐‘… )=๐‘Š+ยฟ exp (โˆ’๐ถ๐‘… ) ยฟ

โ†’ ln (๐‘Š โˆ’exp (๐ถ๐‘… ))=ln ยฟยฟโ†’ ln (๐‘Š โˆ’)+๐ถ๐‘…=ln ยฟยฟโ†’2๐ถ๐‘…=lnยฟ ยฟ

โ†’๐ถ๐‘…=12ln ยฟยฟ

has the minimum value when

๐‘‘๐‘๐‘‘๐ถ๐‘…

2=๐‘Š+ยฟ exp (โˆ’๐ถ๐‘… )+๐‘Šโˆ’exp (๐ถ๐‘… )>0ยฟ

Page 14: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

BuildChain Function

14

๐‘Š 0+๐‘Š+ยฟ+๐‘Š โˆ’=1ยฟ

Repeatedly adding a classifier to R until it maximizes . This will minimize as well.

Page 15: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

PruneChain Function

15

๏ฟฝฬ๏ฟฝ=ยฟLoss Function:

Minimize by removing the last classifier from R.

is obtained from GrowSet.

are obtained from applying R to PruneSet

Page 16: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Update Weights

16

Calculate with ensemble classifier R on the entire data set.

where

Page 17: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Strong Hypothesis

17

At the end of boosting, there are chains,

๏ฟฝฬ‚๏ฟฝ๐‘…๐‘ก=0 ๐‘–๐‘“ ๐‘ฅ โˆ‰๐‘…๐‘ก

Page 18: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

1. The grid cells with the similar crime counts clustered together also are close to each other on the map geographically. Besides, the high-crime-rate area and low-crime-rate area are separated with cluster.

2. The original data set is randomly divided into two subsets each round. The greedy weak-learn algorithm adapts confidence-rate evaluation to โ€œchainโ€ the base-line classifiers using one data set. And then, โ€œtrimโ€ the chain using the other data set.

3. The strong hypothesis is easy to calculate.

SUMMARY

18

Page 19: Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu Crime Forecasting Using Boosted Ensemble Classifiers Department of Computer Science

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Q & A

THANK YOU!!

19