Download - Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Crime Forecasting Using Boosted Ensemble Classifiers

Department of Computer Science University of Massachusetts Boston

2012 GRADUATE STUDENTS SYMPOSIUM

Present by: Chung-Hsien Yu

Advisor: Prof. Wei Ding


• Retaining spatiotemporal knowledge by applying multi-clustering to monthly aggregated crime data.

• Training baseline learners on these clusters obtained from clustering.

• Adapting a greedy algorithm to find a rule-based ensemble classifier during each boosting round.

• Pruning the ensemble classifier to prevent it from overfitting. • Constructing a strong hypothesis based on these ensemble

classifiers obtained from each round.

Abstract

2


Original Data

3

Residential Burglary

911 Calls

Arrest

Foreclosure

Street Robbery


Aggregated Data

4

3

1

1

1


Monthly Data3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

2

6

1

0

5

6

6

2

7

5

3

3

1

3

4

4

3

1

4

0

4

3

3

2

8

9

4

0

6

4

5

1

2

3

2

3

0

3

0

2

0

1

2

5

0

0

0

0

5


Monthly Clusters (k=3)

6


Monthly Clusters (k=4)

7


Flow Chart

8


Algorithm (Part I)

9


Algorithm (Part II)

10


Confidence Value

11

From AdaBoosting (Schapire & Singer 1998) we have

Let and ignore the boosting round .

𝑍=∑𝑖𝑤 (𝑖 ) exp (−𝐶𝑅¿ 𝑦 𝑖)¿

is defined as the confidence value for the rule and if .


Objective Function

12

Therefore,

𝑊 0= ∑{ 𝑖|𝑥 𝑖∉𝑅 }

𝑤 (𝑖 )𝑊+¿= ∑{𝑖|𝑥𝑖∈𝑅 𝑎𝑛𝑑 𝑦=1 }

𝑤 ( 𝑖 ) ¿𝑊−= ∑{𝑖|𝑥 𝑖∈𝑅𝑎𝑛𝑑 𝑦=− 1}

𝑤 (𝑖 )

𝑊 0+𝑊+¿+𝑊 −=1¿


Minimum Z Value

13

𝑑𝑍𝑑𝐶𝑅

=−𝑊+¿exp (−𝐶 𝑅 )+𝑊 −exp (𝐶𝑅 )=0¿

→𝑊−exp (𝐶𝑅 )=𝑊+¿ exp (−𝐶𝑅 ) ¿

→ ln (𝑊 −exp (𝐶𝑅 ))=ln ¿¿→ ln (𝑊 −)+𝐶𝑅= ln ¿¿→2𝐶𝑅= ln¿ ¿

→𝐶𝑅=12 ln ¿¿

has the minimum value when

𝑑𝑍𝑑𝐶𝑅

2=𝑊+¿ exp (−𝐶𝑅 )+𝑊 −exp (𝐶𝑅 )>0¿


BuildChain Function

14

𝑊 0+𝑊+¿+𝑊 −=1¿

Repeatedly adding a classifier to R until it maximizes . This will minimize as well.


PruneChain Function

15

�́�=¿Loss Function:

Minimize by removing the last classifier from R.

is obtained from GrowSet. are obtained from applying R to PruneSet


Update Weights

16

Calculate with ensemble classifier R on the entire data set.

where


Strong Hypothesis

17

At the end of boosting, there are chains,

�̂�𝑅𝑡=0 𝑖𝑓 𝑥 ∉𝑅𝑡


1. The grid cells with the similar crime counts clustered together also are close to each other on the map geographically. Besides, the high-crime-rate area and low-crime-rate area are separated with cluster.

2. The original data set is randomly divided into two subsets each round. The greedy weak-learn algorithm adapts confidence-rate evaluation to “chain” the base-line classifiers using one data set. And then, “trim” the chain using the other data set.

3. The strong hypothesis is easy to calculate.

SUMMARY

18


Q & A

THANK YOU!!

19