+ All Categories
Home > Technology > HunchLab 2.0 Predictive Missions: Under the Hood

HunchLab 2.0 Predictive Missions: Under the Hood

Date post: 12-Jan-2015
Category:
Upload: azavea
View: 675 times
Download: 3 times
Share this document with a friend
Description:
 
Popular Tags:
76
340 N 12 th St, Suite 402 Philadelphia, PA 19107 215.925.2600 [email protected] www.hunchlab.com Missions: Under the Hood
Transcript
Page 1: HunchLab 2.0 Predictive Missions: Under the Hood

340 N 12th St, Suite 402 Philadelphia, PA 19107

215.925.2600 [email protected]

www.hunchlab.com

Missions: Under the Hood

Page 2: HunchLab 2.0 Predictive Missions: Under the Hood

Amelia Longo Business Development Associate [email protected] 215.701.7715

Jeremy Heffner HunchLab Product Manager [email protected] 215.701.7712

Page 3: HunchLab 2.0 Predictive Missions: Under the Hood

Places

People

Patterns } Prioritization

Page 4: HunchLab 2.0 Predictive Missions: Under the Hood

Predictive Missions

Page 5: HunchLab 2.0 Predictive Missions: Under the Hood

It’s the fourth Tuesday in January and school is in session. There were 3 burglaries and 2 robberies yesterday. Six bars, three take-out stores, and a school are in the neighborhood. The forecast is 17° with cloudy skies. Where do you focus your 2 vehicles?

Page 6: HunchLab 2.0 Predictive Missions: Under the Hood
Page 7: HunchLab 2.0 Predictive Missions: Under the Hood
Page 8: HunchLab 2.0 Predictive Missions: Under the Hood
Page 9: HunchLab 2.0 Predictive Missions: Under the Hood
Page 10: HunchLab 2.0 Predictive Missions: Under the Hood
Page 11: HunchLab 2.0 Predictive Missions: Under the Hood

How would you do it?

Page 12: HunchLab 2.0 Predictive Missions: Under the Hood
Page 13: HunchLab 2.0 Predictive Missions: Under the Hood

Analyst Process

•  Identify relevant factors –  Training / Literature –  Experience

•  Use heuristics –  high concentration of past crime è higher risk –  near a bar on a Friday night è higher risk –  near the police station è lower risk –  concentration of ex-offenders è higher risk –  near transit stops è higher risk

Page 14: HunchLab 2.0 Predictive Missions: Under the Hood
Page 15: HunchLab 2.0 Predictive Missions: Under the Hood
Page 16: HunchLab 2.0 Predictive Missions: Under the Hood
Page 17: HunchLab 2.0 Predictive Missions: Under the Hood
Page 18: HunchLab 2.0 Predictive Missions: Under the Hood

?  

Page 19: HunchLab 2.0 Predictive Missions: Under the Hood

How HunchLab Works

Page 20: HunchLab 2.0 Predictive Missions: Under the Hood

A computer system designed to learn how to accomplish a task by using historic data sets. There are different ways (algorithms) to accomplish this training process.

term: machine learning

Page 21: HunchLab 2.0 Predictive Missions: Under the Hood

The step-by-step procedure to accomplish a given calculation. Different algorithms have different qualities. Algorithms are used to train a machine learning model.

term: algorithm

Page 22: HunchLab 2.0 Predictive Missions: Under the Hood

Overall Process

1.  Generate training examples of outcomes

2.  Enrich with relevant variables

3.  Build models

4.  Evaluate accuracy

5.  Select best performing model

Page 23: HunchLab 2.0 Predictive Missions: Under the Hood

Generate Examples

Page 24: HunchLab 2.0 Predictive Missions: Under the Hood

~ 500 ft cells & 1+ hour time slices

Page 25: HunchLab 2.0 Predictive Missions: Under the Hood

Data Volume

•  Space –  Lincoln, NE is 90 sq miles –  500 ft cell size creates 12,000 cells

•  Time –  3 years of data –  1 hour resolution –  26,000 hour blocks

•  Space x Time –  312,000,000 hour block cells (examples)

Page 26: HunchLab 2.0 Predictive Missions: Under the Hood

Data Volume

•  Space –  Lincoln, NE is 90 sq miles –  500 ft cell size creates 12,000 cells

•  Time –  3 years of data –  1 hour resolution –  26,000 hour blocks

•  Space x Time –  312,000,000 hour block cells (examples)

•  Sampling FTW! –  Outcomes are sparse (small % of examples have crimes) –  Sampling strategy preserves crime events

Page 27: HunchLab 2.0 Predictive Missions: Under the Hood

Representing Crime Theories

Page 28: HunchLab 2.0 Predictive Missions: Under the Hood
Page 29: HunchLab 2.0 Predictive Missions: Under the Hood

•  Crime predictions based on: –  Baseline crime levels

•  Similar to traditional hotspot maps

–  Near repeat patterns •  Event recency (contagion)

–  Risk Terrain Modeling •  Proximity and density of geographic features •  Points, Lines, Polygons (bars, bus stops, etc.)

–  Collective Efficacy •  Socioeconomic indicators (poverty, unemployment, etc.)

Predictive Missions

Page 30: HunchLab 2.0 Predictive Missions: Under the Hood

•  Crime predictions based on: –  Routine Activity Theory

•  Offender: proximity and concentration of known offenders •  Guardianship: police presence (AVL / GPS) •  Targets: measures of exposure (population, parcels, vehicles)

–  Temporal cycles •  Seasonality, time of month, day of week, time of day

–  Recurring temporal events •  Holidays, sporting events, etc.

–  Weather •  Temperature, precipitation

Predictive Missions

Page 31: HunchLab 2.0 Predictive Missions: Under the Hood

Representing Crime Theories Risk Terrain Modeling

Page 32: HunchLab 2.0 Predictive Missions: Under the Hood

Gun  shoo)ngs  example  Source:  Rutgers,  h8p://www.rutgerscps.org/rtm/irvrtmgoogearth.htm  

Page 33: HunchLab 2.0 Predictive Missions: Under the Hood
Page 34: HunchLab 2.0 Predictive Missions: Under the Hood
Page 35: HunchLab 2.0 Predictive Missions: Under the Hood
Page 36: HunchLab 2.0 Predictive Missions: Under the Hood
Page 37: HunchLab 2.0 Predictive Missions: Under the Hood
Page 38: HunchLab 2.0 Predictive Missions: Under the Hood
Page 39: HunchLab 2.0 Predictive Missions: Under the Hood
Page 40: HunchLab 2.0 Predictive Missions: Under the Hood

crimes prior7 prior364 dayssincelast bardist dow

0 0 0 365 >2000ft Monday

0 0 1 234 >2000ft Monday

1 1 3 3 750ft Tuesday

0 0 2 43 500ft Wednesday

2 0 2 74 500ft Friday

Page 41: HunchLab 2.0 Predictive Missions: Under the Hood

Representing Crime Theories Aoristic Analysis

Page 42: HunchLab 2.0 Predictive Missions: Under the Hood

crimes probability

0 0

1 a

2 b

3 c

4 d

Page 43: HunchLab 2.0 Predictive Missions: Under the Hood

crimes weights prior7 prior364 dayssincelast bardist dow

0 1 0 0 365 >2000ft Monday

0 1 0 1 234 >2000ft Monday

0 0.5 1 3 3 750ft Tuesday

1 0.5 1 3 3 750ft Tuesday

0 0 0 2 43 500ft Wednesday

0 0.13 0 2 74 500ft Friday

1 0.32 0 2 74 500ft Friday

2 0.55 0 2 74 500ft Friday

Page 44: HunchLab 2.0 Predictive Missions: Under the Hood

Building Models

Page 45: HunchLab 2.0 Predictive Missions: Under the Hood

Models

•  Baseline –  Baseline models (6)

•  Counts –  28 day –  56 day –  364 day

•  Kernel Densities –  28 day –  56 day –  364 day

–  HunchLab models •  Variations of a stacked ensemble:

–  examples è gradient boosting machine (gbm) è y/n probabilities

–  y/n probabilities è generalized additive model (gam) è counts

Page 46: HunchLab 2.0 Predictive Missions: Under the Hood

A machine learning algorithm that recursively partitions a data set based upon variable values forming a tree-like structure.

term: decision tree

Page 47: HunchLab 2.0 Predictive Missions: Under the Hood

crimes prior7 prior364 dayssincelast bardist dow

0 0 0 365 >2000ft Monday

0 0 1 234 >2000ft Monday

1 1 3 3 750ft Tuesday

0 0 2 43 500ft Wednesday

2 0 2 74 500ft Friday

Page 48: HunchLab 2.0 Predictive Missions: Under the Hood

A machine learning algorithm that uses a series of weaker models (typically decision trees) that are trained upon the residuals of prior iterations (boosting) to form one stronger model.

term: gradient boosting machine (GBM)

Build Decision Tree 1

Predict with 1

Calculate errors

1 Build Decision Tree 2

Predict with 1 & 2

Calculate errors

2 Build Decision Tree 3

Predict with 1-3

Calculate errors

3 …

Page 49: HunchLab 2.0 Predictive Missions: Under the Hood

A regression model that fits smoothed functions to the input variables. Compare to a generalized linear model which fits just a single coefficient to each variable.

term: generalized additive model (GAM)

Page 50: HunchLab 2.0 Predictive Missions: Under the Hood

HunchLab Model Building

1.  Build a GBM –  examples è gradient boosting machine è y/n probabilities

Page 51: HunchLab 2.0 Predictive Missions: Under the Hood

312 million

4 million

1 mil 1 mil 1 mil 1 mil

Sampling

4 folds

GBM

}

1 mil

Evaluate

43

200

Page 52: HunchLab 2.0 Predictive Missions: Under the Hood

312 million

4 million

Sampling

GBM 43

Page 53: HunchLab 2.0 Predictive Missions: Under the Hood

HunchLab Model Building

1.  Build a GBM –  examples è gradient boosting machine è y/n probabilities

•  Segment examples into several folds –  For each fold build a GBM model on the rest of the data –  For each iteration in the GBMs:

»  Randomly sample a portion of the data (stochastic) »  Adjust weights of observations (adaptive boosting)

•  Determine how many iterations result in the most accurate model •  Build a GBM on all of the data for that many iterations

Page 54: HunchLab 2.0 Predictive Missions: Under the Hood
Page 55: HunchLab 2.0 Predictive Missions: Under the Hood

HunchLab Model Building

2.  Build a GAM –  y/n probabilities è generalized additive model è counts

•  Transforms (“bends”) GBM output into counts •  Calibrates count levels with other key variables

Page 56: HunchLab 2.0 Predictive Missions: Under the Hood
Page 57: HunchLab 2.0 Predictive Missions: Under the Hood

Example

Page 58: HunchLab 2.0 Predictive Missions: Under the Hood

Lincoln NE

Page 59: HunchLab 2.0 Predictive Missions: Under the Hood

Lincoln Assaults

Page 60: HunchLab 2.0 Predictive Missions: Under the Hood

Lincoln Assaults

Page 61: HunchLab 2.0 Predictive Missions: Under the Hood

Lincoln Assaults

Page 62: HunchLab 2.0 Predictive Missions: Under the Hood

Lincoln Assaults

Page 63: HunchLab 2.0 Predictive Missions: Under the Hood

Lincoln Assaults

Page 64: HunchLab 2.0 Predictive Missions: Under the Hood

Selecting Models

Page 65: HunchLab 2.0 Predictive Missions: Under the Hood

Selecting Models

1.  Build models holding out last 28 days of data

2.  Score each model

–  Combine different metrics into a selection score

3.  Select best score

4.  Rebuild the best model (including last 28 days data)

Page 66: HunchLab 2.0 Predictive Missions: Under the Hood

Cells ranked highest to lowest

A map represented as a grid of cells

0% 100%

Crime Location

Page 67: HunchLab 2.0 Predictive Missions: Under the Hood

Cells ranked highest to lowest

0% 100%

Percent of Patrol Area to Capture All Crimes

Average Crime Rank

0%

50%

100%

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Percent of Crimes Captured vs. Percent of Patrol Area

Page 68: HunchLab 2.0 Predictive Missions: Under the Hood

0%   20%   40%   60%   80%   100%  

Assault  

Burglary  

MVT

 Ra

pe  

Robb

ery  

Percent  of  Patrol  Area  to  Capture  All  Crimes  

Page 69: HunchLab 2.0 Predictive Missions: Under the Hood

0

0.1

0.2

0.3

0.4

0.5

0.6

Assault Burglary MVT Rape Robbery

Average  Crime  Rank  

Page 70: HunchLab 2.0 Predictive Missions: Under the Hood

0

0.2

0.4

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16

Perc

ent

of C

rime

s C

ap

ture

d

Percent of Land Area

Theft of Motor Vehicle

Page 71: HunchLab 2.0 Predictive Missions: Under the Hood

Overall Process

1.  Generate training examples of outcomes

2.  Enrich with relevant variables

3.  Build models

4.  Evaluate accuracy

5.  Select best performing model

Page 72: HunchLab 2.0 Predictive Missions: Under the Hood

Our Solution •  Learns from several years of your data

•  Automatically determines which theories apply

–  more than just crime data

•  Prevents over-fitting

•  Calibrates predictions

•  Selects a model based upon a blind evaluation

–  prioritization and count-based metrics

Page 73: HunchLab 2.0 Predictive Missions: Under the Hood

Our Solution •  Learns from several years of your data

•  Automatically determines which theories apply

–  more than just crime data

•  Prevents over-fitting

•  Calibrates predictions

•  Selects a model based upon a blind evaluation

–  prioritization and count-based metrics

•  But it still cannot make your morning coffee

Page 74: HunchLab 2.0 Predictive Missions: Under the Hood

Additional Information •  How did HunchLab originate?

•  How does HunchLab represent crime theories?

•  What data is needed?

•  How does the modeling work specifically?

Page 75: HunchLab 2.0 Predictive Missions: Under the Hood

Questions

340 N 12th St, Suite 402 Philadelphia, PA 19107

215.925.2600 [email protected]

www.hunchlab.com

Page 76: HunchLab 2.0 Predictive Missions: Under the Hood

340 N 12th St, Suite 402 Philadelphia, PA 19107

215.925.2600 [email protected]

www.hunchlab.com

Amelia Longo Business Development Associate [email protected] 215.701.7715

Jeremy Heffner HunchLab Product Manager [email protected] 215.701.7712


Recommended