crime early warning: automated data mining of cad and rms
DESCRIPTION
The genesis of HunchLab was the idea to mine law enforcement agencies' CAD and RMS databases to detect unusual levels of activity in particular areas and then send alerts to the appropriate police staff. While crime analysis tools often are aiming to display what has happened, the concept of a geographic early warning system, such as within HunchLab, tries to answer the question: "what is unusual that is happening?" http://www.azavea.com/products/hunchlab/features/early-warning/TRANSCRIPT
340 N 12th St, Suite 402Philadelphia, PA 19107
www.azavea.com/hunchlab
Crime Early Warning Systems
Automated Data Mining of CAD and RMS Databases
About Us
Robert CheethamPresident & [email protected]
Jeremy HeffnerHunchLab Product [email protected]
Agenda
• Company Background• The Backstory• HunchLab
– Concept of Early Warning / Data Mining– Demonstration of Hunches– Underlying Statistics
• Q&A
About Azavea
• Founded in 2000
• 27 people
• Based in Philadelphia
– Also Boston & Minneapolis
• Geospatial + web + mobile
– Software development
– Spatial analysis services
Clients & Industries
• Public Safety• Municipal Services• Public Health• Human Services• Culture • Elections & Politics• Land Conservation• Economic Development
Azavea & Governments
The Backstory
How Phila PD uses GIS
Customized Map Products
Weekly CompStat Meetings
Web Crime Analysis
Complainant
CAD
Verizon911
911 Operator
RadioDispatcher
Police Officer
District48 Desk
INCT
Daily download& Geocoding Routines
Incident ReportCompleted by Officer District X
District Y
District Z
Maps distributedThrough Intranet,
Printing, CompStat
INCT & PARS – main database sources
over 5,000 incidents daily, over 2 million annually
PARS
The Context
1,500,000 people
7,000 police officers
1,000 civilian employees
2,000,000 new incidents / year
3 crime analysts
What we did
• Weekly Compstat• Lots of maps• Automation of map creation• Web-based systems
… but what if we could…
Accelerate the cycle Proactively notify Automate the process
Prototype
ArcViewVB & MapObjects
MS SQL Server
Crime Incidents Database
Shapefiles
and
GRIDs
Process Documentation
.ini file
… but there was a problem …
It was crap … sort of.
We needed ….
1. Better Statistics
2. Notification
3. Very Straightforward
web-based crime analysis, early warning, and risk forecasting
Crime Analysis
– Mapping (spatial / temporal densities)
– Trending
– Intelligence Dashboard
Early Warning
– Statistical & Threshold-based Hunches (data mining)
– Alerting
Risk Forecasting
– Near Repeat Pattern
– Load Forecasting
Crime Analysis – What has happened?
– Mapping (spatial / temporal densities)
– Trending
– Intelligence Dashboard
Early Warning – What is out of the ordinary?
– Statistical & Threshold-based Hunches (data mining)
– Alerting
Risk Forecasting – What is likely to happen?
– Near Repeat Pattern
– Load Forecasting
Early Warning
Early Warning
• Geographic Early Warning System– A system to alert staff of an unusual situation in a
particular location– Ingests data sets to automatically “cook on” and only
involves staff when a statistically unusual situation is found
HunchLab Database
Operational Database Alerting
System
Geostatistical Engine
Operational DatabaseOperational Databases
Data Mining
• What do we mean by data mining?– The process of “cooking on” the data to reveal
something new (unusual)• Benefits
– Automated discovery process– Can examine large data sets without additional staff
time• Major crime incidents• Minor crime incidents
– Near real-time alerts• Limitations
– Can’t determine why something unusual is happening, only that it is happening
Early Warning
bit.ly/crimespikedetector
Demo
What is a Hunch?
• A proposed hypothesis, saved into the system, and continually tested for validity
• Incident Attribute Requirements– Location (x, y)– Time (timestamp)– Classification
• Hunch Attributes– Location (area)– Time (recent / historic periods)– Classification
• Analyses– Statistical Hunch– Threshold Hunch
Hunch Parameters: Location
• Address & Radius• Precinct/County/Country• Custom Drawn Area• Mass Hunch
Hunch Parameters: Time
• Statistical Hunch– Recent Past– Historic Past
Hunch Parameters: Classification
• Category• Time of Day• Narrative
Hunch Helper
Email Alert
Hunch Details
The Statistics
What do we know?
• Hunch– Geographic region (that we care about)– Recent time frame (to alert on) – Historic time frame (to compare against)– Classification (that we are interested in)
What do we know?
• Hunch– Geographic region (that we care about)– Recent time frame (to alert on) – Historic time frame (to compare against)– Classification (that we are interested in)
Within Hunch Outside of Hunch
Recent past ? ?
Historic past ? ?
Hypergeometric Distribution
• Arises when selecting items at random from a heterogenous pool without replacement– Example
• A bag contains 45 black marbles and 5 white marbles• What is the chance of picking 4 white marbles when we
draw 10 marbles?
Tony SmithUniversity of Pennsylvania
Drawn Not Drawn
White Marbles
4 1
Black Marbles
6 39
Hypergeometric Distribution
Drawn Not Drawn Total
White Marbles
4 = k 1 = m – k 5 = m
Black Marbles
6 = n-k 39 = N + k – n - m
45 = N – m
Total 10 = n 40 = N - n 50 = N
en.wikipedia.org/wiki/Hypergeometric_distribution
What do we know?
• Hunch– Geographic region (that we care about)– Recent time frame (to alert on) – Historic time frame (to compare against)– Classification (that we are interested in)
Within Hunch Outside of Hunch
Recent past ? ?
Historic past ? ?
What do we know?
• Valid Hunch– The current condition (and all worse conditions) is
unlikely to simply be due to chance
Demo
Research Topics
Research Topics
• Mobile Interfaces• Analysis
– Real-time Functionality• Consume real-time data streams & conduct ongoing
analysis
Research Topics
• Risk Forecasting– Load forecasting enhancements
• Machine learning-based model selection• Weather and special events
– Combining short and long term risk forecasts– Risk Terrain Modeling
Q&A
Contact Us
Robert CheethamPresident & [email protected]
Jeremy HeffnerHunchLab Product [email protected]