is 6833 analytics assignment predicting homicide rate in st. louis city for 2013
DESCRIPTION
IS 6833 ANALYTICS ASSIGNMENT Predicting Homicide Rate in St. Louis City for 2013 . Uzair Bhatti Dan Diecker Puji Bandi Latoya Lewis. Definition. - PowerPoint PPT PresentationTRANSCRIPT
Uzair BhattiDan DieckerPuji BandiLatoya Lewis
IS 6833ANALYTICS
ASSIGNMENTPREDICTING
HOMICIDE RATE IN ST. LOUIS CITY FOR
2013
Homicide is killing of one human being by another. Homicide is a general term; it includes murder, manslaughter, and other criminal homicides as well as noncriminal killings. Murder is the crime of intentionally and unjustifiably killing another. In the U.S., first-degree murder is a homicide committed with premeditation or in the course of a serious felony.
The first type encompasses any homicide resulting from an intentional act done without malice or premeditation and while in the heat of passion or on sudden provocation.
The second type is variously defined in different jurisdictions but often includes an element of unlawful recklessness or negligence.
Noncriminal homicides include killings committed in defense of oneself or another and deaths resulting from accidents caused by persons engaged in lawful acts.
DEFINITION
2008: 167 Total Murder for the Year 2009: 143 Total Murder for the Year 2010: 144 Total Murder for the Year 2011: 113 Total Murder for the Year2012: 113 Total Murder for the Year
St. Louis is ranked fourth dangerous city in the US for Murders
HOMICIDE OVERVIEW IN ST. LOUIS
Data Segmentatio
n
• We collected data by neighborhoods and districts• St. Louis city consists of 9 districts, 79 neighborhoods, 3 Patrol Zones
Data analysis
• Formulated four variables that correlate with the homicide rates in neighborhoods and districts
• Analyze and depict the relation between these four variables and the homicide occurrence
Variables
• Organized data in excel using pivots tables • Analyze data based on year, month and zip codes• Built a regression analysis from all the data collected to predict the murder rate
for 2013
Conclusion
• The ultimate goal is to predict number of homicides and the determined location of unlawful homicides in St. Louis city for 2013.
OUR APPROACH/OBJECTIVE
MURDER FOR PAST FOUR YEARS
2008 2009 2010 2011 20120
20
40
60
80
100
120
140
160Murder for the Past Four Years
Total
MURDER DISTRIBUTION BY ZIP CODE
Total0
20
40
60
80
100
120
631016310263103631046310663107631086310963110631116311263113631156311663118631206313963147
MURDERS BY MONTH
1 2 3 4 5 6 7 8 9 10 11 120
10
20
30
40
50
60
70
Total
Total
Group A as a Team considered many variables to determine potential relationships to homicide.
Due to randomness of Homicides, variables only help determine potential relationships but are no means of causality
Variables Time
Year, Month, Education (High School Diploma) Home / Renter vacancy Income Unemployment Age / Gender Race Location: Districts, Zip code, Neighborhoods, and Streets Poverty Drugs Gangs/ Violence
VARIABLES CONSIDERED
Variables used to develop the Regression Model
Median Household Income
Determined median household income by Zip code
Educational
Determined by average high school graduation rate by Zip code
Vacancy percentage of Rented/Owned Houses
Determined average home vacancy by Zip code
Unemployment Rate
VARIABLES USED TO PREDICT NUMBER OF HOMICIDES AND
LOCATION
Based on available data we have chosen to use regression
model to establish a correlation between data gathered on
St. Louis city and the number of homicides
Variables used have established potential relationship with
number of homicides. (Source 5)
Used regression analysis to show the relationship between
significant variables, and build regression model to predict
future homicides
PREDICTION APPROACH
Inconsistent data availability
Data compatibility issues converting zip codes to districts, districts to neighborhoods
Inadequate data for the required variables
Lack of current data
Each department collects data based on different geographic specifications
CONSTRAINTS FACING THE MODEL
REGRESSION OUTPUT WITH ALL VARIABLES
The regression output indicates a correlation for number of homicides with fluctuations in High school graduation rates
Correlation of homicides to Mean Income, Unemployment and number of vacant dwellings is weak
Standard Error 5.116553Observations 95
ANOVA df SS MS F Significance F
Regression 4 1442.301 360.5751634 13.77338982 0.0000000083968Residual 90 2356.12 26.17911554Total 94 3798.421
Coefficien
tsStandard Error t Stat P-value Lower 95%
Upper 95%
Lower 95.0%
Upper 95.0%
Intercept 56.33116 12.89684 4.367826049 3.34948E-05 30.70933607 81.95299 30.70934 81.95299Mean Household Income -8E-05 0.000102 -0.78558752 0.434172486 -0.000281923 0.000122 -0.00028 0.000122Graduation Rate -0.38253 0.167728 -2.280654171 0.024929887 -0.715751242 -0.04931 -0.71575 -0.04931Unemployment Rate -0.50587 0.457429 -1.105891241 0.271721003 -1.414628454 0.402896 -1.41463 0.402896Vacancy -0.48516 0.269212 -1.802154984 0.074868842 -1.019998119 0.049675 -1.02 0.049675
REGRESSION OUTPUT WITH DROPPED VARIABLES
More accurate estimate of homicide numbers using stronger correlating data:
SUMMARY OUTPUT
Regression StatisticsMultiple R 0.59396R Square 0.352788Adjusted R Square 0.345829Standard Error 5.141423
Observations 95
ANOVA
df SS MS FSignificance
FRegression 1 1340.038 1340.038 50.69327 2.23E-10Residual 93 2458.383 26.43423Total 94 3798.421
CoefficientsStandard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 47.5547 5.823379 8.16617 1.52E-12 35.99062 59.11877 35.99062 59.11877Graduation Rate -0.5022 0.070535 -7.11992 2.23E-10 -0.64227 -0.36213 -0.64227 -0.36213
Number of homicides to be predicted in year 2013 can be referred by the statistical model illustrating,
Combination of variables can be used to predict number of homicides based on high school graduation rate, Home / Rent vacancy, Unemployment rate,
Because significance F is less than .05 we can still claim the combination of variables can be used to predict 2013 homicides.
The past 5 year prediction for High school degree attainment is 26.5%. Where as the past 3 year prediction is 26.6%. So we predict that the number of homicides are going to be 109.
REGRESSION MODEL EQUATION
Based on current trends in education levels of people living in these areas, this model predicts a decrease in the number of homicides for 2013
Studies show that the graduation rate for the St. Louis City has gone up significantly (at a current rate of 26.5%)
Based on the past observations of the murder occurrence we predict that Zip code 63107 is going to have highest murder rate followed by 63112 and 63106 respectively
PREDICTION
Education level is a well-recorded data source and can be used for estimation of future trends in homicides.
High school graduation rate has an inverse relation with the homicide rate.
Future data-gathering should be limited to data points that are strongly correlated with homicides and easy to gather. Benefits: Ease of data maintenance Easier ‘What if?’ functionality if there are fewer data to
consider Ease of use and timeliness of predictions – quicker to
respond and deploy resources where needed.
RECOMMENDATIONS
http://factfinder2.census.gov/faces/nav/jsf/pages/community_facts.xhtml
http://www.city-data.com/http://www.city-data.com/crime/crime-St.-
Louis-Missouri.html (homicide overview in St. Louis)www.forbes.com (4 th dangerous city in the US for
Murders)http://www.gwu.edu/~soc/docs/
Kubrin_neighborhood_correlates.pdfwww.socialexplorer.comwww.factfinder.comwww.stlrcga.org
REFERENCES