spatial analysis of county level crash risk in new jersey using...

19
Spatial Analysis of County Level Crash Risk in New Jersey using Severity 1 based Hierarchical Bayesian models 2 Sami Demiroluk, M.Sc. (Corresponding Author) 3 Graduate Research Assistant, 4 Rutgers Intelligent Transportation Systems Laboratory 5 Department of Civil and Environmental Engineering, 6 Rutgers, The State University of New Jersey, 7 623 Bowser Road, Piscataway, NJ 08854 USA, 8 Tel: (732) 445-4012 9 Fax: (732) 445-0577 10 e-mail: [email protected] 11 12 13 Kaan Ozbay, Ph.D. 14 Professor, 15 Rutgers Intelligent Transportation Systems Laboratory 16 Department of Civil and Environmental Engineering, 17 Rutgers, The State University of New Jersey, 18 623 Bowser Road, Piscataway, NJ 08854 USA, 19 Tel: (732) 445-2792 20 Fax: (732) 445-0577 21 e-mail: [email protected] 22 23 24 Word count: 25 Text: 5,248 26 Tables: 4 x 250 27 Figures: 5 x 250 28 Total: 7,498 29 Submission Date: August 1 st , 2013 30 31 32 33 34 35 36 37 38 39 TRB 2014 Annual Meeting Paper revised from original submittal.

Upload: others

Post on 21-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

Spatial Analysis of County Level Crash Risk in New Jersey using Severity 1

based Hierarchical Bayesian models 2

Sami Demiroluk, M.Sc. (Corresponding Author) 3 Graduate Research Assistant, 4 Rutgers Intelligent Transportation Systems Laboratory 5 Department of Civil and Environmental Engineering, 6 Rutgers, The State University of New Jersey, 7 623 Bowser Road, Piscataway, NJ 08854 USA, 8 Tel: (732) 445-4012 9 Fax: (732) 445-0577 10 e-mail: [email protected] 11 12 13 Kaan Ozbay, Ph.D. 14 Professor, 15 Rutgers Intelligent Transportation Systems Laboratory 16 Department of Civil and Environmental Engineering, 17 Rutgers, The State University of New Jersey, 18 623 Bowser Road, Piscataway, NJ 08854 USA, 19 Tel: (732) 445-2792 20 Fax: (732) 445-0577 21 e-mail: [email protected] 22 23 24 Word count: 25 Text: 5,248 26 Tables: 4 x 250 27 Figures: 5 x 250 28 Total: 7,498 29 Submission Date: August 1st, 2013 30 31

32

33

34

35

36

37

38

39

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 2: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

2

ABSTRACT 1

This study presents an innovative hierarchical Bayesian model for spatial modeling of county 2 level crashes in New Jersey. First, the model is estimated using raw crash counts. Then, weights 3 are applied to crashes with different severities to obtain a weighted crash count. The goal in 4 incorporating severities in the spatial model is to demonstrate the importance of representing 5 spatial variation of crashes as well as their severity. As a contribution to existing literature, crash 6 rates are also analyzed by road type. Finally, crash rate maps are developed based on modeling 7 results to visualize the effects of spatial covariates. The results of the study indicate that the most 8 influential covariate for the crashes is the road curvature, followed by roadway mileage and 9 roadway defects. It is also found that it is possible to represent the crash risk better by applying 10 severity weights to the individual crashes. The developed crash rate maps can help transportation 11 professionals on identifying and ranking the locations at an aggregate level, which requires 12 closer attention. 13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 3: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

3

INTRODUCTION 1

Crashes are random, yet very common events on our roadways. Every year many lives are lost 2 due to crashes. According to Center for Disease Control and Prevention, motor vehicle crashes 3 are the leading cause of death among young people age 5 to 34 in the United States (1). 4 Moreover, in 2010, 2,764,122 people were injured as a result of motor vehicles crashes that year 5 (1). Therefore, there is a need to improve roadway safety. By better handling the crashes, the 6 identification of high risk locations and contributing factors, roadway safety can be improved. 7

Mapping of spatial events is a very popular technique that is widely used in disease 8 mapping by researchers. “Mapping transforms spatial data into a visual form, enhancing the 9 ability of users to observe, conceptualize, validate, and communicate information” (2). After the 10 introduction of Markov Chain Monte Carlo (MCMC) methods, hierarchical Bayesian models are 11 employed for developing risk maps of diseases in the public health studies (3). Additionally, 12 these models are used for ranking disease risks across regions based on spatial and temporal 13 variations. It is possible to make an analogy between diseases and crashes since crash risks vary 14 at different regions and there is need for ranking these regions as well. Currently, there are few 15 documented studies that use this technique to estimate spatial crash models, even though there 16 are several benefits of utilizing these methods in transportation. Since crashes are random events, 17 there is almost always high variance in crash data between different regions. This is called small 18 area problem (4). One way to overcome this problem is to use hierarchical models that account 19 for spatial correlation. The hierarchical models are also very effective in developing models in 20 presence of multiple sources of data and uncertainty in parameterizations (5). 21 Current spatial crash models generally deal with the aggregated data at the county or 22 region level (2, 6-9). Similar to this previous research, hierarchical Bayesian models will be used 23 in this study to analyze the factors contributing to crash risk (e.g. horizontal curves, roadway 24 defects and wet weather) and spatial variation among different regions. County level crash data 25 that contains all crashes from 2001 to 2010 in New Jersey will be used in the model calibration. 26 In addition to a general model for raw crash counts, we propose the use of different models for 27 roadways under various jurisdictions. One of the major contributions of this paper is the 28 development of a severity based spatial model. Since different crashes have varying impacts, we 29 propose the use of severity weights to capture the effects of the crashes more realistically. 30

31

LITERATURE REVIEW 32

Previous research on the spatial analysis of crashes utilizes data aggregated at different level. For 33 example, Noland (10) analyzed the effects of infrastructure improvements on fatalities and 34 injuries. Data from 50 states was used to develop a negative binomial regression model which 35 includes infrastructure and socio-economic variables. Additionally, Negative binomial models 36 are used by researchers to analyze the crashes at county level (11-14). While others utilized such 37 models for modeling crashes for road sections (15-16). 38 However, negative binomial regression models suffer from the fact that they cannot 39 handle spatial and temporal correlation (6-7, 17). Therefore, more sophisticated methods are 40 proposed by researchers such as hierarchical Bayesian models (2, 6-9). 41 Miaou et al. (2) developed a set of Poisson hierarchical Bayes models for estimating 42 crash risk using crash frequencies for fatal, incapacitating and non-incapacitating injuries at 43 county level on low volume roads in Texas. Spatial covariates include percentage of time that the 44

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 4: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

4

road surface is wet, the number of sharp curves, and roadside hazards. However, surrogate 1 variables are used in modeling due to limitations of the data. Conditional Auto-regressive model 2 (CAR) (18) is used to model spatial correlation. Then, the samples are drawn from posterior 3 probability distributions using Markov Chain Monte Carlo (MCMC) method. In the study, it is 4 recognized that although most of the disease mapping was done for area-based data, traffic 5 crashes occur at roadway network. 6 MacNab (6) developed a hierarchical Bayesian model to analyze variations of accident 7 risk factors at the regional level. The study considers ecological (regional) analysis of accident 8 and injury variations, covariate effects, random spatial effects and age effects simultaneously. 9 Hospital separation data for 83 local health areas in British Columbia (BC), Canada is used to 10 investigate ecological/contextual factors of accident injury among males between 0 to 24 ages. 11 Different spatial covariates such as socio-economic indicators, residential environment indicators 12 (roads and parks), availability of medical services and utilizations are included in the study. The 13 model for injury rates assumes injury hospitalization count to follow a Poisson distribution and 14 the spatial random effects are modeled using a Markov random field (MRF) Gaussian 15 distribution. 16 Aguero-Valverde and Jovanis (7) estimated hierarchical Bayesian models considering 17 spatial and temporal effects and space–time interactions by using injury and fatality crash data 18 from Pennsylvania at county level. Weather conditions, transportation infrastructure, socio-19 demographics and amount of travel are included as covariates in these models. Crashes are 20 assumed to follow a Poisson distribution and spatial correlation; the CAR model is used to 21 quantify spatial correlation. Bayesian fatality and injury models are also compared to negative 22 binomial models in the study. It is reported that while negative binomial and Bayesian models 23 are consistent for significant variables. However, marginally significant variables in negative 24 binomial models are not found significant in Bayesian models. 25 Quddus (8 developed a set of negative binomial regression and Bayesian hierarchical 26 models for London crash data. Census wards in Greater London metropolitan area are used as the 27 spatial unit. Variables in different categories such as traffic characteristics, road characteristics 28 and socio-economic factors are considered as the explanatory variables. The relationships 29 between these variables and crash casualties are analyzed in the study using both techniques. It is 30 reported that “Bayesian hierarchical models are more appropriate in developing a relationship 31 between area wide traffic crashes and the contributing factors associated with the road 32 infrastructure, socioeconomic and traffic conditions of the area”. 33 Huang et al. (9) analyzed the crashes at county level in Florida using hierarchical 34 Bayesian models. It is argued that the previous research on aggregate crash data does not 35 explicitly differentiate exposure variables and risk factors which might result in inconsistent 36 results in similar studies. Hence, the exposure variables such as daily vehicle miles traveled and 37 population is explicitly controlled in the study. Existence of spatial correlation between counties 38 is checked by using Moran’s I. Moran’s I is a metric that range from -1 to 1 and a positive value 39 indicates spatial correlation in the region under investigation whereas negative value indicates 40 dispersion (19). It was found that the data was spatially correlated. They used set of road and 41 traffic related variables, and demographic and socio-economic variables in model development. 42

43

44

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 5: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

1

2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20 21 22 23

24

25 26 27

DATA

In this stused for different the generinformatiin this stuand ratiofactored field in complain Nall crashemodel. Fthe countarea, havdevelopedaily vehconsidereover the(DVMT)observedFigure 1.

FIGUR

tudy, New Jdeveloping ttables: accid

ral informatiion on each udy (ratio ofo of wet weby the mosthe occupan

nt of pain, mNew Jersey hes in the statigure 1.a shoties in the ceve higher cred roadway hicle miles ted as the ex exposure)

) in each cod that the cr.b generates

RE 1 (a) An

Jersey Depathe crash risdent, driver, ion about thoccupant of

f crashes releather crasht severe injunt table. In

moderate injuhas 21 countte. All crashows the raw entral and norash counts network in traveled (DVxposure vari

map is devounty, whichrash count ina smoother m

(a) nnual crash

artment of Tk model (20vehicle, ped

he accident, of the vehicleated to road

hes) are conury in the c

n the databaury, incapacities and the lh records froannual cras

orthern part than the resthat region

VMT) in eacable in this veloped usinh is shown n a county imap.

counts by c

(per tho

5

Transportatio0). NJDOT kdestrian, andother tables e(s) includedway defectsntained in arash which

ase, Physicatated, and kilocation of eom 2001 to h counts by of New Jers

st of the sta(see Figure ch county fostudy. The

ng crash coin Figure 1is highly re

county in 20ousand DVM

on (NJDOT)keeps this yed occupant. Wsuch as occ

d in the cras, ratio of cra

accident andis gathered

al condition illed. each crash is2010 are uscounty in 20sey, close to

ate. This cou2). NJDOT

or each yearcrash risk (

ounts and d.b. From thlated to exp

(b) 010 (b) CrasMT).

)’s crash recearly databasWhile accidecupant table sh. The covaashes occurrd vehicle ta

from the ph of the vic

s available ated in the de010. As seeno the New Yuld be a resT publish roar on their w(the ratio ofdaily vehicl

his crash riskposure to tra

sh risks by c

cords database containingent table inccontains det

ariates considred on the cuables. Severihysical cond

ctim is code

t county levevelopment on from the fi

York metroposult of the hadway miles

website. DVMf crash freque miles travk map, it caaffic (DVM

county in 20

ase is g five cludes tailed dered urves, ity is dition ed as

el for of the igure, olitan

highly s and

MT is uency veled an be T) as

010

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 6: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

1 2 3 4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

In

roadway does not roadway proposedcrashes idetailed availableor curve.variable. surface cconditionice, etc. county. Eoccurs thcategorieRoadwayproportiodefects inseveral pvariablescovariate

FIGU

n this studycharacteristinclude anymileage in

d in Miaou (is consideredroadway da

e. On the oth Therefore, The detaile

condition is rn is used to i Another co

Equivalentlyhe apparent es: driver acty factors inclon of crashen a county. Mpossible relas cannot quaes and it is e

URE 2 New

, the goal itics. This tasy spatial coeach county

(2), is followd as surroga

ata which inher hand, whthe ratio of td weather darecorded. Thindicate the ovariate cony to the other

contributintions, vehiclelude obstrucs related to Miaou (2) exationships aantify the imessential to

w Jersey roa

s to developk was challevariates. On

y. Instead, suwed to repreate variablesncludes the nhen a crash othe crashes tata is also nohe proportiopercentage sidered in thr covariates,

ng circumstae factors, pe

ction/debris iroadway de

xplains in deand has limimpacts spatiinclude them

6

ad network (

p a crash menging due tonly covariateurrogate varsent the spas to represennumber of hoccurs, the rthat occurredot available;n of crashesof time that he study is t, this data wances are reedestrian factin road, physefects is devetail that surited explanaial covariatem in the mo

(except surf

model that do the limitate directly avriables approatial covariatnt the numbhorizontal curoad charactd on the curv; however, ws that occurr

the roadwaythe number

was not availaecorded in ttors, and roasical obstrucvised to indirrogate variaatory poweres fully, theodel. Surrog

face streets)

depends on tions of the cvailable fromoach, similartes. The pro

ber of horizourves on theteristics are ves is includwhen a crashred on a wet ay is wet due

of roadwayable. Howevthe databaseadway/enviroctions, etc. Sicate the numables approar”. Althoughey are relategate variable

).

spatially vacrash data sinm NJDOT ir to the approportion of contal curvese roadway icoded as str

ded as a surrh occurs, the

roadway sue to rain or sy hazards in ver, when a e in one of onmental fac

Surrogate varmber of roaach above “mh these surred to the dees are widely

arying nce it is the roach curve . The is not raight rogate e road urface snow,

each crash

f four ctors. riable dway mixes ogate esired y and

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 7: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

7

successfully used in medical sciences since outcome cannot be measured during the treatment 1

and surrogate outcomes are necessary to evaluate the effectiveness of treatment in advance. In 2

transportation many surrogate measures for safety are also defined such as aggressive lane 3

merging, speed, accepted gaps, shock-waves (21). The detailed review of the surrogate safety 4

measures can be found in (21). Thus, the use of similar surrogate measures in this in order to 5

compensate for the unavailability of certain data is consistent with the past practice. The 6

descriptive statistics of the variables included in the study is given in Table 1. 7

8 TABLE 1 Descriptive statistics of the variables included in the study (per county per year) 9

Variable Description Mean Min Max S.D.

All crashes Frequency of all crashes 14716.64 1718 57047 11429.75

Fatal Frequency of fatal crashes 30.33 5 70 14.956

Incapacitating Frequency of incapacitating injury crashes 22.6 4 57 11.207

Injury Frequency of other injury crashes 3504.54 424 10041 2276.659

Property damage Frequency of property damage crashes 11159.16 1268 27723 7226.155

DVMT Daily vehicle miles traveled (in thousands) 9486.79 2160 21587 5367.45Roadway mileage Total length of roadways (in miles) 1823.34 620 3509 758.089

Wet roadway Proportion of wet roadway crashes 0.24 0.12 0.38 0.049

Curve crashes Proportion of curve crashes 0.13 0.06 0.3 0.05

Roadway Defects Proportion of roadway related crashes 0.08 0.02 0.25 0.054

10 11 METHODOLOGY 12 13 In this study, the hierarchical Bayesian generalized linear model is considered for estimating the 14 model. The full hierarchical Bayesian modeling requires a three step approach. At the first step, 15 conditional on mean, it , weighted crash counts, itY are assumed to follow a Poisson 16

distribution: 17

~ ( )ind

it itY Po 18

where itY is the observed number of crashes in county i at time t (years), i=1,..,N and t=1,…,T; 19

and it is the mean of the Poisson process for county i at time t. The mean is further formulated 20

as: 21

it it ite 22

In the above equation, crash risk rate, it , is assumed to be proportional to traffic exposure, ite , 23

hence DVMT is considered as an offset. Then, the crash risk rate, it , is modeled as: 24

log( )it k itk i ik

X 25

where represents the intercept term, k is the coefficient for spatial covariate k, itkX is the 26

observed value of kth covariate for county i at time t, i stands for region-wide or statewide 27

global heterogeneity, and i represents spatially correlated random effects for county i. 28

Next, coefficients are modeled by using non-informative priors. For the random effect 29

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 8: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

8

terms, second level of hierarchy is defined. The heterogeneity term is modeled using a normal 1 prior: 2

1~ 0,iid

ih

N

3

where h is a precision parameter that controls the amount of i among municipalities. The 4

heterogeneity term enables us to include extra-Poisson variability due to unobserved variables 5 over the entire state. For modeling the spatially correlated random effects, a CAR prior is 6 adopted, proposed by Besag (18): 7 8

*

1| ~ ,ij

i i jj i i c i

wN

w w

9

where c is a precision parameter that controls clustering, ij is a neighbor municipality adjacent 10

to municipality i, ijw is the weight of the neighbor j, and iw represents the sum of the weights of 11

the neighbors of municipality i. Note that, in this study, it is assumed that all neighbors have 12 equal weight. By including a spatially correlated random effects term, extra-Poisson variability 13 in the log-relative risk which varies from municipality to municipality can be modeled in such a 14 way that nearby municipalities will have more similar rates. 15 In the next step, the hyperpriors are defined for structured random effect terms which are 16 heterogeneity and clustering precision parameters. Setting same priors for h and c to give 17

same prior emphasis on them is incorrect due to two reasons. First, h uses a marginal 18

specification, whereas, c is specified conditionally. Second, c is multiplied by number of 19

neighbors before it is involved in prior specification. Therefore, hyperpriors for the precision 20 parameters, h and c , are defined based on a fair priors criteria approximated by Bernardinelli et 21

al. (22) as follows: 22 1 1

( ) ( )0.7

i ih c

sd sdw

23

where ( )isd and ( )isd are standard deviations of i and i respectively, and w average 24

number of fair neighbors. The graphical representation of the hierarchical model is shown in 25 Figure 3. 26

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 9: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

9

it

hc

kii itkX

itY

ite

1 FIGURE 3 Graphical representation of the hierarchical Bayesian crash model 2

3 4 5 The proportion of variability in the random effects is another metric which is included in 6

modeling to analyze clustering later and it is defined as: 7 8

( )

( ) ( )i

i i

sd

sd sd

9

Several parameterization approaches of the proposed hierarchical model are considered in 10 the study and to evaluate the model fit deviance information criterion (DIC) is used. DIC is 11 proposed by Spiegelhalter et al. (23) to compare the fit and complexity of hierarchical models in 12 which the number of parameters is not clearly defined. DIC is calculated using the posterior 13 distribution of deviance and can be considered as more general form of Akaike’s information 14 criterion (AIC). DIC is defined as in the following equation: 15

16

( ) 2 DDIC D p 17

where ( )D is the classical estimate of fit (posterior mean of deviance) and Dp is the effective 18

number of parameters. It should be noted that similar to AIC, the models with lower DIC are 19 preferred. 20 21 22 DISCUSSION OF RESULTS 23

The hierarchical Bayesian model, which takes into account spatial correlation and unobserved 24 heterogeneity, was estimated using the software package called WinBUGS (24). WinBUGS uses 25 MCMC methods to sample from the joint posterior probability distribution of the multiple 26 variables. Starting with only an intercept, different subsets of variables were considered in the 27

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 10: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

10

model development. Multiple chains were simulated for these models. 50,000 iterations were 1 performed for each model and the first 10,000 iterations were considered as burn-ins and were 2 not included in the sampling. The model with the lowest DIC was selected. The convergence of 3 the model was evaluated using the mixing of MCMC chains and Gelman-Rubin statistics in 4 WINBUGS. 5

Although MCMC methods are very effective in estimation of hierarchical models, 6 sometimes they can be very slow to converge due to high correlation between parameters or due 7 to non-informative priors (25). Several methods are proposed to overcome this issue and increase 8 the efficiency of MCMC estimation of hierarchical models such as hierarchical centering (25), 9 orthogonal polynomials (26) and parameter expansion (27). In order to improve convergence of 10 the model in this study, covariates were standardized (i.e. mean 0 and standard deviation 1) and 11 hierarchical centering was used for random effects. After these two procedures, the MCMC 12 chains are mixed well and the convergence is reached rather quickly based on the visual 13 inspection of monitoring plots in WINBUGS. 14

Table 2 shows the estimated hierarchical crash risk model for raw county level crash 15 counts. All variables considered in the study are found to be statistically significant. The results 16 indicate that based on the estimates of covariate effects, the “curve crashes” is the most 17 significant variable in explaining crash risk over space. This result confirms the findings of 18 Miaou et al. (2) in which the curve crashes were also the most significant covariate effects. 19 “Roadway mileage” is found to be the second most influential variable. This result shows the 20 general trend that as the roadway mileage increases in a county, if all other variables are 21 controlled in the model, the number of crashes increases. Previous research found that the 22 roadway mileage increases the crash risk (7-8). “Roadway defects” are found to be the third most 23 influential variable. This result implies that although a surrogate variable was used in the study to 24 represent the number of roadway defects, roadway defects elevate the crash risk. The least 25 influential variable in the model is found to be wet roadway. This was the case in (7) as well. 26 However, this might stem from the fact that during the wet weather, traffic volume might be 27 lower. Furthermore, drivers might be more careful and potentially drive more slowly during the 28 wet conditions. 29 30

TABLE 2 Modeling results of hierarchical model for raw crash counts 31 Credible Set Variable Mean S.D. 2.5% 97.5% Intercept 0.17730 0.01619 0.00006 0.17730 Roadway mileage 0.07259 0.00834 0.00037 0.07207 Curve crashes 0.11170 0.00667 0.00021 0.11170 Wet roadway 0.02788 0.00195 0.00004 0.02787 Roadway defects 0.05564 0.00065 0.00001 0.05564

0.47370 0.01637 0.00077 0.47580

0.06362 0.02647 0.00129 0.06010 0.88320

: 2554.63, ( ) 2343.62, 24.816, 2765.65DD D p DIC 32

The model also includes terms that capture the global heterogeneity and spatial 33 correlation. The spatial correlation is significantly higher than the global heterogeneity. The 34

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 11: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

11

standard deviation of is 0.4737 while the standard deviation of is 0.06362. Based on the 1

proportion of variability in the spatial random effects, , it can be implied that there is a strong 2

spatial correlation in crash risk between neighboring counties which will be discussed more in 3 detail in the conclusion section of the paper. 4

In the literature, there is a lack of analysis of crash rates according to road types using 5 this kind of hierarchical Bayesian models. To address this gap in the literature, crashes are 6 classified by the type of roadways under different jurisdictions to re-estimate the model. In New 7 Jersey, there are four levels of jurisdiction for roadways: state (also include interstate highways 8 maintained by NJDOT), authority (toll roads), county, and municipal. Although the roadway 9 jurisdiction is provided in the crash records, DVMT for the roadways was not available; 10 therefore, the county DVMT values are used in the each model. On the other hand, the roadway 11 mileage by jurisdiction is available and is used for model estimations. Six counties are excluded 12 from the analysis of authority roadways since the length of authority roadways are less than five 13 miles. Table 3 shows the estimated hierarchical crash risk model for roadways by jurisdiction. As 14 expected, the effects of the contributing factors differ by the roadway jurisdiction. While 15 roadway mileage is the most influential factor on state and authority roadways, it is the second 16 most influential factor on county and municipal roadways. Moreover, it is only negatively related 17 to crash risk on state roadways. This result might be related to low average speed of vehicles due 18 to high congestion on the state roadways. The curve crash is also another significant variable for 19 all roadway jurisdictions. However, it is positively related to the crash risk for state roadways. 20 This might suggest that more safety precautions are needed to be taken in the vicinity of sharp 21 curves on state roadways. The effect of wet roadways was found to be significant and positive on 22 authority roadways, and municipal roadways. This can be attributed to higher speed limits on 23 authority roadways. On the other hand, municipal roadways are generally two-lane roadways and 24 there is only a small margin for error for drivers when the roadway is wet. The roadway defect is 25 found to be a factor for all road types except for county roadways. Based on the proportion of 26 variability in the spatial random effects, , the spatial correlation is strong for all jurisdictions. 27

The highest value of 0.82 is found for municipal roadway, which indicate that they are 28

highly clustered in the neighboring regions. 29 30

31 32

TABLE 3 Modeling results of hierarchical model for crashes by jurisdiction 33 Credible Set Variable Mean S.D. 2.50% 97.50%

State1

Intercept -0.9493 0.03721 -1.026 -0.8723 Roadway mileage -0.2091 0.02726 -0.2651 -0.1549 Curve crashes 0.03301 0.006056 0.0212 0.04499 Wet roadway 0.003339 0.003007 -0.00255 0.009248 Roadway defects 0.03294 0.006668 0.01983 0.04603

0.24 0.04201 0.1567 0.3136

0.1508 0.05595 0.03403 0.2423 0.6206

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 12: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

12

Authority2

Intercept -2.409 0.06581 -2.554 -2.266 Roadway mileage 0.2218 0.03592 0.1517 0.2914 Curve crashes -0.03364 0.006284 -0.04597 -0.0213 Wet roadway 0.009369 0.004767 1.28E-05 0.01861 Roadway defects 0.03012 0.011 0.008366 0.05154

0.5563 0.09232 0.2789 0.6597

0.1935 0.1356 0.02669 0.5039 0.7539

County3

Intercept -1.011 0.03142 -1.079 -0.9433 Roadway mileage 0.06992 0.003202 0.06366 0.07622 Curve crashes -0.1006 0.0129 -0.1261 -0.07604 Wet roadway -0.00325 0.003561 -0.01025 0.003712 Roadway defects -0.02149 0.01428 -0.04959 0.006426

0.3563 0.04159 0.2459 0.4096

0.1132 0.07493 0.02235 0.2929 0.7701

Municipal4

Intercept -1.05 0.03888 -1.135 -0.9675 Roadway mileage 0.2073 0.01293 0.1818 0.2321 Curve crashes -0.5059 0.01428 -0.5349 -0.4784 Wet roadway 0.03914 0.004267 0.03075 0.04746 Roadway defects -0.0473 0.009711 -0.06643 -0.02846

0.6068 0.04957 0.4705 0.6754

0.1394 0.09315 0.02308 0.3561 0.8212

1

2

3

4

: 6219.92, ( ) 6195.14, 24.783, 6244.71

: 2715.11, ( ) 2696.15, 18.954, 2734.06

: 9977.43, ( ) 9952.36, 25.066, 10002.5

:12066.5, ( ) 12041.4, 25.079, 12091.6

D

D

D

D

D D p DIC

D D p DIC

D D p DIC

D D p DIC

1

2 Not all crashes have the same impact. The more severe crashes result in higher costs and 3

potential fatalities. Consequently, instead of only using a frequency based approach, it is possible 4 to assign different weights to the crashes with varying severity. FHWA Five Percent Report 5 proposes a severity weighing scheme for crashes in which the weights for fatality is 15, 6 incapacitating injury is 7, non-incapacitating injury is 4, possible injury is 2 and property damage 7 is only 1 (28). Although the theoretical background behind the weighing scheme is not explained 8 in the report, it suggests a guideline for weighing the crashes according to their severity. The 9 severity weights in this report will be adopted to re-estimate the previous model. Since non-10 incapacitating injury and possible injury crashes are not differentiated in the data, for all injury 11

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 13: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

13

crashes, except incapacitating injury, 3 is assigned as the weight. Table 4 shows the results of 1 modeling when the severity weights are applied to crash counts. The results indicate a similar 2 trend to the model based on raw crash counts. Note that the spatial correlation decreased in this 3 case while global unobserved heterogeneity increases. This might suggest that there are other 4 unobserved underlying factors affecting the severity of the crashes. 5 6

TABLE 4 Modeling results based on severity weighted crash counts 7

Credible Set

Variable Mean S.D. 2.5% 97.5%

Intercept 0.56660 0.02632 0.51050 0.62290

Roadway mileage 0.09741 0.00657 0.08404 0.11010 Curve crashes 0.12100 0.00532 0.11070 0.13160 Wet roadway 0.02312 0.00157 0.02005 0.02622

Roadway defects 0.06717 0.00053 0.06611 0.06821

0.47050 0.02996 0.40940 0.52900

0.10300 0.04928 0.02962 0.18780 0.82420

: 2643.32, ( ) 2432.57, 24.646, 2854.06DD D p DIC 8

9

Figure 4 shows the crash rate maps for 2010 which are developed based on the model 10 findings of mean crash rates from raw crash counts (Figure 4.a) and from crash counts when 11 severity weights are applied (Figure 4.b). In Figure 4.a, the natural breaks in Figure 1.b is used to 12 make the comparison easy between the crash rates from the model and the real data. It can be 13 observed that the model estimates are proximate to the real crash rates, which confirms the fit of 14 the model to the data. The comparison of Figure 4.a and 4.b indicates that the crash severity is 15 not uniformly distributed in New Jersey. Figure 4.c presents the differences in crash rates 16 between both models. Three of the counties (Passaic, Essex and Hudson) have the highest 17 difference when the severity weights are considered. Since these are neighboring counties, the 18 site specific factors due to shared roadways might play a role in the higher rankings. 19 Additionally, the five counties that follow the top three counties have a similar trend. While 20 Union county neighbors the three counties mentioned earlier, four southern counties, Ocean, 21 Atlantic, Camden, and Cumberland are also neighbors of each other. This trend is not followed 22 for the remaining counties, except for Warren, Hunterdon, and Morris counties which have the 23 smallest change between two models. Based on the results, it can be recommended that the 24 counties that have higher rankings when severity weights are applied need to be further 25 researched. It is necessary to investigate the factors of the likelihood of occurrence of more 26 severe crashes at these locations and the cause behind the neighboring counties to have similar 27 trends. 28

29 30 31 32

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 14: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

1 2

3 4 5 6 7

FIGURcounts

(a)

RE 4 Mean cs (c) differen

crash rates nce between

(a) from ran raw and se

(per tho

14

(c) aw crash coueverity weigousand DVM

(b)

unts (b) fromghted crash MT).

m severity wcounts by c

weighted crcounty in 20

rash 010

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 15: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

15

Crash rate maps are also developed for the different road types as shown in Figure 5. 1 Since DVMT data is not available at the jurisdiction level, county DVMT miles are used for 2 estimating the road based on crash rates. However, the roadway length by jurisdiction is 3 available and the model was estimated using this data. The limitations of the data might have a 4 negative effect on the accuracy of the results as the exposure to traffic is different for various 5 road types. The model results reveal lower crash rates than the actual rates for each road type, 6 but the sum of the crash rates for different road types is equal to the crash rate of a county for all 7 crashes. This is due to the fact that the same county level DVMT is used for all models. These 8 results indicate that crash rates vary for different counties in terms of road types. Thus, crash 9 rates are not spatially consistent within each county independent of the road type. This can have 10 major policy implications because one county that appears to have lower overall crash rates can 11 have higher crash rates in terms of its county or municipal roadways. Moreover, it is found that 12 the state and authority roadways have lower crash rates than the county and municipal roadways. 13 This result can be used to address possible funding and other geometric and operational 14 improvement issues for roads where there is room for improvement compared to other roads 15 located within the same county. 16

17 CONCLUSION 18

This paper presents a county level hierarchical Bayesian model framework for New Jersey. 19 Unlike regular NB models, use of hierarchical Bayesian models enables us to include spatial 20 correlation and global heterogeneity simultaneously. To develop a truly spatial model, only 21 spatial covariates related to roadway conditions are considered in the model development stage. 22 However, due to limitations of the data, spatial covariates are generated from a set of surrogate 23 variables. It is recognized that the surrogate variables may not fully represent the effects of the 24 considered spatial covariates which is a limitation of this study. The spatial variation of crashes 25 is also analyzed by roadway type. However, due to data limitations, DVMT in a county is used 26 instead of the DVMT by roadway type. In addition to common practice of modeling based on 27 raw crash counts, we used of crash counts with severity weights in the hierarchical crash models. 28 Severity weights proposed in FHWA’s “Five Percent Report” is used to represent the impact of 29 the different severity crashes. Finally, the crash rate maps are developed from modeling results 30 for raw crash counts, weighted crash counts, and crashes by roadway type. These maps can be 31 valuable tools to visualize the spatial meaning of models that are not always easy to understand 32 by just observing estimated model equations. Furthermore, the maps based on the spatial models 33 developed in this study enables a fast and intuitive understanding of model predictions especially 34 for non-statisticians. 35 The results of the study indicate that the most influential covariate for the crashes is the 36 roadway curvature, followed by roadway mileage and roadway defects. “Wet roadway” is found 37 to be the least influential factor. By applying severity weights to the crashes in the hierarchical 38 models, it is found that it is possible to represent the crash risk better using this new model. The 39 developed maps based on the raw as well as the weighted crash rates from the model identified 40 counties which are prone to more severe crashes. It is believed that the results of this study will 41 help transportation professionals on identifying and ranking the locations at an aggregate level, 42 which requires closer attention. The crash rate maps can be used to pin-point local needs and 43 allocate funds by the state or federal governments. The maps of crash rate by roadway type can 44 also be used to understand the effectiveness of local governments in terms of highway safety. 45 46

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 16: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

1

2

3

4

5 6 7

FIGU

(

URE 5 Meancounty (d)

(a)

(c)

n crash ratemunicipal

es for (a) staroadways b

16

ate (b) authby county in

(b)

(d)

ority (six con 2010 (per t

ounties are ethousand D

excluded) (DVMT).

(c)

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 17: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

17

For future studies, it is recommended that the methodology described in this paper is 1 extended to analyze more disaggregate level data such as crashes at the municipality level or 2 crashes on individual roadway segments. It will be helpful to investigate the differences in the 3 crash rates and spatial correlation in smaller regions. Although a model for various roadway 4 characteristics is developed in the study, it is also possible to examine the effects of socio-5 economic, demographic and land use variables. Additionally, more accurate results can be 6 obtained from the model if data from surrogate variables can be replaced with real data. Bayesian 7 hierarchical models offer a flexible framework to model temporal random effects in crash data as 8 well as spatial random effects. Future efforts should focus on the investigation of temporal 9 effects and space-time interactions in crash data. For the severity weights, different weighing 10 approaches such as economic impact approach in which property damage only equivalents of 11 accidents are considered (29) can be investigated. 12 13 14 ACKNOWLEDGEMENTS 15 The contents of this paper reflect views of the authors who are responsible for the facts and 16 accuracy of the data presented herein. The contents of the paper do not necessarily reflect the 17 official views or policies of the agencies. 18

19 REFERENCES 20

1. Center for Disease Control and Preventon. WISQARS (Web-based Injury Statistics Query 21 and Reporting System). Atlanta, GA: US Department of Health and Human Services. 22 http://www.cdc.gov/injury/wisqars. Accessed January 16, 2012. 23

2. Miaou, S-P., Song, J. J., and B. K. Mallick. Roadway Traffic Crash Mapping: A Space-24 Time Modeling Approach. Journal of Transportation and Statistics, Volume 6, Number 25 1, 2003, pp. 33-57. 26

3. Waller, L.A., and C. A. Gotway. Applied spatial statistics for public health data. John 27 Wiley & Sons, Inc., Hoboken, NJ, 2004. 28

4. Ghosh, M., and J. N. K. Rao. Small Area Estimation: An Appraisal. Statistical 29 Science Volume 9, Number 1, 1994, pp. 55-76. 30

5. Gelfand, A.E., Diggle, P. J., Fuentes, M., and P. Guttorp. Handbook of Spatial Statistics. 31 CRC Press, Boca Raton, FL, 2010. 32

6. MacNab, Y. C. Bayesian spatial and ecological models for small-area accident and injury 33 analysis. Accident Analysis and Prevention 36 (2004), pp. 1019–1028. 34

7. Aguero-Valverde, J., and P. P. Jovanis. Spatial analysis of fatal and injury crashes in 35 Pennsylvania. Accident Analysis and Prevention 38 (2006), pp. 618–625. 36

8. Quddus, M. A. Modelling area-wide count outcomes with spatial correlation and 37 heterogeneity: An analysis of London crash data. Accident Analysis and Prevention, 38 Vol.40, 2008, pp. 1486-1497. 39

9. Huang, H., Abdel-Aty, M. A., and A. L. Darwiche. County-Level Crash Risk Analysis in 40 Florida. In Transportation Research Record: Journal of the Transportation Research 41 Board, No. 2148, Transportation Research Board of the National Academies, 42 Washington, D.C., 2010, pp. 27-37. 43

10. Noland, R. B. Traffic fatalities and injuries: the effect of changes in infrastructure and 44 other trends. Accident Analysis and Prevention, Vol.35, 2003 599–611 45

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 18: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

18

11. Fridstrom, L., and S. Ingebrigtsen. An Aggregate Accident Model Based on Pooled, 1 Regional Time-Series Data. Accident Analysis and Prevention, Vol. 23, No. 5, 1991, pp. 2 363–378 3

12. Karlaftis, M. G., and A. P. Tarko. Heterogeneity Considerations in Accident Modeling. 4 Accident Analysis and Prevention, Vol. 30, No. 4, 1998, pp.425–433. 5

13. Amoros, E., Martin, J. L., and B. Laumon. Comparison of Road Crashes Incidence and 6 Severity Between Some French Counties. Accident Analysis and Prevention, Vol. 35, No. 7 4, 2003, pp. 537–547. 8

14. Noland, R. B., and L. Oh. The Effect of Infrastructure and Demographic Change on 9 Traffic-Related Fatalities and Crashes: A Case Study of Illinois County-Level Data. 10 Accident Analysis and Prevention, Vol. 36, No. 4, 2004, pp. 525–532. 11

15. Abdel-Aty, M. A., and A. E. Radwan. Modeling traffic accident occurrence and 12 involvement”. Accident Analysis and Prevention, Vol. 32, No.5, 2000, pp. 633-642. 13

16. Lee, J., and F. Mannering. Impact of roadside features on the frequency and severity of 14 run-off-roadway accidents: an empirical analysis. Accident Analysis and Prevention, Vol. 15 34, No.2, 2002, pp. 149-161. 16

17. Lord, D., and F. Mannering. The statistical analysis of crash-frequency data: A review 17 and assessment of methodological alternatives. Transportation Research: Part A, Vol. 18 44, No.5, 2010, pp. 291-305. 19

18. Besag, J. Spatial Interaction and the Statistical Analysis of Lattice Systems. Journal of 20 the Royal Statistical Society. Series B 36, 1974, pp. 192–236. 21

19. Moran, P. A. P. Notes on Continuous Stochastic Phenomena. Biometrika 37 (1), 1950, 22 pp. 17–23. 23

20. New Jersey Department of Transportation (2012). Crash Records. Trenton, NJ. 24 http://www.state.nj.us/transportation/refdata/accident/ Accessed on May 16, 2012 25

21. Tarko, A., G. Davis, N. Saunier, T. Sayed, and S. Washington (2009). Surrogate 26 Measures of Safety. Subcommittee on Surrogate Measures of Safety, Transportation 27 Research Board of the National Academies, Washington, D.C., 2009. 28

22. Bernardinelli, L., Clayton, D., and C. Montomoli. Bayesian Estimates of Disease Maps: 29 How Important Are Priors? Statistics in Medicine, Vol. 14, 1995, pp. 2411-2431. 30

23. Spiegelhalter, D.J., Best, N., Carlin, B. P., and A. Van Der Linde. Bayesian measures of 31 model complexity and fit”. Journal of the Royal Statistical Society: Series B, Volume 64, 32 Issue 4, 2002, 583–639. 33

24. Spiegelhalter, D., Thomas, A., Best, N., and D. Lunn. WINBUGS User Manual Version 34 1.4. MRC Biostatistics Unit, Cambridge, UK. http://www.mrc-35 bsu.cam.ac.uk/bugs/winbugs/contents.shtml Accessed on February 12, 2012 36

25. Gelfand, A.E., Sahu, S.K., and B.P. Carlin. Efficient parameterizations for normal linear 37 mixed models. Biometrika, 82, 1995, 479-488. 38

26. Hills, S.E., and, A.F.M. Smith. Parameterization Issues in Bayesian Inference. Bayesian 39 Statistics 4, (J M Bernardo, J O Berger, A P Dawid, and A F M Smith, eds), 1992, 40 Oxford University Press, 227-246. 41

27. Liu, C., Rubin, D.B., and Y.N. Wu. Parameter expansion to accelerate EM: The PX-EM 42 algorithm. Biometrika 85 (4), 1998, pp. 755-770. 43

28. Five Percent Report. Federal Highway Administration. Accessed on March 12, 2012. 44 http://safety.fhwa.dot.gov/hsip/fivepercent/ 45

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 19: Spatial Analysis of County Level Crash Risk in New Jersey using …engineering.nyu.edu/citysmart/trbpaper/14-5127.pdf · 2014-02-04 · 2 1 ABSTRACT 2 This study presents an innovative

19

29. Oh, J., Washington, S., and D. Lee. Property Damage Crash Equivalency Factors to 1 Solve Crash Frequency-Severity Dilemma: Case Study on South Korean Rural Roads. In 2 Transportation Research Record: Journal of the Transportation Research Board, No. 3 2148, Transportation Research Board of the National Academies, Washington, D.C., 4 2010, pp. 83-92. 5

TRB 2014 Annual Meeting Paper revised from original submittal.