spf workshop ubco february 20141 ch1. what is what ch2. a simple spf ch3. eda ch4. curve fitting...

40
SPF workshop UBCO February 2014 1 H1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing the objective function CH8: Theoretical stuff (skip) Ch9: Adding variables CH10. Choosing a model equation rkshop Objectives : Learn how to fit an SFP to data Understand what SPFs can and cannot do

Upload: tiffany-long

Post on 08-Jan-2018

215 views

Category:

Documents


0 download

DESCRIPTION

SPF workshop UBCO February What is Safety? Here is a count of injury accidents for a Freeway Segment in Colorado. What is its SAFETY? Here is a (monthly) count of accidents for an Intersection in Toronto. What is its SAFETY? Segment of urban freeway in Denver Intersection in Toronto

TRANSCRIPT

Page 1: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 1

CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing the objective function CH8: Theoretical stuff (skip) Ch9: Adding variables CH10. Choosing a model equation

Workshop Objectives:a. Learn how to fit an SFP to datab. Understand what SPFs can and cannot do

Page 2: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 2

What is what.

1. What are SPFs?2. What information do (should) they give us?3. What is that information used for?

Loosely speaking, SPFs are tools that give information about the safety of units such as road segments, intersections, ramps, grade crossings …

What is this?

Page 3: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 3

What is Safety?Here is a count of injury accidents for a Freeway Segment in Colorado. What is its SAFETY?

Here is a (monthly) count of accidents for an Intersection in Toronto. What is its SAFETY?

Segment of urban freeway in Denver Intersection in Toronto

Page 4: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 4

… “what is its safety?” implies that SAFETY is a property of UNITS

What is a ‘Unit’?A Unit can be a road segment, an intersection, Mr. C.J. Smith, heavy trucks on the 401, etc.

Page 5: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 5

1.9 mile long segment of 6-lane urban freeway in Denver, Colorado

Had I defined: Safety = Accident Counts that would mean that safety improved from 1986 to 1987, deteriorated from 1987 to 1988 etc.

Such a definition is not useful for safety management because safety changes even if there is no change in safety-relevant traits. (Exposure, traffic control, physical features, user demography, etc.)

What is the Safety of a unit?

Page 6: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 6

We need a definition of the safety of a unit such that, as long as the ‘safety-relevant’ traits of the unit do not change, it’s ‘safety’ does not change.

Three period running averages; Freeway Segment, Colorado

Thirteen period running averages, Intersection, Toronto

One can rightly imagine that behind the fluctuations there is a gradually changing safety property that is some kind of average

Page 7: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

There are three elements in the graph:1. Observed values ●

2. The invisible (unknown) safety property μ3. Our estimate of the unknown property ○

7

Thirteen period running averages, Intersection, Toronto

Real

ity

SPF workshop UBCO February 2014

Abstraction

Page 8: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 8

We are now ready.Definition: The safety property of a unit is the number of accidents by type and severity, expected to occur on it in a specified period of time. It will always be denoted by μ and its estimate by

What is the ‘safety of a unit’?

Accident typeAccident Severity

PDO Injury FatalRear-end 3.10 1.70 0.20

Angle 1.40 0.90 0.10Single-vehicle 0.30 0.10 0.02

Pedestrian 0.05 0.03

Page 9: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 9

We are gradually assembling the elements needed to say with clarity what an SPF is. Eventually it will be a function of ‘variables’. What is the link between safety and variables?

The ‘safety’ of a unit depends on its ‘traits’

Page 10: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

10

Traits & Safety

Page 11: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 11

Definition: A trait is ‘safety-related’ if when it changes, μ changes.

Consequence: Units with the same s-r traits have the same μ.

S-R traits

Corollary: Units that differ in some s-r traits differ in μ‘s.

Page 12: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 12

PopulationsUnits that share some traits form a population of units.Example, (1) rural, (2) two-lane road segments in (3) flat terrain of (4) Colorado.

Because only some traits are common the units differ in many s-r traits and therefore differ in their μ

We will describe the safety of a population by:Mean of μ’s, E{μ} andStandard deviation of μ’s, σ{μ}

Page 13: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 13

Populations: real and imagined

Example: segments of rural two-lane roads in Colorado form a population

Their shared traits are: (1) State: Colorado, (2) Road Type: two-lane,(3) Setting: rural.

A new population (subset)(1) & (2) & (3) & (4) Terrain: flat.

Flat

Page 14: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 14

The more traits the fewer units.

Colorado data:(1) & (2) & (3) 5323 segments

Their shared traits are: (1) State: Colorado, (2) Road Type: two-lane,(3) Setting: rural,

Add: 2.5<Segment Length <3.5 miles 597 segments

Add: 1000<AADT<2000 vpd 119 segments

If bin is 2400<AADT<2420 there are no units even in the rich data.But the SPF will still provide estimate of E{µ} for a population, albeit an ‘imagined ‘ one.

Page 15: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 15

A Safety Performance Function is a tool which for a multitude of populations provides estimates of:1. The mean of the μ’s in populations - E{μ} and2. The standard deviation of the μ’s in these

populations - σ{μ}.

Finally: “What is an SPF?”

Notational conventions to remember

Page 16: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 16

Notational conventions to remember

μ - the expected number of crashes for a unit - estimate of μ . Caret above always means: estimate of ... - Average of μ’s in a population of units.E{.} always means ‘average or expected value of whatever the dot stands for.’ - standard deviation of μ’s in a population of units. σ{.} always means standard deviation of whatever the dot stands for.

Page 17: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 17

The information we get from an SPF is not about units; it is always about a population of units. When we use the SPF information to estimate the safety of a specific unit we argue as follows: “This unit has the same traits as the units in the population. Therefore my best guess of its μ is E{μ}.”

Page 18: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 18

In interim summary

We needed to be clear about what is an SPF

To get there we had to say what we mean by ‘safety of a unit’ and that it depends on its safety-relevant traits

Further, we had to mention that units that share some safety-relevant traits form populations of units

The safety of a population of units can be described by E{m} and s{m}

These are necessary for practical applications

An SPF provides estimates of E{m} and s{m} for many populations

Page 19: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 19

What and are needed for? { }μE { }μσ

Two groups of applications:Group I: We really need the E{m}.

Examples: (a)To judge what is deviant we have to know what is ‘normal’ . (b) How different are the E{m}‘s of segments with and without (say, paved shoulders)?

Group II. We really need the μ of a specific unit and E{m} helps us to estimate it. Examples: (a) is this road segment a ‘blackspot’? (b) How did the μ of this unit change from ‘before’ treatment to ‘after’ treatment?

Page 20: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 20

Group I: We need the E{μ} of a population

Group II: We need the μ of a unit

What is normal for a unit? Is this unit a ‘blackspot’?

How different are the means of two populations

What might be the safety benefit of treating it?

What was the safety benefit of treating it

To answer: and , and { }{ }μEσ{ }μE { }{ }μEσ{ }μE { }μσ

Page 21: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

21

Some believe that we want to know the function linking E{m} and traits in order to be able to say how a change in the level of a trait will affect the E{m} of units.

Opinions differ on whether such a use of an SPF can be trusted.I do not think so, and will give my reasons in Session 5.I hope that by the end of the workshop there will be more CMF skeptics.

Is there a Group III?

Page 22: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 22

What and are used for? A sequence of simple illustrations.

Go to ‘Spreadsheets to accompany PowerPoints.’Open Spreadsheet #1 ‘Connecticut Drivers’ on ‘1. Data’ workbook.

{ }μE { }μσ

1. How many units are deviant?

2. How well will my screen work?

3. What will be the accident savings of a treatment?

4. How effective was the treatment?

Page 23: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 23

Connecticut drivers (1931-1936)Crashes, (k) Drivers, n(k)

0 238811 45032 9363 1604 335 146 37 1

Total = 29531

Preliminaries:Get and { }μE { }μσ

Data

Page 24: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

24

A B C D Ek n(k) B/B$11 A * C (A-D$11)2*C0 23881 0.8087 0.000 0.0471 4503 0.1525 0.152 0.0882 936 0.0317 0.063 0.0983 160 0.0054 0.016 0.0414 33 0.0011 0.004 0.0165 14 0.0005 0.002 0.0116 3 0.0001 0.001 0.0037 1 0.0000 0.000 0.002

29531 0.240 0.306 0.26

Open workbook 2. Mean and variance estimates’ (of #1)

Computing sample mean and variance.

Page 25: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 25

A B C D Ek n(k) B/B$11 A * C (A-D$11)2*C0 23881 0.8087 0.000 0.0471 4503 0.1525 0.152 0.0882 936 0.0317 0.063 0.0983 160 0.0054 0.016 0.0414 33 0.0011 0.004 0.0165 14 0.0005 0.002 0.0116 3 0.0001 0.001 0.0037 1 0.0000 0.000 0.002

29531 0.240 0.306 0.26

Stay on workbook 2. ‘Mean and variance estimates’ (of #1)

Naturally σ{μ}>0.Even is we used age, gender and exposure as traits, there still would be differences

Estimate of V{μ}, =√0.26=0.51𝜎ො��{𝜇}

Page 26: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 26

Use and for: Screening.

Question: What % is these drivers have a μ that is, say, more than 5 times the mean? (μ>5*0.24=1.2 acc. in six years)

Open workbook 3. ‘How many High mu drivers’ (of #1)

GAMMADIST(μ, b, 1/a, TRUE)

E{μ} σ{μ}

Page 27: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

27

Answer:1. Assume that μ are Gamma distributed.

2. Compute parameters of

3. Use Excel function GAMMADIST(μ, b, 1/a, TRUE)

4. P(μ<1.20)=0.99

5. There are (≈ 29,531*0.01=) 295 such (5 x) drivers

P(μ<1.20)

Page 28: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 28

Use and for: Screen Performance

Question: If we decide to ‘treat’ those 51 (out of 29,531) who had 4 or more accidents how will such a screen do?

Connecticut drivers (1931-1936)Crashes, (k) Drivers, n(k)

0 238811 45032 9363 1604 335 146 37 1

Total = 29531

E{μ} σ{μ}

Page 29: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 29

To answer we have to determine how many of those drivers with 4, 5, 6 or 7 crashes are truly ‘high μ’?

If in a population of unit μ is Gamma distributed then the μ’s of those units with k crashes are also Gamma distributed with

Open workbook 4. ‘Gamma with k=4, 5, 6, 7’ (of #1)

EB

Page 30: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 30

Modify formula in B7 and copy down

First answer: Amongst those who recorded 4 crashes, 66% have μ<1.2.

Do same for k=5, 6, and 7. Record.

Page 31: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 31

k n(k) P(μ≤1.2) False Positives Correct Positives4 33 0.66 22 115 14 0.49 7 76 3 0.33 1 27 1 0.20 0 1Sums 51 30 21Answer:Of 295 with μ>1.2, 21 correctly identified, 30 incorrectly identified and the rest missed

274 missed

21 ca

ught

30 Fa

lse

Use and for: Screen PerformanceE{μ} σ{μ}

Page 32: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 32

Use and for: Anticipating benefit

CMF≡ Expected accident ‘with’ Expected accident ‘without’

Reduction in accidents=m(1-CMF)

Question: How many accidents will be saved if treatment with CMF=0.95 is administered to Connecticut drivers with k≥4?

Preliminaries

E{μ} σ{μ}

Page 33: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 33

Recall that:

Thus, e.g., for k=4, (4+0.85)/(3.55+1)=1.07 crashes

k n(k) (k+b)/(a+1) n(k)*(k+b)/(a+1)4 33 1.07 35.25 14 1.29 18.06 3 1.51 4.57 1 1.73 1.7

59.4

EB

Open workbook 5. ‘Anticipating benefit’ workpage (of #1)

Expected reduction=59.4×(1-0.95)=2.97 acc. in six years.

k+bE{μ|k}=a+1

Page 34: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 34

The 51 drivers with k>=4 received some treatment.Question: If treatment had no effect, and nothing else changed, how many crashes are they expected to have in a 6-year ‘after treatment’ period?

Just as before:

k n(k)(k+b)/(a+1) n(k)*(k+b)/(a+1)

4 33 1.07 35.25 14 1.29 18.06 3 1.51 4.57 1 1.73 1.7

59.4

Use and for: Research about CMFE{μ} σ{μ}

k+bE{μ|k}=a+1

Page 35: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 35

k n(k)(k+b)/(a+1) n(k)*(k+b)/(a+1)

4 33 1.07 35.25 14 1.29 18.06 3 1.51 4.57 1 1.73 1.7

59.4

How come that drivers with 227 accidents are expected to have only 59.4?

Before: 4*33+5*14+6*3+7*1=227 crashes in six yearsIf ineffective, Expected After= 59 crashes in six years227-59=168 Regression to mean!

Page 36: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 36

Summary of illustrations:We used estimates of E{μ} and VAR{μ} to:• Estimate how many deviant units are in a population;• Estimate how many deviants are in subpopulations of units with many crashes (correct and false positives and negatives);• How many crashes will be saved and how many to expect after an ineffective treatment.

Page 37: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

37

Two perspectives on SPF

E{m} and s{m} = f(Traits, parameters)

Applications centered perspective

Cause and effect centered perspective

The perspective determines how modeling is done

Page 38: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 38

E{m} and s{m} = f(Traits, parameters)

Applications centered perspective

Here the question is: “How to do modeling to get good estimates of E{m} and s{m}?

The perspective determines how modeling is done

Page 39: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 39

E{m} and s{m} = f(Traits, parameters)

Cause and effect centered perspective

Here the question is:” How to do modeling to get the right ‘f’ and parameters so that I can compute the change in E{m} caused by a change in a trait.

The perspective determines how modeling is done

Page 40: SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing

SPF workshop UBCO February 2014 40

Summary of 1.1. We defined ‘safety’;2. The safety of a unit is determined by its s-r traits;3. Units that share some traits form a population;4. The safety of a population is described by E{μ} and σ{μ};5. The SPF is ...

A Safety Performance Function is a tool which for a multitude of populations provides estimates of:

1. The mean of the μ’s in populations - E{μ} and its accuracy;

2. The standard deviation of the μ’s in these populations - σ{μ}.