promise, progress, and pitfalls in big data analyses …...data at post -alert & key data...

26
Promise, Progress, and Pitfalls in Big Data Analyses of Safety Data Carol Flannagan, Ph.D. Director, Center for the Management of Information for Safe and Sustainable Transportation (CMISST) University of Michigan Transportation Research Institute (UMTRI)

Upload: others

Post on 28-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Promise, Progress, and Pitfalls in Big Data Analyses of Safety Data

Carol Flannagan, Ph.D.Director, Center for the Management of Information for Safe and

Sustainable Transportation (CMISST)University of Michigan Transportation Research Institute (UMTRI)

Page 2: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

^

Page 3: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Big Transportation Data

• Crash Data• Driving Data• Spatial Data• V2X Data• Cell-phone data• …

Page 4: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Population Demographics,Travel exposure

Population Demographics

Vehicle ID Number

Location and Roadway

Surrogates

Exposure Data Driving Datasets

Crash Databases

Vehicle Characteristics GIS Databases

Other State Datasets

Occupant

Datasets and Linkages

Anthropometry Data

Occupant

Page 5: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Promise of Big DataLarge datasets provide:

• More information (larger sample size)

• More opportunity to find the specific cases of interest (e.g., corner cases)

• More variety of drivers/situations/conditions

• Prospective sample of rare events

Page 6: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

IS BIG DATA ALL THE DATA?Information (variables)

Ca

ses/

Even

ts/D

river

s

Page 7: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

WHY NOT ALL?• World-wide, vehicles travel more than 1 light-year per year

• Over-the-air data transfer is required for any large-scale collection

- Airtime is still expensive (in very large quantities)

• Manufacturers may (or may not) have all data for their vehicles, but even then, it is usually stored on the vehicle for later download

- Physical download requires access to vehicle, limiting possible sample size

• Production cars are not over-built for either bandwidth or storage

- Cuts into profit margin

Page 8: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Data Collection Strategies

Information (variables)

Case

s/Ev

ents

/Driv

ers

Instrumented Vehicle Dataset• Few cases• Extensive info• Data stored in the veh, physical

download later

Triggered Dataset• Lots of cases• Limited, targeted info• Data stored in module,

transferred over air (cell)

Page 9: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Big Data Progress: Large-Scale Triggered Data Collection

C. Flannagan, D.LeBlanc, et al (UMTRI), R. Kiefer et. al. (GM)

• Evaluated driver behavior in conjunction with two warning systems

• A new approach to large-scale field data on driving

o A large-sample view of how new driver-interactive technologies actually play out – what works

o A possible tool for large-scale feedback to designers and safety advocates, e.g., for safe and smooth deployment of automated vehicles

Page 10: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

10

Data at Post-Alert & key data between Alert & Post-Alert

3-6 sec 4 sec

Study data per ignition cycle:

1. Trip aggregated statisticsE.g., Distance, speeds, ranges,night/day, time, etc.

2. Alert-triggered data:Kinematics, alerts, settings,system states, lane position,brake status.

Alert PostPre

Data atPre-Alert

Data atAlert

Time

Data Collection

Analysis of De-Identified

Data

Page 11: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Triggered Data Collection: Onboard Modules

• Participants opted in; no further contact after opt-in email

• Data collected telematically

Page 12: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Triggered Data Collection: Onboard Modules

Page 13: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Data Integration: Benefits of Safety ContentProblem: How effective are different safety technologies in the field? Which are worth the cost??

Page 14: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Benefits of Safety ContentBig Data Solution: 1) Manufacturers provide large sales datasets with safety

content linked to Vehicle Identification Number (VIN)2) UMTRI houses 12 state crash databases with full VIN

included3) Datasets are matched on VIN to search for crashes

involving vehicles with and without new technologies

Page 15: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

NHTSA-Funded Study: Crash Avoidance Technology Evaluation Using Real-World Crash Data

Primary objectives: • Develop and analyze a large database of crashes in the

U.S. that includes vehicle safety content• Estimate safety benefits of a set of vehicle features• Envision and build towards a more sustainable approach

to incorporating safety content into an analyzable database across multiple OEMs

Page 16: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Data

Crash Data• >10M crashes from 12 states

Safety Content data:• Equipment data from GM for 1,215,618 vehicles

with Model Year 2013-2015

Result:• 35,401 matched vehicles in crashes

Page 17: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Safety Content EvaluatedFront Crash Prevention:• Forward Collision Alert (FCA)• Front Automatic Braking • Adaptive Cruise Control w FAB

Rear Crash Prevention:• Rear Vision Camera (RVC)• Rear Park Assist (RPA)• Rear Cross-Traffic Alert (RCTA)• Rear Automatic Braking (RAB)

Page 18: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Frontal Crash Prevention

Page 19: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Backing Crash Prevention

Page 20: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Largest effectiveness:• Front Auto Brake effectiveness: 45%• Rear Auto Brake effectiveness: 83%

* More automated systems produce greater benefits (e.g., AEB vs. FCW, RAB vs. RVC/RPA)

Summary of Results

Page 21: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

• Safety technology is changing very quickly and regular field data analysis is needed to keep up

• BUT, large samples of vehicles are key to earliest analysis • Pooling OEM safety content supports:

o Faster assessment of safety benefitso Estimation of variation in performance across vehicle types,

conditions, systems• Pooled data can be anonymized relative to OEM (to prevent

competitive comparisons)

Safety Content Data Sharing

Page 22: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Pitfalls of Big Data1. Consider the sample!!!

• Big Data is typically sampled for convenience from any source available

• Often those who provide data (e.g., smart-phone owners) are not a random sample of those who might be affected by algorithms and results from those data

Page 23: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Pitfalls of Big Data2. “Flying blind”

• No video = no ground truth• Models of how big data relates to drivers and

situations are based on assumptions that are ideally tested (but might not be)

Page 24: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Pitfalls of Big Data3. Human intelligence needed more than ever

• Easy to assume that machine learning and automated methods are providing good answers

• “Big Data requires Big Judgment”

Page 25: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

What’s Next?• New EDRs for AV safety data—SAE committee

considering contents• Exploration of non-crash safety data possibilities• Need to consider how to capture context information

for triggered data, e.g., images, proximity• Pre-competitive data sharing/pooling for safety

(aviation model)

Page 26: Promise, Progress, and Pitfalls in Big Data Analyses …...Data at Post -Alert & key data between Alert & Post -Alert 3-6 sec 4 sec Study data per ignition cycle: 1. Trip aggregated

Thank you