promise, progress, and pitfalls in big data analyses …...data at post -alert & key data...
TRANSCRIPT
Promise, Progress, and Pitfalls in Big Data Analyses of Safety Data
Carol Flannagan, Ph.D.Director, Center for the Management of Information for Safe and
Sustainable Transportation (CMISST)University of Michigan Transportation Research Institute (UMTRI)
^
Big Transportation Data
• Crash Data• Driving Data• Spatial Data• V2X Data• Cell-phone data• …
Population Demographics,Travel exposure
Population Demographics
Vehicle ID Number
Location and Roadway
Surrogates
Exposure Data Driving Datasets
Crash Databases
Vehicle Characteristics GIS Databases
Other State Datasets
Occupant
Datasets and Linkages
Anthropometry Data
Occupant
Promise of Big DataLarge datasets provide:
• More information (larger sample size)
• More opportunity to find the specific cases of interest (e.g., corner cases)
• More variety of drivers/situations/conditions
• Prospective sample of rare events
IS BIG DATA ALL THE DATA?Information (variables)
Ca
ses/
Even
ts/D
river
s
WHY NOT ALL?• World-wide, vehicles travel more than 1 light-year per year
• Over-the-air data transfer is required for any large-scale collection
- Airtime is still expensive (in very large quantities)
• Manufacturers may (or may not) have all data for their vehicles, but even then, it is usually stored on the vehicle for later download
- Physical download requires access to vehicle, limiting possible sample size
• Production cars are not over-built for either bandwidth or storage
- Cuts into profit margin
Data Collection Strategies
Information (variables)
Case
s/Ev
ents
/Driv
ers
Instrumented Vehicle Dataset• Few cases• Extensive info• Data stored in the veh, physical
download later
Triggered Dataset• Lots of cases• Limited, targeted info• Data stored in module,
transferred over air (cell)
Big Data Progress: Large-Scale Triggered Data Collection
C. Flannagan, D.LeBlanc, et al (UMTRI), R. Kiefer et. al. (GM)
• Evaluated driver behavior in conjunction with two warning systems
• A new approach to large-scale field data on driving
o A large-sample view of how new driver-interactive technologies actually play out – what works
o A possible tool for large-scale feedback to designers and safety advocates, e.g., for safe and smooth deployment of automated vehicles
10
Data at Post-Alert & key data between Alert & Post-Alert
3-6 sec 4 sec
Study data per ignition cycle:
1. Trip aggregated statisticsE.g., Distance, speeds, ranges,night/day, time, etc.
2. Alert-triggered data:Kinematics, alerts, settings,system states, lane position,brake status.
Alert PostPre
Data atPre-Alert
Data atAlert
Time
Data Collection
Analysis of De-Identified
Data
Triggered Data Collection: Onboard Modules
• Participants opted in; no further contact after opt-in email
• Data collected telematically
Triggered Data Collection: Onboard Modules
Data Integration: Benefits of Safety ContentProblem: How effective are different safety technologies in the field? Which are worth the cost??
Benefits of Safety ContentBig Data Solution: 1) Manufacturers provide large sales datasets with safety
content linked to Vehicle Identification Number (VIN)2) UMTRI houses 12 state crash databases with full VIN
included3) Datasets are matched on VIN to search for crashes
involving vehicles with and without new technologies
NHTSA-Funded Study: Crash Avoidance Technology Evaluation Using Real-World Crash Data
Primary objectives: • Develop and analyze a large database of crashes in the
U.S. that includes vehicle safety content• Estimate safety benefits of a set of vehicle features• Envision and build towards a more sustainable approach
to incorporating safety content into an analyzable database across multiple OEMs
Data
Crash Data• >10M crashes from 12 states
Safety Content data:• Equipment data from GM for 1,215,618 vehicles
with Model Year 2013-2015
Result:• 35,401 matched vehicles in crashes
Safety Content EvaluatedFront Crash Prevention:• Forward Collision Alert (FCA)• Front Automatic Braking • Adaptive Cruise Control w FAB
Rear Crash Prevention:• Rear Vision Camera (RVC)• Rear Park Assist (RPA)• Rear Cross-Traffic Alert (RCTA)• Rear Automatic Braking (RAB)
Frontal Crash Prevention
Backing Crash Prevention
Largest effectiveness:• Front Auto Brake effectiveness: 45%• Rear Auto Brake effectiveness: 83%
* More automated systems produce greater benefits (e.g., AEB vs. FCW, RAB vs. RVC/RPA)
Summary of Results
• Safety technology is changing very quickly and regular field data analysis is needed to keep up
• BUT, large samples of vehicles are key to earliest analysis • Pooling OEM safety content supports:
o Faster assessment of safety benefitso Estimation of variation in performance across vehicle types,
conditions, systems• Pooled data can be anonymized relative to OEM (to prevent
competitive comparisons)
Safety Content Data Sharing
Pitfalls of Big Data1. Consider the sample!!!
• Big Data is typically sampled for convenience from any source available
• Often those who provide data (e.g., smart-phone owners) are not a random sample of those who might be affected by algorithms and results from those data
Pitfalls of Big Data2. “Flying blind”
• No video = no ground truth• Models of how big data relates to drivers and
situations are based on assumptions that are ideally tested (but might not be)
Pitfalls of Big Data3. Human intelligence needed more than ever
• Easy to assume that machine learning and automated methods are providing good answers
• “Big Data requires Big Judgment”
What’s Next?• New EDRs for AV safety data—SAE committee
considering contents• Exploration of non-crash safety data possibilities• Need to consider how to capture context information
for triggered data, e.g., images, proximity• Pre-competitive data sharing/pooling for safety
(aviation model)
Thank you