driving predictive roadway analytics with the power of neo4j

34
© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY. 1 © 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY. Blake Nelson | Principal of Waveonics Deve Palakkattukudy|Principal Software Engineer, Mobile Engineering Agero October 13, 2016 Driving Predictive Roadway Analytics with the Power of Neo4j 1

Upload: neo4j-the-fastest-and-most-scalable-native-graph-database

Post on 16-Apr-2017

197 views

Category:

Software


1 download

TRANSCRIPT

Page 1: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.1 © 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.1

Blake Nelson | Principal of WaveonicsDeve Palakkattukudy|Principal Software Engineer, Mobile Engineering Agero

October 13, 2016

Driving Predictive Roadway Analytics with the Power of Neo4j

Page 2: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.2

• Agero, Inc. is a leading provider of vehicle and driver safety, security and information services.• Waveonics, LLC is a software development and consulting firm

• Together, they are leveraging Neo4j 3.x, open source Spatial plug-in and crowdsourced Open Street Map data to:- Detect changing roadway and driving conditions- Analyze dynamic conditions for developing trends- Predict potential consequences of developing trends- Improve driver safety and the driving experience

Enhance Driving Safety and Experience:Detect, Analyze, Predict, Improve

Page 3: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.3

Who We Are

44 Years of Experience

Financial Institutions

Dealers & Repair Shops

AutoManufacturers

Insurance Companies

Road & TowCompanies

Page 4: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.4

Creating Strong and Lasting Connections Between Our Clients and Their Drivers

9.5 MILLION EVENTS

SERVICES IN

75%VEHICLES

20CONNECTED VEHICLE

YEARS

40YEARS

INDUSTRY LEADER

350APIs

Page 5: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.5

Agero Connects Service Providers with Drivers• Protecting over 85 million drivers • Over 9.5 million service dispatches each year

What Are We Dealing With? (Business)

Multiple Platforms • Mobile• Web & Cloud• Telephone

Multiple Needs• Roadside Service• Data Analytics

Page 6: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.6

Community Driven (Crowdsourced)• Mappers, GIS professionals, engineers, humanitarians

providing accurate and up to date global map data

• Databases, Local aerial imagery, GPS devices, low-tech field maps

• Free to Use Credit to OSM and contributors

http://www.openstreetmap.org

Open Street Map (OSM)

Page 7: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.7

‘way’‘node’ ‘relation’

Hierarchy of Structural Information

Open Street Map Data Files

‘way’ (route) are ordered sequence of ‘nodes’• ‘way’ ID and sequence of

‘node’ IDs• Maybe ‘tags’ which are

key/value properties

‘node’ (points) are the lowest level• Latitude, Longitude

and ‘node’ ID• Maybe ’tags’ which are

key/value properties

‘relation’ is highest (semantic) level• ‘relationship’ ID and…

1. Sequence of other relations

2. Sequence of ways3. Sequence of nodes

• Maybe tags which are key/value properties

1 2 3

Page 8: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.8

<node id="261728686" lat="54.0906309" lon="12.2441924" /> <node id="1831881213" lat="54.0900666" lon="12.2539381">

<tag k="name" v="Neu Broderstorf"/> <tag k="traffic_sign" v="city_limit"/>

</node><way id="26659127" >

<nd ref="292403538"/> <nd ref="298884289"/> ... <nd ref="261728686"/> <tag k="highway" v="unclassified"/> <tag k="name" v="Pastower Straße"/>

</way>

What does OSM data look like?

Note: a number of attributes removed for brevity Timestamps, version, user, changeset, etc.

Lat/Lon Points

Properties

Road Segment

Properties

Page 9: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.9

Node and Ways are Relationships

1 Cabot Rd #4

OSM way

nodes in way

Tags (properties)

Revere Beach Parkway

Page 10: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.10

OSM Complex and Dynamic Information • Roads, trails, waterway, regions,

points of interest• 1.2M adds, 302K mods, 120K dels daily

Scale and Complexity

OSM is a Large Rich Data Set. For North America:• 862.4M ‘nodes’ (lat/lon points)• 60.4M ‘ways’ (sequence of related nodes)• 332M ‘tags’ (properties of nodes & ways)• 972K ‘relations’ (between ways and nodes)

Page 11: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.11

Know about way segments• Surface / speeds / widths• Lanes / bridges / tunnels• Intersections / access• Hazards / obstructions

What Do We Want To Do?

EVERY WAY IS UNIQUE

Page 12: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.12

Maps are graphs• Travel from start (node) to end (node) along roadways (edges)• Turns only possible at intersections (nodes)• Road segments have properties (speed, surface, lanes, etc.)

Why Graph Database

https://www.researchgate.net/figure/221252013_fig1_Fig-1-a-Example-of-road-map-extracted-from-a-city-street-map-b-Zones-shown-in

https://neo4j.com/blog/neo4j-3-0-massive-scale-developer-productivity/

Page 13: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.13

Bolt interface• Native drivers for python, java, javascript, .Net• Our Data Scientists work in Python

Stored Procedures• Callable from Cypher through Bolt• Develop on client Migrate to server

Why Neo4j 3?

Applications

Neo4j Execution EngineJava StoredProcedure

Page 14: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.14

• Rectangular ‘bounding boxes’

• Points close to each other often in same bounding box

• Performance problems at scale- Bboxes can overlap- Close points in can be in different bboxes- Poor worst-case performance• Splitting, rebalancing, etc.

Neo4j Spatial Uses RTree Indexing

https://en.wikipedia.org/wiki/R-tree

Page 15: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.15

Map 2D to 1D space• Create a bit string by slicing the

world by longitude & latitude

• Convert the bit string into characters• Similar strings usually close

Use GeoHash 1D Index

https://mapzen.com/blog/geohashes-and-you/

Page 16: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.16

Map 2D to 1D space• Create a bit string by slicing the

world by longitude & latitude

• Convert the bit string into characters• Similar strings usually close

Use GeoHash 1D Index

Page 17: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.17

Encoding is Simple and Fast

Lat/Lon bits base32Division and bit shift• (37.77564, -122.41365)• “0100110110010001111011110”• “9q8yy”

Binary 01001 10110 01000 11110 11110

Decimal 9 22 8 30 30

Base 32 9 q 8 y y

Page 18: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.18

Reverse the Process to Decode• Base32 Bits Latitude/Longitude

Decoding is Simple and Fast

Base 32 9 q 8 y y

Decimal 9 22 8 30 30

Binary 01001 10110 01000 11110 11110

Longitude 0-0-1 -0-1- 0-0-0 -1-1- 1-1-0

Latitude -1-0- 1-1-0 -1-0- 0-1-0 -1-1-

Page 19: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.19

Quick to Determine Closeness

CITY GEOHASH LATITUDE LONGITUDE

San Francisco 9q8yym901hw 37.77926 -122.41923

Oakland 9q9p1d5zfks 37.80531 -122.27258

Berkeley 9q9p3tvj8uf 37.86947 -122.27093

Los Angeles 9q5ctr60zyr 34.05366 -118.24276

New York City dr5regw2z6y 40.71273 -74.00599

London gcpvn0ntjut 51.50479 -0.07871

Greenwich u10hb5403uy 51.47651 0.00283

Things close are in same GeoHash regionor a neighboring GeoHash region

Page 20: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.20

Trie data structure using GeoHash• Walk tree using GeoHash string• Leaf node identifies Bbox

Close things are identified by:• Same leaf node• Leaf node of Neighboring GeoHash

‘Fits’ with Neo4j relationships

Fast Indexing with GeoHash Trie

San Francisco 9q8yy

9

q

8

y

y

GeoHash Root

Page 21: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.21

drt3j2ycdrt3j2y8drt3j2wwdrt3j2wxdrt3j2wm

Walk Relationships Using GeoHash StringOSM

Way

One OSM Way Intersects Several GeoHash Regions

Trie Root

Trie Leaf

Page 22: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.22

Supports and Models our Business Domain• Directed Graph Models Roadways with Properties• Spatial Plugin Supports OSM Data We Depend Upon

Bolt Supports our Data Scientists• Python for Machine Learning and Predictive Analytics• Java for our Developers• Access to Stored Procedures when Needed

Open Source Code / Open Source Community• Ability to Customize for Data Model and Performance Needs (e.g. Indexing)• Add Features with Plugin Technology

Why Graph Database? Why Neo4j?

https://github.com/codeforamerica/DemoDexter

Page 23: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.23© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.23

Questions

Page 24: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.24© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.24

Backup & Errata

Page 25: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.25

Select Node Component

Terminal nodes of this way are sharedwith connecting ways

Different ways when Different property (properties)

Page 26: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.26

Node is Displayed – Part of 2 Ways

2 (connecting) Ways for Terminal Node

Page 27: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.27

Connected Way

Page 28: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.28

Property Change for Way – One is a Bridge

Page 29: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.29

Graph Data Model Superior for Relationships

Page 30: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.30

The division of storage facilitates graph traversals• Nodes and Relationships

Graph DB Model Optimized for Traversals

Graph is node (vertex)and edges (relations)

Storage structured on nodesconnected by relations

Locality of referenceoptimizes graph traversal

Page 31: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.31

Neo4j Physical Storage

Properties

Node

Label

Relation

Page 32: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.32

Open Street Map in Neo4j

Page 33: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.33

Neo4j OSM Spatial Plugin

Page 34: Driving Predictive Roadway Analytics with the Power of Neo4j

© 2016 AGERO, INC. PROPRIETARY AND CONFIDENTIAL. A CROSS COUNTRY GROUP COMPANY.34

Neo4j Spatial Stored Procedures