intro to neo4j and graph databases -...

35
Intro to Neo4j and Graph Databases David Montag Neo Technology [email protected]

Upload: trinhmien

Post on 26-Jun-2018

240 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Intro to Neo4j and Graph Databases

David Montag Neo Technology

[email protected]

Page 2: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing
Page 3: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Early Adopters of Graph Technology

Page 4: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Evolution of Web SearchSurvival of the Fittest

Pre-1999 WWW Indexing

Discrete Data

1999 - 2012 Google Invents

PageRank

Connected Data (Simple)

2012-? Google Knowledge Graph, Facebook Graph Search

Connected Data (Rich)

Page 5: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Evolution of Online Recruiting

1999 Keyword Search

Discrete Data

Survival of the Fittest

2011-12 Social Discovery

Connected Data

Page 6: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Neo Technology, Inc Confidential

Core Industries & Use Cases:

Software Financial Services

Telecomm-unications

Network & Data Center Management

Master Data Management

Social

Geo

Early Adopter Segments(What we expected to happen - view from several years ago)

Page 7: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Neo Technology, Inc Confidential

Neo4j Adoption SnapshotSelect Commercial Customers*

*Community Users Not Included

Core Industries & Use Cases:

Software Financial Services

Telecomm-unications

Network & Data Center Management

Master Data Management

Social

Geo

Finance

Page 8: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Neo Technology, Inc Confidential

Core Industries & Use Cases:

Web / ISV Financial Services

Telecomm-unications

Network & Data Center Management

Master Data Management

Social

Geo

Finance

Core Industries & Use Cases: Software

Financial Services

Telecommunications

Health Care & Life Sciences

Web Social,HR & Recruiting

Media & Publishing

Energy, Services, Automotive, Gov’t, Logistics, Education,

Gaming, Other

Network & Data Center Management

MDM / System of Record

Social

Geo

Recommend-ations

Identity & Access Mgmt

Content Management

BI, CRM, Impact Analysis, Fraud Detection, Resource

Optimization, etc.

Accenture

Aviation

*Community Users Not Included

Finance IT

Neo4j Adoption SnapshotSelect Commercial Customers*

Page 9: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

What’s a Graph Database?

Page 10: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Account

Customer

Branch

balance: $100 ovd_prot: false

name: David address: …

location: Menlo Park, CA

MANAGESEmployee

name: Mike

CUSTO

MER

_OF

OWNS

CREATED

_AT

since: 2010

Page 11: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Customers AccountsCustomer_Accounts

143 Alice326 $100

725 $632

981 $212

143 981

143 725

143 326

Page 12: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Alice$100

$632

$212

143 326

725

981

143 981

143 725

143 326

Page 13: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

name: Alice

bal: $100

bal: $632

bal: $212

Nodes

Relationships

OWNS

OWNS

OWNS

Page 14: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Quick Demo

Page 15: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Graph history, benefits & differentiators

Page 16: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Not Only SQL

Page 17: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

bitsbits

bits

bitsbits

bitsbits

bits

bits bits

bitsbitsbits

bitsbits

bits

bits

bitsbitsbits bits

bitsbits

bits

bitsbits

bits

bits

bitsbits

Page 18: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing
Page 19: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing
Page 20: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing
Page 21: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

NoSQLNeo4j

Teradata

Hadoop

MySQL

DB2

Sybase

Postgres

Oracle CassandraMongoDB

RDBMS

Analytics

Riak

Redis Couchbase

GigaSpaces

Coherence

Page 22: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing
Page 23: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Built ground-up for graphs

Other NoSQL databases don’t do it at all

Relational databases do it very poorly

326 BofA #1234John 326

Person Account

Rigid schema & costly joining of IDs required every lookup

mongo

From the storage layer to the query language, graphs are native to Neo4j.

Page 24: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Connectedness of Data Set

Resp

onse

Tim

e

RDBMSDegree: < 3 Size: Thousands

# Hops: 0-1Neo4j

Degree: Thousands+ Size: Billions+

# Hops: Tens to Hundreds

1000x faster

RDBMS vs. Native Graph DatabaseConnected Query Performance

Page 25: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

(SELECT T.directReportees AS directReportees, sum(T.count) AS count FROM ( SELECT manager.pid AS directReportees, 0 AS count FROM person_reportee manager WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") UNION SELECT manager.pid AS directReportees, count(manager.directly_manages) AS count FROM person_reportee manager WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION SELECT manager.pid AS directReportees, count(reportee.directly_manages) AS count FROM person_reportee manager JOIN person_reportee reportee ON manager.directly_manages = reportee.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION SELECT manager.pid AS directReportees, count(L2Reportees.directly_manages) AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees ) AS T GROUP BY directReportees) UNION (SELECT T.directReportees AS directReportees, sum(T.count) AS count FROM ( SELECT manager.directly_manages AS directReportees, 0 AS count FROM person_reportee manager WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") UNION SELECT reportee.pid AS directReportees, count(reportee.directly_manages) AS count FROM person_reportee manager JOIN person_reportee reportee ON manager.directly_manages = reportee.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION

(continued from previous page...) SELECT depth1Reportees.pid AS directReportees, count(depth2Reportees.directly_manages) AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees ) AS T GROUP BY directReportees) UNION (SELECT T.directReportees AS directReportees, sum(T.count) AS count FROM( SELECT reportee.directly_manages AS directReportees, 0 AS count FROM person_reportee manager JOIN person_reportee reportee ON manager.directly_manages = reportee.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION SELECT L2Reportees.pid AS directReportees, count(L2Reportees.directly_manages) AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees ) AS T GROUP BY directReportees) UNION (SELECT L2Reportees.directly_manages AS directReportees, 0 AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") ) !

MATCH  (boss)-­‐[:MANAGES*0..3]-­‐>(sub),              (sub)-­‐[:MANAGES*1..3]-­‐>(report)  WHERE  boss.name  =  “John  Doe”  RETURN  sub.name  AS  Subordinate,  count(report)  AS  Total

Cypher vs SQL

Page 26: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Forrester estimates that over 25% of enterprises will be using graph databases by 2017 to support the next-generation applications that need connected data sets.

– Forrester Research (TechRadar: Enterprise DBMS, Q1 2014)

– Svetlana Sicular, Research Director, Gartner

… they are the solution that can deliver truly new insights from data.“

Page 27: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

4 Case Studies

Page 28: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

eCommerce DeliveryChanging Network Dynamics

A

B

Hierarchical Routing SystemDid Not Support Point-to-Point Deliveries

Real-time Logistics Routing

Challenge

Logistics

Page 29: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

5M Packages Per Day. 3K Per Second.

A

B

Other examples

Real-time Logistics RoutingLogistics

Solution

Model the Logistics Network as a Graph

Page 30: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Neo Technology, Inc Confidential

Content Management

Challenge

Results

• Next-gen site required 360° deep view of any entity in the system

• RDBMS environment slow, difficult to manage and grow

• Next-gen site deployed on Neo4j • Statistics and drill-downs are easily created & customized

Page 31: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Neo Technology, Inc Confidential

Route Planning

Challenge

Results

• Maintain large network of routes covering many carriers and couriers

• MySQL-based solution not fast enough for real-time use

• 50x less code, 2000x faster calculations • Complete ownership of data, and flexibility to modify algorithms

Page 32: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Neo Technology, Inc Confidential

Support Case

Support Case

Knowledge Base

Article

Solution

Knowledge Base

Article

Knowledge Base

Article

Message

Support Case Avoidance

Challenge

Results

• Support cost & resolution times too high

• RDBMS infrastructure did not support expansion

• Faster answers for customers, with lower reliance on support

Relational databases have a hard time dealing with the complexities of connected data.

– Prem Malhotra, Director Enterprise Architecture

Page 33: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Neo Technology, Inc Confidential

MDM / Recommendations

Challenge

Results

• Constructing a 360° view of the customer for the sales team

• IBM DB2 system not able to meet performance requirements

• Flexibly search for insurance policies and associated personal data

• Migration and deployment was easy

Page 34: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Neo Technology, Inc Confidential

Patient transition & referral

Challenge Results• Real-time search on Oracle not fast enough for next gen product

• Handling 15% of all transitions nationwide in the US

• Real-time deep recommendations on widely heterogeneous data

Page 35: Intro to Neo4j and Graph Databases - gotocon.comgotocon.com/dl/goto-aar-2014/slides/DavidMontag_LearnHowGraphs... · Intro to Neo4j ! and Graph Databases David Montag ... WWW Indexing

Come talk to us about the graphs you see!

Upcoming events: • Stockholm Training on Oct 17 • Øredev Training on Nov 4