switching from the relational to the graph model

49
(c) Luca Garulli Licensed under a Creative Commons Attribution- NoDerivs 3.0 Unported License Page 1 www.orientechnologies.com Luca Garulli Founder and CEO @NuvolaBase Ltd Author of OrientDB Doc/Graph DB Oct 6th 2012 in Barcelona Switching from the Relational to the Graph model

Upload: luca-garulli

Post on 10-May-2015

3.793 views

Category:

Technology


0 download

DESCRIPTION

One of the main resistences of RDBMS users to pass to a NoSQL product are related to the complexity of the model: Ok, NoSQL products are super for BigData and BigScale but what about the model?

TRANSCRIPT

Page 1: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 1www.orientechnologies.com

Luca Garulli – Founder and CEO @NuvolaBase LtdAuthor of OrientDB Doc/Graph DB

Oct 6th 2012 in Barcelona

Switching from the Relational to the Graph model

Page 2: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 2

One of the main resistences ofRDBMS users to pass to a NoSQL product

are related to thecomplexity of the model:

Ok, NoSQL products are super forBigData and BigScale

but...

Page 3: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 3

...but what about the model?

Page 4: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 4

What is the NoSQL answer about managing complex domains?

Key-Value storesColumn-Based

Document databaseGraph database

Page 5: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 5

CAUTION!This presentation will not use a

social like domain withthe classic paradigm of

friend-of-friendN

where the graph databasesare already widely used...

Page 6: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 6

...But rather we will explore howto think «graphically» with one of the

most common domains in theenterprise world:

The old-classic CRM* domain

* today in 99% of the cases a RDBMS is used

Page 7: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 7

Every developer knowsthe Relational Model,but who knows the

Graph one?

Page 8: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 8

Back to school:Graph Theory crash course

Page 9: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 9

Basic Graph

LucaLucaNoSQL Matters

conference

NoSQL Matters

conference

Likes

Page 10: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 10

Property Graph Model*

Lucaname: Luca

surname: Garullicompany: NuvolaBase

Lucaname: Luca

surname: Garullicompany: NuvolaBase

NoSQLMatters

conferenceeditions: [Cologne, Barcelona]

NoSQLMatters

conferenceeditions: [Cologne, Barcelona]

Likes

since: 2012

Vertices and Edges can have propertiesVertices and Edges can have propertiesVertices and Edges can have properties

Vertices are directed

* https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model

Page 11: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 11

Property Graph Model

LucaLucaNoSQL Matters

conference

NoSQL Matters

conference

Likes

since: 2012

Speakstitle: «Switching...»

abstract: «This talk presents...»

An Edge connects 2 vertices: use multiple

vertices to represents 1-N and N-M relationships

Page 12: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 12

Property Graph Model

Likes

KatjaKatja

LucaLuca

Organizes

FriendOfNoSQLMatters

conference

NoSQLMatters

conference

PerePereFriendOf

Joins

Page 13: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 13

Compliments, this is your diploma in«Graph Theory»

Page 14: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 14

Now go backto our domain:

the CRM

Page 15: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 15

Domain: minimal CRM

CustomerCustomer AddressAddress

OrderOrder StockStock

Registry system

Order system

Page 16: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 16

StockStock

Registry system

Domain: minimal CRM

OrderOrder

Order system

CustomerCustomer AddressAddress

How doesRelational DBMS

manage relationships?

Page 17: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 17

Relational World: 1-1 Relationships

JOIN Customer.Address -> Address.Id

Customer

Id Name Address

10 Luca 34

11 Katja 44

34 Sylvia 54

56 Mark 66

88 Steve 68

Address

Id Location

34 Rome, London

44 Cologne

54 Rome

66 New Mexico

68 Palo Alto

Page 18: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 18

Relational World: 1-N Relationships

Inverse JOIN Address.Customer -> Customer.Id

Customer

Id Name

10 Luca

11 Katja

34 Sylvia

56 Mark

88 Steve

Address

Id Customer Location

24 10 Rome

33 10 London

44 34 Rome

66 11 Cologne

68 88 Palo Alto

Page 19: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 19

Relational World: N-M Relationships

Additional table with 2 JOINs(1) CustomerAddress.Id -> Customer.Id and(2) CustomerAddress.Address -> Address.Id

Customer

Id Name

10 Luca

11 Katja

34 Sylvia

56 Mark

88 Steve

Address

Id Location

24 Rome

33 London

44 Rome

66 Cologne

68 Palo Alto

CustomerAddress

Id Address

10 24

10 33

34 24

Page 20: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 20

Relational World: N-M Relationships

Additional table with 2 JOINs(1) CustomerAddress.Id -> Customer.Id and(2) CustomerAddress.Address -> Address.Id

Customer

Id Name

10 Luca

11 Katja

34 Sylvia

56 Mark

88 Steve

Address

Id Location

24 Rome

33 London

44 Rome

66 Cologne

68 Palo Alto

CustomerAddress

Id Address

10 24

10 33

34 24

Page 21: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 21

What’s wrong with theRelational Model?

Page 22: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 22

These are all JOINs executedeverytime you traverse a

relationship

The JOIN is the evil!Customer

Id Name

10 Luca

11 Katja

34 Sylvia

56 Mark

88 Steve

Address

Id Location

24 Rome

33 London

44 Rome

66 Cologne

68 Palo Alto

These are all JOINs executedeverytime you traverse a

relationship

These are all JOINs executedeverytime you traverse a

relationship

These are all JOINs executedeverytime you traverse a

relationship!

CustomerAddress

Id Address

10 24

10 33

34 24

Page 23: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 23

A JOIN means searching for a key inanother table

The first rule to improve performanceis indexing all the keys

Index speeds up searches but slows downinsert, updates and deletes

Page 24: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 24

So in the best case a JOIN is a lookupinto in an index

This is done per single join!

If you traverse hundreds of relationshipsyou’re executing hundreds of JOINs

Page 25: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 25

Index Lookupit is really that fast?

Page 26: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 26

Index Lookup: how does it works?

A-Z

A-L M-Z

Think to an Address Book

where we have to find the Luca’s phone

number

Page 27: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 27

Index Lookup: how does it works?

A-Z

A-L M-Z

A-L

A-D E-L

M-Z

M-R S-Z

Index algorithms are all similar and based on

balanced trees

Page 28: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 28

Index Lookup: how does it works?

A-Z

A-L M-Z

A-L

A-D E-L

M-Z

M-R S-Z

A-D

A-B C-D

E-L

E-G H-L

Page 29: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 29

Index Lookup: how does it works?

A-Z

A-L M-Z

A-L

A-D E-L

M-Z

M-R S-Z

A-D

A-B C-D

E-L

E-G H-L

E-G

E-F G

H-L

H-J K-L

Page 30: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 30

Index Lookup: how does it works?

A-Z

A-L M-Z

A-L

A-D E-L

M-Z

M-R S-Z

A-D

A-B C-D

E-L

E-G H-L

E-G

E-F G

H-L

H-J K-L

Luca

Found! Each lookup takes X steps, where X

grows with the index size!

Found! Each lookup takes X steps, where X

grows with the index size!

Page 31: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 31

An index lookup is executedfor each JOIN

Querying more tables can easilyproduce millions of JOINs/Lookups!

Here the rule: more entries= more lookup steps = slower JOIN

Page 32: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 32

Is there a better way tomanage relationships?

Page 33: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 33

“A graph database is anystorage systemthat provides

index-free adjacency”

- Marko Rodriguez

Page 34: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 34

How does GraphDB manageindex-free relationships?

Page 35: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 35

an Open Source (Apache 2)document-graph NoSQL dbms

supports: transactions, extended-SQL,Multi-Master replication, etc

Page 36: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 36

LucaLucaLives

OrientDB: traverse a relationship

out : [#14:54]label : ‘Customer’name : ‘Luca’

out : [#14:54]label : ‘Customer’name : ‘Luca’

out: [#13:35]in: [#13:100]Label : ‘Lives’

out: [#13:35]in: [#13:100]Label : ‘Lives’

RID = #13:35RID = #13:35

RID = #14:54RID = #14:54

RID = #13:100RID = #13:100

in: [#14:54]label = ‘Address’name = ‘Rome’

in: [#14:54]label = ‘Address’name = ‘Rome’

The Record ID (RID)is a Physical position

RomeRome

Page 37: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 37

GraphDB handles relationships as aphysical LINK to the record

assigned when the edge is created

on the other side

RDBMS computes therelationship every time you query a database

Is not that crazy?!

Page 38: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 38

This means jumping from aO(log N) algorithm to a near O(1)

traversing cost is not more affectedby database size!

This is huge in the BigData age

Page 39: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 39

OrientDB in the Blueprints micro-benchmark,on common hw, with a hot cache,

traverses 29,6 Millionsof records in less than 5 seconds

about 6 Millions of nodes traversed per sec!

*unless you live in the Google’s server farm

Do not this at home with a RDBMS*!

Page 40: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 40

Create the graph in SQL$luca> cd bin$luca> ./console.shOrientDB console v.1.2.0-SNAPSHOT (www.orientdb.org) Type 'help' to display all the commands supported.

orientdb> create vertex V set name = ‘Luca’, label = ‘Customer’Created vertex #13:35 in 0.03 secs

orientdb> create vertex V set name = ‘Rome’, label = ‘Address’Created vertex #13:100 in 0.02 secs

orientdb> create edge E from #13:35 to #13:100 set label = ‘Lives’Created edge #14:54 in 0.02 secs

Page 41: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 41

Create the graph in JavaOGraphDatabase graph = new OGraphDatabase("local:/tmp/db/graph”);

ODocument luca = graph.createVertex();luca.field(“name", “Luca");luca.field(“label", “Customer");

ODocument rome = graph.createVertex();rome.field(“name", “Rome”);rome.field(“label", “Address”);

ODocument edge = graph.createEdge(luca, rome).field(“label”, “Lives”);edge.save();

graph.close();

Page 42: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 42

Query the graph in SQLorientdb> select in[label=‘Lives’].out from V where label = ‘Address’ and name = ‘Rome’---+--------+--------------------+--------------------+--------------------+  #| REC ID |label             |out |in |---+--------+--------------------+--------------------+--------------------+  0|   13:35|Luca                |[#14:54]            |            |---+--------+--------------------+--------------------+--------------------+1 item(s) found. Query executed in 0.007 sec(s).

orientdb> select * from V where label = ‘Address’ AND in[label=‘Lives’].size() > 0---+--------+--------------------+--------------------+--------------------+  #| REC ID |label             |out |in |---+--------+--------------------+--------------------+--------------------+  0|  13:100| Rome |            |[#14:54] |---+--------+--------------------+--------------------+--------------------+1 item(s) found. Query executed in 0.007 sec(s).

Page 43: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 43

Query the graph in JavaOGraphDatabase graph = new OGraphDatabase("local:/tmp/db/graph”);

// GET ALL THE THE CUSTOMER FROM ROME, ITALYList<ODocument> result = graph.command( new OCommandSQL ( “select in[label=‘Lives’].out from V where label = ‘Address’ and name = ?”) ).execute( “Rome”);

for( ODocument v : result ) { System.out.println(“Result: “ + v.field(“label”) );}

---------------------------------------------------------------------------------------Result: Luca

Page 44: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 44

Query vs traversal

Once you’ve a well connected databasein the form of a Super Graph you cancross records instead of query them!

All you need is some root verticeswhere to start to traverse

Page 45: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 45

Query vs traversal

CustomersCustomers

LucaLuca JohnJohn SylviaSylvia

Order2332Order2332

Order8834Order8834

WhiteSoapWhiteSoap

StocksStocksSpecialCustomers

SpecialCustomers

This is aroot

vertex

Page 46: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 46

Query the graph in SQLSupposing that the root node #30:0 links all theCustomer vertices

Get all the customers:

orientdb> select out.in from #30:0

Get all the customers who bought at least one ‘White Soap’ product:

orientdb> select * from ( select out.in from #30:0 ) where out.in.out[label=‘Bought’].in.name = ‘White Soap’

Customers#30:0

Customers#30:0

Page 47: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 47

Demo time!

Page 48: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 48

NuvolaBase.com

The firstGraph Database

on the Cloud

always availablefew seconds to setup it

use it from app & mobile

Page 49: Switching from the Relational to the Graph model

(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 49

«Graphs change the way of modelling data»

Luca Garulli

www.twitter.com/lgarulli

CEO atAuthor of

Document-Graph NoSQLOpen Source project Ltd, London UK