ghislain fourny big data fall 2019 · 2020-01-31 · ghislain fourny big data fall 2019 14. graph...

136
Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone / 123RF Stock Photo tovovan / 123RF Stock Photo 1

Upload: others

Post on 15-Mar-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Ghislain Fourny

Big Data Fall 2019

14. Graph Databases

pinkyone / 123RF Stock Photo tovovan / 123RF Stock Photo1

Page 2: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Why graph databases?

2

Page 3: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

The NoSQL paradigms

foo

bar

foobar

Key-value stores

Triple stores

Column stores Document stores

3

Page 4: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relational databases...

4

Page 5: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relational databases...

Entity

Entity

Relationship

5

Page 6: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relational databases...

Entity

Entity

Relationship

have

expensivejoins!

6

Page 7: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relational databases...

... are not that

efficient

at relationships!

7

Page 8: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

We already know how to partly solve this

though

3NF

0NF

8

Page 9: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

We already know how to partly solve this

though

3NF

0NF

... but it has its

limits,too!

9

Page 10: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Traversals...

10

Page 11: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Traversals...

... translate into

multiple joins!11

Page 12: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Reverse traversals...

12

Page 13: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Reverse traversals...

... need

even more indices!13

Page 14: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Traversals...

... translate into

multiple joins!

what if links

would be more

"direct"?

14

Page 15: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Index-free adjacency

15

Page 16: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Graphs

16

Page 17: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Graphs: ingredients

Nodes Edges

17

Page 18: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Graphs: nodes

18

Page 19: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Graphs: edges

19

Page 20: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Graphs: directed graph

20

Page 21: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Graphs: undirected graph

21

Page 22: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Graph representation: adjacency list

A

CB

Node Edges

A [ ]

B [ A, C ]

C [ A ]

22

Page 23: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Graph representation: adjacency matrix

A

CB

A B C

A 0 1 1

B 0 0 0

C 0 1 0

23

Page 24: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Graph representation: incidence matrix

A

CB

1 2 3

A 1 1 0

B -1 0 -1

C 0 -1 1

Edges

Nodes

24

Page 25: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Labeled property graphs: ingredients

Nodes Edges Properties Labels

25

Page 26: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Property graph

26

Page 27: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Properties

Name: Einstein

First name: Albert

Profession: Physicist

27

Page 28: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Labeled graph

28

Page 29: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Labels on nodes

29

Page 30: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Names (labels) on relationships

A

A

A

A

A

B

B

B

A

A

30

Page 31: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Labeled property graph

31

Page 32: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Node with properties and label

Person

Name: Einstein

First name: Albert

Profession: Physicist

In Switzerland

32

Page 33: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Graph database

33

Page 34: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Graph databases: families

Property Graph Triple stores (RDF)

34

Page 35: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Graph databases: native or not

Native Graph DatabaseGraph stored as

RDBMS, document

store, ...

Source Target Name

Alice Bob knows

Eve Bob eavesdrop

Eve Alice eavesdrop

35

Page 36: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

RDF

36

Page 37: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Triple-based graph

37

Page 38: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

RDF: one triple

ETH Zürich SwitzerlandIs located in

Subject Property Object

38

Page 39: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

IRI

http://www.ethz.ch/#school

http://www.example.com/Switzerland

39

Page 40: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Literal

includes XML Schema types!

Foo 2012-12-16

3.1415926535

40

Page 41: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Blank Node

ETH ZürichIs built on ground

Switzerland

Is subset of

41

Page 42: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

What can appear where?

Subject Property Object

IRI

Literal

Blank

node

42

Page 43: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Generalized Graphs

Subject Property Object

IRI

Literal

Blank

node

43

Page 44: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Syntax

44

Page 45: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

RDF Formats

§ RDF/XML

§ Turtle

§ JSON-LD

§ RDFa

§ N-Triples

45

Page 46: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

RDF/XML

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:geo="http://www.example.com/geography#">

<rdf:Description rdf:about="http://www.ethz.ch/#self">

<geo:isLocatedIn

rdf:resource="http://www.example.com/Switzerland"/>

<geo:population>20000</geo:population>

</rdf:Description>

</rdf:RDF>

46

Page 47: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

RDF/XML: Subject

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:geo="http://www.example.com/geography#">

<rdf:Description rdf:about="http://www.ethz.ch/#self">

<geo:isLocatedIn

rdf:resource="http://www.example.com/Switzerland"/>

<geo:population>8000000</geo:population>

</rdf:Description>

</rdf:RDF>

47

Page 48: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

RDF/XML: Property

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:geo="http://www.example.com/geography#">

<rdf:Description rdf:about="http://www.ethz.ch/#self">

<geo:isLocatedIn

rdf:resource="http://www.example.com/Switzerland"/>

<geo:population>8000000</geo:population>

</rdf:Description>

</rdf:RDF>

48

Page 49: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

RDF/XML: Object

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:geo="http://www.example.com/geography#">

<rdf:Description rdf:about="http://www.ethz.ch/#self">

<geo:isLocatedIn

rdf:resource="http://www.example.com/Switzerland"/>

<geo:population>8000000</geo:population>

</rdf:Description>

</rdf:RDF>

49

Page 50: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

RDF/XML

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:geo="http://www.example.com/geography#">

<rdf:Description rdf:about="http://www.ethz.ch/#self">

<geo:isLocatedIn

rdf:resource="http://www.example.com/Switzerland"/>

<geo:population>8000000</geo:population>

</rdf:Description>

</rdf:RDF>http://www.example.com/geography#isLocatedIn

50

Page 51: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

RDF/XML

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:geo="http://www.example.com/geography#">

<rdf:Description rdf:about="http://www.ethz.ch/#self">

<geo:isLocatedIn

rdf:resource="http://www.example.com/Switzerland"/>

<rdf:type

rdf:resource="http://www.example.com/geography#school"/>

<geo:population>8000000</geo:population>

</rdf:Description>

</rdf:RDF>51

Page 52: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

RDF/XML

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:geo="http://www.example.com/geography#">

<rdf:Description rdf:about="http://www.ethz.ch/#self">

<geo:isLocatedIn

rdf:resource="http://www.example.com/Switzerland"/>

<rdf:type

rdf:resource="http://www.example.com/geography#school"/>

<geo:population>8000000</geo:population>

</rdf:Description>

</rdf:RDF>52

Page 53: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

RDF/XML

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:geo="http://www.example.com/geography#">

<geo:school rdf:about="http://www.ethz.ch/#self">

<geo:isLocatedIn

rdf:resource="http://www.example.com/Switzerland"/>

<geo:population>20000</geo:population>

</geo:school>

</rdf:RDF>

53

Page 54: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

JSON-LD

{

"@context": {

"rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",

"geo": "http://www.example.com/geography#"

},

"@id" : "http://www.ethz.ch/#self",

"rdf:type": "geo:school",

"geo:isLocatedIn": "http://www.example.com/Switzerland",

"geo:population" : 8000000

}

54

Page 55: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Turtle

@prefix geo: <http://www.example.com/geography#> .

@prefix countries: <http://www.example.com/> .

@prefix eth: <http://www.ethz.ch/#> .

eth:self geo:isLocated countries:Switzerland .

eth:self geo:population 8000000 .

55

Page 56: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Turtle

@prefix geo: <http://www.example.com/geography#> .

@prefix countries: <http://www.example.com/> .

@prefix eth: <http://www.ethz.ch/#> .

eth:self geo:isLocated countries:Switzerland ;

eth:self geo:population 8000000 .

56

Page 57: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Turtle

@prefix geo: <http://www.example.com/geography#> .

@prefix countries: <http://www.example.com/> .

@prefix eth: <http://www.ethz.ch/#> .

eth:self geo:isLocated countries:Switzerland,

eth:self geo:isLocated countries:Europe ;

eth:self geo:population 8000000 .

57

Page 58: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Querying

58

Page 59: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Querying paradigms

Classical

declarative

querying

Query

by

example

?

59

Page 60: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Two languages

Cypher SPARQL

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

60

Page 61: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Two languages

Cypher SPARQL

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

61

Page 62: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Querying labeled property graphs by example

62

Page 63: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Querying labeled property graphs by example

63

Page 64: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Querying labeled property graphs by example

64

Page 65: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Querying labeled property graphs by example

65

Page 66: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Querying labeled property graphs by example

66

Page 67: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Querying labeled property graphs by example

AA

A

B

B

B

B

B

B

B

AB

A

B

67

Page 68: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Querying labeled property graphs by example

AA

A

B

B

B

B

B

B

B

AB

A

B

68

Page 69: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Cypher pattern

AA

A

B

B

B

B

B

B

B

AB

(alpha)-[:A]->(beta)-[:B]->(gamma)

alpha

betagamma

69

Page 70: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Cypher pattern: anchoring a label

AA

A

B

B

B

B

B

B

B

AB

(alpha)

-[:A]->(beta:yellow)

-[:B]->(gamma)

alpha

betagamma

yellow

70

Page 71: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Cypher pattern: filtering a property

AA

A

B

B

B

B

B

B

B

AB

(alpha {name: 'Einstein' })

-[:A]->(beta)

-[:B]->(gamma)

alpha

betagamma

name: Einstein

71

Page 72: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Cypher pattern: anchoring and filtering

AA

A

B

B

B

B

B

B

B

AB

(alpha)

-[:A]->(beta)

-[:B]->(gamma: blue {name: 'ETH'})

alpha

betagamma

name: ETH

blue

72

Page 73: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Cypher pattern: right to left

AA

A

B

B

B

B

B

B

B

AB

(alpha)

-[:A]->(beta)

-[:B]->(gamma)<-[:B]-(delta)

alpha

betagamma

delta

73

Page 74: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Cypher pattern: variable repetition

AA

A

B

B

B

B

B

B

B

AB

(alpha)

-[:A]->(beta)

-[:B]->(gamma)<-[:B]-(delta)

-[:B]->(alpha)

alpha

betagamma

delta

74

Page 75: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Cypher pattern: variable length path

AA

A

B

B

B

B

B

B

B

AB

(alpha)

-[*1..4]->(beta)

alpha

beta

75

Page 76: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Cypher pattern: MATCH clause

MATCH (alpha {name: 'Einstein' })-[:A]->(beta)-[:B]->(gamma)

76

Page 77: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Cypher pattern: MATCH clause

MATCH (alpha {name: 'Einstein' })-[:A]->(beta)-[:B]->(gamma)

RETURN gamma

77

Page 78: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Cypher pattern: WHERE clause

MATCH (alpha {name: 'Einstein' })-[:A]->(beta)-[:B]->(gamma)

RETURN gamma

MATCH (alpha)-[:A]->(beta)-[:B]->(gamma)

WHERE alpha.name = 'Einstein'

RETURN gamma

78

Page 79: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Cypher pattern: CREATE clause

CREATE (einstein:Scientist {name: 'Einstein', first: 'Albert' }),

(eth:University {name: 'ETH Zurich' }),

(einstein)-[:VISITED]->(eth)

79

Page 80: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Other clauses

WITH

DELETE

MERGE

FOREACH

SET

UNION

START

80

Page 81: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Two languages

Cypher SPARQL

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

ETH ZürichSwitzerlandIs located in

81

Page 82: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Querying RDF: SPARQL

PREFIX geo: <http://www.example.com/geography#>

PREFIX countries: <http://www.example.com/>

SELECT ?s

WHERE { ?s geo:isLocatedIn countries:Switzerland }

82

Page 83: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

SPARQL

PREFIX geo: <http://www.example.com/geography#>

PREFIX countries: <http://www.example.com/>

SELECT ?s

WHERE { ?s geo:isLocatedIn countries:Switzerland }

83

Page 84: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

SPARQL

PREFIX geo: <http://www.example.com/geography#>

PREFIX countries: <http://www.example.com/>

SELECT ?s

WHERE { ?s geo:isLocatedIn countries:Switzerland }

84

Page 85: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

SPARQL

PREFIX geo: <http://www.example.com/geography#>

PREFIX countries: <http://www.example.com/>

SELECT ?s

WHERE {

?s geo:isLocatedIn countries:Switzerland .

?s :deliversDiplom :bachelor .

}

85

Page 86: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

SPARQL

PREFIX geo: <http://www.example.com/geography#>

PREFIX countries: <http://www.example.com/>

SELECT ?s

WHERE {

?s geo:isLocatedIn ?c .

?c geo:isInContinent geo:America .

}

86

Page 87: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

SPARQL

PREFIX geo: <http://www.example.com/geography#>

PREFIX countries: <http://www.example.com/>

SELECT ?s

WHERE {

?s geo:isLocatedIn countries:Switzerland.

?s :deliversDiplom :bachelor

}

LIMIT 10

87

Page 88: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

SPARQL

PREFIX geo: <http://www.example.com/geography#>

PREFIX countries: <http://www.example.com/>

SELECT ?s ?name

WHERE {

?s geo:isLocatedIn countries:Switzerland .

?s :deliversDiplom :bachelor .

?s :hasName ?name .

}

ORDER BY ?name

LIMIT 10

88

Page 89: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Architecture (Neo4j)

89

Page 90: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

No sharding

90

Page 91: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

joins

Graph databases

Document stores

don't like

shardsdon't like

91

Page 92: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Why? Fast traversal

92

Page 93: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Master-slave architecture

Slave

Master

Slave Slave Slave Slave Slave93

Page 94: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Data replication

Slave

Master

Slave Slave94

Page 95: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Data replication

Slave

Master

Slave Slave

Synchronization

95

Page 96: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Data replication (full)

Slave

Master

Slave Slave

Synchronization

96

Page 97: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Read scale-up

Slave97

Page 98: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Writes

Write to the master Write to a slave

or

98

Page 99: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Caching and pages

Fixed-size records

Index-free adjacency

99

Page 100: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Label storage

Person

Jedi

Geek

Person Jedi Geek

100

Page 101: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Properties storage

name first-name Albert

name: Einstein

first-name: Albert

Einstein

101

Page 102: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relationship storage

A

A

102

Page 103: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relationship storage

A

A

103

Page 104: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relationship storage

A

A

104

Page 105: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relationship storage

A

105

Page 106: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relationship storage

A

106

Page 107: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relationship storage

A

107

Page 108: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relationship storage

A

108

Page 109: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relationship storage

A

109

Page 110: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relationship storage

A

110

Page 111: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relationship storage

A

111

Page 112: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relationship storage

A

112

Page 113: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relationship storage

A

113

Page 114: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relationship storage

A A

114

Page 115: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relationship storage

A A

115

Page 116: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relationship storage

A A

116

Page 117: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Relationship storage

A B

A

B

Source

Targets-previous s-next

t-previous t-next

117

Page 118: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Typical sizes

Node: 9 bytes

Relationship: 33 bytes

Relationship name: 5 bytes

Property: 33 bytes

118

Page 119: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Semantics

119

Page 120: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

RDF has no semantics

Schwiiz SchoggiChuchichäschtli

120

Page 121: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

RDF Schema

Class Property

121

Page 122: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Classes

rdfs:Resource

rdfs:Class

rdf:Property

rdfs:Literal

rdfs:DataType

rdf:HTML

rdf:XMLLiteral

122

Page 123: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Properties

rdf:type

rdfs:label

rdfs:comment

rdfs:range

rdfs:domain

rdfs:subPropertyOf

rdfs:subClassOf

On any resources

On properties

On classes

123

Page 124: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Self-awareness

rdfs:Resource rdfs:Resourcerdf:type

124

Page 125: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Self-awareness

rdfs:Class rdfs:Resourcerdf:subClassOf

125

Page 126: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Self-awareness

rdf:type rdfs:Classrdf:range

126

Page 127: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Self-awareness

rdf:subClassOf rdfs:Propertyrdf:type

127

Page 128: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Simple Entailment (RDF semantics)

I

E

I(E)=true 128

Page 129: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

OWL

129

Page 130: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

OWL

(In principle) standalone

(Much) More powerful than RDF(S)

130

Page 131: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

OWL

<xml/>131

Page 132: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

OWL and description logic / AI

132

Page 133: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Entailment (and Syllogisms)

All men are mortal.

Socrates is a man.

Therefore,

Socrates is mortal.

Major

Minor

Conclusion

133

Page 134: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

More graph databases...

134

Page 135: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

Trees...

135

Page 136: Ghislain Fourny Big Data Fall 2019 · 2020-01-31 · Ghislain Fourny Big Data Fall 2019 14. Graph Databases pinkyone/ 123RF Stock Photo tovovan/ 123RF Stock Photo 1

... and Graphs

136