graph databases and neo4j - university of stirling · graph databases and neo4j kevin swingler re˜...

14
19/08/2015 1 Graph Databases and Neo4j Kevin Swingler Relationships We all know how a relational database models relationships But there are limitations to the approach Relationships can’t have a type, or any properties Permissible relationships (PK and FK) are strictly defined and cannot be added in an ad-hoc way Must be implemented within the relational model

Upload: others

Post on 15-Mar-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Graph Databases and Neo4j - University of Stirling · Graph Databases and Neo4j Kevin Swingler Re˜ t˙ nsh˙ s • We all know how a relational database models relationships •

19/08/2015

1

Graph Databases and Neo4j

Kevin Swingler

Relationships

• We all know how a relational database models

relationships

• But there are limitations to the approach

– Relationships can’t have a type, or any properties

– Permissible relationships (PK and FK) are strictly

defined and cannot be added in an ad-hoc way

– Must be implemented within the relational model

Page 2: Graph Databases and Neo4j - University of Stirling · Graph Databases and Neo4j Kevin Swingler Re˜ t˙ nsh˙ s • We all know how a relational database models relationships •

19/08/2015

2

Flexible Relationships

• Imagine a set of objects that can have

arbitrary properties and arbitrary relationships

between the objects

Animal:Fish

Lives in:Water

Moves:Swims

Animal:Cat

Lives in:Land

Blood:Warm

Eats

Likes:a lot

Main Graph DB Features

• Each entity (object) can have different properties, just like a document database

• Any entity can have a relationship with any other entity

• Relationships have a type, and any pair of entities can have a relationship of any type

• Relationships have properties, so can be thought of as entities that join other entities

• Entity pairs can have more than one relationship

Page 3: Graph Databases and Neo4j - University of Stirling · Graph Databases and Neo4j Kevin Swingler Re˜ t˙ nsh˙ s • We all know how a relational database models relationships •

19/08/2015

3

Anatomy of a Graph DB

• Nodes represent entities, for example people in a social network or an organisation

• Edges represent relationships, e.g ‘Works for’

• Edges are directional: A works for B doesn’t mean B works for A

• So relationships are INCOMING or OUTGOING in respect to a node

• Edges have properties: A works for B: since 2003, as secretary

Labels, Types and Properties

• Nodes

– Label: E.g. Person, Movie

– Properties: E.g. Name, Age

• Edges

– Type: E.g. Works for, Loves

– Properties: E.g. Since when, how much

Page 4: Graph Databases and Neo4j - University of Stirling · Graph Databases and Neo4j Kevin Swingler Re˜ t˙ nsh˙ s • We all know how a relational database models relationships •

19/08/2015

4

Example

Relationship Depth

• Relationship depth measures the steps

between one entity and another

• For example Friend is depth 1, friend of a

friend is depth 2, etc.

Page 5: Graph Databases and Neo4j - University of Stirling · Graph Databases and Neo4j Kevin Swingler Re˜ t˙ nsh˙ s • We all know how a relational database models relationships •

19/08/2015

5

Graph Traversal

• Traversing a graph draws out relationship

• Traversing means moving from one node to

another along the relationship edges

• As a node can have more than one

relationship, traversal is not trivial

• There are algorithms that try to optimise the

traversal of a graph

Traversal Type

• A graph traversal starts with a chosen node,

either a specified root, or any given node

• It can follow INCOMING or OUTGOING nodes,

so go in either direction

• Useful for asking “Who works for A?” or “Who

does B report to?”

• Can traverse DEPTH_FIRST or BREADTH_FIRST

Page 6: Graph Databases and Neo4j - University of Stirling · Graph Databases and Neo4j Kevin Swingler Re˜ t˙ nsh˙ s • We all know how a relational database models relationships •

19/08/2015

6

DEPTH or BREADTH First

• From a node with multiple edges, leading to

long paths:

– Depth first follows the first path to its end, then

returns and follows the second ...

– Breadth first follows all the first steps first, then

lists the depth 2 paths, and so on

Example (Right to Left)

From Morpheus, starting with the edge to Reagan, following KNOWS

Breadth first order is (Reagan), (Trinity), (Reagan – Agent Smith)

Depth first is (Reagan – Agent Smith), (Trinity)

Page 7: Graph Databases and Neo4j - University of Stirling · Graph Databases and Neo4j Kevin Swingler Re˜ t˙ nsh˙ s • We all know how a relational database models relationships •

19/08/2015

7

Indexing

• Nodes and edges are indexed to speed

searches for single entities and relationships

• Means graph doesn’t need to be traversed

Index

Neo

Loves

Trinity

Loves

Why Use a Graph?

• Many data structures are examples of graphs:

– Linked lists

– Trees

– Maps

• So a graph is a generic data structure

• One way to address the impedence mismatch we discussed in lecture 1 – that objects in your java don’t match the structure of a DB

• Maths and algorithms of a graph well understood

Page 8: Graph Databases and Neo4j - University of Stirling · Graph Databases and Neo4j Kevin Swingler Re˜ t˙ nsh˙ s • We all know how a relational database models relationships •

19/08/2015

8

Facebook Graphs

• Using Touchgraph

Facebook Relationships

• Friends is the obvious one

• But you might also include

– Liked

• How many times

– Commented on

– In a relationship with

– Has chatted with

Page 9: Graph Databases and Neo4j - University of Stirling · Graph Databases and Neo4j Kevin Swingler Re˜ t˙ nsh˙ s • We all know how a relational database models relationships •

19/08/2015

9

Queries

• Who likes Me?

• Who has liked something I’ve posted?

• Who likes somebody I’ve liked?

• Which of my friends have chatted with each

other?

• Do any of my old school friends know any of

my university friends?

Query = Traversal

• The query “Who likes me” requires a traversal

of depth 1 of the incoming nodes to me with

the property “Like”

• “Do any of my old school friends know any of

my university friends” requires a traversal

from you to all your friends, then from friend

to friend

Page 10: Graph Databases and Neo4j - University of Stirling · Graph Databases and Neo4j Kevin Swingler Re˜ t˙ nsh˙ s • We all know how a relational database models relationships •

19/08/2015

10

Compared to Other DBs

A Graph Database transforms a RDBMS

Topple the stacks of records in a relational

database while keeping all the relationships,

and you’ll see a graph. Where an RDBMS is

optimized for aggregated data, Neo4j is

optimized for highly connected data.

http://docs.neo4j.org/chunked/stable/tutorial-comparing-models.html

Compared to Other DBs

A Graph Database elaborates a Key-Value

Store

A Key-Value model is great for lookups of

simple values or lists. When the values are

themselves interconnected, you’ve got a

graph. Neo4j lets you elaborate the simple

data structures into more complex,

interconnected data.

Page 11: Graph Databases and Neo4j - University of Stirling · Graph Databases and Neo4j Kevin Swingler Re˜ t˙ nsh˙ s • We all know how a relational database models relationships •

19/08/2015

11

Compared to Other DBs

A Graph Database

navigates a Document Store

The container hierarchy of a

document database

accommodates schema-free

data that can easily be

represented as a tree. Which

is of course a graph. Refer to

other documents (or

document elements) within

that tree and you have a more

expressive representation of

the same data. When in

Neo4j, those relationships are

easily navigable.

D=Document, S=Subdocument, V=Value, D2/S2 = reference to subdocument in (other) document

Neo4j Query Language - Cypher

• Its query language, Cypher is a declarative

language, like SQL

• Graph traversal is handled at a lower level, so

you don’t need to write traversals

• Commands are built from clauses that

represent matches to patterns and

relationships

Page 12: Graph Databases and Neo4j - University of Stirling · Graph Databases and Neo4j Kevin Swingler Re˜ t˙ nsh˙ s • We all know how a relational database models relationships •

19/08/2015

12

Create Nodes

• Create a node, called Kev, label it Person and

provide properties:

Create (Kev:Person { Name:’Kevin’, Age:45})

• And

Create (Beer:Drink {Name:’Beer’, Alcoholic:’Yes’})

Retrieve Nodes

• Use the MATCH operator

Match (a:Person) WHERE a.Name="Kevin"

• Instantiates a with the node with the name "Kevin"

• i.e { Name:’Kevin’, Age:45}

Page 13: Graph Databases and Neo4j - University of Stirling · Graph Databases and Neo4j Kevin Swingler Re˜ t˙ nsh˙ s • We all know how a relational database models relationships •

19/08/2015

13

Create RelationShips

• Create a relationship between Kevin and Beer

Match (a:Person),(b:Beer) WHERE

a.Name="Kevin" and b.Name=“Beer"

CREATE (a)-[r:Likes]->(b)

Neo4j in Java

• Neo4j is based on Java (that is what the 4j means)

Node kevin = graphDb.createNode();kevin.setProperty(“name”,”Kevin”);kevin.setProperty(“Age”,”45”);

Node beer= graphDb.createNode();beer.setProperty(“Alcoholic”,”Yes”);

Relationship rel = kevin.createRelationshipTo( beer, RelTypes.LIKES );

relationship.setProperty( "HowMuch", "Quite a lot" );

Page 14: Graph Databases and Neo4j - University of Stirling · Graph Databases and Neo4j Kevin Swingler Re˜ t˙ nsh˙ s • We all know how a relational database models relationships •

19/08/2015

14

Summary

• Graph databases are good for representing

entities and the relationships between them

• Far more rich than traditional relational

database

• Support ACID transactions

• Good for modelling naturally graph like

structures such as geographic locations, social

networks, etc.