the benefits of graph databases
TRANSCRIPT
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 1/62
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 2/62
Neo4jthe benets of
graph databases
Emil EifremCEO, Neo Technology
#neo4j@[email protected]
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 3/62
What's the plan?
Why now? – Four trends
NoSQL overview
Graph databases && Neo4j
Conclusions
Food
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 4/62
Trend 1:data set size
Source: IDC 2007200740
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 5/62
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 6/62
Trend 2: connectedness
Textdocuments
1990
I n f o r m a
t i o n c o n n e c
t i v i t y
FolksonomiesTagging
User-generated
contentWikis
RSS
Blogs
Hypertext
2000 2010 2020
web 1.0 web 2.0 “web 3.0”
Ontologies
RDF
GiantGlobalGraph(GGG)
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 7/62
Trend 3: semi-structureIndividualization of content!
In the salary lists of the 1970s, all elements had exactlyone job
In the salary lists of the 2000s, we need 5 job columns!Or 8? Or 15?
Trend accelerated by the decentralization of contentgeneration that is the hallmark of the age of participation
(“web 2.0”)
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 8/62
Data complexity
P e r f o
r m a n c e
Relational database
Majority ofWebapps
Social network
Semantic Trading
Salary List
} custom
Aside: RDBMS performance
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 9/62
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 10/62
Trend 4: architecture
2000s: (Slowly towards...)Decoupled services with own backend
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 11/62
Why NoSQL 2009?
Trend 1: Size.
Trend 2: Connectivity.
Trend 3: Semi-structure.
Trend 4: Architecture.
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 12/62
NoSQL
overview
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 13/62
First off: the damn name
NoSQL is NOT “Never SQL”
NoSQL is NOT “No To SQL”
NoSQL is NOT “WE HATE CHRIS' DOG”
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 14/62
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 15/62
Four (emerging) NoSQL categoriesKey-value stores
Based on Amazon's Dynamo paperData model: (global) collection of K-V pairs
Example: Dynomite, Voldemort, Tokyo
BigTable clonesBased on Google's BigTable paper
Data model: big table, column familiesExample: Hbase, Hypertable
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 16/62
Four (emerging) NoSQL categoriesDocument databases
Inspired by Lotus NotesData model: collections of K-V collections
Example: CouchDB, MongoDB
Graph databasesInspired by Euler & graph theory
Data model: nodes, rels, K-V on bothExample: AllegroGraph, VertexDB, Neo4j
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 17/62
NoSQL data models
Complexity
S i z e
Bigtable clones
Key-value stores
Documentdatabases
Graph databases
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 18/62
NoSQL data models
Complexity
S i z e
Bigtable clones
Key-value stores
Documentdatabases
90%of
usecases
(This is still billions ofnodes & relationships)
Graph databases
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 19/62
Graph DBs
& Neo4j intro
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 20/62
The Graph DB model: representationCore abstractions:
NodesRelationships between nodes
Properties on both
name = “Emil”age = 29sex = “yes”
type = KNOWStime = 4 years
type = car vendor = “SAAB”model = “95 Aero”
11 22
33
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 21/62
Example: The Matrix
name = “Thomas Anderson”age = 29
11
name = “The Architect”
4242
CODED_BY
disclosure = public
name = “Cypher”last name = “Reagan”
disclosure = secretage = 6 months
name = “Agent Smith”version = 1.0blanguage = C++
331313
KNOWS K N O W S
name = “Morpheus”rank = “Captain”occupation = “Total badass”
age = 3 days
name = “Trinity”
77
22
KNOWS
K N O W S K N O W S
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 22/62
Code (1): Building a node space NeoService neo = ... // Get factory
// Create Thomas 'Neo' Anderson Node mrAnderson = neo.createNode(); mrAnderson.setProperty( "name" , "Thomas Anderson" ); mrAnderson.setProperty( "age" , 29 );
// Create Morpheus Node morpheus = neo.createNode(); morpheus.setProperty( "name" , "Morpheus" ); morpheus.setProperty( "rank" , "Captain" );
morpheus.setProperty( "occupation" , "Total bad ass" );// Create a relationship representing that they know each other
mrAnderson.createRelationshipTo( morpheus, RelTypes. KNOWS );// ...create Trinity, Cypher, Agent Smith, Architect similarly
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 23/62
Code (1): Building a node space NeoService neo = ... // Get factory
Transaction tx = neo.beginTx();
// Create Thomas 'Neo' Anderson Node mrAnderson = neo.createNode(); mrAnderson.setProperty( "name", "Thomas Anderson" ); mrAnderson.setProperty( "age", 29 );
// Create Morpheus Node morpheus = neo.createNode(); morpheus.setProperty( "name", "Morpheus" ); morpheus.setProperty( "rank", "Captain" );
morpheus.setProperty( "occupation", "Total bad ass" );// Create a relationship representing that they know each other
mrAnderson.createRelationshipTo( morpheus, RelTypes. KNOWS );// ...create Trinity, Cypher, Agent Smith, Architect similarly
tx.commit();
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 24/62
Code (1b): Dening RelationshipTypes// In package org.neo4j.api.core
public interface RelationshipType{
String name() ;}
// In package org.yourdomain.yourapp // Example on how to roll dynamic RelationshipTypesclass MyDynamicRelType implements RelationshipType{
private final String name ; MyDynamicRelType ( String name ){ this. name = name ; }
public String name() { return this .name; }}
// Example on how to kick it, static-RelationshipType-likeenum MyStaticRelTypes implements RelationshipType{ KNOWS ,
WORKS_FOR ,}
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 25/62
Whiteboard friendly
Björn Big Car
DayCare
Björn
owns
drivesbuild
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 26/62
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 27/62
The Graph DB model: traversalTraverser framework forhigh-performance traversingacross the node space
name = “Emil”age = 29sex = “yes”
type = KNOWStime = 4 years
type = car vendor = “SAAB”model = “95 Aero”
11 22
33
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 28/62
Example: Mr Anderson s friends
name = “Thomas Anderson”age = 29
11
name = “The Architect”
4242
CODED_BY
disclosure = public
name = “Cypher”last name = “Reagan”
disclosure = secretage = 6 months
name = “Agent Smith”version = 1.0blanguage = C++
331313
KNOWSK N O W S
name = “Morpheus”rank = “Captain”occupation = “Total badass”
age = 3 days
name = “Trinity”
77
22
KNOWS
K N O W S K N O W S
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 29/62
Code (2): Traversing a node space
// Instantiate a traverser that returns Mr Anderson's friendsTraverser friendsTraverser = mrAnderson.traverse(
Traverser.Order. BREADTH_FIRST ,StopEvaluator. END_OF_GRAPH,ReturnableEvaluator. ALL_BUT_START_NODE ,RelTypes. KNOWS,Direction. OUTGOING );
// Traverse the node space and print out the resultSystem.out.println( "Mr Anderson's friends:" );for ( Node friend : friendsTraverser )
{ System.out.printf( "At depth %d => %s%n" ,friendsTraverser.currentPosition().getDepth(),friend.getProperty( "name" ) );
}
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 30/62
$ bin/start-neo-exampleMr Anderson's friends:
At depth 1 => Morpheus At depth 1 => Trinity At depth 2 => Cypher At depth 3 => Agent Smith$
friendsTraverser = mrAnderson.traverse(Traverser.Order. BREADTH_FIRST ,StopEvaluator. END_OF_GRAPH ,ReturnableEvaluator. ALL_BUT_START_NODE ,RelTypes. KNOWS ,Direction. OUTGOING );
name = “Thomas Anderson”
age = 29
name = “Morpheus”rank = “Captain”occupation = “Total badass”
name = “The Architect”
disclosure = public
age = 3 days
name = “Trinity”
name = “Cypher”last name = “Reagan”
disclosure = secretage = 6 months
name = “Agent Smith”version = 1.0blanguage = C++
77
22
33
1313
4242
11KNOWS KNOWS CODED_BYK N O W S
K N O W S K N
O W S
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 31/62
Example: Friends in love?
name = “Thomas Anderson”age = 29
name = “Morpheus”rank = “Captain”occupation = “Total badass”
name = “The Architect”
disclosure = public
name = “Trinity”
name = “Cypher”last name = “Reagan”
disclosure = secretage = 6 months
name = “Agent Smith”version = 1.0blanguage = C++
77
22
331313
4242
11KNOWS KNOWS
CODED_BYK N O W S
K N O W S K N O W S
L O V E S
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 32/62
Code (3a): Custom traverser
// Create a traverser that returns all “friends in love” Traverser loveTraverser = mrAnderson.traverse(
Traverser.Order. BREADTH_FIRST ,StopEvaluator. END_OF_GRAPH,new ReturnableEvaluator()
{ public boolean isReturnableNode( TraversalPosition pos )
{return pos.currentNode().hasRelationship(
RelTypes. LOVES, Direction. OUTGOING );}
},RelTypes. KNOWS,Direction. OUTGOING );
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 33/62
Code (3a): Custom traverser
// Traverse the node space and print out the resultSystem.out.println( "Who’s a lover?" );for ( Node person : loveTraverser ){
System.out.printf( "At depth %d => %s%n" ,
loveTraverser.currentPosition().getDepth(), person.getProperty( "name" ) );
}
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 34/62
new ReturnableEvaluator(){ public boolean isReturnableNode(
TraversalPosition pos){
return pos.currentNode().hasRelationship( RelTypes. LOVES ,
Direction . OUTGOING );}
},
$ bin/start-neo-example Who’s a lover?
At depth 1 => Trinity $
name = “Thomas Anderson”
age = 29
name = “Morpheus”rank = “Captain”occupation = “Total badass”
name = “The Architect”
disclosure = public
name = “Trinity”
name = “Cypher”last name = “Reagan”
disclosure = secretage = 6 months
name = “Agent Smith”version = 1.0blanguage = C++
77
22
33
1313
4242
11KNOWS KNOWS CODED_BYK N O W S
K N O W S K N
O W S
L O V E S
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 35/62
Bonus code: domain modelHow do you implement your domain model?
Use the delegator pattern, i.e. every domain entity wraps aNeo4j primitive:
// In package org.yourdomain.yourapp class PersonImpl implements Person{
private final Node underlyingNode ; PersonImpl ( Node node ){ this. underlyingNode = node ; }
public String getName ()
{ return this .underlyingNode.getProperty( "name" );
}public void setName ( String name ){
this .underlyingNode.setProperty( "name" , name );}
}
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 36/62
Domain layer frameworksQi4j (www.qi4j.org)
Framework for doing DDD in pure Java5Denes Entities / Associations / Properties
Sound familiar? Nodes / Rel s / Properties!Neo4j is an “EntityStore” backend
NeoWeaver ( http://components.neo4j.org/neo-weaver )
Weaves Neo4j-backed persistence into domain objectsin runtime (dynamic proxy / cglib based)Veeeery alpha
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 37/62
Neo4j system characteristicsDisk-based
Native graph storage engine with custom binary on-diskformat
TransactionalJTA/JTS, XA, 2PC, Tx recovery, deadlock detection,MVCC, etc
Scales up (what's the x and the y?)Several billions of nodes/rels/props on single JVM
Robust6+ years in 24/7 production
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 38/62
Social network pathExists()~1k personsAvg 50 friends perperson
pathExists(a, b) limitdepth 4Two backends
Eliminate disk IO sowarm up caches
11 33
77773636
55
1212
77
4141
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 39/62
Social network pathExists()
11Mike
33Marcus
22Emil
77
John
44Leigh
55Kevin
99Bruce
# persons query timeRelational database 1 000 2 000 msGraph database (Neo4j) 1 000 2 msGraph database (Neo4j) 1 000 000 2 ms
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 40/62
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 41/62
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 42/62
Pros & Cons compared to RDBMS+ No O/R impedance mismatch ( whiteboard friendly )+ Can easily evolve schemas+ Can represent semi-structured info
+ Can represent graphs/networks ( with performance)
- Lacks in tool and framework support- Few other implementations => potential lock in
- No support for ad-hoc queries+
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 43/62
More consequencesAbility to capture semi-structured information
=> allowing individualization of contentNo predened schema
=> easier to evolve model=> can capture ad-hoc relationships
Can capture non-normative relations=> easy to model specic links to specic sets
All state is kept in transactional memory=> improves application concurrency
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 44/62
The Neo4j ecosystemNeo4j is an embedded database
Tiny teeny lil jar leComponent ecosystem
index-utilneo-metaneo-utilspattern-match
sparql-engine...
See http://components.neo4j.org
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 45/62
Language bindingsNeo4j.py – bindings for Jython and CPython
http://components.neo4j.org/neo4j.py
Neo4jrb – bindings for JRuby (incl RESTful API)http://wiki.neo4j.org/content/Ruby
Clojurehttp://wiki.neo4j.org/content/Clojure
Scala (incl RESTful API)http://wiki.neo4j.org/content/Scala
… .NET? Erlang?
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 46/62
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 47/62
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 48/62
Grails Neoclipse screendump
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 49/62
Scale out – replicationRolling out Neo4j HA before end-of-year
Side note: ppl roll it today w/ REST frontends & onlinebackup
Master-slave replication, 1 st conguration
MySQL style... ishExcept all instances can write, synchronously betweenwriting slave & master (strong consistency)
Updates are asynchronously propagated to the otherslaves (eventual consistency)
This can handle billions of entities...
… but not 100B
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 50/62
Scale out – partitioningSharding possible today
… but you have to do manual work… just as with MySQL
Great option: shard on top of resilient, scalable OSS appserver , see: www.codecauldron.org
Transparent partitioning? Neo4j 2.0100B? Easy to say. Sliiiiightly harder to do.
Fundamentals: BASE & eventual consistencyGeneric clustering algorithm as base case, but give lots ofknobs for developers
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 51/62
How ego are you? (aka other impls?)Franz AllegroGraph (http://agraph.franz.com)
Proprietary, Lisp, RDF-oriented but real graphdb
FreeBase graphd (http://bit.ly/13VITB)
In-house at MetawebKloudshare (http://kloudshare.com)
Graph database in the cloud, still stealth mode
Google Pregel (http://bit.ly/dP9IP)We are oh-so-secret
Some academic papers from ~10 years ago
G = {V, E} #FAIL
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 52/62
ConclusionGraphs && Neo4j => teh awesome!
Available NOW under AGPLv3 / commercial licenseAGPLv3: “if you re open source, we re open source”If you have proprietary software? Must buy a commerciallicenseBut up to 1M primitives it s free for all uses!
Download
http://neo4j.orgFeedback
http://lists.neo4j.org
P lid
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 53/62
Party pooper slides
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 54/62
Poop 1Key-value stores?
=> the awesome… if you have 1000s of BILLIONS records OR you don'tcare about programmer productivity
What if you had no variables at all in your programs excepta single globally accessible hashtable?
Would your software be maintainable?
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 55/62
Poop 2In a not-suck architecture...
… the only thing that makes sense is to have anembedded database.
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 56/62
Poop 3Exposing your data model on the wire is bad. Period.
Adding a couple of buzzwords doesn't make it less bad.
If it was bad with SQL-over-sockets (hint: it was) then –surprise! – it's still bad even tho you use Hype-compliant(tm) JSON-over-REST.
We don't want to couple everything to a specic datamodel again!
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 57/62
Poop 4In-memory database
What the hell?That's an oxymoron!Up next: ascii-only JPEGUp next: loopback-only web server
If you're not durable, you're a cache!If you happen to asynchronously spill over to disk, you're acache that asynchronously spills over to disk.
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 58/62
Ait
so, srsly?
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 59/62
NoSQLSQL
Looking ahead: polyglot persistence
&&
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 60/62
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 61/62
Questions?
Image credit: lost again! Sorry :(
8/14/2019 The Benets of Graph Databases
http://slidepdf.com/reader/full/the-benets-of-graph-databases 62/62
http://neotechnology.com