shutl
DESCRIPTION
TRANSCRIPT
Shutl delivers with Neo4j
Tuesday, 30 July 13
Volker Pacher
senior developer @shutl
@vpacher
http://github.com/vpacher
Tuesday, 30 July 13
Tuesday, 30 July 13
Tuesday, 30 July 13
Problems?
Tuesday, 30 July 13
• exponential growth of joins in mysql with added features
• code base too complex and unmaintanable
• api response time growing too large the more data was added
• our fastest delivery was quicker then our slowest query!
problems with our previous attempt (v1):
Tuesday, 30 July 13
The case for graph databases:
• relationships are explicit stored (RDBS lack relationships)
• domain modelling is simplified because adding new ‘subgraphs‘ doesn’t affect the existing structure and queries (additive model)
• white board friendly
• schema-less
• db performance remains relatively constant because queries are localized to its portion of the graph. O(1) for same query
• traversals of relationships are easy and very fast
Tuesday, 30 July 13
What is a graph anyway?
Node 1 Node 2
Node 4
Node 3a collection of vertices (nodes)
connected by edges (relationships)
Tuesday, 30 July 13
a short history: the seven bridges of Königsberg (1735)
Leonard Euler
Tuesday, 30 July 13
directed graph
Node 1 Node 2
Node 4
Node 3
each relationship has a direction orone start node and one end node
Tuesday, 30 July 13
property graph:
name: Volker
• nodes contain properties (key, value)
• relationships have a type and are always directed
• relationships can contain properties too
name: Sam
:friends
name: Megan
:knowssince: 2005
name: Paul
:friends
:works_for
:knows
Tuesday, 30 July 13
Tuesday, 30 July 13
a graph is its own index (constant query performance)
Tuesday, 30 July 13
Tuesday, 30 July 13
Querying the graph: Cypher
• declarative query language specific to neo4j
• easy to learn and intuitive
• enables the user to specify specific patterns to query for (something that looks like ‘this’)
• inspired partly by SQL (WHERE and ORDER BY) and SPARQL (pattern matching)
• focuses on what to query for and not how to query for it
• switch from a mySQl world is made easier by the use of cypher instead of having to learn a traversal framework straight away
Tuesday, 30 July 13
• START: Starting points in the graph, obtained via index lookups or by element IDs.• MATCH: The graph pattern to match, bound to the starting points in START.• WHERE: Filtering criteria.• RETURN: What to return.• CREATE: Creates nodes and relationships.• DELETE: Removes nodes, relationships and properties.• SET: Set values to properties.• FOREACH: Performs updating actions once per element in a list.• WITH: Divides a query into multiple, distinct parts
cypher clauses
Tuesday, 30 July 13
an example graph
Node 1me
Node 2Steve
Node 3Sam
Node 4David
Node 5Megan
me - [:knows] -> Steve -[:knows] -> David
me - [:knows] -> Sam - [:knows] -> Megan
Megan - [:knows] -> David
knows
knowsknows
knows
knows
Tuesday, 30 July 13
START me=node(1)MATCH me-[:knows]->()-[:knows]->fofRETURN fof
the query
Tuesday, 30 July 13
START me=node(1)MATCH me-[:knows*2..]->fofWHERE fof.name =~ 'Da.*'RETURN fof
Tuesday, 30 July 13
root (0)
Year: 2013
Month: 05 Month 01
2014
0105
2013
Year: 2014
Month: 06
06
Day: 24 Day: 25
2425
Day: 26
26
Event 1 Event 2 Event 3
happens happens happens happens
representing dates/times
Tuesday, 30 July 13
find all events on a specific day
START root=node(0)MATCH root-[:‘2013’]-()-[:’05’]-()-[:’24’]-()- [:happens]-event RETURN event
Tuesday, 30 July 13
QUESTIONS?
Volker Pacher
Tuesday, 30 July 13