graph database

Post on 07-Aug-2015

38 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Graph Database

General Discussion

Richard Kuo

References

Extracted from:

• http://neo4j.org/, Tobias Ivarsson, Emil Eifrem,

• http://markorodriguez.com, Marko A. Rodriguez

• http://www.jayway.com/, Andreas Ronge

• etc• etc

4/12/2011 Creative Commons Attribution-Share Alike 3.0 2

Outline

• NoSQL

– What, Why, Who

• Graph Database

– Graph Theory– Graph Theory

– Benefit

• Neo4J

– Function & Feature

– Code & Demo

4/12/2011 3Creative Commons Attribution-Share Alike 3.0

Why ? Not only SQL

• Size• Distributed data with accelerating growth of data

• Scalability & elasticity (at low cost!)

• Connectedness• Global linked data• Global linked data

• Semi-structure• Flexible schemas / semi-structured data

• Complex queries

• Architecture• Data mining and association toward more complex data modeling

• Transactions / strong consistency / integrity

• Geographic distribution (multiple datacenters)

4/12/2011 Creative Commons Attribution-Share Alike 3.0 4

4/12/2011 Creative Commons Attribution-Share Alike 3.0 5

http://richard.cyganiak.de/2007/10/lod/lod-datasets_2010-09-22_colored.html

4/12/2011 Creative Commons Attribution-Share Alike 3.0 6

4/12/2011 Creative Commons Attribution-Share Alike 3.0 7

NoSQL Taxonomy

Key-Value stores

• Simple K/V lookups (DHT)

Column stores

• Each key is associated with many attributes (columns)

• NoSQL column stores are actually hybrid row/column stores

• • Different from “pure” relational column stores! • • Different from “pure” relational column stores!

Document stores

• Store semi-structured documents (JSON)

• Map/Reduce based materialization, sorting, aggregation, etc.

Graph databases

• Scale, semi-structure data model

More …

4/12/2011 Creative Commons Attribution-Share Alike 3.0 8

4/12/2011 Creative Commons Attribution-Share Alike 3.0 9

Graph Database Comparisonhttp://nosql.mypopescu.com/post/619181345/nosql-graph-database-matrix

4/12/2011 Creative Commons Attribution-Share Alike 3.0 10

GRAPH DATABASE

Why Graph Databases?

Data mining

• You can make algorithms for searching patterns and add AI

High-critical environments

• You can apply neo4j for high load databases and optimize the queries and reduce costs on hardware use

• Engineering in biochemical components• Engineering in biochemical components

• You can make algorithms for helping the study of protein synthesys, for example

Discrete event simulation

• You can apply a pattern and behavior and assign everything to a graph database

Social graph

• Everything in user related “tastes” can be organized in a graph

Network architecture

4/12/2011 Creative Commons Attribution-Share Alike 3.0 12

When should I use a Graph DB ?

Massive data volumes

• Massively distributed architecture required to store the data

• Google, Amazon, Yahoo, Facebook – 10-100K servers

Extreme query workload

• Impossible to efficiently do joins at that scale with an RDBMS

Have a complex and evolving data modelHave a complex and evolving data model

• Big part of domain is expressed as relationships

• Schema flexibility (migration) is not trivial at large scale

• Schema changes can be gradually introduced with NoSQL

• Few mandatory and many optional attributes

• Have SQL queries that span many table joins

Many YES => maybe a Graph DB is a good choice

4/12/2011 13Creative Commons Attribution-Share Alike 3.0

When NOT use Graph DB

• Don't have a graph related problem ?

• Not too much changing requirements ?

• Easy to organized data into:

− Tables, Documents or Key-Value models ?− Tables, Documents or Key-Value models ?

Few & well defined relationships in the domain ?

Don't have SQL queries that span many table joins ?

Many YES => maybe Graph DB not a good choice

4/12/2011 14Creative Commons Attribution-Share Alike 3.0

Undirected Graph

• dots (vertices) + lines

(edges) = graphs.

• The Undirected Graph

VerticesVertices

• All vertices denote the same

• type of object.

Edges

• All edges denote the same type of relationship.

• All edges denote a symmetric relationship.

4/12/2011 Creative Commons Attribution-Share Alike 3.0 15

Directed, Multiple Relational Graph

Vertices

• Vertices can be

different type of object.

EdgesEdges

• Edges can be different

type of relationship.

• All edges denote an

asymmetric

relationship.

4/12/2011 Creative Commons Attribution-Share Alike 3.0 16

4/12/2011 Creative Commons Attribution-Share Alike 3.0 17

Benefits of Graph Database

• Express your domain as a Graph

− Domain Modeling Friendly

− No O/R mismatch

− Efficient storage of Semi Structured InformationEfficient storage of Semi Structured Information

− Schema Less

• Express Queries as Traversals

− Fast deep traversal instead of slow SQL queries that

span many table joins

4/12/2011 18Creative Commons Attribution-Share Alike 3.0

4/12/2011 Creative Commons Attribution-Share Alike 3.0 19

Semi-structured information

4/12/2011 20Creative Commons Attribution-Share Alike 3.0

NEO4J

4/12/2011 Creative Commons Attribution-Share Alike 3.0 22

Why Neo4j ?

• Widely deployed graph db in the world

• ACID, persistent, embedded/server

• Robust: 24/7 production since 2003

• Mature: lots of production deployments

Scalable: High Availability, Master failover• Scalable: High Availability, Master failover

• Community: ecosystem of tools, bindings, frameworks

• Product: OSGi, Spatial, RDF, languages

• Available under AGPLv3 and as commercial product

• But the first one is free! For ALL use-cases

4/12/2011 Creative Commons Attribution-Share Alike 3.0 23

DEMO

BACKUP SLIDES

Create Node

4/12/2011 Creative Commons Attribution-Share Alike 3.0 26

Create Relationship & Traverse (1/2)

4/12/2011 Creative Commons Attribution-Share Alike 3.0 27

Traverse (2/2)

4/12/2011 Creative Commons Attribution-Share Alike 3.0 28

NeoEclipse

4/12/2011 Creative Commons Attribution-Share Alike 3.0 29

4/12/2011 30Creative Commons Attribution-Share Alike 3.0

4/12/2011 Creative Commons Attribution-Share Alike 3.0 31

top related