graph database

31
Graph Database General Discussion Richard Kuo

Upload: richard-kuo

Post on 07-Aug-2015

38 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Graph Database

Graph Database

General Discussion

Richard Kuo

Page 2: Graph Database

References

Extracted from:

• http://neo4j.org/, Tobias Ivarsson, Emil Eifrem,

• http://markorodriguez.com, Marko A. Rodriguez

• http://www.jayway.com/, Andreas Ronge

• etc• etc

4/12/2011 Creative Commons Attribution-Share Alike 3.0 2

Page 3: Graph Database

Outline

• NoSQL

– What, Why, Who

• Graph Database

– Graph Theory– Graph Theory

– Benefit

• Neo4J

– Function & Feature

– Code & Demo

4/12/2011 3Creative Commons Attribution-Share Alike 3.0

Page 4: Graph Database

Why ? Not only SQL

• Size• Distributed data with accelerating growth of data

• Scalability & elasticity (at low cost!)

• Connectedness• Global linked data• Global linked data

• Semi-structure• Flexible schemas / semi-structured data

• Complex queries

• Architecture• Data mining and association toward more complex data modeling

• Transactions / strong consistency / integrity

• Geographic distribution (multiple datacenters)

4/12/2011 Creative Commons Attribution-Share Alike 3.0 4

Page 5: Graph Database

4/12/2011 Creative Commons Attribution-Share Alike 3.0 5

http://richard.cyganiak.de/2007/10/lod/lod-datasets_2010-09-22_colored.html

Page 6: Graph Database

4/12/2011 Creative Commons Attribution-Share Alike 3.0 6

Page 7: Graph Database

4/12/2011 Creative Commons Attribution-Share Alike 3.0 7

Page 8: Graph Database

NoSQL Taxonomy

Key-Value stores

• Simple K/V lookups (DHT)

Column stores

• Each key is associated with many attributes (columns)

• NoSQL column stores are actually hybrid row/column stores

• • Different from “pure” relational column stores! • • Different from “pure” relational column stores!

Document stores

• Store semi-structured documents (JSON)

• Map/Reduce based materialization, sorting, aggregation, etc.

Graph databases

• Scale, semi-structure data model

More …

4/12/2011 Creative Commons Attribution-Share Alike 3.0 8

Page 9: Graph Database

4/12/2011 Creative Commons Attribution-Share Alike 3.0 9

Page 10: Graph Database

Graph Database Comparisonhttp://nosql.mypopescu.com/post/619181345/nosql-graph-database-matrix

4/12/2011 Creative Commons Attribution-Share Alike 3.0 10

Page 11: Graph Database

GRAPH DATABASE

Page 12: Graph Database

Why Graph Databases?

Data mining

• You can make algorithms for searching patterns and add AI

High-critical environments

• You can apply neo4j for high load databases and optimize the queries and reduce costs on hardware use

• Engineering in biochemical components• Engineering in biochemical components

• You can make algorithms for helping the study of protein synthesys, for example

Discrete event simulation

• You can apply a pattern and behavior and assign everything to a graph database

Social graph

• Everything in user related “tastes” can be organized in a graph

Network architecture

4/12/2011 Creative Commons Attribution-Share Alike 3.0 12

Page 13: Graph Database

When should I use a Graph DB ?

Massive data volumes

• Massively distributed architecture required to store the data

• Google, Amazon, Yahoo, Facebook – 10-100K servers

Extreme query workload

• Impossible to efficiently do joins at that scale with an RDBMS

Have a complex and evolving data modelHave a complex and evolving data model

• Big part of domain is expressed as relationships

• Schema flexibility (migration) is not trivial at large scale

• Schema changes can be gradually introduced with NoSQL

• Few mandatory and many optional attributes

• Have SQL queries that span many table joins

Many YES => maybe a Graph DB is a good choice

4/12/2011 13Creative Commons Attribution-Share Alike 3.0

Page 14: Graph Database

When NOT use Graph DB

• Don't have a graph related problem ?

• Not too much changing requirements ?

• Easy to organized data into:

− Tables, Documents or Key-Value models ?− Tables, Documents or Key-Value models ?

Few & well defined relationships in the domain ?

Don't have SQL queries that span many table joins ?

Many YES => maybe Graph DB not a good choice

4/12/2011 14Creative Commons Attribution-Share Alike 3.0

Page 15: Graph Database

Undirected Graph

• dots (vertices) + lines

(edges) = graphs.

• The Undirected Graph

VerticesVertices

• All vertices denote the same

• type of object.

Edges

• All edges denote the same type of relationship.

• All edges denote a symmetric relationship.

4/12/2011 Creative Commons Attribution-Share Alike 3.0 15

Page 16: Graph Database

Directed, Multiple Relational Graph

Vertices

• Vertices can be

different type of object.

EdgesEdges

• Edges can be different

type of relationship.

• All edges denote an

asymmetric

relationship.

4/12/2011 Creative Commons Attribution-Share Alike 3.0 16

Page 17: Graph Database

4/12/2011 Creative Commons Attribution-Share Alike 3.0 17

Page 18: Graph Database

Benefits of Graph Database

• Express your domain as a Graph

− Domain Modeling Friendly

− No O/R mismatch

− Efficient storage of Semi Structured InformationEfficient storage of Semi Structured Information

− Schema Less

• Express Queries as Traversals

− Fast deep traversal instead of slow SQL queries that

span many table joins

4/12/2011 18Creative Commons Attribution-Share Alike 3.0

Page 19: Graph Database

4/12/2011 Creative Commons Attribution-Share Alike 3.0 19

Page 20: Graph Database

Semi-structured information

4/12/2011 20Creative Commons Attribution-Share Alike 3.0

Page 21: Graph Database

NEO4J

Page 22: Graph Database

4/12/2011 Creative Commons Attribution-Share Alike 3.0 22

Page 23: Graph Database

Why Neo4j ?

• Widely deployed graph db in the world

• ACID, persistent, embedded/server

• Robust: 24/7 production since 2003

• Mature: lots of production deployments

Scalable: High Availability, Master failover• Scalable: High Availability, Master failover

• Community: ecosystem of tools, bindings, frameworks

• Product: OSGi, Spatial, RDF, languages

• Available under AGPLv3 and as commercial product

• But the first one is free! For ALL use-cases

4/12/2011 Creative Commons Attribution-Share Alike 3.0 23

Page 24: Graph Database

DEMO

Page 25: Graph Database

BACKUP SLIDES

Page 26: Graph Database

Create Node

4/12/2011 Creative Commons Attribution-Share Alike 3.0 26

Page 27: Graph Database

Create Relationship & Traverse (1/2)

4/12/2011 Creative Commons Attribution-Share Alike 3.0 27

Page 28: Graph Database

Traverse (2/2)

4/12/2011 Creative Commons Attribution-Share Alike 3.0 28

Page 29: Graph Database

NeoEclipse

4/12/2011 Creative Commons Attribution-Share Alike 3.0 29

Page 30: Graph Database

4/12/2011 30Creative Commons Attribution-Share Alike 3.0

Page 31: Graph Database

4/12/2011 Creative Commons Attribution-Share Alike 3.0 31