kickoff research project tu ilmenau

23
1 Introduction to graph databases Kickoff research project TU-Ilmenau 11/2011 Henning Rauch [email protected]

Upload: henning-rauch

Post on 11-May-2015

657 views

Category:

Education


1 download

TRANSCRIPT

Page 1: Kickoff research project TU Ilmenau

1

Introduction to graph databases

Kickoff research project

TU-Ilmenau 11/2011

Henning Rauch

[email protected]

Page 2: Kickoff research project TU Ilmenau

2

Agenda

● Introduction● Graph databases● Pros● Cons● Use cases● Sones GraphDB

Page 3: Kickoff research project TU Ilmenau

3

Introduction – /me

● Studied computer science at TU-Ilmenau● 02/2009 – 11/2010 sones core developer of the

sones GraphDB● GraphQL● Type-Management

● 11/2010 – 11/2011 sones Head of R&D● Design of v2 ● Refactoring of v1 → v2 (de-facto rewrite)

● 11/2011 – now NoSQL freelancer & visiting lecturer

Page 4: Kickoff research project TU Ilmenau

4

Introduction – Current situation

● Data-intensive, complex and distributed applications● Semantic web● Recommendation systems● Social networks

● Similarities● Strong connected data in large amounts● Complex structures● Continuous growth in data volume● Mix of structured and non-structured (schema-less) data

Page 5: Kickoff research project TU Ilmenau

5

Introduction – Example

http://www.facebook.com/press/info.php?statistics

Page 6: Kickoff research project TU Ilmenau

6

Introduction – Challenges

● Recursive connected information as a new design goal

● Simple management of structured, semi-structured and unstructured data

● Replication● Versioning● Efficient partitioning of data● Graph oriented operations

Page 7: Kickoff research project TU Ilmenau

7

Graph databases – Data model

● Graph G(V,E)● V – Vertices● E – Edges

Vertex0

Vertex1

Page 8: Kickoff research project TU Ilmenau

8

Graph databases – Data model

Jena Berlin

Stuttgart

383 km 633 km

260 km

Page 9: Kickoff research project TU Ilmenau

9

Graph databases – Property graph

● Extension of the graph data model● Additional properties on vertices and edges● The properties are key/value pairs (Age:23)● Keys are specified by the schema of the vertex type

Name: AliceID: 0

Age: 23

Name: BobID: 1

Age: 42

CommunicatesWithEncrypted : trueMethod : RSA

Page 10: Kickoff research project TU Ilmenau

10

Graph databases – Property graph

Name: AliceID: 0

Age: 23

Name: BobID: 1

Age: 42

CommunicatesWithEncrypted: trueMethod: RSA

Name: CarolID: 3

Age: 18

Name: TU Ilmenau

Name: Uni StuttgartStudiesIn

Since: 2007

StudiesIn

Since: 2004

Relat

iveO

f

Degre

e: S

ister

Comm

unicatesWith

Encrypted: false

Stu

die

sIn

Sin

ce:

201

0

Page 11: Kickoff research project TU Ilmenau

11

Graph databases – Definition

A graph database is a database that uses graph structures with nodes, edges, and properties to represent and store information. General graph

databases that can store any graph are distinct from specialized graph databases such as triplestores

and network databases.

http://en.wikipedia.org/wiki/Graph_database

Page 12: Kickoff research project TU Ilmenau

12

Pros – Data model

● Explicit data model● Direct mapping of real world network

structures

Page 13: Kickoff research project TU Ilmenau

13

Pros – Efficient graph traversal

● The most important operation of graph databases

● Recursive search for vertices/edges with certain properties

● Finding paths in graphs● GraphDB is able to do ~80M vertex-

traversals per second

Page 14: Kickoff research project TU Ilmenau

14

Pros – Index-free adjacency

● Relations (edges) are directly modeled on the vertex → no need for an additional mapping

● No need for a global index for relations● Data locality → adjacent vertices can be

persisted "close together" (efficient storage)● → The vertex-traversal performance is

independent from the size of the graph

Page 15: Kickoff research project TU Ilmenau

15

Cons

● In general the import is slower than in RDBMS

● Relatively new technology● Lack of standards

Page 16: Kickoff research project TU Ilmenau

16

Use cases

● Rating of websites in search engines – Page rank

● Who knows-who in social networks – Shortest path

● Recommendation systems – Bipartite matching

● ...

Page 17: Kickoff research project TU Ilmenau

17

Sones GraphDB – Overview

● http://www.sones.com● Object-oriented graph database● Property-Hypergraph data model● Written in C# (97%)● C# embedded/remote API● GraphQL● Non-persistent OSE and proprietary persistent

GraphFS

Page 18: Kickoff research project TU Ilmenau

18

Sones GraphDB – Architecture

Page 19: Kickoff research project TU Ilmenau

19

Sones GraphDB – Architecture

Page 20: Kickoff research project TU Ilmenau

20

Sones GraphDB – GraphQL

// define Vertex Type

CREATE VERTEX User

ADD ATTRIBUTES (String Name, SET<User> Friends)

INDICES (Name)

// add vertices Alice and Bob

INSERT INTO User VALUES (Name = "Alice", Age = 23)

INSERT INTO User VALUES (Name = "Bob", Age = 42)

// add edges between Alice and Bob

LINK User(Name = ‘Alice') VIA Friends TO User(Name = ‘Bob')

LINK User(Name = ‘Bob') VIA Friends TO User(Name = ‘Alice‘)

Page 21: Kickoff research project TU Ilmenau

21

Sones GraphDB – HowTo run it

● Windows: Install Visual Studio (professional and higher) or MonoDevelop

● Linux: Install mono-complete and MonoDevelop● Download the source from

https://github.com/cosh/sones● Open the „CoreDeveloper.sln“● Have phun

Page 22: Kickoff research project TU Ilmenau

22

Sones GraphDB – Documentation

● Blog: http://developers.sones.de/

● Wiki: http://developers.sones.de/wiki/doku.php

● Forum: http://forum.sones.de/

● BugTracking: http://jira.sones.de/

● The fastest way to information: /me :)

Page 23: Kickoff research project TU Ilmenau

23

Graph visualization

● http://gephi.org/screenshots/

● http://mbostock.github.com/d3/

● http://www.fluidops.net/information-workbench/