graph algebra with pattern matching and aggregation support 1

27
Graph Algebra with Pattern Matching and Aggregation Support 1

Upload: pauline-cain

Post on 23-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Graph Algebra with Pattern Matching and Aggregation Support 1

1

Graph Algebrawith Pattern Matching and Aggregation Support

Page 2: Graph Algebra with Pattern Matching and Aggregation Support 1

2

Nowadays GraphVariety of Sources

◦ Scientific Studies◦ Business Activities◦ Social Needs◦ Internet

Data are often of◦ Large Scale◦ Highly Liked◦ Schema-less

Page 3: Graph Algebra with Pattern Matching and Aggregation Support 1

3

Managing Graph DataPrimary Role of Database

◦ Persistent store ◦ Efficient Query

RDBMS◦ Storage Model : vertex and edge as tuples◦ Query: Link is by join

Graph Database◦ Storage Model: graphs◦ Query: path traversal

Page 4: Graph Algebra with Pattern Matching and Aggregation Support 1

4

Why not RDBMS ?Schema Issue

◦ Every data inserted may of a different schema (Web Graph)

◦ Hard to represent semi structured infoScalability Issues

◦ ACID property VS CAP theoremQuery performance

◦ Difficult to optimize intensive Joins

Page 5: Graph Algebra with Pattern Matching and Aggregation Support 1

5

Graph Databases and Query Languages

No Universal Languages !!!

Page 6: Graph Algebra with Pattern Matching and Aggregation Support 1

6

No Universal Language Like SQL?No commonly agreed algebra

Relational Algebra ?◦ Expressive, test-of-time to be effective◦ NOT suitable for GRAPH

Graph Algebra ?◦ Still at preliminary work

Page 7: Graph Algebra with Pattern Matching and Aggregation Support 1

7

Issues with Relational Algebra (RA)Defined on Tuples or Set of Tuples

◦ Mismatch with graph nature◦ Operators loose semantics

What is Union, Intersection, Join in GRAPH?

◦ I/O type ? Tables not GRAPH

Domain centric, not Data centric◦ Don’t anticipate out-of-order data◦ Treat Tuples as independent

Didn’t aware the links among Tuples Queries written using RA are verbose and complex

Page 8: Graph Algebra with Pattern Matching and Aggregation Support 1

8

Advantage of Graph AlgebraAn algebra itself is a query language

◦ Easy to work out a language with Strong theoretic support

Evaluate expressiveness of given languages◦ Justify when to use what: Gremlin, Cypher etc.

Query Optimization◦ Operator order EQUALS execution plan◦ Algebraic Equivalence IMPLIES query

optimization

Page 9: Graph Algebra with Pattern Matching and Aggregation Support 1

9

Advantage of Graph AlgebraSeparation of Query and System:

◦ One can write Query on any system as long as common algebra is supported.

◦ Knowing RA, one can write SQL, PL/SQL, MS/SQL on MySQL, Oracle, SQLServer

Integrate new operators to database:◦ Current graph database systems didn’t support

newly developed queries: Graph OLAP, Graph Cube, Graph Aggregation etc.

◦ Proper Algebra can incorporate these operators

Page 10: Graph Algebra with Pattern Matching and Aggregation Support 1

10

Existing Works on Graph AlgebraGraph QL [1]

◦ A graph based algebra, operators are based on graphs◦ Selection◦ Join – not properly defined◦ Template

VAQL [2]◦ Focused on visualization◦ Selection◦ Aggregation – restricted◦ Visualization

Selection is restricted on isomorphismAggregation is not defined over edgesNo algebra equivalence[1] He, Huahai, and Ambuj K. Singh. "Graphs-at-a-time: query language and access methods for graph databases." Proceedings of the 2008 ACM SIGMOD international conference on Management of data. ACM, 2008.[2] Shaverdian, Anna A., et al. "A graph algebra for scalable visual analytics." Computer Graphics and Applications, IEEE 32.4 (2012): 26-33.

Page 11: Graph Algebra with Pattern Matching and Aggregation Support 1

11

What we want for a Graph Algebra?Universal

◦ Independent of graph types: Directed VS Undirected. Simple VS Hyper. Homogeneous VS

heterogeneous.

Expressive◦ Able to answer typical graph queries:

Pattern match, Reachability, Path finding etc.

◦ Cover Relational Algebra (RA) This ensures that graph database can handle relational data as well

Scale ◦ Able to manage data in-scale

Support queries to summarize, aggregate data

Page 12: Graph Algebra with Pattern Matching and Aggregation Support 1

12

Extended Algebra – Graph Model is an attributed graphis vertex set, each has a unique IDis edge set contains attributes for each vertex contains attributes for each edge

◦ Edge contain identifier as well◦ In simple graph, edge can be represented by end

points contains information for the graph

Page 13: Graph Algebra with Pattern Matching and Aggregation Support 1

13

Extended Algebra – Operators

Projection

Restriction

Unification

Pattern Matching

Aggregation

Page 14: Graph Algebra with Pattern Matching and Aggregation Support 1

14

Operators: Projection Purpose:

◦ Select user interested data from base graph

Syntax:

are the attribute lists for vertex, edge and graph

The result is a new graph, whose attributes are trimmed by

Page 15: Graph Algebra with Pattern Matching and Aggregation Support 1

15

Operators: Restriction Purpose:

◦ Restrict the attribute value from base graphSyntax:

: vertex restriction, select all the vertices (and their induced edges) which matches predicate

: edge restriction, select all the edges (and their endpoints) which matches predicate

: graph restriction, select graphs whose every vertex matches predicate, every edge matches and the graph matches

Page 16: Graph Algebra with Pattern Matching and Aggregation Support 1

16

Operator: Unification Purpose:

◦ Concatenate graphsSyntax:

: vertex unification, unify vertices with identical ids

: edge unification, adding edges between two vertices matching

: attribute unification, create a virtual vertex for each distinct value in

Page 17: Graph Algebra with Pattern Matching and Aggregation Support 1

17

Operator: Unification

P(v1,v1) and P(v4,v5) are true

Page 18: Graph Algebra with Pattern Matching and Aggregation Support 1

18

Operator: Unification

Page 19: Graph Algebra with Pattern Matching and Aggregation Support 1

19

Operator: Pattern Matching

Purpose:◦ Find subgraphs out of base graph matching a

given patternSyntax:

is a pattern, which is also a graph. The definition comes from [1]

returns all the matching graphs returns abstractive matching, where

only vertices appeared in is returned[1] Fan, Wenfei, et al. "Adding regular expressions to graph reachability and pattern queries." Data Engineering (ICDE), 2011 IEEE 27th International Conference on. IEEE, 2011.

Page 20: Graph Algebra with Pattern Matching and Aggregation Support 1

20

Operator: Pattern Matching

Page 21: Graph Algebra with Pattern Matching and Aggregation Support 1

21

Operator: Aggregation Purpose:

◦ To summarize a given graph

Syntax:

: graph aggregation, every vertex is supplied to and every edge set is supplied to

: vertex aggregation, given a set of vertices group them by

: edge aggregation, given a set of edges, group them by

Page 22: Graph Algebra with Pattern Matching and Aggregation Support 1

22

Operator: Aggregation

Page 23: Graph Algebra with Pattern Matching and Aggregation Support 1

23

Expressiveness

This set of operators are more expressive than Relational Algebra and Graph QL

It can represent many graph queries◦ Reachability◦ Graph Cube computation◦ I-OLAP and T-OLAP

Page 24: Graph Algebra with Pattern Matching and Aggregation Support 1

24

Algebra EquivalenceWhen operators are chained up, they

can form a query execution plan

Find the network induced by the person whose friends comment on each other’s posts with birthday greater than 1989. Output those names as a graph

friend

Commentfriend

⊕𝑣 (𝜋 (𝜎 𝑣 (Γ (𝑅𝑀 ,𝐺 ) , h𝑏𝑖𝑟𝑡 𝑑𝑎𝑦>1989 ) ,𝑣 .𝑛𝑎𝑚𝑒 ))

Base Graph

Matched

Result

Restriction

h𝑏𝑖𝑟𝑡 𝑑𝑎𝑦>1989 v.name

V-Unification

Page 25: Graph Algebra with Pattern Matching and Aggregation Support 1

25

Algebra EquivalenceTo generate multiple execution plans

for a same query, we need theoretic support:

Identity Equivalence:

◦ A operator can be represented by other operators // p is a common attribute predicate

◦ D(P) is to decompose a pattern P into edges

◦ //

...

Page 26: Graph Algebra with Pattern Matching and Aggregation Support 1

26

ConclusionGraph Algebra plays an important role

in graph database development

We make one step forward by proposing a Graph Algebra which:◦ extends existing algebraic work with

Regular pattern matching Aggregation

◦ is expressive and well-defined◦ contains equivalence rules for further query

optimization

Page 27: Graph Algebra with Pattern Matching and Aggregation Support 1

27

Thank you!