the graph traversal programming pattern

76
The Graph Traversal Programming Pattern Marko A. Rodriguez Graph Systems Architect http://markorodriguez.com http://twitter.com/twarko WindyCityDB - Chicago, Illinois – June 26, 2010 June 25, 2010

Upload: marko-rodriguez

Post on 08-Sep-2014

40.916 views

Category:

Technology


2 download

DESCRIPTION

A graph is a structure composed of a set of vertices (i.e.~nodes, dots) connected to one another by a set of edges (i.e.~links, lines). The concept of a graph has been around since the late 19th century, however, only in recent decades has there been a strong resurgence in the development of both graph theories and applications. In applied computing, since the late 1960s, the interlinked table structure of the relational database has been the predominant information storage and retrieval paradigm. With the growth of graph/network-based data and the need to efficiently process such data, new data management systems have been developed. In contrast to the index-intensive, set-theoretic operations of relational databases, graph databases make use of index-free traversals. This presentation will discuss the graph traversal programming pattern and its application to problem-solving with graph databases.

TRANSCRIPT

Page 1: The Graph Traversal Programming Pattern

The Graph Traversal Programming Pattern

Marko A. RodriguezGraph Systems Architect

http://markorodriguez.com

http://twitter.com/twarko

WindyCityDB - Chicago, Illinois – June 26, 2010

June 25, 2010

Page 2: The Graph Traversal Programming Pattern

Abstract

A graph is a structure composed of a set of vertices (i.e. nodes, dots)connected to one another by a set of edges (i.e. links, lines). The conceptof a graph has been around since the late 19th century, however, only inrecent decades has there been a strong resurgence in the development ofboth graph theories and applications. In applied computing, since the late1960s, the interlinked table structure of the relational database has beenthe predominant information storage and retrieval paradigm. With thegrowth of graph/network-based data and the need to efficiently processsuch data, new data management systems have been developed. Incontrast to the index-intensive, set-theoretic operations of relationaldatabases, graph databases make use of index-free traversals. Thispresentation will discuss the graph traversal programming pattern and itsapplication to problem-solving with graph databases.

Page 3: The Graph Traversal Programming Pattern

Outline

• Graph Structures

• Graph Databases

• Graph Traversals

? Artificial Example

? Real-World Examples

Page 4: The Graph Traversal Programming Pattern

Outline

• Graph Structures

• Graph Databases

• Graph Traversals

? Artificial Example

? Real-World Examples

Page 5: The Graph Traversal Programming Pattern

Dots and Lines

There are dots and there are lines.

Lets call them vertices and edges, respectively.

Page 6: The Graph Traversal Programming Pattern

Constructions from Dots and Lines

Its possible to arrange the dots and lines into variousconfigurations.

Lets call such configurations graphs.

Page 7: The Graph Traversal Programming Pattern

Dots and Lines Make a Graph

Page 8: The Graph Traversal Programming Pattern

The Undirected Graph

1. Vertices

• All vertices denote the sametype of object.

2. Edges

• All edges denote the same typeof relationship.• All edges denote a symmetric

relationship.

Page 9: The Graph Traversal Programming Pattern

Denoting an Undirected Structure in the Real World

Collaborator graph is an undirected graph. Road graph is an undirected graph.

Page 10: The Graph Traversal Programming Pattern

A Line with a Triangle

Dots and lines are boring.

Lets add a triangle to one side of each line.

However, lets call a triangle-tipped line a directed edge.

Page 11: The Graph Traversal Programming Pattern

The Directed Graph

1. Vertices

• All vertices denote the sametype of object.

2. Edges

• All edges denote the same typeof relationship.• All edges denote an

asymmetric relationship.

Page 12: The Graph Traversal Programming Pattern

Denoting a Directed Structure in the Real World

Twitter follow graph is a directed graph. Web href-citation graph is a directed graph.

Page 13: The Graph Traversal Programming Pattern

Single Relational Structures

• Without a way to demarcate edges, all edges have the samemeaning/type. Such structures are called single-relational graphs.

• Single-relational graphs are perhaps the most common graph typein graph theory and network science.

Page 14: The Graph Traversal Programming Pattern

How Do You Model a World with Multiple Structures?

I-25

lives_in

is

follows

follows

is

cites

cites

created createdcreated

created

lives_in

I-40

lives_in

is

Page 15: The Graph Traversal Programming Pattern

The Limitations of the Single-Relational Graph

• A single-relational graph is only able to express a single type of vertex(e.g. person, city, user, webpage).1

• A single-relational graph is only able to express a single type of edge(e.g. collaborator, road, follows, citation).2

• For modelers, these are very limiting graph types.

1This is not completely true. All n-partite single-relational graphs allow for the division of the vertex setinto n subsets, where V =

⋃ni Ai : Ai ∩ Aj = ∅. Thus, its possible to implicitly type the vertices.

2This is not completely true. There exists an injective, information-preserving function that maps anymulti-relational graph to a single-relational graph, where edge types are denoted by topological structures.Rodriguez, M.A., “Mapping Semantic Networks to Undirected Networks,” International Journal of AppliedMathematics and Computer Sciences, 5(1), pp. 39–42, 2009. [http://arxiv.org/abs/0804.0277]

Page 16: The Graph Traversal Programming Pattern

The Gains of the Multi-Relational Graph

• A multi-relational graph allows for the explicit typing of edges(e.g. “follows,” “cites,” etc.).

• By labeling edges, edges can have different meanings and verticescan have different types.

? follows : user→ user? created : user→ webpage? cites : webpage→ webpage? ...

created

Page 17: The Graph Traversal Programming Pattern

Increasing Expressivity with Multi-Relational Graphs

createdcreated

follows

follows

created

citescites

created

cites

created follows

follows

follows

Page 18: The Graph Traversal Programming Pattern

The Flexibility of the Property Graph• A property graph extends a multi-relational graph by allowing for both

vertices and edges to maintain a key/value property map.

• These properties are useful for expressing non-relational data (i.e. datanot needing to be graphed).

• This allows for the further refinement of the meaning of an edge.

? Peter Neubauer created the Neo4j webpage on 2007/10.

created

date=2007/10

name=neo4jviews=56781name=peterneubauer

Page 19: The Graph Traversal Programming Pattern

Increasing Expressivity with Property Graphs

createdcreated

follows

follows

created

citescites

created

cites

createdfollows

follows

follows

name=twarkoage=30

name=ahzf

name=graph_blogviews=1000

name=tenderlovegender=male

date=2007/10

name=neo4jviews=56781

page_rank=0.023

name=peterneubauer

Page 20: The Graph Traversal Programming Pattern

Property Graph Instance Schema/Ontology

user webpage

name=<string>age=<integer>gender=<string>

name=<string>views=<integer>

page_rank=<double>

created

date=<string>

follows cites

No standard convention, but in general, specify the types of vertices, edges, and the

properties they may contain. Look into the world of RDFS and OWL for more rigorous,

expressive specifications of graph-based schemas.

Page 21: The Graph Traversal Programming Pattern

Property Graphs Can Model Other Graph Types

property graph

weighted graph

semantic graph

multi-graph

undirected graph

directed graph

simple graph

add weight attribute

remove attributes

remove edge labels

remove loops, directionality, and multiple edges

no op

no op

no op

no op

remove directionality

remove attributes

labeled graph

remove edge labels

no op

rdf graph

make labels URIs

NOTE: Given that a property graph is a binary edge graph, it is difficult to model an n-ary edge graph (i.e. a hypergraph).

Page 22: The Graph Traversal Programming Pattern

Outline

• Graph Structures

• Graph Databases

• Graph Traversals

? Artificial Example

? Real-World Examples

Page 23: The Graph Traversal Programming Pattern

Persisting a Graph Data Structure

• A graph is a relatively simple data structure. It can be

seen as the most fundamental data structure—something isrelated to something else.

• Most every database can model a graph.3

3For the sake of simplicity, the following examples are with respect to a directed, single-relational graph.However, note that property graphs can be easily modeled by such databases as well.

Page 24: The Graph Traversal Programming Pattern

Representing a Graph in a Relational Database

outV | inV

------------

A | B

A | C

C | D

D | A

A

CB

D

Page 25: The Graph Traversal Programming Pattern

Representing a Graph in a JSON Database

{

A : {

out : [B, C], in : [D]

}

B : {

in : [A]

}

C : {

out : [D], in : [A]

}

D : {

out : [A], in : [C]

}

}

A

CB

D

Page 26: The Graph Traversal Programming Pattern

Representing a Graph in an XML Database

<graphml>

<graph>

<node id=A />

<node id=B />

<node id=C />

<edge source=A target=B />

<edge source=A target=C />

<edge source=C target=D />

<edge source=D target=A />

</graph>

</graphml>

A

CB

D

Page 27: The Graph Traversal Programming Pattern

Defining a Graph Database

“If any database can represent a graph, then what

is a graph database?”

Page 28: The Graph Traversal Programming Pattern

Defining a Graph Database

A graph database is any storage system thatprovides index-free adjacency.45

4There is no “official” definition of what makes a database a graph database. The one provided is mydefinition. However, hopefully the following argument will convince you that this is a necessary definition.

5There is adjacency between the elements of an index, but if the index is not the primary data structureof concern (to the developer), then there is indirect/implicit adjacency, not direct/explicit adjacency. Agraph database exposes the graph as an explicit data structure (not an implicit data structure).

Page 29: The Graph Traversal Programming Pattern

Defining a Graph Database

• Every element (i.e. vertex or edge) has a direct pointer toits adjacent element.

• No O(log2(n)) index lookup required to determine which

vertex is adjacent to which other vertex.

• If the graph is connected, the graph as a whole is a singleatomic data structure.

Page 30: The Graph Traversal Programming Pattern

Defining a Graph Database by Example

D

E

C

A

B

Toy Graph Gremlin(stuntman)

Page 31: The Graph Traversal Programming Pattern

Graph Databases and Index-Free Adjacency

D

E

C

A

B

• Our gremlin is at vertex A.

• In a graph database, vertex A has direct references to its adjacent vertices.

• Constant time cost to move from A to B and C. It is dependent upon the number

of edges emanating from vertex A (local).

Page 32: The Graph Traversal Programming Pattern

Graph Databases and Index-Free Adjacency

D

E

C

A

B

The Graph (explicit)

Page 33: The Graph Traversal Programming Pattern

Graph Databases and Index-Free Adjacency

D

E

C

A

B

The Graph (explicit)

Page 34: The Graph Traversal Programming Pattern

Non-Graph Databases and Index-Based Adjacency

D

E

C

A

B

A B C

D EB,C E D,E

• Our gremlin is at vertex A.

• In a non-graph database, the gremlin needs to look at an index to determine whatis adjacent to A.

• log2(n) time cost to move to B and C. It is dependent upon the total number of

vertices and edges in the database (global).

Page 35: The Graph Traversal Programming Pattern

Non-Graph Databases and Index-Based Adjacency

D

E

C

A

B

A B C

D EB,C E D,E

The Index (explicit) The Graph (implicit)

Page 36: The Graph Traversal Programming Pattern

Non-Graph Databases and Index-Based Adjacency

D

E

C

A

B

A B C

D EB,C E D,E

The Index (explicit) The Graph (implicit)

Page 37: The Graph Traversal Programming Pattern

Index-Free Adjacency

• While any database can implicitly represent a graph, only agraph database makes the graph structure explicit.

• In a graph database, each vertex serves as a “mini index”of its adjacent elements.6

• Thus, as the graph grows in size, the cost of a local stepremains the same.7

6Each vertex can be intepreted as a “parent node” in an index with its children being its adjacentelements. In this sense, traversing a graph is analogous in many ways to traversing an index—albeit thegraph is not an acyclic connected graph (tree).

7A graph, in many ways, is like a distributed index.

Page 38: The Graph Traversal Programming Pattern

Graph Databases Make Use of Indices

A B C

D E }}

The Graph

Index of Vertices(by id)

• There is more to the graph than the explicit graph structure.

• Indices index the vertices, by their properties (e.g. ids).

Page 39: The Graph Traversal Programming Pattern

Graph Databases and Endogenous Indices

• Many indices are trees.8

• A tree is a type of constrained graph.9

• You can represent a tree with a graph.10

8Even an “index” that is simply an O(n) container can be represented as a graph (e.g. linked list).9A tree is an acyclic connected graph with each vertex having at most one parent.

10This follows as a consequence of a tree being a graph.

Page 40: The Graph Traversal Programming Pattern

Graph Databases and Endogenous Indices

• Graph databases allows you to explicitly model indicesendogenous to your domain model. Your indices and

domain model are one atomic entity—a graph.11

• This has benefits in designing special-purpose indexstructures for your data.

? Think about all the numerous types of indices in the

geo-spatial community.12

? Think about all the indices that you have yet to think

about.11Originally, Neo4j used itself as its own indexing system before moving to Lucene.12Craig Taverner explores the use of graph databases in GIS-based applications.

Page 41: The Graph Traversal Programming Pattern

Graph Databases and Endogenous Indices

createdcreated

follows

follows

created

citescites

created

cites

createdfollows

follows

follows

name=twarkoage=30

name=ahzf

name=graph_blogviews=1000

name=tenderlovegender=male

date=2007/10

name=neo4jviews=56781

page_rank=0.023

name=peterneubauer

name property index

views property index gender property index

Page 42: The Graph Traversal Programming Pattern

Graph Databases and Endogenous Indices

createdcreated

follows

follows

created

citescites

created

cites

createdfollows

follows

follows

name=twarkoage=30

name=ahzf

name=graph_blogviews=1000

name=tenderlovegender=male

date=2007/10

name=neo4jviews=56781

page_rank=0.023

name=peterneubauer

name property index

views property index gender property index

The Graph Dataset

Page 43: The Graph Traversal Programming Pattern

Outline

• Graph Structures

• Graph Databases

• Graph Traversals

? Artificial Example

? Real-World Examples

Page 44: The Graph Traversal Programming Pattern

Graph Traversals as the Foundation

• Question: Once I have my data represented as a

graph, what can I do with it?

• Answer: You can traverse over the graph to

solve problems.

Page 45: The Graph Traversal Programming Pattern

Outline

• Graph Structures

• Graph Databases

• Graph Traversals

? Artificial Example? Real-World Examples

Page 46: The Graph Traversal Programming Pattern

Graph Database vs. Relational Database

• While any database can represent a graph, it takes time tomake what is implicit explicit.

• The graph database represents an explicit graph.

• The experiment that follows demonstrate the problem withusing lots of table JOINs to accomplish the effect of agraph traversal.13

13Though not presented in this lecture, similar results were seen with JSON document databases.

Page 47: The Graph Traversal Programming Pattern

Neo4j vs. MySQL – Generating a Large Graph

• Generated a 1 million vertex/4 million edge graph with “natural statistics.”14

• Loaded the graph into both Neo4j and MySQL in order to empirically evaluate theeffect of index-free adjacency.

14What is diagramed is a small subset of this graph 1 million vertex graph.

Page 48: The Graph Traversal Programming Pattern

Neo4j vs. MySQL – The Experiment

• For each run of the experiment, a traverser (gremlin) isplaced on a single vertex.

• For each step, the traverser moves to its adjacentvertices.

? Neo4j (graph database): the adjacent vertices are provided by thecurrent vertex.15

? MySQL (relational database): the adjacent vertices are provided by atable JOIN.

• For the experiment, this process goes up to 5 steps.

15You can think of a graph traversal, in a graph database, as a local neighborhood JOIN.

Page 49: The Graph Traversal Programming Pattern

Neo4j vs. MySQL – The Experiment (Zoom-In Subset)

Page 50: The Graph Traversal Programming Pattern

Neo4j vs. MySQL – The Experiment (Step 1)

Page 51: The Graph Traversal Programming Pattern

Neo4j vs. MySQL – The Experiment (Step 2)

Page 52: The Graph Traversal Programming Pattern

Neo4j vs. MySQL – The Experiment (Step 3)

Page 53: The Graph Traversal Programming Pattern

Neo4j vs. MySQL – The Experiment (Step 4)

Page 54: The Graph Traversal Programming Pattern

Neo4j vs. MySQL – The Experiment (Step 5)

Page 55: The Graph Traversal Programming Pattern

Neo4j vs. MySQL – The Results

1 2 3 4

total running time (ms) for step traverals of length n

average over the 250 most dense vertices as root of the traveralsteps

time(m

s)0

20000

60000

100000

0200

00600

00100

000

mysqlneo4j

total running time (ms) for traversals of length n

traversal length

4.5x faster 1.9x faster2.6x faster

2.3x faster

• At step 5, Neo4j completed it in 14 minutes.

• At step 5, MySQL was still running after 2 hours (process stopped).

Page 57: The Graph Traversal Programming Pattern

Why Use a Graph Database? – Data Locality

• If the solution to your problem can be represented as a localprocess within a larger global data structure, then a graph

database may be the optimal solution for your problem.

• If the solution to your problem can be represented as being

with respect to a set of root elements, then a graph

database may be the optimal solution to your problem.

• If the solution to your problem does not require a globalanalysis of your data, then a graph database may be the

optimal solution to your problem.

Page 58: The Graph Traversal Programming Pattern

Why Use a Graph Database? – Data Locality

Page 59: The Graph Traversal Programming Pattern

Outline

• Graph Structures

• Graph Databases

• Graph Traversals

? Artificial Example

? Real-World Examples

Page 60: The Graph Traversal Programming Pattern

Some Graph Traversal Use Cases

• Local searches — “What is in the neighborhood around

A?”16

• Local recommendations — “Given A, what should A

include in their neighborhood?”17

• Local ranks — “Given A, how would you rank B relative to

A?”18

16A can be an individual vertex or a set of vertices. This set is known as the root vertex set.17Recommendation can be seen as trying to increase the connectivity of the graph by recommending

vertices (e.g. items) for another vertex (e.g. person) to extend an edge to (e.g. purchased).18In this presentation, there will be no examples provided for this use case. However, note that searching,

ranking, and recommendation are presented in the WindyCityDB OpenLab Graph Database Tutorial. Otherterms for local rank are “rank with priors” or “relative rank.”

Page 61: The Graph Traversal Programming Pattern

Graph Traversals with Gremlin Programming Language

GremlinG = (V,E)

http://gremlin.tinkerpop.com

The examples to follow are as basic as possible to get the idea across. Note that

numerous variations to the themes presented can be created. Such variations are driven

by the richness of the underlying graph data set and the desired speed of evaluation.

Page 62: The Graph Traversal Programming Pattern

Graph Traversals with Gremlin Programming Language

1

4

knows created

3created

name = peterage = 37

Page 63: The Graph Traversal Programming Pattern

Graph Traversals with Gremlin Programming Language

1

4

knows created

3created

vertex 4 id

vertex 1 out edgesedge label

vertex 3 in edges

edge in vertexedge out vertex

name = peterage = 37

vertex 4 properties

Page 64: The Graph Traversal Programming Pattern

Graph Traversal in Property Graphs

follows

follows

created

created

follows

follows

created

created

created

created

name=emil

name=peter

name=tobias

name=alex

name=johan

name=A

name=B

name=C

name=D

name=E

Red vertices are people and blue vertices are webpages.

Page 65: The Graph Traversal Programming Pattern

Local Search: “Who are the followers of Emil Eifrem?”

follows

follows

created

created

follows

follows

created

created

created

created

name=emil

name=peter

name=tobias

name=alex

name=johan

name=A

name=B

name=C

name=D

name=E

name=alex

name=johan

./outE[@label=ʻfollowsʼ]/inV

1

2

2

Page 66: The Graph Traversal Programming Pattern

Local Search: “What webpages did Emil’s followerscreate?”

follows

follows

created

created

follows

follows

created

created

created

created

name=emil

name=peter

name=tobias

name=alex

name=johan

name=A

name=B

name=C

name=D

name=E

name=A

name=B

./outE[@label=ʻfollowsʼ]/inV /outE[@label=ʻcreatedʼ]/inV

1

2

2

3

3

Page 67: The Graph Traversal Programming Pattern

Local Search: “What webpages did Emil’s followersfollowers create?”

follows

follows

created

created

follows

follows

created

created

created

created

name=emil

name=peter

name=tobias

name=alex

name=johan

name=A

name=B

name=C

name=D

name=E

name=D

name=C

name=E

name=E

./outE[@label=ʻfollowsʼ]/inV/ outE[@label=ʻfollowsʼ]/inV/ outE[@label=ʻcreatedʼ]/inV

2

2

1

3

3

4

44

4

Page 68: The Graph Traversal Programming Pattern

Local Recommendation: “If you like webpage E, youmay also like...”

follows

follows

created

created

follows

follows

created

created

created

created

name=emil

name=peter

name=tobias

name=alex

name=johan

name=A

name=B

name=C

name=D

name=E

name=D

name=C

./inV[@label='created']/outV/ outE[@label='created']/inV[g:except($_)]

1

2

2

3

3

Assumption: if you like a webpage by X, you will like others that they have created.

Page 69: The Graph Traversal Programming Pattern

Local Recommendation: “If you like Johan, you may alsolike...”

follows

follows

created

created

follows

follows

created

created

created

created

name=emil

name=peter

name=tobias

name=alex

name=johan

name=A

name=B

name=C

name=D

name=E name=alex

./inV[@label='follows']/outV/ outE[@label='follows']/inV

12

3

Assumption: if many people follow the same two people, then those two may be similar.

Page 70: The Graph Traversal Programming Pattern

Assortment of Other Specific Graph Traversal Use Cases• Missing friends: Find all the friends of person A. Then find all the

friends of the friends of person A that are not also person A’s friends.19

? ./outE[@label=‘friend’]/inV[g:assign(‘$x’)]/

outE[@label=‘friend’]/inV[g:except($x)]

• Collaborative filtering: Find all the items that the person A likes. Thenfind all the people that like those same items. Then find which itemsthose people like that are not already the items that are liked by personA.20

? ./outE[@label=‘likes’]/inV[g:assign(‘$x’)]/

inE[@label=‘likes’]/outV/outE[@label=‘likes’]/inV[g:except($x)]

19This algorithm is based on the notion of trying to close “open triangles” in the friendship graph. Ifmany of person A’s friends are friends with person B, then its likely that A and B know each other.

20This is the most concise representation of collaborative filtering. There are numerous modifications tothis general theme that can be taken advantage of to alter the recommendations.

Page 71: The Graph Traversal Programming Pattern

Assortment of Other Specific Graph Traversal Use Cases

• Question expert identification: Find all the tags associated withquestion A. For all those tag, find all answers (for any questions) thatare tagged by those tags. For those answers, find who created thoseanswers.21

? ./inE[@label=‘tag’]/outV[@type=‘answer’]/inE[@label=‘created’]/outV

• Similar tags: Find all the things that tag A has been used as a tag for.For all those things, determine what else they have been tagged with.22

? ./inE[@label=‘tag’]/outV/outE[@label=‘tag’]/inV[g:except($_)]

21If two resources share a “bundle” of resources in common, then they are similar.22This is the notion of “co-association” and can be generalized to find the similarity of two resources

based upon their co-association through a third resource (e.g. co-author, co-usage, co-download, etc.). Thethird resource and the edge labels traversed determine the meaning of the association.

Page 72: The Graph Traversal Programming Pattern

Some Tips on Graph Traversals

• Ranking, scoring, recommendation, searching, etc. are all

variations on the basic theme of defining abstract pathsthrough a graph and realizing instances of those paths

through traversal.

• The type of path taken determines the meaning(i.e semantics) of the rank, score, recommendation, search,

etc.

• Given the data locality aspect of graph databases, many ofthese traversals run in real-time (< 100ms).

Page 73: The Graph Traversal Programming Pattern

Property Graph Algorithms in General

• There is a general framework for mapping all the standard single-relationalgraph analysis algorithms over to the property graph domain.23

? Geodesics: shortest path, eccentricity, radius, diameter, closeness,betweenness, etc.24

? Spectral: random walks, page rank, spreading activation, priors, etc.25

? Assortativity: scalar or categorical.? ... any graph algorithm in general.

• All able to be represented in Gremlin.

23Rodriguez M.A., Shinavier, J., “Exposing Multi-Relational Networks to Single-Relational Network AnalysisAlgorithms,” Journal of Informetrics, 4(1), pp. 29–41, 2009. [http://arxiv.org/abs/0806.2274]

24Rodriguez, M.A., Watkins, J.H., “Grammar-Based Geodesics in Semantic Networks,” Knowledge-BasedSystems, in press, 2010.

25Rodriguez, M.A., “Grammar-Based Random Walkers in Semantic Networks,” Knowledge-Based Systems,21(7), pp. 7270–739, 2008. [http://arxiv.org/abs/0803.4355]

Page 74: The Graph Traversal Programming Pattern

Conclusion

• Graph databases are efficient with respects to local dataanalysis.

• Locality is defined by direct referent structures.

• Frame all solutions to problems as a traversal over local

regions of the graph.

? This is the Graph Traversal Pattern.

Page 75: The Graph Traversal Programming Pattern

Acknowledgements

• Pavel Yaskevich for advancing Gremlin. Pavel is currently writing anew compiler that will make Gremlin faster and more memory efficient.

• Peter Neubauer for his collaboration on many of the ideas discussed inthis presentation.

• The rest of the Neo4j team (Emil, Johan, Mattias, Alex, Tobias, David,Anders (1 and 2)) for their comments.

• WindyCityDB organizers for their support.

• AT&T Interactive (Aaron, Rand, Charlie, and the rest of the Buzzteam) for their support.

Page 76: The Graph Traversal Programming Pattern

References to Related Work

• Rodriguez, M.A., Neubauer, P., “Constructions from Dots and Lines,” Bulletin

of the American Society of Information Science and Technology, June 2010.

[http://arxiv.org/abs/1006.2361]

• Rodriguez, M.A., Neubauer, P., “The Graph Traversal Pattern,” AT&Ti and

NeoTechnology Technical Report, April 2010. [http://arxiv.org/abs/1004.1001]

• Neo4j: A Graph Database [http://neo4j.org]

• TinkerPop [http://tinkerpop.com]

? Blueprints: Data Models and their Implementations [http://blueprints.tinkerpop.com]? Pipes: A Data Flow Framework using Process Graphs [http://pipes.tinkerpop.com]? Gremlin: A Graph-Based Programming Language [http://gremlin.tinkerpop.com]? Rexster: A Graph-Based Ranking Engine [http://rexster.tinkerpop.com]∗ Wreckster: A Ruby API for Rexster [http://github.com/tenderlove/wreckster]