getting started with graph databases
DESCRIPTION
Exploiting graph database to discover value in complex Big Data. Lunch will be provided while you discover the power of graph database technology for your Big Data needs. Bring your charged laptops to this upcoming meetup to walk through how to get started with InfiniteGraph. Nick Quinn, Senior Software Developer for InfiniteGraph, will walk you through the initial installation of InfiniteGraph and the HelloGraph sample to get you started with your graph database. Download InfiniteGraph for free here: http://www.objectivity.com/downloads Once we get through the tutorial, there will be time for Q&A and more hands on support from additional members of the InfiniteGraph technical team. If you have a complex Big Data problem and are looking to discover deeper connections and relationships within your data to create next-generation applications for social networks, healthcare, finance, telecom and security this is a must attend event! Get started quickly with our enterprise proven, massively scalable and distributed graph database!TRANSCRIPT
www.Objectivity.com
Getting Started with Graph Databases
Nick Quinn
Principal Engineer, InfiniteGraph
04/10/2023 1
What are we talking about today?
•Big Data and Databases•What is a Graph Database?•What is InfiniteGraph?•Demo and Q&A – Hands On
– Installing InfiniteGraph• https://download.infinitegraph.com
– FlightPlan Sample• http://wiki.infinitegraph.com “Download
Examples” FlightPlanSample.zip
Images Courtesy of IMDB (www.imdb.com)
04/10/2023
NoSQL 2013
• Developers are embracing choice• More than Dynamo and BigTable clones• Incorporates specialized data models like
Document, Object and Graph • 100+ projects and products (Wikipedia)• ~250 Meetup.com Groups (5 meetups this week!)• NoSQL fans consume 12% of the worlds Beer & Pizza
04/10/2023 4
NoSQL and BigData – What’s the Connection ?
• Making big data “appear” smaller• Partitioning, replication & distributed query• Storage model optimizations• Consistency trade offs• Simplified query models• Dynamic views
big data is a loosely-defined term used to describe data sets
so large and complex that they become awkward to work with using on-hand database management tools (wikipedia)
04/10/2023 5
The Specialist !
• Everyone specializes – Doctors, Lawyers, Bankers, Developers
• Why was data so normalized for so long !• NoSQL is all about the data specialist• Specializing in…
– Distribution / deployment– Physical data storage– Logical data model– Query mechanism
04/10/2023 6
Polyglot NoSQL Architectures
Distributed Data Processing Platform Document
GraphDatabase
RDBMS
Partitioned Distributed DB (often Document / KV)
Users
Appl
icati
ons
External / Legacy Data Tr
ansf
orm
ation
\ M
DM
Business
04/10/2023 7
NoSQL Landscape - How it all stacks up!
Data Model Performance Scalability Flexibility Complexity Functionality
Key–value Stores high high high none variable
(none)
Column Store high high moderate low minimal
Document Store high variable high low variable (low)
Graph Database variable variable high high graph theory
Relational Database variable variable low moderate relational
algebra.
From…http://wikipedia.org/wiki/NoSQL
04/10/2023 8
Navigational Query Performance
04/10/2023 9
The Physical Data Model
• Becoming a relationship specialist…
Meetings
P1 Place TimeP2Alice Denver 5-27-10Bob
Calls
From Time DurationToBob 13:20 25CarlosBob 17:10 15Charlie
Payments
From Date AmountToCarlos 5-12-10 100000Charlie
Met5-27-10Alice
Called13:20Bob
Paid100000Carlos
Charlie
Called17:10
Rows/Columns/Tables Relationship/Graph Optimized
04/10/2023 10
Sometimes Big Data is just Fast Data !
• Some data is only actionable momentarily– Intelligence– IT Security– Site/page visit– Financial / trading behavior
• Presents a different type of challenge• Latency of batch data processing becomes
problematic
04/10/2023 11
Scaling Writes
• Big/Fast data demands write performance• Most NoSQL solutions allow you to scale writes by…
– Partitioning the data– Understanding your consistency requirements– Allowing you to defer conflicts
Why a Graph Database ?
04/10/2023 12
04/10/2023 13
Relationships are everywhere
CRM, Sales & Marketing Network Mgmt,
TelecomIntelligence
(Government& Business)
Finance
HealthcareResearch: Genomics
Social Networks
PLM (Product Lifecycle Mgmt)
04/10/2023 14
Exploding Connections
• More often than not… graphs are big !
The Graph Database Landscape
Copyright © InfiniteGraph
• Neo4J • Titan (Aurelius)• AllegroGraph (RDF) • FlockDB (Twitter)• DEX (Sparsity)• OrientDB (Document)• + 24 others (from wikipedia.org)
The Graph Database Landscape Cont’d
Copyright © InfiniteGraph
• Graph Analytics: High latency, Batch Processing, offline– Apache Giraph– GraphLab– Intel’s Graph Builder
• Visual Analytics: In Memory, High Performance, Poor Scalability
– Tom Sawyer– D3JS– KeyLines– InfoVis
• Tinkerpop stack (Blueprints/Gremlin)– 16 implementations and counting…
04/10/2023 17
Why InfiniteGraph™?
• Objectivity/DB is a proven foundation
– Building highly connected databases since 1993– A complete database management system
• Concurrency, transactions, cache, schema, query, indexing
• It’s a Graph Specialist !
– Simple but powerful API tailored for navigation through data
– Easy to configure distribution model
04/10/2023 18
InfiniteGraph™ Basic Architecture
InfiniteGraph - Core/API
ConfigurationNavigation Execution
Management Extensions
BlueprintsUser Apps
Distributed Object and Relationship Persistence Layer
Session / TX ManagementPlacement
04/10/2023 19
Zone 2Zone 1
HostA
Fully Distributed Data Model
IG Core/API
Distributed Object and Relationship Persistence Layer
ADP Placement
HostB HostC HostX
AddVertex()
04/10/2023 20
InfiniteGraph is a Complete Database• InfiniteGraph helps manage the things you don’t want to do, but
want to have done:– Concurrency
• Transactions (commit/rollback)
• Controlled multi-user reading during updates
– Schema Control• Build complex data structures, make changes easily and migrate existing data
– Distribution• Sharing large amounts of distributed data between distributed processes
– Indexes• Choose built-in key-value, b-tree or other indexes
– Cache• Keep large sections of the graphs in configurable memory caches
Copyright © InfiniteGraph
Scaling Graph Writes
InfiniteGraph
Objectivity/DB Persistence Layer
App-1(Ingest V1)
App-2(Ingest V2)
App-3(Ingest V3)
V1 V2 V3
App-1(E1 2{ V1V2})
App-2(E23{ V2V3})
App-3
E12 E23
04/10/2023 22
High Performance Edge Ingest
IG Core/API
C1
C2
C3
E12
E23
Targ
et C
onta
iner
s
Pipeline ContainersE(1->2)
E(3->1)
E(2->3)
E(2->1)
E(2->3)E(3->1)
E(1->2)
E(3->2)
E(1->2)
E(2->3)
E(3->1)
E(2->1)
E(2->3)
E(3->1)
E(3->2)
E(1->2)
Pipeline
Agent
04/10/2023 23
Result…
1
2
4
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
500000
1 client
2 clients
4 clients
8 clients
1 client
2 clients
4 clients
8 clients
# of Processes
Nod
es a
nd E
dges
per
seco
nd
8 Hosts
4 Hosts
2 Hosts
Single Host
Distributed API
Application(s)
Partition 1 Partition 3Partition 2 Partition ...n
Processor Processor Processor Processor
Copyright © InfiniteGraph
Scaling Reads and QueryPartitioning and Read Replicas… easy right !
04/10/2023 25
Distributed API
Application(s)
Partition 1 Partition 3Partition 2 Partition ...n
Processor Processor Processor Processor
Why are Graphs Different ?
04/10/2023 26
Optimizing Distributed Navigation• Detect local hops and perform in memory
traversal– Intelligently cache freq accessed remote data
• Route tasks to other hosts when it is optimal
Processor
Distributed API
Partition 1 Partition 2
Processor
Application
A
XY
B
C
D
EP(A,B,C,D)
F
G
04/10/2023 27
Super Simple API
Person alice = new Person(“Alice”);helloGraphDB.addVertex( alice );
Person bob = new Person(“Bob”);helloGraphDB.addVertex( bob );
Person carlos = new Person(“Carlos”);helloGraphDB.addVertex( carlos );
Person charlie = new Person(“Charlie”);helloGraphDB.addVertex( charlie );
04/10/2023 28
Adding Edges
MyEdgeType edge = new MyEdgeType();
vertexA.addEdge ( edge, vertexB, EdgeKind.???, weight );
Meeting denverMeeting = new Meeting("Denver", "5-27-10");alice.addEdge(denverMeeting, bob, EdgeKind.BIDIRECTIONAL, (short)1);
Call bobToCarlos = new Call(getRandomJulyTime());bob.addEdge(bobToCarlos, carlos, EdgeKind.OUTGOING, (short)0);
Payment payment = new Payment(10000.00);carlos.addEdge(payment, charlie, EdgeKind.OUTGOING, (short)2);
Call bobToCharlie = new Call(getRandomJulyTime());bob.addEdge(bobToCharlie, charlie, EdgeKind.INCOMING, (short)0);
04/10/2023 29
The Result…
04/10/2023 30
Graph Traversal (Navigation) Queries
• Use an instance of the Navigator class to perform a navigation query.
• A navigation instance is highly customizable, but is comprised of the following basic parts:– The vertex from which to start the navigation query.– A guide strategy, which is a high-level navigational aid. You can
create a custom guide, or there are several available built-in guide strategies.
• Guide.Strategy.NONE• Guide.Strategy.SIMPLE_BREADTH_FIRST• Guide.Strategy.SIMPLE_DEPTH_FIRST
– Qualifiers• A path qualifier• A result qualifier
– Handlers• A result handler
04/10/2023 31
Schema – It’s not your enemy ! (well not all the time...)
• Schema vs Schema-less– Database religion– No time for a full debate here– InfiniteGraph supports schema– Planning to also support optional properties on
schema types
• Graph Views : A Great Use Case for Schema!– Filter by type and predicate during navigation– Connection Inference!
Graph Views and Bacon!• Filter out uninteresting projects connected to Kevin Bacon
GraphView view = new GraphView();//Excludes all instances of TvShow from navigationview.excludeClass(myDb.getTypeId(TvShow.class.getName()));//Excludes all movies made for TV/Videoview.excludeClass(myDb.getTypeId(Movie.class.getName()),
“details.madeForTv || details.madeForVideo”);//Include ActedIn w/ characterName not containing “Himself”view.excludeClass(myDb.getTypeId(WorkedOn.class.getName()));view.includeClass(myDb.getTypeId(ActedIn.class.getName()),
“!CONTAINS(characterName, “Himself”)”);
Kevin Bacon
Actor
The Following
TV ShowBehind the
Scenes
Movie
Apollo 13
Movie
HimselfRyan Hardy
Jack Swigert
04/10/2023 33
Tools To Suit the Solution
Demo
Installing InfiniteGraph FlightPlan Sample