webinar: an introduction to infinitegraph, and connecting the dots in big data
DESCRIPTION
This August 16, 2011 webinar, hosted by DBTA with InfiniteGraph, examines the technology behind InfiniteGraph and explores common use cases involving very large scale graph processing, and social network analysis. InfiniteGraph was designed specifically to traverse complex relationships in big data, and provide the framework for products built to provide real-time network analysis, business decision support and relationship analytics. Moderator: Tom Wilson, President, DBTA and Unisphere Research. Presenters: Darren Wood, Chief Architect, InfiniteGraph, and Mark Maagdenberg, Senior Field Engineer, InfiniteGraph.TRANSCRIPT
Graph Database Overviewand Feature Update
Darren WoodChief Architect, InfiniteGraph
History
• Objectivity – Massively scalable, distributed object oriented database– Used in Government (DoD, Intelligence)
• Machine generated data such as sensor, acoustic…
– OEM Markets • Either complex data models, or high ingest or both
• Significant technical advantage in highly connected (many-to-many) data models
Copyright © InfiniteGraph
Graph Databases
• Key technical attributes• How Infinite Graph addresses these• Query and navigation• Challenges/Requirements of Distribution• Practical applications
Copyright © InfiniteGraph
Graph Databases
• Optimized around data relationships– Relationships as first class citizens– Super fast traversal between entities– Rich/flexible annotation of connections
• Small focused API (typically not SQL)– Natively work with concepts of Vertex/Edge– SQL has no concept of “navigation”– Most attempts based in SQL are convoluted
Copyright © InfiniteGraph
Distributed Graph Must Haves
• High performance distributed persistence• Ability to deal with remote data reads (fast)• Intelligent local cache of subgraphs• Distributed navigation processing• Distributed, multi-source concurrent ingest• Write modes supporting both strict and
eventual consistency
Copyright © InfiniteGraph
Some Code
Copyright © InfiniteGraph
Vertex alice = myGraph.addVertex(new Person(“Alice”)); Vertex bob = myGraph.addVertex(new Person(“Bob”)); Vertex carlos = myGraph.addVertex(new Person(“Carlos”)); Vertex charlie = myGraph.addVertex(new Person(“Charlie”));
alice.addEdge(new Meeting(“Denver”, “5-27-10”), bob);bob.addEdge(new Call(timestamp), carlos);carlos.addEdge(new Payment(100000.00), charlie);bob.addEdge(new Call(timestamp), charlie);
Alice Carlos CharlieBobMeets Calls Pays
Calls
Physical Storage Comparison
Copyright © InfiniteGraph
Meetings
P1 Place TimeP2Alice Denver 5-27-10Bob
Calls
From Time DurationToBob 13:20 25CarlosBob 17:10 15Charlie
Payments
From Date AmountToCarlos 5-12-10 100000Charlie
Met5-27-10Alice
Called13:20Bob
Payed100000Carlos
Charlie
Called17:10
Rows/Columns/Tables Relationship/Graph Optimized
Query and Navigation• Queries – but not as you know them• More like a rules based search and discovery• Asynchronous Results
Copyright © InfiniteGraph
Alice Carlos CharlieBobMeets Calls Pays
Calls
“Find all paths between Alice and Charlie”
“Find all paths between Alice and Charlie – within 2 degrees”
“Find all paths between Alice and Charlie – events in May 2010”
Navigation Example
Copyright © InfiniteGraph
// Create a qualifier that describes the target vertexQualifier findCharliePredicate =
new VertexPredicate(personType, "name == ’Charlie'");
// Construct a navigator which starts with Alice and uses a result qualifier// to find all paths in the graph to CharlieNavigator charlieFinder = alice.navigate(
Guide.SIMPLE_BREADTH_FIRST, // default guide Qualifier.ANY, // no path constraints
findCharliePredicate , // find paths ending with Charlie
myResultHandler); // fire results to supplied handler
// Start the navigatorcharlieFinder.start();
Management of Large Data Graphs
• Graphs grow quickly– Billions of phone calls / day in US– Emails, social media events, IP Traffic– Financial transactions
• Some analytics require navigation of large sections of the graph
• Each step (often) depends on the last• Must distribute data and go parallel
Copyright © InfiniteGraph
Basic Architecture
Copyright © InfiniteGraph
IG Core/API
ConfigurationNavigation Execution
Management Extensions
BlueprintsUser Apps
Objectivity/DB Distributed Database
Session / TX ManagementPlacement
Feature Update
Copyright © InfiniteGraph
2.0
Accelerated Ingest
Copyright © InfiniteGraph
IG Core/API
ConfigurationNavigation Execution
Management Extensions
Session / TX ManagementPlacement
Standard Blocking Ingest/Placement (MDP Plugin)
Objectivity/DB
App-1(Ingest V1)
App-2(Ingest V2)
App-3(Ingest V3)
V1V1 V2
V2 V3V3
App-1(E1 2{ V1V2})
App-2(E23{ V2V3})
App-3
E12E12 E23
E23
Accelerated Ingest
Copyright © InfiniteGraph
IG Core/API
ConfigurationNavigation Execution
Management Extensions
Session / TX Management
Placement(Standard)Placement
(Accelerated)
V1V1
V2V2
V3V3
E12E12
E23E23
Distributed
Pipelines
Sta
ging
Con
tain
ers P
ipeline Containers
E(1->2)
E(3->1)
E(2->3)
E(2->1)
E(2->3)E(3->1)
E(1->2)
E(3->2)
E(1->2)
E(2->3)
E(3->1)
E(2->1)
E(2->3)
E(3->1)
E(3->2)
E(1->2)
InfiniteGraph Visualizer
• Really nice flexible graph viewer• Browser style navigation and history• Full index support – search your data• Display connections around a selected point• Fully customize display to your data model • Full data view via selection
Copyright © InfiniteGraph
InfiniteGraph Visualizer
Copyright © InfiniteGraph
InfiniteGraph Visualizer
Copyright © InfiniteGraph
Indexing Framework
• Focused on providing choice !• Manual Indexes for grouping data• Automatic Indexes for cross population• Query interface with qualification language• Pluggable query operators• External index support (Lucene)
Copyright © InfiniteGraph
• Automated Distributed Navigation• Stored Loadable Navigators• Visualizer Navigation Plugins• More Visualizer Enhancements• More Import/Export support
Copyright © InfiniteGraph
>> next
Graphs are used everywhere!
• Social Network Analysis– Targeted Advertising– Recommendation Engines
• Transportation• Network Analysis• Fraud Detection/Prevention• Crime Detection/Prevention
Copyright © InfiniteGraph
Copyright © InfiniteGraph
Social Network Analysis
SamBob
Julie
Kate Mary
Mike
Joe
Susan
Jim
Laura
Value DegreeCentrality
BetweenessCentrality Closeness Eigenvalue
High Bob Sam Sam Bob, Sam
Moderate Sam Bob, Joe Bob, Joe Julie, Kate
Finding and measuring key players and relationships
Transportation
Copyright © InfiniteGraph
“Find me the cheapest flight from Amsterdam to Phoenix leaving on
March 1, 2007, with a maximum of two stops, and each stop should be less
than 4 hours”
Given a list of flights between airports represented as…
… try to answer the following
FLIGHT NO
DEPARTAIRPORT
ARRIVEAIRPORT DEPART TIME ARRIVE TIME PRICE
0 AMS LHR 2007-03-01-11.30 2007-03-01-12.30 160.17
1 LHR ORD 2007-03-01-13.30 2007-03-01-19.30 964.29
2 ORD LAX 2007-03-01-20.30 2007-03-02-01.30 583.11
3 LAX SYD 2007-03-02-02.30 2007-03-02-12.30 1663.04
4 AMS TYO 2007-03-01-11.00 2007-03-01-22.00 1595.86
5 TYO SYD 2007-03-02-03.00 2007-03-02-14.00 1487.33
6 AMS LAX 2007-03-01-18.00 2007-03-02-07.00 1374.15
7 AMS JFK 2007-03-01-10.00 2007-03-01-16.00 964.61
8 JFK PHX 2007-03-01-19.00 2007-03-02-01.00 1069.99
9 AMS LGA 2007-03-01-10.00 2007-03-01-16.00 1081.56
10 LGA PHX 2007-03-01-20.00 2007-03-02-02.00 911.92
11 AMS EWR 2007-03-01-10.00 2007-03-01-17.00 911.36
12 EWR PHX 2007-03-01-19.00 2007-03-02-00.00 937.98
13 AMS CAI 2007-03-01-09.00 2007-03-01-16.00 1208.67
14 CAI TYO 2007-03-01-19.00 2007-03-02-00.00 977.95
15 AMS JFK 2007-03-01-15.00 2007-03-01-21.00 1155.43
16 AMS LGA 2007-03-01-12.00 2007-03-01-18.00 923.61
17 AMS LHR 2007-03-01-15.00 2007-03-01-16.00 114.23
Transportation(graph model)
Copyright © InfiniteGraph
AMS
LHR
ORDLAX
SYD
TYO
JFK
LGA
PHXEWR
CAI
F0-160.17F
1-964.29
F2-583.11
F3-1663.04
F4-1595.86
F5-1487.33
F6-
1374
.15
F7-964.61
F8-1069.99
F9-1081.56
F10-911.92F11-911.36
F12- 937.98
F13
-120
8.67F14-
977.95
F15-1155.43
F16-923.61
F17-114.23
Path 1: AMS -(F16)-> LGA -(F10)-> PHX Total Price: $1835.53Path 2: AMS -(F11)-> EWR -(F12)-> PHX Total Price: $1849.34Path 3: AMS -(F09)-> LGA -(F10)-> PHX Total Price: $1993.48Path 4: AMS -(F07)-> JFK -(F08)-> PHX Total Price: $2034.60
Finding Criminal Activity(by association)
Copyright © InfiniteGraph
Finding Criminal Activity(by location)
Copyright © InfiniteGraph