the 5 graphs of love
DESCRIPTION
Recorded webinar: neotechnology.com/webinar-five-graphs-love The iDating industry cares about interactions and connections. Those two concepts are closely linked. If someone has a connection to another person, through a shared friend or a shared interest, they are much more likely to interact. Graph databases are optimized for querying connections between people, things, interests, or really anything that can be connected. Dating sites and apps worldwide have begun to use graph databases to achieve competitive gain. Neo4j provides thousand-fold performance improvements and massive agility benefits over relational databases, enabling new levels of performance and insight. Amanda Laucher discusses the five graphs of love, and how companies like eHarmony, Hinge and AreYouInterested.com, are now using graph algorithms to create more interactions and connections.TRANSCRIPT
�1
Amanda Laucher Neo Technology @pandamonial
(Neo4j)-[:POWERS] ->(Love)
�2
Most of your favorite dating sites
�3
The 5 Graphs of Love
�4
The 5 Graphs of Love
• The Friends-of-Friends Graph
!
!
!
!
!
!
!
�5
The 5 Graphs of Love
• The Friends-of-Friends Graph
!
• The Passion Graph
!
!
!
!
!
�6
The 5 Graphs of Love
• The Friends-of-Friends Graph
!
• The Passion Graph
!
• The Location Graph
!
!
!
�7
The 5 Graphs of Love
• The Friends-of-Friends Graph
!
• The Passion Graph
!
• The Location Graph
!
• The Safety Graph
!
�8
The 5 Graphs of Love
• The Friends-of-Friends Graph
!
• The Passion Graph
!
• The Location Graph
!
• The Safety Graph
!
• The Poser Graph
๏from: California
๏appearance: very handsome
๏personality: super friendly nerd
๏interests: piano, coding
Meet Jeremy...
Jeremy
๏Kerstin: his sister
๏Peter: his buddy
๏Andreas: his coworker
Jeremy has some friends
KerstinAndreas
JeremyPeter
๏Michael: master hacker, divorced, 2 kids
๏Johan: technology sage, likes fast cars
๏Madelene: polyglot journalist, loves dogs
๏Allison: marketing maven, likes long walks on the beach
His friends introduced more friends
Johan
Kerstin
Allison
Andreas
Michael
Madelene
JeremyPeter
๏how do we know they are friends?
๏either ask each pair: are you friends?
๏or, we can add explicit connections
๏Twitter, Facebook, LinkedIn, etc.
So, we have a bunch of people
Johan
Kerstin
Allison
Andreas
Michael
Madelene
JeremyPeter
๏it's just a graph
This is really just data
Johan
Kerstin
Allison
AnnaAdamAndreas
Michael
Madelene
JeremyPeter
�14
A graph?
Yes, a graph...
�15
๏you know the common data structures
•linked lists, trees, object "graphs"
๏a graph is the general purpose data structure
•suitable for any connected data
๏well-understood patterns and algorithms
•studied since Leonard Euler's 7 Bridges (1736)
•Codd's Relational Model (1970)
•not a new idea, just an idea who's time is now
�16
How can you use this? With a Graph Database
A graph database...
�17
๏optimized for the connections between records
๏really, really fast at querying across records
๏a database: transactional with the usual operations
๏“A relational database may tell you the average age of everyone here,
but a graph database will tell you who is most likely to buy you a beer later.”
What’s love got to do with it?
�18
�19
Friends of Friends Graph
!
๏4% likelihood of interacting with a stranger
๏10% likelihood of interacting with friend of friend
๏7% chance of interacting with 3rd degree connection (friend of friend of friend)
๏Connections mean a much larger number of interactions!
JeremyPeterJohan
Jennifer
Allison
AnnaAdamAndreas
Michael
Madelene
According to SNAP Interactive if you are a female user, you have a:
�21
Friends of friends = larger dating pool
Friends
Peter JenniferAndreasJeremy
Friends of friends
PeterJohan
Jennifer
Allison
Andreas
Jeremy
MadeleneFrank
Amanda
Jeremy
Friends of friends of friends
�25
Find Jeremy’s FoFs
�26
Demo - Find who Jeremy shares the most friends with
JakePeter JenniferAndreas
:WORKS_FOR:FRIENDS:FRIENDS
Complicated Relationships
:WANTS_TO_DATE
JakePeter JenniferAndreas
:WORKS_FOR:FRIENDS:FRIENDS
Friends
Awkward!!
JakePeter JenniferAndreas
:WORKS_FOR:FRIENDS:FRIENDS
Friends
:WANTS_TO_DATE
:WANTS_TO_DATE
Awkward
:WANTS_TO_DATE
JakePeter JenniferAndreas
:WORKS_FOR:FRIENDS:FRIENDS
Friends of Friends
:WANTS_TO_DATE
:WANTS_TO_DATE
JakePeter JenniferAndreas
:WORKS_FOR:FRIENDS:FRIENDS
:NO_DATE
Too complex!
Friends of Friends
Friends of Friends of Friends
:WANTS_TO_DATE :WANTS_TO_DATE
JakePeter JenniferAndreas
:WORKS_FOR:FRIENDS:FRIENDS
:NO_DATE
:NO_DATE
:WANTS_TO_DATE
:WANTS_TO_DATE
Friends of Friends of Friends
Friends of Friends of Friends
๏from: UK
๏seeking: Females
๏appearance: Hot, hot, hot!
๏personality: Fun loving, easy going
๏interests: cooking, chemistry
Jon
Meet Jon...
�36
Location Graph
Jon wants to find a date and refuses to have a long distance relationship
�37
�38
Location Graph*Neo4j Spatial
�39
Passion Graph
Jon wants to find someone he can share his passions
with.
�40
Jon
:REPORTED_INTEREST
Match Specific Interests
Cooking
Jon
:REPORTED_INTEREST
Match Specific Interests
Jon
:REPORTED_INTEREST
JenniferAnne Julia
Match Specific Interests
�44
Safety Graph
Jon uses social networks
Jon
Let’s dig into his Twitter
He follows some strange people
…and tweets about strange things!
Some basic word analysis
Let’s update based on behavior
:DEMONSTRATED_INTEREST
Jon
Any ladies ok with this?
Jennifer Jane Maria
Any ladies ok with this?
�53
Passion Graph
Jon loves the New England Patriots
�54
Jon:HAS_INTEREST
�55
Sports
:IS_A
:IS_A
:IS_A:IS_A
�56
Sports
:HAS_TEAM
:HAS_TEAM
:HAS_TEAM
:HAS_TEAM
:HAS_TEAM
:IS_A:IS_A
:IS_A
:IS_A
�57
Sports
:HAS_TEAM
:HAS_TEAM
:HAS_TEAM
:HAS_TEAM
:HAS_TEAM
:IS_A:IS_A
:IS_A
:IS_A
Jon
�58
Sports
Jon
�59
Find ladies who like football
�60
Jennifer Katie Greta
Find ladies who like football
�61
Poser Graph
Jon has no luck with online dating. All of his interactions are with
spam profiles.
�62
Find real people with at least 1 social network & minimum 2 posts
�63
�64
Find ladies who aren’t spam bots
Put it all together
�65
�66
Find Jon’s perfect date
�67
JenniferJon:PERFECT_FOR
�68
JenniferJon:HAS_DATE_WITH
�69
Jon & Jennifer delete their profiles and go off into the sunset!
JenniferJon
Jon Jennifer
Love
[:FOUND]
[:AIDS]
[:AIDS]
[:AIDS]
[:AIDS]
[:AIDS][:POWERS]
�71
Amanda Laucher Neo Technology
(Neo4j)-[:POWERS] ->(Love)
RDBMS/Other vs. Native Graph Database
Performance Challenges with Connected Data
Connectedness of Data Set
Resp
onse
Tim
e
RDBMS / Other NOSQL# Hops: 0-2 Degree: < 3
Size: ThousandsNeo4j
# Hops: Tens to Hundreds Degree: Thousands+ Size: Billions+
1000x faster
Neo Technology, Inc Confidential
Core Industries & Use Cases:
Web / ISV Financial Services
Telecomm-unications
Network & Data Center Management
Master Data Management
Social
Geo
Core Industries & Use Cases: Software
Financial Services
Telecommunications
Health Care & Life Sciences
Web Social,HR & Recruiting
Media & Publishing
Energy, Services, Automotive, Gov’t, Logistics, Education,
Gaming, Other
Network & Data Center Management
MDM / System of Record
Social
Geo
Recommend-ations
Identity & Access Mgmt
Content Management
BI, CRM, Impact Analysis, Fraud Detection, Resource
Optimization, etc.
Accenture
Aviation
Neo4j Adoption SnapshotSelect Commercial Customers* (some NDA)
*Community Users Not Included
Neo Technology, Inc Confidential
Graph Database Deployment
ApplicationOther
Databases
ETL
Graph Database Cluster
Data Storage & Business Rules Execution
Reporting
Graph- Dashboards&Ad-hocAnalysis
Graph Visualization
End User Ad-hoc visual navigation & discovery
Bulk Analytic Infrastructure
(e.g. Graph Compute Engine)
ETL
Graph Mining & Aggregation
Data Scientist
Ad-HocAnalysis
*“Find all direct reports and how many they manage, up to 3 levels down”
(SELECT T.directReportees AS directReportees, sum(T.count) AS count FROM ( SELECT manager.pid AS directReportees, 0 AS count FROM person_reportee manager WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") UNION SELECT manager.pid AS directReportees, count(manager.directly_manages) AS count FROM person_reportee manager WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION SELECT manager.pid AS directReportees, count(reportee.directly_manages) AS count FROM person_reportee manager JOIN person_reportee reportee ON manager.directly_manages = reportee.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION SELECT manager.pid AS directReportees, count(L2Reportees.directly_manages) AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees ) AS T GROUP BY directReportees) UNION (SELECT T.directReportees AS directReportees, sum(T.count) AS count FROM ( SELECT manager.directly_manages AS directReportees, 0 AS count FROM person_reportee manager WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") UNION SELECT reportee.pid AS directReportees, count(reportee.directly_manages) AS count FROM person_reportee manager JOIN person_reportee reportee ON manager.directly_manages = reportee.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION
(continued from previous page...) SELECT depth1Reportees.pid AS directReportees, count(depth2Reportees.directly_manages) AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees ) AS T GROUP BY directReportees) UNION (SELECT T.directReportees AS directReportees, sum(T.count) AS count FROM( SELECT reportee.directly_manages AS directReportees, 0 AS count FROM person_reportee manager JOIN person_reportee reportee ON manager.directly_manages = reportee.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION SELECT L2Reportees.pid AS directReportees, count(L2Reportees.directly_manages) AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees ) AS T GROUP BY directReportees) UNION (SELECT L2Reportees.directly_manages AS directReportees, 0 AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") ) !
Experiencing Query Pain Actual HR Query* (in SQL)
MATCH (boss)-‐[:MANAGES*0..3]-‐>(sub), (sub)-‐[:MANAGES*1..3]-‐>(report) WHERE boss.name = “John Doe” RETURN sub.name AS Subordinate, count(report) AS Total
Experiencing Query Pain Same Query*, using Cypher
*“Find all direct reports and how many they manage, up to 3 levels down”