new opportunities for connected data
DESCRIPTION
Today's complex data is not only big, but also semi-structured and densely connected. In this session we'll look at how size, structure and connectedness have converged to change the way we work with data. We'll then go on to look at some of the new opportunities for creating end-user value that have emerged in a world of connected data, illustrated with practical examples implemented using Neo4j, the world's leading graph database. Speaker: Ian Robinson, Director of Customer Success, Neo Technology Ian is an engineer at Neo Technology, currently working on research and development for future versions of the Neo4j graph database. Prior to joining the engineering team, Ian served as Neo's Director of Customer Success, managing training, professional services and support, and working with customers to design and develop graph database solutions. He is a co-author of 'Graph Databases' (O'Reilly) and 'REST in Practice' (O'Reilly), and a contributor to 'REST: From Research to Practice' (Springer) and 'Service Design Patterns' (Addison-Wesley). He presents at conferences worldwide on REST and the graph capabilities of Neo4j, and blogs at http://iansrobinson.com.TRANSCRIPT
#neo4j
Outline
• Data complexity • Graph databases – features and benefits • Querying graph data • Use cases
#neo4j
Data Complexity
complexity = f(size, semi-structure, connectedness)
#neo4j
complexity = f(size, semi-structure, connectedness)
Data Complexity
#neo4j
Semi-‐Structure
#neo4j
Semi-‐Structure
Email: [email protected] Email: [email protected] Twi*er: @iansrobinson Skype: iansrobinson
FIRST_NAME LAST_NAME USER_ID EMAIL_1 EMAIL_2 TWITTER FACEBOOK SKYPE
Ian Robinson 315 [email protected] [email protected] @iansrobinson NULL iansrobinson
USER
CONTACT
CONTACT_TYPE
0..n
#neo4j
Social Network
#neo4j
Network Impact Analysis
#neo4j
Route Finding
#neo4j
Recommenda0ons
#neo4j
Logis0cs
#neo4j
Access Control
#neo4j
Fraud Analysis
#neo4j
Securi0es and Debt
Image: orgnet.com
#neo4j
Graphs Are Everywhere
#neo4j
Graph Databases
• Store • Manage • Query data
#neo4j
Neo4j is a Graph Database
Java APIs
Applica0on
REST API REST API REST API
REST Client
Applica0on
Write LB Read LB
#neo4j
Property Graph Data Model
#neo4j
Nodes
#neo4j
Labels
#neo4j
Rela0onships
#neo4j
Graph Database Benefits
“Minutes to milliseconds” performance • Millions of ‘joins’ per second • Consistent query 0mes as dataset grows
Fit for the domain • Lots of join tables? Connectedness • Lots of sparse tables? Semi-‐structure
Business responsiveness • Easy to evolve
#neo4j
Querying Graph Data
• Describing graphs • Crea0ng nodes, rela0onships and proper0es • Querying graphs
#neo4j
Describing Graphs
#neo4j
Cypher
(neo4j)<-[:HAS_SKILL]-(ben)-[:HAS_SKILL]->(rest),(ben)-[:WORKS_FOR]->(acme)
#neo4j
Cypher
(ben)-[:HAS_SKILL]->(neo4j),(ben)-[:HAS_SKILL]->(rest),(ben)-[:WORKS_FOR]->(acme)
#neo4j
Create Some Data CREATE (ben:person { name:'Ben' }), (acme:company { name:'Acme' }), (rest:skill { name:'REST' }), (neo4j:skill= { name:'Neo4j' }), (ben)-[:WORKS_FOR]->(acme), (ben)-[:HAS_SKILL]->(rest), (ben)-[:HAS_SKILL]->(graphs)RETURN ben
#neo4j
Create Nodes CREATE (ben:person { name:'Ben' }), (acme:company { name:'Acme' }), (rest:skill { name:'REST' }), (neo4j:skill= { name:'Neo4j' }), (ben)-[:WORKS_FOR]->(acme), (ben)-[:HAS_SKILL]->(rest), (ben)-[:HAS_SKILL]->(graphs)RETURN ben
#neo4j
Node
(ben:person { name:'Ben' })
#neo4j
Iden0fier
(ben:person { name:'Ben' })
ben =
#neo4j
Label
(ben:person { name:'Ben' })
ben =
#neo4j
Proper0es
(ben:person { name:'Ben' })
ben =
#neo4j
Create Rela0onships CREATE (ben:person { name:'Ben' }), (acme:company { name:'Acme' }), (rest:skill { name:'REST' }), (neo4j:skill= { name:'Neo4j' }), (ben)-[:WORKS_FOR]->(acme), (ben)-[:HAS_SKILL]->(rest), (ben)-[:HAS_SKILL]->(graphs)RETURN ben
#neo4j
Return Node CREATE (ben:person { name:'Ben' }), (acme:company { name:'Acme' }), (rest:skill { name:'REST' }), (neo4j:skill= { name:'Neo4j' }), (ben)-[:WORKS_FOR]->(acme), (ben)-[:HAS_SKILL]->(rest), (ben)-[:HAS_SKILL]->(graphs)RETURN ben
#neo4j
Eventually…
#neo4j
Querying a Graph
Graph local: • One or more start nodes • Explore surrounding graph • Millions of joins per second
#neo4j
Pafern Matching
Pafern
#neo4j
Pick Start Node
Pafern
#neo4j
First Match
Pafern
#neo4j
Second Match
Pafern
#neo4j
Third Match
Pafern
#neo4j
Not A Match
Pafern
#neo4j
Not A Match
Pafern
#neo4j
Who Shares My Skill Set?
#neo4j
Colleagues Who Share My Skills
#neo4j
Cypher Pafern
(company)<-[:WORKS_FOR]-(me) -[:HAS_SKILL]->(skill),(company)<-[:WORKS_FOR]-(colleague) -[:HAS_SKILL]->(skill)
#neo4j
Find Colleagues MATCH (company)<-[:WORKS_FOR]-(me:person) -[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague) -[:HAS_SKILL]->(skill)WHERE me.name = 'Ian'RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skillsORDER BY score DESC
#neo4j
Find Colleagues MATCH (company)<-[:WORKS_FOR]-(me:person) -[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague) -[:HAS_SKILL]->(skill)WHERE me.name = 'Ian'RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skillsORDER BY score DESC
#neo4j
Find Colleagues MATCH (company)<-[:WORKS_FOR]-(me:person) -[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague) -[:HAS_SKILL]->(skill)WHERE me.name = 'Ian'RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skillsORDER BY score DESC
#neo4j
Find Colleagues MATCH (company)<-[:WORKS_FOR]-(me:person) -[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague) -[:HAS_SKILL]->(skill)WHERE me.name = 'Ian'RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skillsORDER BY score DESC
#neo4j
Results +--------------------------------------+| name | score | skills |+--------------------------------------+| "Ben" | 2 | ["Neo4j","REST"] || "Charlie" | 1 | ["Neo4j"] |+--------------------------------------+2 rows
#neo4j
Search the En0re Network MATCH (me:person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(person) -[:HAS_SKILL]->(skill)WHERE me.name = 'Ian'RETURN person.name AS name, company.name AS company, count(skill) AS score, collect(skill.name) AS skillsORDER BY score DESC
#neo4j
Search the En0re Network MATCH (me:person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(person) -[:HAS_SKILL]->(skill)WHERE me.name = 'Ian'RETURN person.name AS name, company.name AS company, count(skill) AS score, collect(skill.name) AS skillsORDER BY score DESC
#neo4j
Results +--------------------------------------------------------------+| name | company | score | skills |+--------------------------------------------------------------+| "Arnold" | "Startup, Ltd" | 3 | ["Java","Neo4j","REST"] || "Ben" | "Acme, Inc" | 2 | ["Neo4j","REST"] || "Gordon" | "Startup, Ltd" | 1 | ["Neo4j"] || "Charlie" | "Acme, Inc" | 1 | ["Neo4j"] |+--------------------------------------------------------------+4 rows
#neo4j
Find People With Matching Skills MATCH p=(me:person)-[:WORKED_ON*2..4]-(person) -[:HAS_SKILL]->(skill)WHERE me.name = 'Ian' AND person <> me AND skill.name IN ['Java','Clojure','SQL']WITH person, skill, min(length(p)) as pathLengthRETURN person.name AS name, count(skill) AS score, collect(skill.name) AS skills, ((pathLength - 1)/2) AS distanceORDER BY score DESC
#neo4j
Find People With Matching Skills MATCH p=(me:person)-[:WORKED_ON*2..4]-(person) -[:HAS_SKILL]->(skill)WHERE me.name = 'Ian' AND person <> me AND skill.name IN ['Java','Clojure','SQL']WITH person, skill, min(length(p)) as pathLengthRETURN person.name AS name, count(skill) AS score, collect(skill.name) AS skills, ((pathLength - 1)/2) AS distanceORDER BY score DESC
#neo4j
Results +---------------------------------------------------+| name | score | skills | distance |+---------------------------------------------------+| "Arnold" | 2 | ["Clojure","Java"] | 2 || "Charlie" | 1 | ["SQL"] | 1 |+---------------------------------------------------+2 rows
#neo4j
Case Studies
#neo4j
Network Impact Analysis
• Which parts of network does a customer depend on?
• Who will be affected if we replace a network element?
#neo4j
Asset Management & Access Control
• Which assets can an admin control?
• Who can change my subscrip0on?
#neo4j
Logis0cs
• What’s the quickest delivery route for this parcel?
#neo4j
Social Network & Recommenda0ons
• Which assets can I access?
• Who shares my interests?
#neo4j
Download the free book from O’Reilly
hfp://graphdatabases.com
Ian Robinson, Jim Webber & Emil Eifrem
Graph Databases
h
Compliments
of Neo Technology
Thank you @ianSrobinson
github.com/iansrobinson