20141216 graph database prototyping ams meetup

73
Graph Database Prototyping @ AMS GraphDB meetup

Upload: rik-van-bruggen

Post on 12-Jul-2015

6.592 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: 20141216 graph database prototyping ams meetup

Graph  Database  Prototyping  

@  AMS  GraphDB  

meetup  

Page 2: 20141216 graph database prototyping ams meetup

Agenda  for  Tonight  

•  Building  a  Graph  Database  Prototype  •  3  parts  – Graph  database  &  modeling  concepts  – Prototyping  tools  &  import  – Graph  querying  with  Cypher  

Page 3: 20141216 graph database prototyping ams meetup

Data  Modeling  With  Neo4j  

Page 4: 20141216 graph database prototyping ams meetup

Topics  

•  Graph  model  building  blocks  •  Quick  intro  to  Cypher  •  Example  modeling  process  •  Modeling  Eps  •  Recipes  for  common  modeling  scenarios  •  Refactoring  •  Test-­‐driven  data  modeling  

Page 5: 20141216 graph database prototyping ams meetup

Graph  Model  Building  Blocks  

Page 6: 20141216 graph database prototyping ams meetup

Property  Graph  Data  Model  

Page 7: 20141216 graph database prototyping ams meetup

Four  Building  Blocks  

•  Nodes  •  RelaEonships  •  ProperEes  •  Labels  

Page 8: 20141216 graph database prototyping ams meetup

Nodes  

Page 9: 20141216 graph database prototyping ams meetup

Nodes  

•  Used  to  represent  en##es  and  complex  value  types  in  your  domain  

•  Can  contain  properEes  – Used  to  represent  enEty  a1ributes  and/or  metadata  (e.g.  Emestamps,  version)  

– Key-­‐value  pairs  •  Java  primiEves  •  Arrays  •  null  is  not  a  valid  value  

– Every  node  can  have  different  properEes  

Page 10: 20141216 graph database prototyping ams meetup

EnEEes  and  Value  Types  

•  EnEEes  – Have  unique  conceptual  idenEty  – Change  aWribute  values,  but  idenEty  remains  the  same  

•  Value  types  – No  conceptual  idenEty  – Can  subsEtute  for  each  other  if  they  have  the  same  value  •  Simple:  single  value  (e.g.  colour,  category)  •  Complex:  mulEple  aWributes  (e.g.  address)  

Page 11: 20141216 graph database prototyping ams meetup

RelaEonships  

Page 12: 20141216 graph database prototyping ams meetup

RelaEonships  

•  Every  relaEonship  has  a  name  and  a  direc#on  – Add  structure  to  the  graph  – Provide  semanEc  context  for  nodes  

•  Can  contain  properEes  – Used  to  represent  quality  or  weight  of  relaEonship,  or  metadata  

•  Every  relaEonship  must  have  a  start  node  and  end  node  – No  dangling  relaEonships  

Page 13: 20141216 graph database prototyping ams meetup

RelaEonships  (conEnued)  

Nodes  can  have  more  than  one  relaEonship  

Self  relaEonships  are  allowed  

Nodes  can  be  connected  by  more  than  one  relaEonship  

Page 14: 20141216 graph database prototyping ams meetup

Variable  Structure  

•  RelaEonships  are  defined  with  regard  to  node  instances,  not  classes  of  nodes  – Two  nodes  represenEng  the  same  kind  of  “thing”  can  be  connected  in  very  different  ways  •  Allows  for  structural  variaEon  in  the  domain  

– Contrast  with  relaEonal  schemas,  where  foreign  key  relaEonships  apply  to  all  rows  in  a  table  •  No  need  to  use  null  to  represent  the  absence  of  a  connecEon    

Page 15: 20141216 graph database prototyping ams meetup

Labels  

Page 16: 20141216 graph database prototyping ams meetup

Labels  

•  Every  node  can  have  zero  or  more  labels  •  Used  to  represent  roles  (e.g.  user,  product,  company)  – Group  nodes  – Allow  us  to  associate  indexes  and  constraints  with  groups  of  nodes  

Page 17: 20141216 graph database prototyping ams meetup

Four  Building  Blocks  

•  Nodes  – EnEEes  

•  RelaEonships  – Connect  enEEes  and  structure  domain  

•  ProperEes  – EnEty  aWributes,  relaEonship  qualiEes,  and  metadata  

•  Labels  – Group  nodes  by  role  

Page 18: 20141216 graph database prototyping ams meetup

Designing  a  Graph  Model  

Page 19: 20141216 graph database prototyping ams meetup

Models  

Images:  en.wikipedia.org  

Purposeful  abstracEon  of  a  domain  designed  to  saEsfy  parEcular  applicaEon/end-­‐user  goals  

Page 20: 20141216 graph database prototyping ams meetup

Design  for  Queryability  

Model  Query  

Page 21: 20141216 graph database prototyping ams meetup

Method  

1.  IdenEfy  applicaEon/end-­‐user  goals  2.  Figure  out  what  quesEons  to  ask  of  the  domain  3.  IdenEfy  enEEes  in  each  quesEon  4.  IdenEfy  relaEonships  between  enEEes  in  each  

quesEon  5.  Convert  enEEes  and  relaEonships  to  paths  –  These  become  the  basis  of  the  data  model  

6.  Express  quesEons  as  graph  paWerns  –  These  become  the  basis  for  queries  

Page 22: 20141216 graph database prototyping ams meetup

ApplicaEon/End-­‐User  Goals  

As  an  employee    

I  want  to  know  who  in  the  company  has  similar  skills  to  me    

So  that  we  can  exchange  knowledge  

Page 23: 20141216 graph database prototyping ams meetup

QuesEons  To  Ask  of  the  Domain  

Which  people,  who  work  for  the  same  company  as  me,  have  similar  skills  to  me?  

As  an  employee    I  want  to  know  who  in  the  company  has  similar  skills  to  me    So  that  we  can  exchange  knowledge  

Page 24: 20141216 graph database prototyping ams meetup

IdenEfy  EnEEes  

Which  people,  who  work  for  the  same  company  as  me,  have  similar  skills  to  me?    Person  Company  Skill  

Page 25: 20141216 graph database prototyping ams meetup

IdenEfy  RelaEonships  Between  EnEEes  

Which  people,  who  work  for  the  same  company  as  me,  have  similar  skills  to  me?    Person  WORKS_FOR  Company  Person  HAS_SKILL  Skill  

Page 26: 20141216 graph database prototyping ams meetup

Convert  to  Cypher  Paths  

Person  WORKS_FOR  Company  Person  HAS_SKILL  Skill  

RelaEonship  

Label  

(:Person)-[:WORKS_FOR]->(:Company),(:Person)-[:HAS_SKILL]->(:Skill)

Page 27: 20141216 graph database prototyping ams meetup

Consolidate  Paths  

(:Person)-[:WORKS_FOR]->(:Company),(:Person)-[:HAS_SKILL]->(:Skill)

(:Company)<-[:WORKS_FOR]-(:Person)-[:HAS_SKILL]->(:Skill)

Page 28: 20141216 graph database prototyping ams meetup

Create  Person  Subgraph  

MERGE (c:Company{name:'Acme'})MERGE (p:Person{name:'Ian'})MERGE (s1:Skill{name:'Java'})MERGE (s2:Skill{name:'C#'})MERGE (s3:Skill{name:'Neo4j'})CREATE UNIQUE (c)<-[:WORKS_FOR]-(p), (p)-[:HAS_SKILL]->(s1), (p)-[:HAS_SKILL]->(s2), (p)-[:HAS_SKILL]->(s3)RETURN c, p, s1, s2, s3

Page 29: 20141216 graph database prototyping ams meetup

Candidate  Data  Model  

(:Company)<-[:WORKS_FOR]-(:Person)-[:HAS_SKILL]->(:Skill)

Page 30: 20141216 graph database prototyping ams meetup

Express  QuesEon  as  Graph  PaWern  

Which  people,  who  work  for  the  same  company  as  me,  have  similar  skills  to  me?  

Page 31: 20141216 graph database prototyping ams meetup

Cypher  Query  

Which  people,  who  work  for  the  same  company  as  me,  have  similar  skills  to  me?  MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill)WHERE me.name = {name}RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skillsORDER BY score DESC

Page 32: 20141216 graph database prototyping ams meetup

Graph  PaWern  

Which  people,  who  work  for  the  same  company  as  me,  have  similar  skills  to  me?  MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill)WHERE me.name = {name}RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skillsORDER BY score DESC

Page 33: 20141216 graph database prototyping ams meetup

Which  people,  who  work  for  the  same  company  as  me,  have  similar  skills  to  me?  MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill)WHERE me.name = {name}RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skillsORDER BY score DESC

Anchor  PaWern  in  Graph  

If  an  index  for  Person.name  exists,  Cypher  will  use  it  

Page 34: 20141216 graph database prototyping ams meetup

Create  ProjecEon  of  Results  

Which  people,  who  work  for  the  same  company  as  me,  have  similar  skills  to  me?  MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill)WHERE me.name = {name}RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skillsORDER BY score DESC

Page 35: 20141216 graph database prototyping ams meetup

First  Match  

Page 36: 20141216 graph database prototyping ams meetup

Second  Match  

Page 37: 20141216 graph database prototyping ams meetup

Third  Match  

Page 38: 20141216 graph database prototyping ams meetup

Running  the  Query  

+-----------------------------------+| name | score | skills |+-----------------------------------+| "Lucy" | 2 | ["Java","Neo4j"] || "Bill" | 1 | ["Neo4j"] |+-----------------------------------+2 rows

Page 39: 20141216 graph database prototyping ams meetup

From  User  Story  to  Model  and  Query  

MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill)WHERE me.name = {name}RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skillsORDER BY score DESC

As  an  employee    I  want  to  know  who  in  the  company  has  similar  skills  to  me    So  that  we  can  exchange  knowledge  

(:Company)<-[:WORKS_FOR]-(:Person)-[:HAS_SKILL]->(:Skill)

Person  WORKS_FOR  Company  Person  HAS_SKILL  Skill

?Which  people,  who  work  for  the  same  company  as  me,  have  similar  skills  to  me?

Page 40: 20141216 graph database prototyping ams meetup

Modeling  Tips  

Page 41: 20141216 graph database prototyping ams meetup

ProperEes  Versus  RelaEonships  

Page 42: 20141216 graph database prototyping ams meetup

Use  RelaEonships  When…  

•  You  need  to  specify  the  weight,  strength,  or  some  other  quality  of  the  rela#onship  

•  AND/OR  the  aWribute  value  comprises  a  complex  value  type  (e.g.  address)  

•  Examples:  –  Find  all  my  colleagues  who  are  expert  (relaEonship  quality)  at  a  skill  (aWribute  value)  we  have  in  common  

–  Find  all  recent  orders  delivered  to  the  same  delivery  address  (complex  value  type)  

Page 43: 20141216 graph database prototyping ams meetup

Use  ProperEes  When…  

•  There’s  no  need  to  qualify  the  relaEonship  •  AND  the  aWribute  value  comprises  a  simple  value  type  (e.g.  colour)  

•  Examples:  – Find  those  projects  wriWen  by  contributors  to  my  projects  that  use  the  same  language  (aWribute  value)  as  my  projects  

Page 44: 20141216 graph database prototyping ams meetup

If  Performance  is  CriEcal…  

•  Small  property  lookup  on  a  node  will  be  quicker  than  traversing  a  relaEonship  – But  traversing  a  relaEonship  is  sEll  faster  than  a  SQL  join…  

•  However,  many  small  proper#es  on  a  node,  or  a  lookup  on  a  large  string  or  large  array  property  will  impact  performance  – Always  performance  test  against  a  representaEve  dataset  

Page 45: 20141216 graph database prototyping ams meetup

RelaEonship  Granularity  

Page 46: 20141216 graph database prototyping ams meetup

Align  With  Use  Cases  

•  RelaEonships  are  the  “royal  road”  into  the  graph  

•  When  querying,  well-­‐named  relaEonships  help  discover  only  what  is  absolutely  necessary  – And  eliminate  unnecessary  porEons  of  the  graph  from  consideraEon  

Page 47: 20141216 graph database prototyping ams meetup

General  RelaEonships  

•  Qualified  by  property  

Page 48: 20141216 graph database prototyping ams meetup

Specific  RelaEonships  

Page 49: 20141216 graph database prototyping ams meetup

Best  of  Both  Worlds  

Page 50: 20141216 graph database prototyping ams meetup

Model  and  Query  Recipes  

Page 51: 20141216 graph database prototyping ams meetup

Events  and  AcEons  

•  Oken  involve  mulEple  parEes  •  Can  include  other  circumstanEal  detail,  which  may  be  common  to  mulEple  events  

•  Examples  – Patrick  worked  for  Acme  from  2001  to  2005  as  a  Sokware  Developer  

– Sarah  sent  an  email  to  Lucy,  copying  in  David  and  Claire  

Page 52: 20141216 graph database prototyping ams meetup

Timeline  Trees  

•  Discrete  events  – No  natural  relaEonships  to  other  events  

•  You  need  to  find  events  at  differing  levels  of  granularity  – Between  two  days  – Between  two  months  – Between  two  minutes  

Page 53: 20141216 graph database prototyping ams meetup

Example  Timeline  Tree  

Page 54: 20141216 graph database prototyping ams meetup

Pimalls  and  AnE-­‐PaWerns  

Page 55: 20141216 graph database prototyping ams meetup

Modeling  EnEEes  as  RelaEonships  

•  Limits  data  model  evoluEon  – A  relaEonship  connects  two  things  – Modeling  an  enEty  as  a  relaEonship  prevents  it  from  being  related  to  more  than  two  things  

•  Smells:  – Lots  of  aWribute-­‐like  properEes  – Heavy  use  of  relaEonship  indexes  

•  EnEEes  hidden  in  verbs:  – E.g.  emailed,  reviewed  

 

Page 56: 20141216 graph database prototyping ams meetup

Example:  Movie  Reviews  

•  IniEal  requirements:  – People  review  films  – ApplicaEon  aggregates  reviews  from  mulEple  sites  

Page 57: 20141216 graph database prototyping ams meetup

IniEal  Model  

Page 58: 20141216 graph database prototyping ams meetup

New  Requirements  

•  Allow  user  to  comment  on  each  other’s  reviews  – Can’t  connect  a  review  to  a  third  enEty  

Page 59: 20141216 graph database prototyping ams meetup

Revised  model  

Page 60: 20141216 graph database prototyping ams meetup

Model  AcEons  in  Terms  of  Products  

Page 61: 20141216 graph database prototyping ams meetup

Now    for    

Some  Prototyping!  

Page 62: 20141216 graph database prototyping ams meetup

Draw  a  Model!  

Eg.  Using  Visio,  www.apcjones.com/arrows,  hWp://graphjson.io,  Omnigraffle  

Page 63: 20141216 graph database prototyping ams meetup

CreaEng  a  prototype  DB  out  of  our  model?  

Page 64: 20141216 graph database prototyping ams meetup

Now  for  Some  

Queries!  

Page 65: 20141216 graph database prototyping ams meetup

Next  meetup!  

•  January  22nd  :  how  to  create  an  APPLICATION  on  top  of  our  newly  created  database  

Page 66: 20141216 graph database prototyping ams meetup

BACKUP  slides:    Cypher  Query  Language  

Page 67: 20141216 graph database prototyping ams meetup

Nodes  and  RelaEonships  

()-->()

Page 68: 20141216 graph database prototyping ams meetup

Labels  and  RelaEonship  Types  

(:Person)-[:FRIEND]->(:Person)

Page 69: 20141216 graph database prototyping ams meetup

ProperEes  

(:Person{name:'Peter'})-[:FRIEND]->(:Person{name:'Lucy'})

Page 70: 20141216 graph database prototyping ams meetup

IdenEfiers  

(p1:Person{name:'Peter'})-[r:FRIEND]->(p2:Person{name:'Lucy'})

Page 71: 20141216 graph database prototyping ams meetup

Cypher  

MATCH graph_patternWHERE binding_and_filter_criteriaRETURN results

Page 72: 20141216 graph database prototyping ams meetup

Cypher  

MATCH (p:Person)-[:FRIEND]->(friends)WHERE p.name = 'Peter'RETURN friends

Page 73: 20141216 graph database prototyping ams meetup

Lookup  Using  IdenEfier  +  Label    

MATCH (p:Person)-[:FRIEND]->(friends)WHERE p.name = 'Peter'RETURN friends