do i need a graph database?

29
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Do I need a Graph Database? Juan F. Sequeda, Ph.D Co-Founder Capsenta 1 Data/Graph Day Texas – January 14, 2017

Upload: juan-sequeda

Post on 13-Apr-2017

693 views

Category:

Software


0 download

TRANSCRIPT

Page 1: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Do I need a Graph Database?

Juan F. Sequeda, Ph.DCo-FounderCapsenta

1Data/Graph  Day  Texas  – January  14,  2017

Page 2: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

• Last  year,  a  talk  at  Data  Day– Client  thought  they  had  a  graph  problem– Evaluated  graph  databases

• Guess  what…  – queries  were  faster  in  Postgres

• Did  they  really  have  a  graph  problem?  

Page 3: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

What  type  of  graphs  are  we  talking  about?

3

Page 4: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Property  Graphs  vs  RDF  Graphs

4

:Bob :Alicefoaf:knows

“Bob  Smith”

foaf:name

“Alice  Smith”

foaf:name

id1 id2

knowskey value

name Bob  Smith

key value

name AliceSmith

key value

since 2005

:g1

2005

:since

http://db-­‐engines.com/en/ranking/graph+dbms http://db-­‐engines.com/en/ranking/rdf+store

• W3C  Standard• Based  on  Triples• Graph  Data  Model  for  the  Web  (URIs)

• No  Standard  (Cypher,  Titan,   etc.)• Key/Values   on  Nodes  and  Edges

Page 5: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Flexibility Data  Integration Semantics Provenance “Graphy”

QueriesGraph

Visualization

Cold  Data

Warm  Data

Page 6: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Flexibility Data  Integration Semantics Provenance “Graphy”

QueriesGraph

Visualization

Cold  Data

Warm  Data

Cold  Data• New   Project• New   Data• Should  I  use  a  Relational   DB  or  a  Graph  DB?

Page 7: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Flexibility Data  Integration Semantics Provenance “Graphy”

QueriesGraph

Visualization

Cold  Data

Warm  Data

Warm  Data• Data  already  exists  • Applications  consuming   existing  data• Should  I  move  my  relational   data  to  a  Graph  DB?• Do  I  keep  two  copies   of  my  data  (relational   and  graph)?

Page 8: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Flexibility Data  Integration Semantics Provenance “Graphy”

QueriesGraph

Visualization

Cold  Data

Warm  Data

Flexibility

Page 9: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Flexible

9

:US_Constitution_1992/section/123

“Excessive  bail   shall  not  be  required,   nor  

excessive  fines  imposed,  nor  cruel   and  unusual  punishments   inflicted.”

:text

:US_Constitution_1992 “United   States  of  America  1789  (rev.  1992)”

:text

:isSectionOf

:Cruelty:hasTopic

“Prohibition   of  cruel  or  degrading  treatment”

:label

“inhumane   treatment”

:keyword

Page 10: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Data  and  Metadata  are  One

10

:US_Constitution_1992/section/123

“Excessive  bail  shall  not  be  required,  nor  excessive  fines  imposed,  nor  cruel  and  unusual  punishments  

inflicted.”

:text

:US_Constitution_1992 “United  States  of  America  1789  (rev.  1992)”

:isSectionOf

:Cruelty:hasTopic

“Prohibition  of  cruel  or  degrading  treatment”

:label

“inhumane  treatment”

:keyword

:text

:Section :Constitution:Topic

:Rights_and_Duties

:Physical_Integrity_Rights

:subClass

:subClass

:subClass

:hasTopic :isSectionOf

:type

:type

Page 11: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Flexibility  in  RDBMS

11

id attr1 attr2 attr3 attr4 … attrn …

id attribute value

id attr1 val1 attr2 val2 attr3 val3

id valueattr1

id valueattr2

id valueattr3

Copeland   and  Khoshafian.  A  decomposition   storage  model.  SIGMOD  1985

Agrawal  et  al.  Storage  and  Querying  of  E-­‐Commerce  Data.  VLDB  2001

Page 12: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Query  Federation

Virtualize Relational  Data  as  (RDF)  Graphs

12

Virtualize  Relational  Databases  as  RDF  Graphs  using  R2RMLKeep  your  legacy  data  in  the  RDBMS

Run  graph  queries  over  the  virtual  graph  data

Add  new  data  that  doesn’t  fit  into  the  schema  into  a  separate  graph

Federate  queries  over  Virtualized  Graph  and  the  Real  Graph

Page 13: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Flexibility Data  Integration Semantics Provenance “Graphy”

QueriesGraph

Visualization

Cold  Data

-­‐ Graph  butdepends  on  flexibilityneeds

-­‐ RDF  vs  PG?

-­‐ RDF  if  Metadatais  Key

Warm  Data

-­‐ HybridRelational/RDF  butdepends  on  flexibilityneeds

-­‐ RDF+R2RML

Data  Integration

Page 14: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Graphs  are  a  Common  denominator  

14

<constitution id=“US_Constitution_1992”><section id="US_Constitution_1992/section/123">

<text>Excessive bail shall ...</text></section><topic>Cruelty</topic>

</constitution>

“Excessive  bail   shall  not   be  required,   nor  excessive  fines  imposed,   nor  cruel and  unusual  punishments   inflicted.”

id text topic123 Excessive  bail  

shall…  Cruelty

:US_Constitution_1992/section/123

“Excessive  bail  shall  not  be  required,  nor  excessive  fines  imposed,  nor  cruel  and  unusual  punishments  

inflicted.”

:text:Cruelty

:hasTopic

XML Text

Tabular

Page 15: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Integration

15

:US_Constitution_1992/section/123

“Excessive  bail  shall  not  be  required,  nor  excessive  fines  imposed,  nor  cruel  and  unusual  punishments  

inflicted.”

:text

:US_Constitution_1992 “United  States  of  America  1789  (rev.  1992)”

:isSectionOf

:Cruelty:hasTopic

“Prohibition  of  cruel  or  degrading  treatment”

:label

“inhumane  treatment”

:keyword

:text

:EighthAmendment_USConstitution :Farmer_vs_Brennan

:lawsApplied

“A  prison  official’s  ‘deliberate  indifference’  to  a  substantial  risk  of  a  

serious  harm  to  an  inmate    violates  the  Eighth  

Amendment”

:holding:sameAs

:Prisons_in_Indiana :LGBT_right

_case_laws

:subject :subject

Page 16: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Integrate  Data  using  Graphs

16

Page 17: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Query  Federation

Virtually Integrate  Data  using  Graphs

17

Page 18: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Flexibility Data  Integration Semantics Provenance “Graphy”

QueriesGraph

Visualization

Cold  Data

-­‐ Graph  butdepends  on  flexibilityneeds

-­‐ RDF  vs  PG?

-­‐ RDF  if  Metadatais  Key

-­‐ RDF  Graphs  because  of  URIs!  

-­‐ SPARQLhas  federation

Warm  Data

-­‐ HybridRelational/RDF  butdepends  on  flexibilityneeds

-­‐ RDF+R2RML

-­‐ VirtualizeRDBMS  as  RDF  

Semantics

Page 19: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Semantics

19

:US_Constitution_1992/section/123

“Excessive  bail  shall  not  be  required,  nor  excessive  fines  imposed,  nor  cruel  and  unusual  punishments  

inflicted.”

:text:Cruelty

:hasTopic

“Prohibition  of  cruel  or  degrading  treatment”

:label

“inhumane  treatment”

:keyword

:Physical_Integrity_Rights

:subClass

:hasTopic

Page 20: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Flexibility Data  Integration Semantics Provenance “Graphy”

QueriesGraph

Visualization

Cold  Data

-­‐ Graph  butdepends  on  flexibilityneeds

-­‐ RDF  vs  PG?

-­‐ RDF  if  Metadatais  Key

-­‐ RDF  Graphs  because  of  URIs!  

-­‐ SPARQLhas  federation

-­‐ RDF  supports  inference  with  OWL  ontologies

Warm  Data

-­‐ HybridRelational/RDF  butdepends  on  flexibilityneeds

-­‐ RDF+R2RML

-­‐ VirtualizeRDBMS  as  RDF  

-­‐ Limitedinference  available  over  Virtual  Relational  RDF  graphs

Provenance

Page 21: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

:Bob :Alicefoaf:knows

“Bob  Smith”

foaf:name

“Alice  Smith”

foaf:name

id1 id2

knowskey value

name Bob  Smith

key value

name AliceSmith

Bob  Smith  knows  Alice  Smith

:Bob :Alicefoaf:knows

“Bob  Smith”

foaf:name

“Alice  Smith”

foaf:name

id1 id2

knowskey value

name Bob  Smith

key value

name AliceSmith

key value

since 2005

:g1

2005

:since

Bob  Smith  knows  Alice  Smith  since  2005

:Bob :Alicefoaf:knows

“Bob  Smith”

foaf:name

“Alice  Smith”

foaf:name

:s1 2005:since

:Bob :AliceknowsSince2005

“Bob  Smith”

foaf:name

“Alice  Smith”

foaf:name

foaf:knows

subProperty

Page 22: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

id1 id2

knowskey value

name Bob  Smith

key value

name AliceSmith

key value

since 2005

Juan  said  in  2017:  Bob  Smith  knows  Alice  Smith  since  2005

:Bob :Alicefoaf:knows

“Bob  Smith”

foaf:name

“Alice  Smith”

foaf:name

:s1 2005:since

:Juan :Event1:actorInvovled

:stated

2017:createdEvent1

key value

created

2017

id3

key value

name Juan

id1key value

Prop knows

International  Semantic  Web  Conference  2016

Page 23: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Flexibility Data  Integration Semantics Provenance “Graphy”

QueriesGraph

Visualization

Cold  Data

-­‐ Graph  butdepends  on  flexibilityneeds

-­‐ RDF  vs  PG?

-­‐ RDF  if  Metadatais  Key

-­‐ RDF  Graphs  because  of  URIs!  

-­‐ SPARQLhas  federation

-­‐ RDF  supports  inference  with  OWL  ontologies

-­‐ PG  if  statements  on  edges

-­‐ Otherwise  RDF  vs  PG  vs  RDB?

Warm  Data

-­‐ HybridRelational/RDF  butdepends  on  flexibilityneeds

-­‐ RDF+R2RML

-­‐ VirtualizeRDBMS  as  RDF  

-­‐ Limitedinference  available  over  Virtual  Relational  RDF  graphs

-­‐ HybridRelational/RDF

“Graphy”  Queries

Page 24: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Traversal,  Navigation,  Reachability

24

:US_Constitution_1992/section/123

“Excessive  bail  shall  not  be  required,  nor  excessive  fines  imposed,  nor  cruel  and  unusual  punishments  

inflicted.”

:text

:US_Constitution_1992 “United  States  of  America  1789  (rev.  1992)”

:isSectionOf

:Cruelty:hasTopic

“Prohibition  of  cruel  or  degrading  treatment”

:label

“inhumane  treatment”

:keyword

:text

:EighthAmendment_USConstitution :Farmer_vs_Brennan

:lawsApplied

“A  prison  official’s  ‘deliberate  indifference’  to  a  substantial  risk  of  a  

serious  harm  to  an  inmate    violates  the  Eighth  

Amendment”

:holding:sameAs

:Prisons_in_Indiana

:LGBT_right_case_laws

:subject :subject

• Conjunctive  Query  (CQ)• Regular  Path  Queries  (RPQ)

• regular  expression  over  edge  labels• Conjunctive  Regular  Path  Query  (CRPQ)• Union  Conjunctive  Regular  Path  Query  (UCRPQ)• Shortest  Path• Page  Rank• Graph  Algorithms  …  

https://github.com/graphMark/gmark

Bagan et  al.  gMark:  Schema-­‐Driven   Generation   of  Graphs  and  Queries.  Journal Transactions  on  Knowledge  and  Data  Engineering.  2017

Page 25: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Flexibility Data  Integration Semantics Provenance “Graphy”

QueriesGraph

Visualization

Cold  Data

-­‐ Graph  butdepends  on  flexibilityneeds

-­‐ RDF  vs  PG?

-­‐ RDF  if  Metadatais  Key

-­‐ RDF  Graphs  because  of  URIs!  

-­‐ SPARQLhas  federation

-­‐ RDF  supports  inference  with  OWL  ontologies

-­‐ PG  if  statements  on  edges

-­‐ Otherwise  RDF  vs  PG  vs  RDB?

-­‐ RDF and  PG  both  support  

Warm  Data

-­‐ HybridRelational/RDF  butdepends  on  flexibilityneeds

-­‐ RDF+R2RML

-­‐ VirtualizeRDBMS  as  RDF  

-­‐ Limitedinference  available  over  Virtual  Relational  RDF  graphs

-­‐ HybridRelational/RDF

-­‐ VirtualizeRDBMS  as  RDF  and  use  recursion  (?)

-­‐ Move  to  Graph  (?)

Graph  Visualizations

Page 26: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

• Gruff• Linkurious• D3• Tom  Sawyer• Keylines• …

Page 27: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Flexibility Data  Integration Semantics Provenance “Graphy”

QueriesGraph

Visualization

Cold  Data

-­‐ Graph  butdepends  on  flexibilityneeds

-­‐ RDF  vs  PG?

-­‐ RDF  if  Metadatais  Key

-­‐ RDF  Graphs  because  of  URIs!  

-­‐ SPARQLhas  federation

-­‐ RDF  supports  inference  with  OWL  ontologies

-­‐ PG  if  statements  on  edges  is  sufficient

-­‐ Otherwise  RDF  vs  PG  vs  RDB?

-­‐ RDF and  PG  both  support  

-­‐ PG  seem  to  have  more  Graph  Viztooling

-­‐ RDF is  not  that  behind  though

Warm  Data

-­‐ HybridRelational/RDF  butdepends  on  flexibilityneeds

-­‐ RDF+R2RML

-­‐ VirtualizeRDBMS  as  RDF  

-­‐ Limitedinference  available  over  Virtual  Relational  RDF  graphs

-­‐ HybridRelational/RDF

-­‐ VirtualizeRDBMS  as  RDF  and  use  recursion  (?)

-­‐ Move  to  Graph  (?)

-­‐ VirtualizeRDBMS  as  RDF  and  use  Viztools  (?)

-­‐ Move  to  Graph  (?)

Conclusion

Page 28: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Takeaway:  Tipping  Point

28

Relational  Database

Graphs

• Flexible• Data  Integration• Semantics• Provenance• “Graphy”  Queries• Graph  Visualizations

Be  skeptical!Ask  Why?

Do  you  really  need  another  database?  

Page 29: Do I need a Graph Database?

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

THANK  YOU

Juan  Sequeda,  Ph.DCo-­‐Founder  – [email protected]@juansequeda

29

Sequeda  J.  Integrating  Relational  Databases  with  the  Semantic  Web.  IOS  Press.  2016http://www.iospress.nl/book/integrating-­‐relational-­‐databases-­‐with-­‐the-­‐semantic-­‐web/