nosql databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/nosql.pdfdatabases – sql...

32
NoSQL Databases Vincent Leroy 1

Upload: others

Post on 28-May-2020

27 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

NoSQLDatabases

VincentLeroy

1

Page 2: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

Database

•  Large-scaledataprocessing–  First2classes:Hadoop,Spark–  PerformsomecomputaCon/transformaConoverafulldataset

–  Processalldata•  SelecCvequery– Accessaspecificpartofthedataset– Manipulateonlydataneeded(1recordamongmillions)àDatabasesystem

2

Page 3: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

Loaddata

Writeresults

Writeresults

Serve

queries

Processing/DatabaseLink

3

Database

BatchJob(Hadoop,Spark)

StreamJob(Spark,Storm)

ApplicaCon1 ApplicaCon2 ApplicaCon3

e.g.senCmentanalysis

e.g.TwiSertrendspage

Insert

records

Page 4: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

Differenttypesofdatabases

•  SofarweusedHDFS– Afilesystemcanbeseenasaverybasicdatabase– Directories/filestoorganizedata– Verysimplequeries(filesystempath)– Verygoodscalability,faulttolerance…

•  Otherendofthespectrum:RelaConalDatabases– SQLquerylanguage,veryexpressive– Limitedscalability(generally1server)

4

Page 5: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

Size/Complexity

5Size

Complexity

GraphDB

RelaConalDB Document

DBColumnDB

Key/ValueDB

Filesystem

Page 6: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

TheNoSQLJungle

6

Page 7: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

Goaloftheseslides

•  PresentanoverviewoftheNoSQLlandscape– Trade-offinchoosingasoluCon– Theoremsandprinciples

•  NotamanualtolearnspecificDBs– Toomanyofthem– Notthatcomplicated(especiallyK/Vstores)– FocusonNeo4jgraphDBinlabwork

7

Page 8: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

RelaConalDatabases:SQL

•  SQLlanguageborn1974– SCllusedbymostdataprocessingsystems(includingSpark)

à Learnit!Don’tbeavicCmoftheNoSQLhype!

8

Page 9: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

RelaConalDatabasesmodel•  Dataorganizedastables

–  Row=record–  Column=aSribute

•  RelaConsbetweentables–  Integrityconstraints

9

SelectCtlefromcoursesnaturaljointakes_coursesgroupbyClassIDhavingcount(*)>10

Page 10: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

ACIDproperCes•  Atomicity

–  TransacConareallornothing(e.g.whenaddingabi-direcConalfriendshiprelaCon,it’saddedbothwaysornotatall)

•  Consistency–  OnlyvaliddatawriSen(e.g.cannotsayastudenttakesacoursethatisnotinthecoursestable)

•  IsolaCon–  WhenmulCpletransacConsexecutesimultaneously,theyappearasiftheywereexecutedsequenCally(akaserializability)

•  Durability–  WhendatahasbeenwriSenandvalidated,itispermanent(i.e.nodataloss,eveninthecaseofsomefailures)

10

àEasylifeforthedeveloper

Page 11: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

WhyNoSQLthen?•  WhatdoesNoSQLmean?

–  NoSQL–  NewSQL–  NotonlySQL…

•  SQLstrongproperCeslimititsabilitytoscaletoverylargedatasets–  RelaxsomeproperCestodealwithlargerdatasets(ACID)–  Butatwhatcost?

•  SQLisverystructured(eachrecordhasthesamecolumns…),Webdataooenisnot–  Semi-structureddata–  Unstructureddata–  Graphdata

11

Page 12: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

CAP

•  Consistency– WhenmulCpleoperaConsexecutesimultaneously,itappearsasiftheywereexecutedoneaoertheother(AofACID)

•  Availability–  Everyrequestreceivedbyanonfailednodemustbeanswered

•  ParCContolerance–  Systemmustrespondcorrectlyevenifnetworkfails

12

Page 13: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

CAPtheorem

•  Impossibletohave3simultaneously– ChooseCA,CP,orAP–  Inacentralizedsystem,noneedforP•  RelaConaldatabaseshaveCA

–  Inadistributedsystem,youcannotignoreP•  DistributeddatabaseschooseCPorAP

13

Page 14: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

CAPintuiCon

14

A:2

B:5

A:3

B:6

A:3

ParCCon

Client1

Client2

2soluCons:•  RefusetoanswerincaseofparCCon•  Answerbutriskinconsistencies

Page 15: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

NoSQLandCAP

15

Page 16: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

Weakerconsistencymodels•  Eventualconsistency

–  WhenthereisnoparCCon,DBisconsistent–  IncaseofparCCon,DBcanreturnstaledata–  OnceparCConisgone,thereisaCmelimitonhowlongittakesforconsistencytoreturn

•  Differentlevelsofconsistency(consistency/costtrade-off)–  Causalconsistency–  Read-your-writesconsistency–  Sessionconsistency–  Monotonicreadconsistency–  MonotonicwriteconsistencyàAgain,manychoices,somanydifferentsystems

16

Page 17: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

Vectorclocks&conflictdetecCon

17

Page 18: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

Vectorclocks&conflictdetecCon

18

Page 19: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

Vectorclocks&conflictdetecCon

19

Page 20: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

Vectorclocks&conflictdetecCon

20

Page 21: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

Vectorclocks&conflictdetecCon

21

Page 22: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

Vectorclocks&conflictdetecCon

22

Page 23: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

Vectorclocks&conflictdetecCon

23

Page 24: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

Vectorclocks&conflictdetecCon

24

Page 25: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

Vectorclocks&conflictdetecCon

25

Page 26: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

Key/Valuestore

•  2basicoperaCons,similartotheHashMapdatastructure– Put(K,V)– Get(K)

•  OoenusedforcachinginformaConinmemory– Facebookusesthemalot

26

Page 27: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

Column/TabularDB

•  Dataorganizedasrowswithaprimarykey– Flexibleformat,eachrowcanhavedifferentfieldsinacolumnfamily

– ReliesonHDFSforfaulttolerance

27

Page 28: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

DocumentDB

•  Datastoredasdocuments(ooenJSON)•  RicherthanK/Vstores–  Insert– Delete– Update– Find– AggregaConfuncCons(Map,Reduce…)–  Indexes

28

Page 29: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

DocumentDB

29

Page 30: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

DocumentDB

30

Page 31: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

GraphDB

•  Representdataasgraphs– Nodes/relaConshipswithproperCesasK/Vpairs

31

Page 32: NoSQL Databases - imaglig-membres.imag.fr/.../uploads/sites/125/2017/11/NoSQL.pdfDatabases – SQL query language, very expressive – Limited scalability (generally 1 server) 4 Size

GraphDB:Neo4j

•  Richdataformat– QuerylanguageaspaSernmatching– Limitedscalability•  ReplicaContoscalereads,writesneedtobedonetoeveryreplica

32