![Page 1: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/1.jpg)
Big Data
NoSQL Database Types: episode II
![Page 2: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/2.jpg)
Content
▪ Document Store▪ Graph DB
![Page 3: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/3.jpg)
Graph
![Page 4: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/4.jpg)
Graph DB
▪ Why Graph DB▪ OrientDB▪ OrientDB vs Neo4J
![Page 5: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/5.jpg)
Graph DB: Why
Long time around
In some form
![Page 6: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/6.jpg)
Graph DB: Why
![Page 7: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/7.jpg)
Graph DB: Why
Can it handle complexity?
▪ Key/Value ▪ Column Store▪ Document Store
can not handle relations
▪ Graph Database !
![Page 8: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/8.jpg)
Graph DB: Why
![Page 9: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/9.jpg)
Graph DB: RDBMS relations
Customer Address
![Page 10: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/10.jpg)
Graph DB: 1 to 1
Customer Address
id address
2 Antwerp
4 Brussels
5 Essen
id name address_id
1 Tom VdB 5
2 Tom C. 4
3 Andriy 2
![Page 11: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/11.jpg)
Graph DB: 1 to N
Customer Address
id address
1 Tom
2 Andriy
3 Jos
id customer location
1 3 Antwerp
2 3 Brussels
3 1 Rome
![Page 12: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/12.jpg)
Graph DB: N to M
Customer CustomerAddress
id address
1 Tom
2 Andriy
3 Jos
customer address
3 1
3 5
2 1
Address
id location
1 Antwerp
5 Brussels
![Page 13: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/13.jpg)
Graph DB: what is wrong
![Page 14: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/14.jpg)
Graph DB: The join
Customer CustomerAddress
id address
1 Tom
2 Andriy
3 Jos
customer address
3 1
3 5
2 1
Address
id location
1 Antwerp
5 Brussels
These joins are all executed everytime you traverse the relationship
![Page 15: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/15.jpg)
Graph DB: what is wrong
![Page 16: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/16.jpg)
Graph DB: what is wrong
A join means searching for a key in another table
In order to improve performance one adds indexing
But that slows down inserts, updates and deletes
![Page 17: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/17.jpg)
Graph DB: index lookup
A-Z
A-L M-Z
A-L
A-D E-L
M-Z
M-R S-Z
A-D
A-B C-D
E-L
E-G H-L
E-G
E-F G
H-L
H-J K-L
Jos
Jos
![Page 18: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/18.jpg)
Graph DB: index lookup
Now
Imagine
billions of records
![Page 19: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/19.jpg)
Graph DB: index lookup
This join is executed for every involved table multiplied for all scanned records
![Page 20: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/20.jpg)
Graph DB: What about document databases
{ “_id”: 1, “name”: Tom, “address_id”: 4}
![Page 21: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/21.jpg)
Graph DB: Is there a better way
“A graph database is any storage system that provides index-free adjacency “
Marko Rodriguez
“auther of Tinkerpop Blueprints”
![Page 22: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/22.jpg)
Graph DB: Is there a better way
Index free relationshops ?
![Page 23: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/23.jpg)
Graph DB: Back to school
![Page 24: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/24.jpg)
Graph DB: Back to School
Tom Essenlives in
I am a Vertex We are vertices
An Edge
![Page 25: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/25.jpg)
Graph DB: Back to School
Tomfirstname: TomSurname: VdB
Company: Ordina
Essenpopulation:
17000
lives in
since: 1982
![Page 26: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/26.jpg)
Graph DB: Back to School
1 to N relationships
TomEssen
lives insince: 1982
Walked in:when: 1990, 1992
![Page 27: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/27.jpg)
Graph DB: Back to School
Graph Example
Tom
Ordina
isMemberOf
Works For
meetup: bigdata.be
Hosted By
VisitedOffice
![Page 28: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/28.jpg)
Graph DB: Back to school
Congratulations - you are now graduated in graph theory
![Page 29: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/29.jpg)
GraphDB: Index Lookup vs Relations
![Page 30: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/30.jpg)
GraphDB: Index Lookup vs Relations
![Page 31: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/31.jpg)
Graph DB: OrientDB
▪ How does OrientDB manage relationships▪ Some Limits▪ Hybrid▪ Transactions and ACID▪ Create the Graph▪ Query vs Traversal▪ Schema
![Page 32: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/32.jpg)
OrientDB: Manage Relationships
Tom(Vertex)
Essen(Vertex)
Rid: #13.35 Rid: #13.100
Label: “customer”Name: Tom
Label: “city”Name: Essen
![Page 33: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/33.jpg)
OrientDB: Manage Relationships
Tom(Vertex)
Essen(Vertex)
Rid: #13.35 Rid: #13.100
Label: “customer”Name: Tomout: #14.3
Label: “city”Name: Essenin: #14.3
Lives in
Rid: #14.3
Label: “Lives in”In: #13.35Out: #13.100
![Page 34: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/34.jpg)
OrientDB: Some Limits
Databases
Clusters
Records per cluster (Edges, Vertices and Documents)
Records per database
Record Size
Document Properties
![Page 35: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/35.jpg)
OrientDB: Some Limits
Indexes
Queries
Concurrency Level
![Page 36: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/36.jpg)
OrientDB: Class - Records - Cluster
![Page 37: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/37.jpg)
OrientDB: Hybrid Model
![Page 38: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/38.jpg)
OrientDB: Transactions and ACID
![Page 39: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/39.jpg)
OrientDB: Transactions and ACID
![Page 40: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/40.jpg)
OrientDB: Transactions and ACID
![Page 41: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/41.jpg)
OrientDB: Create the Graph - SQL
![Page 42: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/42.jpg)
OrientDB: Create the Graph - Java
![Page 43: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/43.jpg)
OrientDB: Query vs Traversal
Order 1
Order 2
Order 3
Calendar
Year 2014
Month 12/2014
Day: 1 dec 2014
Day: 6 dec 2014
Special Order Orders
![Page 44: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/44.jpg)
OrientDB: Schema
▪ schema full
▪ schema-mixed
▪ schema-less
![Page 45: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/45.jpg)
OrientDB: Schema Design
Jos
Tom
André
Sends Email to
Sends Email to
![Page 46: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/46.jpg)
OrientDB: Schema Design
Jos
Tom
André
Emailsends
TO
CC
![Page 47: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/47.jpg)
OrientDB: Gremlin
![Page 48: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/48.jpg)
OrientDB: Gremlin
Pipeline of steps▪ transform▪ filter▪ sideEffect▪ branch
![Page 49: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/49.jpg)
OrientDB: Gremlin
![Page 50: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/50.jpg)
OrientDB: Gremlin
![Page 51: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/51.jpg)
OrientDB: Gremlin
![Page 52: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/52.jpg)
Graphdb: Use Cases
▪ Recommendation engines
▪ Ranking/Credibility
▪ Path Finding (shortest, longest, mutual friends)
▪ Social (friendship, following, key connectors)
![Page 53: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/53.jpg)
Some code to play with
1. Go to https://github.com/tomvdbulck/orientdb_initiation
2. Make sure the following items have been installed on your machine:
o Java 7 or higher
o Git (if you like a pretty interface to deal with git, try SourceTree)
o Maven
3. Install VirtualBox https://www.virtualbox.org/wiki/Downloads
4. Install Vagrant https://www.vagrantup.com/downloads.html
5. Clone the repository into your workspace
6. Open a command prompt, go to the vagrant folder and run
vagrant up
7. This will start up the vagrant box. The first time will take a while (approx. 5 min) as it has to
download the OS image and other dependencies.
![Page 54: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/54.jpg)
Want More?
![Page 55: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/55.jpg)
Even More?
Upcoming meetup on 17/06 - @ Ordina
1st meetup of Spark Belgum http://www.meetup.com/Spark-Belgium/events/222632697/
![Page 56: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/56.jpg)
Want More?
Upcoming meetup hosted @ordina on wednesday 24/06 - Neo4j
http://www.meetup.com/graphdb-belgium/events/222504421
![Page 57: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/57.jpg)
Even More?
Upcoming workshop on 2/7 - @ Ordina
Introduction to Hadoop and it’s zoo
![Page 58: Big data document and graph d bs - couch-db and orientdb](https://reader034.vdocument.in/reader034/viewer/2022042619/5884a55b1a28ab76798b4a31/html5/thumbnails/58.jpg)
Questions or Suggestions?