scale out your graph across servers and clouds with orientdb
TRANSCRIPT
![Page 1: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/1.jpg)
Scale Out Your Graph Across Servers and Clouds
with OrientDB
#gdsf17
Luca Garulli, Founder and CEO @lgarulli GraphDay - San Francisco - June 17, 2017
![Page 2: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/2.jpg)
Copyright (c) - OrientDB LTD 2
We all want the same thing: an open source GraphDB that is
Fast, Flexible, Scalable and … Unbreakable!
![Page 3: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/3.jpg)
Copyright (c) - OrientDB LTD 3
Complexity
Scal
ability
Single Thread
Multi Thread
Distributed
Systems/Complexity
![Page 4: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/4.jpg)
Copyright (c) - OrientDB LTD 4
Master Node
Auto-Discovery
I’m the only one!
C
![Page 5: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/5.jpg)
Copyright (c) - OrientDB LTD 5
Auto-Discovery
Master Node
Master Node
Connected!
C
![Page 6: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/6.jpg)
Copyright (c) - OrientDB LTD 6
Master Node
Master Node
C
Updated distributed configuration is broadcasted to
all the connected clients
Clients See the Distributed Configuration
![Page 7: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/7.jpg)
Copyright (c) - OrientDB LTD 77
Master Node
Master Node
CC
Master Node
Auto-reconnect in Case of Failure
In case of failure, the clients auto-reconnect to
the available nodes
![Page 8: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/8.jpg)
Copyright (c) - OrientDB LTD 88
Master Node
Master Node
C
DBs are automatically deployed
to the newly joined nodes
DB DB
Database Auto-deploy
CC
C
![Page 9: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/9.jpg)
Copyright (c) - OrientDB LTD 9
Replication: Under the Hood Client commits a transaction
99999
Master Node
Master Node
Master Node
HA Queue
HA Queue
HA Queue
C
Transaction
Requests
![Page 10: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/10.jpg)
Copyright (c) - OrientDB LTD 10101010101010
Master Node
Master Node
Master Node
HA Queue
HA Queue
HA Queue
C
HA Queue
HA Queue
HA Queue
Replication: Under the Hood Response Handling
WriteQuorum = 2
Sends OK
OK
Requests
Responses
![Page 11: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/11.jpg)
Copyright (c) - OrientDB LTD 11
Replication: Under the Hood Fix the unaligned node
11111111111111
Master Node
Master Node
Master Node
HA Queue
HA Queue
HA Queue
HA Queue
HA Queue
HA Queue
Requests
Responses
Fix
Commit
![Page 12: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/12.jpg)
Copyright (c) - OrientDB LTD 12
Replication: Under the Hood 2pc messages
12121212121212
Fix (response != quorum)
Commit (response == quorum)
Rollback (quorum not reached)
![Page 13: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/13.jpg)
Copyright (c) - OrientDB LTD 13
Master Node A
Optimistic MVCC Distributed Transaction
(2) tx task
(4) return result
(7) asynch* commit, rollback or fix
(1) Lock all records in order
+ Execute TX locally
Master Node B
(3) Lock all records in order
+ Execute TX locally
(5) Check results
+ unlock all records
(8) Execute the message and Unlock all records
C Node A becomes the transaction coordinator
(6) send response back to the client
![Page 14: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/14.jpg)
Copyright (c) - OrientDB LTD 14
Consistency- Distributed Locks don’t block reads - During the transaction the old version of the
record is retrieved - 2pc message is asynchronous, so a record could
not be updated yet on server X - If you need higher consistency, don’t use load
balancer policy ROUND_ROBIN_REQUEST
![Page 15: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/15.jpg)
Copyright (c) - OrientDB LTD 15
{ "autoDeploy": true, "writeQuorum": “majority”, "newNodeStrategy": “static”, "clusters": { "*": { "servers": ["<NEW_NODE>"] } } }
Default Configuration (json)
deploys the database automatically on new
nodes
![Page 16: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/16.jpg)
Copyright (c) - OrientDB LTD 16
Default Configuration (json)
writeQuorum=2 means at least 2
nodes must agree on writes. Set it to the majority to ensure
consistency
{ "autoDeploy": true, "writeQuorum": “majority”, "newNodeStrategy": “static”, "clusters": { "*": { "servers": ["<NEW_NODE>"] } } }
![Page 17: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/17.jpg)
Copyright (c) - OrientDB LTD 17
Default Configuration (json)
newNodeStrategy: static (default) or
dynamic. Static = once nodes join, they are
part of configuration
{ "autoDeploy": true, "writeQuorum": “majority”, "newNodeStrategy": “static”, "clusters": { "*": { "servers": ["<NEW_NODE>"] } } }
![Page 18: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/18.jpg)
Copyright (c) - OrientDB LTD 18
Default Configuration (json)
clusters contain the distributed
configuration per cluster
{ "autoDeploy": true, "writeQuorum": “majority”, "newNodeStrategy": “static”, "clusters": { "*": { "servers": ["<NEW_NODE>"] } } }
![Page 19: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/19.jpg)
Copyright (c) - OrientDB LTD 19
Default Configuration (json)
cluster “*” represents the default cluster
configuration. Any new node will join,
containing all the clusters
{ "autoDeploy": true, "writeQuorum": “majority”, "newNodeStrategy": “static”, "clusters": { "*": { "servers": ["<NEW_NODE>"] } } }
![Page 20: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/20.jpg)
Copyright (c) - OrientDB LTD 20
Custom Setting per Cluster
cluster “customer” extends the default configuration with a custom writeQuorum
{ "autoDeploy": true, "writeQuorum": “majority”, "newNodeStrategy": “static”, "clusters": { "*": { "servers": ["<NEW_NODE>"] }, "customer": { "writeQuorum": 3, "servers": ["<NEW_NODE>"] } } }
![Page 21: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/21.jpg)
Copyright (c) - OrientDB LTD
What about scalability?
21
![Page 22: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/22.jpg)
Copyright (c) - OrientDB LTD 22
Performance with Reads
2222222222
Master Node
Master Node
Master Node
C
10,000 req/sec
C C
10,000 req/sec
10,000 req/sec
![Page 23: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/23.jpg)
Copyright (c) - OrientDB LTD
Full replication provides linear scalability on reads,
but what about scalability on writes?
23
![Page 24: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/24.jpg)
Copyright (c) - OrientDB LTD 24
Performance with Writes
2424242424
Master Node
C
12,000 req/sec
C C
![Page 25: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/25.jpg)
Copyright (c) - OrientDB LTD 25
Performance with Writes
2525252525
Master Node
Master Node
C
7000 req/sec
C C
7000 req/sec
![Page 26: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/26.jpg)
Copyright (c) - OrientDB LTD 26
Performance with Writes
2626262626
Master Node
Master Node
Master Node
C
5000 req/sec
C C
5000 req/sec
5000 req/sec
The more replicas you
configure, the more the propagation cost
will impact performance, based
on writeQuorum
![Page 27: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/27.jpg)
Copyright (c) - OrientDB LTD
In order to scale up writes, you need
sharing + replication
We used a solution similar to RAID for Hard Drives
27
![Page 28: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/28.jpg)
Copyright (c) - OrientDB LTD
Sharding in OrientDB v2.2
28
![Page 29: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/29.jpg)
Copyright (c) - OrientDB LTD 29
Assign 1 Cluster per Node
2929
customer_usa customer_europe customer_china
Master Node usa
Master Node europe
Master Node china
Customer
![Page 30: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/30.jpg)
Copyright (c) - OrientDB LTD
Master Node usa
Master Node china
Master Node europe
30
RAID for Databases
303030
customer_usa customer_europe customer_china
Customer
customer_europecustomer_usacustomer_china
Replica factor = 2
![Page 31: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/31.jpg)
Copyright (c) - OrientDB LTD
Master Node usa
Master Node china
Master Node europe
31
Traversal with Spark (Pregel)
313131
C
https://spark.apache.org/docs/latest/graphx-programming-guide.html#pregel-api
![Page 32: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/32.jpg)
Copyright (c) - OrientDB LTD 32
Sharding ConfigurationLEGEND: X = Owner, o = Copy +---------------+-----------+----------+-------+-------+-------+ | | | |MASTER |MASTER |MASTER | | | | |ONLINE |ONLINE |ONLINE | +---------------+-----------+----------+-------+-------+-------+ |CLUSTER |writeQuorum|readQuorum| usa |europe |china | +---------------+-----------+----------+-------+-------+-------+ |* | 2 | 1 | X | | | |customer_usa | 2 | 1 | X | | o | |customer_europe| 2 | 1 | o | X | | |customer_china | 2 | 1 | | o | X | +---------------+-----------+----------+-------+-------+-------+
![Page 33: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/33.jpg)
Copyright (c) - OrientDB LTD 33
Sharding Configuration
first node in list is the “owner” for that
cluster
{ "clusters": { “customer_usa": { "servers": [“usa”, “china”] }, “customer_europe": { "servers": [“europe”, “usa”] }, “customer_china": { "servers": [“china”, “europe”] } } }
![Page 34: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/34.jpg)
Copyright (c) - OrientDB LTD 34
Static OwnerThe owner can be static. If the
owner is not online, new records can’t be
inserted in the cluster
{ "clusters": { "client_usa": { "owner": "usa", "servers" : [ "usa", "europe", "asia" ] } } } This assures
the owner is not assigned
dynamically
![Page 35: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/35.jpg)
Copyright (c) - OrientDB LTD 353535353535
Master Node
C
5000 req/sec
C C
Performance with Sharding
![Page 36: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/36.jpg)
Copyright (c) - OrientDB LTD 363636363636
Master Node
Master Node
C
5000 req/sec
C C
5000 req/sec
Performance with Sharding
![Page 37: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/37.jpg)
Copyright (c) - OrientDB LTD 373737373737
Master Node
Master Node
C
5000 req/sec
C C
5000 req/sec
Performance with Sharding
Master Node
5000 req/sec
Master Node
5000 req/sec
Master Node
5000 req/sec
Each write is replicated on only 2 nodes
![Page 38: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/38.jpg)
Copyright (c) - OrientDB LTD 38
Master Node
Master Node
CC
C
CC
C Master Node
CC
C
Master Node
CC
C
Master Node
CC
C
Master Node
CC
C
Master Node
C
C
Linear and Elastic Scalability on both Read & Writes!
![Page 39: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/39.jpg)
Copyright (c) - OrientDB LTD 39
Replica only Nodes
3939
Replica NodeMaster
Node usa
Master Node usa
Master Node usa
Replica NodeReplica NodeReplica NodeReplica NodeReplica NodeReplica NodeReplica NodeReplica NodeReplica NodeReplica NodeReplica Node
Replica servers don’t concur in writeQuorum
![Page 40: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/40.jpg)
Copyright (c) - OrientDB LTD 40
Server Role Configuration
“*” by default each node is a master.
Following nodes are replica only
{ "servers": { "*": “master”, “usa_r1": “replica", “usa_r2": “replica", “usa_r3": “replica", “europe_r1": “replica", “china1_r1": "replica" } }
![Page 41: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/41.jpg)
Copyright (c) - OrientDB LTD 41
Master Node
Master Node
C
Load-Balancing on Client Side
Master Node
![Page 42: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/42.jpg)
Copyright (c) - OrientDB LTD 42
Load-Balancing Configuration
final OrientGraphFactory factory = new OrientGraphFactory(“remote:localhost/demo");
factory.setConnectionStrategy( OStorageRemote.CONNECTION_STRATEGY.ROUND_ROBIN_CONNECT);
OrientGraph graph = factory.getTx();
Available strategies: - STICKY, - ROUND_ROBIN_CONNECT, - ROUND_ROBIN_REQUEST
![Page 43: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/43.jpg)
Copyright (c) - OrientDB LTD 43
What if some records are not aligned
for **ANY** reason?
![Page 44: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/44.jpg)
Copyright (c) - OrientDB LTD 44
OrientDB Auto-Repairer
- Executed in chain - It works in batch of 50 records per time (configurable) - Strategies: - Quorum: checks if it meets the configured write quorum - Content: checks the content - Majority: checks if at least there is a majority - Version: gets the higher version - DC (EE only). Example: dc{winner:asia}
![Page 45: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/45.jpg)
Copyright (c) - OrientDB LTD 45
Master Node
A
Master Node
B
Auto-Repair Flow
(1) tx read [#10:33,#43:90,#12:23]
(3) return [{#10:33 v1},{#43:90 v5},{#12:23 v9}]
(2) lock [#10:33,#43:90,#12:
23](4) fix [{#43:90 v6}]
![Page 46: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/46.jpg)
Copyright (c) - OrientDB LTD 46
Dynamic Timeouts (from v2.2.18)
orientdb> HA STATUS -latency -output=text
REPLICATION LATENCY AVERAGE (in milliseconds) +-------+-----+------+-----+ |Servers|node1|node2*|node3| +-------+-----+------+-----+ |node1 | | 0.60| 0.43| |node2* | 0.39| | 0.38| |node3 | 0.35| 0.53| | +-------+-----+------+-----+
![Page 47: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/47.jpg)
Copyright (c) - OrientDB LTD 47
Cross Datacenter & Cloud
Dublin
Austin
Cross Data Centre and Cloud replication. Data
Centers don’t need to have the same number
of servers
![Page 48: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/48.jpg)
Copyright (c) - OrientDB LTD 48
Data-Center (Enterprise only)"dataCenters": { "rome": { "writeQuorum": "all", "servers": [ "europe-0", "europe-1", "europe-2" ] }, "austin": { "writeQuorum": "all", "servers": [ "usa-0", "usa-1", "usa-2" ] } }
![Page 49: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/49.jpg)
Copyright (c) - OrientDB LTD 49
Performance
![Page 50: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/50.jpg)
Copyright (c) - OrientDB LTD 50
Yahoo Benchmark 3 Nodes
OrientDB is 2x-3x faster than Cassandra on the
same HW/SW configuration, same workload
Ops/sec
![Page 51: Scale Out Your Graph Across Servers and Clouds with OrientDB](https://reader031.vdocument.in/reader031/viewer/2022013013/5a6484827f8b9a5d568b49ed/html5/thumbnails/51.jpg)
Copyright (c) - OrientDB LTD 51
OrientDB v3.1 (Q1 2018)- DHT-like algorithm = Distributed Sharded index
- Native support for batch-traversal (Pregel-like) without Spark
- Background refactoring of the Graph based on usage statistics