for the social graph -  · center for research in intelligent storage 1. complex distributed write...

29
Center for Research in Intelligent Storage Authors: Nathan Bronson, et. al Presentors: Zhichao Cao, Fenggang Wu 10/01/2018 TAO: Facebook’s Distributed Data Store for the Social Graph

Upload: others

Post on 21-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage

Authors: Nathan Bronson, et. al

Presentors: Zhichao Cao, Fenggang Wu

10/01/2018

TAO: Facebook’s Distributed Data Store

for the Social Graph

Page 2: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage Source: https://www.usenix.org/node/174510

Page 3: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage Source: https://www.usenix.org/node/174510

Page 4: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage Source: https://www.usenix.org/node/174510

Page 5: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage Source: https://www.usenix.org/node/174510

Page 6: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage

Stage 1: Relational Database Supported

MySQL

WebServers

……

1. Low write latency

2. High read latency, difficult

to achieve load balancing

3. Low scalability

Page 7: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage

Stage 2: Cache Supported (memcache)

Source: https://www.usenix.org/node/174510

Page 8: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage

Stage 2: Cache Supported (cache in between)

MySQL

WebServers

……

1. Higher write latency (clean

the cache)

2. Lower read latency

3. Higher scalability

Cache Layer

Page 9: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage

Stage 3: TAO (The Associations and Objects)

Source: https://www.usenix.org/node/174510

Page 10: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage

TAO Design 1: One Tier with Sharding

Source: https://www.usenix.org/node/174510

Page 11: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage

TAO Design 2: Multiple Caching Tiers

Source: https://www.usenix.org/node/174510

Page 12: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage

1. Complex distributed write

logic to handle to ensure

data correctness,

consistency

2. A lot of concurrent read

or write go to the

MySQL server, for the

same data

3. Make tradeoffs between

limiting the maximum

tier size and scaling the

cache

Source: https://www.usenix.org/node/174510

Page 13: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage Source: https://www.usenix.org/node/174510

Page 14: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage Source: https://www.usenix.org/node/174510

Page 15: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage Source: https://www.usenix.org/node/174510

Page 16: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage Source: https://www.usenix.org/node/174510

Page 17: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage

Master regionSlave region 1

Slave region 2

Master DBSlave DB

Slave DB

How to Scale out

Page 18: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage Source: https://www.usenix.org/node/174510

Page 19: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage Source: https://www.usenix.org/node/174510

Page 20: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage Source: https://www.usenix.org/node/174510

Page 21: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage Source: https://www.usenix.org/node/174510

Page 22: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage

MySQL Architecture

Source: https://techsoftcompute.wordpress.com/2013/08/19/mysql-internal-architecture/

Page 23: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage Source: http://data-base-recovery.blogspot.com/2015/10/storage-engines-in-mysql.html

Page 24: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage Source: https://www.slideshare.net/morgo/inno-db-presentation

Page 25: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage

Page 26: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage

RocksDB

MySQL Architecture in FaceBook

Source: https://techsoftcompute.wordpress.com/2013/08/19/mysql-internal-architecture/

Page 27: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage Source: https://www.slideshare.net/meeeejin/rocksdb-detail

Page 28: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage Source: https://www.usenix.org/node/174510

Page 29: for the Social Graph -  · Center for Research in Intelligent Storage 1. Complex distributed write logic to handle to ensure data correctness, consistency 2.A lot of concurrent read

Center for Research in Intelligent Storage

Thanks

Q/A