fawn: a fast array of wimpy nodes presented by: clint sbisa & irene haque

16
FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque

Post on 21-Dec-2015

226 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque

FAWN: A Fast Array of Wimpy Nodes

Presented by:Clint Sbisa & Irene Haque

Page 2: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque

Motivation

Large-scale data-intensive applications        Facebook, LinkedIn, Dynamo CPU-I/O Gap        storage, network and memory bottlenecks        low CPU utilization CPU Power        slower CPUs execute more queries per second per Watt        1 billion vs. 100 million instructions per Joule        inefficient energy saving techniques Memory Power 

Page 3: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque

FAWN

Data-intensive, computational simple workloadsSmall objects - 100B - 1KB Cluster of embedded CPUs using flash storage        Efficient        Fast random reads        Slow random writes FAWN-KV         Key-value storage        Consistent HashingFAWN-DS        Data store        Log structured  

Page 4: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque

FAWN - DS

Log-structure key-value storeContains all values in a key range for each virtual ID Maps 160-bit key        Hash Index bucket = i low order index bits        key fragment = next 15 low order bits6 byte in-memory Hash Index stores frag and pointer   

Page 5: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque

FAWN - DS

Basic Functions:        Store        Lookup        Delete                                 Concurrent operations

Virtual Node Maintenance:    Split    Merge    Compact

Page 6: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque

Consistent hashing of back-end VIDs Management node        assigns each front-end to circular key space  Front-end nodes        manages its key space        forwards out-of-range request    Back-end nodes - VIDs        contacts front-end when joining        owns a key range

FAWN - KV

Page 7: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque

Chain replication

FAWN - KV

Page 8: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque

Join    split key range     pre-copy    chain insertion    log flush    Leave    merge key range    Join into each chain 

FAWN - KV

Page 9: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque

Individual Node Performance

• Lookup speed

• Bulk store speed: 23.2 MB/s, or 96% of raw speed

Page 10: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque

Individual Node Performance

• Put speed

• Compared to BerkeleyDB: 0.07 MB/s – shows necessity of log-based filesystems

Page 11: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque

Individual Node Performance

• Read- and write-intensive workloads

Page 12: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque

System Benchmarks

• System throughput and power consumption

Page 13: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque

Impact of Ring Membership Changes

• Query throughput during node join and maintenance operations

Page 14: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque

Impact of Ring Membership Changes

• Query latency

Page 15: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque

Alternative Architectures

• Large Dataset, Low Query → FAWN+Disk

• Small Dataset, High Query → FAWN+DRAM

• Middle Range → FAWN+SSD

Page 16: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque

Conclusion

• Fast and energy efficient processing of random read-intensive workloads

• Over an order of magnitude more queries per Joule than traditional disk-based systems