the performance analysis of cache architecture based on...

15
The Performance Analysis of Cache Architecture based on Alluxio over Virtualized Infrastructure Xu Chang, Li Zha 1

Upload: others

Post on 22-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Performance Analysis of Cache Architecture based on ...web.cse.ohio-state.edu/~lu.932/hpbdc2018/slides/xu-slides.pdf · based on Alluxio over Virtualized Infrastructure Xu Chang,

The Performance Analysis of Cache Architecture

based on Alluxio over Virtualized Infrastructure

Xu Chang, Li Zha

1

Page 2: The Performance Analysis of Cache Architecture based on ...web.cse.ohio-state.edu/~lu.932/hpbdc2018/slides/xu-slides.pdf · based on Alluxio over Virtualized Infrastructure Xu Chang,

•  Background •  Related Works •  Motivation •  Experiments •  Results •  Conclusion •  Future Work

Contents

Page 3: The Performance Analysis of Cache Architecture based on ...web.cse.ohio-state.edu/~lu.932/hpbdc2018/slides/xu-slides.pdf · based on Alluxio over Virtualized Infrastructure Xu Chang,

Background

•  Cloud Computing – Computing as a service – Application of resources on demand and payment

on demand •  Virtualization –  Integrates and encapsulates the resources – Provide the resource in piece – Transparent to users

Page 4: The Performance Analysis of Cache Architecture based on ...web.cse.ohio-state.edu/~lu.932/hpbdc2018/slides/xu-slides.pdf · based on Alluxio over Virtualized Infrastructure Xu Chang,

Background

Decoupling vs Traditional

Advantage: •  More flexible •  Overall cost is reduced

Shortcoming: •  Performance decline

Compute NodeData Node

Compute NodeData Node

Compute NodeData Node

Traditional Architecture

Decoupling architecture of computing and storage

Compute cluster

Compute Node

Compute Node

Compute Node

Data Center (Object Storage)

Data NodeData NodeData Node

Page 5: The Performance Analysis of Cache Architecture based on ...web.cse.ohio-state.edu/~lu.932/hpbdc2018/slides/xu-slides.pdf · based on Alluxio over Virtualized Infrastructure Xu Chang,

Related Works

For making up the loss of performance •  Traditional optimization method – Speed up the shuffle part of jobs with SSDs –  [kambatla2014truth] [ruan2017improving]

•  Reduce the frequency of accessing the object storage – Construct the cache layer between applications and

object storage –  [shankar2017performance] [qureshi2014cache]

Page 6: The Performance Analysis of Cache Architecture based on ...web.cse.ohio-state.edu/~lu.932/hpbdc2018/slides/xu-slides.pdf · based on Alluxio over Virtualized Infrastructure Xu Chang,

Related Works

Alluxio (Tachyon) •  The world’s first memory speed virtual distributed storage

system •  Resides between computation frameworks and storage systems

Source:https://www.alluxio.org/

Page 7: The Performance Analysis of Cache Architecture based on ...web.cse.ohio-state.edu/~lu.932/hpbdc2018/slides/xu-slides.pdf · based on Alluxio over Virtualized Infrastructure Xu Chang,

Motivation

•  Only concern about performance, do not care about cost

•  Cost reduction is critical •  Question: – How to design the caching architecture to make the

cost performance highest?

Page 8: The Performance Analysis of Cache Architecture based on ...web.cse.ohio-state.edu/~lu.932/hpbdc2018/slides/xu-slides.pdf · based on Alluxio over Virtualized Infrastructure Xu Chang,

Experiments

System architecture

CloudStorage

MapReduce

Alluxio

MapReduce

Alluxio

MapReduce

Alluxio

Source:https://www.alluxio.org/

Page 9: The Performance Analysis of Cache Architecture based on ...web.cse.ohio-state.edu/~lu.932/hpbdc2018/slides/xu-slides.pdf · based on Alluxio over Virtualized Infrastructure Xu Chang,

Experiments

Experimental environment

Experiment 1: Platform: AWS Servers: m3.2xlarge * 4 Object storage: S3

Experiment 2: Platform: G-Cloud Servers: 8 cores & 30G memory * 4 Object storage: Ceph

Page 10: The Performance Analysis of Cache Architecture based on ...web.cse.ohio-state.edu/~lu.932/hpbdc2018/slides/xu-slides.pdf · based on Alluxio over Virtualized Infrastructure Xu Chang,

Experiments

Experimental scheme •  Experiment 1: – Workload: Terasort * 6

•  Experiment 2: – Workload: Hive-Join * 3

•  Data Size: 120G •  Cost ratio of memory to SSD Memory :

SSD 8:0 7:1 5:3 3:5 1:7 0:8

Page 11: The Performance Analysis of Cache Architecture based on ...web.cse.ohio-state.edu/~lu.932/hpbdc2018/slides/xu-slides.pdf · based on Alluxio over Virtualized Infrastructure Xu Chang,

Results

Experimental 1:

76.0078.0080.0082.0084.0086.0088.0090.0092.00

Throughp

ut(M

B/s)

Performance

0.000.501.001.502.002.503.003.504.004.505.00

100%MEM 87.5%MEM12.5%SSD

62.5%MEM37.5%SSD

37.5%MEM62.5%SSD

12.5%MEM87.5%SSD

100%SSD

COSTPER

FORM

ANCE

CostPerformance

Page 12: The Performance Analysis of Cache Architecture based on ...web.cse.ohio-state.edu/~lu.932/hpbdc2018/slides/xu-slides.pdf · based on Alluxio over Virtualized Infrastructure Xu Chang,

Results

Experimental 2:

175180185190195200205210

Throughp

ut(M

B/s)

Performance

0

0.5

1

1.5

2

2.5

3

100%MEM 87.5%MEM12.5%SSD

62.5%MEM37.5%SSD

37.5%MEM62.5%SSD

12.5%MEM87.5%SSD

100%SSD

COSTPER

FORM

ANCE

CostPerformance

Page 13: The Performance Analysis of Cache Architecture based on ...web.cse.ohio-state.edu/~lu.932/hpbdc2018/slides/xu-slides.pdf · based on Alluxio over Virtualized Infrastructure Xu Chang,

Conclusion

•  Hybrid cache architecture is recommended.

•  For the workload with large size of output and small size of hot data, the cost ratio of memory to SSD in cache should be around 1:7

•  For the workload with small size of output and large size of hot data, the cost ratio of memory to SSD in cache should be around 5:3

Page 14: The Performance Analysis of Cache Architecture based on ...web.cse.ohio-state.edu/~lu.932/hpbdc2018/slides/xu-slides.pdf · based on Alluxio over Virtualized Infrastructure Xu Chang,

Future Work

•  Study several aspects that affect the cost performance, and try to give a configuration scheme with the best cost performance

•  Increase workload types and application scenarios, so that the conclusion is closer to the real scene and has generality

Page 15: The Performance Analysis of Cache Architecture based on ...web.cse.ohio-state.edu/~lu.932/hpbdc2018/slides/xu-slides.pdf · based on Alluxio over Virtualized Infrastructure Xu Chang,

Q & A

Thanks!