quartet: harmonizing task scheduling and caching for ... · analyses for the masses data collection...

29
Quartet: Harmonizing task scheduling and caching for cluster computing Francis Deslauriers, Peter McCormick, George Amvrosiadis, Ashvin Goel & Angela Demke Brown June 23, 2016

Upload: others

Post on 14-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Quartet: Harmonizing task scheduling and caching for cluster computing

Francis Deslauriers, Peter McCormick,George Amvrosiadis, Ashvin Goel &

Angela Demke Brown

June 23, 2016

Page 2: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Analyses for the masses

• Data collection is cheap, so datasets are growing exponentially

• Cluster computing makes it easy to analyze these datasets, enabling:◦ Queries on entire datasets◦ Analysts running queries on the same corpus◦ Tuning queries

2 of 18

Page 3: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Data is often re-accessed

• Result is many queries running on the same large datasets• Leads to significant data reuse

1

10

100

1,000

10,000

100,000

1 100 10,000 1,000,000

File

access f

req

uency

Input file rank by descending access frequency

CC-b

CC-c

CC-d

CC-e

FB-2010

Facebook, and Cloudera customers[VLDB’12]

0.0 0.2 0.4 0.6 0.8 1.0

Cumulative distribution

of input paths

0.0

0.2

0.4

0.6

0.8

1.0

CD

F o

f path

accesses

OpenCloud

M45

WebMining

CMU academicclusters [VLDB’13]

3 of 18

Page 4: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Data is often re-accessed

• Result is many queries running on the same large datasets• Leads to significant data reuse

1

10

100

1,000

10,000

100,000

1 100 10,000 1,000,000

File

access f

req

uency

Input file rank by descending access frequency

CC-b

CC-c

CC-d

CC-e

FB-2010

Facebook, and Cloudera customers[VLDB’12]

0.0 0.2 0.4 0.6 0.8 1.0

Cumulative distribution

of input paths

0.0

0.2

0.4

0.6

0.8

1.0

CD

F o

f path

accesses

OpenCloud

M45

WebMining

CMU academicclusters [VLDB’13]

3 of 18

Page 5: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Data reuse does not help

• We should expect data reuse improves performance due to caching

• We find that jobs do not see the benefits of reuse

4 of 18

Page 6: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Example: Repeating a Hadoop job

5 of 18

Page 7: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Example: Repeating a Hadoop job

5 of 18

Page 8: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Example: Repeating a Hadoop job

5 of 18

Page 9: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Example: Repeating a Hadoop job

5 of 18

Page 10: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Example: Repeating a Hadoop job

5 of 18

Page 11: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Example: Repeating a Hadoop job

5 of 18

Page 12: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Example: Repeating a Hadoop job

5 of 18

Page 13: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Missed Opportunities

• Working sets don’t fit in the page cache

• Jobs consume data independently of one another

6 of 18

Page 14: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Solution

• Key idea◦ Reorder work to first consume cached data◦ Jobs are made of small tasks with no ordering requirements

• Challenges◦ Cache visibility: Jobs need to know what data is cached on the different nodes◦ Task reordering: Jobs need to reorder their tasks based on this knowledge

• Our Quartet system addresses both these challenges

7 of 18

Page 15: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Challenge 1: Cache Visibility

• Datanodes collect information about HDFSblocks that reside in memory

◦ Requires the Duet kernel module that informsapplications when pages are cached or evicted

• Nodes send this information periodically to theQuartet Manager◦ Changes to the number of resident pages of each

block

8 of 18

Page 16: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Challenge 2: Task Reordering

1. Application Master registers blocks of interestwith the Quartet manager

2. Quartet manager informs Application Masterabout cached blocks

3. Application Master prioritizes and places tasksbased on block residency information

9 of 18

Page 17: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Example: Repeating a Hadoop job

10 of 18

Page 18: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Example: Repeating a Hadoop job

10 of 18

Page 19: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Example: Repeating a Hadoop job

10 of 18

Page 20: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Example: Repeating a Hadoop job

10 of 18

Page 21: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Example: Repeating a Hadoop job

10 of 18

Page 22: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Experiments

• Spark and Hadoop implementations

• 24 nodes with a total of 384 GB of memory

• Different input sizes:◦ Smaller than physical memory (256 GB)◦ Slightly larger (512 GB)◦ Approximately 3 times (1024 GB)

• 3 replicas per block

11 of 18

Page 23: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Results - Cache Hit Rate of the second job

256GB 512GB 1024GB 256GB 512GB 1024GB 0

20

40

60

80

100

Hadoop Spark C

ache

hit

rate

(%

)

Job size

Baseline Quartet

112GB

240GB

22GB

262GB

1GB

249GB

107GB

251GB

3GB

287GB

0GB

284GB

12 of 18

Page 24: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Results - Runtime reduction of the second job

256GB 512GB 1024GB 256GB 512GB 1024GB 0

20

40

60

80

100

Hadoop Spark N

orm

aliz

ed r

untim

e of

sec

ond

job

(%)

Job size

Baseline Quartet

13 of 18

Page 25: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Conclusions

• Observation: Workloads show significant amount of reuse

• Problem: Jobs are unable to take advantage of this reuse

• Solution:◦ Add visibility on what is cached in each of the cluster nodes◦ Reorder tasks to take advantage of this cached data

• Future work: More realistic workloads and scalability

14 of 18

Page 26: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Conclusion

Thank you!

15 of 18

Page 27: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Conclusion

Questions?

16 of 18

Page 28: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Network Overhead

• Watchers updates are aggregated per HDFS blocks (128-256MB)

• Upper bound is the storage bandwidth

• Manager notifications is proportional to the size of the input and hardware◦ 10-100 KB/s in our experiments

17 of 18

Page 29: Quartet: Harmonizing task scheduling and caching for ... · Analyses for the masses Data collection is cheap, so datasets are growing exponentially Cluster computing makes it easy

Related Work

HDFS Cache Manager

• Requires manual changes in case of change of popularity

• Can’t be used when input is larger than memory

PACMan

• Avoiding stragglers in single wave of Mappers

• Modify cache eviction policy to ensure that entire computation stages have memorylocality

18 of 18