the role of container technology in reproducible … role of container technology in reproducible...

68
The Role of Container Technology in Reproducible Computer Systems Research Ivo Jimenez, Carlos Maltzahn (UCSC) Adam Moody, Kathryn Mohror (LLNL) Jay Lofstead (Sandia) Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau (UW)

Upload: vantu

Post on 08-May-2018

215 views

Category:

Documents


1 download

TRANSCRIPT

The Role of Container Technology in Reproducible Computer Systems

Research Ivo Jimenez, Carlos Maltzahn (UCSC) Adam Moody, Kathryn Mohror (LLNL)

Jay Lofstead (Sandia) Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau (UW)

Why Care About Reproducibility

•  Theoretical, Experimental, Computational, Data-intensive research.

•  Reproducibility well established for the first two, but impracticably hard for the last two.

•  Negative impact on science, engineering, and education.

2  

Status Quo of Reproducibility

3  

Status Quo of Reproducibility

4  

Status Quo of Reproducibility

5  

Status Quo of Reproducibility

6  

libs

OS code data

hardware

Status Quo of Reproducibility

7  

libs

OS code data

hardware

8  

Sharing Code Is Not Enough

9  

libs

OS code data

hardware

Results Rely on Complete Context

10  

libs

OS code data

hardware

Potential Solution: Containers

11  

Potential Solution: Containers

•  Can containers reproduce any experiment? – Taxonomize CS experiments. – Determine challenging ones.

•  What is container technology missing? – Answer empirically by reproducing an already-

published experiment. •  Delineate missing components. – Based on learned lessons, define characteristics

that enhance reproucibility capabilities.

12  

13  

libs

OS code data

hardware

Effects of Containerizing Experiments

14  

libs

OS data code

container

libs

OS code data

hardware

Does it work for any experiment?

15  

ü Analyze output data. ü Evaluate analytic models. ü Handle small amounts of data. × Depend on special hardware. × Observe performance metrics.

libs

OS data code

container

Does it work for any experiment?

16  

ü Analyze output data. ü Evaluate analytic models. ü Handle small amounts of data. × Depend on special hardware. × Observe performance metrics.

libs

OS data code

container

Experiments in systems research

•  Runtime. •  Throughput. •  Latency. •  Caching effects. •  Performance model. •  etc.

17  

Experiments in systems research

•  Runtime. •  Throughput. •  Latency. •  Caching effects. •  Performance model. •  etc.

18  

Sample of 100 papers from 10 distinct venues spanning 5 years: ~80% have one or more experiments measuring runtime.

Experiments in systems research

•  Runtime. •  Throughput. •  Latency. •  Caching effects. •  Performance model. •  etc.

19  

libs

OS code data

hardware

Experiments in systems research

•  Runtime. •  Throughput. •  Latency. •  Caching effects. •  Performance model. •  etc.

20  

libs

OS code data

hardware

Experiments in systems research

•  Runtime. •  Throughput. •  Latency. •  Caching effects. •  Performance model. •  etc.

21  

libs

OS data code

container

Ceph OSDI ‘06

•  Select scalability experiment. – Distrubuted; makes use of all resources.

•  Scaled-down version of original. – 1 client instead of 20

•  Implement experiment in containers. – Docker 1.3 and LXC 1.0.6

•  Experiment goal: system scales linearly. – This is the reproducibility criteria.

22  

Ceph OSDI ‘06

23  

20

40

60

80

100

120

OSD cluster size

Thro

ughp

ut (M

B/s)

140

1 2 3 4 5 6 7 8 9 10 11 12 13

Ceph OSDI ‘06

24  

20

40

60

80

100

120

OSD cluster size

Thro

ughp

ut (M

B/s)

140

Original

1 2 3 4 5 6 7 8 9 10 11 12 13

Ceph OSDI ‘06

25  

20

40

60

80

100

120

OSD cluster size

Thro

ughp

ut (M

B/s)

140

Original Reproduced

1 2 3 4 5 6 7 8 9 10 11 12 13

Ceph OSDI ‘06

26  

20

40

60

80

100

120

OSD cluster size

Thro

ughp

ut (M

B/s)

140

Original Reproduced

Non-scalable behavior

1 2 3 4 5 6 7 8 9 10 11 12 13

Repeatability Problems

1.  High variability in old disk drives. – Causes cluster to be unbalanced.

2.  Paper assumes uniform behavior. – Original author (Sage Weil) had to filter disks

out in order to get stable behavior.

27  

Repeatability Problems

1.  High variability in old disk drives. – Causes cluster to be unbalanced.

2.  Paper assumes uniform behavior. – Original author (Sage Weil) had to filter disks

out in order to get stable behavior.

Solution: throttle I/O to get uniform raw-disk performance. 30 MB/s as the lowest common denominator.

28  

Ceph OSDI ’06 (throttled I/O)

29  

20

40

60

80

100

120

OSD cluster size

Thro

ughp

ut (M

B/s)

140

1 2 3 4 5 6 7 8 9 10 11 12 13

Ceph OSDI ’06 (throttled I/O)

30  

20

40

60

80

100

120

OSD cluster size

Thro

ughp

ut (M

B/s)

140

1 2 3 4 5 6 7 8 9 10 11 12 13

Lessons

1.  Resource management feature of containers makes it easier to control sources of noise. –  I/O and network bandwidth, CPU allocation,

amount of available memory, etc.

2.  Stuff that is not in the original paper but it’s important for reproducibility cannot be captured in container images. –  Details about the context matter.

31  

Container Execution Engine

32  

Host

Linux Kernel

Container Execution Engine

33  

Host

Linux Kernel

Container Execution Engine

Container Execution Engine

34  

Host

Linux Kernel

Container Execution Engine

Application

Container Execution Engine (LXC)

35  

Host

Linux Kernel

cgroups namespace

LXC

Application

cgroups

36  

cgroups

37  

Container “Virtual Machine”

host’s raw performance

+ cgroups configuration

= “virtual machine”

38  

Experiment Execution Engine

39  

Host

Linux Kernel

cgroups namespace

LXC

Experiment Execution Engine

40  

Host

Linux Kernel

cgroups namespace

LXC

Experiment Execution Engine

41  

Host

Linux Kernel

cgroups namespace

LXC Monitor

Experiment Execution Engine

42  

Host

Linux Kernel

cgroups namespace

LXC

Experiment

Monitor

Experiment Execution Engine

43  

Host

Linux Kernel

cgroups namespace

LXC

Experiment

Monitor

Profile Repository

Experiment Execution Engine

44  

Host

Linux Kernel

cgroups namespace

LXC

Experiment

Monitor

Profile Repository

Experiment Profile

1.  Container image 2.  Platform profile 3.  Container configuration 4.  Execution profile

45  

Platform Profile

•  Host characteristics – Hardware specs – Age of system – OS/BIOS conf. – etc.

•  Baseline behavior – Microbenchmarks – Raw performance characterization

46  

Container Configuration

•  cgroups configuration

47  

Experiment Container

CPU Memory Network Block IO

Execution Profile

•  Container metrics – Usage statistics (overall)

48  

Execution Profile

•  Container metrics – Usage statistics (overall) – Over time

49  

Experiment Profile

1.  Container image 2.  Platform profile 3.  Container configuration 4.  Execution profile

50  

Experiment Container

CPU Memory Network Block IO

Experiment Profile

1.  Container image 2.  Platform profile 3.  Container configuration 4.  Execution profile

51  

Experiment Container

CPU Memory Network Block IO

Having all this information

at hand is invaluable when

validating an experimental

result and is usually not

found in an academic

article.

Mapping Between Hosts

Reproduce on host B (original ran on host A): 1. Obtain the platform profile of A. 2. Obtain the container configuration on A. 3. Obtain the platform profile of B. 4. Using 1-3, generate configuration for B. Example: emulate memory/io/network on B so that characteristics of A are reflected.

52  

Mapping Between Hosts

53  

CPU Memory Network Block IO

System A

Mapping Between Hosts

54  

CPU Memory Network Block IO

System A

Experiment Container

Mapping Between Hosts

55  

CPU Memory Network Block IO

System A

Experiment Container

CPU Memory Network Block IO

System B

Mapping Between Hosts

56  

CPU Memory Network Block IO

System A

Experiment Container

CPU Memory Network Block IO

System B

Does it work?

57  

Does it work?

58  

Mapping doesn’t always work

59  

•  Experiments that rely on unmanaged operations and resources. – Asynchronous I/O, memory bandwidth, L3

cache, etc.

•  Enhancing isolation guarantees of container execution engine results in supporting more of these cases. – E.g. if cgroups now isolate asynchronous I/O for

every distinct group.

Open Question

60  

•  Given strong isolation guarantees, can we automatically check for repeatability by looking at low-level metrics?

Open Question

61  

•  Given strong isolation guarantees, can we automatically check for repeatability by looking at low-level metrics?

Open Question

62  

System A

•  Given strong isolation guarantees, can we automatically check for repeatability by looking at low-level metrics?

Open Question

63  

System A

•  Given strong isolation guarantees, can we automatically check for repeatability by looking at low-level metrics?

Open Question

64  

System A System B

•  Given strong isolation guarantees, can we automatically check for repeatability by looking at low-level metrics?

Open Question

65  

System A System B

•  Given strong isolation guarantees, can we automatically check for repeatability by looking at low-level metrics?

? =

On-going and Future Work

•  Taking more already-published experiments to test robustness of our approach.

•  Integrate our profiling mechanism into container orchestration tools.

66  

Thanks!

67  

68