b. white, j. lepreau, l. stoller, r. ricci, s. guruprasad, m. newbold, m. hibler, c. barb, a....

52
An integrated Experimental Environment for Distributed Systems and Networks B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Upload: thomas-duane-hodge

Post on 28-Dec-2015

233 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

An integrated Experimental Environment for Distributed

Systems and Networks

B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar

Presented by Sunjun Kim

Jonathan di Costanzo2009/04/13

Page 2: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Outline

MotivationNetbed structureValidation and testingNetbed contributionConclusion

2

Page 3: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Outline

MotivationNetbed structureValidation and testingNetbed contributionConclusion

3

Page 4: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Background

Researchers need a platform in which they can develop, debug, and evaluate their systems

One lab is not enough, lack of resources Need more computers Scalability in terms of distance and number

of nodes can’t be reached Requires a huge amount of time to develop

large scale experiments4

Page 5: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Previous approaches

Simulation: NS

Live networks: PlanetLab

Emulation: Dummynet, NSE

controlled, repeatable environment

Achieves realism Not easy to repeat the experiment again

controlled packet loss and delay

Manual configuration is boring

Loses accuracy due to abstraction

5

Page 6: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Netbed ideas

Derives from “Emulab Classic” A universally-available time- and space-

shared network emulator Automatic configuration from NS script

Add Virtual topologies for network experimentations Integrates simulation, emulation, and live-

network with wide-area nodes experimentation in a single framework

6

Page 7: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Netbed goals

Accuracy Provide artifact-free environment

Universality Anyone can use anything the way he wants

conservative policy for the resource allocation No multiplexing (virtual machine) The resource of one node can be fully utilized

7

Page 8: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Resources

Local-Area Resources Distributed Resources Simulated Resources Emulated Resources

WAN emulator (integrated yet)

PlanetLab ModelNet (still in work)

8

Page 9: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Outline

MotivationNetbed structureValidation and testingNetbed contributionConclusion

9

Page 10: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Netbed structure

Resource

Life cycle

10

Page 11: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Local-Area resources

3 clusters 168 in Utah, 48 PCs in Kentucky & 40 in Georgia

Each node can be used as Edge node, router, traffic-shaping node, traffic

generator

Exclusivity of a machine during an experiment

The OS is given but entirely replaceable11

Page 12: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Local-Area resources

12

Page 13: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Distributed resources

Also called wide-area resources

50-60 nodes in approximatively 30 sites

provides characteristic live network

Very few nodes These nodes are shared between many users FreeBSD Jail mechanism (kind of Virtual machine) Non-root access

13

Page 14: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Distributed resources

14

Page 15: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Simulated resources

Based on nse (NS-emulation) Enables interaction with real traffics

Provides scalability beyond physical resources Many simulated nodes can be multiplexed

15

Page 16: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Emulated resources

VLANs Emulate wide-area links within a local-area

Dummynet Emulates queue & bandwidth limitation ,

introducing delays and packet loss between physical nodes

nodes act as Ethernet bridges transparent to experimental traffic

16

Page 17: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Netbed structure

Resource

Life cycle

17

Page 18: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Life cycle

18

Page 19: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Life cycle

$ns duplex-link $A $B 1.5Mbps 20ms

BA DB

A BBA

SpecificationGlobal Resource AllocationNode Self-ConfigurationExperiment ControlSwap OutParsingSwap In

19

Page 20: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Accessing Netbed

Experiment creation A project leader propose a project on the web A netbed staff accept or reject the project All the experiment will be accessible from the

web

Experiment managment Log on allocated nodes or on the usershost

(fileserver) The fileserver send the OS images, home and

project directories to the other nodes20

Page 21: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Accessing Netbed

21

Page 22: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Specification

Experimenters use ns scripts with Tcl can do as many functions & loops as they want

Netbed defines a small set of ns extension Possibility of chosing a specfic hardware

simultation, emulation, or real implementation Program objects can be defined using a Netbed-

specific ns extension Possibility of using graphical UI

22

Page 23: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Parsing

Front-end Tcl/ns parser Recognizes subset of ns relevant to

topology & traffic generation

Database Store an abstraction of everything about

the exeriment▪ Fixed generated events▪ Information about Hardwares , users &

experiments▪ procedures

23

Page 24: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Parsing

24

Page 25: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Global Resource Allocation

Binds abstractions from the database to physical or simulated entities Best effort to match with specifications On-demand allocations (no reservations)

2 different algorithms for local and distributed nodes (different constraints) Simulated annealing Genetic algorithm 25

Page 26: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Global Resource Allocation

Over-reservation of the bottleneck inter-switch bandwith is to small (2

Gbps) Against their conservative policy

Dynamic changes of the topology are allowed Add and remove nodes

Consistent naming across instantiations Virtualization of IP addresses and host

names

26

Page 27: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Node Self-Configuration

Dynamic linking and loading from the DB Let have the proper context (hostname,

disk image, script to start the experiment)

No persistent configuration states Only volatile memory on the node If requiered, the current soft state can be

stored in the DB as a hard state Swap out / Swap in

27

Page 28: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Node Self-Configuration Local Nodes

All nodes are rebooted in parallel Contact the masterhost which loads the

kernel directed by the database A second level boot may be requiered

Distributed nodes Boot from a CD-ROM then contact the

masterhost A new FreeBSD Jail is instantiated Tested Master Control Client 28

Page 29: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Experiment Control

Netbed supports dynamic experiment control Start, stop and resume processes, traffic

generators and network monitors

Signals between nodes Used of a Publish/Subscribe event

routing system The static events are retrieved from the

DB Dynamics events are possible

29

Page 30: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Experiment Control

ns configuration files is only high-level control

Experimenters can made some low-level controls On local node: root privileges▪ Kernel modification & access to raw sockets

On distributed: Jail-restricted root privileges▪ Access to raw socket with a specific IP address

Each local node support separated network isolated from the experimental one Enable to control a node via a tunnel as we where on

it without interfering 30

Page 31: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Preemption and Scheduling

Netbed try to prevent idling 3 metrics: traffic, use of pseudo-terminal

devices & CPU load average To be sure, a message is sent to the user

who can disapprove manually A challenge for distributed nodes with

several Jails

Netbed proposes automated batch experiments When no interaction is required Enables to wait for available resources

31

Page 32: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Outline

MotivationNetbed structureValidation and testingNetbed contributionConclusion

32

Page 33: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Validation

1st row : emulation overhead Dummynet gives better results than

nse33

Page 34: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Validation

They expect to have better results with future improvements of nse 34

Page 35: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Validation

5 nodes are communicating with 10 links

Evaluation of a derivative of DOOM

Their goal is to sent 30 tics/sec35

Page 36: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Testing

Challenges Depends on physical artifacts (cannot be

cloned) Should evaluate arbitrary programs Must run continuoustly

Minibed: 8 separated Netbed nodes Test mode: prevent hardware

modifications Full-test mode: provides isolated

hardware 36

Page 37: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Outline

MotivationNetbed structureValidation and testingNetbed contributionConclusion

37

Page 38: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Practical benefits

All-in-one set of tools Automated and efficient realization of

virtual topologies Efficient use of resources through time-

sharing and space-sharing Increase of fault-tolerance (resource

virtualization)

38

Page 39: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Practical benefits

Examples The “dumbbell” network▪ 3h15 --> 3 min

Improvement in the utilization of a scarce and expensive infrastructure: 12 months & 168 PC in Utah▪ Time-sharing (swapping): 1064 nodes▪ Space-sharing (isolation): 19,1 years

Virtualization of name and IP addresses▪ No problem with the swappings

39

Page 40: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Experiment creation and swapping

Mapping Reservation Reboot issuing Reboot Miscellaneous

Double time to boot on a custom disk image

Key services

40

Page 41: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Key services

Mapping local resources: assign

Match the user’s requirements Based on simulated annealing Try to minimizes the number of switch and

inter-switch bandwidth Less than 13 seconds

41

Page 42: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Key services

Mapping local resources: assign

42

Page 43: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Key services

Mapping distributed resources: wanassign

Different constraints▪ Fully connected via the internet▪ “Last mile”: type instead of topology▪ Specific topologies may be guaranteed by

requesting particular network characteristics (bandwidth, latency & loss)▪ Based on a genetic algorithm

43

Page 44: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Mapping distributed resources: wanassign 16 nodes 100 edges : ~1sec

256 nodes & 40 edges/nodes : 10min~2h

Key services

44

Page 45: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Key services

Disk reloading 2 possibilities ▪ complete disk image loading▪ incremental synchronization (hash tables on files

or blocks) Good▪ Faster (in their specific case)▪ No corruption

Bad▪ Waste of time when similar images are needed

repeatly▪ Pace reloading of freed node (reserved for 1 user)

45

Page 46: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Key services

Disk reloading Frisbee

Performance techniques:▪ Uses a domain-specific algorithm to skip

unused blocks▪ Delivers images via a custom reliable

multicast protocol

117 sec for 80 nodes, write 550MB instead of 3GB 46

Page 47: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Key services

Scaling of simulated resources

Simulated nodes are multiplexed on 1 physical node▪ Must deal with real time taking into account the

user’s specification : rate of events

Test of a live TCP at 2Mb CBR▪ 850MHz PC with UDP background 2Mb CBR / 50ms

▪ Able to have 150 links for 300 nodes▪ Problem of routing in very complex topologies

47

Page 48: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Example of a new possibility

Possibility to program different batch experiment, with the modification of only 1 parameter by 1

The Armada file system from Oldfield & Kotz 7 bandwidths x 5 latencies x 3

application settings x 4 configs of 20 nodes

420 tests in 30 hrs (4.3 min ~ per experiment)

48

Page 49: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Outline

MotivationNetbed structureValidation and testingNetbed contributionConclusion

49

Page 50: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Summary

Netbed deals with 3 test environments

Reuse of ns script

Quick setup of the test environment

Virtualization techniques provide the artifact-free environment

Enables qualitatively new experimental techniques 50

Page 51: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Future Work

Reliability/Fault Tolerance

Distributed Debugging: Checkpoint/Rollback

Security “Petri Dish”

51

Page 52: B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Thank you

Any Question ?