Transcript
Page 1: Cluster or Network? An Emulation Facility for Research

1

Cluster or Network?An Emulation Facility for

Research

Jay Lepreau Chris AlfeldDavid Andersen (MIT) Mac Newbold

Rob Place Kristin Wright

Dept. of Computer ScienceUniversity of Utah

http://www.cs.utah.edu/flux/testbed/

February 3, 2000

Page 2: Cluster or Network? An Emulation Facility for Research

2

Research We Do

•Operating systems, local and distributed

•Distributed systems

Web caching schemes, distributed objects, ...

•Active Networks

code in every packet: route me!

Configurable router

•Router operating systems

Page 3: Cluster or Network? An Emulation Facility for Research

3

What?

•A configurable Internet (cluster) in a room

230 nodes, 1000 links, BFS (switch)

virtualizable topology, links, software

•An instrument for experimental CS research

•Universally available to any remote experimenter

•Simple to use!

Page 4: Cluster or Network? An Emulation Facility for Research

4

Why?

• “We evaluated our system on five nodes.” -job talk from university with 300-node cluster

• “We evaluated our Web proxy design with 10 clients on 100Mbit ethernet.”

• “Simulation results indicate ...”

• “Memory and CPU demands on the individual nodes were not measured, but we believe will be modest.”

• “The authors ignore interrupt handling overhead in their evaluation, which likely dominates all other costs.”

• “Resource control remains an open problem.”

Page 5: Cluster or Network? An Emulation Facility for Research

5

Why 2

• “You have to know the right people to get access to the cluster.”

• “The cluster is hard to use.”

• “<Experimental network X> runs FreeBSD 2.2.x.”

• “October’s schedule for <experimental network Y> is…”

• “<Experimental network Z> is tunneled through the Internet”

Page 6: Cluster or Network? An Emulation Facility for Research

6

Complementary to Other Experimental Environments

•Simulation

•Small static testbeds

•Live networks

•Maybe someday, a large scale set of distributed small testbeds (“Access”)

Page 7: Cluster or Network? An Emulation Facility for Research

7

Some Unique Characteristics

• Significant scale: initially 225 nodes, degree four 100Mb links between 42 core routers.

•User-configurable control of “physical” characteristics: shaping of link latency/bandwidth/drops/errors(via invisibly interposed “shaping nodes”),router processing power, buffer space, …

•Node breakdown: 42 core, 160 edge, 26 shaping, 2 management

Page 8: Cluster or Network? An Emulation Facility for Research

8

More Unique Characteristics

• Capture of low-level node behavior such as interrupt load and memory bandwidth

•User-replaceable node OS software

•User-configurable physical link topology(VLAN via BFS; “P-LAN” via BFPP)

• Completely configurable and usable by external researchers, including node power cycling

Page 9: Cluster or Network? An Emulation Facility for Research

9

Fundamental Research Leverage:

Extremely Configurable

Page 10: Cluster or Network? An Emulation Facility for Research

10

Obligatory Pictures

Page 11: Cluster or Network? An Emulation Facility for Research

11

Prototype Pieces: edge nodes

Page 12: Cluster or Network? An Emulation Facility for Research

12

Big Iron

Page 13: Cluster or Network? An Emulation Facility for Research

13

A View from the Dark Side

Page 14: Cluster or Network? An Emulation Facility for Research

14

And the Light Side

Page 15: Cluster or Network? An Emulation Facility for Research

15

Artist’s Conception

Page 16: Cluster or Network? An Emulation Facility for Research

16

Zoom in: “Delay” Node

Page 17: Cluster or Network? An Emulation Facility for Research

17

Feature:Automatic mapping of desired

topologies and characteristics to physical resources

•Algorithm goals:

minimize likelihood of experimental artifacts (bottlenecks)

“optimal” packing of multiple simultaneous experiments

Complete in finite time!

• Constraint-based heuristic algorithm (version 2!)

• Feature: accepts ns-compatible specification

Page 18: Cluster or Network? An Emulation Facility for Research

18

Current Algorithm

• Simulated annealing

Make random change (move node from one switch to another), compute score, accept/reject based on current temp.

•Heuristic algorithm

•~ 4 seconds for 30 nodes; polynomial

• Improve:

Hardwired node connections will slow it down x100

Edge nodes

Speed - incremental score recomputation

Page 19: Cluster or Network? An Emulation Facility for Research

Virtual Topology

Page 20: Cluster or Network? An Emulation Facility for Research

Mapping into Physical Topology

Page 21: Cluster or Network? An Emulation Facility for Research

21

Roatan: Remote Console for a Node

Page 22: Cluster or Network? An Emulation Facility for Research

22

Early Network Configuration GUI

Page 23: Cluster or Network? An Emulation Facility for Research

23

Research Applications

• Simulation validation

•Active networks

•Resource demands of services inside routers

•Denial-of-service resistance

• Interaction of adaptive applications and protocols

•All sorts of distributed system experiments

• ...

Page 24: Cluster or Network? An Emulation Facility for Research

24

Research Applications (continued)

•Detailed performance monitoring and analysis

•Relationships between {node, link, topology} characteristics and

Application performance

Task scheduling and assignment

Communication software

Application algorihms

….

Page 25: Cluster or Network? An Emulation Facility for Research

25

Study: Interconnection Techniques

• Point-to-point vs.always through a switch

Salmon et al (Caltech)

• Cost vs. performance

•Of most interest on large clusters

• Locality of communication patterns

• Interference with local processing

•Ad hoc mobile networking

Page 26: Cluster or Network? An Emulation Facility for Research

26

Research Issues and Other Challenges

• Calibration, validation, and scaling: how to emulate different speed networks? Scaling behavior of emulating faster links by slowing nodes?

• Can we sufficiently capture real router internal behavior in a PC?

•Assuring validity: detecting switch bottlenecks, measuring and controlling physical characteristics without introducing artifacts.

•Algorithms and software to map requirements to resources while minimizing artifacts.

• Integrate with ns?

• Providing a reasonable user interface to all this.

Page 27: Cluster or Network? An Emulation Facility for Research

27

Final Remarks

•Should be limping next month

•Looking for feedback on your potential use

•Looking for early users

•Collaborators/clients: UU Physics, CMU CS, MIT CS, Georgia Tech, IBM research

•Sponsors: University of Utah, Novell, DARPA, Compaq, Nortel, <your_name_here>


Top Related