physical buildout of the optiputer at ucsd

21
Physical Buildout of the OptIPuter at UCSD

Upload: bernard-madden

Post on 03-Jan-2016

36 views

Category:

Documents


1 download

DESCRIPTION

Physical Buildout of the OptIPuter at UCSD. What Speeds and Feeds Have Been Deployed Over the Last 10 Years. Doublings. 16 - 32 x 10000Mb. DWDM Capability. 13. OptIPuter Infrastructure. Performance per Dollar Spent. Uplink Speed. Endpoint Speed. 10. 10000Mb. Rockstar. 7. Wiglaf. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Physical Buildout of the OptIPuter  at UCSD

Physical Buildout of the OptIPuter at UCSD

Page 2: Physical Buildout of the OptIPuter  at UCSD

What Speeds and Feeds Have Been Deployed Over the Last 10 Years

Scientific American, January 2001

Number of Years0 2 4 6 8 10

Pe

rfo

rma

nc

e p

er

Do

llar

Sp

en

t

Uplink Speed

DWDM Capability

Endpoint Speed

10Mb1000Mb

Doublings

16 - 32 x

10000Mb

10000Mb

7

10

13

WiglafRockstar

OptIPuter Infrastructure

Page 3: Physical Buildout of the OptIPuter  at UCSD

½ Mile

SIO

SDSC

CRCA

Phys. Sci -Keck

SOM

JSOE Preuss

6th College

SDSCAnnex

Node M

Earth Sciences

SDSC

Medicine

Engineering High School

To CENIC and NLR

Collocation

Source: Phil Papadopoulos, SDSC; Greg Hidley, Cal-(IT)2

The UCSD OptIPuter DeploymentUCSD is Prototyping

a Campus-Scale OptIPuter

Calit2

JuniperT320

0.320 TbpsBackplaneBandwidth

ChiaroEstara

Dedicated Fibers Between Sites Link

Linux Clusters

Cisco 6509 8 – 10GigE

Page 4: Physical Buildout of the OptIPuter  at UCSD

UCSD Packet Test BedOptIPuter Year 2

110

ChiaroEnstara

SDSC JSOE

CSE

10

1

8-nodecluster

(shared)

IBM 9-node viz cluster

SIO

SOM

IBM 48-node storage cluster

IBM 128-node compute cluster

Sun 128-node computecluster

Sun 17-nodestoragecluster

CRCA

6th College

3-node viz cluster

1

IBM 9 mpixeldisplay pairs

Geowall 2Tiled Display

10

Sun 22-node

viz cluster

10Extreme 400Extreme 400

To UCI, ISI and StarLight via CalREN-XD and NLR

DellGeowall

Preuss

IBM 9 mpixeldisplay pairs

Dell Viz

Dell 5224

Dell 6024F

Dell 5224

Dell 5224

Dell 5224

Extreme 400

HP28-nodecluster

(shared)

HP4-nodecontrol

Sun 17-node computecluster

Infiniband4 nodes

Infiniband64 nodes

Sun 5-node

viz cluster

Sun17-nodecomputecluster

Fujitsu

7-node cluster(shared) 10

To StarLight via NLR

4

9-node cluster

(shared) UCSD & CalREN-HPR Shared

IP Network

6

Geowall 2Tiled Display

1

Page 5: Physical Buildout of the OptIPuter  at UCSD

Different Kind of Experimental Infrastructure

• UCSD Campus Infrastructure– A campus-wide experimental apparatus

• Different Kinds of Cluster Endpoints (scaling in the usual dimensions)– Compute– Storage– Visualization– 300 + Nodes available for experimentation (ia32, Opteron, Linux)– 7 different labs

• Clusters and Network can be allocated and configured by the researcher at the lowest level– Machine SW configuration: OS (kernel, networking modules, etc),

Middleware, OptIPuter System Software, Application Software– Root access given to researchers when needed – As close to chaos as we can get

• Networks– Packet oriented network. 10 Gbps/site. Multiple 10GigE where needed– Adding lambda capability (Quartzite: Research Instrumentation Award)

Page 6: Physical Buildout of the OptIPuter  at UCSD

What’s Coming Soon?

• 10 GigE Switching– Force 10 e1200. Initially with sixteen 10GigE Connections

– Expansion is $6K/Port + Optics ($2K for Grey, $5K for DWDM)– Line Cards, Grey Optics here. Awaiting Chassis

– Force 10 S50 Edge Switches– 48-port GigE + two 10GigE uplinks ~ $10K with Grey Optics

• 10 GigE NICs– Neterion

– PCI-X (Intel OEM) with XFP (just received)

– Myrinet 10G (PCI Express)– Ready to place Order

• DWDM – On Order: four 10GigE XFPs, 40KM, Channels 31,32 (2 each).

– Delayed: Expect arrival in March (Sigh).– Following NASA’s lead on the DWDM Hardware (Very good Results on Dragon)

– Arrived: two 8 channel Mux/DeMux from Finisar

• DWDM Switching– Expect Wavelength selective switch this summer.

Page 7: Physical Buildout of the OptIPuter  at UCSD

What’s Changing II

• “Center Switching Complex” moving to Calit2• Should be done my end of March

• A modest number of endpoint for OptIPuter Research will be added• A larger Number (e.g. CAMERA) of “production” resources added

• Increasing emphasis on longer haul connections– Connections to UCI

Page 8: Physical Buildout of the OptIPuter  at UCSD

Quartzite: Reconfigurable Networking

• NSF Research Instrumentation, Papadopoulos, PI• Packet network is great

– Give me bigger and faster of what I already know– Even though TCP is challenged on big pipes

– What about lambdas? And switching lambdas?

• Existing Fiber Plant is fixed.– Want to Experiment with different topologies? -> “buy” a telecom

worker to reconnect cables as needed

• Quartzite: Research Instrumentation Award (Started 15 Sep)– Hybrid Network “Switch stack” at our Collocation Point

– Packet Switch– Transparent Optical Switch

– Allows us to physically build new topologies without physical rewiring

– Wavelength-Selective Switch– Experimental device from Lucent

Page 9: Physical Buildout of the OptIPuter  at UCSD

Quartzite: DWDM

www.aurora.com

www.optoway.com

www.fibredyne.com

$5K/XFP

$2K/Channel (Mux/demux)

$10K/ switch

+

= $14K/Connected PairSingle fiber pair

• Cheap uncooled lasers

• 0W Optical splitters/combiners

• 0.8nm spacing for DWDM

•1GigE, 10GigE

Bonded or Separate

Page 10: Physical Buildout of the OptIPuter  at UCSD

UCSD Quartzite Core at Completion (Year 5 of OptIPuter)

QuartziteCore

CalREN-HPRResearch

Cloud

Campus ResearchCloud

GigE Switch withDual 10GigE Upliks

.....To cluster nodes

GigE Switch withDual 10GigE Upliks

.....To cluster nodes

GigE Switch withDual 10GigE Upliks

.....To cluster nodes

GigE

10GigE

...Toothernodes

Quartzite CommunicationsCore Year 3

ProductionOOO

Switch

Juniper T320

4 GigE4 pair fiber

Wavelength Selective

Switch

To 10GigE clusternode interfaces

..... To 10GigE clusternode interfaces and

other switches

Chiaro Enstara

32 10GigE

• Funded 15 Sep 2004

• Physical HW to Enable Optiputer and Other Campus Networking Research

• Hybrid Network Instrument

Reconfigurable Network and

Enpoints

Page 11: Physical Buildout of the OptIPuter  at UCSD

Scalable and automated network mapping for

Optiputer/Quartzite Network

Optiputer AHM Meeting

San Diego, CA

January 17 2006

Praveen Jagadishprasad Hassan Elmadi

Calit2, UCSD

Phil Papadopoulos Mason Katz

SDSC

Page 12: Physical Buildout of the OptIPuter  at UCSD

Network Map ( 01/16/2006)

Page 13: Physical Buildout of the OptIPuter  at UCSD

Motivation

• Management– Inventory– Troubleshooting

• Programming the network– Ability to view and manipulate the network as a single

entity.– Aid network reconfiguration in a heterogenous network

– Experimental networks have high degree of reconfiguration

• Glimmerglass based physical changes

• VLAN based logical topology changes– Final goal to automate the reconfiguration process.

• Focus on switch/router configuration process

Page 14: Physical Buildout of the OptIPuter  at UCSD

Automated Discovery

• Minimal input needed. – One gateway might be sufficient

• SNMP based discovery– Not tied to vendor protocol

– Tested with Cisco, HP, Dell, Extreme etc

– Almost all major vendors support SNMP

• Fast – Discovery process highly threaded– 3 minutes for UCSD optiputer network (~600 hosts and 20 switches)

• Framework based– Extensible to include mibs for specific switch/router models. For example

– Cisco vlans

– Extreme trunking

Page 15: Physical Buildout of the OptIPuter  at UCSD

Design for discovery and mapping

• Phase 1 ( Layer 3 )– Router discovery– Subnet discovery

• Phase 2 ( Layer 2)– Switch discovery– Host discovery– Switch <---> Host mapping– IP arp mapping

• Phase 3– Network mapping

– Form integrated map through novel algorithms

– Area of research

• Phase 4– Web based Viz– Database storage

Page 16: Physical Buildout of the OptIPuter  at UCSD

Future work

• Reliable discovery of logical topology ( VLANs)• Automate generation of switch/router configs

– Use physical topology information to aid config generation– Fixed templates for each switch/router model

– Templates are extended depending on configuration needed

• Batch configuration of switches/routers– Support Custom VLANS with only end-host specification– Constructing spanning tree of end-host and intermediate switches/routers\– Schedule dependencies for step-by-step configuration– Physical topology information essential

Page 17: Physical Buildout of the OptIPuter  at UCSD

• Logical topology adds an VLAN table to the physical topology tables.– VLAN composed of trunks. – Each Trunk can be a single/multiple port to port connection between same set

of switches– Schema supports retaining VLAN id when modifying trunks and vice-versa.

Optiputer Network Inventory Management – Logical View

LOGICAL TOPOLOGY (Single VLAN) GRAPH

Page 18: Physical Buildout of the OptIPuter  at UCSD

Look at Parallel Data Serving• 128 node Rockstar Cluster (Same as SC2003 Build)• 1 SCSI Drive/File Server Node

8 LustreClients

10 LustreFile Servers

10 LustreFile Servers

8 LustreClients

10 LustreFile Servers

10 LustreFile Servers

8 LustreClients

10 LustreFile Servers

10 LustreFile Servers

8 LustreClients

10 LustreFile Servers

10 LustreFile Servers

48 Port GigE + 10GigE Uplink

48-port GigE 48-port GigE 48-port GigE 48-port GigE

8888

Page 19: Physical Buildout of the OptIPuter  at UCSD

Basic Performance

• 32, 8, 16, 4 nodes reading the same 32 GB file

• Under these Ideal Circumstances, able to read more than 1.4GB/sec from disk

• Writing different 10 GB files from each nodes: about 700MB/s

Page 20: Physical Buildout of the OptIPuter  at UCSD

Why a Hybrid Structure

• Create different physical topologies quickly• Change when site/node is connected via packet, lambda or a hybrid

combination– Want to understand the practical challenges in different circumstances

• Circuits don’t scale in the Internet Sense• Packet switches will be congested in for long-haul

– Real QoS is unreachable in the ossified Internet

• The engineering compromise is likely a hybrid network– Packet paths always exist (internet scalability argument)– Circuit paths on demand

– Think private high-speed networks not just point-to-point

Page 21: Physical Buildout of the OptIPuter  at UCSD

Summary

• OptIPuter is addressing a subset of the research needed for figuring out how to waste (I mean utilize) bandwidth

• Work at multiple levels of the Software stack – protocols, virtual machine construction, storage retrieval

• Trying to understand how lambdas are presented to applications – Explicit?– Hidden?– Hybrid?

• Building an experimental infrastructure as large as our budget will allow– OptIPuter is already international in scale at 10gigabit.– Approximating the Terabit Campus with Quartzite