summary of switching theory balaji prabhakar stanford university
TRANSCRIPT
![Page 1: Summary of switching theory Balaji Prabhakar Stanford University](https://reader036.vdocument.in/reader036/viewer/2022083008/56649ea25503460f94ba6490/html5/thumbnails/1.jpg)
Summary of switching theory
High PerformanceSwitching and RoutingTe le c o m C e n te r W o rksh o p : S ep t 4 , 1 9 9 7 .
Balaji PrabhakarBalaji Prabhakar
Stanford University
![Page 2: Summary of switching theory Balaji Prabhakar Stanford University](https://reader036.vdocument.in/reader036/viewer/2022083008/56649ea25503460f94ba6490/html5/thumbnails/2.jpg)
2
Outline of Notes
• Focus on four types of architecture– Output-queued switches (ideal architecture, not much to say)– Input-queued crossbars– Combined input- and output-queued switches– Buffered crossbars (mentioned briefly)
![Page 3: Summary of switching theory Balaji Prabhakar Stanford University](https://reader036.vdocument.in/reader036/viewer/2022083008/56649ea25503460f94ba6490/html5/thumbnails/3.jpg)
3
A Detailed Sketch of a Router
NetworkProcessor
LookupEngine
NetworkProcessor
Lookup Engine
NetworkProcessor
LookupEngine
InterconnectionFabric
Switch
Output Scheduler
Line cards Outputs
PacketBuffers
PacketBuffers
PacketBuffers
![Page 4: Summary of switching theory Balaji Prabhakar Stanford University](https://reader036.vdocument.in/reader036/viewer/2022083008/56649ea25503460f94ba6490/html5/thumbnails/4.jpg)
4
Things to Remember/Look for
• Switch design is mainly influenced by – Cost– Heat dissipation
• Key technological factors affecting cost and heat– Memory bandwidth (not the size of memory, but its speed)– Complexity of algorithms– Number of off-chip operations (this affects speed)
• Winning algorithms – Make the right trade-offs– Are very simple
• In hardware architecture design, switch/router design seems an exception in that theory has made a surprising amount of difference to the practice
![Page 5: Summary of switching theory Balaji Prabhakar Stanford University](https://reader036.vdocument.in/reader036/viewer/2022083008/56649ea25503460f94ba6490/html5/thumbnails/5.jpg)
5
Evolution of Switches
• In the beginning, there were only telephone switches
• Data packet/cell switches came in with ATM – Almost all original designs were either of the shared memory or the
output-queued architecture– These architectures were difficult to scale to high bandwidths,
because of their very high memory bandwidth requirement
• Input-queued switches require a low memory bandwidth, hence were seen as very scaleable
![Page 6: Summary of switching theory Balaji Prabhakar Stanford University](https://reader036.vdocument.in/reader036/viewer/2022083008/56649ea25503460f94ba6490/html5/thumbnails/6.jpg)
6
Evolution of Switches
• 1987: A very influential paper in switching, by Karol et. al.– IQ switches suffered from the head-of-line blocking phenomenon,
which limits their throughput to 58%– This very poor performance nearly killed the IQ architecture
121
221
• Switching theory bifurcates: The IQ and CIOQ researches– The negative result of Karol et. al. generated much interest in the Combined
Input- and Output-queued (CIOQ) architecture during the years 1987--1995. We will return to CIOQ switches later – For now we will look at developments in the IQ architecture
![Page 7: Summary of switching theory Balaji Prabhakar Stanford University](https://reader036.vdocument.in/reader036/viewer/2022083008/56649ea25503460f94ba6490/html5/thumbnails/7.jpg)
7
Input-queued Switches
![Page 8: Summary of switching theory Balaji Prabhakar Stanford University](https://reader036.vdocument.in/reader036/viewer/2022083008/56649ea25503460f94ba6490/html5/thumbnails/8.jpg)
8
Evolution of IQ Switches
• 1993: Appearance of paper by Anderson et. al.– Showed that head-of-line blocking is easily overcome by the use of virtual output queues, hence higher throughputs are possible; however, VOQs required the switch fabric to be “scheduled” (this is a key trade-off: scheduling problem for memory bandwidth)– Showed that switch scheduling is equivalent to bipartite graph matching,
introduced the Parallel Iterative Matching algorithm
121
221
11 2
1 22
![Page 9: Summary of switching theory Balaji Prabhakar Stanford University](https://reader036.vdocument.in/reader036/viewer/2022083008/56649ea25503460f94ba6490/html5/thumbnails/9.jpg)
9
Evolution of IQ Switches
• 1995: Nick McKeown develops the iSLIP algorithm in his thesis– Used, in 1996, in Cisco Sytems’ flagship GSR family of routers
• 1996: Influential paper by McKeown, Walrand and Anantharam– Showed that the Maximum Size Matching does not give 100% throughput– Showed that Maximum Weight Matching does give 100% throughput
• 1992: Paper by Tassiulas and Ephremides– Showed that the Maximum Weight Matching gives 100% throughput– And many other interesting theoretical results
• 1998: Tassiulas introduces a randomized version of the MWM algorithm– He showed that this simple algorithm gives 100% throughput– But, its delay performance was very poor
• 2000: Giaccone, Prabhakar and Shah introduce other randomized algorithms which give 100% throughput with delay very nearly equal to that of the MWM algorithm
![Page 10: Summary of switching theory Balaji Prabhakar Stanford University](https://reader036.vdocument.in/reader036/viewer/2022083008/56649ea25503460f94ba6490/html5/thumbnails/10.jpg)
10
Performance Analysis of IQ Switches
• Analyzing throughput– Bernoulli IID input processes: Lyapunov analysis of the Markov chain
corresponding to the queue-size process
(all papers mentioned previously)
– SLLN input processes: Fluid models introduced by Dai and Prabhakar
– Adversarial input processes: Analyzed by Andrews and Zhang
• Analyzing delay performance– Bounds from Lyapunov analysis: Leonardi et al, Kopikare and Shah
– Heavy traffic analysis: Stolyar analyzes the MWM algorithm under heavy traffic
– Shah and Wischik build on this and analyze MWM algorithms with different queue weights
– See talks by Shah and Williams
![Page 11: Summary of switching theory Balaji Prabhakar Stanford University](https://reader036.vdocument.in/reader036/viewer/2022083008/56649ea25503460f94ba6490/html5/thumbnails/11.jpg)
11
Combined Input- and Output-queued Switches
![Page 12: Summary of switching theory Balaji Prabhakar Stanford University](https://reader036.vdocument.in/reader036/viewer/2022083008/56649ea25503460f94ba6490/html5/thumbnails/12.jpg)
12
CIOQ Switches
• Recall the negative result on IQ switches in the paper by Karol et al
• It started a lot of work on CIOQ switches– The aim was to get the performance of OQ switches at very near the
cost of IQ switches
– A number of heuristic algorithms, simulations and special-case analyses showed that with a speedup of about 4, a CIOQ switch could approach the performance of an OQ switch
IQ CIOQ OQ
Speedup = 1
Inexpensive
Poor performance
Speedup = 4 or 5?
Inexpensive
Great performance
Speedup = N
Expensive
Great performance
![Page 13: Summary of switching theory Balaji Prabhakar Stanford University](https://reader036.vdocument.in/reader036/viewer/2022083008/56649ea25503460f94ba6490/html5/thumbnails/13.jpg)
13
CIOQ Switches
• Prabhakar and McKeown (1999) – Prove that a CIOQ switch with a speedup of 4 exactly emulates an OQ switch; i.e.
there does not exist an input pattern of packets that can distinguish the two switches
– They introduced an algorithm called MUCF, which is of the stable marriage type– This result was later improved to 2 by Chuang, Goel, McKeown and P– Related other work due to Charny et al, Krishna et al
• Iyer, Zhang and McKeown (2002?) generalize the above to switches with a single stage of buffers
– Thereby making a theoretical analysis of the Juniper router architecture (which has a shared memory architecture)
• Dai and Prabhakar (2000) and Leonardi et al (2000) show that any maximal matching algorithm delivers a 100% throughput at a speedup of 2
– This result has a lot of significance for practice because (essentially) all commercial switches employ a speedup close to 2 and (truncated) maximal matching algorithms; so it validated a popular practice
![Page 14: Summary of switching theory Balaji Prabhakar Stanford University](https://reader036.vdocument.in/reader036/viewer/2022083008/56649ea25503460f94ba6490/html5/thumbnails/14.jpg)
14
Buffered Crossbars
• This type of fabric is very attractive because– It completely decouples the input from the output– It can handle variable-length packets in a natural way– It sits in some hot-selling networking products: e.g. Cisco’s Catalyst 6000 switch– Very ripe for theoretical study
11 2
1 22