the concurrent matching switch architecture bill lin (university of california, san diego) isaac...
Post on 20-Dec-2015
220 views
TRANSCRIPT
The Concurrent Matching Switch Architecture
Bill Lin (University of California, San Diego)
Isaac Keslassy (Technion, Israel)
IEEE INFOCOM, Barcelona, April 23-29, 2006 2
Motivation
Traffic demands expected to grow, driven in part by increasing broadband adoption 10x increase in broadband subscription in just last 3
years, already over 100 million subscribers 1.25-2.4 Gbps fiber to homes emerging (GPON,
GEPON, EPON, BPON …)
Larger routers needed for consolidation
Operators need scalable routers that provide good performance
IEEE INFOCOM, Barcelona, April 23-29, 2006 3
Limitations of Previous Routers
Output-Queueing (OQ) Switch Well-known to provide good performance, but
scalability hampered by need for internal N speedup
Crossbar Switches, using Input-Queueing (IQ) or Combined Input-Output Queueing (CIOQ)
Huge body of literature, but scalability hampered by need for centralized scheduling and arbitrary per-packet switch configurations
IEEE INFOCOM, Barcelona, April 23-29, 2006 4
Limitations of Previous Routers
Load-Balanced Routers No centralized scheduler Scalable fixed configuration switch fabric in optics Guarantees 100% throughput 100 Tb/s design with 160 Gb/s linecards shown
But packets may be delivered “out-of-order”
IEEE INFOCOM, Barcelona, April 23-29, 2006 5
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
R/N
R/N
R/NR/N
R/N
R/N
R/N
Basic Load-Balanced Router
R/NR/N
R/NR/N
In
In
In
LinecardsLinecards LinecardsA1A1A2A2A3A3
B1B1
C1C1C2C2
B1B1B2B2
C1C1
IEEE INFOCOM, Barcelona, April 23-29, 2006 6
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
R/N
R/N
R/NR/N
R/N
R/N
R/N
Basic Load-Balanced Router
R/NR/N
R/NR/N
In
In
In
LinecardsLinecards Linecards
A1A1
A2A2
A3A3
B1B1C1C1
C2C2B1B1
B2B2C1C1
Many Fabric Options (any spreading device)
Space: Full uniform mesh Wavelength: Static WDM Time: Round-robin switches
Just need fixed uniform rate channels at R/N
No dynamic switch reconfigurations
Many Fabric Options (any spreading device)
Space: Full uniform mesh Wavelength: Static WDM Time: Round-robin switches
Just need fixed uniform rate channels at R/N
No dynamic switch reconfigurations
IEEE INFOCOM, Barcelona, April 23-29, 2006 7
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
R/N
R/N
R/NR/N
R/N
R/N
R/N
Basic Load-Balanced Router
R/NR/N
R/NR/N
In
In
In
LinecardsLinecards Linecards
A1A1
A2A2
A3A3
B1B1C1C1
C2C2B1B1
B2B2C1C1
Out ofOrder !
IEEE INFOCOM, Barcelona, April 23-29, 2006 8
Packet Ordering Problem
Out-of-order packet delivery is undesirable(e.g. bad for TCP)
Previous techniques (e.g. EDF, UFS, FOFF) Accumulate and delay packets at input/middle ports And/or delay and re-order packets at middle/output ports
However, these techniques are unsatisfactory because they add substantial delays
IEEE INFOCOM, Barcelona, April 23-29, 2006 9
Impact on Avg. Delay(N = 128, uniform traffic)
Basic Load-Balanced
UFSFOFF
SignificantDelay
IEEE INFOCOM, Barcelona, April 23-29, 2006 10
Concurrent Matching Switch (CMS)
Basic idea Retain load-balanced router structure and scalability of a
fixed optical mesh, no dynamic reconfiguration Instead of packets, load-balance “request tokens” to N
parallel “schedulers” Each scheduler independently solves its own matching Packets delivered in order based on matching results
Goal is to provide much lower average delay than accumulation-based methods for ensuring packet
order while retaining 100% throughput and scalability
Goal is to provide much lower average delay than accumulation-based methods for ensuring packet
order while retaining 100% throughput and scalability
IEEE INFOCOM, Barcelona, April 23-29, 2006 11
Out
Out
Out
R
R
R
R
R
R
ArchitectureLinecards LinecardsLinecards
A1A1
B1B1
C1C1C2C2C1C1C1C1
B2B2
C2C2
Retain Fixed Configuration
Meshes
BUT move packet buffers
to INPUT
A2A2A3A3A4A4
IEEE INFOCOM, Barcelona, April 23-29, 2006 12
Out
Out
Out
R
R
R
R
R
R
ArchitectureLinecards LinecardsLinecards
A1A1
B1B1
C1C1C2C2C1C1C1C1
B2B2
C2C2
A2A2A3A3A4A4 201
101
100
001
001
011
010
000
000
Add N2 Token
Counters
IEEE INFOCOM, Barcelona, April 23-29, 2006 13
Out
Out
Out
R
R
R
R
R
R
Arrival PhaseLinecards LinecardsLinecards
A1A1
C1C1C2C2C1C1C1C1
C2C2
A2A2A3A3A4A4 201
101
100
001
001
011
010
000
000
B1B1B1B1B2B2
A1A1A1A1A2A2
B1B1B2B2
C2C2C3C3C4C4
IEEE INFOCOM, Barcelona, April 23-29, 2006 14
Out
Out
Out
R
R
R
R
R
R
Arrival PhaseLinecards LinecardsLinecards
A1A1
C1C1C2C2C1C1C1C1
C2C2
A2A2A3A3A4A4 201
101
100
101
001
011
110
000
100
B1B1B1B1B2B2
B1B1B2B2
C2C2C3C3C4C4
A1A1A1A1A2A2
IEEE INFOCOM, Barcelona, April 23-29, 2006 15
Out
Out
Out
R
R
R
R
R
R
Arrival PhaseLinecards LinecardsLinecards
A1A1
C1C1C2C2C1C1C1C1
C2C2
A2A2A3A3A4A4 211
101
100
101
011
011
110
010
100
B1B1B1B1B2B2
A1A1A1A1A2A2
B1B1B2B2
C2C2C3C3C4C4
IEEE INFOCOM, Barcelona, April 23-29, 2006 16
Out
Out
Out
R
R
R
R
R
R
Arrival PhaseLinecards LinecardsLinecards
A1A1
C1C1C2C2C1C1C1C1
C2C2
A2A2A3A3A4A4 211
101
100
101
011
012
111
010
101
B1B1B1B1B2B2
A1A1A1A1A2A2
B1B1B2B2
C2C2C3C3C4C4
IEEE INFOCOM, Barcelona, April 23-29, 2006 17
Out
Out
Out
R
R
R
R
R
R
Matching PhaseLinecards LinecardsLinecards
A1A1
C1C1C2C2
A2A2A3A3A4A4 211
101
100
101
011
012
111
010
101
B1B1B1B1B2B2
A1A1A1A1A2A2
B1B1B2B2
C1C1C2C2C1C1C2C2C3C3C4C4
IEEE INFOCOM, Barcelona, April 23-29, 2006 18
Out
Out
Out
R
R
R
R
R
R
Matching PhaseLinecards LinecardsLinecards
A1A1
C1C1C2C2
211
101
100
101
011
012
111
010
101
B1B1
A2A2A3A3A4A4B1B1A1A1
A1A1A2A2 C1C1
B1B1B2B2B2B2
C2C2C1C1C2C2C3C3C4C4
IEEE INFOCOM, Barcelona, April 23-29, 2006 19
Out
Out
Out
R
R
R
R
R
R
Matching PhaseLinecards LinecardsLinecards
A1A1
C1C1C2C2C2C2
111
001
000
100
001
002
110
000
100
B1B1 A2A2
A3A3
A4A4B1B1
B2B2
C3C3C4C4
A1A1A1A1A2A2 C1C1
B1B1C1C1
B2B2C2C2
IEEE INFOCOM, Barcelona, April 23-29, 2006 20
Out
Out
Out
R
R
R
R
R
R
Departure PhaseLinecards LinecardsLinecards
A1A1
C1C1C2C2C2C2
111
001
000
100
001
002
110
000
100
B1B1 A2A2
A3A3
A4A4B1B1
B2B2
C3C3C4C4
A1A1A1A1A2A2 C1C1
B1B1C1C1
B2B2C2C2
IEEE INFOCOM, Barcelona, April 23-29, 2006 21
Distributed Operation
All linecards operate in parallel in a fully distributed manner
Arrival, matching, and departure phases overlap in a pipeline manner
IEEE INFOCOM, Barcelona, April 23-29, 2006 22
Main Ideas
Each middle linecard acts as a “micro-router” with 1/Nth of the arrival traffic
Therefore, it gets N time slots to think about the schedule, time complexity amortized by a factor of N
If each micro-router can guarantee 100% throughput, so can the overall switch
Each micro-router can work the way that it wants, leveraging huge body of existing work on scheduling
CMS provides a new way of aggregating routers together. Therefore, provides a new way of thinking
about scaling routers.
CMS provides a new way of aggregating routers together. Therefore, provides a new way of thinking
about scaling routers.
IEEE INFOCOM, Barcelona, April 23-29, 2006 23
Practicality
Well-studied randomized approximations to Maximum Weighted Matching have been shown to achieve very good results [Tassiulas 1998] [Giaccone, Prabhakar & Shah, 2003]
These algorithms only require O(N) complexity using sequential hardware, but can provide 100% throughput guarantees with no speedup and good delay results
Amortized over N time slots, CMS with these scheduling algorithms can achieve O(1) time complexity (independent of switch size) 100% throughput Good delay results Packet ordering
IEEE INFOCOM, Barcelona, April 23-29, 2006 24
Experimental Results(N = 128, uniform traffic)
Basic Load-Balanced
UFSFOFFCMS
Difference of N time slots for matching phase
IEEE INFOCOM, Barcelona, April 23-29, 2006 25
Conclusions
CMS is scalable Leverages scalability of fixed optical meshes Fully distributed Can achieve O(1) time complexity
CMS achieves good performance Guarantees 100% throughput Guarantees packet ordering Experimentally achieves low packet delays
CMS provides new way of thinking about scaling routers and connects huge body of existing literature on scheduling to load-balanced routers