packet-mode emulation of output-queued switches

53
Packet-Mode Emulation of Output- Queued Switches David Hay, CS, Technion Joint work with Hagit Attiya (CS) and Isaac Keslassy (EE)

Upload: jerold

Post on 31-Jan-2016

45 views

Category:

Documents


0 download

DESCRIPTION

Packet-Mode Emulation of Output-Queued Switches. David Hay, CS, Technion Joint work with Hagit Attiya (CS) and Isaac Keslassy (EE). Outline. Cell-Mode Scheduling vs. Packet-Mode Scheduling Impossibility of an Exact Emulation Speedup-RQD Tradeoff Emulation with S 4 Emulation with S 2 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Packet-Mode Emulation of Output-Queued Switches

Packet-Mode Emulation of Output-Queued Switches

David Hay, CS, Technion

Joint work with Hagit Attiya (CS) and Isaac Keslassy (EE)

Page 2: Packet-Mode Emulation of Output-Queued Switches

Outline

Cell-Mode Scheduling vs. Packet-Mode Scheduling

Impossibility of an Exact Emulation Speedup-RQD Tradeoff

Emulation with S4 Emulation with S2

Emulation of OQ switch w/ bounded buffer Simulation Results

Page 3: Packet-Mode Emulation of Output-Queued Switches

CIOQ Switches

Page 4: Packet-Mode Emulation of Output-Queued Switches

Cell-Mode Scheduling

Page 5: Packet-Mode Emulation of Output-Queued Switches

Cell-Mode Scheduling

Page 6: Packet-Mode Emulation of Output-Queued Switches

Cell-Mode Scheduling

Page 7: Packet-Mode Emulation of Output-Queued Switches

Trend towards Packet-Mode

Cell-mode scheduling is getting too hard Fragmentation and reassembly should work very fast,

at the external rate Extra header for each cell loss of bandwidth

For optical switches such fragmentation and reassembly are prohibitive

Cell-mode schedulers are packet-oblivious Degradation of the overall performance

Page 8: Packet-Mode Emulation of Output-Queued Switches

Packet-Mode Scheduling

Page 9: Packet-Mode Emulation of Output-Queued Switches

Packet-Mode Scheduling

No need for fragmentation and reassembly Must ensure contiguous packet delivery over the

fabric While input i delivers a packet to output j, neither input

i nor output j can handle other packets.

Can packet-mode schedulers provide similar

performance guarantees as cell-mode schedulers?

[Marsan et al., 2002][Ganjali et al., 2003][Turner, 2006]

Page 10: Packet-Mode Emulation of Output-Queued Switches

Output Queuing Emulation

OQ switches are considered optimal with respect to queuing delay and throughput But too hard to implement in practice…

Emulation: Same input traffic same output traffic

How hard is it for cell-mode / packet-mode CIOQ switch to emulate OQ switch?

Page 11: Packet-Mode Emulation of Output-Queued Switches

Output Queuing Emulation

OQ switches are considered optimal with respect to queuing delay and throughput But too hard to implement in practice…

Emulation: Same input traffic same output traffic

How hard is it for cell-mode / packet-mode CIOQ switch to emulate OQ switch?

Page 12: Packet-Mode Emulation of Output-Queued Switches

Easy with speedup S=N N scheduling decisions every time-slot:

In the 1st decision forward the cell of input 1 In the 2nd decision forward the cell of input 2⋮ In the Nth decision forward the cell of input N

Cell-Mode Emulation is Possible

Page 13: Packet-Mode Emulation of Output-Queued Switches

Easy with speedup S=N N scheduling decisions every time-slot:

In the 1st decision forward the cell of input 1 In the 2nd decision forward the cell of input 2⋮ In the Nth decision forward the cell of input N

Cell-Mode Emulation is Possible

Page 14: Packet-Mode Emulation of Output-Queued Switches

1st Key Concept: Slackness of a cell (in the input side)L(C) = OC(C) - IT(C)

Slackness may decrease by at most 2 in every time-slot A cell leaves the destination of C OC-- A cell arrives at the input and is queued before C IT++

Initial slackness can be made non-negative When C arrive, Insert it in the OC(C)th place of its input buffer.

Plan: Ensure that slackness always increases by 2 Slackness is never negative All cells are delivered on time

Cell-Mode Emulation w/ S=2[Chuang et al.,1999]

Input Thread: (“bad guys”)How many cells proceed C in its input-port buffer?

Output Cushion: (“good guys”)How many cells are queued in the output-buffer of C’s destination, and should leave the OQ switch before C

Page 15: Packet-Mode Emulation of Output-Queued Switches

Stable Marriage (stable matching): Given two equal-size sets M,W and preference lists from every mM, wW. Find a matching in which there are no two pairs (m,w),(m’,w’) s.t. m prefer w’ over w w’ prefer m over m

Classical problem in CS Stable marriage always exists Many algorithms..

Cell-Mode Emulation w/ S=2[Chuang et al.,1999]

Page 16: Packet-Mode Emulation of Output-Queued Switches

Critical Cell First (CCF) algorithm performs stable marriage at each decision:M is the set of inputs, W is the set of outputs i prefers o1 over o2 if there is a cell for o1 that

is queued before all cells for o2

o prefers i1 over i2 if there is a cell from i1 that should leave before all cells from i2

Cell-Mode Emulation w/ S=2[Chuang et al.,1999]

Page 17: Packet-Mode Emulation of Output-Queued Switches

For each cell C from input-port i to output port j, and each scheduling decision:C is forwarded (and we don’t care about it)C’ was forwarded from i, and i preferred to

forward it IT--C’ was forwarded to j, and j preferred to

receive it OC++ Two scheduling decisions every time-slots

Slackness always increases by 2

Cell-Mode Emulation w/ S=2[Chuang et al.,1999]

Page 18: Packet-Mode Emulation of Output-Queued Switches

Easy with speedup S=N Possible with speedup S=2 (w/ CCF)

Lower bound: S≥2-1/N is required [Chuang et

al.,1999]

Cell-Mode Emulation

What is the speedup required for

packet-mode emulation?

Page 19: Packet-Mode Emulation of Output-Queued Switches

Outline

Cell-Mode Scheduling vs. Packet-Mode Scheduling

Impossibility of an Exact Emulation Speedup-RQD Tradeoff

Emulation with S4 Emulation with S2

Emulation of OQ switch w/ bounded buffer Simulation Results

Page 20: Packet-Mode Emulation of Output-Queued Switches

Packet-Mode Emulation is Impossible

Regardless of speedupEven with speedup S=N

Page 21: Packet-Mode Emulation of Output-Queued Switches

Packet-Mode Emulation is Impossible

Page 22: Packet-Mode Emulation of Output-Queued Switches

Packet-Mode Emulation is Impossible

Page 23: Packet-Mode Emulation of Output-Queued Switches

Packet-Mode Emulation is Impossible

Page 24: Packet-Mode Emulation of Output-Queued Switches

Packet-Mode Emulation is Impossible

Page 25: Packet-Mode Emulation of Output-Queued Switches

Packet-Mode Emulation is Impossible

Page 26: Packet-Mode Emulation of Output-Queued Switches

Outline

Cell-Mode Scheduling vs. Packet-Mode Scheduling

Impossibility of an Exact Emulation Speedup-RQD Tradeoff

Emulation with S4 Emulation with S2

Emulation of OQ switch w/ bounded buffer Simulation Results

Page 27: Packet-Mode Emulation of Output-Queued Switches

Emulation w/ Relative Queuing Delay

The CIOQ switch is allowed a bounded lag behind the shadow OQ switch

Exact same behavior as the optimal OQ switch, but with some extra delay Called relative queuing delay

Can we provide packet-mode OQ emulation with bounded RQD and small speedup?

Page 28: Packet-Mode Emulation of Output-Queued Switches

Our Results: Speedup-RQD tradeoff

Speedup

RQD

2

4

2Lmax

Lower bound on RQD (even with infinite speedup)

Lower bound on the speedup (from cell-mode scheduling)

Generalization of cell-mode scheduling with S=2: Taking each packet of size ≤ Lmax as one huge cell

Lmax= maximum packet size (known value)

Page 29: Packet-Mode Emulation of Output-Queued Switches

Intuition for Emulation Algorithms

Packet Mode CIOQ

Packet Mode OQ

Cell Mode CIOQ w/ S=2

Page 30: Packet-Mode Emulation of Output-Queued Switches

PIFO Cell-Mode OQ Switch

FIFO = First-In First-Out

Page 31: Packet-Mode Emulation of Output-Queued Switches

PIFO Cell-Mode OQ Switch

FIFO = First-In First-Out PIFO = Push-In First-Out

Page 32: Packet-Mode Emulation of Output-Queued Switches

PIFO Cell-Mode OQ Switch

FIFO = First-In First-Out PIFO = Push-In First-Out

FIFO Packet-Mode OQ Switch is a PIFO Cell-Mode Switch

Page 33: Packet-Mode Emulation of Output-Queued Switches

Underlying CCF Algorithm

Cell-Mode CIOQ w/ CCF (and speedup S=2) emulates any PIFO cell-mode OQ switch [Chuang et al.,1999]

But, CCF does not maintain contiguous packet forwarding over the fabric!

Packet Mode CIOQ

Packet Mode OQ

Cell Mode CIOQ w/ S=2

PIFO Cell-Mode OQ

=

Page 34: Packet-Mode Emulation of Output-Queued Switches

Intuition for Emulation Algorithms

Packet Mode CIOQ

Packet Mode OQ

Cell Mode CIOQ w/ S=2

Two sub-steps:1. Framing2. Contiguous Decomposition

Page 35: Packet-Mode Emulation of Output-Queued Switches

Frame-Based Schedulers

Works in pipelined frame-based manner

Within each frame: Build a demand matrix for this frame Schedule the demand matrix of the

previous frame

time

Page 36: Packet-Mode Emulation of Output-Queued Switches

At each frame of size T, CCF forwards at most 2T cells from each input and to each output.

Building the Demand Matrix

3012

1221

2220

0213

Number of cells CCF sent from input 1 to output 1 in

the last frame

+ + +

+

+

+

+

+

+ +

+

+

≤ 2T

≤ 2T

≤ 2T

≤ 2T

++++

++++

++++≤≤ ≤ ≤

Problem: A packet may span several frames.

2T 2T 2T 2T

Page 37: Packet-Mode Emulation of Output-Queued Switches

Building the Demand Matrix

Count only packets whose last cell is forwarded by the CCF in the frame

Each row/column in the matrix is bounded by 2T+N(Lmax-1)For each input-output pair only cells of one

additional packet can be added.

Translates into RQD of 2T+(Lmax-2).

Page 38: Packet-Mode Emulation of Output-Queued Switches

Intuition for Emulation Algorithms

Packet Mode CIOQ

Packet Mode OQ

Cell Mode CIOQ w/ S=2

Two sub-steps:1. Framing2. Contiguous Decomposition

Page 39: Packet-Mode Emulation of Output-Queued Switches

Decomposing the Demand Matrix Challenge: Decompose the matrix into permutations

while maintaining contiguous packet delivery. Each permutation dictates a scheduling decision.

First try: optimal Birkhoff von-Neumann decomposition results in 2T+N(Lmax-1) permutations.

0010

0100

1000

0001

1000

0010

0100

0001

1000

0100

0010

0001

3012

1221

2220

0213

0001

0010

1000

0100

0001

1000

0100

0010

1000

0001

0010

0100

Page 40: Packet-Mode Emulation of Output-Queued Switches

Contiguous Greedy Decomposition

To maintain contiguous packet delivery: If (i,j) was matched in iteration t-1 and there are more

(i,j) cells to schedule keep for iteration t.

Find a greedy matching for the rest of the matrix.

Speedup: RQD: 2T+Lmax-1T

LN )( max 14

Page 41: Packet-Mode Emulation of Output-Queued Switches

Our Results: Speedup-RQD tradeoff

Speedup

RQD

2

4

2Lmax

S=4+ (N(Lmax-1))/TRQD = 2T+Lmax-1

Next…

Page 42: Packet-Mode Emulation of Output-Queued Switches

Intuition for Emulation Algorithms

Packet Mode CIOQ

Packet Mode OQ

Cell Mode CIOQ w/ S=2

Two sub-steps:1. Framing2. Contiguous Decomposition

Page 43: Packet-Mode Emulation of Output-Queued Switches

Emulation w/ S2 - Framing

Keep a separate demand matrix for every possible packet size

Example: Possible packets sizes are 3,4,6

11040

86110

15157

0231

# of size 3 packets

# of size 4 packets

# of size 6 packets

181510

0150

51019

67412

13047

115310

29210

021013

Page 44: Packet-Mode Emulation of Output-Queued Switches

Emulation w/ S2 - Framing

Concatenate packets of the same size into mega-packets of size k=LCM(1,…,Lmax)

Leftover matrix for each size m

11040

86110

15157

0231

size 6size 4

181510

0150

51019

67412

size 3

13047

115310

29210

021013

Mega Packets (of size 12)

0000

0000

0000

0000

Page 45: Packet-Mode Emulation of Output-Queued Switches

Emulation w/ S2 - Framing

Concatenate packets of the same size into mega-packets of size k=LCM(1,…,Lmax)

Leftover matrix for each size m

11040

86110

15157

0231

size 6size 4

181510

0150

51019

67412

size 3

13047

115310

29210

021013

Mega Packets (of size k=12)

0000

0000

0000

0000

Page 46: Packet-Mode Emulation of Output-Queued Switches

Emulation w/ S2 - Framing

Concatenate packets of the same size into mega-packets of size k=LCM(1,…,Lmax)

Leftover matrix for each size m

11040

86110

15157

0231

size 6size 4

181510

0150

51019

67412

size 3(leftovers)

1003

3132

2110

0221

Mega Packets (of size 12)

3011

2102

0250

0023

Page 47: Packet-Mode Emulation of Output-Queued Switches

Emulation w/ S2 - Framing

Concatenate packets of the same size into mega-packets of size k=LCM(1,…,Lmax)

Leftover matrix for each size m

11040

86110

15157

0231

size 6size 4(leftovers)

1201

0120

2110

0110

size 3(leftovers)

1003

3132

2110

0221

Mega Packets (of size 12)

3264

2112

1553

2237

Page 48: Packet-Mode Emulation of Output-Queued Switches

Emulation w/ S2 - Framing

Concatenate packets of the same size into mega-packets of size k=LCM(1,…,Lmax)

Leftover matrix for each size m

1000

0010

1111

0011

size 6(leftovers)

size 4(leftovers)

1201

0120

2110

0110

size 3(leftovers)

1003

3132

2110

0221

Mega Packets (of size 12)

3784

6417

8576

2347

Page 49: Packet-Mode Emulation of Output-Queued Switches

Emulation w/ S2 - Framing

Sum of each row/column is boundedFor mega packets matrix: ≤ (2T+N(Lmax-1))/k

For each leftover matrix of size m: ≤ N(k -1)/m

1000

0010

1111

0011

size 6(leftovers)

size 4(leftovers)

1201

0120

2110

0110

size 3(leftovers)

1003

3132

2110

0221

Mega Packets (of size 12)

3784

6417

8576

2347

< 12/3 < 12/4 < 12/6

Page 50: Packet-Mode Emulation of Output-Queued Switches

Emulation w/ S2 - Decomposition

Optimally decompose (w/ Birkhoff von-Neumann) the mega-packets matrix and then the leftover matrices

max )()# max L

m m

kNm

k

LNTk

TT

nspermutatio1

11(21S

)() maxmax 11(21

kNLLNTT

T

kLN )( max 12

Bound on the mega-packets matrix

Hold each permutation k times for contiguous (mega)-packet delivery

Page 51: Packet-Mode Emulation of Output-Queued Switches

Our Results: Speedup-RQD tradeoff

Speedup

RQD

2

4

2Lmax

S=4+ (N(Lmax-1))/TRQD = 2T+Lmax-1

S=2+(NkLmax-1)/TRQD = 2T+Lmax-1

Page 52: Packet-Mode Emulation of Output-Queued Switches

Wrap-up

Packet-mode scheduling can be done withthe same speedup as cell-mode scheduling

With the price of bounded RQD Future work: lower bounds

??

Page 53: Packet-Mode Emulation of Output-Queued Switches

Thank You!