outline - web.eecs.umich.edusugih/courses/eecs589/f16/hrf.pdf · outline background input queued...

7
1 Lawrence Yeung Department of Electrical & Electronic Engineering The University of Hong Kong Dec. 1, 2016 Designing the Most Efficient Iterative Scheduling Algorithms for Input- queued Switches 2 Outline Background Input queued switch Iterative scheduling algorithms Highest rank first (HRF) HRF-basic HRF-refined HRF with request coding (HRF-RC) Performance evaluations Conclusions 3 Two Types of Switches N*R N*R N*R R R R Output buffer Output buffer Output buffer Switch Fabric R R R Input buffer Input buffer Input buffer Switch Fabric Scheduler Output-queued switch Input-queued switch 4 Input-queued Switch R R R R R R Input Input Input Switch Fabric 3 3 Scheduler - Maximum size matching - Maximum weight matching VOQ (i,j) 1 2 3

Upload: dangnhu

Post on 19-Jul-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Outline - web.eecs.umich.edusugih/courses/eecs589/f16/HRF.pdf · Outline Background Input queued switch ... Thesis, 1995. 11 Rank-based Priority: HRF its N ... DRRM are stable under

1

Lawrence YeungDepartment of Electrical & Electronic Engineering

The University of Hong KongDec. 1, 2016

Designing the Most Efficient Iterative Scheduling Algorithms for Input-

queued Switches

2

Outline

BackgroundInput queued switchIterative scheduling algorithms

Highest rank first (HRF) HRF-basicHRF-refinedHRF with request coding (HRF-RC)

Performance evaluationsConclusions

3

Two Types of Switches

N*R

N*R

N*R

R

R

R

Output buffer

Output buffer

Output bufferSwitch Fabric

R

R

R

Input buffer

Input buffer

Input buffer

Switch Fabric

Scheduler

Output-queued switch Input-queued switch

4

Input-queued Switch

R

R

R

R

R

R

Input

Input

Input

Switch Fabric

3333

2233

Scheduler

- Maximum size matching- Maximum weight matching

VOQ (i,j)

1

2

3

Page 2: Outline - web.eecs.umich.edusugih/courses/eecs589/f16/HRF.pdf · Outline Background Input queued switch ... Thesis, 1995. 11 Rank-based Priority: HRF its N ... DRRM are stable under

5

Iterative Scheduling Algorithms

Maximal size matching (MSM) is simpler as no backtracking on established connections.

Iterative scheduling algorithms are good for finding MSM, and hardware implementation. Each iteration consists of 3 phases:

Request: Inputs send matching requests to outputsGrant: Each output grants at most one requestAccept: Each input accepts at most one grant

1) An iterative MSM algorithm guarantees maximal size matching in N iterations, where N is the switch size.

2) In practice, only a small fixed number of iterations are used.

6

PIM (Parallel Iterative Matching)

1234

1234

I1: Requests

1234

1234

I2: Grant

1234

1234

I3: Accept

Random selection1234

1234

Random selection

1234

1234

#1

#2

1234

1234

1234

1234

Maximum size: 4

Maximal size: 3

A matching is of maximal size if “no input or output is left unnecessarily idle”.

Request: only for VOQ > 0Grant/accept: only for winners

7

iSLIP (iterative RR with slip)

)1: Requests

1234

1234

)3: Accept

1234

1234

1234

1234

)2: Grant

12

3

41

2

3

4

12

3

41

2

3

4

12

3

4

1234

1234

2

15436

31

2

15436

31

iLQF

1234

1234

436

3

436

3

1234

1234

4

6

3

4

6

3

Single-bit request

Multiple-bit request LQF

LQF

Size

Weight

8

DRRM (Dual RR Matching*)

Single request from each inputNot to unnecessarily attract > 1 grants (but ..)A grant is guaranteed to be accepted => 2-phase, simpler

Single-iteration performance comparable to iSLIP-1

)1: Requests

1234

1234

1234

1234

)2: Grant

12

3

4

12

3

41234

1234

1234

1234

)3: Accept)1: Requests

12

3

4

* Yihan Li, Shivendra Panwar and H. Jonathan Chao, “On the Performance of a Dual Round-Robin Switch,” IEEE INFOCOM 2001, vol. 3, pp. 1688-1697, April 2001

Page 3: Outline - web.eecs.umich.edusugih/courses/eecs589/f16/HRF.pdf · Outline Background Input queued switch ... Thesis, 1995. 11 Rank-based Priority: HRF its N ... DRRM are stable under

9

SRR (Synchronous RR*)

Single request from each input based on a global RR (gRR) schedule.Implicit; no local RR arbiters, simpler

Scheduling priority is given topreferred I/O pair first, and longest VOQ next.

Outperforms iSLIP-1 & DRRM under uniform traffic

)1: Requests

1234

1234

1234

1234

)2: Grant)1: Requests

* A. Scicchitano, A. Bianco, P. Giaccone, E. Leonardi and E. Schiattarella, “Distributed scheduling in input queued switches” IEEE ICC 2007, June 2007, Glasgow, Scotland.

LQF

Preferred I/O pairs

10

Iterative Scheduling AlgorithmsNon-weighted matching:

iSLIP* / DRRM / ...Rotating priority via local RR arbitersTDM-like high-load performance

Weighted matching:iLQF* / … Queue-based priority, where the LQF is always served firstBut difficult to implement, and size is limited

Hybrid:SRR | gRR (size) + LQF (weight)What is the right balance between size and weight?

(A minor change can have a big impact on performance!)

Our goal: A single-iteration scheduling algorithm that is simple to implement and better in performance.

* N. McKeown, “Scheduling algorithms for input-queued cell switches,” PhD. Thesis, University of California at Berkeley, 1995.

11

Rank-based Priority: HRF

Each input ranks its NVOQs according to queue size.

N ranks (1 to N)A special rank, R(i,j) = 0, is reserved for empty VOQ Î log (N+1) bits

In arbitration, priority is given to VOQ with the highest rank, i.e. HRFRank-based priority vs queue-based priority

Rank (R(i,j):3241

2103

3210

1032

Input

1

2

3

4

Output

1

2

3

4

Scheduler

12

R(i,j):3241

21

3

312

1

32

HRF-Basic

1234

3241

312

3241

312

213

132

213

132

3231

423

3231

423

211

132

211

132

1234

3425

121

3425

121

231

312

231

312

3213

211

3213

211

432

512

432

512

1234

1234

3425

3425

1234

1234

55

Size = 3, Weight = 10

Size = 1, Weight = 5

iLQF

)1: Requests

1234

1234

11

1

12

1

12

)2: Grant

1234

1234

1

1

1

1

11

)3: Accept

HRF

HRF

LQF

LQF

A grant to a rank 1 request will be accepted for sure.

A grant to a LQF request may NOT be accepted.

Page 4: Outline - web.eecs.umich.edusugih/courses/eecs589/f16/HRF.pdf · Outline Background Input queued switch ... Thesis, 1995. 11 Rank-based Priority: HRF its N ... DRRM are stable under

13

E.g. under uniform traffic

HRF-basic vs iLQF-1Rank-based priority is more effective

BUTPoor high-load performance Multiple-bit requests

HRF-basic

iLQF-1

14

HRF-Refined

gRR (as in SRR): Each input has a distinct preferred output in each slot.Each input prefers each output exactly once in every Nslots.Input i at time slot t, its preferred output j is given by

j = ( i + t ) mod N

Scheduling priority is given topreferred input-output pair first, and highest rank VOQ next.

15

HRF-Refined

Request: If output j is the preferred output and VOQ(i,j) > 0, input i sends 1 to output j and 0 to all others. Otherwise, send R(i,j) to all. Grant: An output grants the request from its preferred input first. If no preferred request, grants the request with the highest rank.Accept: Input accepts the grant from its preferred output. If no preferred grant, accepts the grant with the highest rank.

Note: Rank 0 = “empty”16

E.g. under uniform traffic

HRF-refined vsHRF-basic

High-load performance is improved

HRF-refined vsSRR

HRF + gRRLQF + gRR

HRF-refined

SRR

HRF-basic

Page 5: Outline - web.eecs.umich.edusugih/courses/eecs589/f16/HRF.pdf · Outline Background Input queued switch ... Thesis, 1995. 11 Rank-based Priority: HRF its N ... DRRM are stable under

17

HRF with Request Coding (HRF-RC)

Idea: use the single-bit request (Xt) to indicate the increase or decrease of the VOQ rank

vs “empty” or “non-empty”

Maintaining full-rank info at each input?

HRF-basic: successful VOQs ranked highOur approach: 3 ranks

Multiple-bit request Î single-bit request

18

Request Coding & Decoding

Based on the value of Xt Xt-1

Priority: Lowest Highest

19

Empty

Others

Xt-1 = 1

Empty Others

Xt = 0

Others

Others

Empty

Longest

Empty Others

Xt = 0

Longest

Longest

Empty

Initial state

slot t-1

slot t

Xt-1 = 1 Xt-1 = 1

Xt = 0

Xt-1 = 1

Xt = 0 Xt = 0 Xt = 0

slot t-2

Initial state

Initial state Time

Longest

Xt-1 = 1

Empty Others

Xt = 0 Xt = 0

Others

Xt = 0

Others

Xt = 0

E.g. Xt Xt-1=“01”All possible state changes for Xt Xt-1=“01”

20

HRF-RC Request: If an input’s preferred output is backlogged at slot t, sends Xt = 1 to output j and Xt = 0 to others. Otherwise, using the original RC.Grant: Each output decodes Xt from

its preferred input as an occupancy indicator (VOQ(i,j) = 0 or not), and other inputs using the Xt+1Xt decoding table

Accept: Each input accepts the grant from its preferred output first. Otherwise, accept the grant with the highest rank.

Page 6: Outline - web.eecs.umich.edusugih/courses/eecs589/f16/HRF.pdf · Outline Background Input queued switch ... Thesis, 1995. 11 Rank-based Priority: HRF its N ... DRRM are stable under

21

Properties of HRF-RC

Simple to implement:Three VOQ states/ranks Single-bit requestTwo-bit comparators

HRF-RC is stable if each flow’s arrival rate ≤ 1/N.iSLIP & DRRM are stable under uniform traffic (≤ 1/N).

HRF-RC satisfies the max-min fairness criteria.iSLIP & DRRM ensures no starvation.

Three states:2221

2102

212

1022

1234

0101

110

0101

110

011

100

011

100

)1: Requests

Single-bit requests:

Two-bit comparators:

22

Outline

Iterative scheduling algorithmsHighest rank first (HRF)

HRF-basicHRF-refinedHRF with request coding (HRF-RC)

Performance evaluationsConclusions

23

Uniform64 x 64 switch

* S. Mneimneh, “Match form the first iteration: an iterative switching algorithm for input queued switch,” IEEE/ACM Trans. on Networking, Vol. 16, Issue 1, pp. 206 – 217, Feb. 2008.

HRF-refined

HRF-basic

HRF-RC

*

24

BurstyBurst size = 30 cells

HRF-refined

HRF-basic

HRF-RC

* B. Hu, K. L. Yeung, Q. Zhou and C. He, “On Iterative Scheduling for Input-queued Switches with a Speedup of 2-1/N,” Accepted by IEEE/ACM Transactions on Networking, Feb. 2016.

*

Page 7: Outline - web.eecs.umich.edusugih/courses/eecs589/f16/HRF.pdf · Outline Background Input queued switch ... Thesis, 1995. 11 Rank-based Priority: HRF its N ... DRRM are stable under

25

“Output” HotspotEach input has a distinct hotspot output.

HRF-refined

HRF-basic

HRF-RC

26

“Input” HotspotInput 1 is always fully loaded.

HRF-refined

HRF-basic

HRF-RC

27

Conclusions

We reviewed existing work on iterative scheduling algorithm design.We proposed a rank-based priority scheme (HRF)We designed a request coding scheme for keeping single-bit request