techniques for fast packet buffers sundar iyer, ramana rao, nick mckeown (sundaes,ramana,...

22
Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, nickm)@stanford.edu Departments of Electrical Engineering & Computer Science, Stanford University

Upload: amie-rogers

Post on 08-Jan-2018

216 views

Category:

Documents


0 download

DESCRIPTION

Stanford University 3 Characteristics of Packet Buffer Architectures The total throughput needed is at least 2(Ingress Rate) Size of Buffer is at least R * RTT The buffers have one or more FIFOs The sequence in which the FIFOs are accessed is determined by an arbiter and is unknown apriori

TRANSCRIPT

Page 1: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Hi gh Pe rf or ma nc eSwi tc hi ng a nd Routi ngTe lec om Ce nter W ork sho p: Sep t 4 , 19 97 .

Techniques for

Fast Packet Buffers

Sundar Iyer, Ramana Rao, Nick McKeown(sundaes,ramana, nickm)@stanford.eduDepartments of Electrical Engineering & Computer Science, Stanford University

Page 2: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 2

Problem Statement RedefinedMotivation:

To design an extremely high speed packet buffer architecture with fast access time and large size.

This talk:Is about the analysis of one such well known approach.

Page 3: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 3

Characteristics of Packet Buffer Architectures

• The total throughput needed is at least 2(Ingress Rate)

• Size of Buffer is at least R * RTT

• The buffers have one or more FIFOs

• The sequence in which the FIFOs are accessed is determined by an arbiter and is unknown apriori

Page 4: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 4

Memory Hierarchy of Packet Buffer

ArrivingPackets

DepartingPackets

Large DRAM memory with access time T’

Ingress SRAM Egress SRAM cache of FIFO heads

1

Q

1

Q

1

Q

b cells

R RArbiter

b cells b cells

Write Access Read Access Time = T= 2T’ Time = T = 2T’

Memory Management Algorithm

cache of FIFO tails

grants

Page 5: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 5

System Design ParametersMain Parameters

– SRAM Size– Latency faced by a cell

System Parameters– I/O Bandwidth– Number of addresses

• Use single address on every DRAM• Use different addresses on every DRAM

– Use/Non Use of DRAM Burst Mode– (non) Existence of Bank conflicts

Page 6: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 6

Today’s Talk…

Optimize Main Parameters– Minimize latency at cost of SRAM size – (Necessity and Sufficiency)

…… (later) Minimize SRAM size at cost of Latency

Assumptions on system parameters• No speedup on I/O

– I/O = 2R• Simple address architecture

– Use single address from every DRAM

Page 7: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 7

More Assumptions ..

• We shall assume that we have only cells of size “C” which arrive in the system

• No use of DRAM Burst Mode

• No bank conflicts

Page 8: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 8

Symmetry Argument

• The analysis and working of the ingress and egress buffer architectures are similar

• We shall analyze only the egress buffer architecture

Page 9: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 9

A Bad Case for the Queues …1

t = 0 t = 1 t = 2 t = 3

t = 4 t = 5 t = 6 t = 7

w

Page 10: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 10

A Bad Case for the Queues … 2

t = 8 t = 9 t = 10 t = 11

t = 12 t = 13 t = 14 … t = 17

Page 11: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 11

Observation

• There exists some value of “w” for which the buffer does not overflow

• w = qb is one such sufficient value• Threshold value “Ti” governs “w”.

wTib -1

Q

Page 12: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 12

Definitions• Occupancy

– This is the number of cells in the SRAM for a particular queue

• Active Queue– An active queue is one which has an

occupancy less than the threshold and has cells in the DRAM present for it

Page 13: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 13

One More Definition • Deficit

– This is defined as the difference between the threshold ‘T’ and the occupancy of an active queue.

– For a queue which is not active the deficit is zero

occupancy

b -1 deficit

Ti

Page 14: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 14

Can we Bound the Maximum Value of the Deficit?

• Define f(i,q)– The maximum deficit that a set of “i”

queues can have in a system of “q” queues

• We are interested in f(1,q)

• f(q,q) < qb …. trivially

Page 15: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 15

Largest Deficit Queue First

Recurrence Equations

• f(2,q) >= f(1,q) –b + [f(1,q) –b]• f(3,q) >= f(2,q) –b + [f(2,q) –b]/2• f(4,q) >= f(3,q) –b + [f(3,q) –b]/3• ……• f(q,q) >= f(q-1,q) –b + [f(q-1,q) –b]/(q-

1)

Page 16: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 16

Dirty Math..• qb > f(q,q) … trivially >= [f(q-1,q) –b] + [f(q-1,q) –b]/(q-1) >= f(q-1,q)(q/q-1) – b(q/q-1)

>= {f(q-2,q)(q-1/q-2) –b(q-1/q-2)}(q/q-1) – b[q/q-1]

>= f(q-2,q)q/q-2 –bq/q-2 –bq/q-1 >= f(q-3,q)q/q-3 –bq/q-3 –bq/q-2 - bq/q-1 ….. >= f(1,q) q/1 – bq sigma [1/i]• This gives, f(1,q) <= b[1 + ln q]

Page 17: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 17

Results

• If the MMA services the queue,– with the largest deficit &– has a simple address architecture – and no I/O speedup

• then– A latency of zero can be guaranteed when the – width of the SRAM is b[1 + lnq] + b = b [2 +

ln q]– And the size of SRAM is [2 + lnq]qb

Page 18: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 18

Necessity Traffic Pattern – b=2, q=8

t = 0 t = 8 t = 8 +8/2 t = 8 + 8/2 + 8/4

w w w w

Page 19: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 19

Necessity Analysis … 1

• In 1st iteration – q(b-1/b) queues with deficit 1

• In 2 nd iteration– q(b-1/b)2 queues with deficit 2

• In xth iteration– q(b-1/b)x = 1 queues with deficit x

• X = log (b/b-1) q = ln q/ ln (1 +1/b-1) ; (Use ln (1+x) = x) = ln q(b-1)

Page 20: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 20

Necessity Analysis ….2

• In xth iteration– We can delete another “b”– Deficit is x + b = ln q(b-1) + b = b[ 1 + ln q(b-1)/b] = approx b [1 + lnq]

• Width of SRAM = b [2 + lnq] • Size of SRAM = qb[2 + lnq]

Page 21: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 21

A Dose of Reality• Typical values

– “b” is typically <= 10– q = Np, where

• N = # of ports (for VOQ)• p = number of classes per port

• Implementations– VOQ

• N = 32, p = 1, q = 25, b = 23, SRAM = 700 kb

– Diffserv• N = 32, p = 16, q = 29, b = 23, SRAM = 17 Mb

– Intserv• Lets not think about it!

Page 22: Techniques for Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown (sundaes,ramana, Departments of Electrical Engineering & Computer

Stanford University 22

Future Work

• Discussion on trading off latency for SRAM size

• Analysis of other parameters– Relaxing I/O, address constraints

• Implementation Pain

• …. Still a long way to go