nick mckeown 1 memory for high performance internet routers micron february 12 th 2003 nick mckeown...

31
1 Nick McKeown Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University [email protected] www.stanford.edu/~nickm

Post on 21-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

1

Nick McKeown

High PerformanceSwitching and RoutingTelecom Center Workshop: Sept 4, 1997.

Memory for High PerformanceInternet Routers

MicronFebruary 12th 2003

Nick McKeownProfessor of Electrical Engineering and Computer Science, Stanford University

[email protected]/~nickm

Page 2: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

2

Ways to get involved

1. Weekly group meetings, talks and papers: http://klamath.stanford.edu

2. Optics in Routers Projecthttp://klamath.stanford.edu/or

3. Networking classes at Stanford: • Introduction to Computer Networks: EE284, CS244a, EE384a.• Packet Switch Architectures: EE384x, EE384y.• Multimedia Networking: EE384b,c

4. Stanford Network Seminar Series: http://netseminar.stanford.edu

5. Stanford Networking Research Center: http://snrc.stanford.edu

Page 3: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

3

Outline

Context: High Performance Routers Trends and Consequences Fast Packet Buffers

Page 4: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

4

What a High Performance Router Looks Like

Cisco GSR 12416 Juniper M160

6ft

19”

2ft

Capacity: 160Gb/sPower: 4.2kW

3ft

2.5ft

19”

Capacity: 80Gb/sPower: 2.6kW

Page 5: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

5

Points of Presence (POPs)

A

B

C

POP1

POP3POP2

POP4 D

E

F

POP5

POP6 POP7POP8

Page 6: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

6

Generic Router Architecture

LookupIP Address

UpdateHeader

Header ProcessingData Hdr Data Hdr

~1M prefixesOff-chip DRAM

AddressTable

AddressTable

IP Address Next Hop

QueuePacket

BufferMemoryBuffer

Memory~1M packetsOff-chip DRAM

Page 7: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

7

Generic Router Architecture

LookupIP Address

UpdateHeader

Header Processing

AddressTable

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

AddressTable

BufferManager

BufferMemory

BufferMemory

BufferManager

BufferMemory

BufferMemory

BufferManager

BufferMemory

BufferMemory

Page 8: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

8

Outline

Context: High Performance Routers Trends and Consequences

Routing Tables Network Processors Circuit Switches Bigger Routers Multi-rack Routers Packet Buffers

Fast Packet Buffers

Page 9: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

9

Trends in Routing Tables1

0

10000

20000

30000

40000

50000

60000

70000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

Prefix Length

Num

ber

of P

refi

xes

2

3IPv6 adoption is extremely slow

Source: Geoff Huston

Consequences:1. Whole of IPv4 Address space will fit on one 4Gb DRAM.2. Whole of IPv4 Address space as table on one 32Gb DRAM.3. All 24-bit prefixes as a lookup table fit in 10% of a 1Gb DRAM.4. 1M entry address table already fits in the corner of an ASIC.5. TCAMs (for IP lookup) have limited life.

Moore’s Law2x / 18 months

99.5% prefixes are 24-bits or shorter

Page 10: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

10

Trends in Technology, Routers & Traffic

1

10

100

1,000

10,000

100,000

1,000,000

1980 1983 1986 1989 1992 1995 1998 2001

Nor

mal

ized

Gro

wth

sin

ce 1

980

DRAM Random Access Time1.1x / 18months

Moore’s Law2x / 18 months

Router Capacity2.2x / 18months

Line Capacity2x / 7 months

User Traffic2x / 12months

Page 11: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

11

Trends and Consequences

1

10

100

1000

1996 1997 1998 1999 2000 2001

CPU Instructions per minimum length packet

1

Consequences:1. Packet processing is getting harder, and eventually network

processors will be used less for high performance routers.2. (Much) bigger routers will be developed.

0

100

200

300

400

500

600

2003 2006 2009 2012

Norm

alized g

rowth

5-folddisparity

traffic

Routercapacity

Disparity between traffic and router growth

2

Page 12: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

12

Trends and Consequences (2)

0

1

2

3

4

5

6

1990 1993 1996 1999 2002

Power

(kW

)

approx...

Power consumption will Exceed POP limits

3 Disparity between line-rate and memory access time

4

1

10

100

1,000

10,000

100,000

1,000,000

1980

1986

1992

1998

Nor

mal

ized

Gro

wth

Rat

e

Consequences:3. Multi-rack routers will spread power over multiple racks.4. It will get harder to build packet buffers for linecards…

Page 13: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

13

Outline

Context: High Performance Routers Trends and Consequences Fast Packet Buffers

Work with Sundar Iyer (PhD Student) Problem of big, fast memories Hybrid SRAM-DRAM How big does the SRAM need to be? Prototyping

Page 14: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

14

The Problem All packet switches (e.g. Internet routers, ATM

switches) require packet buffers for periods of congestion.

Size: For TCP to work well, the buffers need to hold one RTT (about 0.25s) of data.

Speed: Clearly, the buffer needs to store (retrieve) packets as fast as they arrive (depart).

MemoryLinerate, R

MemoryLinerate, R

Linerate, R

Linerate, R

Memory1

N

1

N

Page 15: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

15

An ExamplePacket buffers for a 40Gb/s router

linecard

BufferMemory

Write Rate, R

One 40B packetevery 8ns

Read Rate, R

One 40B packetevery 8ns

10Gbits

Buffer Manager

Page 16: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

16

Memory Technology

Use SRAM?+ Fast enough random access time, but- Too low density to store 10Gbits of data.

Use DRAM? + High density means we can store data, but- Can’t meet random access time.

Page 17: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

17

Can’t we just use lots of DRAMs in parallel?

BufferMemory

Write Rate, R

One 40B packetevery 8ns

Read Rate, R

One 40B packetevery 8ns

Buffer Manager

BufferMemory

BufferMemory

BufferMemory

BufferMemory

BufferMemory

BufferMemory

BufferMemory

Read/write 320B every 32ns

40-79Bytes: 0-39 … … … … … 280-319

320B 320B

Page 18: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

18

Works fine if there is only one FIFO

Write Rate, R

One 40B packetevery 8ns

Read Rate, R

One 40B packetevery 8nsBuffer Manager

40-79Bytes: 0-39 … … … … … 280-319

320B

Buffer Memory

320B40B 320B

320B

40B40B40B 40B40B 40B40B 40B40B

320B320B320B320B320B320B320B320B320B320B

Page 19: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

19

Works fine if there is only one FIFO

Write Rate, R

One 40B packetevery 8ns

Read Rate, R

One 40B packetevery 8nsBuffer Manager

40-79Bytes: 0-39 … … … … … 280-319

320B

Buffer Memory

320B?B 320B

320B

?B

320B320B320B320B320B320B320B320B320B320B

Variable Length Packets

Page 20: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

20

In practice, buffer holds many FIFOs

40-79Bytes: 0-39 … … … … … 280-319

320B 320B 320B 320B

320B 320B 320B 320B

320B 320B 320B 320B

1

2

Q

e.g. In an IP Router, Q might be 200. In an ATM switch, Q might be 106.

Write Rate, R

One 40B packetevery 8ns

Read Rate, R

One 40B packetevery 8nsBuffer Manager

320B

320B?B 320B

320B

?B

How can we writemultiple variable-lengthpackets into differentqueues?

Page 21: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

21

Problems

1. A 320B block will contain packets for different queues, which can’t be written to, or read from the same location.

2. If instead a different address is used for each memory, and packets in the 320B block are written to different locations, how do we know the memory will be available for reading when we need to retrieve the packet?

Page 22: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

22

ArrivingPackets

R

Arbiter orScheduler

Requests

DepartingPackets

R

12

1

Q

21234

345

123456

Small head SRAM cache for FIFO heads

SRAM

Hybrid Memory HierarchyLarge DRAM memory holds the body of FIFOs

57 6810 9

79 81011

1214 1315

5052 515354

8688 878991 90

8284 838586

9294 9395 68 7911 10

1

Q

2

Writingb bytes

Readingb bytes

cache for FIFO tails

5556

9697

8788

57585960

899091

1

Q

2

Small tail SRAM

DRAM

Page 23: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

23

Some Thoughts

The buffer architecture itself is well known. Usually designed to work OK on average. We would like deterministic guarantees.

1. What is the minimum SRAM needed to guarantee that a byte is always available in SRAM when requested?

2. What algorithm should we use to manage the replenishment of the SRAM “cache” memory?

Page 24: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

24

An Example Q = 5, w = 9+, b = 6

t = 1

Bytes

t = 3

Bytes

t = 4

Bytes

t = 5

Bytes

t = 7

Bytes

t = 2

Bytes

t = 6

Bytes

t = 0

BytesReplenish

Replenish

Page 25: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

25

An Example Q = 5, w = 9+, b = 6

t = 8

Bytes

t = 9

Bytes

t = 10

Bytes

t = 11

Bytes

t = 12

Bytes

t = 13

Bytes Replenish

… t = 19

Bytes Replenish

t = 23

Bytes

Read

Page 26: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

26

TheoremImpatient Arbiter: An SRAM cache of size Qb(2 + ln Q) bytes is sufficient to guarantee a byte is always available when requested. Algorithm is called MDQF (Most Deficit Queue first).

Examples: 1. 40Gb/s linecard, b=640, Q=128: SRAM = 560kBytes2. 160Gb/s linecard, b=2560, Q=512: SRAM = 10MBytes

Page 27: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

27

Reducing the size of the SRAM

Intuition: If we use a lookahead buffer to peek at the

requests “in advance”, we can replenish the SRAM cache only when needed.

This increases the latency from when a request is made until the byte is available.

But because it is a pipeline, the issue rate is the same.

Page 28: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

28

Theorem

Patient Arbiter: An SRAM cache of size Q(b – 1) bytes is sufficient to guarantee that a requested byte is available within Q(b – 1) + 1 request times. Algorithm is called ECQF (Earliest Critical Queue first).

Example: 160Gb/s linecard, b=2560, Q=512: SRAM = 1.3MBytes,delay bound is 65s (equivalent to 13 miles of fiber).

Page 29: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

29

Maximum Deficit Queue First with Latency (MDQFL)

What if application can only tolerate a latency lmax < Q(b – 1) + 1 timeslots?

Algorithm: Maximum Deficit Queue First with latency (MDQFL) services a queue, once every b timeslots in the following order:

1. If there is an earliest critical queue, replenish it.2. If not, then replenish the queue that will have the most

deficit lmax timeslots in the future.

Page 30: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

Latency

Qu

eue

Len

gth

, w

Queue Length for Zero Latency

(MDF)

Queue Length for Maximum Latency

(ECQF)

Queue Length vs. LatencyQ=1000, b = 10

Page 31: Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

Nick McKeown

31

What’s Next

We plan to prototype a 160Gb/s linecard buffer.

Part of Optics in Routers Project at Stanford: http://klamath.stanford.edu/or Funding: Cisco, MARCO (US Government-Industry

consortium), TI.

Would Micron like to work with us?