![Page 1: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/1.jpg)
Department of Computer and IT EngineeringDepartment of Computer and IT EngineeringUniversity of KurdistanUniversity of Kurdistan
Computer Networks IIRouter Architecture
By: Dr. Alireza AbdollahpouriBy: Dr. Alireza Abdollahpouri
![Page 2: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/2.jpg)
What is Routing and forwarding?
A
B
C
R1
R2
R3
R4 D
E
FR5
2
![Page 3: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/3.jpg)
History …
Introduction
3
![Page 4: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/4.jpg)
History …
And future trends!And future trends!
Introduction
4
![Page 5: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/5.jpg)
Cisco GSR 12416 Juniper M160
6ft
19”
2ft
Capacity: 160Gb/sPower: 4.2kW
3ft
2.5ft
19”
Capacity: 80Gb/sPower: 2.6kW
What a Router Looks Like
5
![Page 6: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/6.jpg)
Basic network system functionality Address lookup Packet forwarding and routing Fragmentation and re-assembly Security Queuing Scheduling Packet classification Traffic measurement …
Packet Processing FunctionsPacket Processing Functions
6
![Page 7: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/7.jpg)
1. Accept packet arriving on an ingress line.
2. Lookup packet destination address in the
forwarding table, to identify outgoing interface(s).
3. Manipulate packet header: e.g., decrement TTL,
update header checksum.
4. Send packet to outgoing interface(s).
5. Queue until line is free.
6. Transmit packet onto outgoing line.
Per-packet Processing in a Router
7
![Page 8: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/8.jpg)
Control Plane May be Slow“Typically in Software”
Data plane (per-packet processing) Must be fast“Typically in Hardware”
• Switching•Arbitration•Scheduling
• Routing Lookup• Packet Classifier
Routing - Routing table update (OSPF, RIP, IS-IS) - Admission Control - Congestion Control - Reservation
SwitchingSwitching
Basic Architecture of a Router
How packets get forwarded
How routing protocols establish routes/etc
8
![Page 9: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/9.jpg)
9
Generic Router Architecture
Data Hdr
Data Hdr
Data Hdr
BufferManager
BufferMemory
BufferMemory
BufferManager
BufferMemory
BufferMemory
BufferManager
BufferMemory
BufferMemory
Data Hdr
Data Hdr
Data Hdr
![Page 10: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/10.jpg)
Interconnect scheduling
Route lookup
TTL proces
sing
Buffering
Buffering
QoS sched
uling
Control plane
Ingress linecard Egress linecardInterconnect
Framing
Framing
Data path
Control path
Scheduling path
Functions in a Packet Switch
usually multiple usage of memory (DRAM for packet buffer,
SRAM for queues and tables)10
![Page 11: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/11.jpg)
Line Card Picture
11
![Page 12: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/12.jpg)
Major Components of Routers: Interconnect
Interconnect Input Ports to Output Ports, includes 3 modes Bus
All Input ports transfer data through the shared bus. Problem : Often cause in data flow congestion.
Shared Memory Input port write data into the share memory. After destination lookup is performed, the
output port read data from the memory. Problem : Require fast memory read/write and management technology.
Crossbar N input ports has dedicated data path to N output ports. Result in N*N switching matrix. Problem : Blocking (Input, Output, Head-of-line HOL). Max switch load for random traffic
is about 59%.
BusBusShared MemoryShared Memory
CrossbarCrossbar
MemoryMemory
12
![Page 13: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/13.jpg)
Interconnects: Two basic techniques
Input Queueing Output Queueing
Usually a non-blockingswitch fabric (e.g. crossbar)
13
![Page 14: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/14.jpg)
Output Queued (OQ) Switch
How an OQ Switch Works
14
![Page 15: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/15.jpg)
Input Queueing: Head of Line Blocking
Del
ay
Load58.6% 100%
15
![Page 16: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/16.jpg)
Head of Line Blocking
16
![Page 17: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/17.jpg)
17
![Page 18: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/18.jpg)
18
![Page 19: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/19.jpg)
Virtual Output Queues (VoQ)
Virtual Output Queues: At each input port, there are N queues – each
associated with an output port Only one packet can go from an input port at a time Only one packet can be received by an output port
at a time It retains the scalability of FIFO input-queued switches It eliminates the HoL problem with FIFO input Queues
19
![Page 20: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/20.jpg)
Input Queueing: Virtual output queues
20
![Page 21: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/21.jpg)
Del
ay
Load100%
Input Queueing: Virtual output queues
21
![Page 22: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/22.jpg)
The Evolution of Router Architecture
First Generation Routers
Modern Routers
22
![Page 23: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/23.jpg)
RouteTableCPU Buffer
Memory
LineInterface
MAC
LineInterface
MAC
LineInterface
MAC
First Generation Routers
Shared Backplane
Line Interfac
eCPU
Mem
ory
Bus-based Router Architectures with Single ProcessorBus-based Router Architectures with Single Processor23
![Page 24: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/24.jpg)
Based on software implementations on a single CPU.
Limitations: Serious processing bottleneck in the central
processor Memory intensive operations (e.g. table lookup
& data movements) limits the effectiveness of processor power
First Generation Routers
24
![Page 25: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/25.jpg)
Second Generation Routers
RouteTableCPU
LineCard
BufferMemory
LineCard
MAC
BufferMemory
LineCard
MAC
BufferMemory
FwdingCache
FwdingCache
FwdingCache
MAC
BufferMemory
Bus-based Router Architectures with Multiple Processors
Bus-based Router Architectures with Multiple Processors
25
![Page 26: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/26.jpg)
Architectures with Route Caching Distribute packet forwarding operations Network interface cards
Processors Route caches
Packets are transmitted once over the shared bus Limitations:
The central routing table is a bottleneck at high-speeds Traffic dependent throughput (cache) Shared bus is still a bottleneck
Second Generation Routers
26
![Page 27: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/27.jpg)
LineCard
MAC
LocalBuffer
Memory
CPUCard
LineCard
MAC
LocalBuffer
Memory
Switched Backplane
Line Interfac
eCPU
Mem
ory FwdingTable
RoutingTable
FwdingTable
Third Generation Routers
Switch-based Architectures with Fully Distributed ProcessorsSwitch-based Architectures with Fully Distributed Processors27
![Page 28: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/28.jpg)
To avoid bottlenecks:
Processing power
Memory bandwidth
Internal bus bandwidth
Each network interface is equipped with appropriate processing power and buffer space.
Data vs. control plane
• Data plane – line cards
• Control plane - processor
Third Generation Routers
28
![Page 29: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/29.jpg)
Switch Core Linecards
Optical links
100sof metres
0.3 - 10Tb/s routers in development
Fourth Generation Routers/Switches
Optics inside a router for the first timeOptics inside a router for the first time
29
![Page 30: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/30.jpg)
Do we still higher processing power in networking devices?
Of course, YESBut why? and how?
Demand for More Powerful Routers
30
![Page 31: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/31.jpg)
Processing Complexity
Hundreds of instructions per
packet
Thousands of instructions per
packetLayer 2
switchingIPv4
routingFlow
ClassificationEncryption
Intrusiondetection
packet inter-arrival time (for 40Gbps):Big packet: 300 nsSmall packet: 12 ns
Beyond the moore’s lawBeyond the moore’s law
Demands for Faster Routers (why?)
31
![Page 32: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/32.jpg)
Future applications will demand TIPS
Demands for Faster Routers (why?)
32
![Page 33: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/33.jpg)
Future applications will demand TIPS Power? Heat?
Demands for Faster Routers (why?)
33
![Page 34: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/34.jpg)
Technology push:- Link bandwidth scaling much faster than CPU and memory
technology
- Transistor scaling and VLSI technology help but not enough
Demands for Faster Routers (summary)
Application pull:
- More complex applications are required
- Processing complexity is defined as the number of instructions
and number of memory access to process one packet
34
![Page 35: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/35.jpg)
“Future applications will demand TIPS”
“Think platform beyond a single processor”
“Exploit concurrency at multiple levels”
“Power will be the limiter due to complexity and leakage”
Distribute workload on multiple cores
Demands for faster routers (How?)
35
![Page 36: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/36.jpg)
Symmetric multi-processors allow multi-threaded applications to achieve higher performance at less die area and power consumption than single-core processors
Asymmetric multi-processors consume power and provide increased computational power only on demand
Multi-Core Processors
36
![Page 37: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/37.jpg)
Performance Bottlenecks
Memory Bandwidth available, but access time too slow Increasing delay for off-chip memory
I/O High-speed interfaces available Cost problem with optical interfaces
Internal Bus Can be solved with an effective switch, allowing simultaneous transfers between network interfaces
Processing power Individual cores are getting more complex Problems with access to shared resources Control processor can become bottleneck
37
![Page 38: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/38.jpg)
Different Solutions
• ASIC• FPGA• NP• GPP
Flexibility
Performance
ASIC
GPP
FPGA
NP
38
![Page 39: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/39.jpg)
By: Niraj Shah
Different Solutions
39
![Page 40: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/40.jpg)
“It is always something(corollary). Good, Fast, Cheap:
Pick any two (you can’t have all three).”
RFC1925“The Twelve Networking Truths”
“It is always something(corollary). Good, Fast, Cheap:
Pick any two (you can’t have all three).”
RFC1925“The Twelve Networking Truths”
40
![Page 41: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/41.jpg)
High cost to develop Network processing moderate quantity market
Long time to market Network processing quickly changing services
Difficult to simulate Complex protocol
Expensive and time-consuming to change Little reuse across products Limited reuse across versions No consensus on framework or supporting chips Requires expertise
Why not ASIC?Why not ASIC?
41
![Page 42: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/42.jpg)
• Introduced several years ago (1999+)
• A way to introduce flexibility and programmability
in network processing
• Many players were there (Intel, Motorola, IBM)
• Only a few players still there
Network Processors
42
![Page 43: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/43.jpg)
Intel IXP 2800
Initial release August 200343
![Page 44: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/44.jpg)
CPU-level flexibility – A giant step forward compared to ASICs
How? – Hardware coprocessors – Memory hierarchies – Multiple hardware threads (zero context switching overhead) – Narrow (and multiple) memory buses – Some other ad-hoc solutions for network processing, e.g., Fast switching fabric, memory accesses, etc
What Was Correct With NPs?
44
![Page 45: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/45.jpg)
What Was Wrong With NPs?
Programmability issues
– Completely new programming paradigm
– Developers are not familiar with the unprecedented
parallelism of the NPU, They do not know how to
exploit it at best
– New (proprietary) languages
– Portability among different network processors
families
45
![Page 46: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/46.jpg)
What Happened in NP Market?
Intel went out of the market in 2007
Many other small players disappeared
High risk when selecting a NP maker that may disappear
46
![Page 47: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/47.jpg)
Every old idea will be proposed again with a different name and a
different presentation, regardless of whether it works.
RFC1925“The Twelve Networking Truths”
Every old idea will be proposed again with a different name and a
different presentation, regardless of whether it works.
RFC1925“The Twelve Networking Truths”
47
![Page 48: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/48.jpg)
Processing in General-purpose CPUs
CPUs optimized for few threads, high performance per thread
– High CPU frequencies – Maximize instruction-level parallelism • Pipeline • Superscalar • Out-of-order execution • Branch prediction • Speculative loads
Software Routers
48
![Page 49: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/49.jpg)
Aim: Low cost, flexibility and extensibility
Linux on PC with a bunch of NICs
Changing a functionality is as simple as a
software upgrade
Software Routers
49
![Page 50: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/50.jpg)
• RouteBricks [SOSP’09] Uses Intel Nehalem architecture
• Packet shader [SIGCOMM’10] GPU-Accelerated Developed in KAIST, Korea
Software Routers (examples)
50
![Page 51: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/51.jpg)
Intel Nehalem Architecture
C0
L3 Common Cache
C1
C2
C3
51
![Page 52: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/52.jpg)
NUMA architecture: The latency to access the local memory is, approximately, 65 nano-seconds. The latency to access the remote memory is, approximately, 105 nano-seconds
Bandwidth through of the QPI link is 12.8 GB/s
Three DDR3 channels to local DRAM support a bandwidth of 31.992GB/s
Intel Nehalem Architecture
52
![Page 53: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/53.jpg)
Shared L3 Cache
I/Ocontroller
hub
IMC3 channels
DRAM
DRAM
DRAM
PCI slots
PCI slots
PCI slots
QPI
PCI busnetwork card
disk
file system
communication system
application
file systemcommunication
system
application
disk network cardL2
cache
QPI2
QPI1
Powerand clock
L2cache
L2cache
L2cache
Core
0
Core
1
Core
2
Core
3
Nehalem Quadcore
L1-I L1-D L1-I L1-D L1-I L1-D L1-I L1-D
Intel Nehalem Architecture
53
![Page 54: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/54.jpg)
Other Possible PlatformsOther Possible Platforms
Intel Westmere-EP Intel Jasper Forest
54
![Page 55: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/55.jpg)
Pipeline Parallel
Hybrid
Workload Partitioning (parallelization)Workload Partitioning (parallelization)
55
![Page 56: Department of Computer and IT Engineering University of Kurdistan Computer Networks II](https://reader031.vdocument.in/reader031/viewer/2022013004/56812fbf550346895d954183/html5/thumbnails/56.jpg)
Questions!Questions!Questions!Questions!