an architecture and prototype implementation for tcp/ip hardware support mirko benz
DESCRIPTION
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany. TERENA 2001. Motivation – Demanding Services. 1000+ Instructions per Packet. Application Complexity. Internet Security Provision. Required Processing Power. - PowerPoint PPT PresentationTRANSCRIPT
An Architecture and Prototype Implementation for TCP/IP Hardware Support
Mirko BenzDresden University of Technology, Germany
TERENA 2001
Required Processing
Routing
Switching
Quality of Service Support
Internet Security Provision
Required Processing Power
Application Complexity1000+ Instructions per Packet
Motivation – Demanding Services
TERENA 2001
Motivation – MIPS versus Bandwidth Trend
TERENA 2001
Technological Progress / Time
MIPS Performance / Bandwidth
Processor Performance Evolution ~100%/18 month
Available Bandwidth
Hardware Support for Protocol Processing Acceleration
Case Study: TCP
• Assumptions & Preconditions
- Restriction to Local Area Networks (e.g. Gigabit Ethernet)- High Bandwidth and Low Error Probability- Concentration on Host Implementations
Project Overview
TERENA 2001
Protocol Analysis
TCP/IP Partitioning System
Simulation
Evaluation Optimisation
Efficient OSIntegration
Prototype Variants
Flexible Protocol Engine
Domain Specific
Methodology
Talk Outline
TCP Protocol Performance Evaluation
TCP Acceleration Approach
System Simulation Environment
Operating System Integration
Hardware Implementation Directions
Myrinet Implementation and Results
Conclusions and Outlook
TERENA 2001
TCP Protocol Performance Evaluation
• TCP Software Implementation Structure
• Sources of Protocol Processing Overhead- Communication, Synchronisation- Operating System Call Overhead- Copy Operation - Classification: Per-Byte / Per-Packet
• Optimisation Opportunities- Interrupt Suppression- Zero Copy Mechanisms- User Level Networking- Checksum Offloading (e.g. Task Offload)- Extending frame sizes (e.g. Jumbo Frames)
TERENA 2001
Network
Driver
IP
TCP
Socket
Application
TCP Protocol Performance Evaluation
• Performance TCP versus Myrinet GM:- Throughput 335/967 Mbit/s (TCP/Myrinet)- Latency 81/29 s (TCP/Myrinet)- 100% CPU Utilisation - (RedHat Linux 6.2 / PIII 500 MHz)
TERENA 2001
0
200
400
600
800
1000
0 10000 20000 30000 [byte]
[Mb
it/s
]
GM 1.2
TCP/Myrinet
Goals?
• Software Implementation as a Foundation• Achieve On Wire Compatibility• Consider Different Target Architectures• Develop Re-Useable Hardware Components• Integration of High Level Tools• System Wide Optimisation• Efficient, Transparent Operating System Integration
Domain SpecificMethodology
Flexible Protocol Engine
TERENA 2001
TCP Acceleration Approach
• TCP SW Stack Complexity - General Purpose Protocol- Not Designed for High End Networking- Many Interdependent Algorithms - Often Modified, Adapted, Optimised- ~15.000 Lines C
• Approach- TCP Partitioning -> Fast Path Extraction- Hardware Support -> Acceleration - Operating System Bypass
• HW/SW Synchronisation- Initialisation, Termination/Error
• Transparent Integration- Socket Level Switch
Network
Driver
IP
TCP
Socket
Application
PE
TERENA 2001
Fast Path Protocol Processing
TERENA 2001
TCP Send
Sender Receiver
TCP Recv
Send Ack
Network
Connection Context
TCP Send
TCP Recv
Send Ack
Connection Context
DataAck
• Only for User Data Exchange• No Connection Management• No Error Recovery – Only Detection• Complexity ~10% of SW Stack
Netserver
Socket
User Mode Linux
Netserver
Socket
User Mode Linux
System Simulation Environment
TERENA 2001
NetworkSimulator
Netperf
Socket
User Mode Linux
CORBA
• Complex Communication System• Real Applications, Operating System (User Mode Linux)• Network Simulation – Error Injection• Fast Path Implementation: Hardware/Software• System Evaluation: Functionality & Performance
ISASimulator
TCP Fast Path SW
VHDLSimulator
TCP Fast Path HW
Evaluation
Fast Path Hardware Implementation Directions
• Embedded RISC Processor - LEON Sparc 33 MHz, INTEL StrongARM 200 MHz- OS: ucLinux, GNU C Environment
Intelligent Network Adapter (Myrinet)- RISC Core with User/Network Interface, DMA Engines- Control Program Modification, no Operating System
• Network Processor (INTEL IXP1200)- 6 multithreaded microengines - Development: IXP Assembler, Simulator
• Specific Hardware- High Level FPGA Design Flow, XILINX Virtex- SYNOPSYS Protocol Compiler
Software
Hardware
TERENA 2001
Myrinet Implementation Plattform
TERENA 2001
LOCAL SRAM
LANai 7
HostInterface
PacketInterface
RISCPCI
BridgeDMA
ControllerMyrinet
Link
64 bit64 bit, 33 MHz 1280 Mbit/s
• Technology
- Packet-Communication and Switching Technology
- High-Performance, Highly Reliable
- System-Area Network, Cluster Interconnect
• Intelligent Network Adapter
TCP Fast Path/Myrinet
• Development Environment
- Host SW GM (message passing), Firmware MCP – open source
- GNU C Suite, no OS, one context only, no Interrupts
•Implementation
- MCP: 4 Event Driven State Machines
- Fast Path Integration within Network Send & Recv Code
- Exploitation of Hardware Support for Checksum Computation
- No specific Optimisations, Some Limitations
TERENA 2001
0
200
400
600
800
1000
0 10000 20000 30000 [byte]
[Mbi
t/s] GM 1.2
Fast Path
TCP/Myrinet
TCP Fast Path / Myrinet Performance Results
• Performance
- Test Setup: INTEL PIII/500MHz, Myrinet LAN Adapter, Linux OS
- Netperf Benchmark Throughput/Delay
- Throughput Peak: 967, 816, 333 Mbit/s (GM, Fast Path, TCP)
- Delay Minimum: 16.5, 49, 81 s (GM, Fast Path, TCP)
TERENA 2001
Summary & Outlook
• Integrated Architecture and Desing Flow for Protocol Processing Acceleration
- TCP Partitioning
- System Simulation Environment
- Integration with existing SW TCP Stack & OS
• Prototype with Promising Performance
• Present Work:
- Fast Path HW Implementation and SoC Integration
Protocol Analysis
TCP/IP Partitioning
System Simulation
Efficient OSIntegration
Prototype Variants
Evaluation Optimisation
Flexible Configurable Protocol Engine
TERENA 2001