ixp lab 2012: part 1
DESCRIPTION
IXP Lab 2012: Part 1. Network Processor Brief. Outline. Network Processor Intel IXP2400 Processing Element Register Memory Interface IXP Programming Language Programming Model Programming Syntax. Router Development (1). Software Based General Purpose Processor Flexible - PowerPoint PPT PresentationTRANSCRIPT
IXP Lab 2012: Part 1
Network Processor Brief
NCKU CSIE CIAL Lab 2
Outline Network Processor Intel IXP2400
Processing Element Register Memory Interface
IXP Programming Language Programming Model Programming Syntax
NCKU CSIE CIAL Lab 3
Router Development (1)
Software Based General Purpose Processor
Flexible Poor Performance …
Hardware Based ASIC
Best Performance Long Development Time
NCKU CSIE CIAL Lab 4
Router Development (2)
Network Processor (NPU) Based Balance of both How ?
Parallel processors Multi-threaded cores Programmable processors with
nonprogrammble copressors
NCKU CSIE CIAL Lab 5
Network Processor Overview
For high speed packet processing Comprise Multi-Cores for Parallel
executing Multi-Threaded Core Reduced Instruction Set Multiple Memory Interfaces
NCKU CSIE CIAL Lab 6
Hierarchical Layer Data-Plane
Fast-Path Slow-Path
Control-Plane Routing Protocol
Management-Plane Monitor Applications User Interface
NCKU CSIE CIAL Lab 7
Data-Plane
Fast-Path General Packet Handling As fast as possible
Slow-Path Exception Packet Handling
Packet with options Local TCP/IP Stack
NCKU CSIE CIAL Lab 8
Internet eXchange Processor First Generation
IXP1200, IXP1240, IXP1250 Second Generation
IXP2400, IXP2800, IXP2850 IXP2805, IXP2855
Others IXP4XX
NCKU CSIE CIAL Lab 9
Network Flow Processor
By Netronome From Intel IXP2XXX NFP-3240, NFP-3216
NCKU CSIE CIAL Lab 11
Intel IXP2400 Block Diagram
NCKU CSIE CIAL Lab 12
IXP2400 Overview
Functional Block Processing Element Memory Interfaces Coprocessors Other Interfaces
Hierarchical View
NCKU CSIE CIAL Lab 13
Processing Element
Programmability Hierarchical Processing Elements
XScale Microengine (ME)
NCKU CSIE CIAL Lab 14
XScale
RISC based processor (ARMV5TE) Real-time OS
Montavista Linux ME Management
Control ME execution Resource Management
NCKU CSIE CIAL Lab 15
MicroEngine (1)
Eight MEs per IXP2400 (work in parallel)
Eight Threads per ME Instruction set of ME are reduced
for packet processing only Not as powerful as general processor No floating point related instructions No divide instruction
NCKU CSIE CIAL Lab 16
MicroEngine (2)
No OS Not interactive Managed by XScale
Code Store (4K Instrcutions) Executing
NCKU CSIE CIAL Lab 17
MicroEngine Threads
Concurrent Executing No Preemptive Round Robin Executing Each thread own its private set of
registers Zero-Overhead Context Switching
NCKU CSIE CIAL Lab 18
Registers of ME 256 GPRs 256 SRAM Transfer Registers
128 Read 128 Write
256 DRAM Transfer Registers 128 Read 128 Write
128 Next Neighbor Registers
NCKU CSIE CIAL Lab 19
Context Switch
Content of registers needs not be swap-out and swap-in during context switching
With the mechanism, another thread can swap in and doing some useful task to cover the long latency when the previous thread has swapped out for issues a memory request
NCKU CSIE CIAL Lab 20
Memory Interface of IXP2400 Local Memory
Smallest and Fastest Scratchpad
Passing handle of the packet SRAM
Hold data structure for packet processing DRAM
Largest and Slowest Hold packet’s content
NCKU CSIE CIAL Lab 21
Local Memory Per ME Private to Other MEs Private to XScale Size: 2560 Bytes (640 LWs) Usage
Variable Spilling Caching
Latency: 3 cycles
NCKU CSIE CIAL Lab 22
Scratchpad
On-Chip Memory Shared by all MEs Size: 16KB (Fixed) Usage:
Scratchpad Scratch Ring (Hardware FIFO)
Latency: ~60 cycles
NCKU CSIE CIAL Lab 23
SRAM Off-Chip Memory Shared by all MEs (2-channels) Size: 64 MB (Per Channel at
Maximum) Usage:
Hardware FIFO Hold data structure Hold Meta-data of packets
Latency: ~90 cycles
NCKU CSIE CIAL Lab 24
DRAM
Off-Chip Memory Shared by all MEs (1-channels) Size: 1 GB (at Maximum) Usage:
Hold whole packet contents Alternative space for data structure
Latency: ~120 cycles
NCKU CSIE CIAL Lab 25
Coprocessor MSF (Media Switch Fabric)
Receive Packet to DRAM Transmit Packet from DRAM
SHaC Scratchpad Hash Unit CAP
NCKU CSIE CIAL Lab 26
Packet META-DATA (1)
Data for processing packets How to identify packet?
Packet Handle Packet Temporal Information
Non-related to packet content Meta-data
Input Port, Output Port Info for Packet Address in DRAM
NCKU CSIE CIAL Lab 27
Packet META-DATA (2)
How to pass these info between ME? Hardware FIFO
Scratch Ring SRAM Ring Next-Neighbor Ring
Issues
NCKU CSIE CIAL Lab 28
Hierarchical View (Setting #1) Only one IXP2400 based board Data-Plane
Fast-Path: Microengine Slow-Path: XScale
Control-Plane XScale
Management-Plane XScale
NCKU CSIE CIAL Lab 29
Hierarchical View (Setting #2) Multiple IXP2400 based boards Data-Plane
Fast-Path: Microengine Slow-Path: XScale
Control-Plane CPU
Management-Plane CPU
NCKU CSIE CIAL Lab 30
Programming IXP2400
XScale Programming with C
Microengine Programming with MicroC or
Microcode We will focus on this part !
NCKU CSIE CIAL Lab 31
IDE Tool--IXA SDK Workbench
NCKU CSIE CIAL Lab 32
ME Language
MicroC Subset of ANSI C Only limited part of standard C
libraries are implemented Intrinsic Library for supporting
operations of IXP Microcode
High level of assembly
NCKU CSIE CIAL Lab 33
Programming Model (1)
Receive – Processing – Transmit Intel has provided sample code for
receive and transmit. We only focus on the part of
processing.
RX PROCESSING TX
NCKU CSIE CIAL Lab 34
Programming Model (2)
Processing ME Pipeline Model Parallel Model Mixed Model
RX PROCESSING TX
NCKU CSIE CIAL Lab 35
Pipeline Model
RX TXPROC #1 RPOC #2
•Control the whole resource of ME
•Hard to balance between different stage
NCKU CSIE CIAL Lab 36
Parallel Model
RX TX
PROC #1
RPOC #2
•Balance is easy
•Higher Performance
•Resource is limited
NCKU CSIE CIAL Lab 37
Mixed Model
RX TX
PROC #1
RPOC #2
PROC #3
NCKU CSIE CIAL Lab 38
MicroC Example 1 (1)void main () {
_declspec(shared sram) int old_array[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };_declspec(shared sram) int new_array[sizeof(old_array)/sizeof(int)];
global_label("start_reverse");reverse_array(old_array, new_array,
sizeof(old_array)/sizeof(int));global_label("end_reverse");
}
NCKU CSIE CIAL Lab 39
MicroC Example 1 (2)
void reverse_array(volatile int* old, volatile int* new, int
size) { int index = 0;
for (index = 0; index < size; index++) {new[index] = old[size - index - 1];
}}
NCKU CSIE CIAL Lab 40
MicroC Example 2
sram_read(&sram_egt_dim1_2_node, (__declspec(sram) unsigned int *)(PACKET_CLASSIFICATION_SRAM_BASE1 + current*8), 2, sig_done, &sram_read_sig_dim1_2);
__wait_for_all(&sram_read_sig_dim1_2);temp = sram_egt_dim1_2_node.next_dim;
NCKU CSIE CIAL Lab 42
1. COPY IXA_SDK_3.51, ixp_book 到 D:\ ; 再 reboot
3.[Ctrl+Enter] 進還原卡總管模式 4.Password: davidchang 5. 解壓縮 ixasdk351cd1windows.zip,
ixasdk351cd3.zip, ixasdk351framework.zip, 再依序安裝 (cd1 裝完後需 reboot)
6. 把 ixp_book 目錄 COPY 到 C:\