spring camp 2013 1 netfpga spring camp day 1 presented by: andrew w. moore with marek michalski,...
TRANSCRIPT
Spring Camp 2013 1
NetFPGA Spring CampDay 1
Presented by: Andrew W. Moore
withMarek Michalski, Neelakandan Manihatty-Bojan, Gianni Antichi
Georgina Kalogeridou, Jong Hun Han, Noa Zilbermanaided by
Yury Audzevich, Dimosthenis Pediaditakis
Poznan University of TechnologyMay 20 – 24, 2013
http://NetFPGA.org
Spring Camp 2013 2
Tutorial Outline• Background
– Introduction– The NetFPGA Platform
• The Base Reference Router– Motivation: Basic IP review– Example: Reference Router running on the NetFPGA
• Infrastructure– Tree– Build System– Scripts
• The Life of a Packet Through the NetFPGA– Hardware Datapath – Interface to software: Exceptions and Host I/O
• Implementation– Module Template– Write Crypto NIC using a static key
• Simulation and Debug– Write and Run Simulations for Crypto NIC
• Concluding Remarks
Spring Camp 2013 3
Section I: Motivation
Spring Camp 2013 4
NetFPGA = Networked FPGA
A line-rate, flexible, open networking platform for teaching and research
Spring Camp 2013 5
NetFPGA 1G Board
NetFPGA consists of…
Four elements:
• NetFPGA board
• Tools + reference designs
• Contributed projects
• CommunityNetFPGA 10G Board
Spring Camp 2013 6
NetFPGA-10G
• A major upgrade over the 1Gb/s predecessor• State-of-the-art technology
Spring Camp 2013 7
10 Gigabit Ethernet
• 4 SFP+ Cages• AEL2005 PHY• 10G Support
– Direct Attach Copper– 10GBASE-R Optical
Fiber
• 1G Support– 1000BASE-T Copper– 1000BASE-X Optical
Fiber
Spring Camp 2013 8
Others
• QDRII-SRAM– 27MB– Storing routing tables,
counters and statistics• RLDRAM-II
– 288MB– Packet Buffering
• PCI Express x8– PC Interface
• Expansion Slot
Spring Camp 2013 9
Xilinx Virtex 5 TX240T
• Optimized for ultra high-bandwidth applications
• 48 GTX Transceivers• 4 hard Tri-mode
Ethernet MACs• 1 hard PCI Express
Endpoint
Spring Camp 2013 10
NetFPGA 1G NetFPGA 10G
4 x 1Gbps Ethernet Ports 4 x 10Gbps Ethernet Ports
4.5 MB ZBT SRAM64 MB DDR2 SDRAM
27 MB QDRII-SRAM288 MB RLDRAM-II
PCI PCI Express x8
Virtex II-Pro 50 Virtex 5 TX240T
NetFPGA Board Comparison
Spring Camp 2013 11
Beyond Hardware
• NetFPGA-10G Board• Xilinx EDK based IDE• Reference designs with
ARM AXI4• Software (embedded and
PC)• Public Repository• Public Wiki
Reference Designs
Reference Designs AXI4 IPsAXI4 IPs
Xilinx EDKXilinx EDK
MicroBlaze SW
MicroBlaze SW PC SWPC SW
GitHub, User CommunityGitHub, User Community
Spring Camp 2013 12
FPGAFPGA
MemoryMemory
10GE10GE
10GE10GE
10GE10GE
10GE10GE
NetFPGA board
PCIe
CPUCPU MemoryMemory
PC with NetFPGA
NetworkingSoftwarerunning on a standard PC
A hardware acceleratorbuilt with a Field Programmable Gate Arraydriving 10 Gigabit network links
Spring Camp 2013 13
Tools + Reference Designs
Tools:• Compile designs• Verify designs• Interact with hardware
Reference designs:• Router (HW)• Switch (HW)• Network Interface Card (HW)• Router Kit (SW)• SCONE (SW)
Spring Camp 2013 14
Contributed Projects
More projects:• https://github.com/NetFPGA/NetFPGA-public/wiki/Projects
Project Contributor
OpenFlow switch Stanford University
IPv4 Router Pisa & Cambridge Uni.
RLDRAM I/F Cambridge Univerity
ARP Reply Stanford University
Encap & DCAP Stanford University
Learning CAM Switch Pisa & Cambridge Uni.
Spring Camp 2013 15
Community
Wiki• Documentation
– User’s Guide “so you just got your first NetFPGA”
– Developer’s Guide “so you want to build a …”
• Encourage users to contribute
Forums• Support by users for users• Active community - 10s-100s of posts/week• Incubating the 10G mailing-list into a forum
Spring Camp 2013 16
International Community
Over 1,000 users, using 2,000 cards at150 universities in 40 countries
Spring Camp 2013 17
NetFPGA’s Defining Characteristics
• Line-Rate– Processes back-to-back packets
• Without dropping packets • At full rate of 10 Gigabit Ethernet Links
– Operating on packet headers • For switching, routing, and firewall rules
– And packet payloads• For content processing and intrusion prevention
• Open-source Hardware – Similar to open-source software
• Full source code available • BSD-Style License for 1G and LGPL 2.1 for 10G
– But harder, because • Hardware modules must meeting timing• Verilog & VHDL Components have more complex interfaces • Hardware designers need high confidence in specification of modules
Spring Camp 2013 18
Test-Driven Design
• Regression tests– Have repeatable results – Define the supported features– Provide clear expectation on functionality
• Example: Internet Router– Drops packets with bad IP checksum– Performs Longest Prefix Matching on destination address– Forwards IPv4 packets of length 64-1500 bytes– Generates ICMP message for packets with TTL <= 1– Defines how packets with IP options or non IPv4
… and dozens more … Every feature is defined by a regression test
Spring Camp 2013 19
Who, How, Why
Who uses the NetFPGA?– Teachers– Students– Researchers
How do they use the NetFPGA?– To run the Router Kit– To build modular reference designs
• IPv4 router• 4-port NIC• Ethernet switch, …
Why do they use the NetFPGA?– To measure performance of Internet systems– To prototype new networking systems
Spring Camp 2013 20
Spring Camp Objectives
• Overall picture of NetFPGA• How reference designs work• How you can work on a project
– NetFPGA Design Flow– Directory Structure, library modules and projects– How to utilize contributed projects
• Interface/Registers– How to verify a design (Simulation and Regression
Tests)– Things to do when you get stuck
AND… You can build your own projects!
Spring Camp 2013 21
Section II: Network review
Spring Camp 2013 22
Internet Protocol (IP)
Data
DataIP
Hdr
Eth Hdr
DataIP
Hdr
Data to betransmitted:
IP packets:
EthernetFrames:
DataIP
HdrData
IP Hdr
Eth Hdr
DataIP
HdrEth Hdr
DataIP
Hdr
…
…
Spring Camp 2013 23
Internet Protocol (IP)
Data
DataIP
Hdr…
16 3241
Options (if any)
Destination Address
Source Address
Header ChecksumProtocolTTL
Fragment OffsetFlagsFragment ID
Total Packet LengthT.ServiceHLenVer
20 b
ytes
Spring Camp 2013 24
Basic operation of an IP router
R3
A
B
C
R1
R2
R4 D
E
FR5
R5F
R3E
R3D
Next HopDestination
D
Spring Camp 2013 25
Basic operation of an IP router
A
B
C
R1
R2
R3
R4 D
E
FR5
Spring Camp 2013 26
Forwarding tables
Entry Destination Port
12⋮ 232
0.0.0.00.0.0.1
⋮255.255.255.255
12⋮12
~ 4 billion entries
Naïve approach:One entry per address
Improved approach:Group entries to reduce table sizeEntry Destination Port
12⋮50
0.0.0.0 – 127.255.255.255128.0.0.1 – 128.255.255.255
⋮248.0.0.0 – 255.255.255.255
12⋮12
IP address 32 bits wide → ~ 4 billion unique address
Spring Camp 2013 27
IP addresses as a line
0 232-1
Entry Destination Port
12345
StanfordBerkeley
North AmericaAsia
Everywhere (default)
12345
All IP addresses
North AmericaAsia
BerkeleyStanford
Your computer My computer
Spring Camp 2013 28
Longest Prefix Match (LPM)
Entry Destination Port
12345
StanfordBerkeley
North AmericaAsia
Everywhere (default)
12345
Universities
Continents
Planet
DataTo:
Stanford
Matching entries:•Stanford•North America•Everywhere
Most specific
Spring Camp 2013 29
Longest Prefix Match (LPM)
Entry Destination Port
12345
StanfordBerkeley
North AmericaAsia
Everywhere (default)
12345
Universities
Continents
Planet
DataTo:
Canada
Matching entries:•North America•Everywhere
Most specific
Spring Camp 2013 30
Implementing Longest Prefix Match
Entry Destination Port
12345
StanfordBerkeley
North AmericaAsia
Everywhere (default)
12345
Most specific
Least specific
Searching
FOUND
Spring Camp 2013 31
Basic components of an IP router
Control Plane
Data Planeper-packet processing
SwitchingForwarding
Table
Routing Table
Routing Protocols
Management& CLI
Softw
areH
ardware
Queuing
Spring Camp 2013 32
IP router components in NetFPGA
SCONE
Routing Table
Routing Protocols
Management& CLI
Output PortLookup
ForwardingTable
InputArbiter
OutputQueues
Switching Queuing
Linux
Routing Table
Routing Protocols
Management& CLI
Router Kit
OR
Softw
areH
ardware
Spring Camp 2013 33
Section III: Example I
Spring Camp 2013 34
Operational IPv4 router
Control Plane
Data Planeper-packet processing
Softw
areH
ardware
Routing Table
Routing Protocols
Management& CLI
SCONE
SwitchingForwarding
TableQueuing
Reference router
Java GUI
Spring Camp 2013 35
Streaming video
Spring Camp 2013 36
Streaming video
PC & NetFPGA(NetFPGA in PC)
NetFPGA runningreference router
Spring Camp 2013 37
Streaming video
Video streaming over shortest path
Videoclient
Videoserver
Spring Camp 2013 38
Streaming video
Videoclient
Videoserver
Spring Camp 2013 39
Observing the routing tables
Columns:•Subnet address•Subnet mask•Next hop IP•Output ports
Spring Camp 2013 40
Example 1
Spring Camp 2013 41
Review
NetFPGA as IPv4 router:•Reference hardware + SCONE software•Routing protocol discovers topology
Demo:•Ring topology•Traffic flows over shortest path•Broken link: automatically route around failure
Spring Camp 2013 42
Section III: Example II
Spring Camp 2013 43
Buffers in Routers
Rx Rx
Rx Rx
Rx Rx
Tx Tx
Tx Tx
Tx Tx
• Internal Contention
• Pipelining
• Congestion
Spring Camp 2013 44
Buffer Sizing Story
Spring Camp 2013 45
Using NetFPGA to explore buffer size
• Need to reduce buffer size and measure occupancy
• Alas, not possible in commercial routers• So, we will use the NetFPGA instead
Objective:– Use the NetFPGA to understand how large a
buffer we need for a single TCP flow.
Spring Camp 2013 46
Reference Router Pipeline
• Five stages– Input interfaces– Input arbitration– Routing decision and
packet modification– Output queuing– Output interfaces
• Packet-based module interface
• Pluggable design
MACRxQ
CPURxQ
MACRxQ
CPURxQ
MACRxQ
CPURxQ
MACRxQ
CPURxQ
Input Arbiter
Output Port Lookup
MACTxQ
CPUTxQ
MACTxQ
CPUTxQ
MACTxQ
CPUTxQ
MACTxQ
CPUTxQ
Output Queues
Spring Camp 2013 47
Extending the Reference Pipeline
MACRxQMACRxQ
CPURxQCPURxQ
MACRxQMACRxQ
CPURxQCPURxQ
MACRxQMACRxQ
CPURxQCPURxQ
MACRxQMACRxQ
CPURxQCPURxQ
Input ArbiterInput Arbiter
Output Port LookupOutput Port Lookup
MACTxQMACTxQ
CPUTxQCPUTxQ
MACTxQMACTxQ
CPUTxQCPUTxQ
MACTxQMACTxQ
CPUTxQCPUTxQ
MACTxQMACTxQ
CPUTxQCPUTxQ
Output QueuesOutput Queues
RateLimiter
RateLimiter
Event CaptureEvent Capture
Spring Camp 2013 48
Enhanced Router Pipeline
Two modules added
1. Event Capture to capture output queue events (writes, reads, drops)
2. Rate Limiter to create a bottleneck
MACRxQ
CPURxQ
MACRxQ
CPURxQ
MACRxQ
CPURxQ
MACRxQ
CPURxQ
Input Arbiter
Output Port Lookup
MACTxQ
CPUTxQ
MACTxQ
CPUTxQ
MACTxQ
CPUTxQ
MACTxQ
CPUTxQ
Output Queues
RateLimiter
Event Capture
Spring Camp 2013 49
Topology for Exercise 2
Iperf Client
IperfServer
Recall:
NetFPGA host PC is life-support: power & control
So:
The host PC may physically route its traffic through the local NetFPGA
PC & NetFPGA(NetFPGA in PC)
NetFPGA runningextended reference router
nf2c2
eth1
nf2c1
eth2
Spring Camp 2013 50
Example 2
Spring Camp 2013 51
Review
NetFPGA as flexible platform:•Reference hardware + SCONE software•new modules: event capture and rate-limiting
Example 2:Client Router Server topology
– Observed router with new modules– Started tcp transfer, look at queue occupancy– Observed queue change in response to TCP ARQ
Spring Camp 2013 52
Section IV: Infrastructure
Spring Camp 2013 53
Infrastructure
• Tree structure
• NetFPGA package contents– Reusable Verilog modules– Verification infrastructure– Build infrastructure– Utilities– Software libraries
Spring Camp 2013 54
NetFPGA package contents
• Projects:– HW: router, switch, NIC, buffer sizing router– SW: router kit, SCONE
• Reusable Verilog modules• Verification infrastructure:
– simulate full board with PCI-express + physical interfaces
– run tests against hardware– test data generation libraries (eg. packets)
• Build infrastructure• Utilities:
– register I/O, packaging, …• Software libraries
Spring Camp 2013 55
Tree Structure (1)
NetFPGA-10G
projects (including reference designs)
contrib-projects (contributed user projects)
lib (custom and reference IP Cores and software libraries)
tools (scripts for running simulations etc.)
docs (design documentations and user-guides)
https://github.com/NetFPGA/NetFPGA-10G-live
Spring Camp 2013 56
Tree Structure (2)
lib
hw (hardware logic as IP cores)
sw (core specific software drivers/libraries)
std (reference cores)
contrib (contributed cores)
xilinx (Xilinx provided cores)
std (reference libraries)
contrib (contributed libraries)
xilinx (Xilinx provided libraries)
Spring Camp 2013 57
Tree Structure (3)
projects/reference_nic
hw (EDK based project)
data (contains user constraint files)
bitfiles (FPGA executables)
nf10 (contains files used to run various tools)
pcores (contains local hardware peripherals)
swembedded (contains code for microblaze)
host (contains code for host communication etc.)
Spring Camp 2013 58
Reusable logic (IP cores)
* PBS – packet bus stream in NF1G** RBS – register bus stream in NF1G
Spring Camp 2013 59
A brief detour
What is a PCore?
Spring Camp 2013 60
What’s pcore?
IP Core? P Core? Pcore? pcore?- all the same.
•“Processor Core” in EDK•HDL (Verilog/VHDL) + metadata files•≈ module •Examples:
– 10G MAC– SRAM Controller– NIC Output port lookup
Spring Camp 2013 61
HDL (Verilog)
• All NetFPGA-10G pcores are AXI-compliant• AXI = Advanced eXtensible Interface
– Used in ARM-based embedded systems– Standard interface to bridge pcores together– AXI4/AXI4-Lite: Control and status interface– AXI4-Stream: Data path interface
• Xilinx IPs and tool chains are migrating to AXI
Spring Camp 2013 62
Pcore metadata files
• Help EDK put pcores together• PAO: Peripheral Analyze Order
– Files that needs synthesizing• MPD: Microprocessor Peripheral Definition
– Defines interface of the pcore (e.g. # of AXI’s)• Consult Xilinx UG642 for details
Spring Camp 2013 63
Verification Infrastructure (1)
• Simulation and Debugging– built on industry standard Xilinx “iSim” simulator
and “Scapy”– Python scripts for stimuli construction and
verification
Spring Camp 2013 64
Verification Infrastructure (2)
• iSim– a High Level Description (HDL) simulator– performs functional and timing simulations for
embedded, VHDL, Verilog and mixed designs• Scapy
– a powerful interactive packet manipulation library for creating “test data”
– provides primitives for many standard packet formats
– allows addition of custom formats
Spring Camp 2013 65
Build Infrastructure (1)
• Based on the design flow of:– “Xilinx Embedded Development Kit” – “ARM’s AXI4 interface specification”
Spring Camp 2013 66
Build Infrastructure (2)
• Build/Synthesis– collection of shared hardware peripherals cores
(pcores) stitched together with “AXI4”, “Lite” and “Stream” buses
– bitfile generation and verification using Xilinx synthesis and implementation tools
• XST• Translate, MAP, PAR• BitGen
Spring Camp 2013 67
Build Infrastructure (3)
• Register system– collates and generates addresses for all the
registers and memories in a project– uses Xilinx EDK – libgen utility to generate
“include” files for software applications
Spring Camp 2013 68
Module i
Module i+1
TDATA
Inter-Module Communication
TUSER
TVALID
TREADY
– Using AXI-4 Stream (Packets are moved as Stream)
TSTRB
TLAST
Spring Camp 2013 69
AXI4-Stream
AXI4-Stream Description
TDATA Data Stream
TSTRB Strobe signal (i.e. byte enable)
TVALID Valid Indication
TREADY Flow control indication
TLAST End of packet/burst indication
TUSER Out of band metadata
Spring Camp 2013 70
Packet Format
TLAST TUSER TSTRB TDATA
0 X 0xFF…F Eth Hdr
0 X 0xFF…F IP Hdr
0 X 0xFF…F …
1 X 0x0…1F Last word
Spring Camp 2013 71
TUSER
Position Content
[15:0] length of the packet in bytes
[23:16] source port: one-hot encoded
[31:24] destination port: one-hot encoded
[127:32] 6 user defined slots, 16bit each
Spring Camp 2013 72
TVALID/TREADY Signal timing
– No waiting!– Assert TREADY/TVALID whenever
appropriate– TVALID should not depend on TREADY
TVALIDTREADY
Spring Camp 2013 73
Byte ordering
• In compliance to AXI, NF 10G has a specific byte ordering– 1st byte of the packet @ TDATA[7:0]– 2nd byte of the packet @ TDATA[15:8]
73
Spring Camp 2013 74
Section V: Life of a Packet
Spring Camp 2013 75
Reference Router Pipeline
• Five stages– Input– Input arbitration– Routing decision and
packet modification– Output queuing– Output
• Packet-based module interface
• Pluggable design
MACRxQMACRxQ
MACRxQMACRxQ
MACRxQMACRxQ
MACRxQMACRxQ
CPURxQCPURxQ
Input ArbiterInput Arbiter
Output Port LookupOutput Port Lookup
Output QueuesOutput Queues
MACRxQMACRxQ
MACRxQMACRxQ
MACRxQMACRxQ
MACRxQMACRxQ
CPURxQCPURxQ
Spring Camp 2013 76
Full System Components
Software
PCIe Bus
NetFPGA
AXI Lite
user data path
nf0 nf1 nf2 nf3 ioctl
MACTxQMACTxQ
MACRxQMACRxQ
Ethernet
CPURxQCPURxQ
CPUTxQ
MACTxQMACTxQ
MACRxQMACRxQ
MACTxQMACTxQ
MACRxQMACRxQ
MACTxQMACTxQ
MACRxQMACRxQ
Spring Camp 2013 77
port0 port2192.168.2.y192.168.1.x
Life of a Packet through the Hardware
IP packet
Spring Camp 2013 78
MAC Rx Queue
MAC Rx Queue
Spring Camp 2013 79
Rx Queue
Rx Queue
IP Hdr:IP Dst: 192.168.2.3, TTL:
64, Csum:0x3ab4
Eth Hdr:Dst MAC = port 0,
Ethertype = IP
Payload
0
Length, Src port, Dst port, User defined
0
TUSER TDATA
Spring Camp 2013 80
Input Arbiter
Input Arbiter
Rx Q 0
Rx Q 1
…
Rx Q 4
Pkt
Pkt
Pkt
Spring Camp 2013 81
Output Port Lookup
Output Port
Lookup
IP Hdr:IP Dst: 192.168.2.3, TTL:
64, Csum:0x3ab4
Eth Hdr:Dst MAC = port 0,
Ethertype = IP
Payload
Spring Camp 2013 82
Output Port
Lookup
IP Hdr:IP Dst: 192.168.2.3, TTL:
63, Csum:0x3ac2
Eth Hdr: Dst MAC= nextHop , Src MAC = port 4, Ethertype = IP
Payload
0
Length, Src port, Dst port, User defined
0
Output Port Lookup
1- Check input port matches
Dst MAC
2- Check TTL, checksum
3- Lookup next hop IP & output port
(LPM)
4- Lookup next hop MAC address (ARP)
5- Update output port in TUSER
6- Modify MAC Dst and Src addresses
7-Decrement TTL and update
checksum
TUSER TDATA
Spring Camp 2013 83
Output Queues
Output Queues
OQ0
OQ2
OQ4
Spring Camp 2013 84
MAC Tx Queue
MAC Tx Queue
Spring Camp 2013 85
MAC Tx Queue
MAC Tx Queue
IP Hdr:IP Dst: 192.168.2.3, TTL:
63, Csum:0x3ac2
Eth Hdr: Dst MAC= nextHop , Src MAC = port 4, Ethertype = IP
Payload
0
Length, Src port, Dst port, User defined
0
Spring Camp 2013 86
Exception Packets
• Example: TTL = 0 or TTL = 1
• Packet has to be sent to the CPU• Host generates an ICMP packet response• Difference starts at the Output Port Lookup
Spring Camp 2013 87
Software
PCI Bus
NetFPGA
AXI Lite
user data path
nf0 nf1 nf2 nf3 ioctl
MACTxQMACTxQ
MACRxQMACRxQ
Ethernet
CPURxQCPURxQ
CPUTxQ
MACTxQMACTxQ
MACRxQMACRxQ
MACTxQMACTxQ
MACRxQMACRxQ
MACTxQMACTxQ
MACRxQMACRxQ
Exception Packet Path
Spring Camp 2013 88
Output Port
Lookup
IP Hdr:IP Dst: 192.168.2.3, TTL:
1, Csum:0x3ab4
Eth Hdr: Dst MAC= 0 , Src MAC = port x,
Ethertype = IP
Payload
0
Length, Src port, Dst port, User defined
0
Output Port Lookup
1- Check input port matches
Dst MAC
2- Check TTL, checksum – EXCEPTION!
3- Add output port module
Spring Camp 2013 89
Output Queues
Output Queues
OQ0
OQ1
OQ2
OQ4
Spring Camp 2013 90
CPU Tx Queue
CPU Tx Queue
Spring Camp 2013 91
CPU Tx Queue
CPU Tx Queue IP Hdr:
IP Dst: 192.168.2.3, TTL: 1, Csum:0x3ab4
Eth Hdr: Dst MAC= 0 , Src MAC = port x,
Ethertype = IP
Payload
0
Length, Src port, Dst port, User defined
0
Spring Camp 2013 92
ICMP Packet
• Packet arrives at the CPU Rx Queue from the PCI-express Bus
• Same path as a packet from the MAC until it reaches the Output Port Lookup (OPL)
• The OPL module sees the packet is from the CPU Rx Queue 1 and sets the output port directly to 0
• The packet continues on the same path as the non-exception packet to the Output Queues and then MAC Tx queue 0
Spring Camp 2013 93
Software
PCI Bus
NetFPGA
AXI Lite
user data path
nf0 nf1 nf2 nf3 ioctl
MACTxQMACTxQ
MACRxQMACRxQ
Ethernet
CPURxQCPURxQ
CPUTxQ
MACTxQMACTxQ
MACRxQMACRxQ
MACTxQMACTxQ
MACRxQMACRxQ
MACTxQMACTxQ
MACRxQMACRxQ
ICMP Packet Path
Spring Camp 2013 94
NetFPGA-Host Interaction
• Linux driver interfaces with hardware– Packet interface via standard Linux network
stack
– Register reads/writes via ioctl system call with wrapper functions:
• rdaxi(int address, unsigned *rd_data);• wraxi(int address, unsigned *wr_data);
eg:rdaxi(0x7d4000000, &val);
Spring Camp 2013 95
NetFPGA-Host Interaction
NetFPGA to host packet transferNetFPGA to host packet transfer
PC
I Bu
sP
CI B
us
2. Interrupt notifies driver of packet arrival
2. Interrupt notifies driver of packet arrival
3. Driver sets up and initiates DMA transfer
3. Driver sets up and initiates DMA transfer
1. Packet arrives – forwarding table sends to CPU queue
1. Packet arrives – forwarding table sends to CPU queue
Spring Camp 2013 96
NetFPGA-Host Interaction
NetFPGA to host packet transfer (cont.)NetFPGA to host packet transfer (cont.)
PC
I Bu
sP
CI B
us
4. NetFPGA transfers packet via DMA
4. NetFPGA transfers packet via DMA
5. Interrupt signals completion of DMA
5. Interrupt signals completion of DMA
6. Driver passes packet to network stack6. Driver passes packet to network stack
Spring Camp 2013 97
NetFPGA-Host Interaction
Host to NetFPGA packet transfersHost to NetFPGA packet transfers
PC
I Bu
sP
CI B
us
3. Interrupt signals completion of DMA
3. Interrupt signals completion of DMA
1. Software sends packet via network sockets
Packet delivered to driver
1. Software sends packet via network sockets
Packet delivered to driver
2. Driver sets up and initiates DMA transfer
2. Driver sets up and initiates DMA transfer
Spring Camp 2013 98
NetFPGA-Host Interaction
Register accessRegister access
PC
I Bu
sP
CI B
us
1. Software makes ioctl call on network socket
ioctl passed to driver
1. Software makes ioctl call on network socket
ioctl passed to driver
2. Driver performs PCIe memory read/write
2. Driver performs PCIe memory read/write
Spring Camp 2013 99
Section V: Contributed Projects
Spring Camp 2013 100
FPGAFPGA
MemoryMemory
Running the Router Kit
User-space development, 4x10GE line-rate forwarding
PCI-Express
CPUCPU MemoryMemoryOSPFOSPF BGPBGPMy ProtocolMy Protocol
userkernel
RoutingTable
Usage #1
IPv4RouterIPv4
Router
FwdingTable
FwdingTable
PacketBufferPacketBuffer
“Mirror”
10GbE10GbE
10GbE10GbE
10GbE10GbE
10GbE10GbE
10GbE10GbE
10GbE10GbE
10GbE10GbE
10GbE10GbE
Spring Camp 2013 101
FPGAFPGA
MemoryMemory
Enhancing Modular Reference Designs
PCI-Express
CPUCPU MemoryMemory
Usage #2
NetFPGA DriverNetFPGA Driver
Java GUIFront Panel(Extensible)
Java GUIFront Panel(Extensible)
PW-OSPFPW-OSPF
In QMgmtIn Q
MgmtIP
LookupIP
Lookup
L2Parse
L2Parse
L3Parse
L3Parse
Out QMgmtOut QMgmt
Verilog modules interconnected by FIFO interfaces
MyBlock
MyBlock
Verilog, System Verilog, VHDL, Bluespec….
Verilog, System Verilog, VHDL, Bluespec….
10GbE10GbE
10GbE10GbE
10GbE10GbE
10GbE10GbE
10GbE10GbE
10GbE10GbE
10GbE10GbE
10GbE10GbE
EDA Tools(Xilinx,
Mentor, etc.)
EDA Tools(Xilinx,
Mentor, etc.)1.Design2.Simulate3.Synthesize4.Download
1.Design2.Simulate3.Synthesize4.Download
Spring Camp 2013 102
FPGAFPGA
MemoryMemory
Creating new systems
PCI-Express
CPUCPU MemoryMemory
Usage #3
NetFPGA DriverNetFPGA Driver
My Design
(10GE MAC is soft/replaceable)
My Design
(10GE MAC is soft/replaceable)
Verilog, System Verilog, VHDL, Bluespec….
Verilog, System Verilog, VHDL, Bluespec….
10GbE10GbE
10GbE10GbE
10GbE10GbE
10GbE10GbE
10GbE10GbE
10GbE10GbE
10GbE10GbE
10GbE10GbE
EDA Tools(Xilinx,
Mentor, etc.)
EDA Tools(Xilinx,
Mentor, etc.)1.Design2.Simulate3.Synthesize4.Download
1.Design2.Simulate3.Synthesize4.Download
Spring Camp 2013 103
NetThreads, NetThreads-RE, NetTM
Martin LabrecqueGregory Steffan
ECE Dept.
Geoff SalmonMonia GhobadiYashar Ganjali
CS Dept.U. of Toronto
•Efficient multithreaded design–Parallel threads deliver performance
•System Features–System is easy to program in C–Time to results is very short
Spring Camp 2013 104
FPGA
Soft processors: processors in the FPGA fabricUser uploads program to soft processorEasier to program software than hardware in the FPGACould be customized at the instruction level
Processor(s)DDR controller
Ethernet MAC
Soft Processors in FPGAs
Spring Camp 2013 105
Example 1G Contributed Projects
Project Contributor
OpenFlow switch Stanford University
Packet generator Stanford University
NetFlow Probe Brno University
NetThreads University of Toronto
zFilter (Sp)router Ericsson
Traffic Monitor University of Catania
DFA UMass Lowell
Spring Camp 2013 106
10G Contributions
Project Contributor
Bluespec switch MIT/SRI International
Traffic Monitor University of Pisa
NF1G legacy on NF10G Uni Pisa & Uni Cambridge
Simple/better DMA core Stanford RAMcloud project
Your ideas You?
…. ….
…. ….
Alongside reference projects we have…
Spring Camp 2013 107
Future for NetFPGA 10G
Non-reference Projects we know about
•High precision capture card– 4 10GbE ports, GPS input, 8ns resolution.
•Traffic generator– 4 port 10GbE real-traffic generator
•OpenFlow native port (Verilog) to OVS
Spring Camp 2013 108
• Build an accurate, fast, line-rate NetDummy/nistnet element
• A flexible home-grown monitoring card
• Evaluate new packet classifiers – (and application classifiers, and other neat network apps….)
• Prototype a full line-rate next-generation Ethernet-type
• Trying any of Jon Crowcrofts’ ideas (Sourceless IP routing for example)
• Demonstrate the wonders of Metarouting in a different implementation (dedicated hardware)
• Provable hardware (using a C# implementation and kiwi with NetFPGA as target h/w)
• Hardware supporting Virtual Routers
• Check that some brave new idea actually works e.g. Rate Control Protocol (RCP), Multipath TCP,
• toolkit for hardware hashing
• MOOSE implementation
• IP address anonymization
• SSL decoding “bump in the wire”
• Xen specialist nic
• computational co-processor
• Distributed computational co-processor
• IPv6 anything
• IPv6 – IPv4 gateway (6in4, 4in6, 6over4, 4over6, ….)
• Netflow v9 reference
• PSAMP reference
• IPFIX reference
• Different driver/buffer interfaces (e.g. PFRING)
• or “escalators” (from gridprobe) for faster network monitors
• Firewall reference
• GPS packet-timestamp things
• High-Speed Host Bus Adapter reference implementations– Infiniband– iSCSI– Myranet– Fiber Channel
• Smart Disk adapter (presuming a direct-disk interface)
• Software Defined Radio (SDR) directly on the FPGA (probably UWB only)
• Routing accelerator– Hardware route-reflector– Internet exchange route accelerator
• Hardware channel bonding reference implementation
• TCP sanitizer
• Other protocol sanitizer (applications… UDP DCCP, etc.)
• Full and complete Crypto NIC
• IPSec endpoint/ VPN appliance
• VLAN reference implementation
• metarouting implementation
• virtual <pick-something>
• intelligent proxy
• application embargo-er
• Layer-4 gateway
• h/w gateway for VoIP/SIP/skype
• h/w gateway for video conference spaces
• security pattern/rules matching
• Anti-spoof traceback implementations (e.g. BBN stuff)
• IPtv multicast controller
• Intelligent IP-enabled device controller (e.g. IP cameras or IP powermeters)
• DES breaker
• platform for flexible NIC API evaluations
• snmp statistics reference implementation
• sflow (hp) reference implementation
• trajectory sampling (reference implementation)
• implementation of zeroconf/netconf configuration language for routers
• h/w openflow and (simple) NOX controller in one…
• Network RAID (multicast TCP with redundancy)
• inline compression
• hardware accelorator for TOR
• load-balancer
• openflow with (netflow, ACL, ….)
• reference NAT device
• active measurement kit
• network discovery tool
• passive performance measurement
• active sender control (e.g. performance feedback fed to endpoints for control)
• Prototype platform for NON-Ethernet or near-Ethernet MACs– Optical LAN (no buffers)
How might we use NetFPGA?Well I’m not sure about you but here is a list I created:• Build an accurate, fast, line-rate NetDummy/nistnet element
• A flexible home-grown monitoring card• Evaluate new packet classifiers
– (and application classifiers, and other neat network apps….)• Prototype a full line-rate next-generation Ethernet-type• Trying any of Jon Crowcrofts’ ideas (Sourceless IP routing for example)• Demonstrate the wonders of Metarouting in a different implementation (dedicated hardware)• Provable hardware (using a C# implementation and kiwi with NetFPGA as target h/w)• Hardware supporting Virtual Routers• Check that some brave new idea actually works
e.g. Rate Control Protocol (RCP), Multipath TCP, • toolkit for hardware hashing• MOOSE implementation• IP address anonymization • SSL decoding “bump in the wire”• Xen specialist nic• computational co-processor• Distributed computational co-processor• IPv6 anything• IPv6 – IPv4 gateway (6in4, 4in6, 6over4, 4over6, ….)• Netflow v9 reference• PSAMP reference• IPFIX reference• Different driver/buffer interfaces (e.g. PFRING)• or “escalators” (from gridprobe) for faster network monitors• Firewall reference• GPS packet-timestamp things• High-Speed Host Bus Adapter reference implementations
– Infiniband– iSCSI– Myranet– Fiber Channel
• Smart Disk adapter (presuming a direct-disk interface)• Software Defined Radio (SDR) directly on the FPGA (probably UWB only)• Routing accelerator
– Hardware route-reflector– Internet exchange route accelerator
• Hardware channel bonding reference implementation• TCP sanitizer• Other protocol sanitizer (applications… UDP DCCP, etc.)• Full and complete Crypto NIC• IPSec endpoint/ VPN appliance• VLAN reference implementation• metarouting implementation• virtual <pick-something>• intelligent proxy• application embargo-er• Layer-4 gateway• h/w gateway for VoIP/SIP/skype• h/w gateway for video conference spaces• security pattern/rules matching• Anti-spoof traceback implementations (e.g. BBN stuff)• IPtv multicast controller• Intelligent IP-enabled device controller (e.g. IP cameras or IP powermeters)• DES breaker• platform for flexible NIC API evaluations• snmp statistics reference implementation• sflow (hp) reference implementation• trajectory sampling (reference implementation)• implementation of zeroconf/netconf configuration language for routers• h/w openflow and (simple) NOX controller in one…• Network RAID (multicast TCP with redundancy)• inline compression• hardware accelorator for TOR• load-balancer• openflow with (netflow, ACL, ….)• reference NAT device• active measurement kit• network discovery tool• passive performance measurement• active sender control (e.g. performance feedback fed to endpoints for control)• Prototype platform for NON-Ethernet or near-Ethernet MACs
– Optical LAN (no buffers)
Spring Camp 2013 109
How might YOU use NetFPGA?• Build an accurate, fast, line-rate NetDummy/nistnet element• A flexible home-grown monitoring card• Evaluate new packet classifiers
– (and application classifiers, and other neat network apps….)• Prototype a full line-rate next-generation Ethernet-type• Trying any of Jon Crowcrofts’ ideas (Sourceless IP routing for example)• Demonstrate the wonders of Metarouting in a different implementation (dedicated hardware)• Provable hardware (using a C# implementation and kiwi with NetFPGA as target h/w)• Hardware supporting Virtual Routers• Check that some brave new idea actually works
e.g. Rate Control Protocol (RCP), Multipath TCP, • toolkit for hardware hashing• MOOSE implementation• IP address anonymization • SSL decoding “bump in the wire”• Xen specialist nic• computational co-processor• Distributed computational co-processor• IPv6 anything• IPv6 – IPv4 gateway (6in4, 4in6, 6over4, 4over6, ….)• Netflow v9 reference• PSAMP reference• IPFIX reference• Different driver/buffer interfaces (e.g. PFRING)• or “escalators” (from gridprobe) for faster network monitors• Firewall reference• GPS packet-timestamp things• High-Speed Host Bus Adapter reference implementations
– Infiniband– iSCSI– Myranet– Fiber Channel
• Smart Disk adapter (presuming a direct-disk interface)• Software Defined Radio (SDR) directly on the FPGA (probably UWB only)• Routing accelerator
– Hardware route-reflector– Internet exchange route accelerator
• Hardware channel bonding reference implementation• TCP sanitizer• Other protocol sanitizer (applications… UDP DCCP, etc.)• Full and complete Crypto NIC• IPSec endpoint/ VPN appliance• VLAN reference implementation• metarouting implementation• virtual <pick-something>• intelligent proxy• application embargo-er• Layer-4 gateway• h/w gateway for VoIP/SIP/skype• h/w gateway for video conference spaces• security pattern/rules matching• Anti-spoof traceback implementations (e.g. BBN stuff)• IPtv multicast controller• Intelligent IP-enabled device controller (e.g. IP cameras or IP powermeters)• DES breaker• platform for flexible NIC API evaluations• snmp statistics reference implementation• sflow (hp) reference implementation• trajectory sampling (reference implementation)• implementation of zeroconf/netconf configuration language for routers• h/w openflow and (simple) NOX controller in one…• Network RAID (multicast TCP with redundancy)• inline compression• hardware accelorator for TOR• load-balancer• openflow with (netflow, ACL, ….)• reference NAT device• active measurement kit• network discovery tool• passive performance measurement• active sender control (e.g. performance feedback fed to endpoints for control)• Prototype platform for NON-Ethernet or near-Ethernet MACs
– Optical LAN (no buffers)
Spring Camp 2013 110
Section VI: Example Project
Spring Camp 2013 111
Project: Cryptographic NIC
Implement a network interface card (NIC) that encrypts upon transmission and
decrypts upon reception
Spring Camp 2013 112
Cryptography
XOR function
XOR written as: ^ ⊻ ⨁XOR is commutative: (A ^ B) ^ C = A ^ (B ^ C)
A B A ^ B
0 0 0
0 1 1
1 0 1
1 1 0
XORing a value with itself always yields 0
Spring Camp 2013 113
Cryptography (cont.)
Simple cryptography:– Generate a secret key– Encrypt the message by XORing the message and key– Decrypt the ciphertext by XORing with the key
Explanation:
(M ^ K) ^ K = M ^ (K ^ K)
= M
= M ^ 0
Commutativity
A ^ A = 0
Spring Camp 2013 114
Cryptography (cont.)
Example:
Message: 00111011
Key: 10110001
Message ^ Key: 10001010
Key: 10110001
Message ^ Key ^ Key: 00111011
Spring Camp 2013 115
Cryptography (cont.)
Idea: Implement simple cryptography using XOR– 32-bit key– Encrypt every word in payload with key
Note: XORing with a one-time pad of the same length of the message is secure/uncrackable. See: http://en.wikipedia.org/wiki/One-time_pad
PayloadHeader
Key Key Key Key Key
⨁
Spring Camp 2013 116
Section VII: Implementation
Spring Camp 2013 117
Embedded Development Kit (1)
• A development environment for building “embedded systems”
• NetFPGA-10G is largely built upon the “EDK”
Spring Camp 2013 118
Embedded Development Kit (2)
• Xilinx EDK suite is composed of:– Platform Studio (XPS), a top level integrated
design tool for “hardware” synthesis , implementation and bitstream generation
– Software Development Kit (SDK), a development environment for “software application” running on embedded processors like Microblaze
Spring Camp 2013 119
Xilinx Platform Studio (1)
• An XPS project consists of following:– system.xmp
• top level XPS project file
– system.mhs • microprocessor hardware specification file• defines the embedded processor system, and includes
bus architecture, peripherals, processor, and connectivity etc.
– system.ucf • user constraint file• defines constraints such as timing, area, IO placement
etc.
Spring Camp 2013 120
Xilinx Platform Studio (2)
• To invoke XPS design tool, run:# xps <project_root>/hw/system.xmp
• This will open the project in the XPS graphical user interface
• open a new terminal• cd ~/NetFPGA-10G-live/projects/crypto_nic• source /opt/Xilinx/13.4/ISE_DS/settings64.sh• xps hw/system.xmp
Spring Camp 2013 121
XPS Design Tool (1)
Bus Assembly
System AssemblyIP Catalog
Spring Camp 2013 122
XPS Design Tool (2)
• IP Catalog: contains categorized list of all available peripheral cores
• Bus Assembly: shows connectivity of various modules over AXI bus
• System Assembly: provides a complete view of instantiated cores– Cores can be added and deleted, in this view
Spring Camp 2013 123
XPS Design Tool (2)
Port view
• Port view: provides listing of internal and external ports for each peripheral core
Spring Camp 2013 124
XPS Design Tool (3)
Address view
• Address view: defines base and high address value for peripheral cores connected to AXI4 or AXI-LITE bus
• These values can be changed manually, here
Spring Camp 2013 125
Getting started with a new project (1)
• Projects:– Each design represented by a project
– Location: NetFPGA-10G-live/projects/<proj_name>• ~/NetFPGA-10G-live/projects/crypto_nic
– Consists of:• Verilog source• Simulation tests• Hardware tests• Libraries• Optional software
Spring Camp 2013 126
Getting started with a new project (2)
– Normally:• copy an existing project as the starting point
– Today:• pre-created project
– Missing from pre-created project:• Verilog files (with crypto implementation)• Simulation tests• Hardware tests• Custom software
Spring Camp 2013 127
MACRxQMACRxQ
MACRxQMACRxQ
MACRxQMACRxQ
MACRxQMACRxQ
CPURxQCPURxQ
Input ArbiterInput Arbiter
Output Port LookupOutput Port Lookup
Output QueuesOutput Queues
MACRxQMACRxQ
MACRxQMACRxQ
MACRxQMACRxQ
MACRxQMACRxQ
CPURxQCPURxQ
Getting started with a new project (3)
Typically implement functionality in one or more modules under the top wrapper
CryptoCrypto
Crypto module to encrypt and decrypt packets
Spring Camp 2013 128
Getting started with a new project (4)
– Shared modules included from netfpga/lib/verilog• Generic modules that are re-used in multiple projects• Specify shared modules in project’s mhs file
– Local src modules override shared modules
– crypto_nic:
Local Shared
crypto.v Everything else
Spring Camp 2013 129
Getting started with a new project (5)
Create crypto.v from module_template.v:
1. Rename project project_template to crypto2. Rename the local module_template.v to crypto.v3. Change the module name inside crypto.v (first non-
comment line of the file)
4. Add the crypto module to the mhs file5. Update mpd file6. Update pao file
Spring Camp 2013 130
cd ~/NetFPGA-10G-live/projects/crypto_nicxps hw/system.xmp
Running XPS
Spring Camp 2013 131
Adding The Crypto Module
• Double click the Crypto module in IP catalog– It’s under “Project Local Pcores”
• Select 256 bit data width and click OK
Spring Camp 2013 132
Connect Crytpo module
• In “Bus Interfaces”, click Crypto’s S_AXIS and select Input Output_lookup’s M_AXIS
• Click Output_queue’s S_AXIS and select Crypto’s M_AXIS
Spring Camp 2013 133
Connect Clock and Reset
• Click “Ports” Tab• Clock -> clock_generator_0::CLKOUT1• Reset -> reset_0::Peripheral_aresetn
Spring Camp 2013 134
Implementing the Crypto Module (1)
• What do we want to encrypt?– IP payload only
• Plaintext IP header allows routing• Content is hidden
– Encrypt bytes 35 onward• Bytes 1-14 – Ethernet header• Bytes 15-34 – IPv4 header (assume no options)• Remember AXI byte ordering
– For simplicity, assume all packets are IPv4 without options
Spring Camp 2013 135
Implementing the Crypto Module (2)
• State machine (draw on next page):– Module headers on each packet– Datapath 256-bits wide
• 34 / 32 is not an integer! • Inside the crypto module
Registers
in_fifo_empty
in_fifo_rd_en
data/ctrl
valid
ready
data/ctrl
S_AXIS M_AXIS
Spring Camp 2013 137
Crypto Module State Diagram
Hint: We suggest 3 states
DetectPacket’sHeader
Spring Camp 2013 138
Implementing the Crypto Module (3)
Implement your state machine inside crypto.v– Use a static key initially
Suggested sequence of steps:1. Create a static key value
• Constants can be declared in the module with localparam:localparam MY_EXAMPLE = 32’h01234567;
2. Implement your state machine without modifying the packet
3. Update your state machine to modify the packet by XORing the key and the payload
• Use eight copies of the key to create a 256-bit value to XOR with data words
Spring Camp 2013 139
module_template.v (1)module module_template #( parameter C_FAMILY = "virtex5", parameter C_M_AXIS_DATA_WIDTH=256, parameter C_S_AXIS_DATA_WIDTH=256, parameter C_M_AXIS_TUSER_WIDTH=128, parameter C_S_AXIS_TUSER_WIDTH=128, ... ) ( ... )
//----------------------- Signals---------------------------- ...
//------------------ Local assignments ----------------------- ...
//------------------------Registers ---------------------- ...
Module port declaration
Spring Camp 2013 140
module_template.v (2) //------------------------- Modules-------------------------------
fallthrough_small_fifo #( .WIDTH(C_S_AXIS_DATA_WIDTH+C_S_AXIS_TUSER_WIDTH +(C_S_AXIS_DATA_WIDTH / 8) +1), .MAX_DEPTH_BITS(2) ) input_fifo ( .din (), // Data in .wr_en (), // Write enable .rd_en (), // Read the next word .dout (), // Data out .full (), //FIFO full indication .nearly_full (), //Almost full indication .empty (), //FIFO empty indication .reset (), //Rest .clk () //Clock );
Packet data dumped in a FIFO. Allows some “decoupling” between input and output.
Spring Camp 2013 141
module_template.v (3) // -- AXILITE IPIF axi_lite_ipif_1bar #(. . .) axi_lite_ipif_inst ( . . . );
// -- IPIF REGS ipif_regs #(. . .) ipif_regs_inst ( . . . );
//------------------------Registers logic ----------------------
Registers section.
Ignore for now – we’ll explore this later
Spring Camp 2013 142
module_template.v (4) //------------------------- Logic-------------------------------
always @(*) begin // Default values out_wr_int = 0; in_fifo_rd_en = 0;
if (!in_fifo_empty && out_rdy) begin out_wr_int = 1; in_fifo_rd_en = 1; end end
Combinational logic to read data from the FIFO. (Data is output to output ports.)
You’ll want to add your state in this section.
Spring Camp 2013 143
More Verilog: Assignments 1
• Continuous assignments– appear outside processes (always @ blocks):
assign foo = baz & bar;
– targets must be declared as wires– always “happening” (ie, are concurrent)
Spring Camp 2013 144
More Verilog: Assignments 2
• Non-blocking assignments– appear inside processes (always @ blocks)– use only in sequential (clocked) processes:
always @(posedge clk) begina <= b;b <= a;
end
– occur in next delta (‘moment’ in simulation time)– targets must be declared as regs– never clock any process other than with a clock!
Spring Camp 2013 145
More Verilog: Assignments 3
• Blocking assignments– appear inside processes (always @ blocks)– use only in combinatorial processes:
• (combinatorial processes are much like continuous assignments)
always @(*) begin
a = b;b = a;end
– occur one after the other (as in sequential langs like C)– targets must be declared as regs – even though not a register
– never use in sequential (clocked) processes!
Spring Camp 2013 146
More Verilog: Assignments 3
• Blocking assignments– appear inside processes (always @ blocks)– use only in combinatorial processes:
• (combinatorial processes are much like continuous assignments)
always @(*) begintmp = a;a = b;b = tmp;end
– occur one after the other (as in sequential langs like C)– targets must be declared as regs – even though not a register
– never use in sequential (clocked) processes!
unlike non-blocking, have to use a temporary signal
Spring Camp 2013 147
(hints)
• Never assign one signal from two processes:
always @(posedge clk) beginfoo <= bar;
end
always @(posedge clk) beginfoo <= quux;
end
Spring Camp 2013 148
(hints)
• In combinatorial processes:– take great care to assign in all possible cases
always @(*) beginif (cond) begin
foo = bar;end
end
– (latches ‹as opposed to flip-flops› are bad for timing closure)
Spring Camp 2013 149
(hints)
• In combinatorial processes:– take great care to assign in all possible cases
always @(*) beginif (cond) begin
foo = bar;else
foo = quux;end
end
Spring Camp 2013 150
(hints)
• In combinatorial processes:– (or assign a default)
always @(*) beginfoo = quux;
if (cond) beginfoo = bar;
endend
Spring Camp 2013 151
Section VIII: Simulation and Debug
Spring Camp 2013 152
Testing: Simulation
• Simulation allows testing without requiring lengthy synthesis process
• NetFPGA simulation environment allows:– Send/receive packets
• Physical ports and CPU– Read/write registers– Verify results
• Simulations run in iSim
Spring Camp 2013 153
Testing: Simulation
• We will simulate the “crypto_nic” design under the “10G simulation framework”
• We will show you how to– create simple packets using scapy– transmit and reconcile packets sent over 10G
Ethernet and PCIe interfaces
Spring Camp 2013 154
Testing: Simulation(2)
Run a simulation to verify changes:
1. cd ~/NetFPGA-10G-live/projects/crypto_nic/hw2. make sim
or
1. cd ~/NetFPGA-10G-live/projects/crypto_nic/hw2. make simgui
make simgui opens isim *more on this later*
Now we can implement the crypto functionality
Spring Camp 2013 155
cd ~/NetFPGA-10G-live/projects/crypto_nic/hwvim make_pkts.py
• This will create eight simple IP packets for injecting into the system from each port including CPU
1
Spring Camp 2013 156
• Using the created packets we create the stimulus (input to the simulator) and the expected output from the simulator
• NB. line 97 deciphers the packets from the device
• To test without a key – edit ~/NetFPGA-10G-live/projects/crypto_nic/hw/Makefile and set the argument of –k to be 0x0
2 The destination in this code snippet is the DMA
Spring Camp 2013 157
• Using the created packets we create the stimulus (input to the simulator) and the expected output from the simulator
• NB. line 97 deciphers the packets from the device
• To test without a key – edit ~/NetFPGA-10G-live/projects/crypto_nic/hw/Makefile and set the argument of –k to be 0x0
The destinations in this code snippet are the ports3
Spring Camp 2013 158
4
• As expected, two packets are received from PCIe on each 10G interface
• Total of 32 packets, eight from each 10G interface, are received on the PCIe
The results are in…
Spring Camp 2013 159
Running simulation in iSim (1)
• To invoke iSim for design simulation, run:# cd <project_root>/hw/# make simgui
• This will open simulation in the iSim graphical user interface
Spring Camp 2013 160
Running simulation in iSim (2)
Objects panel
Instance panel
Waveform window
Tcl console
Spring Camp 2013 161
Running simulation in iSim (3)
• Instance panel: displays process and instance hierarchy
• Objects panel: displays simulation objects associated with the instance selected in the instance panel
• Waveform window: displays wave configuration consisting of signals and busses
• Tcl console: displays simulator generated messages and can executes Tcl commands
Spring Camp 2013 162
Simulation gone wildWhen make sim1source /opt/Xilinx/13.4/ISE_DS/settings64.sh2…XPS% make: *** No rule to make target `___xps/simgen.opt’, needed …..
Go fix the address in ….hw/system.mhs3ERROR:EDK – for point-2-point interface connector ‘nf10_crypto_0_m_axis’…..3a Add the crypto module to the MHS file (edit system.mhs OR use the GUI)3b Check the connectivity of the module (To each of the OPL and Output Queue modules)4….Cannot find MPD for weird core name ….
5ERROR: Simluator:778 – Static elaboration of top level Verilog design unit(s) in library work failededit pao file (~/NetFPGA-10G-live/projects/crypto_nic/hw/pcores/nf10_crypto_v1_00_a/data/nf10_crypto_v2_1_0.pao)
For simulation
uncomment the first four lines (39-42)comment out the last line here (43)
For synthesisuncomment lines 39,40,43 comment out lines 41 and 42
6if sim finishes but complains that each test passes 2 packets but all tests FAIL – this means your static key is different between your code and your Makefile (argument to make_pkts.py)
•check the log (verilog•hhhhhhhk
Spring Camp 2013 163
Simple Verilog error Lots of errors
When the file is empty…
Spring Camp 2013 164
Section IX: Conclusion
Spring Camp 2013 165
Nick McKeown, Glen Gibb, Jad Naous, David Erickson, G. Adam Covington, John W. Lockwood, Jianying Luo, Brandon Heller,
Paul Hartke, Neda Beheshti, Sara Bolouki, James Zeng, Jonathan Ellithorpe, Sachidanandan Sambandan, Eric Lo
AcknowledgmentsNetFPGA Team at Stanford University (Past and Present):
NetFPGA Team at University of Cambridge (Past and Present):
Andrew Moore, David Miller, Muhammad Shahbaz, Martin ZadnikMatthew Grosvenor, Gianni Antichi, Neelakandan Manihatty-Bojan,
Georgina Kalogeridou, Jong Hun Han, Noa Zilberman
All Community members (including but not limited to):
Paul Rodman, Kumar Sanghvi, Wojciech A. Koszek, Yahsar Ganjali, Martin Labrecque, Jeff Shafer,
Eric Keller , Tatsuya Yabe, Bilal Anwer,Yashar Ganjali, Martin Labrecque
Kees Vissers, Michaela Blott, Shep Siegel
Spring Camp 2013 167
Thanks to our Sponsors:
• Support for the NetFPGA project has been provided by the following companies and institutions
Disclaimer: Any opinions, findings, conclusions, or recommendations expressed in these materials do not necessarily reflect the views of the National Science Foundation or of any other sponsors supporting this project.This effort is also sponsored by the Defense Advanced Research Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL), under contract FA8750-11-C-0249. This material is approved for public release, distribution unlimited. The views expressed are those of the authors and do not reflect the official policy or position of the Department of Defense or the U.S. Government.