![Page 1: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/1.jpg)
Lecture 3 1
ECE 412: Microcomputer Laboratory
Lecture 3: Introduction to FPGAs
![Page 2: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/2.jpg)
Lecture 3 2
Objectives
• Understand the basics of how FPGAs work– So that the CAD tools make sense to you
![Page 3: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/3.jpg)
Lecture 3 3
Review Questions• This semester, as quiz preparation, I hope to begin each
lecture with one or two questions based on the last lecture’s material.
The answers will be included, and both posted on the web in the LECTURES portion, as usual. We will spend a few minutes discussing the answers, and any other issues or items not clear from the earlier lectures; feel free to ask anything!
![Page 4: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/4.jpg)
Lecture 3 4
Programmable Logic Device Advantages
• Short TAT (total turnaround time)• No or very low NRE (non-recurring expenses )• Field-reprogrammable• Platform-based design
![Page 5: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/5.jpg)
Lecture 3 5
Today - Two Major Types of Programmable Logic
• CPLD (complex programmable logic device) – coarse-grained two-level AND-OR programmable logic arrays (PLAs)
– fast and more predictable delay
– simpler interconnect structures
• FPGA (field programmable gate array) – fine-grained logic cells
– high logic density
– good design flexibility
![Page 6: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/6.jpg)
Lecture 3 6
A Generic FPGA Architecture
Programmable IO
KLUTInputs D FF
Clock
Out
BLE # 1
BLE # N
NOutputs
I Inputs
Clock
I
N
Programmable Logic (CLB)
Programmable Routing
![Page 7: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/7.jpg)
Lecture 3 7
Conceptual Structure of a Logic Block and Its Peripheral
K
LUT FF
Routing wiresegments
I Inputs
to logic
block
Local buffers & routing muxes
BLE
NBLEs
Routing wiresegments
N+
I
SRAM
SRAM
Programmable switch
![Page 8: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/8.jpg)
Lecture 3 8
An Implementation of a 4-input Look-up Table (4-LUT)
Out = f (in0, in1, in2, in3)
In0 In1 In2 In3
Out
…
16 SRAMs
![Page 9: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/9.jpg)
Lecture 3 9
Interconnect
• 2-Dimensional mesh of wires, with switching elements at wire crossings to control routing– Bit patterns stored into the switch SRAMs determine routing
– Switch connections programmed as part of configuring array
• To optimize for speed, many designs include multiple lengths of wire– Single-length (connect adjacent switches)
– Double-length (connect to switches two hops away)
• Long lines (run entire length/width of array)
![Page 10: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/10.jpg)
Lecture 3 10
Interconnects
• Architecture parameters channel width (W), • switch block flexibility (Fs – the number of wires to which each incoming wire can connect in a switch block), • connection block flexibility (Fc – the number of wires in each channel to which a logic block input or output pin can connect), • and segmented wire lengths (the number of logic blocks a wire segment spans).
![Page 11: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/11.jpg)
Commercial FPGAs
• Commercial FPGA chips contain a large amount of dedicated interconnects with different fixed lengths.
• These interconnects are usually point-to-point and uni-directional connections for performance improvement.
• There are dedicated carry chain and register chain interconnects within and between logic blocks as well.
Dedicated interconnects in Xilinx’s Spartan-3E
An 8-bit carry-select adder in Altera’s Stratix
Sum0
Sum1
Sum2
Sum3
LE1
LE2
LE3
LE4
1 0
A0 B0
A1 B1
A2 B2
A3 B3
Sum4
Sum5
Sum6
Sum7
LE5
LE6
LE7
LE8
1 0
A4 B4
A5 B5
A6 B6
A7 B7
Cin
Cout
![Page 12: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/12.jpg)
Commercial FPGAs
• Modern FPGAs also provide embedded IP cores, such as memories, DSP blocks, and processors, to facilitate the implementation of system-on-a-chip (SoC) designs.
![Page 13: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/13.jpg)
Lecture 3 13
Xilinx CLB – a.k.a. “Slice”
input variables
control variables
clock
output registers
![Page 14: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/14.jpg)
Lecture 3 14
Input-Output Blocks
• One IOB per FPGA pin– Allows pin to be used as input, output, or bidirectional (tri-state)
• Inputs– Direct
– Registered
– Drive dedicated decoder logic for address recognition
• IOB may also include logic for boundary scan (JTAG)
![Page 15: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/15.jpg)
Lecture 3 15
Xilinx IOBtri-state output buffer
input
output
![Page 16: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/16.jpg)
A Typical FPGA Design Flow (1)
RTL design
RTL elaboration
Architecture - independent optimization
Technology mapping and architecture - specific optimization
Placement - driven optimization and incremental placement
Routing
Bitstream generation
Clustering and placement
Bitstream
RTL design The most widely used design specification languages are Verilog or VHDL at the register transfer (RT) level
New trend toward moving to specification at a higher level of abstraction, where a behavior synthesis tool is used to generate the RTL specification in Verilog or VHDL
RTL elaboration
This identifies and/or infers datapath operations, such as additions, multiplications, register files, and/or memory blocks, and control logic
![Page 17: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/17.jpg)
• Architecture-independent optimization – Datapath optimization– Control logic optimization
• Technology mapping and architecture-specific optimization– Datapath to on-chip
dedicated circuit structures – Control logic to basic
programmable logic elements (BLEs)
– Datapath operations to BLEs
RTL design
RTL elaboration
Architecture - independent optimization
Technology mapping and architecture - specific optimization
Placement - driven optimization and incremental placement
Routing
Bitstream generation
Clustering and placement
Bitstream
A Typical FPGA Design Flow (2)
![Page 18: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/18.jpg)
• Clustering and placement– Clustering and placement
can be carried out separately or simultaneously
• Placement-driven optimization and incremental placement– Deal with interconnects
bottleneck– Incremental placement to
legalize
• Routing– Global routing and detail
routing
RTL design
RTL elaboration
Architecture - independent optimization
Technology mapping and architecture - specific optimization
Placement - driven optimization and incremental placement
Routing
Bitstream generation
Clustering and placement
Bitstream
A Typical FPGA Design Flow (3)
![Page 19: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/19.jpg)
Lecture 3 19
FPGAs -- Pros
• Reasonably Cheap at low volume– Good for low-volume parts, more expensive than IC for high-volume
parts
– Can migrate from SRAM based to fuse based when volume ramps up
• Short Design Cycle (~1sec programming time)• Reprogrammable
– Can download bug fix into units you’ve already shipped
• Large capacity (millions of gates, though we won’t use any that big)– FPGAs in the lab are “rated” at ~1M gates for 30K LE’s
• More flexible than PLDs -- can have internal state
![Page 20: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/20.jpg)
Lecture 3 20
FPGA’s -- Cons
• Lower capacity, speed and higher power consumption than building an ASIC– Sub-optimal mapping of logic into CLB’s – often 60% utilization
– Much lower clock frequency than max CLB toggle rate – often 40%
– Less dense layout and placement and slower operation due to programmability
• Overhead of configurable interconnect and logic blocks
• CPLDs may be faster than FPGA for designs they can handle
![Page 21: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/21.jpg)
Lecture 3 21
Evolution of the FPGA
• Early FPGAs used mainly for “glue logic” between other components– Simple CLBs, small number of inputs– Focus was on implementing “random” logic efficiently
• As capacities grew, other applications emerged– FPGAs as alternative to custom IC’s for entire applications– Computing with FPGAs
• FPGAs have changed to meet new application demands– Carry chains, better support for multi-bit operations– Integrated memories, such as the block RAMs in the devices we’ll use– Specialized units, such as multipliers, to implement functions that are
slow/inefficient in CLBs– Newer devices incorporate entire CPUs: Xilinx Virtex device family has
1-4 Power PC CPUs • Devices that don’t have CPU hardware generally support
synthesized CPUs
![Page 22: Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d6e5503460f94a4f95e/html5/thumbnails/22.jpg)
Lecture 3 22
Next Lecture
• Introduction to VHDL