coe 405 programmable logic and storage devices dr. aiman h. el-maleh computer engineering department...

86
COE 405 COE 405 Programmable Logic and Programmable Logic and Storage Devices Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals

Upload: clifton-gilmore

Post on 12-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

COE 405COE 405 Programmable Logic and Programmable Logic and

Storage DevicesStorage Devices

COE 405COE 405 Programmable Logic and Programmable Logic and

Storage DevicesStorage Devices

Dr. Aiman H. El-Maleh

Computer Engineering Department

King Fahd University of Petroleum & Minerals

Dr. Aiman H. El-Maleh

Computer Engineering Department

King Fahd University of Petroleum & Minerals

Page 2: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-2

OutlineOutlineOutlineOutline

History of Computational Fabrics ASIC vs. FPGA Reconfigurable Logic Anti-Fuse-Based Approach (Actel) RAM Based Field Programmable Logic (Xilinx) CLBs Carry & Control Logic FPGA Memory Implementation

History of Computational Fabrics ASIC vs. FPGA Reconfigurable Logic Anti-Fuse-Based Approach (Actel) RAM Based Field Programmable Logic (Xilinx) CLBs Carry & Control Logic FPGA Memory Implementation

Page 3: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-3

History of Computational FabricsHistory of Computational FabricsHistory of Computational FabricsHistory of Computational Fabrics

Discrete devices: relays, transistors (1940s-50s) Discrete logic gates (1950s-60s) Integrated circuits (1960s-70s)

• e.g. TTL packages: Data Book for 100’s of different parts

Gate Arrays (IBM 1970s)• Transistors are pre-placed on the chip & Place and Route

software puts the chip together automatically – only program the interconnect (mask programming)

Software Based Schemes (1970’s- present)• Run instructions on a general purpose core

Discrete devices: relays, transistors (1940s-50s) Discrete logic gates (1950s-60s) Integrated circuits (1960s-70s)

• e.g. TTL packages: Data Book for 100’s of different parts

Gate Arrays (IBM 1970s)• Transistors are pre-placed on the chip & Place and Route

software puts the chip together automatically – only program the interconnect (mask programming)

Software Based Schemes (1970’s- present)• Run instructions on a general purpose core

Page 4: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-4

History of Computational FabricsHistory of Computational FabricsHistory of Computational FabricsHistory of Computational Fabrics

ASIC Design (1980’s to present)• Turn Verilog directly into layout using a library of standard

cells

• Effective for high-volume and efficient use of silicon area

Programmable Logic (1980’s to present)• A chip that is reprogrammed after it has been fabricated

• Examples: PALs, PLAs, EPROM, EEPROM, PLDs, FPGAs

• Excellent support for mapping from Verilog

ASIC Design (1980’s to present)• Turn Verilog directly into layout using a library of standard

cells

• Effective for high-volume and efficient use of silicon area

Programmable Logic (1980’s to present)• A chip that is reprogrammed after it has been fabricated

• Examples: PALs, PLAs, EPROM, EEPROM, PLDs, FPGAs

• Excellent support for mapping from Verilog

Page 5: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-5

What is an FPGA?What is an FPGA?What is an FPGA?What is an FPGA?

A filed programmable gate array (FPGA) is a reprogrammable silicon chip.

Using prebuilt logic blocks and programmable routing resources, you can configure these chips to implement custom hardware functionality without ever having to pick up a breadboard or soldering iron.

You develop digital computing tasks in software and compile them down to a configuration file or bitstream that contains information on how the components should be wired together.

A filed programmable gate array (FPGA) is a reprogrammable silicon chip.

Using prebuilt logic blocks and programmable routing resources, you can configure these chips to implement custom hardware functionality without ever having to pick up a breadboard or soldering iron.

You develop digital computing tasks in software and compile them down to a configuration file or bitstream that contains information on how the components should be wired together.

Page 6: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-6

ASIC vs. FPGAASIC vs. FPGAASIC vs. FPGAASIC vs. FPGA

• designs must be sent for expensive and time consuming fabrication in semiconductor foundry

• bought off the shelf and reconfigured by designers themselves

ASICApplication Specific

Integrated Circuit

FPGAField Programmable

Gate Array

• designed all the way from behavioral description to physical layout

• no physical layout design; design ends with a bitstream used to configure a device

Page 7: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-7

ASIC vs. FPGAASIC vs. FPGAASIC vs. FPGAASIC vs. FPGA

Off-the-shelf

Low development cost

Short time to market

Reconfigurability

High performance

ASICs FPGAs

Low power

Low cost inhigh volumes

Page 8: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-8

Other FPGA AdvantagesOther FPGA AdvantagesOther FPGA AdvantagesOther FPGA Advantages

Manufacturing cycle for ASIC is very costly, lengthy and engages lots of manpower• Mistakes not detected at design time have large impact on

development time and cost

• FPGAs are perfect for rapid prototyping of digital circuits

Easy upgrades like in case of software FPGA provide a flexible platform for implementing

digital computing A rich set of macros and I/Os supported (multipliers,

block RAMS, ROMS, high-speed I/O) A wide range of applications from prototyping (to

validate a design before ASIC mapping) to high performance spatial computing

Manufacturing cycle for ASIC is very costly, lengthy and engages lots of manpower• Mistakes not detected at design time have large impact on

development time and cost

• FPGAs are perfect for rapid prototyping of digital circuits

Easy upgrades like in case of software FPGA provide a flexible platform for implementing

digital computing A rich set of macros and I/Os supported (multipliers,

block RAMS, ROMS, high-speed I/O) A wide range of applications from prototyping (to

validate a design before ASIC mapping) to high performance spatial computing

Page 9: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-9

How are FPGAs Used?How are FPGAs Used?How are FPGAs Used?How are FPGAs Used?

Prototyping• Ensemble of gate arrays used to

emulate a circuit to be manufactured

• Get more/better/faster debugging done than with simulation

Reconfigurable hardware• One hardware block used to

implement more than one function

Special-purpose computation engines• Hardware dedicated to solving one

problem (or class of problems)

• Accelerators attached to general-purpose computers (e.g., in a cell phone!)

Prototyping• Ensemble of gate arrays used to

emulate a circuit to be manufactured

• Get more/better/faster debugging done than with simulation

Reconfigurable hardware• One hardware block used to

implement more than one function

Special-purpose computation engines• Hardware dedicated to solving one

problem (or class of problems)

• Accelerators attached to general-purpose computers (e.g., in a cell phone!)

Page 10: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-10

Major FPGA VendorsMajor FPGA VendorsMajor FPGA VendorsMajor FPGA Vendors

SRAM-based FPGAs Xilinx, Inc. Altera Corp. Atmel Lattice Semiconductor

Flash & antifuse FPGAs Actel Corp. Quick Logic Corp.

SRAM-based FPGAs Xilinx, Inc. Altera Corp. Atmel Lattice Semiconductor

Flash & antifuse FPGAs Actel Corp. Quick Logic Corp.

Share over 60% of the market

Page 11: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-11

Reconfigurable LogicReconfigurable LogicReconfigurable LogicReconfigurable Logic

Page 12: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-12

Anti-Fuse-Based Approach (Actel)Anti-Fuse-Based Approach (Actel)Anti-Fuse-Based Approach (Actel)Anti-Fuse-Based Approach (Actel)

Page 13: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-13

Actel Logic ModuleActel Logic ModuleActel Logic ModuleActel Logic Module

Combinational BlockExample Gate Mapping

S-R Latch

Page 14: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-14

Actel Routing & ProgrammingActel Routing & ProgrammingActel Routing & ProgrammingActel Routing & Programming

Page 15: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-15

RAM Based Field ProgrammableRAM Based Field ProgrammableLogic - XilinxLogic - XilinxRAM Based Field ProgrammableRAM Based Field ProgrammableLogic - XilinxLogic - Xilinx

Page 16: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-16

Xilinx FPGA FamiliesXilinx FPGA FamiliesXilinx FPGA FamiliesXilinx FPGA Families

Old families• XC3000, XC4000, XC5200

• Old 0.5µm, 0.35µm and 0.25µm technology. Not recommended for modern designs.

High-performance families• Virtex (0.22µm)

• Virtex-E, Virtex-EM (0.18µm)

• Virtex-II, Virtex-II PRO (0.13µm)

• Virtex-4 (0.09µm)

Low Cost Family• Spartan/XL – derived from XC4000

• Spartan-II – derived from Virtex

• Spartan-IIE – derived from Virtex-E

• Spartan-3

Old families• XC3000, XC4000, XC5200

• Old 0.5µm, 0.35µm and 0.25µm technology. Not recommended for modern designs.

High-performance families• Virtex (0.22µm)

• Virtex-E, Virtex-EM (0.18µm)

• Virtex-II, Virtex-II PRO (0.13µm)

• Virtex-4 (0.09µm)

Low Cost Family• Spartan/XL – derived from XC4000

• Spartan-II – derived from Virtex

• Spartan-IIE – derived from Virtex-E

• Spartan-3

Page 17: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-17

FPGA NomenclatureFPGA NomenclatureFPGA NomenclatureFPGA Nomenclature

Page 18: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-18

Device Part MarkingDevice Part MarkingDevice Part MarkingDevice Part Marking

Page 19: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-19

The Xilinx 4000 CLBThe Xilinx 4000 CLBThe Xilinx 4000 CLBThe Xilinx 4000 CLB

Page 20: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-20

Two 4-input Functions, Registered Two 4-input Functions, Registered Output and a Two Input FunctionOutput and a Two Input FunctionTwo 4-input Functions, Registered Two 4-input Functions, Registered Output and a Two Input FunctionOutput and a Two Input Function

Page 21: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-21

5-input Function, Combinational 5-input Function, Combinational OutputOutput5-input Function, Combinational 5-input Function, Combinational OutputOutput

Page 22: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-22

5-Input Functions implemented using 5-Input Functions implemented using two LUTstwo LUTs5-Input Functions implemented using 5-Input Functions implemented using two LUTstwo LUTs

Page 23: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-23

LUT MappingLUT MappingLUT MappingLUT Mapping

N-LUT direct implementation of a truth table: any function of n-inputs.

N-LUT requires 2N storage elements (latches) N-inputs select one latch location (like a memory)

N-LUT direct implementation of a truth table: any function of n-inputs.

N-LUT requires 2N storage elements (latches) N-inputs select one latch location (like a memory)

Page 24: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-24

Configuring the CLB as a RAMConfiguring the CLB as a RAMConfiguring the CLB as a RAMConfiguring the CLB as a RAM

Page 25: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-25

Xilinx 4000 InterconnectXilinx 4000 InterconnectXilinx 4000 InterconnectXilinx 4000 Interconnect

Page 26: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-26

Xilinx 4000 Interconnect DetailsXilinx 4000 Interconnect DetailsXilinx 4000 Interconnect DetailsXilinx 4000 Interconnect Details

Page 27: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-27

Xilinx 4000 Flexible IOBXilinx 4000 Flexible IOBXilinx 4000 Flexible IOBXilinx 4000 Flexible IOB

Page 28: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-28

Basic I/O Block StructureBasic I/O Block StructureBasic I/O Block StructureBasic I/O Block Structure

Page 29: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-29

IOB FunctionalityIOB FunctionalityIOB FunctionalityIOB Functionality

IOB provides interface between the package pins and CLBs

Each IOB can work as uni- or bi-directional I/O Outputs can be forced into High Impedance Inputs and outputs can be registered

• advised for high-performance I/O

Inputs can be delayed

IOB provides interface between the package pins and CLBs

Each IOB can work as uni- or bi-directional I/O Outputs can be forced into High Impedance Inputs and outputs can be registered

• advised for high-performance I/O

Inputs can be delayed

Page 30: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-30

Additional Features in Modern FPGAsAdditional Features in Modern FPGAsAdditional Features in Modern FPGAsAdditional Features in Modern FPGAs

Page 31: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-31

Spartan-3 Xilinx FPGA Block DiagramSpartan-3 Xilinx FPGA Block DiagramSpartan-3 Xilinx FPGA Block DiagramSpartan-3 Xilinx FPGA Block Diagram

Page 32: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-32

CLB StructureCLB StructureCLB StructureCLB Structure

Page 33: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-33

CLB Slice StructureCLB Slice StructureCLB Slice StructureCLB Slice Structure

Each slice contains two sets of the following:• Four-input LUT

• Any 4-input logic function,• or 16-bit x 1 sync RAM• or 16-bit shift register

• Carry & Control• Fast arithmetic logic• Multiplier logic• Multiplexer logic

• Storage element• Latch or flip-flop• Set and reset• True or inverted inputs• Sync. or async. control

Each slice contains two sets of the following:• Four-input LUT

• Any 4-input logic function,• or 16-bit x 1 sync RAM• or 16-bit shift register

• Carry & Control• Fast arithmetic logic• Multiplier logic• Multiplexer logic

• Storage element• Latch or flip-flop• Set and reset• True or inverted inputs• Sync. or async. control

Page 34: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-34

Xilinx Multipurpose LUT (MLUT)Xilinx Multipurpose LUT (MLUT)Xilinx Multipurpose LUT (MLUT)Xilinx Multipurpose LUT (MLUT)

16-bit SR

16 x 1 RAM

4-input LUT16 x 1 ROM(logic)

Page 35: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-35

5-Input Functions implemented using 5-Input Functions implemented using two LUTstwo LUTs5-Input Functions implemented using 5-Input Functions implemented using two LUTstwo LUTs One CLB Slice can implements any function of 5 inputs Logic function is partitioned between two LUTs F5 multiplexer selects LUT

One CLB Slice can implements any function of 5 inputs Logic function is partitioned between two LUTs F5 multiplexer selects LUT

Page 36: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-36

Distributed RAMDistributed RAMDistributed RAMDistributed RAM

CLB LUT configurable as Distributed RAM• A LUT equals 16x1 RAM

• Implements Single and Dual-Ports

• Cascade LUTs to increase RAM size

Synchronous write Synchronous/Asynchronous read

• Accompanying flip-flops used for synchronous read

Two LUTs can make• 32 x 1 single-port RAM• 16 x 2 single-port RAM• 16 x 1 dual-port RAM

CLB LUT configurable as Distributed RAM• A LUT equals 16x1 RAM

• Implements Single and Dual-Ports

• Cascade LUTs to increase RAM size

Synchronous write Synchronous/Asynchronous read

• Accompanying flip-flops used for synchronous read

Two LUTs can make• 32 x 1 single-port RAM• 16 x 2 single-port RAM• 16 x 1 dual-port RAM

Page 37: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-37

Shift RegisterShift RegisterShift RegisterShift Register

Each LUT can be configured as shift register• Serial in, serial out

Dynamically addressable delay up to 16 cycles

For programmable pipeline Cascade for greater cycle

delays Use CLB flip-flops to add

depth

Each LUT can be configured as shift register• Serial in, serial out

Dynamically addressable delay up to 16 cycles

For programmable pipeline Cascade for greater cycle

delays Use CLB flip-flops to add

depth

Page 38: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-38

Shift RegisterShift Register Shift RegisterShift Register

Register-rich FPGA• Allows for addition of pipeline stages to increase

throughput

Data paths must be balanced to keep desired functionality

Register-rich FPGA• Allows for addition of pipeline stages to increase

throughput

Data paths must be balanced to keep desired functionality

Page 39: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-39

Carry & Control LogicCarry & Control LogicCarry & Control LogicCarry & Control Logic

Page 40: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-40

Fast Carry LogicFast Carry LogicFast Carry LogicFast Carry Logic

Each CLB contains separate logic and routing for the fast generation of sum & carry signals• Increases efficiency and performance

of adders, subtractors, accumulators, comparators, and counters

Carry logic is independent of normal logic and routing resources

All major synthesis tools can infer carry logic for arithmetic functions

Each CLB contains separate logic and routing for the fast generation of sum & carry signals• Increases efficiency and performance

of adders, subtractors, accumulators, comparators, and counters

Carry logic is independent of normal logic and routing resources

All major synthesis tools can infer carry logic for arithmetic functions

Page 41: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-41

The Virtex II CLB (Half Slice Shown)The Virtex II CLB (Half Slice Shown)The Virtex II CLB (Half Slice Shown)The Virtex II CLB (Half Slice Shown)

Page 42: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-42

Adder ImplementationAdder ImplementationAdder ImplementationAdder Implementation

Page 43: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-43

Carry ChainCarry ChainCarry ChainCarry Chain

Page 44: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-44

New 18 x 18 Embedded MultiplierNew 18 x 18 Embedded MultiplierNew 18 x 18 Embedded MultiplierNew 18 x 18 Embedded Multiplier

Embedded 18-bit x 18-bit multiplier• 2’s complement signed operation

Multipliers are organized in columns

Fast arithmetic functions• Optimized to implement multiply /

accumulate modules

Embedded 18-bit x 18-bit multiplier• 2’s complement signed operation

Multipliers are organized in columns

Fast arithmetic functions• Optimized to implement multiply /

accumulate modules

Page 45: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-45

Design Flow - MappingDesign Flow - MappingDesign Flow - MappingDesign Flow - Mapping

Technology Mapping: Schematic/HDL to Physical Logic units• Compile functions into basic LUT-based groups (function of

target architecture)

Technology Mapping: Schematic/HDL to Physical Logic units• Compile functions into basic LUT-based groups (function of

target architecture)

Page 46: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-46

Design Flow – Placement & RouteDesign Flow – Placement & RouteDesign Flow – Placement & RouteDesign Flow – Placement & Route

Placement – assign logic location on a particular device

Routing – iterative process to connect CLB inputs/outputs and IOBs. Optimizes critical path delay – can take hours or days for large, dense designs

Placement – assign logic location on a particular device

Routing – iterative process to connect CLB inputs/outputs and IOBs. Optimizes critical path delay – can take hours or days for large, dense designs

Challenge! Cannot use full chip for reasonable speeds (wires are not ideal).Typically no more than 50% utilization.

Page 47: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-47

Example: Verilog to FPGAExample: Verilog to FPGAExample: Verilog to FPGAExample: Verilog to FPGA

Page 48: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-48

Memory TypesMemory TypesMemory TypesMemory Types

Page 49: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-49

FPGA Memory ImplementationFPGA Memory ImplementationFPGA Memory ImplementationFPGA Memory Implementation

Regular registers in logic blocks• Piggy use of resources, but convenient & fast if small

[Xilinx Vertex II] use the LUTs:• Single port: 16x(1,2,4,8), 32x(1,2,4,8), 64x(1,2), 128x1

• Dual port (1 R/W, 1R): 16x1, 32x1, 64x1

• Can fake extra read ports by cloning memory: all clones are written with the same addr/data, but each clone can have a different read address

[Xilinx Vertex II] use block ram:• 18K bits: 16Kx1, 8Kx2, 4Kx4

• with parity: 2Kx(8+1), 1Kx(16+2), 512x(32+4)

• Single or dual port

• Pipelined (clocked) operations

Regular registers in logic blocks• Piggy use of resources, but convenient & fast if small

[Xilinx Vertex II] use the LUTs:• Single port: 16x(1,2,4,8), 32x(1,2,4,8), 64x(1,2), 128x1

• Dual port (1 R/W, 1R): 16x1, 32x1, 64x1

• Can fake extra read ports by cloning memory: all clones are written with the same addr/data, but each clone can have a different read address

[Xilinx Vertex II] use block ram:• 18K bits: 16Kx1, 8Kx2, 4Kx4

• with parity: 2Kx(8+1), 1Kx(16+2), 512x(32+4)

• Single or dual port

• Pipelined (clocked) operations

Page 50: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-50

LUT-Based RAMSLUT-Based RAMSLUT-Based RAMSLUT-Based RAMS

Page 51: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-51

LUT-Based RAMSLUT-Based RAMSLUT-Based RAMSLUT-Based RAMS

Page 52: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-52

LUT-Based RAM ModulesLUT-Based RAM ModulesLUT-Based RAM ModulesLUT-Based RAM Modules

Page 53: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-53

LUT-Based RAM ModulesLUT-Based RAM ModulesLUT-Based RAM ModulesLUT-Based RAM Modules

// instantiate a LUT-based RAM module

RAM16X1S mymem (.D(din),.O(dout),.WE(we),.WCLK(clock_27mhz), .A0(a[0]),.A1(a[1]),.A2(a[2]),.A3(a[3]));

defparam mymem.INIT = 16’b01101111001101011100;

// msb first

// instantiate a LUT-based RAM module

RAM16X1S mymem (.D(din),.O(dout),.WE(we),.WCLK(clock_27mhz), .A0(a[0]),.A1(a[1]),.A2(a[2]),.A3(a[3]));

defparam mymem.INIT = 16’b01101111001101011100;

// msb first

Page 54: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-54

Example of Inferred MemoryExample of Inferred MemoryExample of Inferred MemoryExample of Inferred Memory

Page 55: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-55

Block RAMBlock RAMBlock RAMBlock RAM

Most efficient memory implementation• Dedicated blocks of memory

Ideal for most memory requirements• 4 to 104 memory blocks

• 18 kbits = 18,432 bits per block (16 k without parity bits)

• Use multiple blocks for larger memories

Builds both single and true dual-port RAMs

Synchronous write and read (different from distributed RAM)

Most efficient memory implementation• Dedicated blocks of memory

Ideal for most memory requirements• 4 to 104 memory blocks

• 18 kbits = 18,432 bits per block (16 k without parity bits)

• Use multiple blocks for larger memories

Builds both single and true dual-port RAMs

Synchronous write and read (different from distributed RAM)

Page 56: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-56

Block RAMBlock RAMBlock RAMBlock RAM

Support of two independent 9 Kb blocks, or a single 18 Kb block RAM.

Each 9 Kb block RAM can be set to simple dual-port mode, doubling data width of the block RAM to a maximum of 36 bits.

Simple dual-port mode is defined as having one read-only port and one write-only port with independent clocks.

18 or 36-bit wide ports can have an individual write enable per byte. This feature is popular for interfacing to an on-chip microprocessor.

All inputs are registered with the port clock and have a setup-to-clock timing specification.

Support of two independent 9 Kb blocks, or a single 18 Kb block RAM.

Each 9 Kb block RAM can be set to simple dual-port mode, doubling data width of the block RAM to a maximum of 36 bits.

Simple dual-port mode is defined as having one read-only port and one write-only port with independent clocks.

18 or 36-bit wide ports can have an individual write enable per byte. This feature is popular for interfacing to an on-chip microprocessor.

All inputs are registered with the port clock and have a setup-to-clock timing specification.

Page 57: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-57

Block RAMBlock RAMBlock RAMBlock RAM

A write operation requires one clock edge. A read operation requires one clock edge. All output ports are latched. The state of the output

port does not change until the port executes another read or write operation. The default block RAM output is latch mode.

The output data path has an optional internal pipeline register. Using the register mode is strongly recommended. This allows a higher clock rate, however, it adds a clock cycle latency of one.

A write operation requires one clock edge. A read operation requires one clock edge. All output ports are latched. The state of the output

port does not change until the port executes another read or write operation. The default block RAM output is latch mode.

The output data path has an optional internal pipeline register. Using the register mode is strongly recommended. This allows a higher clock rate, however, it adds a clock cycle latency of one.

Page 58: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-58

Block RAMBlock RAMBlock RAMBlock RAM

Page 59: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-59

Block RAM Logic DiagramBlock RAM Logic DiagramBlock RAM Logic DiagramBlock RAM Logic Diagram

Page 60: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-60

Block RAM Data Combinations and Block RAM Data Combinations and ADDR LocationsADDR LocationsBlock RAM Data Combinations and Block RAM Data Combinations and ADDR LocationsADDR Locations

Page 61: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-61

Block RAM Port Aspect RatiosBlock RAM Port Aspect RatiosBlock RAM Port Aspect RatiosBlock RAM Port Aspect Ratios

Page 62: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-62

Dual-Port Bus FlexibilityDual-Port Bus FlexibilityDual-Port Bus FlexibilityDual-Port Bus Flexibility

Each port can be configured with a different data bus width

Provides easy data width conversion without any additional logic

Each port can be configured with a different data bus width

Provides easy data width conversion without any additional logic

Page 63: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-63

Simple Dual-Port Mode Allowed Simple Dual-Port Mode Allowed Combinations for 9 Kb Block RAMCombinations for 9 Kb Block RAMSimple Dual-Port Mode Allowed Simple Dual-Port Mode Allowed Combinations for 9 Kb Block RAMCombinations for 9 Kb Block RAM

Page 64: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-64

True Dual-Port Mode Allowed True Dual-Port Mode Allowed Combinations for 9 Kb Block RAMCombinations for 9 Kb Block RAMTrue Dual-Port Mode Allowed True Dual-Port Mode Allowed Combinations for 9 Kb Block RAMCombinations for 9 Kb Block RAM

Page 65: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-65

18 Kb Block RAM—True Dual-Port 18 Kb Block RAM—True Dual-Port OperationOperation18 Kb Block RAM—True Dual-Port 18 Kb Block RAM—True Dual-Port OperationOperation

Page 66: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-66

Read & Write OperationsRead & Write OperationsRead & Write OperationsRead & Write Operations

Read Operation• In latch mode, the read operation uses one clock edge. The

read address is registered on the read port, and the stored data is loaded into the output latches after the RAM access time.

• When using the output register, the read operation will take one extra latency cycle to arrive at the output.

Write Operation• A write operation is a single clock-edge operation. The write

address is registered on the write port, and the data input is stored in memory.

Read Operation• In latch mode, the read operation uses one clock edge. The

read address is registered on the read port, and the stored data is loaded into the output latches after the RAM access time.

• When using the output register, the read operation will take one extra latency cycle to arrive at the output.

Write Operation• A write operation is a single clock-edge operation. The write

address is registered on the write port, and the data input is stored in memory.

Page 67: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-67

Write ModesWrite ModesWrite ModesWrite Modes

Three settings of the write mode determines the behavior of the data available on the output latches after a write clock edge: WRITE_FIRST, READ_FIRST, and NO_CHANGE.

The Write mode attribute can be individually selected for each port. The default mode is WRITE_FIRST.

WRITE_FIRST outputs the newly written data onto the output bus.

READ_FIRST outputs the previously stored data while new data is being written.

NO_CHANGE maintains the output previously generated by a read operation.

Three settings of the write mode determines the behavior of the data available on the output latches after a write clock edge: WRITE_FIRST, READ_FIRST, and NO_CHANGE.

The Write mode attribute can be individually selected for each port. The default mode is WRITE_FIRST.

WRITE_FIRST outputs the newly written data onto the output bus.

READ_FIRST outputs the previously stored data while new data is being written.

NO_CHANGE maintains the output previously generated by a read operation.

Page 68: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-68

WRITE_FIRST or Transparent Mode WRITE_FIRST or Transparent Mode (Default)(Default)WRITE_FIRST or Transparent Mode WRITE_FIRST or Transparent Mode (Default)(Default) In WRITE_FIRST mode, the input data is

simultaneously written into memory and stored in the data output (transparent write).

In WRITE_FIRST mode, the input data is simultaneously written into memory and stored in the data output (transparent write).

Page 69: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-69

READ_FIRST or Read-Before-Write READ_FIRST or Read-Before-Write ModeModeREAD_FIRST or Read-Before-Write READ_FIRST or Read-Before-Write ModeMode In READ_FIRST mode, data previously stored at the

write address appears on the output latches, while the input data is being stored in memory (read before write).

In READ_FIRST mode, data previously stored at the write address appears on the output latches, while the input data is being stored in memory (read before write).

Page 70: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-70

NO_CHANGE ModeNO_CHANGE ModeNO_CHANGE ModeNO_CHANGE Mode

In NO_CHANGE mode, the output latches remain unchanged during a write operation.

In NO_CHANGE mode, the output latches remain unchanged during a write operation.

Page 71: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-71

Conflict AvoidanceConflict AvoidanceConflict AvoidanceConflict Avoidance

Block RAM memory is a true dual-port RAM where both ports can access any memory location at any time.

When accessing the same memory location from both ports, the user must, however, observe certain restrictions.

There are no timing restrictions when both ports perform a read operation.

When one port performs a write operation, the other port must not read- or write access the exact same memory location.

Block RAM memory is a true dual-port RAM where both ports can access any memory location at any time.

When accessing the same memory location from both ports, the user must, however, observe certain restrictions.

There are no timing restrictions when both ports perform a read operation.

When one port performs a write operation, the other port must not read- or write access the exact same memory location.

Page 72: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-72

Spartan-3 Block RAM AmountsSpartan-3 Block RAM AmountsSpartan-3 Block RAM AmountsSpartan-3 Block RAM Amounts

Page 73: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-73

Spartan-3 FPGA Family MembersSpartan-3 FPGA Family MembersSpartan-3 FPGA Family MembersSpartan-3 FPGA Family Members

Page 74: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-74

Virtex-II 1.5V ArchitectureVirtex-II 1.5V ArchitectureVirtex-II 1.5V ArchitectureVirtex-II 1.5V Architecture

Page 75: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-75

Virtex-II 1.5VVirtex-II 1.5VVirtex-II 1.5VVirtex-II 1.5V

Device CLB Array

Slices Maximum I/O

BlockRAM

(18kb)

Multiplier Blocks

Distributed RAM bits

XC2V40 8x8 256 88 4 4 8,192

XC2V80 16x8 512 120 8 8 16,384

XC2V250 24x16 1,536 200 24 24 49,152

XC2V500 32x24 3,072 264 32 32 98,304

XC2V1000 40x32 5,120 432 40 40 163,840

XC2V1500 48x40 7,680 528 48 48 245,760

XC2V2000 56x48 10,752 624 56 56 344,064

XC2V3000 64x56 14,336 720 96 96 458,752

XC2V4000 80x72 23,040 912 120 120 737,280

XC2V6000 96x88 33,792 1,104 144 144 1,081,344

XC2V8000 112x104 46,592 1,108 168 168 1,490,944

Page 76: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-76

Using Core GeneratorUsing Core GeneratorUsing Core GeneratorUsing Core Generator

Page 77: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-77

Single Port BRAMSingle Port BRAMSingle Port BRAMSingle Port BRAM

Page 78: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-78

Single Port BRAMSingle Port BRAMSingle Port BRAMSingle Port BRAM

Page 79: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-79

Single Port BRAMSingle Port BRAMSingle Port BRAMSingle Port BRAM

Page 80: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-80

Single Port BRAMSingle Port BRAMSingle Port BRAMSingle Port BRAM

Page 81: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-81

Dual Port BRAMDual Port BRAMDual Port BRAMDual Port BRAM

Page 82: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-82

Dual Port BRAMDual Port BRAMDual Port BRAMDual Port BRAM

Page 83: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-83

Dual Port BRAMDual Port BRAMDual Port BRAMDual Port BRAM

Page 84: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-84

Distributed RAMDistributed RAMDistributed RAMDistributed RAM

Page 85: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-85

Distributed RAMDistributed RAMDistributed RAMDistributed RAM

Page 86: COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman

1-86

Distributed RAMDistributed RAMDistributed RAMDistributed RAM