coe 405 programmable logic and storage devices dr. aiman h. el-maleh computer engineering department...

Post on 12-Jan-2016

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

COE 405COE 405 Programmable Logic and Programmable Logic and

Storage DevicesStorage Devices

COE 405COE 405 Programmable Logic and Programmable Logic and

Storage DevicesStorage Devices

Dr. Aiman H. El-Maleh

Computer Engineering Department

King Fahd University of Petroleum & Minerals

Dr. Aiman H. El-Maleh

Computer Engineering Department

King Fahd University of Petroleum & Minerals

1-2

OutlineOutlineOutlineOutline

History of Computational Fabrics ASIC vs. FPGA Reconfigurable Logic Anti-Fuse-Based Approach (Actel) RAM Based Field Programmable Logic (Xilinx) CLBs Carry & Control Logic FPGA Memory Implementation

History of Computational Fabrics ASIC vs. FPGA Reconfigurable Logic Anti-Fuse-Based Approach (Actel) RAM Based Field Programmable Logic (Xilinx) CLBs Carry & Control Logic FPGA Memory Implementation

1-3

History of Computational FabricsHistory of Computational FabricsHistory of Computational FabricsHistory of Computational Fabrics

Discrete devices: relays, transistors (1940s-50s) Discrete logic gates (1950s-60s) Integrated circuits (1960s-70s)

• e.g. TTL packages: Data Book for 100’s of different parts

Gate Arrays (IBM 1970s)• Transistors are pre-placed on the chip & Place and Route

software puts the chip together automatically – only program the interconnect (mask programming)

Software Based Schemes (1970’s- present)• Run instructions on a general purpose core

Discrete devices: relays, transistors (1940s-50s) Discrete logic gates (1950s-60s) Integrated circuits (1960s-70s)

• e.g. TTL packages: Data Book for 100’s of different parts

Gate Arrays (IBM 1970s)• Transistors are pre-placed on the chip & Place and Route

software puts the chip together automatically – only program the interconnect (mask programming)

Software Based Schemes (1970’s- present)• Run instructions on a general purpose core

1-4

History of Computational FabricsHistory of Computational FabricsHistory of Computational FabricsHistory of Computational Fabrics

ASIC Design (1980’s to present)• Turn Verilog directly into layout using a library of standard

cells

• Effective for high-volume and efficient use of silicon area

Programmable Logic (1980’s to present)• A chip that is reprogrammed after it has been fabricated

• Examples: PALs, PLAs, EPROM, EEPROM, PLDs, FPGAs

• Excellent support for mapping from Verilog

ASIC Design (1980’s to present)• Turn Verilog directly into layout using a library of standard

cells

• Effective for high-volume and efficient use of silicon area

Programmable Logic (1980’s to present)• A chip that is reprogrammed after it has been fabricated

• Examples: PALs, PLAs, EPROM, EEPROM, PLDs, FPGAs

• Excellent support for mapping from Verilog

1-5

What is an FPGA?What is an FPGA?What is an FPGA?What is an FPGA?

A filed programmable gate array (FPGA) is a reprogrammable silicon chip.

Using prebuilt logic blocks and programmable routing resources, you can configure these chips to implement custom hardware functionality without ever having to pick up a breadboard or soldering iron.

You develop digital computing tasks in software and compile them down to a configuration file or bitstream that contains information on how the components should be wired together.

A filed programmable gate array (FPGA) is a reprogrammable silicon chip.

Using prebuilt logic blocks and programmable routing resources, you can configure these chips to implement custom hardware functionality without ever having to pick up a breadboard or soldering iron.

You develop digital computing tasks in software and compile them down to a configuration file or bitstream that contains information on how the components should be wired together.

1-6

ASIC vs. FPGAASIC vs. FPGAASIC vs. FPGAASIC vs. FPGA

• designs must be sent for expensive and time consuming fabrication in semiconductor foundry

• bought off the shelf and reconfigured by designers themselves

ASICApplication Specific

Integrated Circuit

FPGAField Programmable

Gate Array

• designed all the way from behavioral description to physical layout

• no physical layout design; design ends with a bitstream used to configure a device

1-7

ASIC vs. FPGAASIC vs. FPGAASIC vs. FPGAASIC vs. FPGA

Off-the-shelf

Low development cost

Short time to market

Reconfigurability

High performance

ASICs FPGAs

Low power

Low cost inhigh volumes

1-8

Other FPGA AdvantagesOther FPGA AdvantagesOther FPGA AdvantagesOther FPGA Advantages

Manufacturing cycle for ASIC is very costly, lengthy and engages lots of manpower• Mistakes not detected at design time have large impact on

development time and cost

• FPGAs are perfect for rapid prototyping of digital circuits

Easy upgrades like in case of software FPGA provide a flexible platform for implementing

digital computing A rich set of macros and I/Os supported (multipliers,

block RAMS, ROMS, high-speed I/O) A wide range of applications from prototyping (to

validate a design before ASIC mapping) to high performance spatial computing

Manufacturing cycle for ASIC is very costly, lengthy and engages lots of manpower• Mistakes not detected at design time have large impact on

development time and cost

• FPGAs are perfect for rapid prototyping of digital circuits

Easy upgrades like in case of software FPGA provide a flexible platform for implementing

digital computing A rich set of macros and I/Os supported (multipliers,

block RAMS, ROMS, high-speed I/O) A wide range of applications from prototyping (to

validate a design before ASIC mapping) to high performance spatial computing

1-9

How are FPGAs Used?How are FPGAs Used?How are FPGAs Used?How are FPGAs Used?

Prototyping• Ensemble of gate arrays used to

emulate a circuit to be manufactured

• Get more/better/faster debugging done than with simulation

Reconfigurable hardware• One hardware block used to

implement more than one function

Special-purpose computation engines• Hardware dedicated to solving one

problem (or class of problems)

• Accelerators attached to general-purpose computers (e.g., in a cell phone!)

Prototyping• Ensemble of gate arrays used to

emulate a circuit to be manufactured

• Get more/better/faster debugging done than with simulation

Reconfigurable hardware• One hardware block used to

implement more than one function

Special-purpose computation engines• Hardware dedicated to solving one

problem (or class of problems)

• Accelerators attached to general-purpose computers (e.g., in a cell phone!)

1-10

Major FPGA VendorsMajor FPGA VendorsMajor FPGA VendorsMajor FPGA Vendors

SRAM-based FPGAs Xilinx, Inc. Altera Corp. Atmel Lattice Semiconductor

Flash & antifuse FPGAs Actel Corp. Quick Logic Corp.

SRAM-based FPGAs Xilinx, Inc. Altera Corp. Atmel Lattice Semiconductor

Flash & antifuse FPGAs Actel Corp. Quick Logic Corp.

Share over 60% of the market

1-11

Reconfigurable LogicReconfigurable LogicReconfigurable LogicReconfigurable Logic

1-12

Anti-Fuse-Based Approach (Actel)Anti-Fuse-Based Approach (Actel)Anti-Fuse-Based Approach (Actel)Anti-Fuse-Based Approach (Actel)

1-13

Actel Logic ModuleActel Logic ModuleActel Logic ModuleActel Logic Module

Combinational BlockExample Gate Mapping

S-R Latch

1-14

Actel Routing & ProgrammingActel Routing & ProgrammingActel Routing & ProgrammingActel Routing & Programming

1-15

RAM Based Field ProgrammableRAM Based Field ProgrammableLogic - XilinxLogic - XilinxRAM Based Field ProgrammableRAM Based Field ProgrammableLogic - XilinxLogic - Xilinx

1-16

Xilinx FPGA FamiliesXilinx FPGA FamiliesXilinx FPGA FamiliesXilinx FPGA Families

Old families• XC3000, XC4000, XC5200

• Old 0.5µm, 0.35µm and 0.25µm technology. Not recommended for modern designs.

High-performance families• Virtex (0.22µm)

• Virtex-E, Virtex-EM (0.18µm)

• Virtex-II, Virtex-II PRO (0.13µm)

• Virtex-4 (0.09µm)

Low Cost Family• Spartan/XL – derived from XC4000

• Spartan-II – derived from Virtex

• Spartan-IIE – derived from Virtex-E

• Spartan-3

Old families• XC3000, XC4000, XC5200

• Old 0.5µm, 0.35µm and 0.25µm technology. Not recommended for modern designs.

High-performance families• Virtex (0.22µm)

• Virtex-E, Virtex-EM (0.18µm)

• Virtex-II, Virtex-II PRO (0.13µm)

• Virtex-4 (0.09µm)

Low Cost Family• Spartan/XL – derived from XC4000

• Spartan-II – derived from Virtex

• Spartan-IIE – derived from Virtex-E

• Spartan-3

1-17

FPGA NomenclatureFPGA NomenclatureFPGA NomenclatureFPGA Nomenclature

1-18

Device Part MarkingDevice Part MarkingDevice Part MarkingDevice Part Marking

1-19

The Xilinx 4000 CLBThe Xilinx 4000 CLBThe Xilinx 4000 CLBThe Xilinx 4000 CLB

1-20

Two 4-input Functions, Registered Two 4-input Functions, Registered Output and a Two Input FunctionOutput and a Two Input FunctionTwo 4-input Functions, Registered Two 4-input Functions, Registered Output and a Two Input FunctionOutput and a Two Input Function

1-21

5-input Function, Combinational 5-input Function, Combinational OutputOutput5-input Function, Combinational 5-input Function, Combinational OutputOutput

1-22

5-Input Functions implemented using 5-Input Functions implemented using two LUTstwo LUTs5-Input Functions implemented using 5-Input Functions implemented using two LUTstwo LUTs

1-23

LUT MappingLUT MappingLUT MappingLUT Mapping

N-LUT direct implementation of a truth table: any function of n-inputs.

N-LUT requires 2N storage elements (latches) N-inputs select one latch location (like a memory)

N-LUT direct implementation of a truth table: any function of n-inputs.

N-LUT requires 2N storage elements (latches) N-inputs select one latch location (like a memory)

1-24

Configuring the CLB as a RAMConfiguring the CLB as a RAMConfiguring the CLB as a RAMConfiguring the CLB as a RAM

1-25

Xilinx 4000 InterconnectXilinx 4000 InterconnectXilinx 4000 InterconnectXilinx 4000 Interconnect

1-26

Xilinx 4000 Interconnect DetailsXilinx 4000 Interconnect DetailsXilinx 4000 Interconnect DetailsXilinx 4000 Interconnect Details

1-27

Xilinx 4000 Flexible IOBXilinx 4000 Flexible IOBXilinx 4000 Flexible IOBXilinx 4000 Flexible IOB

1-28

Basic I/O Block StructureBasic I/O Block StructureBasic I/O Block StructureBasic I/O Block Structure

1-29

IOB FunctionalityIOB FunctionalityIOB FunctionalityIOB Functionality

IOB provides interface between the package pins and CLBs

Each IOB can work as uni- or bi-directional I/O Outputs can be forced into High Impedance Inputs and outputs can be registered

• advised for high-performance I/O

Inputs can be delayed

IOB provides interface between the package pins and CLBs

Each IOB can work as uni- or bi-directional I/O Outputs can be forced into High Impedance Inputs and outputs can be registered

• advised for high-performance I/O

Inputs can be delayed

1-30

Additional Features in Modern FPGAsAdditional Features in Modern FPGAsAdditional Features in Modern FPGAsAdditional Features in Modern FPGAs

1-31

Spartan-3 Xilinx FPGA Block DiagramSpartan-3 Xilinx FPGA Block DiagramSpartan-3 Xilinx FPGA Block DiagramSpartan-3 Xilinx FPGA Block Diagram

1-32

CLB StructureCLB StructureCLB StructureCLB Structure

1-33

CLB Slice StructureCLB Slice StructureCLB Slice StructureCLB Slice Structure

Each slice contains two sets of the following:• Four-input LUT

• Any 4-input logic function,• or 16-bit x 1 sync RAM• or 16-bit shift register

• Carry & Control• Fast arithmetic logic• Multiplier logic• Multiplexer logic

• Storage element• Latch or flip-flop• Set and reset• True or inverted inputs• Sync. or async. control

Each slice contains two sets of the following:• Four-input LUT

• Any 4-input logic function,• or 16-bit x 1 sync RAM• or 16-bit shift register

• Carry & Control• Fast arithmetic logic• Multiplier logic• Multiplexer logic

• Storage element• Latch or flip-flop• Set and reset• True or inverted inputs• Sync. or async. control

1-34

Xilinx Multipurpose LUT (MLUT)Xilinx Multipurpose LUT (MLUT)Xilinx Multipurpose LUT (MLUT)Xilinx Multipurpose LUT (MLUT)

16-bit SR

16 x 1 RAM

4-input LUT16 x 1 ROM(logic)

1-35

5-Input Functions implemented using 5-Input Functions implemented using two LUTstwo LUTs5-Input Functions implemented using 5-Input Functions implemented using two LUTstwo LUTs One CLB Slice can implements any function of 5 inputs Logic function is partitioned between two LUTs F5 multiplexer selects LUT

One CLB Slice can implements any function of 5 inputs Logic function is partitioned between two LUTs F5 multiplexer selects LUT

1-36

Distributed RAMDistributed RAMDistributed RAMDistributed RAM

CLB LUT configurable as Distributed RAM• A LUT equals 16x1 RAM

• Implements Single and Dual-Ports

• Cascade LUTs to increase RAM size

Synchronous write Synchronous/Asynchronous read

• Accompanying flip-flops used for synchronous read

Two LUTs can make• 32 x 1 single-port RAM• 16 x 2 single-port RAM• 16 x 1 dual-port RAM

CLB LUT configurable as Distributed RAM• A LUT equals 16x1 RAM

• Implements Single and Dual-Ports

• Cascade LUTs to increase RAM size

Synchronous write Synchronous/Asynchronous read

• Accompanying flip-flops used for synchronous read

Two LUTs can make• 32 x 1 single-port RAM• 16 x 2 single-port RAM• 16 x 1 dual-port RAM

1-37

Shift RegisterShift RegisterShift RegisterShift Register

Each LUT can be configured as shift register• Serial in, serial out

Dynamically addressable delay up to 16 cycles

For programmable pipeline Cascade for greater cycle

delays Use CLB flip-flops to add

depth

Each LUT can be configured as shift register• Serial in, serial out

Dynamically addressable delay up to 16 cycles

For programmable pipeline Cascade for greater cycle

delays Use CLB flip-flops to add

depth

1-38

Shift RegisterShift Register Shift RegisterShift Register

Register-rich FPGA• Allows for addition of pipeline stages to increase

throughput

Data paths must be balanced to keep desired functionality

Register-rich FPGA• Allows for addition of pipeline stages to increase

throughput

Data paths must be balanced to keep desired functionality

1-39

Carry & Control LogicCarry & Control LogicCarry & Control LogicCarry & Control Logic

1-40

Fast Carry LogicFast Carry LogicFast Carry LogicFast Carry Logic

Each CLB contains separate logic and routing for the fast generation of sum & carry signals• Increases efficiency and performance

of adders, subtractors, accumulators, comparators, and counters

Carry logic is independent of normal logic and routing resources

All major synthesis tools can infer carry logic for arithmetic functions

Each CLB contains separate logic and routing for the fast generation of sum & carry signals• Increases efficiency and performance

of adders, subtractors, accumulators, comparators, and counters

Carry logic is independent of normal logic and routing resources

All major synthesis tools can infer carry logic for arithmetic functions

1-41

The Virtex II CLB (Half Slice Shown)The Virtex II CLB (Half Slice Shown)The Virtex II CLB (Half Slice Shown)The Virtex II CLB (Half Slice Shown)

1-42

Adder ImplementationAdder ImplementationAdder ImplementationAdder Implementation

1-43

Carry ChainCarry ChainCarry ChainCarry Chain

1-44

New 18 x 18 Embedded MultiplierNew 18 x 18 Embedded MultiplierNew 18 x 18 Embedded MultiplierNew 18 x 18 Embedded Multiplier

Embedded 18-bit x 18-bit multiplier• 2’s complement signed operation

Multipliers are organized in columns

Fast arithmetic functions• Optimized to implement multiply /

accumulate modules

Embedded 18-bit x 18-bit multiplier• 2’s complement signed operation

Multipliers are organized in columns

Fast arithmetic functions• Optimized to implement multiply /

accumulate modules

1-45

Design Flow - MappingDesign Flow - MappingDesign Flow - MappingDesign Flow - Mapping

Technology Mapping: Schematic/HDL to Physical Logic units• Compile functions into basic LUT-based groups (function of

target architecture)

Technology Mapping: Schematic/HDL to Physical Logic units• Compile functions into basic LUT-based groups (function of

target architecture)

1-46

Design Flow – Placement & RouteDesign Flow – Placement & RouteDesign Flow – Placement & RouteDesign Flow – Placement & Route

Placement – assign logic location on a particular device

Routing – iterative process to connect CLB inputs/outputs and IOBs. Optimizes critical path delay – can take hours or days for large, dense designs

Placement – assign logic location on a particular device

Routing – iterative process to connect CLB inputs/outputs and IOBs. Optimizes critical path delay – can take hours or days for large, dense designs

Challenge! Cannot use full chip for reasonable speeds (wires are not ideal).Typically no more than 50% utilization.

1-47

Example: Verilog to FPGAExample: Verilog to FPGAExample: Verilog to FPGAExample: Verilog to FPGA

1-48

Memory TypesMemory TypesMemory TypesMemory Types

1-49

FPGA Memory ImplementationFPGA Memory ImplementationFPGA Memory ImplementationFPGA Memory Implementation

Regular registers in logic blocks• Piggy use of resources, but convenient & fast if small

[Xilinx Vertex II] use the LUTs:• Single port: 16x(1,2,4,8), 32x(1,2,4,8), 64x(1,2), 128x1

• Dual port (1 R/W, 1R): 16x1, 32x1, 64x1

• Can fake extra read ports by cloning memory: all clones are written with the same addr/data, but each clone can have a different read address

[Xilinx Vertex II] use block ram:• 18K bits: 16Kx1, 8Kx2, 4Kx4

• with parity: 2Kx(8+1), 1Kx(16+2), 512x(32+4)

• Single or dual port

• Pipelined (clocked) operations

Regular registers in logic blocks• Piggy use of resources, but convenient & fast if small

[Xilinx Vertex II] use the LUTs:• Single port: 16x(1,2,4,8), 32x(1,2,4,8), 64x(1,2), 128x1

• Dual port (1 R/W, 1R): 16x1, 32x1, 64x1

• Can fake extra read ports by cloning memory: all clones are written with the same addr/data, but each clone can have a different read address

[Xilinx Vertex II] use block ram:• 18K bits: 16Kx1, 8Kx2, 4Kx4

• with parity: 2Kx(8+1), 1Kx(16+2), 512x(32+4)

• Single or dual port

• Pipelined (clocked) operations

1-50

LUT-Based RAMSLUT-Based RAMSLUT-Based RAMSLUT-Based RAMS

1-51

LUT-Based RAMSLUT-Based RAMSLUT-Based RAMSLUT-Based RAMS

1-52

LUT-Based RAM ModulesLUT-Based RAM ModulesLUT-Based RAM ModulesLUT-Based RAM Modules

1-53

LUT-Based RAM ModulesLUT-Based RAM ModulesLUT-Based RAM ModulesLUT-Based RAM Modules

// instantiate a LUT-based RAM module

RAM16X1S mymem (.D(din),.O(dout),.WE(we),.WCLK(clock_27mhz), .A0(a[0]),.A1(a[1]),.A2(a[2]),.A3(a[3]));

defparam mymem.INIT = 16’b01101111001101011100;

// msb first

// instantiate a LUT-based RAM module

RAM16X1S mymem (.D(din),.O(dout),.WE(we),.WCLK(clock_27mhz), .A0(a[0]),.A1(a[1]),.A2(a[2]),.A3(a[3]));

defparam mymem.INIT = 16’b01101111001101011100;

// msb first

1-54

Example of Inferred MemoryExample of Inferred MemoryExample of Inferred MemoryExample of Inferred Memory

1-55

Block RAMBlock RAMBlock RAMBlock RAM

Most efficient memory implementation• Dedicated blocks of memory

Ideal for most memory requirements• 4 to 104 memory blocks

• 18 kbits = 18,432 bits per block (16 k without parity bits)

• Use multiple blocks for larger memories

Builds both single and true dual-port RAMs

Synchronous write and read (different from distributed RAM)

Most efficient memory implementation• Dedicated blocks of memory

Ideal for most memory requirements• 4 to 104 memory blocks

• 18 kbits = 18,432 bits per block (16 k without parity bits)

• Use multiple blocks for larger memories

Builds both single and true dual-port RAMs

Synchronous write and read (different from distributed RAM)

1-56

Block RAMBlock RAMBlock RAMBlock RAM

Support of two independent 9 Kb blocks, or a single 18 Kb block RAM.

Each 9 Kb block RAM can be set to simple dual-port mode, doubling data width of the block RAM to a maximum of 36 bits.

Simple dual-port mode is defined as having one read-only port and one write-only port with independent clocks.

18 or 36-bit wide ports can have an individual write enable per byte. This feature is popular for interfacing to an on-chip microprocessor.

All inputs are registered with the port clock and have a setup-to-clock timing specification.

Support of two independent 9 Kb blocks, or a single 18 Kb block RAM.

Each 9 Kb block RAM can be set to simple dual-port mode, doubling data width of the block RAM to a maximum of 36 bits.

Simple dual-port mode is defined as having one read-only port and one write-only port with independent clocks.

18 or 36-bit wide ports can have an individual write enable per byte. This feature is popular for interfacing to an on-chip microprocessor.

All inputs are registered with the port clock and have a setup-to-clock timing specification.

1-57

Block RAMBlock RAMBlock RAMBlock RAM

A write operation requires one clock edge. A read operation requires one clock edge. All output ports are latched. The state of the output

port does not change until the port executes another read or write operation. The default block RAM output is latch mode.

The output data path has an optional internal pipeline register. Using the register mode is strongly recommended. This allows a higher clock rate, however, it adds a clock cycle latency of one.

A write operation requires one clock edge. A read operation requires one clock edge. All output ports are latched. The state of the output

port does not change until the port executes another read or write operation. The default block RAM output is latch mode.

The output data path has an optional internal pipeline register. Using the register mode is strongly recommended. This allows a higher clock rate, however, it adds a clock cycle latency of one.

1-58

Block RAMBlock RAMBlock RAMBlock RAM

1-59

Block RAM Logic DiagramBlock RAM Logic DiagramBlock RAM Logic DiagramBlock RAM Logic Diagram

1-60

Block RAM Data Combinations and Block RAM Data Combinations and ADDR LocationsADDR LocationsBlock RAM Data Combinations and Block RAM Data Combinations and ADDR LocationsADDR Locations

1-61

Block RAM Port Aspect RatiosBlock RAM Port Aspect RatiosBlock RAM Port Aspect RatiosBlock RAM Port Aspect Ratios

1-62

Dual-Port Bus FlexibilityDual-Port Bus FlexibilityDual-Port Bus FlexibilityDual-Port Bus Flexibility

Each port can be configured with a different data bus width

Provides easy data width conversion without any additional logic

Each port can be configured with a different data bus width

Provides easy data width conversion without any additional logic

1-63

Simple Dual-Port Mode Allowed Simple Dual-Port Mode Allowed Combinations for 9 Kb Block RAMCombinations for 9 Kb Block RAMSimple Dual-Port Mode Allowed Simple Dual-Port Mode Allowed Combinations for 9 Kb Block RAMCombinations for 9 Kb Block RAM

1-64

True Dual-Port Mode Allowed True Dual-Port Mode Allowed Combinations for 9 Kb Block RAMCombinations for 9 Kb Block RAMTrue Dual-Port Mode Allowed True Dual-Port Mode Allowed Combinations for 9 Kb Block RAMCombinations for 9 Kb Block RAM

1-65

18 Kb Block RAM—True Dual-Port 18 Kb Block RAM—True Dual-Port OperationOperation18 Kb Block RAM—True Dual-Port 18 Kb Block RAM—True Dual-Port OperationOperation

1-66

Read & Write OperationsRead & Write OperationsRead & Write OperationsRead & Write Operations

Read Operation• In latch mode, the read operation uses one clock edge. The

read address is registered on the read port, and the stored data is loaded into the output latches after the RAM access time.

• When using the output register, the read operation will take one extra latency cycle to arrive at the output.

Write Operation• A write operation is a single clock-edge operation. The write

address is registered on the write port, and the data input is stored in memory.

Read Operation• In latch mode, the read operation uses one clock edge. The

read address is registered on the read port, and the stored data is loaded into the output latches after the RAM access time.

• When using the output register, the read operation will take one extra latency cycle to arrive at the output.

Write Operation• A write operation is a single clock-edge operation. The write

address is registered on the write port, and the data input is stored in memory.

1-67

Write ModesWrite ModesWrite ModesWrite Modes

Three settings of the write mode determines the behavior of the data available on the output latches after a write clock edge: WRITE_FIRST, READ_FIRST, and NO_CHANGE.

The Write mode attribute can be individually selected for each port. The default mode is WRITE_FIRST.

WRITE_FIRST outputs the newly written data onto the output bus.

READ_FIRST outputs the previously stored data while new data is being written.

NO_CHANGE maintains the output previously generated by a read operation.

Three settings of the write mode determines the behavior of the data available on the output latches after a write clock edge: WRITE_FIRST, READ_FIRST, and NO_CHANGE.

The Write mode attribute can be individually selected for each port. The default mode is WRITE_FIRST.

WRITE_FIRST outputs the newly written data onto the output bus.

READ_FIRST outputs the previously stored data while new data is being written.

NO_CHANGE maintains the output previously generated by a read operation.

1-68

WRITE_FIRST or Transparent Mode WRITE_FIRST or Transparent Mode (Default)(Default)WRITE_FIRST or Transparent Mode WRITE_FIRST or Transparent Mode (Default)(Default) In WRITE_FIRST mode, the input data is

simultaneously written into memory and stored in the data output (transparent write).

In WRITE_FIRST mode, the input data is simultaneously written into memory and stored in the data output (transparent write).

1-69

READ_FIRST or Read-Before-Write READ_FIRST or Read-Before-Write ModeModeREAD_FIRST or Read-Before-Write READ_FIRST or Read-Before-Write ModeMode In READ_FIRST mode, data previously stored at the

write address appears on the output latches, while the input data is being stored in memory (read before write).

In READ_FIRST mode, data previously stored at the write address appears on the output latches, while the input data is being stored in memory (read before write).

1-70

NO_CHANGE ModeNO_CHANGE ModeNO_CHANGE ModeNO_CHANGE Mode

In NO_CHANGE mode, the output latches remain unchanged during a write operation.

In NO_CHANGE mode, the output latches remain unchanged during a write operation.

1-71

Conflict AvoidanceConflict AvoidanceConflict AvoidanceConflict Avoidance

Block RAM memory is a true dual-port RAM where both ports can access any memory location at any time.

When accessing the same memory location from both ports, the user must, however, observe certain restrictions.

There are no timing restrictions when both ports perform a read operation.

When one port performs a write operation, the other port must not read- or write access the exact same memory location.

Block RAM memory is a true dual-port RAM where both ports can access any memory location at any time.

When accessing the same memory location from both ports, the user must, however, observe certain restrictions.

There are no timing restrictions when both ports perform a read operation.

When one port performs a write operation, the other port must not read- or write access the exact same memory location.

1-72

Spartan-3 Block RAM AmountsSpartan-3 Block RAM AmountsSpartan-3 Block RAM AmountsSpartan-3 Block RAM Amounts

1-73

Spartan-3 FPGA Family MembersSpartan-3 FPGA Family MembersSpartan-3 FPGA Family MembersSpartan-3 FPGA Family Members

1-74

Virtex-II 1.5V ArchitectureVirtex-II 1.5V ArchitectureVirtex-II 1.5V ArchitectureVirtex-II 1.5V Architecture

1-75

Virtex-II 1.5VVirtex-II 1.5VVirtex-II 1.5VVirtex-II 1.5V

Device CLB Array

Slices Maximum I/O

BlockRAM

(18kb)

Multiplier Blocks

Distributed RAM bits

XC2V40 8x8 256 88 4 4 8,192

XC2V80 16x8 512 120 8 8 16,384

XC2V250 24x16 1,536 200 24 24 49,152

XC2V500 32x24 3,072 264 32 32 98,304

XC2V1000 40x32 5,120 432 40 40 163,840

XC2V1500 48x40 7,680 528 48 48 245,760

XC2V2000 56x48 10,752 624 56 56 344,064

XC2V3000 64x56 14,336 720 96 96 458,752

XC2V4000 80x72 23,040 912 120 120 737,280

XC2V6000 96x88 33,792 1,104 144 144 1,081,344

XC2V8000 112x104 46,592 1,108 168 168 1,490,944

1-76

Using Core GeneratorUsing Core GeneratorUsing Core GeneratorUsing Core Generator

1-77

Single Port BRAMSingle Port BRAMSingle Port BRAMSingle Port BRAM

1-78

Single Port BRAMSingle Port BRAMSingle Port BRAMSingle Port BRAM

1-79

Single Port BRAMSingle Port BRAMSingle Port BRAMSingle Port BRAM

1-80

Single Port BRAMSingle Port BRAMSingle Port BRAMSingle Port BRAM

1-81

Dual Port BRAMDual Port BRAMDual Port BRAMDual Port BRAM

1-82

Dual Port BRAMDual Port BRAMDual Port BRAMDual Port BRAM

1-83

Dual Port BRAMDual Port BRAMDual Port BRAMDual Port BRAM

1-84

Distributed RAMDistributed RAMDistributed RAMDistributed RAM

1-85

Distributed RAMDistributed RAMDistributed RAMDistributed RAM

1-86

Distributed RAMDistributed RAMDistributed RAMDistributed RAM

top related