ece645 fpga devices

52
George Mason University ECE 645 – Computer Arithmetic Introduction to FPGA Devices

Upload: neeraj-kadiyan

Post on 25-Nov-2014

120 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ECE645 FPGA Devices

George Mason UniversityECE 645 – Computer Arithmetic

Introduction to FPGA Devices

Page 2: ECE645 FPGA Devices

2ECE 645 – Computer Arithmetic

World of Integrated Circuits

Integrated Circuits

Full-CustomASICs

Semi-CustomASICs

UserProgrammable

PLD FPGA

PAL PLA PML LUT(Look-Up Table)

MUX Gates

Page 3: ECE645 FPGA Devices

3ECE 645 – Computer Arithmetic

• designs must be sent for expensive and time consuming fabrication in semiconductor foundry

• bought off the shelf and reconfigured by designers themselves

Two competing implementation approaches

ASICApplication Specific

Integrated Circuit

FPGAField Programmable

Gate Array

• designed all the way from behavioral description to physical layout

• no physical layout design; design ends with a bitstream used to configure a device

Page 4: ECE645 FPGA Devices

4ECE 645 – Computer Arithmetic

Block R

AM

s

Block R

AM

s

ConfigurableLogicBlocks

I/OBlocks

What is an FPGA?

BlockRAMs

Page 5: ECE645 FPGA Devices

5ECE 645 – Computer Arithmetic

Which Way to Go?

Off-the-shelf

Low development cost

Short time to market

Reconfigurability

High performance

ASICs FPGAs

Low power

Low cost inhigh volumes

Page 6: ECE645 FPGA Devices

6ECE 645 – Computer Arithmetic

Other FPGA Advantages

• Manufacturing cycle for ASIC is very costly, lengthy and engages lots of manpower• Mistakes not detected at design time have

large impact on development time and cost• FPGAs are perfect for rapid prototyping of

digital circuits

• Easy upgrades like in case of software

• Unique applications• reconfigurable computing

Page 7: ECE645 FPGA Devices

7ECE 645 – Computer Arithmetic

Major FPGA Vendors

SRAM-based FPGAs• Xilinx, Inc.• Altera Corp.• Atmel• Lattice Semiconductor

Flash & antifuse FPGAs• Actel Corp.• Quick Logic Corp.

Share over 60% of the market

Page 8: ECE645 FPGA Devices

8ECE 645 – Computer Arithmetic

Xilinx

Primary products: FPGAs and the associated CAD software

Main headquarters in San Jose, CA Fabless* Semiconductor and Software Company

UMC (Taiwan) {*Xilinx acquired an equity stake in UMC in 1996} Seiko Epson (Japan) TSMC (Taiwan)

Programmable Logic Devices ISE Alliance and Foundation

Series Design Software

Page 9: ECE645 FPGA Devices

9ECE 645 – Computer Arithmetic

Xilinx FPGA Families• Old families

• XC3000, XC4000, XC5200• Old 0.5µm, 0.35µm and 0.25µm technology. Not recommended

for modern designs.

• High-performance families• Virtex (0.22µm)• Virtex-E, Virtex-EM (0.18µm)• Virtex-II, Virtex-II PRO (0.13µm)• Virtex-4 (0.09µm)

• Low Cost Family• Spartan/XL – derived from XC4000• Spartan-II – derived from Virtex• Spartan-IIE – derived from Virtex-E• Spartan-3

Page 10: ECE645 FPGA Devices

10ECE 645 – Computer Arithmetic

Page 11: ECE645 FPGA Devices

11ECE 645 – Computer Arithmetic

Xilinx FPGA Block Diagram

Page 12: ECE645 FPGA Devices

12ECE 645 – Computer Arithmetic

CLB Structure

Page 13: ECE645 FPGA Devices

13ECE 645 – Computer Arithmetic

CLB Slice Structure• Each slice contains two sets of the

following:• Four-input LUT

• Any 4-input logic function,• or 16-bit x 1 sync RAM• or 16-bit shift register

• Carry & Control• Fast arithmetic logic• Multiplier logic• Multiplexer logic

• Storage element• Latch or flip-flop• Set and reset• True or inverted inputs• Sync. or async. control

Page 14: ECE645 FPGA Devices

14ECE 645 – Computer Arithmetic

LUT (Look-Up Table) Functionality

• Look-Up tables are primary elements for logic implementation

• Each LUT can implement any function of 4 inputs

x1 x2 x3 x4

y

x1 x2

y

LUT

x1x2x3x4

y

0x1

0x2 x3 x4

0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1

y0100010101001100

0x1

0x2 x3 x4

0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1

y1111111111110000

x1 x2 x3 x4

y

x1 x2 x3 x4

y

x1 x2

y

x1 x2

y

LUT

x1x2x3x4

y

0x1

0x2 x3 x4

0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1

y0100010101001100

0x1

0x2 x3 x4

0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1

y0100010101001100

0x1

0x2 x3 x4

0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1

y1111111111110000

0x1

0x2 x3 x4

0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1

y1111111111110000

Page 15: ECE645 FPGA Devices

15ECE 645 – Computer Arithmetic

5-Input Functions implemented using two LUTs

• One CLB Slice can implement any function of 5 inputs• Logic function is partitioned between two LUTs• F5 multiplexer selects LUT

A4

A3

A2

A1WS DI

D

LUTROMRAM

1

0

F4

F3

F2

F1

A4

A3

A2

A1

WS DI

D

LUTROMRAM

F5

GXOR

G

nBX

BX

1

0

BX

X

F5

A4

A3

A2

A1WS DI

D

LUTROMRAM

A4

A3

A2

A1WS DI

D

LUTROMRAM

1

0

1

0

F4

F3

F2

F1

A4

A3

A2

A1

WS DI

D

LUTROMRAM

A4

A3

A2

A1

WS DI

D

LUTROMRAM

F5

GXOR

G

F5

GXOR

G

nBX

BX

1

0

nBX

BX

1

0

BX

X

F5

Page 16: ECE645 FPGA Devices

16ECE 645 – Computer Arithmetic

5-Input Functions implemented using two LUTs

LUTLUT

X5 X4 X3 X2 X1 Y

0 0 0 0 0 00 0 0 0 1 10 0 0 1 0 00 0 0 1 1 00 0 1 0 0 10 0 1 0 1 10 0 1 1 0 00 0 1 1 1 00 1 0 0 0 10 1 0 0 1 00 1 0 1 0 00 1 0 1 1 10 1 1 0 0 10 1 1 0 1 10 1 1 1 0 10 1 1 1 1 11 0 0 0 0 01 0 0 0 1 01 0 0 1 0 01 0 0 1 1 01 0 1 0 0 01 0 1 0 1 01 0 1 1 0 01 0 1 1 1 11 1 0 0 0 01 1 0 0 1 11 1 0 1 0 01 1 0 1 1 11 1 1 0 0 01 1 1 0 1 11 1 1 1 0 01 1 1 1 1 0

LUTLUT

OUT

Page 17: ECE645 FPGA Devices

17ECE 645 – Computer Arithmetic

RAM16X1S

O

DWE

WCLKA0A1A2A3

RAM32X1S

O

DWEWCLKA0A1A2A3A4

RAM16X2S

O1

D0

WEWCLKA0A1A2A3

D1

O0

=

=LUT

LUT or

LUT

RAM16X1D

SPO

D

WE

WCLK

A0

A1

A2

A3

DPRA0 DPO

DPRA1

DPRA2

DPRA3

or

Distributed RAM

• CLB LUT configurable as Distributed RAM• A LUT equals 16x1 RAM• Implements Single and Dual-

Ports• Cascade LUTs to increase

RAM size

• Synchronous write• Synchronous/Asynchronous

read• Accompanying flip-flops used

for synchronous read

Page 18: ECE645 FPGA Devices

18ECE 645 – Computer Arithmetic

D QCE

D QCE

D QCE

D QCE

LUT

INCE

CLK

DEPTH[3:0]

OUTLUT =

Shift Register

• Each LUT can be configured as shift register• Serial in, serial out

• Dynamically addressable delay up to 16 cycles

• For programmable pipeline

• Cascade for greater cycle delays

• Use CLB flip-flops to add depth

Page 19: ECE645 FPGA Devices

19ECE 645 – Computer Arithmetic

Shift Register

• Register-rich FPGA• Allows for addition of pipeline stages to increase throughput

• Data paths must be balanced to keep desired functionality

64Operation A

4 Cycles 8 Cycles

Operation B

3 Cycles

Operation C64

12 Cycles

3 Cycles9-Cycle imbalance

Page 20: ECE645 FPGA Devices

20ECE 645 – Computer Arithmetic

COUT

D Q

CK

S

REC

D Q

CK

REC

O

G4G3G2G1

Look-UpTable

Carry&

ControlLogic

O

YB

Y

F4F3F2F1

XB

X

Look-UpTable

F5IN

BYSR

S

Carry&

ControlLogic

CINCLKCE SLICE

Carry & Control Logic

Page 21: ECE645 FPGA Devices

21ECE 645 – Computer Arithmetic

Each CLB contains separate logic and routing for the fast generation of sum & carry signals• Increases efficiency and

performance of adders, subtractors, accumulators, comparators, and counters

Carry logic is independent of normal logic and routing resources

Fast Carry Logic

LSB

MSB

Car

ry L

ogic

Rou

ting

Page 22: ECE645 FPGA Devices

22ECE 645 – Computer Arithmetic

Accessing Carry Logic

All major synthesis tools can infer carry logic for arithmetic functions

• Addition (SUM <= A + B)

• Subtraction (DIFF <= A - B)

• Comparators (if A < B then…)

• Counters (count <= count +1)

Page 23: ECE645 FPGA Devices

23ECE 645 – Computer Arithmetic

Block RAM

Spartan-IITrue Dual-Port

Block RAM

Port A

Port B

Block RAM

• Most efficient memory implementation• Dedicated blocks of memory

• Ideal for most memory requirements• 4 to 104 memory blocks

• 18 kbits = 18,432 bits per block

• Use multiple blocks for larger memories

• Builds both single and true dual-port RAMs

Page 24: ECE645 FPGA Devices

24ECE 645 – Computer Arithmetic

Spartan-3 Block RAM Amounts

Page 25: ECE645 FPGA Devices

25ECE 645 – Computer Arithmetic

Block RAM Port Aspect Ratios

Page 26: ECE645 FPGA Devices

26ECE 645 – Computer Arithmetic

Block RAM Port Aspect Ratios

0

16,383

1

4,095

40

8,191

20

2047

8+10

1023

16+20

16k x 1

8k x 2 4k x 4

2k x (8+1)

1024 x (16+2)

Page 27: ECE645 FPGA Devices

27ECE 645 – Computer Arithmetic

Dual Port Block RAM

Page 28: ECE645 FPGA Devices

28ECE 645 – Computer Arithmetic

RAMB4_S4_S16

Port A Out18-Bit Width

Port B In2k-Bit Depth

Port A In1K-Bit Depth

Port B Out9-Bit Width

DOA[17:0]

DOB[8:0]

WEA

ENA

RSTA

ADDRA[9:0]

CLKA

DIA[17:0]

WEB

ENB

RSTB

ADDRB[8:0]

CLKB

DIB[15:0]

Dual-Port Bus Flexibility

• Each port can be configured with a different data bus width

• Provides easy data width conversion without any additional logic

Page 29: ECE645 FPGA Devices

29ECE 645 – Computer Arithmetic

VCC, ADDR[12:0]

GND, ADDR[12:0]

RAMB4_S1_S1

Port B Out1-Bit Width

DOA[0]

DOB[0]

WEA

ENA

RSTA

ADDRA[12:0]

CLKA

DIA[0]

WEB

ENB

RSTB

ADDRB[12:0]

CLKB

DIB[0]

Port B In8K-Bit Depth

Port A Out1-Bit Width

Port A In8K-Bit Depth

Two Independent Single-Port RAMs

• To access the lower RAM• Tie the MSB address bit to

Logic Low• To access the upper RAM

• Tie the MSB address bit to Logic High

• Added advantage of True Dual-Port

• No wasted RAM Bits• Can split a Dual-Port 16K RAM into

two Single-Port 8K RAM• Simultaneous independent access

to each RAM

Page 30: ECE645 FPGA Devices

30ECE 645 – Computer Arithmetic

New 18 x 18 Embedded Multiplier

• Fast arithmetic functions• Optimized to implement multiply /

accumulate modules

18 x 18 signed multiplierFully combinatorialOptional registers with CE & RST (pipeline)Independent from adjacent block RAM

Page 31: ECE645 FPGA Devices

31ECE 645 – Computer Arithmetic

18 x 18 Multiplier • Embedded 18-bit x 18-bit multiplier

• 2’s complement signed operation

• Multipliers are organized in columns

18 x 18Multiplier

Output (36 bits)

Data_A (18 bits)

Data_B (18 bits)

Note: See Virtex-II Data Sheet for updated performances

Page 32: ECE645 FPGA Devices

32ECE 645 – Computer Arithmetic

Basic I/O Block Structure

DEC

Q

SR

DEC

Q

SR

DEC

Q

SR

Three-StateControl

Output Path

Input Path

Three-State

Output

Clock

Set/Reset

Direct Input

Registered Input

FF Enable

FF Enable

FF Enable

Page 33: ECE645 FPGA Devices

33ECE 645 – Computer Arithmetic

IOB Functionality

• IOB provides interface between the package pins and CLBs

• Each IOB can work as uni- or bi-directional I/O

• Outputs can be forced into High Impedance

• Inputs and outputs can be registered• advised for high-performance I/O

• Inputs can be delayed

Page 34: ECE645 FPGA Devices

34ECE 645 – Computer Arithmetic

Routing Resources

PSM PSM

CLB

PSM PSM

CLB CLB

CLBCLB CLB

CLBCLB CLB

ProgrammableSwitchMatrix

Page 35: ECE645 FPGA Devices

35ECE 645 – Computer Arithmetic

Clock Distribution

Page 36: ECE645 FPGA Devices

36ECE 645 – Computer Arithmetic

Spartan-3 FPGA Family Members

Page 37: ECE645 FPGA Devices

37ECE 645 – Computer Arithmetic

FPGA Nomenclature

Page 38: ECE645 FPGA Devices

38ECE 645 – Computer Arithmetic

Device Part Marking

We’re Using: XC3S100-4FG256

Page 39: ECE645 FPGA Devices

39ECE 645 – Computer Arithmetic

Page 40: ECE645 FPGA Devices

40ECE 645 – Computer Arithmetic

Virtex-II 1.5V Architecture

Configurable

Logic

Block

Block R

AM

s

I/OBlock

Multipliers 18 x 18

Block R

AM

s

Multipliers 18 x 18

Block R

AM

s

Multipliers 18 x 18

Block R

AM

s

Multipliers 18 x 18

Page 41: ECE645 FPGA Devices

41ECE 645 – Computer Arithmetic

Virtex-II 1.5V

Device CLB Array

Slices Maximum I/O

BlockRAM

(18kb)

Multiplier Blocks

Distributed RAM bits

XC2V40 8x8 256 88 4 4 8,192

XC2V80 16x8 512 120 8 8 16,384

XC2V250 24x16 1,536 200 24 24 49,152

XC2V500 32x24 3,072 264 32 32 98,304

XC2V1000 40x32 5,120 432 40 40 163,840

XC2V1500 48x40 7,680 528 48 48 245,760

XC2V2000 56x48 10,752 624 56 56 344,064

XC2V3000 64x56 14,336 720 96 96 458,752

XC2V4000 80x72 23,040 912 120 120 737,280

XC2V6000 96x88 33,792 1,104 144 144 1,081,344

XC2V8000 112x104 46,592 1,108 168 168 1,490,944

Page 42: ECE645 FPGA Devices

42ECE 645 – Computer Arithmetic

Virtex-II Block SelectRAM• Virtex-II BRAM is 18 kbits

• Additional “parity” bits available in selected configurations

WEA

ENA

SSRA

CLKA

ADDRA[# : 0]

DIA[# : 0]

DOA[# : 0]

WEB

ENB

RSTB

CLKB

ADDRB[# : 0]

DIB[# : 0]

DOB[# : 0]

DIPA[# : 0]

DIPA[# : 0]

DOPA[# : 0]

DOPB[# : 0]

WEA

ENA

SSRA

CLKA

ADDRA[# : 0]

DIA[# : 0]

DOA[# : 0]

WEB

ENB

RSTB

CLKB

ADDRB[# : 0]

DIB[# : 0]

DOB[# : 0]

DIPA[# : 0]

DIPA[# : 0]

DOPA[# : 0]

DOPB[# : 0]

Width Depth Address Data Parity

1 16,386 [13:0] [0] N/A

2 8,192 [12:0] [1:0] N/A

4 4,096 [11:0] [3:0] N/A

9 2,048 [10:0] [7:0] [0]

18 1,024 [9:0] [15:0] [1:0]

36 512 [8:0] [31:0] [3:0]

Page 43: ECE645 FPGA Devices

George Mason UniversityECE 645 – Computer Arithmetic

Using Library Components in VHDL Code

Page 44: ECE645 FPGA Devices

44ECE 645 – Computer Arithmetic

RAM 16x1 (1)

library IEEE;use IEEE.STD_LOGIC_1164.all;

library UNISIM;use UNISIM.all;

entity RAM_16X1_DISTRIBUTED is port(

CLK : in STD_LOGIC; WE : in STD_LOGIC; ADDR : in STD_LOGIC_VECTOR(3 downto 0); DATA_IN : in STD_LOGIC; DATA_OUT : out STD_LOGIC

);end RAM_16X1_DISTRIBUTED;

Page 45: ECE645 FPGA Devices

45ECE 645 – Computer Arithmetic

RAM 16x1 (2)architecture RAM_16X1_DISTRIBUTED_STRUCTURAL of RAM_16X1_DISTRIBUTED is

attribute INIT : string;attribute INIT of RAM16X1_S_1: label is "F0C1";

-- Component declaration of the "ram16x1s(ram16x1s_v)" unit-- File name contains "ram16x1s" entity: ./src/unisim_vital.vhdcomponent ram16x1sgeneric(

INIT : BIT_VECTOR(15 downto 0) := X"0000");port(

O : out std_ulogic;A0 : in std_ulogic;A1 : in std_ulogic;A2 : in std_ulogic;A3 : in std_ulogic;D : in std_ulogic;WCLK : in std_ulogic;WE : in std_ulogic);

end component;

Page 46: ECE645 FPGA Devices

46ECE 645 – Computer Arithmetic

RAM 16x1 (3)

begin

RAM_16X1_S_1: ram16x1s generic map (INIT => X"F0C1")port map(O=>DATA_OUT, A0=>ADDR(0), A1=>ADDR(1), A2=>ADDR(2), A3=>ADDR(3), D=>DATA_IN, WCLK=>CLK, WE=>WE );

end RAM_16X1_DISTRIBUTED_STRUCTURAL;

Page 47: ECE645 FPGA Devices

47ECE 645 – Computer Arithmetic

RAM 16x8 (1)

library IEEE;use IEEE.STD_LOGIC_1164.all;

library UNISIM;use UNISIM.all;

entity RAM_16X8_DISTRIBUTED is port(

CLK : in STD_LOGIC; WE : in STD_LOGIC; ADDR : in STD_LOGIC_VECTOR(3 downto 0); DATA_IN : in STD_LOGIC_VECTOR(7 downto 0); DATA_OUT : out STD_LOGIC_VECTOR(7 downto 0)

);end RAM_16X8_DISTRIBUTED;

Page 48: ECE645 FPGA Devices

48ECE 645 – Computer Arithmetic

RAM 16x8 (2)architecture RAM_16X8_DISTRIBUTED_STRUCTURAL of RAM_16X8_DISTRIBUTED is

attribute INIT : string;attribute INIT of RAM16X1_S_1: label is "0000";

-- Component declaration of the "ram16x1s(ram16x1s_v)" unit-- File name contains "ram16x1s" entity: ./src/unisim_vital.vhdcomponent ram16x1sgeneric(

INIT : BIT_VECTOR(15 downto 0) := X"0000");port(

O : out std_ulogic;A0 : in std_ulogic;A1 : in std_ulogic;A2 : in std_ulogic;A3 : in std_ulogic;D : in std_ulogic;WCLK : in std_ulogic;WE : in std_ulogic);

end component;

Page 49: ECE645 FPGA Devices

49ECE 645 – Computer Arithmetic

RAM 16x8 (3)begin

GENERATE_MEMORY:for I in 0 to 7 generate

RAM_16X1_S_1: ram16x1s generic map (INIT => X"0000")port map(O=>DATA_OUT(I),

A0=>ADDR(0), A1=>ADDR(1), A2=>ADDR(2),

A3=>ADDR(3), D=>DATA_IN(I), WCLK=>CLK, WE=>WE );

end generate;

end RAM_16X8_DISTRIBUTED_STRUCTURAL;

Page 50: ECE645 FPGA Devices

50ECE 645 – Computer Arithmetic

ROM 16x1 (1)

library IEEE;use IEEE.STD_LOGIC_1164.all;

library UNISIM;use UNISIM.all;

entity ROM_16X1_DISTRIBUTED is port( ADDR : in STD_LOGIC_VECTOR(3 downto 0);

DATA_OUT : out STD_LOGIC );

end ROM_16X1_DISTRIBUTED;

Page 51: ECE645 FPGA Devices

51ECE 645 – Computer Arithmetic

ROM 16x1 (2)architecture ROM_16X1_DISTRIBUTED_STRUCTURAL of ROM_16X1_DISTRIBUTED is

attribute INIT : string;attribute INIT of ROM16X1_S_1: label is "F0C1";

component ram16x1sgeneric(

INIT : BIT_VECTOR(15 downto 0) := X"0000");port(

O : out std_ulogic;A0 : in std_ulogic;A1 : in std_ulogic;A2 : in std_ulogic;A3 : in std_ulogic;D : in std_ulogic;WCLK : in std_ulogic;WE : in std_ulogic);

end component; signal Low : std_ulogic := ‘0’;

Page 52: ECE645 FPGA Devices

52ECE 645 – Computer Arithmetic

ROM 16x1 (3)

begin

ROM_16X1_S_1: ram16x1s generic map (INIT => X"F0C1")port map(O=>DATA_OUT, A0=>ADDR(0), A1=>ADDR(1), A2=>ADDR(2), A3=>ADDR(3), D=>Low, WCLK=>Low, WE=>Low

);

end ROM_16X1_DISTRIBUTED_STRUCTURAL;