1 heterogeneous logic blocks 1.mixture of two different sizes of luts: larger lut and cluster...

65
1 Heterogeneous Logic Blocks 1. Mixture of two different sizes of LUTs: Larger LUT and cluster sizes: higher speed Smaller sizes: more area efficient Up to the CAD tool to select the resource 2. Mixture of PAL-like LBs and LUT-based LBs: PAL blocks: improved circuit speed LUT blocks: area efficiency 3. Mixture of “specific-purpose logic” and general-purpose LBs: SP LBs: superior area, speed, and power consumption If the function is not used, the silicon area is wasted

Post on 20-Dec-2015

222 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

1

Heterogeneous Logic Blocks

1. Mixture of two different sizes of LUTs: Larger LUT and cluster sizes: higher speed Smaller sizes: more area efficient

− Up to the CAD tool to select the resource

2. Mixture of PAL-like LBs and LUT-based LBs: PAL blocks: improved circuit speed LUT blocks: area efficiency

3. Mixture of “specific-purpose logic” and general-purpose LBs: SP LBs: superior area, speed, and power

consumption If the function is not used, the silicon area is wasted

Page 2: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

2

Heterogeneous Logic Blocks

• Key questions:

1. Which kinds of SP functions?

2. What should be the ratio: SP/GP?

3. What can be done about SP LBs not used in a specific application?− Rose’s golden rule: “build structures that are always

useful, even if that use is less than perfectly efficient.”− “The more useful a hard structure is, across a wider

range of applications, then the greater its net benefit - provided the cost of the extra functionality is not excessive.”− Rose. Hard vs. Soft: The Central Question of Pre-Fabricated

Silicon. In Proceedings of the 34th International Symposium on Multiple-Valued Logic (ISMVL’04), 2004.

Page 3: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

3

Hard Blocks

• Common hard blocks in modern FPGAs: Memory Multipliers MAC for DSP applications Microprocessors

Page 4: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

Embedded Memories

Page 5: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

5

Memory in Altera Flex10K

Page 6: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

6

Memory in FLEX 10K

Page 7: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

7

Memory in FLEX 10K

Page 8: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

8

Heterogeneous Logic Blocks

• Each EAB: 2048 bits if used as memory

− Dual port RAM, ROM, FIFO, … 10-600 gates if used as logic

Page 9: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

9

پيكر بندي به عنوان حافظه

A[10..0] D0

2048x1

D[7..0]A[7..0]

256x8

A[8..0] D[3..0]

512x4

A[9..0] D[1..0]

1028x2

Page 10: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

10

پيكر بندي به عنوان حافظه

• Can be used independently

• Can be combined for a larger memory

A[8..0] D[3..0]512x4

A[8..0] D[3..0]

512x4

D[7..0]A[8..0]

Page 11: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

11

Altera Cyclone III Architecture

Page 12: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

12

Cyclone III

Page 13: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

13

پيكر بندي به عنوان تابع منطقي

جذرگير به كار رود: مثل LUTمي تواند به عنوان • خروجي(.8 ورودي EAB 8)با يك

(: تأخير LE)نسبت به پياده سازي با چند مزيت•قابل پيش بيني و سرعت بيشتر.

ترکيب EABمي تواند مستقًال: استفاده شود يا چند •شوند و تابع پيچيده تري را پياده سازي کنند.

Remember:

3. What can be done about SP LBs not used in a specific application?

Page 14: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

14

Cyclone III M9K

Page 15: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

15

Memory Modes

• Embedded shift register mode

• ROM mode

• FIFO buffer

• Single/dual-port

Page 16: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

16

Memory Volume in Cyclone III

Page 17: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

17

Memory Modes

• Simple dual-port mode: Supports simultaneous read

and write operation to different locations.

• True dual-port mode: Supports any combination of

two-port operations: − two reads,

− two writes,

− one read and one write,

at two different clock frequencies.

Page 18: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

18

Memory Block Megafunctions

• Can instantiate memory blocks by Quartus MegaWizard

• Can instantiate them in your VHDL/Verilog code. Refer to

− “RAM Megafunction User Guide,” 2007, http://www.altera.com/literature/ug/ug_ram.pdf

Page 19: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

19

Altera Stratix II Embedded Memory

Page 20: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

20

TriMatrix Memory Structure

Page 21: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

21

Stratix II RAM Blocks

Page 22: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

22

Stratix IV RAM Blocks

Page 23: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

23

Embedded Memory کاربردهايام B: )يا هر تابع رياضي پيچيده: ريشة 4x4ضرب کننده •

(Aعدد

و چند 4x4براي ضرب کننده هاي بزرگتر، از چند ضرب کننده ي •جمع کننده استفاده مي کنيم.

Page 24: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

24

Embedded Memory کاربردهاي

و سيستمهاي کنترلي(:DSPضرب کننده ي ثابت )در • خواهد بود.EABمقدار ثابت تعيين کننده ي الگوي محتويات •

اگر مقدار ثابت در حين اجرا تغيير کند مي توان الگوي جديد • لود کرد. EABرا در

دقت ضرب کننده را مي توان با تنظيم تعداد بيتهاي خروجي •تنظيم کرد )براي صرفه جويي(

Page 25: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

25

Embedded Memory کاربردهاي

•FSM( هاي با تغيير حالتtransition:هاي پيچيده)

•FSM( عمومي general purpose:)

Page 26: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

26

Memory کاربردهاي

Page 27: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

27

Embedded Memory کاربردهاي

:Transcendentalتوابع •سينوس، ...، لگاريتم، ... که محاسبه شان با الگوريتم و پياده •

سازي سخت افزاريشان مشکل است.

آرگومان تابع: ورودي خطوط آدرس.•

نتيجه: روي خروجي داده.•

Page 28: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

28

Embedded Memory کاربردهاي

مبدل کدهاي بزرگ:•

بيتي به عدد 8مبدل کد عدد • بيتي10

Page 29: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

29

Xilinx Virtex II Pro

(Digital Clock Manager)

Page 30: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

30

Xilinx Virtex II Pro

Page 31: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

31

Xilinx Virtex 4

Page 32: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

32

Virtex 5

Page 33: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

Computation-Oriented Tiles

Page 34: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

34

Virtex Family

Page 35: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

35

18*18ضرب كننده هاي

DSPبراي كارهاي محاسباتي و •

Page 36: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

37

18*18ضرب كننده هاي

• In Virtex 5:• DSP48E slices

- 25 x 18, two’s complement multiplication- One adder, one subtracter and an accumulator

Page 37: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

38

Multipliers in Altera Cyclone III

Page 38: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

39

Embedded Multipliers

Page 39: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

40

Embedded Multipliers

• Can configure each embedded multiplier as one 18 × 18 or two 9 × 9.

• For > 18 × 18, the Quartus II software cascades.• No restriction on the data width

but the greater the data width, the slower the multiplication process.

• Can also implement soft multipliers using Cyclone III M9K memory blocks. Increase the number of multipliers.

Page 40: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

41

Number of Multipliers

Page 41: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

42

Multiplier Block Architecture

Page 42: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

43

9-Bit Mode

Page 43: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

44

Multiplier Megafunctions

• For instantiating multipliers, refer to: Quartus User Guide, Synthesis,

http://www.altera.com/literature/hb/qts/qts_qii5v1_03.pdf

Page 44: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

45

Stratix II DSP Blocks

Page 45: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

46

Stratix II DSP Blocks

Page 46: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

47

Stratix II DSP Blocks

Page 47: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

48

Stratix II DSP Blocks

Page 48: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

49

Page 49: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

50

Stratix Architecture

Page 50: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

51

Ratio-Based Architectures

• If multipliers not needed by an application, the multipliers provide little benefit. One way: multiple sub-families within a device family with different

ratios of soft logic to hard-logic. Designer can select the device with the most appropriate ratio

− minimize “wasted” area− FPGA vendor must support a larger number of devices

223 449 275 373soft/hard ratio

Page 51: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

52

Ratio-Based Architectures

• Virtex 4/Virtex 5 sub-families:1. LX: focus on soft logic and memory 2. SX: focus on arithmetic computational units3. FX: with a processor and high-speed serial interface focus

• Virtex 6:

1. LXT: High-performance logic with advanced serial connectivity2. SXT: Highest signal processing capability with advanced serial

connectivity3. HXT: Highest bandwidth serial connectivity

Page 52: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

53

Xilinx Virtex 4

Page 53: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

54

Virtex 5

Page 54: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

Embedded Processors

Page 55: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

56

System-Level Design

Until recently, CPU and its peripheral: as discrete chips.

• Two Scenarios:

Memory Connected to CPU via general-purpose processor bus

Tightly-coupled memory (TCM) connected to processor via dedicated bus

Page 56: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

57

Embedded System Design

Dedicated chips for CPU and peripherals − High area cost,− Low reliability.

For relatively small amount of memory, integrated memory in FPGA is used.

Page 57: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

58

Challenges

• Challenges:Decision on hardware/software partitioning.Design environment must support

hardware/software co-verification.

Page 58: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

59

SoPC

• SoC: A chip that integrates the major functional elements of

a complete end product.

• Complex FPGAs : CPU Memory Arithmetic units (multipliers, …) Peripheral modules Logic

Whole system on a chip (SoPC)

Page 59: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

60

Microprocessor Cores

• Two types:Hard Core

− Implemented as hardwired component− E.g. PowerPC in Xilinx− E.g. Arm in Altera− E.g. MIPS in QuickLogic

Soft Core− Configure logic blocks to act as

microprocessor(s)− E.g. MicroBlaze in Xilinx− E.g. NiosII in Altera− E.g. Q90C1 in QuickLogic

Page 60: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

61

Hard Microprocessor Cores• Two Scenarios:

1. Locate it in a strip to the side of FPGA fabric.

Easier for tools because the main FPGA fabric is identical for devices with or without hard code

FPGA vendor can embed a lot of additional functions in the strip to complement the micro.

Altera: ARM in Excalibur

Page 61: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

62

Hard Microprocessor Cores• Two Scenarios:

2. Embed core(s) directly into the main FPGA fabric Design tools must consider presence of these blocks in the fabric. Memory used by the core from embedded RAM blocks Speed advantages by proximity to the main FPGA fabric. Xilinx: PowerPC in Virtex II-Pro, Virtex 4, and Virtex 5.

Page 62: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

63

Hard Microprocessor Cores2. (cont.) Embed core(s) directly into the main FPGA fabric

No dedicated processor bus or peripheral bus. These buses must be implemented using FPGA logic.

Advantage: flexibility to define the architecture of the embedded system.

Disadvantage: the processor cannot perform useful work without configuring the FPGA logic

Page 63: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

64

Soft Processor Core

• Disadvantages: Generally slower Larger

• Advantage: can often be customized to exactly suit the needs

of the application − Gains back some of the lost performance and

area efficiency.

Page 64: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

65

Soft Microprocessor Cores

• Firm or Soft: Soft: if in the form of RTL netlist that will be

synthesized, Firm: if placed and routed.

• Peripherals in soft or firm form: E.g. Memory controllers, interrupt controllers,

communication functions, timer counters. Refer to library of FPGA vendor.

• Xilinx MicroBlaze: 32-bit microprocessor (~1000 logic cells) PicoBlaze: 8-bit microprocessor (~150 logic cells)

• Altera: NiosII: 32-bits

Page 65: 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient

66

References

• [Xilinx] www.xilinx.com

• [Altera] www.altera.com