lecture 1: course introduction september 8, 2004 ece 697f reconfigurable computing lecture 1 course...

Post on 18-Jan-2016

217 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Lecture 1: Course Introduction September 8, 2004

ECE 697F

Reconfigurable Computing

Lecture 1

Course Introduction

Prof. Russell Tessier

Lecture 1: Course Introduction September 8, 2004

What is Reconfigurable Computing?

• Computation using hardware that can adapt at the logic level to solve specific problems

° Why is this interesting?• Some applications are poorly suited to

microprocessor.

• VLSI “explosion” provides increasing resources.

• Hardware/Software

• Relatively new research area.

° Acknowledgement: Wolf text

Lecture 1: Course Introduction September 8, 2004

Background needed

• Basic VLSI – transistors, delay models.

• Basic algorithms – graph algorithms, seaches

• Computer Architecture – ALU, microprocessor

• Digital Design – adder, counter, etc.

Topic self-contained!

Lecture 1: Course Introduction September 8, 2004

Course Organization

• Two lecture a week (about 27 overall)

• 3 Homework assignments (20%)

• Final project (35%)

• Transcription assignment (20%)

• Mid-term (25%)

• Required text – FPGA-based System Design (Wolf)

Lecture 1: Course Introduction September 8, 2004

Microprocessor-based Systems

• Generalized to perform many functions well.

• Operates on fixed data sizes.

• Inherently sequential.

Data Storage(Register File)

ALU

A B C

64

Lecture 1: Course Introduction September 8, 2004

Reconfigurable Computing

• Create specialized hardware for each application.

• Functional units optimized to perform a special task.

Functional Unit

A B

H L

If (A > B) { H = A; L = B;}Else { H = B; L = A;}

Lecture 1: Course Introduction September 8, 2004

Example: Bubblesort

• Adapt interconnect to problem.

• Take advantage of parallelism.

A B

H L

A B

H L

A B

H L

A B

H L

A B

H L

Smallest Largest

Lecture 1: Course Introduction September 8, 2004

Implementation Spectrum

• ASIC gives high performance at cost of inflexibility.

• Processor is very flexible but not tuned to the application.

• Reconfigurable hardware is a nice compromise.

Microprocessor Reconfigurable Hardware

ASIC

What does it look like?

Lecture 1: Course Introduction September 8, 2004

Moore’s Law

° Gordon Moore: co-founder of Intel.

° Predicted that number of transistors per chip would grow exponentially (double every 18 months).

° Exponential improvement in technology is a natural trend: steam engines, dynamos, automobiles.

Lecture 1: Course Introduction September 8, 2004

Moore’s Law plot

Lecture 1: Course Introduction September 8, 2004

The cost of fabrication

° Current cost: $2-3 billion.

° Typical fab line occupies about 1 city block, employs a few hundred people.

° New fabrication processes require 6-8 month turnaround.

° Most profitable period is first 18 months-2 years.

Lecture 1: Course Introduction September 8, 2004

Reconfigurable Hardware

• Each logic element operates on four one-bit inputs.

• Output is one data bit.

• Can perform any boolean function of four inputs

2 = 64K functions!

Logic Element

ABCD

Out

A B C D = out

24

Lecture 1: Course Introduction September 8, 2004

Field-Programmable Gate Array

• Each logic element outputs one data bit.

• Interconnect programmable between elements.

• Interconnect tracks grouped into channels.

LE LE

LE LE

LE LE LE LE

LE LE

LE LE

Logic Element Tracks

Lecture 1: Course Introduction September 8, 2004

FPGA Architecture Issues

• Need to explore architectural issues.

• How much functionality should go in a logic element?

• How many routing tracks per channel?

• Switch “population”?

LogicElement

Lecture 1: Course Introduction September 8, 2004

Real World Physical Issues

• Modelling FPGA delay.

• Improving performance through buffering/segmentation.

• Technology dependent.

• The cost of reconfigurability.

S S

Wires have real cost

Lecture 1: Course Introduction September 8, 2004

Translating a Design to an FPGA

• CAD to translate circuit from text description to physical implementation well understood.

• CAD to translate from C program to circuit not well understood.

• Very difficult for application designers to successfully write high-performance applications

C program

. . C = A+B .

Circuit

AB + C

Array

Need for design automation!

Lecture 1: Course Introduction September 8, 2004

High-level Compilers

• Difficult to estimate hardware resources.

• Some parts of program more appropriate for processor (hardware/software codesign).

• Compiler must parallelize computation across many resources.

• Engineers like to write in C rather than pushing little blocks around.

C = A+B

A B

+

C

for (i = 0; i<n, i++){ . .}

Lecture 1: Course Introduction September 8, 2004

Design abstractions

specification

behavior

register-transfer

logic

circuit

layout

English

Executableprogram

Sequentialmachines

Logic gates

transistors

rectangles

Throughput,design time

Function units,clock cycles

Literals, logic depth

nanoseconds

microns

function cost

Lecture 1: Course Introduction September 8, 2004

Circuit Compilation

1. Technology Mapping

2. Placement

3. Routing

LUT

LUT

?

Assign a logical LUT to a physical location.

Select wire segmentsAnd switches forInterconnection.

Lecture 1: Course Introduction September 8, 2004

Two Bit Adder

FA

A B

Co Ci

S

Made of Full Adders

A+B = D

Logic synthesis tool reduces circuit to SOP form

Co = ABCi + ABCi + ABCi + ABCi

S = ABCi + ABCi + ABCi + ABCi

LUT CoCi

BA

LUT SCi

BA

Lecture 1: Course Introduction September 8, 2004

FPGA design

° FPGA manufacturer creates an FPGA fabric; system designer uses the fabric.

° FPGA fabric design issues:• Study sample user designs.

• Select interconnect topology.

• Create logic element structures.

• Design circuits, layout.

Lecture 1: Course Introduction September 8, 2004

Why do we care about layout?

° We won’t design layout.

° Layout determines:• Logic delay.

• Interconnect delay.

• Energy consumption.

° We want to understand sources of FPGA characteristics.

Lecture 1: Course Introduction September 8, 2004

Design validation

° Must check at every step that errors haven’t been introduced-the longer an error remains, the more expensive it becomes to remove it.

° Forward checking: compare results of less- and more-abstract stages.

° Back annotation: copy performance numbers to earlier stages.

Lecture 1: Course Introduction September 8, 2004

Processor + FPGA

1. FPGA serves as coprocessor for data intensive applications – possible project.

Three possibilities

Backplane bus(e.g. PCI)

Proc

chip

daughtercard

FPGA

chipFPGAProc

2. FPGA serves as embedded computer for low latency transfer.“Reconfigurable Functional Unit”

Lecture 1: Course Introduction September 8, 2004

Processor + FPGA (cont..)

• FPGA logic embedded inside processor.

• A number of problems with 2 and 3.

- Process technology an issue.

- ALU much faster than FPGA generally.

- FPGA much faster than the entire processor.

RF

ALU FPGA

Processor

3. Processor integration

Lecture 1: Course Introduction September 8, 2004

Multi-FPGA Systems

• Most applications don’t fit on one device.

• Create need for partitioning designs across many devices.

• Effectively a “netlist computer”

Each FPGA is a logic processor interconnected in a given topology.

F

F

F

F

F

F

F

F

F

Lecture 1: Course Introduction September 8, 2004

Dynamic Reconfiguration

• What if I want to exchange part of the design in the device with another piece?

• Need to create architectures and software to incrementally change designs.

• Effectively a “configuration cache”

Examples: encryption, filtering.

L L

L L

Lecture 1: Course Introduction September 8, 2004

Research Areas

• Storing configuration info inside device.

• Architecture evaluation.

- Size and performance tradeoff.

• Layout of a new logic element.

• Algorithm for place and route.

• Apply an application to FPGA logic.

Lecture 1: Course Introduction September 8, 2004

Versatile Place and Route

• Written by Vaughn Betz at the University of Toronto

• Performs FPGA placement and routing.

• Written in C

• Runs on Suns, Alphas, Linux

• Estimates device sizes and performance.

Lecture 1: Course Introduction September 8, 2004

Xilinx XC4000 Cell

• 2 4-input look-up tables

• 1 3-input look-up table

• 2 D flip flops

Lecture 1: Course Introduction September 8, 2004

Xilinx XC4000 Routing

25

Lecture 1: Course Introduction September 8, 2004

Altera Flex10K

Lecture 1: Course Introduction September 8, 2004

Altera Flex10K

Lecture 1: Course Introduction September 8, 2004

Xilinx Virtex-II Pro

Lecture 1: Course Introduction September 8, 2004

Altera Stratix

Lecture 1: Course Introduction September 8, 2004

Xilinx Virtex CLB

Lecture 1: Course Introduction September 8, 2004

Embedded RAM

° Xilinx – Block SelectRAM• 18Kb dual-port RAM arranged in columns

° Altera – TriMatrix Dual-Port RAM• M512 – 512 x 1

• M4K – 4096 x 1

• M-RAM – 64K x 8

Lecture 1: Course Introduction September 8, 2004

Xilinx: Embedded Multipliers

Lecture 1: Course Introduction September 8, 2004

Summary

° Reconfigurable computing relies heavily on new VLSI technology

° Device architectures maturing

° Application development progressing at rapid pace

° Integration of hardware and software a difficult challenge

° Active area of research at UMass.

top related