1 kurt keutzer lecture 11: interfaces, i/o and configurable processors professor kurt keutzer...

54
1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from Prof. David Patterson Niraj Shah, Scott Weber

Upload: lizeth-wixon

Post on 14-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

1Kurt Keutzer

Lecture 11: Interfaces, I/O and

Configurable Processors

Professor Kurt Keutzer

Computer Science 252

Spring 2000

With contributions from Prof. David Patterson

Niraj Shah, Scott Weber

Page 2: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

2Kurt Keutzer

Embedded Systems vs. General Purpose Computing - 1

Embedded System

• Runs a few applications often

known at design time

• Not end-user programmable

• Operates in fixed run-time

constraints, additional

performance may not be

useful/valuable

General purpose computing

•Intended to run a fully general

set of applications

• End-user programmable

• Faster is always better

Page 3: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

3Kurt Keutzer

Embedded Systems vs. General Purpose Computing - 2

Embedded System

Differentiating features:

power

cost

speed (must be predictable)

General purpose computing

Differentiating features

speed (need not be fully predictable)

speed

did we mention speed?

cost (largest component power)

Page 4: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

4Kurt Keutzer

Configurabilty and Embedded Systems

Advantages of configuration:

• Pay (in power, design time, area) only for what you use

• Gain additional performance by adding features tailored to

your application:

Particularly for embedded systems:

Principally in embedded controller microprocessor applications

Some us in DSP

Page 5: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

5Kurt Keutzer

What to Configure?

What parts of the microcontroller/microprocessor system to

configure?

Easy answers:

• Memory and Cache Sizes - get precisely the sizes your

applications needs

• Register file sizes

• Interrupt handling and addresses

Harder answers:

• Peripherals

• Instructions

But first we need more context

Page 6: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

6Kurt Keutzer

I/O Interrupts

An I/O interrupt is just like the exception handlers except:

An I/O interrupt is asynchronous

Further information needs to be conveyed

An I/O interrupt is asynchronous with respect to instruction execution:

I/O interrupt is not associated with any instruction

I/O interrupt does not prevent any instruction from completion You can pick your own convenient point to take an interrupt

I/O interrupt is more complicated than exception:

Needs to convey the identity of the device generating the interrupt

Interrupt requests can have different urgencies: Interrupt request needs to be prioritized

Page 7: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

7Kurt Keutzer

add $r1,$r2,$r3subi $r4,$r1,#4slli $r4,$r4,#2

Hiccup(!)

lw $r2,0($r4)lw $r3,4($r4)add $r2,$r2,$r3sw 8($r4),$r2

Raise priorityReenable All IntsSave registers

lw $r1,20($r0)lw $r2,0($r1)addi $r3,$r0,#5sw $r3,0($r1)

Restore registersClear current IntDisable All IntsRestore priorityRTI

Ext

ern

al I

nte

rru

pt

PC saved

Disable

All Ints

Superviso

r Mode

Restore PC

User Mode

“In

terr

up

t H

and

ler”

Example: Device Interrupt

Advantage: User program progress is only halted during actual transfer

Disadvantage, special hardware is needed to: Cause an interrupt (I/O device) Detect an interrupt (processor) Save the proper states to resume after the interrupt (processor)

Page 8: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

8Kurt Keutzer

Interrupt Driven Data TransferCPU

IOC

device

Memory

addsubandornop

readstore...rti

memory

userprogram(1) I/O

interrupt

(2) save PC

(3) interruptservice addr

interruptserviceroutine(4)

Device xfer rate = 10 MBytes/sec => 0 .1 x 10 sec/byte => 0.1 µsec/byte => 1000 bytes = 100 µsec 1000 transfers x 100 µsecs = 100 ms = 0.1 CPU seconds

-6

User program progress only halted during actual transfer

1000 transfers at 1 ms each: 1000 interrupts @ 2 µsec per interrupt 1000 interrupt service @ 98 µsec each = 0.1 CPU seconds

Still far from device transfer rate! 1/2 in interrupt overhead

Page 9: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

9Kurt Keutzer

Better Way to Handle Interrupts?

Handling all interrupts with CPU could bring it to a halt in a

real time system

Isn’t there a better way?

Hint, remember the trickledown theory of embedded

processor architecture.

Page 10: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

10Kurt Keutzer

Trickle Down Theory of Embedded Architectures

Mainframe/supercomputers

High-end servers/workstations

High-end personal computers

Personal computers

Lap tops/palm tops

Gadgets

Watches

...

Features tend to trickle down:• #bits: 4->8->16->32->64• ISA’s• Floating point support• Dynamic scheduling• Caches• I/O controllers/processors• LIW/VLIW• Superscalar

Page 11: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

11Kurt Keutzer

I/O Interface

Independent I/O Bus

CPU

Interface Interface

Peripheral Peripheral

Memory

memorybus

Separate I/O instructions (in,out)

CPU

Interface Interface

Peripheral Peripheral

Memory

Lines distinguish between I/O and memory transferscommon memory

& I/O bus

VME busMultibus-IINubus

40 Mbytes/secoptimistically

10 MIP processorcompletelysaturates the bus!

Page 12: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

12Kurt Keutzer

Delegating I/O Responsibility from the CPU: IOP

CPU IOP

Mem

D1

D2

Dn

. . .main memory

bus

I/Obus

CPU

IOP

(1) Issuesinstructionto IOP

memory

(2)

(3)

Device to/from memorytransfers are controlledby the IOP directly.

IOP steals memory cycles.

OP Device Address

target devicewhere cmnds are

IOP looks in memory for commands

OP Addr Cnt Other

whatto do

whereto putdata

howmuch

specialrequests

(4) IOP interrupts CPU when done

Page 13: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

13Kurt Keutzer

Memory Mapped I/O

Single Memory & I/O Bus No Separate I/O Instructions

CPU

Interface Interface

Peripheral Peripheral

Memory

ROM

RAM

I/O$

CPU

L2 $

Memory Bus

Memory Bus Adaptor

I/O bus

Page 14: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

14Kurt Keutzer

Delegating I/O Responsibility from the CPU: DMA

Direct Memory Access (DMA):

External to the CPU

Act as a master on the bus

Transfers blocks of data to or from memory without CPU intervention

CPU

IOC

device

Memory DMAC

CPU sends a starting address, direction, and length count to DMAC. Then issues "start".

DMAC provides handshakesignals for PeripheralController, and MemoryAddresses and handshakesignals for Memory.

Page 15: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

15Kurt Keutzer

Direct Memory Access

CPU

IOC

device

Memory DMAC

Time to do 1000 xfers at 1 msec each:

1 DMA set-up sequence @ 50 µsec1 interrupt @ 2 µsec1 interrupt service sequence @ 48 µsec

.0001 second of CPU time

CPU sends a starting address, direction, and length count to DMAC. Then issues "start".

DMAC provides handshake signals for PeripheralController, and Memory Addresses and handshakesignals for Memory.

0ROM

RAM

Peripherals

DMACn

Memory Mapped I/O

Page 16: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

16Kurt Keutzer

68332 Family

68K was the most successful embedded controller in

history

CISC instruction set - good code density

Table lookup for compressed tables

Time processing unit - breakthrough in modular peripheral

handling!

Page 17: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

17Kurt Keutzer

MC68332 - Top level

inter module busIMB

I/0 - channel 0

I/0 - channel 15

unitTPU

time processingCPU32

serial I/0

IMB control RAM

TPU

Designed for automotive applications with mixture of computation intensive tasks and complex I/0 -functions Idea: off-load CPU from frequent I/0 interactions to make use of computation performance:

Page 18: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

18Kurt Keutzer

68332 CPU Block Diagram

Page 19: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

19Kurt Keutzer

Addressing Modes in 68332

Seven modes

• Register direct

• Register indirect

• Register indirect with index

• Program counter indirect with displacement

• Program counter indirect with Index

• Absolute

• Immediate

Why so many modes? Antiquated architectural feature?

Page 20: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

20Kurt Keutzer

Addressing Modes in 68332

Seven modes

• Register direct

• Register indirect

• Register indirect with index

• Program counter indirect with displacement

• Program counter indirect with Index

• Absolute

• Immediate

Complex addressing modes allow for more dense code … but …MCore - Mot’s embedded micocontroller rewrite uses simple DLX-like

Load Store instructions - code size impact?

Page 21: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

21Kurt Keutzer

MC68332 Time Processing Unit

IMB

Data

Control ServiceRequests

Microengine

HostInterface

TimerChannelsScheduler

DevelopmentSupportand Test

SystemConfiguration

ChannelControl

ParameterRAM

Store

ExecutionUnit

Channel 0

Channel 1

Channel 15

Pins

Control andData

Ch

ann

elControlStore

timebase

TPU: time processing unit: peripheral coprocessor

independent programmable timer channels: single-shot "capture & compare"

channel coupling and sequence control with control processor

pin

Page 22: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

22Kurt Keutzer

Time Processing Unit

Page 23: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

23Kurt Keutzer

Time Processing Unit

Semi-autonomous microcontroller

Operates concurrently with CPU

• Schedules tasks

• Processes ROM instructions

• Accesses shared data with CPU

• Performs Input/Output

Page 24: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

24Kurt Keutzer

Uses of Time Processing Unit

Programmable series of two operations

• Match

• Capture

Each operation is called an ``event’’

A pre-programmed series of event is called a ``function’’

Pre-programmed functions

• Input capture/input transition counter

• Output compare

• Period measurement with addition/missing transition detect

• Position synchronized pulse-generator

• Period/pulse-width accumulator

Page 25: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

25Kurt Keutzer

Time Bases

Two sixteen-bit counters

provide time bases for all

Pre-scalers controlled by CPU

via bit-fiels in TPU module

configuration register

TPUCMR

Current values accessible via

TCR1 and TCR2 registers

TCR1, TCR2 can be read/written

by TPU microcode- not

available to CPU

TC1 qualified by system clock

TC2 qualified by system clock

or external clock

Page 26: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

26Kurt Keutzer

Timer Channels

Sixteen channels

- each one connect to a MCU

pin

Each channel has symmetric

hardware:

• Event register

16-bit capture register

16-bit compare/match register

16-bit comparator

• Pin control logic - pin

direction determined by

TPU microengine

Page 27: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

27Kurt Keutzer

Scheduler

Determines which of sixteen

channels is serviced by the

microenginer

Channel can request service

for one of four reasons

host service

link to another channel

match event

capture event

• Host system assigns to each

channel a priority

high

middle

low

Page 28: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

28Kurt Keutzer

Microengine

Determines which of sixteen

channels is serviced by the

microenginer

Channel can request service

for one of four reasons

host service

link to another channel

match event

capture event

• Host system assigns to each

channel a priority

high

middle

low

Page 29: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

29Kurt Keutzer

Another Motorola Microprocessor

Page 30: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

30Kurt Keutzer

Concepts so far ...

• Interrupts

• Memory Mapping of I/O

• Time Processing Unit / Peripheral Processor

other configurable elements

Peripherals

Instructions

Page 31: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

31Kurt Keutzer

Configurability in ARM Processor

ARM allows for configurability via AMBA bus

Offers ``prime cell’’ peripherals which hook into AMBA

Peripheral Bus (APB)

• UART

• Real Time Clock

• Audio Codec Interface

• Keyboard and mouse interface

• General purpose I/O

• Smart card interface

• Generic IR interface

http://www.arm.com/Pro+Peripherals/PrimeCell/index.html

Page 32: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

32Kurt Keutzer

ARM7 core

Page 33: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

33Kurt Keutzer

ARM’s Amba open standard

Advanced System Bus, (ASB) - high performance, CPU, DMA, external

Advanced Peripheral Bus, (APB) - low speed, low power, parallel I/O, UART’s

External interface

http://www.arm.com/Documentation/Overviews/AMBA_Intro/#intro

Page 34: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

34Kurt Keutzer

Ex1: ARM Infrared (IR) Interface

Page 35: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

35Kurt Keutzer

Ex 2: ARM Smart Card Interface

Page 36: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

36Kurt Keutzer

Ex 3: Audio Codec

Page 37: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

37Kurt Keutzer

Another Kind of Configurability

RTLSynthesis

HDL

netlist

logicoptimization

netlist

Library

physicaldesign

layout

Synthesis of a processor core from an RTL description allows for:

• full range of other types of configurability

• additional degrees of freedom in quality of implementation

Examples:

• ARM7

• Motorola Coldfire

• Tensilica Xtensa

Page 38: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

38Kurt Keutzer

Quality of Results Tradeoffs

Delay

Area

Synthesizable implementationallows for explanation of a widerange of implementations

Page 39: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

39Kurt Keutzer

ARM Core7 Thumb Embedded

Page 40: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

40Kurt Keutzer

Ultimate configurabilty :The tensilica solution:

Fast, safetailoring of

coresExtensibility with

synchronization tothe hardware

DSP andperipheral

blocksuP

GeneratoruP

Generator

uPCores

uPCores

Pre-verifiedfunctionlibrary

Pre-verifiedfunctionlibrary

S/Wdevelopmentenvironment

S/Wdevelopmentenvironment

Ultra small andefficient, newarchitectures

Page 41: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

41Kurt Keutzer

Tensilica Viterbi Implementation

Niraj Shah

Scott Weber

290A Final Presentation

Page 42: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

42Kurt Keutzer

Tensilica Flow

.c

.o xt-run

.c.c

gen uArch Designer

gen

xt-gcc

TIE

TensilicaProcessorGenerator

Page 43: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

43Kurt Keutzer

Xtensa Architecture

XtensaCore

Rs Rt RrI

TIE

TIE Extensions:

single cycle

state free

no new exceptions

no stalls

typeless data

Rs, Rt, Rr are 32 bit regs

I is the instruction controlling the

TIE unit

Xtensa Core is a 32 bit configurable

RISC processor

Page 44: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

44Kurt Keutzer

Viterbi Architecture

ACS

TraceBackRAMInit

ADCI/0

Device

MeasuredMeasuredPerformancePerformance

HereHere

Page 45: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

45Kurt Keutzer

TIE SetupBMreg (ACS)

-++

31 8:7 0I

Rs Rt

Rr

31 8:7 0Q

bm33123:2415:167:80

bm2bm1bm0

-

0x7F0x7F

-

Controlinstruction

Page 46: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

46Kurt Keutzer

ACS TIE Extension (ACS)

+

+

bm331 24:23 16:15 8:7 0

bm2 bm1 bm0

17

pm- pm-

11 1:027

-=1?

11:12

pm

310:10’s

decision bitdecision bit

ACS03 ||ACS12 ||ACS30 ||ACS21

31

instruction

RtRs

Rr

msbmsb

Page 47: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

47Kurt Keutzer

ACS TIE Extension with State (ACS)

bm331 24:2316:15 8:7 0

bm2 bm1 bm0

+

+

17pm- pm-

1127

-=1?

31Rs

msbmsb

+

+

17pm-pm-

11 27

- =1?

31Rt

msbmsb

11

pm

310:1

decision bitdecision bit

Rr

pm

16:17

0:11:0

27

decision bitdecision bit

Control

instruction

Page 48: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

48Kurt Keutzer

TIE Zmask (TraceBack)

&

31 1:0Rs Rt

Rr

31 6:5 0

6:70

|

0x7F0x7F

<<1<<1

&0x3F0x3F

31

Controlinstruction

Page 49: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

49Kurt Keutzer

Designs

All designs had a BER of 0.000095 after 10 million iterations

Design 1

100 MHz, 48 mW, 1K DCache, 1K ICache, TIE

Design 1+

222 MHz, 144 mW, 1K DCache, 1K ICache, TIE

Design 2-

100 MHz, 69 mW, 16K DCache, 16K ICache, TIE

Design 2

222 MHz, 191 mW, 16K DCache, 16K ICache, TIE

Design 3

222 MHz, 191 mW, 16K DCAche, 16K ICache, TIE with state

Page 50: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

50Kurt Keutzer

Performance

118

409

263

909

357409

793

909966

1142

0

200

400

600

800

1000

1200

Design

1

Design

1+

Design

2-

Design

2

Design

3

Cache

Perfect CacheKb/sKb/s

Page 51: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

51Kurt Keutzer

Energy Dissipation

uJ/bituJ/bit

0.4

0.12

0.54

0.160.19

0.17

0.240.21 0.2

0.17

0

0.1

0.2

0.3

0.4

0.5

0.6

Design

1

Design

1+

Design

2-

Design

2

Design

3

Cache

Perfect Cache

Page 52: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

52Kurt Keutzer

n(s*J)/Bit

n(s*J)/n(s*J)/BitBit

3.39

0.293

2.05

0.176

0.5320.416

0.3150.231 0.2070.148

0

0.5

1

1.5

2

2.5

3

3.5

Design

1

Design

1+

Design

2-

Design

2

Design

3

Cache

Perfect Cache

Page 53: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

53Kurt Keutzer

Die Area

2.1 2.12.372.37

6.146.14

6.7 6.7 6.7 6.7

0

1

2

3

4

5

6

7

Design

1

Design

1+

Design

2-

Design

2

Design

3

Cache

Perfect Cachemmmm22

Page 54: 1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from

54Kurt Keutzer

Summary: Levels of Configurabilty

Configurability is highly desirable in embedded

applications

There are many levels of configuration:

• Memory and Cache Sizes - get precisely the sizes your

applications needs

• Register file sizes

• Interrupt handling and addresses

• Peripherals

• Instructions

• Physical implementation