an energy-efficient reconfigurable multiprocessor ic for dsp applications

3
P1B P2B P4B P3B P1A P2A P4A P3A R egister file Processor Operation Scheduling ALU Program M em ory ALUR To R egister file To R egister file Output ProcessorA rchitecture R 41B R 41A R 12B R 12 A R 34A R 34B R 23A R 23B C ontrol Input An Energy-Efficient Reconfigurable Multiprocessor IC for DSP Applications Multiple programmable VLIW processors arranged in a ring topology Balances its functionalities between ASICs and general-purpose digital signal processors Distributed memories along with direct inter-processor communications through register files Flexible choice of computing resources Energy-efficient DSP applications can be achieved by exploiting its multi-l evel reconfigurable architecture Efficient mapping of algorithms onto the multiprocessor Inside each processor, computation modules, e.g. multipliers, can be turned off b y the instructions to improve the energy-efficiency Scalable datapath provides a means of trading off performance vs. power efficienc y Memory localization through distributed memories also contributes to power saving s

Upload: cedric-alston

Post on 30-Dec-2015

17 views

Category:

Documents


0 download

DESCRIPTION

An Energy-Efficient Reconfigurable Multiprocessor IC for DSP Applications. Multiple programmable VLIW processors arranged in a ring topology Balances its functionalities between ASICs and general-purpose digital signal processors - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: An Energy-Efficient Reconfigurable Multiprocessor IC for DSP Applications

P1B

P2BP4B

P3B

P1A

P2AP4A

P3A

Register file

Processor

Operation Scheduling

ALU

Program Memory

ALUR

To Register file

To Register file

Output

Processor Architecture

R41B

R41A

R12B

R12A

R34A

R34B

R23A

R23B

Control Input

An Energy-Efficient Reconfigurable Multiprocessor IC for DSP

Applications

• Multiple programmable VLIW processors arranged in a ring topology– Balances its functionalities between ASICs and general-purpose digital signal processors– Distributed memories along with direct inter-processor communications through register files– Flexible choice of computing resources

• Energy-efficient DSP applications can be achieved by exploiting its multi-level reconfigurable architecture

– Efficient mapping of algorithms onto the multiprocessor– Inside each processor, computation modules, e.g. multipliers, can be turned off by the instructi

ons to improve the energy-efficiency– Scalable datapath provides a means of trading off performance vs. power efficiency– Memory localization through distributed memories also contributes to power savings

Page 2: An Energy-Efficient Reconfigurable Multiprocessor IC for DSP Applications

• Variable word-length 20-tap FIR and 8-point FFT (16-, 24- and 48-bit)– In 16- and 24-b resolution, ring A in use and ring B in “sleep mode”; in 48-b mode, both rings a

re active—ring A for 24 MSBs and ring B for 24 LSBs

– Booth multipliers used in 16-b mode; serial multipliers employed in 24- and 48-b modes

– Multipliers are active only for the multiplication with W81 and W8

3 in the FFT

• Reconfigurable Viterbi decoder (K = 6 to 9, r = 1/2 and 1/3)– Efficient ACS implementation and path metric memory localization

MUX

8-bitadder

Compare

8-bitadder

{6'h00, BM0, 6'h00, BM1}

{PM0, PM1}

16

16

sub +

Viterbi

enable

16

8 8

survivor metric

8 General addition output

Mode<1>Mode<0>

Cin_ext

sub 0

1

BM1sub

2

B<7:0>Cin

8

A<7:0>8

Co

PM1

sub

0 1

mode<1>

16

B<7:0>Cin

A<7:0>

Co

8

82BM0sub

PM0

Energy-Efficient DSP Applications on the Multiprocessor IC

Page 3: An Energy-Efficient Reconfigurable Multiprocessor IC for DSP Applications

23456789

1011121314151617181920212223

1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2

VDD (V)

Po

wer

(m

W)

48-bit FIR

48-bit FFT

24-bit FIR

24-bit FFT

16-bit FIR

16-bit FFT

0.10

1.00

10.00

100.00

1000.00

1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2

VDD (V)

Max

imu

m T

hro

ug

hp

ut

(MH

z)

16-bit FFT 24-bit FFT 48-bit FFT

16-bit FIR 24-bit FIR 48-bit FIR

0

50

100

150

200

250

300

1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0VDD (V)

Po

we

r (m

W)

K=9 (@ 10 Mbps)

K=8 (@ 10 Mbps)

K=7 (@ 10 Mbps)

K=6 (@ 10 Mbps)

0.00

10.00

20.00

30.00

40.00

50.00

60.00

70.00

80.00

0.7 0.9 1.1 1.3 1.5 1.7 1.9

VDD (V)

Ma

xim

um

De

co

de

Ra

te (

Mb

ps

)

K=6K=7K=8K=9

(c) Viterbi decoder: power consumption vs. VDD (d) Viterbi decoder: maximum throughput vs. VDD

(a) FIR & FFT: power consumption vs. VDD (b) FIR & FFT: maximum throughput vs. VDD

Results and Conclusion

• Conclusion: the multiprocessor IC achieves performance close to ASIC solutions while possessing a degree of flexibility available only in general-purpose digital signal processors

48-b FIR48-b FFT24-b FIR24-b FFT16-b FIR16-b FFT

16-b FFT16-b FIR

24-b FFT24-b FIR

48-b FFT48-b FIR