register file access reduction by data reuse

Post on 31-Jan-2016

46 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

REGISTER FILE ACCESS REDUCTION BY DATA REUSE. Hiroshi Takamura Koji Inoue Vasily G. Moshnyaga. Dept. of Electronics Engineering and Computer Science Fukuoka University, Japan. Overview of the talk. Motivation of this work The Data-Reuse approach Experimental Results Conclusion. - PowerPoint PPT Presentation

TRANSCRIPT

1

REGISTER FILE ACCESS REDUCTION BY DATA REUSE

Hiroshi Takamura

Koji Inoue

Vasily G. MoshnyagaDept. of Electronics Engineering and Computer Science

Fukuoka University, Japan

2

Overview of the talk

Motivation of this work The Data-Reuse approach Experimental Results Conclusion

3

Motivation of this work

Extending battery life time.Making to low-cost.

Reducing energy consumption of microprocessors is necessary

4

Power distribution in Motorola’s M-core Source: D.Gonzales, IEEE Micro,19(4)1999

Register file takes 16% of the total power and 42% of the data path power!

Clock :

Data path:Controller:

36%

36%28%

Total 100%

5

Register File Energy Dissipation

Energy = ( Nread + Nwrite ) * Eacc

Total number of RF reads

Total number of RF writes

Average energy per RF access

Our goal: To lower N according to operand variation by Architectural optimizations

Assumption: Read and write consumes equal energy

6

add $t0, $s1, $t1 (i)

mul $t3, $s1, $t1 (ii)

Register file ALU

Rs

Rt

The first source operandThe second source operandDestination operand

The value is not updated. 4 read-accesses

Problem of conventional RF operation

Therefore there is unnecessary RF reading

7

Problem of conventional RF operation

ALURegister

File Datamemory

x1

x2

Forwardingunit

ID/EX EX/MEM MEM/WB

AB

x

$rs

$rt

rs

rs

rdALU

RegisterFile Data

memory

x1

x2

Forwardingunit

ID/EX EX/MEM MEM/WB

AB

x

$rs

$rt

rs

rs

rd

Almost all results are provided to following instructions via forwarding units, so that they are consumed before RF writing.

So, there is a unnecessary RF writing

8

add $t0, $s1, $t1 (i)

mul $t3, $s1, $t1 (ii)

Register file

ALU

Rs

Rt

control

Register file access reduction approach (Reuse of the same source operand value )

The first source operandThe second source operandDestination operand

R-mode

9

add $t0, $s1, $t1 (i)

mul $t3, $t1, $s1 (ii)

Register file

ALU

Rs

Rt

S-mode

MUX

MUX

control

Register file access reduction approach(operand swapping)

The first source operandThe second source operandDestination operand

10

RF access reduction approach(Delayed Operand Reuse)

sub $t3, $s1, $t1 (i)lw $t2, 20($s2) (ii)sub $t4, $t2, $t1 (iii)

J-mode

Register file

ALU

Rs

Rt

control

The first source operandThe second source operandDestination operand

11

add $t1, $t1, $s1 (i)sub $t1, $s1, $t1 (ii)Useless writing

access

c.c.1 c.c.2 c.c.3 c.c.4 c.c.5 c.c.6

IM Reg DM Reg

IM Reg DM Reg

(i)

(ii)

Reduction of RF writing(Application of writing operation omission)

The first source operand

The second source operandDestination operand

12

Number of accessesadd $t0, $s1, $t1 (i)

mul $t3, $s1, $t1 (ii)

add $t1, $t1, $s1 (iii)

sub $t1, $s1, $t1 (iv)

lw $t2, 20($s1) (v)

sub $t4, $s1, $t1 (vi)

Nread Nwrite

CONV 11 6R

S

RSJ

W+RSJ

Dest.sSource1

Source2

-An example-Number of accesses in conventional register file

13

Operand reusing between continuous instructions

Nread Nwrite

CONV 11 6R

S

RSJ

W+RSJ

Dest.sSource1

Source2

7 6

add $t0, $s1, $t1 (i)

mul $t3, $s1, $t1 (ii)

add $t1, $t1, $s1 (iii)

sub $t1, $s1, $t1 (iv)

lw $t2, 20($s1) (v)

sub $t4, $s1, $t1 (vi)

Number of accesses

14

add $t0, $s1, $t1 (i)

mul $t3, $s1, $t1 (ii)

add $t1, $t1, $s1 (iii)

sub $t1, $s1, $t1 (iv)

lw $t2, 20($s1) (v)

sub $t4, $s1, $t1 (vi)

Nread Nwrite

CONV 11 6R 7 6S

RSJ

W+RSJ

3 6

Dest.sSource1

Source2

Operand swapping

Number of accesses

15

add $t0, $s1, $t1 (i)

mul $t3, $s1, $t1 (ii)

add $t1, $t1, $s1 (iii)

sub $t1, $s1, $t1 (iv)

lw $t2, 20($s1) (v)

sub $t4, $s1, $t1 (vi)

Nread Nwrite

CONV 11 6R 7 6S 3 6

RSJ

W+RSJ

2 6

Dest.sSource1

Source2

Reusing operand between discontinuous instructions

Number of accesses

16

add $t0, $s1, $t1 (i)

mul $t3, $s1, $t1 (ii)

add $t1, $t1, $s1 (iii)

sub $t1, $s1, $t1 (iv)

lw $t2, 20($s1) (v)

sub $t4, $s1, $t1 (vi)

Nread Nwrite

CONV 11 6R 7 6S 3 6

RSJ 2 6W+RSJ 2 5

Dest.sSource1

Source2

Writing operation omission

Number of accesses

17

RF accesses by the proposed technique

add $t0, $s1, $t1 (i)

mul $t3, $s1, $t1 (ii)

add $t1, $t1, $s1 (iii)

sub $t1, $s1, $t1 (iv)

lw $t2, 20($s1) (v)

sub $t4, $s1, $t1 (vi)

Nread Nwrite

CONV 11 6R 7 6S 3 6

RSJ 2 6W+RSJ 2 5

Number of reading : 11 times > 2 timesNumber of writing : 6 times > 5 timesNumber of total accesses : 17 times > 7 times

18

Experimental Evaluation Flexible Architecture Simulation Tool

Cycle-accurate instruction simulation on 5-stage RISC-type microprocessor (similar to MIPS)

Traces user-level instructions and records RF access info as well as operand’s total number of reuse.

32-entry RF (1 write, 2 reads) SPEC95 and MediaBench Benchmarks:

adpcm_c, adpcm_d, compress, go, mpeg_d, mpeg_e, pegwit_g, pegwit_enc, pegwit_dec

we described a simple RISC microprocessor in Verilog-HDL, and synthesized it by Synopsys Design Compiler. A 0.35 μm process technology was assumed.

SUN UltraSparc-3 environment

20

Reduction rate (%) for the RF read

0

10

20

30

40

50

60

70RSJRSJ

RF access reduction: 62.7% (maximum)!

21

Reduction rate (%) for the RF write

0

10

20

30

40

50

60

70

2inst

1inst

RF access reduction: 60% (maximum)!

22

Reduction rate (%) for read&write

0

10

20

30

40

50

60

70

ade add com_n com_t com_b go mpd_m mpd_t mpd_tv mpd_tm mpe pegc pege pegd

W+RW+SW+JW+RSJ

RF access reduction: 61% (maximum)!

23

Area comparison

100.00%

101.70%

103.23%

97%

98%

99%

100%

101%

102%

103%

104%

105%

Conventional type Read Reuse Read &Write Reuse

The

inc

reas

e ra

te o

f ar

ea(%

)

Hardware Overhead: +3.2% (maximum)!

24

Conclusion We proposed a technique to reduce energy dissipation of

register file by operand reuse Energy savings vary on application:

Read: 62% (max), 29%(aver.) Write: 60% (2instr), 55%(1instr) Total: 61% (max), 39%(aver.)

Hardware overheadRead: 1.7%, Read&Write: 3.2%

Verification at a cycle level Evaluation based on a detailed energy models A detailed estimation of the control circuitry overhead

Future Work

top related