dependence-based value prediction yiannakis sazeides university of cyprus yanos@ucy.ac.cy...

Dependence-Based Value Prediction

Yiannakis SazeidesUniversity of Cyprus

yanos@ucy.ac.cy

UPC-Barcelona 17/5/2001

Yanos SazeidesUniversity of Cyprus

UPC 17/5/01

Motivation

• Improve Performance (reduce complexity, save power...)

• What limits performance? Dependences

• Break dependences: Prediction• Exploits regularity and/or non-

uniformity in predicted information

UPC 17/5/01

Predicted Information

• Architectural– dependences: control, address, value, memory

• Non-Architectural– structural constrains: cache misses,

bank conflicts– reduce hardware complexity: cache

sets, shared patterns

UPC 17/5/01

Use of Predicted Information• Speculative/Non-Speculative

– Depends on whether predicted information can directly/indirectly cause an incorrect update of architected state• non-speculative: branch prediction for

instruction prefetching• speculative: branch prediction for

instruction execution

• Hardware/Software

UPC 17/5/01

Trends• Identify and eliminate predictability at

computational levels below programming• Predictors in almost all high performance

processors and many compilers• Technological evolution and limitations

will increase latency and make predictive techniques more important– deeper pipelines– fast processor/slow memory– distributed microarchitectures

UPC 17/5/01

Our motivation

• Can value prediction help?• How to use value predictability?• Wholistic approach: all types not

just vp and coarser grain• Absolutely curious to discover,

understand and use the esoteric program properties causing the predictability and non-uniformity observed in program information

UPC 17/5/01

This talk...

• Hypothesis: dependence information influences the predictability of program values

• Propose and evaluate various dependence-based value predictors

• Approach: theoretical and practical• Work in progress...

UPC 17/5/01

Outline

• Introduction• Background on value prediction• Dependence Info and Predictability• Dependence-based Value

Predictors• Results• Future work

UPC 17/5/01

Value Sequences Produced by Instructions• Basic Sequences

– Constant: 0 0 0 0 0– Stride: 4 3 2 1 0– Non-Stride: -1 23 10 94

• Repeating Sequences (composition of basic sequences):

– Repeated Stride: 4 3 2 1 0 4 3 2 1 0– Repeated Non-Stride: -1 23 10 94 -1 23

UPC 17/5/01

Local History Value Predictors• Computational-Based

– compute next value by performing a computation on previous value(s)

• Context-Based– learn the value that follows a finite

number of previous values (context) and predict that value when context repeats

– need to observe a context-value before predicting correctly

• Hybrid: Delta-Predictor

UPC 17/5/01

Last Value Predictor

Las t Value

V a lu e P re d ic t io n T a b le (V P T )

P C P re d ic t io n

UPC 17/5/01

Stride Predictor

Las t Value

P re d ic t io n

S tr ide

UPC 17/5/01

Context-Based Predictor

Value His tory Las t value

C o n te x t T a b le (C T ) V a lu e P re d ic t io n T a b le (V P T )

P CP re d ic t io n

V P TI n d e x

UPC 17/5/01

Hybrid Predictor

D eltas His tory s tr ide

C o n te x t T a b le (C T ) S trid e T a b le (S T )

P CP re d ic t io n

S TI n d e x

Las t value

P re d ic t io n

UPC 17/5/01

Causes of Predictability

• Model Based on Dependence Graph• Robustness: predictability due to

program control structure and immediate values not input data

• Predictability generation-propagation termination– need to consider info beyond local scope– dependence information holds potential

to increase accuracy of value prediction

UPC 17/5/01

Hypothesis/Fact

• The predictability of an instruction is determined by the information on the dependence path that leads to it

• Predict the value produced by an instruction based on the values and/or information of its predecessors

UPC 17/5/01

Outline

Example

B(s everal bbs )

Contro l F low

D ata F low

O pco de O pe ra n ds G R O u tpu t V a lu e s

A1 s r l $2 ,$6 ,5 6 (0) 3 2 (1) 3 2

2 s ll $2 ,$2,2 6 (0) 3 2 (4) 3 2

3 addu $2 ,$2 ,19 6 ,19 (0x1002f8b0) 3 2 (0x1002f8b4) 3 2

4 lw $2 ,($2) 6 ,19 (0x8000bfff) 3 2 (0xffffffff ) 3 2

5 andi $3 ,$6 ,31 6 (0 ,1 , . . ,31) 2

6 s r lv $2 ,$2,$3 6 ,19 v 0 ,v 1 , . . . ,v 6 2 ,v 3 1

7 andi $2 ,$2,1 6 ,19 (1) 1 401(0) 1 5 (1) 3 3

8 beq $2 ,0 , C 6 ,19

C9 addiu $6 ,$6 ,1 6 1 ,2 , . . , 64

10 s lti $2 ,$6,64 6 (0) 6 31

11 bne $2 ,0 , A 6

UPC 17/5/01

Observations

• All instructions can trace dependence back to $6

• $6 used as an induction variable• Use information about $6 to get

predictions

UPC 17/5/01

Example

• Use value in $6 to predict instruction 7:

Value of $6 Output of Instr. 7 0,..,13, 15, 31..63 1

14, 16..30 0

• Local history with <47 previous values mispredicts

UPC 17/5/01

Dependence-Based Prediction• Prediction based on information from

the component of the predicted instruction i

Output = PF(Componenti)

• Component:– backward dynamic data dependence slice– node info: pcs, optypes, immed., outcomes– livein values (values not produced by

component, register and memory)

UPC 17/5/01

Dependence-Based Prediction• In general, use information from

components of previous instructions

Output = PF(Ci-n..Ci-1,Ci)

UPC 17/5/01

Perspective

• Virtually all predictors are functions of component approximations to values

• “Practical Limitation”: predictors can rely on a subset of components and information

UPC 17/5/01

Component Approximation

• Approximation accuracy depends on information used from component

• What limits approximation accuracy:– finite resources and aliasing– amount of information to be stored– how soon a prediction can be made

UPC 17/5/01

Existing Predictors

• “Interesting” with how little information predictors work well

• Global History Branch Predictors– use information from multiple previous

branch components Output=PF(Ci-n..Ci-1,Ci)

– Note components may be unrelated (longer learning and destructive aliasing)

UPC 17/5/01

Dependence Information

• A lot of information to choose from– Which predecessors:

• recent, recurrent, earliest, all

– Information• values• register, memory names• pc, optypes, immediates• dependence distance

– Propagation through registers or memory

UPC 17/5/01

Ideas… predict based on:

– Values of recurrent predecessors• values indicating location in program

Based on Recurrent Predecessors$6 $19

UPC 17/5/01

– Values of recurrent predecessors• values indicating location in program• no sp• sp with last SP value and PC selection of

stride

Created with the Trial Edition of SmartDraw 5.

UPC 17/5/01

Dependence-Based Stack Pointer Predictor

Las t Value

S trid e T a b le (V P T )

P re d ic t io n

S tr ideLas t Value

P re d ic t io n

S tr ide

Created with the Trial Edition of SmartDraw 5.

UPC 17/5/01

stride

– Distance from predecessors • distance indicates coordinates in

program • predict values not based on values• no sp

Based on Dependence Distance$6 $19

UPC 17/5/01

stride

– Distance from predecessors • distance indicates coordinates in program • predict values not based on values• no sp

– Most recently known processor state• isolate non-determinism

$6 $19

Based on Recent Predecessors

UPC 17/5/01

Outline

UPC 17/5/01

Prediction Process

• To predict an instruction– Construction of Dependence Record

(DR)• approximation of the information on the

component of the instruction

– Use DR to obtain a prediction (directly or indirectly)

UPC 17/5/01

DBVP Predictor with CT

D e p e n d e n c e R e c o rd

L a s t V a lu e

C o n te x t T a b le (C T )

P re d ic t io n T a b le (P T )

P re d ic t io nP TI n d e x

UPC 17/5/01

DBVP Register (DBVP-R)

• Component subset: instructions those fetched but have not updated yet the architectural state– isolates the non-determinism in execution

• Information used: – Generate Registers(GR):livein registers

(most recently known architected state) – Dependence Path Id(DPI):pcs, optypes,

immediates (info about unknown state)

UPC 17/5/01

Construction of DR for an instruction (GR)

1 s rl $ 2 ,$ 6 ,5

2 s ll $ 2 ,$ 2 ,2

3 a d d u $ 2 ,$ 2 ,1 9

4 lw $ 2 ,($ 2 )

5 a nd i $ 3 ,$ 6 ,3 1

6 s rlv $ 2 ,$ 2 ,$ 3

7 a nd i $ 2 ,$ 2 ,1

8 b e q $ 2 ,0 ,C

9 a d d iu $ 6 ,$ 6 ,1

1 0 s lti $ 2 ,$ 6 ,6 4

1 1 b ne $ 2 ,0 ,A

$6 $19

UPC 17/5/01

Construction of DR for an instruction

1 s rl $ 2 ,$ 6 ,5

2 s ll $ 2 ,$ 2 ,2

3 a d d u $ 2 ,$ 2 ,1 9

4 lw $ 2 ,($ 2 )

5 a nd i $ 3 ,$ 6 ,3 1

6 s rlv $ 2 ,$ 2 ,$ 3

7 a nd i $ 2 ,$ 2 ,1

8 b e q $ 2 ,0 ,C

9 a d d iu $ 6 ,$ 6 ,1

1 0 s lti $ 2 ,$ 6 ,6 4

1 1 b ne $ 2 ,0 ,A

$6 $19

- Many instructions same GRs- Differentiate with DPI

UPC 17/5/01

DR Construction

• Information encoded in instructions• May be done incrementally off-line

and stored in a table where it can be retrieved (imprecise?)

• No investigation of implementation specifics of DR construction

• Memory instructions propagate the dependence info through address operands (can be problematic - later)

UPC 17/5/01

DBVP-R Predictor I n s tru c t io n D e p e n d e n c e R e c o rd

D e p e n d e n c e P a th I dG e n e ra te R e g is te rs

R e g is t e r H is to ry T a b le (R H T )

V a lu e P re d ic t io n T a b le (V P T )(p c s , o p ty p e s , im m e d ia te s e t c )

R e g N u m

P re d ic t io nV P TI n d e x

History Table indexed with register names contains architected state

UPC 17/5/01

DBVP Memory (DBVP-M)

• DBVP-R ignores memory dependences– often no correlation between load address

and value• DBVP-M: same as DBVP-R with

additional functionality and information: – propagation of DR through memory

• convert def-store-load-use dependence in a component to def-use

– livein memory locations(generate locations GLs): • converts values from memory to liveins

UPC 17/5/01

DR Propagation Through Memory (spilled variable)

O p T y p e O p e ra n d s G R re g G R re g +m e m O u tp u tV a lu e s

1 add $ 2 ,$ 2 ,1 2 2 1 ,2 ,..,6 4

2 s t $ 2 ,($ 9 ) 2 ,9 2

- - - - - - - - - - - inte rve ning ins truc tio ns - - - - - - - - - - - - -

3 ld $ 2 ,($ 9 ) 9 2 1 ,2 ,..,6 4

4 s lt i $ 2 ,$ 2 ,6 4 9 2 (0 ) 631

5 bne $ 2 ,$ 0 ,... 9 2

p r o p ag at i o nt h r o u g hm e m o r y

G R reg G R reg +mem

UPC 17/5/01

Livein Memory LocationsO pT ype O pe ra nds G R re g + m e m O utput V a lue s1 ld $2,($9) 9 0,1,2,..,63

2 ad d $2,$2,1 9 1,2,3,..,64

3 s t $2,($9) 9

4 s lti $2,$2,64 9 (0)631

5 b ne $2,$0,... 9

C o n v e rt in g s to red e p e n d e n c e to a liv e in

UPC 17/5/01

Generate Locations (GL) and Memory History Table(MHT)• GL is an index into a table (MHT)

with values written to memory• Store-load dependent pairs are

assigned MHT location

• Memory Dependence Prediction mechanisms can provide the extra functionality required by DBVP-M

UPC 17/5/01

DBVP-M Predictor

I n s tru c t io n D e p e n d e n c e R e c o rd

D e p e n d e n c e P a th I dG e n e ra te R e g s /L o c s R e g is te r H is to ry T a b le (R H T )

(p c s , o p ty p e s ,im m e d ia te s e tc )

M e m N u m

R e g N u m

P re d ic t io nV P TI n d e x

M e m o ry H is to ry T a b le (M H T )

UPC 17/5/01

Differences DBVP and CB

• Information used to access tables and table sizes– History table is register indexed (vs

PC indexed)+smaller+may be easier to manage speculative

updates-possibly provide predictions later

– Smaller context is required to capture repeated behavior

UPC 17/5/01

Outline

UPC 17/5/01

Experimental Framework

• SPECINT95 Benchmarks• Study predictors with simple delay

model:– predict d instructions (or up to first

misprediction)– update

• d is also maximum component size• DPI: pc, optypes and immediates

UPC 17/5/01

DBVP-R: DPI Information d=1, 64K entry VPT

UPC 17/5/01

DBVP-R: VPT Sized=1

UPC 17/5/01

DBVP-R: Sensitivity to Delay64K VPT size

UPC 17/5/01

Problems with increasing d

• Capacity: instruction in different components requires more information to be learned

$6 $19

$2 $19

2,6,19

Component Redundancy

UPC 17/5/01

• Capacity: instruction in different components requires more information to be learned

• Destr. Aliasing: multiple instances of same instruction same prediction

$6 $19

Multiple instances same GR and DPI

UPC 17/5/01

• Capacity: instruction in different components requires more information to be learned (destructive aliasing)

• Destr. Aliasing: multiple instances of same instruction same prediction

• Capacity: increasing d means more information needs to be learned

UPC 17/5/01

Information to be learned

1,2,3,..,64

Two combinations

UPC 17/5/01

Information to be learned

1,2,3,..,64

64 combinations

UPC 17/5/01

Solutions for increasing d

• Careful partitioning• Transform different structure to same• Better DPI:

– position– count of outstanding instances of an

instruction– predict only liveouts

• Predict based on other information– recurrence, dependence distance...

UPC 17/5/01

DBVP-M: MHT Sized=1, 64K VPT size

UPC 17/5/01

DBVP-M: Sensitivity to Delay

UPC 17/5/01

DBVP vs CB

UPC 17/5/01

Conclusions

• Dependence-based prediction holds potential to improve accuracy

• Use dependence information for prediction

• Propagation of dependence information through memory useful for increased accuracy

UPC 17/5/01

Current and Future Work

• Refinement and evaluation of dependence-based prediction methods

• Evaluation in a processor • DR construction• Characterize program full

components– size, uniqueness, correlation to

predictability

UPC 17/5/01

Other Work/Interests

• Prediction and Speculation• Program Behavior• Distributed Microarchitectures• SMT• Benchmark characterization

I had a great time - thank you all!

Hasta luego amigas y amigos!

dependence-based value prediction yiannakis sazeides university of cyprus yanos@ucy.ac.cy...

Documents

trace processors: exploiting hierarchy …...i thank quinn,...

state of hawaii department of land and natural … … ·...

the fukushima accident - lse home · rare events and risk...

mitigating the performance degradation due to faults in...

brenton r. yanos, william fuss, candice lambert, emily...

under-employment and the trickle-down of unemployment ·...

tips for running a stellar toastmasters speech contest ·...

network mobility yanos saravanos avanthi koneru. agenda...

modeling the impact of permanent faults in caches29 modeling...

std goods - Головна · web viewsyr 1915 (danfoss)...

evidence from africa on the dynamics of civil con icts and...

performance implications of faults in prediction arrays...

award - itala · date of dispatch to the parties: 29...

traceback pat burke yanos saravanos. agenda introduction...

tourist roles preference in greece - easm.net · tourist...

performance implications of single thread migration on a...

identification of theophrastus’ pigments egyptios yanos

energy-aware time synchronization in wireless sensor...

energy-aware synchronization in wireless sensor networks...

internal migration and firm growth: evidence from...