evaluation of the gini-index for studying branch prediction features veerle desmet lieven eeckhout...

Post on 28-Dec-2015

217 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Evaluation of the Gini-index for Studying Branch

Prediction FeaturesVeerle Desmet

Lieven EeckhoutKoen De Bosschere

2

A simple prediction example

outlook t°

windyseason,...

umbrellaprediction

predictionmechanism

features

goal = prediction accuracy of 100%

past observations

3

A simple prediction example

• Daily prediction• Binary prediction: yes or no• Outcome in the evening

• Prediction strategies:– No need in summer, yes otherwise

• Easy, not very accurate

– Based on humidity and temperature• More complex, very accurate

4

Predicting

• How to improve prediction accuracy?• Shortcomings of existing models?

– Feature set– Prediction mechanism– Implementation limits– ...

• This talk: evaluation of prediction featuresfor branch prediction

5

Program execution

• Phases during instruction execution:

• Fetch = read next instruction• Decode = analyze type and

read operands• Execute• Write Back = write result

Fetch Decode Execute Write Back

R1=R2+R3addition

4 3

computation

R1 contains 7

6

Pipelined architectures

Parallel versus sequential:

• Constant flow of instructions possible• Faster applications

• Limitation due to branches

Fetch Decode Execute Write Back

R1=R2+R3R1=R2+R3R5=R2+1 R1=R2+R3R5=R2+1R4=R3-1 R1=R2+R3R5=R2+1R4=R3-1R7=2*R1 R5=R2+1R4=R3-1R7=2*R1R5=R6 R4=R3-1R7=2*R1R5=R6R1=4

7

Branches

Fetch Decode Execute Write Back

R1=R2+R3 R1=R2+R3R5=R2+1 R5=R2+1R4=R3-1R7=2*R1R5=R6 R5=R6test R1=0 test R1=0 R5=R2+1 R5=R6? test R1=0 R5=R2+1?? test R1=0R7=2*R1

R1=R2+R3

R5=R2+1

R7=0

R7=2*R1

R5=R6

test R1=0

yes

noR2=R2-1

• Branches determine program flow or execution path

• Introduce 2 bubbles affecting pipeline throughput

8

Solution

• 1 out of 8 instructions is a branch• Waiting for the outcome of branches

seriously affects amount of parallelism• Increasing number of pipeline stages

– Pentium 4: up to 20 stages

Predict outcomeof branch

9

Branch prediction

• Fetch those instructions that are likely to be executed

• Correct prediction eliminates bubbles

Fetch Decode Execute Write Back

R1=R2+R3 R1=R2+R3R5=R2+1 R5=R2+1R4=R3-1R7=2*R1R5=R6 R5=R6test R1=0 test R1=0 R5=R2+1 R5=R6R7=2*R1 test R1=0 R5=R2+1R7=2*R1R2=R2-1

R1=R2+R3

R5=R2+1

R7=0

R7=2*R1

R5=R6

test R1=0

yes

noR2=R2-1

10

Branch prediction

• Prediction for each branch execution• Binary prediction: taken or not-taken• Outcome after the test is excuted

• Prediction strategies:– Many predictors in literature– Static versus dynamic

11

Static branch prediction

• BTFNT: Backward Taken, Forward Not Taken– Loops (e.g. For, while)– Summer no need of umbrella

• Based on type of test in branch– Branch if equal mostly not-taken– Sunday no need of umbrella

• Easy, prediction fixed at compile-time• Prediction accuracy: about 75%

12

Dynamic branch prediction

• Bimodal• Global• Gshare• Local

Simulations:• SimpleScalar/Alpha• SPEC2000 integer benchmarks• 250M branches

13

Bimodal branch predictor

Averaging outcomes from previous years

branchaddress

saturatingcountere.g. 2

saturatingcountere.g. 3

predictione.g. taken

update withoutcome

e.g. taken

14

Global branch predictor

Averaging last day outcomes

globalhistory

e.g. 0111

saturatingcountere.g. 2

saturatingcountere.g. 3

predictione.g. taken

update withoutcome

e.g. taken

globalhistory

e.g. 1111

15

Gshare branch predictor

saturatingcountere.g. 2

predictione.g. taken

branch address

XOR

global historye.g. 1010

update with outcome

AMD K6

16

Local branch predictor

Record day outcomes of previous yearsAveraging over same day histories

branchaddress

prediction

local history

e.g. 1111

saturatingcountere.g. 2

17

Accuracy versus storage

local

bimodal

gshare

global

75

80

85

90

95

100

1 10 100 1000 10000 100000Pre

dic

tion

Accu

racy

(%

)

Predictor Size (byte)

18

Branch prediction strategies

• All saturating counter mechanism• All use of limited tables

– problem with so-called aliasing• Different prediction features• Accuracies up to 95%

• Further improvement?• Predictive power of features?

19

Feature selection

• Which features are relevant?• Less features

– require less storage– faster prediction

predictionpredictionmechanismfeatures?

Feature selection

20

Systematic feature evaluation

• Feature = input to predictor• Power of features

– predictor size not fixed– prediction strategy not fixed

• Decision trees:– Selects feature– Split observations– Recursive algorithm– Easy understandable

21

Decision Tree Construction

Outlook t° windy

sunny high no no

sunny low yes yes

overcast high no no

overcast low no no

overcast high yes yes

overcast low yes yes

rain low no yes

rain high yes yes

past o

bserv

ations

features

outlook

windy

YES

NO YES

yes no

sunnyovercast

rain

predictionmechanism

prediction

22

Gini-indexMetric for partition purity of a data set S:

Gini (S) = 1 – p0² – p1²

where pi is the relative frequence of class i in S

For binary prediction: minimum 0 maximum 0.5

The higher the Gini-index, the more difficult to predict

23

Finding good split points

• If data set S is split into two subsets S0 and S1 with sizes N0 and N1 (N = N0 + N1):

• Feature with lowest Ginisplit is chosen

• Extensible for non binary features

• Looking for features with low Ginisplit-index, i.e. features with good predictive power

N0 N

N1 N

Gini(S0) + Gini(S1)Ginisplit(S) =

24

Individual feature bits

00,050,1

0,150,2

0,250,3

0,350,4

0,450,5

globalhistory

localhistory

branch address

gshare-index

target directionbranch type

ending typesuccessorbasic block

dynamic features static features

Gin

i sp

lit-in

dex

25

Individual features

• Local history bits very good– perfect local history uses branch

address• Static features powerful

– non-binary– except target direction– known at compile-time

• Looking for good feature combinations...

26

Features as used in predictors

00,050,1

0,150,2

0,250,3

0,350,4

0,450,5

0 1 2 3 4 5 6 7 8 9 1011121314151617181920

Gin

i sp

lit-in

dex

globalhistory

branch address

gshare-index

localhistory

Feature length (bit)

27

Features as used in predictors

• Static features better for small lengths• Better if longer features• A few local history bits enough

• Same behaviour as accuracy curves– low Gini-index implies high accuracy

• Independent to predictor size• Independent to prediction strategy

28

Remark

• Limitation of decision trees: outliers– majority vote– clean data

• Keep implementation in mind

Outlook t° windy

sunny high no no

sunny high no yes

sunny high no no

29

Conclusion

• Need of accurate branch prediction in modern microprocessors

• Towards systematic predictor development– Selecting features– Predictive power of features

• Gini-index useful for studying branch prediction features– without fixing any predictor

aspect

Thanks for Listening

top related