evaluation of the gini-index for studying branch prediction features veerle desmet lieven eeckhout...

Evaluation of the Gini-index for Studying Branch

Prediction FeaturesVeerle Desmet

Lieven EeckhoutKoen De Bosschere

A simple prediction example

outlook t°

windyseason,...

umbrellaprediction

predictionmechanism

features

goal = prediction accuracy of 100%

past observations

A simple prediction example

• Daily prediction• Binary prediction: yes or no• Outcome in the evening

• Prediction strategies:– No need in summer, yes otherwise

• Easy, not very accurate

– Based on humidity and temperature• More complex, very accurate

Predicting

• How to improve prediction accuracy?• Shortcomings of existing models?

– Feature set– Prediction mechanism– Implementation limits– ...

• This talk: evaluation of prediction featuresfor branch prediction

Program execution

• Phases during instruction execution:

• Fetch = read next instruction• Decode = analyze type and

read operands• Execute• Write Back = write result

Fetch Decode Execute Write Back

R1=R2+R3addition

computation

R1 contains 7

Pipelined architectures

Parallel versus sequential:

• Constant flow of instructions possible• Faster applications

• Limitation due to branches

R1=R2+R3R1=R2+R3R5=R2+1 R1=R2+R3R5=R2+1R4=R3-1 R1=R2+R3R5=R2+1R4=R3-1R7=2*R1 R5=R2+1R4=R3-1R7=2*R1R5=R6 R4=R3-1R7=2*R1R5=R6R1=4

Branches

R1=R2+R3 R1=R2+R3R5=R2+1 R5=R2+1R4=R3-1R7=2*R1R5=R6 R5=R6test R1=0 test R1=0 R5=R2+1 R5=R6? test R1=0 R5=R2+1?? test R1=0R7=2*R1

R1=R2+R3

R5=R2+1

R7=2*R1

test R1=0

noR2=R2-1

• Branches determine program flow or execution path

• Introduce 2 bubbles affecting pipeline throughput

Solution

• 1 out of 8 instructions is a branch• Waiting for the outcome of branches

seriously affects amount of parallelism• Increasing number of pipeline stages

– Pentium 4: up to 20 stages

Predict outcomeof branch

Branch prediction

• Fetch those instructions that are likely to be executed

• Correct prediction eliminates bubbles

R1=R2+R3 R1=R2+R3R5=R2+1 R5=R2+1R4=R3-1R7=2*R1R5=R6 R5=R6test R1=0 test R1=0 R5=R2+1 R5=R6R7=2*R1 test R1=0 R5=R2+1R7=2*R1R2=R2-1

R1=R2+R3

R5=R2+1

R7=2*R1

test R1=0

noR2=R2-1

Branch prediction

• Prediction for each branch execution• Binary prediction: taken or not-taken• Outcome after the test is excuted

• Prediction strategies:– Many predictors in literature– Static versus dynamic

Static branch prediction

• BTFNT: Backward Taken, Forward Not Taken– Loops (e.g. For, while)– Summer no need of umbrella

• Based on type of test in branch– Branch if equal mostly not-taken– Sunday no need of umbrella

• Easy, prediction fixed at compile-time• Prediction accuracy: about 75%

Dynamic branch prediction

• Bimodal• Global• Gshare• Local

Simulations:• SimpleScalar/Alpha• SPEC2000 integer benchmarks• 250M branches

Bimodal branch predictor

Averaging outcomes from previous years

branchaddress

saturatingcountere.g. 2

predictione.g. taken

update withoutcome

e.g. taken

Global branch predictor

Averaging last day outcomes

globalhistory

e.g. 0111

update withoutcome

e.g. taken

globalhistory

e.g. 1111

Gshare branch predictor

branch address

global historye.g. 1010

update with outcome

AMD K6

Local branch predictor

Record day outcomes of previous yearsAveraging over same day histories

branchaddress

prediction

local history

e.g. 1111

Accuracy versus storage

bimodal

gshare

global

1 10 100 1000 10000 100000Pre

Predictor Size (byte)

Branch prediction strategies

• All saturating counter mechanism• All use of limited tables

– problem with so-called aliasing• Different prediction features• Accuracies up to 95%

• Further improvement?• Predictive power of features?

Feature selection

• Which features are relevant?• Less features

– require less storage– faster prediction

predictionpredictionmechanismfeatures?

Feature selection

Systematic feature evaluation

• Feature = input to predictor• Power of features

– predictor size not fixed– prediction strategy not fixed

• Decision trees:– Selects feature– Split observations– Recursive algorithm– Easy understandable

Decision Tree Construction

Outlook t° windy

sunny high no no

sunny low yes yes

overcast high no no

overcast low no no

overcast high yes yes

overcast low yes yes

rain low no yes

rain high yes yes

past o

ations

features

outlook

NO YES

yes no

sunnyovercast

predictionmechanism

prediction

Gini-indexMetric for partition purity of a data set S:

Gini (S) = 1 – p0² – p1²

where pi is the relative frequence of class i in S

For binary prediction: minimum 0 maximum 0.5

The higher the Gini-index, the more difficult to predict

Finding good split points

• If data set S is split into two subsets S0 and S1 with sizes N0 and N1 (N = N0 + N1):

• Feature with lowest Ginisplit is chosen

• Extensible for non binary features

• Looking for features with low Ginisplit-index, i.e. features with good predictive power

Gini(S0) + Gini(S1)Ginisplit(S) =

Individual feature bits

00,050,1

0,150,2

0,250,3

0,350,4

0,450,5

globalhistory

localhistory

branch address

gshare-index

target directionbranch type

ending typesuccessorbasic block

dynamic features static features

lit-in

Individual features

• Local history bits very good– perfect local history uses branch

address• Static features powerful

– non-binary– except target direction– known at compile-time

• Looking for good feature combinations...

Features as used in predictors

00,050,1

0,150,2

0,250,3

0,350,4

0,450,5

0 1 2 3 4 5 6 7 8 9 1011121314151617181920

lit-in

globalhistory

branch address

gshare-index

localhistory

Feature length (bit)

Features as used in predictors

• Static features better for small lengths• Better if longer features• A few local history bits enough

• Same behaviour as accuracy curves– low Gini-index implies high accuracy

• Independent to predictor size• Independent to prediction strategy

Remark

• Limitation of decision trees: outliers– majority vote– clean data

• Keep implementation in mind

Outlook t° windy

sunny high no no

sunny high no yes

sunny high no no

Conclusion

• Need of accurate branch prediction in modern microprocessors

• Towards systematic predictor development– Selecting features– Predictive power of features

• Gini-index useful for studying branch prediction features– without fixing any predictor

aspect

Thanks for Listening

evaluation of the gini-index for studying branch prediction features veerle desmet lieven eeckhout...

correct prediction

need of umbrellabased

type of test

need of umbrellaeasy

takenupdate withoutcomee

predictive power of

execution pathintroduce

program flow

Documents

medewerkers motiveren tot flexibele mobiliteit anouk van...

flow cytometrie bij lage cytoses in lumbaal vocht veerle...

veerle van den eeckhout - mpi · 2018. 8. 31. · veerle...

opaque predicates detection by abstract interpretation ·...

ilse bracke veerle meuleman_evagoossens_lessiushogeschool

interieur12 concept sophie eeckhout

veerle van den eeckhoutveerle van den eeckhout max planck...

veerle heyvaert the transnationalization of law...

clustered indexing for conditional branch predictors veerle...

data inventory, audit and management tools veerle van den...

o taxation and market power - jan eeckhout

steganography for executables and code transformation...

presentation organic agriculture veerle van hooijdonk

financing to address climate change veerle vandeweerd united...

10 september, 2015 seescoaseescoa stww - programma d1.2:...

correct alignment of a ras after call and return...

uitnodiging-doctoraat veerle vyncke

workload design: selecting representative program-input...

an efficient data race detector for diota michiel ronsse,...

development of policy instruments: environmental health...