predicting conditional branches with fusion-based hybrid predictors

33
Predicting Conditional Predicting Conditional Branches With Fusion- Branches With Fusion- Based Hybrid Based Hybrid Predictors Predictors Gabriel H. Gabriel H. Loh Loh Yale University Yale University Dept. of Computer Science Dept. of Computer Science Dana S. Dana S. Henry Henry Yale University Yale University Depts. of Elec. Eng. & Depts. of Elec. Eng. & Comp. Sci. Comp. Sci. This research was funded by NSF Grant MIP-970

Upload: cicada

Post on 12-Feb-2016

23 views

Category:

Documents


0 download

DESCRIPTION

Predicting Conditional Branches With Fusion-Based Hybrid Predictors. This research was funded by NSF Grant MIP-9702281. The Branch Prediction Problem. PC Compute. Branch resolution. 1 out of 5 instructions is a branch May require many cycles to resolve - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Predicting Conditional Predicting Conditional Branches With Fusion-Branches With Fusion-

Based Hybrid PredictorsBased Hybrid Predictors

Gabriel H. LohGabriel H. Loh Yale UniversityYale UniversityDept. of Computer ScienceDept. of Computer Science

Dana S. HenryDana S. Henry Yale UniversityYale UniversityDepts. of Elec. Eng. & Comp. Depts. of Elec. Eng. & Comp. Sci.Sci.

This research was funded by NSF Grant MIP-9702281

Page 2: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

The Branch Prediction The Branch Prediction ProblemProblem

• 1 out of 5 instructions is a branch1 out of 5 instructions is a branch• May require many cycles to resolveMay require many cycles to resolve

– P4 has 20 cycle branch resolution pipelineP4 has 20 cycle branch resolution pipeline– Future pipeline depths likely to increase Future pipeline depths likely to increase

[Sprangle02][Sprangle02]• Predict branches to keep pipeline fullPredict branches to keep pipeline full

PC Compute Branch resolution

Page 3: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Bigger Predictors = More Bigger Predictors = More AccurateAccurate

• Larger predictors tend to yield more Larger predictors tend to yield more accurate predictionsaccurate predictions

• Faster cycle times force smaller Faster cycle times force smaller branch predictorsbranch predictors

• Overriding predictorOverriding predictor couples small, couples small, fast predictor with a large, multi-fast predictor with a large, multi-cycle predictor [Jiménez2000]cycle predictor [Jiménez2000]– performs close to ideal large-fast performs close to ideal large-fast

predictorpredictor

(but bigger predictors = slower)(but bigger predictors = slower)

Page 4: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Hybrid PredictorsHybrid Predictors• Wide variety of branch prediction Wide variety of branch prediction

algorithms availablealgorithms available• Hybrid combines more than one “stand-Hybrid combines more than one “stand-

alone” or alone” or componentcomponent predictor predictor [McFarling93]:[McFarling93]:

PP11 PP22Meta-Meta-

PredictorPredictor

Final PredictionFinal Prediction

Page 5: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Multi-HybridsMulti-Hybrids

PP11 PP22 PPnn

Pr. Encoder…

… …

Final PredictionFinal Prediction

PP11 PP22MM11 PP33 PP44MM22

MM33

Final PredictionFinal Prediction

““Multi-Hybrid” [Evers96]Multi-Hybrid” [Evers96] ““Quad-Hybrid” [Evers00]Quad-Hybrid” [Evers00]

Page 6: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Our Idea: Prediction FusionOur Idea: Prediction Fusion

PP11 ……

PP22 PP33 PPnn

Prediction Selection

PP11 ……

PP22 PP33 PPnn

Prediction Fusion

Page 7: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Early Attempt from MLEarly Attempt from ML

• Weighted Majority algorithm [LW94]Weighted Majority algorithm [LW94]– Better predictors get assigned larger weightsBetter predictors get assigned larger weights– Make final prediction with larger sumMake final prediction with larger sum

• Predictor with largest weight not always correctPredictor with largest weight not always correct

0.487 0.513

PP22 PP66PP77 PP11 PP33 PP44 PP55

PP88

P2, P6 and P7 say “not-taken”P1, P3, P4, P5 and P8 say “taken”

Page 8: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

OutlineOutline

• COLT PredictorCOLT Predictor• Choosing parameters and Choosing parameters and

componentscomponents• PerformancePerformance• Prediction distributions, component Prediction distributions, component

choicechoice

Page 9: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

COLT OrganizationCOLT Organization

Branch AddressBranch AddressBranch HistoryBranch History

PP11 PP22 PP33 PPnn

11 00 11 00……

…MappingMapping

TableTable

VMTVMT

Final PredictionFinal Prediction

Page 10: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Pathological ExamplePathological Example

PP11 PP22 PP33

00 00 00

Actual outcome = 1 (taken)Actual outcome = 1 (taken)

Page 11: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Example (cont’d)Example (cont’d)

PP11 PP22 PP33

00 00 00

Outcome is always wrongOutcome is always wrong

Selection:Selection:

PP11 PP22 PP33

1 1 0 10 0 0

Can recognizeCan recognizeand rememberand rememberthis patternthis pattern

11

COLT:COLT:

VMTVMT

Page 12: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

COLT Lookup DelayCOLT Lookup Delay

1 0 0 1 1…

......

......

PP11 PP22

PPnn

PredictionPrediction

timetime

MT SelectMT Select

critical delaycritical delay

Page 13: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Design ChoicesDesign Choices• # of branch address bits# of branch address bits• # of branch history bits# of branch history bits

• # of components# of components

• Choice of componentsChoice of components– gshare, PAs, gskewed, …gshare, PAs, gskewed, …– History length, PHT size, …History length, PHT size, …

}}Determines number ofDetermines number ofmapping tablesmapping tables

}}Determines size ofDetermines size ofindividual MT’sindividual MT’s

Page 14: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Predictor ComponentsPredictor Components• Global HistoryGlobal History

– gshare [McFarling93]gshare [McFarling93]– Bi-Mode [Lee97]Bi-Mode [Lee97]– Enhanced gskewed Enhanced gskewed

[Michaud97][Michaud97]– YAGS [Eden98]YAGS [Eden98]

• Local HistoryLocal History– PAs [Yeh94]PAs [Yeh94]– pskewed [Evers96]pskewed [Evers96]

• OtherOther– 2bC (bimodal) [Smith81]2bC (bimodal) [Smith81]– Loop [Chang95]Loop [Chang95]– alloyed Perceptron alloyed Perceptron

[Jiménez02][Jiménez02]

}}history lengthshistory lengthsoptimized onoptimized ontest data setstest data sets

Total of 59 configurationsTotal of 59 configurationsSizes vary up to 64KBSizes vary up to 64KB

Page 15: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Huge Search SpaceHuge Search Space• 225959 ways to choose components ways to choose components ways to choose COLT parametersways to choose COLT parameters• We use a genetic searchWe use a genetic search

bit-k = 0 means don’t include Pbit-k = 0 means don’t include Pkkbit-k = 1 means do include Pbit-k = 1 means do include Pkk

VMT SizeVMT Size historyhistorylengthlength

gene format:gene format:……

Page 16: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

MethodologyMethodology• SPEC2000 integer benchmarksSPEC2000 integer benchmarks

– For tuning/optimization: 10M branches For tuning/optimization: 10M branches from testfrom test

– For evaluation: 500M branches from trainFor evaluation: 500M branches from train• Skipped first 100M branchesSkipped first 100M branches

– Compiled with Compiled with cc –arch ev6 –O4 –fast –non_sharedcc –arch ev6 –O4 –fast –non_shared

• SimpleScalar simulatorSimpleScalar simulator– sim-safe for trace collectionsim-safe for trace collection– MASE for ILP simulationsMASE for ILP simulations

Page 17: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Genetic Search COLT Genetic Search COLT ResultsResults

NamNamee

SizeSize(KB)(KB) ComponentsComponents VMTVMT CounteCounte

r widthr widthHistorHistor

y y lengthlength

1616alpct(34/alpct(34/1010) ) gskewed(12)gskewed(12)

gshare(8)gshare(8)20482048 44 88

3232alpct(34/alpct(34/1010) ) gshare(15)gshare(15)

gshare(9) PAs(gshare(9) PAs(77))81928192 44 77

6464alpct(40/alpct(40/1414) )

gshare(16) YAGS(11) gshare(16) YAGS(11) pskewed(pskewed(66))

1638416384 44 1010

128128alpct(40/alpct(40/1414) ) alpct(38/alpct(38/1414) ) gshare(16) gshare(16)

gskewed(13) gskewed(13) YAGS(12) PAs(YAGS(12) PAs(88))

1638416384 44 77

256256alpct(50/alpct(50/1818) ) alpct(34/alpct(34/1010) )

gshare(18) Bi-gshare(18) Bi-Mode(16) Mode(16)

gskewed(15) PAs(gskewed(15) PAs(88))

3276832768 44 44

Page 18: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Overall Predictor Overall Predictor PerformancePerformance

Page 19: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Per-Benchmark Per-Benchmark PerformancePerformance

Page 20: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

ILP PerformanceILP Performance• Simulated CPU:Simulated CPU:

– 6-issue6-issue– 20 cycle pipeline20 cycle pipeline– Same functional units, latencies, caches Same functional units, latencies, caches

as as IntInteell P4/NetBurst microarchitecture P4/NetBurst microarchitecture

1-cycle1-cycle2bC2bC

4-cycle4-cycleOR alpctOR alpct

++ ++

4-cycle4-cycleOR COLTOR COLT

IdealIdeal1-cycle1-cycleCOLTCOLT

Page 21: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

ILP ImpactILP Impact

Page 22: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

COLT Parameter COLT Parameter SensitivitySensitivity

• Mapping table counter widthsMapping table counter widths• Number of mapping tablesNumber of mapping tables• Number of history bits for VMT Number of history bits for VMT

indexindex

Page 23: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Counter WidthCounter Width

Page 24: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

VMT SizeVMT Size

Page 25: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

History LengthHistory Length

Page 26: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Explaining Choice of Explaining Choice of ComponentsComponents

• Parameter sensitivity results shows Parameter sensitivity results shows GA performed well for the COLT GA performed well for the COLT parametersparameters

• Why did it choose the component Why did it choose the component predictors that it did?predictors that it did?

Page 27: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Classifying COLT Classifying COLT PredictionsPredictions

• We examined the We examined the (32KB) COLT config. (32KB) COLT config.• For each mapping table lookup, we For each mapping table lookup, we

examine the neighboring entries:examine the neighboring entries:

PP11 PP22 PP33 PP44

11 00 00 11 1111

0010

1001

entry entry 00001 = NT001 = NT

entry 1001 = Tentry 1001 = T

entry 1entry 11101 = T01 = T

Page 28: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Classifying Predictions Classifying Predictions (cont’d)(cont’d)

easy: all neighboring entries agreeeasy: all neighboring entries agreeshort: only gshare(9) distinguishesshort: only gshare(9) distinguisheslong: only gshare(14) distinguisheslong: only gshare(14) distinguisheslocal: only PAs(local: only PAs(77) distinguishes) distinguishesperceptron: only alpct(34/perceptron: only alpct(34/1010) )

distinguishesdistinguishesmulti-length: mix of gshare(9), (14) or multi-length: mix of gshare(9), (14) or

alpctalpctmixed: both global and local componentsmixed: both global and local components

gsharegshare(9)(9)

gsharegshare(14)(14)

PAsPAs((77))

alpctalpct(34/(34/1010))32KB COLT:32KB COLT:

Classes:Classes:

Page 29: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Prediction ClassificationsPrediction Classifications

Page 30: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Related Work/IssuesRelated Work/Issues• Alloyed history [Skadron00]Alloyed history [Skadron00]• Variable path history length [Stark98]Variable path history length [Stark98]• Dynamic history length fitting [Juan98]Dynamic history length fitting [Juan98]• Interference reduction [lots…]Interference reduction [lots…]

COLT handles all of these cases*COLT handles all of these cases*

Doesn’t support partial update policiesDoesn’t support partial update policies

Page 31: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Open ResearchOpen Research• Better individual componentsBetter individual components• Augment with SBI [Manne99], agree Augment with SBI [Manne99], agree

[Sprangle97][Sprangle97]• Better fusion algorithmsBetter fusion algorithms• Hybrid fusion/selection algorithmsHybrid fusion/selection algorithms• Other domains (branch confidence Other domains (branch confidence

prediction, value prediction, memory prediction, value prediction, memory dependence prediction, instruction dependence prediction, instruction criticality prediction, …)criticality prediction, …)

Page 32: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

SummarySummary• Fusion is more powerful than selectionFusion is more powerful than selection

– Combines multiple sources of informationCombines multiple sources of information• Branch behavior is very variedBranch behavior is very varied

– Need long, short, global and local histories, Need long, short, global and local histories, multiple simultaneous lengths and types of multiple simultaneous lengths and types of historyhistory

• COLT is one possible fusion-based COLT is one possible fusion-based predictorpredictor– Combines multiple types of informationCombines multiple types of information– Current “best” purely dynamic predictor*Current “best” purely dynamic predictor*

Page 33: Predicting Conditional Branches With Fusion-Based Hybrid Predictors

Questions?Questions?