2d-profiling detecting input-dependent branches with a single input data set hyesoon kim m. aater...

24
2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The University of Texas at Austin

Post on 22-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

2D-ProfilingDetecting Input-Dependent Brancheswith a Single Input Data Set

Hyesoon KimM. Aater SulemanOnur MutluYale N. Patt

HPS Research Group The University of Texas at Austin

Page 2: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

2

Motivation

Profile-guided code optimization has become essential for achieving good performance. Run-time behavior profile-time behavior: Good! Run-time behavior ≠ profile-time behavior: Bad!

Page 3: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

3

Motivation

Profiling with one input set is not enough! Because a program can show different behavior with

different input data sets

Example: Performance of predicated execution is

highly dependent on the input data set

Because some branches behave differently with

different input sets

Page 4: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

4

Input-dependent Branches

Definition A branch is input-dependent if its misprediction

rate differs by more than some Δ over different input data sets.

Inp. 1 Inp. 2 Inp.1 – Inp. 2

Misprediction rate of Br. X

30% 29% 1%

Misprediction rate of Br. Y

30% 5% 25%

Input-dependent branch

Input-dependent br. hard-to-predict br.

Page 5: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

5

An Example Input-dependent Branch

Example from Gap (SPEC2K): Type checking branch

TypHandle Sum (A,B)

If ((type(A) == INT) && (type(B) == INT)) { Result = A + B;Return Result;

}Return SUM(A, B);

Train input set: A&B are integers 90% of the time misprediction rate: 10%

Reference input set: A&B are integers 42% of the time misprediction rate: 30% (30%-10%)>Δ

//input-dependent br

Page 6: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

6

Predicated Execution

Eliminate hard-to-predict branchesbut fetch blocks B and C all the time

(normal branch code)

C B

D

AT N

p1 = (cond) branch p1, TARGET

mov b, 1 jmp JOIN

TARGET: mov b, 0

A

B

C

B

C

D

A

(predicated code)

A

B

C

if (cond) { b = 0;}else { b = 1;} p1 = (cond)

(!p1) mov b, 1

(p1) mov b, 0

Page 7: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

7

Predicated Code Performance vs. Branch Misprediction Rate

Normal branch code performs better

Predicated code performs better

run-time (input B) profile-time (input A)

Converting a branch to predicated code could hurt performance if run-time misprediction rate is lower than profile-time misprediction rate

X

2

3

4

5

6

7

8

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Branch misprediction rate (%)

Execu

tio

n t

ime (

cycle

s)

predicated code

normal branch code

Page 8: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

8

0

0.2

0.4

0.6

0.8

1

1.2

gzip vpr gcc mcf crafty parserperlbmk gap vortex bzip2 twolfEx

ec

. tim

e n

orm

aliz

ed

to

no

n-p

red

ica

ted

co

de

run-time: input-Arun-time: input-Brun-time: input-C

Predicated Code Performance vs. Input Set

9%-4%

-16%1%

Measured on an Itanium-II machine

non-predicated

Predicated execution loses performance because of input-dependent branches

Page 9: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

9

If We Know a Branch is Input-Dependent

May not convert it to predicated code.

May convert it to a wish branch.

[Kim et al. Micro’05]

May not perform other compiler optimizations

or may perform them less aggressively.

Hot-path/trace/superblock-based optimizations

[Fisher’81, Pettis’90, Hwu’93, Merten’99]

Page 10: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

10

Our Goal

Identify input-dependent branches by using a single input set for profiling

Page 11: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

11

Talk Outline

Motivation 2D-profiling Mechanism Experimental Results Conclusion

Page 12: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

12

0

0.2

0.4

0.6

0.8

1

0 500 1000 1500Time (in terms of number of executed instructions x 100M)

Bra

nc

h P

red

icti

on

Ac

cu

rac

y

0

0.2

0.4

0.6

0.8

1

0 500 1000 1500Time (in terms of number of executed instructions x 100M)

Bra

nc

h P

red

icti

on

Ac

cu

rac

y

Key Insight of 2D-profiling

Phase behavior in prediction accuracyis a good indicator of input dependence

input-dependent

input-independent

phase 1

phase 2

phase 3

Page 13: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

13

Traditional Profiling

brA

time

brB

time

MEAN pr.Acc(brA) MEAN pr.Acc(brB)

behavior of brA behavior of brB

MEAN pr.Acc(brA)

MEAN pr.Acc(brB)

pr.

Acc

pr.

Acc

Page 14: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

14

2D-profiling

brA

time

brB

timeMEAN pr.Acc(brA) MEAN pr.Acc(brB)

STD pr.Acc(brA) ≠ STD pr.Acc(brB)

behavior of brA ≠ behavior of brB

A: input-dependent br, B: input-independent br

MEAN pr.Acc(brA)

STD pr.Acc(brA)

MEAN pr.Acc(brB)

STD pr.Acc(brB)pr.

Acc

pr.

Acc

Page 15: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

15

Calculate MEAN (brA, brB, …), Standard deviation (brA, brB, …),

PAM:Points Above Mean (brA, brB, …)

2D-profiling Mechanism

Slice 1 Slice 2 Slice N …

mean Pr.Acc(brA,s1)

mean Pr.Acc(brB,s1)

mean Pr.Acc(brA,s2)

mean Pr.Acc(brB,s2)

mean Pr.Acc(brA,sN)

mean Pr.Acc(brB,sN)...

...

The profiler collects branch prediction accuracy information for every static branch over time

mean brA

time

PAM:50%

PAM:0%

brA

brBmean brB

... ......

slice size = M instructions

Page 16: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

16

Input-dependence Tests STD&PAM-test: Identify branches that have large variations

in accuracy over time (phase behavior) STD-test (STD > threshold): Identify branches that have large

variations in the prediction accuracy over time PAM-test (PAM > threshold): Filter out branches that pass STD-

test due to a few outlier samples

MEAN&PAM-test: Identify branches that have low prediction accuracy and some time-variation in accuracy MEAN-test (MEAN < threshold): Identify branches that have low

prediction accuracy PAM-test (PAM > threshold): Identify branches that have some

variation in the prediction accuracy over time

A branch is classified as input-dependent if it passes either

STD&PAM-test or MEAN&PAM-test

Page 17: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

17

Talk Outline

Motivation 2D-profiling Mechanism Experimental Results Conclusion

Page 18: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

18

Experimental Methodology Profiler: PIN-binary instrumentation tool Benchmarks: SPEC 2K INT Input sets

Profiler: Train input set Input-dependent Branches: Reference input set and train/other

extra input sets

Input-dependent branch: misprediction rate of the branch

changes more than Δ = 5% when input data set changes Different Δ are examined in our TechReport [reference 11].

Branch predictors Profiler: 4KB Gshare, Machine: 4KB Gshare Profiler: 4KB Gshare, Machine: 16KB Perceptron (in paper)

Page 19: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

19

Evaluation Metrics Coverage and Accuracy for input-dependent

branches

dependent-Input Actual

PredictedCorrectly

A

BACOV

A B

dependent-InputPredicted

PredictedCorrectly

B

BACC

A

Actual Input-dependent br.

Predicted Input-dependent br. (2D-profiler)

Correctly Predicted Input-dependent br.

Page 20: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

20

Input-dependent Branches

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

bzip2 gzip twolf gap crafty gcc% o

f b

ran

ch

es

th

at

are

inp

ut-

de

pe

nd

en

t

2 input sets3 input sets4 input sets5 input sets6 input sets7 input sets8 input sets

Page 21: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

21

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

coverage accuracy coverage accuracy

Fra

ctio

n

2 input sets

3 input sets

4 input sets

5 input sets

6 input sets

7 input sets

8 input sets

2D-profiling Results

input-dependent branches input-independent branches

Phase behavior and input-dependence are strongly correlated!

63%

88%93%

76%

Page 22: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

22

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

bzip2 gzip twolf gap crafty gcc AVG

No

rmal

ized

exe

cuti

on

tim

e

Edge

Gshare

2D+Gshare

The Cost of 2D-profiling?

2D-profiling adds only 1% overhead over modeling the branch predictor in software Using a H/W branch predictor [Conte’96]

54%1%

Page 23: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

23

Conclusion 2D-profiling is a new profiling technique to find input-

dependent characteristics by using a single input data set for profiling

2D-profiling uses time-varying information instead of just average data

Phase behavior in prediction accuracy in a profile run input-dependent

2D-profiling accurately identifies input-dependent branches with very little overhead (1% more than modeling the branch predictor in the profiler)

Applications of 2D-profiling are an open research topic Better predicated code/wish branch generation algorithms Detecting other input-dependent program characteristics

Page 24: 2D-Profiling Detecting Input-Dependent Branches with a Single Input Data Set Hyesoon Kim M. Aater Suleman Onur Mutlu Yale N. Patt HPS Research Group The

Questions?