princeton university electrical engineering 12th international symposium on high-performance...

93
Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006 Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques Canturk ISCI Margaret MARTONOSI

Upload: sheryl-blake

Post on 17-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Princeton University Electrical Engineering

12th International Symposium on High-Performance Computer Architecture

HPCA-12, Austin, TX. 2006

Feb 14, 2006

Phase Characterization for Power:

Evaluating Control-Flow-Based and Event-Counter-Based Techniques

Canturk ISCI

Margaret MARTONOSI

Page 2: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi2

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

What are Program Phases? Distinct and often-recurring regions of program behavior

Ex: Vortex

0.20.40.60.8

11.2

105 149 193 237 281 325 369 413 457 501 545

Billions of Instructions

IPC

0.4

0.6

0.8

1

105 149 193 237 281 325 369 413 457 501 545

Billions of Instructions

L2

Re

fs

00.20.40.60.8

1

0 44 88 132 176 220 264 308 352 396 440

Billions of Instructions

Me

m R

efs

Page 3: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi3

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

38

42

46

50

0 44 88 132 176 220 264 308 352 396 440Billions of Instructions

Po

we

r [W

]

Power Behavior has Phases, too

Recurring intervals of distinct power behavior

Ex: Vortex

Page 4: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi4

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Phase Methods for Power

Two main approaches:

Key Question:

How do these methods perform in terms of useful representations of power phase behavior?

Several studied program characteristics

Control Flow Methods Basic Block Vectors (BBVs)

[Sherwood et al. ASPLOS’02]

Event Monitoring Techniques Performance Monitoring Counters (PMCs)

[Isci and Martonosi Micro’03]

Page 5: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi5

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Outline

Real-system experimentation framework

How do the control-flow-based and event-counter-based approaches perform in power characterization?

Reasons why the two approaches can differ

Page 6: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi6

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

PinPintoolApplication

Binary

Application

Experimental Setup

OS

Hardware

Instrumentbasic block

heads

Sample basic block

head addresses

Collect PMC event rates

Enable/Disable counters

Read/Flush power history

Enable/Disablepower input

Performance CounterHardware

External Power

Measurement via Current

Probe

OS serial device

file

Page 7: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi7

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

A {BBV,PMC,Power} Sample

0x80485200x80485540x8048570

0x80487740x804878d0x804879c

Visited basic blocks:10844

PMCs:

463832

4862349

299303

36382

Power history:

35.9W36.9W

37.2W37.5W36.5W

BBV32

5 0 15 12 44 6

Hash

1 Sample

5

0

15

13

44

6

1 BBV

0.5

0.02

0.7

1.4

0.16

1 PMCvector

37W

1 Powernumber

13

0x8048554

Page 8: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi8

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

From Sample Vectors to Phases

We have vectors as proxy for power Like vectors => like power

Cluster similar vectors together and consider them a power phase

Here: First Pivot Clustering Paper also shows agglomerative clustering

Page 9: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi9

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Evaluation Main Steps

Cluster BBV samples

Cluster PMC vectors

Compare each to true measured power

Also compare to Oracle: classify directly for power

Random: assign samples to target clusters randomly

Benchmarks: 46 benchmark-input pairs from SPEC2K and other document

creation, media and scientific applications

Page 10: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi10

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Results

0%

5%

10%

15%

20%

25%

30%

AVE(SPECint) AVE(SPECfp) AVE(OTHER) AVE(Overall)

Pe

rce

nt

Err

or

w.r

.t. A

ctu

al P

ow

er

Random

BBV

PMC

Oracle

Consistent results regardless of clustering method SPECfp < SPECint < Others following from variability of power and memory

behavior

Page 11: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi11

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Errors with respect to Bounds

Consistent results regardless of clustering method SPECfp < SPECint < Others following from variability of power and memory

behavior BBV and PMCs both improve on upper bounds, but also significant gap over

lower bound

0%

5%

10%

15%

20%

25%

30%

AVE(SPECint) AVE(SPECfp) AVE(OTHER) AVE(Overall)

Pe

rce

nt

Err

or

w.r

.t. A

ctu

al P

ow

er

Random

BBV

PMC

Oracle

BBVs 70% of Random

PMCs 40% of Random

Page 12: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi12

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Errors with respect to Bounds

Consistent results regardless of clustering method SPECfp < SPECint < Others following from variability of power and memory

behavior BBV and PMCs both improve on upper bounds, but also significant gap over

lower bound

0%

5%

10%

15%

20%

25%

30%

AVE(SPECint) AVE(SPECfp) AVE(OTHER) AVE(Overall)

Pe

rce

nt

Err

or

w.r

.t. A

ctu

al P

ow

er

Random

BBV

PMC

Oracle

Oracle 30% of BBVs

Oracle 50% of PMCs

Page 13: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi13

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Comparing PMCs to BBVs

Consistent results regardless of clustering method SPECfp < SPECint < Others following from variability of power and memory

behavior BBV and PMCs both improve on upper bounds, but also significant gap over

lower bound PMCs generally lead to less errors than BBVs

0%

5%

10%

15%

20%

25%

30%

AVE(SPECint) AVE(SPECfp) AVE(OTHER) AVE(Overall)

Pe

rce

nt

Err

or

w.r

.t. A

ctu

al P

ow

er

Random

BBV

PMC

Oracle PMCs achieve 40%

less errors than BBVs

Page 14: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi14

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Different Target Number of Clusters

PMCs perform relatively better for the practical range of target clusters

Relative BBV error is significantly larger than PMCs for small number of phases [1-10]

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 20 40 60 80 100

AVE Error (BBV)

AVE Error (PMC)

Page 15: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi15

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Why BBV and PMC Phases Differ

Same Basic Blocks can have different power behavior

Different cache hit/miss patterns at different points

Operand dependent behavior

Different basic blocks can have similar power behavior

Different execution paths with effectively same execution characteristics

Page 16: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi16

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

1 51 101 151 201 251 301

10

20

30

40

50

60

Po

we

r [W

] H L M

1 51 101 151 201 251 301

10

20

30

40

50

60 H L M

0

0.2

0.4

0.6

0.8

1

4 12 30 51 73 94 115

1 51 101 151 201 251 301

10

20

30

40

50

60 H L M

0

0.2

0.4

0.6

0.8

1

4 12 30 51 73 94 115

Instructions [xBillion]

L2FPU

Effectively Same Execution Mesh: Various computationally similar tasks

Lead to many control-flow phases, not binding to application behavior

M1 M2 M3 M1 M2 M3 M1 M2 M3

BB

V P

atte

rns

PM

C E

ven

ts

Page 17: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi17

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Resulting Phases

For power management, too detailed control flow information can be detrimental!

1 51 101 151 201 251 30110

20

30

40

50

60

Po

we

r [W

] H L M

0

1

2

3

4

5

6

4 12 30 51 73 94 115

BBV Phases (N=5) PMC Phases (N=5)

M1 M2 M3 M1 M2 M3 M1 M2 M3

H L MDesiredPhases:

B AA C B AC B AC BBBV:

B A CPMC:

ObservedPhases:

BB

V P

att

ern

s

Page 18: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi18

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Potentials & Challenges Control flow (BBVs):

Perfect repeatability Architectural independence Detail at program level

Runtime applicability BBV phases ≢ power phases No physical binding to power

Event counters (PMCs):

Runtime monitoring Strong relation to power

Imperfect repeatability Lack of detail

Combining the strengths of two sides? Mutual information, but direct combination of vectors does not help! Future direction: Consider in terms of hierarchy/cooperation

Page 19: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi19

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Conclusions Phase characterizations with control flow and event counter

features can provide insight to application power behavior on real systems 2-6X better characterization than upper bounds

Two features can suggest significantly different power phase characterizations under different scenarios

PMC based approaches generally provide a better proxy to changes in power behavior 40% less errors than BBVs

Resulting experimental framework and observations can help guide phase-oriented characterization and system adaptation work on real systems

Page 20: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi22

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Thanks!

Page 21: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi23

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

EXTRAS 1.1) Why care about phases examples 1.2) Why care about pwr phases examples 1.3) What are different features that prev studies looked at? 2) Experiment setup details 3) Illustration of BBV, PMC & Power collection methodology 3.1) Similarity/Vector Distance concept 3.2) Error computation 3.3) Tracked Events 3.6) Results for agglomerative 3.7) More on Sources of contradiction 3.8) Explaining BBV patterns 3.9) Dcache example to varying data locality 3.95) Stream Example to Operand Dependent Behavior 3.96) 5 Phases to effectively same execution? 3.97) BBV+PMC Hierarchy 4) Power Behavior Under Pin 5) How bad are the sampling effects on BBVs? 10) How much power phases different from IPC phases?

(What is the scatter for IPC & PWR from measurement)

Page 22: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi24

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Characterizing execution regions

1.1) Why Care About Phases?

00.10.20.30.40.50.60.70.80.9

1

5 10 15 20 25Time [s]

E1 E2 E3 E4

Page 23: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi25

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

0

0.2

0.4

0.6

0.8

1

2 7 12Time [s]

Store Refs

Load Refs

Load Misses

Store Misses

Committed Instrns

1.1) Why Care About Phases? Characterizing execution regions

Managing dynamic adaptation

OFFON

Page 24: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi26

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

3 8 13Time [s]

Load Refs

Store Misses

1.1) Why Care About Phases? Characterizing execution regions

Managing dynamic adaptation

Use current phase/behavior to predict future behavior

Page 25: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi27

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

1.2) Why Care About Power Phases?

Useful for: Guiding power budget / temperature limit management

40

45

50

10 54 98

35

45

55

65

75

10 54 98

Slow down!

Power [W] Temp. [oC]

Time [s] Time [s]

Uncontrolled T

Enforced T

I.e. Montecito/Foxton

Page 26: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi28

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

1.2) Why Care About Power Phases?

Useful for: Guiding power budget / temperature limit management Power/Temperature aware scheduling

Time [s]

Po

wer

[W

]

[Bellosa et al. COLP’03]

Page 27: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi29

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

1.2) Why Care About Power Phases?

Useful for: Guiding power budget / temperature limit management Power/Temperature aware scheduling Power balancing for multiprocessor systems/activity migration

Power PowerTask1 Task2

Swap hot task

Slow down!Speed up!

Core/μP 1 Core/μP 2

Page 28: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi30

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

1.3) Evaluating Phase Methods for Power

All lead to different interesting characterizations

Several methods, looking at different similarity features

- Specific metrics (IPC, EPI)

- Hardware performance vectors

- Branch counts

- Working sets

- Basic block vectors

- Procedures

From event monitors

From (sampled) control flow

We specifically look at basic block vectors (BBVs) & performance counter based (PMC) vectors

How do these behave in terms of power representation?

Is there a dominant method or does a combination work better?

Page 29: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi31

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

2) Experimental Framework Goal: To acquire control flow, performance metric and

power behavior of workload execution at matching & controlled observation points on a real system

Control flow: Sampled PC Basic block vectors (BBVs)

Performance related events: Performance monitoring counters (PMCs) PMC vectors

Power: External measurements via current probe/DMM Verification

Page 30: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi32

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Pin

Analy

sis

Inst

rum

enta

tio

n

2) Setup

Application Binary

Performance CounterHardware

External Power

Measurement via Current

Probe

OS serial device

file

Experimental

Machine

Instrumentbasic block

heads

Sample basic block

head addresses

Collect PMC event rates

Start/stop/reset

counters

Read/Flush power history

Detach/Attachpower input

Page 31: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi33

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Pin

Experimental Setup

Application Binary

Performance CounterHardware

External Power

Measurement via Current

Probe

OS serial device

file

Experimental

Machine

Instrumentbasic block

heads

Sample basic block

head addresses

Collect PMC event rates

Start/stop/reset

counters

Read/Flush power history

Detach/Attachpower input

Page 32: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi34

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

2) Experiment Setup Details PIN kit 1795 3 level Trace instrumentation

~Every user trace: Conditional inlined trace count Every 50-200K Trace call: Sample EIP Every 5-20M Trace call:

Generate BBV & Collect PMCs & Read PWR history

Constraint: Instrumentation should not overwhelm Power variations!!

BBV Generation: Sample BBL heads hash into 32 dimensions

PMC Reading: Single rotation subset of 15 Sample via syscalls at major instrumentation & reset for next

Power Reading: Read from serial device buffer Disable device at major instrumentation & exhaust buffer

Page 33: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi35

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3) Methodology IllustrationExecution timeline:

0x80485200x80485540x8048570

0x80487740x804878d0x804879c

Visited basic blocks:

Powerhistory:

00003

PMCs:

00005

00004

00008

00007

00003

1st level analysis:

2nd level analysis:

3rd level analysis:

Instr-n count = 5

BBV = 0 0 0 0 0 0

Page 34: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi36

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3) Methodology IllustrationExecution timeline:

0x80485200x80485540x8048570

0x80487740x804878d0x804879c

36.5W

Visited basic blocks:

Powerhistory:

00009

PMCs:

00015

00013

00023

00040

00007

1st level analysis:

2nd level analysis:

3rd level analysis:

Instr-n count = 11

BBV = 0 0 0 0 0 0

Page 35: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi37

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3) Methodology IllustrationExecution timeline:

0x80485200x80485540x8048570

0x80487740x804878d0x804879c

36.5W

Visited basic blocks:

Powerhistory:

00024

PMCs:

00033

00019

00043

00080

00015

1st level analysis:

2nd level analysis:

3rd level analysis:

Instr-n count = 18

BBV = 0 0 0 0 0 0

Page 36: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi38

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3) Methodology IllustrationExecution timeline:

0x80485200x80485540x8048570

0x80487740x804878d0x804879c

37.2W

Visited basic blocks:

Powerhistory:

01024

PMCs:

34033

04519

993455

05373

00345

1st level analysis:

2nd level analysis:

3rd level analysis:

Instr-n count ~ 1M

BBV = 0 0 0 0 0 0

37.5W36.5W

2nd level analysis: H32

1 2

Page 37: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi39

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3) Methodology IllustrationExecution timeline:

0x80485200x80485540x8048570

0x80487740x804878d0x804879c

37.2W

Visited basic blocks:

Powerhistory:

01024

PMCs:

34033

04519

993455

05373

00345

1st level analysis:

2nd level analysis:

3rd level analysis:

Instr-n count ~ 1M

BBV = 0 0 1 0 0 0

37.5W36.5W

Page 38: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi40

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

1 Sample

3) Methodology IllustrationExecution timeline:

0x80485200x80485540x8048570

0x80487740x804878d0x804879c

35.9W

Visited basic blocks:

Powerhistory:

10844

PMCs:

463832

58479

4862349

299303

36382

1st level analysis:

2nd level analysis:

3rd level analysis:

Instr-n count ~ 100M

BBV = 5 0 15 13 44 6

36.9W

37.2W37.5W36.5W

3rd level analysis:

5

0

15

13

44

6

1 BBV

0.5

0.02

0.7

1.4

0.16

1 PMCvector

37W

1 Powernumber

Page 39: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi41

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3.1) What is the similarity of a vector? We use L1 (Manhattan) Distance

Simple Example: Consider 4 vectors,

each with 4 dimensions:

1

2

3

5

2

1

5

3

2

2

4

4

4

3

5

1

322214543

:2 &1 VectorsBetween Distance

:nCalculatio DistanceManhattan Exemplary

0 6 3 7

6 0 3 6

3 3 0 7

7 6 7 0

Log all distances in the similarity matrix:

For 2 Target Phases: {0,1,2} & {3}

Page 40: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi42

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3.2) Error Computation For 5 Target Phases:

Representative Power:

Mean of power values for each phase

Error for a benchmark:

RMS error over all samples after phase classification, with respect to the representative power of that phase

Single number measure of how much power variation remains within each phase, averaged-nonuniformly-over all phases.

Page 41: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi43

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3.3) Tracked EventsPMC Event Mask Description

IOQ_allocation 0x0EFE1 I/O Queue and Bus Sequence Queue allocations from all agents

BSQ_cache_ref 0x0507 L2 cache read and write accessesFSB_data_activity 0x03F Front Side Bus utilization for reading,

driving or reserving the bus.ITLB_reference 0x07 ITLB translations performeduop_queue_writes 0x07 All μops written to the μop queueTC_deliver_mode 0x038 Number of cycles the processor is

buiding traces from instruction decodeuop_queue_writes 0x04 μops written to the μop queue by

microcode ROMx87_FP_uop 0x08000 All x87 floating point μops executedLD_port_replay 0x02 Number of replays at the load portx87_SIMD_moves 0x018 Executed x87, MMX, SSE and SSE2

load, store and register move μops ST_port_replay 0x02 Number of replays at the store portbranch_retired 0x0F All branches retireduops_retired 0x03 Number of μops retired front_end_event 0x03 Number of loads and stores retireduop_type 0x06 Tags load and stores (Does not count)

Page 42: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi44

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3.6) Results

Consistent results regardless of clustering method SPECfp < SPECint < Others following from variability of power and memory

behavior BBV and PMCs perform improve on upper bounds, but also significant gap over

lower bound PMCs generally lead to less errors than BBVs

0

2

4

6

8

10

12

AVE(SPECint) AVE(SPECfp) AVE(OTHER)

Err

or

[W]

Upper Bound

BBV

PMC

Lower Bound

0

2

4

6

8

10

12

AVE(SPECint) AVE(SPECfp) AVE(OTHER)

Err

or

[W]

Upper Bound

BBV

PMC

Lower Bound

First pivot Agglomerative(compl.)

Page 43: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi45

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3.7) Sources of Contradiction between Control Flow and Performance Metrics Dynamic change in data locality

Sort a 10K / 10M entry (presorted/random)

More prominent with TP/DB applications

Effectively same execution Scientific/computational programs

Compute similarity between PMC / PC vector samples

Operand dependent behavior Overflow/Operand width dependent execution

More to be observed with power-aware architectural choicesi.e. Pentium M – Execution width scaling

[Wu et al. Micro’05]

[Gochman et al. ITJ Q2’03]

Page 44: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi46

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3.7) The Reasons of Contradiction No global observation explains all cases

Memory subsystem Memory boundness

Dynamically varying data locality

Number of traversed basic blocks

Effectively same execution

Operand dependent behavior

CoD ~ 0.61

0%

2%

4%

6%

8%

10%

12%

14%

0 10000 20000 30000 40000BBL Count

BB

V E

rro

r

Page 45: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi47

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

1 51 101 151 201 251 301

10

20

30

40

50

60

Po

we

r [W

] H L M

0

0.2

0.4

0.6

0.8

1

4 12 30 51 73 94 115

Instructions [xBillion]

L2FPU

3.8) Effectively Same Execution Mesh: Various computationally similar tasks

Lead to many control-flow phases, not binding to application behavior

M1 M2 M3 M1 M2 M3 M1 M2 M3

30

5

65

32

6

62

33

4

63

0:4033:

630

...

0:6032:

620

0:5030:

650

Page 46: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi48

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

0.0%

20.0%

40.0%

60.0%

80.0%

100.0%

100% 90% 80% 70% 60% 50% 40%

Specified Hit Rate

L1

Hit

Rat

e

Expected

From Counters

3.9) Motivating Ex: Dcache Microkernel Specify L1 hit rate, generate desired hits via random linked list

traversal

A

C

M

P

Z

L1 S

ize

L2 S

ize

Mem

Siz

e

Page 47: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi49

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3.9) Dcache: Control flow Sample PC every 1M instructions ± 100, map to basic blocks

134514480

134514490

134514500

134514510

134514520

134514530

134514540

134514550

134514560

134514570

134514580

2.5E+09 9.8E+09 1.7E+10 2.4E+10

0x8048736

0x8048753

0x8048766

0x8048773

0x804878D

L1 Intensive L2 Intensive Mem Intensive

Page 48: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi50

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2.5E+09 1.1E+10 1.7E+10 2.4E+10

0

10

20

30

40

50

60

70

L2

MEM

IPC

PWR

3.9) Dcache: Performance counters and power

Sample PMCs every 100M instructions, collect power from current probe

L1 Intensive L2 Intensive Mem Intensive

Page 49: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi51

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

2.5 13.7 24.8 36.0 47.1 58.2 69.4 80.5 91.7

Instructions [xBillion]

0

10

20

30

40

50

Po

we

r [W

]

Overflow at iteration 261

3.95) Operand Dependent Behavior Stream: 4 repetitive operations Reaches OVF after 261 iterations

BBVsequences:

40

10

35

15

15

30

15

40 15

50

20

15

Page 50: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi52

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3.95) Operand Dependent Behavior

Drastic change in power behavior

Not seen in control flow, but is followed by PMC vectors

Timeline shows the actual impact

0

10

20

30

40

50

60

0 40 80 120 160 200

Cycles [xBillion]

Po

we

r [W

]

0

0.5

1

1.5

2

Pe

rfo

rma

nc

e R

ate

s

PWRL2MEMIPC

Overflow at iteration 261

Page 51: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi53

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3.96) Resulting Phases BBVs distinguish

all different regions of operation

However, the distance between the M phases still larger than the distance between H, L and M3 even for N=3

Too much granularity conceals the available information

1 51 101 151 201 251 30110

20

30

40

50

60

Po

we

r [W

] H L M

0

1

2

3

4

5

6

4 12 30 51 73 94 115

BBV Phases (N=5) PMC Phases (N=5)

M1 M2 M3 M1 M2 M3 M1 M2 M3

B DC E D E D E D

B DA C B DC B DC E

B A C

B AA C B AC B AC B

H L MDesiredPhases:

5 Phases:

PMC:

BBV:

3 Phases:

PMC:

BBV:

Page 52: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi54

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3.97

Page 53: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi55

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3.97) BBV+PMC Hierarchy, an EX:

30

5

6570

15

10

5

50

18

20

12

20

40

4070

15

10

5

50

18

20

12

20

40

40

30

5

6570

15

10

5

50

18

20

12

20

40

4070

15

10

5

50

18

20

12

20

40

40

BA C D B C D BA C D B C D

Page 54: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi56

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3.97) BBV+PMC Hierarchy, an EX:

BA C D B C D BA C D B C D

PMCs: ?

BA A C

BA A B

Knowledge of BBV repetition information helps detect/predict actual recurrent behavior:

Page 55: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi57

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

3.97) Flagging ‘Problem’ Phases

Same control-flow with varying behavior can be identified to avoid false predictions based on this recurrence

30

5

65

30

5

65

30

5

65

?

Page 56: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi58

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

4) Gcc Native/Pin Powers

0

20

40

60

0 20 40 60

Time [s]

Po

wer

[W

]

0

20

40

60

0 100 200 300 400 500 600 700

Time [s]

Po

wer

[W

]

0

20

40

60

0 50 100 150 200Time [s]

Po

wer

[W

]

0

20

40

60

0 20 40 60 80Time [s]

Po

wer

[W

]

(a) Native execution power behavior without instrumentation

(b) Flattened power behavior with Pin basic block instrumentation

Instrumentation dominant

Execution dominant

(c) Improved external power behavior with Pin trace instrumentation and conditional inlining

(d) Power behavior assigned to application execution by Pintool

Page 57: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi59

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

30

40

50

60

0 44 88 132 176 220 264 308 352

Po

we

r [W

]

4) Power Results with Pin Do we still have the hook on power variability?

Native From PIN

30

40

50

60

0 44 88 132 176 220 264 308 352 396 440 484

35

40

45

50

0 44 88 132 176 220 264 308 352 396

Po

we

r [W

]

35

40

45

50

0 44 88 132 176 220 264 308 352 396 440 484 528 572 616 660

30

35

40

45

50

0 12 24 36 48 60 72

Po

we

r [W

]

30

35

40

45

50

0 44 88 132 176 220

Mcf

Vortex

Gzip

Page 58: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi60

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

5) Sampling Effect on BBVs Is sampling good enough? Are they Meaningful?

Full Blown BBV SimMatrices

Our sampled & hashed BBV Simmatrices

Page 59: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

5) Similarity Matrix Example

Consider 4 vectors, each with 4 dimensions:

1

2

3

5

2

1

5

3

2

2

4

4

4

3

5

1

0 6 3 7

6 0 3 6

3 3 0 7

7 6 7 0

322214543

:2 &1 VectorsBetween Distance

:nCalculatio DistanceManhattan Exemplary

Log all distances in the similarity matrix

0 6 3 7

6 0 3 6

3 3 0 7

7 6 7 0

Color-scale from black to white (only for upper diagonal)

Page 60: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi62

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

5) Interpreting Similarity Matrix Plot

Level of darkness at any location (r,c) shows the amount of similarity between vectors –samples– r & c.

i.e. 0 & 2

All samples are perfectly similar to themselves

All (r,r) are black

Vertically above the diagonal shows similarity of the sample at the diagonal to previous samples

i.e. 1 vs. 0

Horizontally right of the diagonal shows similarity of the sample at the diagonal to future samples

i.e. 1 vs. 2,3

Page 61: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi63

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

A B C D

AB

CD

0

50

100

150

200

250

300

350

400#

of

Ph

as

e P

air

s a

t (X

,Y)

c

5) Full vs. sampled BBV Phases Comparison with Gzip

Sampled BBV Phases

Full-blown BBV Phases

Page 62: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi64

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

10) Why Need Power Phases? (Power ≡? IPC)

How different is characterization for IPC from power?

Page 63: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi65

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

10) Why Need Power Phases? (Power ≡? IPC)

Bzip2-graphic: very strong relation between power and IPC domains

0

0.5

1

1.5

2

2.5

20 30 40 50 60 70

Power [W]

IPC

bzip2

Page 64: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi66

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

10) Why Need Power Phases? (Power ≡? IPC)

Swim: Still strong relation , but the power and IPC domains across benchmarks are different (different relation)

0

0.5

1

1.5

2

2.5

20 30 40 50 60 70

Power [W]

IPC

swim

Page 65: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi67

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

10) Why Need Power Phases? (Power ≡? IPC)

Mcf: Similar power range as bzip2 achieved with much lower IPC domains

0

0.5

1

1.5

2

2.5

20 30 40 50 60 70

Power [W]

IPC

mcf

Page 66: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi68

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

10) Why Need Power Phases? (Power ≡? IPC)

Ghostscript: Similar IPC can lead to drastically different power ranges within a benchmark at the lower end

0

0.5

1

1.5

2

2.5

20 30 40 50 60 70

Power [W]

IPC

gs

Page 67: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi69

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

10) Why Need Power Phases? (Power ≡? IPC)

Lame: Similar observation at higher end, with distinct IPC domains overlapping power

0

0.5

1

1.5

2

2.5

20 30 40 50 60 70

Power [W]

IPC

lame

Page 68: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi70

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

10) Why Need Power Phases? (Power ≡? IPC)

Art: The actual spread due to measurement is much smaller!

0

0.5

1

1.5

2

2.5

20 30 40 50 60 70

Power [W]

IPC

art

Page 69: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi71

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Ditched Slides

Page 70: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi72

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Phase Analysis & Real Systems Phases: Self-similar, mostly recurrent, execution regions

Useful for characterization, dynamic-adaptive management SimPoints [Sherwood et al., ASPLOS’02]

Multiconfigurable HW [Dhodapkar and Smith, ISCA’02]

Real systems impose additional constraints Larger granularities O(ms)

Applicability to large-scale management methods

Dynamic voltage/frequency scaling

Thermal Management

Identifying recurrence under inexact replication of repetitive behavior!

Page 71: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi73

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Phase Analysis & Real Systems Phases:

Self-similar, mostly recurrent, execution regions

Real-system experiments Long execution timescale observations Incorporating system effects / verification with real measurements

Useful for characterization, dynamic-adaptive management

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 5 10 15 20 25 30 35 40 45Time [s]

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 5 10 15 20 25 30 35 40 45Time [s]

E1 E2 E3 E4 E5

Page 72: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi74

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Phases Distinct and often-recurring regions of program behavior

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 5 10 15 20 25 30 35 40 45Time [s]

Store Refs

Load Refs

Load Misses

Store Misses

CommittedInstrns

Page 73: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi75

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

What are Program Phases? Distinct and often-recurring regions of program behavior

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

5 10 15 20 25Time [s]

Mem Refs

L3 Refs

IPC

Page 74: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi76

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Why Care About Phases?

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 5 10 15 20 25 30 35 40 45Time [s]

E1 E2 E3 E4 E5

Characterizing execution regions

Page 75: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi77

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

0

0.2

0.4

0.6

0.8

1

2 7 12Time [s]

Store Refs

Load Refs

Load Misses

Store Misses

Committed Instrns

Why Care About Phases? Characterizing execution regions

Managing dynamic adaptation

Page 76: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi78

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Why Care About Phases? Characterizing execution regions

Managing dynamic adaptation

Use current phase/behavior to predict future behavior

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

3 8 13Time [s]

Store Refs

Load Refs

Store Misses

Page 77: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi79

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Power Phases

Recurring intervals of distinct power behavior

Ex: Vortex

40

45

50

0 44 88 132 176 220 264 308 352 396 440

Power [W]

Billions of Instructions

Page 78: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi80

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Evaluating Phase Methods for Power

All lead to different interesting characterizations

We specifically look at basic block vectors (BBVs) & performance counter based (PMC) vectors

How do these behave in terms of power representation?

Is there a dominant method or does a combination work better?

Similarity Based On:

Metrics(IPC, EPI, etc)

Hardware Performance

Vectors

BBVs, Working

SetsProcedures Branches

Sampling Quanta:

Instruction/Code/Time/Energy intervals

From performance monitoring counters

From (sampled) control flow

Page 79: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi81

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Outline

Background and Definitions

Real-system experimentation framework

How do the control-flow-based and event-counter-based approaches perform in power characterization?

Reasons why the two approaches can differ

Page 80: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi82

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Experimental Framework Goal: To acquire control flow, performance metric and power

behavior of workload execution at matching & controlled observation points on a real system These will guide phase classification of control flow and event based

behavior with validation against power measurements

Control flow: From sampled PC

Will construct basic block vectors (BBVs) for each observed sample

Performance related events: From performance monitoring counters (PMCs)

Will construct PMC vectors for each sample

Power: From external measurements via current probe/DMM

Will provide the actual power behavior for each observed sample for verification

Page 81: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi83

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Experimental Setup

Application Binary

Application

OS

Hardware

Pin

Instrumentbasic block

heads

Sample basic block

head addresses

Collect PMC event rates

Start/stop/reset

counters

Read/Flush power history

Detach/Attachpower input

Performance CounterHardware

External Power

Measurement via Current

Probe

OS serial device

file

Page 82: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi84

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Pin

Experimental Setup

Application Binary

Application

OS

Hardware

Instrumentbasic block

heads

Sample basic block

head addresses

Collect PMC event rates

Enable/Disable counters

Read/Flush power history

Enable/Disablepower input

Performance CounterHardware

External Power

Measurement via Current

Probe

OS serial device

file

Page 83: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi85

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Phase Classification For each benchmark:

N sets of {BBV32,PMC15,Power}

Cluster BBV32 and PMC15 into F sets using L1-distance of vectors

Control-flow & event-counter based phases

Clustering methods: First Pivot: Simple, online method modified for a fixed target

number

Agglomerative: Complex, offline method. Link pairs of clusters based on linkage criterion

Average linkage

Complete linkage

Page 84: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi86

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Phase Classification For each benchmark: N sets of {BBV32,PMC15,Power}

Cluster BBV32 and PMC15 using L1-distance of vectors

Control-flow & event-counter based phases

Clustering methods: First Pivot, Agglomerative [Average & Complete linkage]

First Pivot:

Page 85: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi87

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

From Sample Vectors to Phases

If we knew power precisely We could divide power into ranges and classify points directly

But we have vectors as proxy for power Like vectors => like power

Cluster similar vectors together and consider them a power phase

Here: First Pivot Clustering Paper also shows agglomerative clustering

Page 86: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi88

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Evaluation Compute classification error for each method with sample

standard deviation of power

Upper and lower bounds: Oracle: classify directly for power

Random: assign samples to target clusters randomly

Benchmarks: 46 benchmark-input pairs from SPEC2K and other document

creation, media and scientific applications

Page 87: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi89

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Results

Consistent results regardless of clustering method

SPECfp < SPECint < Others following from variability of power and memory behavior

0

2

4

6

8

10

12

AVE(SPECint) AVE(SPECfp) AVE(OTHER) AVE(Overall)

Err

or

[W]

Random

BBV

PMC

Oracle

Page 88: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi90

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Evaluation and Results

Consistent results regardless of clustering method SPECfp < SPECint < Others following from variability of power and memory

behavior

0%

5%

10%

15%

20%

25%

30%

AVE(SPECint) AVE(SPECfp) AVE(OTHER) AVE(Overall)

Pe

rce

nt

Err

or

w.r

.t. A

ctu

al P

ow

er

Random

BBV

PMC

Oracle

BBVs 12.9%

PMCs 7.3%

BBVs 1.9%

PMCs 1.5%

BBVs 5.6%

PMCs 3.3%

Page 89: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi91

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Errors with respect to Bounds

Consistent results regardless of clustering method SPECfp < SPECint < Others following from variability of power and memory

behavior BBV and PMCs both improve on upper bounds, but also significant gap over

lower bound

0%

5%

10%

15%

20%

25%

30%

AVE(SPECint) AVE(SPECfp) AVE(OTHER) AVE(Overall)

Pe

rce

nt

Err

or

w.r

.t. A

ctu

al P

ow

er

Random

BBV

PMC

Oracle

BBVs 3.3X of Oracle

PMCs 1.9X of Oracle

Page 90: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi92

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Resulting Phases BBVs distinguish all

different regions of operation

However, the distance between the M phases still larger than the distance between H, L and M3 even for N=3

Too much granularity conceals the available information

1 51 101 151 201 251 30110

20

30

40

50

60

Po

we

r [W

] H L M

0

1

2

3

4

5

6

4 12 30 51 73 94 115

Ph

as

e N

o

BBV Phases (N=5) PMC Phases (N=5)

-1

0

1

2

3

4

4 12 30 51 73 94 115

Instructions [xBillion]

Ph

as

e N

o

BBV Phases (N=3) PMC Phases (N=3)

M1 M2 M3 M1 M2 M3 M1 M2 M3

Page 91: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi93

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Effectively Same Execution Mesh: Various computationally similar tasks

Lead to many control-flow phases, not binding to application behavior

1 51 101 151 201 251 301

10

20

30

40

50

60

Po

we

r [W

] H L M

0

0.2

0.4

0.6

0.8

1

4 12 30 51 73 94 115

Instructions [xBillion]

L2FPU

M1 M2 M3 M1 M2 M3 M1 M2 M3

Page 92: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi94

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Control flow (BBVs): Repeatability

Architecturally independent

Runtime applicability [sampling, mapping to BBLs]

Managing dimensions

False alarms with effectively same execution behavior

Misses on varying data locality and operand dependent behavior

Event counters (PMCs): Runtime applicability

Imperfect repeatability

Managing variable event ranges

Lack of detail

Combining the strengths of two sides? They have mutual info, but direct combination of vectors does not help!

Future direction: Consider in terms of hierarchy

Potentials & Challenges with the Phase Characterization Approaches

Page 93: Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006

Canturk Isci - Margaret Martonosi95

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques

[HPCA-12 ’06]

Background: Power and Phases

Runtime processor power monitoring and estimation [Micro’03] Sample PMCs to estimate powers for 22 chip components

Real measurement feedback for tuning and verification

Workload power phase behavior with power vectors [WWC’03] Consider power estimations as power vectors

Characterize “power phases” based on vector similarity