s calably verifiable dynamic power management opeoluwa matthews, meng zhang, and daniel j. sorin...

34
SCALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer Architecture (HPCA) Orlando, Florida, February 17-19, 2014 - Krishnaprasad K and Yashas Krishna

Upload: neal-doyle

Post on 17-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

SCALABLY VERIFIABLE DYNAMIC POWER MANAGEMENTOpeoluwa Matthews, Meng Zhang, and Daniel J. Sorin

20th International Symposium on High Performance Computer Architecture (HPCA)

Orlando, Florida, February 17-19, 2014

- Krishnaprasad K and Yashas Krishna

Page 2: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

SOME BACKGROUND

Current day biggest problem Power Management

Managing power each Component gets When power is given How system gets power when needed Etc ..

Power management Static Power Management

Pre allocate power to each component Dynamic Power Management

Allocate power when needed Eg : Dynamic Voltage / frequency scaling

Page 3: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

PROBLEMS WITH DPM

Designing DPM is Difficult Because of Increasing scale of Computer

Systems Cores / Processor increases Processors /System Increasing

Challenge to efficient DPM: Scalability

Scalable to large-scale systems Verifiability

Verify correctness in all situations Scalability affects Verifiability But no automated methods to Verify DPM

Page 4: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

IMPORTANT FACTORS IN DOM

Scalability Factor Scalability proportional to Power Consumption High Scale = High Power Req. Low Scale = Low Power Req.

Verification Of DPM and Benefits Find Bugs in DPM To Prove Correctness of DPM

If not Done : Component Overheat System Failure and Damage

So a Scalably verifiable DPM is needed

Page 5: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

CONTENTS

Existing System Model and Issues Introducing new DPM system : Fractal DPM How verification possible in new System ? Fractal DPM vs Performance : Tradeoffs New System Evaluation Implementation Strategy Comparison to Prior works Conclusion

Page 6: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

INITIAL SYSTEM MODEL

DPM Model Dynamically allocate power to each component Ci

Power Allotted proportional to Current performance Xi

Xi = function of ( Current power Allocation Pi & Current unconstrained perf (Xmaxi).)

Initial Setting : Set a power Budget Allot power to Components satisfying Budget

Maximize Xi Sum(Pi) < Budget

Power Performance Model 5 possible power settings for each Ci

Low ( L) Medium_Low (ML) Medium (M) Medium_High (MH) High ( H )

Page 7: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

INITIAL MODEL : ISSUES

Design Using Existing tools Fully automated Formal verification

Methodologies Tool : MurΦ Model Checker

Exhaustive State Space search Checks Invariant Satisfied or not

Issue : State Space Explosion problems As Ci increase : States Increase

Infeasible to traverse all states For Eg: 5 C and 5 setting means 5^5 states

Typical Solution: Check for small scale and if satisfied , assume Large scale also

satisfies Need not be true always

Page 8: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

FRACTAL DPM DESIGN

Fractal Design A design in which system behaves the same at

every scale This makes Inductive verification possible

Base case: Verify that the minimum system satisfies its power constraints

Inductive step: Verify that larger systems are equivalent to smaller systems

Both done Using MurΦ

Page 9: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

FRACTAL SYSTEM ORGANIZATION

Hierarchical Structure : Binary tree model Leaves : Computing Resource ( CR ) Intermediate Nodes : DPM Controllers

Records Power states of Child Nodes Handles power requests of CRs

Power Requests CR can request more power

Sending req to DMP controller ( Parent ) DMP Controller Responds

Either directly Or Passing the req to Its parent Controller

A DMP Controller and Its Two Child considered a single “Node” like a Single CR

Each such Node has a combined Power Setting Average of Child Nodes L:R

Page 10: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

FRACTAL SYSTEM ORGANIZATION Eg : If Child are H and L , then average is MH L:R format represents power setting of Left child : power

setting of right child

Page 11: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

FRACTAL SYSTEM ORGANIZATION

Page 12: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

FRACTAL POWER INVARIANT

The Invariant Must be fractal Applicable on all scales of System Plus point of Fractal DPM : makes its unique from

other DPMs

Fractal Invariant It is impossible for both children of a DPM controller

to be at the High power setting at the same time Why?

Good for cases when Sum(Pi) > Budget Limits System Wide power consumption

Limitation Other Invariants are not considered or Compared : Future

Work

Page 13: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

FRACTAL DPM : SPECIFICATION Table based specification Method Each entry in the table corresponds to a state/event combination, and the

entry specifies what happens in that situation.

Page 14: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer
Page 15: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

SPECIFICATION CONTINUED

Special States : Pend-*

family of pending states in which the computing resource has requested a new power state and is waiting for a response

Block-* family includes states such as block-L:ML, in which the DPM

controller granted or denied a request to a child and is blocked waiting on the Ack from the child and will then go to state L:ML

Specification Of root DPM Same as Non Root DPM except Root has no parent DPM

to request power No Pending States , Only Block States Non root DPM passes to parent DPM only if :

It handles req by itself ( but Node state unchanged ) 4 Exceptions : Invariant not satisfied

Page 16: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

FRACTAL DPM : SCALABILITY ISSUES

When High Scalability Tree height Increase Request from leaves to root take more time

Latency Issues More hops

Possible Solution Multi Degree Tree : Reduces Height of Tree Prob : MurΦ doesn’t support this ; Couldn't verify

Scalability Issues : No big Concern latency of DPM itself is not critical. many requests can be satisfied without traveling far up

the tree Experimental results on a real system (modestly sized

system (16 computing resources)) latencies are reasonable.

Page 17: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

VERIFICATION OF FRACTAL DPM

Scalably Verify Verification Effort : Independent of number of CR Steps

Base Case Verification Induction Step Verification

Base Case :Minimum System verification Base system must be complete

Include all basic components Incomplete base system

When some elements not considered Gives incomplete verification : Spurious Actions MurΦ verifies whether Invariants satisfied

Page 18: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

BASE CASE :MINIMUM SYSTEM VERIFICATION

Page 19: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

VERIFICATION OF FRACTAL DPM

Inductive Step : Equivalence Verification Observation Equivalence verification chosen

Only outside behavior of system of diff. scale considered No internal Actions considered Considers only how system reacts to inputs

Two Perspectives Looking Down

When system scaled Downwards Looking Up

When system scaled Upwards In both case , verify the larger system behaves same as

sub system . Tool : MurΦ is used

Using same tool for both steps decrease transitional errors

On-The-Fly Mode : No extra state space

Page 20: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

EQUIVALENCE VERIFICATION

Page 21: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

POWER MANAGEMENT EFFICIENCY System wide power consumption : upper bounded

Max power consumed : ( C-1) MH + H As C approach Infinity

Max Average power of CR = MH F-DPM allows all CR to be in MH

Do not permit certain cases Causes Inefficiency But Tradeoff between this and Fractal Invariance But Rare and Inefficiency caused is small Another Inefficiency : F-DPM forces on CR of H to MH

Page 22: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

EVALUATION OF SYSTEM

Goal Fractal DPM actually does its Job well ?

In allocating power to CRs Dynamically and Efficiently

Simulation Methodology1. Dynamically set Xmaxi to all CRs

1. Keep it changing at Time steps

2. Give weights to power settings 3. Model behavior of CRs and DPMCs

1. Specification Tables

4. Computes performance of each CR1. Function of power it is granted by DPM per Time

Steps

Page 23: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

PERFORMANCE MODELING How determine performance of a given CR at a given

power setting ? Each CR can use power different way

May achieve different performance at same setting Abstract way : as a function of Pi and Xmaxi Two Functions :

Perf1: Decreasing marginal performance benefit

E.g. using more power to enable a faster core clock frequency helps performance but eventually performance becomes memory-bound

Perf2: Linear Performance benefit

E.g. ideal voltage/frequency scaling

Page 24: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

PERFORMANCE COMPARISON AND RESULTS

Compare Against Implementable Oracle ( Ideal DPM) Gives best possible allocations , even H:H

allocations Results ( give #CRs = 8) :

In majority of the time steps (>72%) : performance(FDPM) = performance(Oracle)

the performance gap is never more than 37% for perf1 and 46% for perf2

Performance difference greater for Perf2 perf2 models greater performance at higher power

states, and thus being at a lower power state (to maintain the fractal invariant) is somewhat more costly

Thus : amount of performance sacrificed = Small

Page 25: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

IMPLEMENTATION STRATEGY Dynamic Voltage/Frequency Scaling as Power adjustment strategy V/F adjusted on a core-pair ( Granularity )

Possible because of fractal structure CR and DPMC using Linux Daemons Communication through Sockets

Optimization : OptiFDPM CR re-requests next lower power setting if current request rejected Optimized version holds scalable verifiability of FDPM

Page 26: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

EVALUATION OF IMPLEMENTATION

Compare the power and performance of fractal DPM against an un-implementable oracle DPM scheme that always assigns the optimal power levels to core pairs.

Compare the power and performance of fractal DPM against a provably correct power management scheme that statically sets all cores to a given power level.

Determine the latency to service requests for new power levels

Page 27: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

EVALUATION OF IMPLEMENTATION

Comparison to Oracle Power Management

Page 28: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

EVALUATION OF IMPLEMENTATION

Comparison to Static Power Management

Page 29: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

EVALUATION OF IMPLEMENTATION

Latency

Page 30: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

COMPARISON : PREVIOUS WORKS

Lungu et al.’s research on verifiable DPM for multicore processors [9] Observed DPM schemes cannot be verified on

Large Scale Showed State space explosion

Zhang et al.’s works on Fractal Coherence [14] Derived idea of Fractal design

First time used for DPM

Others Works on DMP [10][8][6] Did not use Verification

Page 31: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

CONCLUSION

Design of Scalably verifiable DPM Using Fractal Design for Verifiability Small performance in efficiency only

Par with Oracle Model

Page 32: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

REFERENCE [1] D. Bergamini, N. Descoubes, C. Joubert, and R. Mateescu,

“BISIMULATOR: A Modular Tool for On-the-Fly Equivalence Checking,” in Proceedings of TACAS’05, volume 3440 of LNCS, 2005, pp. 581–585.

[2] C. Bienia, S. Kumar, J. P. Singh, and K. Li, “The PARSEC Benchmark Suite: Characterization and Architectural Implications,” in Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2008.

[3] C.-T. Chou, P. Mannava, and S. Park, “A Simple Method for Parameterized Verification of Cache Coherence Protocols,” in Formal Methods in Computer-Aided Design, 2004, pp. 382–398.

[4] G. Dhiman, K. K. Pusukuri, and T. Rosing, “Analysis of Dynamic Voltage Scaling for System Level Energy Management,” in Proceedings of the 2008 Conference on Power Aware Computing and Systems, 2008.

[5] D. L. Dill, A. J. Drexler, A. J. Hu, and C. H. Yang, “Protocol Verification as a Hardware Design Aid,” in IEEE International Conference on Computer Design: VLSI in Computers and Processors, 1992, pp. 522–525.

Page 33: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

REFERENCE [6] A. Efthymiou and J. D. Garside, “Adaptive Pipeline Depth Control for

Processor Power-Management,” in Proceedings of the IEEE International Conference on Computer Design, 2002.

[7] J.-C. Fernandez, H. Garavel, A. Kerbrat, L. Mounier, R. Mateescu, and M. Sighireanu, “CADP - A Protocol Validation and Verification Toolbox,” in Proceedings of the 8th International Conference on Computer Aided Verification, 1996, pp. 437–440.

[8] C. Isci, A. Buyuktosunoglu, C.-Y. Cher, P. Bose, and M. Martonosi, “An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget,” in Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, 2006.

[9] A. Lungu, P. Bose, D. J. Sorin, S. German, and G. Janssen, “Multicore Power Management: Ensuring Robustness via Early-Stage Formal Verification,” in Proceedings of the Seventh ACM-IEEE International Conference on Formal Methods and Models for Codesign (MEMOCODE), 2009.

[10] R. Maro, Y. Bai, and R. I. Bahar, “Dynamically Reconfiguring Processor Resources to Reduce Power Consumption in High-Performance Processors,” in Proceedings of the Workshop on Power-Aware Computer Systems, pp. 97–111, Nov. 2000.

Page 34: S CALABLY VERIFIABLE DYNAMIC POWER MANAGEMENT Opeoluwa Matthews, Meng Zhang, and Daniel J. Sorin 20th International Symposium on High Performance Computer

REFERENCE [11] S. Park, S. Das, and D. L. Dill, “Automatic Checking of

Aggregation Abstractions Through State Enumeration,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 19, no. 10, pp. 1202–1210, Nov. 2006.

[12] S. Park and D. L. Dill, “Verification of FLASH Cache Coherence Protocol by Aggregation of Distributed Transactions,” in Proceedings of the Eighth ACM Symposium on Parallel Algorithms and Architectures, 1996, pp. 288–296.

[13] D. J. Sorin, M. Plakal, M. D. Hill, A. E. Condon, M. M. K. Martin, and D. A. Wood, “Specifying and Verifying a Broadcast and a Multicast Snooping Cache Coherence Protocol,” IEEE Transactions on Parallel and Distributed Systems, vol. 13, no. 6, pp. 556–578, Jun. 2002.

[14] M. Zhang, A. R. Lebeck, and D. J. Sorin, “Fractal Coherence: Scalably Verifiable Cache Coherence,” in Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture 2010.