ice: a general and validated energy complexity model for...

Post on 23-Aug-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

ICE: A General and Validated Energy Complexity Model for Multithreaded

Algorithms

Vi Tran, Phuong Ha

Department of Computer Science, UiT The Arctic University of Norway

The 2nd EXCESS workshop, Aug. 26, 2016

Motivation – Energy Complexity Models

Time complexity models contribute to

Analysis and development of performance-efficientalgorithms

Energy complexity models are crucial to

Understand the energy consumption of algorithms

Improve energy efficiency of algorithms

Reduce energy consumption of computing systems

Energy complexity models must be

Applicable to both sequential and multi-threadedalgorithms

Considering both algorithm and platform characteristics

Not only theoretical, but also validated on real platformsand application kernels

Vi Tran, Phuong Ha 2

Motivation – ICE Complexity Model

Vi Tran, Phuong Ha 3

Differences of ICE (Ideal Cache Energy) complexity model compared to available energy models

Contributions

Vi Tran, Phuong Ha 4

Devise a new ICE model to answer the question:

Given two parallel algorithms A and B for a given problem,

which algorithm consumes less energy analytically?

Conduct two cases studies to apply the ICE model on

data-intensive algorithms

computation-intensive algorithms

Validate the ICE model

with different algorithms using different input types on different HPC

platforms

Results: 100% matching of the ICE model and experimental data

ICE complexity model does not provide the absolute estimation of

energy consumption

Outline

Motivation

Contributions

Shared Memory Machine Model

ICE: Energy Complexity Models

A Case Study of Energy Complexity – SpMV

A Case Study of Energy Complexity – Matmul

ICE Model Validation

Conclusion

Vi Tran, Phuong Ha 5

Shared Memory Machine Model

Vi Tran, Phuong Ha 6

The energy consumption of a parallel algorithm:

Energy for memory accesses are analysed based on I/O complexity

There are two available I/O models for parallel algorithms

Model Approach Limitation

PEM (Parallel External Memory)

N cores access n blocks simultaneously, I/O complexity = O(1)

Suitable for time complexity, rather than energy complexity

IDC (Ideal Distributed Cache)

N cores access n blocks simultaneously, I/O complexity = O(n)

Only applicable for divide-and-conquer algorithms

Traditional IC (Ideal Cache)

I/O complexity = O(n)Find upper-bound on I/O complexity

Applicable for both sequential and multithreaded algorithms

Outline

Motivation

Contributions

Shared Memory Machine Model

ICE: Energy Complexity Models

A Case Study of Energy Complexity – SpMV

A Case Study of Energy Complexity – Matmul

ICE Model Validation

Conclusion

Vi Tran, Phuong Ha 7

ICE Complexity Model - Parameters

Vi Tran, Phuong Ha 8

The ICE model considers both machine and algorithm characteristics

ICE Complexity Model – Compute-Bound

Vi Tran, Phuong Ha 9

: the static (or leakage) energy

: the dynamic energy of computation

: the dynamic energy of memory accesses

The energy consumption of a parallel algorithm:

If an algorithm is compute-bound

ICE Complexity Model – Memory-Bound

Vi Tran, Phuong Ha 10

: the static (or leakage) energy

: the dynamic energy of computation

: the dynamic energy of memory accesses

The energy consumption of a parallel algorithm:

If an application is memory-bound:

ICE Complexity Model

Vi Tran, Phuong Ha 11

If an application is compute-bound:

If an application is memory-bound:

Where

Platform Parameters

We provide the parameter values for 11 recent HPC platforms

Vi Tran, Phuong Ha 12

Outline

Motivation

Contributions

Shared Memory Machine Model

ICE: Energy Complexity Models

A Case Study of Energy Complexity – SpMV

A Case Study of Energy Complexity – Matmul

ICE Model Validation

Conclusion

Vi Tran, Phuong Ha 13

Case Studies - SpMV Energy Complexity

SpMV energy complexity:

Analyse Work Complexity, I/O complexity and Span Complexity of

Compress Sparse Collumn (CSC)

Compressed Sparse Block (CSB)

Compressed Sparse Row (CSR)

Vi Tran, Phuong Ha 14

Case Studies – CSC-SpMV

Compressed Sparse Column

Vi Tran, Phuong Ha 15

SpMV is a memory-bound algorithm

Case Studies – CSB-SpMV

Vi Tran, Phuong Ha 16

SpMV is a memory-bound algorithm

Compressed Sparse Block

Case Study - Dense Matrix Multiplication (Matmul)

Matmul: A [n][m]* B [m][p] = C [n][p]

Simple Matmul (Simple-Matmul): a 3-loop implementation of Matmul

Cache-oblivious Matmul (CO-Matmul): a recursive Matmul. At each step,

If n >= max (m, p)

If m >= max (n, p)

If p >= max (n, m)

Vi Tran, Phuong Ha 17

A B

A1 B

A2 B

= =

C1

C2

= Cx

x

x

Matmul Complexity Analysis

Matmul is compute-bound algorithm

Vi Tran, Phuong Ha 18

Simple-Matmul Energy Complexity

Matmul complexity analysis

Matmul Complexity Analysis

Matmul is compute-bound algorithm

Vi Tran, Phuong Ha 19

CO-Matmul Energy Complexity

Matmul complexity analysis

Outline

Motivation

Contributions

Shared Memory Machine Model

ICE: Energy Complexity Models

A Case Study of Energy Complexity – SpMV

A Case Study of Energy Complexity – Matmul

ICE Model Validation

Conclusion

Vi Tran, Phuong Ha 20

Model Validation

Vi Tran, Phuong Ha 21

The ICE model objective is to answer the question:

Given two parallel algorithms A and B for a given problem,

which algorithm consumes less energy analytically?

Validate the ICE model with different SpMV and Matmul

algorithms

Validate the ICE model with different input types

Sparse matrices (a subset of Florida set): varied matrix sizes (n,

m) and varied patterns (nz, nc)

Dense matrix: varied matrix sizes (n, m)

Validate the ICE model with experimental data on 2 HPC

platforms (Xeon and Xeon Phi)

Model Validation – Expected Results

Vi Tran, Phuong Ha 22

Compute the energy consumption ratio of

two SpMV algorithms (i.e., CSC-SpMV and CSB-SpMV) and

two Matmul algorithms (i.e., Simple-Matmul and CO-Matmul)

Expected results: the energy comparison from both energy

model and experimental data is matched

The energy ratio of CSC-energy to CSB-energy is greater/lesser

than 1 from both model and experimental data

The energy ratio of Simple-Matmul to CO-Matmul is greater/lesser

than 1 from both model and experimental data

ICE Model Validation - SpMV

Vi Tran, Phuong Ha 23

Energy consumption ratio of CSC to CSB SpMV on Xeon and Xeon Phi

Match percentage of energy comparison for CSC and CSB-SpMV is 100%

ICE Model Validation - Matmul

Vi Tran, Phuong Ha 24

Energy consumption ratio of Simple-Matmul to CO-Matmul on Xeon and Xeon Phi

Match percentage of energy comparison for Simple-Matmul and CO-Matmul is 100%

Outline

Motivation

Contributions

Shared Memory Machine Model

ICE: Energy Complexity Models

A Case Study of Energy Complexity – SpMV

A Case Study of Energy Complexity – Matmul

ICE Model Validation

Conclusion

Vi Tran, Phuong Ha 25

Conclusion - Energy/Power Model Studies

Vi Tran, Phuong Ha 26

Devise a new energy complexity model (ICE) for general and multi-threaded algorithms

Analyse algorithms by their work, span and I/O complexity

Proposing Ideal Cache Memory Model to analyse I/O complexity in the ICE model.

Considering static and dynamic energy of computation and memory access as platform parameters

Propose a new way to analyse I/O complexity in energy complexity model

Conduct two case studies (i.e., SpMV and Matmul) to demonstrate how to use the ICE model

Conduct experimental studies to validate the ICE model:

For data-intensive and computation-intensive algorithms

With different input matrix types and sizes and two HPC platforms (e.g., Intel Xeon and Xeon Phi)

Energy comparison of two given algorithms are 100% matched

Thank you!

Vi Tran, Phuong Ha 27

Contact: vi.tran@uit.no

top related