characterizing nas benchmark performance on shared heterogeneous networks

15
Rice01, slide 1 Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks Jaspal Subhlok Shreenivasa Venkataramaiah Amitoj Singh University of Houston Heterogeneous Computing Workshop, April 15, 2002

Upload: meadow

Post on 06-Jan-2016

33 views

Category:

Documents


1 download

DESCRIPTION

Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks. Jaspal Subhlok Shreenivasa Venkataramaiah Amitoj Singh University of Houston Heterogeneous Computing Workshop, April 15, 2002. Mapping/Adapting Distributed Applications on Networks. Model. Data. Sim 2. Vis. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks

Rice01, slide 1

Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks

Jaspal Subhlok

Shreenivasa Venkataramaiah

Amitoj Singh

University of Houston

Heterogeneous Computing Workshop, April 15, 2002

Page 2: Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks

Rice01, slide 2

Mapping/Adapting Distributed Applications on Networks

Data

Sim 1

VisSim 2

Stream

Model

Pre

?Application Network

Page 3: Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks

Rice01, slide 3

Automatic node selection

m-6

m-5

m-4

m-7

m-1 m-2 m-3

Congested route

Compute nodesRouters

m-8

Busynodes

selected nodes

Select 4 nodes for execution : Choice is easy

Page 4: Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks

Rice01, slide 4

Automatic node selection

m-6

m-5

m-4

m-7

m-1 m-2 m-3

Congested route

Compute nodesRouters

m-8

Busynodes

selected nodes

Select 5 nodes: choice depends on application

Page 5: Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks

Rice01, slide 5

Mapping/Adapting Distributed Applications on Networks

Data

Sim 1

VisSim 2

Stream

Model

Pre

?Application Network

1) Discover application characteristics and model performance in a shared heterogeneous environment

2) Discover network structure and available resources (e.g., NWS, REMOS)

3) Algorithms to map/remap applications to networks

Page 6: Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks

Rice01, slide 6

Methodology for Building Application Performance Signature

Performance signature = model to predict application execution time under given network conditions

1. Execute the application on a controlled testbed

2. Measure system level activity during execution– such as CPU, communication and memory usage

3. Analyze and discover program level activity (message sizes, sequences, synchronization waits)

4. Develop a performance signature

• No access to source code/libraries assumed

Page 7: Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks

Rice01, slide 7

Discovering application characteristics

500MHz Pentium Duos

ethernet switch(crossbar)

100 Mbps links

ExecutableApplication

Code

Benchmarking on a controlled

testbed and analysis

Model as aPerformance

Signature

• capture patterns of CPU loads and traffic during execution

Page 8: Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks

Rice01, slide 8

Results in this paper

ExecutableApplication

Code

Benchmarking on a controlled

testbed

Measure performance with resource

sharing

Demonstrate that measured resource usage on a testbed is a good predictor of performance on a shared network for NAS benchmarks

500MHz Pentium Duos

ethernet switch(crossbar)100 Mbps

links

• capture patterns of CPU loads and traffic during execution

Page 9: Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks

Rice01, slide 9

Experiment Procedure

• Resource utilization of NAS benchmarks measured on a dedicated testbed– CPU probes based on “top” and “vmstat” utility

– Bandwidth using “iptraf”, “tcpdump”, SNMP queries

• Performance of NAS benchmark measured with competing loads and limited bandwidth– Employ dummynet and NISTnet to limit bandwidth

• All measurements presented are on 500MHz Pentium Duos, 100 Mbps network, TCP/IP, FreeBSD

• All results on Class A, MPI, NAS Benchmarks

Page 10: Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks

Rice01, slide 10

Discovered Communication Structure of NAS Benchmarks

0 1

32

BT

0 1

32

CG

0 1

3

IS

0 1

32

EP

0 1

32

LU

0 1

32

MG

0 1

32

SP

2

Page 11: Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks

Rice01, slide 11

Performance with competing computation loads

0

20

40

60

80

100

120

140

EP BT CG IS LU MG SP

Per

cen

tag

e in

crea

se in

exe

cutio

n ti

me

All nodes are loaded

Most busy node loaded

Least busy node loaded

• Increase beyond 50% due to lack of coordinated (gang) scheduling and synchronization

• Correlation between low CPU utilization and smaller increase in execution time (e.g. MG shows only ~60% CPU utilization)

• Execution time is lower if least busy node has a competing load (20% difference in the busyness level for CG)

Page 12: Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks

Rice01, slide 12

Performance with Limited Bandwidth (reduced from 100 to 10Mbps) on one link

0

20

40

60

80

100

120

140

CG IS MG SP BT LU EP

Pe

rce

nta

ge

in

cre

as

e i

n

ex

ec

uti

on

tim

e

0

2

4

6

8

10

12

14

16

Lin

k n

etw

ork

tra

ffic

(M

bp

s)

Close correlation between link utilization and performance with a shared or slow link

Page 13: Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks

Rice01, slide 13

Performance with Limited Bandwidth (reduced from 100 to 10 Mbps) on all

links

0

50

100

150

200

250

300

350

400

450

500

IS CG SP MG BT LU EP

Pe

rce

nta

ge

in

cre

as

e i

n

ex

ec

uti

on

tim

e

0

10

20

30

40

50

60

70

80

To

tal

ne

two

rk t

raff

ic (

Mb

ps

)

Close correlation between total network traffic and performance with all shared or slow links

Page 14: Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks

Rice01, slide 14

Results and Conclusions (not the last slide)

• Computation and communication patterns can be captured by passive, near non-intrusive, monitoring

• Benchmarked resource usage pattern is a strong indicator of performance with sharing– strong correlation between application traffic and

performance with low bandwidth links– CPU utilization during normal execution a good

indicator of performance with node sharing

Synchronization and timing effects were not dominant for NAS Benchnmarks

Page 15: Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks

Rice01, slide 15

Discussion and Ongoing Work (the last slide)

• Capture application level data exchange pattern from network probes (e.g. MPI message sequence, sizes)– slowdown different for different message sizes

• Infer the main synchronization/waiting patterns– Impact of unbalanced execution and lack of gang

scheduling• Capture impact of CPU scheduling policy for

accurate prediction with sharing– Policies try to compensate for waits

Goal is to build a quantitative “performance signature” to estimate execution time under any given network conditions, and use it in a resource management prototype system