achieving application performance on the computational grid

39
Achieving Application Performance on the Computational Grid Francine Berman U. C. San Diego and NPACI

Upload: cree

Post on 20-Jan-2016

16 views

Category:

Documents


0 download

DESCRIPTION

This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during your presentation In Slide Show, click on the right mouse button Select “Meeting Minder” Select the “Action Items” tab - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Achieving Application  Performance on the Computational Grid

Achieving Application Performance

on the Computational Grid

Francine Berman

U. C. San Diego

and

NPACI

Page 2: Achieving Application  Performance on the Computational Grid

The Computing Landscape

data archives

networks

visualizationinstruments

MPPs

clusters

PCs

Workstations

Wireless

Page 3: Achieving Application  Performance on the Computational Grid

Computing Platforms

• Combining resources “in the box”– focus is on new hardware

• Combining resources as a “virtual box”– focus is on software infrastructure

Page 4: Achieving Application  Performance on the Computational Grid

The Computational Grid

Computational Grid = ensemble of distributed and heterogeneous resources

Metaphor: Electric Power Grid– for users, power is ubiquitous

– you can plug in anywhere

– you don’t need to know where the power is coming from

Page 5: Achieving Application  Performance on the Computational Grid

Better Toast• On the electric power grid, power is either

adequate or it’s not– On the computational grid, application

performance depends on the underlying system state

• Major Grid research and development thrusts:– Building the Grid– Programming the Grid

Page 6: Achieving Application  Performance on the Computational Grid

Programming for Performance

• Performance Paradigm:To achieve performance, applications must be designed and implemented to leverage the performance characteristics of the underlying resources.

Performance Characteristics of the Grid• Resources are distributed, heterogeneous

• Resources shared by multiple users

• Resource performance may be hard to predict

Page 7: Achieving Application  Performance on the Computational Grid

How Can Applications Achieve Performance on the Grid?

• Build programs to be grid-aware

• Leverage deliverable resource performance during execution

– Scheduling is fundamental

• Key Grid scheduling components

– dynamic information

– quantitative and qualitative predictions

– adaptivity

Page 8: Achieving Application  Performance on the Computational Grid

Achieving Application Performance

• Many entities will schedule the application

PSE

Config.object

program

wholeprogramcompiler

Source appli-cation

libraries

Realtimeperf

monitor

Dynamicoptimizer

Grid runtime system

negotiation

Softwarecomponents

Service negotiator

Scheduler

Performance feedback

Perfproblem

Grid Application Development System

Page 9: Achieving Application  Performance on the Computational Grid

Application Scheduling

• Application schedulers must– perceive the performance impact of system resources on the

application

– adapt application execution schedule to dynamic conditions

– optimize application schedule for Grid according to the user’s performance criteria

• Application scheduler tasked with promoting application performance over the performance of other applications and system components

Page 10: Achieving Application  Performance on the Computational Grid

• Self-Centered Scheduling:Everything in the system is evaluated in terms

of its impact on the application.

• performance of each system component can be considered as a measurable quantity

• forecasts of quantities relevant to the application can be manipulated to determine schedule

• This simple paradigm forms the basis for AppLeS.

Paradigm for Application Scheduling

Page 11: Achieving Application  Performance on the Computational Grid

AppLeS

• AppLeS =

Application-Level

Scheduler– agent-based approach

– each application integrated with its own AppLeS

– each AppLeS develops and implements a custom application schedule

NWS(Wolski)

UserPrefs

AppPerf

Model

PlannerResource Selector

Application

Act.

Grid/cluster resources/

infrastructure

• Joint project with Rich Wolski at U. Tenn

Page 12: Achieving Application  Performance on the Computational Grid

AppLeS Approach• Select resources

• For each feasible resource set, plan a schedule

– For each schedule, predict application performance at execution time

– consider both the prediction and its qualitative attributes

• Implement the “best” of the schedules wrt user’s performance criteria – execution time

– convergence– turnaround time

NWS(Wolski)

UserPrefs

AppPerf

Model

PlannerResource Selector

Application

Act.

Grid/cluster resources/

infrastructure

Page 13: Achieving Application  Performance on the Computational Grid

Network Weather Service (Wolski, U. Tenn.)

• The NWS provides dynamic resource information for AppLeS

• NWS is stand-alone system

• NWS – monitors current system state

– provides best forecast of resource load from multiple models

Sensor Interface

Reporting Interface

Forecaster

Model ModelModel

Page 14: Achieving Application  Performance on the Computational Grid

iii Commpt

OperAreaT

Using Forecasting in Scheduling

• How much work should each processor be given?

• Jacobi2D AppLeS solves equations for Area:

N N Areai

P1 P2 P3

Fast Ethernet Bandwidth at SDSC

0

10

20

30

40

50

60

70

Time of Day

Meg

abits

per

Sec

ond

Measurements

Exponential SmoothingPredictions

Tue Wed Thu Fri Sat Sun Mon Tue

Page 15: Achieving Application  Performance on the Computational Grid

Good Predictions Promote Good Schedules

• Jacobi2D experiments

0

1

2

3

4

5

6

7

Ex

ecu

tio

n T

ime

(sec

on

ds)

10

00

110

0

12

00

13

00

14

00

15

00

16

00

17

00

18

00

19

00

20

00

Problem Size

Comparison of Execution Times

Compile-time Blocked

Compile-time Irregular Strip

Runtime

Page 16: Achieving Application  Performance on the Computational Grid

SARA: An AppLeS-in-Progress

• SARA = Synthetic Aperture Radar Atlas– application developed at JPL

and SDSC

• Goal: Assemble/process files for user’s desired image– thumbnail image shown

to user

– user selects desired bounding box for more detailed viewing

– SARA provides detailed image in variety of formats

Page 17: Achieving Application  Performance on the Computational Grid

Simple SARA• AppLeS focuses on resource selection problem:

Which site can deliver data the fastest?• Goal is to optimize performance by minimizing transfer time

• Code developed by Alan Su

ComputeServer

DataServer

DataServer

DataServer Computation servers

and data servers are

logical entities, not

necessarily different

nodes

Network shared by variable number of users

Computation assumed to be done at compute servers

Page 18: Achieving Application  Performance on the Computational Grid

Experimental Setup

• Data for image accessed over shared networks

• Data sets 1.4 - 3 megabytes, representative of SARA file sizes

• Servers used for experiments– lolland.cc.gatech.edu

– sitar.cs.uiuc

– perigee.chpc.utah.edu

– mead2.uwashington.edu

– spin.cacr.caltech.edu

via vBNS

via general Internet

Page 19: Achieving Application  Performance on the Computational Grid

Which is “Closer”?

• Sites on the east coast or sites on the west coast?

• Sites on the vBNS or sites on the general Internet?

• Consistently the same site or different sites at different times?

Page 20: Achieving Application  Performance on the Computational Grid

Which is “Closer”?

• Sites on the east coast or sites on the west coast?

• Sites on the vBNS or sites on the general Internet?

• Consistently the same site or different sites at different times?

Depends a lot on traffic ...

Page 21: Achieving Application  Performance on the Computational Grid

Preliminary Results• Experiment with larger data set (3 Mbytes)

• During this time-frame, general Internet provides data mostly faster than vBNS

Page 22: Achieving Application  Performance on the Computational Grid

• Experiment with smaller data set (1.4 Mbytes)• During this time frame, east coast sites provide

data mostly faster than west coast sites

More Preliminary Results

Page 23: Achieving Application  Performance on the Computational Grid

9/21/98 Experiments• Clinton Grand Jury webcast commenced at trial 62

Page 24: Achieving Application  Performance on the Computational Grid

What if File Sizes are Larger?Storage Resource Broker

(SRB)

• SRB provides access to distributed, heterogeneous storage systems

– UNIX, HPSS, DB2, Oracle, ..

– files can be 16MB or larger

– resources accessed via a common SRB interface

Page 25: Achieving Application  Performance on the Computational Grid

An SRB AppLeS

SRB Client

Network Weather Service

SRB Server

MCAT

Distributed Physical Storage

AppLeS

Network

• Being developed by Marcio Faerman

• Like Simple SARA, SRB focuses on resource selection

• NWS probe is 64K, SRB file size is 16MB

• How to predict SRB file transfer time?

Page 26: Achieving Application  Performance on the Computational Grid

Predicting Large File Transfer Times

NWS and SRB present distinct behaviors

Bandwidth x TimeWashington St. Louis - UCSD

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Time

Ban

dw

idth

(M

bit

s/s)

SRB NWS "Predicted" SRB

Dec 3

Dec 11

Bandwidth x TimeWashington St. Louis - UCSD

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Time

Ban

dw

idth

(M

bit

s/s)

SRB NWS

Current approach:Use linear regression on NWS bandwidth measurementsto track SRB behavior

Page 27: Achieving Application  Performance on the Computational Grid

Distributed Data Applications

. . .ComputeServers

DataServers

Client

Move the computationor move the data?

Which computeservers to use?

Which serversto use for multiplefiles?

• Simple SARA and SRB representative of a larger class of distributed data applications

• Goal is to develop AppLeS scheduler for “end-to-end” applications

Page 28: Achieving Application  Performance on the Computational Grid

A Bushel of AppLeS … almost

• During the first “phase” of the project, we’ve focused on developing AppLeS applications

– Jacobi2D

– DOT

– SRB

– Simple SARA

– magnetohydrodynamics

– CompLib

– INS2D

– Tomography, ...

• What have we learned?

Page 29: Achieving Application  Performance on the Computational Grid

Lessons Learned From AppLeS

Compile-time Blocked Partitioning

Run-time AppLeS Non-

Uniform Strip Partitioning

• Dynamic information is critical.

Jacobi2D

Page 30: Achieving Application  Performance on the Computational Grid

Lessons Learned from AppLeS

• Program execution and parameters may exhibit a range of performance

Page 31: Achieving Application  Performance on the Computational Grid

Lessons Learned from AppLeS

• Knowing something about the “goodness” of performance predictions can improve scheduling

Execution time

0

50

100

150

200

250

300

350

Small Medium Large

Problem Size

Tim

e (s)

SuperAppLeSAppLeSMentat

SOR CompLib

Page 32: Achieving Application  Performance on the Computational Grid

Lessons Learned from AppLeS

• Performance of application sensitive to scheduling policy, data, and system characteristics

Page 33: Achieving Application  Performance on the Computational Grid

Achieving Application Performance on the Grid

• AppLeS uses adaptivity to leverage deliverable resource performance

• Performance impact of all components considered

• AppLeS agents target dynamic, multi-user distributed environments

• AppLeS is leading project in application scheduling

Page 34: Achieving Application  Performance on the Computational Grid

Related Work• Application Schedulers

– Mars, Prophet/Gallop, VDCE, ...• Scheduling Services

– Globus GRAM• Resource Allocators

– I-Soft, PBS, LSF, Maui Scheduler, Nile, Legion• PSEs

– Nimrod, NEOS, NetSolve, Ninf• High-Throughput Schedulers

– Condor• Performance Steering

– Autopilot, SciRun

Page 35: Achieving Application  Performance on the Computational Grid

New Directions

• AppLeS Templates– distributed data applications

– parameter sweeps

– master/slave applications

– data parallel stencil applications

AppLeS Template Retargeting Engineering Environment

ApplicationModule

PerformanceModule

SchedulingModule

DeploymentModule

AP

I

AP

I

AP

I

Network Weather Service

dynamicbenchmarking

suite selection

Page 36: Achieving Application  Performance on the Computational Grid

New Directions

• Expanding AppLeS

target execution sites– interactive clusters

• linux, NT

– Globus, Legion

– batch systems

– high-throughput clusters

(Condor)

– all of the above

SCHED

AppLeS

Page 37: Achieving Application  Performance on the Computational Grid

New Directions• Real World Scheduling

• scheduling with

– partial information

– poor information

– dynamically changing information

• Multischeduling• resource economies• scheduling “social structure”

X

Page 38: Achieving Application  Performance on the Computational Grid

The Brave New World• Design, development, and execution of

grid-aware applications

PSE

Config.object

program

wholeprogramcompiler

Source appli-cation

libraries

Realtimeperf

monitor

Dynamicoptimizer

Grid runtime system

negotiation

Softwarecomponents

Service negotiator

Scheduler

Performance feedback

Perfproblem

Grid Application Development System

Page 39: Achieving Application  Performance on the Computational Grid

The AppLeS Project

• AppLeS Corps:– Fran Berman, UCSD– Rich Wolski, U. Tenn– Henri Casanova– Walfredo Cirne– Marcio Faerman– Jaime Frey

– Jim Hayes– Graziano Obertelli– Gary Shao– Shava Smallen– Alan Su– Dmitrii Zagorodnov

• Thanks to NSF, NASA, NPACI, DARPA, DoD

• AppLeS Home Page: http://www-cse.ucsd.edu/groups/hpcl/apples.html