allen d. malony, sameer shende {malony,shende}@cs.uoregon.edu department of computer and information...

28
Allen D. Malony, Sameer Shende {malony,shende}@cs.uoregon.edu Department of Computer and Information Science Computational Science Institute University of Oregon Performance EngineeringTechnology for Complex Scientific Component Software

Post on 21-Dec-2015

220 views

Category:

Documents


3 download

TRANSCRIPT

Allen D. Malony, Sameer Shende {malony,shende}@cs.uoregon.edu

Department of Computer and Information Science

Computational Science Institute

University of Oregon

Performance EngineeringTechnologyfor Complex Scientific Component

Software

June 24, 2002 Argonne CCA Meeting2

Outline

Complexity and performance technology Developing performance interfaces for CCA

Performance knowledge repository Performance observation

TAU performance system Applications Implementation Concluding remarks

June 24, 2002 Argonne CCA Meeting3

Problem Statement

How do we create robust and ubiquitous performance technology for the analysis and tuning of component software in the presence of (evolving)

complexity challenges?

How do we apply performance technology effectively for the variety and diversity of performance problems

that arise in the context of CCA components?

June 24, 2002 Argonne CCA Meeting4

Extended Component Design

PKC: Performance Knowledge Component POC: Performance Observability Component

genericcomponent

June 24, 2002 Argonne CCA Meeting5

Performance Knowledge Describe and store “known” component’s performance

Benchmark characterizations in performance database Empirical or analytical performance models

Saved information about component performance Use for performance-guided selection and deployment Use for runtime adaptation

Representation must be in common forms with standard means for accessing the performance information

June 24, 2002 Argonne CCA Meeting6

Performance Knowledge Repository & Component Component performance repository

Implement in componentarchitecture framework

Similar to CCA componentrepository [Alexandria]

Access by componentinfrastructure

View performance knowledge as component (PKC) PKC ports give access to performance knowledge to other components back to original component Static/dynamic component control and composition Component composition performance knowledge

June 24, 2002 Argonne CCA Meeting7

Performance Observation

Ability to observe execution performance is important Empirically-derived performance knowledge

Does not require measurement integration in component Monitor during execution to make dynamic decisions

Measurement integration is key

Performance observation integration Component integration: core and variant Runtime measurement and data collection On-line and off-line performance analysis

June 24, 2002 Argonne CCA Meeting8

Performance Observation Component (POC)

Performance observation in aperformance-engineeredcomponent model

Functional extension of originalcomponent design ( ) Include new component

methods and ports ( ) for othercomponents to access measured performance data

Allow original component to access performance data Encapsulate as tightly-couple and co-resident performance

observation object POC “provides” port allow use optmized interfaces ( )

to access ``internal'' performance observations

June 24, 2002 Argonne CCA Meeting9

Component Composition Performance Engineering

Performance of component-based scientific applications depends on interplay of component functions and the computational resources available

Management of component compositions throughout execution is critical to successful deployment and use

Identify key technological capabilities needed to support the performance engineering of component compositions

Two model concepts performance awareness performance attention

June 24, 2002 Argonne CCA Meeting10

Performance Awareness of Component Ensembles Composition performance knowledge and observation Composition performance knowledge

Can come from empirical and analytical evaluation Can utilize information provided at the component level Can be stored in repositories for future review

Extends the notion of component observation to ensemble-level performance monitoring Associate monitoring components hierarchical component

grouping Build upon component-level observation support Monitoring components act as performance integrators

and routers Use component framework mechanisms

June 24, 2002 Argonne CCA Meeting11

Performance Engineered Component

Four parts Performance knowledge

Characterization Model

Performance observation Measurement Analysis

Performance query Performance control

Extend component design for performance engineering Keep consistent with CCA model

June 24, 2002 Argonne CCA Meeting12

TAU Performance System Framework

Tuning and Analysis Utilities Performance system framework for scalable parallel and distributed high-

performance computing Targets a general complex system computation model

nodes / contexts / threads Multi-level: system / software / parallelism Measurement and analysis abstraction

Integrated toolkit for performance instrumentation, measurement, analysis, and visualization Portable, configurable performance profiling/tracing facility Open software approach

University of Oregon, LANL, FZJ Germany http://www.cs.uoregon.edu/research/paracomp/tau

June 24, 2002 Argonne CCA Meeting13

General Complex System Computation Model

Node: physically distinct shared memory machine Message passing node interconnection network

Context: distinct virtual memory space within node Thread: execution threads (user/system) in context

memory memory

Node Node Node

VMspace

Context

SMP

Threads

node memory

Interconnection Network Inter-node messagecommunication

*

*

physicalview

modelview

June 24, 2002 Argonne CCA Meeting14

TAU Performance System Architecture

EPILOG

Paraver

June 24, 2002 Argonne CCA Meeting15

TAU Status Instrumentation supported:

Source, preprocessor, compiler, MPI, runtime, virtual machine Languages supported:

C++, C, F90, Java, Python HPF, ZPL, HPC++, pC++...

Packages supported: PAPI [UTK], PCL [FZJ] (hardware performance counter access), Opari, PDT [UO,LANL,FZJ], DyninstAPI [U.Maryland] (instrumentation), EXPERT, EPILOG[FZJ],Vampir[Pallas], Paraver [CEPBA] (visualization)

Platforms supported: IBM SP, SGI Origin, Sun, HP Superdome, Compaq ES, Linux clusters (IA-32, IA-64, PowerPC, Alpha), Apple, Windows, Hitachi SR8000, NEC SX, Cray T3E ...

Compilers suites supported: GNU, Intel KAI (KCC, KAP/Pro), Intel, SGI, IBM, Compaq,HP, Fujitsu,

Hitachi, Sun, Apple, Microsoft, NEC, Cray, PGI, Absoft, … Thread libraries supported:

Pthreads, SGI sproc, OpenMP, Windows, Java, SMARTS

June 24, 2002 Argonne CCA Meeting16

Program Database Toolkit

Application/ Library

C / C++parser

Fortran 77/90parser

C / C++IL analyzer

Fortran 77/90IL analyzer

ProgramDatabase

Files

IL IL

DUCTAPE

PDBhtml

SILOON

CHASM

TAU_instr

Programdocumentation

Applicationcomponent glue

C++ / F90interoperability

Automatic sourceinstrumentation

June 24, 2002 Argonne CCA Meeting17

Program Database Toolkit (PDT) Program code analysis framework for developing source-based tools for C99,

C++ and F90 High-level interface to source code information Widely portable:

IBM, SGI, Compaq, HP, Sun, Linux clusters,Windows, Apple, Hitachi, Cray T3E...

Integrated toolkit for source code parsing, database creation, and database query commercial grade front end parsers (EDG for C99/C++, Mutek for F90) Intel/KAI C++ headers for std. C++ library distributed with PDT portable IL analyzer, database format, and access API open software approach for tool development

Target and integrate multiple source languages Used in CCA for automated generation of SIDL Use in TAU to build automated performance instrumentation tools

(tau_instrumentor) Can be used to generate code for performance ports in CCA

June 24, 2002 Argonne CCA Meeting18

Performance Database Framework

. . .

Raw performance data

PerfDMLdata

description

Performance analysis programs

PerfDMLtranslators

Performance analysisand query toolkit

ORDB

PostgreSQL• XML profile data representation

• Multiple experiment performance database

June 24, 2002 Argonne CCA Meeting19

Empirical-Based PerformanceOptimization Processes

. . .ORDB

Performance Dataand Meta-Data

Performance Query& Analysis Toolkit

Performance Analysis Programs

PerfDML Translators

Raw Performance Data

PerfDMLData Description

PerformanceTuning

PerformanceDiagnosis

PerformanceExperimentation

PerformanceObservation

hypotheses

properties

characterization

Experiment Schemas

PerfESLScripts

ExperimentTrials

SourceCode

Pre-processor

InstrumentedSourceCode Compiler

ObjectCode Linker

ExecutableCode

Dynamic

VirtualMachine

ProfileGroups

FunctionDatabase

Statistics

FunctionCallstack

HardwareCounters

User-LevelTimers

Run-Time Library Modules

ProfilingData Files

Event TracesEvent Tables

pprof mergeconvert

ASCII Report Trace Logs

RacyjRacy Vampir

PROFILE TRACE

TAU API

Binary Rewrite

TAU Performance System

URV

URV

URV

PerfDBF

Integrated Performance Evaluation Environment

June 24, 2002 Argonne CCA Meeting20

Applications: VTF (ASCI ASAP Caltech) C++, C, F90, Python PDT, MPI

June 24, 2002 Argonne CCA Meeting21

Applications: SAMRAI (LLNL) C++ PDT, MPI SAMRAI timers (groups)

June 24, 2002 Argonne CCA Meeting22

Applications: Uintah (U. Utah ASCI L1 Center) C++ Mapping performance data, EXPARE experiment system MPI, sproc

June 24, 2002 Argonne CCA Meeting23

Applications: Uintah (U. Utah)

TAU uses SCIRun [U. Utah] for visualization of performance data (online/offline)

June 24, 2002 Argonne CCA Meeting24

Applications: Uintah (contd.)

Scalability analysis

June 24, 2002 Argonne CCA Meeting25

Implementation

We need the CCA forum to help: standardize component performance knowledge repository

specification to facilitate sharing define protocols for accessing performance data define the interface for performance ports support this effort

Prototype implementation using TAU Identify target CCA projects

June 24, 2002 Argonne CCA Meeting26

Concluding Remarks

Complex component systems pose challenging performance analysis problems that require robust methodologies and tools

New performance problems will arise Instrumentation and measurement Data analysis and presentation Diagnosis and tuning

Performance engineered components Performance knowledge, observation, query and control

June 24, 2002 Argonne CCA Meeting27

References

A. Malony and S. Shende, “Performance Technology for Complex Parallel and Distributed Systems,” Proc. 3rd Workshop on Parallel and Distributed Systems (DAPSYS), pp. 37-46, Aug. 2000.

S. Shende, A. Malony, and R. Ansell-Bell, “Instrumentation and Measurement Strategies for Flexible and Portable Empirical Performance Evaluation,” Proc. Int’l. Conf. on Parallel and Distributed Processing Techniques and Applications (PDPTA), CSREA, pp. 1150-1156, July 2001.

S. Shende, “The Role of Instrumentation and Mapping in Performance Measurement,” Ph.D. Dissertation, Univ. of Oregon, Aug. 2001.

J. de St. Germain, A. Morris, S. Parker, A. Malony, and S. Shende, “Integrating Performance Analysis in the Uintah Software Development Cycle,” ISHPC 2002, Nara, Japan, May, 2002.

URL: http://www.cs.uoregon.edu/research/paracomp/tau

Support Acknowledgement

TAU and PDT support: Department of Energy (DOE)

DOE 2000 ACTS contract DOE MICS contract DOE ASCI Level 3 (LANL, LLNL) U. of Utah DOE ASCI Level 1 subcontract

DARPA NSF National Young Investigator (NYI) award