sameer shende [email protected] department of computer and information science neuroinformatics...

24
Sameer Shende [email protected] Department of Computer and Information Science NeuroInformatics Center University of Oregon Generating Proxy Components using PDT

Upload: cecelia-stapleton

Post on 16-Dec-2015

220 views

Category:

Documents


4 download

TRANSCRIPT

Sameer [email protected]

Department of Computer and Information Science

NeuroInformatics Center

University of Oregon

Generating Proxy Components using PDT

Apr. 15, 2004 Boulder CCA Meeting2

Outline

Overview of the TAU and PDT projects Proxy Component Auto-generation of proxies Applications Concluding remarks

Apr. 15, 2004 Boulder CCA Meeting3

TAU Performance System Framework

Tuning and Analysis Utilities Performance system framework for scalable parallel and distributed high-

performance computing Targets a general complex system computation model

nodes / contexts / threads Multi-level: system / software / parallelism Measurement and analysis abstraction

Integrated toolkit for performance instrumentation, measurement, analysis, and visualization Portable, configurable performance profiling/tracing facility Open software approach

University of Oregon, LANL, FZJ Germany http://www.cs.uoregon.edu/research/paracomp/tau

Apr. 15, 2004 Boulder CCA Meeting4

TAU Performance System Architecture

EPILOG

Paraver

Apr. 15, 2004 Boulder CCA Meeting5

TAU’s Paraprof Profile Browser (ESMF Data)

Apr. 15, 2004 Boulder CCA Meeting6

Callpath Profiling in TAU

Apr. 15, 2004 Boulder CCA Meeting7

Program Database Toolkit

Componentsource/ Library

C / C++parser

Fortran 77/90/95parser

C / C++IL analyzer

Fortran 77/90/95IL analyzer

ProgramDatabase

Files

IL IL

DUCTAPE

tau_pg

SILOON

CHASM

TAU_instr

Proxy Component

Applicationcomponent glue

C++ / F90interoperability

Automatic sourceinstrumentation

Apr. 15, 2004 Boulder CCA Meeting8

Program Database Toolkit (PDT) Program code analysis framework for developing source-based tools for C99,

C++ and F90 High-level interface to source code information Widely portable:

IBM (AIX, Linux Power4), SGI, Compaq, HP, Sun, Linux clusters,Windows, Apple, Hitachi, Cray X1,T3E, RedStorm...

Integrated toolkit for source code parsing, database creation, and database query commercial grade front end parsers

EDG for C99/C++ Mutek Solutions for F90 Cleanscape Flint Parser for F77/F90/F95

Intel/KAI C++ headers for std. C++ library distributed with PDT portable IL analyzer, database format, and access API open software approach for tool development

Target and integrate multiple source languages Used in TAU to build automated performance instrumentation tools Used in CHASM, XMLGEN, Component method signature extraction,…

Apr. 15, 2004 Boulder CCA Meeting9

CCA Performance Observation Component

Design measurement port and measurement interfaces Timer

start/stop set name/type/group

Control enable/disable groups

Query get timer names metrics, counters, dump to disk

Event user-defined events

Apr. 15, 2004 Boulder CCA Meeting10

CCA C++ (CCAFFEINE) Performance Interfacenamespace performance { namespace ccaports { class Measurement: public virtual classic::gov::cca::Port { public: virtual ~ Measurement (){}

/* Create a Timer interface */ virtual performance::Timer* createTimer(void) = 0; virtual performance::Timer* createTimer(string name) = 0; virtual performance::Timer* createTimer(string name, string type) = 0;

virtual performance::Timer* createTimer(string name, string type, string group) = 0;

/* Create a Query interface */ virtual performance::Query* createQuery(void) = 0;

/* Create a user-defined Event interface */ virtual performance::Event* createEvent(void) = 0; virtual performance::Event* createEvent(string name) = 0;

/* Create a Control interface for selectively enabling and disabling * the instrumentation based on groups */ virtual performance::Control* createControl(void) = 0; }; }}

Measurement port

Measurement interfaces

Apr. 15, 2004 Boulder CCA Meeting11

CCA Timer Interface Declaration

namespace performance { class Timer { public: virtual ~Timer() {}

/* Implement methods in a derived class to provide functionality */

/* Start and stop the Timer */ virtual void start(void) = 0; virtual void stop(void) = 0;

/* Set name and type for Timer */ virtual void setName(string name) = 0; virtual string getName(void) = 0; virtual void setType(string name) = 0; virtual string getType(void) = 0;

/* Set the group name and group type associated with the Timer */ virtual void setGroupName(string name) = 0; virtual string getGroupName(void) = 0;

virtual void setGroupId(unsigned long group ) = 0; virtual unsigned long getGroupId(void) = 0; };}

Timer interface methods

Apr. 15, 2004 Boulder CCA Meeting12

Use of Observation Component in CCA Example

#include "ports/Measurement_CCA.h"...double MonteCarloIntegrator::integrate(double lowBound, double upBound, int count) { classic::gov::cca::Port * port; double sum = 0.0; // Get Measurement port port = frameworkServices->getPort ("MeasurementPort"); if (port) measurement_m = dynamic_cast < performance::ccaports::Measurement * >(port); if (measurement_m == 0){ cerr << "Connected to something other than a Measurement port"; return -1; } static performance::Timer* t = measurement_m->createTimer( string("IntegrateTimer")); t->start(); for (int i = 0; i < count; i++) { double x = random_m->getRandomNumber (); sum = sum + function_m->evaluate (x); } t->stop();}

Apr. 15, 2004 Boulder CCA Meeting13

Measurement Port Implementation

Use of Measurement port (i.e., instrumentation) independent of choice of measurement tool independent of choice of measurement type

TAU performance observability component Implements the Measurement port Implements Timer, Control, Query, Control Port can be registered with the CCAFEINE framework

Components instrument to generic Measurement port Runtime selection of TAU component during execution TauMeasurement_CCA port implementation uses a

specific TAU library for choice of measurement type

Apr. 15, 2004 Boulder CCA Meeting14

What’s Going On Here?

TAU API

runtime TAUperformance data

TAU API

applicationcomponent

performancecomponent

other API

Alternative implementationsof performance component

Two instrumentationpaths using TAU API

Two query and controlpaths using TAU API

applicationcomponent

Apr. 15, 2004 Boulder CCA Meeting15

Proxy Component

Interpose a proxy component for each port Inside the proxy, track caller/callee invocations, timings Automate the process of proxy component creation

Using PDT for static analysis of components

MidpointIntegrator

IntegratorPortGo

Driver

IntegratorPort

IntegratorProxy Component

IntegratorPortUsesIntegratorPortProvides

MeasurementPortMeasurementPort

Performance

Apr. 15, 2004 Boulder CCA Meeting16

TAU’s Proxy Generator for Classic C++ Interface

Proxy generator arguments: -p <port name> -t <type> -c <component> -d <PDB file>

-o <output file> -f <selective instrumentation file> -x <Component tag> e.g.,

% tau_pg -c integrators::ccaports::Integrator -t integrators.ccaports.Integrator -p IntegratorPort -d ParallelIntegrator_CCA.pdb -o Proxy.cc -h ports/Integrator_CCA.h -f select.dat –x ParallelInt

Creating PDB file:% cxxparse <file.cpp> -I<dir> -D<flags>

creates file.pdb.% pdbmerge -o merged.pdb file1.pdb file2.pdb …

Apr. 15, 2004 Boulder CCA Meeting17

Selective Instrumentation

Exclude or include list of routines

% tau_pg … -f select.dat

% cat select.dat

# Selective instrumentation: Specify an exclude/include list of routines/files.

BEGIN_EXCLUDE_LIST

void quicksort(int *, int, int)

void sort_5elements(int *)

void interchange(int *, int *)

END_EXCLUDE_LIST

# or use BEGIN_INCLUDE_LIST END_INCLUDE_LIST to bracket the event names

# Instruments routines in Main.cpp, Foo?.c and *.C files only

# Use BEGIN_[FILE]_INCLUDE_LIST with END_[FILE]_INCLUDE_LIST

Apr. 15, 2004 Boulder CCA Meeting18

Flame Reaction-Diffusion Demonstration

CCAFFEINE

Apr. 15, 2004 Boulder CCA Meeting19

CFRFS Profiles using Proxy GeneratorNODE 0;CONTEXT 0;THREAD 0:---------------------------------------------------------------------------------------%Time Exclusive Inclusive #Call #Subrs Inclusive Name msec total msec usec/call ---------------------------------------------------------------------------------------100.0 3,374 1:40.763 1 7 100763742 int main(int, char **) 95.8 1,177 1:36.525 1 391 96525455 driver_proxy::go() 48.6 9,023 48,947 15 9640 3263162 rk2_proxy::Advance() 44.8 43,914 45,151 7 138 6450244 ee_proxy::Regrid() 34.3 3,368 34,559 594 7129 58180 flux_proxy::compute() 21.7 21,862 21,862 1188 0 18403 sc_proxy::compute() 9.0 9,089 9,089 1188 0 7651 efm_proxy::compute() 3.7 2,216 3,707 210 11250 17655 grace_proxy::GC_Synch() 1.2 841 1,225 3 1496 408607

grace_proxy::GC_regrid_above_threshold() 1.0 980 980 123 0 7970 icc_proxy::restrict() 0.9 943 943 4460 0 212 icc_proxy::prolong() 0.9 863 863 1 39 863722 MPI_Init() 0.8 772 772 16764 0 46 TAU_GET_FUNCTION_VALUES() 0.6 613 614 869 869 707 MPI_Isend() 0.4 432 436 15 24 29093 c_proxy::compute() 0.4 288 409 30 120 13665 stats_proxy::compute() 0.4 393 393 954 0 413 MPI_Waitsome() 0.2 217 217 6 18 36282 MPI_Comm_dup() 0.2 182 182 215 0 849 MPI_Allreduce() 0.1 3 126 15 75 8402

rk2_proxy::GetStableTimestep() 0.1 101 120 15 45 8062 compute void (

VectorFieldVariable *) 0.1 62 62 533 0 118 bc_proxy::compute() 0.0 17 17 3 0 5729 MPI_Barrier()

Apr. 15, 2004 Boulder CCA Meeting20

Performance Modeling

Use MasterMind Component [IPDPS’04] with Measurement Component to track each argument invocation

Proxy for Mastermind component currently tracks all methods.

Develop performance models based on measurements Specify performance models to a model evaluator library

(being developed at UO/Sandia [Nick Trebon, J. Ray]) for evaluating performance models of component ensembles

Specifying performance models research Performance Database ties in historical performance data

Apr. 15, 2004 Boulder CCA Meeting21

TAU Performance Database FrameworkPerformance

analysis programs

Performance analysisand query toolkit

profile data only XML representation project / experiment / trial

PerfDMLtranslators

. . .

ORDB

PostgreSQL

PerfDB

Performancedata description

Raw performance data

Other tools

Apr. 15, 2004 Boulder CCA Meeting22

Proxy Generator for other Applications

PDT based proxy component for: QoS tracking [Boyana, ANL] Debugging Port Monitor (tracks arguments) SCIRun2 Perfume components [Venkat, U. Utah]

Exploring Babel for auto-generation of proxies: Direct SIDL to proxy code generation Generating client component interface in C++, using PDT

for generating proxies

Apr. 15, 2004 Boulder CCA Meeting23

Concluding Remarks

Complex component systems pose challenging performance analysis problems that require robust methodologies and tools

Automating Instrumentation of Component Software Performance Measurement Performance Prediction Debugging Performance-aware (QoS) intelligent components

Performance engineered components Performance knowledge, observation, query and control

Support Acknowledgement

TAU and PDT support: Department of Energy (DOE)

DOE MICS contracts DOE ASCI Level 3 (LANL, LLNL) U. of Utah DOE ASCI Level 1 subcontract

NSF National Young Investigator (NYI) award