scalability study of s3d using tau sameer shende [email protected]

30
Scalability Study of S3D using TAU Sameer Shende [email protected]

Post on 19-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

Scalability Study of S3D using TAUSameer Shende

[email protected]

Page 2: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 2

Acknowledgements

Alan Morris [UO] Kevin Huck [UO] Allen D. Malony [UO] Kenneth Roche [ORNL] Bronis R. de Supinski [LLNL]

The performance data presented here is available at:

http://www.cs.uoregon.edu/research/tau/s3d

Page 3: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 3

TAU Parallel Performance System

http://www.cs.uoregon.edu/research/tau/ Multi-level performance instrumentation

Multi-language automatic source instrumentation Flexible and configurable performance measurement Widely-ported parallel performance profiling system

Computer system architectures and operating systems Different programming languages and compilers

Support for multiple parallel programming paradigms Multi-threading, message passing, mixed-mode, hybrid

Page 4: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 4

Scalability Study

Harness testcase Platform: Jaguar Cray XT3 at ORNL

1p 8p 64p 512p

Goal: to evaluate scaling properties of code regions Scalability of MPI operations

Page 5: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 5

Introduction to ParaProf: Main Window

click left mouse button

click right mouse button

% paraprof *.ppkload all 1p, 8p, 64p, 512pprofile datasets together

Page 6: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 6

ParaProf: MFLOPs sorted by Exclusive Time

Page 7: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 7

Source Code View

Page 8: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 8

Comparison Window: Inclusive Time

Page 9: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 9

Comparing Level 1 Data Cache Misses

Page 10: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 10

CPU Resource Stalls

Page 11: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 11

ParaProf: 3D view for 512 cpus - Jagged Edges!

Page 12: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 12

MPI_Wait - Jagged Edges Seen in 3D Window

pattern repeatsevery 8 cpus!

512 cpus

Page 13: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 13

MPI_Wait - Histogram (Bins) View

Page 14: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 14

Comparing MPI_Wait

MPI_Wait time increases steadily with processors!

Page 15: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 15

PerfDMF: Performance Data Mgmt. Framework

Page 16: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 16

PerfExplorer - Comparative Analysis Relative speedup, efficiency

total runtime, by event, one event, by phase Breakdown of total runtime Group fraction of total runtime Correlating events to total runtime Timesteps per second

Page 17: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 17

PerfExplorer

TAU’sPerfDMFdatabase

S3D

Page 18: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 18

PerfExplorer: Select Experiment & Analysis

Page 19: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 19

Relative Efficiency By Event

Page 20: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 20

Relative Efficiency For S3D - Weak Scaling

Page 21: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 21

Relative Speedup

Page 22: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 22

Relative Efficiency & Speedup for One Event

Page 23: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 23

Data Mining: Event Correlation to Total Time

r = 1 impliesdirect correlation

Page 24: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 24

MPI Scaling

Page 25: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 25

Total Runtime Breakdown by Events

Page 26: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 26

S3D - Building with TAU Change name of compiler in build/make.XT3

ftn=> tau_f90.sh cc => tau_cc.sh

Set compile time environment variables setenv TAU_MAKEFILE /spin/proj/perc/TOOLS/tau_latest/xt3/lib/

Makefile.tau-callpath-multiplecounters-mpi-papi-pdt-pgi Choose callpath, PAPI counters, MPI profiling, PDT for source instrumentation

setenv TAU_OPTIONS ‘-optTauSelectFile=select.tau -optPreProcess’ Selective instrumentation file eliminates instrumentation in lightweight routines Pre-process Fortran source code using cpp before compiling

Set runtime environment variables for instrumentation control and event PAPI counter selection in job submission script:

export TAU_THROTTLE=1 export COUNTER1 GET_TIME_OF_DAY export COUNTER2 PAPI_FP_INS export COUNTER3 PAPI_L1_DCM export COUNTER4 PAPI_RES_STL export COUNTER5 PAPI_L2_DCM

Page 27: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 27

Selective Instrumentation in TAU

% cat select.tauBEGIN_EXCLUDE_LIST

MCADIF

GETRATES

TRANSPORT_M::MCAVIS_NEW

MCEDIF

MCACON

CKYTCP

THERMCHEM_M::MIXCP

THERMCHEM_M::MIXENTH

THERMCHEM_M::GIBBSENRG_ALL_DIMT

CKRHOY

MCEVAL4

THERMCHEM_M::HIS

THERMCHEM_M::CPS

THERMCHEM_M::ENTROPY

END_EXCLUDE_LIST

BEGIN_INSTRUMENT_SECTION

loops routine="#"

END_INSTRUMENT_SECTION

Page 28: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 28

Getting Access to TAU on Jaguar set path=(/spin/proj/perc/TOOLS/tau_latest/x86_64/bin $path) Choose Stub Makefiles (TAU_MAKEFILE env. var.) from

/spin/proj/perc/TOOLS/tau_latest/xt3/lib/Makefile.* Makefile.tau-mpi-pdt-pgi (flat profile) Makefile.tau-mpi-pdt-pgi-trace (event trace, for use with Vampir) Makefile.tau-callpath-mpi-pdt-pgi (single metric, callpath profile)

Binaries of S3D can be found in: ~sameer/scratch/S3D-BINARIES

withtau» papi, multiplecounters, mpi, pdt, pgi options

without_tau

Page 29: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 29

Concluding Discussion Performance tools must be used effectively More intelligent performance systems for productive use

Evolve to application-specific performance technology Deal with scale by “full range” performance exploration Autonomic and integrated tools Knowledge-based and knowledge-driven process

Performance observation methods do not necessarily need to change in a fundamental sense More automatically controlled and efficiently use

Develop next-generation tools and deliver to community Open source with support by ParaTools, Inc. http://www.cs.uoregon.edu/research/tau

Page 30: Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

TAU Performance SystemS3D Scalability Study 30

Support Acknowledgements

Department of Energy (DOE)

Office of Science LLNL, LANL, ORNL, ASC PERI