allen d. malony [email protected] department of computer and information science tau performance...
Embed Size (px)
TRANSCRIPT

Allen D. Malony [email protected]
Department of Computer and Information Science
TAU Performance Research Laboratory
University of Oregon
Discussion:How to Address Tools Scalability

Discussion: How to Address Tools ScalabilityIBM Petascale Tools 2
Scale and Scaling What is meant by scale?
Processors execution concurrency/parallelism
Memory memory behavior, problem size Network concurrent communications File system parallel file operations / data size Scaling in the physical size / concurrency of the system
What else? Program code size / interacting modules Power electrical power consumption Performance potential computational power
Dimension Terascale … Petascale … and beyond

Discussion: How to Address Tools ScalabilityIBM Petascale Tools 3
Tools Scalibility Types of tools
Performance analytical / simulation / empirical
Debugging detect / correct concurrency errors Programming parallel languages / computation Compiling parallel code / libraries Scheduling systems allocation and launching
What does it mean for a tool to be scalable? Tool dependent (different problems and scaling aspects) What changes about the tool?
Naturally scalable vs. change in function / operation Is a paradigm shift required? To what extent is portability important?
What tools would you say are scalable? How? Why?

Discussion: How to Address Tools ScalabilityIBM Petascale Tools 4
Focus – Parallel Performance Tools/Technology
Tools for performance problem solving Empirical-based performance optimization process Performance technology concerns
characterization
PerformanceTuning
PerformanceDiagnosis
PerformanceExperimentation
PerformanceObservation
hypotheses
properties
• Instrumentation• Measurement• Analysis• Visualization
Performance Technology
• Experimentmanagement
• Performancedata storage
PerformanceTechnology

Discussion: How to Address Tools ScalabilityIBM Petascale Tools 5
Large Scale Performance Problem Solving
How does our view of this process change when we consider very large-scale parallel systems?
What are the significant issues that will affect the technology used to support the process?
Parallel performance observation is required In general, there is the concern for intrusion
Seen as a tradeoff with performance diagnosis accuracy Scaling complicates observation and analysis Nature of application development may change What will enhance productive application development? Paradigm shift in performance process and technology?

Discussion: How to Address Tools ScalabilityIBM Petascale Tools 6
Instrumentation and Scaling
Make events visible to the measurement system Direct instrumentation (code instrumentation)
Static instrumentation modifies code prior to execution does not get removed (always will get executed) source instrumentation may alter optimization
Dynamic instrumentation modifies code at runtime can be inserted and deleted at runtime incurs runtime cost
Indirect instrumentation generates events outside of code Does scale affect the number of events? Runtime instrumentation is more difficult with scale
Affected by increased parallelism

Discussion: How to Address Tools ScalabilityIBM Petascale Tools 7
Measurement and Scaling What makes performance measurement not scalable?
More parallelism more performance data overall performance data specific to each thread of execution possible increase in number interactions between threads
Harder to manage the data (memory, transfer, storage) Issues of performance intrusion
Performance data size Number of event generated X metrics per event Are there really more events? Which are important? Control number of events generated Control what is measured (to a point)
Need for performance data versus cost of obtaining it Portability!

Discussion: How to Address Tools ScalabilityIBM Petascale Tools 8
Measurement and Scaling (continued) Consider “traditional” measurement methods
Profiling: summary statistics calculated during execution Tracing: time-stamped sequence of execution events Statistical sampling: indirect triggers, PC + metrics Monitoring: access to performance data at runtime
How does the performance data grow? How does per thread profile / trace size grow? Consider communication
Strategies for scaling Control performance data production and volume Change in measurement type or approach Event and/or measurement control
Filtering, throttling, and sampling

Discussion: How to Address Tools ScalabilityIBM Petascale Tools 9
Concern for Performance Measurement Intrusion Performance measurement can affect the execution
Perturbation of “actual” performance behavior Minor intrusion can lead to major execution effects
Problems exist even with small degree of parallelism Intrusion is accepted consequence of standard practice Consider intrusion (perturbation) of trace buffer overflow
Scale exacerbates the problem … or does it? Traditional measurement techniques tend to be localized Suggests scale may not compound local intrusion globally Measuring parallel interactions likely will be affected
Use accepted measurement techniques intelligently

Discussion: How to Address Tools ScalabilityIBM Petascale Tools 10
Analysis and Visualization Scalability
How to understand all the performance data collected? Objectives
Meaningful performance results in meaningful forms Want tools to be reasonably fast and responsive Integrated, interoperable, portable, …
What does “scalability” mean here? Performance data size
Large data size should not impact analysis tool use Data complexity should not overwhelm interpretation Results presentation should understandable
Tool integration and usability

Discussion: How to Address Tools ScalabilityIBM Petascale Tools 11
Analysis and Visualization Scalability (continued)
Online analysis and visualization Potential interference with execution
Single experiment analysis versus multiple experiments Strategies
Statistical analysis data dimension reduction, clustering, correlation, …
Scalable and semantic presentation methods statistical, 3D relate metrics to physical
domain Parallelization of analysis algorithms (e.g., trace analysis) Increase system resources for analysis / visualization tools Integration with performance modeling Integration with parallel programming environment

Discussion: How to Address Tools ScalabilityIBM Petascale Tools 12
Role of Intelligence and Specificity
How to make the process more effective (productive)? Scale forces performance observation to be intelligent
Standard approaches deliver a lot of data with little value What are the important performance events and data?
Tied to application structure and computational mode Tools have poor support for application-specific aspects Process and tools can be more application-aware Will allow scalability issues to be addressed in context
More control and precision of performance observation More guided performance experimentation / exploration Better integration with application development

Discussion: How to Address Tools ScalabilityIBM Petascale Tools 13
Role of Automation and Knowledge Discovery
Even with intelligent and application-specific tools, the decisions of what to analyze may become intractable
Scale forces the process to become more automated Performance extrapolation must be part of the process Build autonomic capabilities into the tools
Support broader experimentation methods and refinement Access and correlate data from several sources Automate performance data analysis / mining / learning Include predictive features and experiment refinement
Knowledge-driven adaptation and optimization guidance Address scale issues through increased expertise

Discussion: How to Address Tools ScalabilityIBM Petascale Tools 14
ParaProf – Histogram View (Miranda)
8k processors
16k processors

Discussion: How to Address Tools ScalabilityIBM Petascale Tools 15
ParaProf – 3D Full Profile (Miranda)
16k processors

Discussion: How to Address Tools ScalabilityIBM Petascale Tools 16
ParaProf – 3D Scatterplot (Miranda)
Each pointis a “thread”of execution
A total offour metricsshown inrelation
ParaVis 3Dprofilevisualizationlibrary JOGL

Discussion: How to Address Tools ScalabilityIBM Petascale Tools 17
Hierarchical and K-means Clustering (sPPM)

Discussion: How to Address Tools ScalabilityIBM Petascale Tools 18
Vampir Next Generation (VNG) Architecture
MergedTraces
Analysis Server
Classic Analysis:
monolithic
sequential
Worker 1
Worker 2
Worker m
Master
Trace 1Trace 2
Trace 3Trace N
File System
InternetInternet
Parallel Program
Monitor System
Event Streams
Visualization Client
Segment Indicator
768 Processes Thumbnail
Timeline with 16 visible Traces
ProcessParallel
I/OMessage Passing