a report on the acts toolkit (acts-support@nersc)
DESCRIPTION
A Report on the ACTS Toolkit ([email protected]). Osni Marques and Tony Drummond (LBNL/NERSC) [email protected] , [email protected]. What is the ACTS Toolkit?. http://acts.nersc.gov. information center. A dvanced C omputational T esting and S imulation - PowerPoint PPT PresentationTRANSCRIPT
A Report on theACTS Toolkit
Osni Marques and Tony Drummond(LBNL/NERSC)
04/22/23 A Report on the ACTS Toolkit 2
What is the ACTS Toolkit?
• Advanced Computational Testing and Simulation• Tools for developing parallel applications
• developed (primarily) at DOE Labs• ~ 20 tools
• ACTS is an “umbrella” project• leverage numerous independently funded projects• collect tools in a toolkit
http://acts.nersc.gov information
center
04/22/23 A Report on the ACTS Toolkit 3
ACTS: Project Goals
• Extended support for experimental software
• Make ACTS tools available on DOE computers
• Provide technical support ([email protected])
• Maintain ACTS information center (http://acts.nersc.gov)
• Coordinate efforts with other supercomputing centers
• Enable large scale scientific applications
• Educate and train
04/22/23 A Report on the ACTS Toolkit 4
What needs to be computed?
bAx zAz TVUA
ul xxxxr :)(min 221
ODEsPDEs
SuperLU
ScaLAPACKAztec/Trilinos
HyprePETSc
TAOPVODE
04/22/23 A Report on the ACTS Toolkit 5
What codes are being developed?
Global Arrays
Parallel programs that use large distributed arrays
Operations with grids for PDE applications
Scripting interface for C++ numerics
Expression templates for C++
Infrastructure for distributed
computing
Interactive visualization
Coupling distributed applications
Performance analysis and monitoring
Overture
POOMASILOON
PETECUMULVS
TAU
PAWS
04/22/23 A Report on the ACTS Toolkit 6
ACTS: levels of support
• High• Intermediate level• Tool expertise• Conduct tutorials• Intermediate
• Basic level• Provide a higher level of support to users of the tool• Basic
• Basic knowledge of the tools• Help with installation • Compilation of user’s reports ([email protected])
04/22/23 A Report on the ACTS Toolkit 7
ACTS Tools Installed on NERSC Computers
Tool IBM SP(seaborg)
CRAY T3E(mcurie)
PC Cluster(alvarez)
Aztec 2.1 2.0, 2.1 2.1CUMULVS - 1.1.1, 1.1.2 -
Hypre 1.2.0 - -PETSc 2.1.0, 2.1.1, 2.1.2 2.0.24, 2.1.0, 2.1.2 -
PVODE 1998 1998 -ScaLAPACK 1.6, 1.7 1.5, 1.7 1.6
SuperLU (dist) 1.0 1.0 -TAO 1.4* 1.0.2, 1.2 -TAU 2.8b7, 2.11.9 2.6, 2.9 -
See also http://acts.nersc.gov/tools
04/22/23 A Report on the ACTS Toolkit 8
Aztec
• Solves large sparse systems of linear systems on distributed memory machines
• Implements Krylov iterative methods (CG, CGS, Bi-CG-Stab, GMRES, TFQMR)
• Suite of preconditioners (Jacobi, Gauss-Seidel, overlapping domain decomposition with sparse LU, ILU, BILU within domains)
• Highly efficient, scalable (1000 processors on the “ASCI Red” machine)
http://acts.nersc.gov/aztec
04/22/23 A Report on the ACTS Toolkit 9
AztecOO/Trilinos
• Trilinos encompasses efforts in linear solvers, eigensolvers, nonlinear and time-dependent solvers, and others.
• Provides a common framework for current and future solver projects:• A common set of concrete linear algebra objects for solver development
and application interfaces.• A consistent set of solver interfaces via abstract classes (API) .
• AztecOO improves on Aztec by:• Using objects for defining matrix and RHS.• Providing more preconditioners/scalings.• Using C++ class design to enable more sophisticated use.
• AztecOO interfaces allows:• Continued use of Aztec for functionality.• Introduction of new solver capabilities outside of Aztec.
04/22/23 A Report on the ACTS Toolkit 10
CUMULVS
• Collaborative User Migration, User Library for Visualization and Steering
• Enables parallel programming with the integration of:• Interactive visualization (local and remote)
• Multiple views
• Fault Tolerance• Computational Steering
http://acts.nersc.gov/cumulvs
04/22/23 A Report on the ACTS Toolkit 11
CUMULVS
04/22/23 A Report on the ACTS Toolkit 12
Hypre
• Before writing your code:• choose a conceptual interface• choose a solver / preconditioner• choose a matrix type that is compatible with your solver /
preconditioner and conceptual interface• Now write your code:
• build auxiliary structures (e.g., grids, stencils)• build matrix/vector through conceptual interface• build solver/preconditioner• solve the system• get desired information from the solver
http://acts.nersc.gov/hypre
04/22/23 A Report on the ACTS Toolkit 13
Hypre: Interfaces
Data Layoutstructured composite block-struc unstruc CSR
Linear SolversGMG, ... FAC, ... Hybrid, ... AMGe, ... ILU, ...
Linear System Interfaces
Multiple interfaces are necessary to provide “best” solvers and data layouts
04/22/23 A Report on the ACTS Toolkit 14
PETSc
• Portable, Extensible Toolkit for Scientific Computing• What can it do?
• Support the development of parallel PDE solvers• Implicit or semi-implicit solution methods, finite element,
finite difference, or finite volume type discretizations.• Specification of the mathematics of the problem
• Vectors (field variables) and matrices (operators)• How to solve the problem?
• Linear, non-linear, and time-stepping (ODE) solvers
http://acts.nersc.gov/petsc
04/22/23 A Report on the ACTS Toolkit 15
PETSc: Features
• Parallelism• Uses MPI• Data Layout: structure and unstructured meshes• Partitioning and coloring
• Viewers• Printing Data Object information• Visualization of a field and matrix data
• Profiling and performance Tuning• -log_summary• Profiling by stages of an application• User define events
04/22/23 A Report on the ACTS Toolkit 16
PVODE
• PVODE, for systems of ordinary differential equations• KINSOL, for systems of nonlinear algebraic equations• IDA, for systems of differential-algebraic equations
PVODE actually refers to a trio of closely related solvers:
http://acts.nersc.gov/pvode
PVODE has been evolved into SUNDIALS (SUite of Nonlinear and Differential/ALgebraic equation Solvers)
04/22/23 A Report on the ACTS Toolkit 17
ScaLAPACK
ScaLAPACK
BLAS
LAPACK BLACS
PVM/MPI/...
PBLASGlobalLocal
platform specific
Clarity,modularity, performance and portability.
Atlas can be used for automatic tuning.
Version 1.7 released in August
2001.
Linear systems, least squares, singular value decomposition,
eigenvalues.
Communication routines targeting
linear algebra operations.
Parallel BLAS.
http://acts.nersc.gov/scalapack
04/22/23 A Report on the ACTS Toolkit 18
ScaLAPACK: Goals
• Efficiency• Optimized computation and communication engines• Block-partitioned algorithms (Level 3 BLAS) for good node performance
• Reliability• Whenever possible, use LAPACK algorithms and error bounds.
• Scalability• As the problem size and number of processors grow• Replace LAPACK algorithm that did not scale (new ones into LAPACK)
• Portability• Isolate machine dependencies to BLAS and the BLACS
• Flexibility• Modularity: build rich set of linear algebra tools (BLAS, BLACS, PBLAS)
• Ease-of-Use• Calling interface similar to LAPACK
04/22/23 A Report on the ACTS Toolkit 19
SuperLU
• Solves Ax=b on by sparse Gaussian elimination• Sequential, SMP and distributed memory (MPI)
implementations • Suitable for general sparse A, nonsymmetric, real or
complex• Performance depends strongly on
• Sparsity structure, good if (number of flops) / (number of nonzeros) is large
• Ordering of equations and unknowns (controls fill-in, parallelism)
http://acts.nersc.gov/superlu
04/22/23 A Report on the ACTS Toolkit 20
Distributed SuperLU: Performance Highlights
• Uses static instead of dynamic pivoting to be as scalable as Cholesky• Performance on a 512 processor Cray T3E
• 10.2 Gflops for MIXING-TANK, fluid flow, n = 29957, nonzeros/row = 67• 8.4 Gflops for ECL32, device simulation, n = 51993, nonzeros/row = 7.3 • 2.5 Gflops for BBMAT, fluid flow, n = 38744, nonzeros/row = 46
(20% parallel efficiency)• Used to solve open Quantum Mechanics problem
(Science, 24 Dec 1999)• n = 736 K on 64 PEs,
Cray T3E in 5.7 minutes• n = 1.8 M on 24 PEs,
ASCI Blue Pacific in 24 minutes
04/22/23 A Report on the ACTS Toolkit 21
TAO
• Toolkit for Advanced Optimization• Object-oriented techniques• Component-based interaction• Leverage of existing parallel computing infrastructure• Reuse of external toolkits
• Algorithms for:• Unconstrained optimization• Bound-constrained optimization• Linearly constrained optimization• Nonlinearly constrained optimization
http://acts.nersc.gov/tao
04/22/23 A Report on the ACTS Toolkit 22
TAO: interfaces
04/22/23 A Report on the ACTS Toolkit 23
TAU
• Profiling of Java, C++, C, and Fortran codes• Detailed information (much more than prof/gprof)• Profiles for each unique template instantiation• Time spent exclusively and inclusively in each function• Start/Stop timers• Profiling data maintained for each thread, context, and node• Parallel IO Statistics for the number of calls for each profiled function• Profiling groups for organizing and controlling instrumentation• Support for using CPU hardware counters (PAPI)• Graphic display for parallel profiling data• Graphical display of profiling results (built-in viewers, interface to Vampir)
http://acts.nersc.gov/tau
04/22/23 A Report on the ACTS Toolkit 24
TAU: Control Windows
• COSY: COmpile manager Status displaY• FANCY: File ANd Class displaY • SPIFFY: Structured Programming Interface and Fancy File displaY• CAGEY: CAll Graph Extended displaY• CLASSY: CLASS hierarchY browser• RACY: Routine and data ACcess profile displaY• SPEEDY: Speedup and Parallel Execution Extrapolation DisplaY
04/22/23 A Report on the ACTS Toolkit 25
Why do we need these tools?
• High Performance Tools• portable• library calls• robust algorithms• help code optimization
• More code development in less time
• More simulation in less computer time A computation that took 1 full year to
complete in1980 could be done in ~ 10 hours in 1992, in ~ 16 minutes in 1997 and in ~ 27 seconds in
2001!
GFl
op/s
2001
ASCI White
Pacific
(7424)
ASCI Blue
Pacific SST
(5808)SGI ASCI
Blue
Mountain
(5040)
Intel
ASCI Red
(9152)Hitachi
CP-PACS(2040)
IntelParagon(6788)
FujitsuVPP-500
(140)
TMC CM-5(1024)
NEC SX-3(4)
TMCCM-2(2048)
Fujitsu VP-2600
Cray Y-MP (8)
Intel ASCI
Red Xeon
(9632)
0
1000
2000
3000
4000
5000
6000
7000
1990 1992 1994 1996 1998 2000Year
1992
1997
04/22/23 A Report on the ACTS Toolkit 26
Lessons Learned
• There is still a gap between tool developers and application developers which leads to duplication of efforts.
• The tools currently included in the ACTS Toolkit should be seen as dynamical configurable toolkits and should be grouped into toolkits upon user/application demand.
• Users demand long-term support of the tools.• Applications and users play an important role in making the
tools mature.• Tools evolve or are superseded by other tools.• There is a demand for tool interoperability and more
uniformity in the documentation and user interfaces.• There is a need for an intelligent and dynamic
catalog/repository of high performance tools.
User Community
ACTS
Challenge Codes Computing Systems
Interoperability
Pool of Software Tools
Testing and Acceptance Phase
Collaboratories
Workshops and Training
Scientific Computing Centers
Computer Vendors
Numerical SimulationsPhysicsChemistryBiology
Medicine
Mathematics
BioinformaticsComputer Sciences
Engineering
Agenda, accomplishments, conferences, releases, etc
http://acts.nersc.gov
Tool descriptions, installation details, examples, etc
Goals and other relevant information
Points of contact
Search engine
Please mark your calendars!
[email protected]://acts.nersc.gov