www.ci.anl.gov matlab on the cray xe6 beagle beagle team...

35
www.ci.anl.gov www.ci.uchicago.edu Matlab on the Cray XE6 Beagle Beagle Team (beagle [email protected] ) Computation Institute University of Chicago & Argonne National Laboratory

Upload: sterling-beel

Post on 14-Dec-2015

224 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

Matlab on the Cray XE6 Beagle

Beagle Team ([email protected])Computation Institute University of Chicago & Argonne National Laboratory

Page 2: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

2 Matlab on Beagle – [email protected]

Outline

• Introduction to high performance computing• Some relevant facts about Beagle’s hardware• Basics about the work environment • Data transfer using Globus Online• Use of the compilers (C, C++, and Fortran)• Launching and monitoring applications• Using Matlab on Beagle

Page 3: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

What the Heck is Supercomputing?

Credit: Henry Neeman, DirectorOU Supercomputing Center for Education & Research

http://www.oscer.ou.edu/education.phpContact: [email protected]

Page 4: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

4 Matlab on Beagle – [email protected]

Why Beagle?

Not the kind of problemwe can handle with Matlab

at this point

Page 5: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

5 Matlab on Beagle – [email protected]

What affects performance? Accessing data

Examples:1. Data array too big to fit into cache (12

MB), we need to use main memory (32 GB)

2. An image too big to fit into memory (32 GB), use of disk space or distributed memory (23 TB)

3. Too many genomes to fit on local storage (~ max 50 TB per user), use of network disks

Page 6: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

6 Matlab on Beagle – [email protected]

What affects performance? Repetition

Examples:1. Unrelated experiments (e.g., CT image

reconstruction and molecular dynamics modeling) can be run at the same time

2. Each genome in a experiment can be analyzed independently

3. Slices or sub-images can be processed at the same time

Page 7: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

7 Matlab on Beagle – [email protected]

Matlab examples:1) If analyzing a single image is time

consuming (or images are large): slices or sub-images can be processed at the same time using different threads (e.g., with parallel tools, but not working yet)

2) If images are small: different threads can analyze different images (not really shared memory, just in the same memory)

Page 8: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

8 Matlab on Beagle – [email protected]

Page 9: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

9 Matlab on Beagle – [email protected]

Page 10: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

10 Matlab on Beagle – [email protected]

Page 11: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

Some relevant facts about Beagle’s hardware

http://beagle.ci.uchicago.edu/Contact: [email protected]

Page 12: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

12 Matlab on Beagle – [email protected]

Beagle: hardware overview

• Ranked about 100th fastest machine in the world (december 2011)

• Cray XE6 system• 151 teraflops• 17,660 compute cores• 23 TB memory• 600 TB disk (450 TB ormatted)• Cray Gemini interconnect

Page 13: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

13 Matlab on Beagle – [email protected]

Beagle “under the hood”

Page 14: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

14 Matlab on Beagle – [email protected]

Compute nodes

Compute nodes• 2 AMD Opteron 6100 “Magny-Cours”

• 12-core (24 per node)• 2.1-GHz

• 32 GB RAM (8 GB per processor)• No disk on node (mounts DVS and

Lustre network filesystems)

• To know more:http://www.ci.uchicago.edu/wiki/bin/view/Beagle/SystemSpecs#Overview

Page 15: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

15 Matlab on Beagle – [email protected]

Details about the Processors (sockets)

• Superscalar:• 3 Integer ALUs• 3 Floating point ALUs (can do 4 FP per cycle)

• Cache hierarchy: • Victim cache • 64KB L1 instruction cache• 64KB L1 data cache (latency 3 cycles)• 512KB L2 cache per processor core (latency of 9 cycles)• 12MB shared L3 cache (latency 45 cycles)

• To know more:

http://www.ci.uchicago.edu/wiki/bin/view/Beagle/SystemSpecs

Page 16: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

Basics about the work environment

http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagleContact: [email protected]

Page 17: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

17 Matlab on Beagle – [email protected]

Beagle’s operating system

• Cray XE6 uses Cray Linux Environment v3 (CLE3) • SuSE Linux-based • Compute nodes use Compute Node Linux (CNL)• Login and sandbox nodes use a more standard Linux• The two are different (relevant to Matlab).• Compute nodes can operate in

– ESM (extreme scalability mode) to optimize performance to large multi-node calculations

– CCM (cluster compatibility mode) for out-of-the-box compatibility with Linux/ x86 versions of software – more or less without recompilation or relinking! (It doesn’t work yet )

• To know more:http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#Basics_about_the_work_environmen

Page 18: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

18 Matlab on Beagle – [email protected]

Beagle’s filesystems

• /lustre/beagle: local Lustre filesystem (read-write) -- this is where all input and output files should be; however, NO BACKUP!

• /gpfs/pads: PADS GPFS (read-write) – for permanent storage

• /home: CI home directories, largely useless you can’t write there from the compute nodes!

• To know more:http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#How_to_work_on_the_filesystem

Page 19: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

19 Matlab on Beagle – [email protected]

How to move data to and from Beagle• Beagle is not HIPAA-compliant — no PHI data on Beagle • Example of factors for choosing a data movement tool:

– how many files, how large the files are …– how much fault tolerance is desired,– performance– security requirements, and – the overhead needed for software setup.

• Recommended tools:– scp/sftp can be OK for moving a few small files (< a couple of GB)

o pros: quick to initiate o cons: slow and not scalable

– For optimal speed and reliability we recommend Globus Online :o high-performance (e.g., fast)o reliable and easy to useo easy to use from either a command line or web browser,o provides fault tolerant, fire-and-forget transfers. If you know you'll be moving a lot of data

or find scp is too slow/unreliable we recommend

• To know more:http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#How_to_move_data_to_and_from_Bea

Page 20: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

20 Matlab on Beagle – [email protected]

Applications on Beagle

• Applications on Beagle are (mostly) run from the command line, e.g.:

aprun –n 17664 myapp <myInput >& this.log

• How do I know if an application is on Beagle?– http://beagle.ci.uchicago.edu/software/– http://www.ci.uchicago.edu/wiki/bin/view/Beagle/SoftwareOnBeagle– On Beagle, use module avail, e.g.:lpesce@login2:~> module avail 2>&1 | grep –i matlab

Matlab/7.13(default)

Page 21: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

21 Matlab on Beagle – [email protected]

Applications on Beagle

• GUIs are in general not supported (true for both for Matlab and Simulink)

• Licensing is similar to any other uchicago.edu machine– Packages charged by number of cores can be

expensive on Beagle and aren’t usually supported– Packages which have a campus license can be simply

installed and used on Beagle• Octave is available at no charge and can in

principle be installed (per serious request) on Beagle even if porting is not easy

Page 22: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

Matlab on Beagle

http://www.ci.uchicago.edu/wiki/bin/view/Beagle/MATLABhttp://beagle.ci.uchicago.edu/

Contact: [email protected]

Page 23: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

23 Matlab on Beagle – [email protected]

Matlab on Beagle: GUI

• The Matlab GUI is not supported and most likely will not be in the future:– According to our experience standard Matlab is not

very effective in exploiting massively parallel supercomputers such as Beagle

– Parallel tools has the potential to at least overcome some of these issues, but licensing and other practical issues render this approach practically unfeasible at this time

– If you have suggestions about how to use the GUI and parallel tools, let us know.

Page 24: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

24 Matlab on Beagle – [email protected]

Matlab on Beagle: Compile code

• However, compiled executables from Matlab code can be easily run on Beagle:– MATLAB programs should be compiled using mcc (Matlab

compiler) and run as command line executables with MCR (Matlab Compiler Runtime).

• In our experience, Matlab has shown very limited ability in exploiting effectively multi-core processors.– Therefore, to exploit parallelism, executables are compiled

single-threaded and run in parallel using a scripting language such as a bash shell or a Swift.

– We are working at including parallel tools into the compiled programs, but we have no working solution at this point.

– Suggestions?

Page 25: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

25 Matlab on Beagle – [email protected]

Compiling Matlab: Matlab code

• The Matlab enviroment can compile any Matlab function of the form

foofunc(x1,x2,... ,xn)

• Matlab functions can call other Matlab functions from other files, usually

leaving them in the compilation directory will be sufficient• Calling parameters (x1, x2, …,xn above) become arguments for the

executable.• However, those arguments will be considered as strings and will need to be

edited as (if arguments are numbers!): if (isdeployed)

x1 = str2num(x1);

x2 = str2num(x2);

...

xn = str2num(xn);

end

Page 26: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

26 Matlab on Beagle – [email protected]

Compiling Matlab: mcc and MCR

• The Matlab compiler (mcc) produces executables that in order to run require the Matlab Compiler Runtime (MCR) — a set of shared libraries that enables the execution of Matlab files without an installed version of Matlab or a license.

• The mcc compiler is loaded with the command module load matlab

• See alsohttp://www.mathworks.com/help/toolbox/

compiler

Page 27: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

27 Matlab on Beagle – [email protected]

Compiling Matlab: mcc and MCR

• Compilation can be done on other systems, as long as the MCR version corresponding to the mcc used to compile is installed on Beagle.

• Specific versions MCR can be installed by users in the directories on lustre. Please contact us if you encounter problems while trying to do it.

• Currently MCR is available as – /soft/matlab/7.13/– /soft/mcr/v714/ – (if you require other versions let us know).

Page 28: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

28 Matlab on Beagle – [email protected]

Compiling Matlab on Beagle: mcc options

• We recommend users to compile withmcc -R -singleCompThread -R -nojvm -R -nodisplay -mv myapp.m -o my_app

-m generates a standalone application-v option (verbose) displays all the the compilation steps -- e.g., it helps identify which third-party compiler is used and what environment variables are referenced-R specifies run-time options for MCR

– -R -nojvm disables the java virtual machine– -R -nodisplay eliminates functions that would produce a display).– -R -singleCompThread runs MCR single threaded

At this stage, it does not appear that there is a way to control how MATLAB creates threads or that it can run a multi-threaded program efficiently on a 24-core Cray XE6 node (MATLAB checks directly /proc/cpuinfo to determine how many cores are available for a calculation and uses all of them, independently from the instructions given by the aprun command)

• To know more:http://www.mathworks.com/help/toolbox/compiler/f0-985134.html

Page 29: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

29 Matlab on Beagle – [email protected]

Matlab on Beagle: mcc output

• After the compilation, a number of files will be generated:– mccExcludedFiles.log : don’t worry about this one– my_app: the executable you will need to copy to Beagle– readme.txt : contains information, for example where is

the version of MCRInstaller.bin for your specific MATLAB, which you will need if different from the ones available on Beagle

– run_my_app.sh : a shell script that can is used to run each copy of my_app. We recommend that you use it to avoid having to take care of too many variables in your PBS scripts. However, you will need to modify those scripts when using them on Beagle, see next page

Page 30: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

30 Matlab on Beagle – [email protected]

Matlab on Beagle: changes to run_my_app.sh

• To prevent the various scripts from blocking each other, add you can add something like the following lines at the beginning of the script, right after the initial comments (series of lines starting with "#")#Added to run on Beagle after August 2011 #TMP must be defined by the calling PBS script tmp=`mktemp -d $TMP/matlabcachedir.XXXXXXXXXXX`echo $tmpexport MCR_CACHE_ROOT=$tmp;# end added

• In order to remove the temporary cache directories, after the line eval "${exe_dir}”/my_app $args

add #Added to run on Beagle after August 2011 rm -rf $tmp

#end added

Page 31: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

31 Matlab on Beagle – [email protected]

Using Matlab on Beagle: scripting

• Run multiple copies of single-threaded run_my_app.sh using a scripting language:– Bash shell + PBS (batch submission)– Swift

• Remember that Beagle provides only 32GB per node, any request above that value will produce an Out of Memory (OOM) error, which will result in the termination of the process: be mindful about how much you “pack” calculations

Page 32: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

32 Matlab on Beagle – [email protected]

Using Matlab on Beagle: Bash + PBS#!/bin/bash#PBS -N myTestMatlab#PBS -l walltime=0:10:00#PBS -l mppwidth=24#PBS -j oe# Load modules and set for dynamic environment. /opt/modules/3.2.6.6/init/bash# Sets the shared library environmentexport CRAY_ROOTFS=DSL# set the env variable where the root of MRC is (you might need to change this if you need a specific version of MCR)#export MCRROOT=/soft/mcr/v714export MCRROOT=/soft/matlab/7.13/# Create, if necessary, a directory on /lustre to run the simulationsLUSTREDIR=/lustre/beagle/`whoami`/testMatlab/magicsquare${PBS_JOBID}mkdir -p $LUSTREDIR# Set up TMP and a cache root dir for MCR, it won't work if it isn't setLUSTRETMP=${LUSTREDIR}/${PBS_JOBID}/tmpmkdir -p $LUSTRETMPexport TMP=$LUSTRETMPexport MCR_CACHE_ROOT=$LUSTRETMP# copy the file to the run dir and run the codecd $PBS_O_WORKDIRcp run_my_app.sh my_app $LUSTREDIRcd $LUSTREDIRaprun -b -n 1 -d 1 ./run_my_app.sh $MCRROOT 5 &>test_my_app.log• To know more (e.g., packing and loops):http://www.ci.uchicago.edu/wiki/bin/view/Beagle/MATLAB#How_to_run_MATLAB_executables_vi

Page 33: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

33 Matlab on Beagle – [email protected]

Matlab on Beagle: note

• We are happy to help you use a scripting language effectively:– Bash shell – Swift (PRESENTATION ABOUT IT follows)

• In general Matlab compiled executables do not use Beagle very efficiently (both in terms of CPU and memory) and this should be considered carefully when planning large calculations.

• Let us know if we can help with any of the steps involved into using Matlab on Beagle

Page 34: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

34 Matlab on Beagle – [email protected]

Acknowledgments

• BSD for funding most of the operational costs of Beagle• A lot of the images and the content has been taken or learned

from Cray documentation or their staff (Dave Strenski, mostly)• Globus for providing us with many slides and support; special

thanks to Mary Bass, manager for communications and outreach at the CI.

• NERSC and its personnel provided us with both material and direct instruction; special thanks to Katie Antypas, group leader of the User Services Group at NERSC

• All the people at the CI who supported our work, from administrating the facilities to taking pictures of Beagle

• Beagle users who helped with the content about using Matlab and Python

Page 35: Www.ci.anl.gov  Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation

www.ci.anl.govwww.ci.uchicago.edu

Thanks!

We look forward to working with you.

Questions?

(or later: [email protected])