rhea analysis & post-processing cluster
DESCRIPTION
Rhea Analysis & Post-processing Cluster. Robert D. French NCCS User Assistance. Rhea Quick Overview. 200 Dell PowerEdge C6220 Nodes 196 Compute / 4 Login RHEL 6.4 2 x 8-Core Intel Xeon CPUs @ 2.0 GHz Hyperthreading is enabled, so “top” shows 32 CPUs 64GB of RAM New 56Gb/s IB Fabric - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/1.jpg)
RheaAnalysis & Post-processing Cluster
Robert D. FrenchNCCS User Assistance
![Page 2: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/2.jpg)
2
Rhea Quick Overview
• 200 Dell PowerEdge C6220 Nodes– 196 Compute / 4 Login– RHEL 6.4– 2 x 8-Core Intel Xeon CPUs @ 2.0 GHz
• Hyperthreading is enabled, so “top” shows 32 CPUs
– 64GB of RAM– New 56Gb/s IB Fabric
• Mounts Atlas– Does not mount Widow
• Replaces Lens
• No Preemptive Queue
![Page 3: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/3.jpg)
3
Allocation & Billing
• Rhea is prioritized as an extra resource for INCITE and ALCC users through the end of the year.– DD Projects may request access
• 1 node hour charged per node per hour– Ex: 10 nodes for 2 hours = 20 node hours
• Each project will be awarded 1,000 hours per Month– Separate from Titan / Eos usage– Request more if you run low
![Page 4: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/4.jpg)
4
Rhea Queue Policy
Job Size Job Length Job Limits Restricted by
1 – 16 Nodes 0 – 12 Hours12 – 36 Hours36 – 96 Hours
3 eligible / unlimited running2 active 1 active
UserSystemSystem
17 – 32 Nodes 0 – 12 Hours12 – 36 Hours
2 active1 active
SystemSystem
33 – 128 Nodes 0 – 3 Hours 1 active System
• Should minimize large jobs swamping the system
• Small runs should complete quickly
• Request a Reservation for more nodes / longer wall-times
![Page 5: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/5.jpg)
5
Software Stack
• Most Lens software will already be installed
• Here are some highlights:– Visualization: ParaView, VisIt, VMD– Compilers: GCC, Intel, and PGI– Scientific Languages: MATLAB, Octave, R, SciPy– Data Management: Globus, BBCP, NetCDF, HDF5, Adios– Debugging: DDT, Vampir, Valgrind
• Full list of installed software available on our website
• If you can’t find what you need, just ask!
![Page 6: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/6.jpg)
6
Transitioning to Rhea
TitanTitanLensLens
WidowWidow
• Now:
• Titan and Lens mount Widow
![Page 7: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/7.jpg)
7
Transitioning to Rhea
TitanTitan RheaRheaLensLens
WidowWidow AtlasAtlas
• Soon (mid-to-late November):
• Titan will mount both Atlas and Widow
• Move data to Atlas and take advantage of Rhea
![Page 8: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/8.jpg)
8
Transitioning to Rhea
TitanTitan RheaRhea
AtlasAtlas
• Near Future:
• Lens will be decommissioned
• Rhea will be the center’s viz & analysis cluster
WidowWidow
![Page 9: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/9.jpg)
9
Questions?
![Page 10: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/10.jpg)
Spider II Directory Layout Changes
Chris Fuson
![Page 11: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/11.jpg)
11
OLCF Center-wide File Systems
• Spider–Center-wide scratch space–Temporary; not backed-up–Available from compute nodes–Fast access to job-related temporary files
and for staging large files to and from archival storage
–Contains multiple Lustre file systems
![Page 12: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/12.jpg)
12
Spider I v/s Spider II
Spider I
Widow [1-3]
•240 GB/s•10 PB•3 MDS•192 OSS•1,344 OST
•Current Center-wide Scratch•Decommissioned Early January, 2014
Atlas [1-2]
•1 TB/s•30 PB•2 MDS•288 OSS•2,016 OST
•Available on Additional OLCF Systems Soon
Spider II
![Page 13: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/13.jpg)
13
Spider II Change Overview
1. New directory structure– Organized by project– Each project given a directory on one of the atlas filesystems– WORKDIR now within project areas
» You may have multiple WORKDIRs» * Requires Change
2. Quota increases– Increased file system size allows for increased quotas
3. All areas purged– To help ensure space available for all projects
Before using Spider II, please note the following:Before using Spider II, please note the following:
![Page 14: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/14.jpg)
14
• Purpose: Batch job I/O• Path:
- $MEMBERWORK/<projid>• 10 TB quota• 14 day purge• Permissions:
- User allowed to change permissions to share within project- No automatic permission changes
Spider II Directory StructureProjectID
Member WorkMember Work
World Work
Project Work
![Page 15: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/15.jpg)
15
• Purpose: Data sharing within project• Path:
- $PROJWORK/<projid>• 100 TB quota• 90 day purge• Permissions:
- Read, Write, Execute access for project members
Spider II Directory StructureProjectID
Member Work
World Work
Project WorkProject Work
![Page 16: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/16.jpg)
16
• Purpose: Data sharing with users who are not members of project• Path:
- $WORLDWORK/<projid>• 10 TB quota• 14 day purge• Permissions:
- Read, Execute for world- Read, Write, Execute for project
Spider II Directory StructureProjectID
Member Work
World WorkWorld Work
Project Work
![Page 17: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/17.jpg)
17
Spider II Directory Structure• New directory structure
– Organized by project
![Page 18: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/18.jpg)
18
Before Using Atlas• Modify scripts to point to new directory structure
• /tmp/work/$USER• $WORKDIR
/tmp/proj/<projid>
• $MEMBERWORK/<projid>
• Migrate data• You will need to transfer needed data onto Spider II (atlas)
• $PROJWORK/<projid>
![Page 19: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/19.jpg)
19
Questions?
• More information:– www.olcf.ornl.gov/kb_articles/atlas-transition/
• Email:– [email protected]
![Page 20: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/20.jpg)
20
Other Items
• Dec 17th - Titan to return to 100%
• 2013 User Survey– Available on olcf.ornl.gov
![Page 21: Rhea Analysis & Post-processing Cluster](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ff0550346895dab000b/html5/thumbnails/21.jpg)
21
Thanks for your time.