Download - Introduction to I/O in the HPC Environment
![Page 1: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/1.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Introduction to I/O in the HPC Environment
Brian Haymore, [email protected]
Sam Liston, [email protected]
Center for High Performance Computing
Spring 2013
![Page 2: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/2.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Overview
• CHPCFS
Topology
04/21/23 http://www.chpc.utah.edu Slide 2
![Page 3: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/3.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Overview• Couplet Topology
– Redundency– Performance– Used on:
• Redbutte• Drycreek• Greenriver• Saltflats• IBrix Scratch
– Virt/Horiz Scalable
04/21/23 http://www.chpc.utah.edu Slide 3
Network Clients
10GigE/40GigE10GigE/40GigE
![Page 4: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/4.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Overview• SAN storage redundancy – RAID• RAID = Redundant Array of Inexpensive Disks
04/21/23 http://www.chpc.utah.edu Slide 4
![Page 5: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/5.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Overview• Types of storage available at CHPC
– Home Directory (i.e. /uufs/chpc.utah.edu/common/home/uNID)• Per department backed up (except CHPC_HPC file system) • Intended for critical/volatile data• Expected to maintain a high level of responsiveness
– Group Data Space (i.e. /uufs/chpc.utah.edu/common/home/pi_grp)• Optional per department archive• Intended for active projects, persistent data, etc.• Usage expectations to be set by group
– Network Mounted Scratch (i.e. /scratch/serial)• No expectation of data retention (It’s scratch)• Expected to maintain a high level of I/O performance under significant load
– Local Disk (i.e. /tmp)• Most consistent I/O• No expectation of data retention• Unique per machine
04/21/23 http://www.chpc.utah.edu Slide 5
![Page 6: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/6.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Overview• Good Citizens
– Shared Environment Characteristics• Many to one relationship (over-subscribed)• Globally accessible• Global resource are still finite• Consider your usage impact when choosing a storage location• Be aware of any usage policies• Evaluate different I/O methodologies• Seek additional assistance from CHPC
04/21/23 http://www.chpc.utah.edu Slide 6
![Page 7: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/7.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Best Practices
04/21/23 http://www.chpc.utah.edu Slide 7
• Data Segregation; Where should files be stored? – Classify your Data
• How important is it?• Can it be recreated?• Is this dataset currently in use?• Will this data be used by others?• Does this data need to be backed up?
– Put your data in the appropriate space• Home Directory• Group Space• Scratch• /tmp
![Page 8: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/8.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Considerations• Backup Impact
– Performance Characteristics• Time (when, duration)
• Competition/concurrent access
• Capacity of files backed up
• Quantity of files backed up
• Unintended consequences
04/21/23 http://www.chpc.utah.edu Slide 8
![Page 9: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/9.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Considerations• Network Performance
04/21/23 http://www.chpc.utah.edu Slide 9
![Page 10: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/10.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Considerations• Data Migration
– What are we moving; What does it look like?– From where to where are we moving the data?– What transfer performance expectation do we have?– What tool will we use to make the transfer?
• SSH/SFTP– Simple– Very portable
• rsync– Restart able– File verification
• tar via SSH– More efficient with many small files
• Compression? • Secure
04/21/23 http://www.chpc.utah.edu Slide 10
![Page 11: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/11.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
File Operations• Directory Structure
– Poor performance when too many files are in the same directory – Organizing files in a tree avoids this issue– Directory block count significance
• Network vs. Local– IOPS vs. Bandwidth– Network I/O
• Overhead• Limited by network pipe• More efficient for bandwidth vs. IOPS
– Local I/O • Limited size• Not globally accessible• Depending on hardware offers a fair balance between bandwidth and IOPS
04/21/23 http://www.chpc.utah.edu Slide 11
![Page 12: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/12.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
File Operations• Metadata Operations Performance Considerations
– Create, Destroy, Stat, etc.– IOPS oriented performance
• Application I/O Performance Considerations– How often do we open and close files?– What I/O granularity do our applications write files?– Are we doing anything else silly?
04/21/23 http://www.chpc.utah.edu Slide 12
![Page 13: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/13.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Examples• Code (user example)
– Code wrote millions of small files (<10kB) to a single dir – Code wrote thousands of larger files (100s of MB)– Code uses a compression algorithm to speed up I/O of the larger files.
• Observations– Changing default r/w chunk of compression I/O from 16kB to 32kB/64kB
improved performance 10%-20%.– Changing code to write files in a hierarchical directory structure produced a
multiple times speed up.
04/21/23 http://www.chpc.utah.edu Slide 13
![Page 14: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/14.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Examples
• Single Directory vs. Hierarchical Directory Structure
04/21/23 http://www.chpc.utah.edu Slide 14
4.3GB file
![Page 15: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/15.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Examples
• Processing Many Small Files over Network vs. Local
04/21/23 http://www.chpc.utah.edu Slide 15
4.3GB file
![Page 16: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/16.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Examples
• Bonnie Test
04/21/23 http://www.chpc.utah.edu Slide 16
4.3GB file
![Page 17: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/17.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Examples
• Tar (Linux Kernel)
04/21/23 http://www.chpc.utah.edu Slide 17
4.3GB file
![Page 18: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/18.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Examples
• Compile (Linux Kernel)
04/21/23 http://www.chpc.utah.edu Slide 18
4.3GB file
![Page 19: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/19.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Examples
• Fine vs. Coarse I/O (Read)
04/21/23 http://www.chpc.utah.edu Slide 19
4.3GB file
![Page 20: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/20.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Examples
• Fine vs. Coarse I/O (Writes)
04/21/23 http://www.chpc.utah.edu Slide 20
4.3GB file
![Page 21: Introduction to I/O in the HPC Environment](https://reader031.vdocument.in/reader031/viewer/2022033023/56814f5a550346895dbd0958/html5/thumbnails/21.jpg)
CENTER FOR HIGH PERFORMANCE COMPUTING
Troubleshooting
04/21/23 http://www.chpc.utah.edu Slide 21
• Diagnosing Slowness
– Open a ticket ([email protected])
– File system
– System load
– Network load
• Future Monitoring
– Ganglia
• Additional Information
– http://www.chpc.utah.edu