praktickÝ Úvod do superpoČÍtaČe anselm infrastruktura , přístup a podpora uživatelů

Post on 23-Feb-2016

40 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

PRAKTICKÝ ÚVOD DO SUPERPOČÍTAČE ANSELM Infrastruktura , přístup a podpora uživatelů. David Hrbáč 2013-09-27. Intro What is the supercomputer Infrastructure Access to cluster Support Log-in. http:// www.it4i.cz/ files/ hrbac.pdf http:// www.it4i.cz/ files/ hrbac.pptx. Why Anselm. - PowerPoint PPT Presentation

TRANSCRIPT

PRAKTICKÝ ÚVOD DO SUPERPOČÍTAČEANSELM

Infrastruktura, přístup a podpora uživatelů

David Hrbáč2013-09-27

• Intro• What is the supercomputer• Infrastructure• Access to cluster• Support• Log-in http://www.it4i.cz/files/hrbac.pdf

http://www.it4i.cz/files/hrbac.pptx

Why Anselm

• 6000 name suggestions• The very first coal mine at the region• The very first to have a steam engine• Anselm of Canterbury

Early days

Future - Hal

What is a supercomputer

• Bunch of computers• Having a lot of CPU power• Having a lot of RAM• Local storage• Shared storage• High-speed interconnected• Message Process Interface

Supercomputer

Supercomputer ?!?

Supercomputer ?!?

Anselm HW

• 209 compute nodes• 3344 cores• 15TB RAM• 300TB /home• 135TB /scratch• Bull Extreme Computing Linux (RHEL clone)

Type of Nodes

• 180 compute nodes• 23 GPU accelerated nodes• 4 MIC accelerated nodes• 2 fat nodes

General Node

• 180 nodes• 2880 cores in total• two Intel Sandy Bridge E5-2665, 8-core, 2.4GHz processors per

node• 64 GB of physical memory per node• one 500GB SATA 2,5” 7,2 krpm HDD per node• bullx B510 blade servers• cn[1-180]

GPU Accelerated Nodes• 23 nodes• 368 cores in total• two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per

node• 96 GB of physical memory per node• one 500GB SATA 2,5” 7,2 krpm HDD per node• GPU accelerator 1x NVIDIA Tesla Kepler K20 per node• bullx B515 blade servers• cn[181-203]

MIC Accelerated NodesIntel Many Integrated Core Architecture

• 4 nodes• 64 cores in total• two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per

node• 96 GB of physical memory per node• one 500GB SATA 2,5” 7,2 krpm HDD per node• MIC accelerator 1x Intel Phi 5110P per node• bullx B515 blade servers• cn[204-207]

Fat Node• 2 nodes• 32 cores in total• 2 Intel Sandy Bridge E5-2665, 8-core, 2.4GHz processors per

node• 512 GB of physical memory per node• two 300GB SAS 3,5”15krpm HDD (RAID1) per node• two 100GB SLC SSD per node• bullx R423-E3 servers• cn[208-209]

Storage

• 300TB /home• 135TB /scratch• Infiniband 40 Gb/s– Native 3600 MB/s– Over TCP 1700MB/s

• Ethernet – 114MB/s• LustreFS

Lustre File System

• Clustered• OSS – object storage server• MDS – meta-data server• Limits in petabytes• Parallel - striped

Stripes

• Stripe count– Parallel access– Mind the script processes– Stripe per gigabyte

• lfs setstripe|getstripe

Quotas

• /home – 250GB• /scratch – no quota

• lfs quota –u hrb33 /home

Access to Anselm

• Internal Access Call - 4x a year– 3rd round

• Open Access Call – 2x a year– 2nd round

Proposals

• Proposals undergoing evaluation– Scientific– Technical– Economical

• Primary Investigator– List of co-operators

Login Credentials

• Personal certificate• Signed request• Credentials encrypted– Login– Password– Ssh keys– Password to the key

Credentials lifetime

• Active project(s) or affiliation to IT4Innovations• Deleted 1 year after the last project• Announcement– 3 months before the removal– 1 month before the removal– 1 week before the removal

Support

• Bug tracking and trouble ticketing system• Documentation• IT4I internal command line tools• IT4I web applications• IT4I android application• End-user courses

Main Mean

• Request Tracker• support@it4i.cz

Documentation

• https://support.it4i.cz/docs/anselm-cluster-documentation/

• Still evolving• Changes almost every day

IT4I internal command line tools

• It4free• Rspbs• Licenses allocation

• Internal in-house scripts– Automation to handle the credentials– Cluster automation– PBS accounting

IT4I web applications

• Internal information system– Project management– Project accounting– User management

• Cluster monitoring

IT4I android application

• Internal tool• Considering the release to end-users

• Features– News– Graphs

• Feature requests– Accounting– Support– Nodes allocation– Jobs status

Log-in to AnselmFinally!

• Ssh protocol• Via anselm.it4i.cz– login1.anselm.it4i.cz– login2.anselm.it4i.cz

VNC

• ssh anselm –L 5961:localhost:5961

• Remmina• Vncviewer 127.0.0.1:5961

Links

• https://support.it4i.cz/docs/anselm-cluster-documentation/

• https://support.it4i.cz/• https://www.it4i.cz/

Questions

Thank you.

top related