11th October 2012 Graduate Lectures 1
Oxford University Particle Physics Unix Overview
Pete Gronbech
Senior Systems Manager andGridPP Project Manager
11th October 2012 Graduate Lectures 2
Strategy Local Cluster Overview Connecting to it Grid Cluster Computer Rooms How to get help
11th October 2012 Graduate Lectures 3
Particle Physics Strategy The Server / Desktop Divide
Win XP PC Linux Desktop
Des
ktop
sS
erve
rs
General Purpose Unix
Server
Group DAQ
Systems
Linux Worker nodes
Web Server
Linux FileServers
Win XP PC
Win 7 PC
Win 7 PC
Approx 200 Desktop PC’s with Exceed, putty or ssh/X windows used to access PP Linux systems
Virtual Machine Host
NIS Server
torque Server
11th October 2012 Graduate Lectures 4
Particle Physics Linux Unix Team (Room 661):
Pete Gronbech - Senior Systems Manager and GridPP Project Manager Ewan MacMahon – Grid and Local Systems Administrator Kashif Mohammad – Grid Support Sean Brisbane – Local Server and User Support
Aim to provide general purpose Linux based systems for code development and testing and other Linux based applications.
Interactive login servers and batch queues are provided.
Systems run Scientific Linux which is a free Red Hat Enterprise based distribution.
Systems are currently running SL5, this is the same version as used on the Grid and at CERN. Students should use pplxint5 and 6.
A Grid User Interface is provided on the interactive nodes to allow job submission to the Grid.
Worker nodes form a PBS (aka torque) cluster accessed via batch queues.
11th October 2012 Graduate Lectures 5
Current Clusters
Particle Physics Local Batch cluster
Oxfords Tier 2 Grid cluster
pplxwnnn 8 * Intel 5420 cores
11th October 2012
PP Linux Batch Farm
pplxwn4
Scientific Linux 5
pplxint6pplxint5
8 * Intel 5420 cores
Interactive login nodes
pplxwn5 8 * Intel 5420 cores
pplxwnnn 8 * Intel 5420 cores
pplxwnnn 8 * Intel 5420 cores
pplxwnnn 8 * Intel 5420 cores
6Graduate Lectures
pplxwn25
pplxwn26
pplxwn27pplxwn28
pplxwn29
pplxwn30
pplxwn31pplxwn32 16 * AMD Opteron 6128 cores
16 * AMD Opteron 6128 cores
16 * AMD Opteron 6128 cores
16 * AMD Opteron 6128 cores
16 * AMD Opteron 6128 cores
16 * AMD Opteron 6128 cores
16 * AMD Opteron 6128 cores
16 * AMD Opteron 6128 cores
Users log in to the interactive nodesPplxint5 & 6, the home directories and all the data disks (/home area or /data/group ) are shared across the cluster and visible on the interactive machines and all the batch system worker nodes.
Approximately 350 Cores each with 4GB or RAM memory.
pplxwnnn 8 * Intel 5420 cores
11th October 2012
PP Linux Batch Farm Data Storage
pplxfsn
9TB
pplxfsn
19TB
Data Areas
pplxfsn
19TB
7Graduate Lectures
NFS Servers
Home areas
Data Areas
NFS is used to export data to the smaller experimental groups, where the partition size is less than the total size of a server.
The data areas are too big to be backed up. The servers have dual redundant PSUs, RAID 6 and are running on uninterruptable powers supplies. This safeguards against hardware failures, but does not help if you delete files.
The home areas are backed up to by two different systems nightly. The OUCS HFS service and a local back up system. If you delete a file tell us a soon as you can when you deleted it and it’s full name.The latest nightly backup of any lost or deleted files from your home directory is available at the read-only location "/data/homebackup/{username}
The home areas are quota’d but if you require more space ask us.
Store your thesis on /home NOT /data.
pplxfsn
30TBData Areas
Particle Physics Computing
44TB
Lustre OSS04
df -h /data/atlasFilesystem Size Used Avail Use% Mounted on/lustre/atlas 183T 147T 27T 85% /data/atlas
df -h /data/lhcbFilesystem Size Used Avail Use% Mounted on/lustre/lhcb 58T 40T 16T 72% /data/lhcb
11th October 2012 8Graduate Lectures
The Lustre file system is used to group multiple file servers together to provide extremely large continuous file spaces. This is used for the Atlas and LHCb groups.
11th October 2012 Graduate Lectures 9
11th October 2012 Graduate Lectures 10
Strong Passwords etc
Use a strong password not open to dictionary attack! fred123 – No good Uaspnotda!09 – Much better
Better to use ssh with a passphrased key stored on your desktop.
11th October 2012 Graduate Lectures 11
Connecting with PuTTYDemo1. Plain ssh terminal connection2. With key and Pageant3. ssh with X windows tunnelled to
passive exceed4. ssh, X windows tunnel, passive
exceed, KDE Session
http://www.physics.ox.ac.uk/it/unix/particle/XTunnel%20via%20ssh.htm
http://www.howtoforge.com/ssh_key_based_logins_putty
11th October 2012 Graduate Lectures 12
Puttygen to create an ssh key on Windows
11th October 2012 Graduate Lectures 13
Paste this into ~/.ssh/authorized_keys on pplxint
If you are likely to then hop to other nodes add :
ForwardAgent yes
to a file called config in the .ssh dir on pplxint
Save the public and private parts of the key to a subdirectory of your h: drive
Pageant
Run Pageant once after login to load your (windows ssh key)
11th October 2012 Graduate Lectures 14
11th October 2012 Graduate Lectures 15
SouthGrid Member Institutions
Oxford RAL PPD Cambridge Birmingham Bristol Sussex
JET at Culham
Current capacity Compute Servers
Twin and twin squared nodes– 1300 CPU cores
Storage Total of ~700TB The servers have between 12 and 36 disks, the
more recent ones are 2TB capacity each. These use hardware RAID and UPS to provide resilience.
11th October 2012 Graduate Lectures 16
11th October 2012 Graduate Lectures 17
Get a Grid Certificate
Must remember to use the same web browser to request and retrieve the Grid Certificate.
Once you have it in your browser you can export it to the Linux Cluster to run grid jobs.
Details of these steps and how to request membership of the SouthGrid VO (if you do not belong to an existing group such as ATLAS, LHCb) are here:
http://www.gridpp.ac.uk/southgrid/VO/instructions.html
11th October 2012 Graduate Lectures 18
Two New Computer Rooms provide excellent
infrastructure for the future
The New Computer room built at Begbroke Science Park jointly for the Oxford Super Computer and the Physics department, provides space for 55 (11KW) computer racks. 22 of which will be for Physics. Up to a third of these can be used for the Tier 2 centre. This £1.5M project was funded by SRIF and a contribution of ~£200K from Oxford Physics.
The room was ready in December 2007. Oxford Tier 2 Grid cluster was moved there during spring 2008. All new Physics High Performance Clusters will be installed here.
11th October 2012 Graduate Lectures 19
Local Oxford DWB Physics Infrastructure Computer Room
Completely separate from the Begbroke Science park a computer room with 100KW cooling and >200KW power has been built. ~£150K Oxford Physics money.
Local Physics department Infrastructure computer room.
Completed September 2007.
This allowed local computer rooms to be refurbished as offices again and racks that were in unsuitable locations to be re housed.
Cold aisle containment
20
11th October 2012 Graduate Lectures 21
The end for now… Ewan will give more details of use of the
clusters next week Help Pages
http://www.physics.ox.ac.uk/it/unix/default.htm http://www2.physics.ox.ac.uk/research/particle-physics/
particle-physics-computer-support Email
[email protected] Questions…. Network Topology