southgrid status pete gronbech: 27th june 2006 gridpp 16 qmul
TRANSCRIPT
Status at RAL PPD
• SL3 cluster on glite 3.0.1 • CPUs: 66 2.8GHz Xeons• 6.4TB of RAID disks (dcache)• 24 VO’s supported!!• Network monitoring Machine
installed and runningCurrently Installing New Kit• 52 dual core dual opterons • 64TB of storage.• Nortel switches with 10Gb/s
capable uplink. Hope to get 10Gb/s link to RAL T1 this summer.
Status at Cambridge
• Currently glite_3.0.1 on SL3• CPUs: 42 2.8GHz Xeons• 3.2 TB Storage • Network monitoring box requested.• Condor Batch System• Lack of Condor support from LCG teams• Problems installing experiment software with
the VO sgm account as this does not map well to condoruser1 / 2. Currently overcome by local intervention but need a long term solution.
Future Upgrades to CAMgrid in Autumn 06 could
mean additional 160 KSI2K plus 4-5 TB disk storage
Status at Bristol
• Status– Running glite 3.0.1 on SL3
• Existing resources– GridPP nodes plus local cluster nodes used to bring site on line. – Now have 16 WN cpu’s (This is an 8 fold increase since last
gridpp)– Network monitoring box installed and running
• New resources– 2TB storage increase in SE next week– University investment in hardware Includes CPU, high quality and
scratch disk resources – Installation commences this summer (512 cores to be installed in
August, 2048 by Jan 07)– 1Gbps private link to RAL for the Grid Cluster next year
• Staff– New SouthGrid support / development post (GridPP / HP) being
filled
Status at Birmingham
• Currently SL3 with glite_3.0.1• CPUs: 28 2.0GHz Xeon (+98
800MHz )• 1.9TB DPM installed Oct 2005• Babar Farm runs SL3 and
Bristol farm integrated • Running Pre Production
Service• Network monitoring box
installed and running• ALICE VO Box installed
Status atOxford
• Currently glite_3.0.1 on SL305• 72 WN cpus’s running • CPUs: 88 2.8 GHz Total• 3.0TB DPM storage on one server node and one pool
node.• Some further Air Conditioning Problems now resolved for
Room 650, Second rack in overheating basement.• Five dual 2.8GHz servers used for Southgrid Testzone,
have been used for pre release testing of 2.7.0 and glite 3
• Network monitoring box in place and running• Oxford was the first Southgrid site to install glite 3• OPS VO enabled• Also in talks with the Oxford CampusGrid and NGS
LHCb
ATLAS
Oxford Tier 2 GridPP Cluster June 2006
Biomed
Queued jobs; Have just reduced biomed allocation as there are waiting LHC jobs.
Running Jobs:
glite_3.0.0 upgrade
SC4
• All sites now on gigabit Ethernet to at least the SE’s.
• All sites > 250Mbps in the throughput tests
• 4/5 sites have the network monitoring box installed and running, remaining site is expecting theirs soon.
• All sites now supporting the OPS VO
Stability, Throughput and Involvement
• SouthGrid continues to perform well, with good stability and functionality.
• Bristol PP additional nodes• All five sites running glite 3 by 22nd June • Support for many VO’s across the Tier2
including non LHC VO’s such as Biomed, zeus, hone, ilc and pheno
• All sites SRM enabled (1 dcache, 4 dpm) by Oct 05
• 5 sites running an LFC
Summary & outlook
• South Grid continues to maintain good momentum, all sites are running the latest release and have SRM enabled se’s. SouthGrid was the joint first T2 to be running glite 3.
• The new Systems Administrator at Bristol, Winnie Lacesso, has helped progress at Bristol to be more rapid.
• RALPPD installing large upgrade• Cambridge expecting upgrade in Autumn• Bristol will have a percentage of the new Campus cluster• Birmingham will have a percentage of the new Campus Cluster• Oxford will be able to expand resources once the new computer
room is built.
• Yves Coppens continues to provide valuable help across SouthGrid.