11th oct 2005hepix slac - oxford site report1 oxford university particle physics site report pete...

25
11th Oct 2005 Hepix SLAC - Oxford Site Report 1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

Upload: todd-mcdowell

Post on 16-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 1

Oxford University Particle Physics

Site Report

Pete Gronbech

Systems Manager andSouth Grid Technical Co-ordinator

Page 2: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 2

Physics Department Computing Services Physics department restructuring. Reduced staff involved in system management by one.

Still trying to fill another post.

E-Mail hubs A lot of work done to simplify the system and reduce manpower requirements. Haven’t

had much effort available for anti-spam. Increased use of the Exchange Servers.

Windows Terminal Servers Still a popular service. More use of remote access to user’s own desktops (XP only)

Web / Database A lot of work around supporting administration and teaching.

Exchange Servers Increased size of information store disks from 73-300GB. One major problem with

failure of a disk but was solved by reloading overnight.

Windows Front End server – WINFE Access to windows file system via SCP, SFTP or web browser Access to exchange server (web and outlook) Access to address lists (LDAP) for email, telephone VPN service

Page 3: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 3

Windows Status Systems are now almost entirely Windows XP or 2000 on clients

and Windows Server 2003 for services. More machines brought into centrally managed domain. More automation of updates, vulnerability scans etc. More laptops to support

Grant year Windows Desktops

Installed

Minimum

Spec

Maximum

Spec

Laptops

(individual)

Laptops

(pool)

98/99 25 P2/350 P3/450

99/00 34 P3/450 P3/650 1 2

00/01 22 P3/733 P3/866 3 2

01/02 78 P3/1000 P4/1800 5 3

02/03 49 P4/2.0 P4/2.6 6+8 2

03/04 50 P4/2.6 P4/3.0 9 2

04/05 45 P4/3.0 P4/3.2 11 3

Page 4: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 4

Software Licensing

Cost increasing each year Have continued NAG deal (libraries only) New deal for Intel compilers run through OSC group Also deals for Labview, Mathematica, Maple, IDL System management tools, software for imaging,

backup, anti-spyware etc. (MS OS’s and Office covered by Campus select

agreement)

Page 5: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 5

Network

Gigabit connection to campus operational since July 2005. Several technical problems with the link delayed this by over half a year.

Gigabit firewall installed. Purchased commercial unit to minimise manpower required for development and maintenance. Juniper ISG 1000 running netscreen.

Firewall also supports NAT and VPN services which is allowing us to consolidate and simplify the network services.

Moving to the firewall NAT has solved a number of problems we were having previously, including unreliability of videoconferencing connections.

Physics-wide wireless network. Installed in DWB public rooms, Martin Wood and Theory. Will install same in AOPP. New firewall provides routing and security for this network.

Page 6: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

Network Access

CampusBackboneRouter

Super Janet 4 2.4Gb/s with Super Janet 4

OUCSFirewall

depts

depts

Physics Firewall PhysicsBackboneRouter

1Gb/s

10Gb/s

1Gb/s

10Gb/s

BackboneEdgeRouter

depts

100Mb/s

100Mb/s

1Gb/s

depts

100Mb/s

BackboneEdgeRouter

10Gb/s

Page 7: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 7

Network Security

Constantly under threat from vulnerability scans, worms and viruses. We are attacking the problem in several ways Boundary Firewall’s ( but these don’t solve the problem entirely as

people bring infections in on laptops.) – new firewall Keeping operating systems patched and properly configured - new

windows update server Antivirus on all systems – More use of Sophos but some problems Spyware detection – anti-spyware software running on all centrally

managed systems Segmentation of the network into trusted and un-trusted sections – new

firewall Strategy

Centrally manage as many machines as possible to ensure they are uptodate and secure – most windows machines moved into domain

Use Network Address Translation (NAT) service to separate centrally managed and `un-trusted` systems into different networks – new firewall plus new Virtual LANs

Continue to lock-down systems by invoking network policies. The client firewall in Windows XP –SP2 is very useful for excluding network based attacks – centralised client firewall policies

Page 8: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 8

Particle Physics Linux

Aim to provide general purpose Linux based system for code development and testing and other Linux based applications.

New Unix Admin (Rosario Esposito) has joined us, so we now have more effort to put into improving this system.

New main server installed (ppslgen) running Scientific Linux (SL3)

File server upgraded to SL3 and 6TB disk array added. Two dual processor worker nodes reclaimed from Atlas Barrel

assembly and connected as SL3 worker nodes. RH7.3 worker nodes being migrated to SL3 ppslgen and worker nodes form a mosix cluster which we hope

will provide a more scalable interactive service. These also support conventional batch queues.

Some performance problems with gnome and SL3 from exceed. Evaluating alternative (NX) for exceed which doesn’t exhibit this problem (also has better integrated ssl).

Page 9: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 9

6TB

ppslgen

pplxwn06

pplxwn07

pplxwn08

pplxwn09

pplxwn10

pplxwn11

pplxwn05pplxgen

pplxwn01

pplxwn02

pplxwn03

pplxwn04

pplx2

pplx3

pplxfs1

2 * 2.8GHz P4

2 * 2.8GHz P4

1 * 2.4GHz P4

2 * 2.4GHz P4

2 * 2.4GHz P4

2 * 2.4GHz P4

2 * 2.4GHz P4

8 * 700MHz P3

Scientific Linux 3

PP Linux Batch Farm

Red Hat 7.3

2 * 2.4GHz P4

2 * 2.4GHz P4

2 * 2.4GHz P4

2 * 2.4GHz P4

2 * 2.2GHz P4

2 * 450MHz P3

2 * 800MHz P3

Migration to Scientific Linux

4TB

1.1TB

2 * 1GHz P3File Server

Addition of New 6TB SATARAID Array

Page 10: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 10

Southgrid Member Institutions

Oxford RAL PPD Cambridge Birmingham Bristol HP-Bristol Warwick

Page 11: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 11

Stability, Throughput and Involvement

Pete Gronbech has taken on the role of technical coordinator for South Grid tier-2 centre.

The last Quarter has been a good stable period for SouthGrid

Addition of Bristol PP All sites upgraded to LCG-2_6_0 Large involvement in Biomed DC

Page 12: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 12

Status at RAL PPD

SL3 cluster on 2.6.0 CPUs: 11 2.4 GHz, 33 2.8GHz

100% Dedicated to LCG

0.7 TB Storage 100% Dedicated to LCG

Configured 6.4TB of IDE RAID disks for use by dcache

5 systems to be used for preprodution testbed

Page 13: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 13

Status at Cambridge

Currently LCG 2.6.0 on SL3

CPUs: 42 2.8GHz (Extra Nodes only 2/10 any good) 100% Dedicated to LCG

2 TB Storage (have 3 but only 2 available) 100% Dedicated to LCG

Condor Batch System Lack of Condor support

from LCG teams

Page 14: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 14

Status at Birmingham

Currently SL3 with LCG-2_6_0

CPUs: 24 2.0GHz Xenon (+48 local nodes which could in principle be used but…) 100% LCG

1.8TB Classic se 100% LCG.

Babar Farm moving to SL3 and Bristol integrated but not yet on LCG

Page 15: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 15

Status at Oxford

Currently LCG 2.6.0 on SL304

All 74 cpus’s running since ~June 20th

CPUs: 80 2.8 GHz 100% LCG

1.5 TB Storage – second 1.5TB will be brought on line as DPM or dcache. 100% LCG.

Heavy use by Biomed during their DC

Plan to give local users access

Page 16: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 16

Biomed Data Challenge,Supported a non LHC EGEE VOFor about 4 weeks.

LHCb

ATLAS

Start of August 2005

ZEUS

Oxford Tier 2 GridPP Cluster Summer 2005

Page 17: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 17

Page 18: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 18

Page 19: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 19

Page 20: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 20

Page 21: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 21

Page 22: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 22

Oxford Computer Room

Modern processors take a lot of power and generate a lot of heat.

We’ve had many problems with air conditioning units and a power trip.

Need new, properly designed and constructed computer room to deal with increasing requirements.

Local work on the design and the Design Office has checked the cooling and air flow.

Plan is to use two of the old target rooms on level 1, one for physics one for the new Oxford Supercomputer (800 nodes).

Requirements call for power and cooling between of 0.5 and 1MW

SRIF funding has been secured but this now means its all in the hands of the University’s estates. Now unlikely to be ready before next summer.

Page 23: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 23

Centre of the Racks

Rows of racks arranged to form hot and cold aisles.

Flow simulation showing temperature of air.

Page 24: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

Hepix SLAC - Oxford Site Report 24

“Oxford Physics Level 1 Computer Room”

Last Year you saw the space we could use.

There are now 5 racksof computers locatedhere.

Tier 2 Rack 2, Clarendon, Oxford grid development, and Ex RAL CDF IBM 8 way

Bad News:No Air ConditioningRoom also used as a storeConversion of Room is taking time…But

SRIF Funding secured to build joint room for Physics and OSC

Page 25: 11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator

11th Oct 2005 Hepix SLAC - Oxford Site Report 25

Future

Intrusion detection system for increased network security Complete migration of desktops to XP-SP2 and MS Office 2003. Improve support for laptops – still more difficult to manage than

desktops. Once migration of central service to SL is complete, we will be

developing a Linux desktop clone. Investigating how best to integrate Mac OS X with existing

infrastructure. Scale up ppslgen in line with demand. Money has been set aside

for more worker nodes and a further 12+ TB of storage. More worker nodes for the tier-2 service Look to use university services where appropriate. Perhaps the

new supercomputer ?