gridpp23 – final steps to data david britton, 8/sep/09

21
GridPP23 – Final Steps to Data David Britton, 8/Sep/09

Upload: maurice-lindsey

Post on 13-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

GridPP23 – Final Steps to Data

David Britton, 8/Sep/09

Page 2: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

2

Since GridPP22 in April…

• Validated the UK infrastructure with STEP09.

• Moved the Tier-1 to R89.• Procured new hardware.• Exercised our disaster

management process (several times!)

8/Sep/09

… and before GridPP24 at RHUL we will have data.

Page 3: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

3

WLCG Growth

March 2009 September 2009

>315,000 KSI2K

Page 4: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

4

UK CPU Contribution

Same picture if non-LHC VOs included

8/Sep/09

Page 5: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

58/Sep/09

UK Site Contributions

2007 – 8 - 9

NorthGrid: 34 – 22 - 15%

London: 28 – 25 - 32%

ScotGrid: 18 – 17 - 22%

Tier-1: 13 – 15 - 13%

SouthGrid: 7 – 16 - 13%

GridIreland: 0 – 6 - 5%

Page 6: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

6

UK Site Contributions: Non LHC VOs

8/Sep/09

Page 7: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

7

CPU Efficiencies (CPU/Wall Time)

8/Sep/09

Page 8: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

8

StorageGstat gives:

8/Sep/09

September 2008 March 2009 September 2009

… the last set could actually be sensible!

Page 9: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

9

Data Transfers

8/Sep/09

Page 10: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

10

OPN Resilience

8/Sep/09

Page 11: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

11

STEP09Operations Report at wLCG MB; 16/Jun

The lack of “hero-mode” is a direct consequence of all the (heroic) effort that has been put in over the last year to make the UK Grid more resilient.8/Sep/09

Page 12: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

12

More STEP09 Highlights

• I won’t preempt (too much) the upcoming talks…

– RAL was the best ATLAS Tier-1 after the BNL ATLAS-only Tier-1

– Glasgow ran more jobs then any of the 50-60 ATLAS Tier-2 sites throughout the world.

– Most Tier-2 sites made good contributions and many gained valuable insight into tuning issues during STEP09 and subsequent testing.

– “The responsiveness of RAL to CMS during STEP09 was in stark-contrast to many other Tier-1s.”

– CMS noted the tape performance at RAL was very good as was the CPU efficiency.

– Many (if not all) the metrics for the experiments were met, and in some cases, significantly exceeded at RAL during STEP09.

8/Sep/09

Page 13: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

13

In the end, hand-over was delay from Dec to Apr 09. Hardware was delayed but we were (almost) rescued by the LHC schedule change. Minor (?) issues remain with R89 (Aircon-trips; water-proof membrane?)

(GridPP22) Current Issues: R89

8/Sep/09

Page 14: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

14

Tier-1 Hardware

• The FY2008 hardware procurement had to await the acceptance of R89.

• The CPU is tested, accepted, and being deployed (14,000 HEPSPEC06)

• The disk procurement (2.2 PB) was split into two halves (different disks and controllers to mitigate against acceptance risk). This has proved sensible, as one batch has demonstrated ejection issues.

• One half of the disk is being deployed; progress is being made on the other half and best guess is deployment by end of November.

• A second SL85000 tape robot is available.• The FY09 hardware procurement is underway.

8/Sep/09

Page 15: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

15

Disaster Management

• A four-stage disaster management process was established at the Tier-1 earlier this year as part of our focus on resilience and disaster management.

• Designed to be used regularly so that process is familiar. This means low-threshold to trigger Stage-1 “disasters”

• At Stage-3, the process formally involves stake-holders outside the Tier-1, including GridPP management. This has now happened several times including:– R89 aircon trip– R89 water leak– Disk procurement problem– Swine flu planning.

• The process is still being honed, but I believe it is very useful.

8/Sep/09

Page 16: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

16

Tier-2 Performance

Resource-weighted averages

8/Sep/09

Page 17: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

17

Tier-2 Resources

1/Apr/098/Sep/09

Page 18: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

18

- NGI

- NGI

- NGI

EGI/NGI

EGI

UK-NGI

Coordinating body in Amsterdam

National initiatives in member countries

GridPP

NGS

Involves STFC, EPSRC and JISC (at least) in the UK.

EGI is vital to GridPP but it is not GridPP’s core business to run an e-science infrastructure for the whole of the UK: seek a middle ground.

8/Sep/09

Page 19: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

19

Jigsaw Puzzle

SSC EMI

EGIHeavy Users

SSCSSC

(Roscoe) UnicoreARC

gLiteUK involvement with Ganga?

UK involvement via the UK NGI with global tasks such as GOGDB, security, dissemination, training....

UK involvement with APEL, GridSite? …

8/Sep/09

Page 20: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

20

Next Steps

• Oversight Committee meeting next week.– Approval for OPN resilient link– Confirmation of remaining GridPP3 spending profile– Some guidance on GridPP4?

• The LHC start-up, round-2 (Roger Bailey’s talk next!)• Moving towards a UK NGI in the perspective of EGI,

SSC’s, EMI, etc. (Monologue by John Cleese: “There will be a certain degree of uncertainty, of that we can be quite (long pause) … sure.”)

• Shaking down R89; Settling down for a long run.• Tier-2 hardware allocations.• GridPP4• … and data!

8/Sep/09

Page 21: GridPP23 – Final Steps to Data David Britton, 8/Sep/09

21

Summary and the Future

LHC Data

Oh god!

• STEP09 validated the UK infrastructure for LHC data-taking and proved that we are in good shape.

• We are building on this with careful tuning and further improvements to resilience and management processes.

• Great care must be taken not to invalidate the validation (but we cannot sit still either).

Hand of god?

Thoroughly deserved team effort which did not require (much) divine intervention.

8/Sep/09