cern database challenges (2)information technology - the web and the grid medicine - diagnosis and...

63
Data and Database Challenges at CERN Tony Cass Leader, Database Services Group Information Technology Department 29 th May 2010 1

Upload: others

Post on 06-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Data and Database Challenges at CERNg

Tony CassLeader, Database Services Group

Information Technology Department

29th May 2010

1

Page 2: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

2

2

Page 3: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

3

Page 4: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Outline• Introduction to CERN, Experiments & Data• Challengesg• Summary/Conclusion

4

Page 5: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Outline• Introduction to CERN, Experiments & Data• Challengesg• Summary/Conclusion

5

Page 6: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

CERN

MethodologyThe fastest racetrack on the planet…p

Trillions of protons will race around the 27km ring in opposite directions over 11,000 times a second, travelling at 99.999999991 per cent the speed of light.

6

Page 7: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

CERN

MethodologyThe emptiest space in the solar system…p p y

To accelerate protons to almost the speed of light requires a vacuum as empty as interplanetary space. There is 10 times more atmosphere on the moon than there will be in the LHC.

7

Page 8: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

CERN

MethodologyOne of the coldest places in the universe…p

With an operating temperature of about -271 degrees Celsius, just 1.9 degrees above absolute zero, the LHC is colder than outer space.

8

Page 9: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

CERN

MethodologyThe hottest spots in the galaxy…p g y

When two beams of protons collide, they will generate temperatures 1000 million times hotter than the heart of the sun, but in a minuscule space.

9

Page 10: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

CERN

MethodologyThe biggest most sophisticated detectors ever built…gg p

To sample and record the debris from up to 600 million proton collisions per second, scientists are building gargantuan devices that measure particles with micron precision. 10

Page 11: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

CERN

MethodologyThe most extensive computer system in the world…p y

To analyse the data, tens of thousands of computers around the world are being harnessed in the Grid. The laboratory that gave the world the web, is now taking distributed computing a big step further. 11

Page 12: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

CERN

Methodology

Why?Why?

12

Page 13: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

CERN

MethodologyTo push back the frontiers of knowledge…p g

Newton’s unfinished business… what is mass?

Science’s little embarrassment… what is 96% of the Universe made of?

Nature’s favouritism… why is there no more antimatter?

The secrets of the Big Bang… what was matter like within the first second of the Universe’s life? 13

Page 14: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

14

14

Page 15: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

CERN

MethodologyTo push back the frontiers of knowledge…p g

Newton’s unfinished business… what is mass?

Science’s little embarrassment… what is 96% of the Universe made of?

Nature’s favouritism… why is there no more antimatter?

The secrets of the Big Bang… what was matter like within the first second of the Universe’s life? 15

Page 16: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

CERN

MethodologyTo develop new technologies…p g

Information technology - the Web and the Grid

Medicine - diagnosis and therapy

Security - scanning technologies for harbours and airports

Vacuum - new techniques for flat screen displays or solar energy devices16

Page 17: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

CERN

MethodologyTo unite people from different countries and cultures…p p

20 Member states

38 Countries with cooperation agreements

111 Nationalities

10000 People17

Page 18: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

CERN

MethodologyTo train the scientists and engineers of tomorrow…g

From mini-Einstein workshops for five to sixes, through to professional schools in physics, accelerator science and IT, CERN plays a valuable role in building enthusiasm for science and providing formal training..

18

Page 19: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

“Compact” Detectors!

19

Page 20: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

20

Page 21: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques
Page 22: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

The Four LHC Experiments…ATLAS CMSATLAS- General purpose- Origin of mass- Supersymmetry

CMS- General purpose

- Origin of mass- Supersymmetry

- 2,000 scientists from 34 countries -1,800 scientists from over 150 institutes

22

ALICE- heavy ion collisions, to create quark-gluon plasmas- 50,000 particles in each collision

LHCb- to study the differences between matter and antimatter

- will detect over 100 million b and b-bar mesons each year

Page 23: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

… generate lots of data …

The accelerator generates 40 million particle collisions (events) every second at the centre of each of thesecond at the centre of each of the four experiments’ detectors

23

Page 24: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

… generate lots of data …reduced by online computers toa few hundred “good” eventsper second.

Which are recorded on disk and magnetic tapeat 100-1,000 MegaBytes/sec ~15 PetaBytes per year

for all four experiments

24

Page 25: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

which is distributed worldwideTi 0 (CERN)Tier-0 (CERN):•Data recording•Initial data reconstruction

•Data distribution

Tier-1 (11 centres):Permanent storage•Permanent storage

•Re-processing•Analysis

Tier-2 (~130 centres):• Simulation• End-user analysis

25

Page 26: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Outline• Introduction to CERN, Experiments & Data• Challengesg• Summary/Conclusion

26

Page 27: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Outline• Introduction to CERN, Experiments & Data• Challengesg

– Data Storage & Distribution

• Summary/ConclusionSummary/Conclusion

27

Page 28: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

28

Page 29: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Dataflows and rates

1000MB/s 420MB/sScheduled work only!

1100MB/s 1520MB/s

(2000MB/s) (2500MB/s)

1600MB/sAverages! Need to be able tosupport 2x for recovery!

Remember this figure

29

Page 30: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Castor

Client RH Scheduler DB

SvcJobSvc

QrySvc

ErrorSvc

Stager

M

RR

StagerJobMover

Mov

er

MigHunter

GCDBe

NameServerDisk Servers RTCPClientD

T S Tape

Dae

mon

RTCPDVDQM

VMGR

30

Tape Servers T m

Page 31: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Dataflows and rates

1000MB/s 420MB/s

1100MB/s 1520MB/s

(2000MB/s) (2500MB/s)

1600MB/s

31

Page 32: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

The real challenge? Users!

32

32

Page 33: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

The real challenge? Users!• Such user behaviour can affect Oracle

execution plans…– … which impacts our services

33

• We look forward to improved control overexecution plans in Oracle 11g!

33

Page 34: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Outline• Introduction to CERN, Experiments & Data• Challengesg

– Metadata distribution

• Summary/ConclusionSummary/Conclusion

34

Page 35: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

35

Page 36: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Metadata Distribution• To make sense of the raw data generated by

the detectors, physicists need data about any conditions at the time it was taken that affect the detector calibration.

• This conditions data is stored in a relational database and needs distributing to Tier1 gcentres to enable future reprocessing of the raw data.

• Oracle Streams enables this distribution…

36

36

Page 37: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

37

Page 38: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Oracle Streams Replication• Technology for sharing information between

databases• Database changes captured from the redo-log

and propagated asynchronously as Logical p p g y y gChange Records (LCRs)

Propagate

Target Database

SourceDatabase

Propagate

3838 ApplyCaptureRedo

Logs

Page 39: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Streams setup for ATLAS

39

39

Page 40: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Metadata Distribution• To make sense of the raw data generated by

the detectors, physicists need data about any conditions at the time it was taken that affect the detector calibration.

• This conditions data is stored in a relational database and needs distributing to Tier1 gcentres to enable future reprocessing of the raw data.

• Oracle Streams enables this distribution…• We will be evaluating GoldenGate in the near

40

40

• We will be evaluating GoldenGate in the near future as this offers greater flexibility.

Page 41: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Outline• Introduction to CERN, Experiments & Data• Challengesg

– The users again…

• Summary/ConclusionSummary/Conclusion

41

Page 42: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

42

Page 43: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Databases for Accelerators• The accelerator operations team make

extensive use of databases for logging many 140

B

Daily additional datafrom March 2008 to 25 April 2010

operational parameters.• Our largest and most rapidly growing database

120

GB

g p y g g• Small beer in comparison to Walmart, say, …• but our users insist on being able to perform 80

100

• … but our users insist on being able to perform random queries on the full historical data at any time 60

any time 40

43

43

0

20

Page 44: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Backups: A digression• Backing up databases is a chore, not a

challenge...• …the challenge is to recover a database…

– … and this had better be rare, not a chore!,

• To help reduce the challenge of, and increase confidence in, recovery procedures, we have confidence in, recovery procedures, we have developed an automated recovery test tool.

44

44

Page 45: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

45

Page 46: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Backups: A digression• Backing up databases is a chore, not a

challenge...• …the challenge is to recover a database…

– … and this had better be rare, not a chore!,

• To help reduce the challenge of, and increase confidence in, recovery procedures, we have confidence in, recovery procedures, we have developed an automated recovery test tool.

• Problem: days to recover some databases• Problem: days to recover some databases– At least 2.5 days for partial restore of the critical

database for accelerator operations

46

46

database for accelerator operations…use Data Guard to create standby databases.

Page 47: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

WAN/Intranet

Physical StandbyRAC databaseRMAN

Primary RAC database

We are extremely keen to migrate toWe are extremely keen to migrate to 11g and exploit Active Data Guard!

Page 48: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Outline• Introduction to CERN, Experiments & Data• Challengesg

– Virtualisation & Fabric Management

• Summary/ConclusionSummary/Conclusion

48

Page 49: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

49

Page 50: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Size isn’t all…• As well as being vital for accelerator and

experiment operations, Oracle databases also underpin CERN’s administrative applications.

• Individual databases are small, but we have many servers

• Average load is very low, so clear opportunity Average load is very low, so clear opportunity for server consolidation with virtualisation.

• Excellent results from tests with Oracle VM • Excellent results from tests with Oracle VM and Web Logic Server virtual edition.

50

50

Page 51: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Physical vs Virtual

- Comparison of a Physical Machine (4GB memory, 8 cores) running WebLogic Server

51

51

Comparison of a Physical Machine (4GB memory, 8 cores) running WebLogic Server versus a Virtual Machine (4GB memory, 8 Virtual CPUs) running WebLogic Server –Virtual Edition

- An unreleased version of the kernel of JRockit has been used for this tests.

Page 52: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Fabric Management• As most large organisations, CERN has

standards and procedures for network and system configuration.

• “Easy to install” applications can be good, but y pp gthey should be configurable to work correctly in a tightly managed environment.g y g

• Good collaboration around Oracle VM– e.g. IP/MAC address binding, assumptions about e.g. IP/MAC address binding, assumptions about

NFS as shared file system.

52

52

Page 53: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Outline• Introduction to CERN, Experiments & Data• Challengesg

– Monitoring

• Summary/ConclusionSummary/Conclusion

53

Page 54: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

54

Page 55: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Up and seen to be upU d t di th t t t f • Understanding the current state of our many database servers and applications is a major challengechallenge.

• Collaboration around Oracle Enterprise Manager has been a really fruitful aspect of Manager has been a really fruitful aspect of the CERN openlab partnership with Oracle.

• Many tools developed at CERN for monitoring • Many tools developed at CERN for monitoring Streams have led to features in OEM 10.2 and 11 111.1.

• As in the fabric management area, integration with other tools is desirable. A future

55

55

with other tools is desirable. A future challenge?

Page 56: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

56

Page 57: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Up and seen to be upU d t di th t t t f • Understanding the current state of our many database servers and applications is a major challengechallenge.

• Collaboration around Oracle Enterprise Manager has been a really fruitful aspect of Manager has been a really fruitful aspect of the CERN openlab partnership with Oracle.

• Many tools developed at CERN for monitoring • Many tools developed at CERN for monitoring Streams have led to features in OEM 10.2 and 11 111.1.

• As in the fabric management area, integration with other tools is desirable. A future

57

57

with other tools is desirable. A future challenge?

Page 58: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

58

Page 59: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Outline• Introduction to CERN, Experiments & Data• Challengesg• Summary/Conclusion

59

Page 60: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

60

60

Page 61: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

61

Page 62: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

62

Page 63: CERN Database Challenges (2)Information technology - the Web and the Grid Medicine - diagnosis and therapy Security - scanning technologies for harbours and airports Vacuum - new techniques

Th k Y !Thank You!

Thanks also to

Eric Grancher Eva Dafonte Perez Luca Canali Sebastien PonceEric Grancher, Eva Dafonte Perez, Luca Canali, Sebastien Ponce,Nilo Segura, Carlos Garcia Fernandez, Dawid Wojcik, Anton Topurov,Artur Wiecek

63