phedex: a novel approach to robust grid data management tim barrass dave newbold and lassi tuura all...

PhEDEx: a novel approach to PhEDEx: a novel approach to robust Grid data management robust Grid data management Tim Barrass Dave Newbold and Lassi Tuura All Hands Meeting, Nottingham, UK 22 September 2005

Upload: wyatt-schmidt

Post on 28-Mar-2015




0 download


Page 1: PhEDEx: a novel approach to robust Grid data management Tim Barrass Dave Newbold and Lassi Tuura All Hands Meeting, Nottingham, UK 22 September 2005

PhEDEx: a novel approach to PhEDEx: a novel approach to robust Grid data managementrobust Grid data management

Tim BarrassDave Newbold and Lassi Tuura

All Hands Meeting, Nottingham, UK22 September 2005

Page 2: PhEDEx: a novel approach to robust Grid data management Tim Barrass Dave Newbold and Lassi Tuura All Hands Meeting, Nottingham, UK 22 September 2005

Tim Barrass, Bristol, [email protected] 2

What is PhEDEx?

• A data distribution management system Used by the Compact Muon Solenoid (CMS) High Energy

Physics (HEP) experiment at CERN, Geneva

• Blends traditional HEP data distribution practice with more recent technologies Grid and peer-to-peer filesharing

• Scalable infrastructure for managing dataset replication Automates low-level activity Allows manager to work with high level dataset concepts

rather than low level file operations

• Technology agnostic Overlies Grid components Currently couples LCG, OSG, NorduGrid, standalone sites

Page 3: PhEDEx: a novel approach to robust Grid data management Tim Barrass Dave Newbold and Lassi Tuura All Hands Meeting, Nottingham, UK 22 September 2005

Tim Barrass, Bristol, [email protected] 3

The HEP environment• HEP collaborations are quite large

Order of 1000 collaborators, globally distributed CMS is only one of four Large Hadron Collider (LHC) experiments

being built at CERN

• Typically resources are globally distributed Resources organised in tiers of decreasing capacity

Tier 0: the detector facility Tier 1: large regional centres Tier 2+: smaller sites-- Universities, groups, individuals…

Raw data partitioned between sites, highly processed ready-for-analysis data available everywhere

• LHC computing demands are large Order 10 PetaBytes per year created for CMS alone Similar order simulated Also analysis and user data

Page 4: PhEDEx: a novel approach to robust Grid data management Tim Barrass Dave Newbold and Lassi Tuura All Hands Meeting, Nottingham, UK 22 September 2005

Tim Barrass, Bristol, [email protected] 4

CMS distribution use cases

• Two principle use cases- push and pull of data Raw data is pushed onto the regional centres Simulated and analysis data is pulled to a subscribing site Actual transfers are 3rd party- handshake between active components

important, not push or pull• Maintain end-to-end multi-hop transfer state

Can only clean online buffers at detector when data safe at Tier 1• Policy must be used to resolve these two use cases

Page 5: PhEDEx: a novel approach to robust Grid data management Tim Barrass Dave Newbold and Lassi Tuura All Hands Meeting, Nottingham, UK 22 September 2005

Tim Barrass, Bristol, [email protected] 5

PhEDEx design

• Assume every operation is going to fail!• Keep complex functionality in discrete agents

Handover between agents minimal Agents are persistent, autonomous, stateless, distributed System state maintained using a modified blackboard


• Layered abstractions make system robust• Keep local information local where possible

Enable site administrators to maintain local infrastructure Robust in face of most local changes

Deletion and accidental loss require attention

• Draws inspiration from agent systems, “autonomic” and peer-to-peer computing

Page 6: PhEDEx: a novel approach to robust Grid data management Tim Barrass Dave Newbold and Lassi Tuura All Hands Meeting, Nottingham, UK 22 September 2005

Tim Barrass, Bristol, [email protected] 6

Transfer workflow overview

Page 7: PhEDEx: a novel approach to robust Grid data management Tim Barrass Dave Newbold and Lassi Tuura All Hands Meeting, Nottingham, UK 22 September 2005

Tim Barrass, Bristol, [email protected] 7

Production performance

Page 8: PhEDEx: a novel approach to robust Grid data management Tim Barrass Dave Newbold and Lassi Tuura All Hands Meeting, Nottingham, UK 22 September 2005

Tim Barrass, Bristol, [email protected] 8

Service challenge performance

Page 9: PhEDEx: a novel approach to robust Grid data management Tim Barrass Dave Newbold and Lassi Tuura All Hands Meeting, Nottingham, UK 22 September 2005

Tim Barrass, Bristol, [email protected] 9

Future directions

• Contractual file routing Cost-based offers for a given transfer

• Peer-to-peer data location Using Kademlia to partition replica location information

• Semi-autonomy Agents governed by many small tuning parameters Self modify- or use more intelligent protocols?

• Advanced policies for priority conflict resolution Need to ensure that raw data is always flowing Difficult real-time scheduling problem

Page 10: PhEDEx: a novel approach to robust Grid data management Tim Barrass Dave Newbold and Lassi Tuura All Hands Meeting, Nottingham, UK 22 September 2005

Tim Barrass, Bristol, [email protected] 10


• PhEDEx enables dataset level replication for the CMS HEP experiment Currently manages 200TB+ of data, globally distributed Real life performance of 1 TB per day sustained per site Challenge performance of over 10TB per day

• Not CMS-- or indeed HEP-- specific• Well-placed to meet future challenges

Ramping up to get to O(10)PB per year 10-100TB per day

Data starts flowing for real in the next two years

Page 11: PhEDEx: a novel approach to robust Grid data management Tim Barrass Dave Newbold and Lassi Tuura All Hands Meeting, Nottingham, UK 22 September 2005

Tim Barrass, Bristol, [email protected] 11

Extra information

• PhEDEx and CMS [email protected] : feel free to subscribe! CMS Computing model• Agent frameworks


• Peer-to-peer Kademlia Kenosis

• Autonomic computing

• General agents and blackboards Where should complexity go? Agents and blackboards

Page 12: PhEDEx: a novel approach to robust Grid data management Tim Barrass Dave Newbold and Lassi Tuura All Hands Meeting, Nottingham, UK 22 September 2005

Tim Barrass, Bristol, [email protected] 12

Issues• Most issues fabric-related

Most low level components experimental or not production-hardened

• Tools typically unreliable under load• MSS access a serious handicap

PhEDEx plays very fair, keeping within request limits and ordering requests by tape when possible

• Main problem is keeping in touch with the O(3) people at each site involved in deploying fabric, administration &c

Page 13: PhEDEx: a novel approach to robust Grid data management Tim Barrass Dave Newbold and Lassi Tuura All Hands Meeting, Nottingham, UK 22 September 2005

Tim Barrass, Bristol, [email protected] 13

Deployment• 8 regional centres, 16 smaller sites• 110TB, replicated ~twice• 1 TB per day sustained

On standard Internet

Page 14: PhEDEx: a novel approach to robust Grid data management Tim Barrass Dave Newbold and Lassi Tuura All Hands Meeting, Nottingham, UK 22 September 2005

Tim Barrass, Bristol, [email protected] 14

Testing and scalability

Page 15: PhEDEx: a novel approach to robust Grid data management Tim Barrass Dave Newbold and Lassi Tuura All Hands Meeting, Nottingham, UK 22 September 2005

Tim Barrass, Bristol, [email protected] 15

PhEDEx architecture