paul avery university of florida phys.ufl/~avery/ [email protected]
DESCRIPTION
U.S. Physics Data Grid Projects. Paul Avery University of Florida http://www.phys.ufl.edu/~avery/ [email protected]. International Workshop on HEP Data Grids Kyungpook National University, Daegu, Korea Nov. 8-9, 2002. “Trillium”: US Physics Data Grid Projects. - PowerPoint PPT PresentationTRANSCRIPT
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 1
Paul AveryUniversity of Florida
http://www.phys.ufl.edu/~avery/[email protected]
U.S. Physics Data Grid Projects
International Workshop on HEP Data GridsKyungpook National University, Daegu, Korea
Nov. 8-9, 2002
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 2
“Trillium”: US Physics Data Grid Projects
Particle Physics Data Grid (PPDG)Data Grid for HENP experimentsATLAS, CMS, D0, BaBar, STAR, JLAB
GriPhyNPetascale Virtual-Data GridsATLAS, CMS, LIGO, SDSS
iVDGLGlobal Grid labATLAS, CMS, LIGO, SDSS, NVO
Data intensive expts.
Collaborations of physicists & computer scientists
Infrastructure development & deployment
Globus + VDT based
=
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 3
Why Trillium?Many common aspects
Large overlap in project leadershipLarge overlap in participantsLarge overlap in experiments, particularly LHCCommon projects (monitoring, etc.)Common packagingCommon use of VDT, other GriPhyN software
Funding agencies like collaborationGood working relationship on grids between NSF and DOEGood complementarity: DOE (labs), NSF (universities)Collaboration of computer science/physics/astronomy
encouraged
Organization from the “bottom up”With encouragement from funding agencies
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 4
1800 Physicists150 Institutes32 Countries
Driven by LHC Computing Challenges
Complexity: Millions of detector channels, complex eventsScale: PetaOps (CPU), Petabytes (Data)Distribution: Global distribution of people & resources
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 5
Experiment (e.g., CMS)
Global LHC Data Grid
Online System
CERN Computer Center > 20 TIPS
USAKorea RussiaUK
Institute
100-200 MBytes/s
2.5 Gbits/s
0.1 - 1 Gbits/s
2.5 Gbits/s
~0.6 Gbits/s
Tier 0
Tier 1
Tier 3
Tier 4
Tier0/( Tier1)/( Tier2) ~ 1:1:1
Tier 2
Physics cachePCs, other portals
Institute
Institute
Institute
Tier2 Center
Tier2 Center
Tier2 Center
Tier2 Center
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 6
LHC Tier2 Center (2001)
Router
>1 RAID WA
N
FEth/GEthSwitch
“Flat” switching topology
Da
ta S
erv
er
20-60 nodesDual 0.8-1 GHz, P31 TByte RAID
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 7
LHC Tier2 Center (2002-2003)
Router
GEth/FEth SwitchGEthSwitch
Da
ta S
erv
er
>1 RAID WA
N
“Hierarchical” switching topology
Switch Switch GEth/FEth
40-100 nodesDual 2.5 GHz, P42-4 TBytes RAID
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 8
Buy late, but not too late: phased implementationR&D Phase 2001-2004 Implementation Phase 2004-2007R&D to develop capabilities and computing model itselfPrototyping at increasing scales of capability & complexity
1.4 years
1.2 years
1.1 years
2.1 years
LHC Hardware Cost Estimates
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 9
Particle Physics Data Grid
“In coordination with complementary projects in the US and Europe, PPDG aims to meet the urgent needs for advanced Grid-enabled technology and to strengthen the collaborative foundations of experimental particle and nuclear physics.”
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 10
PPDG GoalsServe high energy & nuclear physics (HENP)
experimentsFunded 2001 – 2004 @ US$9.5M (DOE)
Develop advanced Grid technologiesUse Globus to develop higher level toolsFocus on end to end integration
Maintain practical orientationNetworks, instrumentation, monitoringDB file/object replication, caching, catalogs, end-to-end
movement
Serve urgent needs of experimentsUnique challenges, diverse test environments
But make tools general enough for wide community!Collaboration with GriPhyN, iVDGL, EDG, LCGRecent work on ESNet Certificate Authority
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 11
PPDG Participants and Work Program
Physicist + CS involvementD0, BaBar, STAR, CMS, ATLASSLAC, LBNL, Jlab, FNAL, BNL, Caltech, Wisconsin, Chicago,
USC
Computer Science Program of WorkCS1: Job description languageCS2: Schedule, manage data processing, data placement
activitiesCS3: Monitoring and status reporting (with GriPhyN)CS4: Storage resource managementCS5: Reliable replication servicesCS6: File transfer servicesCS7: Collect/document experiment practices generalize…CS11: Grid-enabled data analysis
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 12
GriPhyN = App. Science + CS + Grids
ParticipantsUS-CMS High Energy PhysicsUS-ATLAS High Energy PhysicsLIGO/LSC Gravity wave researchSDSS Sloan Digital Sky SurveyStrong partnership with computer scientists
Design and implement production-scale gridsDevelop common infrastructure, tools and services (Globus
based) Integration into the 4 experimentsBroad application to other sciences via “Virtual Data
Toolkit”Strong outreach program
Funded by NSF for 2000 – 2005R&D for grid architecture (funded at $11.9M +$1.6M) Integrate Grid infrastructure into experiments through VDT
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 13
GriPhyN: PetaScale Virtual-Data Grids
Virtual Data Tools
Request Planning &
Scheduling ToolsRequest Execution & Management Tools
Transforms
Distributed resources(code, storage, CPUs,networks)
Resource Management
Services
Resource Management
Services
Security and Policy
Services
Security and Policy
Services
Other Grid ServicesOther Grid
Services
Interactive User Tools
Production TeamIndividual Investigator Workgroups
Raw data source
~1 Petaflop~100 Petabytes
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 14
GriPhyN Research AgendaBased on Virtual Data technologies (fig.)
Derived data, calculable via algorithm Instantiated 0, 1, or many times (e.g., caches)“Fetch value” vs “execute algorithm”Very complex (versions, consistency, cost calculation, etc)
LIGO example“Get gravitational strain for 2 minutes around each of 200
gamma-ray bursts over the last year”
For each requested data value, need toLocate item location and algorithm Determine costs of fetching vs calculatingPlan data movements & computations required to obtain
results Schedule the planExecute the plan
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 15
Virtual Data Concept
Data request may Compute locally Compute remotely Access local data Access remote data
Scheduling based on Local policies Global policies Cost
Major facilities, archives
Regional facilities, caches
Local facilities, cachesFetch item
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 16
iVDGL: A Global Grid Laboratory
International Virtual-Data Grid LaboratoryA global Grid laboratory (US, EU, Asia, South America, …)A place to conduct Data Grid tests “at scale”A mechanism to create common Grid infrastructureA laboratory for other disciplines to perform Data Grid testsA focus of outreach efforts to small institutions
U.S. part funded by NSF (2001 – 2006)$14.1M (NSF) + $2M (matching) International partners bring own funds
“We propose to create, operate and evaluate, over asustained period of time, an international researchlaboratory for data-intensive science.”
From NSF proposal, 2001
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 17
iVDGL ParticipantsInitial experiments (funded by NSF proposal)
CMS, ATLAS, LIGO, SDSS, NVO
Possible other experiments and disciplinesHENP: BTEV, D0, CMS HI, ALICE, …Non-HEP: Biology, …
Complementary EU project: DataTAGDataTAG and US pay for 2.5 Gb/s transatlantic network
Additional support from UK e-Science programmeUp to 6 Fellows per yearNone hired yet
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 18
iVDGL ComponentsComputing resources
Tier1 laboratory sites (funded elsewhere)Tier2 university sites software integrationTier3 university sites outreach effort
NetworksUSA (Internet2, ESNet), Europe (Géant, …)Transatlantic (DataTAG), Transpacific, AMPATH, …
Grid Operations Center (GOC) Indiana (2 people) Joint work with TeraGrid on GOC development
Computer Science support teamsSupport, test, upgrade GriPhyN Virtual Data Toolkit
Coordination, management
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 19
iVDGL Management and Coordination
Project Coordination Group
US External Advisory Committee
GLUE Interoperability Team
Collaborating Grid Projects
TeraGrid
EDG Asia
DataTAG
BTEV
LCG?
BioALICE Geo
?
D0 PDC CMS HI ?
US ProjectDirectors
Outreach Team
Core Software Team
Facilities Team
Operations Team
Applications Team
International Piece
US Project Steering Group
U.S. Piece
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 20
iVDGL Work TeamsFacilities Team
Hardware (Tier1, Tier2, Tier3)
Core Software TeamGrid middleware, toolkits
Laboratory Operations TeamCoordination, software support, performance monitoring
Applications TeamHigh energy physics, gravity waves, virtual astronomyNuclear physics, bioinformatics, …
Education and Outreach TeamWeb tools, curriculum development, involvement of
students Integrated with GriPhyN, connections to other projectsWant to develop further international connections
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 21
US-iVDGL Data Grid (Sep. 2001)
UF
Wisconsin
Fermilab BNL
Indiana
Boston USKC
Brownsville
Hampton
PSU
J. Hopkins
Caltech
Tier1Tier2Tier3
Argonne
UCSD/SDSC
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 22
US-iVDGL Data Grid (Dec. 2002)
UF
Wisconsin
Fermilab BNL
Indiana
Boston USKC
Brownsville
Hampton
PSU
J. Hopkins
Caltech
Tier1Tier2Tier3
FIU
FSUArlington
Michigan
LBL
Oklahoma
Argonne
Vanderbilt
UCSD/SDSC
NCSA
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 23
Possible iVDGL Participant: TeraGrid
26
24
8
4 HPSS
5
HPSS
HPSS UniTree
External Networks
External Networks
External Networks
External Networks
Site Resources Site Resources
Site ResourcesSite ResourcesNCSA/PACI8 TF240 TB
SDSC4.1 TF225 TB
Caltech Argonne
40 Gb/s
13 TeraFlops
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 24
International ParticipationExisting partners
European Data Grid (EDG)DataTAG
Potential partnersKorea T1China T1? Japan T1?Brazil T1Russia T1Chile T2Pakistan T2Romania ?
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 25
Current Trillium WorkPackaging technologies: PACMAN
Used for VDT releases very successful & powerfulEvaluated for Globus, EDG
GriPhyN Virtual Data Toolkit 1.1.3 releasedVastly simplifies installation of grid toolsNew changes will further simplify configuration complexity
Monitoring (joint efforts)Globus MDS 2.2 (GLUE schema)Caltech MonaLisaCondor HawkEyeFlorida Gossip (low level component)
Chimera Virtual Data System (more later)Testbeds, demo projects (more later)
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 26
Virtual Data: Derivation and Provenance
Most scientific data are not simple “measurements”They are computationally corrected/reconstructedThey can be produced by numerical simulation
Science & eng. projects are more CPU and data intensive
Programs are significant community resources (transformations)
So are the executions of those programs (derivations)
Management of dataset transformations important!Derivation: Instantiation of a potential data productProvenance: Exact history of any existing data product
Programs are valuable, like data.They should be community resources
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 27
Transformation Derivation
Data
product-of
execution-of
consumed-by/generated-by
“I’ve detected a mirror calibration error and want to know which derived data products need to be recomputed.”
“I’ve found some interesting data, but I need to know exactly what corrections were applied before I can trust it.”
“I want to search a database for dwarf galaxies. If a program that performs this analysis exists, I won’t have to write one from scratch.”
“I want to apply a shape analysis to 10M galaxies. If the results already exist, I’ll save weeks of computation.”
Motivations (1)
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 28
Motivations (2)
Data track-ability and result audit-ability Universally sought by GriPhyN applications
Facilitates tool and data sharing and collaboration Data can be sent along with its recipe
Repair and correction of data Rebuild data products—c.f., “make”
Workflow management A new, structured paradigm for organizing, locating,
specifying, and requesting data products
Performance optimizations Ability to re-create data rather than move it
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 29
“Chimera” Virtual Data System Virtual Data API
A Java class hierarchy to represent transformations & derivations
Virtual Data Language Textual for people & illustrative examples XML for machine-to-machine interfaces
Virtual Data Database Makes the objects of a virtual data definition persistent
Virtual Data Service (future) Provides a service interface (e.g., OGSA) to persistent
objects
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 30
Virtual Data Catalog Object Model
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 31
Virtual Data Language (VDL) Describes virtual data products
Virtual Data Catalog (VDC) Used to store VDL
Abstract Job Flow Planner Creates a logical DAG (dependency
graph)
Concrete Job Flow Planner Interfaces with a Replica Catalog Provides a physical DAG submission file to
Condor-G
Generic and flexible As a toolkit and/or a framework In a Grid environment or locally
Currently in beta version
Log
ical
Ph
ysi
cal
AbstractPlannerVDC
ReplicaCatalog
ConcretePlanner
DAX
DAGMan
DAG
VDLXML
Chimera as a Virtual Data System
XML
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 32
Size distribution ofgalaxy clusters?
1
10
100
1000
10000
100000
1 10 100
Num
ber
of C
lust
ers
Number of Galaxies
Galaxy clustersize distribution
Chimera Virtual Data System+ GriPhyN Virtual Data Toolkit
+ iVDGL Data Grid (many CPUs)
Chimera Application: SDSS Analysis
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 33
US-CMS Testbed
UCSD
Florida
Wisconsin
Caltech
Fermilab
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 34
Other CMS Institutes Encouraged to Join
Expressions of interest• Princeton• Brazil• South Korea• Minnesota• Iowa• Possibly others
UCSD
Florida
Caltech
FermilabWisconsin
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 35
Grid Middleware Used in TestbedVirtual Data Toolkit 1.1.3
VDT Client:Globus Toolkit 2.0 Condor-G 6.4.3
VDT Server:Globus Toolkit 2.0mkgridmapCondor 6.4.3 ftshGDMP 3.0.7
Virtual Organization (VO) ManagementLDAP Server deployed at FermilabGroupMAN (adapted from EDG) used to manage the VOUse D.O.E. Science Grid certificates Accept EDG and Globus certificates
UCSD
Florida
Wisconsin
Caltech
Fermilab
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 36
Commissioning the CMS Grid Testbed
A complete prototype (fig.)CMS Production ScriptsGlobusCondor-GGridFTP
Commissioning: Require production quality results!Run until the Testbed "breaks"Fix Testbed with middleware patchesRepeat procedure until the entire Production Run finishes!
Discovered/fixed many Globus and Condor-G problems
Huge success from this point of view alone… but very painful
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 37
CMS Grid Testbed Production
Remote Site 2Master Site
Remote Site 1
IMPALA mop_submitterDAGManCondor-G
GridFTP
BatchQueue
GridFTP
BatchQueue
GridFTP
Remote Site NBatchQueue
GridFTP
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 38
Linker ScriptGenerator
Configurator
Requirements
Self Description
MasterScript "DAGMaker" VDL
MOP MOP Chimera
MCRunJob
Production Success on CMS Testbed
Results150k events generated, ~200 GB produced1.5 weeks continuous running across all 5 testbed sites1M event run just started on larger testbed (~30%
complete!)
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 39
Grid Coordination EffortsGlobal Grid Forum (www.gridforum.org)
International forum for general Grid effortsMany working groups, standards definitionsNext one in Japan, early 2003
HICB (High energy physics) Joint development & deployment of Data Grid middlewareGriPhyN, PPDG, iVDGL, EU-DataGrid, LCG, DataTAG,
CrossgridGLUE effort (joint iVDGL – DataTAG working group)
LCG (LHC Computing Grid Project)Strong “forcing function”
Large demo projects IST2002 CopenhagenSupercomputing 2002 Baltimore
New proposal (joint NSF + Framework 6)?
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 40
WorldGrid DemoJoint Trillium-EDG-DataTAG demo
Resources from both sides in Intercontinental Grid TestbedUse several visualization tools (Nagios, MapCenter, Ganglia)Use several monitoring tools (Ganglia, MDS, NetSaint, …)
ApplicationsCMS: CMKIN, CMSIMATLAS:ATLSIM
Submit jobs from US or EU Jobs can run on any clusterShown at IST2002 (Copenhagen)To be shown at SC2002 (Baltimore)
Brochures now available describing Trillium and demos
I have 10 with me now (2000 just printed)
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 41
WorldGrid
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 42
SummaryVery good progress on many fronts
PackagingTestbedsMajor demonstration projects
Current Data Grid projects are providing good experience
Looking to collaborate with more international partners
TestbedsMonitoringDeploying VDT more widely
Working towards new proposalEmphasis on Grid-enabled analysisExtending Chimera virtual data system to analysis
Korean HEP Grid Workshop (Nov. 8, 2002)
Paul Avery 43
Grid References Grid Book
www.mkp.com/grids Globus
www.globus.org Global Grid Forum
www.gridforum.org TeraGrid
www.teragrid.org EU DataGrid
www.eu-datagrid.org PPDG
www.ppdg.net GriPhyN
www.griphyn.org iVDGL
www.ivdgl.org