argonne chicago ian foster discussion points l maintaining the right balance between research and...
TRANSCRIPT
Ian Foster ARGONNE CHICAGO
Discussion Points
Maintaining the right balance between research and development
Maintaining focus vs. accepting broader scope E.g., international collaboration E.g., GriPhyN in the large (GriPhyN-2) E.g., Terascale
Creating a national cyberinfrastructure What is our appropriate role
Ian Foster ARGONNE CHICAGO
Discussion Points
Outreach to other disciplines Biology, NEES, …
Virtual data toolkit Inclusive or focused? Resource issue, again
Achieving critical mass of resources to deliver on the complete promise
Ian Foster ARGONNE CHICAGO
Planning
Review of Year 1 milestones Top 10 research challenges Demonstration projects Research projects + goals Workshops
Ian Foster ARGONNE CHICAGO
Year 1 Milestones:Virtual Data
Develop basic information model to represent data elements, relationships between different data types, characteristics of data elements
Develop protocols for storing, discovering, and retrieving these models
Design and develop tools for creating, accessing and manipulating these models by interactive tools, & planning and scheduling tools
Deploy centralized metadata and replica catalog services. Develop tools for managing catalogs
Ian Foster ARGONNE CHICAGO
Year 1 Milestones:Request Planning
Develop generic models for representing execution plans. Define API and tools for constructing, traversing, and manipulating plan data structures. Develop protocols and formats for storing and exchanging execution plans.
Develop uniform policy representation for code, data, and resource access. Develop a set of global and local policy scenarios that reflect the requirements of the user communities of the four physics experiments.
Develop simple optimization heuristics. Initial thrust will be on data movement only and focus on the use of alternative, or branching plans to compensate for both resource failure and changes in resource performance. Implement planning heuristics in prototype planning module. Evaluate performance of alternatives with simulation and model based studies, as well as execution on GriPhyN testbed.
Ian Foster ARGONNE CHICAGO
Year 1 Milestones:Request Execution
Develop and evaluate a task control language capable of capturing the requirements, preferences and dependencies of a PVDG request. Implement prototype of an interpreter to a basic subset of the language
Enhance the "Gang Matching" capabilities of the ClassAd language and add these enhancements to the run-time support library
Explore ways to enhance the ClassAd language to support events and triggers
Develop a protocol for information exchange between the execution and planning agents
Ian Foster ARGONNE CHICAGO
Year 1 Milestones:Virtual Data Toolkit
VDT-1 (Basic Grid Services) provides an initial set of grid enabling services and tools, including security, information, metadata, CPU scheduling, and data transport. VDT-1 will support efficient operation on O(10 TB) datasets, O(100) CPUs, and O(100 MB/s) wide area networks and will build extensively on existing technology.
Ian Foster ARGONNE CHICAGO
Year 1 Milestones:CMS & LIGO
CMS Build basic services and 1-2 prototype Tier 2
centers Complete High Level Trigger milestones and
perform studies with ORCA, the CMS object-oriented reconstruction & analysis software
LIGO Develop a cataloging approach for data access
methods & data location (metadata definition, design)
Develop an access and use model for LIGO data across the GriPhyN system
Ian Foster ARGONNE CHICAGO
Year 1 Milestones:ATLAS and SDSS
SDSS Build a prototype distributed analysis system
ATLAS Connect the Athena analysis framework to a set of
prototype virtual data services Start with Globus replica catalog service Athena EventSelector service to a replica catalog (reading) Athena Replica catalog update service
Testing of basic file replication and transport using 500 GB testbeam data sets.
Develop Condor interface to the ATLAS testbed Build basic services and 1-2 prototype Tier 2 centers
Ian Foster ARGONNE CHICAGO
ScheduleFeb Doc: Data grid reference architecture v1
Doc: Virtual data architecture v1
Doc: LIGO application summary and virtual data requirements v1
Doc: CMS application summary and virtual data requirements v1
Mar Doc: CAS policy architecture v0
Apr VDT: GSI, GRAM, GridFTP, replica catalog, Condor-G
Doc: SDSS application summary and virtual data requirements v1
App: CMS data analysis v1 [tests GSI, GRAM, Condor-G]
May Docs: Architecture, virtual data, appln requirements v2
June Doc: Data grid failure models and fault management research plan
Doc: GriPhyN simulation architecture and integrated simulation R&D plan
Doc:
Proto: CAS policy architecture prototype
App: LIGO data analysis v1 [tests replica catalog]
Ian Foster ARGONNE CHICAGO
ScheduleJuly Proto: Virtual data catalog
Proto: Request scheduler
App: ATLAS data analysis
Aug Tbed: Initial joint testbed with EDG
Sept Tbed: Joint testbed with EDG established with UC, ISI, CIT resources
App: CMS data analysis over EDG-GriPhyN[-PPDG?] testbed
Ian Foster ARGONNE CHICAGO
Breakouts/Workshops?
Virtual data representations Naming, etc.
Simulation strategies and tools UC, CIT, UCB, others?
Architecture What are the essential (and missing) pieces
Failure models
Ian Foster ARGONNE CHICAGO
Workloads
Three types: queries, objects, files Koen Holtman’s modeling work LIGO workloads (says Valerie)