argonne chicago ian foster discussion points l maintaining the right balance between research and...

13
Ian Foster ARGONNE CHICAGO Discussion Points Maintaining the right balance between research and development Maintaining focus vs. accepting broader scope E.g., international collaboration E.g., GriPhyN in the large (GriPhyN-2) E.g., Terascale Creating a national cyberinfrastructure What is our appropriate role

Upload: eugenia-crawford

Post on 29-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader

Ian Foster ARGONNE CHICAGO

Discussion Points

Maintaining the right balance between research and development

Maintaining focus vs. accepting broader scope E.g., international collaboration E.g., GriPhyN in the large (GriPhyN-2) E.g., Terascale

Creating a national cyberinfrastructure What is our appropriate role

Page 2: ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader

Ian Foster ARGONNE CHICAGO

Discussion Points

Outreach to other disciplines Biology, NEES, …

Virtual data toolkit Inclusive or focused? Resource issue, again

Achieving critical mass of resources to deliver on the complete promise

Page 3: ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader

Ian Foster ARGONNE CHICAGO

Planning

Review of Year 1 milestones Top 10 research challenges Demonstration projects Research projects + goals Workshops

Page 4: ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader

Ian Foster ARGONNE CHICAGO

Year 1 Milestones:Virtual Data

Develop basic information model to represent data elements, relationships between different data types, characteristics of data elements

Develop protocols for storing, discovering, and retrieving these models

Design and develop tools for creating, accessing and manipulating these models by interactive tools, & planning and scheduling tools

Deploy centralized metadata and replica catalog services. Develop tools for managing catalogs

Page 5: ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader

Ian Foster ARGONNE CHICAGO

Year 1 Milestones:Request Planning

Develop generic models for representing execution plans. Define API and tools for constructing, traversing, and manipulating plan data structures. Develop protocols and formats for storing and exchanging execution plans.

Develop uniform policy representation for code, data, and resource access. Develop a set of global and local policy scenarios that reflect the requirements of the user communities of the four physics experiments.

Develop simple optimization heuristics. Initial thrust will be on data movement only and focus on the use of alternative, or branching plans to compensate for both resource failure and changes in resource performance. Implement planning heuristics in prototype planning module. Evaluate performance of alternatives with simulation and model based studies, as well as execution on GriPhyN testbed.

Page 6: ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader

Ian Foster ARGONNE CHICAGO

Year 1 Milestones:Request Execution

Develop and evaluate a task control language capable of capturing the requirements, preferences and dependencies of a PVDG request. Implement prototype of an interpreter to a basic subset of the language

Enhance the "Gang Matching" capabilities of the ClassAd language and add these enhancements to the run-time support library

Explore ways to enhance the ClassAd language to support events and triggers

Develop a protocol for information exchange between the execution and planning agents

Page 7: ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader

Ian Foster ARGONNE CHICAGO

Year 1 Milestones:Virtual Data Toolkit

VDT-1 (Basic Grid Services) provides an initial set of grid enabling services and tools, including security, information, metadata, CPU scheduling, and data transport. VDT-1 will support efficient operation on O(10 TB) datasets, O(100) CPUs, and O(100 MB/s) wide area networks and will build extensively on existing technology.

Page 8: ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader

Ian Foster ARGONNE CHICAGO

Year 1 Milestones:CMS & LIGO

CMS Build basic services and 1-2 prototype Tier 2

centers Complete High Level Trigger milestones and

perform studies with ORCA, the CMS object-oriented reconstruction & analysis software

LIGO Develop a cataloging approach for data access

methods & data location (metadata definition, design)

Develop an access and use model for LIGO data across the GriPhyN system

Page 9: ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader

Ian Foster ARGONNE CHICAGO

Year 1 Milestones:ATLAS and SDSS

SDSS Build a prototype distributed analysis system

ATLAS Connect the Athena analysis framework to a set of

prototype virtual data services Start with Globus replica catalog service Athena EventSelector service to a replica catalog (reading) Athena Replica catalog update service

Testing of basic file replication and transport using 500 GB testbeam data sets.

Develop Condor interface to the ATLAS testbed Build basic services and 1-2 prototype Tier 2 centers

Page 10: ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader

Ian Foster ARGONNE CHICAGO

ScheduleFeb Doc: Data grid reference architecture v1

Doc: Virtual data architecture v1

Doc: LIGO application summary and virtual data requirements v1

Doc: CMS application summary and virtual data requirements v1

Mar Doc: CAS policy architecture v0

Apr VDT: GSI, GRAM, GridFTP, replica catalog, Condor-G

Doc: SDSS application summary and virtual data requirements v1

App: CMS data analysis v1 [tests GSI, GRAM, Condor-G]

May Docs: Architecture, virtual data, appln requirements v2

June Doc: Data grid failure models and fault management research plan

Doc: GriPhyN simulation architecture and integrated simulation R&D plan

Doc:

Proto: CAS policy architecture prototype

App: LIGO data analysis v1 [tests replica catalog]

Page 11: ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader

Ian Foster ARGONNE CHICAGO

ScheduleJuly Proto: Virtual data catalog

Proto: Request scheduler

App: ATLAS data analysis

Aug Tbed: Initial joint testbed with EDG

Sept Tbed: Joint testbed with EDG established with UC, ISI, CIT resources

App: CMS data analysis over EDG-GriPhyN[-PPDG?] testbed

Page 12: ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader

Ian Foster ARGONNE CHICAGO

Breakouts/Workshops?

Virtual data representations Naming, etc.

Simulation strategies and tools UC, CIT, UCB, others?

Architecture What are the essential (and missing) pieces

Failure models

Page 13: ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader

Ian Foster ARGONNE CHICAGO

Workloads

Three types: queries, objects, files Koen Holtman’s modeling work LIGO workloads (says Valerie)