january 2002fast 2002 wip presentation1 the armada framework for parallel i/o on computational grids...

5
January 2002 FAST 2002 WIP Presentatio n 1 The Armada framework for The Armada framework for parallel I/O on parallel I/O on computational grids computational grids Ron Oldfield and David Kotz Department of Computer Science Dartmouth College

Upload: gilbert-sparks

Post on 17-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: January 2002FAST 2002 WIP Presentation1 The Armada framework for parallel I/O on computational grids Ron Oldfield and David Kotz Department of Computer

January 2002 FAST 2002 WIP Presentation 1

The Armada framework for The Armada framework for parallel I/O on computational parallel I/O on computational

gridsgrids

Ron Oldfield and David Kotz

Department of Computer Science

Dartmouth College

Page 2: January 2002FAST 2002 WIP Presentation1 The Armada framework for parallel I/O on computational grids Ron Oldfield and David Kotz Department of Computer

January 2002 FAST 2002 WIP Presentation 2

IntroductionIntroduction

• Must use large remote datasets

• Often computationally intensive

• Datasets often need pre- and/or post-processing

• Examples– Climate modeling (EOS-DAS)– Astronomy (Digital Sky Surveys)– Comp. Biology (Computed

MicroTomography)– Computational physics

• Flexibility– Application control of the

interface– Application control of system

policies (caching, data-dist., …)

• Performance– Parallel data transfers.– Remote execution of user code

(e.g., filtering, transforms, compression, encryption)

Computational Grids: geographically distributed networks of heterogeneous computer systems and devices.

Data-intensive grid applicationsData-intensive grid applications I/O system requirementsI/O system requirements

Page 3: January 2002FAST 2002 WIP Presentation1 The Armada framework for parallel I/O on computational grids Ron Oldfield and David Kotz Department of Computer

January 2002 FAST 2002 WIP Presentation 3

ArmadaArmada

• Flexible design, based on stackable file systems.

• Applications access data through a graph of “ships” called an “armada”.

• Requests travel toward data servers.

• Data is pushed toward clients for reads, pulled toward servers for writes.

• The armada abstracts details of the I/O system– Caching, filtering, data

distribution

from dataprovider

added byapplication

file file file

dist

clients on site A

filter

data segmentson site B

file file

dist

replica

data segmentsfrom site C

APIdata flow

An I/O framework for data-intensive grid applicationsAn I/O framework for data-intensive grid applications

Page 4: January 2002FAST 2002 WIP Presentation1 The Armada framework for parallel I/O on computational grids Ron Oldfield and David Kotz Department of Computer

January 2002 FAST 2002 WIP Presentation 4

Improving PerformanceImproving Performance

file file file

dist

clients on site A

filter

data segmentson site B

file file

dist

replica

data segmentsfrom site C

API

dist

clients on site A

file

filter

data segmentson site B

dist

data segmentsfrom site C

API APIAPI

file

filter

file

filter

file

filter

file

filter

replicacombine

replica

Page 5: January 2002FAST 2002 WIP Presentation1 The Armada framework for parallel I/O on computational grids Ron Oldfield and David Kotz Department of Computer

January 2002 FAST 2002 WIP Presentation 5

In progress…In progress…

• Automate graph restructuring.– Formalize rules and algorithms

• Develop placement algorithms.– Requires detailed information on ship requirements and

available resources.

• Performance monitoring and analysis.

Contact InformationContact Information

Ron Oldfield ([email protected])Ron Oldfield ([email protected])David Kotz ([email protected])David Kotz ([email protected])

http://www.cs.dartmouth.edu/~dfk/armadahttp://www.cs.dartmouth.edu/~dfk/armada