topos: high-throughput parallel processing pipelines on the grid

15
SARA Reken- en Netwerkdiensten ToPoS: High-Throughput Parallel Processing Pipelines on the Grid Pieter van Beek SARA Computing and Networking Services High Performance Computing and Visualization e-Science Support ToPoS | 23 October 2008

Upload: pilis

Post on 15-Jan-2016

39 views

Category:

Documents


0 download

DESCRIPTION

ToPoS: High-Throughput Parallel Processing Pipelines on the Grid. Pieter van Beek SARA Computing and Networking Services High Performance Computing and Visualization e-Science Support. Users experiences with gLite. Overhead for starting jobs is considerable - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: ToPoS: High-Throughput Parallel Processing Pipelines on the Grid

SARA Reken- en Netwerkdiensten

ToPoS:High-Throughput Parallel Processing Pipelines on the Grid

Pieter van Beek

SARA Computing and Networking Services

High Performance Computing and Visualization

e-Science Support

ToPoS | 23 October 2008

Page 2: ToPoS: High-Throughput Parallel Processing Pipelines on the Grid

SARA Reken- en Netwerkdiensten

Users experiences with gLite

Overhead for starting jobs is considerable

Determining the best chunk size is difficult. Too small -> large overhead

Too large -> timeouts and throughput problems.

Resource brokering is far from optimal

Jobs often fail and users create their own tools for administrative tasks

ToPoS | 23 October 2008

Page 3: ToPoS: High-Throughput Parallel Processing Pipelines on the Grid

SARA Reken- en Netwerkdiensten

Resource Brokering

ToPoS | 23 October 2008

Submitted jobs are sent to a CE immediately.

When another CE becomes available, you won't use it automatically

Page 4: ToPoS: High-Throughput Parallel Processing Pipelines on the Grid

SARA Reken- en Netwerkdiensten

Failing Jobs (1)

Common experiences:

Sorry, an Incomprehensible Error occurred

Your VOMS Credential has expired

What Job?

Success! (but there’s no output)

Failure! (but it ran just fine)

Out of Wall-time (but no CPU-time?)

A lot of “monitoring and resubmission” software is created again and again by many users.

ToPoS | 23 October 2008

Page 5: ToPoS: High-Throughput Parallel Processing Pipelines on the Grid

SARA Reken- en Netwerkdiensten

Failing Jobs (2)

A real world example:27,000 jobs

duration: approx. 4 hrs

approx. 280 WNs

Theoretical duration: 16 days

But with a success rate of 70% …Approx. 9 resubmissions

“Practical” duration: >2 months

ToPoS | 23 October 2008

Page 6: ToPoS: High-Throughput Parallel Processing Pipelines on the Grid

SARA Reken- en Netwerkdiensten

Pilot Jobs

ToPoS | 23 October 2008

“Normal” jobs

Pilot jobs

Page 7: ToPoS: High-Throughput Parallel Processing Pipelines on the Grid

SARA Reken- en Netwerkdiensten

Simplest possible solution:Topos I

An online counter, like a “page views” counter

Numbers are “leased” for some period

Leases must be renewed

Interfaced with HTTP (REST web service)

Can be used with any HTTP client (wget, browsers)

As little security as possible

ToPoS | 23 October 2008

Page 8: ToPoS: High-Throughput Parallel Processing Pipelines on the Grid

SARA Reken- en Netwerkdiensten

Pilot job flow

ToPoS | 23 October 2008

Pilot jobPilot job

affirmtokenuse

affirmtokenuse

Getunusedtoken

Getunusedtoken

SubmitSubmit

Pilot job with

token

Pilot job with

token

Running pilot jobRunning pilot job

Executetoken task

Executetoken task

Finished?

Finished?

DeletetokenDeletetoken

noyes

Page 9: ToPoS: High-Throughput Parallel Processing Pipelines on the Grid

SARA Reken- en Netwerkdiensten

Advantages

Simple design and useUsing HTTP REST

Automatic resubmissions

Less overhead for large number of jobs. One pilot job can execute several tasks in sequence.

Improved scheduling

Easy job administration by querying Token Pool Server.Progress

Fail rate

ToPoS | 23 October 2008

Page 10: ToPoS: High-Throughput Parallel Processing Pipelines on the Grid

SARA Reken- en Netwerkdiensten

Topos I screenshots

ToPoS | 14 November 2008

Page 11: ToPoS: High-Throughput Parallel Processing Pipelines on the Grid

SARA Reken- en Netwerkdiensten

Topos 2.x

Interfaced by WebDAV i.o. HTTP

Tokens are files, i.e. they haveidentity

content

mime-type

properties

Token pools are directories

Tokens can be moved between directories

Allows users to build pipelines and workflows (high-level colored Petri nets)

ToPoS |

Page 12: ToPoS: High-Throughput Parallel Processing Pipelines on the Grid

SARA Reken- en Netwerkdiensten

Topos 2 screenshot

ToPoS |

Page 13: ToPoS: High-Throughput Parallel Processing Pipelines on the Grid

SARA Reken- en Netwerkdiensten

“Portfolio”

SciaGridCollaboration between SRON, KNMI, NIKHEF and SARA

Website where users can select satellite data (Sciamachy) data processors

Arnold Kuzniar and Jack Leunissen (WUR)BLAST protein sequence alignment

Bas Dutilh (CMBI)HAMMER sequence alignment (?)

Jan Bot (TUD)

ToPoS |

Page 14: ToPoS: High-Throughput Parallel Processing Pipelines on the Grid

SARA Reken- en Netwerkdiensten

Future directions

Documentation

ATOM/RSS instead of WEBDAV

Back to numbers instead of files

TODO

ToPoS |

Page 15: ToPoS: High-Throughput Parallel Processing Pipelines on the Grid

SARA Reken- en Netwerkdiensten

[email protected]

ToPoS | 23 October 2008