co-allocation of compute and network resources in the viola testbed christoph barz and markus pilz...

17
Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV Oliver Wäldrich and Wolfgang Ziegler Fraunhofer Institute for Scientific Computing and Algorithms, Department of Bioinformatics Thomas Eickermann and Lidia Kirtchakova Research Centre Jülich, ZAM TERENA Networking Conference 2006 (15 - 18 May 2006, Catania, Italy)

Upload: silvester-summers

Post on 14-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV

Co-Allocation of Compute and Network Resources in the VIOLA

Testbed

Christoph Barz and Markus Pilz

University of BonnInstitute of Computer Science IV

Oliver Wäldrich and Wolfgang Ziegler

Fraunhofer Institute for Scientific Computing and Algorithms, Department of Bioinformatics

Thomas Eickermann and Lidia Kirtchakova

Research Centre Jülich, ZAM

TERENA Networking Conference 2006 (15 - 18 May 2006, Catania, Italy)

Page 2: Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV

2Terena Networking Conference 2006

Agenda

Motivation

Resource Orchestration by MetaScheduling

Network Reservations with ARGON

Future Work

Motivation

Resource Orchestration by MetaScheduling

Network Reservations with ARGON

Future Work

Page 3: Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV

3Terena Networking Conference 2006

Motivation - Grid Projects (Examples)

Large Scale Scientific Applications

• Extremely high Data Volumes

• High Computational Demand

• Distributed Resources

http://www.realitygrid.org/Spice/

http://www.c3grid.de/

http://www.gac-grid.de/

Resource Orchestration via

• Advance Reservations

• Co-Scheduling

Resource Orchestration of

• Computational Resources

• Storage Resources

• Instruments and Sensors

• Network Resources

D-Grid Initiative

SPICE onTeraGrid + UK NGS

Page 4: Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV

4Terena Networking Conference 2006

Motivation - Applications in VIOLA (Examples)

AMG-OPT(simulation based on a hierarchical algebraic solver)

TechSim(distributed simulation

of complex technological systems)

MetaTrace(simulation of pollutant transport in groundwater)

KoDaVis(collaborative visualization of huge atmospheric

datasets in heterogeneous environments)

Page 5: Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV

5Terena Networking Conference 2006

MetaTrace Demonstration

Distribution of Chemicals in the Soil –Problem Decomposition

TRACE: calculation of water-flow

PARTRACE: distribution and chemical reactions of pollutants

exchange of intermediate results: up to 1 GigaByte in 1 second

FhGSankt Augustin

FZ Jülich

caesar

Uni Bonn

FH BRS

Network Reservation

• multiple point-to-point tunnels

• Layer 2/3 switching / routingMPLS Network

Network Service Description

Jülich-CrayPARTRACE 30 nodes

FH BRSTRACE6x2x2 CPUs

CaesarTRACE

30x2 CPUs

water-flow1x/step

water-flow1x /step

30-100x /step

ClusterReservation

MetaMPICH

ClusterRequirements

Network Requirements

Page 6: Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV

6Terena Networking Conference 2006

MetaScheduling Service - Architecture

UNICORE Client

Local Scheduler

UNICORE Gateway

Target SystemInterface

Primary NJS

Target SystemInterface

NJS

UNICORE Gateway

Target SystemInterface

NJS

Local Scheduler

Local Scheduler

Adapter

Job Queue

Adapter Adapter

Job QueueJob Queue

MetaScheduler

Site A Site B

Network RMS

ARGON

Link Usage

1) User specifies Job

4) MetaScheduler Reply

(WS-Agreement)

Adapter

3) Negotiationand Reservation

Cluster Cluster Cluster

5) Job transfer to UNICORE System

6) All Job Components

including Network QoS are

provisioned automatically

2) MetaScheduling Request

(WS-Agreement)

Page 7: Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV

7Terena Networking Conference 2006

MetaScheduling Service - Algorithm

Requested Resources

Time Constraints

Determine nextAvailability of Res. n

Determine nextAvailability of Res. 2

Determine nextAvailability of Res. 1

nextStartup = max(nextStartup,freeSlots[i]); i++

[i>n]

[i≤n]

[Common freeslot found]

t

t

t

First-fit search for common start time of all job components on all resources

t

Space of time in which requested service can start

Jülich-CrayPARTRACE

30 nodes

FH BRSTRACE

6x2x2 CPUs

CaesarTRACE

30x2 CPUs

Common start time found!

Network Service

nextStartup

MetaTrace ExampleFirst-fit Algorithm

Page 8: Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV

8Terena Networking Conference 2006

ARGON - Network Service

Availa

bili

ty

Rese

rvati

on

Bin

d

Query

Cance

l

Modify

Interface

Page 9: Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV

9Terena Networking Conference 2006

ARGON – Reservation Lifetime

treq tbegintendtconf tbind

IntermediatePhase

NegotiationPhase

Usage/RenegotiationPhase

Advance Reservation

time

time

resou

rces constraints

Feasible solution space

Constraints Traffic engineering Service availability Policy rules SLA User requirements …

Constraints Traffic engineering Service availability Policy rules SLA User requirements …

tact

ActivationPhase

Negotiation Phase: - Availability check(s)- Admission decision- Reservation

Negotiation Phase: - Availability check(s)- Admission decision- Reservation

Intermediate Phase: - Re-optimization inexpensive- Binding of service parameters

Intermediate Phase: - Re-optimization inexpensive- Binding of service parameters

Activation Phase: - Automatic initiation- Configuration of network-devices- Duration dependant on

service & devices

Activation Phase: - Automatic initiation- Configuration of network-devices- Duration dependant on

service & devices

Usage/Renegotiation - Re-optimization expensivePhase: - Modification of parametersUsage/Renegotiation - Re-optimization expensivePhase: - Modification of parameters

Query and Cancel can be used anytime after negotiation

Page 10: Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV

10Terena Networking Conference 2006

ARGON – Resource OptimisationIntermediate

PhaseNegotiation

PhaseUsage/Renegotiation

Phasetime

Rerouting of intermediate phase flows is inexpensive.

Online and offline algorithms can be used.

capacity

time

shift start

First Fit / Deadline

Flexible Reservations

Malleable Reservations

capacity

time

capacity

time

increase capacityreduce duration

reduce capacityincrease duration

Flexible Path Selection

Rerouting/Planning of Accepted Flows

1

2

Page 11: Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV

11Terena Networking Conference 2006

ARGON - Network Architecture

Overlay Model Optical domain is not visible

to IP domain MPLS domain can not

perform efficient TE UNI signaling

Overlay Model Optical domain is not visible

to IP domain MPLS domain can not

perform efficient TE UNI signaling

Multi-Region Network

ProxyUniClient

DataBaseSNMPServer

SNMPClient

Controller

RSVP

AutoDiscovery

ARGON

Listener

CLI

UNI

GMPLS Switch

MPLS Switch

Administration

Alcatel

Service provisioning

MPLS

ASON/GMPLS

Page 12: Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV

12Terena Networking Conference 2006

Co-Allocation in the VIOLA Testbed

Reservation, Signaling and Provisioning End-to-End Path Computation Service Modeling

Network Service Provisioning

AAA Support Service Level Agreement (SLA) Policy aware Provisioning

Policy-based Framework

Optimization Resource Modeling

Network Resource Management

Multi-Domain Multi-Layer Multi-Region

Network Architecture

IETF nomenclature

Standardization activities: GRAAP, OGSA-RSS Technologies: UNICORE, Web-Services, WS-Agreement

Interoperability

Negotiation of a common time frame for all resources Reservation of nodes at different clusters Reservation of network services via ARGON

Resource Orchestration

MetaScheduler Concept:

ARGON Concept (Allocation and Reservation in Grid-enabled Optical Networks) :

Page 13: Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV

13Terena Networking Conference 2006

ARGON - Where are we now?

time

1st year

2nd year

3rd year

today

DevelopmentPrototype

deployment

Application “Gridifying”

Middleware Resource Broker

ARGON Reservation Service

ARGON Service Provisioning

VIOLA Network Deployment and Tests

Multi-Layer

Single-Region

Single-Domain

Overall VIOLA requirements

• Infrastructure for Applications

• Grid Middleware Integration

• Network as a Grid resource

Overall VIOLA requirements

• Infrastructure for Applications

• Grid Middleware Integration

• Network as a Grid resource

Overall ARGON objectives

• Bandwidth On Demand in VIOLA

• Advance Network Reservation

• Interface inspired by EGEE specs

Overall ARGON objectives

• Bandwidth On Demand in VIOLA

• Advance Network Reservation

• Interface inspired by EGEE specs

• Multi-Vendor

• Multi-Layer

• Multi-Region

Multi-Region

Multi-Domain

Page 14: Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV

14Terena Networking Conference 2006

Future Work

MetaScheduler More automatic resource pre-selection process

MetaScheduling Service for workflows

Porting to GT4

LUCIFER integration

ARGON

• Multi-Region: MPLS and ASON/GMPLS layer must be coordinated

• Resource and Service Model enhancements

• GÉANT2 cooperation (JRA3 Inter-domain manager)

• LUCIFER integration (ARGON, UCLPv2, D-RAC)

Page 15: Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV

15Terena Networking Conference 2006

Conclusion

Demanding applications benefit from resources of multiple clusters and sites

Application driven resource selection for UNICORE Grid applications

Co-scheduling of computational, storage and network resources

MetaScheduling Service does orchestration of resources of multiple domains

ARGON provides Network Services with advance reservation capabilities and dedicated QoS

Page 16: Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV

16Terena Networking Conference 2006

The end

Thank You!

Thank You!

Contact:www.viola-testbed.de{barz, pilz}@cs.uni-bonn.de {th.eickermann, l.kirtchakova}@fz-juelich.de {Wolfgang.Ziegler, Oliver.Waeldrich}@scai.fraunhofer.de

Contact:www.viola-testbed.de{barz, pilz}@cs.uni-bonn.de {th.eickermann, l.kirtchakova}@fz-juelich.de {Wolfgang.Ziegler, Oliver.Waeldrich}@scai.fraunhofer.de

Page 17: Co-Allocation of Compute and Network Resources in the VIOLA Testbed Christoph Barz and Markus Pilz University of Bonn Institute of Computer Science IV

17Terena Networking Conference 2006

MetaScheduling Service – Algorithm (2)

set n = number of requested resourcesset res [1..n] = requested resourcesset prop [1..n] = requested property per resource

set freeSlots[1..n] = nullset endOfPreviewWindow = falseset nextStartup = currentTime + someMinutes

set needNext = truewhile (endOfPreviewWindow = false & needNext = true) do {

for 1..n do in parallel {freeSlots[i] = AvailableAt( res [i], prop [i], nextStartup)

}

set needNext = falsefor 1..n do {if ( nextStartup != freeSlots[i]) then {if ( freeSlots[i] != null) then {if( nextStartup < freeSlots[i]) then {set nextStartup = freeSlots[i]set needNext = true

}} else {set endOfPreviewWindow = true

}}

}}

if (needNext = false & endOfPreviewWindow = false) then return freeSlots[1]

elsereturn “no common slot found”