Co-Allocation of Compute and Network Resources in the VIOLA
Testbed
Christoph Barz and Markus Pilz
University of BonnInstitute of Computer Science IV
Oliver Wäldrich and Wolfgang Ziegler
Fraunhofer Institute for Scientific Computing and Algorithms, Department of Bioinformatics
Thomas Eickermann and Lidia Kirtchakova
Research Centre Jülich, ZAM
TERENA Networking Conference 2006 (15 - 18 May 2006, Catania, Italy)
2Terena Networking Conference 2006
Agenda
Motivation
Resource Orchestration by MetaScheduling
Network Reservations with ARGON
Future Work
Motivation
Resource Orchestration by MetaScheduling
Network Reservations with ARGON
Future Work
3Terena Networking Conference 2006
Motivation - Grid Projects (Examples)
Large Scale Scientific Applications
• Extremely high Data Volumes
• High Computational Demand
• Distributed Resources
http://www.realitygrid.org/Spice/
http://www.c3grid.de/
http://www.gac-grid.de/
Resource Orchestration via
• Advance Reservations
• Co-Scheduling
Resource Orchestration of
• Computational Resources
• Storage Resources
• Instruments and Sensors
• Network Resources
D-Grid Initiative
SPICE onTeraGrid + UK NGS
4Terena Networking Conference 2006
Motivation - Applications in VIOLA (Examples)
AMG-OPT(simulation based on a hierarchical algebraic solver)
TechSim(distributed simulation
of complex technological systems)
MetaTrace(simulation of pollutant transport in groundwater)
KoDaVis(collaborative visualization of huge atmospheric
datasets in heterogeneous environments)
5Terena Networking Conference 2006
MetaTrace Demonstration
Distribution of Chemicals in the Soil –Problem Decomposition
TRACE: calculation of water-flow
PARTRACE: distribution and chemical reactions of pollutants
exchange of intermediate results: up to 1 GigaByte in 1 second
FhGSankt Augustin
FZ Jülich
caesar
Uni Bonn
FH BRS
Network Reservation
• multiple point-to-point tunnels
• Layer 2/3 switching / routingMPLS Network
Network Service Description
Jülich-CrayPARTRACE 30 nodes
FH BRSTRACE6x2x2 CPUs
CaesarTRACE
30x2 CPUs
water-flow1x/step
water-flow1x /step
30-100x /step
ClusterReservation
MetaMPICH
ClusterRequirements
Network Requirements
6Terena Networking Conference 2006
MetaScheduling Service - Architecture
UNICORE Client
Local Scheduler
UNICORE Gateway
Target SystemInterface
Primary NJS
Target SystemInterface
NJS
UNICORE Gateway
Target SystemInterface
NJS
Local Scheduler
Local Scheduler
Adapter
Job Queue
Adapter Adapter
Job QueueJob Queue
MetaScheduler
Site A Site B
Network RMS
ARGON
Link Usage
1) User specifies Job
4) MetaScheduler Reply
(WS-Agreement)
Adapter
3) Negotiationand Reservation
Cluster Cluster Cluster
5) Job transfer to UNICORE System
6) All Job Components
including Network QoS are
provisioned automatically
2) MetaScheduling Request
(WS-Agreement)
7Terena Networking Conference 2006
MetaScheduling Service - Algorithm
Requested Resources
Time Constraints
Determine nextAvailability of Res. n
Determine nextAvailability of Res. 2
Determine nextAvailability of Res. 1
nextStartup = max(nextStartup,freeSlots[i]); i++
[i>n]
[i≤n]
[Common freeslot found]
t
t
t
First-fit search for common start time of all job components on all resources
t
Space of time in which requested service can start
Jülich-CrayPARTRACE
30 nodes
FH BRSTRACE
6x2x2 CPUs
CaesarTRACE
30x2 CPUs
Common start time found!
Network Service
nextStartup
MetaTrace ExampleFirst-fit Algorithm
8Terena Networking Conference 2006
ARGON - Network Service
Availa
bili
ty
Rese
rvati
on
Bin
d
Query
Cance
l
Modify
Interface
9Terena Networking Conference 2006
ARGON – Reservation Lifetime
treq tbegintendtconf tbind
IntermediatePhase
NegotiationPhase
Usage/RenegotiationPhase
Advance Reservation
time
time
resou
rces constraints
Feasible solution space
Constraints Traffic engineering Service availability Policy rules SLA User requirements …
Constraints Traffic engineering Service availability Policy rules SLA User requirements …
tact
ActivationPhase
Negotiation Phase: - Availability check(s)- Admission decision- Reservation
Negotiation Phase: - Availability check(s)- Admission decision- Reservation
Intermediate Phase: - Re-optimization inexpensive- Binding of service parameters
Intermediate Phase: - Re-optimization inexpensive- Binding of service parameters
Activation Phase: - Automatic initiation- Configuration of network-devices- Duration dependant on
service & devices
Activation Phase: - Automatic initiation- Configuration of network-devices- Duration dependant on
service & devices
Usage/Renegotiation - Re-optimization expensivePhase: - Modification of parametersUsage/Renegotiation - Re-optimization expensivePhase: - Modification of parameters
Query and Cancel can be used anytime after negotiation
10Terena Networking Conference 2006
ARGON – Resource OptimisationIntermediate
PhaseNegotiation
PhaseUsage/Renegotiation
Phasetime
Rerouting of intermediate phase flows is inexpensive.
Online and offline algorithms can be used.
capacity
time
shift start
First Fit / Deadline
Flexible Reservations
Malleable Reservations
capacity
time
capacity
time
increase capacityreduce duration
reduce capacityincrease duration
Flexible Path Selection
Rerouting/Planning of Accepted Flows
1
2
11Terena Networking Conference 2006
ARGON - Network Architecture
Overlay Model Optical domain is not visible
to IP domain MPLS domain can not
perform efficient TE UNI signaling
Overlay Model Optical domain is not visible
to IP domain MPLS domain can not
perform efficient TE UNI signaling
Multi-Region Network
ProxyUniClient
DataBaseSNMPServer
SNMPClient
Controller
RSVP
AutoDiscovery
ARGON
Listener
CLI
UNI
GMPLS Switch
MPLS Switch
Administration
Alcatel
Service provisioning
MPLS
ASON/GMPLS
12Terena Networking Conference 2006
Co-Allocation in the VIOLA Testbed
Reservation, Signaling and Provisioning End-to-End Path Computation Service Modeling
Network Service Provisioning
AAA Support Service Level Agreement (SLA) Policy aware Provisioning
Policy-based Framework
Optimization Resource Modeling
Network Resource Management
Multi-Domain Multi-Layer Multi-Region
Network Architecture
IETF nomenclature
Standardization activities: GRAAP, OGSA-RSS Technologies: UNICORE, Web-Services, WS-Agreement
Interoperability
Negotiation of a common time frame for all resources Reservation of nodes at different clusters Reservation of network services via ARGON
Resource Orchestration
MetaScheduler Concept:
ARGON Concept (Allocation and Reservation in Grid-enabled Optical Networks) :
13Terena Networking Conference 2006
ARGON - Where are we now?
time
1st year
2nd year
3rd year
today
DevelopmentPrototype
deployment
Application “Gridifying”
Middleware Resource Broker
ARGON Reservation Service
ARGON Service Provisioning
VIOLA Network Deployment and Tests
Multi-Layer
Single-Region
Single-Domain
Overall VIOLA requirements
• Infrastructure for Applications
• Grid Middleware Integration
• Network as a Grid resource
Overall VIOLA requirements
• Infrastructure for Applications
• Grid Middleware Integration
• Network as a Grid resource
Overall ARGON objectives
• Bandwidth On Demand in VIOLA
• Advance Network Reservation
• Interface inspired by EGEE specs
Overall ARGON objectives
• Bandwidth On Demand in VIOLA
• Advance Network Reservation
• Interface inspired by EGEE specs
• Multi-Vendor
• Multi-Layer
• Multi-Region
Multi-Region
Multi-Domain
14Terena Networking Conference 2006
Future Work
MetaScheduler More automatic resource pre-selection process
MetaScheduling Service for workflows
Porting to GT4
LUCIFER integration
ARGON
• Multi-Region: MPLS and ASON/GMPLS layer must be coordinated
• Resource and Service Model enhancements
• GÉANT2 cooperation (JRA3 Inter-domain manager)
• LUCIFER integration (ARGON, UCLPv2, D-RAC)
15Terena Networking Conference 2006
Conclusion
Demanding applications benefit from resources of multiple clusters and sites
Application driven resource selection for UNICORE Grid applications
Co-scheduling of computational, storage and network resources
MetaScheduling Service does orchestration of resources of multiple domains
ARGON provides Network Services with advance reservation capabilities and dedicated QoS
16Terena Networking Conference 2006
The end
Thank You!
Thank You!
Contact:www.viola-testbed.de{barz, pilz}@cs.uni-bonn.de {th.eickermann, l.kirtchakova}@fz-juelich.de {Wolfgang.Ziegler, Oliver.Waeldrich}@scai.fraunhofer.de
Contact:www.viola-testbed.de{barz, pilz}@cs.uni-bonn.de {th.eickermann, l.kirtchakova}@fz-juelich.de {Wolfgang.Ziegler, Oliver.Waeldrich}@scai.fraunhofer.de
17Terena Networking Conference 2006
MetaScheduling Service – Algorithm (2)
set n = number of requested resourcesset res [1..n] = requested resourcesset prop [1..n] = requested property per resource
set freeSlots[1..n] = nullset endOfPreviewWindow = falseset nextStartup = currentTime + someMinutes
set needNext = truewhile (endOfPreviewWindow = false & needNext = true) do {
for 1..n do in parallel {freeSlots[i] = AvailableAt( res [i], prop [i], nextStartup)
}
set needNext = falsefor 1..n do {if ( nextStartup != freeSlots[i]) then {if ( freeSlots[i] != null) then {if( nextStartup < freeSlots[i]) then {set nextStartup = freeSlots[i]set needNext = true
}} else {set endOfPreviewWindow = true
}}
}}
if (needNext = false & endOfPreviewWindow = false) then return freeSlots[1]
elsereturn “no common slot found”