d2.2 soft real- time dynamic resource...
TRANSCRIPT
Project Partners: Aicas, Bosch, CNRS, Rheon Media, The Open Group, University of Stuttgart, University of York
Every effort has been made to ensure that all statements and information contained herein are accurate, however
the Project Partners accept no liability for any error or omission in the same.
© 2014 Copyright in this document remains vested in the DreamCloud Project Partners.
Project Number 611411
D2.2 – Soft Real-Time Dynamic Resource Allocation
Version 1.0
8 December 2014 Final
Public Distribution
University of York
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page ii Version 1.0 8 December 2014
Confidentiality: Public Distribution
PROJECT PARTNER CONTACT INFORMATION
Aicas
Fridtjof Siebert
Haid-und-Neue Strasse 18
76131 Karlsruhe
Germany
Tel: +49 721 663 96823
E-mail: [email protected]
Bosch Jochen Härdtlein
Robert-Bosch-Strasse 2
71701 Schwieberdingen
Germany
Tel: +49 711 811 24517
E-mail: [email protected]
CNRS
Gilles Sassatelli
Rue Ada 161
34392 Montpellier
France
Tel: +33 4 674 18690
E-mail: [email protected]
Rheon Media
Raj Patel
20 Leighton Avenue
Pinner Middlesex HA5 3BW
United Kingdom
Tel: +44 7547 162920
E-mail: [email protected]
The Open Group
Scott Hansen
Avenue du Parc de Woluwe 56
1160 Brussels, Belgium
Tel: +32 2 675 1136
E-mail: [email protected]
University of Stuttgart
Bastian Koller
Nobelstrasse 19
70569 Stuttgart
Germany
Tel: +49 711 68565891
E-mail: [email protected]
University of York
Leandro Soares Indrusiak
Deramore Lane
York YO10 5GH
United Kingdom
Tel: +44 1904 325571
E-mail: [email protected]
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page iii
Confidentiality: Public Distribution
DOCUMENT CONTROL
Version Status Date
0.1 First overview of soft real-time dynamic resource allocation. 30 September 2014
0.2 Description of proposed resource allocation heuristics. 24 October 2014
0.3 Results and description of market-inspired heuristics 20 November 2014
0.4 Results and description of bio-inspired heuristic 21 November 2014
0.5 Modifications by reviewing of editors 1 December 2014
0.6 Modifications after internal partner review 6 December 2014
1.0 Final review and QA 8 December 2014
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page iv Version 1.0 8 December 2014
Confidentiality: Public Distribution
TABLE OF CONTENTS
1. Introduction ................................................................................................................................................................. 1
1.1 Structure of this document ....................................................................................................................................... 1
2. Dynamic Resource Allocation Techniques for Systems with Soft Real-Time Constraints ................................... 2
2.1 Market-inspired Dynamic Resource Allocation ...................................................................................................... 3
2.2 Bio-inspired Dynamic Resource Allocation ............................................................................................................ 6
3. Techniques for Market-inspired Dynamic Resource Allocation ............................................................................. 7
3.1 Review of Existing Techniques ................................................................................................................................ 7
3.2 Proposed Market-inspired Techniques .................................................................................................................. 10 3.2.1 Hybrid techniques for bidding ...................................................................................................................... 11 3.2.2 On-the-fly techniques for bidding ................................................................................................................. 26
4. Techniques for Bio-inspired Resource Allocation .................................................................................................. 30
4.1 Review of Existing Techniques .............................................................................................................................. 30 4.1.1 Pheromone signalling (PS) ............................................................................................................................ 30 4.1.2 SymbioticSphere ........................................................................................................................................... 31 4.1.3 Biological Task Mapping and Scheduling .................................................................................................... 31
4.2 Proposed Bio-inspired Technique ......................................................................................................................... 32 4.2.1 Preliminary experimental work ..................................................................................................................... 34
5. Compliance with the Dynamic Resource Allocation Requirements ..................................................................... 37
References........................................................................................................................................................................ 38
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page v
Confidentiality: Public Distribution
EXECUTIVE SUMMARY
This deliverable describes soft real-time dynamic resource allocation techniques. First,
the existing techniques are analysed to identify their bottlenecks for the DreamCloud
applications such as video processing and high performance computing. In order to
overcome the bottlenecks, several techniques are proposed to perform well in different
dynamic scenarios. The proposed techniques fall into two categories: market-inspired
and bio-inspired.
The market-inspired heuristics introduce the notion of values to a task or job and em-
ploy market concepts in the resource allocation process. The introduction of market
concepts helps to choose the most valuable jobs while discarding the less valuable jobs
in the system overload situations in order to maximize the overall system utility, which
has also been referred to as value (profit) achieved by the system. The proposed heuris-
tics take both the computation and communication overheads into account and provide
better allocation decision in comparison to the existing commonly used market-inspired
heuristic.
The bio-inspired heuristics exploit the concepts of biological system. Specifically, pher-
omone-signalling-based load balancing algorithm has been proposed that copes easily
with changing workload dynamics and incurs low computation and communication
overheads.
Finally, conformity in fulfilling the dynamic resource allocation requirements, enumer-
ated and described in Deliverable D1.2, has been analysed.
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 1
Confidentiality: Public Distribution
1. INTRODUCTION
Based on the earlier progress shown in deliverable D1.1, this deliverable needs to ad-
dress the aspects of dynamic resource allocation for soft real-time systems from the
market and bio-inspirational points of view.
Market-inspired approaches use market concepts by employing a notion of value to
each application (or job) that needs to be allocated into the multi-core platform. The
values associated with each application helps to decide what application to choose so
that utility can be maximized, which is really desirable in the platform overload situa-
tions. Further, associating varying value with different completion time of the same ap-
plication makes the problem more interesting to decide what exact value to bid for. The
market concepts are helpful for such interesting propositions.
Bio-inspired approaches can be developed based on several bio-inspirational ideas. The
concept of pheromone-signalling has been identified as a promising candidate for the
resource allocation purposes.
1.1 STRUCTURE OF THIS DOCUMENT
This deliverable is structured as follows:
Section 2 briefly reviews state-of-the-art in dynamic resource allocation for
systems with soft real-time constraints towards proposing market-inspired and
bio-inspired mechanisms for DreamCloud applications.
Section 3 reviews existing market-inspired resource allocation techniques to
identify their limitations and proposes new techniques to overcome the
shortcomings. A variety of techniques are proposed to perform well for different
kinds of end user needs.
Section 4 reviews existing bio-inspired techniques and proposes the pheromone-
signalling-based load balancing technique.
Section 5 describes compliance of the techniques described in the previous sec-
tions with the requirements from deliverable D1.2.
Sources for additional information are footnoted throughout the document whenever
required.
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 2 Version 1.0 8 December 2014
Confidentiality: Public Distribution
2. DYNAMIC RESOURCE ALLOCATION TECHNIQUES FOR SYSTEMS WITH
SOFT REAL-TIME CONSTRAINTS
It has been well proven that resource allocation is one of the most complex problems in
large many-core and distributed systems, and in general it is considered NP-hard [1].
Therefore, a well-tuned search algorithm needs to evaluate hundreds of thousands of
distinct allocations before it finds one solution that meets the system‟s performance re-
quirements. Since such evaluation is expected to take a long time, maybe hours to days,
it cannot be applied to find the solution quickly, which is desired in the contexts of dy-
namic resource allocation. Further, the search algorithms normally consider static work-
load and thus cannot handle dynamic workload scenarios. These requirements can only
be fulfilled by employing dynamic resource allocation that needs to find a performance
satisfying solution quickly in the dynamic workload scenarios.
The deliverables from the previous DreamCloud work packages (e.g., [2]) summarized
that resource allocation has significant impact on performance and energy dissipation in
both grid and many-core based systems. A variety of existing resource allocation tech-
niques that target to achieve performance guarantee and energy efficiency were intro-
duced [3] [4]. It has been observed that state-of-the-art techniques employ search-based
static allocation approaches in order to achieve performance guarantee [5] [6] [7]. These
techniques do not take into account the overheads of dynamically allocating and migrat-
ing tasks (i.e. context saving and transferring). Further, most of them assume that the
tasks are independent and thus do not explicitly consider communication overheads [8].
There have been efforts to consider multiple dependent tasks for the resource allocation
process [7] [9] [10] [11] [12] [13]. They employ various heuristics such as 1) load bal-
ancing: balances load on various cores, 2) critical path method: the tasks on the critical
path that is the longest path in the execution flow and governs the performance, are al-
located first, 3) dependency-based priority assignment: a task connected with many oth-
er tasks is assigned high priority as this task determines execution of the chid dependent
tasks by processing the input received from the parent dependent tasks, 4) hybrid of crit-
ical path method & weighted total execution time: the priority assignment of tasks for
their allocation sequence considers both the critical path and tasks weight that is deter-
mined by total execution time of successor tasks, 5) upward and downward ranks: the
tasks are assigned upward and downward ranks by giving weight to each task based on
the longest path from the task to its latest sink or earliest source, respectively, 6) Long-
est Remaining Time First (LRTF): sorts the tasks in decreasing order of upward ranks to
perform the allocation, 7) Shortest Remaining Time First (SRTF): orders tasks by in-
creasing order of upward rank.
It can be realized that the aforementioned heuristics fall into category of best effort heu-
ristics and try to optimize for the overall compute performance in terms of response
time (total execution time). In order to optimize for both the performance and energy
consumption, several heuristics have been reported in the literature [4] [14]. These heu-
ristics employ a number of fundamental optimization procedures such as iterative hier-
archical allocation to reduce energy consumption while satisfying the required Quality
of Service (QoS) [14], incremental dynamic allocation for finding a contiguous area to
map an application [15], hybrid mapping to perform intensive computations at design
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 3
Confidentiality: Public Distribution
time and using the design-time analysed results at run time to perform efficient online
allocation [16] [17], neighbourhood metric inspired allocation to allocate communi-
cating tasks on neighbouring cores [18] [19].
Drawbacks of Existing Heuristics
Although several efforts have been made to devise heuristics for dynamic resource allo-
cation, they suffer from the following main bottlenecks:
They can lead to starvation, missed deadlines, reduced throughput in overload
situation where demand for available resources is higher than the supply. In such
situation, it becomes difficult to decide what applications to discard and what to
enter into the system when some resources become available due to completion
of some tasks.
Most of the existing heuristics do not take into account any notion of value of
tasks to users. Further, industrial workloads do not currently have values
assigned to tasks/jobs. If such notions can be associated, the system can better
decide what tasks to consider in the allocation process and what to discard in
order to maximize the overall system utility in overload situation.
They cannot efficiently deal with industrial applications containing dependent
tasks. Towards such dealing, industrial applications need to be modelled as a set
of tasks having a set of pre-defined edges that connect the tasks, e.g. Directed
Acyclic Graph (DAG), Synchronous Dataflow Graph (SDFG).
Research in energy-efficient allocation for high performance computing (HPC)
and cloud systems is still incipient, with existing works addressing only the time
and space fragmentation of resource utilisation at a very large granularity (server
level), aiming to minimise energy by rearranging the load and freeing servers
that are then turned off, or slowed down by employing DVFS [20].
It has been realized that the aforementioned drawbacks can be handled by the proposed
soft real-time dynamic resource allocation heuristics within the framework of the
DreamCloud project. In order to be applicable to a variety of platforms, from embedded
to HPC systems, a number of heuristics need to be proposed such that they are
appropriate to different systems and applications. Further, their applicability needs to be
well defined so that one can choose the most suitable heuristic for a given scenario. The
dynamic resource allocation heuristics to be explored in the DreamCloud project are to
be based on market and biology concepts, and are introduced subsequently.
2.1 MARKET-INSPIRED DYNAMIC RESOURCE ALLOCATION
Market-based resource allocation mechanisms are proven to provide promising alloca-
tion decisions for various computing platforms such as clusters, distributed databases,
grids, parallel and distributed systems, World Wide Web, etc. [21]. Market-based
mechanisms use available platform capacity measured by low-level heuristics as bids
within an auction-like allocation process in order to find the allocation that can provide
guarantees to satisfy the required level of QoS and can maximize the overall system
utility (profit). Figure 1 demonstrates the process of market-based dynamic resource al-
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 4 Version 1.0 8 December 2014
Confidentiality: Public Distribution
location where different tasks need to be allocated into the multi-core platform. To in-
corporate market concepts in the allocation process, tasks are assigned values and bids
from resources (agents) are placed to the allocation engine (Manager Processor) in order
to maximize the value returned by multi-core system.
Figure 1: Market-based Dynamic Resource Allocation: DreamCloud Perspective
The value of tasks or applications (represented as task-graphs) can change over time to
reflect the impact of the computation over the business processes. For example, finish-
ing a computation earlier can enable fast business decisions, which in turn may result in
increasing earnings for a specific product, whereas a late finish may result in low earn-
ings.
The value forecast related to the completion of a given computation can be plotted over
time, resulting in what is called a value curve. For each task, the value curve is a func-
tion of the value of the task to the user depending on the completion time of the task.
Thus, the value curve assigns appropriate benefits (profits) to task completion in partic-
ular time. This is similar to the value curves used in several works reported in the litera-
ture, e.g. [30] [31] [32]. It has been shown that using value curves instead of fixed val-
ues for tasks gives greater market efficiency in the long run [29].
The value curve normally contains value (profit) on one axis and completion time on
another one. For example, Figure 2 shows the value curve for an HPC application. The
figure shows different profit values at different completion times (ranging from t_i to
t_f). The value of the application (job) trends towards zero with the increasing comple-
tion time and becomes zero at t_f. The DreamCloud applications are assumed to be as-
sociated with the similar value curves. Such value curve is normally perceived from the
business unit by following an economic model.
Value curves facilitate for bidding based on the available processing capacity (bids) on
different processors towards maximizing the profit for each processor. For example, a
processor with high available processing capacity can bid for the maximum value (£500
in Figure 2) to maximize its profit. This also helps to finish the job as soon as possible,
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 5
Confidentiality: Public Distribution
enabling faster release of the occupied processor that can be used for future incoming
jobs by having better resource availability. However, the bidding process needs to make
sure to achieve the bidding (desired) price before starting to provide the service, i.e. bid
only when the desired profit can be achieved. In case the application requirements are
not satisfied, a penalty in the overall profit can be imposed. Such bidding process can be
helpful towards devising market-based light-weight bidding heuristics. This seems to
fulfil the realization of platforms considered in the DreamCloud project, e.g. embedded
and HPC systems, which need to process dynamic workloads by employing light-weight
bidding. This approach seems tempting especially in case of dynamic workloads.
Figure 2: Value curve for HPC application processing
The bidding-based resource allocation seems to possess potential, but it suffers from the
following issues:
A weak understanding of how much value each task should be assigned for
industrial workloads.
More complex when dealing with dependent tasks as it requires an approach of
building value curves for particular tasks based on the value curve of the whole
application.
Despite having some aforementioned issues, it has several advantages over the other
kind of resource allocation approaches, for example:
Good handling in overload situations: In overload situations where demand for
the resource increases the supply of the resources, the bidding-based allocation
helps to find an allocation to maximize the overall system profit. Towards the
same, the bidding-based allocation can employ several basic principles such as:
500 450
400 300
100 0 0
100
200
300
400
500
600
…………. t_i t_m t_n .. .. t_f
Time (Accepted time range t_i to t_f)
Pro
fit
(Val
ue
in £
)
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 6 Version 1.0 8 December 2014
Confidentiality: Public Distribution
o Least valuable tasks can be postponed or discarded.
o “Survival of the fittest” can be employed to let some jobs starve in order
that the majority are able to run to completion.
Can operate both in centralised and decentralised (distributed) way: The
centralized approach achieves good performance due to a better view of the
system resource by the central manager, but there is always bottleneck around
the manager due to heavy traffic around it and extensive computation within it.
The distributed approach overcomes the above bottleneck, but higher network
traffic overhead is encountered while communicating all the bids and offers.
The overhead (in terms of time and energy) for dynamic allocation can be quite high for
large size problems such as high number of tasks in the job and resources (e.g., proces-
sors) in the platform. The overhead increases polynomially with the number of tasks
and grows further with the number of processors due to increased allocation options for
each task. This necessitates the need for light-weight bidding heuristics to efficiently al-
locate the tasks.
2.2 BIO-INSPIRED DYNAMIC RESOURCE ALLOCATION
The bio-inspired heuristics imitate a particular biological system. Among bio-inspired
algorithms, a class drawing inspiration from swarm intelligence can be singled out.
Multiple agents in these algorithms follow a number of relatively simple rules, which
results in their collective behaviour. These heuristics are worth analysing in the project
since they are (usually) distributed and self-organising, whereas algorithms implement-
ed in each agent are of low computation complexities and can be easily parallelised. To
date, project partners analysed one particular algorithm of this kind, namely Phero-
mone-Signalling-Based Load Balancing Algorithm. Although it is difficult to predict
the final system‟s parameters, and in particular to guarantee of meeting any constraints,
this algorithm behaves promising in numerous situations according to the already con-
ducted experiments (some results are presented in [43]). Despite being characterised
with low computation and communication overheads, it copes easily with changing
workload dynamics. Since each node uses only information available locally, this algo-
rithm scales well and avoids generating any hot-spots.
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 7
Confidentiality: Public Distribution
3. TECHNIQUES FOR MARKET-INSPIRED DYNAMIC RESOURCE ALLOCATION
3.1 REVIEW OF EXISTING TECHNIQUES
There are many cloud-based and grid-based HPC systems that use allocation and sched-
uling heuristics by taking into account not only the timing constraints of the tasks but
also their value (economic or otherwise). This problem has been well-studied under the
model of Deadline and Budget Constraints (DBC) [23], where each task or task-flow
has a fixed deadline and a fixed budget. State-of-the-art allocation and scheduling tech-
niques target objectives such as maximising the number of tasks completed within dead-
line and/or budget [24], maximising utility (profit) for platform provider [25] or mini-
mising cost to users [26] while still ensuring deadlines. Several approaches to the DBC
problem use market-inspired techniques to balance the rewards between platform pro-
viders and users [27]. A comprehensive conducted survey reviews several market-based
allocation techniques supporting homogeneous or heterogeneous platforms, some of
them supporting applications with dependent tasks modelled as DAGs [21].
Some allocation techniques using bidding-based feedback approach are proposed in
[28]. During the course of allocation, each core (or group of cores) computes its bid in-
dependently and sends it to allocation (optimization) engine as shown in Figure 3. The
bids are considered as feedback from the system and tasks/processes as auctioned items.
At regular time interval, the engine receives the list of tasks and bids and decides alloca-
tion for the pending tasks/processes. The cluster/core with the highest bid receives the
task with the highest execution time.
Figure 3: Bidding-based system feedback
The bids calculation in [28] is performed as follows:
where b(Ci) represents estimated workload that the cluster can handle, b0(Ci) is the total
CPU instruction queue space available in the core (or cluster). Since an ideal execution
time of one instruction per cycle is considered (e.g. RISC processor), it is assumed that
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 8 Version 1.0 8 December 2014
Confidentiality: Public Distribution
the number of instructions is equal to the number of cycles. w(Ci) is the remaining
workload of Ci that represents the number of instructions in queue awaiting to be exe-
cuted. The number of instructions is computed using dedicated hardware counters. d(Ci)
is the network delay to receive the bidding packet by the optimization engine, which al-
so represents estimated time for a new process to reach Ci.
By using the computed bids as described above, the authors propose two bidding-based
heuristics having a slight variation between them. The heuristics are termed as Neces-
sary Resorting (NRS) and Dynamic Resorting (DRS). The NRS approach binds highest
bidder with largest process, second highest bidder with second largest process, and so
on. If number of processes is greater than the number of cores (clusters), the allocation
process updates the bids based on the allocation done so far and then repeats the same
process to find binding for remaining processes. The same process is repeated until all
processes have been bound or no more clusters with positive bids exist. In DRS, similar
rationale as that of NRS are followed with the distinction that a cluster‟s bid is re-
calculated every time a process is bound to the cluster and list of available clusters is re-
sorted in order to reflect the allocation.
It has been shown that using value curves instead of fixed values for tasks (employed in
the above mentioned approaches) gives greater market efficiency in the long run [29].
Based on the value curve concept, Irwin et al. [30] have considered a model that as-
sumes linear decaying values of tasks with waiting time. As some variation, linear de-
crease in value followed by an exponential phase has also been considered [31]. Further,
rising and falling value curves have also been proposed to capture the situation in real-
time systems, where early completion of work can be considered as bad as late comple-
tion [32]. Earlier completion is always valued in the industrial scenario. Therefore, val-
ue curves can be assumed to be decreasing. By using value, the nature of scheduling
naturally lends itself to market-based scheduling [29].
A simple and straightforward resource allocation process may follow bidding based on
highest value [28]. The allocation process keeps a track of the unallocated tasks by put-
ting them in a queue and aims for the tasks with the highest value first in order to max-
imize the system utility (profit). The actual value (price) is less important than the order
imposed on jobs in the queue. This means that the 'sale value' of a position in the queue
is the bid value, following the English style of auction. These intermediate values do not
need to be stored. This is because as the actual value paid to the platform operator de-
pends not on the completion times of individual components of the task, but of the
completion time of the whole task according to its value curve. The bidding based on
highest value approach may achieve a large value, but it might require a large amount of
computational resources. It may be possible to achieve high value by running several
small size tasks that take less execution resources in order to maximize the overall sys-
tem utility.
In order to overcome the problem associated with bidding based on highest value, bid-
ding based on value density was introduced [33], where task‟s value divided by the
amount of required computational resources is considered as the value density. In a dy-
namic system, for the situations where workload is schedulable by Earliest Deadline
First (EDF), it has been shown that it is optimal to order the tasks by decreasing value
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 9
Confidentiality: Public Distribution
density. However, in overload situations, since EDF shows rapid degradation in perfor-
mance, the optimality proof will not necessarily hold. In [34], variants of value density
based allocation approach are presented. Bansal et al. [35] introduced a similar ap-
proach termed as highest density first. These approaches normally have limitations such
as: tasks must have exactly known execution times, a fixed value for completion, and all
tasks must execute on a uniprocessor.
An allocation approach to bid based on the value density squared has been introduced in
[36], where the tasks with highest value density squared are run first. This gives a more
extreme separation between valuable and less-valuable tasks. It intends that jobs will
ever start if they are never going to finish. This helps to reduce the execution penalty
and favours the most valuable jobs to finish their execution. However, this work has
drawback of considering pre-emptive tasks, which might not be good for industrial
workloads.
The bidding based on value density has been refined slightly to consider task‟s value
divided by its upward rank instead of task‟s value divided by the amount of required
computational resources. This refined approach has been referred to as value critical
path density. Here, the tasks with the highest value critical path density are run first.
This may be a more useful measure in large clusters, where responsiveness is important.
Another heuristic termed as minimum value remaining has been proposed to ensure that
the job that is going to lose its value soon, i.e., has minimum remaining value, should be
allocated first [37]. The remaining value is calculated as the area under the value curve
from the current time to the time when its value is zero. However, this and the afore-
mentioned approaches might not guarantee higher overall profit than the bidding based
on the highest value as higher value jobs might be postponed for later allocation due to
their low value density or high minimum remaining value. It is also important to note
that the performance of heuristics can depend a great deal on the exact parameters of the
workload and platform. Further, these approaches do not use design-time profiling re-
sults and lack the concept of holding low value executing jobs to allocate the freed re-
sources to high value arrived jobs.
The aforementioned heuristics align within the context of DreamCloud project and thus
have been reviewed. There has been several other market-inspired heuristics in various
contexts such as grid and distributed systems. However, they cannot be simply applied
to fulfil the aim of the DreamCloud project due to their drawbacks. The major draw-
backs of existing market-inspired heuristics are discussed subsequently.
Drawbacks of Existing Market-inspired Heuristics
The existing market-inspired heuristics suffer from the following drawbacks:
Most of them do not consider tasks/process having dependencies and thus can-
not be simply extended to apply for DreamCloud applications such as embedded
and high performance computing (HPC) as they typically contain multiple de-
pendent tasks. There have been some efforts to consider dependencies in the
context of market-inspired resource allocation [32], but static value-scheduling
has been adopted. Therefore, they cannot be applied to dynamic workloads.
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 10 Version 1.0 8 December 2014
Confidentiality: Public Distribution
The values of tasks may vary with time, but the execution times of tasks are not
well-known in advance and thus expected execution times (e.g. WCET, ACET)
are considered
Most of the approaches consider a single value for a task.
A centralized management is employed in the bidding process, which might not
be scalable.
The DreamCloud project proposes market-inspired heuristics that try to overcome most
of the drawbacks of existing market-inspired heuristics.
3.2 PROPOSED MARKET-INSPIRED TECHNIQUES
The DreamCloud project aims to propose a number of heuristics that can be used to
provide different levels of performance guarantees and can cope with different levels of
dynamism on the application workload while applying them to different systems. Fur-
ther, the heuristics should be able to perform well in different execution scenarios and
different kind of requirements. To perform an early analysis, some heuristics have been
developed and analysed while trying to satisfy the performance requirement. However,
they are expected to be refined during further project development, especially during the
progress of WP3 (Time and Energy Predictability in High Performance and Embedded
Cloud Systems) in order to consider both the performance and energy consumption re-
quirements.
The proposed market-inspired heuristics try to achieve the aims defined in the
DreamCloud project by considering the following aspects:
The heuristics can handle dynamic workloads.
The application model considers dependent tasks and thus the heuristics need to
handle communication issues.
The considered platforms are multi/many cores.
The heuristics can be applied both in centralize and decentralize way.
For a given QoS requirement, a value curve is considered instead of a single
value in contrast to most of the existing approaches.
Towards devising heuristics, two execution scenarios have been considered. In one, it is
assumed that the historical data in terms of executed jobs over the last year or few
months is known. The historical jobs can be analysed in advance so that an incoming
job at runtime can be efficiently allocated to the system resources if the job belongs to
the historical jobs. In case the incoming job does not belong to the historical jobs, i.e.
advance analysis results are not available for the job; the analysis step can be employed
followed by the run-time resource allocation step. The other execution scenario assumes
no prior information about the jobs is available before their arrival. Therefore, all the
processing has to be performed at runtime (i.e., on-the-fly) after the job has arrived and
received by the system. In both the scenarios, value of a job is only realised once all of
its components have been completed. A partially-completed job is considered to be
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 11
Confidentiality: Public Distribution
effectively worthless. The proposed heuristics for these two execution scenarios follow
various principles and are as follows:
Hybrid techniques for bidding
On-the-fly techniques for bidding
3.2.1 Hybrid techniques for bidding
The hybrid technique performs optimizations both at design-time and run-time, and has
been an emerging trend for efficient resource allocation [4]. However, their potential
has not been exploited to facilitate the bidding based resource allocation. The proposed
hybrid techniques employ design-time (off-line) profiling and run-time (on-line)
resource allocation for the jobs considering their arrival time and profiled results, as
shown in Figure 4. The platform resource manager is invoked to perform the resource
allocation process for the arrived jobs.
Design-time
Profiling
Jobs HPC Platform
Run-time Platform
Resource Manager
Incoming Jobs
Allocation
Result
HPC Platform
Profiling Results
& Value Curves
Figure 4: Hybrid resource allocation
Design-time Profiling
At design-time, for a given job, the performance (makespan) is estimated when different
number of cores (that is proportional to the amount of computing power) is utilized. The
makespan for a job can also be referred to as the response time or completion time of
the job after it has been allocated for execution. A job contains a set of workflows (ap-
plications) where each workflow contains a set of tasks having predefined connections
amongst the tasks, as shown in Figure 5. The same job model has also been described in
deliverable D3.1 [38]. Different workflows (applications) might have different task
graph structure.
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 12 Version 1.0 8 December 2014
Confidentiality: Public Distribution
Figure 5: DreamCloud application components
For each job that might need to be allocated into the system, we use Interval Algebra
(D5.1- Analytical Platform Model [39]) to estimate the performance (makespan) that
can be achieved when using different number of cores (computing power). The
makespan values are computed by assuming worst-case execution times of the tasks in
the job, so that the most pessimistic run-time system behaviour can be taken into ac-
count. The resource allocation decisions at different number of used cores follow a ge-
netic algorithm approach in order to obtain an efficient allocation. Further details can be
found in deliverable D5.1- Analytical Platform Model [39]. These resource allocation
decisions are kept to be used during the run-time resource allocation.
The makespan values (for different number of used cores) can represent the time axis
and some values (profits) can be assigned to achieve the corresponding makespan val-
ues in order to obtain the value-time curve. For example, Figure 6 shows the values (in
blue color on the left vertical axis) that can be achieved by completing a job at different
moment of times (makespan values). Such value-time curve is obtained from the busi-
ness unit and is normally based on the computation and communication overhead of the
tasks within the job. We assume the value-curve is given and has similar properties as
that of the value-curves reported in the literature. The main properties are 1) the value
decreases with the time, 2) value is highest at the minimum possible time, and 3) value
becomes zero at a particular time. The considered value-time curve complies with the
required properties and thus can be considered for evaluation. The profiling output (in
red color on the right vertical axis) is plotted along with the given value curve (in blue
color), which provide enriched information for the job to perform efficient run-time re-
source allocation. The right vertical axis (# Cores) represents the number of homogene-
ous cores required to obtain the corresponding makespan and value.
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 13
Confidentiality: Public Distribution
Figure 6: Value-time curve for a Job
Similar profiling is performed for all the jobs in the workload. For each job, this step as-
sociates information about the required computing power (# Cores) to achieve a certain
value by executing the job over a fixed amount of time. These information along with
the allocation decisions at different number of used cores are stored (Profiling Results &
Value Curves as shown in Figure 4) to be used for performing efficient run-time re-
source allocation.
Run-time resource allocation for Jobs
At run-time, the jobs arrive at different moments of time and the platform resources
need to be allocated to them based on the available computing power (cores). The plat-
form contains a set of nodes (Node 1,…,Node N), where each node contains a set of
homogeneous cores (PEs). The bottom part of Figure 7 shows an example HPC plat-
form. The communication amongst the cores of a node is established by employing ded-
icated connections. The platform nodes are assumed to be connected by a shared bus. A
platform (global) resource manager is used to manage the platform resources and per-
form resource allocation for the arrived jobs. During system operation, the manager
keeps up to date status of the platform resources, i.e., which resources are busy and
which are idle, such that accurate and efficient allocations can be made. In our case, the
platform status is maintained as the number of available (idle) cores at different nodes
and resource allocation has also been referred to as core allocation.
0
1
2
3
4
5
6
7
8
0
100
200
300
400
500
600
15 17 19 23 29 43 86 90
Time (in time-units)
# C
ore
s
Val
ue
(in
$ o
r £
)
Value # Cores
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 14 Version 1.0 8 December 2014
Confidentiality: Public Distribution
O
S
J01
O
S
O
S
Global Resource Manager
Job Arrival Time
J11J3
1J4 J5 J6 J7
1J8
J01
J11J3
1J4 J5
1J6
1J7
1J8 . . . 1
Jx1Jy
1Jz . . . 1
Jn
Current Time
PE PE
PE PE
PE
PE
Interconnect
. . .
. . .
O
S
O
S
O
S
PE PE
PE PE
PE
PE
Interconnect
. . .
. . .
O
S
O
S
O
S
PE PE
PE PE
PE
PE
Interconnect
. . .
. . .
. . .
Node 1 Node 2 Node N
Figure 7: Run-time job arrival and resource allocation
The run-time resource allocation process can be realized by looking into Figure 7. It can
be seen that the jobs arrived up to the current time has been already allocated into dif-
ferent computing nodes and jobs coming in future (after the current time) need to be al-
located. In order to allocate platform resources to the incoming jobs at run-time, the
platform resource manager is invoked to find an allocation. The manager takes the pro-
filing results of the jobs from the storage along with their value curves and arrival times
as input, and identifies profit maximizing allocation for each job based on the number of
available cores collected as bids at different nodes in the platform. This helps to achieve
high overall profit by servicing (completing) different jobs. Bidding, in this context, of-
fers several inherent benefits. Bid computation is distributed inside the nodes, eliminat-
ing unnecessary traffic. For each job, it is assumed that all of its tasks will be allocated
to one node in the platform, i.e., the tasks of a job cannot be allocated to more than one
node in order to avoid huge communication delay between two nodes. In case of a new-
ly arrived job for which profiling result is not available, the profiling step in employed
followed by the run-time resource allocation step based on the available number of
cores.
The proposed resource allocation heuristics followed by the manager are as follows:
Simple Job Queuing (SJQ)
Simple Job Queuing with Holding (SJQH)
. . .
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 15
Confidentiality: Public Distribution
Maximum Value Queued Job with Holding (maxVH)
Maximum Value Density Queued Job with Holding (maxVDH)
Minimum Value Remaining Queued Job with Holding (minVRH)
3.2.1.1 Simple Job Queuing (SJQ)
The proposed SJQ heuristic is presented in Heuristic 1. The heuristic takes incoming
jobs with their arrival time, the profiling results (value-time curve of jobs with compu-
ting power and allocation decisions), and platform availability as input and tries to find
an efficient allocation for each job. At each time step, the heuristic checks for two deci-
sion points: 1) any already allocated job(s) finish execution (line 2), and 2) any job(s)
arrive into the platform (line 14). First, decision point 1) is checked, followed by deci-
sion point 2).
If any job(s) arrive at the current time step and there are available resources in the plat-
form, then the job(s) are allocated into the platform; otherwise they are put into a queue
for later allocation when resources become available due to already allocated job(s)
completion (line 21). The incoming (arrived) jobs are handled in the same order as they
are perceived by the platform. For each incoming job, the platform node having highest
bid as maximum available resources is initially selected (line 16). In case more than one
platform nodes have the same maximum available resources (bids), any of them is cho-
sen. Choosing such a node helps to achieve better load balancing amongst nodes and
profit maximization by allocating the job to large number of cores. Then, the exact
number of cores to be used by the incoming job is found (line 17). If the number of
available cores is greater than the number of cores to be used to achieve maximum prof-
it, the latter one is chosen as the exact number of cores to be used by the incoming job;
otherwise the former one for the same purpose. Thereafter, the incoming job is allocated
on the chosen number of cores based on the design-time computed allocation and the
platform resources are updated (line 18). The same process is applied for all the incom-
ing job(s) in order to allocate them on the platform resources.
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 16 Version 1.0 8 December 2014
Confidentiality: Public Distribution
Heuristic 1: Simple Job Queuing (SJQ)
Input: Incoming Jobs with arrival times, Value-time curves of Jobs with computing
power and allocation decision information, Platform nodes with available cores
Output: Resource Allocation for Incoming Jobs
1: for each time_step
2: if allocated job(s) finish execution
3: Update platform resources;
4: If JobQueue contains jobs
5: for each queued job
6: if queued job has value at time_step && resources available
7: Choose the maximum capacity node;
8: Find number of cores to be used by queued job to
maximize profit;
9: Allocate resources to the queued job and update
platform resources;
10: end
11: end
12: end
13: end
14: if Job(s) arrive && resources are available in nodes
15: for each Job
16: Choose the maximum capacity node;
17: Find number of cores to be used by Job to maximize profit;
18: Allocate cores to Job and update platform resources;
19: end
20: else
21: Put the Job(s) in JobQueue for late allocation;
22: end
23: end
In case any earlier allocated job(s) complete the execution at the current time step, the
queued jobs are tried to be allocated onto the freed resources (line 2). At the job(s)
completion, the platform resources are updated. Then, for each queued job that still has
a positive value at the current time step (line 6), resource allocation is performed in the
similar way as described earlier: first, the node having highest bid, i.e., maximum num-
ber of available cores is chosen, then the exact number of cores to be used in the chosen
node is found from the profit maximization point of view, and finally the queued job is
allocated on the found number of cores by using design-time allocation decision and the
platform resources are updated. The same process is repeated for the next queued job if
the job has a positive value at the current time step and some platform resources are
available. If no platform resource is available, the current and earlier queued jobs re-
main in the queue and resource allocation for them is performed at later time steps when
some resources become available. Further, if a queued job has zero value at the current
time step, it is dropped from the queue as no profit can be made out of it.
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 17
Confidentiality: Public Distribution
The same process is repeated for all the incoming jobs until all of them are not allocated
or dropped due to having zero value at their late allocation points. This approach tries to
allocate most of the jobs while achieving some profit for each of them. However, some
of them might not achieve profit due to late availability of the platform resources freed
by the already allocated jobs.
3.2.1.2 Simple Job Queuing with Holding (SJQH)
This technique is similar to that of SJQ with the extra capabilities of holding the already
executing low value jobs during the run-time resource allocation process whenever
profitable. The run-time resource allocation process of the SJQH approach is presented
in Heuristic 2. The heuristic takes the same input as that of Heuristic 1 and follows most
of the similar steps. Here, before queuing an incoming job at a particular arrival time
due to no available resources in the platform, it is checked if any profit can be made by
holding low value executing jobs that are supposed to lead to small amount of profit,
and allocating the incoming job on the freed resources. The same check is applied for
all the incoming jobs at the same arrival time.
In order to identify the executing jobs to be hold, first, executing jobs of each node are
found and sorted based on their start time of the execution (lines 9 and 10). Then, the
net profit made by holding the executing jobs in each node is computed as follows:
The profit is calculated by allocating the incoming job on the freed cores and loss is the
earlier profit achieved by the executing jobs to be put on hold. The allocation uses either
all the freed cores or some of them. If the number of freed cores is higher than the num-
ber of cores required by the job to make the maximum profit, the later one is chosen as
the number of cores to be used; otherwise the former one is chosen for the same. Sorting
the executing jobs based on their start times helps us to choose, first, the job having the
latest start time, then the latest start time job along with the second latest start time, then
latest and second latest start time jobs along with the third latest start time, and so on.
Such consideration helps to identify and hold the jobs that have started recently and
avoids holding of jobs that have been executed for a long time. This process tries to
identify the most profitable instance in terms of jobs to hold. For example, holding the
latest start time job might not be profitable, but it might be profitable to hold the latest
and second latest start time jobs together. In such cases, the profitable instance would be
only when both the latest and second latest start time jobs are put on hold. Further, this
also avoids considering all the possible job combinations that might be quite huge for
large number of executing jobs in a node.
After profit calculation for each instance in each node, the maximum net profit instance
is selected and the corresponding node and its jobs to be put on hold are chosen as
max_profitable_node and jobs_to_hold, respectively (line 16 of Heuristic 2). Thus, the
holding instance that leads to maximum profit is identified. Figure 8 demonstrates the
holding process, where three platform nodes are executing different set of jobs at the
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 18 Version 1.0 8 December 2014
Confidentiality: Public Distribution
Heuristic 2: Simple Job Queuing with Holding (SJQH)
1: for each time_step
2: if allocated job(s) finish execution
3: Perform steps 3 to 12 of Heuristic 1 (HybOJQ approach);
4: end
5: if Job(s) arrive && resources are available in nodes
6: Perform steps 15 to 19 of Heuristic 1 (HybOJQ approach);
7: else //first perform holding if profitable; else put into job queue
8: for each arrived Job
9: Find executing_jobs in each platform node;
10: Sort executing_jobs in each node based on their start time;
11: for each node c of platform nodes
12: for each executing_job (∈ executing_jobs) of c
13: Find net_profit to hold the executing_job and allocate
Job on the freed resources;
14: end
15: end
16: Choose the max profitable holding instance in terms of
max_profitable_node and it‟s jobs_to_hold
17: if (net_profit at max profitable holding instance > 0)
18: Hold the jobs jobs_to_hold of max_profitable_node and
update resources;
19: Find number of cores to be used by Job to maximize profit;
20: Allocate cores to the Job and update platform resources;
21: else
22: Put the Job(s) in JobQueue for late allocation;
23: end
24: end
25: end
26: end
current time, e.g., Node 1 is executing jobs j1, j4 and j7. The executing jobs started at dif-
ferent moments of time. At the current time, job j8 has arrived and no resource is availa-
ble in the platform, therefore, the holding process tries to identify the set of jobs to be
put on hold. The table on the right hand side shows the jobs to hold in various nodes and
the corresponding net profit by allocating the freed resources to job j8. It can be realized
that sometimes the net profit (in $) by holding two jobs may be higher than that of one
job, e.g., 100$ by holding j7 and j4, whereas only 40$ by holding j7. The net profit can
also be negative (e.g., -80$ by holding j5), representing a loss if jobs are put on hold,
i.e., the achieved profit is less than the loss. The most profitable holding instance is to
hold the jobs j5 and j2 of Node 2, which results in a net profit of 200$. The holding pro-
cess will choose this instance.
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 19
Confidentiality: Public Distribution
No
de
1N
od
e 2
No
de
3j1
j2
j3
Time
Current Time
j8 arrived & no available cores
j4
j5
j6
j7
No
de
1N
od
e 2
No
de
3
Jobs_to_hold
j740$
j7, j4100$
j7, j4, j1-150$
j5-80$
j6-40$
j6, j3-140$
Net_profit by allocating j8 on freed cores
200$j5, j2
most profitable holding instance
Figure 8: Holding Demonstration
If net profit at the most profitable holding instance is greater than zero, the jobs
jobs_to_hold in node max_profitable_node are held and resources are updated. Then,
the number of resources to be used by the incoming job is identified based on the avail-
able resources and resources required to achieve maximum profit, as described earlier
(line 19). Thereafter, the incoming job is allocated to the cores based on the design-time
profiling allocation decision and resources are updated.
In case there in no profit at any holding instance, the incoming job is put into the queue
to allocate it at later time steps when some resources become available by finishing the
execution of job(s). The holding process helps to achieve higher profit for some jobs,
whereas profit for the held jobs becomes zero. In contrast to SJQ, this approach tends to
provide more overall profit as only profitable holdings are allowed.
3.2.1.3 Maximum Value Queued Job with Holding (maxVH)
The earlier two approaches have following shortcomings: 1) process the queued jobs in
the queuing order, 2) choose the queued jobs randomly for holding operation, and 3) do
not resume back the held jobs when resources become available. The maxVH approach
addresses the above shortcomings and is presented in Heuristic 3. At each time step, the
heuristic checks for three events as follows: 1) any already allocated job(s) finish execu-
tion to update the platform resources, 2) any job(s) arrive into the platform to put into a
job queue, and 3) job queue contains job(s) having non-zero values at current time step
to perform resource allocation for such jobs.
To perform resource allocation for all valuable queued jobs (i.e., jobs having positive
values), all of them (counter = 0 to JobQueue.size()) are tried to be allocated on the
platform resources as along as any resource is available or profit can be made by hold-
ing some executing jobs. First, bids (in terms of number of available cores) from differ-
ent platform nodes are collected, then the maximum bid (maxBid) and the correspond-
ing node is selected. Choosing such a node helps to achieve better load balancing
amongst nodes and thus better resource utilization. In case more than one nodes have
the same amount of bid, any of them is chosen. If the estimate of maxBid is greater than
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 20 Version 1.0 8 December 2014
Confidentiality: Public Distribution
Heuristic 3: Maximum Value Queued Job with Holding (maxVH)
Input: Incoming Jobs with arrival times, Value-time curves of Jobs with profiling
results, Platform nodes with available cores
Output: Resource Allocation for Incoming Jobs
1: for each time_step
2: if allocated_job(s) finish execution
3: Update platform resources;
4: end
5: if job(s) arrive
6: Put the job(s) in JobQueue;
7: end
8: If JobQueue contains job(s) having positive values
9: counter = 0;
10: do
11: Collect bids from different nodes and select maxBid;
12: if maxBid > 0
13: Compute profits of jobs by utilizing maxBid resources;
14: Select maxProfitableJob and its profit;
15: if profit > 0
16: Allocate resources of maxBid node to maxProfitableJob;
17: Update platform resources;
18: end
19: else
20: Find executing jobs_to_hold in the best_suitable_node for recently arrived
maxProfitJob (from JobQueue) and max_hold_profit by Holding heuristic;
21: if max_hold_profit > 0
22: Hold jobs jobs_to_hold and put them in JobQueue for later allocation;
23: Allocate resources of best_suitable_node to maxProfitJob;
24: Update platform resources;
25: end
26: end
27: counter++;
28: while (counter != jobQueue.size());
29: end
30: end for
zero (maxBid > 0), i.e., at least one resource is available in the platform, the profits of
jobs utilizing maxBid resources are computed and the job leading to maximum profit is
selected (maxProfitableJob) to be allocated to resources of the node having maxBid re-
sources provided the maximum profit is a positive value (profit > 0). The profit compu-
tation for each job considers the exact number of cores (computed as described earlier)
to be used by the job and its value at the current time step. The resource allocation on
the exact number of cores of the node containing maxBid cores is done based on the al-
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 21
Confidentiality: Public Distribution
location achieved on the same number of cores during design-time profiling. The alloca-
tion process allocates tasks within a job to the cores (PEs) of a node. The platform re-
sources are updated after each allocation process to have up to date resources‟ availabil-
ity information for the next allocation instance. Such information helps to achieve an
accurate and efficient allocation.
In case no resource is available in the platform, i.e. maxBid = 0, it is checked if any
profit can be made by holding low value executing jobs that are supposed to lead to
small amount of profit. For a profitable holding, max_hold_profit is greater than zero
(max_hold_profit > 0). The jobs holding logic in presented in Algorithm 3, which pro-
vides the executing jobs to hold in the best suitable node, and the maximum profitable
queued job (maxProfitJob) along with the achieved profit (max hold profit) by utilizing
the freed cores of the held jobs. The holding process is carried out for the recently ar-
rived jobs (i.e., at current time step) to avoid the same process for all the queued jobs at
each time step.
Heuristic 4: Job Holding
1: // max hold profit = 0;
2: for each recently arrived job j of JobQueue
3: Find executing jobs in each platform node;
4: Sort executing jobs in each node in ascending order based on their start times;
5: for each node n of platform
6: for each executing_job of n
7: Find net_profit (as in earlier approaches) by holding executing job;
8: if ne_ profit > max_hold_profit
9: max hold profit = net profit;
10: maxProfitJob = j;
11: best suitable node = n;
12: Add executing job to list jobs to hold;
13: end
14: end
15: end
16: end
If holding is profitable (max hold profit > 0), the jobs jobs to hold of node best suitable
node are put on hold for later allocation and used cores are released. Then, the incoming
job is allocated to the freed cores based on the profiling allocation decisions and re-
sources are updated. The holding process helps to achieve a higher profit for some jobs,
whereas jobs on hold achieve lower profits due to allocation at later time steps with de-
creased values of the jobs. In case holding is not profitable, the recently arrived jobs
remain in the job queue and resource allocation for them is performed later when re-
sources become available by completing the executing job(s). The allocation process al-
so ensures that a queued job having zero value at the current time step is dropped from
the queue as no profit can be made out of it. Further, the allocation for a queued job that
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 22 Version 1.0 8 December 2014
Confidentiality: Public Distribution
was put on hold starts from the hold point (i.e., it is resumed) to ensure that only the
fraction of the job left after holding is executed, but not the whole job from the begin-
ning. The allocation process continues until all the arrived jobs are allocated or dropped
due to having zero value while waiting in the job queue.
3.2.1.4 Maximum Value Density Queued Job with Holding (maxVDH)
This approach is similar to maxVH approach with the distinction that the queued jobs
are processed based on their value density when utilizing available (or required)
computing power (number of cores) of highest bid node (maxBid node). The value
density of a job is computed by dividing the achieved value by the number of used
cores. The queued job having maximum value density is chosen first. This also indicates
that a job providing high value and requiring less cores is allocated first, which also
leaves cores for later arriving jobs.
3.2.1.5 Minimum Value Remaining Queued Job with Holding (minVRH)
This approach is also similar to maxVH approach with the distinction that the queued
jobs are processed based on their remaining value. The remaining value of a job is cal-
culated as the remaining area under the value-time curve from the current time to the
time when value becomes zero. The job having minimum remaining value is chosen
first. This signifies that the job that is going to loose its value soon, i.e., has minimum
remaining value, is chosen first in the assumption that most of the jobs will be serviced
and some values will be achieved out of them. However, in doing so, a high value job
might have a very low value by the time resources are available to perform the alloca-
tion process for it. Therefore, it might result in low overall profit.
3.2.1.6 Results and comparison
To evaluate the quality of the heuristics, historical data from High Performance Compu-
ting Center Stuttgart (HLRS) have been considered as the workload. The workload con-
tains a set of jobs having varying arrival time. Each job contains a set of tasks that have
known worst-case execution times (WCETs). The considered HPC platform model con-
tains a set of 3 nodes, where each node consists of 9 cores. However, any number of
nodes and cores within them can be considered based on the physical limitation of
hardware integration.
The platform manager employs a heuristic to find an allocation for each job of the
workload by considering its arrival time, given value curve and profiled information
representing the computing power (used number of cores) and the corresponding alloca-
tion decision to achieve a certain value by executing over a fixed amount of time. The
profiling information is achieved by employing the design-time profiling step. For each
job set, the profiling is performed in advance in order to associate profiling results with
the given value-time curves. The profiling information includes required computing
power (# cores) and resource allocation decisions information to achieve different
makespans representing the time axis.
Overall profit by executing different number of jobs: Figure 9 shows the overall profit obtained by employing the proposed approaches for
varying job sets, which are derived by choosing different number of jobs from the work-
load. The small number of jobs in a set reflects the execution of jobs in the HPC centre
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 23
Confidentiality: Public Distribution
for a fixed small amount of time, e.g. few minutes or hours, but not for the time as with
the whole historical data. A couple of observations can be made from the figure. 1) The
profit obtained by the approaches employing the holding process (e.g., maxVDH and
maxVH) is always higher than that of the corresponding approaches without employing
holding. This improvement is achieved by holding low value executing jobs and allocat-
ing the freed resources to high value incoming jobs. Since holding is performed only
when it is profitable, the approaches employing holding achieve higher overall profit. 2)
The overall profit increases with the number of jobs in a workload as profit is made out
of higher number of jobs. 3) The maxVH approach achieves maximum overall profit for
the considered job sets. This is due to the fact the choosing maximum value queued job
leads to more favourable situations to maximize the profit by completing different jobs.
On an average, maxVH achieves 8% higher profit than that of maxV, which can be a
significant value when serving (completing) a large amount of jobs.
Figure 9: Overall profit for executing different job sets
Analysing holding effect on profit by different jobs: Figure 10 shows the profit obtained by different jobs when approaches SJQ and SJQH
are employed for the job set containing 30 jobs. The interesting observations that can be
made from the figure are as follows. 1) SJQ makes profit for most of the jobs and jobs
achieving zero profit are those that were queued and whose value decreased to zero
when resources become available. An example of such zero profit making job is job 27.
2) SJQH makes profit for lower number of jobs than that of SJQ as SJQH holds some
jobs and makes zero profit for some of them. Examples of such jobs are job 20 and job
24. It should be noted that some zero profit making jobs by SJQH could be due to the
same reason as that of SJQ, i.e. due to some queued jobs for whom profitable holding
was not possible at their arrival and value for them becomes zero when resources be-
come available. Similar results are obtained for other job sets.
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
30 Jobs 40 Jobs 50 Jobs
Ove
rall
Pro
fit
(£)
Job sets containing different number of jobs
SJQ
MinVR
MaxVD
MaxV
SJQH
MinVRH
MaxVDH
MaxVH
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 24 Version 1.0 8 December 2014
Confidentiality: Public Distribution
Figure 10: Profit by different jobs
Overall profit with varying job killing penalty: We also have evaluated the overall achieved profit in case there is some penalty to hold
the low value executing jobs. This will be more favourable situation for the customers
as they know that most likely their submitted jobs will be serviced with the initial prom-
ised quality; otherwise the HPC centre has to pay them back in terms of some penalty.
We have assumed that if a job is put on hold then there will be a penalty of some per-
centage of the maximum value that could be achieved for the job. However, for the
queued jobs that can lead to zero profit making, we have not considered any penalty
since the job is not put on hold but went to out of profit making point due to resources
unavailability. The holding penalty percentage has been varied to evaluate its impact on
the overall profit achieved by the most promising approach maxVH.
Figure 11 shows the overall profit obtained by employing maxVH when holding penalty
percentage is varied from 0% to 70% for the job set containing 30 tasks. The overall
profits obtained by employing maxVH with no holding penalty (maxVH-0%) and maxV
have also been plotted for the comparison purposes. It can be observed that the overall
profit by maxVH decreases as the holding penalty percentage increases and becomes
saturated after a particular holding penalty percentage. The profit at lower penalties re-
mains the same as the penalty is not sufficient enough to affect the allocation decisions
for the jobs. The decreasing trend is obtained as lesser holdings are performed with in-
creased penalty and thus making low overall profit. The later constant profit region in-
dicates that no further holdings are performed due to high cost (penalty). It should be
noted that the holding penalty also determines the holding decision and thus different
jobs are put on hold with the changed penalty. Further, it can be observed that the over-
all profit with higher holding penalty is the same as that of maxV as the approach
maxVH performs resource allocation exactly in the same manner as that of maxV, i.e.,
no jobs are put on hold due to high penalty and incoming job is put into the job queue
for later allocation when resources become free.
0
100
200
300
400
500
600
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Pro
fit
($)
Job Index
SJQ SJQH
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 25
Confidentiality: Public Distribution
Figure 11: Profit by maxVH with varying holding penalty and by maxV
Profit by different jobs at varying job killing penalty: Figure 12 shows the profit obtained by different jobs when approach SJQH is employed
with different job holding penalties for a job set containing 20 jobs. It can be observed
that the number of held jobs that make zero profit is higher when the holding penalty is
low. For example, for 10% holding penalty, jobs 7, 8, 9 and 18 are held, which results in
zero profit by them. The number of held jobs reduces as the holding penalty increases.
For example, for 30% holding penalty, only job 12 is held. It should be noted that the
holding penalty also determines the holding decision and thus different jobs are held
with the changed penalty. This can be clearly observed with the above two examples
that show that jobs 7, 8, 9 and 10 are held if penalty is 10%, whereas job 12 is held if
penalty is 30%. Further, it can also be observed that the holding of jobs stops at higher
penalties, e.g. at 40%. It should also be noted that zero profit making can be due to job
queuing and late allocation as well. However, for Figure 12, there is no such job, which
has been verified with the allocation results.
Figure 12: Profit by different jobs
95009600970098009900
100001010010200
Tota
l Pro
fit
($)
Resource allocation approaches
0
100
200
300
400
500
600
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Pro
fit
($)
Job index
SJQH-10% SJQH-20% SJQH-30% SJQH-40%
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 26 Version 1.0 8 December 2014
Confidentiality: Public Distribution
Discussions: The SJQ approach is good from user (customer) point of view as it tries to perform re-
source allocation for most of the jobs. Similarly, minVR tries to service most of the jobs.
In contrast, the approaches employing holding operation hold jobs to maximize the
overall profit, and thus do not care much about the user but to the HPC platform suppli-
ers, i.e. companies. Further, these approaches perform similar allocations as that of ap-
proaches without employing holding in case the job holding penalty is quite high.
The results indicate that maxVH can be employed if the holding penalty is not quite high
and HPC platform supplier wants to maximize the overall profit. However, users might
not wish to submit their jobs to such HPC centres where their jobs might get unserved
due to holding. Therefore, from the user point of view, SJQ and minVR can be em-
ployed that try to serve all the jobs and make profit by most of them. Further, in SJQ,
when no resource is available for an incoming job, the client may be early informed that
the service for his job cannot be started immediately but sometimes soon when resource
become available. This helps the client to withdraw his jobs in case he is not happy and
able to find a better HPC supplier at the same time. However, such a possibility is not
feasible in the approaches employing holding operation as a job can be held at any time.
3.2.2 On-the-fly techniques for bidding
The approaches described in Section 3.2.1 consider whether employing a holding pro-
cess for the low value executing jobs and resuming them later (i.e., employing pre-
emption) can lead to a higher level of value achieved. Further, they utilize design-time
profiled results of the jobs to perform an efficient allocation towards maximizing over-
all profit. The on-the-fly approaches perform all the processing at runtime after the job
has arrived and do not use any design-time profiled results. Further, they do not consid-
er pre-emption. These approaches are suitable for the scenario when the customers are
expected to submit different kinds of jobs at different moments of time and historical
data cannot be made available by HPC centres.
The on-the-fly approaches do not have overhead to manage job queues anywhere other
than inside a HPC platform as no prior profiling information is used. The allocations
found at runtime may be suboptimal as valuable jobs may be starved on a very busy
node, even if there is free capacity on other nodes, just because of the allocation made
on the arrival of job. Similar to the earlier approaches, a centralised auctioneer manag-
ing the queue of jobs that are yet to be allocated is considered. The overhead to manage
the queue could be compensated by the higher value achieved. HPC platforms tend to
use a submission host machine and the auction could be run on it without needing yet
another machine, though it would place more load on the submission host.
As earlier, the auctions take place in order to determine which job to run next. The jobs
and platform nodes place bids, where bids from nodes are the number of free cores.
The job is allocated to the node with the most cores free first, achieving load balancing
across the HPC platform. The highest-bidding job is allocated to the highest-bidding
node until there is no more job or free cores left.
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 27
Confidentiality: Public Distribution
3.2.2.1 Techniques
The evaluated on-the-fly techniques are similar to those proposed in Section 3.2.1, but
they do not use any design-time profiling results and do not employ the holding process
(pre-emption). When not utilizing the profiling information and not employing the hold-
ing process, the techniques are referred to as: SJQ that performs job queuing when no
resource is available and process the queued jobs in the queuing order, maxV that
chooses the maximum value queued job first for allocation, maxVD that chooses maxi-
mum value density queued job first, maxVD2 that chooses maximum value density
squared job first and minVR that chooses minimum value remaining job first. It is
important to note, however, that the performance of heuristics can depend a great deal
on the exact parameters of the workload and platform.
3.2.2.2 Results and comparison
The workload size is designed to correspond to a year in the life of an HPC centre, to
ensure sufficient weekly cycles of load are present. Two simulation scenarios have been
evaluated: small scale scenario and large scale scenario. In small scale scenario, a total
of 1000 jobs to be executed over a HPC platform of two nodes with 20 execution
threads each (representing 10 hyper-threaded cores) are considered. The large scale sce-
nario considers a total of 10,000 jobs composed of more than 100,000 tasks to be exe-
cuted on a platform containing 4 nodes with 1000 cores each. The large scale scenario
has been considered to simulate future platforms that might be encountered in future.
The arrival rates of the jobs are varied to correspond to a normal working week, with
peaks of work arrival during normal working hours, and quiet periods overnight and at
weekends. These peaks are intended to to be high enough to oversaturate the system, in
order to evaluate the techniques‟ ability to appropriately prioritise jobs to achieve the
highest value. Further, a range of load levels are examined. If the load is perfectly di-
visible across all cores, 100% load would represent all the cores being fully utilised (a
state called saturation). Above 100% load, some jobs must starve, whereas below 100%
load there is some slack time available.
It is to be expected that as load rises, the proportion of the maximum value achievable
will fall. Especially beyond saturation, some work will have to starve and so will never
realise any value. In these results, no penalties are applied for jobs that do not run; they
just do not return any value.
Overall profit at varying load: Figure 13 shows the proportion of overall profit at varying loads when different tech-
niques are employed for the small-scale simulation scenario. In these results, maxVD2
policy attains the highest value results across the spectrum of load (Figures 13 (a) and
(b)). This is due to the fact that squaring the density allows a better separation of the
values between the most and least-valuable tasks, and the simulation scenario is more
favourable.
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 28 Version 1.0 8 December 2014
Confidentiality: Public Distribution
(a) Normal View (b) Zoomed-in View
Figure 13: Value against load for the small-scale platform
Figure 14 shows the proportion of overall profit at varying loads when different tech-
niques are employed for the large-scale simulation scenario. These are designed to rep-
resent a large industrial HPC scenario or a platform owned by an organisation selling
their capacity as cloud computing. In the large-scale results (Figures 14 (a) and (b)),
across the spectrum of load, maxVD dominates all the other policies and is significantly
better at load levels above 100%. However, maxVD2 dominates for load range 80% to
100%.
(a) Normal View (b) Zoomed-in View
Figure 14: Value against load for the large-scale platform
Number of starved (uncompleted) jobs: Figure 15 shows the number of starved jobs (workflows) at varying loads for the small
and large scale platforms. It is remarkable that although maxVD2 and maxVD
achieve
higher overall value for small-scale and large-scale platforms (Figure 13 and Figure 14),
the number of jobs that each technique starve is very different indeed. For small-scale
system, maxVD2 achieves higher overall value and lower number of starved jobs,
therefore, it should be chosen to maximize the overall profit and user satisfaction. For
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 29
Confidentiality: Public Distribution
the large-scale system, even though maxVD2
has a somewhat lower overall value
(Figure 14), it may be better to be used as it starves fewer jobs. Lower levels of
starvation are likely to maintain higher user satisfaction.
(b) Small-scale platform (b) Large-scale platform
Figure 15: Number of starved (incomplete) jobs against load
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 30 Version 1.0 8 December 2014
Confidentiality: Public Distribution
4. TECHNIQUES FOR BIO-INSPIRED RESOURCE ALLOCATION
4.1 REVIEW OF EXISTING TECHNIQUES
Biologically inspired heuristics are processes which draw inspiration from nature and
apply the observed characteristics when solving specific computational problems. They
are often based on the characteristics of self-organizing biological systems where global
patterns emerge from interactions at a lower-level in the system [40].
Bio-inspired heuristics have been well explored to balance communication load in
networks and distributed systems [41] [42]. In this review, however, we concentrate on
the problem of the allocating both computation and communication loads.
4.1.1 Pheromone signalling (PS)
A bio-inspired approach to load balancing in wireless sensor networks (WSNs) is
presented by Caliskanelli et al. [43]. It is based on the behaviour of honey bees when
choosing a queen for their colony. The authors identify the similarities between the
assignments of responsibilities to members of a beehive with the distribution of a
workload across a WSN. The goal of the heuristic is to optimise the trade-off between
energy efficiency and service availability in a WSN. The proposed scheme applies a
dynamic load balancing approach based on the idea of certain nodes propagating a
virtual „pheromone‟ to make other nearby nodes aware of its presence and
responsibilities. The idea is to provide these optimisations at runtime in a completely
decentralised and self-organising fashion.
The heuristic can be summarised as follows; all nodes in the network are capable of
collecting sensor readings, executing tasks and communicating with other nearby nodes.
Nodes are differentiated into one of two roles at any given time: queen nodes and
worker nodes. Queen nodes (QNs) refer to those which are responsible for the mapping
and execution of tasks. All other nodes are determined to be Worker Nodes (WNs).
QNs are the only nodes which will voluntarily execute tasks, WNs will only perform
execution if explicitly told to do so. The level of a virtual pheromone seen by a node
determines whether it can differentiate itself as a QN and it is possible for there to be
multiple QNs in the network.
QNs periodically propagate pheromone to their neighbours in the network. The level of
pheromone decays with each hop from the QN. For example, a node which is a direct
neighbour of the QN will experience a higher pheromone level from that particular QN
compared to one which is two hops from the QN. It is possible that a node receives
doses of pheromone from multiple QNs. These pheromone doses are accumulated into
an individual pheromone level by each node. If the pheromone level experienced by a
node drops below a pre-defined threshold, the node will become a QN. The pheromone
level of a node will also decay over time, ensuring that if no pheromone doses are
received after an amount of time, the node will become a QN. This ensures the heuristic
is resilient to failure of QNs or if certain nodes are too many hops away from a QN. It
also means that the network will have an appropriate density of QNs so as to maintain a
high level of service availability. However, it is unspecified how the pheromone level
affects nodes which have already differentiated themselves as QNs.
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 31
Confidentiality: Public Distribution
While parameter-rich and difficult to tune to different scenarios, PS provides a
completely decentralised approach to load balancing, as each node dynamically makes
resource allocation decisions using only information that is locally available (i.e.
pheromone level), and its computation and communication overhead is minimal: simple
periodic and event-triggered computation tasks, and lightweight communication
(pheromone packets can be only one byte long).
4.1.2 SymbioticSphere
This is approach tries to model the resource allocation problem as a bio-inspired process
that involves agents that mimic the lifecycle, feeding, reproduction, evolution and death
of living beings [44]. Agents autonomously decide to follow so-called synergetic
behaviours that allow them to migrate towards runtime platforms that are more
resource-rich or to duplicate themselves. The approach has been applied to cloud
systems, aiming to optimise service availability (which is referred as throughput in
[44]), response time and resource efficiency (which is the amount of computational
work performed by a given agent divided by the amount of platform resources it used to
do so). The choice of synergic behaviours is based on game-theoretical formulations
that aim towards stability of the populations of agents, rather than the optimisation of
the metrics of interest. The evolution of the agents, on the other hand, is driven by a
genetic heuristic. Both those mechanisms are computationally heavy, which makes this
approach suitable for large scale systems only and questionable when it comes to
systems with timing constraints.
4.1.3 Biological Task Mapping and Scheduling
Hamouda and Phillips [45] present a bio-inspired task mapping and scheduling
heuristic, “Biological Task Mapping and Scheduling” (BTMS), aiming to improve
energy efficiency and performance in Wireless Sensor Networks. The heuristic is
inspired from zygotes, which are human embryonic stem cells. Zygotes exhibit
behaviour called differentiation which relates to when cells begin to specialise to
perform different functions. The work attempts to recreate this behaviour in network
nodes to achieve the aforementioned goals.
BTMS uses an application model based on a direct acyclic graph (DAG) to represent
the tasks and dependencies between them. It orders tasks according to their
dependencies and divides them in execution stages (or levels), which are used to guide
the mapping. It assumes a homogeneous network, and all nodes begin with the same
energy level. The heuristic is divided in three phases: group discovery, service
provisioning and group management.
The group discovery is based on the concept of a target sensor node, which influences
other nodes based on their proximity to it. A request to participate in a processing
activity is broadcast from the target node to its nearby neighbours who can if they wish
to participate in the activity. Each neighbour makes their decision based on two factors.
Firstly, if their energy level is below a pre-set threshold value, nodes will prefer to
remain as relay nodes to preserve their energy. The authors do not explicitly define the
responsibility of relay nodes but it can be assumed that they participate in
communication activities and not in computation activities. The second decision factor
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 32 Version 1.0 8 December 2014
Confidentiality: Public Distribution
is whether the node identifies if it has sufficient neighbours to relay the data in the
network. If both of these factors are true for a node, it will participate in the activity.
The target sensor then performs a local election heuristic to elect a Main Node (MN)
whose role it is perform the task mapping. The MN is assigned to the node which has
the maximum of the fitness function:
The first part of the equation refers to the remaining energy of the node where ESi is the
remaining energy of the node and Emax is the maximum energy of the node. The second
part of the equation refers to how close the node is to the centre of the group
participating in the activity. The summation term takes the total of the distances from
the node to all of its neighbours. The a term is a parameter between 0 and 1 which can
be tuned to place more precedence on the nodes remaining energy or its centrality to the
group.
This fitness function is looking for a node with a high energy level which is in the
centre of the group. Intuitively, these nodes are the most appropriate for the role of MN
and this is reflected in the fitness function. The CPU power of the MN is not considered
in this fitness function because the network is assumed to be homogeneous. If the
heuristic were to be applied to a heterogeneous network, a term which factors in
processing power of the node may have to be incorporated into the fitness function,
otherwise the role of MN may be given to a node which is not sufficiently fast enough
to perform the BTMS heuristic. A similarity can be drawn here between main nodes in
BTMS and queen nodes in PS reviewed in the previous subsection, where certain nodes
are elected to have responsibility over a number of nodes in a given area. The additional
responsibility often comes with an increased drain in energy, meaning an imbalance in
the rate of battery drain between nodes.
4.2 PROPOSED BIO-INSPIRED TECHNIQUE
Pheromone Signalling for Manycores (PSIGMA) is the DreamCloud extension to the
PS heuristic presented in [43]. It uses the pheromone signalling mechanism to allow
nodes to advertise and procure the availability of resources over a manycore platform.
Cores that have high availability of resources differentiate themselves as queens (QN)
and will propagate pheromones to supress the differentiation of neighbouring nodes,
therefore achieving efficient load balancing.
Similarly to PS, PSIGMA has three distinct phases which are executed on every
processing core: two of them are time-triggered (differentiation cycle and decay of
pheromone) and one of them is event-triggered (propagation of received pheromone).
The first time-triggered phase, referred to as the differentiation cycle, is executed by
every node of the network every TQN time units. On each execution, the core checks its
current pheromone level hi against a predefined level thresholdQN. The core will
differentiate itself into QN (or maintain its QN status) if hi < thresholdQN; otherwise it
will become (or remain) a regular core and will handle only its individual workload. If
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 33
Confidentiality: Public Distribution
the node is a QN, it then transmits pheromone to its network neighbourhood to make its
presence felt. Each pheromone dose hd is represented as a two-position vector. The first
element of the vector denotes the distance in network hops to the QN that has produced
it. The second element is the actual dosage of the pheromone that will be absorbed by
the neighbours.
The event-triggered part of the heuristic deals with the propagation of the pheromone
released by QNs (as described previously in the differentiation cycle) and received at
neighbouring nodes. The purpose of propagation is to extend the influence of QNs to
nodes other than their directly connected neighbours. Propagation is not a periodic
activity and happens every time a node receives a pheromone dose. Upon receiving a
pheromone dose, a node checks whether the QN that has produced it is sufficiently near
for the pheromone to be effective. It does that by comparing the first element of hd with
a predefined thresholdhopcount. If the hd has travelled more hops than the threshold, the
core simply discards it. If not, it adds the received dosage of the pheromone to its own
pheromone level hi and propagates the pheromone to its neighbourhood. Before
forwarding it, the core updates the hd vector element by incrementing the hop count and
by multiplying the dosage by a decay factor Khopdecay. This represents pheromone
transmission decaying with distance from the source.
The second time-triggered part of the heuristic proposed in [43] is a simple periodic
decay of the pheromone level of each node. Every Tdecay time units, hi is multiplied by a
decay factor Ktimedecay. In PSIGMA, we take into account the dynamic load scenarios
that DreamCloud systems are likely to encounter, and tune the Ktimedecay factor to reflect
the current availability of resources of each core. Following the PS principles, this is
done in a completely decentralised way and using only information available locally to
each core. Each core monitors the slack of the tasks and communication activities it
performs: how early do they finish with regard to their timing constraints (i.e. soft real-
time deadlines). If slacks are high, it means that the core is underloaded, since most
tasks and communications are processed well ahead of their deadlines. In that case, the
Ktimedecay factor is increased, aiming to accelerate the process of differentiating this core
into a QN. Conversely, if slacks are low or negative, it means that the node is
overloaded and therefore should not be differentiated into QN, so Ktimedecay is decreased.
Finally, we introduce another event-triggered part to the heuristic, allowing individual
cores to tune their thresholdQN according to the local availability of resources. It uses
the same slack monitors described in the previous paragraph: if the slacks are large and
increasing, the value of thresholdQN is also increased, in order to increase the likelihood
of a differentiation into QN. Conversely, low or negative slacks will result in a decrease
of thresholdQN, which could potentially force a differentiation of a core from QN back
to a regular node. The differentiation of a QN back into a regular node due to overload
is a particularly desirable behaviour in the case of DreamCloud applications, but this
was not achievable under the baseline PS algorithm, or by tuning Ktimedecay as described
above.
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 34 Version 1.0 8 December 2014
Confidentiality: Public Distribution
Heuristic 3: Pheromone Signalling for Manycores (PSIGMA)
differentiation 1 every TQN do 2 if <
3 =true 4 broadcast hd = {0, hQN} 5 else 6 =false propagation 1 when is received 2 if ( [ ]
3 [ ] 4 broadcast hd’ = { [ ] , [ ] }
5 else 6 drop hd decay 1 every TDECAY do 2
3
threshold tuning 1 every TQN do 2
4.2.1 Preliminary experimental work
PSIGMA was validated within a simulated multi-stream video processing scenario,
which is similar to one of the DreamCloud case studies in task T6.3 of WP6.
We assume an open system where several video streams can be initiated by end-users,
each with distinct frame-rates, resolutions and QoS requirements. Each video-stream is
divided in MPEG groups of pictures (GoPs), which are periodically processed by a
chain of intercommunicating tasks according to the stream’s frame-rate. The proposed
PSIGMA resource allocation mechanism tries to improve the timeliness of the video
processing against a baseline mechanism that allocates tasks to the cores with the lowest
utilisation value: ∑
⁄
where WCETi is the worst case
execution time of task i and Ti is its minimum inter-arrival interval (i.e. derived from the
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 35
Confidentiality: Public Distribution
stream’s frame-rate). Instead, PSIGMA allocates tasks to cores that have differentiated
themselves as QN. Note that we do not consider task migration in this case study. Once
allocated, a task will be executed by the core it was assigned to, even if it is not QN
anymore by the time the task is scheduled to run. In that case, however, future jobs of
such a task (i.e. processing the next GoP of a stream) will be reallocated to a QN core.
The following parameters were used to tune and implement PSIGMA:
TQN = 0.072 s
Tdecay = 0.036 s
thresholdQN = 9
thresholdhopcount = 2
Khopdecay = 0.25
Ktimedecay = 0.7
hd = 14
pheromone packet size = 32 bytes
Thirty five different scenarios were simulated, each referred by its seed number, which
was used to define the computation cost of the video processing tasks, the video stream
arrival times and the video resolutions. By covering a wide array of scenarios, we can
show the robustness of the proposed resource allocation mechanism, i.e. show that it
performs well for different types of workloads.
Figure 16 below shows the distribution of GoP lateness, measured in seconds, in each of
the 35 scenarios with the baseline and with the PSIGMA resource allocation. There are
clearly noticeable reductions in lateness by using PSIGMA, such as in seeds: 88117,
5558, 42198, 18065, etc. However negative results can be seen in seeds 74076 and
83660. Overall, the approach performs better for both average case and worst case GoP
lateness. Figure 17 has boxplots with the percent improvement of the mean and
maximum GoP lateness for all the 35 scenarios. PSIGMA provides an average
improvement of about 8% and at most a 22% improvement in reducing the mean GoP
lateness of streams processed by the system. Likewise, the overall average improvement
of maximum lateness is in the order of 5%.
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 36 Version 1.0 8 December 2014
Confidentiality: Public Distribution
Figure 16: MPEG GoP lateness (in seconds): baseline vs PSIGMA
Figure 17: PSIGMA MPEG GoP lateness percent reduction against baseline
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 37
Confidentiality: Public Distribution
5. COMPLIANCE WITH THE DYNAMIC RESOURCE ALLOCATION
REQUIREMENTS
The heuristics proposed in Section 3 and Section 4 should comply with the dynamic
resource allocation requirements reported in deliverable D1.2 [2]. This is to ensure that
the development of the heuristics fall within the DreamCloud perspective. The different
requirements of the heuristics are mentioned in the left most column of Table 1. The
proposed market-inspired and bio-inspired heuristics for soft real-time dynamic
resource allocations are analysed for the different requirements of the DreamCloud
project. In Table 1, ‘+’ sign indicates that the corresponding requirement has been
fulfilled, whereas a ‘-’ sign indicates the opposite. The numbers with the sign are added
to provide further clarifications with the same footnote numbers just below the table.
These footnotes indicate mainly the lack of fulfilment, which are to be considered in
different current or future deliverables. The requirement fulfilment status has been
provided based on the preliminary analysis and results. The analysis for some
requirements is out of the scope of this deliverable and will be deeply performed in
future deliverables.
It can be seen that most of the requirements are fulfilled by the proposed soft real-time
dynamic resource allocation heuristics. The initial sets of analysis and results indicate
that the proposed heuristics are promising choices for extensions and investigations.
Table 1. Dynamic resource allocation requirement fulfilment by Proposed Heuristics
Requirement Heuristics
SJQH maxVH maxVDH minVR
minVRH
PSIGMA
Objectives of dynamic re-
source management should be
configurable
+1,4
+1,4
+1,4
+1,4
+1,4
Dynamic resource allocation
shall be used to provide differ-
ent levels of performance guar-
antees
+4 +
4 +
4 +
4 +
4
The average latency of jobs
shall be minimised
+ + + + +
The total energy dissipation of
jobs shall be minimised
-1 -
1 -
1 -
1 -
1
Communication overhead pa-
rameters shall be predictable
+ + + + +
Dynamic resource allocation
overhead shall be predictable
and bounded
+ + + + +
The dynamic resource alloca-
tion mechanisms shall cope
with dynamic workload
-2 -
2 -
2 -
2 +
The dynamic resource alloca-
tion mechanisms shall not limit
+ + + + +
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 38 Version 1.0 8 December 2014
Confidentiality: Public Distribution
hardware scaling
The dynamic resource alloca-
tion mechanisms shall cope
with limited information about
the state of the overall system
+3 +
3 +
3 +
3 +
3
The dynamic resource alloca-
tion mechanisms shall respect
mapping constraints that re-
strict the allowed computation-
al unit
+4 +
4 +
4 +
4 +
4
The dynamic resource alloca-
tion mechanisms shall consider
cost, runtime and power effi-
ciency for different type of
resources available to a multi-
typed job
+5 +
5 +
5 +
5 +
5
1Outside the scope of this deliverable; it will be covered in deliverable D2.3 by having possible
extensions of the proposed heuristics. 2In hybrid techniques, the workload is analysed at design-time by assuming that the same workload
pattern will appear at run-time. Therefore, the workload that will be executed at different times should be
known in advance. 3The heuristics use limited information to make resource allocation decision especially for the distributed
resource management. 4Not covered explicitly by the proposed heuristics, but their extension to consider multiple
objectives/constraints is trivial. 5Currently only one type of resource has been considered. The future extensions will consider different
types of resources.
REFERENCES
1. S. H. Bokhari. On the mapping problem. IEEE Transaction on Computers, 30(3):207–214, 1981
2. University of York, University of Stuttgart, D1.2 – Dynamic Resource Allocation Requirements,
DreamCloud, 2014.
3. R. I. Davis and A. Burns. A survey of hard real-time scheduling for multiprocessor systems.
ACM Comput. Surv., 43(4):35:1–35:44, October 2011
4. A. K. Singh, M. Shafique, A. Kumar and J. Henkel. Mapping on multi/many-core systems: sur-
vey of current and emerging trends. In 50th Annual Design Automation Conference, 1:1 – 1:10,
2013
5. G. Ascia, V. Catania, and M. Palesi. A multi-objective genetic approach to mapping problem on
Network-on-Chip. Journal of Universal Computer Science, 12(4):370–394, 2006
6. A. Racu and L. S. Indrusiak. Using genetic algorithms to map hard real-time NoC-based sys-
tems. In 7th International Workshop on Reconfigurable Communication-centric Systems-on-
Chip (ReCoSoC), 2012
7. S. Stuijk, T. Basten, M.C.W. Geilen, and H. Corporaal. Multiprocessor resource allocation for
throughput-constrained synchronous dataflow graphs. In 44th ACM/IEEE Design Automation
Conference, DAC ’07, pp. 777 –782, 2007
8. O. Moreira, F. Valente, and M. Bekooij. Scheduling multiple independent hard-real-time jobs on
a heterogeneous multiprocessor. In Proceedings of the 7th ACM & IEEE International Confer-
ence on Embedded Software, pp. 57–66, 2007
9. J. E. Kelley, Jr. Critical-path planning and scheduling: Mathematical basis. Operations Research,
9(3):pp. 296–320, 1961
D2.2 – Soft Real-Time Dynamic Resource Allocation
8 December 2014 Version 1.0 Page 39
Confidentiality: Public Distribution
10. T.C.E. Cheng and Q. Ding. Scheduling start time dependent tasks with deadlines and identical
initial processing times on a single machine. Computers and Operations Research, 30(1):51 – 62,
2003
11. B. Shirazi, M. Wang, and G. Pathak. Analysis and evaluation of heuristic methods for static task
scheduling. Journal of Parallel and Distributed Computing, 10(3):222 – 232, 1990
12. H. Topcuouglu, S. Hariri, and M. Wu. Performance-effective and low-complexity task schedul-
ing for heterogeneous computing. IEEE Transactions on Parallel and Distributed Systems,
13(3):260–274, March 2002
13. E. Saule, D. Bozda˘ g, and U. V. Catalyurek. A moldable online scheduling algorithm and its
application to parallel short sequence mapping. In Eitan Frachtenberg and Uwe Schwiegelshohn,
editors, Job Scheduling Strategies for Parallel Processing, Lecture Notes in Computer Science,
vol. 6253, pp. 93–109, 2010
14. L.T. Smit, J.L. Hurink, and G.J.M. Smit. Run-time mapping of applications to a heterogeneous
SoC. In International Symposium on System-on-Chip, pp. 78 –81, 2005
15. C.-L. Chou and R. Marculescu. Incremental run-time application mapping for homogeneous
NoCs with multiple voltage levels. 5th IEEE/ACM/IFIP International Conference on Hard-
ware/Software Codesign and System Synthesis (CODES+ISSS), pp. 161 –166, 2007
16. A. Schranzhofer, J.-J. Chen, and L. Thiele. Dynamic power-aware mapping of applications onto
heterogeneous MPSoC platforms. IEEE Transactions on Industrial Informatics, 6(4):692 –707,
2010
17. A. K. Singh, A. Kumar, and T. Srikanthan. A hybrid strategy for mapping multiple throughput-
constrained applications on MPSoCs. In Proceedings of the 14th international conference on
Compilers, architectures and synthesis for embedded systems (CASES), 2011
18. A. K. Singh, T. Srikanthan, A. Kumar, and W. Jigang. Communication-aware heuristics for run-
time task mapping on NoC-based MPSoC platforms. J. Syst. Archit., 56(7):242–255, 2010
19. S.-S. Lu, C.-H. Lu, and P.-A. Hsiung. Congestion- and energy-aware run-time mapping for tile-
based network-on-chip architecture. In International Conference on Frontier Computing. Theory,
Technologies and Applications, pp. 300 –305, 2010
20. A. Beloglazov and R. Buyya. Energy efficient allocation of virtual machines in cloud data cen-
ters. In 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
(CCGrid), pp. 577 –578, 2010
21. C. S. Yeo and R. Buyya. A taxonomy of market-based resource management systems for utility-
driven cluster computing. Software: Practice and Experience, 36(13):1381–1419, 2006
22. I. Caliskanelli, J. Harbin, L. S. Indrusiak, P. Mitchell, D. Chesmore, F. Polack. Bio-inspired load
balancing in large-scale WSNs using pheromone signalling. International Journal of Distributed
Sensor Networks, vol. 2013, Article ID 172012, 14 pages, 2013
23. R. Buyya and M. Murshed. A deadline and budget constrained cost-time optimisation algorithm
for scheduling task farming applications on global grids. arXiv:cs/0203020, March 2002. Tech-
nical Report, Monash University, 2002
24. Y. Tao and X. Yu. Classified optimization scheduling algorithm driven by multi-QoS attributes
in economical grid. In International Conference on Computer Science and Software Engineering,
volume 3, pp. 70–73.2008
25. C. Li and L. Li. Multi-level scheduling for global optimization in grid computing. Computers &
Electrical Engineering, 34(3):202–221, 2008
26. O.O. Sonmez and A. Gursoy. A novel economic-based scheduling heuristic for computational
grids. International Journal of High Performance Computing Applications, 21(1):21–29, 2007
27. L. Xiao, Y. Zhu, L.M. Ni, and Z. Xu. Incentive-based scheduling for market-like computational
grids. IEEE Transactions on Parallel and Distributed Systems, 19(7):903–913, 2008
D2.2 – Soft Real-Time Dynamic Resource Allocation
Page 40 Version 1.0 8 December 2014
Confidentiality: Public Distribution
28. T. Theocharides, M. K. Michael, M. Polycarpou, and A. Dingankar. Hardware-enabled dynamic
resource allocation for manycore systems using bidding-based system feedback. EURASIP J.
Embedded Syst., Article 3, 21 pages, 2010
29. K. Lai. Markets are dead, long live markets. ACM SIGecom Exchanges, 5(4): 1–10, 2005
30. D. E. Irwin, L. E. Grit, and J. S. Chase. Balancing risk and reward in a market-based task ser-
vice. In Proceedings of the 13th IEEE International Symposium on High Performance Distribut-
ed Computing, pp. 160–169, 2004
31. E. D. Jensen, C. D. Locke, and H. Tokuda. A time-driven scheduling model for real-time operat-
ing systems. In IEEE Real-Time Systems Symposium, pp. 112–122, 1985
32. K. Chen and P. Muhlethaler. A scheduling algorithm for tasks described by time value function.
Real-Time Systems, 10(3):293–312, 1996
33. C. D. Locke. Best-effort decision-making for real-time scheduling. PhD thesis, Pittsburgh, PA,
USA, 1986
34. P. Li and B. Ravindran. Fast, best-effort real-time scheduling algorithms. IEEE Transactions on
Computers, 53(9):1159–1175, 2004
35. N. Bansal and K. R. Pruhs. Server scheduling to balance priorities, fairness, and average quality
of service. SIAM Journal on Computing, 39(7):3311–3335, 2010
36. S. A. Aldarmi and A. Burns. Dynamic value-density for scheduling real-time systems. In The
11th Euromicro Conference on Real-Time Systems, 1999
37. A. M. Burkimsher. Fair, responsive scheduling of engineering workflows on computing grids.
Ph.D. dissertation, UK, 2014
38. University of Stuttgart, D3.1 – Cloud Communications Patterns Analysis, DreamCloud, 2014.
39. University of York, D5.1 – Analytical Platform Model, DreamCloud, 2014.
40. S. Camazine, J.-L. Deneubourg, N. R. Franks, J. Sneyd, G. Theraulaz, and E. Bonabeau. Self-
Organization in Biological Systems. Princeton University Press, 2003
41. T. Nishitha and P. C. Reddy. Performance Evaluation of AntHocNet Routing Algorithm in Ad
Hoc Networks. In 2012 International Conference on Computing Sciences (ICCS), pp. 207–211,
2012
42. A. da Silva Rego, J. Celestino, A. dos Santos, E. C. Cerqueira, A. Patel, and M. Taghavi. BEE-
C: A bio-inspired energy efficient cluster-based algorithm for data continuous dissemination in
Wireless Sensor Networks. In 2012 18th IEEE International Conference on Networks (ICON),
2012, pp. 405–410, 2012
43. I. Caliskanelli, J. Harbin, L. S. Indrusiak, P. Mitchell, F. Polack, and D. Chesmore. Bioinspired
Load Balancing in Large-Scale WSNs Using Pheromone Signalling. Int. J. Distrib. Sens. Netw.,
vol. 2013, p. 14, May 2013
44. P. Champrasert, J. Suzuki, and C. Lee. Exploring self-optimization and self-stabilization proper-
ties in bio-inspired autonomic cloud applications. Concurrency Computat.: Pract. Exper.,
24(9):1015-1034, 2012
45. Y. E. M. Hamouda and C. Phillips. Biological Task Mapping and Scheduling in Wireless Sensor
Networks. In Proc. IEEE Int. Conf. Communications Technology and Applications (ICCTA), pp.
914–919, 2009