an application programming interface for hpc
TRANSCRIPT
-
8/13/2019 An Application Programming Interface for HPC
1/53
An Application Programming Interface for High
Performance Distributed Computing
For the Copernicus computing project
ASHKAN JAHANBAKHSH
HANIF FARAHMAND MOKARREMI
Master of Science Thesis, KTH
Supervisor: Iman Pouya, Erik Lindahl
Examiner: Patrik Henelius
Stockholm, Sweden 2013
-
8/13/2019 An Application Programming Interface for HPC
2/53
-
8/13/2019 An Application Programming Interface for HPC
3/53
Abstract
This master thesis was performed at Lindahl Lab at Science for Life Labora-
tory (SciLifeLab), located in Stockholm, Sweden.
The need of compute resources is increasing and the speed of a single com-
puter is not enough for data intensive computations. Distributed computing
has been developed to improve computations of such tasks by distributing
them to other machines. There are many platforms that are implemented for
this purpose. Copernicus is such a platform that provides the availability to
distribute computationally intensive tasks. The current Copernicus API han-
dles only distribution of entire applications and there is no support to distribute
sections of code in Copernicus.
In this paper we provide a general API design and a Python implementation
of it for how to distribute sections of a code on Copernicus. The implementa-
tion of the API handles only Python code but it is possible to extend to other
languages with the help of Python wrappers. It also abstracts the learning
threshold for a new Copernicus user, especially for those that are not com-
puter scientists or people with little knowledge in programming.
-
8/13/2019 An Application Programming Interface for HPC
4/53
ReferatAPI fr distribuerade berkningar
Detta examensarbete utfrdes p Science for Life Laboratory (SciLifeLab), p
Lindahl Lab avdelning, som ligger i Stockholm.
Behovet av datorresurser kar och hastigheten av en enda dator r inte tillrck-
ligt fr dataintensiva berkningar. Distribuerade berkningar har utvecklats fr
att frbttra berkning av sdana arbeten genom att distribuera dem till andra
maskiner. Det finns mnga plattformar som har implementerats fr detta n-
daml. Copernicus r en sdan plattform som ger tillgng till distribuering av
berkningsintensiva arbeten. Den nuvarande Copernicus API hanterar endast
distribution av ett program i helhet och det finns inget std fr att distribuera
delar av kod i Copernicus.
I denna rapport tillhandahller vi ett generellt API-design och dess implemen-
tation i Python fr hur man distribuerar delar av en kod i Copernicus. Im-
plementationen av APIet hanterar endast Python-kod, men det r mjligt att
utvidga det till andra sprk med hjlp av Python wrappers. Detta minskar ven
bort inlrningstrskeln fr en ny Copernicus-anvndare, speciellt fr dem som
inte r dataloger eller personer med lite kunskap inom programmering.
-
8/13/2019 An Application Programming Interface for HPC
5/53
AcknowledgementsWe would like to thank everyone who helped us along this master thesis. A special
thanks goes to Iman Pouya, our supervisor who guided us in the right direction to
reach our goal. Other special thanks go to Sander Pronk and Patrik Falkman who
gave us the opportunity to discuss our problems with them during the work and get
many valuable feedbacks. We also thank Professor Erik Lindahl, our examiner in
SciLifeLab who let us do this master thesis.
Finally, we would like to thank our friends and specially Ali Mehrabi and Hannes
Salin who put some time to read our report and gave us valuable feedback.
-
8/13/2019 An Application Programming Interface for HPC
6/53
Contents
1 Introduction 1
1.1 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Parallelization and Distributed computing 5
2.1 Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Distributed computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 MapReduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Advantages and Disadvantages . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Copernicus 9
3.1 Copernicus design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Copernicus module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3 Copernicus module example . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.1 Defining the _import.xml file . . . . . . . . . . . . . . . . . . . 15
3.3.2 Defining the runner.py file . . . . . . . . . . . . . . . . . . . . . 16
3.3.3 Defining the executable.xml file . . . . . . . . . . . . . . . . . . 17
3.3.4 Adding jobs to Copernicus . . . . . . . . . . . . . . . . . . . . . . 17
4 Related work 19
4.1 Folding@home . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 PiCloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3 Hadoop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.4 Techila. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5 The API 21
5.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.1.1 Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.1.2 First attempt: Module generator . . . . . . . . . . . . . . . . . . . 22
5.1.3 Final attempt: Generic module . . . . . . . . . . . . . . . . . . . . 23
5.2 API implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6 Results 29
6.1 MD5 cracker with the new API . . . . . . . . . . . . . . . . . . . . . . . . 29
6.2 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
-
8/13/2019 An Application Programming Interface for HPC
7/53
7 Discussion 39
7.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Bibliography 41
Appendices 43
A Link to source code 45
-
8/13/2019 An Application Programming Interface for HPC
8/53
-
8/13/2019 An Application Programming Interface for HPC
9/53
Chapter 1
Introduction
Today, many scientists do a lot of computationally intensive simulations[1]. This is mostly
used in the academic world in biology, chemistry and physics but also in the commercial
industry, e.g. the automotive industry[1]. These simulations are usually done for several
reasons:
The problem is too complex to solve analytically
It is too expensive (money or time) to solve it in real world
To be able to do a good simulation of the real world application, one must build a model
that can represent that application. For example, in a car crash simulation it could be tooexpensive and time consuming for a company to destroy hundreds of cars to be able to get
a representative statistical result of how a possible real car crash would turn out. Instead
the company can use simulations to minimize the real car crash tests, thus save both time
and money.
Figure 1.1: A snapshot of a car crash simulation, source: BMW.
1
-
8/13/2019 An Application Programming Interface for HPC
10/53
CHAPTER 1. INTRODUCTION
These simulations are usually computationally intensive so it would take a long time to
do the processing on a single workstation. Therefore the simulations are usually done oncomputational resources equipped with hundreds to millions of CPU cores. These com-
putational resources could either be a cluster of computers that are interconnected with a
high performance link, or a cluster of smaller computers that could be placed all over the
world which are connected through the Internet. The cluster of computers (also called su-
percomputers) is usually very huge. They are specially built to be energy efficient, but still
consume a lot of energy and emit huge amount of heat that needs to be dissipated. They are
therefore located in areas with a lot of space, not only needed for the computers but also
for the cooling system and electricity.
Figure 1.2: A supercomputer, source: NASA.
On top of the hardware, the user must implement a mechanism for the application to com-
municate across all computers and clusters. There is support for such a mechanism both in
the application layer and in a lower programming language layer. In the programming layer
there is several language APIs such as Message Passing Interface (MPI)[2], Remote Pro-
cedure Call (RPC)[3] and Remote Method Invocation (RMI)[4]. The programmer needs to
handle the communications between the computers, handle fault tolerance and the available
resources.
Many applications have been created to simplify and make the language layer more ab-
stract. While these applications might be very different in design and usage, most of them
have some fundamental functionality. Some of these functionalities are handling the com-
munication between computers, resource management and fault tolerance.
In the Related work chapter, three such tools will briefly be explained and in the next
section another tool together with a problem statement is presented.
2
-
8/13/2019 An Application Programming Interface for HPC
11/53
1.1. PROBLEM STATEMENT
1.1 Problem statement
As explained in the previous section, there are many software tools that are aiming to sim-
plify the distribution of work. One of these tools is called Copernicus[5], which has been
mainly developed at Royal Institute of Technology (KTH) with cooperation with Virginia
and Stanford universities. Copernicus is a Peer-to-Peer (P2P) platform for distributed com-
puting. It connects heterogeneous computing resources and allows utilization by defining a
problem as a workflow. This means that users can focus on formulating their problem and
not worry about the parallel work distribution and fault tolerance.
It is designed primarily for molecular dynamics but can practically be used for any compu-
tationally intensive work that can be run in a distributed manner.
Figure 1.3: A molecular simulation of a protein folding, source: Stanford university.
The current structure of Copernicus allows binaries, custom programs and scripts to be used
in a workflow. However this is a monolithic approach and in certain use cases one needs
more granular control over the work distribution. The workflow creation part of Copernicus
is very powerful but it is also pretty time consuming for a new Copernicus user or users
with little knowledge in programming. In some use cases it is utterly impractical to design
a workflow to be able to distribute some work on the Copernicus platform.
One such use case is when only some part of a code needs to be run in a distributed manner.
In such use case, the current Copernicus design would force the user to change the code
dramatically to make it run on the Copernicus platform. Each time the user changes the
code, appropriate changes must be made again. While this is really time consuming and
even frustrating for a code base with a single developer, it is even more frustrating and
impractical for a code base with multiple developers. This is a high entry barrier which is
probably one of the reasons for the small user base of Copernicus.
The goal of this project is to define an API for users to use when they want to define sections
of their code to distribute in parallel in the Copernicus platform.
3
-
8/13/2019 An Application Programming Interface for HPC
12/53
CHAPTER 1. INTRODUCTION
1.2 Methodology
The Copernicus project is a large free software project. It is lacking a flow chart and a
UML diagram that explains its design decisions. Its user API documentation was also un-
der construction, at the start of this thesis. From a developer standpoint these are two huge
drawbacks. This means that not only did we need to understand how to use Copernicus
but also read its code and find out about its current design, structure and its strength and
weaknesses. However, an introduction to how to use Copernicus was given by our supervi-
sor.
Because of the reasons above, we decided to first get familiar with the usage of Copernicus.
When a full understanding of its usage was obtained, literature studies were done in the area
of High Performance Computing (HPC) and similar applications. Most of them mentionedin the previous sections. The reason for this was to get good knowledge of what was out
there and how they are used.
After literature studies were done, we implemented a simple distributable application that
used Copernicus to compute some tasks. The goal was to understand and get an overview
on how Copernicus works and get used to the available Copernicus commands. After that,
we iteratively implemented a simple version of the API. This version was only intended
to get a flow in the development and understanding of the Copernicus design. When there
is a working flow, it is much easier to both understand and change each section of the
implementation.
The methodology of this thesis can be described by the following flow chart:
Figure 1.4: The methodology of this thesis.
4
-
8/13/2019 An Application Programming Interface for HPC
13/53
Chapter 2
Parallelization and Distributed
computing
2.1 Parallelization
The clock speed of processors will no longer have a significant increment because the
high clock speed requires exponentially more voltage which generates almost exponen-
tially more heat[6]. Instead processor manufacturers use new transistors to add multiple
processors cores to each chip[6]. To use the power of these multiple cores, programs have
to be run in parallel. Parallel processing is a way of computing in which a large problem
is divided into smaller and independent tasks and all tasks are computed concurrently, on a
separate core, usually on a single computer. In other words parallel processing is the use of
two or more processor cores at the same time to solve a single problem that can be divided
into sub-problems.
Figure 2.1: A visualization of parallelization of a problem.
5
-
8/13/2019 An Application Programming Interface for HPC
14/53
CHAPTER 2. PARALLELIZATION AND DISTRIBUTED COMPUTING
2.2 Distributed computing
In order to calculate computationally intensive tasks in a reasonable timeframe is to use a
supercomputer, i.e. a powerful high performance computer that consists of many compute
nodes1. In a supercomputer, tasks can take advantages of the huge amount of nodes, and
thus get computed in a parallelized way.
Another alternative to compute computationally intensive tasks in a reasonable timeframe
is to set up a number of computers in a network and use their resources to compute tasks.
Distributed computing[7] is a field in computer science which solves this by dividing a
large problem into smaller parts and sends them to many computers in a network to solve
and then merge the sub-solutions into a solution for the problem.
Figure 2.2: A visualization of distribution of a problem.
1A compute node is simply a single machine in a cluster or a network.
6
-
8/13/2019 An Application Programming Interface for HPC
15/53
2.3. MAPREDUCE
2.3 MapReduce
MapReduce is a programming model that allows developers to process large data sets in
a distributed way[8]. There are two types of key functions in the MapReduce framework,
the Map function and the Reduce function. The job is separated into sub-problems which
are processed by the mappers. The outputs of the maps are sent to the reducers where they
are collected into one result. While the idea of having a mapper function and a reducer
function is pretty old and widely used[9], the name MapReduce was first encountered
when a paper on the subject was published, inherited from two Google engineers.
Figure 2.3: A visualization of MapReduce programming concept. A problem is separated
into sub-problems and sent to the mappers. When mappers are done mapping, their results
are sent to the reducers and they are collected and presented as a final output.
7
-
8/13/2019 An Application Programming Interface for HPC
16/53
CHAPTER 2. PARALLELIZATION AND DISTRIBUTED COMPUTING
2.4 Advantages and Disadvantages
The main advantage of a distributed system compared to a parallel computing system is
the reliability. If one machine crashes, the remaining computers remain unaffected and the
system will still run as a whole, given that there is support for fault-tolerance. In a parallel
computing system, a failure of a CPU or other hardware may cause the whole system to
stop. Another advantage of a distributed system is the flexibility, i.e. it is very easy to
implement, install and debug new services which in turn can be accessed equally by every
client. Finally, there are many volunteers around the world who might want to contribute
CPU resources to help scientists compute data-intensive tasks in a distributed manner like
the Folding@Home project[10].
One of the main disadvantages of distributed computing systems is troubleshooting and di-agnosing them. The maintainer may be required to connect to remote machines or check if
the communication between computers in the system is working. Network is the other fac-
tor for a reliable distributed system, i.e. if the network is overloaded or there are problems
with data transmission, the performance of the system will be affected.
Both parallel and distributed computing requires one to modify sections of the code that
is meant to be parallelized to independent tasks. This requires a good understanding for
parallel and distributed computing.
8
-
8/13/2019 An Application Programming Interface for HPC
17/53
Chapter 3
Copernicus
As mentioned in chapter 1, Copernicus is a highly scalable platform for distribution of
computationally intensive tasks, connected in a model that is called a workflow. The
user can focus on formulating the problem as a workflow instead of spending time on
handling the message passing and fault tolerance, which is a crucial part of distributed
computing.
In Copernicus, a user can create a workflow of connected instances of functions that are
working as wrappers for running executables. For example one might want to connect the
output of function A to the input of function B thereafter connect the output from B to
function C. The idea of the Copernicus workflow design is that a user should be able to
change the inputs of the module functions while the job is being executed.
Figure 3.1: An example of a simple workflow, with three connected functions in Coperni-
cus.
9
-
8/13/2019 An Application Programming Interface for HPC
18/53
CHAPTER 3. COPERNICUS
Insection 3.2a more detailed explanation of Copernicus module is presented.
Copernicus is also designed with scalability in mind. That means the user can add new
workers on demand and the number of workers that can be in the cluster is practically
unlimited.
The following sections will describe the Copernicus design and the requirements for a
project to be distributed on it.
10
-
8/13/2019 An Application Programming Interface for HPC
19/53
3.1. COPERNICUS DESIGN
3.1 Copernicus design
The current version of Copernicus is designed in such a way that there are three differ-
ent environments: client, server and workers. On the client machine the user starts a new
project and handles all the commands to the server. The servers job is to handle all com-
mands coming from the client and send jobs to the workers. Copernicus commands are
used as an interface between the client and the server i.e. to start a project, set inputs, add
jobs, receive outputs etc. The main purpose of the server is to distribute the computation-
ally intensive work to the workers. The sequential part of the code should always run on
the server. It also handles the persistency and fault tolerance in case a worker would not
respond.
In order for Copernicus to scale dynamically, i.e. to add new workers into a cluster, it isdesigned such that the workers are asking for jobs instead of the other way around. The
workers tell the server what they are capable of to compute, the server then checks if it
has any matching job for that specific worker. If it has a job that matches the workers
capabilities, it will send the job to the worker. Otherwise it will let the worker know that
it has no job at the moment. The worker will wait for a given amount of time and then
asks the server again for jobs. This is simply because the server might have a job at a later
time.
Figure 3.2: An illustration of jobs delivered to workers from a Copernicus server.
11
-
8/13/2019 An Application Programming Interface for HPC
20/53
CHAPTER 3. COPERNICUS
In order to make Copernicus even more dynamic, it was designed around the P2P architecture[5].
Befriended servers can help each other with the work balancing, i.e. a server asks anotherserver if it has undone jobs in its queue. The jobs can then be transferred to the other server
to more efficiently complete the job.
Figure 3.3: An illustration of jobs being transferred to a trusted Copernicus server when
the other Copernicus server has undone jobs in its queue.
12
-
8/13/2019 An Application Programming Interface for HPC
21/53
3.2. COPERNICUS MODULE
3.2 Copernicus module
To be able to connect different executables inputs/outputs, files and different data types,
the user needs to define a module. The module specifies all the wrapping functions and
their corresponding executables with the inputs/outputs that needs to be connected. This is
defined in an XML-file called _import.xml.
Figure 3.4: This figure shows where all files must be located in order to create a Copernicus
module to be able to run an application on the Copernicus platform.
Another file that is needed for a working Copernicus module is a Python script. Each func-
tion definition in the _import.xml file needs to be implemented in a Python script. When
Copernicus reads the _import.xml file, it knows which functions that specific module has
and all the data types and inputs/outputs for each function. In the Python script, a usercan connect the instances of those functions and manipulate the outputs of one executable
before setting it to the input of another executable.
Each time a change is made to a module instance, Copernicus will call the specific function
with some argument. The function must return, otherwise the project will be blocked. This
is a crucial part to consider for the implementation of the API. Copernicus is designed in
this way so that users can interact with a running project. For example, a user might want
to change a value in the middle of a huge continuous project.
13
-
8/13/2019 An Application Programming Interface for HPC
22/53
CHAPTER 3. COPERNICUS
3.3 Copernicus module example
This section covers the basics of creating a module in Copernicus by an example. After the
module creation, adding jobs to the Copernicus job queue will be explained.
The aim of this application which is called MD5 Cracker is to crack a MD5 hash, i.e. to
find the plaintext1 that represents the corresponding MD5 hash. A MD5 hash is produced
by a one way encryption algorithm. There are several techniques to crack these kind of
hashes such as brute force attack, dictionary attack[11] and using rainbow tables[12]. The
application uses brute force attack technique, i.e. a way of cracking by trying all possible
combinations to find the plaintext. In order to crack a single lowercase character (English
language), it takes a maximum of 26 tries, and to crack a two character text, it takes a maxi-
mum of262
tries. The exponential nature of the brute force attack makes it computationallyintensive task. The problem can simply be divided into sub-problems by defining a range
for each task to compute. These two properties make this an ideal application to run on the
Copernicus platform.
1 import hashlib, string, itertools, sys
2 def bruteforce(job):
3 def validateWord(word, original_hash):
4 return hashlib.md5("".join(word)).hexdigest() ==
original_hash
5
6 def nextPermutation(FIRST_WORD_TUP,LAST_WORD_TUP,wSize):
7 for x in itertools.product(string.ascii_lowercase,
repeat=wSize):
8 if x >= FIRST_WORD_TUP and x
-
8/13/2019 An Application Programming Interface for HPC
23/53
3.3. COPERNICUS MODULE EXAMPLE
3.3.1 Defining the _import.xml file
As already described in section 3.2, a file called _import.xml must be created and it
should be located in the Copernicus server.
1
2
3 MD5 cracker
4
5
6
7
8 crack a given hash and find the corresponding
plaintext
9
10
11 Length of the plaintext
12
13
14 The hash string
15
16
17 The start point
18
19
20 The end point
21
22
23
24
25 The output of md5cracker
26
27
28
31
32
Listing 3.2: The contents of "_import.xml"
15
-
8/13/2019 An Application Programming Interface for HPC
24/53
CHAPTER 3. COPERNICUS
In this XML example, the module is named MY_MODULE, and it has one function.
The function is named runner, it gets 4 input parameters and have one output. The inputsto the function are the total length of the plaintext, the hash value, the starting and ending
values. The output is the plaintext value that will be returned in a file after it is computed
by the workers.
3.3.2 Defining the runner.py file
Under the controller tag in the _import.xml file there is a property called function.
This property tells Copernicus that this file must be called runner.py.
1 import logging, cpc.command, cpc.util, os, shutil
2 log=logging.getLogger(cpc.lib.MY_MODULE)
3 def runner(inp):
4 if inp.testing():
5 return
6 fo=inp.getFunctionOutput()
7 persDir=inp.getPersistentDir()
8 val1=inp.getInput(num)
9 val2=inp.getInput(hash)
10 val3=inp.getInput(start)
11 val4=inp.getInput(end)
12 fileExist = os.path.isfile(persDir + "/stdout")
13 if not fileExist:
14 for i in range(len(val1)):
15 outputFiles= ["out.%d"%i]
16 args=["md5cracker", val1[i].get(), val2[i].get(),
val3[i].get(), val4[i].get()]
17 cmd=cpc.command.Command(persDir, "MY_MODULE/runner"
, args,
18 minVersion=cpc.command.Version("1.0"),
19 addPriority=0, outputFiles=outputFiles)
20 fo.addCommand(cmd)
21 return fo
Listing 3.3: The contents of "runner.py"
This Python script will tell Copernicus to run the application md5cracker on the work-
ers. The input values to the application are read from the module. After that, the internal
Copernicus API function addCommand is executed to add the desired job to the Coper-
nicus queue.
16
-
8/13/2019 An Application Programming Interface for HPC
25/53
3.3. COPERNICUS MODULE EXAMPLE
3.3.3 Defining the executable.xml file
In order to let the Copernicus server know what functions the worker is capable of running,
a file called executable.xml must be created. This file must be copied to all the workers
that are meant to run the job.
1
2
3
4
5
6
Listing 3.4: The contents of "executable.xml"
Under the executable tag in this file there is a property called name. This property tells
Copernicus that this particular worker is capable of running MY_MODULE/runner. As
already mentioned in the previous sections, MY_MODULE is the name of the module
and runner is its function.
3.3.4 Adding jobs to Copernicus
When the module creation is finished, a project can be set up and jobs can be addedto the Copernicus queue. In order to create a workflow and connect the input/outputs,
some Copernicus commands must be executed. In this particular example, a hash value
95ebc3c7b3b9f1d2c40fec14415d3cb8 which represents the plaintext zzzzz is being
brute forced by the application. For the sake of simplicity, the length of the plaintext in
this example is known, which is 5. In a real case, it is not possible to know the length of
the plaintext from the hash value.
//create a project
$cpcc start MY_CRACKER
//import the recently create module to the project$cpcc import MY_MODULE
17
-
8/13/2019 An Application Programming Interface for HPC
26/53
CHAPTER 3. COPERNICUS
$cpcc transact//create an instance of the job and name it "runner_1"
$cpcc instance MY_MODULE::runner runner_1
$cpcc activate
//set the length of the plaintext
$cpcc set runner_1:in.num[+] "5"
//set the hash value to be cracked
$cpcc set runner_1:in.hash[+] "95
ebc3c7b3b9f1d2c40fec14415d3cb8"
//the start point
$cpcc set runner_1:in.start[+] "aaaaa"
//the end point$cpcc set runner_1:in.end[+] "mzzzz"
//commit the first job
$cpcc commit
Doing the same but with different arguments to add the second job:
$cpcc transact
$cpcc instance MY_MODULE::runner runner_2
$cpcc activate
$cpcc set runner_2:in.num[+] "5"
$cpcc set runner_2:in.hash[+] "95
ebc3c7b3b9f1d2c40fec14415d3cb8"
$cpcc set runner_2:in.start[+] "naaaa"
$cpcc set runner_2:in.end[+] "zzzzz"
//commit the second job
$cpcc commit
Two jobs are now added to the Copernicus queue to be run on the workers. If there are
workers with available computational resources, the jobs will be fetched by them and the
computation will start. When they are done with the computation, the results are sent back
to the server. After all jobs in the queue are done, the project is considered to be finishedby Copernicus.
While this is a very simple code example, it is clear that it is very time consuming process
for creating a Copernicus module for a simple distribution.
18
-
8/13/2019 An Application Programming Interface for HPC
27/53
Chapter 4
Related work
In this chapter a number of related work will be briefly explained with their similarities/dif-
ferences and advantages/disadvantages compared to Copernicus. The goal is to get a good
insight in how other distributed platforms work and to get some inspiration before starting
to design the new API.
4.1 Folding@home
Folding@home (FaH)[13]is a distributed computing project with the goal to research pro-
tein folding[14], i.e. predicting the 3D-structure of a protein from its primary structure.
Currently, there are more than 263,000 volunteers all around the world that contribute their
computer resources to this project[13].
Copernicus is in fact highly influenced by the FaH design[15]. While most of the FaH code
is proprietary software and is only used for protein folding, Copernicus is completely free
software and is designed to do any kind of distributed computing[5].
4.2 PiCloud
PiCloud[16] is a so called cloud computing service, a commercial web application that
distributes computational work. Its API design is based on function calls to functions,
i.e. you define your functions that you want to run in a distributed manner, and then you
call the PiCloud API functions to run, make progress and receive the return values of the
function. PiCloud can both run a function sequential on the cloud and map the function to
run parallelized in a distributed system.
The big difference between the API of Copernicus and PiCloud is that the later can dis-
tribute a function and run it on the cloud but in the current Copernicus API a single function
19
-
8/13/2019 An Application Programming Interface for HPC
28/53
CHAPTER 4. RELATED WORK
cannot be distributed but a whole program can. On the other hand, Copernicus is capable of
running several applications, collect outputs and connect inputs/outputs of each applicationto other ones and distribute the desired jobs on the workers. PiCloud is required to receive
all data needed to compute tasks before starting the computation, but Copernicus can start
a job and under the computation receive the data it needs to compute tasks.
4.3 Hadoop
Hadoop[17] is a software framework for running applications on a large cluster with sup-
port for large amount of data. It is derived from MapReduce and Google File System (GFS).
It is a free software program, licensed under the Apache License 2.0 and it is widely used
by large companies such as Facebook, Yahoo, Amazon.com, IBM, HP and others. While
Hadoop was not mainly designed for computationally intensive work, it surely can be used
as one. But its core strength is the Hadoop File System (HFS)[18]. The HFS replicates data
on all computers connected in the cluster such that if one node goes down another node can
take its place without losing any data. Data intensive jobs can take advantage of the so
called node localization system. Hadoop holds information on where each node is and
instead of transferring data to the program, the program is transferred to where the data is
located.
4.4 Techila
Techila[19] is a commercial distributed computation platform that lets intensive computa-
tions to be processed in a distributed way. It is meant to distribute sections of code, e.g.
a for-loop and it supports many languages such as Perl, Python, Matlab, C/C++, etc. It
is only capable of distributing embarrassingly parallel workloads, i.e. tasks that are com-
pletely independent, i.e. not have any shared variables.
Techila is very similar to Copernicus considering that both platforms distribute jobs on
workers and the end-user can receive the computed result from them. The difference be-
tween Techila and the current Copernicus API is that Techila is able to distribute sections
of code but Copernicus distributes programs that execute.
20
-
8/13/2019 An Application Programming Interface for HPC
29/53
Chapter 5
The API
5.1 Design
Two fundamentally different design approaches were considered, function calls and an-
notations. In the case of function calls, the user would need to move the part of the code
that needs to be distributed inside a function, and add calls to our API-functions together
with needed arguments. The arguments would be a function pointer together with a list of
data that needs to be distributed. The function pointer points to the function that is going to
be called for each distributed work. Each element in the argument list is a list of arguments
for each distributed work. An example of the function call implementation would look like
this:
def myFunc(args):
#do a lot of work...
return something
args = [[arg1], [arg2], ... , [argN]]
if not COPERNICUS:
# Conventional wayretValueList = []
for arg in args:
retValueList.append(myFunc(arg))
else:
# The new API way
retValueList = call_to_our_api(myFunc, args)
Listing 5.1: API Design
21
-
8/13/2019 An Application Programming Interface for HPC
30/53
CHAPTER 5. THE API
When a user runs the example script above through Copernicus, everything that is needed
for a Copernicus project will be created automatically, the script will be executed, the func-tion myFunc will be distributed to the workers and each return value from the workers
will be stored in the retValueList variable.
The annotation design would use preprocessor directives like OpenMP annotations in
C++, i.e. #pragma omp, where the programmer can add an annotation above the section
that needs to be run in parallel[20]. If the compiler has support for OpenMP, it will make
that section of code to run parallelized for that specific environment and CPU architecture.
If the compiler does not have support for OpenMP, the annotations will be ignored and the
program will run in sequential way. This way the user would only need to add this kind of
annotations right before the section of code that needs to be distributed. In Python they are
called decorators but they cannot be used for sections of code and are therefore limited toonly functions. This would make it unreasonable to use annotations for the design.
5.1.1 Considerations
There were many considerations made before the implementation started. This section will
list the most important part of them.
1. The user should change her code as little as possible to make it run on Copernicus.
2. The user might call our API functions multiple times in her code, so the user script
must wait till all jobs on the workers are done before continuing to execute the restof the code.
3. The script should not run on the client computer because the job might take long time
to complete, in case the user might want to shut down the client computer.
4. The Copernicus module should be as general as possible, i.e. able to handle arbitrary
number of input arguments, data types and executables.
5. Most likely, the server and workers will not have all required dependencies, therefore
these needs to be copied both to the server and on to all workers.
5.1.2 First attempt: Module generator
This section briefly describes the module generator design. The idea is that instead of
creating all the module files manually, our API creates all needed files and executes the
Copernicus commands by running a generated script. The user script is started normally
as the user does when she runs it on a single machine. The user adds function calls to
our API wherever a distribution of a function is needed. Our API function searches for
all dependencies for that specific function and serializes and dumps the function together
with all dependencies. The API function analyses the input data and generates all needed
22
-
8/13/2019 An Application Programming Interface for HPC
31/53
5.1. DESIGN
Copernicus module files, the _import.xml, Python script, plugin script and a bash/shell
script for:
Creating a Copernicus project
Importing the module
Creating instances
Setting the input values
There are multiple challenges with this design approach. The first problem is that when the
user script is executed, each call to the API functions generates a new Copernicus module
that has its own specific name, number of inputs/outputs and their specific types. While
this actually works, it is not that practical for having a good overview of the project. The
second problem is that the user will have to manually copy the generated plugin script that
is specific for each call to the API functions.
While this design was not general enough to handle all kinds of Copernicus workflows, it
helped us gain a lot of knowledge about the internal design of Copernicus and had influence
on the final design. Also some code for handling dependencies and test code for creating a
Copernicus project was reused in the final design.
5.1.3 Final attempt: Generic module
The generic module was designed and redesigned multiple of times. But the final designturned out to be pretty simple. It has two main Copernicus module functions; one that starts
the users script (this is called mainRunner) and one that is created for each call to our API
functions, i.e. each function distribution (this is called subRunner). The mainRunner gets
its first inputs from the client when a project is created. The inputs are the name of the
script that calls our API functions and a tarball1 with all its dependencies. The outputs are
the standard out, standard error streams and a tarball including all the files that were created
during the execution. The user script is started in the mainRunner as a sub-process. This
way the mainRunner will not block the whole project.
1An archive that contains a set of files.
23
-
8/13/2019 An Application Programming Interface for HPC
32/53
CHAPTER 5. THE API
Figure 5.1: Module design that shows the subRunner instance inside the mainRunner. Mul-
tiple subRunner instances are created for each call to the API functions.
While in the script, when a call to our API occurs, new instances of the subRunner are
created and the execution of the script is halted. The subRunner gets a dump of the function
that needs to be distributed, together with the specific argument for that job. Then the
outputs of each subRunner are connected to the sub-inputs of the mainRunner. In this
way the mainRunner can collect the output from each worker and reassemble them in a list
and return it back to the script, which then will continue its execution. When the whole
execution of the script is done, the mainRunner will collect the final outputs and make a
tarball out of them.
24
-
8/13/2019 An Application Programming Interface for HPC
33/53
5.1. DESIGN
Figure 5.2: A flowchart of a user script running inside Copernicus by using generic module
API. The Communication Server and the user script are executed in separate threads from
the mainRunner.
One thing that was deliberately left out in the description of the execution scheme above
was how the user script is waiting for a list of outputs from the workers. The problem is
that while the user script is running in a separate process, there is by design no support in
Copernicus for the script to communicate with a specific project and add new jobs. The
function that is adding new jobs, i.e. creating new instances, should always be called inside
the Copernicus scope. This is solved by using an inter-process communication (IPC). The
actual implementation is described in the next chapter.When a call is made to our API from the user script, the function and the arguments are
dumped and a signal is sent through the IPC, which tells the server that there are now jobs
that need to be created. The jobCreator function, which is started from the mainRunner
and is waiting for a signal, gets the signal from the server and loads the dumped function
and its arguments. It then creates a list of instances of the subRunner and connects their
outputs to the sub-inputs of the mainRunner. The jobCreator function will return after
the job creation. As mentioned earlier, this is crucial for the Copernicus project not to be
blocked. After the jobCreator and mainRunner have returned, the subRunner function is
called by Copernicus, and the jobs will be put on the Copernicus job queue for the workers
25
-
8/13/2019 An Application Programming Interface for HPC
34/53
CHAPTER 5. THE API
to fetch and run.
On the workers, the serialized function together with its list of arguments are loaded and
executed. After that the job is done, the output of the argument is serialized. It is then
compressed into a tarball together with all the new files that might have been created by the
distributed function, which is thereafter sent back to the Copernicus server.
As mentioned insection 3.2,each time a change is made to a module function instance, that
function is executed by Copernicus. In this case, the workers are sending back their results.
When results from all workers are sent back, they will be deserialized and collected into a
list. The list is then serialized and dumped and a signal is sent back to the API function that
is currently waiting and blocking the user script.
When the blocking API function gets the signal through IPC, it will load the serialized list
and send it back to the user script. The script will then continue its execution and a new
call to our API function is possible.
When the user script is done executing, the stdout and stderr, together with a tarball that
includes all newly created files, will be ready for the user to fetch.
5.2 API implementation
This section will go through each part of the implementation of the API design. Since
the whole Copernicus project is implemented in Python, this implementation is also in
Python.
The main challenge here is that since the function myFunc is being executed in the work-
ers, all its dependencies might not be available on those machines. So the API must handle
that by recursively inspect all the underlying dependencies and copy them to all the work-
ers. This is solved by using a Python API called snakefood[21].
For the implementation of the IPC, Unix Domain Sockets (UDS) are used, which is a socket
like API but the communication is made through a file instead of an IP address[ 22].
26
-
8/13/2019 An Application Programming Interface for HPC
35/53
5.2. API IMPLEMENTATION
Figure 5.3: The API function and job creator function communicate through the communi-
cation server using UDS.
For passing data between the user script and the Copernicus module functions, the UDS
was not used. They were instead serialized and dumped to the hard drive by using the
internal Python object serialization, called marshal[23].
27
-
8/13/2019 An Application Programming Interface for HPC
36/53
-
8/13/2019 An Application Programming Interface for HPC
37/53
Chapter 6
Results
The final result of this thesis is a general design of an API for the Copernicus computing
project and also an implementation of the design in Python. The API radically simplifies
the modifications needed for a users Python script to run on the Copernicus computing
platform. The user no longer needs to create customized Copernicus modules but only call
the API function, which we call hapi_map, that handles everything that is needed for
that function to be distributed in the Copernicus platform. A full comparison between both
APIs is described in the next section and some more conclusions are drawn.
6.1 MD5 cracker with the new API
In this section the result of distributing MD5 Cracker application on the Copernicus plat-
form is presented. The goal is to show how the new API is used in an application. Two
test systems are used to measure the results and find the threshold of how small the jobs
can be to be able to obtain a speedup when distributing an application with the new API. A
speedup is the time that takes to execute an application in sequential mode, divided by the
time it takes to execute the same application in a parallel mode.
Speedup =T1
TP(6.1)
29
-
8/13/2019 An Application Programming Interface for HPC
38/53
CHAPTER 6. RESULTS
The two systems run Ubuntu GNU/Linux 12.04 with Python 2.7.3. They are both x86-64
CPUs, but are of different vendors and thus different architecture designs. The followingcode shows how to run the application in parallel by splitting the work into two jobs. The
same concept is used to split the work into 24, 48 and 96 smaller jobs. Also different
plaintext lengths, 4 and 6, are used to be able to analyze how the API implementation
scales for small and large jobs.
import hashlib, string, itertools
def bruteforce(job):
#code omitted
job1 = [6, 453e41d218e071ccfb2d1c99ce23906a,
aaaaaa, mzzzzz]
job2 = [6, 453e41d218e071ccfb2d1c99ce23906a,
naaaaa, zzzzzz]
jobs = [job1, job2]
# This is the new distributed way.
from cpc.lib.execPythonModule.hapiModule import hapi_map
results = hapi_map(bruteforce, jobs)
Listing 6.1: md5cracker.py
30
-
8/13/2019 An Application Programming Interface for HPC
39/53
6.1. MD5 CRACKER WITH THE NEW API
Figure 6.1:
Number of workers Duration (sec) for Opteron Duration (sec) for Xeon
1 2733 1707
2 1771 879
8 438 231
24 167 150
64 177 152
Table 6.1: The duration for 24 jobs and 6 characters.
Number of workers Speedup for Opteron Speedup for Xeon
1 1.00 1.00
2 1.54 1.948 6.24 7.39
24 16.37 11.38
64 15.44 11.23
Table 6.2: The speedup for 24 jobs and 6 characters.
31
-
8/13/2019 An Application Programming Interface for HPC
40/53
CHAPTER 6. RESULTS
Figure 6.2:
Number of workers Duration (sec) for Opteron Duration (sec) for Xeon
1 6800 4701
2 3626 2435
8 1031 634
24 365 399
64 222 397
Table 6.3: The duration for 96 jobs and 6 characters.
Number of workers Speedup for Opteron Speedup for Xeon
1 1.00 1.00
2 1.88 1.938 6.60 7.41
24 18.63 11.78
64 30.63 11.84
Table 6.4: The speedup for 96 jobs and 6 characters.
32
-
8/13/2019 An Application Programming Interface for HPC
41/53
6.1. MD5 CRACKER WITH THE NEW API
Figure 6.3:
Number of workers Duration (sec) for Opteron Duration (sec) for Xeon
1 4 3
2 21 18
8 6 5
24 4 3
64 8 5
Table 6.5: The duration for 24 jobs and 4 characters.
Number of workers Speedup for Opteron Speedup for Xeon
1 1.00 1.00
2 0.19 0.178 0.67 0.60
24 1.00 1.00
64 0.50 0.60
Table 6.6: The speedup for 24 jobs and 4 characters.
33
-
8/13/2019 An Application Programming Interface for HPC
42/53
CHAPTER 6. RESULTS
Figure 6.4:
Number of workers Duration (sec) for Opteron Duration (sec) for Xeon
1 5.9 4.7
2 40 37
8 10 10
24 8 7
64 11 13
Table 6.7: The duration for 48 jobs and 4 characters.
Number of workers Speedup for Opteron Speedup for Xeon
1 1.00 1.00
2 0.15 0.138 0.59 0.47
24 0.74 0.67
64 0.54 0.36
Table 6.8: The speedup for 48 jobs and 4 characters.
34
-
8/13/2019 An Application Programming Interface for HPC
43/53
6.1. MD5 CRACKER WITH THE NEW API
From the results above, we can clearly see that when the computation time for a single job
is very short, the network data transfer and communication will be the bottleneck. We canalso see that when the number of workers on a system is more than the number of jobs,
not only there is no speedup, but it sometimes even result in a performance decrease. The
reason for this is that the server is being overloaded with a lot of workers that are asking
for jobs.
Comparing the first two charts, we can see that as long as there are enough CPUs and jobs
available for them, there will be a linear speedup.
35
-
8/13/2019 An Application Programming Interface for HPC
44/53
CHAPTER 6. RESULTS
6.2 Comparison
In this section the usage of the previous Copernicus API and the new one is compared with
focus on the users perspective. The MD5 Cracker application described insection 3.3is
used:
# This function is supposed to be distributed.
def bruteforce(args):
#Code omitted
# Code omitted.
# Calling the brute force code.
result = bruteforce(args)
Listing 6.2: md5cracker.py
To run the application in a distributed way through Copernicus, the user should first of all
define a Copernicus module. Creation of a module in Copernicus is already described in
thesection 3.2. It requires one to have access to the Copernicus server and to write sev-
eral hundred lines of code. Since the bruteforce function is meant to be distributed and
Copernicus does not support function distribution, the md5cracker.py must be rewrit-
ten. The next step is to copy md5cracker.py file and all necessary dependencies to all
workers.
In this part, the same application is used as an example to be distributed by the new imple-
mented API. First of all sections of code that are meant to be distributed must call to the
new APIs function (hapi_map) with the following code:
# This function is supposed to be distributed.
def bruteforce(args):
# Code omitted.
# Code omitted.# This tells Copernicus to distribute the bruteforce
function.
results = hapi_map(bruteforce, listOfArgs)
Listing 6.3: md5cracker.py
Now the user is only required to run a single command to make the job done, i.e. distribute
the bruteforce function.
36
-
8/13/2019 An Application Programming Interface for HPC
45/53
6.2. COMPARISON
Besides freeing the user from the burden of writing several hundred lines of code for creat-
ing the module, the new API brings more functionality to Copernicus which is:
Copernicus is now able to distribute sections of code and not only the entire appli-
cation. This means that sequential sections of the code can run locally on the server
and only those sections that are meant to be run on the workers will be distributed.
There is no longer any need to install or copy any files to the workers manually, all
that is handled by the API functions. In the previous Copernicus API, the user was
required to copy the application and its dependencies to all workers.
The user is no longer required to modify the Copernicus server to create Copernicus
modules. The main advantage of this result is that the system administrator is no
longer required to give the right to the users to access the server which in turn brings
more security to Copernicus.
There is no need to have any pre-knowledge about Copernicus, how it works and its
internal design. A user only needs to learn a single new, but simple, function call.
37
-
8/13/2019 An Application Programming Interface for HPC
46/53
-
8/13/2019 An Application Programming Interface for HPC
47/53
Chapter 7
Discussion
During the implementation of the different designs many unpredictable problems were en-
countered. Some of them were not solvable without doing some change to the Copernicus
design while other problems needed more consideration than they were initially given. For
example for the first design attempt, we did not know that when Copernicus is calling a
module function, that function must return before any other function inside that project
can be called from Copernicus. Another challenge was to decide where to run the user
script and how to handle the communication between the user script and Copernicus. Since
Copernicus is distributing the jobs to the workers, and the results from the workers need to
be returned back to the user script, the user script must be blocked inside the API function.If the script is executed on the client machine, there would be a need for a result fetching
function in the client side of Copernicus. This design would add some overhead to the
workflow of Copernicus and was therefore abandoned.
Most of our shortcomings during the design and implementation were due to lack of docu-
mentation on how to use Copernicus and its internal design. If it wasnt for all the fruitful
discussions with Iman Pouya, Sander Pronk and Patrik Falkman this project would proba-
bly not have been possible in the given time frame.
7.1 Future work
While the current design is general enough for a range of different domains, it lacks two
fundamental functionalities; distributing data files efficiently and distributing executable
binaries. These two functionalities are so common in distribution that the lack of them
would make the API unusable in practice. For such implementation in the future, the
designer might need to consider how to handle binaries that are compiled for different
architectures. This is an important point since one big advantage of using Copernicus
compared to other distributing platform is that it is cross-platform.
39
-
8/13/2019 An Application Programming Interface for HPC
48/53
CHAPTER 7. DISCUSSION
For efficient file distribution, a simple solution would be that the user gives the name/path of
a file or a folder as an argument when running the Copernicus exec-py command. A fun-damentally different approach to this would be to design something more like how Hadoop
handles the distribution of files. That is a large cluster of computers in a distributed file
system with support for redundancy and network localization. Since this second approach
is too complex, the first approach would be a good starting point.
One of the requirements for this project was that the design should be general enough so
that it could be implemented in multiple languages. While the current implementation only
supports Python code, the design is fully portable to other similar dynamic programming
languages. For other common languages like C/C++ and Java the user will have to either
make binaries of each function and use a Python wrapper to call each of them, or to com-
pile the code as shared/dynamic library and call the functions through a Python wrapper.
For both situations the support for running executable binaries, explained above, must be
implemented.
Other future implementations that would be of interest for a Copernicus user would be
so called reducer functionality. At the moment there is only one API function which
called hapi_map, it is a mapper function that takes two arguments; the function that the
user wants to run on each worker and a list of arguments for each worker. Support for
the reducer functionality could either be added as a collection of standard reducers built
in the API, or as user defined reducer functions. The user would simply add a function
pointer or an identifier for one of the standard reducers as a third argument to the hapi_map
function.
7.2 Conclusion
With the new world of cloud computing and the birth of quantum computing[24] we can
see that the demand for computational intensive tasks are rapidly increasing.
We have in this master thesis managed to design and implement an easy to use API that
consists of one function call. It not only simplifies the usage of the Copernicus API but
also adds new functionalities. Users can now distribute sections of their code to run in a
distributed manner with a single function call. We have also shown where the minimum
limits of jobs can be in Copernicus, using our API. Finally we have listed many improve-ments that can be done to both this API design and the Copernicus project as whole. By
giving Copernicus this boost, we believe that the whole field can benefit from it.
40
-
8/13/2019 An Application Programming Interface for HPC
49/53
Bibliography
[1] Wikipedia. Computer simulation. http://en.wikipedia.org/wiki/
Simulation#Computer_simulation, May 2013.
[2] Wikipedia. Message passing interface. http://en.wikipedia.org/wiki/
Message_Passing_Interface,May 2013.
[3] Wikipedia. Remote procedure call. http://en.wikipedia.org/wiki/
Remote_procedure_call, May 2013.
[4] Wikipedia. Java remote method invocation. http://en.wikipedia.org/
wiki/Java_remote_method_invocation,May 2013.
[5] Sander Pronk, Per Larsson, Iman Pouya, Gregory R. Bowman, Imran S. Haque, KyleBeauchamp, Berk Hess, Vijay S. Pande, Peter M. Kasson, and Erik Lindahl. Coper-
nicus: a new paradigm for parallel adaptive molecular dynamics. InProceedings of
2011 International Conference for High Performance Computing, Networking, Stor-
age and Analysis, SC 11, pages 60:160:10, New York, NY, USA, 2011. ACM.
[6] Intel. Why parallel processing? why now? what about my legacy
code? http://software.intel.com/en-us/blogs/2009/08/31/
why-parallel-processing-why-now-what-about-my-legacy-code ,
August 2009.
[7] Wikipedia. Distributed computing. http://en.wikipedia.org/wiki/
Distributed_computing, May 2013.
[8] Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simplified data processing on large
clusters. Commun. ACM, 51(1):107113, January 2008.
[9] Wikipedia. Simplified data processing on large clusters. http://research.
google.com/archive/mapreduce.html,May 2013.
[10] Adam L. Beberg, Daniel L. Ensign, Guha Jayachandran, Siraj Khaliq, and Vijay S.
Pande. Folding@home: Lessons from eight years of volunteer distributed computing.
41
http://en.wikipedia.org/wiki/Simulation#Computer_simulationhttp://en.wikipedia.org/wiki/Simulation#Computer_simulationhttp://en.wikipedia.org/wiki/Message_Passing_Interfacehttp://en.wikipedia.org/wiki/Message_Passing_Interfacehttp://en.wikipedia.org/wiki/Message_Passing_Interfacehttp://en.wikipedia.org/wiki/Remote_procedure_callhttp://en.wikipedia.org/wiki/Remote_procedure_callhttp://en.wikipedia.org/wiki/Java_remote_method_invocationhttp://en.wikipedia.org/wiki/Java_remote_method_invocationhttp://en.wikipedia.org/wiki/Java_remote_method_invocationhttp://software.intel.com/en-us/blogs/2009/08/31/why-parallel-processing-why-now-what-about-my-legacy-codehttp://software.intel.com/en-us/blogs/2009/08/31/why-parallel-processing-why-now-what-about-my-legacy-codehttp://software.intel.com/en-us/blogs/2009/08/31/why-parallel-processing-why-now-what-about-my-legacy-codehttp://en.wikipedia.org/wiki/Distributed_computinghttp://en.wikipedia.org/wiki/Distributed_computinghttp://research.google.com/archive/mapreduce.htmlhttp://research.google.com/archive/mapreduce.htmlhttp://research.google.com/archive/mapreduce.htmlhttp://research.google.com/archive/mapreduce.htmlhttp://research.google.com/archive/mapreduce.htmlhttp://en.wikipedia.org/wiki/Distributed_computinghttp://en.wikipedia.org/wiki/Distributed_computinghttp://software.intel.com/en-us/blogs/2009/08/31/why-parallel-processing-why-now-what-about-my-legacy-codehttp://software.intel.com/en-us/blogs/2009/08/31/why-parallel-processing-why-now-what-about-my-legacy-codehttp://en.wikipedia.org/wiki/Java_remote_method_invocationhttp://en.wikipedia.org/wiki/Java_remote_method_invocationhttp://en.wikipedia.org/wiki/Remote_procedure_callhttp://en.wikipedia.org/wiki/Remote_procedure_callhttp://en.wikipedia.org/wiki/Message_Passing_Interfacehttp://en.wikipedia.org/wiki/Message_Passing_Interfacehttp://en.wikipedia.org/wiki/Simulation#Computer_simulationhttp://en.wikipedia.org/wiki/Simulation#Computer_simulation -
8/13/2019 An Application Programming Interface for HPC
50/53
BIBLIOGRAPHY
In Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed
Processing, IPDPS 09, pages 18, Washington, DC, USA, 2009. IEEE ComputerSociety.
[11] Wikipedia. Dictionary attack, a technique for defeating a cipher or a hash to determine
its decryption. http://en.wikipedia.org/wiki/Dictionary_attack,
May 2013.
[12] Wikipedia. Rainbow tables, a precomputed table for reversing cryptographic
hash functions. http://en.wikipedia.org/wiki/Rainbow_table, May
2013.
[13] Stanford University. Protein folding simulation software. http://folding.stanford.edu/,May 2013.
[14] Wikipedia. Protein folding. http://en.wikipedia.org/wiki/Protein_
folding,May 2013.
[15] Wikipedia. Copernicus and the folding@homes markov state model. http:
//en.wikipedia.org/wiki/Folding@home#Biomedical_research,
May 2013.
[16] PiCloud. A distributed computing service. http://www.picloud.com/,May
2013.
[17] Wikipedia. Apache hadoop, software framework that supports data-intensive
distributed applications. http://en.wikipedia.org/wiki/Apache_
Hadoop, May 2013.
[18] Tevfik Kosar. Data Intensive Distributed Computing: Challenges and Solutions for
Large-scale Information Management. Information Science Reference - Imprint of:
IGI Publishing, Hershey, PA, 1st edition, 2011.
[19] Wikipedia. Techila, a distributed computing platform. http://en.wikipedia.
org/wiki/Techila_Grid, May 2013.
[20] Wikipedia. Openmp. http://en.wikipedia.org/wiki/OpenMP, May
2013.
[21] Martin Blais. Snakefood, a python dependency analyzer. http://furius.ca/
snakefood/, May 2013.
[22] Wikipedia. Unix domain socket. http://en.wikipedia.org/wiki/Unix_
domain_socket,May 2013.
42
http://en.wikipedia.org/wiki/Dictionary_attackhttp://en.wikipedia.org/wiki/Rainbow_tablehttp://folding.stanford.edu/http://folding.stanford.edu/http://folding.stanford.edu/http://en.wikipedia.org/wiki/Protein_foldinghttp://en.wikipedia.org/wiki/Protein_foldinghttp://en.wikipedia.org/wiki/Protein_foldinghttp://en.wikipedia.org/wiki/Folding@home#Biomedical_researchhttp://en.wikipedia.org/wiki/Folding@home#Biomedical_researchhttp://en.wikipedia.org/wiki/Folding@home#Biomedical_researchhttp://www.picloud.com/http://www.picloud.com/http://en.wikipedia.org/wiki/Apache_Hadoophttp://en.wikipedia.org/wiki/Apache_Hadoophttp://en.wikipedia.org/wiki/Techila_Gridhttp://en.wikipedia.org/wiki/Techila_Gridhttp://en.wikipedia.org/wiki/OpenMPhttp://en.wikipedia.org/wiki/OpenMPhttp://furius.ca/snakefood/http://furius.ca/snakefood/http://en.wikipedia.org/wiki/Unix_domain_sockethttp://en.wikipedia.org/wiki/Unix_domain_sockethttp://en.wikipedia.org/wiki/Unix_domain_sockethttp://en.wikipedia.org/wiki/Unix_domain_sockethttp://en.wikipedia.org/wiki/Unix_domain_sockethttp://furius.ca/snakefood/http://furius.ca/snakefood/http://en.wikipedia.org/wiki/OpenMPhttp://en.wikipedia.org/wiki/Techila_Gridhttp://en.wikipedia.org/wiki/Techila_Gridhttp://en.wikipedia.org/wiki/Apache_Hadoophttp://en.wikipedia.org/wiki/Apache_Hadoophttp://www.picloud.com/http://en.wikipedia.org/wiki/Folding@home#Biomedical_researchhttp://en.wikipedia.org/wiki/Folding@home#Biomedical_researchhttp://en.wikipedia.org/wiki/Protein_foldinghttp://en.wikipedia.org/wiki/Protein_foldinghttp://folding.stanford.edu/http://folding.stanford.edu/http://en.wikipedia.org/wiki/Rainbow_tablehttp://en.wikipedia.org/wiki/Dictionary_attack -
8/13/2019 An Application Programming Interface for HPC
51/53
[23] Python. Read and write python values in a binary format. http://docs.
python.org/2/library/marshal.html,May 2013.
[24] ArsTechnica. Google buys a d-wave quantum opti-
mizer. http://arstechnica.com/science/2013/05/
google-buys-a-d-wave-quantum-optimizer/ , May 2013.
[25] Wikipedia. Embarrassingly parallel workload. http://en.wikipedia.org/
wiki/Embarrassingly_parallel, May 2013.
[26] Xavier Vigouroux. What designs for coming supercomputers? InProceedings of the
Conference on Design, Automation and Test in Europe, DATE 13, pages 469469,
San Jose, CA, USA, 2013. EDA Consortium.
[27] Wikipedia. Peer-to-peer, a distributed application architecture that partitions
tasks or workloads between peers. http://en.wikipedia.org/wiki/
Peer-to-peer,May 2013.
43
http://docs.python.org/2/library/marshal.htmlhttp://docs.python.org/2/library/marshal.htmlhttp://docs.python.org/2/library/marshal.htmlhttp://arstechnica.com/science/2013/05/google-buys-a-d-wave-quantum-optimizer/http://arstechnica.com/science/2013/05/google-buys-a-d-wave-quantum-optimizer/http://en.wikipedia.org/wiki/Embarrassingly_parallelhttp://en.wikipedia.org/wiki/Embarrassingly_parallelhttp://en.wikipedia.org/wiki/Peer-to-peerhttp://en.wikipedia.org/wiki/Peer-to-peerhttp://en.wikipedia.org/wiki/Peer-to-peerhttp://en.wikipedia.org/wiki/Peer-to-peerhttp://en.wikipedia.org/wiki/Peer-to-peerhttp://en.wikipedia.org/wiki/Embarrassingly_parallelhttp://en.wikipedia.org/wiki/Embarrassingly_parallelhttp://arstechnica.com/science/2013/05/google-buys-a-d-wave-quantum-optimizer/http://arstechnica.com/science/2013/05/google-buys-a-d-wave-quantum-optimizer/http://docs.python.org/2/library/marshal.htmlhttp://docs.python.org/2/library/marshal.html -
8/13/2019 An Application Programming Interface for HPC
52/53
-
8/13/2019 An Application Programming Interface for HPC
53/53
Appendix A
Link to source code
Link to Copernicus computing http://copernicus-computing.org/
http://copernicus-computing.org/http://copernicus-computing.org/