pp assignment
TRANSCRIPT
-
8/2/2019 PP Assignment
1/11
What are multiprocessors? What are the different types ofmultiprocessors?
Multiprocessors are computing systems in which all programs share asingle address space. This may be achieved by use of a single memory ora collection of memory modules that are closely connected andaddressable as a single unit.All programs running on such a system communicate via shared variables
in memory.
Types of Multiprocessors:
There are two major variants of multiprocessors: UMA and NUMA.
In UMA (Uniform Memory Access) multiprocessors, often called SMP
(Symmetric Multiprocessors), each processor takes the same amount of
time to access every memory location. This property may be enforced by
use of memory delays.
In NUMA (NonUniform Memory Access) multiprocessors, some memory
accesses are faster than others. This model presents interesting
challenges to the programmer in that race conditions become a real
possibility, but offers increased performance.
What are PRAM algorithms? What are the different types of
PRAM algorithms?
PRAM algorithms are Parallel RAM algorithm which is used when we have
memory constrains (resource). It is used when the resource (memory) is
single but many processors are trying to concurrently access the memory.
There are four types of PRAM algorithms:
EREW
ERCW
-
8/2/2019 PP Assignment
2/11
CRCW
CRCW
Explain Berkeleys algorithm for clock synchronization.
The Berkeley algorithm, developed by Gusella and Zatti in 1989,
does not assume that any machine has an accurate time source with
which to synchronize.
Instead, it opts for obtaining an average time from the participating
computers and synchronizing all machines to that average.
The machines involved in the synchronization each run a time
daemon process that is responsible for implementing the protocol.
One of these machines is elected (or designated) to be the master.
The others are slaves. The server polls each machine periodically,
asking it for the time. The time at each machine may be estimated by
using Cristian's method to account for network delays. When all the
results are in, the master computes the average time (including its
own time in the calculation). The hope is that the average cancels out
the individual clock's tendencies to run fast or slow.
The algorithm also has provisions to ignore readings from clocks
whose skew is too great. The master may compute fault-tolerant
average averaging values from machines whose clocks have not
drifted by more than a certain amount. If the master machine fails,
any other slave could be elected to take over.
Explain the pipeline architecture. How can speedup be
calculated for a pipeline architecture. How are the different
pipeline architectures classified. Show an example of each.
-
8/2/2019 PP Assignment
3/11
Pipelining is one form of imbedding parallelism or concurrency in a
computer system. It refers to a segmentation of a computational
process (say, an instruction) into several sub processes which are
executed by dedicated autonomous units (facilities, pipeliningsegments). Successive processes (instructions) can be carried out in
an overlapped mode analogous to an industrial assembly line. So,
very loosely, pipelining can be defined as the technique of
decomposing a repeated sequential process into sub processes,
each of which can be executed efficiently on a special dedicated
autonomous module that operates concurrently with the others.
Speed Up calculation:
Time to process 1 segment = t
No of segments = n
Number of sub-blocks for each segment = k
Time to process each sub-block = t/n
Time With pipeline = (n+k-1)t/n
For example, if:- n = 4, t = 5, k = 5
For, Non- Pipeline = 4*5 = 20
Pipeline = (4+5-1)*5/5 = 8
-
8/2/2019 PP Assignment
4/11
Ratio = 20/8 = 6.67 factor improvement.
Different types of pipelines:
There are mainly two types of pipeline architecture.
Linier architecture
Two types: Synchronous (ex: conventional microprocessors)
and Asynchronous (ex: AMULET microprocessor)
Non-linier architecture
What are AVL trees. Explain Elliss parallel algorithm for
searching and insertion in an AVL tree. Is Elliss algorithm
control parallel or data parallel?
An AVL tree is another balanced binary search tree. Named after
their inventors, Adelson-Velskii and Landis, they were the first
dynamically balanced trees to be proposed. Like red-black trees, they
are not perfectly balanced, but pairs of sub-trees differ in height by at
most 1, maintaining an O(logn) search time. Addition and deletion
operations also take O(logn) time.
An AVL tree is a binary search tree which has the following
properties:
The sub-trees of every node differ in height by at most one.
Every sub-tree is an AVL tree.
-
8/2/2019 PP Assignment
5/11
Elliss algorithm :
Input: Array A[n] of keys (integers) Sort the array by constructing a binary searched tree out of all the
integers.
Convert the binary searched tree from step 1 to an AVL tree.
Perform the various dictionary operations like
Search
insert
delete
All the processors have access to the shared data structured that
is the AVL tree constructed in step 2.
The operations that need to be performed are partitioned and
executed in parallel.
Elliss algorithm is a control parallel algorithm.
Differentiate between SIMD, SPMD and MIMD programming?
SIMD: Here every processor do exactly same thing as same
instruction is given to each processor. The data are divided into
segments and send to each processor and each processor do exactly
same type of work in those data. Here we need only simple hardware
to implement.
MIMD: Here traditional parallel processing is being done. Each
processor may get different types of control information that is, they
may receive different instruction on a different set of data. So all the
N processors doing all their work individually. Here we need more
complex hardware.
-
8/2/2019 PP Assignment
6/11
SPMD: SPMD (single process, multiple data) is a technique
employed to achieve parallelism; it is a subcategory of MIMD. Tasks
are split up and run simultaneously on multiple processors with
different input in order to obtain results faster. SPMD is the mostcommon style of parallel programming.
Here multiple autonomous processors simultaneously execute the
same program at independent points, rather than in the lockstep that
SIMD imposes on different data. With SPMD, tasks can be executed
on general purpose CPUs; SIMD requires vector processors to
manipulate data streams.
Explain the parallel algorithm for Odd-Even reduction algorithm
to solve a system of linear equations.
In parallel computing, an oddeven sort or oddeven transpositionsort (also known as brick sort) is a relatively simple sortingalgorithm, developed originally for use on parallel processors with
local interconnections. It functions by comparing all (odd, even)-indexed pairs of adjacent elements in the list and, if a pair is in thewrong order (the first is larger than the second) the elements areswitched. The next step repeats this for (even, odd)-indexed pairs (ofadjacent elements). Then it alternates between (odd, even) and(even, odd) steps until the list is sorted.
On parallel processors, with one value per processor and only local leftright neighbour connections, the processors all concurrently do acompareexchange operation with their neighbours, alternating
between oddeven and evenodd pairings.The algorithm extends efficiently to the case of multiple items per
processor. In the BaudetStevenson oddeven merge-splittingalgorithm, each processor sorts its own sublist at each step, usingany efficient sort algorithm, and then performs a merge splitting, or
-
8/2/2019 PP Assignment
7/11
transpositionmerge, operation with its neighbour, with neighbourpairing alternating between oddeven and evenodd on each step.
Algorithm
The single-processor algorithm, like bubblesort, is simple but not veryefficient. Here a zero-basedindex is assumed:
Assume a[] is an array of values to be sorted.
Set sorted = false;while(!sorted){sorted=true;for( i = 1; i < list.length-1; i += 2)
{if(a[i] > a[i+1]){swap(a, i, i+1);sorted = false;}}for(var i = 0; i < list.length-1; i += 2){
if(a[i] > a[i+1]){swap(a, i, i+1);sorted = false;}}}
Write parallel algorithms for the following:(i) P-Depth search(ii) Breadth first search
(i) P-Depth search
visit a Vertex from [Vertex|Other_vertices] = Unvisited_neighbours,
-
8/2/2019 PP Assignment
8/11
add this Vertex to the current path;
send {self(), wait} to the 'collector' process;
run *dfs_mod* for Unvisited neighbours of the current Vertex in a new
process;
continue running *dfs_mod* with the rest of the provided vertices
(Other_vertices);
when there are no more vertices to visit - send {self(), done} to the
collector process and terminate;
(ii) Breadth first search
For each vertex u( V(G){v0} )
u.dist =
v0.dist = 0
Q = {v0}
while Q != 0
u = DEQUEUE(Q)
for each v V such that (u,v) E(G)
if v.dist ==
v.dist = u.dist+1
ENQUEUE(Q,v)
-
8/2/2019 PP Assignment
9/11
Describe the parallel version of Sollins algorithm for Minimum
cost Spanning tree construction. Do a complexity analysis of the
algorithm.
Sollins algorithm:
It is a graph algorithm to find out the Minimum Spanning Tree in
parallel. Here we take pairs of nodes and assign it to P processors.
We join two pairs of node and form a tree. We have to remember
though that when we join two nodes the whole node became a tree,
so we have to connect a tree to another tree not a node to another
node. We choose the minimum edge to connect the nodes. When all
the edges are connected (make sure there is no cycle exists) westop. After joining all the nodes we get the Minimum Spanning Tree.
This algorithm keeps a forest of minimum spanning trees which it
continuously connects via the least cost arc leaving each tree. To
begin each node is made its own minimum spanning tree. From here
the shortest arc leaving each tree (which doesn't connect to a node
already belonging to the current tree) is added along the minimum
spanning tree it connects to. This continues until there exists a single
spanning tree of the entire graph.
This complexity of the algorithm is:O(m log n).
Algorithm
Begin with a connected graph Gcontaining edges of distinct weights,
and an empty set of edges T
While the vertices of Gconnected by Tare disjoint:
Begin with an empty set of edges E
For each component:
Begin with an empty set of edges S
For each vertex in the component:
-
8/2/2019 PP Assignment
10/11
-
8/2/2019 PP Assignment
11/11
between algorithms that are merely waiting for a very unlikely set of
circumstances to occur and algorithms that will never finish because
of deadlock.
On the basis of matrix multiplication, we propose a deadlock
detection algorithm. The basic idea in this algorithm is iteratively
reducing the matrix by dividing the matrix in smaller parts.