on parallel computation o f exp(x) based on master-worker paradigm

Post on 21-Jan-2016

19 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

On Parallel Computation o f exp(x) based on Master-Worker Paradigm. Keiichi Shiraishi (Kagawa N.C.T.) Yoshiro Imai (Kagawa University). Welcome to Japan Welcome to Takuma Campus I’m looking forward to discussing on Computer Science. BACKGROUNDS. Processors : Clock up --> Multi-core - PowerPoint PPT Presentation

TRANSCRIPT

On Parallel Computation of exp(x)based on

Master-Worker ParadigmKeiichi Shiraishi (Kagawa N.C.T.)

Yoshiro Imai (Kagawa University)

Welcome to JapanWelcome to

Takuma Campus

I’m looking forward to discussing on Computer

Science.

BACKGROUNDS Processors : Clock up --> Multi-core

Multi-core processors are not expensive. Core i7, Cell, GPU, etc.

PC clusters, Super Computers and Cloud Computing are a kind of parallel computing.

Using multi-core processors effectively is important.

Teaching materials for parallel computingAnd… My research area : Computer Algebra

System, e-Learning and Instructional Design

CONTENTS Parallel computation of exp(x) Embarassingly parallel computation and

Master-worker paradigm How to compute/parallelize exp(x) Algorithms Experiments Discussions Conclusion

EMBARASSINGLY PARALLEL COMPUTATIONS Embarassingly parallel computations - no

dependency or communication exists between parallel tasks

Master-worker paradigm is suitable.

Tasks Results

Computing/Processing

NUMERICAL APPROXIMATION OF exp(x)

Equation (3) will be divided to M groups of terms and allocated them to each worker process.

M

NL

1

Worker 1

Worker 2

Worker 3

Worker M

NUMERICAL APPROXIMATION OF exp(x)

In these group, the last terms of the former groups are appeared in the coefficient of the following other groups, e.g. xL-1/(L-1)! is appeared in all groups.

PARALLEL COMPUTATION OF exp(x) (MASTER)

PARALLEL COMPUTATION OF exp(x) (WORKER)

TEST-BED OF EXPERIMENTS

PC

Processor Intel(R) Core(TM) i7 860 2.80GHz

4cores (8HT)

Memory 3GB

OS FreeBSD/i386 8.0-RELEASE

Computer Algebra System

Risa/Asir 20070806

Number of Workers1~8

Compute with multiple precision integer/rational numbers (very slow)

ELAPSED TIME (exp(1), N=1000)

Sequential Parallel

Number of workers

increases.Number of

digits of return value decreses.

Communicaton time is needed.

MAXIMUM NUMBER OF DIGITS OF NUMERATOR/DENOMINATOR OF RETURN VALUE FROM WORKERS

If the number of digits is twice, it would needs 4 times multiplication.

Number of worker processes

Maximum number of digits of numerator/denominator of return value

1 2565

2 1437

3 969

4 735

5 594

6 492

7 425

8 375

DISCUSSIONS From the viewpoint of numerical

computation, a computation with N=1000 isn’t needed because one with about N=10 makes the results sufficient.

There are some numerical approach, e.g. C standard library’s exp(x), Stirling's approximation for n!.

For teaching material, this approach would be good because the tasks allocated each worker processes have some dependencies. It is more difficult than to compute the circle ratio.

CONCLUSIONS Parallel computation of exp(x) is illustrated. As number of worker processes increase, the

completion time decreases. As number of digits of

numerator/denominator of return value from worker processes decrease, the completion time decreases.

Future works Evaluation the problem as teaching material

exp(x) by C STANDARD LIBRARY

SEQUENTIAL COMPUTATION OF exp(x)

NUMERICAL APPROXIMATION OF THE CIRCLE RATIO

dxx

1

021

4

1

02)/)5.0((1

14 N

i NiN

99

502

49

02

99

02

)100/)5.0((1

1

)100/)5.0((1

1

100

4

)100/)5.0((1

1

100

4

ii

i

ii

i

N=100, # of workers is 2.Tasks are independent each other. They can be allocated to 2 workers.

TEST-BED OF EXPERIMENTS

PC PLAYSTATION3

Processor Xeon X3210 2.13GHz Cell B.E. 3.2GHz

4cores PPE, 7SPEsSPE Local Store 256KBEIB 307.2GB/s

Memory 3GB 256MB

OS FreeBSD/amd64 7.1-RELEASE

Yellow Dog Linux 6.2

CAS, Library

Risa/Asir 20070806 libspe-2.2.80-132

ELAPSED TIME (N=50,000,000) Elapsed Time(Risa/Asir)

Elapsed Time(SPE Library)

* Speedup ratio is the ratio of elapsed time with 1 worker to one with n workers.

# of workers 1 2 3 4

Elapsed time[s] 37.902 19.627 13.787 11.056

Speedup ratio* 1 1.931 2.749 3.428

# of workers 1 2 3 4 5 6

Elapsed time[s] 6.037 3.023 2.019 1.518 1.217 1.017

Speedup ratio* 1 1.997 2.990 3.978 4.959 5.934

top related