two allocations of a 16x16 array to 16 processes: (a) 2-dimensional blocks; (b) rows

12
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-1 Two allocations of a 16X16 array to 16 processes: (a) 2-dimensional blocks; (b) rows.

Upload: ciara-bowen

Post on 30-Dec-2015

27 views

Category:

Documents


1 download

DESCRIPTION

Two allocations of a 16X16 array to 16 processes: (a) 2-dimensional blocks; (b) rows. Overlap regions. If values from one processor must be communicated to another, then those values are “duplicated” on each processor It is as if that region exists on both processors Hence, overlap. - PowerPoint PPT Presentation

TRANSCRIPT

Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-1

Two allocations of a 16X16 array to 16 processes: (a) 2-dimensional blocks; (b) rows.

Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

Overlap regions

• If values from one processor must be communicated to another, then those values are “duplicated” on each processor

• It is as if that region exists on both processors– Hence, overlap

5-2

Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-3

Overlap regions (gray) show the non-local values; once the overlay regions are filled, the stencil computation is local.

Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

Cyclic and Block Allocations

• Some algorithms will cause some processors to finish before others when using trivial data mappings.

• For example, Gaussian Elimination– After first pass, done with first column and row, etc.– Row, column or block assignment will have some

processors idle while just a few are working at the end of the process.

5-4

Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-5

(a) LU decomposition algorithm; (b) 16 processes arranged in a grid; (c) the allocation of the array elements to processes.

Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-6

Illustration of a cyclic distribution of an 8 × 8 array onto five processes.

Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-7

Block-cyclic allocation of 3 × 2 blocks to a 14 × 14 array distributed to four processes (colors).

Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-8

The block-cyclic allocation midway through the computation; the blocks to the right summarize the active values for each process.

Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-9

Example of an unstructured grid representing the pressure distribution on two airfoils. Image from http://fun3d.larc.nasa.gov/example-24.html.

Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-10

Cap allocation for a binary tree on P = 8 processes. Each process is allocated one of the leaf subtrees, along with a copy of the cap (shaded).

Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-11

Logical tree representations: (a) a binary tree where P = 8; (b) a binary tree where P = 6. Useful solution when the tree is known at the beginning of the computation.

Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-12

Enumerating the Tic-Tac-Toe game tree; a process is assigned to search the games beginning with each of the four initial move sequences. Symmetric positions are redundant.