christopher brown and kevin hammond school of computer science, university of st. andrews july 2010...

36
Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for general Parallel Orbit calculations for use in SymGrid-Par.

Upload: gary-cook

Post on 19-Jan-2018

214 views

Category:

Documents


0 download

DESCRIPTION

The Orbit Calculation 1 Starting set f :: Int -> Int f x = (x+1) `mod` 4

TRANSCRIPT

Page 1: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Christopher Brown and Kevin HammondSchool of Computer Science, University of St.

AndrewsJuly 2010

Ever Decreasing Circles:

Implementing a skeleton for general Parallel Orbit calculations for use in SymGrid-Par.

Page 2: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

A General Orbit CalculationExplore a solution space given:

An initial set of values;A set of generators;

Used in computational algebra:Symmetry of solutions: chemistry,

quantum physics, etc.Rubik’s Cube (Permutations).

Sequential implementations already exist, concerns about performance.

Page 3: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

The Orbit Calculation

1

Starting set

f :: Int -> Int

f x = (x+1) `mod` 4

Page 4: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

The Orbit Calculation

1

Accumulating set

f :: Int -> Int

f x = (x+1) `mod` 4

f 1

2

Page 5: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

The Orbit Calculation

1

Accumulating set

f :: Int -> Int

f x = (x+1) `mod` 4

f 2

2

3

Page 6: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

The Orbit Calculation

1

Accumulating set

f :: Int -> Int

f x = (x+1) `mod` 4

f 3

2

3

0

Page 7: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

The Orbit Calculation

1

Accumulating set

f :: Int -> Int

f x = (x+1) `mod` 4

f 0

2

3

0

Page 8: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

State-of-the-artA sequential version already exists in GAP.

But we need to be able to compute the orbit of millions of iterations.

Parallel version exists in C using hash tables:But fine-tuned to a very specific problem (direct

condensation)May not be scalable?

There is also a new parallel implementation in GAP (Shpectorov)tuple-based implementationuses SCSCP and dedicated hash-table servers

We need a general skeleton that can be used for arbitrary orbits

Page 9: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

SymGrid-Par

Page 10: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

SymGrid-Par

Page 11: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

The Orbit - Sequential VersionTo our knowledge, this is the first version

ever implemented in Haskell.

orbitMul :: (Ord a, Eq a) => [ a -> a ] -> [a] -> [a] -> [a]

orbitMul gens [] set = setorbitMul gens (t:ts) set = orbitMul gens (ts++new) set' where (new, set') = applyGens gens [t] set [] applyGens = ... img = ...

queue of tasksgenerators accumulatingset of results

Page 12: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

The Orbit - Sequential

genimg :: Eq a => (a->a) -> [a] -> [a] -> ([a], [a])

genimg g queue@(t:ts) set = if img `elem` set then ([] , set ) else (img : queue , img : set ) where img = g q

img represents the generator applied to the task (the image of the generator application).

Need to check for membership in the result set.

add img tonew task queue

add img toset of results

Page 13: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

The Orbit - Sequential

applyGens :: Eq a => [ a -> a ] -> [a] -> [a] -> [a] -> ([a],[a])

applyGens [] q s q' = (q', s) applyGens (g:gs) queue set q' = applyGens gs queue set' (q'++queue') where (queue', set') = genimg g queue set

Recurse over list of generators.Pass result of an img into next generator

application.

Page 14: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Parallel OrbitNeed a queue to express tasks waiting to be

processed.We need to distribute the queue over available

PEs.We use a Task Farm (master/worker) approach

Page 15: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Task Farm (Master/Worker)

Page 16: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Extending to a Parallel OrbitThe orbit is not quite a true farm, however.

Results from workers must be accumulated and checked for duplicates… a set?

Non-duplicates are released as new tasks.Moreover, we must be sure that the orbit will

terminate!

Page 17: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Parallel Orbit

Page 18: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Parallel Orbit CalculationorbitPar :: ([ a->a ] -> [a] -> [ [a] ]) -> [a->a] -> [a] -> [a]

orbitPar orbitfun gens init = …

workerProcs = [ process (concat . (Data.List.map (orbitfun gens))) | n <- [1..noPe] ]

toWorker tasks = unshuffle noPe tasks

process abstraction

Page 19: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Parallel Orbit CalculationorbitPar :: ([ a->a ] -> [a] -> [ [a] ]) -> [a->a] -> [a] -> [a]

orbitPar orbitfun gens init = …

addNewTask set (t:ts) c | not (t `member` set) = t : addNewTask (Data.Set.insert t set) ts ((c-1)+nGens) | c <= 1 = [] | otherwise = addNewTask set ts (c - 1) workerProcs = …

toWorker tasks = …

count of potential tasks

Page 20: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Simple Test CaseTest case that gives similar (tunable)

granularities.Deliver wide range of result values.Change size of result set by setSize.All tests seeded with 1.f1 s n = (fib ((n `mod` 20) + 10) + n) `mod` setSize

f2 s n = (fib ((n `mod` 10) + 20) + n) `mod` setSize

f3 s n = (fib ((n `mod` 19) + 10) + n - 1) `mod` setSize

orbitOnList [] _ = []

orbitOnList (g:gens) list = map g list : orbitOnList gens list

Page 21: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Measurement FrameworkExecuted on 8-core machine running at

2.66GHz.4 GB of RAM.Compiled with GHC 6.82 -O2.Runtimes are given as an average over 10

runs.Performance of parallel version against

single core parallel version.

Page 22: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Farm Speedup Against Par 1

0

1

2

3

4

5

6

7

8

9

par 1 par 2 par 3 par 4 par 5 par 6 par 7 par 8

7501000200040008000160003200064000

setSize

Page 23: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Farm - Trace (64000)

Page 24: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Evaluation of the Task FarmGood for regular and well-balanced tasks.Static round-robin distribution.May suffer from load imbalance.Does not distribute tasks in a request driven

way.

Page 25: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

A Workpool ApproachDistributes tasks in a request driven way

when a task completes, its processor is added to the queue of idle processors

Better for irregular and unbalanced tasks.Automatically deals with load imbalance.Still limited by master/worker ratio

Page 26: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Workpool

Page 27: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Workpool Speedup Against Par 1

0

1

2

3

4

5

6

7

8

9

par 1 par 2 par 3 par 4 par 5 par 6 par 7 par 8

7501000200040008000160003200064000

setSize

Page 28: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Workpool - Trace (64000)

Page 29: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

ConclusionsSpeedup appears almost linear up to a factor

of 8.29 on 8 cores for a set size of 64000.Workpool is more efficient, and gives better

speed ups for larger set sizes.Workpool may incur slight overhead,

noticeable in small set sizes.Workpool is more balanced for larger set

sizes.

Page 30: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Work in ProgressIntegrating orbit skeleton into SymGrid-Par.

Use GAP to compute the computational algebra…… Haskell to exploit parallelism.

Application to larger problemse.g. the braid orbit

Develop tool support to aid parallel developmente.g. using refactoring

First of a series of domain-specific parallel skeletonsduplicate elimination, completion algorithm, chain

reduction, …

Page 31: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

http://www.symbolic-computation.org

QuickTime™ and a decompressor

are needed to see this picture.

Page 32: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Future WorkComplete SGP integration.Solve some real symbolic computing

problems.Tool support for sequential -> parallel

transformations?Implement more parallel skeletons:

Parallel nub?

Page 33: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Parallel Orbit CalculationorbitPar :: ([ a->a ] -> [a] -> [ [a] ]) -> [a->a] -> [a] -> [a]orbitPar orbitfun gens init = dat where newTasks = merge (zipWith (#) workerProcs (toWorker dat))

dat = (addNewTask empty (init' ++ newTasks) (length init')) init' = take noPe (cycle init)

addNewTask set (t:ts) c | not (t `member` set) = t : addNewTask (Data.Set.insert t set) ts c' | c <= 1 = [] | otherwise = addNewTask set ts (c - 1) where c' = (c-1) + nGens

workerProcs = [ process (concat . (Data.List.map (orbitfun gens))) | n <- [1..noPe] ]

toWorker tasks = unshuffle noPe tasks

Page 34: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

EdenSemi-explicit model of parallelism.Explicit process creation.Implicit thread creation: (unzip . streamf) :: Num a => [a] -> ([a],[a])

uncurry zip ((process (unzip . streamf) # [1..10])

where streamf args = map worker args

worker x = (factorial x, fibonacci x)

Page 35: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

Questions?

http://www.symbolic-computation.org/The_SCIEnce_Project

Page 36: Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for

WorkpoolorbitPar :: ([ a->a ] -> [a] -> [ [a] ]) -> [a->a] -> [a] -> [a]orbitPar orbitfun gens init = dat where (newReqs, newTasks) = (unzip . merge) (zipWith (#) workerProcs (toWorker dat))

dat = (addNewTask empty (init' ++ newTasks) (length init')) init' = take noPe (cycle init)

addNewTask set (t:ts) c | not (t `member` set) = t : addNewTask (Data.Set.insert t set) ts c' | c <= 1 = [] | otherwise = addNewTask set ts (c - 1) where c' = (c-1) + nGens

workerProcs = [ process (zip [n,n..] . (concat . (Data.List.map (orbitfun gens)))) | n <- [1..noPe] ]

toWorker tasks = distribute tasks requests

requests = initialReqs ++ newReqs