christopher brown and kevin hammond school of computer science, university of st. andrews july 2010...
DESCRIPTION
The Orbit Calculation 1 Starting set f :: Int -> Int f x = (x+1) `mod` 4TRANSCRIPT
Christopher Brown and Kevin HammondSchool of Computer Science, University of St.
AndrewsJuly 2010
Ever Decreasing Circles:
Implementing a skeleton for general Parallel Orbit calculations for use in SymGrid-Par.
A General Orbit CalculationExplore a solution space given:
An initial set of values;A set of generators;
Used in computational algebra:Symmetry of solutions: chemistry,
quantum physics, etc.Rubik’s Cube (Permutations).
Sequential implementations already exist, concerns about performance.
The Orbit Calculation
1
Starting set
f :: Int -> Int
f x = (x+1) `mod` 4
The Orbit Calculation
1
Accumulating set
f :: Int -> Int
f x = (x+1) `mod` 4
f 1
2
The Orbit Calculation
1
Accumulating set
f :: Int -> Int
f x = (x+1) `mod` 4
f 2
2
3
The Orbit Calculation
1
Accumulating set
f :: Int -> Int
f x = (x+1) `mod` 4
f 3
2
3
0
The Orbit Calculation
1
Accumulating set
f :: Int -> Int
f x = (x+1) `mod` 4
f 0
2
3
0
State-of-the-artA sequential version already exists in GAP.
But we need to be able to compute the orbit of millions of iterations.
Parallel version exists in C using hash tables:But fine-tuned to a very specific problem (direct
condensation)May not be scalable?
There is also a new parallel implementation in GAP (Shpectorov)tuple-based implementationuses SCSCP and dedicated hash-table servers
We need a general skeleton that can be used for arbitrary orbits
SymGrid-Par
SymGrid-Par
The Orbit - Sequential VersionTo our knowledge, this is the first version
ever implemented in Haskell.
orbitMul :: (Ord a, Eq a) => [ a -> a ] -> [a] -> [a] -> [a]
orbitMul gens [] set = setorbitMul gens (t:ts) set = orbitMul gens (ts++new) set' where (new, set') = applyGens gens [t] set [] applyGens = ... img = ...
queue of tasksgenerators accumulatingset of results
The Orbit - Sequential
genimg :: Eq a => (a->a) -> [a] -> [a] -> ([a], [a])
genimg g queue@(t:ts) set = if img `elem` set then ([] , set ) else (img : queue , img : set ) where img = g q
img represents the generator applied to the task (the image of the generator application).
Need to check for membership in the result set.
add img tonew task queue
add img toset of results
The Orbit - Sequential
applyGens :: Eq a => [ a -> a ] -> [a] -> [a] -> [a] -> ([a],[a])
applyGens [] q s q' = (q', s) applyGens (g:gs) queue set q' = applyGens gs queue set' (q'++queue') where (queue', set') = genimg g queue set
Recurse over list of generators.Pass result of an img into next generator
application.
Parallel OrbitNeed a queue to express tasks waiting to be
processed.We need to distribute the queue over available
PEs.We use a Task Farm (master/worker) approach
Task Farm (Master/Worker)
Extending to a Parallel OrbitThe orbit is not quite a true farm, however.
Results from workers must be accumulated and checked for duplicates… a set?
Non-duplicates are released as new tasks.Moreover, we must be sure that the orbit will
terminate!
Parallel Orbit
Parallel Orbit CalculationorbitPar :: ([ a->a ] -> [a] -> [ [a] ]) -> [a->a] -> [a] -> [a]
orbitPar orbitfun gens init = …
workerProcs = [ process (concat . (Data.List.map (orbitfun gens))) | n <- [1..noPe] ]
toWorker tasks = unshuffle noPe tasks
process abstraction
Parallel Orbit CalculationorbitPar :: ([ a->a ] -> [a] -> [ [a] ]) -> [a->a] -> [a] -> [a]
orbitPar orbitfun gens init = …
addNewTask set (t:ts) c | not (t `member` set) = t : addNewTask (Data.Set.insert t set) ts ((c-1)+nGens) | c <= 1 = [] | otherwise = addNewTask set ts (c - 1) workerProcs = …
toWorker tasks = …
count of potential tasks
Simple Test CaseTest case that gives similar (tunable)
granularities.Deliver wide range of result values.Change size of result set by setSize.All tests seeded with 1.f1 s n = (fib ((n `mod` 20) + 10) + n) `mod` setSize
f2 s n = (fib ((n `mod` 10) + 20) + n) `mod` setSize
f3 s n = (fib ((n `mod` 19) + 10) + n - 1) `mod` setSize
orbitOnList [] _ = []
orbitOnList (g:gens) list = map g list : orbitOnList gens list
Measurement FrameworkExecuted on 8-core machine running at
2.66GHz.4 GB of RAM.Compiled with GHC 6.82 -O2.Runtimes are given as an average over 10
runs.Performance of parallel version against
single core parallel version.
Farm Speedup Against Par 1
0
1
2
3
4
5
6
7
8
9
par 1 par 2 par 3 par 4 par 5 par 6 par 7 par 8
7501000200040008000160003200064000
setSize
Farm - Trace (64000)
Evaluation of the Task FarmGood for regular and well-balanced tasks.Static round-robin distribution.May suffer from load imbalance.Does not distribute tasks in a request driven
way.
A Workpool ApproachDistributes tasks in a request driven way
when a task completes, its processor is added to the queue of idle processors
Better for irregular and unbalanced tasks.Automatically deals with load imbalance.Still limited by master/worker ratio
Workpool
Workpool Speedup Against Par 1
0
1
2
3
4
5
6
7
8
9
par 1 par 2 par 3 par 4 par 5 par 6 par 7 par 8
7501000200040008000160003200064000
setSize
Workpool - Trace (64000)
ConclusionsSpeedup appears almost linear up to a factor
of 8.29 on 8 cores for a set size of 64000.Workpool is more efficient, and gives better
speed ups for larger set sizes.Workpool may incur slight overhead,
noticeable in small set sizes.Workpool is more balanced for larger set
sizes.
Work in ProgressIntegrating orbit skeleton into SymGrid-Par.
Use GAP to compute the computational algebra…… Haskell to exploit parallelism.
Application to larger problemse.g. the braid orbit
Develop tool support to aid parallel developmente.g. using refactoring
First of a series of domain-specific parallel skeletonsduplicate elimination, completion algorithm, chain
reduction, …
http://www.symbolic-computation.org
QuickTime™ and a decompressor
are needed to see this picture.
Future WorkComplete SGP integration.Solve some real symbolic computing
problems.Tool support for sequential -> parallel
transformations?Implement more parallel skeletons:
Parallel nub?
Parallel Orbit CalculationorbitPar :: ([ a->a ] -> [a] -> [ [a] ]) -> [a->a] -> [a] -> [a]orbitPar orbitfun gens init = dat where newTasks = merge (zipWith (#) workerProcs (toWorker dat))
dat = (addNewTask empty (init' ++ newTasks) (length init')) init' = take noPe (cycle init)
addNewTask set (t:ts) c | not (t `member` set) = t : addNewTask (Data.Set.insert t set) ts c' | c <= 1 = [] | otherwise = addNewTask set ts (c - 1) where c' = (c-1) + nGens
workerProcs = [ process (concat . (Data.List.map (orbitfun gens))) | n <- [1..noPe] ]
toWorker tasks = unshuffle noPe tasks
EdenSemi-explicit model of parallelism.Explicit process creation.Implicit thread creation: (unzip . streamf) :: Num a => [a] -> ([a],[a])
uncurry zip ((process (unzip . streamf) # [1..10])
where streamf args = map worker args
worker x = (factorial x, fibonacci x)
Questions?
http://www.symbolic-computation.org/The_SCIEnce_Project
WorkpoolorbitPar :: ([ a->a ] -> [a] -> [ [a] ]) -> [a->a] -> [a] -> [a]orbitPar orbitfun gens init = dat where (newReqs, newTasks) = (unzip . merge) (zipWith (#) workerProcs (toWorker dat))
dat = (addNewTask empty (init' ++ newTasks) (length init')) init' = take noPe (cycle init)
addNewTask set (t:ts) c | not (t `member` set) = t : addNewTask (Data.Set.insert t set) ts c' | c <= 1 = [] | otherwise = addNewTask set ts (c - 1) where c' = (c-1) + nGens
workerProcs = [ process (zip [n,n..] . (concat . (Data.List.map (orbitfun gens)))) | n <- [1..noPe] ]
toWorker tasks = distribute tasks requests
requests = initialReqs ++ newReqs