scala parallel collections aleksandar prokopec epfl

Post on 05-Jan-2016

227 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Scala Parallel Collections

Aleksandar ProkopecEPFL

Scala collections

for { s <- surnames n <- names if s endsWith n} yield (n, s)

McDonald

Scala collections

for { s <- surnames n <- names if s endsWith n} yield (n, s)

1040 ms

Scala parallel collections

for { s <- surnames n <- names if s endsWith n} yield (n, s)

Scala parallel collections

for { s <- surnames.par n <- names.par if s endsWith n} yield (n, s)

Scala parallel collections

for { s <- surnames.par n <- names.par if s endsWith n} yield (n, s)

2 cores

575 ms

Scala parallel collections

for { s <- surnames.par n <- names.par if s endsWith n} yield (n, s)

4 cores

305 ms

for comprehensions

surnames.par.flatMap { s => names.par .filter(n => s endsWith n) .map(n => (n, s))}

for comprehensionsnested parallelized bulk operations

surnames.par.flatMap { s => names.par .filter(n => s endsWith n) .map(n => (n, s))}

Nested parallelism

Nested parallelismparallel within parallel

composition

surnames.par.flatMap { s => surnameToCollection(s) // may invoke parallel ops}

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield

recursive algorithms

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s

gen(5, Array(""))

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s

gen(5, Array(""))

1545 ms

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s

gen(5, ParArray(""))

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s

gen(5, ParArray("")) 1 core

1575 ms

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s

gen(5, ParArray("")) 2 cores

809 ms

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s

gen(5, ParArray("")) 4 cores

530 ms

So, I just use par and I’m home free?

How to think parallel

Character countuse case for foldLeft

val txt: String = ...txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1}

6543210

Character countuse case for foldLeft

txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1}

going left to right - not parallelizable!

A B C D E F

_ + 1

Character countuse case for foldLeft

txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1}

going left to right – not really necessary

3210 A B C

_ + 1

3210 D E F

_ + 1

_ + _6

Character countin parallel

txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1}

Character countin parallel

txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1}

3211 A B C

_ + 1

3211 A B C

: (Int, Char) => Int

Character countfold not applicable

txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1}

3213 A B C

_ + _ 33

3213 A B C

! (Int, Int) => Int

Character countuse case for aggregate

txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1}, _ + _)

3211 A B C

Character countuse case for aggregate

txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1}, _ + _)

_ + _ 33

3213 A B C

_ + 1

Character countuse case for aggregate

aggregation element

3211 A B C

_ + _ 33

3213 A B C

txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1}, _ + _)

B

_ + 1

Character countuse case for aggregate

aggregation aggregation aggregation element

3211 A B C

_ + _ 33

3213 A B C

txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1}, _ + _)

B

_ + 1

Word countanother use case for foldLeft

txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}

Word countinitial accumulation

txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}

0 words so far last character was a space

“Folding me softly.”

Word counta space

txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}

“Folding me softly.”

last seen character is a space

Word counta non space

txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}

“Folding me softly.”

last seen character was a space – a new word

Word counta non space

txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}

“Folding me softly.”

last seen character wasn’t a space – no new word

Word countin parallel

“softly.““Folding me “

P1 P2

Word countin parallel

“softly.““Folding me “

wc = 2; rs = 1 wc = 1; ls = 0

P1 P2

Word countin parallel

“softly.““Folding me “

wc = 2; rs = 1 wc = 1; ls = 0wc = 3

P1 P2

Word countmust assume arbitrary partitions

“g me softly.““Foldin“

wc = 1; rs = 0 wc = 3; ls = 0

P1 P2

Word count must assume arbitrary partitions

“g me softly.““Foldin“

wc = 1; rs = 0 wc = 3; ls = 0

P1 P2

wc = 3

Word countinitial aggregation

txt.par.aggregate((0, 0, 0))

Word countinitial aggregation

txt.par.aggregate((0, 0, 0))

# spaces on the left # spaces on the right#words

Word countinitial aggregation

txt.par.aggregate((0, 0, 0))

# spaces on the left # spaces on the right#words

””

Word countaggregation aggregation

...}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res

“““Folding me“ “softly.“““

Word count aggregation aggregation

...}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs)

“e softly.“ “Folding m“

Word count aggregation aggregation

...}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs)

“ softly.““Folding me”

Word count aggregation element

txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1)

”_”

0 words and a space – add one more space each side

Word count aggregation element

txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0)

” m”

0 words and a non-space – one word, no spaces on the right side

Word count aggregation element

txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1)

” me_”

nonzero words and a space – one more space on the right side

Word count aggregation element

txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0)

” me sof”

nonzero words, last non-space and current non-space – no change

Word count aggregation element

txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0)

” me s”

nonzero words, last space and current non-space – one more word

Word countin parallel

txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0)}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs)})

Word countusing parallel strings?

txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0)}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs)})

Word countstring not really parallelizable

scala> (txt: String).par

Word countstring not really parallelizable

scala> (txt: String).parcollection.parallel.ParSeq[Char] = ParArray(…)

Word countstring not really parallelizable

scala> (txt: String).parcollection.parallel.ParSeq[Char] = ParArray(…)

different internal representation!

Word countstring not really parallelizable

scala> (txt: String).parcollection.parallel.ParSeq[Char] = ParArray(…)

different internal representation!

ParArray

Word countstring not really parallelizable

scala> (txt: String).parcollection.parallel.ParSeq[Char] = ParArray(…)

different internal representation!

ParArray

copy string contents into an array

Conversionsgoing parallel

// `par` is efficient for...mutable.{Array, ArrayBuffer, ArraySeq}

mutable.{HashMap, HashSet}immutable.{Vector, Range}immutable.{HashMap, HashSet}

Conversionsgoing parallel

// `par` is efficient for...mutable.{Array, ArrayBuffer, ArraySeq}

mutable.{HashMap, HashSet}immutable.{Vector, Range}immutable.{HashMap, HashSet}

most other collections construct a new parallel collection!

Conversionsgoing parallel

sequential parallel

Array, ArrayBuffer, ArraySeq mutable.ParArray

mutable.HashMap mutable.ParHashMap

mutable.HashSet mutable.ParHashSet

immutable.Vector immutable.ParVector

immutable.Range immutable.ParRange

immutable.HashMap immutable.ParHashMap

immutable.HashSet immutable.ParHashSet

Conversionsgoing parallel

// `seq` is always efficientParArray(1, 2, 3).seqList(1, 2, 3, 4).seqParHashMap(1 -> 2, 3 -> 4).seq”abcd”.seq

// `par` may not be...”abcd”.par

Custom collections

Custom collection

class ParString(val str: String)

Custom collection

class ParString(val str: String)extends parallel.immutable.ParSeq[Char] {

Custom collection

class ParString(val str: String)extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length

Custom collection

class ParString(val str: String)extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str)

Custom collection

class ParString(val str: String)extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str) def splitter: Splitter[Char]

Custom collection

class ParString(val str: String)extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str) def splitter = new ParStringSplitter(0, str.length)

Custom collectionsplitter definition

class ParStringSplitter(var i: Int, len: Int)extends Splitter[Char] {

Custom collectionsplitters are iterators

class ParStringSplitter(i: Int, len: Int)extends Splitter[Char] { def hasNext = i < len def next = { val r = str.charAt(i) i += 1 r }

Custom collectionsplitters must be duplicated

... def dup = new ParStringSplitter(i, len)

Custom collectionsplitters know how many elements remain

... def dup = new ParStringSplitter(i, len) def remaining = len - i

Custom collectionsplitters can be split

... def psplit(sizes: Int*): Seq[ParStringSplitter] = { val splitted = new ArrayBuffer[ParStringSplitter] for (sz <- sizes) { val next = (i + sz) min ntl splitted += new ParStringSplitter(i, next) i = next } splitted }

Word countnow with parallel strings

new ParString(txt).aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0)}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs)})

Word countperformance

txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}

new ParString(txt).aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0)}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs)})

100 ms

cores: 1 2 4time: 137 ms 70 ms 35 ms

Hierarchy

GenTraversable

GenIterable

GenSeq

Traversable

Iterable

Seq

ParIterable

ParSeq

Hierarchy

def nonEmpty(sq: Seq[String]) = { val res = new mutable.ArrayBuffer[String]()for (s <- sq) {

if (s.nonEmpty) res += s } res}

Hierarchy

def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]()for (s <- sq) {

if (s.nonEmpty) res += s } res}

Hierarchy

def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]()for (s <- sq) {

if (s.nonEmpty) res += s } res}

side-effects!ArrayBuffer is not synchronized!

Hierarchy

def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]()for (s <- sq) {

if (s.nonEmpty) res += s } res}

side-effects!ArrayBuffer is not synchronized!

ParSeq

Seq

Hierarchy

def nonEmpty(sq: GenSeq[String]) = { val res = new mutable.ArrayBuffer[String]()for (s <- sq) {

if (s.nonEmpty) res.synchronized { res += s } } res}

Accessors vs. transformerssome methods need more than just splitters

foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, …

map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, …

Accessors vs. transformerssome methods need more than just splitters

foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, …

map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, …

These return collections!

Accessors vs. transformerssome methods need more than just splitters

foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, …

map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, …

Sequential collections – builders

Accessors vs. transformerssome methods need more than just splitters

foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, …

map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, …

Sequential collections – buildersParallel collections – combiners

Buildersbuilding a sequential collection

1 2 3 4 5 6 7 Nil2 4 6

Nil

ListBuilder

+= += +=

result

How to build parallel?

Combinersbuilding parallel collections

trait Combiner[-Elem, +To]extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo]}

Combinersbuilding parallel collections

trait Combiner[-Elem, +To]extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo]}

CombinerCombiner Combiner

Combinersbuilding parallel collections

trait Combiner[-Elem, +To]extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo]}

Should be efficient – O(log n) worst case

Combinersbuilding parallel collections

trait Combiner[-Elem, +To]extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo]}

How to implement this combine?

Parallel arrays

1, 2, 3, 4 5, 6, 7, 82, 4 6, 8 3, 1, 8, 0 2, 2, 1, 98, 0 2, 2

merge merge

mergecopy

allocate

2 4 6 8 8 0 2 2

Parallel hash tables

ParHashMap

Parallel hash tables

ParHashMap0 1 2 4 5 7 8 9

e.g. calling filter

Parallel hash tables

ParHashMap0 1 2 4 5 7 8 9

ParHashCombiner ParHashCombiner

e.g. calling filter

0 51 7 94

Parallel hash tables

ParHashMap0 1 2 4 5 7 8 9

ParHashCombiner

0 1 4

ParHashCombiner

5 7 9

Parallel hash tables

ParHashMap0 1 2 4 5 7 8 9

ParHashCombiner

0 1 4

ParHashCombiner

5 9

5 70 1 4

7

9

Parallel hash tables

ParHashMap

ParHashCombiner ParHashCombiner

How to merge?

5 70 1 4 9

5 7 8 91 40

Parallel hash tables

buckets!ParHashCombiner ParHashCombiner

0 1 4 975

ParHashMap20 = 00002

1 = 00012

4 = 01002

Parallel hash tables

ParHashCombiner ParHashCombiner

0

1

4 9

7

5

combine

Parallel hash tables

ParHashCombiner ParHashCombiner

9

7

50

1

4

ParHashCombiner

no copying!

Parallel hash tables

9

7

5

0

1

4

ParHashCombiner

Parallel hash tables

9750 1 4

ParHashMap

Custom combinersfor methods returning custom collections

new ParString(txt).filter(_ != ‘ ‘)

What is the return type here?

Custom combinersfor methods returning custom collections

new ParString(txt).filter(_ != ‘ ‘)

creates a ParVector!

Custom combinersfor methods returning custom collections

new ParString(txt).filter(_ != ‘ ‘)

creates a ParVector!

class ParString(val str: String)extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i)...

Custom combinersfor methods returning custom collections

class ParString(val str: String)extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString]{ def apply(i: Int) = str.charAt(i)...

Custom combinersfor methods returning custom collections

class ParString(val str: String)extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString]{ def apply(i: Int) = str.charAt(i)...protected[this] override def newCombiner : Combiner[Char, ParString]

Custom combinersfor methods returning custom collections

class ParString(val str: String)extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString]{ def apply(i: Int) = str.charAt(i)...protected[this] override def newCombiner = new ParStringCombiner

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] {

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] { var size = 0

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] { var size = 0

size

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder)

size

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder)

size

chunks

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last

size

chunks

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last

size lastc

chunks

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last def +=(elem: Char) = { lastc += elem size += 1 this }

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last def +=(elem: Char) = { lastc += elem size += 1 this }

size lastc

chunks+1

Custom combinersfor methods returning custom collections

... def combine[U <: Char, NewTo >: ParString] (other: Combiner[U, NewTo]) = other match { case psc: ParStringCombiner => sz += that.sz chunks ++= that.chunks lastc = chunks.last this }

Custom combinersfor methods returning custom collections

... def combine[U <: Char, NewTo >: ParString] (other: Combiner[U, NewTo])

lastc

chunks

lastc

chunks

Custom combinersfor methods returning custom collections

... def result = { val rsb = new StringBuilder for (sb <- chunks) rsb.append(sb) new ParString(rsb.toString) }...

Custom combinersfor methods returning custom collections

... def result = ...

lastc

chunks

StringBuilder

Custom combinersfor methods expecting implicit builder factories

// only for big boys... with GenericParTemplate[T, ParColl]...

object ParColl extends ParFactory[ParColl] { implicit def canCombineFrom[T] = new GenericCanCombineFrom[T] ...

Custom combinersperformance measurement

txt.filter(_ != ‘ ‘)

new ParString(txt).filter(_ != ‘ ‘)

txt.filter(_ != ‘ ‘)

new ParString(txt).filter(_ != ‘ ‘)

106 ms

Custom combinersperformance measurement

txt.filter(_ != ‘ ‘)

new ParString(txt).filter(_ != ‘ ‘)

106 ms

1 core

125 ms

Custom combinersperformance measurement

txt.filter(_ != ‘ ‘)

new ParString(txt).filter(_ != ‘ ‘)

106 ms

1 core

125 ms2 cores

81 ms

Custom combinersperformance measurement

txt.filter(_ != ‘ ‘)

new ParString(txt).filter(_ != ‘ ‘)

106 ms

1 core

125 ms2 cores

81 ms4 cores

56 ms

Custom combinersperformance measurement

1 core

125 ms2 cores

81 ms4 cores

56 ms

t/ms

proc

125 ms

1 2 4

81 ms56 ms

Custom combinersperformance measurement

1 core

125 ms2 cores

81 ms4 cores

56 ms

t/ms

proc

125 ms

1 2 4

81 ms56 ms

def result

(not parallelized)

Custom combinersperformance measurement

Custom combinerstricky!

• two-step evaluation– parallelize the result method in combiners

• efficient merge operation– binomial heaps, ropes, etc.

• concurrent data structures– non-blocking scalable insertion operation– we’re working on this

Future workcoming up

• concurrent data structures• more efficient vectors• custom task pools• user defined scheduling• parallel bulk in-place modifications

Thank you!

Examples at:git://github.com/axel22/sd.git

top related