scala parallel collections
DESCRIPTION
Scala Parallel Collections. Aleksandar Prokopec EPFL. Scala collections. for { sTRANSCRIPT
-
Scala Parallel CollectionsAleksandar ProkopecEPFL
- Scala collectionsfor { s
- Scala collectionsfor { s
- Scala parallel collectionsfor { s
- Scala parallel collectionsfor { s
- Scala parallel collectionsfor { s
- Scala parallel collectionsfor { s
-
for comprehensionssurnames.par.flatMap { s => names.par .filter(n => s endsWith n) .map(n => (n, s))}
-
for comprehensionsnested parallelized bulk operationssurnames.par.flatMap { s => names.par .filter(n => s endsWith n) .map(n => (n, s))}
-
Nested parallelism
-
Nested parallelismparallel within parallelcompositionsurnames.par.flatMap { s => surnameToCollection(s) // may invoke parallel ops}
-
Nested parallelismgoing recursivedef vowel(c: Char): Boolean = ...
-
Nested parallelismgoing recursivedef vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc
- Nested parallelismgoing recursivedef vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s
- Nested parallelismgoing recursivedef vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s
- Nested parallelismgoing recursivedef vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s
- Nested parallelismgoing recursivedef vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s
- Nested parallelismgoing recursivedef vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s
- Nested parallelismgoing recursivedef vowel(c: Char): Boolean = ...def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s
- Nested parallelismgoing recursivedef vowel(c: Char): Boolean = ...def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s
- Nested parallelismgoing recursivedef vowel(c: Char): Boolean = ...def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s
- Nested parallelismgoing recursivedef vowel(c: Char): Boolean = ...def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s
-
So, I just use par and Im home free?
-
How to think parallel
-
Character countuse case for foldLeftval txt: String = ...txt.foldLeft(0) { case (a, ) => a case (a, c) => a + 1}
-
6543210Character countuse case for foldLefttxt.foldLeft(0) { case (a, ) => a case (a, c) => a + 1}going left to right - not parallelizable!ABCDEF
-
Character countuse case for foldLefttxt.foldLeft(0) { case (a, ) => a case (a, c) => a + 1}going left to right not really necessary3210ABC3210DEF_ + _6
-
Character countin paralleltxt.fold(0) { case (a, ) => a case (a, c) => a + 1}
-
Character countin paralleltxt.fold(0) { case (a, ) => a case (a, c) => a + 1}3211ABC3211ABC: (Int, Char) => Int
-
Character countfold not applicabletxt.fold(0) { case (a, ) => a case (a, c) => a + 1}
3213ABC_ + _333213ABC! (Int, Int) => Int
-
Character countuse case for aggregatetxt.aggregate(0)({ case (a, ) => a case (a, c) => a + 1}, _ + _)
-
3211ABCCharacter countuse case for aggregatetxt.aggregate(0)({ case (a, ) => a case (a, c) => a + 1}, _ + _)_ + _333213ABC
-
Character countuse case for aggregateaggregation element3211ABC_ + _333213ABCtxt.aggregate(0)({ case (a, ) => a case (a, c) => a + 1}, _ + _)B
-
Character countuse case for aggregateaggregation aggregationaggregation element3211ABC_ + _333213ABCtxt.aggregate(0)({ case (a, ) => a case (a, c) => a + 1}, _ + _)B
-
Word countanother use case for foldLefttxt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}
-
Word countinitial accumulationtxt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}0 words so farlast character was a spaceFolding me softly.
-
Word counta spacetxt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}Folding me softly.last seen character is a space
-
Word counta non spacetxt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}Folding me softly.last seen character was a space a new word
-
Word counta non spacetxt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}Folding me softly.last seen character wasnt a space no new word
-
Word countin parallelsoftly.Folding me P1P2
-
Word countin parallelsoftly.Folding me wc = 2; rs = 1wc = 1; ls = 0P1P2
-
Word countin parallelsoftly.Folding me wc = 2; rs = 1wc = 1; ls = 0wc = 3P1P2
-
Word countmust assume arbitrary partitionsg me softly.Foldinwc = 1; rs = 0wc = 3; ls = 0P1P2
-
Word count must assume arbitrary partitionsg me softly.Foldinwc = 1; rs = 0wc = 3; ls = 0P1P2wc = 3
-
Word countinitial aggregationtxt.par.aggregate((0, 0, 0))
-
Word countinitial aggregationtxt.par.aggregate((0, 0, 0))# spaces on the left# spaces on the right#words
-
Word countinitial aggregationtxt.par.aggregate((0, 0, 0))# spaces on the left# spaces on the right#words
-
Word countaggregation aggregation
...}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => resFolding mesoftly.
-
Word count aggregation aggregation ...}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs)e softly. Folding m
-
Word count aggregation aggregation ...}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) softly.Folding me
-
Word count aggregation element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1)_0 words and a space add one more space each side
-
Word count aggregation elementtxt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) m0 words and a non-space one word, no spaces on the right side
-
Word count aggregation elementtxt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) me_nonzero words and a space one more space on the right side
-
Word count aggregation elementtxt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0)
me sofnonzero words, last non-space and current non-space no change
-
Word count aggregation elementtxt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0)
me snonzero words, last space and current non-space one more word
-
Word countin paralleltxt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0)}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs)})
-
Word countusing parallel strings?txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0)}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs)})
-
Word countstring not really parallelizablescala> (txt: String).par
-
Word countstring not really parallelizablescala> (txt: String).parcollection.parallel.ParSeq[Char] = ParArray()
-
Word countstring not really parallelizablescala> (txt: String).parcollection.parallel.ParSeq[Char] = ParArray()different internal representation!
-
Word countstring not really parallelizablescala> (txt: String).parcollection.parallel.ParSeq[Char] = ParArray()different internal representation!ParArray
-
Word countstring not really parallelizablescala> (txt: String).parcollection.parallel.ParSeq[Char] = ParArray()different internal representation!ParArray copy string contents into an array
-
Conversionsgoing parallel// `par` is efficient for...mutable.{Array, ArrayBuffer, ArraySeq}mutable.{HashMap, HashSet}immutable.{Vector, Range}immutable.{HashMap, HashSet}
-
Conversionsgoing parallel// `par` is efficient for...mutable.{Array, ArrayBuffer, ArraySeq}mutable.{HashMap, HashSet}immutable.{Vector, Range}immutable.{HashMap, HashSet}most other collections construct a new parallel collection!
-
Conversionsgoing parallel
sequentialparallelArray, ArrayBuffer, ArraySeqmutable.ParArraymutable.HashMapmutable.ParHashMapmutable.HashSetmutable.ParHashSetimmutable.Vectorimmutable.ParVectorimmutable.Rangeimmutable.ParRangeimmutable.HashMapimmutable.ParHashMapimmutable.HashSetimmutable.ParHashSet
-
Conversionsgoing parallel// `seq` is always efficientParArray(1, 2, 3).seqList(1, 2, 3, 4).seqParHashMap(1 -> 2, 3 -> 4).seqabcd.seq
// `par` may not be...abcd.par
-
Custom collections
-
Custom collection
class ParString(val str: String)
-
Custom collection
class ParString(val str: String)extends parallel.immutable.ParSeq[Char] {
-
Custom collection
class ParString(val str: String)extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length
-
Custom collection
class ParString(val str: String)extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str)
-
Custom collection
class ParString(val str: String)extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str) def splitter: Splitter[Char]
-
Custom collection
class ParString(val str: String)extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str) def splitter = new ParStringSplitter(0, str.length)
-
Custom collectionsplitter definitionclass ParStringSplitter(var i: Int, len: Int)extends Splitter[Char] {
-
Custom collectionsplitters are iteratorsclass ParStringSplitter(i: Int, len: Int)extends Splitter[Char] { def hasNext = i < len def next = { val r = str.charAt(i) i += 1 r }
-
Custom collectionsplitters must be duplicated... def dup = new ParStringSplitter(i, len)
-
Custom collectionsplitters know how many elements remain... def dup = new ParStringSplitter(i, len) def remaining = len - i
- Custom collectionsplitters can be split... def psplit(sizes: Int*): Seq[ParStringSplitter] = { val splitted = new ArrayBuffer[ParStringSplitter] for (sz
-
Word countnow with parallel stringsnew ParString(txt).aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0)}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs)})
-
Word countperformancetxt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}new ParString(txt).aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0)}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs)})100 mscores: 1 2 4time: 137 ms 70 ms 35 ms
-
Hierarchy
GenTraversableGenIterableGenSeqTraversableIterableSeqParIterableParSeq
-
Hierarchy
def nonEmpty(sq: Seq[String]) = { val res = new mutable.ArrayBuffer[String]()for (s
-
Hierarchy
def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]()for (s
-
Hierarchy
def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]()for (s
-
Hierarchy
def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]()for (s
-
Hierarchy
def nonEmpty(sq: GenSeq[String]) = { val res = new mutable.ArrayBuffer[String]()for (s
-
Accessors vs. transformerssome methods need more than just splitters
foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo,
-
Accessors vs. transformerssome methods need more than just splitters
foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, These return collections!
-
Accessors vs. transformerssome methods need more than just splitters
foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, Sequential collections builders
-
Accessors vs. transformerssome methods need more than just splitters
foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, Sequential collections buildersParallel collections combiners
-
Buildersbuilding a sequential collection
1234567Nil246NilListBuilder+=+=+=result
-
How to build parallel?
-
Combinersbuilding parallel collections
trait Combiner[-Elem, +To]extends Builder[Elem, To] { def combine[N : To] (other: Combiner[N, NewTo]): Combiner[N, NewTo]}
-
Combinersbuilding parallel collections
trait Combiner[-Elem, +To]extends Builder[Elem, To] { def combine[N : To] (other: Combiner[N, NewTo]): Combiner[N, NewTo]}Combiner
-
Combinersbuilding parallel collections
trait Combiner[-Elem, +To]extends Builder[Elem, To] { def combine[N : To] (other: Combiner[N, NewTo]): Combiner[N, NewTo]}Should be efficient O(log n) worst case
-
Combinersbuilding parallel collections
trait Combiner[-Elem, +To]extends Builder[Elem, To] { def combine[N : To] (other: Combiner[N, NewTo]): Combiner[N, NewTo]}How to implement this combine?
-
Parallel arrays1, 2, 3, 45, 6, 7, 82, 46, 83, 1, 8, 02, 2, 1, 98, 02, 2mergemergemergecopyallocate24688022
-
Parallel hash tablesParHashMap
-
Parallel hash tablesParHashMap01245789e.g. calling filter
-
Parallel hash tablesParHashMap01245789ParHashCombinerParHashCombinere.g. calling filter051794
-
Parallel hash tablesParHashMap01245789ParHashCombiner014ParHashCombiner579
-
Parallel hash tablesParHashMap01245789ParHashCombiner014ParHashCombiner595701479
-
Parallel hash tablesParHashMapParHashCombinerParHashCombinerHow to merge?570149
-
5789140Parallel hash tablesbuckets!ParHashCombinerParHashCombiner014975ParHashMap20 = 000021 = 000124 = 01002
-
Parallel hash tablesParHashCombinerParHashCombiner014975combine
-
Parallel hash tablesParHashCombinerParHashCombiner975ParHashCombinerno copying!
-
Parallel hash tables975014ParHashCombiner
-
Parallel hash tables975014ParHashMap
-
Custom combinersfor methods returning custom collectionsnew ParString(txt).filter(_ != )What is the return type here?
-
Custom combinersfor methods returning custom collectionsnew ParString(txt).filter(_ != )creates a ParVector!
-
Custom combinersfor methods returning custom collectionsnew ParString(txt).filter(_ != )creates a ParVector!class ParString(val str: String)extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i)...
-
Custom combinersfor methods returning custom collectionsclass ParString(val str: String)extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString]{ def apply(i: Int) = str.charAt(i)...
-
Custom combinersfor methods returning custom collectionsclass ParString(val str: String)extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString]{ def apply(i: Int) = str.charAt(i)...protected[this] override def newCombiner : Combiner[Char, ParString]
-
Custom combinersfor methods returning custom collectionsclass ParString(val str: String)extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString]{ def apply(i: Int) = str.charAt(i)...protected[this] override def newCombiner = new ParStringCombiner
-
Custom combinersfor methods returning custom collectionsclass ParStringCombinerextends Combiner[Char, ParString] {
-
Custom combinersfor methods returning custom collectionsclass ParStringCombinerextends Combiner[Char, ParString] { var size = 0
-
Custom combinersfor methods returning custom collectionsclass ParStringCombinerextends Combiner[Char, ParString] { var size = 0
size
-
Custom combinersfor methods returning custom collectionsclass ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder)size
-
Custom combinersfor methods returning custom collectionsclass ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder)sizechunks
-
Custom combinersfor methods returning custom collectionsclass ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.lastsizechunks
-
Custom combinersfor methods returning custom collectionsclass ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.lastsizelastcchunks
-
Custom combinersfor methods returning custom collectionsclass ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last def +=(elem: Char) = { lastc += elem size += 1 this }
-
Custom combinersfor methods returning custom collectionsclass ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last def +=(elem: Char) = { lastc += elem size += 1 this }sizelastcchunks+1
-
Custom combinersfor methods returning custom collections... def combine[U : ParString] (other: Combiner[U, NewTo]) = other match { case psc: ParStringCombiner => sz += that.sz chunks ++= that.chunks lastc = chunks.last this }
-
Custom combinersfor methods returning custom collections... def combine[U : ParString] (other: Combiner[U, NewTo])lastcchunkslastcchunks
- Custom combinersfor methods returning custom collections... def result = { val rsb = new StringBuilder for (sb
-
Custom combinersfor methods returning custom collections... def result = ...lastcchunksStringBuilder
-
Custom combinersfor methods expecting implicit builder factories// only for big boys... with GenericParTemplate[T, ParColl]...
object ParColl extends ParFactory[ParColl] { implicit def canCombineFrom[T] = new GenericCanCombineFrom[T] ...
-
Custom combinersperformance measurementtxt.filter(_ != )
new ParString(txt).filter(_ != )
-
txt.filter(_ != )
new ParString(txt).filter(_ != )106 msCustom combinersperformance measurement
-
txt.filter(_ != )
new ParString(txt).filter(_ != )106 ms1 core125 msCustom combinersperformance measurement
-
txt.filter(_ != )
new ParString(txt).filter(_ != )106 ms1 core125 ms2 cores81 msCustom combinersperformance measurement
-
txt.filter(_ != )
new ParString(txt).filter(_ != )106 ms1 core125 ms2 cores81 ms4 cores56 msCustom combinersperformance measurement
-
1 core125 ms2 cores81 ms4 cores56 mst/msproc125 ms12481 ms56 msCustom combinersperformance measurement
-
1 core125 ms2 cores81 ms4 cores56 mst/msproc125 ms12481 ms56 msdef result(not parallelized)Custom combinersperformance measurement
-
Custom combinerstricky!two-step evaluationparallelize the result method in combinersefficient merge operationbinomial heaps, ropes, etc.concurrent data structuresnon-blocking scalable insertion operationwere working on this
-
Future workcoming upconcurrent data structuresmore efficient vectorscustom task poolsuser defined schedulingparallel bulk in-place modifications
-
Thank you!Examples at:git://github.com/axel22/sd.git
*