lightning talks & integrations track - running apache spark libraries on apache apex @ abdw17,...

Post on 15-Apr-2017

9 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

2

• Motivation• Apex Processing Model• Spark Processing Model• Translation from Spark to Apex• Parallelism in Apex• I/O Performance Enhancement• RoadMap

3

4

5

6

7

val parsed = sc.textFile(path, minPartitions)

.map(_.trim)

.filter(line => !(line.isEmpty || line.startsWith("#")))

.map(training_record)

val d = parsed.reduce(math.Max + 1)

parsed.map(_+d).collect()

8

val parsed = sc.textFile(path, minPartitions)

.map(_.trim)

.filter(line => !(line.isEmpty || line.startsWith("#")))

.map(training_record)

Apex RDD

parsed

9

val d = parsed.reduce(math.Max + 1)

val d = nParsed

Apex RDD

10

parsed.map(_ + d).collect()

Parsed (ApexRDD)

11

Map

Map

Map

Map

Reduce

Reduce

12

13

14

15

16

17

top related