why your apache spark job is failing

Why your Spark Job is FailingKostas Sakellis

• Software Engineering at Cloudera•Contributor to Apache Spark•Before that, worked on Cloudera Manager

com.esotericsoftware.kryo.KryoException: Unable to find class: $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$4$$anonfun$apply$3

We go about our day ignoring manholes until…

Courtesy of: http://www.independent.co.uk/incoming/article9127706.ece/binary/original/maholev23.jpg

… something goes wrong.

Courtesy of: http://greenpointers.com/wp-content/uploads/2015/03/Manhole-Explosion1.jpg

org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 6, kostas-4.vpc.cloudera.com): java.lang.NumberFormatException: For input string: "3.9166,10.2491,-4.0926,-4.4659,0"

at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1250)

at java.lang.Double.parseDouble(Double.java:540)at scala.collection.immutable.StringLike[...]

Driver stacktrace:at

org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203)

at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)

Job? What now?

Courtesy of:http://calvert.lib.md.us/jobs_pic.jpg

Examplesc.textFile(“hdfs://…”, 4) .map((x) => x.toInt) .filter(_ > 10) .sum()

Then what the heck is a stage?

Courtesy of: https://writinginadeadworld.files.wordpress.com/2014/03/rock1.jpeg

Partitionssc.textFile(“hdfs://…”, 4) .map((x) => x.toInt) .filter(_ > 10) .sum()

Partition 1

Partition 2

Partition 3

Partition 4

RDDssc.textFile(“hdfs://…”, 4) .map((x) => x.toInt) .filter(_ > 10) .sum()

…RDD1

Partition 1

Partition 2

Partition 3

Partition 4

…RDD1 …RDD2

Partition 1

Partition 2

Partition 3

Partition 4

Partition 1

Partition 2

Partition 3

Partition 4

…RDD1 …RDD2

Partition 1

Partition 2

Partition 3

Partition 4

Partition 1

Partition 2

Partition 3

Partition 4

…RDD3

Partition 1

Partition 2

Partition 3

Partition 4

…RDD1 …RDD2

Partition 1

Partition 2

Partition 3

Partition 4

sc.textFile(“hdfs://…”, 4) .map((x) => x.toInt) .filter(_ > 10) .sum()

Partition 1

Partition 2

Partition 3

Partition 4

…RDD3

Partition 1

Partition 2

Partition 3

Partition 4

…RDD1 …RDD2

RDD Lineage

Partition 1

Partition 2

Partition 3

Partition 4

sc.textFile(“hdfs://…”, 4) .map((x) => x.toInt) .filter(_ > 10) .sum()

Partition 1

Partition 2

Partition 3

Partition 4

…RDD3

Partition 1

Partition 2

Partition 3

Partition 4

Lineage

RDD Dependencies

…RDD1 …RDD2

Partition 1

Partition 2

Partition 3

Partition 4

Partition 1

Partition 2

Partition 3

Partition 4

…RDD3

Partition 1

Partition 2

Partition 3

Partition 4

Narrow Dependencies

•Narrow and Wide Dependencies

Wide Dependencies

• Sometimes records need to be grouped together• Examples• join•groupByKey

• Stages created at wide dependency boundaries

A more Interesting Spark Job

val rdd1 = sc.textFile(“hdfs://...”) .map(someFunc) .filter(filterFunc)

val rdd2 = sc.hadoopFile(“hdfs://...”) .groupByKey() .map(someOtherFunc)

val rdd3 = rdd1.join(rdd2) .map(someFunc)

rdd3.collect()

val rdd1 = sc.textFile(“hdfs://...”) .map(someFunc) .filter(filterFunc)

maptextFile filter

val rdd2 = sc.hadoopFile(“hdfs://...”) .groupByKey() .map(someOtherFunc)

groupByKeyhadoopFile map

val rdd3 = rdd1.join(rdd2) .map(someFunc)

join map

rdd3.collect()

maptextFile filter

groupByKey

hadoopFile map

join map

Wide Dependencies

Get to the point before I stop caring!

What was the failure?

org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 6, kostas-4.vpc.cloudera.com): java.lang.NumberFormatException: For input string: "3.9166,10.2491,-4.0926,-4.4659,0” [...]

StageTask Task

Task Task

StageTask Task

Task Task

StageTask Task

Task Task

spark.task.maxFailures=4

ERROR executor.Executor: Exception in task ID 2866 java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:565) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:648)

at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:706) at java.io.DataInputStream.read(DataInputStream.java:100) at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:209) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:173) at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206) at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45) at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:164) [...]

Spark Architecture

YARN Architecture

Resource Manager

Node Manager

Container Container

Node Manager

Container Container

Application Master

Client

Process Process

Spark on YARN Architecture

Resource Manager

Node Manager

Container Container

Node Manager

Container ContainerClient

Process Process

Spark on YARN Architecture

Resource Manager

Node Manager

Container Container

Node Manager

Container Container

Application Master

Client

Process Process

spark-submit --executor-memory 2g

--master yarn-client

--num-executors 2

--num-cores 2

Container [pid=63375,containerID=container_1388158490598_0001_01_000003] is running beyond physical memory limits. Current usage: 2.2 GB of 2.1 GB physical memory used; 2.8 GB of 4.2 GB virtual memory used. Killing container. [...]

spark-submit --executor-memory 2g

--master yarn-client

--num-executors 2

--num-cores 2

yarn.nodemanager.resource.memory-mb

Executor Container

spark.yarn.executor.memoryOverhead (7%) (10% in 1.4)

spark.executor.memory

spark.shuffle.memoryFraction (0.4) spark.storage.memoryFraction (0.6)

Memory allocation

Sometimes jobs run slow or even…

Courtesy of: http://blog.sdrock.com/pastors/files/2013/06/time-clock.jpg

java.lang.OutOfMemoryError: GC overhead limit exceeded at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1986) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) [...]

GC Stalls

Too much spilling!

Courtesy of: http://tgnp.me/wp-content/uploads/2014/05/spilled-starbucks.jpg

Shuffle Boundaries

maptextFile filter

groupByKey

hadoopFile map

join map

Shuffle

Most performance issues are in shuffles!

Inside a Task: Fetch & Aggregate

ExternalAppendOnlyMapBlock

deserialize

key1 -> valueskey2 -> valueskey3 -> valueskey4 -> values

Sort & Spill

key1 -> valueskey2 -> valueskey3 -> values

rdd.reduceByKey(reduceFunc, numPartitions=1000)

Inside a Task: Specify partitions

Why not set partitions to ∞ ?

Excessive parallelism

•Overwhelming scheduler overhead•More fetches -> more disk seeks•Driver needs to track state per-task

So how to choose?

• Easy answer:•Keep multiplying by 1.5 and see what works

Is Spark bad?

Courtesy of: https://theferkel.files.wordpress.com/2015/04/250474-breaking-bad-quotes.jpg

Thank you

why your apache spark job is failing

Software

apache spark rdds

apache spark introduction

r + apache spark

spark sql | apache spark

apache ignite and apache spark - gridgain systems · ignite...

writing apache spark and apache flink applications using...

teachyourself apache spark...hour 1 introducing apache...

running apache spark & apache zeppelin in production

introduction to cassandra • why spark - apache cassandra |...

apache spark meetup

apache spark

apache spark 101

[@naukriengineering] apache spark

integrating apache hive with kafka, spark, and...

using apache spark pat mcdonough - databricks. apache spark...

apache spark - courses€¦ · apache spark introduction to...

performance-analyse von apache spark und apache...

using apache spark

apache spark briefing

apache spark pdf