running spark in production
TRANSCRIPT
Running Spark in Production
Director, Product Management Member, Technical Staff April 13, 2016Twitter: @neomythos
Vinay Shukla Saisai (Jerry) Shao
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Who Are We?
Product Management Spark for 2+ years, Hadoop for 3+ years Recovering Programmer Blog at www.vinayshukla.com Twitter: @neomythos Addicted to Yoga, Hiking, & Coffee
Member of Technical Staff Apache Spark Contributor
Vinay Shukla Saisai (Jerry) Shao
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Spark Security : Secure Production Deployments
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Context: Spark Deployment Modes• Spark on YARN
–Spark driver (SparkContext) in YARN AM(yarn-cluster)–Spark driver (SparkContext) in local (yarn-client):
• Spark Shell & Spark Thrift Server runs in yarn-client only
Client
Executor
App Master
Spark Driver
Client
Executor
App Master
Spark Driver
YARN-ClientYARN-Cluster
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Spark on YARN
Spark Submit
User
Spark AM
1
Hadoop Cluster
HDFS
Executor
YARN RM
4
2 3
Node Manager
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Spark – Security – Four Pillars
Authentication Authorization Audit Encryption
Spark leverages Kerberos on YARNThis presumes network is secured,
the first step of security
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Kerberos Primer
Client
KDC
NN
DN
1. kinit - Login and get Ticket Granting Ticket (TGT)
3. Get NameNode Service Ticket (NN-ST)
2. Client Stores TGT in Ticket Cache
4. Client Stores NN-ST in Ticket Cache
5. Read/write file given NN-ST and file name; returns block locations, block IDs and Block Access Tokensif access permitted
6. Read/write block givenBlock Access Token and block ID
Client’sKerberos
Ticket Cache
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Kerberos authentication within Spark
KDC
Use Spark ST, submit Spark Job
Spark gets Namenode (NN) service ticket
YARN launches Spark Executors using John Doe’s identity
Get service ticket for Spark,
John Doe
Spark AMNN
Executor reads from HDFS using John Doe’s delegation token
kinit
1
2
3
4
5
6
7
Hadoop Cluster
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Spark + X (Source of Data)
KDC
Use Spark ST, submit Spark Job
Spark gets X ST
YARN launches Spark Executors using John Doe’s identity
Get Service Ticket (ST) for Spark
John Doe
Spark AMX
Executor reads from X using John Doe’s delegation token
kinit
1
2
3
4
5
6
7
Hadoop Cluster
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Spark – Kerberos - Example
kinit -kt /etc/security/keytabs/johndoe.keytab [email protected]
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 3 --driver-memory 512m --executor-memory 512m --executor-cores 1 lib/spark-examples*.jar 10
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS
Spark – Authorization
YARN Cluster
A B C
KDC
Use Spark ST, submit Spark Job
Get Namenode (NN) service ticket
Executors read from HDFS
Client gets service ticket for Spark
John Doe
RangerCan John launch this job?Can John read this file
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Spark – Communication Channels – YARN-CLUSTER Mode
Spark Submit
RM
Shuffle Service
AMDriver
NM
Ex 1 Ex N
Shuffle Data
Control/RPC
ShuffleBlockTransfer
DataSource
Read/Write Data
FS – Broadcast,File Download
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Spark – Channel Encryption - Example
Shuffle Data
Control/RPC
ShuffleBlockTransfer
Read/Write Data
FS – Broadcast,File Download
spark.authenticate.enableSaslEncryption= true
spark.authenticate = true. Leverage YARN to distribute keys
Depends on Data Source, For HDFS RPC (RC4 | 3DES) or SSL for WebHDFS
NM > Ex leverages YARN based SSL
spark.ssl.enabled = true
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Gotchas with Spark Security Client -> Spark Thrift Server > Spark Executors – No identity propagation on 2nd hop
– Forces STS to run as Hive user to read all data– Reduces security– Use SparkSQL via shell or programmatic API– https://issues.apache.org/jira/browse/SPARK-5159
SparkSQL – Granular security unavailable– Ranger integration will solve this problem (TP coming)– Brings Row/Column level/Masking features to SparkSQL
Spark + HBase with Kerberos– Issue fixed in Spark 1.4 (Spark-6918)
Spark Stream + Kafka + Kerberos + SSL– Issues fixed in HDP 2.4.x
Spark jobs > 72 Hours– Kerberos token not renewed before Spark 1.5
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Spark Performance
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
#1 Big or Small Executor ?
spark-submit --master yarn \ --deploy-mode client \ --num-executors ? \ --executor-cores ? \ --executor-memory ? \ --class MySimpleApp \ mySimpleApp.jar \ arg1 arg2
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Show the Details
Cluster resource: 72 cores + 288GB Memory
24 cores
96GB Memory24 cores
96GB Memory24 cores
96GB Memory
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Benchmark with Different Combinations
Cores Per E#
Memory Per E (GB)
E Per Node
1 6 18
2 12 9
3 18 6
6 36 3
9 54 2
18 108 1
#E stands for Executor
Except too small or too big executor, the performance is relatively close
18 cores 108GB memory per node# The lower the better
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Big Or Small Executor ?
Avoid too small or too big executor.– Small executor will decrease CPU and memory efficiency.– Big executor will introduce heavy GC overhead.
Usually 3 ~ 6 cores and 10 ~ 40 GB of memory per executor is a preferable choice.
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Any Other Thing ?
Executor memory != Container memory Container memory = executor memory + overhead memory (10% of executory memory) Leave some resources to other service and system
Enable CPU scheduling if you want to constrain CPU usage#
#CPU scheduling - http://hortonworks.com/blog/managing-cpu-resources-in-your-hadoop-yarn-clusters/
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
#2 Task Parallelism ?
What is the preferable number of partitions (task parallelism) ?
rdd.groupByKey(numPartitions)
rdd.reduceByKey(func, [numPartitions])
rdd.sortByKey([ascending], [numPartitions])
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Task Parallelism ?
map map map
reduce1 reduce2
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Task Parallelism ?
Task parallelism == number of reducer
Increasing the task parallelism will reduce the data to be processed for each task
map map map
reduce1 reduce2 reduce2
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Big or Small Task ?
Small task– Fine-grained partition can mitigate data skew problem to some extent.– With relatively small data to be processed in each task can alleviate disk spill.– Task number will be increased, this will somehow increase the scheduling overhead.
Big task– Too big shuffle block (> 2GB) will crash the Spark application.– Easily introduce data skew problem.– Frequent GC problem.
Summary– Fully make using of cluster parallelism.– Consider a preferable task size according to the size of shuffle input (~128MB).– Prefer small task rather than big task (increase the task parallelism).
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Benchmark With Different Task Parallelism
# The lower the better
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Conclusion
Different workloads have different processing pattern, knowing the pattern will give a right direction to tune
No one rule could fit all situations, based on the monitored behavior to tune workload can save the effort and get a better result
The more you know about the framework, the better you will get the tuning result
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank You
Vinay Shukla, @neomythosSaisai (Jerry) Shao