deep dive: memory management in apache spark
TRANSCRIPT
![Page 1: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/1.jpg)
Deep Dive:Memory Management in Apache
Andrew OrJune 8th, 2016@andrewor14
![Page 2: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/2.jpg)
students.select("name").orderBy("age").cache().show()
Caching Tungsten
Off-heapMemoryContention
![Page 3: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/3.jpg)
3
Efficient memory use is critical to good performance
![Page 4: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/4.jpg)
![Page 5: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/5.jpg)
Memory contention poses three challenges for Apache Spark
5
How to arbitrate memory between execution and storage?
How to arbitrate memory across tasks running in parallel?
How to arbitrate memory across operators running within the same task?
![Page 6: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/6.jpg)
Two usages of memory in Apache Spark
6
ExecutionMemory used for shuffles, joins, sorts and aggregations
StorageMemory used to cache data that will be reused later
![Page 7: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/7.jpg)
Iterator
4, 3, 5, 1, 6, 2 Sort
4 3 5 1 6 2 Iterator
1, 2, 3, 4, 5, 6
1 2 3 4 5 6
Execution memory
Take(3)
What if I want the sorted values again?
![Page 8: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/8.jpg)
Iterator
4, 3, 5, 1, 6, 2 Sort
4 3 5 1 6 2 Iterator
1, 2, 3, 4, 5, 6
1 2 3 4 5 6 Take(3)
Iterator
4, 3, 5, 1, 6, 2 Sort
4 3 5 1 6 2 Iterator
1, 2, 3, 4, 5, 6
1 2 3 4 5 6 Take(4)
...
![Page 9: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/9.jpg)
Sort
Iterator
4, 3, 5, 1, 6, 2
4 3 5 1 6 2 Iterator
1, 2, 3, 4, 5, 6
1 2 3 4 5 6
Cache
4 3 5 1 6 21 2 3 4 5 6
Execution memory Storage memory
Take(5)Take(4)Take(3) ...
![Page 10: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/10.jpg)
Challenge #1How to arbitrate memory between
execution and storage?
![Page 11: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/11.jpg)
Easy, static assignment!
11
Total available memory
Execution Storage
Spark 1.0
May 2014
![Page 12: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/12.jpg)
Easy, static assignment!
12
Execution Storage
Spill to disk
Spark 1.0
May 2014
![Page 13: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/13.jpg)
Easy, static assignment!
13
Execution Storage
Spark 1.0
May 2014
![Page 14: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/14.jpg)
Easy, static assignment!
14
Execution Storage
Evict LRU block to disk
Spark 1.0
May 2014
![Page 15: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/15.jpg)
15
Inefficient memory use leads tobad performance
![Page 16: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/16.jpg)
Easy, static assignment!
16
Execution can only use a fraction of the memory, even when there is no storage!
Execution Storage
Spark 1.0May 2014
![Page 17: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/17.jpg)
Storage
Easy, static assignment!
17
Efficient use of memory required user tuning
Execution
Spark 1.0May 2014
![Page 18: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/18.jpg)
18
Fast forward to 2016…How could we have done better?
![Page 19: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/19.jpg)
19
Execution Storage
![Page 20: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/20.jpg)
20
Unified memory management
Spark 1.6+
Jan 2016
What happens if there is already storage?
Execution Storage
![Page 21: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/21.jpg)
21
Unified memory management
Spark 1.6+
Jan 2016
Evict LRU block to disk
Execution Storage
![Page 22: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/22.jpg)
22
Unified memory management
Spark 1.6+
Jan 2016
What about the other way round?
Execution Storage
![Page 23: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/23.jpg)
23
Unified memory management
Spark 1.6+
Jan 2016
Evict LRU block to disk
Execution Storage
![Page 24: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/24.jpg)
Design considerations
24
Why evict storage, not execution?Spilled execution data will always be read back from disk, whereas cached data may not.
What if the application relies on caching?Allow the user to specify a minimum unevictable amount of cached data (not a reservation!).
Spark 1.6+
Jan 2016
![Page 25: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/25.jpg)
Challenge #2How to arbitrate memory across
tasks running in parallel?
![Page 26: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/26.jpg)
Easy, static assignment!
Worker machine has 4 cores
Each task gets 1/4 of the total memory
Slot 1 Slot 2 Slot 3 Slot 4
![Page 27: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/27.jpg)
Alternative: Dynamic assignment
The share of each task depends on number of actively running tasks (N)
Task 1
![Page 28: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/28.jpg)
Alternative: Dynamic assignment
Now, another task comes alongso the first task will have to spill
Task 1
![Page 29: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/29.jpg)
Alternative: Dynamic assignment
Each task is now assigned 1/N ofthe memory, where N = 2
Task 1 Task 2
![Page 30: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/30.jpg)
Alternative: Dynamic assignment
Each task is now assigned 1/N ofthe memory, where N = 4
Task 1 Task 2 Task 3 Task 4
![Page 31: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/31.jpg)
Alternative: Dynamic assignment
Last remaining task gets all the memory because N = 1
Task 3
Spark 1.0+
May 2014
![Page 32: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/32.jpg)
Static vs dynamic assignment
32
Both are fair and starvation free
Static assignment is simpler
Dynamic assignment handles stragglers better
![Page 33: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/33.jpg)
Challenge #3How to arbitrate memory across
operators running within the same task?
![Page 34: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/34.jpg)
SELECT age, avg(height) FROM students GROUP BY age ORDER BY avg(height)
students.groupBy("age") .avg("height") .orderBy("avg(height)") .collect()
Scan
Project
Aggregate
Sort
![Page 35: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/35.jpg)
Worker has 6 pages of memory
Scan
Project
Aggregate
Sort
![Page 36: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/36.jpg)
Scan
Project
Aggregate
Sort
Map { // age → heights 20 → [154, 174, 175] 21 → [167, 168, 181] 22 → [155, 166, 188] 23 → [160, 168, 178, 183]}
![Page 37: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/37.jpg)
Scan
Project
Aggregate
Sort
All 6 pages were used by Aggregate, leaving no memory for Sort!
![Page 38: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/38.jpg)
Solution #1:Reserve a page for
each operator
Scan
Project
Aggregate
Sort
![Page 39: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/39.jpg)
Solution #1:Reserve a page for
each operator
Scan
Project
Aggregate
Sort
Starvation free, but still not fair…What if there were more operators?
![Page 40: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/40.jpg)
Solution #2:Cooperative spilling
Scan
Project
Aggregate
Sort
![Page 41: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/41.jpg)
Scan
Project
Aggregate
SortSolution #2:
Cooperative spilling
![Page 42: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/42.jpg)
Scan
Project
Aggregate
SortSolution #2:
Cooperative spilling
Sort forces Aggregate to spill a page to free memory
![Page 43: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/43.jpg)
Scan
Project
Aggregate
SortSolution #2:
Cooperative spilling
Sort needs more memory so it forces Aggregate to spill another page (and so on)
![Page 44: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/44.jpg)
Scan
Project
Aggregate
SortSolution #2:
Cooperative spilling
Sort finishes with 3 pages
Aggregate does not have to spill its remaining pages
Spark 1.6+
Jan 2016
![Page 45: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/45.jpg)
Recap: Three sources of contention
45
How to arbitrate memory …
● between execution and storage?● across tasks running in parallel?● across operators running within the same task?
Instead of avoid statically reserving memory in advance, deal with memory contention when it arises by forcing members to spill
![Page 46: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/46.jpg)
Project Tungsten
46
Binary in-memory data representation
Cache-aware computation
Code generation (next time)Spark 1.4+
Jun 2015
![Page 47: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/47.jpg)
“abcd”
47
• Native: 4 bytes with UTF-8 encoding• Java: 48 bytes– 12 byte header– 2 bytes per character (UTF-16 internal representation)– 20 bytes of additional overhead– 8 byte hash code
Java objects have large overheads
![Page 48: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/48.jpg)
48
Schema: (Int, String, String)
Row
Array String(“data”)
String(“bricks”)
5+ objects, high space overhead, expensive hashCode()
BoxedInteger(123)
Java objects based row format
![Page 49: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/49.jpg)
6 “bricks”
49
0x0 123 32L 48L 4 “data”
(123, “data”, “bricks”)
Null tracking bitmap
Offset to var. length data
Offset to var. length data
Tungsten row format
![Page 50: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/50.jpg)
Cache-aware computation
50
ptr key rec
ptr key rec
ptr key rec
Naive layoutPoor cache locality
ptrkey prefix rec
ptrkey prefix rec
ptrkey prefix rec
Cache-aware layoutGood cache locality
E.g. sorting a list of records
![Page 51: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/51.jpg)
Off-heap memory
51
Available for execution since Apache Spark 1.6Available for storage since Apache Spark 2.0
Very important for large heaps
Many potential advantages: memory sharing, zero copy I/O, dynamic allocation
![Page 52: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/52.jpg)
For more info...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metalhttps://www.youtube.com/watch?v=5ajs8EIPWGI
Spark Performance: What’s Nexthttps://www.youtube.com/watch?v=JX0CdOTWYX4
Unified Memory Managementhttps://issues.apache.org/jira/browse/SPARK-10000
![Page 53: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/53.jpg)
Databricks Community Edition
Free version of cloud based platform in beta
More than 8,000 users registered
Users created over 61,000 notebooks in different languages
http://www.databricks.com/try
53
![Page 54: Deep Dive: Memory Management in Apache Spark](https://reader031.vdocument.in/reader031/viewer/2022021422/586f79d11a28ab10258b70a5/html5/thumbnails/54.jpg)
Thank [email protected]@andrewor14