instalytics: cluster filesystem co-design for big-data analyticsjaya/slides/instalytics... · 2020....
TRANSCRIPT
![Page 1: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/1.jpg)
INSTalytics: Cluster Filesystem Co-design for Big-data Analytics
Muthian Sivathanu, Midhul Vuppalapati, Bhargav S. Gulavani,
Kaushik Rajan, Jyoti Leeka, Jayashree Mohan, Piyus Kedia
Microsoft Research India
![Page 2: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/2.jpg)
Big-data Analytics: Motivation
• Queries to measure, understand & derive intelligence from data
• Huge business value (billion $ industry)• Large internet companies -> massive data
• Store & process Exabytes of data per week
• Analytics as a Service offerings
• Several Frameworks• Extensive research work over past decade
![Page 3: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/3.jpg)
Problem statement
• Large-scale analytics queries (100TBs - PBs)• Very expensive to store in DRAM / on SSD
• Take several hours to execute (on 1000s of machines)
• Consume significant CPU, Disk, Network resources
• Two problems• High latency for users
• Huge resource/machine cost for service provider
• Goal: Improve efficiency of large scale analytics processing
![Page 4: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/4.jpg)
Approach at a glance
Today’s Systems
ClusterFilesystem
Read_Block,Append_Block
![Page 5: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/5.jpg)
Approach at a glance
Compute-aware Storage can drive significant efficiency in analytics
Today’s Systems
ClusterFilesystem
Co-Designed
ClusterFilesystem
Read_Block,Append_Block
![Page 6: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/6.jpg)
Approach at a glance
Compute-aware Storage can drive significant efficiency in analytics
Today’s Systems
ClusterFilesystem
Co-Designed
ClusterFilesystem
INSTalytics(Intelligent Store-powered Analytics)
Improves Query Performance
Read_Block,Append_Block
Latency +Execution cost
No strings attached!
![Page 7: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/7.jpg)
Outline
• Introduction
•Design & Evaluation1.) Key mechanism at storage layer2.) Efficient Query Execution
• Implementation
• Summary
![Page 8: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/8.jpg)
• Partitioning
Common Techniques used today
![Page 9: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/9.jpg)
• Partitioning
Common Techniques used today
![Page 10: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/10.jpg)
• Partitioning
Retrieve all click records with domain == “cnn”
Common Techniques used today
(Filter Query)
![Page 11: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/11.jpg)
• Partitioning
Retrieve all click records with domain == “cnn”
Common Techniques used today
(Filter Query)
![Page 12: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/12.jpg)
• Partitioning
• Partitioning + Co-location
Retrieve all click records with domain == “cnn”
Common Techniques used today
(Filter Query)
![Page 13: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/13.jpg)
• Partitioning
• Partitioning + Co-location
Retrieve all click records with domain == “cnn”
Common Techniques used today
(Filter Query)
![Page 14: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/14.jpg)
• Partitioning
• Partitioning + Co-location
Retrieve all click records with domain == “cnn”
Common Techniques used today
(Filter Query)
(Join Query)
![Page 15: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/15.jpg)
• Partitioning
• Partitioning + Co-location
Retrieve all click records with domain == “cnn”
Common Techniques used today
(Filter Query)
(Join Query)
![Page 16: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/16.jpg)
• Partitioning
• Partitioning + Co-location
Retrieve all click records with domain == “cnn”
Common Techniques used today
(Filter Query)
(Join Query)
![Page 17: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/17.jpg)
But, utility is limited
• Only one column can be chosen for partitioning or collocation• Helps only small set of queries that happen to filter/join on that column
• Queries on other columns still slow!
• How to get multiple partitioning/co-location strategies?• Only option: Maintain multiple copies of file
• Prohibitive storage cost
• Cost of maintaining consistency
![Page 18: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/18.jpg)
Logical Replication
• Can we get multiple partition orders without extra storage cost?• Answer: Yes!
• Key insight: Piggyback on replication done by cluster filesystem
• Today: Physical replication• All 3 copies of a file are identical byte-wise replicas
• Logical replication: Each replica of file partitioned differently• Benefit: 3 partition orders with no extra storage cost!
![Page 19: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/19.jpg)
Are 3 partition orders enough?
• Analyzed one week of jobs on a production cluster
• Large input files (100GB+): How many columns used in filters / joins?
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20 25 30 35
frac
tio
n o
f la
rge
file
s
Columns used for filters and equijoins
![Page 20: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/20.jpg)
Are 3 partition orders enough?
• One partition order covers only 35% of files
• 3 diff. partition orders cover 75% of files
• Analyzed one week of jobs on a production cluster
• Large input files (100GB+): How many columns used in filters / joins?
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20 25 30 35
frac
tio
n o
f la
rge
file
s
Columns used for filters and equijoins
![Page 21: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/21.jpg)
physical file logical replica 1 logical replica 2 logical replica 3
un-partitioned partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C3 C1 C2 C3
10 100 200 R1 10 100 200 R1 80 30 40 R14 120 320 20 R9
110 50 50 R2 50 210 250 R3 110 50 50 R2 80 30 40 R14
E1 50 210 250 R3 60 220 120 R10 150 50 320 R9 110 50 50 R2
200 150 300 R4 80 30 40 R14 310 80 220 R19 310 380 80 R5
310 380 80 R5 80 210 90 R13 180 80 220 R23 200 380 80 R12
110 140 330 R6 80 120 120 R24 220 80 180 R11 80 210 90 R13
300 320 220 R7 110 50 50 R2 10 100 200 R1 370 320 100 R17
240 120 320 R8 110 140 330 R6 80 120 120 R24 310 230 120 R20
E2 120 320 20 R9 150 50 320 R9 240 120 320 R8 60 220 120 R10
60 220 120 R10 150 50 380 R15 280 120 180 R16 80 120 120 R24
220 80 180 R11 180 210 310 R18 110 140 330 R6 220 80 180 R11
200 380 80 R12 180 80 220 R23 200 150 300 R4 280 120 180 R16
80 210 90 R13 200 150 300 R4 80 210 90 R13 10 100 200 R1
80 30 40 R14 200 380 80 R12 180 210 320 R18 320 300 210 R21
E3 150 50 380 R15 220 80 180 R11 50 210 250 R3 310 80 220 R19
280 120 180 R16 240 120 320 R8 60 220 120 R10 180 80 220 R23
370 320 100 R17 250 220 310 R22 250 220 310 R22 300 320 220 R7
180 210 310 R18 280 120 180 R16 310 230 120 R20 50 210 250 R3
310 80 220 R19 300 320 220 R7 320 300 210 R21 200 150 300 R4
310 230 120 R20 310 380 80 R5 370 320 100 R17 180 210 310 R18
E4 320 300 210 R21 310 80 220 R19 120 320 20 R9 250 220 310 R22
250 220 310 R22 320 300 210 R21 320 320 220 R7 240 120 320 R8
180 80 220 R23 310 230 120 R20 320 320 80 R5 110 140 330 R6
80 120 120 R24 370 320 100 R17 200 380 80 R12 150 50 380 R15
Challenge: Recovery cost
![Page 22: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/22.jpg)
physical file logical replica 1 logical replica 2 logical replica 3
un-partitioned partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C3 C1 C2 C3
10 100 200 R1 10 100 200 R1 80 30 40 R14 120 320 20 R9
110 50 50 R2 50 210 250 R3 110 50 50 R2 80 30 40 R14
E1 50 210 250 R3 60 220 120 R10 150 50 320 R9 110 50 50 R2
200 150 300 R4 80 30 40 R14 310 80 220 R19 310 380 80 R5
310 380 80 R5 80 210 90 R13 180 80 220 R23 200 380 80 R12
110 140 330 R6 80 120 120 R24 220 80 180 R11 80 210 90 R13
300 320 220 R7 110 50 50 R2 10 100 200 R1 370 320 100 R17
240 120 320 R8 110 140 330 R6 80 120 120 R24 310 230 120 R20
E2 120 320 20 R9 150 50 320 R9 240 120 320 R8 60 220 120 R10
60 220 120 R10 150 50 380 R15 280 120 180 R16 80 120 120 R24
220 80 180 R11 180 210 310 R18 110 140 330 R6 220 80 180 R11
200 380 80 R12 180 80 220 R23 200 150 300 R4 280 120 180 R16
80 210 90 R13 200 150 300 R4 80 210 90 R13 10 100 200 R1
80 30 40 R14 200 380 80 R12 180 210 320 R18 320 300 210 R21
E3 150 50 380 R15 220 80 180 R11 50 210 250 R3 310 80 220 R19
280 120 180 R16 240 120 320 R8 60 220 120 R10 180 80 220 R23
370 320 100 R17 250 220 310 R22 250 220 310 R22 300 320 220 R7
180 210 310 R18 280 120 180 R16 310 230 120 R20 50 210 250 R3
310 80 220 R19 300 320 220 R7 320 300 210 R21 200 150 300 R4
310 230 120 R20 310 380 80 R5 370 320 100 R17 180 210 310 R18
E4 320 300 210 R21 310 80 220 R19 120 320 20 R9 250 220 310 R22
250 220 310 R22 320 300 210 R21 320 320 220 R7 240 120 320 R8
180 80 220 R23 310 230 120 R20 320 320 80 R5 110 140 330 R6
80 120 120 R24 370 320 100 R17 200 380 80 R12 150 50 380 R15
Challenge: Recovery cost
Physical Replication
Recovery: Copy from another replica (Extent: 250MB)
![Page 23: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/23.jpg)
physical file logical replica 1 logical replica 2 logical replica 3
un-partitioned partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C3 C1 C2 C3
10 100 200 R1 10 100 200 R1 80 30 40 R14 120 320 20 R9
110 50 50 R2 50 210 250 R3 110 50 50 R2 80 30 40 R14
E1 50 210 250 R3 60 220 120 R10 150 50 320 R9 110 50 50 R2
200 150 300 R4 80 30 40 R14 310 80 220 R19 310 380 80 R5
310 380 80 R5 80 210 90 R13 180 80 220 R23 200 380 80 R12
110 140 330 R6 80 120 120 R24 220 80 180 R11 80 210 90 R13
300 320 220 R7 110 50 50 R2 10 100 200 R1 370 320 100 R17
240 120 320 R8 110 140 330 R6 80 120 120 R24 310 230 120 R20
E2 120 320 20 R9 150 50 320 R9 240 120 320 R8 60 220 120 R10
60 220 120 R10 150 50 380 R15 280 120 180 R16 80 120 120 R24
220 80 180 R11 180 210 310 R18 110 140 330 R6 220 80 180 R11
200 380 80 R12 180 80 220 R23 200 150 300 R4 280 120 180 R16
80 210 90 R13 200 150 300 R4 80 210 90 R13 10 100 200 R1
80 30 40 R14 200 380 80 R12 180 210 320 R18 320 300 210 R21
E3 150 50 380 R15 220 80 180 R11 50 210 250 R3 310 80 220 R19
280 120 180 R16 240 120 320 R8 60 220 120 R10 180 80 220 R23
370 320 100 R17 250 220 310 R22 250 220 310 R22 300 320 220 R7
180 210 310 R18 280 120 180 R16 310 230 120 R20 50 210 250 R3
310 80 220 R19 300 320 220 R7 320 300 210 R21 200 150 300 R4
310 230 120 R20 310 380 80 R5 370 320 100 R17 180 210 310 R18
E4 320 300 210 R21 310 80 220 R19 120 320 20 R9 250 220 310 R22
250 220 310 R22 320 300 210 R21 320 320 220 R7 240 120 320 R8
180 80 220 R23 310 230 120 R20 320 320 80 R5 110 140 330 R6
80 120 120 R24 370 320 100 R17 200 380 80 R12 150 50 380 R15
Challenge: Recovery cost
Physical Replication
Recovery: Copy from another replica (Extent: 250MB)1-100
100-200
200-300
300-400
![Page 24: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/24.jpg)
physical file logical replica 1 logical replica 2 logical replica 3
un-partitioned partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C3 C1 C2 C3
10 100 200 R1 10 100 200 R1 80 30 40 R14 120 320 20 R9
110 50 50 R2 50 210 250 R3 110 50 50 R2 80 30 40 R14
E1 50 210 250 R3 60 220 120 R10 150 50 320 R9 110 50 50 R2
200 150 300 R4 80 30 40 R14 310 80 220 R19 310 380 80 R5
310 380 80 R5 80 210 90 R13 180 80 220 R23 200 380 80 R12
110 140 330 R6 80 120 120 R24 220 80 180 R11 80 210 90 R13
300 320 220 R7 110 50 50 R2 10 100 200 R1 370 320 100 R17
240 120 320 R8 110 140 330 R6 80 120 120 R24 310 230 120 R20
E2 120 320 20 R9 150 50 320 R9 240 120 320 R8 60 220 120 R10
60 220 120 R10 150 50 380 R15 280 120 180 R16 80 120 120 R24
220 80 180 R11 180 210 310 R18 110 140 330 R6 220 80 180 R11
200 380 80 R12 180 80 220 R23 200 150 300 R4 280 120 180 R16
80 210 90 R13 200 150 300 R4 80 210 90 R13 10 100 200 R1
80 30 40 R14 200 380 80 R12 180 210 320 R18 320 300 210 R21
E3 150 50 380 R15 220 80 180 R11 50 210 250 R3 310 80 220 R19
280 120 180 R16 240 120 320 R8 60 220 120 R10 180 80 220 R23
370 320 100 R17 250 220 310 R22 250 220 310 R22 300 320 220 R7
180 210 310 R18 280 120 180 R16 310 230 120 R20 50 210 250 R3
310 80 220 R19 300 320 220 R7 320 300 210 R21 200 150 300 R4
310 230 120 R20 310 380 80 R5 370 320 100 R17 180 210 310 R18
E4 320 300 210 R21 310 80 220 R19 120 320 20 R9 250 220 310 R22
250 220 310 R22 320 300 210 R21 320 320 220 R7 240 120 320 R8
180 80 220 R23 310 230 120 R20 320 320 80 R5 110 140 330 R6
80 120 120 R24 370 320 100 R17 200 380 80 R12 150 50 380 R15
Challenge: Recovery cost
Physical Replication
Recovery: Copy from another replica (Extent: 250MB)1-100
100-200
200-300
300-400
![Page 25: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/25.jpg)
physical file logical replica 1 logical replica 2 logical replica 3
un-partitioned partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C3 C1 C2 C3
10 100 200 R1 10 100 200 R1 80 30 40 R14 120 320 20 R9
110 50 50 R2 50 210 250 R3 110 50 50 R2 80 30 40 R14
E1 50 210 250 R3 60 220 120 R10 150 50 320 R9 110 50 50 R2
200 150 300 R4 80 30 40 R14 310 80 220 R19 310 380 80 R5
310 380 80 R5 80 210 90 R13 180 80 220 R23 200 380 80 R12
110 140 330 R6 80 120 120 R24 220 80 180 R11 80 210 90 R13
300 320 220 R7 110 50 50 R2 10 100 200 R1 370 320 100 R17
240 120 320 R8 110 140 330 R6 80 120 120 R24 310 230 120 R20
E2 120 320 20 R9 150 50 320 R9 240 120 320 R8 60 220 120 R10
60 220 120 R10 150 50 380 R15 280 120 180 R16 80 120 120 R24
220 80 180 R11 180 210 310 R18 110 140 330 R6 220 80 180 R11
200 380 80 R12 180 80 220 R23 200 150 300 R4 280 120 180 R16
80 210 90 R13 200 150 300 R4 80 210 90 R13 10 100 200 R1
80 30 40 R14 200 380 80 R12 180 210 320 R18 320 300 210 R21
E3 150 50 380 R15 220 80 180 R11 50 210 250 R3 310 80 220 R19
280 120 180 R16 240 120 320 R8 60 220 120 R10 180 80 220 R23
370 320 100 R17 250 220 310 R22 250 220 310 R22 300 320 220 R7
180 210 310 R18 280 120 180 R16 310 230 120 R20 50 210 250 R3
310 80 220 R19 300 320 220 R7 320 300 210 R21 200 150 300 R4
310 230 120 R20 310 380 80 R5 370 320 100 R17 180 210 310 R18
E4 320 300 210 R21 310 80 220 R19 120 320 20 R9 250 220 310 R22
250 220 310 R22 320 300 210 R21 320 320 220 R7 240 120 320 R8
180 80 220 R23 310 230 120 R20 320 320 80 R5 110 140 330 R6
80 120 120 R24 370 320 100 R17 200 380 80 R12 150 50 380 R15
Challenge: Recovery cost
Physical Replication
Recovery: Copy from another replica (Extent: 250MB)1-100
100-200
200-300
300-400
![Page 26: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/26.jpg)
physical file logical replica 1 logical replica 2 logical replica 3
un-partitioned partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C3 C1 C2 C3
10 100 200 R1 10 100 200 R1 80 30 40 R14 120 320 20 R9
110 50 50 R2 50 210 250 R3 110 50 50 R2 80 30 40 R14
E1 50 210 250 R3 60 220 120 R10 150 50 320 R9 110 50 50 R2
200 150 300 R4 80 30 40 R14 310 80 220 R19 310 380 80 R5
310 380 80 R5 80 210 90 R13 180 80 220 R23 200 380 80 R12
110 140 330 R6 80 120 120 R24 220 80 180 R11 80 210 90 R13
300 320 220 R7 110 50 50 R2 10 100 200 R1 370 320 100 R17
240 120 320 R8 110 140 330 R6 80 120 120 R24 310 230 120 R20
E2 120 320 20 R9 150 50 320 R9 240 120 320 R8 60 220 120 R10
60 220 120 R10 150 50 380 R15 280 120 180 R16 80 120 120 R24
220 80 180 R11 180 210 310 R18 110 140 330 R6 220 80 180 R11
200 380 80 R12 180 80 220 R23 200 150 300 R4 280 120 180 R16
80 210 90 R13 200 150 300 R4 80 210 90 R13 10 100 200 R1
80 30 40 R14 200 380 80 R12 180 210 320 R18 320 300 210 R21
E3 150 50 380 R15 220 80 180 R11 50 210 250 R3 310 80 220 R19
280 120 180 R16 240 120 320 R8 60 220 120 R10 180 80 220 R23
370 320 100 R17 250 220 310 R22 250 220 310 R22 300 320 220 R7
180 210 310 R18 280 120 180 R16 310 230 120 R20 50 210 250 R3
310 80 220 R19 300 320 220 R7 320 300 210 R21 200 150 300 R4
310 230 120 R20 310 380 80 R5 370 320 100 R17 180 210 310 R18
E4 320 300 210 R21 310 80 220 R19 120 320 20 R9 250 220 310 R22
250 220 310 R22 320 300 210 R21 320 320 220 R7 240 120 320 R8
180 80 220 R23 310 230 120 R20 320 320 80 R5 110 140 330 R6
80 120 120 R24 370 320 100 R17 200 380 80 R12 150 50 380 R15
Challenge: Recovery cost
Physical Replication
Recovery: Copy from another replica (Extent: 250MB)1-100
100-200
200-300
300-400
![Page 27: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/27.jpg)
physical file logical replica 1 logical replica 2 logical replica 3
un-partitioned partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C3 C1 C2 C3
10 100 200 R1 10 100 200 R1 80 30 40 R14 120 320 20 R9
110 50 50 R2 50 210 250 R3 110 50 50 R2 80 30 40 R14
E1 50 210 250 R3 60 220 120 R10 150 50 320 R9 110 50 50 R2
200 150 300 R4 80 30 40 R14 310 80 220 R19 310 380 80 R5
310 380 80 R5 80 210 90 R13 180 80 220 R23 200 380 80 R12
110 140 330 R6 80 120 120 R24 220 80 180 R11 80 210 90 R13
300 320 220 R7 110 50 50 R2 10 100 200 R1 370 320 100 R17
240 120 320 R8 110 140 330 R6 80 120 120 R24 310 230 120 R20
E2 120 320 20 R9 150 50 320 R9 240 120 320 R8 60 220 120 R10
60 220 120 R10 150 50 380 R15 280 120 180 R16 80 120 120 R24
220 80 180 R11 180 210 310 R18 110 140 330 R6 220 80 180 R11
200 380 80 R12 180 80 220 R23 200 150 300 R4 280 120 180 R16
80 210 90 R13 200 150 300 R4 80 210 90 R13 10 100 200 R1
80 30 40 R14 200 380 80 R12 180 210 320 R18 320 300 210 R21
E3 150 50 380 R15 220 80 180 R11 50 210 250 R3 310 80 220 R19
280 120 180 R16 240 120 320 R8 60 220 120 R10 180 80 220 R23
370 320 100 R17 250 220 310 R22 250 220 310 R22 300 320 220 R7
180 210 310 R18 280 120 180 R16 310 230 120 R20 50 210 250 R3
310 80 220 R19 300 320 220 R7 320 300 210 R21 200 150 300 R4
310 230 120 R20 310 380 80 R5 370 320 100 R17 180 210 310 R18
E4 320 300 210 R21 310 80 220 R19 120 320 20 R9 250 220 310 R22
250 220 310 R22 320 300 210 R21 320 320 220 R7 240 120 320 R8
180 80 220 R23 310 230 120 R20 320 320 80 R5 110 140 330 R6
80 120 120 R24 370 320 100 R17 200 380 80 R12 150 50 380 R15
Challenge: Recovery cost
Physical Replication
Recovery: Copy from another replica (Extent: 250MB)1-100
100-200
200-300
300-400
![Page 28: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/28.jpg)
physical file logical replica 1 logical replica 2 logical replica 3
un-partitioned partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C3 C1 C2 C3
10 100 200 R1 10 100 200 R1 80 30 40 R14 120 320 20 R9
110 50 50 R2 50 210 250 R3 110 50 50 R2 80 30 40 R14
E1 50 210 250 R3 60 220 120 R10 150 50 320 R9 110 50 50 R2
200 150 300 R4 80 30 40 R14 310 80 220 R19 310 380 80 R5
310 380 80 R5 80 210 90 R13 180 80 220 R23 200 380 80 R12
110 140 330 R6 80 120 120 R24 220 80 180 R11 80 210 90 R13
300 320 220 R7 110 50 50 R2 10 100 200 R1 370 320 100 R17
240 120 320 R8 110 140 330 R6 80 120 120 R24 310 230 120 R20
E2 120 320 20 R9 150 50 320 R9 240 120 320 R8 60 220 120 R10
60 220 120 R10 150 50 380 R15 280 120 180 R16 80 120 120 R24
220 80 180 R11 180 210 310 R18 110 140 330 R6 220 80 180 R11
200 380 80 R12 180 80 220 R23 200 150 300 R4 280 120 180 R16
80 210 90 R13 200 150 300 R4 80 210 90 R13 10 100 200 R1
80 30 40 R14 200 380 80 R12 180 210 320 R18 320 300 210 R21
E3 150 50 380 R15 220 80 180 R11 50 210 250 R3 310 80 220 R19
280 120 180 R16 240 120 320 R8 60 220 120 R10 180 80 220 R23
370 320 100 R17 250 220 310 R22 250 220 310 R22 300 320 220 R7
180 210 310 R18 280 120 180 R16 310 230 120 R20 50 210 250 R3
310 80 220 R19 300 320 220 R7 320 300 210 R21 200 150 300 R4
310 230 120 R20 310 380 80 R5 370 320 100 R17 180 210 310 R18
E4 320 300 210 R21 310 80 220 R19 120 320 20 R9 250 220 310 R22
250 220 310 R22 320 300 210 R21 320 320 220 R7 240 120 320 R8
180 80 220 R23 310 230 120 R20 320 320 80 R5 110 140 330 R6
80 120 120 R24 370 320 100 R17 200 380 80 R12 150 50 380 R15
Challenge: Recovery cost
Physical Replication
Recovery: Copy from another replica (Extent: 250MB)1-100
100-200
200-300
300-400
![Page 29: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/29.jpg)
physical file logical replica 1 logical replica 2 logical replica 3
un-partitioned partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C3 C1 C2 C3
10 100 200 R1 10 100 200 R1 80 30 40 R14 120 320 20 R9
110 50 50 R2 50 210 250 R3 110 50 50 R2 80 30 40 R14
E1 50 210 250 R3 60 220 120 R10 150 50 320 R9 110 50 50 R2
200 150 300 R4 80 30 40 R14 310 80 220 R19 310 380 80 R5
310 380 80 R5 80 210 90 R13 180 80 220 R23 200 380 80 R12
110 140 330 R6 80 120 120 R24 220 80 180 R11 80 210 90 R13
300 320 220 R7 110 50 50 R2 10 100 200 R1 370 320 100 R17
240 120 320 R8 110 140 330 R6 80 120 120 R24 310 230 120 R20
E2 120 320 20 R9 150 50 320 R9 240 120 320 R8 60 220 120 R10
60 220 120 R10 150 50 380 R15 280 120 180 R16 80 120 120 R24
220 80 180 R11 180 210 310 R18 110 140 330 R6 220 80 180 R11
200 380 80 R12 180 80 220 R23 200 150 300 R4 280 120 180 R16
80 210 90 R13 200 150 300 R4 80 210 90 R13 10 100 200 R1
80 30 40 R14 200 380 80 R12 180 210 320 R18 320 300 210 R21
E3 150 50 380 R15 220 80 180 R11 50 210 250 R3 310 80 220 R19
280 120 180 R16 240 120 320 R8 60 220 120 R10 180 80 220 R23
370 320 100 R17 250 220 310 R22 250 220 310 R22 300 320 220 R7
180 210 310 R18 280 120 180 R16 310 230 120 R20 50 210 250 R3
310 80 220 R19 300 320 220 R7 320 300 210 R21 200 150 300 R4
310 230 120 R20 310 380 80 R5 370 320 100 R17 180 210 310 R18
E4 320 300 210 R21 310 80 220 R19 120 320 20 R9 250 220 310 R22
250 220 310 R22 320 300 210 R21 320 320 220 R7 240 120 320 R8
180 80 220 R23 310 230 120 R20 320 320 80 R5 110 140 330 R6
80 120 120 R24 370 320 100 R17 200 380 80 R12 150 50 380 R15
Challenge: Recovery cost
Physical Replication
Recovery: Copy from another replica (Extent: 250MB)1-100
100-200
200-300
300-400
![Page 30: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/30.jpg)
physical file logical replica 1 logical replica 2 logical replica 3
un-partitioned partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C3 C1 C2 C3
10 100 200 R1 10 100 200 R1 80 30 40 R14 120 320 20 R9
110 50 50 R2 50 210 250 R3 110 50 50 R2 80 30 40 R14
E1 50 210 250 R3 60 220 120 R10 150 50 320 R9 110 50 50 R2
200 150 300 R4 80 30 40 R14 310 80 220 R19 310 380 80 R5
310 380 80 R5 80 210 90 R13 180 80 220 R23 200 380 80 R12
110 140 330 R6 80 120 120 R24 220 80 180 R11 80 210 90 R13
300 320 220 R7 110 50 50 R2 10 100 200 R1 370 320 100 R17
240 120 320 R8 110 140 330 R6 80 120 120 R24 310 230 120 R20
E2 120 320 20 R9 150 50 320 R9 240 120 320 R8 60 220 120 R10
60 220 120 R10 150 50 380 R15 280 120 180 R16 80 120 120 R24
220 80 180 R11 180 210 310 R18 110 140 330 R6 220 80 180 R11
200 380 80 R12 180 80 220 R23 200 150 300 R4 280 120 180 R16
80 210 90 R13 200 150 300 R4 80 210 90 R13 10 100 200 R1
80 30 40 R14 200 380 80 R12 180 210 320 R18 320 300 210 R21
E3 150 50 380 R15 220 80 180 R11 50 210 250 R3 310 80 220 R19
280 120 180 R16 240 120 320 R8 60 220 120 R10 180 80 220 R23
370 320 100 R17 250 220 310 R22 250 220 310 R22 300 320 220 R7
180 210 310 R18 280 120 180 R16 310 230 120 R20 50 210 250 R3
310 80 220 R19 300 320 220 R7 320 300 210 R21 200 150 300 R4
310 230 120 R20 310 380 80 R5 370 320 100 R17 180 210 310 R18
E4 320 300 210 R21 310 80 220 R19 120 320 20 R9 250 220 310 R22
250 220 310 R22 320 300 210 R21 320 320 220 R7 240 120 320 R8
180 80 220 R23 310 230 120 R20 320 320 80 R5 110 140 330 R6
80 120 120 R24 370 320 100 R17 200 380 80 R12 150 50 380 R15
Challenge: Recovery cost
Physical Replication
Recovery: Copy from another replica (Extent: 250MB)1-100
100-200
200-300
300-400
![Page 31: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/31.jpg)
physical file logical replica 1 logical replica 2 logical replica 3
un-partitioned partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C3 C1 C2 C3
10 100 200 R1 10 100 200 R1 80 30 40 R14 120 320 20 R9
110 50 50 R2 50 210 250 R3 110 50 50 R2 80 30 40 R14
E1 50 210 250 R3 60 220 120 R10 150 50 320 R9 110 50 50 R2
200 150 300 R4 80 30 40 R14 310 80 220 R19 310 380 80 R5
310 380 80 R5 80 210 90 R13 180 80 220 R23 200 380 80 R12
110 140 330 R6 80 120 120 R24 220 80 180 R11 80 210 90 R13
300 320 220 R7 110 50 50 R2 10 100 200 R1 370 320 100 R17
240 120 320 R8 110 140 330 R6 80 120 120 R24 310 230 120 R20
E2 120 320 20 R9 150 50 320 R9 240 120 320 R8 60 220 120 R10
60 220 120 R10 150 50 380 R15 280 120 180 R16 80 120 120 R24
220 80 180 R11 180 210 310 R18 110 140 330 R6 220 80 180 R11
200 380 80 R12 180 80 220 R23 200 150 300 R4 280 120 180 R16
80 210 90 R13 200 150 300 R4 80 210 90 R13 10 100 200 R1
80 30 40 R14 200 380 80 R12 180 210 320 R18 320 300 210 R21
E3 150 50 380 R15 220 80 180 R11 50 210 250 R3 310 80 220 R19
280 120 180 R16 240 120 320 R8 60 220 120 R10 180 80 220 R23
370 320 100 R17 250 220 310 R22 250 220 310 R22 300 320 220 R7
180 210 310 R18 280 120 180 R16 310 230 120 R20 50 210 250 R3
310 80 220 R19 300 320 220 R7 320 300 210 R21 200 150 300 R4
310 230 120 R20 310 380 80 R5 370 320 100 R17 180 210 310 R18
E4 320 300 210 R21 310 80 220 R19 120 320 20 R9 250 220 310 R22
250 220 310 R22 320 300 210 R21 320 320 220 R7 240 120 320 R8
180 80 220 R23 310 230 120 R20 320 320 80 R5 110 140 330 R6
80 120 120 R24 370 320 100 R17 200 380 80 R12 150 50 380 R15
Challenge: Recovery cost
Naïve Logical Replication
Prohibitive recovery cost!
Physical Replication
Recovery: Copy from another replica (Extent: 250MB)1-100
100-200
200-300
300-400
![Page 32: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/32.jpg)
logical replica 1 logical replica 2 logical replica 3
partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C310 100 200 R1 110 50 50 R2 120 320 20 R950 210 250 R3 220 80 180 R11 110 50 50 R2
E1 60 220 120 R10 10 100 200 R1 310 380 80 R5110 50 50 R2 240 120 320 R8 200 380 80 R12110 140 330 R6 110 140 330 R6 60 220 120 R10120 320 20 R9 200 150 300 R4 220 80 180 R11200 380 80 R12 50 210 250 R3 10 100 200 R1200 150 300 R4 60 220 120 R10 300 320 220 R7
E2 220 80 180 R11 120 320 20 R9 50 210 250 R3240 120 320 R8 300 320 220 R7 200 150 300 R4300 320 220 R7 310 380 80 R5 240 120 320 R8310 380 80 R5 200 380 80 R12 110 140 330 R6
80 30 40 R14 80 30 40 R14 80 30 40 R1480 210 90 R13 150 50 380 R15 80 210 90 R13
E3 80 120 120 R24 310 80 220 R19 370 320 100 R17150 50 380 R15 180 80 220 R23 80 120 120 R24180 80 220 R23 80 120 120 R24 310 230 120 R20180 210 310 R18 280 120 180 R16 280 120 180 R16250 220 310 R22 80 210 90 R13 320 300 210 R21280 120 180 R16 180 210 310 R18 180 80 220 R23
E4 310 80 220 R19 250 220 310 R22 310 80 220 R19310 230 120 R20 310 230 120 R20 250 220 310 R22320 300 210 R21 320 300 210 R21 180 210 310 R18370 320 100 R17 370 320 100 R17 150 50 380 R15
Super Extents
Sup
er-Extent 1
Sup
er-Extent 2
• Super Extent
• Contiguous group of fixed # of extents
• Super extent size
• Re-order records at super-extent level
![Page 33: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/33.jpg)
logical replica 1 logical replica 2 logical replica 3
partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C310 100 200 R1 110 50 50 R2 120 320 20 R950 210 250 R3 220 80 180 R11 110 50 50 R2
E1 60 220 120 R10 10 100 200 R1 310 380 80 R5110 50 50 R2 240 120 320 R8 200 380 80 R12110 140 330 R6 110 140 330 R6 60 220 120 R10120 320 20 R9 200 150 300 R4 220 80 180 R11200 380 80 R12 50 210 250 R3 10 100 200 R1200 150 300 R4 60 220 120 R10 300 320 220 R7
E2 220 80 180 R11 120 320 20 R9 50 210 250 R3240 120 320 R8 300 320 220 R7 200 150 300 R4300 320 220 R7 310 380 80 R5 240 120 320 R8310 380 80 R5 200 380 80 R12 110 140 330 R6
80 30 40 R14 80 30 40 R14 80 30 40 R1480 210 90 R13 150 50 380 R15 80 210 90 R13
E3 80 120 120 R24 310 80 220 R19 370 320 100 R17150 50 380 R15 180 80 220 R23 80 120 120 R24180 80 220 R23 80 120 120 R24 310 230 120 R20180 210 310 R18 280 120 180 R16 280 120 180 R16250 220 310 R22 80 210 90 R13 320 300 210 R21280 120 180 R16 180 210 310 R18 180 80 220 R23
E4 310 80 220 R19 250 220 310 R22 310 80 220 R19310 230 120 R20 310 230 120 R20 250 220 310 R22320 300 210 R21 320 300 210 R21 180 210 310 R18370 320 100 R17 370 320 100 R17 150 50 380 R15
Super Extents
Sup
er-Extent 1
Sup
er-Extent 2
• Super Extent
• Contiguous group of fixed # of extents
• Super extent size
• Re-order records at super-extent level
![Page 34: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/34.jpg)
logical replica 1 logical replica 2 logical replica 3
partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C310 100 200 R1 110 50 50 R2 120 320 20 R950 210 250 R3 220 80 180 R11 110 50 50 R2
E1 60 220 120 R10 10 100 200 R1 310 380 80 R5110 50 50 R2 240 120 320 R8 200 380 80 R12110 140 330 R6 110 140 330 R6 60 220 120 R10120 320 20 R9 200 150 300 R4 220 80 180 R11200 380 80 R12 50 210 250 R3 10 100 200 R1200 150 300 R4 60 220 120 R10 300 320 220 R7
E2 220 80 180 R11 120 320 20 R9 50 210 250 R3240 120 320 R8 300 320 220 R7 200 150 300 R4300 320 220 R7 310 380 80 R5 240 120 320 R8310 380 80 R5 200 380 80 R12 110 140 330 R6
80 30 40 R14 80 30 40 R14 80 30 40 R1480 210 90 R13 150 50 380 R15 80 210 90 R13
E3 80 120 120 R24 310 80 220 R19 370 320 100 R17150 50 380 R15 180 80 220 R23 80 120 120 R24180 80 220 R23 80 120 120 R24 310 230 120 R20180 210 310 R18 280 120 180 R16 280 120 180 R16250 220 310 R22 80 210 90 R13 320 300 210 R21280 120 180 R16 180 210 310 R18 180 80 220 R23
E4 310 80 220 R19 250 220 310 R22 310 80 220 R19310 230 120 R20 310 230 120 R20 250 220 310 R22320 300 210 R21 320 300 210 R21 180 210 310 R18370 320 100 R17 370 320 100 R17 150 50 380 R15
Super Extents
Sup
er-Extent 1
Sup
er-Extent 2
• Super Extent
• Contiguous group of fixed # of extents
• Super extent size
• Re-order records at super-extent level
![Page 35: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/35.jpg)
logical replica 1 logical replica 2 logical replica 3
partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C310 100 200 R1 110 50 50 R2 120 320 20 R950 210 250 R3 220 80 180 R11 110 50 50 R2
E1 60 220 120 R10 10 100 200 R1 310 380 80 R5110 50 50 R2 240 120 320 R8 200 380 80 R12110 140 330 R6 110 140 330 R6 60 220 120 R10120 320 20 R9 200 150 300 R4 220 80 180 R11200 380 80 R12 50 210 250 R3 10 100 200 R1200 150 300 R4 60 220 120 R10 300 320 220 R7
E2 220 80 180 R11 120 320 20 R9 50 210 250 R3240 120 320 R8 300 320 220 R7 200 150 300 R4300 320 220 R7 310 380 80 R5 240 120 320 R8310 380 80 R5 200 380 80 R12 110 140 330 R6
80 30 40 R14 80 30 40 R14 80 30 40 R1480 210 90 R13 150 50 380 R15 80 210 90 R13
E3 80 120 120 R24 310 80 220 R19 370 320 100 R17150 50 380 R15 180 80 220 R23 80 120 120 R24180 80 220 R23 80 120 120 R24 310 230 120 R20180 210 310 R18 280 120 180 R16 280 120 180 R16250 220 310 R22 80 210 90 R13 320 300 210 R21280 120 180 R16 180 210 310 R18 180 80 220 R23
E4 310 80 220 R19 250 220 310 R22 310 80 220 R19310 230 120 R20 310 230 120 R20 250 220 310 R22320 300 210 R21 320 300 210 R21 180 210 310 R18370 320 100 R17 370 320 100 R17 150 50 380 R15
Super Extents
Sup
er-Extent 1
Sup
er-Extent 2
• Super Extent
• Contiguous group of fixed # of extents
• Super extent size
• Re-order records at super-extent level
![Page 36: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/36.jpg)
logical replica 1 logical replica 2 logical replica 3
partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C310 100 200 R1 110 50 50 R2 120 320 20 R950 210 250 R3 220 80 180 R11 110 50 50 R2
E1 60 220 120 R10 10 100 200 R1 310 380 80 R5110 50 50 R2 240 120 320 R8 200 380 80 R12110 140 330 R6 110 140 330 R6 60 220 120 R10120 320 20 R9 200 150 300 R4 220 80 180 R11200 380 80 R12 50 210 250 R3 10 100 200 R1200 150 300 R4 60 220 120 R10 300 320 220 R7
E2 220 80 180 R11 120 320 20 R9 50 210 250 R3240 120 320 R8 300 320 220 R7 200 150 300 R4300 320 220 R7 310 380 80 R5 240 120 320 R8310 380 80 R5 200 380 80 R12 110 140 330 R6
80 30 40 R14 80 30 40 R14 80 30 40 R1480 210 90 R13 150 50 380 R15 80 210 90 R13
E3 80 120 120 R24 310 80 220 R19 370 320 100 R17150 50 380 R15 180 80 220 R23 80 120 120 R24180 80 220 R23 80 120 120 R24 310 230 120 R20180 210 310 R18 280 120 180 R16 280 120 180 R16250 220 310 R22 80 210 90 R13 320 300 210 R21280 120 180 R16 180 210 310 R18 180 80 220 R23
E4 310 80 220 R19 250 220 310 R22 310 80 220 R19310 230 120 R20 310 230 120 R20 250 220 310 R22320 300 210 R21 320 300 210 R21 180 210 310 R18370 320 100 R17 370 320 100 R17 150 50 380 R15
Super Extents
Sup
er-Extent 1
Sup
er-Extent 2
• Super Extent
• Contiguous group of fixed # of extents
• Super extent size
• Re-order records at super-extent level
![Page 37: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/37.jpg)
logical replica 1 logical replica 2 logical replica 3
partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C310 100 200 R1 110 50 50 R2 120 320 20 R950 210 250 R3 220 80 180 R11 110 50 50 R2
E1 60 220 120 R10 10 100 200 R1 310 380 80 R5110 50 50 R2 240 120 320 R8 200 380 80 R12110 140 330 R6 110 140 330 R6 60 220 120 R10120 320 20 R9 200 150 300 R4 220 80 180 R11200 380 80 R12 50 210 250 R3 10 100 200 R1200 150 300 R4 60 220 120 R10 300 320 220 R7
E2 220 80 180 R11 120 320 20 R9 50 210 250 R3240 120 320 R8 300 320 220 R7 200 150 300 R4300 320 220 R7 310 380 80 R5 240 120 320 R8310 380 80 R5 200 380 80 R12 110 140 330 R6
80 30 40 R14 80 30 40 R14 80 30 40 R1480 210 90 R13 150 50 380 R15 80 210 90 R13
E3 80 120 120 R24 310 80 220 R19 370 320 100 R17150 50 380 R15 180 80 220 R23 80 120 120 R24180 80 220 R23 80 120 120 R24 310 230 120 R20180 210 310 R18 280 120 180 R16 280 120 180 R16250 220 310 R22 80 210 90 R13 320 300 210 R21280 120 180 R16 180 210 310 R18 180 80 220 R23
E4 310 80 220 R19 250 220 310 R22 310 80 220 R19310 230 120 R20 310 230 120 R20 250 220 310 R22320 300 210 R21 320 300 210 R21 180 210 310 R18370 320 100 R17 370 320 100 R17 150 50 380 R15
Super Extents
Sup
er-Extent 1
Sup
er-Extent 2
• Super Extent
• Contiguous group of fixed # of extents
• Super extent size
• Re-order records at super-extent level
![Page 38: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/38.jpg)
logical replica 1 logical replica 2 logical replica 3
partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C310 100 200 R1 110 50 50 R2 120 320 20 R950 210 250 R3 220 80 180 R11 110 50 50 R2
E1 60 220 120 R10 10 100 200 R1 310 380 80 R5110 50 50 R2 240 120 320 R8 200 380 80 R12110 140 330 R6 110 140 330 R6 60 220 120 R10120 320 20 R9 200 150 300 R4 220 80 180 R11200 380 80 R12 50 210 250 R3 10 100 200 R1200 150 300 R4 60 220 120 R10 300 320 220 R7
E2 220 80 180 R11 120 320 20 R9 50 210 250 R3240 120 320 R8 300 320 220 R7 200 150 300 R4300 320 220 R7 310 380 80 R5 240 120 320 R8310 380 80 R5 200 380 80 R12 110 140 330 R6
80 30 40 R14 80 30 40 R14 80 30 40 R1480 210 90 R13 150 50 380 R15 80 210 90 R13
E3 80 120 120 R24 310 80 220 R19 370 320 100 R17150 50 380 R15 180 80 220 R23 80 120 120 R24180 80 220 R23 80 120 120 R24 310 230 120 R20180 210 310 R18 280 120 180 R16 280 120 180 R16250 220 310 R22 80 210 90 R13 320 300 210 R21280 120 180 R16 180 210 310 R18 180 80 220 R23
E4 310 80 220 R19 250 220 310 R22 310 80 220 R19310 230 120 R20 310 230 120 R20 250 220 310 R22320 300 210 R21 320 300 210 R21 180 210 310 R18370 320 100 R17 370 320 100 R17 150 50 380 R15
Super Extents
Sup
er-Extent 1
Sup
er-Extent 2
• Super Extent
• Contiguous group of fixed # of extents
• Super extent size
• Re-order records at super-extent level
•Consequence: • partial ordering v/s global ordering• Benefits = func(super extent size)
![Page 39: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/39.jpg)
logical replica 1 logical replica 2 logical replica 3
partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C310 100 200 R1 110 50 50 R2 120 320 20 R950 210 250 R3 220 80 180 R11 110 50 50 R2
E1 60 220 120 R10 10 100 200 R1 310 380 80 R5110 50 50 R2 240 120 320 R8 200 380 80 R12110 140 330 R6 110 140 330 R6 60 220 120 R10120 320 20 R9 200 150 300 R4 220 80 180 R11200 380 80 R12 50 210 250 R3 10 100 200 R1200 150 300 R4 60 220 120 R10 300 320 220 R7
E2 220 80 180 R11 120 320 20 R9 50 210 250 R3240 120 320 R8 300 320 220 R7 200 150 300 R4300 320 220 R7 310 380 80 R5 240 120 320 R8310 380 80 R5 200 380 80 R12 110 140 330 R6
80 30 40 R14 80 30 40 R14 80 30 40 R1480 210 90 R13 150 50 380 R15 80 210 90 R13
E3 80 120 120 R24 310 80 220 R19 370 320 100 R17150 50 380 R15 180 80 220 R23 80 120 120 R24180 80 220 R23 80 120 120 R24 310 230 120 R20180 210 310 R18 280 120 180 R16 280 120 180 R16250 220 310 R22 80 210 90 R13 320 300 210 R21280 120 180 R16 180 210 310 R18 180 80 220 R23
E4 310 80 220 R19 250 220 310 R22 310 80 220 R19310 230 120 R20 310 230 120 R20 250 220 310 R22320 300 210 R21 320 300 210 R21 180 210 310 R18370 320 100 R17 370 320 100 R17 150 50 380 R15
Super Extents
Sup
er-Extent 1
Sup
er-Extent 2
• Super Extent
• Contiguous group of fixed # of extents
• Super extent size
• Re-order records at super-extent level
•Consequence: • partial ordering v/s global ordering• Benefits = func(super extent size)
•In practice: Super extent size = 100
![Page 40: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/40.jpg)
logical replica 1 logical replica 2 logical replica 3
partitioned C1 partitioned C2 partitioned C3
C1 C2 C3 C1 C2 C3 C1 C2 C310 100 200 R1 110 50 50 R2 120 320 20 R950 210 250 R3 220 80 180 R11 110 50 50 R2
E1 60 220 120 R10 10 100 200 R1 310 380 80 R5110 50 50 R2 240 120 320 R8 200 380 80 R12110 140 330 R6 110 140 330 R6 60 220 120 R10120 320 20 R9 200 150 300 R4 220 80 180 R11200 380 80 R12 50 210 250 R3 10 100 200 R1200 150 300 R4 60 220 120 R10 300 320 220 R7
E2 220 80 180 R11 120 320 20 R9 50 210 250 R3240 120 320 R8 300 320 220 R7 200 150 300 R4300 320 220 R7 310 380 80 R5 240 120 320 R8310 380 80 R5 200 380 80 R12 110 140 330 R6
80 30 40 R14 80 30 40 R14 80 30 40 R1480 210 90 R13 150 50 380 R15 80 210 90 R13
E3 80 120 120 R24 310 80 220 R19 370 320 100 R17150 50 380 R15 180 80 220 R23 80 120 120 R24180 80 220 R23 80 120 120 R24 310 230 120 R20180 210 310 R18 280 120 180 R16 280 120 180 R16250 220 310 R22 80 210 90 R13 320 300 210 R21280 120 180 R16 180 210 310 R18 180 80 220 R23
E4 310 80 220 R19 250 220 310 R22 310 80 220 R19310 230 120 R20 310 230 120 R20 250 220 310 R22320 300 210 R21 320 300 210 R21 180 210 310 R18370 320 100 R17 370 320 100 R17 150 50 380 R15
Super Extents
Sup
er-Extent 1
Sup
er-Extent 2
• Super Extent
• Contiguous group of fixed # of extents
• Super extent size
• Re-order records at super-extent level
•Consequence: • partial ordering v/s global ordering• Benefits = func(super extent size)
•In practice: Super extent size = 100
Recovery cost still 100x!
![Page 41: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/41.jpg)
replica 1 replica 2 replica 3
C1 C2 C3 C1 C2 C3 C1 C2 C3
x x x
E1 x x x
x x x
x x x
x x x
E2 x x x
x x x
x x x
x x x
E3 x x x
x x x
x x x
x x x
E4 x x x
x x x
x x x
Chained Intra-extent bucketing
![Page 42: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/42.jpg)
replica 1 replica 2 replica 3
C1 C2 C3 C1 C2 C3 C1 C2 C3
x x x
E1 x x x
x x x
x x x
x x x
E2 x x x
x x x
x x x
x x x
E3 x x x
x x x
x x x
x x x
E4 x x x
x x x
x x x
Chained Intra-extent bucketing
![Page 43: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/43.jpg)
replica 1 replica 2 replica 3
C1 C2 C3 C1 C2 C3 C1 C2 C3
x x x
E1 x x x
x x x
x x x
x x x
E2 x x x
x x x
x x x
x x x
E3 x x x
x x x
x x x
x x x
E4 x x x
x x x
x x x
Chained Intra-extent bucketing
![Page 44: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/44.jpg)
replica 1 replica 2 replica 3
C1 C2 C3 C1 C2 C3 C1 C2 C3
x x x
E1 x x x
x x x
x x x
x x x
E2 x x x
x x x
x x x
x x x
E3 x x x
x x x
x x x
x x x
E4 x x x
x x x
x x x
Chained Intra-extent bucketing
![Page 45: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/45.jpg)
replica 1 replica 2 replica 3
C1 C2 C3 C1 C2 C3 C1 C2 C3
x x x
E1 x x x
x x x
x x x
x x x
E2 x x x
x x x
x x x
x x x
E3 x x x
x x x
x x x
x x x
E4 x x x
x x x
x x x
Chained Intra-extent bucketing
![Page 46: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/46.jpg)
replica 1 replica 2 replica 3
C1 C2 C3 C1 C2 C3 C1 C2 C3
x x x
E1 x x x
x x x
x x x
x x x
E2 x x x
x x x
x x x
x x x
E3 x x x
x x x
x x x
x x x
E4 x x x
x x x
x x x
Chained Intra-extent bucketing
![Page 47: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/47.jpg)
replica 1 replica 2 replica 3
C1 C2 C3 C1 C2 C3 C1 C2 C3
x x x
E1 x x x
x x x
x x x
x x x
E2 x x x
x x x
x x x
x x x
E3 x x x
x x x
x x x
x x x
E4 x x x
x x x
x x x
Chained Intra-extent bucketing
![Page 48: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/48.jpg)
replica 1 replica 2 replica 3
C1 C2 C3 C1 C2 C3 C1 C2 C3
x x x
E1 x x x
x x x
x x x
x x x
E2 x x x
x x x
x x x
x x x
E3 x x x
x x x
x x x
x x x
E4 x x x
x x x
x x x
Chained Intra-extent bucketing
![Page 49: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/49.jpg)
replica 1 replica 2 replica 3
C1 C2 C3 C1 C2 C3 C1 C2 C3
x x x
E1 x x x
x x x
x x x
x x x
E2 x x x
x x x
x x x
x x x
E3 x x x
x x x
x x x
x x x
E4 x x x
x x x
x x x
Chained Intra-extent bucketing
![Page 50: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/50.jpg)
replica 1 replica 2 replica 3
C1 C2 C3 C1 C2 C3 C1 C2 C3
x x x
E1 x x x
x x x
x x x
x x x
E2 x x x
x x x
x x x
x x x
E3 x x x
x x x
x x x
x x x
E4 x x x
x x x
x x x
Chained Intra-extent bucketing
Same recovery costas Physical Replication(in terms of Disk & Network I/O)
![Page 51: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/51.jpg)
replica 1 replica 2 replica 3
C1 C2 C3 C1 C2 C3 C1 C2 C3
x x x
E1 x x x
x x x
x x x
x x x
E2 x x x
x x x
x x x
x x x
E3 x x x
x x x
x x x
x x x
E4 x x x
x x x
x x x
Chained Intra-extent bucketing
Same recovery costas Physical Replication(in terms of Disk & Network I/O)
• Super extent size = 100• => Size(Intra-bucket) = 2.5MB
• Disk seek amortized over transfer
![Page 52: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/52.jpg)
Recovery Cost Evaluation
• Setup• Dedicated cluster of 500 machines (20 racks x 25 machines)
• Machine configuration• 2.4GHz Xeon processor w/ 24 H/W threads
• 128GB RAM
• 4x 5TB HDD
• 4x 500GB SSD
• Recovery Experiment• Ingested large amount of data
• Took down 1 rack of machines
• Measured disk & network utilization
![Page 53: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/53.jpg)
Recovery cost: Disk I/O
![Page 54: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/54.jpg)
Recovery cost: Disk I/O
Area under the curves is same
![Page 55: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/55.jpg)
Recovery cost: Network I/O
![Page 56: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/56.jpg)
Recovery cost: Network I/O
Area under the curves is same
![Page 57: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/57.jpg)
Other storage challenges
• Availability properties
• Fault isolation
Please refer to paper for details
![Page 58: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/58.jpg)
Outline
• Introduction
•Design & Evaluation1.) Key mechanism at storage layer2.) Efficient Query Execution
• Implementation
• Summary
![Page 59: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/59.jpg)
Efficient Filter Queries
Super extent 1(100 extents)
Super extent 2(100 extents)
Replica partitioned
by A
![Page 60: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/60.jpg)
Efficient Filter Queries
Replica partitioned
by A
Partition #1 Partition #2 Partition #3 Partition #100
![Page 61: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/61.jpg)
Efficient Filter Queries
Replica partitioned
by A
Partition #1 Partition #2 Partition #3 Partition #100
Filter on A
![Page 62: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/62.jpg)
Efficient Filter Queries
Replica partitioned
by A
Partition #1 Partition #2 Partition #3 Partition #100
Filter on A
![Page 63: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/63.jpg)
Efficient Filter Queries
Replica partitioned
by A
Partition #1 Partition #2 Partition #3 Partition #100
Filter on A
1-100x Savings
![Page 64: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/64.jpg)
Join Queries: Heterogeneous co-location
• Rack level co-location of partitions across files
Partition #1 Partition #2 Partition #3 Partition #100
File 1
![Page 65: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/65.jpg)
Join Queries: Heterogeneous co-location
• Rack level co-location of partitions across files
Partition #1 Partition #2 Partition #3 Partition #100
File 1
File 2
![Page 66: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/66.jpg)
Join Queries: Heterogeneous co-location
• Rack level co-location of partitions across files
Partition #1 Partition #2 Partition #3 Partition #100
File 1
File 2
File 1
File 2
File 3
File 4
Replica 2
![Page 67: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/67.jpg)
Join Queries: Heterogeneous co-location
• Rack level co-location of partitions across files
Partition #1 Partition #2 Partition #3 Partition #100
File 1
File 2
File 1
File 2
File 3
File 4
Replica 2
More queriesget benefits of
co-location
![Page 68: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/68.jpg)
Efficient Join Queries: Sliced Reads
• File 1 joined with File 2 on Column A
Partition #1 Partition #2 Partition #3 Partition #100
File 1
File 2
![Page 69: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/69.jpg)
Efficient Join Queries: Sliced Reads
• File 1 joined with File 2 on Column A
Partition #1 Partition #2 Partition #3 Partition #100
File 1
File 2
Needfiner grainedpartitioning
![Page 70: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/70.jpg)
Extent
Efficient Join Queries: Sliced Reads
Storage Node
• File 1 joined with File 2 on Column A
Partition #1 Partition #2 Partition #3 Partition #100
File 1
File 2
Needfiner grainedpartitioning
A B C
![Page 71: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/71.jpg)
Extent
Efficient Join Queries: Sliced Reads
Storage Node
• File 1 joined with File 2 on Column A
Partition #1 Partition #2 Partition #3 Partition #100
File 1
File 2
Needfiner grainedpartitioning
A B C
![Page 72: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/72.jpg)
Extent
Efficient Join Queries: Sliced Reads
Storage Node
Sliced_read(A, 1)
• File 1 joined with File 2 on Column A
Partition #1 Partition #2 Partition #3 Partition #100
File 1
File 2
Needfiner grainedpartitioning
A B C
![Page 73: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/73.jpg)
Extent
Efficient Join Queries: Sliced Reads
Storage Node
Sliced_read(A, 1)
• File 1 joined with File 2 on Column A
Partition #1 Partition #2 Partition #3 Partition #100
File 1
File 2
Needfiner grainedpartitioning
A B C
![Page 74: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/74.jpg)
Extent
Efficient Join Queries: Sliced Reads
Storage Node
Sliced_read(A, 1)
• File 1 joined with File 2 on Column A
Partition #1 Partition #2 Partition #3 Partition #100
File 1
File 2
Needfiner grainedpartitioning
A B C
![Page 75: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/75.jpg)
Extent
Efficient Join Queries: Sliced Reads
Storage Node
Sliced_read(A, 1)
• File 1 joined with File 2 on Column A
Partition #1 Partition #2 Partition #3 Partition #100
File 1
File 2
Needfiner grainedpartitioning
A B C
• Co-ordinated lazy request scheduling
• Selective Caching
![Page 76: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/76.jpg)
Extent
Efficient Join Queries: Sliced Reads
Storage Node
Sliced_read(A, 1)
Sliced_read(A, 2)
• File 1 joined with File 2 on Column A
Partition #1 Partition #2 Partition #3 Partition #100
File 1
File 2
Needfiner grainedpartitioning
A B C
• Co-ordinated lazy request scheduling
• Selective Caching
![Page 77: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/77.jpg)
Extent
Efficient Join Queries: Sliced Reads
Storage Node
Sliced_read(A, 1)
Sliced_read(A, 2)
Sliced_read(A, 3)
• File 1 joined with File 2 on Column A
Partition #1 Partition #2 Partition #3 Partition #100
File 1
File 2
Needfiner grainedpartitioning
A B C
• Co-ordinated lazy request scheduling
• Selective Caching
![Page 78: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/78.jpg)
Extent
Efficient Join Queries: Sliced Reads
Storage Node
Sliced_read(A, 1)
Sliced_read(A, 2)
Sliced_read(A, 3)
Sliced_read(A, 4)
• File 1 joined with File 2 on Column A
Partition #1 Partition #2 Partition #3 Partition #100
File 1
File 2
Needfiner grainedpartitioning
A B C
• Co-ordinated lazy request scheduling
• Selective Caching
![Page 79: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/79.jpg)
Extent
Efficient Join Queries: Sliced Reads
Storage Node
Sliced_read(A, 1)
Sliced_read(A, 2)
Sliced_read(A, 3)
Sliced_read(A, 4)
• File 1 joined with File 2 on Column A
Partition #1 Partition #2 Partition #3 Partition #100
File 1
File 2
Needfiner grainedpartitioning
A B C
• Co-ordinated lazy request scheduling
• Selective Caching
![Page 80: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/80.jpg)
AMPLab Big Data Benchmark
Execution cost of queries
No
rma
lized
exe
cuti
on
co
st
Filter Group by Filter + Join
![Page 81: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/81.jpg)
AMPLab Big Data Benchmark
Execution cost of queries
No
rma
lized
exe
cuti
on
co
st
Filter Group by Filter + Join
Simultaneous benefits on multiple columns
![Page 82: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/82.jpg)
Production Queries
• Slice of production telemetry analytics workload
• Costs are in compute hours• Latencies are in minutes
![Page 83: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/83.jpg)
Production Queries
• Slice of production telemetry analytics workload
• Costs are in compute hours• Latencies are in minutes
![Page 84: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/84.jpg)
Outline
• Introduction
•Design & Evaluation1.) Key mechanism at storage layer2.) Efficient Query Execution
• Implementation
• Summary
![Page 85: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/85.jpg)
Implementation
1.) Create Path Master
StorageNode
StorageNode
StorageNode
StorageNode
2.) Recovery Path
![Page 86: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/86.jpg)
Logically_replicate(file, adapter)
Implementation
1.) Create Path Master
StorageNode
StorageNode
StorageNode
StorageNode
2.) Recovery Path
CSV
![Page 87: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/87.jpg)
Logically_replicate(file, adapter)
Implementation
1.) Create Path Master
StorageNode
StorageNode
StorageNode
StorageNode
2.) Recovery Path
Recover_extent(super-extent info)
CSV
![Page 88: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/88.jpg)
Summary
• INSTalytics: Compute-aware cluster filesystem• Logical replication: Amplifies benefits of partitioning
• Efficient processing of join queries• Heterogeneous co-location
• Sliced Reads
• Significant performance benefits
• Recovery properties not compromised
• Co-design of Compute & Storage layers for efficient analytics at scale
![Page 89: INSTalytics: Cluster Filesystem Co-design for Big-data Analyticsjaya/slides/instalytics... · 2020. 1. 31. · •Analytics as a Service offerings •Several Frameworks •Extensive](https://reader033.vdocument.in/reader033/viewer/2022060601/6055323614777d563853c770/html5/thumbnails/89.jpg)
Thank youQuestions?