heterogeneous workflows with spark at netflix
TRANSCRIPT
![Page 1: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/1.jpg)
Heterogeneous Workflows with Spark at Netflix
0
Antony Arokiasamy | Kedar Sadekar | Personalization Infrastructure
![Page 2: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/2.jpg)
1
Helpmembersfindcontenttowatchandenjoytomaximizemembersa8sfac8onandreten8on
![Page 3: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/3.jpg)
Everything is a Recommendation 2
Recommenda)onsaredrivenbyMachineLearning
Ranking
Rows
![Page 4: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/4.jpg)
Machine Learning Pipeline 3
UserSelec8onFeature
Genera8onModel
Valida8onPublishModel
ModelTraining
![Page 5: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/5.jpg)
Machine Learning Pipeline Challenges 4
• Innova8on• HeterogeneousEnvironments
• Spark• Na8veSupport
• SeparateOrchestra8onandExecu8on
• Mul8Tenancy
• MachineLearningConstructs• ParameterSweep–30kDockers
![Page 6: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/6.jpg)
Meson Workflow System 5
• GeneralPurposeWorkflowOrchestra8onandSchedulingframework• Delegatesexecu8ontoresourcemanagerslikeMesos
• Op8mizedforMachineLearningPipelinesandVisualiza8on
• CheckouttheBlog• hTp://bit.ly/mesonwsortechblog.neXlix.com
• PlantoOpenSourcedsoon
![Page 7: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/7.jpg)
Meson Architecture 6
![Page 8: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/8.jpg)
Standard and Custom Step Types 7
![Page 9: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/9.jpg)
Parameter Passing 8
HiveQuery UserDataSet RegionalDataSet
GlobalDataSet
GetUsers
RegionalModel
GlobalModel
UserDataSet
WrangleData
![Page 10: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/10.jpg)
Structured Constructs 9
![Page 11: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/11.jpg)
Top Down or Bottom Up 10
![Page 12: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/12.jpg)
Two Way Communication 11
![Page 13: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/13.jpg)
Spark Step 12
![Page 14: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/14.jpg)
Artifacts 13
• StepoutputstrackedasAr8facts
• Visualiza8on
• Memoiza8on
![Page 15: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/15.jpg)
Multi Tenancy 14
• ResourceATributes • spark.cores.max• spark.executor.memory• spark.mesos.constraints• DynamicResourceAlloca8on
![Page 16: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/16.jpg)
Cluster Management 15
• Red-Blackso\wareupdates
• Scaleup/Scaledown
![Page 17: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/17.jpg)
Meson/Spark Cluster 16
• 100sofConcurrentJobs
• 700Nodes
• 5000Cores
• 25TBMemory
• Apps:MesonWorkflowSystem,SparkandDockers
• Fewsmallerclusters
![Page 18: Heterogeneous Workflows With Spark At Netflix](https://reader031.vdocument.in/reader031/viewer/2022030317/586fde701a28ab18428b6bd3/html5/thumbnails/18.jpg)
17
Antony Arokiasamy Kedar Sadekar
@aasamy
/aasamy
@kedar_sadekar
/kedar-sadekar