design patterns leveraging spark in pdi - pentahoworld … · 2017-11-06 · design patterns...
TRANSCRIPT
![Page 1: Design Patterns Leveraging Spark in PDI - PentahoWorld … · 2017-11-06 · Design Patterns Leveraging Spark in PDI Chris Skirde ... C. Fast and general purpose engine for large](https://reader031.vdocument.in/reader031/viewer/2022022522/5b2d75297f8b9a55208b55e0/html5/thumbnails/1.jpg)
DesignPatternsLeveragingSparkinPDIChrisSkirdePentaho DirectorofSalesEngineering,HitachiVantaraRakeshSahaPentahoSeniorProductManager,HitachiVantara
![Page 2: Design Patterns Leveraging Spark in PDI - PentahoWorld … · 2017-11-06 · Design Patterns Leveraging Spark in PDI Chris Skirde ... C. Fast and general purpose engine for large](https://reader031.vdocument.in/reader031/viewer/2022022522/5b2d75297f8b9a55208b55e0/html5/thumbnails/2.jpg)
QuizTime!
• WhatisSpark?A. Agoodwaytostartafire.B. Necessaryforawellrunninginternalcombustionengine.C. Fastandgeneralpurposeengineforlarge-scaledataprocessing.D. Alloftheabove.
• TrueorFalse,PentahosupportsSpark?• WhoisusingSparktoday(withorwithoutPentaho)?
![Page 3: Design Patterns Leveraging Spark in PDI - PentahoWorld … · 2017-11-06 · Design Patterns Leveraging Spark in PDI Chris Skirde ... C. Fast and general purpose engine for large](https://reader031.vdocument.in/reader031/viewer/2022022522/5b2d75297f8b9a55208b55e0/html5/thumbnails/3.jpg)
Agenda
• IntroductiontoSpark• Commondesignpatterns
• HowtoleverageSparkwithPentaho
![Page 4: Design Patterns Leveraging Spark in PDI - PentahoWorld … · 2017-11-06 · Design Patterns Leveraging Spark in PDI Chris Skirde ... C. Fast and general purpose engine for large](https://reader031.vdocument.in/reader031/viewer/2022022522/5b2d75297f8b9a55208b55e0/html5/thumbnails/4.jpg)
IntroductiontoSpark
• Whyareweinterested?
• Whatisitreally?
• What’sbeendone?
![Page 5: Design Patterns Leveraging Spark in PDI - PentahoWorld … · 2017-11-06 · Design Patterns Leveraging Spark in PDI Chris Skirde ... C. Fast and general purpose engine for large](https://reader031.vdocument.in/reader031/viewer/2022022522/5b2d75297f8b9a55208b55e0/html5/thumbnails/5.jpg)
SparkApplicationArchitecture
Daemon
PDI/Server
![Page 6: Design Patterns Leveraging Spark in PDI - PentahoWorld … · 2017-11-06 · Design Patterns Leveraging Spark in PDI Chris Skirde ... C. Fast and general purpose engine for large](https://reader031.vdocument.in/reader031/viewer/2022022522/5b2d75297f8b9a55208b55e0/html5/thumbnails/6.jpg)
WhatDoThoseApplicationsHaveinCommon?
![Page 7: Design Patterns Leveraging Spark in PDI - PentahoWorld … · 2017-11-06 · Design Patterns Leveraging Spark in PDI Chris Skirde ... C. Fast and general purpose engine for large](https://reader031.vdocument.in/reader031/viewer/2022022522/5b2d75297f8b9a55208b55e0/html5/thumbnails/7.jpg)
CommonDesignPatterns
• Filter/Organize• Join• Sum
• Transform/Enrich
• Query• MachineLearning/DataScience
![Page 8: Design Patterns Leveraging Spark in PDI - PentahoWorld … · 2017-11-06 · Design Patterns Leveraging Spark in PDI Chris Skirde ... C. Fast and general purpose engine for large](https://reader031.vdocument.in/reader031/viewer/2022022522/5b2d75297f8b9a55208b55e0/html5/thumbnails/8.jpg)
Filter/Organize
![Page 9: Design Patterns Leveraging Spark in PDI - PentahoWorld … · 2017-11-06 · Design Patterns Leveraging Spark in PDI Chris Skirde ... C. Fast and general purpose engine for large](https://reader031.vdocument.in/reader031/viewer/2022022522/5b2d75297f8b9a55208b55e0/html5/thumbnails/9.jpg)
Join
![Page 10: Design Patterns Leveraging Spark in PDI - PentahoWorld … · 2017-11-06 · Design Patterns Leveraging Spark in PDI Chris Skirde ... C. Fast and general purpose engine for large](https://reader031.vdocument.in/reader031/viewer/2022022522/5b2d75297f8b9a55208b55e0/html5/thumbnails/10.jpg)
Sum(andOtherAggregations)
![Page 11: Design Patterns Leveraging Spark in PDI - PentahoWorld … · 2017-11-06 · Design Patterns Leveraging Spark in PDI Chris Skirde ... C. Fast and general purpose engine for large](https://reader031.vdocument.in/reader031/viewer/2022022522/5b2d75297f8b9a55208b55e0/html5/thumbnails/11.jpg)
Transform/Enrich
• Anystepyoulike!
![Page 12: Design Patterns Leveraging Spark in PDI - PentahoWorld … · 2017-11-06 · Design Patterns Leveraging Spark in PDI Chris Skirde ... C. Fast and general purpose engine for large](https://reader031.vdocument.in/reader031/viewer/2022022522/5b2d75297f8b9a55208b55e0/html5/thumbnails/12.jpg)
Query– Easy!
• ClouderauseHive-on-SparkwithHive2• HortonworksuseSparkSQL viaSimba
![Page 13: Design Patterns Leveraging Spark in PDI - PentahoWorld … · 2017-11-06 · Design Patterns Leveraging Spark in PDI Chris Skirde ... C. Fast and general purpose engine for large](https://reader031.vdocument.in/reader031/viewer/2022022522/5b2d75297f8b9a55208b55e0/html5/thumbnails/13.jpg)
MachineLearning/DataScience
![Page 14: Design Patterns Leveraging Spark in PDI - PentahoWorld … · 2017-11-06 · Design Patterns Leveraging Spark in PDI Chris Skirde ... C. Fast and general purpose engine for large](https://reader031.vdocument.in/reader031/viewer/2022022522/5b2d75297f8b9a55208b55e0/html5/thumbnails/14.jpg)
Recap
Whatwecoveredtoday:
• ReviewedwhatSparkisandwhyorganizationsareadoptingit• Discussedseveralcommondataintegrationdesignpatterns
• LinkedthosedesignpatternstoPentahofeaturesforyoutotry
![Page 15: Design Patterns Leveraging Spark in PDI - PentahoWorld … · 2017-11-06 · Design Patterns Leveraging Spark in PDI Chris Skirde ... C. Fast and general purpose engine for large](https://reader031.vdocument.in/reader031/viewer/2022022522/5b2d75297f8b9a55208b55e0/html5/thumbnails/15.jpg)
Questions?
![Page 16: Design Patterns Leveraging Spark in PDI - PentahoWorld … · 2017-11-06 · Design Patterns Leveraging Spark in PDI Chris Skirde ... C. Fast and general purpose engine for large](https://reader031.vdocument.in/reader031/viewer/2022022522/5b2d75297f8b9a55208b55e0/html5/thumbnails/16.jpg)
NextSteps
Wanttolearnmore?
• “MeettheExperts”MattCastersandMarkHall!
• AdaptiveExecutionLayerhttp://www.pentaho.com/blog/introducing-adaptive-execution-layer-spark-architecture
• SQLonSparkhttp://www.pentaho.com/blog/operationalize-spark-big-data-newest-enhancements
![Page 17: Design Patterns Leveraging Spark in PDI - PentahoWorld … · 2017-11-06 · Design Patterns Leveraging Spark in PDI Chris Skirde ... C. Fast and general purpose engine for large](https://reader031.vdocument.in/reader031/viewer/2022022522/5b2d75297f8b9a55208b55e0/html5/thumbnails/17.jpg)