6 roadmap operating pentaho at scale - … –worker nodes hear about new upcoming capabilities for...

29
Roadmap: Operating Pentaho at Scale Jens Bleuel Senior Product Manager, Pentaho

Upload: hadieu

Post on 22-May-2018

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

Roadmap:OperatingPentahoatScaleJensBleuelSeniorProductManager,Pentaho

Page 2: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

Agenda– WorkerNodes

HearaboutnewupcomingcapabilitiesforscalingoutthePentahoplatforminlargeenterpriseoperations.Thiswillcover8.0androadmaptopics.

• WorkerNodes:OverviewandBusinessBenefits

• HowisthisdifferentfromAEL/HadoopMapReduce

• TypicalCustomerScenarios

• Architecture&CapabilitiesincludingMonitoring&Logging

• ImprovementsinRelatedAreas

• Demonstration

• Availability&Roadmap

Page 3: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

WorkerNodes– Overview

• WorkerNodescanscaleworkitemsacrossmultiplenodes(containers)like:

– PDIjobsandtransformations(in8.0)– Reportexecutions(notin8.0)– […]

• Itoperateseasilyandsecurelyacrossanelasticarchitecture,whichaddsadditionalmachineresourcesastheyarerequiredforprocessing

• WorkerNodescanoperateonpremiseorinthecloud

• UsesPopulartechnologiesunderthehoodsuchasDocker(ContainerPlatform),Chronos(Scheduler)andMesos/Marathon(ContainerOrchestration)

WorkerNode(a)

WorkerNode(b)

WorkerNode(c…)DistributeandScale

Page 4: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

WorkerNodes– BusinessBenefits

Largeenterprisesneedtheabilitytoseamlesslyandefficientlyspinupresourcestohandle100s+workitemsatdifferenttimes,withdifferentdependenciesandprocessingrequirements.WorkerNodesaddressestheseneedsanddelivers:• FastertimetovalueandreducedTCObecauseitenablescustomerstodeploytheirownscale-outprocesseswithoutrequiredservices• Managechangingworkloadsmoreefficientlybyspinningresourcesupanddownasneeded• Increasedbusinessagilitythankstocontainerization– whichenablesportabilityofprocessesacrosson-prem andcloudenvironmentswithouttheneedtore-engineerthem.– Eveninpureon-prem,WNprovideselasticityandresourceoptimization.

Page 5: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

HowIsThisDifferentfromAEL/HadoopMapReduce?

Thesetwoarchitecturescanalsobecombined:WithinaWorkerNode,aPDItransformationcanalsoscaleoutwithAELorMapReduce

SCALEOUTONDATA

SCALEOUTONPROCESSES(WORKITEMS)

AEL/HadoopMapReduce(simplified):• Dataisdistributedacrossnodes• Theprocessingtakesplaceatthenodelevel• Helpsinscaleoutdatavolume

WorkerNodes(simplified):• WorkItemslikePDIJobs,PDITransformationsgetdistributedacrossnodes– thisisabouttheprocessingandorchestration(incontrasttodistributingdata)

• HelpsinscaleoutPentahoprocesses

Page 6: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

TypicalCustomerScenarios

CustomerType TypicalNumberofWorkItems Scale-OutNeed

Small Upto10 No

Medium 10through100 Sometimes

Enterprisewithonedepartment +/- 100 Yes

Enterprisewithmultipledepartments Hundredsorthousands Yes

Page 7: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

TypicalCustomerExamples– SLA’sandTimeWindows

• NeedtomeetcustomerSLA’s– Datafromhundredsofsourcesneedtogetcollectedandaggregated– ThisisdonebyhundredsofPDIjobsandtransformations– Allthesejobsandtransformationsneedtobefinishedwithinadefinedtimewindow(forexamplebetween5amand7am)sothatthedataisavailableandaccurateforthetargetaudience

• WorkerNodesprovidesthetechnologytorunprocessesinparallelandscaleoutwhenneeded,forexampleatpeaktimes(endofmonth)

Page 8: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

TypicalCustomerExamples– SharedServices

Exampleofoneproject:

• 800dailybatchesfromdifferentdepartmentsinanenterprise

• Oneserverwith120GBmemoryandmanyCPUs

• ThismachinehostslotsofVMinparallel

Issue:Whenthereistoomuchworkload,onemachineisnotenough

• WorkerNodessolvesthisinscalingoutonacluster

Page 9: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

TypicalCustomerExamples– ScalableonDemand

• Needtosupportgrowingdatavolumesandcustomerrequirements

• WorkerNodesprovidesaflexibleandscalablearchitectureon-promiseorinthecloudforgrowingdemand

• Thisisseamlessanddoesnotneedtochangetheunderlyingarchitecture

WorkerNode(1)

WorkerNode(2)

WorkerNode(3)DistributeandScale

WorkerNode(1)

WorkerNode(2)

WorkerNode(3)DistributeandScale

WorkerNode(4)

WorkerNode(5)

BASETIMES PEAKTIMES

Page 10: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

WORKERNODES

OrchestrationFramework

ContainerFramework

WorkerNodes– Newin8.0

• Containerizedscale-out• PentahoPDI“workitems”

PentahoServerWN1e.g.KJB

WN2e.g.KTR

WN…n“Executor”

Orchestration(Scheduler,monitoring,security,etc.)

Controller

Master(Standby)

Master(Standby)

Master(Working)

PentahoRepository

PentahoClients

Page 11: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

WorkerNodesCapabilities

• Deployconsistentlyinphysical,virtual,andcloudenvironmentsAdaptstocustomerneeds(bare-metalvs.virtualizationvs.Cloud)andnoneedtomodifytheproductwhenthestrategychanges

• ScaleandloadbalanceservicesThishelpstodealwithpeaksandlimitedtime-windows,allocatetheresourcesthatareneeded.

• HybriddeploymentscanbeusedtodistributeloadEvenwhentheon-premise resourcesarenotsufficient,scalingoutintotheCloudispossibletoprovidemoreresources.

Page 12: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

MonitoringandLogging

Page 13: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

Monitoring– Overview

Page 14: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

Monitoring– WorkerNodeExample

Page 15: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

ImprovementsinRelatedAreasOpenandSaveDialogs

Page 16: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

• Wheneveryousaveanewtransformation/jobintotherepository,thedefaultfolderissettotheuser’shomefolder.

PainPoint:SaveaNewJob/Transformation

Inpreviousversions:Theuserwillneedtochangethefolderforeverytimetheysaveanewtransformationorjob.

Page 17: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

NewSaveDialogin8.0– Overview

• Remembersthelastopenedfolder!

• Justenterthefilename!(and/orchangethefolder)

• SimilartotheOpenDialogwithadditionalfunctionality(seenextslide).

Page 18: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

NewOpenDialogin8.0– Overview

Recents

Openshowsthelastopenedfolder.Thisisabigtimesaver!

Search

Page 19: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

ImprovementsinRelatedAreasRunConfigurations

Page 20: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

PainPoint:RemotePentahoServerExecutionbefore 8.0

ToexecuteonthePentahoServerbefore8.0,youneedtodefineaSlaveserverandgivethecredentials. ThenexecuteontheselectedServer.

Page 21: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

ExecuteonthePentahoServer

• ByselectingthePentahoserveroption,youdonotneedtodefineaSlaveserveranymorewhenyouwanttoexecuteremotely.

• Behindthescenes,thisoptionexecutesthetransformationorjobviatheScheduler.Thisisthesameasyouwoulddoa“ScheduleNow.”

Thisnewfunctionalityimprovestheeaseofuse,alsoforWorkerNodes

Page 22: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

RunConfigurationswithinJobEntries

• RunConfigurationcanbeusedintheRundialogandalsointhejobentriesthatcouldexecutejobsortransformationsremotelyandonWorkerNodes

7.1 Example

8.0

Page 23: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

Demonstration

Page 24: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

AvailabilityandRoadmap

Page 25: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

Availability

• WorkerNodesisEEonly

• Initially,8.0WorkerNodeswillbeLimitedAvailability– Fullysupported,productiondeployment– Distributiontoalimitednumberofcustomers

• Requiresadditionaldownloadandimplementationservices

Page 26: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

• PentahoServer&RepositoryasaServiceincludingHighAvailability

• ImprovedMonitoringandLogging

• ExtendtootherPentahoworkitemssuchasReports

• IntegratedwithotherHitachiVantara ServicesandProducts

Roadmap

ContainerFrameworkPentahoServer

WN1e.g.KJB

WN2e.g.KTR

WN…n“Executor”

PentahoRepository

Page 27: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

Summary

Whatwecoveredtoday:

• TheupcomingcapabilitiesforscalingoutthePentahoplatformandwhentousethem

• Howtousethenewwayofscalingoutworkitems(PentahoprocessessuchasPDIjobsandtransformations)acrossmultiplenodes

Page 28: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

NextStepsWanttolearnmore?

• Meet-the-Expert:– PedroTeixera

• Otherrecommendedbreakoutsessions:– MattHoward:Pentaho8.0andRoadmap– RakeshSaha andJensBleuel:Roadmap:ProcessingBigData– MattCasters:PDIBestArchitecturePractices– SteveSzabo:PDISizingOverviewandCaseStudy– JonathanJarvis:UnderstandingParallelismwithPDIandAdaptiveExecutionwithSpark– MarkBurnett:UnderstandingtheBigDataTechnologyEcosystem

Page 29: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover