deploying data streaming applicaons in the...
TRANSCRIPT
![Page 1: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/1.jpg)
Deployingdatastreamingapplica2onsintheFog
ValeriaCardellini
UniversityofRomeTorVergata
2ndWorkshop“ThroughtheFog”–Pisa,Italy
![Page 2: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/2.jpg)
Datastreamprocessing(DSP)
• Avarietyoflow-latencyandlocaLon-awareapplicaLonsindiversedomains:– SituaLon-awareapplicaLons(e.g.,intelligenturbantransport,surveillance,andtrafficcongesLon)
– Socialdatamining
• Require– ConLnuousreal-2meprocessingofunboundeddatastreamsgeneratedbymulLple,distributedsources
– ToextractvaluableinformaLoninaLmelyandreliablemanner
1V.Cardellini-ThroughtheFog2017
![Page 3: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/3.jpg)
Inanewdistributedenvironment• Toincreasescalabilityandavailability,reducelatency,networktraffic,andpowerconsumpLon
– Edge/fogcompu2ng(“thecloudclosetotheground”):manymicrodatacenterslocatedatthenetworkedge
Exploitdistributedandnear-edgecomputaLon
![Page 4: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/4.jpg)
…thatposesoldandnewchallenges
• Networklatenciesaresignificant• CompuLngandnetworkingresourcesareheterogeneous(e.g.,businessconstraints,capacitylimits,…)
• CompuLngandnetworkresourcesarenotalwaysavailable
• Datacannotbeprocessedeverywhere• …
3V.Cardellini-ThroughtheFog2017
![Page 5: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/5.jpg)
Goalofthetalk
• GiveaflavorofsomechallengesandtheirpossiblesoluLonsthatarisewhendeployingdatastreamprocessingapplicaLonsinafog/edgeenvironment
4V.Cardellini-ThroughtheFog2017
![Page 6: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/6.jpg)
DSPapplicaLonbasics
• Anetworkofoperatorsconnectedbydatastreams,atleastonedatasourceandonedatasink
• Representedbyadirectedgraph– GraphverLces:operators– Graphedges:datastreams– Usuallydirectedacyclicgraph(DAG)
• Operator:– Processingelementthattransformsoneormoreinputstreamsintoanotherstream
– Canbestatelessorstateful5V.Cardellini-ThroughtheFog2017
![Page 7: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/7.jpg)
Challenge1:Operatorplacement
• HowtoassigntheDSPoperatorstocompuLngnodeswhicharedistributedinaFogenvironment
6V.Cardellini-ThroughtheFog2017
1 23
4 6
5
(1,2)
(1,2) (1,2) (2,3)(2,4)
(3,5)(4,5)
(4,6)
(4,6)
(2,4)(2,3)
(3,5)
(4,5)
(4,6)
![Page 8: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/8.jpg)
Thebeginning:DistributedStorm
• CurrentDSPsystems(e.g.,Storm,Flink,Heron)aredesignedtoruninsingledatacenters
• OuriniLalgoal:toextendStormforalarge-scaledistributedandheterogeneousenvironment
7
V.Cardellini,V.Grassi,F.LoPres2,M.Nardelli,DistributedQoS-awareschedulinginStorm.DEBS’15.V.Cardellini-ThroughtheFog2017
![Page 9: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/9.jpg)
NetworklatencyesLmaLon
• HowtoprovideanefficientesLmaLonofthenetworkdelaybetweenpairsofnodes?
• Useanetworkcoordinatessystem– Topredictlatencieswithoutperformingdirectmeasurements• E.g.,Vivaldinetworkcoordinates:decentralizedandgossip-basedscheme
8V.Cardellini-ThroughtheFog2017
![Page 10: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/10.jpg)
Operatorplacementpolicies
• Operatorplacement:NP-hardproblem• Severalplacementpoliciesinliterature(mainlyheurisLcs)thataddresssuchproblembut– DifferentassumpLons(systemmodel,applicaLontopology,QoSahributesandmetrics,…)
– DifferentobjecLves– Noteasilycomparable
9V.Cardellini-ThroughtheFog2017
![Page 11: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/11.jpg)
ODP:OpLmalDSPPlacement• WeproposeODP– CentralizedpolicyforopLmalplacementofDSPapplicaLons
– FormulatedasIntegerLinearProgramming(ILP)problem
• Ourgoals:– Tocomputetheop2malplacement(ofcourse!)
– Toprovideaunifiedgeneralformula2onoftheplacementproblemforDSPapplicaLons(butnotonly!)
– ToconsidermulLpleQoSaNributesofapplicaLonsandresources
– ToprovideabenchmarkforheurisLcs
10
V.Cardellini,V.Grassi,F.LoPres2,M.Nardelli,Op2malOperatorPlacementforDistributedStreamProcessingApplica2ons,DEBS’16. V.Cardellini-ThroughtheFog2017
![Page 12: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/12.jpg)
ODP:modelDSPapplica2on
11
Operators• CirequiredcompuLngresources• RiexecuLonLmeperdataunit
Datastreams• λi,j dataratefromoperatoritoj
V.Cardellini-ThroughtheFog2017
![Page 13: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/13.jpg)
ODP:modelCompu2ngandnetworkresources
12
(Logical)Networklinks• du,vnetworkdelayfromutov
• Bu,v bandwidthfromutov
• Au,vlinkavailability
Compu2ngresources• Cuamountofresources
• Su processingspeed• Auresourceavailability
V.Cardellini-ThroughtheFog2017
![Page 14: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/14.jpg)
13
DecisionvariablesWheretomapoperatorsanddatastreams
OpLmalDSPPlacementModel
i
j
xi,u=1
y(i,j),(u,v)=1
xj,v=1
u
z
v
w
ODP:model
V.Cardellini-ThroughtheFog2017
![Page 15: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/15.jpg)
ODP:someQoSmetrics
• Latency– Maxend-to-enddelaybetweensourcesanddesLnaLons
14
R
• Availability– Prob.thatalloperators/linksareupandrunning
• Latencyandbandwidth– Inter-nodetraffic– Networkusage
• Inflightbytes Σlinks∈lrate(l)Lat(l)V.Cardellini-ThroughtheFog2017
![Page 16: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/16.jpg)
15
Latency
Availability
Networkbandwidthandnodecapacityconstraints
Assignmentandintegerconstraints
ODP:OpLmalDSPPlacementModelODP:ILPformulaLonTunableknobstosettheopLmalplacementgoals
V.Cardellini-ThroughtheFog2017
![Page 17: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/17.jpg)
ODPandApacheStorm• WecanuseODP
– todeterminetheopLmalplacement
– asbenchmarktoevaluateexisLngheurisLcs
16
ODP
V.Cardellini-ThroughtheFog2017
![Page 18: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/18.jpg)
ODP:BenchmarkforplacementheurisLcsDistributedplacementheurisLcthatminimizesnetworkusage
17P.Pietzuchetal.,Network-awareoperatorplacementforstream-processingsystems,ICDE‘06.
Pietzuchetal.:
V.Cardellini-ThroughtheFog2017
![Page 19: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/19.jpg)
Challenge2:placementandreplicaLon
• ExploitapplicaLon-levelparallelismbyreplicaLonoperators
18V.Cardellini-ThroughtheFog2017
A B
A
A
A
Split Merge
![Page 20: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/20.jpg)
OperatorplacementandreplicaLon
V.Cardellini-ThroughtheFog2017 19
![Page 21: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/21.jpg)
ODRP:OpLmalDSPReplicaLonandPlacement• WeproposeODRP
– CentralizedpolicyforopLmalreplicaLonandplacementofDSPapplicaLons
– FormulatedasIntegerLinearProgramming(ILP)problem
• Ourgoals:– TojointlydeterminetheopLmalnumberofreplicaandtheirplacement
– ToconsidermulLpleQoSaNributesofapplicaLonsandresources
– Toprovideaunifiedgeneralformula2on
– ToprovideabenchmarkforheurisLcs
20
V.Cardellini,V.Grassi,F.LoPres2,M.Nardelli,Op2maloperatorreplica2onandplacementfordistributedstreamprocessingsystems.ACMPerf.Eval.Rew.,2017. V.Cardellini-ThroughtheFog2017
![Page 22: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/22.jpg)
ODRPperformance
V.Cardellini-ThroughtheFog2017 21
sinkoperatorsource
RabbitMQRedis
data source parser filterByCoordinates
metronome
computeRouteID
partialRankcountByWindow globalRank
0.001
0.01
0.1
1
10
100
1000
20 40 60 80 100 120
Response time (s)
Source data rate (tuples/s)
S-ODP_RS-ODRP_R
DSPapplicaLon:DEBS2015GrandChallenge
![Page 23: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/23.jpg)
Challenge3:runLmedeployment
• ManyfactorsmaychangeatrunLme,e.g.,– LoadvariaLons,QoSahributesofresources,costofresources(e.g.,duetodynamicpricingschemes),networkcharacterisLcs,nodemobility,…
• HowtoadapttheplacementandreplicaLonwhenchangesoccur?Exploitself-adap2vedeployment
22V.Cardellini-ThroughtheFog2017
![Page 24: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/24.jpg)
Self-adapLvedeployment
23
• MAPE(Monitor,Analyze,PlanandExecute)
• Planphase:howtoreconfiguretheapplicaLondeployment
V.Cardellini-ThroughtheFog2017
![Page 25: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/25.jpg)
ReconfiguraLonchallenges
24
• Reconfiguringthedeploymenthasanonnegligiblecost!
• CanaffectnegaLvelyapplicaLonperformanceintheshortterm– ApplicaLonfreezingLmescausedbyoperator
migraLonandscaling,especiallyforstatefuloperators
PerformreconfiguraLononlywhenneeded
TakeintoaccounttheoverheadformigraLngandscalingtheoperators
V.Cardellini-ThroughtheFog2017
![Page 26: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/26.jpg)
ElasLcstatefulmigraLoninStorm
• WedevelopmechanismsforelasLcstatefulmigraLoninApacheStorm
Supervisor Supervisor Supervisor Supervisor
worker
process
worker
process
worker
slot
worker
slot
worker
slot
worker
slot
worker
process
worker
process
worker
process
worker
process
worker
process
worker
process
DDS DDS DDS DDS
Network
schedulerMigrationNotifier
ElasticityManager
Nimbus ZooKeeper
25V.Cardellini-ThroughtheFog2017V.Cardellini,M.Nardelli,D.Luzi,Elas2cstatefulstreamprocessinginStorm,HPCS‘16.
![Page 27: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/27.jpg)
EDRP:ElasLcDSPReplicaLonandPlacement
• UnifiedframeworkfortheQoS-awareiniLaldeploymentandrunLmeelasLcitymanagementofDSPapplicaLons
• Wemodelreconfigura2oncosts– RelatedtomigraLngorscalingin/outtheoperators
• CentralizedpolicyformulatedasIntegerLinearProgramming(ILP)problem
V.Cardellini-ThroughtheFog2017 26
V.Cardellini,F.LoPres2,M.Nardelli,G.RussoRusso,Op2maloperatordeploymentandreplica2onforelas2cdistributeddatastreamprocessing,underreview,2017.
![Page 28: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/28.jpg)
EDRPperformance
V.Cardellini-ThroughtheFog2017 27
WithreconfiguraLon
penalLes:Availability:95.5%DownLme:<90s
WithoutreconfiguraLon
penalLes:Availability:81.5%DownLme:370s
![Page 29: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/29.jpg)
Futurework• StudyefficientheurisLcstodealwithlargeprobleminstances
• Dealwithuncertainty:takeuncertaintyofparametersintoaccountanddesignrobustplacementalgorithms
• StudyhowtodeploymulLplecompeLngapplicaLonsintheFog
• IntegrateplacementdecisionwithSDN– WithSDN,networkintothecontrolloop
• Studycross-layerstrategiesthatinvolvemulLpleBigdataframeworksintheFog– E.g.,Heron+ApacheAurora+Mesos
28V.Cardellini-ThroughtheFog2017
![Page 30: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems](https://reader033.vdocument.in/reader033/viewer/2022042302/5ecd07f48689bc727d167860/html5/thumbnails/30.jpg)
Acknowledgments–Co-authors
29
VincenzoGrassi FrancescoLoPresL MaheoNardelli
Thankyou!AnyquesLons?
hhp://www.ce.uniroma2.it/~valeriaV.Cardellini-ThroughtheFog2017