model-based clustering for online crisis …woodard/woodard_cornell...modeling computation and...
TRANSCRIPT
![Page 1: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/1.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Model-Based Clustering for Online CrisisIdentification in Distributed Computing
Dawn WoodardOperations Research and Information Engineering
Cornell University
with Moises GoldszmidtMicrosoft Research
1
![Page 2: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/2.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Outline1 Background and Overview2 Modeling3 Computation and Decision Making
Offline ComputationOnline ComputationDecision Making
4 Simulation StudyOfflineOnline
5 Application to the Exchange Hosted ServicesOfflineOnline
6 Conclusions2
![Page 3: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/3.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Outline1 Background and Overview2 Modeling3 Computation and Decision Making
Offline ComputationOnline ComputationDecision Making
4 Simulation StudyOfflineOnline
5 Application to the Exchange Hosted ServicesOfflineOnline
6 Conclusions3
![Page 4: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/4.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Distributed Computing
Commercial distributed computing providers:
Offer remotely-hosted computing services
E.g. Microsoft’s Exchange Hosted Services (EHS)
24/7 email processing incl. spam filtering, encryption
4
![Page 5: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/5.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Distributed Computing
This processing is performed by farming out to many servers
Often, tens of thousands of servers in multiple locations
Client Provider
Server 1
Server 2
Server 3
…
5
![Page 6: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/6.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Distributed Computing
6
![Page 7: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/7.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Distributed Computing
Can have occasional severe violation of performance goals (“crises”)
E.g. due to:
servers becoming overloaded in periods of high demand
performance problems in lower-level computing centers on which theservers rely (e.g. for performing authentication)
If the problem lasts for more than a few minutes, must pay cashpenalties to clients, have potential loss of contracts
7
![Page 8: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/8.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Distributed Computing
% of servers violating a performance goal, for a 10-day periodin EHS:
0.0
0.2
0.4
KP
I 10.
00.
20.
4K
PI 2
010
00M
etr
10
2040
Met
r 2
0 200 400 600 800 1000
020
000
Met
r 3
Time
Exceeding the dotted line constitutes a crisis.
8
![Page 9: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/9.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Distributed Computing
Need to rapidly recognize the recurrence of a problem
If an effective intervention is known for this problem, can apply it
Due to large scale and interdependence, manual problem diagnosis isdifficult and slow
Have a set of status measurements for each server. E.g., for EHS:
CPU utilization
Memory utilization
For each spam filter, the length of the queue and the throughput
. . .
9
![Page 10: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/10.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Distributed Computing
Goal: Match a currently occurring (i.e., incompletely observed) crisis toprevious crises of mixed known and unknown cause
I.e., are any previous crises of the same type as the new crisis? Whichones?
This is an online clustering problem with:
partial labeling
incomplete data for the new crisis
We use model-based clustering based on a Dirichlet process mixture(e.g. Escobar & West 1995)
The evolution of each process is modeled as a time series10
![Page 11: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/11.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Cost-Optimal Decision Making
Wish to perform optimal (expected-cost-minimizing) decision making during acrisis...
...while accounting for uncertainty in the crisis type assignments and theparameters of those types
This requires fully Bayesian inference
11
![Page 12: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/12.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Fully Bayesian Inference
We apply fully Bayesian inference (via MCMC) in the long periodsbetween crises
Due to posterior multimodality, we combine a collapsed-space split-mergemethod with parallel tempering
As a new crisis begins, update rapidly using an approximation
12
![Page 13: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/13.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Related Work
Ours is the first instance of fully Bayesian online clustering
Online model-based clustering was performed by Zhang, Ghahramani,and Yang (2004) for documents
Obtain a single cluster assignment based on the posterior; insufficient foroptimal decision making
Fully Bayesian clustering: Bensmail, Celeux, Raftery, and Robert(1997); Pritchard, Stephens, and Donnelly (2000); Lau and Green(2007)
Many examples of fully Bayesian mixture modeling
13
![Page 14: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/14.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Outline1 Background and Overview2 Modeling3 Computation and Decision Making
Offline ComputationOnline ComputationDecision Making
4 Simulation StudyOfflineOnline
5 Application to the Exchange Hosted ServicesOfflineOnline
6 Conclusions14
![Page 15: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/15.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
DataMedians of 3 metrics across servers, for a 10-day period (EHS):
Time
15
![Page 16: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/16.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
DataCrises are highlighted; color indicates their known type:
Time
16
![Page 17: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/17.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Data
The medians of the metrics are very informative as to crisis type
specifically, whether the median is low, normal, or high
We fit our models to the median values of the metrics, discretized into 1:low, 2: normal, and 3: high
17
![Page 18: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/18.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Crisis Modeling
Time series model for crisis evolution:
Yilj: value of metric j in the lth time period after the start of crisis i
Assume that metrics are independent conditional on the crisis type
For crisis type k, Yi1j is drawn from a discrete dist’n with probabilityvector γ(jk)
...and Yilj evolves according to a Markov chain with transition matrix T(jk)··
18
![Page 19: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/19.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Crisis Modeling
⇒ Complete-data likelihood fn:
π“D | {Zi}I
i=1, {γ(jk), T(jk)·· }j,k
”=
Qi,j,t
"“γ
(j Zi)t
”1(Yi1j=t) Qs
“T(j Zi)
st
”nijst
#. (1)
conditioning on the unknown type indicators Zi of each crisis i = 1, . . . , I.
nijst: the number of transitions of the jth metric from state s to state t during crisis i
19
![Page 20: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/20.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Cluster Modeling
Dirichlet process mixture (DPM) prior:
Natural for online clustering
Allows number of clusters to increase with the number of crises
Crises are exchangeable
Parameterized by
α: controls the expected number of clusters occurring in a fixed number ofcrises
G0: the prior G0(d{γ(jk), T(jk)·· }j) for the parameters associated with each
cluster k
20
![Page 21: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/21.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Cluster Modeling
The DPM prior for the cluster indicators {Zi}Ii=1 and the cluster parameters
γ(jk), T(jk)·· :
π({Zi}Ii=1) =
IQi=1
π(Zi | {Zi′}i′<i)
=IQ
i=1
"α
α+i−1 1(Zi=mi−1+1)+ 1α+i−1
Pi′<i
1(Zi=Zi′)#
(2)
where mi = max{Zi′ : i′ ≤ i} for i > 0 and m0 = 0, and
π“
d{γ(jk),T(jk)·· }j,k | {Zi}I
i=1
”=
mIQk=1
G0
“d{γ(jk),T(jk)
·· }j
”. (3)
21
![Page 22: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/22.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Cluster ModelingAlso called the “Chinese Restaurant Process”:
π (Zi = k | {Zi′}i′<i) ∝
8><>:α : k is a new typePi′<i
1 (Zi′ = k) : else
Each observation i is a new guest who either sits at an occupied table withprob. proportional to the number of guests at that table, or sits at an emptytable:
22
![Page 23: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/23.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Cluster Modeling
Now we can evaluate the posterior density (up to a normalizingconstant):
π({Zi}I
i=1, {γ(jk), T(jk)·· }j,k | D
)∝
π({Zi}I
i=1
)π
({γ(jk), T(jk)
·· }j,k | {Zi}Ii=1
)π
(D | {Zi}I
i=1, {γ(jk), T(jk)·· }j,k
)
23
![Page 24: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/24.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Cluster Modeling
Partially labeled case:
We have given the prior for the case where none of the crisis types Zi
are known
If we know that Zi = Zi′ for some crises i ∼ i′, multiply (2) byQi∼i′
1(Zi = Zi′)
24
![Page 25: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/25.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Cluster Modeling
G0:
Independent Dirichlet priors for γ(jk) for each j
Independent product Dirichlet priors for T(jk)·· for each j
25
![Page 26: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/26.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Offline ComputationOnline ComputationDecision Making
Outline1 Background and Overview2 Modeling3 Computation and Decision Making
Offline ComputationOnline ComputationDecision Making
4 Simulation StudyOfflineOnline
5 Application to the Exchange Hosted ServicesOfflineOnline
6 Conclusions26
![Page 27: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/27.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Offline ComputationOnline ComputationDecision Making
Outline1 Background and Overview2 Modeling3 Computation and Decision Making
Offline ComputationOnline ComputationDecision Making
4 Simulation StudyOfflineOnline
5 Application to the Exchange Hosted ServicesOfflineOnline
6 Conclusions27
![Page 28: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/28.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Offline ComputationOnline ComputationDecision Making
Offline Computation
The cluster parameters {γ(jk), T(jk)·· }j,k can be integrated analytically out
of the posterior
Run a Markov chain with target dist’n π({Zi}Ii=1 | D)
Jain and Neal (2004) use a Gibbs sampler, with an additionalsplit-merge move on clusters
We add parallel tempering (Geyer 1991)
28
![Page 29: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/29.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Offline ComputationOnline ComputationDecision Making
Outline1 Background and Overview2 Modeling3 Computation and Decision Making
Offline ComputationOnline ComputationDecision Making
4 Simulation StudyOfflineOnline
5 Application to the Exchange Hosted ServicesOfflineOnline
6 Conclusions29
![Page 30: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/30.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Offline ComputationOnline ComputationDecision Making
Online Inference
Wish to identify a crisis in real time
Have data D from previous crises and data Dnew so far for the new crisis
E.g., wish to estimate π(Znew = Zi | D,Dnew) for each previous crisisi = 1, . . . , I
...and π(Znew 6= Zi ∀i | D,Dnew)
30
![Page 31: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/31.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Offline ComputationOnline ComputationDecision Making
Exact Online InferenceMethod 1:
Just apply the Markov chain method to the data from the I + 1 crises
Gives posterior sample vectors“{Z(l)
i }Ii=1, Z(l)
new
”for l = 1, . . . , L
Monte Carlo estimates of the desired probabilities:
π̂(Znew = Zi | D,Dnew) = 1L
LPl=1
1(Z(l)new = Z(l)
i )
π̂(Znew 6= Zi ∀i | D,Dnew) = 1L
LPl=1
1(Z(l)new 6= Z(l)
i ∀i)
But running the Markov chain is too slow for real-time decision making!
31
![Page 32: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/32.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Offline ComputationOnline ComputationDecision Making
Approximate Online Inference
We give a method using the approximation:
π(Znew = Zi | D,Dnew) =X
{Zi}Ii=1
π(Znew = Zi | {Zi}Ii=1,D,Dnew)π({Zi}I
i=1 | D,Dnew)
≈X
{Zi}Ii=1
π(Znew = Zi | {Zi}Ii=1,D,Dnew)π({Zi}I
i=1 | D)
* Assumes that Dnew does not tell us much about the past crisis types
32
![Page 33: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/33.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Offline ComputationOnline ComputationDecision Making
Approximate Online Inference
Method 2: Approximate Online Inference
1 After the end of each crisis, rerun the Markov chain, yielding samplevectors {Z(l)
i }Ii=1 from the posterior π({Zi}I
i=1 | D).2 When a new crisis begins, use its data Dnew to calculate the Monte Carlo
estimates:
π̂(Znew = Zi | D,Dnew) =1L
LXl=1
π(Znew = Z(l)i | {Z(l)
i′ }Ii′=1,D,Dnew)
π̂(Znew 6= Zi ∀i | D,Dnew) =1L
LXl=1
π(Znew 6= Z(l)i ∀i | {Z(l)
i′ }Ii′=1,D,Dnew).
33
![Page 34: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/34.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Offline ComputationOnline ComputationDecision Making
Approximate Online Inference
Part 2 is O(LIJ), very fast
34
![Page 35: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/35.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Offline ComputationOnline ComputationDecision Making
Outline1 Background and Overview2 Modeling3 Computation and Decision Making
Offline ComputationOnline ComputationDecision Making
4 Simulation StudyOfflineOnline
5 Application to the Exchange Hosted ServicesOfflineOnline
6 Conclusions35
![Page 36: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/36.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Offline ComputationOnline ComputationDecision Making
Optimal Decision Making
Want expected-cost-minimizing decision making during a crisis
The total cost of the new crisis is a function Cˆφ, {Z∗
i }Ii=1, Z∗
new˜
of:
The intervention φ
The true type Z∗new of the current crisis
The vector of past crisis types {Z∗i }Ii=1, which give the context for Z∗new
Finding the expected cost of the crisis for intervention φ requiresintegrating C over the posterior distribution of
`{Zi}I
i=1, Znew´
Can be done exactly using Method 1, or approximately using Method 2
36
![Page 37: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/37.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Outline1 Background and Overview2 Modeling3 Computation and Decision Making
Offline ComputationOnline ComputationDecision Making
4 Simulation StudyOfflineOnline
5 Application to the Exchange Hosted ServicesOfflineOnline
6 Conclusions37
![Page 38: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/38.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Outline1 Background and Overview2 Modeling3 Computation and Decision Making
Offline ComputationOnline ComputationDecision Making
4 Simulation StudyOfflineOnline
5 Application to the Exchange Hosted ServicesOfflineOnline
6 Conclusions38
![Page 39: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/39.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Simulation Study
Offline:
Simulate I crises from model
Compare MBC with distance-based clustering
39
![Page 40: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/40.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Simulation Study
Offline Accuracy Criteria:
1 Pairwise Sensitivity: For pairs of crises of the same type, % assignedto the same cluster
for MBC, having prob. > 0.5 of being in the same cluster.
2 Pairwise Specificity: For pairs of crises not of the same type, %assigned to different clusters
for MBC, having prob. ≤ 0.5 of being in the same cluster.
3 Error of No. Crisis Types: The % error of the estimated number ofcrisis types
for MBC, post. mean is used to estimate No. of types.40
![Page 41: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/41.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Simulation StudyNo. Crises No. Metrics Method Pairwise Pairwise % Error
Sensitivity Specificity No. Types15 10 MBC 94.6 (2.08) 99.0 (0.50) 9.3 (1.87)
K-Means 1 47.8 (4.26) 95.3 (0.57) –K-Means 2 74.8 (5.39) 77.9 (1.73) –
15 15 MBC 99.0 (1.00) 99.4 (0.41) 3.7 (0.95)K-Means 1 69.6 (4.76) 97.0 (0.54) –K-Means 2 88.3 (4.01) 78.2 (2.13) –
25 10 MBC 91.9 (1.88) 98.8 (0.40) 7.4 (1.58)K-Means 1 57.7 (3.19) 95.5 (0.54) –K-Means 2 76.0 (4.01) 82.9 (1.16) –
25 15 MBC 99.6 (0.23) 99.9 (0.05) 3.5 (1.13)K-Means 1 56.5 (3.76) 95.8 (0.57) –K-Means 2 82.4 (4.76) 83.0 (1.83) –
35 10 MBC 97.6 (0.65) 99.8 (0.08) 6.4 (1.81)K-Means 1 56.5 (3.43) 95.9 (0.48) –K-Means 2 74.0 (3.93) 83.9 (1.15) –
35 15 MBC 99.5 (0.24) 99.9 (0.03) 3.4 (0.67)K-Means 1 59.3 (4.07) 97.8 (0.27) –K-Means 2 81.1 (4.74) 86.7 (1.48) – 41
![Page 42: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/42.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Simulation Study
MBC does far better than K-means
More metrics ⇒ better accuracy of MBC
More crises 6⇒ better accuracy of MBC
42
![Page 43: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/43.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Outline1 Background and Overview2 Modeling3 Computation and Decision Making
Offline ComputationOnline ComputationDecision Making
4 Simulation StudyOfflineOnline
5 Application to the Exchange Hosted ServicesOfflineOnline
6 Conclusions43
![Page 44: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/44.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Simulation Study
Online:
Compare Method 1 (“MBC-EX”) to Method 2 (“MBC”)
44
![Page 45: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/45.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Simulation Study
Online Accuracy Criteria:
1 Full-data misclassification rate: % of crises with incorrect predictedtype, using all of the data for the new crisis.
2 p-period misclassification rate: % of crises with incorrect predictedtype, using the first p time periods of data for the new crisis.
3 Average time to correct identification: Avg. No. of time periodsrequired to obtain the correct identification
(“correct” predicted type: π̂(Znew 6= Zi ∀i | D,Dnew) > 0.5 if Z∗new 6= Z∗
i ∀i and otherwiseπ̂(Znew = Zi | D,Dnew) > 0.5 for some i ≤ I such that Z∗
new = Z∗i )
45
![Page 46: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/46.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Simulation Study
Online Accuracy:
No. No. Method Full-data 3-period Avg. Time toCrises Metrics Misclassification Misclassification Identification15 10 MBC 6.7 (3.0) 10.7 (4.5) 1.31 (0.11)
MBC-EX 8 (2.5) 10.7 (4.5)15 15 MBC 6.7 (5.2) 9.3 (6.2) 1.13 (0.08)
MBC-EX 5.3 (3.9) 8.0 (4.9)25 10 MBC 13.6 (2.7) 15.2 (2.7) 1.33 (0.13)
MBC-EX 9.6 (2.0) 15.2 (3.4)25 15 MBC 2.4 (1.6) 4.0 (1.8) 1.15 (0.06)
MBC-EX 3.2 (1.5) 3.2 (1.5)
46
![Page 47: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/47.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Simulation Study
Classification accuracy high (> 80%) for both MBC & MBC-EX
MBC not significantly worse than MBC-EX
3-period misclassification is not much > than full-data misclassification
Very early identification!
47
![Page 48: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/48.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Outline1 Background and Overview2 Modeling3 Computation and Decision Making
Offline ComputationOnline ComputationDecision Making
4 Simulation StudyOfflineOnline
5 Application to the Exchange Hosted ServicesOfflineOnline
6 Conclusions48
![Page 49: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/49.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Application to EHS
27 crises in EHS during Jan-Apr 2008.
The causes of some of these were diagnosed later:
ID Cause No. of knowncrises
A overloaded front-end 2B overloaded back-end 8C database configuration error 1D configuration error 1E performance issue 1F middle-tier issue 1G whole DC turned off and on 1H workload spike 1I request routing error 1
49
![Page 50: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/50.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Outline1 Background and Overview2 Modeling3 Computation and Decision Making
Offline ComputationOnline ComputationDecision Making
4 Simulation StudyOfflineOnline
5 Application to the Exchange Hosted ServicesOfflineOnline
6 Conclusions50
![Page 51: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/51.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Offline Application to EHS
Apply the Markov chain method to the set of 27 crises without the labels
Compare to those labels
51
![Page 52: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/52.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Offline Application to EHSTrace plots of parallel tempering Markov chain samples of Z22:
beta
= 1
2.0
2.4
2.8
beta
= 0
.40
12
34
5be
ta =
0.2
04
812
0 2000 4000 6000 8000 10000
Geweke diag. p-value: 0.44 Gelman-Rubin scale factor: 1.0152
![Page 53: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/53.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Offline Application to EHS
Post. mode cluster assignment has 58% prob.
Sizes of clusters:ID Cause No. of known No. identified No. MBC crises
crises by MBC matching knownA overloaded front-end 2 3 2B overloaded back-end 8 14 8C database configuration error 1 2 1D configuration error 1 0 0 (labeled as A)E performance issue 1 0 0 (labeled as B)F middle-tier issue 1 0 0 (labeled as I)G whole DC turned off and on 1 0 0 (labeled as B)H workload spike 1 1 1I request routing error 1 6 1
53
![Page 54: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/54.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Offline Application to EHS
Post. mode crisis labels mostly match known clusters
The largest 5 clusters are correctly labelled
Four uncommon crisis types are clustered with more common types
Crises having different causes can have the same patterns in their metrics
Need to add metrics that distinguish these types effectively
54
![Page 55: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/55.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Outline1 Background and Overview2 Modeling3 Computation and Decision Making
Offline ComputationOnline ComputationDecision Making
4 Simulation StudyOfflineOnline
5 Application to the Exchange Hosted ServicesOfflineOnline
6 Conclusions55
![Page 56: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/56.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
OfflineOnline
Online Application to EHSEvaluate online accuracy, treating the posterior mode from the offlinecontext as the gold standard.
Original ordering:
1 Full-data misclassification: 7.4%
2 3-period misclassification: 14.8%
3 Avg. time to correct iden.: 1.81
Permuting the crises:
1 Full-data misclassification: 5.9% (SE =3.4%)
2 3-period misclassification: 11.8% (SE =3.2%)
3 Avg. time to correct iden.: 1.56 (SE =0.07)
56
![Page 57: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/57.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Outline1 Background and Overview2 Modeling3 Computation and Decision Making
Offline ComputationOnline ComputationDecision Making
4 Simulation StudyOfflineOnline
5 Application to the Exchange Hosted ServicesOfflineOnline
6 Conclusions57
![Page 58: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/58.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Conclusions
Gave a method for fully Bayesian real-time crisis identification indistributed computing
Described how to use this to perform rapid expected-cost-minimizingcrisis intervention
Very accurate on both simulated data and data from a productioncomputing center
A copy of this paper and seminar are available at:http://people.orie.cornell.edu/woodard
58
![Page 59: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/59.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
ReferencesEscobar, M. D. and West, M. (1995).Bayesian density estimation and inference using mixtures.Journal of the American Statistical Association, 90, 577-588.
Geyer, C. J. (1991).Markov chain Monte Carlo maximum likelihood.in Computing Science and Statistics, Vol. 23: Proc. of the 23rd Symp. on theInterface, ed. E. Keramidas, pp. 156-163.
Jain, S. and Neal, R. M. (2004).A split-merge Markov chain Monte Carlo procedure for the Dirichlet processmixture model.Journal of Computational and Graphical Statistics, 13, 158-182.
Lau, J. W. and Green, P. J. (2007).Bayesian model-based clustering procedures.Journal of Computational and Graphical Statistics, 16, 526-558.
Zhang, J., Ghahramani, Z., and Yang, Y. (2004).A probabilistic model for online document clustering with application to noveltydetection.in Advances in Neural Information Processing Systems, ed. Y. Weiss. 59
![Page 60: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/60.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Prior Constants
Prior hyperparameters chosen by combining information in data withexpert opinion
Reflect the fact that the server status measurements are chosen to beindicative of crisis type
Results far better than a “default” prior specification, which contradictsdata and experts
60
![Page 61: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/61.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Prior Constantsα:
Prob. that 2 randomly chosen crises are of same type: 1/(α + 1)
EHS experts estimate as 0.1, giving α = 9
⇒ ∼13 types in 27 crises
γ(jk) ∼ Dir(a(j)). To choose a(j):
Prior mean of γ(jk) taken as empirical dist’n of Yi1j over i and j
Substantial prob. that one of the γ(jk) is “close” to 1:
π“(γ
(jk)1 > .85) OR (γ
(jk)2 > .95) OR (γ
(jk)3 > .85)
”= 0.5
Analogous for T(jk)··
61
![Page 62: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/62.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Optimal Decision Making
Want expected-cost-minimizing decision making during a crisis
The total cost of the new crisis is a function Cˆφ, {Z∗
i }Ii=1, Z∗
new˜
of:
The intervention φ
The true type Z∗new of the current crisis
The vector of past crisis types {Z∗i }I
i=1, which give the context for Z∗new
62
![Page 63: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/63.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Optimal Decision Making
If we knew C,
given posterior sample vectors“{Z(l)
i }Ii=1, Z(l)
new
”from the exact Method
1...
...the expected cost can be estimated as:
E(C) ≈ 1L
LXl=1
Chφ, ({Z(l)
i }Ii=1, Z(l)
new)i.
Have a similar expression for approximate inferences from Method 2
63
![Page 64: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/64.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Optimal Decision Making
Don’t know C in practice
For interventions φ taken during previous crises can estimate C fromrealized costs
Otherwise can estimate C from expert knowledge
64
![Page 65: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/65.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Optimal Decision Making
Since the goal is optimal intervention
...and since this requires the entire posterior distribution over`{Zi}I
i=1, Znew´...
we will avoid choosing a “best” cluster assignment
instead focusing on the accuracy of the “soft identification”, i.e. theposterior distribution over
`{Zi}I
i=1, Znew´
65
![Page 66: Model-Based Clustering for Online Crisis …woodard/Woodard_Cornell...Modeling Computation and Decision Making Simulation Study Application to the Exchange Hosted Services Conclusions](https://reader033.vdocument.in/reader033/viewer/2022050102/5f4107b07b843f359f2df5d7/html5/thumbnails/66.jpg)
Background and OverviewModeling
Computation and Decision MakingSimulation Study
Application to the Exchange Hosted ServicesConclusions
Simulation Study
K-means:
Criteria for choosing the number of clusters do not work well in ourcontext
So we apply K-means using the true number of clusters (“K-means 1”)
and half the true number of clusters (“K-means 2”)
This is unrealistically optimistic...
...but K-means still does terribly
66