causal models, learning algorithms and their application to performance modeling
DESCRIPTION
Causal Models, Learning Algorithms and their Application to Performance Modeling. Jan Lemeire Parallel Systems lab November 15 th 2006. Overview. I. Causal Models II. Learning Algorithms III. Performance Modeling IV. Extensions. I. Multivariate Analysis. Variables. - PowerPoint PPT PresentationTRANSCRIPT
Causal Models, Learning Algorithms and their Application to Performance Modeling
Jan LemeireParallel Systems labNovember 15th 2006
2Causal Performance Models
Pag.
Overview
I. Causal Models II. Learning Algorithms III. Performance Modeling IV. Extensions
3Causal Performance Models
Pag.
I. Multivariate Analysis
Variables
Probabilistic model of joint distribution?Relational information?A priori unknown relations
Experimental data
4Causal Performance Models
Pag.
A. Representation of distributions
Factorization
Reduction of factorization complexity
Bayesian Network
A, B, C, DOrdering 1 Ordering 2
A
B C
D
C D B
A C B
A, D, B, C
A
D B
C
C D B
A C B
A D
P(A, B, C, D)=P(A).P(B|A).P(C|A, B).P(D|A, B, C)
P(C|A, B)=P(C|B) ó A C B
A
B C
D A
D B
C
5Causal Performance Models
Pag.
Conditional independence
Qualitative property: P(rain|quality of speech)=P(rain)?
Markov condition in graphVariable becomes independent from all its non-descendants by
conditioning on its direct parents.
– graphical d-separation criterion
B. Representation of Independencies
D
CBA
P(A|B, C) = P(A|B) ó A C B
B d-separates A from CA is d-separated from DA is not d-separated from D, given B
A C BA D
A D B
6Causal Performance Models
Pag.
Faithfulness
Faithfulness:Joint Distribution Directed Acyclic Graph
Conditional independencies d-separation
Theorem:
if a faithful graph exists, it is the minimal factorization.
Independence-map:
All independencies in the Bayesian network appear in the distribution
7Causal Performance Models
Pag.
Definition through interventions
causal model + Conditional Probability Distributions + Causal Markov Condition = Bayesian network
C. Representation of Causal Mechanisms
Model of the underlying physical mechanisms
A B
do(A=a)
P(B|A=a)
A B
do(A=a)
P(B)
A B A B
8Causal Performance Models
Pag.
Reductionism Causal modeling = reductionism
Canonical representation: unique, minimal, independent
Building block = P(Xi|parents(Xi))Whole theory is based on this modularity
Intervention = change of block
X1 X2
X3 X4
X5
X1 X2
X3 X4
X5
do(X3=a) =a
9Causal Performance Models
Pag.
Ultimate motivation for causality
Model = canonical representation able to explain all qualitative properties (independencies) close to reality
If causal mechanisms are unrelated model is faithful
10Causal Performance Models
Pag.
II. Learning Algorithms
Two types: Constraint-basedbased on the independencies
Scoring-basedsearches set of all models, give a score of how good they
represent distribution
11Causal Performance Models
Pag.
Step 1: Adjacency search
Property: adjacent nodes do not become independent
Algorithm: start with full-connected graphcheck for marginal independenciescheck for conditional independencies
C
A B
DC
A B
D
A D
C
A B
D
A C B
C D B
12Causal Performance Models
Pag.
Step 2: Orientation
Property: V-structure can be recognized
Algorithm: look for v-structures derived rules
C
A B
DC
A B
D
A D
A D B
C
A B
D
A C
A C B
A C
B
A B C
A B C
A C
B
A C
A C B
A C
A C B
13Causal Performance Models
Pag.
Assumptions
General statistical assumptions: No selection bias Random sample Sufficient data for correctness of statistical tests
Underlying network is faithful
Causal sufficiency No unknown common causes
A C
B
14Causal Performance Models
Pag.
Criticism
Definition causality?About predicting the effect of changes to the system
Faithfulness assumption Eg.: accidental cancellation
Causal Markov Condition “All relations are causal”
Learning algorithms are not robustStatistical tests make mistakes
X
Y
VU
X Y
15Causal Performance Models
Pag.
Part III: Performance Analysis
High-Performance computing
1 processor
parallel system
Performance Questions:Performance prediction
System-dependency?
Parameter-dependency? Reasons of bad performance?
Effect of Optimizations?
16Causal Performance Models
Pag.
Causal modeling (cf. COMO lab, VUB) Representation form Close to reality Learning algorithms TETRAD tool (open-source, java)
PhD??
17Causal Performance Models
Pag.
Performance Models
Aim performance analysis Support software developer High-performance applications
Expected properties offer insight into causes performance degradation prediction estimate effect of optimizations reusable submodels
separate application and system-dependency
reason under uncertainty
causal models
18Causal Performance Models
Pag.
Integrated in statistical analysis
Statistical characteristics Regression analysisProbability table compressionOutlier detection
Iterative process1. Perform additional experiments2. Extract additional characteristics3. Indicate exceptions4. Analyze the divergences of the
data points with the current hypotheses
Experiments
Profiling
Causal Model
User Inspection
ModelConstruction
Application
Curve Fitting
Analytical Model
Database
DivergencesExc
eptio
ns
1
2
34
CPT compression
19Causal Performance Models
Pag.
A. Model construction
Model of computation
time of LU decom- position algorithm
elementsize (redundant variable) is sufficient for influence datatype -> cache misses regression analysis on submodels X=f(parents) analysis of parameters
#op
fclock
datatypeTcomp
n
Cop
L1Mop
elementsize
#instrop
L2Mop
20Causal Performance Models
Pag.
B. Detection of unexpected dependencies
Point-to-point communication performance
background communication
21Causal Performance Models
Pag.
C. Finding explanations for outliers
Exceptional data in communication performance measurements
Probability table compressionX P(X=1)
}X0
X1
X2
X3
00
1
1 }
Y
Y0
Y0
Y1
Y1
=> derived variableInteresting features
22Causal Performance Models
Pag.
IV. Complexity of Performance Data
Mixture discrete and continuous variables Mutual Information & Kernel Density Estimation
Non-linear relations Mutual Information & Kernel Density Estimation
Deterministic relations Augmented models & Complexity criterion
Context variables Work in progress
Context-specific independencies Work in progress
23Causal Performance Models
Pag.
A. Information-theoretic Dependency
Entropy of random variable X Discretized entropy for continuous variable
Mutual Information
24Causal Performance Models
Pag.
B. Kernel Density Estimation
See applets
Trade-off maximal entropy <> typicalness
Conclusions Limited number data points needed Discretization of continuous data justified Form-free dependency measure
25Causal Performance Models
Pag.
C. Deterministic relations
Y=f(X)
Y becomes independent from Z conditioned on X~ violation of the intersection condition (Pearl ’88)Not faithfully describable
Solution: augmented causal model- add regularity to model- adapt inference algorithms
YX Z
YX Z
X YZ XY ZY Z
XY Z
X Z
26Causal Performance Models
Pag.
The Complexity Criterion
Select simplest relation
X & Y contain equivalent information about Z
Complexity(Y-Z) < Complexity(X-Z)
X Y
Z
X Y
Z
27Causal Performance Models
Pag.
Augmented causal model
Consistent models underComplexity Increase assumption
Compl(X-Z) ≥ Compl(X-Y)Compl(X-Z) ≥ Compl(Y-Z)
X Y Z
X YZS
Restrict conditional independenciesGeneralize d-separation
Reestablish faithfulness
X YZeq
S
eq{
28Causal Performance Models
Pag.
Theory works!
Deterministic
A
B Probabilistic
29Causal Performance Models
Pag.
Conclusions
Benefit of the integration of statistical techniques
Causal modeling is a challenge– wants to know the inner from the outer
More information– http://parallel.vub.ac.be– http://parallel.vub.ac.be/~jan