1 graphical causal models clark glymour carnegie mellon university florida institute for human and...

42
1 Graphical Causal Graphical Causal Models Models Clark Glymour Clark Glymour Carnegie Mellon University Carnegie Mellon University Florida Institute for Human Florida Institute for Human and Machine Cognition and Machine Cognition

Upload: sabina-kelley

Post on 05-Jan-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

11

Graphical Causal ModelsGraphical Causal Models

Clark GlymourClark GlymourCarnegie Mellon UniversityCarnegie Mellon University

Florida Institute for Human and Florida Institute for Human and Machine CognitionMachine Cognition

Page 2: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

22

OutlineOutline Part I: Goals and the Miracle of d-Part I: Goals and the Miracle of d-

separationseparation Part II: Statistical/Machine Learning Part II: Statistical/Machine Learning

Search and Discovery Methods for Search and Discovery Methods for Causal RelationsCausal Relations

Part III: A Bevy of Causal Analysis Part III: A Bevy of Causal Analysis ProblemsProblems

Page 3: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

33

I. Brains, Trains, and Automobiles: I. Brains, Trains, and Automobiles: Cognitive Neuroscience as Reverse Cognitive Neuroscience as Reverse

Auto MechanicsAuto MechanicsIdeaIdea: Like autos, like trains, like : Like autos, like trains, like

computers, brains have parts.computers, brains have parts.

The parts influence one another The parts influence one another to to produce a behavior.produce a behavior.

The parts can have roles in The parts can have roles in multiple multiple behaviors.behaviors.

Big parts haveBig parts have

littler parts.littler parts.

Page 4: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

44

I. Goals of the Automobile I. Goals of the Automobile HypothesisHypothesis

Overall goals:Overall goals: Identify the parts critical to behaviors of Identify the parts critical to behaviors of

interest. interest. Figure out how they influence one Figure out how they influence one

another, in what timing sequences.another, in what timing sequences. Imaging Imaging goalsgoals

Identify relatively BIG parts (ROIs).Identify relatively BIG parts (ROIs). Figure out how they influence one Figure out how they influence one

another, with what timing sequences, in another, with what timing sequences, in producing behaviors of interest.producing behaviors of interest.

Page 5: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

55

I. Goal: From Data to I. Goal: From Data to MechanismsMechanisms

X Y

Z

W

Causal Relations among Neurally Localized Variables

Multivariate Time Series

A

B

C

D

Page 6: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

66

I. Graphical Causal Models: the I. Graphical Causal Models: the Abstract Structure of Abstract Structure of

InfluencesInfluences

Vehicle deceleration

Friction of pads Friction of shoe

against rotor against wheel

Fluid in caliper Fluid in wheel cyiinder

Fluid level in

master cylinder

Push brake

This system is deterministic (we hope)

Page 7: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

77

““Cause” is a vague, metaphysical notion.Cause” is a vague, metaphysical notion. Response: Response: Compare “probability.”Compare “probability.”

““Probability” has a mathematical structure. Probability” has a mathematical structure. “Causation” does not.“Causation” does not. Response: Response: See Spirtes, et al., Causation, See Spirtes, et al., Causation,

Prediction and Search, 1993, 2000; Pearl, Causality, Prediction and Search, 1993, 2000; Pearl, Causality, 2000. Listen to Pearl’s lecture this afternoon.2000. Listen to Pearl’s lecture this afternoon.

The real causes are at the synaptic level, so talk The real causes are at the synaptic level, so talk of ROIs as causes is nonsense.of ROIs as causes is nonsense.

“…“…for many this rhetoric represents a category error…for many this rhetoric represents a category error…because causal [sic] is an attribute of the state equation.” because causal [sic] is an attribute of the state equation.” (Friston, et al, 2007, 602.) (Friston, et al, 2007, 602.)

Response: Response: So, do you think “smoking causes cancer” is So, do you think “smoking causes cancer” is nonsense? “Human activities cause global temperature nonsense? “Human activities cause global temperature increases” is nonsense? “Turning the ignition key causes increases” is nonsense? “Turning the ignition key causes the car to start” is nonsense?the car to start” is nonsense?

I. Philosophical ObjectionsI. Philosophical Objections

Page 8: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

88

I. The Abstract Structure of I. The Abstract Structure of InfluencesInfluences

Linear causal models (SEMs) specify a directed graphical structure.

MedFGlb : = a CING(b) + e1

STG(b) : = b CING(b) + e2

IPL(b) := c STG(b) + d CING(b) + e3

e1, e2, e3 jointly independent

S. Hanson, et al., 2008. Middle Occipital Gyrus (mog), Inferior Parietal Lobule (ipl), Middle Frontal Gyrus(mfg), and Inferior Frontal Gyrus (ifg) Middle Occipital Gyrus (mog), Inferior Parietal Lobule (ipl), Middle Frontal Gyrus(mfg), and Inferior Frontal Gyrus (ifg)

But so does any functional form of the influences:MedFGlb : = f(CING(b) ) + e1STG(b) : = g(CING(b) + e2IPL(b) := h(STG(b), CING(b)) + e3e1, e2, e3 jointly independent

This system is not deterministic

Page 9: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

99

I. So What?I. So What?1. The directed graph codes the conditional independence relations implied by the model:

MedFGl(b) II {STG(b), IPL(b}) | CING(b).

2. (Almost) All of our tests of models are tests of implications of their conditional independence claims.

So what is the code?

Page 10: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

1010

I. d-separation Is the Code!I. d-separation Is the Code!

XX YY ZZ WW

RR

SS

X II {Z, W} | Y

X II W | Z

NOT X II W | R

NOT X || W | S

NOT X || W | {Y, Z, R}

NOT X || W | {Y, Z, S}

Conditioning on a variable in a directed path between X, W blocks the association produced by that path

Conditioning a variable that is a descendant of X, W creates a path that produces an association between X, W

J. Pearl, 1988

What about systems with cycles?

d-separation characterizes conditional independence relations in all such linear systems.

P. Spirtes, 1996

Page 11: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

1111

I. How To Determine If Variables I. How To Determine If Variables AA and and ZZ Are Independent Conditional on a Set Are Independent Conditional on a Set QQ of of Variables.Variables.

1.1. Consider each sequence Consider each sequence pp of edge adjacent of edge adjacent variables (each in any direction) without self variables (each in any direction) without self intersections terminating in intersections terminating in AA and and ZZ..

2.2. A collider on p is a variable A collider on p is a variable NN on p such that on p such that variables variables M, OM, O on on pp each have edges directed each have edges directed into N: M -> N <- Ointo N: M -> N <- O

3.3. Sequence (path) p creates a dependency Sequence (path) p creates a dependency between between AA and and ZZ conditional on conditional on QQ if and only if and only if:if:

1.1. No non-collider on No non-collider on pp is in is in QQ..2.2. Every collider on Every collider on pp is in is in QQ or has a descendant in or has a descendant in Q Q

(a directed path from the collider to a member of (a directed path from the collider to a member of QQ.).)

Page 12: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

1212

II. So, What Can We Do With II. So, What Can We Do With It?It?

Exploit d-separation in conjunction with Exploit d-separation in conjunction with distribution assumptions to estimate distribution assumptions to estimate graphical causal structure from sample graphical causal structure from sample data.data.

Understand when data analysis and Understand when data analysis and measurement methods distort conditional measurement methods distort conditional independence relations in target systems.independence relations in target systems. Wrong conditional independence relations Wrong conditional independence relations

=> wrong d-separation relations => wrong => wrong d-separation relations => wrong causal structure.causal structure.

Page 13: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

1313

II. Simple Illustration (PC)II. Simple Illustration (PC)

X Y Z

Truth:

W

Consequences:X || Z {X,Z} || W | Y

X

Y

Z

W

Method:

X

Y

Z

W

X

Y

Z

W

X

Y

Z

W Spirtes, Glymour, & Scheines. (1993). Causation, Prediction, & Search, Springer Lecture Notes in Statistics.

Page 14: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

1414

II. Bayesian Search: Greedy II. Bayesian Search: Greedy Equivlence Search (GES)Equivlence Search (GES)

X Y Z

Truth:

W

1. Start with empty graph.

2. Add or change the edge that most increases fit.

3. Iterate.

Data

X Y Z

W

Model with highest posterior probability

Chickering and Meek, Uncertainty in Artificial Intelligence Proceedings, 2003

Page 15: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

1515

X Y Z W

II. With Unknown, Unrecorded II. With Unknown, Unrecorded Confounders: FCIConfounders: FCI

Truth

X Y ZW

UnrecordedVariable

Data FCI

Consistent estimator under i.i.d. sampling

Spirtes, et al., Causation, Prediction and Search But in other cases is often

uninformative

Page 16: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

1616

II. Overlapping Databases: II. Overlapping Databases: IONION

Truth:

X

W R

Y

Z

S

WW XX YY ZZ SS RR

DD

11

1.1.55

4.4.88

2.2.22

-4.7-4.7 10.10.11

2.2.33

8.8.88

7.7.44

0.30.3 -5.1-5.1

…… …… …… …… ……

DD

22

7.7.22

3.3.55

1.81.8 9.29.2 7.7.00

1.1.11

4.4.88

11.11.22

12.12.11

6.6.55

…… …… …… …… ……

ION algorithm recoversthe full graph!

But in other cases often generates a number of alternative models

Danks, Tillman and Glymour, NIPS, 2008.

Page 17: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

1717

II. Time Series (Structural II. Time Series (Structural VAR)VAR) Basic idea: PC or Basic idea: PC or

GES style search GES style search on “relative” on “relative” time-slicestime-slices

Additive, non-Additive, non-linear model linear model of climate of climate teleconnections teleconnections (5 ocean (5 ocean indices; 563-indices; 563-month series)month series) Chu & Glymour, 2008, Chu & Glymour, 2008,

Journal of Machine Journal of Machine Learning ResearchLearning Research

Page 18: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

1818

II. Discovering Latent II. Discovering Latent VariablesVariables

Truth:

M1 M2

T1

M3 M4

M5 M6

T2

M7 M8

M9 M10

T3

M11 M12

Cluster M’s using a heuristic or Build Pure Clusters (Silva, et al. JMLR. 2006)

M1 M2 M3

M5 M6

M9 M10 M11 M12

T1

T2

T3

Applicable to time series?

Apply GES

Page 19: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

1919

II. Limits of PC and GESII. Limits of PC and GES

With i.i.d. samples, and correct distribution families, PC With i.i.d. samples, and correct distribution families, PC and GES give correct information almost surely in the large and GES give correct information almost surely in the large sample limit—assuming no unrecorded common causes sample limit—assuming no unrecorded common causes

Works with “random effects “ for linear models.Works with “random effects “ for linear models. But doe not give all the information we want: Often cannot But doe not give all the information we want: Often cannot

determine the directions of influences! determine the directions of influences! Can post process with exhaustive test for all orientations—Can post process with exhaustive test for all orientations—

heuristic.heuristic. Adjacencies more reliable than directions of edgesAdjacencies more reliable than directions of edges

X Y

Z

…predicts the same independencies as…

X Y

Z

X Y

Z

X Y

Z

X Y

Z

X Y

Z

All of these are d-separation equivalent

Page 20: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

2020

II. Breaking Down d-separation II. Breaking Down d-separation Equivalence: LiNGAMEquivalence: LiNGAM

X Y

Z

Linear equations (reduced):

X = X

Y = aXX + Y

Z = bXX + bYY + Z

X Y Z

X Y Z

Graphical representation:

Discoverable byLiNGaM (ICA + algebra)!

Disturbance terms must be non-Gaussian

Shimizu, et al. (2006) Journal of Machine Learning Research

Page 21: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

2121

II. Feedback SystemsII. Feedback Systems

Truth: X W

Y Z

Two methods:Modified LiNGaM

Lacerda, Spirtes, & Hoyer (2008). Discovering cyclic causal models by independent component analysis. UAI.

Conditional independenciesRichardson & Spirtes (1999). Discovery of linear cyclic models.

X W

Y Z

X W

Y Z

Page 22: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

2222

II. Missed Opportunities?II. Missed Opportunities?

None of the machine learning/statistical methods in None of the machine learning/statistical methods in II. have been used with imaging data. Instead:II. have been used with imaging data. Instead: Trial and error guessing and data fittingTrial and error guessing and data fitting RegressionRegression Granger Causality for time series.Granger Causality for time series. Exhaustive testing of all linear models.Exhaustive testing of all linear models.

How come?How come? UnfamiliarityUnfamiliarity The machine learning/statistical methods respect what it The machine learning/statistical methods respect what it

is possible to learn (in the large sample limit), which is is possible to learn (in the large sample limit), which is often less than researchers want to conclude.often less than researchers want to conclude.

Page 23: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

2323

III. Simple Possible ErrorsIII. Simple Possible Errors

Pooling data from different subjects:Pooling data from different subjects: If If XX and and YY are independent in population P1 are independent in population P1

and in population P2, but have different and in population P2, but have different probability distributions in the two populations, probability distributions in the two populations, the the XX and and YY are not usually are not usually notnot independent in independent in P1 P1 P2. (G. Yule, 1904). P2. (G. Yule, 1904).

Pooling data from different time points in Pooling data from different time points in fMRI seriesfMRI series If the series is not stationary, data are being If the series is not stationary, data are being

pooled as above.pooled as above. Can remove trends but that doesn’t guarantee Can remove trends but that doesn’t guarantee

stationarity.stationarity.

Page 24: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

2424

III. Eliminating OpportunitiesIII. Eliminating Opportunities

Removing autocorrelation by regression Removing autocorrelation by regression interferes with discovering feedback interferes with discovering feedback between variables.between variables.

Data manipulations that tend to make Data manipulations that tend to make variables Gaussianvariables Gaussian Spatial smoothingSpatial smoothing Variables defined by principal components or Variables defined by principal components or

averages over ROIsaverages over ROIs

eliminate or reduce the possibility of taking eliminate or reduce the possibility of taking advantage of LiNGAM algorithms. advantage of LiNGAM algorithms.

Page 25: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

2525

III. Simple LimitationsIII. Simple Limitations

Testing all models (e.g., with LISREL chi-Testing all models (e.g., with LISREL chi-square) is a consistent search method for square) is a consistent search method for linear, Gaussian models (folk theorem).linear, Gaussian models (folk theorem).

But it is not feasible except for very small But it is not feasible except for very small numbers of variables, e.g., for 8 variables numbers of variables, e.g., for 8 variables there arethere are

332424 = 22,876,792,454,961 = 22,876,792,454,961 directed graphs.directed graphs.

Page 26: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

2626

III. III. Not So Simple Possible Errors: Variables Not So Simple Possible Errors: Variables Defined on ROIs as Proxies for Latent Defined on ROIs as Proxies for Latent

VariablesVariables

X X YY ZZ

AA BB CC

X is independent of Z conditional on Y

But unless B is a perfect measure of Y, A is not independent of C conditional on B.

So if A, B, and C are taken as “proxies” for X, Y and Z, a regression of C on A and B will find, correctly, that X has an indirect influence on Z, through Y, but also, incorrectly, that X has in addition a direct influence on Z not through Y.

Page 27: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

2727

III. Not So Obvious Errors: III. Not So Obvious Errors: RegressionRegression

Lots of forms: linear, polynomial, logistic, etc.Lots of forms: linear, polynomial, logistic, etc. All have the following features:All have the following features:

Prior separation of variables into outcome, Y, Prior separation of variables into outcome, Y, and a set and a set SS of possible causes, A, B, C, etc. of Y. of possible causes, A, B, C, etc. of Y.

Regression estimate of the influence of A on Y is a Regression estimate of the influence of A on Y is a measure of the association of A and Y measure of the association of A and Y conditional on conditional on all other variablesall other variables in in SS..

Regression for causal effects always attempts to Regression for causal effects always attempts to estimate the estimate the directdirect (relative to other variables in (relative to other variables in SS) ) influence of A on Y.influence of A on Y.

Page 28: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

2828

III. Regression to Estimate III. Regression to Estimate Causal InfluenceCausal Influence

• Let Let VV = { = {XX,Y,,Y,TT}, where }, where

- Y : measured outcome- Y : measured outcome

- measured regressors: - measured regressors: X X = {X= {X11, X, X22, …, X, …, Xnn}}

- latent common causes of pairs in - latent common causes of pairs in X X U Y: U Y: TT = {T = {T11, …, , …,

TTkk}}

• Let the true causal model over Let the true causal model over VV be a Structural be a Structural

Equation Model in which each V Equation Model in which each V VV is a linear is a linear

combination of its direct causes and independent, combination of its direct causes and independent,

Gaussian noise.Gaussian noise.

Page 29: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

2929

III. Regression to estimate Causal III. Regression to estimate Causal InfluenceInfluenceConsider the regression equation:Consider the regression equation: Y = Y = 00 + + 11XX11 + + 22XX22 + ..… + ..… nnXXnn

Let the OLS regression estimate Let the OLS regression estimate i i be thebe the estimatedestimated causal influencecausal influence of X of Xii on Y. on Y.

That is, hypothetically holding That is, hypothetically holding XX/Xi /Xi experimentally constant, experimentally constant, i i is an estimate of is an estimate of the change in E(Y) that would result from an the change in E(Y) that would result from an intervention that changes Xi by 1 unit.intervention that changes Xi by 1 unit.

Let the Let the realreal CausalCausal InfluenceInfluence X Xii Y = b Y = bii

When is the OLS estimate When is the OLS estimate i i a consistent estimate a consistent estimate of bof bii??

Page 30: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

3030

III. Regression Will Be “inconsistent” III. Regression Will Be “inconsistent” WhenWhen

1. There is an unrecorded common 1. There is an unrecorded common cause of Y and Xicause of Y and Xi

LL

Xi Xi Y Y

If X, Y are the only measured variables, PC, GES and FCI cannot determine whether the influence is from X to Y or from an unmeasured common cause, or both. LiNGAM can if the disturbances are non-Gaussian.

Page 31: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

3131

Regression will be Regression will be “inconsistent” when“inconsistent” when

2. Cause and effect are confused2. Cause and effect are confused::Xi Xi Y Y

3. And that error can lead to others3. And that error can lead to others:: Xi YXi Y

XkXk

“…one region, with a long haemodynamic latency, could cause a neuronal response in another that was expressed, haedynamically, before the source.” (Friston, et al., 2007, 602). LiNGAM does not make this error.

Regression concludes Xk is cause of Y. FCI, etc. do not make these errors.

Page 32: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

3232

Bad Regression ExampleBad Regression Example

X2

Y

X3 X1

T1

True Model

T2

1

2

3

0 0 X 0 X

Multiple Regression Result

PC, GES, FCI get these kinds of cases right.

Page 33: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

3333

Regression ConsistencyRegression Consistency

IfIf • XXii is d-separated from Y conditional on is d-separated from Y conditional on

XX\X\Xii in the true graph after removing in the true graph after removing XXii Y, and Y, and

• XX contains no descendant of Y, contains no descendant of Y, thenthen::

ii is a consistent estimate of bis a consistent estimate of bii

Page 34: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

3434

III. Granger CausalityIII. Granger CausalityIdea: Time seriesIdea: Time seriesX is a Granger cause of Y iff stationary {…..Xt-1; ….Yt-X is a Granger cause of Y iff stationary {…..Xt-1; ….Yt-

1} predicts Yt better than does {….Yt-1}1} predicts Yt better than does {….Yt-1}

Obvious GeneralizationsObvious Generalizations:: Non-Gaussian time series.Non-Gaussian time series. Multiple time series—essentially time series version of Multiple time series—essentially time series version of

multiple regression: X is a Granger cause of Y iff Yt is not multiple regression: X is a Granger cause of Y iff Yt is not independent of …Xt-1 conditional on covariates …independent of …Xt-1 conditional on covariates …ZZt-1.t-1.

Less obvious generalizations:Less obvious generalizations: Non-linear time series (finding conditional independence Non-linear time series (finding conditional independence

tests is touchy)tests is touchy)

C. Granger, C. Granger, Econometrica,Econometrica, 1969 1969

Page 35: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

3535

GC All Over the PlaceGC All Over the Place

Goebel, R. Roebroeck, A. Kim, D. and Formisano, E. (2003). Goebel, R. Roebroeck, A. Kim, D. and Formisano, E. (2003). Investigating directed cortical interactions in time-resolved fMI Investigating directed cortical interactions in time-resolved fMI data using vector autoregressive modeling and Granger causality data using vector autoregressive modeling and Granger causality mapping. mapping. Magnetic Resonance Imaging,Magnetic Resonance Imaging, 21: 125-161. 21: 125-161.

Chen, Y. Bressler, S.L., Knuth, K.H., Truccolo, W.A., Ding, M.Z., Chen, Y. Bressler, S.L., Knuth, K.H., Truccolo, W.A., Ding, M.Z., (2006). Stochastic modeling of neurobiological time series: power, (2006). Stochastic modeling of neurobiological time series: power, coherence, Granger causality, and separation of evoked responses coherence, Granger causality, and separation of evoked responses from ongoing activity. from ongoing activity. ChaosChaos 16, 26-113. 16, 26-113.

Brovelli, A., Ding, M.Z., Ledberg, A., Chen, Y.H., Nakamura, R., Brovelli, A., Ding, M.Z., Ledberg, A., Chen, Y.H., Nakamura, R., Bressler,S.L., (2004). Beta oscillations in a large-scale Bressler,S.L., (2004). Beta oscillations in a large-scale sensorimotor cortical network: directional influences revealed by sensorimotor cortical network: directional influences revealed by granger causality. Proc. Natl. Acad. Sci. U. S. A. 101: 9849–9854.granger causality. Proc. Natl. Acad. Sci. U. S. A. 101: 9849–9854.

Deshpande, G., Hu, ., Stilla, R, and K. Sathian, (2008) Effective Deshpande, G., Hu, ., Stilla, R, and K. Sathian, (2008) Effective connectivity during haptic perception: A study using Granger connectivity during haptic perception: A study using Granger causality analysis of functional magnetic resonance imaging data. causality analysis of functional magnetic resonance imaging data. NeuroImageNeuroImage, 40: 1807-1814., 40: 1807-1814.

Page 36: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

3636

• fMRI series with multiple conditions are not stationary fMRI series with multiple conditions are not stationary

----May not always be serious.May not always be serious. • GC can produce causal errors when there is GC can produce causal errors when there is

measurement error or unmeasured confounding measurement error or unmeasured confounding series.series.

– --Open research problem--Open research problem: find a : find a consistent method to identify unrecorded common consistent method to identify unrecorded common causes of time series, akin to Silva, et al., causes of time series, akin to Silva, et al., JMLRJMLR 2006 for equilibrium data; Glymour and Spirtes, 2006 for equilibrium data; Glymour and Spirtes, J. J. of Econometricsof Econometrics, 1988., 1988.

III. Problems with GCIII. Problems with GC

Page 37: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

3737

III. If Xt records an event occurring III. If Xt records an event occurring later than Yt+1, X may be mistakenly later than Yt+1, X may be mistakenly

taken to be a cause of Y. (Friston, taken to be a cause of Y. (Friston, 2007, again.)2007, again.)

• This is a problem for This is a problem for regression;regression; • Not a problem if PC, FCI, GES or LiNGAM are Not a problem if PC, FCI, GES or LiNGAM are

used in estimating the “Structural VAR” used in estimating the “Structural VAR” because they do not require a separation of variables because they do not require a separation of variables into outcome and potential cause, or a time ordering into outcome and potential cause, or a time ordering of variables.of variables.

Page 38: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

3838

III. Granger Causality and III. Granger Causality and MechanismsMechanisms

Neural signals occur faster than fMRI Neural signals occur faster than fMRI sampling rate—what is going on in sampling rate—what is going on in between?between?

X1 Y1 Z1 W1

X2 Y2 Z2 W2

X3 Y3 Z3 W3

X4 Y4 Z4 W4

Granger Causes ARE:

W

X Y

Z

Unobserved

Spurious edges

Page 39: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

3939

III. Analysis of ResidualsIII. Analysis of Residuals

Regress and apply PC, etc. to Regress and apply PC, etc. to residualsresiduals

X1 Y1 Z1 W1

X2 Y2 Z2 W2

X3 Y3 Z3 W3

X4 Y4 Z4 W4

Regress on X1, Y1, Z1, W1;

W

X Y

Z

Unobserved

Swanson and Granger, JASA; Demiralp and Hoover (2003), Oxford Economic Bulletin

Page 40: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

4040

ConclusionConclusion Causal inference from imaging data is about as hard Causal inference from imaging data is about as hard

as it gets;as it gets; Conventional statistical procedures are radically Conventional statistical procedures are radically

insufficient tools;insufficient tools; Lots of unused potentially relevant, principled, tools Lots of unused potentially relevant, principled, tools

in the Machine Learning literature;in the Machine Learning literature; Measurement methods and data transformations can Measurement methods and data transformations can

alter the probability distributions in destructive ways;alter the probability distributions in destructive ways; Graphical causal models are the best available tool Graphical causal models are the best available tool

for thinking about the statistical constraints that for thinking about the statistical constraints that causal hypotheses imply.causal hypotheses imply.

Page 41: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

4141

Things There Aren’t:Things There Aren’t:

Magic WandsMagic Wands

Pixie Dust Pixie Dust

Page 42: 1 Graphical Causal Models Clark Glymour Carnegie Mellon University Florida Institute for Human and Machine Cognition

4242

If You Forget Everything Else in If You Forget Everything Else in This Talk, Remember This:This Talk, Remember This:

P. Spirtes, et al., P. Spirtes, et al., Causation, Prediction and Causation, Prediction and SearchSearch, Springer Lecture Notes in , Springer Lecture Notes in Statistics, 2Statistics, 2ndnd edition MIT Press, 2000 edition MIT Press, 2000

J. Pearl, J. Pearl, CausalityCausality, Oxford, 2000., Oxford, 2000. Uncertainty in Artificial IntelligenceUncertainty in Artificial Intelligence Annual Annual

Conference ProceedingsConference Proceedings Journal of Machine Learning ResearchJournal of Machine Learning Research Peter Spirtes’ webpagePeter Spirtes’ webpage Judea Pearl’s web page.Judea Pearl’s web page. The TETRAD webpage.The TETRAD webpage.