graphical) models) bayesian)networks) seman&cs)&)...

59
Seman&cs & Factoriza&on Probabilis&c Graphical Models Bayesian Networks Representa&on

Upload: others

Post on 22-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Seman&cs)&)Factoriza&on)

Probabilis&c)Graphical)Models) Bayesian)Networks)

Representa&on)

Page 2: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

•  Grade •  Course Difficulty •  Student Intelligence •  Student SAT •  Reference Letter

P(G,D,I,S,L)

Page 3: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Intelligence Difficulty

Grade

Letter

SAT

Page 4: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Intelligence Difficulty

Grade

Letter

SAT

0.3 0.08 0.25

0.4 g2(B)

0.02 0.9 i1,d0 0.7 0.05 i0,d1

0.5

0.3 g1(A) g3(C)

0.2 i1,d1

0.3 i0,d0

l1 l0

0.99 0.4 0.1 0.9 g1

0.01 g3 0.6 g2

0.2 0.95

s0 s1

0.8 i1 0.05 i0

0.4 0.6 d1 d0

0.3 0.7 i1 i0

Page 5: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Intelligence Difficulty

Grade

Letter

SAT

P(D) P(I)

P(G|I,D) P(S|I)

P(L|G)

Chain Rule for Bayesian Networks

P(D,I,G,S,L) = P(D) P(I) P(G|I,D) P(S|I) P(L|G)

Distribution defined as a product of factors!

Page 6: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Intelligence Difficulty

Grade

Letter

SAT

0.3 0.08 0.25

0.4 g2

0.02 0.9 i1,d0 0.7 0.05 i0,d1

0.5

0.3 g1 g3

0.2 i1,d1

0.3 i0,d0

l1 l0

0.99 0.4 0.1 0.9 g1

0.01 g3 0.6 g2

0.2 0.95

s0 s1

0.8 i1 0.05 i0

0.4 0.6 d1 d0

0.3 0.7 i1 i0

P(d0, i1, g3, s1, l1) =

Page 7: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Bayesian Network •  A Bayesian network is: – A directed acyclic graph (DAG) G whose

nodes represent the random variables X1,…,Xn – For each node Xi a CPD P(Xi | ParG(Xi))

•  The BN represents a joint distribution via the chain rule for Bayesian networks

P(X1,…,Xn) = Πi P(Xi | ParG(Xi))

Page 8: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

BN Is a Legal Distribution: P ≥ 0

Page 9: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

BN Is a Legal Distribution: ∑ P = 1 ∑D,I,G,S,L P(D,I,G,S,L) = ∑D,I,G,S,L P(D) P(I) P(G|I,D) P(S|I) P(L|G)

= ∑D,I,G,S P(D) P(I) P(G|I,D) P(S|I) ∑L P(L|G)

= ∑D,I,G,S P(D) P(I) P(G|I,D) P(S|I)

= ∑D,I,G P(D) P(I) P(G|I,D) ∑S P(S|I)

= ∑D,I P(D) P(I) ∑G P(G|I,D)

Page 10: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

P Factorizes over G •  Let G be a graph over X1,…,Xn. •  P factorizes over G if

P(X1,…,Xn) = Πi P(Xi | ParG(Xi))

Page 11: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Genetic Inheritance

Homer

Bart

Marge

Lisa Maggie

Clancy Jackie

Selma

Genotype

Phenotype

AA, AB, AO, BO, BB, OO

A, B, AB, O

Page 12: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

BNs for Genetic Inheritance

GHomer

GBart

GMarge

GLisa GMaggie

GClancy GJackie

GSelma

BClancy BJackie

BSelma BHomer BMarge

BBart BLisa BMaggie

Page 13: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

Reasoning'Pa1erns'

Probabilis3c'Graphical'Models' Bayesian'Networks'

Representa3on'

Page 14: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

The Student Network

Intelligence Difficulty

Grade

Letter

SAT

0.3 0.08 0.25

0.4 g2

0.02 0.9 i1,d0 0.7 0.05 i0,d1

0.5

0.3 g1 g3

0.2 i1,d1

0.3 i0,d0

l1 l0

0.99 0.4 0.1 0.9 g1

0.01 g3 0.6 g2

0.2 0.95

s0 s1

0.8 i1 0.05 i0

0.4 0.6 d1 d0

0.3 0.7 i1 i0

Page 15: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

Intelligence Difficulty

Grade

Letter

SAT

Causal Reasoning

P(l1) ≈ 0.5

P(l1 | i0 ) ≈

P(l1 | i0 , d0) ≈

Difficulty Intelligence

0.39

0.51

Page 16: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

Intelligence Difficulty

Grade

Letter

SAT

Evidential Reasoning P(d1) = 0.4 P(i1) = 0.3 P(d1 | g3) ≈ P(i1 | g3) ≈

Student gets a C

0.3 0.08 0.25

0.4 g2

0.02 0.9 i1,d0 0.7 0.05 i0,d1

0.5

0.3 g1 g3

0.2 i1,d1

0.3 i0,d0

0.63 0.08

Page 17: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

Intelligence Difficulty

Grade

Letter

SAT

Intercausal Reasoning

Class is hard!

Student gets a C

P(i1 | g3, d1) ≈ 0.11

P(d1) = 0.4 P(i1) = 0.3 P(d1 | g3) ≈ 0.63 P(i1 | g3) ≈ 0.08

Page 18: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

Intercausal Reasoning Explained

X2 X1

Y OR

X1 X2 Y' Prob(

0' 0' 0 0.25

0' 1' 1 0.25

1' 0' 1 0.25

1' 1 1 0.25

Page 19: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

Difficulty Intelligence Difficulty

Grade

Letter

SAT

Intercausal Reasoning II

Class is hard!

Student gets a B

P(i1) = 0.3

P(i1 | g2, d1) ≈ 0.34 P(i1 | g2) ≈ 0.175

g2'

Page 20: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

Student Aces the SAT •  What happens to the posterior

probability that the class is hard? Intelligence

Grade

Letter

SAT

Student gets a C

Difficulty

Student aces the SAT

Page 21: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

Intelligence

Grade

Letter

SAT

Student Aces the SAT

P(i1 | g3, s1) ≈ 0.58 P(d1 | g3, s1) ≈ 0.76 Difficulty

Student aces the SAT

P(d1) = 0.4 P(i1) = 0.3 P(d1 | g3) ≈ 0.63 P(i1 | g3) ≈ 0.08

Student gets a C

Page 22: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

Flow'of''Probabilis3c'Influence'

Probabilis3c'Graphical'Models' Bayesian'Networks'

Representa3on'

Page 23: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

When can X influence Y? •  X � Y •  X � Y •  X � W � Y •  X � W � Y •  X � W � Y •  X � W � Y

Intelligence Difficulty

Grade

Letter

SAT

Page 24: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

Active Trails •  A trail X1 ─ … ─ Xn is active if: it has no v-structures Xi-1 � Xi � Xi+1

Page 25: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

When can X influence Y Given evidence about Z

•  X � Y •  X � Y

•  X � W � Y •  X � W � Y •  X � W � Y •  X � W � Y

Intelligence Difficulty

Grade

Letter

SAT

W ∉ Z W ∈ Z

Page 26: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

When can X influence Y given evidence about Z

•  S ― I ― G ― D allows influence to flow when: Intelligence Difficulty

Grade

Letter

SAT

Page 27: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

Active Trails •  A trail X1 ─ … ─ Xn is active given Z if:

– for any v-structure Xi-1 � Xi � Xi+1 we have that Xi or one of its descendants ∈Z

– no other Xi is in Z

Page 28: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Preliminaries*

Probabilis-c*Graphical*Models* Independencies*

Representa-on*

Page 29: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Independence •  For events α, β, P α ⊥ β if: – P(α, β) = P(α) P(β) – P(α|β) = P(α) – P(β|α) = P(β)

•  For random variables X,Y, P X ⊥ Y if: – P(X, Y) = P(X) P(Y) – P(X|Y) = P(X) – P(Y|X) = P(Y)

Page 30: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Independence

I D Prob i0 d0 0.42 i0 d1 0.18 i1 d0 0.28 i1 d1 0.12

I Prob i0 0.6 i1 0.4

D Prob d0 0.7 d1 0.3

P(I,D)

I D G Prob. i0 d0 g1 0.126 i0 d0 g2 0.168 i0 d0 g3 0.126 i0 d1 g1 0.009 i0 d1 g2 0.045 i0 d1 g3 0.126 i1 d0 g1 0.252 i1 d0 g2 0.0224 i1 d0 g3 0.0056 i1 d1 g1 0.06 i1 d1 g2 0.036 i1 d1 g3 0.024

I D

G

Page 31: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Conditional Independence •  For (sets of) random variables X,Y,Z P (X ⊥ Y | Z) if: – P(X, Y|Z) = P(X|Z) P(Y|Z) – P(X|Y,Z) = P(X|Z) – P(Y|X,Z) = P(X|Z) – P(X,Y,Z) ∝ φ1(X,Y) φ2(Y,Z)

Page 32: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Conditional Independence

X2 X1

Coin

Page 33: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Conditional Independence I S G Prob. i0 s0 g1 0.114 i0 s0 g2 0.1938 i0 s0 g3 0.2622 i0 s1 g1 0.006 i0 s1 g2 0.0102 i0 s1 g3 0.0138 i1 s0 g1 0.252 i1 s0 g2 0.0224 i1 s0 g3 0.0056 i1 s1 g1 0.108 i1 s1 g2 0.0096 i1 s1 g3 0.0024

S Prob s0 0.95 s1 0.05

P(S,G | i0)

S G Prob. s0 g1 0.19 s0 g2 0.323 s0 g3 0.437 s1 g1 0.01 s1 g2 0.017 s1 g3 0.023

G Prob. g1 0.2 g2 0.34 g3 0.46

I

G S

Page 34: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Conditioning can Lose Independences I D G Prob. i0 d0 g1 0.126 i0 d0 g2 0.168 i0 d0 g3 0.126 i0 d1 g1 0.009 i0 d1 g2 0.045 i0 d1 g3 0.126 i1 d0 g1 0.252 i1 d0 g2 0.0224 i1 d0 g3 0.0056 i1 d1 g1 0.06 i1 d1 g2 0.036 i1 d1 g3 0.024

I D Prob. i0 d0 0.282 i0 d1 0.02 i1 d0 0.564 i1 d1 0.134

P(I,D | g1)

Page 35: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Bayesian(Networks(

Probabilis2c(Graphical(Models( Independencies(

Representa2on(

Page 36: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Independence & Factorization

•  Factorization of a distribution P implies independencies that hold in P

•  If P factorizes over G, can we read these independencies from the structure of G?

P(X,Y,Z) ∝ φ1(X,Z) φ2(Y,Z) P(X,Y) = P(X) P(Y) X,Y independent

(X ⊥ Y | Z)

Page 37: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Flow of influence & d-separation

Definition: X and Y are d-separated in G given Z if there is no active trail in G between X and Y given Z

I D

G

L

S

Notation: d-sepG(X, Y | Z)

Page 38: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Factorization Independence: BNs Theorem: If P factorizes over G, and d-sepG(X, Y | Z) then P satisfies (X ⊥ Y | Z)

I D

G

L

S

Page 39: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Any node is d-separated from its non-descendants given its parents Intelligence Difficulty

Grade

Letter

SAT

Job

Happy

Coherence

If P factorizes over G, then in P, any variable is independent of its non-descendants given its parents

Page 40: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

I-maps •  d-separation in G P satisfies

corresponding independence statement

•  Definition: If P satisfies I(G), we say that G is an I-map (independency map) of P

I(G) = {(X ⊥ Y | Z) : d-sepG(X, Y | Z)}

Page 41: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

I-maps I D Prob i0 d0 0.42 i0 d1 0.18 i1 d0 0.28 i1 d1 0.12

I D Prob. i0 d0 0.282 i0 d1 0.02 i1 d0 0.564 i1 d1 0.134

G1

P2

I D I D

P1

G2

Page 42: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Factorization Independence: BNs Theorem: If P factorizes over G, then G is

an I-map for P

Page 43: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Independence Factorization Theorem: If G is an I-map for P, then P

factorizes over G

I D

G

L

S

Page 44: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Summary Two equivalent views of graph structure: •  Factorization: G allows P to be represented •  I-map: Independencies encoded by G hold in P

If P factorizes over a graph G, we can read from the graph independencies that must hold in P (an independency map)

Page 45: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

Naïve'Bayes'

Probabilis5c'Graphical'Models' Bayesian'Networks'

Representa5on'

Page 46: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

Naïve'Bayes'Model'

Class

X1 X2 Xn .'.'.'(Xi ⊥ Xj | C) for all Xi, Xj

Page 47: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

Naïve'Bayes'Classifier'

Class

X1 X2 Xn .'.'.'

Page 48: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

Bernoulli'Naïve'Bayes'for'Text'

cat dog buy .'.'.'Document Label

P(“cat” appears | Label)

Financial cat dog buy sell …

Pets 0.001 0.001 0.2 0.3 … 0.3 0.4 0.02 0.0001 …

Page 49: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

Mul5nomial'Naïve'Bayes'for'Text'

W1 W2 Wn .'.'.'Document Label Financial

cat dog buy sell …

Pets 0.001 0.001 0.02 0.02 … 0.02 0.03 0.003 0.001 …

Page 50: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne'Koller'

Summary •  Simple approach for classification

–  Computationally efficient –  Easy to construct

•  Surprisingly effective in domains with many weakly relevant features

•  Strong independence assumptions reduce performance when many features are strongly correlated

Page 51: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Applica'on:+Diagnosis+

Probabilis'c+Graphical+Models+ Bayesian+Networks+

Representa'on+

Page 52: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Medical Diagnosis: Pathfinder (1992) •  Help pathologist diagnose lymph node

pathologies (60 different diseases) •  Pathfinder I: Rule-based system •  Pathfinder II used naïve Bayes and got

superior performance

Heckerman et al.

Page 53: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Medical Diagnosis: Pathfinder (1992) •  Pathfinder III: Naïve Bayes with better

knowledge engineering •  No incorrect zero probabilities •  Better calibration of conditional probabilities –  P(finding | disease1) to P(finding | disease2) – Not P(finding1 | disease) to P(finding2 | disease)

Heckerman et al.

Page 54: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Medical Diagnosis: Pathfinder (1992) •  Pathfinder IV: Full Bayesian network – Removed incorrect independencies – Additional parents led to more accurate

estimation of probabilities •  BN model agreed with expert panel in 50/53

cases, vs 47/53 for naïve Bayes model •  Accuracy as high as expert that designed

the model Heckerman et al.

Page 55: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

CPCS

# of parameters: 21000 to 133,931,430 to 8254

M. Pradhan , G. Provan , B. Middleton , M.Henrion, UAI 94

Page 56: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Medical Diagnosis (Microsoft)

Thanks to: Eric Horvitz, Microsoft Research

Page 57: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

- Medical Diagnosis (Microsoft)

Thanks to: Eric Horvitz, Microsoft Research

Page 58: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Fault Diagnosis •  Microsoft troubleshooters

Page 59: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)

Daphne Koller

Fault Diagnosis •  Many examples: – Microsoft troubleshooters – Car repair

•  Benefits: – Flexible user interface – Easy to design and maintain