importance sampling

27
Importance Sampling ICS 276 Fall 2007 Rina Dechter

Upload: fionn

Post on 13-Jan-2016

83 views

Category:

Documents


0 download

DESCRIPTION

Importance Sampling. ICS 276 Fall 2007 Rina Dechter. Outline. Gibbs Sampling Advances in Gibbs sampling Blocking Cutset sampling (Rao-Blackwellisation) Importance Sampling Advances in Importance Sampling Particle Filtering. Importance Sampling Theory. Importance Sampling Theory. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Importance Sampling

Importance Sampling

ICS 276Fall 2007

Rina Dechter

Page 2: Importance Sampling

Outline

Gibbs Sampling Advances in Gibbs sampling

Blocking Cutset sampling (Rao-Blackwellisation)

Importance Sampling Advances in Importance Sampling Particle Filtering

Page 3: Importance Sampling

Importance Sampling Theory

Z

EX

n

iii

EX

eEZPeEP

eXpaXPeEEXPeEP

),()(

)),(|(),\()(\ 1\

simplify E,\XZLet

Page 4: Importance Sampling

Importance Sampling Theory

Given a distribution called the proposal distribution Q (such that P(Z=z,e)>0=> Q(Z=z)>0)

Zz

eEzZPeEP ),()(

)()(

),()( zZQ

zZQ

eEzZPeEP

Zz

Zz

Q zZzQZE )( :value expected of definition By

)()(

),()( zZwE

zZQ

eEzZPEeEP QQ

w(Z=z) is called as importance weight

Page 5: Importance Sampling

Importance Sampling Theory

)()(

),()( zZwE

zZQ

eEzZPEeEP QQ

)()(ˆ ,N

)(1

)(

),(1)(ˆ

)z,...,(z Samples

Q fromdrawn samples ofset aGiven

11

n1

eEPeEPAs

zZwNzZQ

eEzZP

NeEP

N

i

ii

N

ii

i

Underlying principle, Approximate Average over a set of numbers by an average over a set of sampled numbers

Page 6: Importance Sampling

Importance Sampling (Informally) Express the problem as computing the

average over a set of real numbers Sample a subset of real numbers Approximate the true average by sample

average. True Average:

Average of (0.11, 0.24, 0.55, 0.77, 0.88,0.99)=0.59 Sample Average over 2 samples:

Average of (0.24, 0.77) = 0.505

Page 7: Importance Sampling

How to generate samples from Q

Express Q in product form: Q(Z)=Q(Z1)Q(Z2|Z1)….Q(Zn|Z1,..Zn-1)

Sample along the order Z1,..Zn

Example: Q(Z1)=(0.2,0.8) Q(Z2|Z1)=(0.2,0.8,0.1,0.9) Q(Z3|Z1,Z2)=Q(Z3|Z1)=(0.5,0.5,0.3,0.7)

N

ii

i

zZQ

eEzZP

NeEP

1 )(

),(1)(

Page 8: Importance Sampling

How to sample from Q

Generate a random number between 0 and 1

Q(Z1)=(0.2,0.8)Q(Z2|Z1)=(0.2,0.8,0.1,0.9)Q(Z3|Z1,Z2)=Q(Z3|Z1)=(0.5,0.5,0.3,0.7)

0 10.2

Which value to select for Z1?

Domains of each variable is {0,1}

01

Page 9: Importance Sampling

How to sample from Q?

Each Sample Z=z Sample Z1=z1 from Q(Z1) Sample Z2=z2 from Q(Z2|Z1=z1) Sample Z3=z3 from Q(Z3|Z1=z1)

Generate N such samples

)(1

)(

),(1)(

)z,...,(z Samples

11

n1

iN

i

N

ii

i

zZwNzZQ

eEzZP

NeEP

Page 10: Importance Sampling

Likelihood weighting

Q= Prior Distribution=CPTs of the Bayesian network

Page 11: Importance Sampling

Likelihood weighting example

lung Cancer

Smoking

X-ray

Bronchitis

DyspnoeaP(D|C,B)

P(B|S)

P(S)

P(X|C,S)

P(C|S)

P(S, C, B, X, D) = P(S) P(C|S) P(B|S) P(X|C,S) P(D|C,B)

0)BC,|S)P(DC,|1S)P(X|0S)P(B|P(S)P(C0)B1,P(X

false0 and 1 where?)0,1( trueBXP

Page 12: Importance Sampling

Likelihood weighting example

lung Cancer

Smoking

X-ray

Bronchitis

DyspnoeaP(D|C,B)

P(B|S)

P(S)

P(X|C,S)

P(C|S)

Q=Prior

Q(S,C,D)=Q(S)*Q(C|S)*Q(D|C,B=0)

=P(S)P(C|S)P(D|C,B=0)

Sample S=s from P(S)

Sample C=c from P(C|S=s)

Sample D=d from P(D|C=c,B=0)

N

ii

i

zZQ

eEzZP

NeEP

1 )(

),(1)(

),|1()|0(

)0,|()|()(

)0,|(),|1()|0()|()(

)0,|()|()(

)0,1,,,(

)(

),()(

sScCXPsSBP

BcCdDPsScCPsSP

BcCdDPsScCXPsSBPsScCPsSP

BcCdDPsScCPsSP

BXdDcCsSP

zZQ

eEzZPzZw

i

ii

Page 13: Importance Sampling

The Algorithm

N

P

w(e)P(e)P

paePww

eX

paxPxX

EX

XXoX

w

N1k

(e)P

k

iikk

ii

iiii

i

ni

k

(e)ˆReturn

ˆˆ

)|(

Assign

)|( from sample

:),...,(order icalin topologeach each For

1

to

1

else

if

For

Page 14: Importance Sampling

How to solve belief updating?

eE

eExX

eEP

eExXPeExXP

ii

iiii

is Evidence :rDenominato

, is Evidence :Numerator

sampling importanceby r Denominato andNumerator Estimate

)(

),()|(

0 , z sample iff 1),(,

)(

)(),(

)|(ˆ

j

1

1

elsexXcontainszxwhere

zw

zwzx

eExXP

iij

i

N

j

j

N

j

jji

ii

Page 15: Importance Sampling

Difference between estimating P(E=e) and P(Xi=xi|E=e)

N

i

izwN

eEP1

)(1

)(ˆ

N

j

j

N

j

jji

ii

zw

zwzx

eExXP

1

1

)(

)(),(

)|(ˆ

UnbiasedAsymptotically Unbiased )()(ˆ eEPeEPEQ )|()|(ˆ eExXPeExXPE iiiiQ

)|()|(ˆlim eExXPeExXPE iiiiQN

Page 16: Importance Sampling

Proposal Distribution: Which is better?

e)P(E compute tosufficient is sample oneonly and

)()(ˆ then 0, varianceIf

ondistributi proposal variancelowprefer should one So

)()()(

is |)()(ˆ|y thatprobabilit The

22

2

eEPeEP

VariancezQeEPzw

eEPeEP

Zz

Page 17: Importance Sampling

Outline

Gibbs Sampling Advances in Gibbs sampling

Blocking Cutset sampling (Rao-Blackwellisation)

Importance Sampling Advances in Importance Sampling Particle Filtering

Page 18: Importance Sampling

Research Issues in Importance Sampling

Better Proposal Distribution Likelihood weighting

Fung and Chang, 1990; Shachter and Peot, 1990 AIS-BN

Cheng and Druzdzel, 2000 Iterative Belief Propagation

Changhe and Druzdzel, 2003 Iterative Join Graph Propagation and

variable ordering Gogate and Dechter, 2005

Page 19: Importance Sampling

Research Issues in Importance Sampling (Cheng and

Druzdzel 2000)

Adaptive Importance Sampling

k

)(ˆ Re

')(Q Update

)(N

1)(ˆe)(EP̂

Q z,...,z samples Generate

dok to1iFor

0)(ˆ

))(|(*..*))(|(*)()(Q Proposal Initial

1k

1

N1

2211

eEPturn

End

QQkQ

zweEP

from

eEP

ZpaZQZpaZQZQZ

kk

iN

jk

k

nn

Page 20: Importance Sampling

Adaptive Importance Sampling

General case Given k proposal distributions Take N samples out of each

distribution Approximate P(e)

1)(ˆ

1

k

j

proposaljthweightAvgk

eP

Page 21: Importance Sampling

Estimating Q'(z)

sampling importanceby estimated is

)Z,..,Z|(ZQ'each where

))(|('*..*))(|('*)(')(Q

1-i1i

221'

nn ZpaZQZpaZQZQZ

Page 22: Importance Sampling

Cutset importance sampling

Divide the Set of variables into two parts Cutset (C) and Remaining Variables

(R)

instancefor bel-Elim using computed is )|(

)|(*)(

)(1)(

~

1

j

jN

jj

j

cCRP

cCRPcCQ

cCP

NeEP

(Gogate and Dechter, 2005) and (Bidyuk and Dechter 2006)

Page 23: Importance Sampling

Outline

Gibbs Sampling Advances in Gibbs sampling

Blocking Cutset sampling (Rao-Blackwellisation)

Importance Sampling Advances in Importance Sampling Particle Filtering

Page 24: Importance Sampling

Dynamic Belief Networks (DBNs)

Bayesian Network at time t

Bayesian Network at time t+1

Transition arcs

Xt Xt+1

Yt Yt+1

X0 X1 X2

Y0 Y1 Y2

Unrolled DBN for t=0 to t=10

X10

Y10

Page 25: Importance Sampling

Query

Compute P(X 0:t |Y 0:t ) or P(X t |Y 0:t ) Example P(X0:10|Y0:10) or P(X10|Y0:10)

Hard!!! over a long time period Approximate! Sample!

Page 26: Importance Sampling

Particle Filtering (PF)

= “condensation” = “sequential Monte Carlo” = “survival of the fittest”

PF can treat any type of probability distribution, non-linearity, and non-stationarity;

PF are powerful sampling based inference/learning algorithms for DBNs.

Page 27: Importance Sampling

Particle Filtering

On white board