1 bayesian essentials slides by peter rossi and david madigan

27
1 Bayesian Essentials Slides by Peter Rossi and David Madigan

Upload: norah-sparks

Post on 18-Jan-2016

223 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

1

Bayesian Essentials

Slides by Peter Rossi and David Madigan

Page 2: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

2

Distribution Theory 101

Marginal and Conditional Distributions:

X

Y

1

1

uniform

Page 3: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

3

Simulating from Joint

To draw from the joint:i. Draw from marginal on Xii. Condition on this draw, and draw from

conditional of Y|X

library(triangle)x <- rtriangle(NumDraws,0,1,1)y <- runif(NumDraws,0,x)plot(x,y)

Page 4: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

4

Triangular Distribution

If U~ unif(0,1), then:

sqrt(U) has the standard triangle distribution

If U1, U2 ~ unif(0,1), then:

Y=max{U1,U2} has the standard triangle distribution

Page 5: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

Sampling Importance Resampling

5

f

g

draw a big sample from g

sub-sample from that sample with probability f/g

Page 6: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

Metropolis

6

start with current = 0.5

to get the next value: draw a “proposal” from g

keep with probability f(proposal)/f(current)

else keep current

f

g

Page 7: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

7

The Goal of InferenceMake inferences about unknown quantities using available information.

Inference -- make probability statements

unknowns --

parameters, functions of parameters, states or latent variables, “future” outcomes, outcomes conditional on an action

Information –

data-based

non data-based

theories of behavior; subjective views; mechanism

parameters are finite or in some range

Page 8: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

8

p(θ|D) α p(D| θ) p(θ)

Posterior α “Likelihood” × Prior

Modern Bayesian computing– simulation methods for generating draws from the posterior distribution p(θ|D).

Bayes theorem

Page 9: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

9

Summarizing the posterior

Output from Bayesian Inference:A possibly high dimensional distribution

Summarize this object via simulation:marginal distributions of don’t just compute

Contrast with Sampling Theory:point est/standard error

summary of irrelevant dist bad summary (normal)Limitations of asymptotics

Page 10: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

10

Metropolis

Start somewhere with θcurrent

To get the next value, generate a proposal θproposal

Accept with “probability”:

else keep currrent

Page 11: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

11

Example

Believe these measurements (D) come from N(μ,1):

0.9072867 -0.4490744 -0.1463117 0.2525023 0.9723840 -0.8946437 -0.2529104 0.5101836 1.22897950.5685497

Prior for μ?

p(μ) = 2μ

Page 12: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

12

Example continued

p(D|μ)? 0.9072867 -0.4490744 -0.1463117 0.2525023 0.9723840 -0.8946437 -0.2529104 0.5101836 1.22897950.5685497

y1,…,y10

switch to R…

other priors? unif(0,1), norm(0,1), norm(0,100)

generating good candidates?

Page 13: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

13

Prediction

See D, compute: “Predictive Distribution”

future observable

Page 14: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

14

Bayes/Classical Estimators

Prior washes out – locally uniform!!! Bayes is consistent unless you have dogmatic prior.

Page 15: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

15

Bayesian Computations

Before simulation methods, Bayesians used posterior expectations of various functions as summary of posterior.

If p(θ|D) is in a convenient form (e.g. normal), then I might be able to compute this for some h.

Page 16: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

16

Conjugate Families

Models with convenient analytic properties almost invariably come from conjugate families.

Why do I care now?- conjugate models are used as building blocks- build intuition re functions of Bayesian inference

Definition:A prior is conjugate to a likelihood if the

posterior is in the same class of distributions as prior.

Basically, conjugate priors are like the posterior from some imaginary dataset with a diffuse prior.

Page 17: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

17

Beta-Binomial model

Need a prior!

Page 18: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

18

Beta distribution

Page 19: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

19

Posterior

Page 20: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

20

Prediction

Page 21: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

21

Regression model

Page 22: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

22

Bayesian Regression

Prior:

Inverted Chi-Square:

Interpretation as from another dataset.

Draw from prior?

Page 23: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

23

Posterior

Page 24: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

24

Combining quadratic forms

Page 25: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

25

Posterior

Page 26: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

26

IID Simulations

3) Repeat

1) Draw [2 | y, X]

2) Draw [ | 2,y, X]

Scheme: [y|X, , 2] [|2] [2][, 2|y,X] [2 | y,X] [ | 2,y,X]

Page 27: 1 Bayesian Essentials Slides by Peter Rossi and David Madigan

27

IID Simulator, cont.