notes 2 bayesianstatistics

Upload: atonu-tanvir-hossain

Post on 03-Jun-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Notes 2 BayesianStatistics

    1/6

    A Summary of the Bayesian Method and Bayesian Point of View

    Page gary simon20031

    The formalism of the Bayesian methodology is not too hard to explain, but thephilosophical points of view are very contentious.

    The discussion starts with Bayes theorem, a familiar probability result. The essentialmanipulation is this:

    P(AB) =( )

    ( )

    P

    P

    A B

    B

    =

    ( ) ( )

    ( )

    P | P

    P

    B A A

    B

    This is a non-controversial formula, but you should be aware of its potential for timereversal. Suppose that eventA comes earlier in time. Then P(B|A) asks about theprobability of a current event, given a past event. Time is reversed in P(A|B) , whichasks about a past event, given a current event.

    You often see Bayes formula explained in terms of a partition. LetA1,A2, ,Anbe a partition. Partition means that

    P(AiAj) = 0 for ij

    and also that

    P(A1A2 An) = 1

    Bayes formula might now be written

    P(AjB) =( ) ( )

    ( ) ( )1

    P | P

    P | P

    j j

    n

    i i

    i

    B A A

    B A A=

    Now imagine that we have a standard parametric problem dealing with a likelihoodf(x). Herexis a stand-in for the data that well see, while represents the parameterwhich will forever remain unknown. The conventional approach to this problem (calledthe likelihood approach) will use the likelihood as the primary object of interest.Estimates of will be based on the likelihood, and the method of maximum likelihoodhas paramount importance. The parameter is considered nonrandom.

    The Bayesian, however, believes that previous knowledge about (or even prejudicesabout ) can be incorporated into a prior distribution. Imagine that () denotes this priordistribution. Now think of as random with probability law ().

    One can now engage in a petty argument which distinguishes (i) situations inwhich there is a genuine random process (amenable to modeling) which creates ,

  • 8/12/2019 Notes 2 BayesianStatistics

    2/6

    A Summary of the Bayesian Method and Bayesian Point of View

    Page gary simon20032

    from (ii) situations in which is created exactly once (so that modeling isimpossible and irrelevant). The Bayesian has no need to make this distinction. Itis the state of his mental opinion about which is subjected to modeling.

    It follows that the joint density of the random (,X) is given by ()f(x). Integrationof will give the marginal law ofX. Lets denote this as m(x).

    m(x) = ( ) ( )|f x d

    We can now create the conditional law of , given the datax. Specifically, this is

    f(|x) =( ) ( )

    ( ) ( )

    |

    |

    f x

    f x d

    =

    ( ) ( )

    ( )

    |f x

    m x

    In the denominator, rather than has been used as the dummy of integration, just toavoid confusion. More simply, we could note that

    f(|x) =( ) ( )

    ( )

    |

    factor without

    f x

    This means that the denominator is just a fudge factor involvingxand some numbers (butnot), so that we can supply the denominator in whatever way is needed to makef(|x)a legitimate density. (A later example will make this point clear.)

    The Bayesian callsf(|x) as theposteriordensity. If another experiment is to be done,then the posterior becomes the prior for this next experiment.

    The Bayesian regardsf(|x) as the best summary of the experiment.

    If you wanted an estimate of , the Bayesian would supply you with a measure oflocation fromf(|x), possibly the mean or median. If you have gone through theformality of creating a loss function, the Bayesian would minimize the expectedposterior loss.

    If you wanted a 95% confidence interval for , the Bayesian would give you aninterval (a, b) with the property that

    ( )|b

    a

    f x d = 0.95

    presumably choosing this interval so that b- ais as short as possible.

  • 8/12/2019 Notes 2 BayesianStatistics

    3/6

    A Summary of the Bayesian Method and Bayesian Point of View

    Page gary simon20033

    Consider now this simple example. Suppose that is a binomial parameter. Suppose

    that you want to estimate based on a random variable Y, which is binomial (n, ).

    The maximum likelihood person uses =n

    with no further mental anguish.

    The Bayesian will invoke a prior distribution () for . For this problem, this will likelybe an instance of the beta distribution. Specifically, he might choose

    () =( )

    ( ) ( ) ( ) ( )

    11 1 I 0 1 +