tutorial 8, stat1301 fall 2010, 16nov2010, mb103@hku by joseph dong

24
Partition and Condition = Divide and Conquer Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

Upload: tessa-smethurst

Post on 30-Mar-2015

218 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

Partition and Condition

=Divide and Conquer

Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU

By Joseph Dong

Page 2: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

2

Any exhaustive and disjoint collection of

subsets of a given set forms a partition of that set.

E.g. forms a trivial partition of the presumed set . If , then the collection of pre-images of atoms of

the range, , forms a partition of the domain .

Recall: A Partition on a Set

Page 3: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

3

Recall: Conditioning on a Partition

Shares the same idea with Divide and Conquer Casewise

enumeration A Tree-diagram

Formal language: Goal = find the

probability of event , .

It is equivalent to finding the intersection of it with the sure event . .

Page 4: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

4

Formal language (continued)

Now break down the sure event into a number of manageable smaller pieces and these pieces together forms a partition of the sure event .

If we investigate all such events , then we’re done.

The hardcore of the problem now becomes finding each , and this is where the conditioning takes place.

Assuming it is more straight forward a task to find and .

Recall: Conditioning on a Partition (cont’d)

Page 5: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

5

Recall: What does an R.V. do to its State Space?

An r.v. cuts the state space into blocks. On each of these blocks, the r.v. sends all points there to a common atom in the sample space. An r.v. causes a partition

on the state space. Conversely, given a partition

on the state space, you can also define random variables on it so that it “conforms” the partition by taking one value for each block.

Random Variable

Partition on

Page 6: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

6

Conditioning an Event on an R.V.

Since an r.v. cuts the state space into a partition, conditioning on an r.v. is just conditioning on that partition it caused on the state space.

The meaning of is now clearly illustrated on the right.

Page 7: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

7

as a Random Variable

It contains a random variable inside, making itself a function of .

It has a distribution and expectation. Lotus Question: What’s the meaning of

its expected value? To fix its value by fixing an value:

, Every fixed value is now a

conditional probability involving two events.

Page 8: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

8

Exercise: Finding from

This is the prototypical problem of finding the probability of an event via the technique of conditioning on a random variable.

Hint: Ponder on the link between Law of Total Probability and Expectation.

Ans:

Page 9: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

9

ℙ (𝑌|𝑋 )

Given : Q1: How to find ?

Ans: Q2: How to find ?

Ans: N.B.: Both expectations

are done w.r.t. the conditioner .

It involves two r.v.’s now. is a function of the bivariate random vector . Fixing will give you

back the conditional density of given at the fixed position.

Page 10: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

10

Difference among 3 types of densities:

a conditional density is normalized by the marginal probability of is a point dividing a row sum/integral is the density of

a joint density is normalized by the entire joint space is a point dividing the sum/integral of entire space is the density of

a marginal density is also normalized by the entire space is a row sum dividing the sum/integral of entire space is the density of

Conditional, Marginal, and Joint densities

Page 11: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

11

Handout Problem 1

Page 12: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

12

First of all, the random variable has to be numerically

valued. That’s why expectation is also known as the “expected value” and is a numerical characteristic of the sample space (a subset of or simply itself with zero densities equipped at those impossible points).

The expectation is both conceptually and technically equivalent to the location of the center of probability mass of the sample space. Expectation provides only partial information of the random

variable because it eliminates randomness by giving you back only 1 representative point of the sample space.

Recall: What’s the Expectation of a random

variable

Page 13: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

13

is a set-valued random variable.

Given , it evaluates to the set . We cannot have an expected value defined for .

Clarification: is not . The latter is numerically valued, as we have previously established for its expected value:

More elaboration: On the set-theory layer, is not strictly different from the set-r.v. pair . But when onto the probability-theory layer, is normalized by a different space than is .

For examples,

Page 14: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

14

is a numerically-valued random variable. We

can compute its expected value. vs : their sample spaces are different.

Compute using

Compute using

𝔼 ( 𝑋|𝐸 )

Page 15: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

15

Handout Problem 2

Warm-up exercise

Page 16: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

16

First of all, this is a random variable—a function of .

Its randomness comes from the state space of , but the mapping mechanism is worked out together by both of and .

This expression is known as the conditional expectation of the conditionee given the conditioner .

The expectation is done with respect to . To be precise, should say w.r.t. . There are multiple (or even a continuum of) sample

spaces of , depending on which atom value takes. After fixing to an atom, or equivalently, a block in the state space that has been partitioned by , the expression is just a constant.

The expectation eliminates the randomness of given .

: concepts

Page 17: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

17

as an r.v.

It uses the joint state space of and as its own state space.

It uses a degenerated version of the sample space of as its own sample space. The degeneration

preserves the locus of the overall center of mass.

Each point in the degenerated space is a block center of mass

Page 18: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

18

“Degeneration preserves overall center of mass”

cuts its own state space as well as the joint state space of it and .

This partition of the joint state space will be mapped by to a partition on its own sample space (a numeral set).

Then the expression represents the locus of center of mass of the first block of the partition.

represents the totality of loci of these block centers of mass.

Page 19: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

19

This is the prototypical problem of finding the expectation

of a random variable via the technique of conditioning on another random variable.

Ans.

In the divide-conquer-merge paradigm: Divide is done by the conditioner Conquer refers to the inner expectation carried out at each

division Merge refers to the outer expectation to piece up the whole

plate. This exercise addresses the merge step. Compare with the conditional probability, ponder the link

between them.

Exercise: Finding from

Page 20: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

20

Conditional Variance

Finding variance by conditioning:

Pf.

Unfortunately, the degeneration of the sample space of does not preserve second moments.

That’s why there is the addendum in the formula.

Page 21: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

21

The key observations are Obs1: To find the center of mass of a piece of material,

you can divide it into a few blocks, find their centers of mass, and then find the center of mass of these block centers of masses. The initial division of the piece is quite arbitrary. This fundamental law of physics supports the many nice

properties of expectation in the calculus of probability. Obs2: A random variable partitions its state space into

a collection of atom-valued blocks. This suggests using random variable as a general device

to divide the piece mentioned in Obs1. Such a random variable is called the conditioner.

Summary: Conditional Expectation

Page 22: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

22

Trick: Use indicator of set . The indicator is a

Bernoulli random variable. Reason: Conclusion: The conditional probability of an

event conditioned on a random variable (a partition) is a conditional expectation of the indicator of that event conditioned on the same random variable in disguise.

All properties of conditional expectation should apply to conditional probability. Such as the Law of Total Probability is just in disguise.

Linking to

Page 23: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

23

The art of conditioning lies in the choice of the

conditioner. Usually, if our unknown target is the r.v. , and

we know that is a known function of a known r.v. , then it would be natural to use as the conditioner for , that is Divide the state space of by Conquer every Merge them into

Choosing Conditioner

Page 24: Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong

24

Handout problem 3 Handout problem 4 Handout problem 5 Handout problem 6

Exercises