tutorial 8, stat1301 fall 2010, 16nov2010, mb103@hku by joseph dong
TRANSCRIPT
Partition and Condition
=Divide and Conquer
Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU
By Joseph Dong
2
Any exhaustive and disjoint collection of
subsets of a given set forms a partition of that set.
E.g. forms a trivial partition of the presumed set . If , then the collection of pre-images of atoms of
the range, , forms a partition of the domain .
Recall: A Partition on a Set
3
Recall: Conditioning on a Partition
Shares the same idea with Divide and Conquer Casewise
enumeration A Tree-diagram
Formal language: Goal = find the
probability of event , .
It is equivalent to finding the intersection of it with the sure event . .
4
Formal language (continued)
Now break down the sure event into a number of manageable smaller pieces and these pieces together forms a partition of the sure event .
If we investigate all such events , then we’re done.
The hardcore of the problem now becomes finding each , and this is where the conditioning takes place.
Assuming it is more straight forward a task to find and .
Recall: Conditioning on a Partition (cont’d)
5
Recall: What does an R.V. do to its State Space?
An r.v. cuts the state space into blocks. On each of these blocks, the r.v. sends all points there to a common atom in the sample space. An r.v. causes a partition
on the state space. Conversely, given a partition
on the state space, you can also define random variables on it so that it “conforms” the partition by taking one value for each block.
Random Variable
Partition on
6
Conditioning an Event on an R.V.
Since an r.v. cuts the state space into a partition, conditioning on an r.v. is just conditioning on that partition it caused on the state space.
The meaning of is now clearly illustrated on the right.
7
as a Random Variable
It contains a random variable inside, making itself a function of .
It has a distribution and expectation. Lotus Question: What’s the meaning of
its expected value? To fix its value by fixing an value:
, Every fixed value is now a
conditional probability involving two events.
8
Exercise: Finding from
This is the prototypical problem of finding the probability of an event via the technique of conditioning on a random variable.
Hint: Ponder on the link between Law of Total Probability and Expectation.
Ans:
9
ℙ (𝑌|𝑋 )
Given : Q1: How to find ?
Ans: Q2: How to find ?
Ans: N.B.: Both expectations
are done w.r.t. the conditioner .
It involves two r.v.’s now. is a function of the bivariate random vector . Fixing will give you
back the conditional density of given at the fixed position.
10
Difference among 3 types of densities:
a conditional density is normalized by the marginal probability of is a point dividing a row sum/integral is the density of
a joint density is normalized by the entire joint space is a point dividing the sum/integral of entire space is the density of
a marginal density is also normalized by the entire space is a row sum dividing the sum/integral of entire space is the density of
Conditional, Marginal, and Joint densities
11
Handout Problem 1
12
First of all, the random variable has to be numerically
valued. That’s why expectation is also known as the “expected value” and is a numerical characteristic of the sample space (a subset of or simply itself with zero densities equipped at those impossible points).
The expectation is both conceptually and technically equivalent to the location of the center of probability mass of the sample space. Expectation provides only partial information of the random
variable because it eliminates randomness by giving you back only 1 representative point of the sample space.
Recall: What’s the Expectation of a random
variable
13
is a set-valued random variable.
Given , it evaluates to the set . We cannot have an expected value defined for .
Clarification: is not . The latter is numerically valued, as we have previously established for its expected value:
More elaboration: On the set-theory layer, is not strictly different from the set-r.v. pair . But when onto the probability-theory layer, is normalized by a different space than is .
For examples,
14
is a numerically-valued random variable. We
can compute its expected value. vs : their sample spaces are different.
Compute using
Compute using
𝔼 ( 𝑋|𝐸 )
15
Handout Problem 2
Warm-up exercise
16
First of all, this is a random variable—a function of .
Its randomness comes from the state space of , but the mapping mechanism is worked out together by both of and .
This expression is known as the conditional expectation of the conditionee given the conditioner .
The expectation is done with respect to . To be precise, should say w.r.t. . There are multiple (or even a continuum of) sample
spaces of , depending on which atom value takes. After fixing to an atom, or equivalently, a block in the state space that has been partitioned by , the expression is just a constant.
The expectation eliminates the randomness of given .
: concepts
17
as an r.v.
It uses the joint state space of and as its own state space.
It uses a degenerated version of the sample space of as its own sample space. The degeneration
preserves the locus of the overall center of mass.
Each point in the degenerated space is a block center of mass
18
“Degeneration preserves overall center of mass”
cuts its own state space as well as the joint state space of it and .
This partition of the joint state space will be mapped by to a partition on its own sample space (a numeral set).
Then the expression represents the locus of center of mass of the first block of the partition.
represents the totality of loci of these block centers of mass.
19
This is the prototypical problem of finding the expectation
of a random variable via the technique of conditioning on another random variable.
Ans.
In the divide-conquer-merge paradigm: Divide is done by the conditioner Conquer refers to the inner expectation carried out at each
division Merge refers to the outer expectation to piece up the whole
plate. This exercise addresses the merge step. Compare with the conditional probability, ponder the link
between them.
Exercise: Finding from
20
Conditional Variance
Finding variance by conditioning:
Pf.
Unfortunately, the degeneration of the sample space of does not preserve second moments.
That’s why there is the addendum in the formula.
21
The key observations are Obs1: To find the center of mass of a piece of material,
you can divide it into a few blocks, find their centers of mass, and then find the center of mass of these block centers of masses. The initial division of the piece is quite arbitrary. This fundamental law of physics supports the many nice
properties of expectation in the calculus of probability. Obs2: A random variable partitions its state space into
a collection of atom-valued blocks. This suggests using random variable as a general device
to divide the piece mentioned in Obs1. Such a random variable is called the conditioner.
Summary: Conditional Expectation
22
Trick: Use indicator of set . The indicator is a
Bernoulli random variable. Reason: Conclusion: The conditional probability of an
event conditioned on a random variable (a partition) is a conditional expectation of the indicator of that event conditioned on the same random variable in disguise.
All properties of conditional expectation should apply to conditional probability. Such as the Law of Total Probability is just in disguise.
Linking to
23
The art of conditioning lies in the choice of the
conditioner. Usually, if our unknown target is the r.v. , and
we know that is a known function of a known r.v. , then it would be natural to use as the conditioner for , that is Divide the state space of by Conquer every Merge them into
Choosing Conditioner
24
Handout problem 3 Handout problem 4 Handout problem 5 Handout problem 6
Exercises