probtheo4
TRANSCRIPT
-
8/10/2019 probtheo4
1/78
4. Joint Distributions of Two Random Variables
4.1 Joint Distributions of Two Discrete Random Variables
Suppose the discrete random variables X and Y have supports SXand SY, respectively.
The joint distribution ofX and Y is given by the set{P(X =x,Y =y)}xSX,ySY.This joint distribution satisfies
P(X =x,Y =y)0xSX
ySY
P(X =x,Y =y)=1
1 / 7 8
http://find/ -
8/10/2019 probtheo4
2/78
Joint Distributions of Two Discrete Random Variables
When SX and SYare small sets, the joint distribution ofX and Yis usually presented in a table, e.g.
Y = 1 Y = 0 Y = 1X = 0 0 1
3 0
X = 1 13
0 13
2 / 7 8
http://find/ -
8/10/2019 probtheo4
3/78
4.1.1 Marginal Distributions of Discrete Random Variables
Note that by summing along a row (i.e. summing over all thepossible values ofYfor a particular realisation ofX) we obtain theprobability that Xtakes the appropriate value, i.e.
P(X =x) = ySY
P(X =x,Y =y).
Similarly, by summing along a column (i.e. summing over all thepossible values ofXfor a particular realisation ofY) we obtain theprobability that Ytakes the appropriate value, i.e.
P(Y =y) =xSX
P(X =x,Y =y).
3 / 7 8
http://find/ -
8/10/2019 probtheo4
4/78
Marginal Distributions of Discrete Random Variables
The distributions ofX and Yobtained in this way are called themarginal distributions ofX and Y, respectively.
These marginal distributions satisfy the standard conditions for adiscrete distribution, i.e.
P(X =x) 0;xSX
P(X =x) = 1.
4 / 7 8
http://find/ -
8/10/2019 probtheo4
5/78
4.1.2 Independence of Two Discrete Random Variables
Two discrete random variables X and Yare independent if and
only if
P(X =x,Y =y) =P(X =x)P(Y =y), x SX, y SY.
5 / 7 8
http://find/ -
8/10/2019 probtheo4
6/78
4.1.3 The Expected Value of a Function of Two Discrete
Random Variables
The expected value of a function, g(X,Y), of two discrete randomvariables is defined as
E[g(X,Y)] =xSX
ySY
g(x, y)P(X =x,Y =y).
In particular, the expected value ofX is given by
E[X] = xSX
ySY
xP(X =x,Y =y).
It should be noted that if we have already calculated the marginaldistribution ofX, then it is simpler to calculate E[X] using this.
6 / 7 8
http://find/ -
8/10/2019 probtheo4
7/78
4.1.4 Covariance and the Correlation Coefficient
The covariance between X and Y, Cov(X,Y) is given by
Cov(X,Y) =E(XY) E(X)E(Y)
Note that by definition Cov(X,X) =E(X2)
E(X)2 =Var(X).
Also, for any constants a, b
Cov(X, aY +b) =aCov(X,Y).
The coefficient of correlation between X and Y is given by
(X,Y) = Cov(X,Y)Var(X)Var(Y)
.
7 / 7 8
http://find/ -
8/10/2019 probtheo4
8/78
Properties of the Correlation Coefficient
1.1 (X,Y) 1.2. If|(X,Y)| = 1, then there is a deterministic linear
relationship between X and Y, Y =aX+b. When(X,Y) = 1, then a>0. When (X,Y) =
1,
a0, (X, aY+ b) =(X,Y), i.e. the correlation
coefficient is independent of the units a variable ismeasured in. Ifa
-
8/10/2019 probtheo4
9/78
Example 4.1
Discrete random variables X and Yhave the following jointdistribution:
Y =
1 Y = 0 Y = 1
X = 0 0 13
0
X = 1 13
0 13
a) Calculate i) P(X>Y), ii) the marginal distributions ofX and
Y, iii) the expected values and variances ofX and Y, iv) thecoefficient of correlation between X and Y.
b) Are X and Y independent?
9 / 7 8
http://find/ -
8/10/2019 probtheo4
10/78
Example 4.1
To calculate the probability of such an event, we sum over all thecells which correspond to that event. Hence,
P(X>Y)=P(X = 0,Y = 1) +P(X = 1,Y = 1) ++P(X = 1,Y= 0) =
1
3
10/78
http://find/ -
8/10/2019 probtheo4
11/78
-
8/10/2019 probtheo4
12/78
Example 4.1
Similarly,
P(Y =
1)=
1
x=0
P(X =x,Y =
1) = 0 +
1
3
=1
3
P(Y= 0)=1
x=0
P(X =x,Y= 0) =1
3+ 0 =
1
3
P(Y= 1)=1 P(Y = 1) P(Y= 0) =1
3 .
12/78
http://find/ -
8/10/2019 probtheo4
13/78
Example 4.1
We can calculate the expected values and variances ofX and Yfrom these marginal distributions.
E[X]=1
x=0
xP(X =x) = 0 13
+ 1 23
=23
E[Y]=1
y=1
yP(Y =y) = 1 13
+ 0 13
+ 1 13
= 0.
13/78
http://find/http://goback/ -
8/10/2019 probtheo4
14/78
Example 4.1
To calculate Var(X) and Var(Y), we use the formula
Var(X) =E(X2) E(X)2.
E[X2]=1
x=0
x2P(X =x) = 02 13
+ 12 23
=2
3
E[Y2]=1
y=1
yP(Y =y) = (1)2 13
+ 02 13
+ 12 13
=2
3.
Thus,
Var(X)=2
3
2
3
2=
2
9
Var(Y)=2
3 02 =2
3
14/78
http://find/http://goback/ -
8/10/2019 probtheo4
15/78
Example 4.1
To calculate the coefficient of correlation, we first calculate thecovariance between X and Y. We have
Cov(X,Y) =E[XY] E[X]E[Y],
where
E[XY]=1
x=01
y=1xyP(X =x,Y =y)
=0 0 13
+ 1 (1) 13
+ 1 1 13
= 0.
15/78
http://find/ -
8/10/2019 probtheo4
16/78
-
8/10/2019 probtheo4
17/78
Example 4.1
Note that if(X,Y) = 0, then we can immediately state that Xand Y are dependent.
When (X,Y) = 0, we must check whether X and Y are
independent from the definition. X and Yare independent if andonly if
P(X =x,Y =y) =P(X =x)P(Y =y), x SX, y SY.
To show that X and Yare dependent it is sufficient to show thatthis equation does not hold for some pair of values x and y.
17/78
http://find/http://goback/ -
8/10/2019 probtheo4
18/78
Example 4.1
We have
P(X = 0,Y = 1) = 0 =P(X = 0)P(X = 1) = 13
2.
It follows that X and Yare dependent. It can be seen thatX = |Y|.
18/78
http://find/ -
8/10/2019 probtheo4
19/78
4.1.5 Conditional Distributions and Conditional Expected
Values
Define the conditional distribution ofX given Y =y, wherey SY by{P(X =x|Y =y)}xSX.
P(X =x|Y =y) =P(X =x,Y =y)
P(Y =y) .
Analogously, the conditional distribution ofY given X =x, wherex SX is given by{P(Y =y|X =x)}ySY.
P(Y =y|X =x) = P(X =x,Y =y)P(Y =y)
.
Note: This definition is analagous to the definition of conditionalprobability given in Chapter 1.
19/78
http://find/http://goback/ -
8/10/2019 probtheo4
20/78
Conditional Expected Value
The expected value of the function g(X) given Y =y is denotedE[g(X)|Y =y]. We have
E[g(X)|Y =y] = xSX
g(x)P(X =x|Y =y).
In particular,
E[X|Y =y] = xSX
xP(X =x|Y =y).
The expected value of the function g(Y) given X =x is definedanalogously.
20/78
http://find/ -
8/10/2019 probtheo4
21/78
Probabilistic Regression
The graph of the probabilistic regression ofX on Y is given by ascatter plot of the points{(y,E[X|Y =y])}ySY.Note that the variable we are conditioning on appears on thex-axis.
Analogously, the graph of the probabilistic regression ofY on X isgiven by a scatter plot of the points{(x,E[Y|X =x])}xSX.
These graphs illustrate the nature of the (probabilistic) relationbetween the variables X and Y.
21/78
http://find/http://goback/ -
8/10/2019 probtheo4
22/78
Probabilistic Regression
The graph of the probabilistic regression ofY on X, E[Y
|X =x],
may be1. Monotonically increasing in x. This indicates a
positive dependency between X and Y. (X,Y) willbe positive.
2. Monotonically decreasing in x. This indicates anegative dependency between X and Y. (X,Y) willbe negative.
3. Flat. This indicates that (X,Y) = 0. It does not,however, indicate that X and Yare independent.
4. None of the above (e.g. oscillatory). In this case, Xand Yare dependent, but this dependency cannot bedescribed in simple terms such as positive ornegative. the correlation coefficient can be 0, positiveor negative in such a case (butcannotbe-1or1).
22/78
http://find/ -
8/10/2019 probtheo4
23/78
Example 4.2
Discrete random variables X and Yhave the following jointdistribution:
Y = 1 Y = 0 Y = 1X = 0 0 1
3 0
X = 1 13 0 13
a) Calculate the conditional distributions ofXgiven i)Y = 1, ii) Y= 0, iii) Y = 1.
b) Calculate the conditional distributions ofY
given i)X= 0, ii) X = 1.
c) Draw the graph of the probabilistic regression of i) Yon X, ii) X on Y.
23/78
http://find/ -
8/10/2019 probtheo4
24/78
Example 4.2
a) i) The distribution ofXconditioned on Y =
1 is given by
P(X =x|Y = 1) = P(X =x,Y = 1)P(Y = 1) , x SX = {0, 1}.
From Example 4.1, P(Y =
1) = 1/3. Thus
P(X = 0|Y = 1)=P(X = 0,Y = 1)1/3
= 0
P(X = 1|Y = 1)=P(X = 1,Y = 1)1/3
= 1
Note that since Xonly takes the values 0 and 1
P(X = 1|Y = 1) = 1 P(X = 0|Y = 1).
24/78
http://find/ -
8/10/2019 probtheo4
25/78
Example 4.2
ii) Similarly, the distribution ofXconditioned on Y= 0 is given by
P(X = 0|Y= 0)=P(X = 0,Y = 0)P(Y = 0)
=1/3
1/3= 1
P(X = 1|Y= 0)=P(X = 1,Y = 0)
P(Y = 0) =
0
1/3= 0
iii) The distribution ofXconditioned on Y= 1 is given by
P(X = 0|Y= 1)=
P(X = 0,Y = 1)
P(Y = 1) =
0
1/3= 0
P(X = 1|Y= 1)=P(X = 1,Y = 1)P(Y = 1)
=1/3
1/3= 1
25/78
http://find/ -
8/10/2019 probtheo4
26/78
-
8/10/2019 probtheo4
27/78
Example 4.2
Similarly, the distribution ofY conditioned on X= 1 is given by
P(Y = 1|X= 1)=P(X = 1,Y =
1)
P(X = 1) =
1/3
2/3= 1/2
P(Y = 0|X= 1)=P(X = 1,Y = 0)P(X = 1)
= 0
2/3= 0
P(Y = 1|X= 1)=1 P(Y = 1|X = 1) P(Y = 0|X= 1) = 1/2
27/78
http://find/ -
8/10/2019 probtheo4
28/78
Example 4.2
c) i) To plot the probabilistic regression ofY on X, we need firstto calculate E[Y|X =x], x SX = {0, 1}. We have
E[Y|X= 0]= ySY
yP(Y =y|X= 0) = 1 0 + 0 1 + 1 0 = 0
E[Y|X= 1]=ySY
yP(Y =y|X= 1) = 1 12
+ 0 0 + 1 12
= 0
Since E[Y|X =x] is constant, it follows that (X,Y) = 0.
28/78
http://find/ -
8/10/2019 probtheo4
29/78
Example 4.2
x
0
E[Y|X =x]
1
29/78
http://find/ -
8/10/2019 probtheo4
30/78
-
8/10/2019 probtheo4
31/78
Example 4.2
y
E[X|Y =y]
-1 1
1
0
31/78
http://find/ -
8/10/2019 probtheo4
32/78
4.2 Joint Distributions of Two Continuous Random
Variables
The joint distribution of two continuous random variables can bedescribed by the joint density function fX,Y(x, y).
The joint density function fX,Ysatisfies the following two
conditions
fX,Y(x, y)0
fX,Y(x, y)dxdy=1
In practice, we integrate over the joint support of the randomvariablesX and Y. This is the set of pairs (x, y) for whichfX,Y(x, y)> 0 (see Example 4.4).
32/78
http://find/ -
8/10/2019 probtheo4
33/78
Joint Distributions of Two Continuous Random Variables
The definitions used for continuous random variables are analogousto the ones used for discrete random variables. Summations arereplaced by integrations.
Note that the volume under the density function is equal to 1.
The probability that (X,Y) Ais given by
P[(X,Y)
A] =
A
fX,Y(x, y)dxdy
33/78
http://find/ -
8/10/2019 probtheo4
34/78
4.2.1 Marginal Density Functions for Continuous Random
Variables
The marginal density function ofXcan be obtained by integratingthe joint density function with respect to Y.
fX(x) =
fX,Y(x, y)dy.
Similarly, the marginal density function ofYcan be obtained byintegrating the joint density function with respect to X.
fY(y) =
fX,Y(x, y)dx.
If the joint support ofX and Y is not a rectangle, some careshould be made regarding the appropriate form of the joint densityfunction according to the variable we are integrating with respect
to (see Example 4.4). 34/78
http://find/ -
8/10/2019 probtheo4
35/78
4.2.2 Independence of Two Continuous Random Variables
These marginal density functions satisfy the standard conditionsfor the density function of a single random variable.
The continuous random variables X and Yare independent if andonly if
fX,Y(x, y) =fX(x)fY(y), x, y.
35/78
http://find/ -
8/10/2019 probtheo4
36/78
C C ffi f C
-
8/10/2019 probtheo4
37/78
4.2.4 Covariance and Coefficient of Correlation
The definitions of the covariance and coefficient of correlation,Cov(X,Y) and (X,Y) are identical to the definitions used for
discrete random variables.
The properties of these measures and the interpretation of thecoefficient of correlation are unchanged.
37/78
E l 3
http://find/ -
8/10/2019 probtheo4
38/78
Example 4.3
The pair of random variables X and Yhas a uniform jointdistribution on the triangle Awith apexes (0,0), (2,0) and (2,2),i.e. fX,Y(x, y) =c if (x, y) A, otherwise fX,Y(x, y) = 0. Thedensity function is illustrated on the following slide.
a) Find the constant c.
b) Calculate P(X>1,Y>1).
c) Find the marginal densities ofX and Y.
d) Find the expected values and variances ofX and Y.
e) Find the coefficient of correlation, (X,Y).
f) Are X and Y independent?
38/78
E l 4 3
http://find/ -
8/10/2019 probtheo4
39/78
Example 4.3
x
y
2
20
fX,Y(x, y) =c
fX,Y(x, y) = 0
39/78
E l 4 3
http://find/ -
8/10/2019 probtheo4
40/78
Example 4.3
a) Method 1: The clever way
Note that if the density function is constant on some set A, thenthe volume under the density curve (which must be equal to 1) is
simply the height of the density curve, c, times the area of the setA.
The set A is a triangle with base and height both of length 2. Thearea of this triangle is thus 0.5
2
2 = 2.
It follows that 2c= 1 c= 0.5.
40/78
E l 4 3
http://find/ -
8/10/2019 probtheo4
41/78
Example 4.3
Method 2 The joint support ofX and Y is the triangle A. Hence, A
fX,Y(x, y)dydx= 1.
We must be careful to get the boundaries of integration correct.
Note that in A, xvaries from 0 to 2. Fixing x, ycan vary from 0to x. Hence,
AfX,Y(x, y)dydx=
2
x=0
x
y=0
cdy
dx
=
2
x=0
[cy]xy=0dx=
2
0
cxdx
=c[x2/2]20= 2c c= 0.5.
41/78
E l 4 3
http://find/ -
8/10/2019 probtheo4
42/78
Example 4.3
It is possible to do the integration in the opposite order. Note thatin A,yvaries from 0 to 2. Fixingy, xcan vary from y to 2. Hence,
A
fX,Y(x, y)dydx= 2
y=0
2
x=y
cdxdy.
Doing the integration this way obviously leads to the same answer,but the calculation is slightly longer.
42/78
Example 4 3
http://find/ -
8/10/2019 probtheo4
43/78
Example 4.3
Method 1: The Clever Way If (X,Y) has a uniform distributionover the set A, then P[(X,Y) B], where B A is given by
P[(X,Y) B] =|B||A| ,
where|A| is the area of the set A.
43/78
Example 4 3
http://find/ -
8/10/2019 probtheo4
44/78
Example 4.3
The area ofA was calculated to be 2.
If both X and Yare greater than 1, then (X,Y) must belong to
the interior of the triangle with apexes (1,1), (2,1) and (2,2).The area of this triangle is 0.5. Hence,
P(X>1,Y>1) =0.5
2 =
1
4.
44/78
Example 4 3
http://find/ -
8/10/2019 probtheo4
45/78
Example 4.3
Method 2: By integrationWhen both X and Y are greater than1, there is only a positive density on the triangle with apexes (1,1),(2,1) and (2,2).
In this triangle, 1 x 2. Fixing x, 1 y x. Hence,
P(X>1,Y>1) =
2
1
x1
fX,Y(x, y)dydx.
45/78
Example 4 3
http://find/ -
8/10/2019 probtheo4
46/78
Example 4.3
Thus
P(X>1,Y>1)=
2
1
x1
0.5dydx
= 21 [0.5y]
x
y=1dx
=
2
1
(0.5x 0.5)dx
=x2
4 x
221
=(1 1) (1/4 1/2) = 1/4
46/78
Example 4 3
http://find/ -
8/10/2019 probtheo4
47/78
Example 4.3
c) Note that both X and Ytake values in [0, 2].To get the marginal density function at x, fX(x), we integrate thejoint density function with respect to y, i.e. we fix xand sumover y.
Given x, we should consider for what values ofy the joint densityis positive. From the graph of the density function, there is apositive density as long as 0< y
-
8/10/2019 probtheo4
48/78
Example 4.3
To get the marginal density function at y, fY(y), we integrate thejoint density function with respect to x, i.e. we fix y and sumover x.
For a given y, we should consider for what values ofx the joint
density is positive. From the graph of the density function, there isa positive density as long as y
-
8/10/2019 probtheo4
49/78
Example 4.3
-
8/10/2019 probtheo4
50/78
Example 4.3
Similarly,
E(Y)=
2
0
yfY(y)dy=
2
0
0.5y(2 y)dy
= 20 (y 0.5y
2
)dy= [y2
/2 y3
/6]2
0=
2
3
E(Y2)=
2
0
y2fY(y)dy=
2
0
0.5y2(2 y)dy
= 20 (y
2
0.5y3
)dy= [y3
/3 y4
/8]
2
0=
2
3
It follows that Var(Y) =E(Y2) E(Y)2 = 23 2
3
2= 2
9.
50/78
Example 4.3
http://find/http://goback/ -
8/10/2019 probtheo4
51/78
Example 4.3
e) First, we find the covariance,Cov(X,Y) =E(XY) E(X)E(Y). We have
E(XY)= A
xyfX,Y
(x, y)dxdy= 2x=0
xy=0
0.5xydy dx=
2
x=0
[xy2/4]xy=0dx=
2
0
x3/4dx
=[x4/16]20= 1
It follows that Cov(X,Y) = 1 4/3 2/3 = 1/9.
51/78
http://find/ -
8/10/2019 probtheo4
52/78
Example 4.3
-
8/10/2019 probtheo4
53/78
p
Without calculating the correlation coefficient, we can show thatX and Yare dependent from the definition.
Intuitively, X and Yare dependent, since the values Y can takedepend on the realisation ofX.
In such cases, it is easiest to show dependence by choosing x andy, such that x SX and y SY, but (x, y) is not in the jointsupport, i.e. 0
-
8/10/2019 probtheo4
54/78
Expectations
The conditional density function ofX given Y =y, where y SYis given by fX|Y=y, where
fX|Y=y(x) = fX,Y(x, y)
fY(y)
The conditional density function ofY given X =x, where x SXis given by fY|X=x, where
fY|X=x(y) = fX,Y(x, y)
fX(x)
54/78
Conditional Expectations
http://find/ -
8/10/2019 probtheo4
55/78
p
The conditional expectation ofg(X) given Y =y, where y SYis given by E[g(X)|Y =y], where
E[g(X)|Y =y] =
g(x)fX|Y=y(x)dx.
In particular,
E[X|Y =y] =
xfX|Y=y(x)dx.
In practice, we integrate over the interval where fX|Y=y is positive.The conditional expected value ofg(Y) given X =x is definedanalogously.
55/78
Probabilistic Regression for Continuous Random Variables
http://find/ -
8/10/2019 probtheo4
56/78
In the case of discrete random variables, the probabilisticregression ofY on X is given by a scatter plot.
In the case of continuous random variables, the probabilisticregression ofY on X is given by a curve.
The x-coordinate in this plot is given by x, the y-coordinate isgiven by E[Y|X =x].
56/78
Probabilistic Regression for Continuous Random Variables
http://find/ -
8/10/2019 probtheo4
57/78
The probabilistic regression ofX on Y is defined analogously.
Since we are conditioning on y, the x-coordinate is given by y.
The y-coordinate is given by E(X|Y =y).A probabilistic regression curve can be interpreted in an analogousway to the interpretation of probabilistic regression for discreterandom variables (see Example 4.4).
57/78
Example 4.4
http://find/http://goback/ -
8/10/2019 probtheo4
58/78
For the joint distribution given in Example 4.3
i) Calculate the conditional density function ofX given Y =y.
Derive E(X|Y =y).ii) Calculate the conditional density function ofY given X =x.Derive E(Y|X =x).
58/78
Example 4.4
http://find/ -
8/10/2019 probtheo4
59/78
i) We have fY(y) = 1 y/2, 0 y 2. Thus SY = [0, 2). Hence,fory [0, 2)
fX|Y=y(x) = fX,Y(x, y)
fY(y) .
Since the joint support ofX and Y is not a rectangle, we must becareful with the form offX,Y(x, y). Here, we are fixing y, there isa positive joint density ify x 2. Hence, for y x 2,
fX|Y=y(x) = 1/2
1 y/2=
1
2 y.
Otherwise, fX|Y=y(x) = 0.
59/78
Example 4.4
http://find/ -
8/10/2019 probtheo4
60/78
We have fX|Y=y(x) = 1/(2 y) for y x 2. Hence,
E[X|Y =y]= 2
y
xfX|Y=y(x)dx=
2
y
x
2 ydx
=
x2
2(2 y)2x=y
= 4 y22(2 y)
=(2 y)(2 +y)
2(2 y) = 1 +y/2
Since this regression curve is increasing, it follows that a) X and Yare dependent, b) (X,Y)>0.
60/78
Example 4.4
http://find/ -
8/10/2019 probtheo4
61/78
ii) We have fX(x) =x/2, 0 x 2. Thus SX = (0, 2]. Hence, forx (0, 2]
fY|X=x(y) = fX,Y(x, y)
fX(x) .
Again, we must be careful with the form offX,Y(x, y). Here, weare fixing x, there is a positive joint density if 0 y x. Hence,for 0 y x,
fY|X=x(y) = 1/2
x/2=
1
x.
Otherwise, fY|X=x(y) = 0.
61/78
Example 4.4
http://find/ -
8/10/2019 probtheo4
62/78
We have fY|X=x(y) = 1/x for 0 y x. Hence,
E[Y|X =x]= x0
yfY|X=x(y)dy=
x0
y
xdy
=y2
2x
x
y=0
= x2
2x
=x/2
Again, since this regression curve is increasing, it follows that a) Xand Y are dependent, b) (X,Y)> 0.
62/78
4.3 The Multivariate Normal Distribution
http://find/http://goback/ -
8/10/2019 probtheo4
63/78
Let X= (X1,X2, . . . ,Xk) be a vector ofk random variables.
Let E[Xi] =i and = (1, 2, . . . , k), i.e. is the vector ofexpected values
Let be a k kmatrix where the entry in row i, column j, i,j isgiven by Cov(Xi,Xj).
is called the covariance matrix of the random variables
X1,X2, . . . ,Xk.
63/78
Properties of the Covariance matrix,
http://find/http://goback/ -
8/10/2019 probtheo4
64/78
1. is symmetric [since Cov(Xi,Xj)=Cov(Xj,Xi)].2. i,i=Cov(Xi,Xi) =Var(Xi) =
2i , i.e. the i-th entry
on the leading diagonal is the variance of the i-thrandom variable.
3. is positive definite, i.e. ify is any 1 kvector,yyT 0.4. Dividing each element i,j byij (where i is the
standard deviation of the i-th variable), we obtainthe correlation matrix R. i,j(the element in row i,
column jof this matrix gives the coefficient ofcorrelation between Xi and Xj. Note i,i= 1 bydefinition.
64/78
The Form of the Multivariate Distribution
http://find/ -
8/10/2019 probtheo4
65/78
X= (X1,X2, . . . ,Xk) has a multivariate normal distribution if andonly if the joint density function is given by (in matrix form)
fX(x) = (2)k/2||1/2 exp[1/2(x )1(x )T]
where x= (x1, x2, . . . , xn) and|| is the determinant of thecovariance matrix.
IfX= (X1,X2, . . . ,Xk) has a multivariate normal distribution,then any linear combination of the Xihas a (univariate) normal
distribution. In particular, each Xihas a (univariate) normaldistribution.
65/78
The Bivariate Normal distribution
http://find/ -
8/10/2019 probtheo4
66/78
In this case k= 2, X= [X1,X2], = [1, 2]. The covariancematrix is given by
= 21 Cov(X1,X2)
Cov(X1,X2) 22 =
21 1212
22 ,
where i is the standard deviation ofXi and is the coefficient ofcorrelation between X1 and X2. The correlation matrix is given by
R=1 1
.
66/78
The Bivariate Normal distribution
http://find/ -
8/10/2019 probtheo4
67/78
In this case, the determinant of the covariance matrix is given by
|| =2122 22122 =2122(1 2).Hence,||1/2 =12
1 2.
67/78
The Bivariate Normal distribution
http://find/http://goback/ -
8/10/2019 probtheo4
68/78
Also,
1 = 1
2122
(1 2)
22 1212 21
It follows that
1/2(x )1(x ) = 1/2(x1 1, x2 2)1x1 1x2 2
= 1
2(1 2
)
(x1 1)2
2
1
+(x2 2)2
2
2
2(x1 1)(x2 2)
12
68/78
The Bivariate Normal distribution
http://find/http://goback/ -
8/10/2019 probtheo4
69/78
Hence, the joint density function ofX= [X1,X2] is given by
fX
(x1, x2)= 1
212
1 2exp
1
2(1 2) (x1
1)
2
21+
+ (x2 2)2
22
2(x1 1)(x2 2)12
.
69/78
The Bivariate Normal distribution
http://find/http://goback/ -
8/10/2019 probtheo4
70/78
Note that if= 0, then
fX(x1, x2)= 1
212exp
12
(x1 1)2
21
+(x2 2)2
22
=fX1(x1)fX2(x2).
Hence, it follows that if [X1,X2] has a bivariate normaldistribution, then = 0 is a necessary and sufficient condition forX1 and X2 to be independent. Note that, In general, in order to
prove independence it is not sufficient to show that = 0 (seeExample 4.2).
70/78
http://find/http://goback/ -
8/10/2019 probtheo4
71/78
Example 4.5
-
8/10/2019 probtheo4
72/78
Suppose the height of males has a normal distribution with mean170cm and standard deviation 15cm and the correlation of theheight of a father and son is 0.6.
1. Derive the joint distribution of the height of twomales chosen at random.
2. Calculate the probability that the average height oftwo such individuals is greater than 179cm.
3. Assuming the joint distribution of the height of fatherand son is a bivariate normal distribution, derive the
joint distribution of the height of a father and son.4. Calculate the probability that the average height of a
father and son is greater than 179cm.
72/78
Example 4.5
http://find/ -
8/10/2019 probtheo4
73/78
1. Since the individuals are picked at random, we may assume thatthe two heights are independent. The univariate density functionof the i-th height (i= 1, 2) is
fXi(xi) = 1
i
2
exp(xi i)2
22i
= 1
15
2
exp(xi 170)2
2
152 ,The joint density function of the two heights is the product ofthese density functions, i.e.
fX1,X2(x1, x2)=fX1(x1)fX2(x2)
= 1
152 2exp(x1 170)2 (x2 170)2
2 152.
73/78
Example 4.5
http://find/ -
8/10/2019 probtheo4
74/78
2. Note that the sum of the two heights also has a normaldistribution. Let S=X1+X2. E[S] =E[X1] +E[X2] = 340.
Since X1 and X2 are independent,
Var(S)=Var(X1)+Var(X2) = 2 152 = 450.We must calculate
P([X1+X2]/2> 179) =P(X1+X2>358) =P(S>358).
74/78
http://find/ -
8/10/2019 probtheo4
75/78
Example 4.5
-
8/10/2019 probtheo4
76/78
3. The joint distribution of the height of a father and his son isgiven by
fX(x1, x2)= 1
2121 2
exp 1
2(1
2)
(x1 1)221
+
+(x2 2)2
22
2(x1 1)(x2 2)12
= 1
360exp
11.28
225 (x1 170)
2+
+(x2 170)2 1.2(x1 170)(x2 170)
.
76/78
Example 4.5
http://find/ -
8/10/2019 probtheo4
77/78
4. If the joint distribution of the height of father and son has abivariate distribution, then the sum of their heights has a normaldistribution. In this case, E[S] =E[X1] +E[X2] = 340.
Var(S)=Var(X1)+Var(X2) + 2Cov(X1,X2). We have
=Cov(X1,X2)
12Cov(X1,X2)=12= 0.6
15
15 = 135.
77/78
Example 4.5
http://find/ -
8/10/2019 probtheo4
78/78
It follows that
Var(S)=Var(X1) +Var(X2) + 2Cov(X1,X2)
=225 + 225 + 2 135 = 720.
By standardising, we obtain
P(S>358)=P
S E(S)
S>
358 340720
=P(Z>0.67) = 0.2514.
78/78
http://find/