variance and covariance

29
President University Erwin Sitompul PBST 5/1 Dr.-Ing. Erwin Sitompul President University Lecture 5 Probability and Statistics http://zitompul.wordpress.com

Upload: ulf

Post on 25-Feb-2016

126 views

Category:

Documents


0 download

DESCRIPTION

Chapter 4.2. Variance and Covariance. Variance and Covariance. The mean or expected value of a random variable X is important because it describes the center of the probability distribution. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Variance and Covariance

President University Erwin Sitompul PBST 5/1

Dr.-Ing. Erwin SitompulPresident University

Lecture 5Probability and Statistics

http://zitompul.wordpress.com

Page 2: Variance and Covariance

President University Erwin Sitompul PBST 5/2

Variance and CovarianceChapter 4.2 Variance and Covariance

The mean or expected value of a random variable X is important because it describes the center of the probability distribution.

However, the mean does not give adequate description of the shape and variability in the distribution.

The most important measure of variability of a random variable X is obtained by letting g(X) = (X– μ)2.

This variability measure is referred to as the variance of the random variable X or the variance of the probability distribution of X. It is denoted by Var(X) or the symbol , or simply by .

Distribution with equal means but different dispersions (variability)

2X

2

Page 3: Variance and Covariance

President University Erwin Sitompul PBST 5/3

Variance and CovarianceChapter 4.2 Variance and Covariance

Let X be a random variable with probability distribution f(x) and mean μ. The variance of X is

2 2 2[( ) ] ( ) ( )x

E X x f x if X is discrete, and

if X is continuous.

The positive square root of the variance, σ, is called the standard deviation of X.

2 2 2[( ) ] ( ) ( )E X x f x dx

Page 4: Variance and Covariance

President University Erwin Sitompul PBST 5/4

Variance and CovarianceChapter 4.2 Variance and Covariance

Let the random variable X represent the number of cars that are used for official business purposes on any given workday. The probability distribution for company A and company B are

( ) (0)(0.2) (1)(0.1) (2)(0.3) (3)(0.3) (4)(0.1) 2E X

Show that the variance of the probability distribution for company B is greater than that of company A.

32 2

1

( 2) ( )x

x f x

( ) (1)(0.3) (2)(0.4) (3)(0.3) 2E X

2 2 2(1 2) (0.3) (2 2) (0.4) (3 2) (0.3)

32 2

1

( 2) ( )x

x f x

2 2 2

2 2(0 2) (0.3) (1 2) (0.3) (2 2) (0.4)(3 2) (0.3) (4 2) (0.3)

0.6

1.6

Clearly, the variance of the number of cars that are used for official business purposes is greater for company B than for company A.

Company A

Company B

Page 5: Variance and Covariance

President University Erwin Sitompul PBST 5/5

Variance and CovarianceChapter 4.2 Variance and Covariance

The variance of a random variable X is also given by2 2 2( )E X

Let the random variable X represent the number of defective parts for a machine when 3 parts are sampled from a production line and tested. The following is the probability distribution of X

Calculate the variance σ2.

32 2

0

( ) ( )x

E X x f x

(0)(0.51) (1)(0.38) (2)(0.10) (3)(0.01) 0.61

2 2 2 2(0) (0.51) (1) (0.38) (2) (0.10) (2) (0.01) 0.87

2 2 2( )E X 20.87 (0.61) 0.4979

Page 6: Variance and Covariance

President University Erwin Sitompul PBST 5/6

Variance and CovarianceChapter 4.2 Variance and Covariance

The weekly demand for a drinking-water product, in thousands liters, from a local chain of efficiency stores, is a continuous random variable X having the probability density

Find the mean and variance of X. 2( 1), 1 2( ) 0,

x xf x elsewhere

22 2

1

( ) 2( 1)E X x x dx

2

1

2( 1)x x dx 2

3 2

1

23x x 5

3

24 3

1

2 24 3x x 17

6

2 2 2( )E X 217 5

6 3

118

Page 7: Variance and Covariance

President University Erwin Sitompul PBST 5/7

Variance and CovarianceChapter 4.2 Variance and Covariance

Let X be a random variable with probability distribution f(x). The variance of the random variable g(X) is

2 2 2( ) ( ) ( ){[ ( ) ] } [ ( ) ] ( )g X g X g X

x

E g X g X f x if X is discrete, and

if X is continuous.

2 2 2( ) ( ) ( ){[ ( ) ] } [ ( ) ] ( )g X g X g XE g X g X f x dx

Page 8: Variance and Covariance

President University Erwin Sitompul PBST 5/8

Variance and CovarianceChapter 4.2 Variance and Covariance

Calculate the variance of g(X) = 2X + 3, where X is a random variable with probability distribution given as

3

( ) 2 30

(2 3) ( )g X Xx

x f x

2 2 2( ) ( ){[ ( ) ] } {[(2 3) 6] }g X g XE g X E X

32 2

0

[4 12 9] (4 12 9) ( )x

E X X x x f x

1 1 1 1(3) (5) (7) (9) 64 8 2 8

1 1 1 1(9) (1) (1) (9) 44 8 2 8

Page 9: Variance and Covariance

President University Erwin Sitompul PBST 5/9

Variance and CovarianceChapter 4.2 Variance and Covariance

Let X be a random variable with density function

Find the variance of the random variable g(X) = 4X + 3 if it is known that the expected value of g(X) = 8.

2

, 1 2( ) 30, elsewhere

x xf x

2 2 24 3 {[(4 3) 8] } [16 40 25]X E X E X X

22

1

(16 40 25) ( )x x f x dx

2 2

2

1

(16 40 25)3xx x dx

24 3 2

1

1 16 40 253

x x x dx

2

5 4 3

1

1 16 40 253 5 4 3

x x x

1 136 3233 15 15

459 5145 5

Page 10: Variance and Covariance

President University Erwin Sitompul PBST 5/10

Variance and CovarianceChapter 4.2 Variance and Covariance

Let X and Y be a random variables with probability distribution f(x, y). The covariance of the random variables X and Y is

[( )( )] ( )( ) ( , )XY X Y X Yx y

E X Y x y f x y if X and Y are discrete, and

if X and Y are continuous.[( )( )] ( )( ) ( , )XY X Y X YE X Y x y f x y dxdy

σXY >0, Positive

correlation

σXY <0 Negative

correlation

Page 11: Variance and Covariance

President University Erwin Sitompul PBST 5/11

Variance and CovarianceChapter 4.2 Variance and Covariance

The covariance of two random variables X and Y with means μX and μY, respectively, is given by( )XY X YE XY

Page 12: Variance and Covariance

President University Erwin Sitompul PBST 5/12

Variance and CovarianceChapter 4.2 Variance and Covariance

Referring back again to the “ballpoint pens” example, find the covariance of X and Y.

2 2

0 0

( ) ( , )Xx y

E X xf x y

( )XY X YE XY

2 2

0 0

( ) ( , )Yx y

E Y yf x y

2

0

( )x

xg x

5 15 3 3(0) (1) (2)14 28 28 4

2

0

( )y

yh y

15 3 1 1(0) (1) (2)28 7 28 2

3 3 1 914 4 2 56

See again Lecture 4

Page 13: Variance and Covariance

President University Erwin Sitompul PBST 5/13

Variance and CovarianceChapter 4.2 Variance and Covariance

The fraction X of male runners and the fraction Y of female runners who compete in marathon races is described by the joint density function

8 , 0 1( , ) 0,xy y xf x y elsewhere

Find the covariance of X and Y

34 , 0 1( )0, elsewherex xg x

24 (1 ), 0 1( )

0, elsewherey y yh y

14

0

( ) 4X E X x dx 1

2 2

0

( ) 4 (1 )Y E Y y y dy

1 12 2

0

( ) 8y

E XY x y dxdy

( )XY X YE XY

45

815

49

4 4 8 49 5 15 225

Page 14: Variance and Covariance

President University Erwin Sitompul PBST 5/14

Variance and CovarianceChapter 4.2 Variance and Covariance

Although the covariance between two random variables does provide information regarding the nature of the relationship, the magnitude of σXY does not indicate anything regarding the strength of the relationship, since σXY is not scale free.

This means, that its magnitude will depend on the units measured for both X and Y.

There is a scale-free version of the covariance called the correlation coefficient, that is used widely in statistics.

XYXY

X Y

Let X and Y be random variables with covariance σXY and standard deviation σX and σY, respectively. The correlation coefficient X and Y is

Page 15: Variance and Covariance

President University Erwin Sitompul PBST 5/15

Means of Linear Combinations of XChapter 4.3 Means and Variances of Linear Combinations of Random Variables

If a and b are constant, then( ) ( )E aX b aE X b

Applying theorem to the discrete random variable g(X) = 2X – 1, rework the carwash example.

(2 1) 2 ( ) 1E X E X

9

4

( ) ( )Xx

E X xf x

1 1 1 1 1 1 41(4) (5) (6) (7) (8) (9)12 12 4 4 6 6 6

2 1 2 1X X 412 1 $12.676

Page 16: Variance and Covariance

President University Erwin Sitompul PBST 5/16

Means of Linear Combinations of XChapter 4.3 Means and Variances of Linear Combinations of Random Variables

(4 3) 4 ( ) 3E X E X

2 2

1

( )3xE X x dx

5(4 3) 4 3 84

E X

Let X be a random variable with density function

Find the expected value of g(X) = 4X + 3 by using the theorem presented recently.

2

, 1 2( ) 30, elsewhere

x xf x

2 3

1

53 4x dx

Page 17: Variance and Covariance

President University Erwin Sitompul PBST 5/17

Means of Linear Combinations of XChapter 4.3 Means and Variances of Linear Combinations of Random Variables

The expected value of the sum or difference of two or more functions of a random variable X is the sum or difference of the expected values of the functions. That is

[ ( ) ( )] [ ( )] [ ( )]E g X h X E g X E h X

Let X be a random variable with probability distribution as given next. Find the expected value of Y = (X – 1)2.

2 2[( 1) ] [ 2 1]E X E X X

1 1 1( ) (0) (1) (2)(0) (3) 13 2 6

E X

2 2 2 21 1 1( ) (0) (1) (2) (0) (3) 23 2 6

E X

2[( 1) ] 2 (2)(1) 1 1E X

2( ) 2 ( ) (1)E X E X E

Page 18: Variance and Covariance

President University Erwin Sitompul PBST 5/18

Means of Linear Combinations of XChapter 4.3 Means and Variances of Linear Combinations of Random Variables

The weekly demand for a certain drink, in thousands of liters, at a chain of convenience stores is a continuous random variable g(X) = X2 + X – 2, where X has the density function

2( 1), 1 2( ) 0,x xf x elsewhere

Find the expected value for the weekly demand of the drink.

2 2( 2) ( ) ( ) (2)E X X E X E X E

2

1

( ) 2 ( 1)E X x x dx 2

2 2

1

( ) 2 ( 1)E X x x dx

2( 2)E X X

22

1

52 ( )3

x x dx 2

3 2

1

172 ( )3

x x dx

17 5 526 3 2

Page 19: Variance and Covariance

President University Erwin Sitompul PBST 5/19

Means of Linear Combinations of XChapter 4.3 Means and Variances of Linear Combinations of Random Variables

The expected value of the sum or difference of two or more functions of a random variables X and Y is the sum or difference of the expected values of the functions. That is

( , ) ( , ) ( , ) ( , )E g X Y h X Y E g X Y E h X Y

Let X and Y be two independent random variables. Then( ) ( ) ( )E XY E X E Y

Page 20: Variance and Covariance

President University Erwin Sitompul PBST 5/20

Means of Linear Combinations of XChapter 4.3 Means and Variances of Linear Combinations of Random Variables

In producing gallium-arsenide microchips, it is known that the ratio between gallium and arsenide is independent of producing a high percentage of workable wafers, which are the main components of microchips. Let X denote the ratio of gallium to arsenide and Y denote the percentage of workable microwafers retrieved during a 1-hour period. X and Y are independent random variables with the joint density being known as

2(1 3 ) , 0 2, 0 1( , ) 40, elsewhere

x y x yf x y

Illustrate that E(XY) = E(X)E(Y).

1 2

0 0

( ) ( , )E XY xyf x y dxdy1 2 2 2

0 0

(1 3 )4

x y y dxdy

21 3 2

0 0

(1 3 )12

x

x

x y y dy

1 2

0

2 (1 3 ) 53 6

y y dy

Page 21: Variance and Covariance

President University Erwin Sitompul PBST 5/21

Means of Linear Combinations of XChapter 4.3 Means and Variances of Linear Combinations of Random Variables

1 2

0 0

( ) ( , )E X xf x y dxdy21 3 2

0 0

(1 3 )12

x

x

x y dy

1 2

0 0

( ) ( , )E Y yf x y dxdy21 2 2

0 0

(1 3 )8

x

x

x y y dy

5 4 5( ) ( ) ( )6 3 8

E XY E X E Y

1 2 2 2

0 0

(1 3 )4

x y dxdy

1 2

0

2(1 3 ) 43 3y dy

1 2 2

0 0

(1 3 )4

xy y dxdy

1 2

0

(1 3 ) 52 8

y y dy

Hence, it is proven that

Page 22: Variance and Covariance

President University Erwin Sitompul PBST 5/22

Means of Linear Combinations of XChapter 4.3 Means and Variances of Linear Combinations of Random Variables

If a and b are constant, then2 2 2 2 2aX b Xa a

If X and Y are random variables with joint probability distribution f(x, y), then

2 2 2 2 2 2aX bY X Y XYa b ab

Page 23: Variance and Covariance

President University Erwin Sitompul PBST 5/23

Means of Linear Combinations of XChapter 4.3 Means and Variances of Linear Combinations of Random Variables

If X and Y are random variables with variances , , and covariance σXY = –2, find the variance of the random variable Z = 3X – 4Y + 8.

2 23 4 8Z X Y 2

3 4X Y 2 29 16 24X Y XY (9)(2) (16)(4) (24)( 2)

130

Let X and Y denote the amount of two different types of impurities in a batch of a certain chemical product. Suppose that X and Y are independent random variables with variances and . Find the variance of the random variable Z = 3X – 2Y + 5.

2 23 2 5Z X Y 2

3 2X Y 2 29 4X Y (9)(2) (4)(3)

30

2 2X 2 4Y

2 2X 2 3Y

Page 24: Variance and Covariance

President University Erwin Sitompul PBST 5/24

Chebyshev’s TheoremChapter 4.4 Chebyshev’s Theorem

As we already discussed, the variance of a random variable tells us something about the variability of the observation about the mean.

If a variable has a small variance or standard deviation, we would expect most of the values to be grouped around the mean.

The probability that a random variable assumes a value within a certain interval about the main is greater in this case.

If we think of probability in terms of area, we would expect a continuous distribution with a small standard deviation to have most of its area close to μ.

Variability of continuous observations about the mean

Page 25: Variance and Covariance

President University Erwin Sitompul PBST 5/25

Chebyshev’s TheoremChapter 4.4 Chebyshev’s Theorem

We can argue the same way for a discrete distribution. The spread out of an area in the probability histogram indicates a more variable distribution of measurements or outcomes.

Variability of discrete observations about the mean

Page 26: Variance and Covariance

President University Erwin Sitompul PBST 5/26

Chebyshev’s TheoremChapter 4.4 Chebyshev’s Theorem

A Russian mathematician P. L. Chebyshev discovered that the fraction of the area between any two values symmetric about the mean is related to the standard deviation.

|Chebyshev’s Theorem| The probability that any random variable X will assume a value within k standard deviations of the mean is at least 1 – 1/k2. That is

2

1( ) 1P k X kk

Chebyshev’s Theorem holds for any distribution of observations and, for this reason, the results are usually weak.

The value given by the theorem is a lower bound only. Exact probabilities can only be determined when the probability distribution is known.

The use of Chebyshev’s Theorem is relegated to situations where the form of the distribution is unknown.

Page 27: Variance and Covariance

President University Erwin Sitompul PBST 5/27

Chebyshev’s Theorem and Normal DistributionChapter 4.4 Chebyshev’s Theorem

Page 28: Variance and Covariance

President University Erwin Sitompul PBST 5/28

15 16

Chebyshev’s TheoremChapter 4.4 Chebyshev’s Theorem

A random variable X has a mean μ = 8, a variance σ2 = 9, and an unknown probability distribution. Find(a) P(–4 < X < 20)(b) P(|X – 8| ≥ 6)

(a) ( 4 20) 8 (4)(3) 8 (4)(3)P X P X

(b) ( 8 6) 1 8 6P X P X

1 6 8 6P X

1 8 (2)(3) 8 (2)(3)P X

1 1 1 4

1 4

21 1 4

Page 29: Variance and Covariance

President University Erwin Sitompul PBST 5/29

Homework 5Probability and Statistics

1. For the joint probability distribution of the two random variables X and Y as given in the following figure, calculate the covariance of X and Y.

(Mo.E5.27 p.0172)

2. The photoresist thickness in semiconductor manufacturing has a mean of 10 micrometers and a standard deviation of 1 micrometer. Bound the probability that the thickness is less than 6 or greater than 14 micrometers.

(Mo.S5.25 p05.15)