1 g89.2228 lect 4a g89.2228 lecture 4a f(x) of special interest: normal distribution are these...

1G89.2228 Lect 4a

G89.2228Lecture 4a

• f(X) of special interest: Normal Distribution

• Are These Random Variables Normally Distributed?

• Probability Statements and the Normal Distribution

• Covariance: An important bivariate moment

• Covariance and. correlation

2G89.2228 Lect 4a

A Density of Special Interest: the Normal Distribution

• The facts about expectations have been developed without specifying the exact nature of the distribution of X» f(X) can take many different forms

» In some cases its form is not known

• There is one form of f(X) that is of special interest: the normal distribution» The familiar bell shaped distribution so

often observed in nature

» A distribution that repeatedly emerges in mathematical statistics

• Central Limit Theorem shows that sums (and averages) of random variables are normally distributed

3G89.2228 Lect 4a

The Normal density

• A family of distributions that are indexed by two parameters and 2, the mean and variance

• is the index of location, and 2 is the index of spread

e

X

Xf

2

2

221

)(

Family of normal curves

-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

-3

-2.5 -2

-1.5 -1

-0.5

0

0.5 1

1.5 2

2.5 3

X

f(X

)

Series1Series2

Normal(0,1)

Normal(-.5,.25)

4G89.2228 Lect 4a

Normal distributions

• Why do they appear in nature so often?

• Linear transformation of X~N(,2) [X “distributed as” N(,2)] does not change form» If Y=a+bX then Y~N[(a),

(b22)]» If height is normal in inches, it is

normal in centimeters» If self-esteem is normal using one

scale, it will usually be normal with a highly correlated scale

• Empirical operation of Central limit theorem

5G89.2228 Lect 4a

Central Limit Theorem

• Sums of random variables will be normally distributed as the number of things summed gets large

• If the distribution of random variables Wi is symmetric, “large” may be as little as N=10» Averages are simply linear transformed

sums: (1/n)(X)

• Many processes in nature are additive» Height is the sum of annual growths

• Many psychological measures are additive» Educational achievement as sum of

correct test responses

6G89.2228 Lect 4a

Are these random variables normally distributed?

• Sum five coin flips (H=1, T=0)• Sum of fifty coin flips • Annual salaries of professors

• For X~N(,12) and Y~N(2, 2

2),X+Y

• For X~N(,2), X2

• For X~N(,2) and Y~N(2, 2),X2+Y2

• For Xi~N(,2) for all i=1,2,...,500, Xi

2

• Reaction times to memory trials• Errors in smell identification test• Sum of 10 attitude strength items

7G89.2228 Lect 4a

Probability statements using the Normal Distribution

• The distribution of normally distributed random variables, such as sample means, is well known and often presented in tables as N(0,1).

• Tables can be used by transforming variables with other normal distributions to the form of N(0,1).

• If X~N() and if and are known,then Z = (X-)/ has N(0,1) distribution

• This transformation is one-to-one, allowing one to reconstruct X from Z:

X = Z +

8G89.2228 Lect 4a

Computing Probabilities from N(0,1) Distribution

• Tables of N(0,1) allow us to ask the probability of sampling Z~N(0,1) in the range (-1, 1).» Pr(-1 Z 1) = .68

• If X~N(-.5,.52) and we want to ask about Pr(-1.5 X 0) we transform to Z and compute

8185.3413.4772.12Pr

5.

)5.(0

5.

)5.(5.1Pr)05.1Pr(

Z

ZX

9G89.2228 Lect 4a

One Table Fits All

• Transformation makes it unnecessary to have all variations of normal curves tabled.

• The standard normal table describes probability in terms of number of sd's from mean.

Family of normal curves

-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

-3

-2.5 -2

-1.5 -1

-0.5

0

0.5 1

1.5 2

2.5 3

X

f(X

)

Series1Series2

Normal(0,1)

Normal(-.5,.25)

10G89.2228 Lect 4a

Assessing Non-independence:One More Expectation Operator

• Very often we consider two random variables together» height and weight

» reaction time and response errors

» depression and anxiety

» Subject 1 and a yoked control

• E[(X-x)(Y-y)] = Cov(X,Y) = XY is called the population covariance.

• Cov(X,Y) measures linear association between the variables

• It is an expectation that depends on the joint bivariate density of X and Y, f(X,Y).» f(X,Y) says how likely are any pair of

values of X and Y

11G89.2228 Lect 4a

Interpreting covariance as a parameter

• When X and Y tend to increase together, Cov(X,Y)>0

• When high levels of X go with low levels of Y, Cov(X,Y)<0

• When X and Y are independent, Cov(X,Y) = 0.

• Note that there are cases when Cov(X,Y) take the value zero when X and Y are related nonlinearly.

X

Y

+,+

-,-

-,+

+,-

12G89.2228 Lect 4a

Correlation and Covariance

• Besides noticing its sign and whether it is zero, it is difficult to interpret the absolute magnitude of covariance

• Note that Cov(X,Y) is bounded by V(X) and V(Y):

• If V(X) and V(Y) can be transformed so that both have variances equal to one, then the new covariance is bounded by -1 and +1» In this case the covariance =

correlation, XY = Corr(X,Y)

» It has all the same properties of covariances just discussed, but is easier to interpret

)](),([Max),(Cov YVXVYX

13G89.2228 Lect 4a

Cov (X,Y) as an expectation operator

» For k1 and k2 as constants, there are facts closely parallel to facts for variances:

• Cov(k1+X, k2+Y) = Cov(X,Y) = XY

• Cov(k1X, k2Y) = k1*k2*Cov(X,Y)= k1*k2* XY

» Important special case:• Let Y* = (1/Y)Y and X* = (1/X)X

V(X*) = V(Y*) = 1.0

• Cov(X*,Y*) = (1/Y) (1/X) XY = XY

• Cov (X*,Y*) is the population correlation for the variables X and Y, XY

» Since XY = (1/Y) (1/X) XY,

XY = (Y) (X) XY

14G89.2228 Lect 4a

One Payoff for Studying Covariance

• We can generalize the rule for calculating the variance of a sum of two variables.

• For any X and Y,Var(X+Y) = V(X) + V(Y) +2Cov(X,Y)Var(XY) = V(X) + V(Y) 2Cov(X,Y)

• More generally,Var(k1X+k2Y) = k1

2X2 + k2

2Y2+2k1k2XY

X Y X+Y X-Y3 5 8 -226 18 44 840 15 55 2521 16 37 58 1 9 76 6 12 013 15 28 -225 12 37 1320 9 29 1110 11 21 -1

Mean 17.2 10.8 28 6.4Variance 129.1 30.18 246 72.489SD 11.36 5.493 15.68 8.514Cov(X,Y) 43.38Corr(X,Y) 0.695

1 g89.2228 lect 4a g89.2228 lecture 4a f(x) of special interest: normal distribution are these...

Documents