1 g89.2228 lect 4a g89.2228 lecture 4a f(x) of special interest: normal distribution are these...
TRANSCRIPT
1G89.2228 Lect 4a
G89.2228Lecture 4a
• f(X) of special interest: Normal Distribution
• Are These Random Variables Normally Distributed?
• Probability Statements and the Normal Distribution
• Covariance: An important bivariate moment
• Covariance and. correlation
2G89.2228 Lect 4a
A Density of Special Interest: the Normal Distribution
• The facts about expectations have been developed without specifying the exact nature of the distribution of X» f(X) can take many different forms
» In some cases its form is not known
• There is one form of f(X) that is of special interest: the normal distribution» The familiar bell shaped distribution so
often observed in nature
» A distribution that repeatedly emerges in mathematical statistics
• Central Limit Theorem shows that sums (and averages) of random variables are normally distributed
3G89.2228 Lect 4a
The Normal density
• A family of distributions that are indexed by two parameters and 2, the mean and variance
• is the index of location, and 2 is the index of spread
e
X
Xf
2
2
221
)(
Family of normal curves
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
-3
-2.5 -2
-1.5 -1
-0.5
0
0.5 1
1.5 2
2.5 3
X
f(X
)
Series1Series2
Normal(0,1)
Normal(-.5,.25)
4G89.2228 Lect 4a
Normal distributions
• Why do they appear in nature so often?
• Linear transformation of X~N(,2) [X “distributed as” N(,2)] does not change form» If Y=a+bX then Y~N[(a),
(b22)]» If height is normal in inches, it is
normal in centimeters» If self-esteem is normal using one
scale, it will usually be normal with a highly correlated scale
• Empirical operation of Central limit theorem
5G89.2228 Lect 4a
Central Limit Theorem
• Sums of random variables will be normally distributed as the number of things summed gets large
• If the distribution of random variables Wi is symmetric, “large” may be as little as N=10» Averages are simply linear transformed
sums: (1/n)(X)
• Many processes in nature are additive» Height is the sum of annual growths
• Many psychological measures are additive» Educational achievement as sum of
correct test responses
6G89.2228 Lect 4a
Are these random variables normally distributed?
• Sum five coin flips (H=1, T=0)• Sum of fifty coin flips • Annual salaries of professors
• For X~N(,12) and Y~N(2, 2
2),X+Y
• For X~N(,2), X2
• For X~N(,2) and Y~N(2, 2),X2+Y2
• For Xi~N(,2) for all i=1,2,...,500, Xi
2
• Reaction times to memory trials• Errors in smell identification test• Sum of 10 attitude strength items
7G89.2228 Lect 4a
Probability statements using the Normal Distribution
• The distribution of normally distributed random variables, such as sample means, is well known and often presented in tables as N(0,1).
• Tables can be used by transforming variables with other normal distributions to the form of N(0,1).
• If X~N() and if and are known,then Z = (X-)/ has N(0,1) distribution
• This transformation is one-to-one, allowing one to reconstruct X from Z:
X = Z +
8G89.2228 Lect 4a
Computing Probabilities from N(0,1) Distribution
• Tables of N(0,1) allow us to ask the probability of sampling Z~N(0,1) in the range (-1, 1).» Pr(-1 Z 1) = .68
• If X~N(-.5,.52) and we want to ask about Pr(-1.5 X 0) we transform to Z and compute
8185.3413.4772.12Pr
5.
)5.(0
5.
)5.(5.1Pr)05.1Pr(
Z
ZX
9G89.2228 Lect 4a
One Table Fits All
• Transformation makes it unnecessary to have all variations of normal curves tabled.
• The standard normal table describes probability in terms of number of sd's from mean.
Family of normal curves
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
-3
-2.5 -2
-1.5 -1
-0.5
0
0.5 1
1.5 2
2.5 3
X
f(X
)
Series1Series2
Normal(0,1)
Normal(-.5,.25)
10G89.2228 Lect 4a
Assessing Non-independence:One More Expectation Operator
• Very often we consider two random variables together» height and weight
» reaction time and response errors
» depression and anxiety
» Subject 1 and a yoked control
• E[(X-x)(Y-y)] = Cov(X,Y) = XY is called the population covariance.
• Cov(X,Y) measures linear association between the variables
• It is an expectation that depends on the joint bivariate density of X and Y, f(X,Y).» f(X,Y) says how likely are any pair of
values of X and Y
11G89.2228 Lect 4a
Interpreting covariance as a parameter
• When X and Y tend to increase together, Cov(X,Y)>0
• When high levels of X go with low levels of Y, Cov(X,Y)<0
• When X and Y are independent, Cov(X,Y) = 0.
• Note that there are cases when Cov(X,Y) take the value zero when X and Y are related nonlinearly.
X
Y
+,+
-,-
-,+
+,-
12G89.2228 Lect 4a
Correlation and Covariance
• Besides noticing its sign and whether it is zero, it is difficult to interpret the absolute magnitude of covariance
• Note that Cov(X,Y) is bounded by V(X) and V(Y):
• If V(X) and V(Y) can be transformed so that both have variances equal to one, then the new covariance is bounded by -1 and +1» In this case the covariance =
correlation, XY = Corr(X,Y)
» It has all the same properties of covariances just discussed, but is easier to interpret
)](),([Max),(Cov YVXVYX
13G89.2228 Lect 4a
Cov (X,Y) as an expectation operator
» For k1 and k2 as constants, there are facts closely parallel to facts for variances:
• Cov(k1+X, k2+Y) = Cov(X,Y) = XY
• Cov(k1X, k2Y) = k1*k2*Cov(X,Y)= k1*k2* XY
» Important special case:• Let Y* = (1/Y)Y and X* = (1/X)X
V(X*) = V(Y*) = 1.0
• Cov(X*,Y*) = (1/Y) (1/X) XY = XY
• Cov (X*,Y*) is the population correlation for the variables X and Y, XY
» Since XY = (1/Y) (1/X) XY,
XY = (Y) (X) XY
14G89.2228 Lect 4a
One Payoff for Studying Covariance
• We can generalize the rule for calculating the variance of a sum of two variables.
• For any X and Y,Var(X+Y) = V(X) + V(Y) +2Cov(X,Y)Var(XY) = V(X) + V(Y) 2Cov(X,Y)
• More generally,Var(k1X+k2Y) = k1
2X2 + k2
2Y2+2k1k2XY
X Y X+Y X-Y3 5 8 -226 18 44 840 15 55 2521 16 37 58 1 9 76 6 12 013 15 28 -225 12 37 1320 9 29 1110 11 21 -1
Mean 17.2 10.8 28 6.4Variance 129.1 30.18 246 72.489SD 11.36 5.493 15.68 8.514Cov(X,Y) 43.38Corr(X,Y) 0.695