1 g89.2229 lect 2m examples of correlation random variables and manipulated variables thinking about...
TRANSCRIPT
1G89.2229 Lect 2M
• Examples of Correlation
• Random variables and manipulated variables
• Thinking about joint distributions
• Thinking about marginal distributions: Expectations
• Covariance as a statistical concept and tool
G89.2229 Multiple Regression in
Psychology
2G89.2229 Lect 2M
Three examples of correlation
• All from bar exam study discussed last week» Anxiety and Depression from
POMS on day 29 (two days before bar exam)
» Anger and Vigor from POMS on day 29 (two days before bar exam)
» Anxiety and day to exam during week prior to start of exam.
3G89.2229 Lect 2M
Anxious and Depressed Mood 2 Days Before Exam
• What do you notice about joint distribution?
• What is correlation?
Anxious and Depressed Mood Day 29
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
4
-1 0 1 2 3 4
Depressed Mood
An
xio
us
Mo
od
r = 0.64
4G89.2229 Lect 2M
Anger and Vigor 2 Days Before Exam
• What do you notice about joint distribution?
• What is correlation?
Vigor and Anger Day 29
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
4
-1 0 1 2 3 4
Vigor
An
ge
r
r = -.19
5G89.2229 Lect 2M
Jiggled Anxiety by Day
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
-7 -6 -5 -4 -3 -2 -1 0 1
Jiggled Day
Jig
gle
d A
nx
iety
Series1
Anxious Mood in Days Before the Exam
• What do you notice about joint distribution?
• What is correlation?
r = .25
6G89.2229 Lect 2M
Random Variables vs. Manipulated Variables
• A random variable is a quantity that is not known exactly prior to data collection.» E.g. anxiety and depression on
any given day for a randomly selected subject
• A manipulated variable is a quantity that is determined by a sampling plan or an experimental design.» E.g. Day to exam, level of
exposure, gender
• This distinction will have implications on statistical analysis of bivariate association.
7G89.2229 Lect 2M
Thinking about bivariate (Joint) distributions
• Suppose we sample persons and measure two behaviors.» Both are random» The variables might be related or
independent» The joint distribution contains
information about each variable and the relation among them.
• When we ignore one of the two variables, and study the other, we say we are studying the Marginal distribution» This term simply reminds us that
another variable is in the background
8G89.2229 Lect 2M
• Suppose we measure X, and Y, but choose to study only X (ignoring Y).
• We can describe the marginal distribution of X using the mean, the variance, and other moments such as coefficient of skewness and kurtosis.
• The population moments of the variable are described with Expectation Operators.
• Expectation operators can be used to study means and variances.
Expectations and Moments for Marginal Distributions
9G89.2229 Lect 2M
Expectation operators defined
• The population mean, = E(X), is the average of all elements in the population.
• It can be derived knowing only the form of the population distribution.» Let f(X) be the density function
describing the likelihood of different values of X in the population.
» The population mean is the average of all values of X weighted by the likelihood of each value.
• If X has finite discrete values, each with probability f(X)=P(X), E(X)= P(xi)xi
• If X has continuous values, we write E(X)= x f(x) dx
10G89.2229 Lect 2M
Rules for Expectation operators
• E(X)=x is the first moment, the mean
• Let k represent some constant number (not random)» E(k*X) = k*E(X) = k*x
» E(X+k) = E(X)+k = x+k
• Let Y represent another random variable (perhaps related to X)» E(X+Y) = E(X)+E(Y) = x + y
» E(X-Y) = E(X)-E(Y) = x - y
• Putting these together» E( ) = E[(X1+X2)/2] =(1 + 2)/2 =
The expected value of the average of two random variables is the average of their means.
X
11G89.2229 Lect 2M
Variance Operators
• Analogous to E(Y)=, is V(Y)=E(Y)2 = (y )2f(y) dy
• E[(X-x)2] = V(X) = x2
• Let k represent some constant » V(k*X) = k2*V(X) = k2*x
2
» V(X+k) = V(X) = x2
• Let Y represent another random variable that is independent of X» V(X+Y) = V(X)+V(Y) = x
2 + y2
» V(X-Y) = V(X)+V(Y) = x2 + y
2
• A more general form of these formulas requires the concept of covariance
12G89.2229 Lect 2M
Covariance: A Bivariate Moment
• E[(X-x)(Y-y)] = Cov(X,Y) = XY is called the population covariance.» It is the average product of
deviations from means» It is zero when the variables
are linearly independent
• Formally it depends on the joint bivariate density of X and Y, f(X,Y).» f(X,Y) says how likely are any
pair of values of X and Y» Cov(X,Y)=
(X-x)(Y-y)f(X,Y)dXdY
13G89.2229 Lect 2M
Cov (X,Y) as an expectation operator
» For k1 and k2 as constants, there are facts closely parallel to facts for variances:
• Cov(k1+X, k2+Y) = Cov(X,Y) = XY
• Cov(k1X, k2Y) = k1*k2*Cov(X,Y)= k1*k2* XY
» Important special case:
• Let Y* = (1/Y)Y and X* = (1/X)X V(X*) = V(Y*) = 1.0
• Cov(X*,Y*) = (1/Y) (1/X) XY = XY
• Cov (X*,Y*) is the population correlation for the variables X and Y, XY
» Since XY = (1/Y) (1/X) XY,
XY = (Y) (X) XY
14G89.2229 Lect 2M
An important use of correlation and covariance
• We are often interested in linear functions of two random variables: aX+bY» a=1, b=1 gives sum» a=.5, b=.5 gives average» a=1, b=-1 gives difference
• What is the expected variance of W=aX+bY in general?» Var(W) = V(aX+bY) =
a2 V(X)+b2 V(Y) + 2ab Cov(X,Y) = a2 x
2 +b2 y2 + 2ab x y xy
» This can be used to compute expected standard error of contrasts of sample statistics.
15G89.2229 Lect 2M
Example
• Suppose we want to average the POMS anxious and depressed moods. What is the expected variance?
• In the sample on day 29,» Var(Anx)=1.129,
Var(Dep)=0.420Corr(A,D)= 0.64Cov(A,D)=.64*(1.129*.420)1/2
= 0.441» Var(.5*A+.5*D) =
.(25)(1.129)+(.25)(.420) +(2)(.25)(.441) =0.648
16G89.2229 Lect 2M
Final Comment
• Standard deviations and variances are particularly useful when variables are normally distributed
• Expectation operators assume that f(X), f(Y) and f(X,Y) can be known, but they do not assume that these describe bell shape or normal distributions
• Covariances and correlations can be estimated with non-normal variables, but be careful about statistical tests.