linear-correlation-1205885176993532-3
TRANSCRIPT
-
7/23/2019 linear-correlation-1205885176993532-3
1/102
Lecture 4Survey Research & Design in Psychology
James Neill, 2012
Correlation
-
7/23/2019 linear-correlation-1205885176993532-3
2/102
2
1. Purpose of correlation2. ovariation
!. "inear correlation#. $ypes of correlation
%. nterpreting correlation
'.(ssumptions ) limitations*. Dealing +ith several correlations
Overview
-
7/23/2019 linear-correlation-1205885176993532-3
3/102
3
Howell (2010) h'ategorical Data & hiS-uare horrelation & Regression h10(lternative orrelational
$echni-ues
10.1Point/iserial orrelation an PhiPearson orrelation y (nother Name
10.!orrelation oefficients for Ran3e
Data
Readings
-
7/23/2019 linear-correlation-1205885176993532-3
4/102
4
Purpose of correlation
-
7/23/2019 linear-correlation-1205885176993532-3
5/102
$he unerlying purpose ofcorrelation is to help aress the-uestion
4hat is the5 relations!ipor5 egree of associationor5 amount of s!ared variance
et+een two varia"les6
Purpose of correlation
-
7/23/2019 linear-correlation-1205885176993532-3
6/102
#
Purpose of correlation7ther +ays of e8pressing the
unerlying correlational -uestioninclue
$o +hat e8tent5 o t+o variales covar$65 are t+o variales dependentor
independentof one another65 can one variale e predicted
from another6
-
7/23/2019 linear-correlation-1205885176993532-3
7/102%
Covariation
-
7/23/2019 linear-correlation-1205885176993532-3
8/102
8
$he +orl is mae of
covariation
-
7/23/2019 linear-correlation-1205885176993532-3
9/102
9
4e oserve
covariations inthe psycho
social +orl.
-
7/23/2019 linear-correlation-1205885176993532-3
10/102
10
4e oserve
covariations inthe psycho
social +orl.
4e canmeasure ouroservations.
e.g., epictions ofviolence in the
environment.
e.g., psychological statessuch as stress
an epression.
Do they ten
to cooccur6
-
7/23/2019 linear-correlation-1205885176993532-3
11/102
-
7/23/2019 linear-correlation-1205885176993532-3
12/10212
Linear correlation
-
7/23/2019 linear-correlation-1205885176993532-3
13/10213
Linear correlation
$he e8tent to +hich t+o varialeshave a simple linear9straightline:relationship.
"inear correlations provie theuiling loc3s for multivariatecorrelational analyses, such as
5 ;actor analysis
5 Reliaility
5
-
7/23/2019 linear-correlation-1205885176993532-3
14/10214
Linear correlation
"inear relations et+een varialesare inicate y correlations
5 &irection'orrelation sign 9= ) :
inicates irection of linear relationship5 trengt!'orrelation si>e inicates
strength 9ranges from 1 to =1:
5 tatistical significance'pinicatesli3elihoo that oserve relationshipcoul have occurre y chance
-
7/23/2019 linear-correlation-1205885176993532-3
15/1021
!at is t!e linear correlation*+$pes of answers
5 No relationship 9inepenence:5 "inear relationship
?(s one variale @s, so oes the other 9=ve:
?(s one variale @s, the other As 9ve:
5 Nonlinear relationship
5 Pay caution ue to? Beterosceasticity
? Restricte range
? Beterogeneous samples
-
7/23/2019 linear-correlation-1205885176993532-3
16/1021#
+$pes of correlation
$o ecie +hich type of
correlation to use, consierthe levels of ,easure,entfor each variale
-
7/23/2019 linear-correlation-1205885176993532-3
17/1021%
+$pes of correlation
5 Nominal y nominalPhi 9C: ) ramers V, his-uare
5 7rinal y orinalSpearmans ran3 ) Eenalls $au b
5 Dichotomous y interval)ratioPoint iserial r
pb
5 nterval)ratio y interval)ratioProuctmoment or Pearsons r
-
7/23/2019 linear-correlation-1205885176993532-3
18/102
1-
+$pes of correlation and LO.
Scatterplot
Product-
momentcorrelation r
Int/Ratio
Recode
Scatterplot or
clustered bar
chart
Spearman'sRho or
Kendall's Tau
Ordinal
Scatterplot,bar chart or
error-bar chart
Point bi-serial
correlation
(rpb)
RecodeClustered bar-chart,
Chi-square,
Phi () or
Cramer's V
Nominal
Int/RatioOrdinalNominal
-
7/23/2019 linear-correlation-1205885176993532-3
19/102
1/
o,inal "$ no,inal
-
7/23/2019 linear-correlation-1205885176993532-3
20/102
20
o,inal "$ no,inalcorrelational approac!es
5 ontingency 9or crossta: tales? 7serve
? F8pecte? Ro+ an)or column Gs
?
-
7/23/2019 linear-correlation-1205885176993532-3
21/102
21
ontingency tales
/ivariate fre-uency tales ell fre-uencies 9re:
-
7/23/2019 linear-correlation-1205885176993532-3
22/102
22
ontingency tale F8ample
RFD ontingency cells
/"KF
-
7/23/2019 linear-correlation-1205885176993532-3
23/102
23
ontingency tale F8ample
his-uare is ase on the ifferences et+een
the actual an e8pecte cell counts.
-
7/23/2019 linear-correlation-1205885176993532-3
24/102
24
Example
Ro+ an)or column cell percentages may alsoai interpretatione.g., L2)!rs of smo3ers snore, +hereas onlyL1)!rof nonsmo3ers snore.
-
7/23/2019 linear-correlation-1205885176993532-3
25/102
25
lustere ar graph/ivariate ar graph of fre-uencies or percentages.
$he categorya8is ars are
clustere 9ycolour or fillpattern: toinicate the the
secon varialescategories.
-
7/23/2019 linear-correlation-1205885176993532-3
26/102
26
L2)!rs ofsnorers aresmo3ers,+hereas onlyL1)!rof nonsnores are
smo3ers.
-
7/23/2019 linear-correlation-1205885176993532-3
27/102
27
Pearson chis-uare test
P hi t t
-
7/23/2019 linear-correlation-1205885176993532-3
28/102
284riteup 2 91, 1M': 10.2',p .001
Pearson chis-uare testF8ample
hi i t i ti F l
-
7/23/2019 linear-correlation-1205885176993532-3
29/102
29
his-uare istriution F8ample
-
7/23/2019 linear-correlation-1205885176993532-3
30/102
30
P!i () Cra,ers V
P!i ()
5 Kse for 282, 28!, !82 analysese.g., ener 92: & Pass);ail 92:
Cra,ers V
5 Kse for !8! or greater analysese.g., ;avourite Season 9#: 8 ;avouriteSense 9%:
9nonparametric measures of correlation:
-
7/23/2019 linear-correlation-1205885176993532-3
31/102
31
Phi 9: & ramers V F8ample
291, 1M': 10.2',p .001, .2#
-
7/23/2019 linear-correlation-1205885176993532-3
32/102
32
Ordinal "$ ordinal
-
7/23/2019 linear-correlation-1205885176993532-3
33/102
33
Ordinal "$ ordinalcorrelational approac!es
5 SpearmanHs rho 9rs:
5 Eenall tau 9:5(lternatively, use nominal y
nominal techni-ues 9i.e., treat aslo+er level of measurement:
-
7/23/2019 linear-correlation-1205885176993532-3
34/102
34
rap!ing ordinal "$ ordinal data
5 7rinal y orinal ata is ifficult tovisualise ecause its nonparametric,yet there may e many points.
5 onsier using
?Nonparametric approaches 9e.g.,
clustere ar chart:?Parametric approaches 9e.g.,
scatterplot +ith inning:
! ( )
-
7/23/2019 linear-correlation-1205885176993532-3
35/102
3
pear,ans r!o (rs) or
pear,ans ran5 order correlation
5 ;or ran3e 9orinal: ata
?e.g. 7lympic Placing correlate +ith
4orl Ran3ing5 Kses prouctmoment correlation
formula
5 nterpretation is aOuste to consierthe unerlying ran3e scales
6 d ll + ( )
-
7/23/2019 linear-correlation-1205885176993532-3
36/102
3#
6endalls +au ()
5 $au a? Does not ta3e Ooint ran3s into account
5 $au
? $a3es Ooint ran3s into account? ;or s-uare tales
5 $au c? $a3es Ooint ran3s into account
? ;or rectangular tales
-
7/23/2019 linear-correlation-1205885176993532-3
37/102
3%
&ic!oto,ous "$
interval7ratio
-
7/23/2019 linear-correlation-1205885176993532-3
38/102
3-
Point8"iserial correlation (rp"
)
5 7ne ichotomous & onecontinuous variale
?e.g., elief in go 9yes)no: anamount of international travel
5 alculate as for PearsonHs
prouctmoment r,5(Oust interpretation to consier
the unerlying scales
-
7/23/2019 linear-correlation-1205885176993532-3
39/102
39
Pointiserial correlation 9rp
:
F8ample
$hose +ho report that they
elieve in o also reporthaving travelle to slightlyfe+er countries 9r
p .10: ut
this ifference coul haveoccurre y chance 9p .0%:,thus B
0is not reOecte.
Do not elieve /elieve
-
7/23/2019 linear-correlation-1205885176993532-3
40/102
40
Pointiserial correlation 9rp
:
F8ample
0 No1 Qes
-
7/23/2019 linear-correlation-1205885176993532-3
41/102
41
9nterval7ratio "$
9nterval7ratio
-
7/23/2019 linear-correlation-1205885176993532-3
42/102
42
catterplot
5 Plot each pair of oservations 9, Q:?8 preictor variale 9inepenent:
?y criterion variale 9epenent:
5 /y convention
?the I shoul e plotte on the 8
9hori>ontal: a8is?the DI on the y 9vertical: a8is.
-
7/23/2019 linear-correlation-1205885176993532-3
43/102
43
Scatterplot sho+ing relationship et+eenage & cholesterol +ith line of est fit
Li f " t fit
-
7/23/2019 linear-correlation-1205885176993532-3
44/102
44
5 $he correlation et+een 2
variales is a measure of theegree to +hich pairs of numers9points: cluster together aroun aestfitting straight line
5 "ine of est fit y a = 8
5 hec3 for?outliers
?linearity
Line of "est fit
-
7/23/2019 linear-correlation-1205885176993532-3
45/102
45
4hatHs +rong +ith this scatterplot6
I shoul
treate as an DI as Q,although this is
not al+aysistinct.
Scatterplot e8ample
-
7/23/2019 linear-correlation-1205885176993532-3
46/102
46
Scatterplot e8ampleStrong positive 9.M1:
4hy is infantmortality positivelylinearly associate+ith the numer of
physicians 9+ith theeffects of DPremove:6
( /ecause more
octors ten to eeploye to areas+ith infant mortality9socioeconomicstatus asie:.
Scatterplot e8ample
-
7/23/2019 linear-correlation-1205885176993532-3
47/102
47
Scatterplot e8ample4ea3 positive 9.1#:
Scatterplot e8ample
-
7/23/2019 linear-correlation-1205885176993532-3
48/102
48
Scatterplot e8ample
-
7/23/2019 linear-correlation-1205885176993532-3
49/102
49
Pearson product8,o,ent correlation (r)
$he prouctmomentcorrelation is the
standardised covariance.
-
7/23/2019 linear-correlation-1205885176993532-3
50/102
0
Covariance
5 Iariance share y 2 variales
5 ovariance reflects the
irection of the relationship=ve cov inicates = relationship
ve cov inicates relationship.
Cross products
1
))((
=
N
YYXXCov
XY
-
7/23/2019 linear-correlation-1205885176993532-3
51/102
51
ovariance rossproucts
vecross
proucts
X1
403020100
Y1
3
3
2
2
1
1
0
ve ev.proucts
ve ev.proucts
=ve ev.proucts
=ve ev.proucts
C i
-
7/23/2019 linear-correlation-1205885176993532-3
52/102
2
Covariance5 Depenent on the scale of
measurementT ant comparecovariance across ifferent scales ofmeasurement9e.g., age y +eight in 3ilos versus
age y +eight in grams:.
5 $herefore, standardisecovariance9ivie y the crossprouct of
the Ss: T correlation5 orrelation is an effect si>e? i.e.,
stanarise measure of strength of linear
relationship
Covariance SD and
-
7/23/2019 linear-correlation-1205885176993532-3
53/102
3
;or a given set of ata thecovariance et+eenX an Y is1.20. $he SDofX is 2 an the SD
of Y is !. $he resulting correlationis
a. .20
. .!0
c. .#0
. 1.20
Covariance: SD: andcorrelation' ;uiui< >uestion'ignificance of correlation
-
7/23/2019 linear-correlation-1205885176993532-3
63/102
#3
9nterpreting correlation
-
7/23/2019 linear-correlation-1205885176993532-3
64/102
#4
Coefficient of &eter,ination (r2)
5 oD $he proportion ofvariance or change in one
variale that can e accountefor y another variale.
5 e.g., r .'0, r2
.!'
9nterpreting correlation
-
7/23/2019 linear-correlation-1205885176993532-3
65/102
#
9nterpreting correlation(Co!en: 1/--)
( correlation is an effect sie of correlation
-
7/23/2019 linear-correlation-1205885176993532-3
66/102
66
Si>e of correlation 9ohen, 1MM:
4F(E 9.1 .!:
-
7/23/2019 linear-correlation-1205885176993532-3
67/102
#%
9nterpreting correlation(?vans: 1//#)
trengt! r r2
very +ea3 0 .1 90 to #G:
+ea3 .20 .! 9# to 1'G:moerate .#0 .% 91' to !'G:
strong .'0 .* 9!'G to '#G:
very strong .M0 1.00 9'#G to 100G:
orrelation of this scatterplot
-
7/23/2019 linear-correlation-1205885176993532-3
68/102
68
X1
403020100
Y1
3
3
2
2
1
1
0
orrelation of this scatterplot .
Scale has no effect
on correlation.
orrelation of this scatterplot
-
7/23/2019 linear-correlation-1205885176993532-3
69/102
69
X1
100+0,070-0.0403020100
Y1
2
222222
222
111
11111
11
00000
orrelation of this scatterplot .
Scale has no effect
on correlation.
4h i h l i f hi
-
7/23/2019 linear-correlation-1205885176993532-3
70/102
70
4hat o you estimate the correlation of thisscatterplot of height an +eight to e6
a. .%. 1
c. 0
. .%
e. 1
/%
73727170-+-,-7---.
%02%3
17-
174
172
170
1-,
1--
4h t ti t th l ti f thi
-
7/23/2019 linear-correlation-1205885176993532-3
71/102
71
4hat o you estimate the correlation of thisscatterplot to e6
a. .%
. 1
c. 0
. .%
e. 1
X
.-.4.2.04.,4.-4.4
Y
14
12
10
,
-
4
2
4h t ti t th l ti f thi
-
7/23/2019 linear-correlation-1205885176993532-3
72/102
72
4hat o you estimate the correlation of thisscatterplot to e6
a. .%
. 1
c. 0
. .%
e. 1
X
141210,-42
Y
-
.
.
.
.
.
4
rite up' ?@a,ple
-
7/23/2019 linear-correlation-1205885176993532-3
73/102
%3
rite8up' ?@a,ple
UNumer of chilren an maritalsatisfaction +ere inversely relate9r 9#M: .!%,p Z .0%:, such that
contentment in marriage teneto e lo+er for couples +ith morechilren. Numer of chilren
e8plaine appro8imately 10G ofthe variance in maritalsatisfaction, a smallmoerateeffect 9see ;igure 1:.W
-
7/23/2019 linear-correlation-1205885176993532-3
74/102
%4
=ssu,ptions and
li,itations(Pearson product8,o,entlinear correlation)
=ssu,ptions and li,itations
-
7/23/2019 linear-correlation-1205885176993532-3
75/102
%
1. "evels of measurement [ interval
2. orrelation is not causation
!. "inearity
1. Fffects of outliers2. Nonlinearity
#. Normality
%. Bomosceasticity
'. Range restriction
* Beterogenous samples
=ssu,ptions and li,itations
orrelation is not causation e.g.,
-
7/23/2019 linear-correlation-1205885176993532-3
76/102
76
g
correlation et+een ice cream consumption an crime,ut actual cause is temperature
orrelation is not causation e.g.,
-
7/23/2019 linear-correlation-1205885176993532-3
77/102
77
gStop gloal +arming /ecome a pirate
ausation may e
-
7/23/2019 linear-correlation-1205885176993532-3
78/102
78
yin the eye of the
eholertHs a rather interestingphenomenon. Fvery time
press this lever, thatgrauate stuent reathesa sigh of relief.
?ffect of outliers
-
7/23/2019 linear-correlation-1205885176993532-3
79/102
%/
?ffect of outliers
5 7utliers can isproportionatelyincrease or ecrease r.
5 7ptions
? compute r+ith & +ithout outliers? get more ata for outlying values
? recoe outliers as having more
conservative scores? transformation
? recoe variale into lo+er level of
measurement
(ge & selfesteem
-
7/23/2019 linear-correlation-1205885176993532-3
80/102
80
(ge & self esteem9r .'!:
(NF
M0*0'0%0#0!02010
SF
10
M
'
#
2
0
(ge & selfesteem
-
7/23/2019 linear-correlation-1205885176993532-3
81/102
81
(ge & self esteem9outliers remove: r .2!
(NF
#0!02010
SF
.
M
*
'
%
#
!
2
1
-
7/23/2019 linear-correlation-1205885176993532-3
82/102
-
7/23/2019 linear-correlation-1205885176993532-3
83/102
-3
on8linear relations!ips
f nonlinear, consier5 Does a linear relation help6
5 $ransforming variales to Vcreatelinear relationship
5 Kse a nonlinear mathematical
function to escrie therelationship et+een the variales
-
7/23/2019 linear-correlation-1205885176993532-3
84/102
-4
or,alit$
5 $he an Q ata shoul e samplefrom populations +ith normal istriutions
5 Do not overly rely on a single inicator ofnormalityY use histograms, s3e+ness an3urtosis, an inferential tests 9e.g.,
Shapiro4il3s:5 Note that inferential tests of normality are
overly sensitive +hen sample is large
-
7/23/2019 linear-correlation-1205885176993532-3
85/102
-
7/23/2019 linear-correlation-1205885176993532-3
86/102
86
Bomosceasticity
Range restriction
-
7/23/2019 linear-correlation-1205885176993532-3
87/102
-%
Range restriction
5 Range restriction is +hen thesample contains restricte 9ortruncate: range of scores? e.g., cognitive capacity an age Z 1M
might have linear relationship5 f range restriction, e cautious in
generalisingeyon the range for
+hich ata is availale? F.g., cognitive capacity oes not
continue to increase linearly +ith ageafter age 1M
Range restriction
-
7/23/2019 linear-correlation-1205885176993532-3
88/102
--
g
B l
-
7/23/2019 linear-correlation-1205885176993532-3
89/102
89
Beterogenous samples
Susamples 9e.g.,males & females:may artificiallyincrease or
ecrease overall r. Solution calculate
r separately for su
samples & overall,loo3 for ifferences
/1
,070-00
%1
1+0
1,0
170
1-0
10
140
130
Scatterplot of Samese8 &
-
7/23/2019 linear-correlation-1205885176993532-3
90/102
90
p7ppositese8 Relations y ener
A r .'*B r .%2
pp Se5 6elations
7-.43210
Sa!eSe56elatio
ns
7
-
.
4
3
2
SX
(e!ale
!ale
A r .'*B r .%2
Scatterplot of 4eight an Self
-
7/23/2019 linear-correlation-1205885176993532-3
91/102
91
p gesteem y ener
4FB$
1201101000M0*0'0%0#0
SF
10
M
'
#
2
0
SF
male
female
A r .%0Br .#M
-
7/23/2019 linear-correlation-1205885176993532-3
92/102
/2
&ealing wit! several
correlations
Dealing +ith several correlations
-
7/23/2019 linear-correlation-1205885176993532-3
93/102
93
Scatterplot matricesorganisescatterplots ancorrelations
amongst severalvariales at once.
Bo+ever, they arenot etaile over formore than aout fivevariales at a time.
g
orrelation matri8
-
7/23/2019 linear-correlation-1205885176993532-3
94/102
94
F8ample of an (P( Style
orrelation $ale
-
7/23/2019 linear-correlation-1205885176993532-3
95/102
95
Scatterplotmatr
i8
-
7/23/2019 linear-correlation-1205885176993532-3
96/102
/#
u,,ar$
6e$ points
-
7/23/2019 linear-correlation-1205885176993532-3
97/102
/%
$ p
1. ovariations are the uilingloc3sof more comple8 analyses,e.g., reliaility analysis, factor analysis,
multiple regression2. orrelation oes not prove
causation? may e in opposite
irection, cocausal, or ue to othervariales.
-
7/23/2019 linear-correlation-1205885176993532-3
98/102
6e$ points
-
7/23/2019 linear-correlation-1205885176993532-3
99/102
//
$ p
%. onsier effect si>e9e.g., C,ramerHs V, r, r2: an irection ofrelationship
'. onuct inferential test9if neee:.
6e$ points
-
7/23/2019 linear-correlation-1205885176993532-3
100/102
100
$ p
*. nterpret)Discuss Relate ac3 to research
hypothesis Descrie & interpret correlation
(direction: si
-
7/23/2019 linear-correlation-1205885176993532-3
101/102
101
Fvans, J. D. 91':. Straightfr!ard statistics
fr the behaviral sciences. Pacific rove, (/roo3s)ole Pulishing.
Bo+ell, D. . 9200*:. "#ndamental statistics
fr the behaviral sciences. /elmont, (4as+orth.
Bo+ell, D. . 92010:. Statistical methds fr
psychlgy9*th e.:. /elmont, (4as+orth.
Open Office 9,press
-
7/23/2019 linear-correlation-1205885176993532-3
102/102
Open Office 9,press $his presentation +as mae using
7pen 7ffice mpress. ;ree an open source soft+are.
http))+++.openoffice.org)prouct)impress.html
http://www.openoffice.org/product/impress.htmlhttp://www.openoffice.org/product/impress.html