correlation analysis · correlation a linear association between two random variables correlation...
TRANSCRIPT
![Page 1: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/1.jpg)
CORRELATION
ANALYSIS
NDIM
![Page 2: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/2.jpg)
IntroductionCorrelation a LINEAR association between two
random variables
Correlation analysis show us how to determineboth the nature and strength of relationshipbetween two variables
When variables are dependent on timecorrelation is applied
Correlation lies between +1 to -1NDIM
![Page 3: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/3.jpg)
A zero correlation indicates that there is no
relationship between the variables
A correlation of –1 indicates a perfect negative
correlation
A correlation of +1 indicates a perfect positive
correlation
NDIM
![Page 4: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/4.jpg)
Types of Correlation
There are three types of correlation
Types
Type 1 Type 2 Type 3
NDIM
![Page 5: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/5.jpg)
Type1
Positive Negative No Perfect
If two related variables are such that when
one increases (decreases), the other also
increases (decreases).
If two variables are such that when one
increases (decreases), the other decreases
(increases)
If both the variables are independentNDIM
![Page 6: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/6.jpg)
When plotted on a graph it tends to be a perfect
line
When plotted on a graph it is not a straight line
Type 2
Linear Non – linear
NDIM
![Page 7: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/7.jpg)
NDIM
![Page 8: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/8.jpg)
Type 3
Simple Multiple Partial
Two independent and one dependent variable
One dependent and more than one independent
variables
One dependent variable and more than one
independent variable but only one independent
variable is considered and other independent
variables are considered constant
NDIM
![Page 9: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/9.jpg)
Methods of Studying Correlation
Scatter Diagram Method
Karl Pearson Coefficient Correlation of
Method
Spearman’s Rank Correlation Method
NDIM
![Page 10: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/10.jpg)
180
160
140
120
100
80
60
40
20
0
0 5 0 2 0 0 2 5 0100 150
Drug A (dose in mg)
S y
mpt
omIn
dex
160
140
120
100
80
60
40
20
00 50 250100 150 200
Drug B (dose in mg)
Sym
pto
mIn
dex
Very good fit Moderate fit
Correlation: LinearRelationships
Strong relationship = good linear fit
Points clustered closely around a line show a strong correlation. The line is a good predictor (good fit) with the data. The more spread out the points, the weaker the correlation, and the less good the fit. The line is a REGRESSSION line (Y = bX + a)
NDIM
![Page 11: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/11.jpg)
Coefficient of CorrelationA measure of the strength of the linear relationship
between two variables that is defined in terms of the
(sample) covariance of the variables divided by their
(sample) standard deviations
Represented by “r”
r lies between +1 to -1
Magnitude and Direction
NDIM
![Page 12: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/12.jpg)
-1 < r < +1
The + and – signs are used for positive linear
correlations and negative linear
correlations, respectively
NDIM
![Page 13: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/13.jpg)
n Y 2 ( Y)2X2n ( X)2
n XY X Yrxy
Shared variability of X and Y variables on the topIndividual variability of X and Y variables on the bottom
NDIM
![Page 14: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/14.jpg)
Interpreting CorrelationCoefficient r
strong correlation: r > .70 or r < –.70
moderate correlation: r is between .30 &.70or r is between –.30
and –.70
weak correlation: r is between 0 and .30 or r is between 0 and –.30 .
NDIM
![Page 15: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/15.jpg)
Spearmans rank coefficient
A method to determine correlation when the data
is not available in numerical form and as an
alternative the method, the method of rank
correlation is used. Thus when the values of the
two variables are converted to their ranks, and
there from the correlation is obtained, the
correlations known as rank correlation.
NDIM
![Page 16: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/16.jpg)
Computation of RankCorrelation
Spearman’s rank correlation coefficient
ρ can be calculated when
Actual ranks given
Ranks are not given but grades are given but not
repeated
Ranks are not given and grades are given and
repeated
NDIM
![Page 17: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/17.jpg)
REGRESSION ANALYSIS
NDIM
![Page 18: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/18.jpg)
Algebraically method
1.Least Square Method-:
The regression equation of X on Y is :
X= a+bX
Where,
X=Dependent variable and Y=Independent variable
The regression equation of Y on X is:
Y = a+bX
Where,
Y=Dependent variable
X=Independent variable
NDIM
![Page 19: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/19.jpg)
Simple Linear Regression
Independent variable (x)
De
pe
nd
en
t va
ria
ble
(y)
The output of a regression is a function that predicts the dependent variable
based upon values of the independent variables.
Simple regression fits a straight line to the data.
y = a + bX ± є
a (y intercept)
b = slope
= ∆y/ ∆x
є
NDIM
![Page 20: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/20.jpg)
Example1-: From the following data obtain the regression equations using the method of Least Squares.
X 3 2 7 4 8
Y 6 1 8 5 9
Solution-:
X Y XY X2 Y2
3 6 18 9 36
2 1 2 4 1
7 8 56 49 64
4 5 20 16 25
8 9 72 64 81
24X 29Y 168XY 1422 X 2072 Y
NDIM
![Page 21: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/21.jpg)
XbnaY
2XbXaXY
Substitution the values from the table we get
29=5a+24b…………………(i)168=24a+142b84=12a+71b………………..(ii)
Multiplying equation (i ) by 12 and (ii) by 5
348=60a+288b………………(iii)420=60a+355b………………(iv)
By solving equation(iii)and (iv) we get
a=0.66 and b=1.07
NDIM
![Page 22: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/22.jpg)
By putting the value of a and b in the Regression equation Y on X we get
Y=0.66+1.07X
Now to find the regression equation of X on Y ,The two normal equation are
2YbYaXY
YbnaX
Substituting the values in the equations we get
24=5a+29b………………………(i)168=29a+207b…………………..(ii)
Multiplying equation (i)by 29 and in (ii) by 5 we get
a=0.49 and b=0.74
NDIM
![Page 23: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/23.jpg)
Substituting the values of a and b in the Regression equation X and Y
X=0.49+0.74Y
2.Deaviation from the Arithmetic mean method:
The calculation by the least squares method are quit cumbersome when the values of X and Y are large. So the work can be simplified by using this method.The formula for the calculation of Regression Equations by this method:
Regression Equation of X on Y- )()( YYbXX xy Regression Equation of Y on X-
)()( XXbYY yx
2y
xybxy
2x
xybyxand
Where,xyb
yxband = Regression Coefficient
NDIM
![Page 24: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/24.jpg)
Example2-: from the previous data obtain the regression equations byTaking deviations from the actual means of X and Y series.
X 3 2 7 4 8
Y 6 1 8 5 9
X Y x2 y2 xy
3 6 -1.8 0.2 3.24 0.04 -0.36
2 1 -2.8 -4.8 7.84 23.04 13.44
7 8 2.2 2.2 4.84 4.84 4.84
4 5 -0.8 -0.8 0.64 0.64 0.64
8 9 3.2 3.2 10.24 10.24 10.24
XXx YYy
24X 29Y 8.262 x 8.28xy8.382 y 0x 0 y
Solution-:
NDIM
![Page 25: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/25.jpg)
Regression Equation of X on Y is
49.074.0
8.574.08.4
8.58.38
8.288.4
2
YX
YX
YX
y
xybxy
Regression Equation of Y on X is)()( XXbYY yx
66.007.1
)8.4(07.18.5
8.48.26
8.288.5
2
XY
XY
XY
x
xybyx
………….(I)
………….(II)
)()( YYbXX xy
NDIM
![Page 26: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/26.jpg)
It would be observed that these regression equations are same as those obtained by the direct method .
3.Deviation from Assumed mean method-:
When actual mean of X and Y variables are in fractions ,thecalculations can be simplified by taking the deviations from theassumed mean.
The Regression Equation of X on Y-:
22
yy
yxyx
xy
ddN
ddddNb
The Regression Equation of Y on X-:
22
xx
yxyx
yx
ddN
ddddNb
)()( YYbXX xy
)()( XXbYY yx
But , here the values of and will be calculated byfollowing formula:
xyb yxb
NDIM
![Page 27: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/27.jpg)
Example-: From the data given in previous example calculateregression equations by assuming 7 as the mean of X series and 6 asthe mean of Y series.
X YDev. From
assu. Mean 7 (dx)=X-7
Dev. From assu. Mean 6 (dy)=Y-6
dxdy
3 6 -4 16 0 0 0
2 1 -5 25 -5 25 +25
7 8 0 0 2 4 0
4 5 -3 9 -1 1 +3
8 9 1 1 3 9 +3
Solution-:
2
xd 2
yd
24X 29Y 11xd 1yd 512
xd 392
yd 31yxddNDIM
![Page 28: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/28.jpg)
The Regression Coefficient of X on Y-:
22
yy
yxyx
xy
ddN
ddddNb
74.0
194
144
1195
11155
)1()39(5
)1)(11()31(52
xy
xy
xy
xy
b
b
b
b
8.55
29
Y
N
YY
The Regression equation of X on Y-:
49.074.0
)8.5(74.0)8.4(
)()(
YX
YX
YYbXX xy
8.45
24
X
N
XX
NDIM
![Page 29: CORRELATION ANALYSIS · Correlation a LINEAR association between two random variables Correlation analysis show us how to determine both the nature and strength of relationship between](https://reader034.vdocument.in/reader034/viewer/2022042605/5f56b8a92de95e7c593a0649/html5/thumbnails/29.jpg)
The Regression coefficient of Y on X-:
22
xx
yxyx
yx
ddN
ddddNb
07.1
134
144
121255
11155
)11()51(5
)1)(11()31(52
yx
yx
yx
yx
b
b
b
b
The Regression Equation of Y on X-:)()( XXbYY yx
66.007.1
)8.4(07.1)8.5(
XY
XY
It would be observed the these regression equations are same as thoseobtained by the least squares method and deviation from arithmetic mean .
NDIM