srit / picm105 sfm / correlation and regression sri … · 2019. 12. 5. · srit / picm105 – sfm...
TRANSCRIPT
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 1
SRI RAMAKRISHNA INSTITUTE OF TECHNOLOGY
(AN AUTONOMOUS INSTITUTION)
COIMBATORE- 641010
PICM105 & STATISTICS FOR MANAGEMENT
Unit V
CORRELATION AND REGRESSION
Correlation
If the change in one variable affects the change in the other variable, then the
variable are said to be correlated.
Positive correlation:
If the two variable deviate in the same direction (i.e., increase or decrease) in one
variable in a corresponding (increase or decrease) in other variable is said to be positive
correlation.
Ex: Income and expenditure
Negative correlation:
If the two variable deviate in opposite direction (i.e., increase or decrease) in one
variable in a corresponding (decrease or increase) in other variable is said to be negative
correlation.
Ex: Price and demand of a product
Rank correlation
Sometimes there doesn’t exist a marked linear relationship between two random
variables but a monotonic relation (if one increases, the other also increases or instead,
decreases) is clearly noticed. Pearson’s Correlation Coefficient evaluation, in this case,
would give us the strength and direction of the linear association only between the
variables of interest. Herein comes the advantage of the Spearman Rank Correlation
methods, which will instead, give us the strength and direction of the monotonic relation
between the connected variables. This can be a good starting point for further evaluation.
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 2
The Spearman Rank-Order Correlation Coefficient
The Spearman’s Correlation Coefficient, represented by or by , is a
nonparametric measure of the strength and direction of the association that exists between
two ranked variables. It determines the degree to which a relationship is monotonic, i.e.,
whether there is a monotonic component of the association between two continuous or
ordered variables.
Monotonicity is “less restrictive” than that of a linear relationship. Although
monotonicity is not actually a requirement of Spearman’s correlation, it will not be
meaningful to pursue Spearman’s correlation to determine the strength and direction of a
monotonic relationship if we already know the relationship between the two variables is
not monotonic.
Spearman Ranking of the Data
We must rank the data under consideration before proceeding with the Spearman’s
Rank Correlation evaluation. This is necessary because we need to compare whether on
increasing one variable, the other follows a monotonic relation (increases or decreases
regularly) with respect to it or not.
Thus, at every level, we need to compare the values of the two variables. The method of
ranking assigns such ‘levels’ to each value in the dataset so that we can easily compare it.
Assign number 1 to (the number of data points) corresponding to the variable
values in the order highest to lowest.
In the case of two or more values being identical, assign to them the arithmetic mean
of the ranks that they would have otherwise occupied.
For example, Selling Price values given: 28.2, 32.8, 19.4, 22.5, 20.0, 22.5 The
corresponding ranks are: 2, 1, 5, 3.5, 4, 3.5 The highest value 32.8 is given rank 1, 28.2 is
given rank 2,…. Two values are identical (22.5) and in this case, the arithmetic means of
ranks that they would have otherwise occupied (3+42) has to be taken.
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 3
Spearman’s Rank Correlation formula
∑
( )
where is the number of data points of the two variables and is the difference in the
ranks of the ith element of each random variable considered. The Spearman correlation
coefficient, , can take values from to .
I. If the value of is indicates a perfect association of ranks
II. If the value of is indicates no association between ranks and
III. If the value of is indicates a perfect negative association of ranks.
IV. If the value of near to zero, the weaker the association between the ranks.
Merits of Rank Correlation Coefficient
1. Spearman’s rank correlation coefficient can be interpreted in the same way as the Karl
Pearson’s correlation coefficient;
2. It is easy to understand and easy to calculate;
3. If we want to see the association between qualitative characteristics, rank correlation
coefficient is the only formula;
4. Rank correlation coefficient is the non-parametric version of the Karl Pearson’s product
moment correlation coefficient; and
5. It does not require the assumption of the normality of the population from which the
sample observations are taken.
Demerits of Rank Correlation Coefficient
1. Product moment correlation coefficient can be calculated for bivariate frequency
distribution but rank correlation coefficient cannot be calculated; and
2. If , this formula is time consuming.
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 4
Problem: 1
The following table provides data about the percentage of students who have free
university meals and their CGPA scores. Calculate the Spearman’s Rank Correlation
between the two and interpret the result.
State
University
% of students
having free meals
% of students scoring
above 8.5 CGPA
Pune 14.4 54
Chennai 7.2 64
Delhi 27.5 44
Kanpur 33.8 32
Ahmedabad 38.0 37
Indore 15.9 68
Guwahati 4.9 62
Answer:
Let us first assign the random variables to the required data –
X – % of students having free meals
Y – % of students scoring above 8.5 CGPA
Before proceeding with the calculation, we’ll need to assign ranks to the data
corresponding to each state university. We construct the table for the rank as below –
Rank in X Rank in Y
14.4
7.2
27.5
33.8
38.0
15.9
4.9
∑ ∑
Rank Correlation
∑
( )
(
( )) , -
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 5
(
)
The result shows that a strong negative coefficient of correlation. That is the highest
percentage of students consuming free meals tend to have the least successful results.
Problem: 2
Compute the coefficient of rank correlation between sales and advertisement expressed in
thousands of dollars from the following data:
Sales 90 85 68 75 82 80 95 70
Advertisement 7 6 2 3 4 5 8 1
Answer:
Rank in X Rank in Y
90 2 7 2 0
85 3 6 3 0
68 8 2 7 1
75 6 3 6 0
82 4 4 5 1
80 5 5 4 1
95 1 8 1 0
70 7 1 8 1
∑ ∑
Rank Correlation
∑
( )
(
( )) , -
(
)
The result shows that a strong positive coefficient of correlation. Hence there is a very
good amount of agreement between sales and advertisement.
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 6
Problem: 3
Find the rank correlation co efficient from the following data.
Rank in X 1 2 3 4 5 6 7
Rank in Y 4 3 1 2 6 5 7
Answer:
∑ ∑
Rank Correlation
∑
( )
(
( )) , -
(
)
Problem: 4
The ranks of some 16 students in mathematics and physics are as follows. Find the
rank correlation for the proficiency in mathematics and physics.
Rank in Math’s 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Rank in Physics 1 10 3 4 5 7 2 6 8 11 15 9 14 12 16 13
Answer:
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 7
∑ ∑
Rank Correlation
∑
( )
(
( )) , -
(
)
Problem: 5
Suppose we have ranks of 5 students in three subjects Computer, Physics and Statistics and
we want to test which two subjects have the same trend.
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 8
Rank in Computer 2 4 5 1 3
Rank in Physics 5 1 2 3 4
Rank in Statistics 2 3 5 4 1
Answer:
In this problem ranks are directly given.
Rank
in X
Rank
in Y
Rank
in Z
2 5 2
4 1 3
5 2 5
1 3 4
3 4 1
∑ ∑
∑
Rank Correlation
∑
( )
I. Rank correlation between computer and physics:
∑
( )
(
( )) , -
(
)
II. Rank correlation between physics and statistics:
∑
( )
(
( )) , -
(
)
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 9
III. Rank correlation between computer and statistics:
∑
( )
(
( )) , -
(
)
Since and are negative which indicates that Computer and Physics also Physics and
Statistics have opposite trends. But indicates that Computer and Statistics have same
trend.
Repeated Ranks
If the value is repeated in any row in any series or series in times, then we
have add the correction factor in the rank correlation formula
( )
(∑
( ))
Problem: 6
Determine the rank correlation co efficient for the following data.
68 64 75 50 64 80 75 40 55 64
62 58 68 45 81 60 68 48 50 70
Answer:
Rank in Rank in
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 10
∑ ∑
To find Correction Factor:
In series value 75 is repeated two times
( )
In series value 64 is repeated three times
( )
In series value 68 is repeated two times
( )
Rank Correlation:
(∑
( ))
(
( )) , -
(
)
Problem: 7
The sample of 12 fathers and their eldest sons have the following data about their
heights in inches.
Fathers 65 63 67 64 68 62 70 66 68 67 69 71
Sons 68 66 68 65 69 66 68 65 71 67 68 70
Answer:
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 11
Rank in Rank in
∑ ∑
To find Correction Factor:
In series value 68 is repeated two times
( )
In series value 67 is repeated two times
( )
In series value 68 is repeated four times
( )
In series value 66 is repeated two times
( )
In series value 65 is repeated two times
( )
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 12
Rank Correlation:
(∑
( ))
(
( )) , -
( )
Karl – pearson’s co efficient of Correlation
Karl Pearson’s Coefficient of Correlation is widely used mathematical method
wherein the numerical expression is used to calculate the degree and direction of the
relationship between linear related variables.
Pearson’s method, popularly known as a Pearsonian Coefficient of Correlation, is the
most extensively used quantitative methods in practice. The coefficient of correlation is
denoted by .
If the relationship between two variables and is to be ascertained, then the
following formula is used:
( ) ( ) ( )
( ) ( ) ( ) ( )
∑
∑
∑
( ) ( ) , ( )- ( )
( ) , ( )-
∑
∑
Note:
If ( ) then ( ) ( ) ( )
If ( ) ( ) then and are uncorrelated.
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 13
Coefficient of Determination
In Statistical Analysis, the coefficient of determination method is used to predict and
explain the future outcomes of a model. This method is also known as R squared. This
method also acts like a guideline which helps in measuring the model’s accuracy. In this
article, let us discuss the definition, formula, and properties of the coefficient of
determination in detail.
Definition: Coefficient of Determination
The coefficient of determination or squared method is the proportion of the
variance in the dependent variable that is predicted from the independent variable. It
indicates the level of variation in the given data set.
The coefficient of determination is the square of the correlation(r), thus it ranges
from 0 to 1.
With linear regression, the correlation of determination is equal to the square of the
correlation between the x and y variables.
If is equal to 0, then the dependent variable should not be predicted from the
independent variable.
If is equal to 1, then the dependent variable should be predicted from the
independent variable without any error.
If is between 0 and 1, then it indicates the extent that the dependent variable can
be predictable. If of 0.10 means, it is 10 per cent of the variance in variable is
predicted from the variable. If 0.20 means, it is 20 per cent of the variance is
variable is predicted from the variable, and so on.
The value of shows whether the model would be a good fit for the given data set. On the
context of analysis, for any given per cent of the variation, it(good fit) would be different.
For instance, in a few fields like rocket science, R2 is expected to be nearer to 100 %. But
(minimum theoretical value), which might not be true as is always greater than
0 (by Linear Regression).
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 14
The value of increases after adding a new variable predictor. Note that it might
not be associated with the result or outcome. The which was adjusted will include the
same information as the original one. The number of predictor variables in the model gets
penalized. When in a multiple linear regression model, new predictors are added, it would
increase . Only an increase in which is greater than the expected(chance alone), will
increase the adjusted .
Properties of Coefficient of Determination
It helps to get the ratio of how a variable which can be predicted from the other one,
varies.
If we want to check how clear it is to make predictions from the data given, we can
determine the same by this measurement.
It helps to find Explained variation / Total Variation
It also lets us know the strength of the association(linear) between the variables.
If the value of gets close to 1, The values of y become close to the regression line
and similarly if it goes close to 0, the values get away from the regression line.
It helps in determining the strength of association between different variables.
Problem: 8
Find the correlation co efficient for the following data
X 10 14 18 22 26 30 Y 18 12 24 6 30 36
Answer:
10 18 180 100 324
14 12 168 196 144
18 24 432 324 576
22 6 132 484 36
26 30 780 676 900
30 36 1080 900 1296
∑ ∑ ∑ ∑ ∑
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 15
∑
∑
√∑
( )
√
( )
√∑
( )
√
( )
( ) ∑
( ) ( )
Problem: 9
The table below shows the number of absences, , in a Calculus course and the final exam
grade, , for 7 students. Find the correlation coefficient and interpret your result.
X 1 0 2 6 4 3 3 Y 95 90 90 55 70 80 85
Answer:
1 95 1 9025 95
0 90 0 8100 0
2 90 4 8100 180
6 55 36 3025 330
4 70 16 4900 280
3 80 9 6400 240
3 85 9 7225 255
∑ ∑ ∑ ∑ ∑
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 16
∑
∑
√∑
( ) √
√∑
( ) √
( ) ∑
( ) ( )
Interpret this result:
There is a strong negative correlation between the number of absences and the final
exam grade, since is very close to . Thus, as the number of absences increases, the final
exam grade tends to decrease.
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 17
Problem: 10
The marks obtained by 10 students in Mathematics and Statistics are given below.
Find the correlation coefficient between the two subjects.
Marks in math’s 75 30 60 80 53 35 15 40 38 48
Marks in Stats 85 45 54 91 58 63 35 43 45 44
Answer:
75 85 6375 5625 7225
30 45 1350 900 2025
60 54 3240 3600 2916
80 91 7280 6400 8281
53 58 3074 2809 3364
35 63 2205 1225 3969
15 35 525 225 1225
40 43 1720 1600 1849
38 45 1710 1444 2025
48 44 2112 2304 1936
∑ ∑ ∑ ∑ ∑
∑
∑
√∑
( ) √
√∑
( ) √
( ) ∑
( ) ( )
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 18
Problem: 11
Compute the coefficient of correlation between X and Y using the following data:
X 1 3 5 7 8 10
Y 8 12 15 17 18 20
Answer:
1 8 8 1 64
3 12 36 9 144
5 15 75 25 225
7 17 119 49 289
8 18 144 64 324
10 20 200 100 400
∑ ∑ ∑ ∑ ∑
∑
∑
√∑
( )
√
√∑
( )
√
( ) ∑
( ) ( )
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 19
Problem: 12
Calculate the coefficient of correlation for the following data:
9 8 7 6 5 4 3 2 1
15 16 14 13 11 12 10 8 9
Answer:
9 15 135 81 225
8 16 128 64 256
7 14 98 49 196
6 13 78 36 169
5 11 55 25 121
4 12 48 16 144
3 10 30 9 100
2 8 16 4 64
1 9 9 1 81
∑ ∑ ∑ ∑ ∑
∑
∑
√∑
( )
√
√∑
( )
√
( ) ∑
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 20
( ) ( )
Regression
Regression analysis is most often used for prediction. The goal in regression analysis
is to create a mathematical model that can be used to predict the values of a dependent
variable based upon the values of an independent variable. In other words, we use the
model to predict the value of when we know the value of . (The dependent variable is
the one to be predicted). Correlation analysis is often used with regression analysis
because correlation analysis is used to measure the strength of association between the
two variables and .
In regression analysis involving one independent variable and one dependent
variable the values are frequently plotted in two dimensions as a scatter plot. The scatter
plot allows us to visually inspect the data prior to running a regression analysis. Often this
step allows us to see if the relationship between the two variables is increasing or
decreasing and gives only a rough idea of the relationship. The simplest relationship
between two variables is a straight-line or linear relationship. Of course the data may well
be curvilinear and in that case we would have to use a different model to describe the
relationship (we will deal only with linear relationship’s for now). Simple linear regression
analysis finds the straight line that best fits the data
Definition:
Regression is mathematical measure of the average relationship between two or
more variables in terms of original limits of the data.
The equation of line of regression of on is
( )
The equation of line of regression of on is
( )
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 21
Regression co efficient
Correlation co efficient √
Problem: 13
Obtain the equations of the lines of regression from the following data:
1 2 3 4 5 6 7
9 8 10 12 11 13 14
Answer:
1 9 9 1 81
2 8 16 4 64
3 10 30 9 100
4 12 48 16 144
5 11 55 25 121
6 13 78 36 169
7 14 98 49 196
Total 334 140 875
∑
∑
√∑
( ) √
√∑
( ) √
( ) ( ) ( ) ( )
∑
Correlation coefficient
( )
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 22
The line of regression of on is
( )
( )
The line of regression of on is
( )
( )
Problem: 14
From the following data find
The two regression lines
The co efficient of correlation between the marks in economics and statistics
The most likely marks in statistics when marks in economics are 30.
Marks in Economics: 25 28 35 32 31 36 29 38 34 32
Marks in Statistics: 43 46 49 41 36 32 31 30 33 39
Answer:
25 43 1075 625 1849
28 46 1288 784 2116
35 49 1715 1225 2401
32 41 1312 1024 1681
31 36 1116 961 1296
36 32 1152 1296 1024
29 31 899 841 961
38 30 1140 1444 900
34 33 1122 1156 1089
32 39 1248 1024 1521
Total 12067 10380 14838
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 23
∑
∑
√∑
( )
√
√∑
( )
√
( ) ( ) ( ) ( )
∑
Correlation coefficient
( )
The line of regression of on is
( )
( )
( )
The line of regression of on is
( )
( )
( )
The most likely marks in statistics when marks in economics are 30.
( )
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 24
Problem: 15
A tyre manufacturing company is interested in removing pollutants from the exhaust at the
factory, and cost is a concern. The company has collected data from other companies
concerning the amount of money spent on environmental measures and the resulting
amount of dangerous pollutants released (as a percentage of total emissions)
Money spent
(Rupees in lakhs
8.4 10.2 16.5 21.7 9.4 8.3 11.5 18.4 16.7 19.3 28.4 4.7 12.3
Percentage of
dangerous pollutants
35.9 31.8 24.7 25.2 36.8 35.8 33.4 25.4 31.4 27.4 15.8 31.5 28.9
a) Compute the regression equation.
b) Predict the percentage of dangerous pollutants released when Rs. 20,000 is spent on
control measures.
c) Find the standard error of the estimate (regression line).
Answer:
S. No
1 8.4 35.9 70.56 1288.8 301.56
2 10.2 31.8 104.04 1011.2 324.36
3 16.5 24.7 272.25 610.09 407.55
4 21.7 25.2 470.89 635.04 546.84
5 9.4 36.8 88.36 1354.2 345.92
6 8.3 35.8 68.89 1281.6 297.14
7 11.5 33.4 132.25 1115.6 384.1
8 18.4 25.4 338.56 645.16 467.36
9 16.7 31.4 278.89 985.96 524.38
10 19.3 27.4 372.49 750.76 528.82
11 28.4 15.8 806.56 249.64 448.72
12 4.7 31.5 22.09 992.25 148.05
13 12.3 28.9 151.29 835.21 355.47
Total ∑ 185.8 ∑ 384 ∑ 3177 ∑ 11756 ∑ 5080
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 25
a) Regression equation
∑
∑
√∑
( )
√
√∑
( )
√
( ) ( ) ( ) ( )
∑
Correlation coefficient
( )
The line of regression of on is
( )
( )
( )
The line of regression of on is
( )
( )
( )
b) When Rs. 20,000 ( ) is spent on control then the percentage of
dangerous pollutants released is
( )
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 26
c) Standard Error Estimate
S. No ( )
1 8.4 35.9 3.056 3.056
2 10.2 31.8 0.888 0.888
3 16.5 24.7 9.669 9.669
4 21.7 25.2 2.138 2.138
5 9.4 36.8 11.773 11.773
6 8.3 35.8 2.465 2.465
7 11.5 33.4 2.807 2.807
8 18.4 25.4 0.850 0.850
9 16.7 31.4 14.041 14.041
10 19.3 27.4 3.179 3.179
11 28.4 15.8 7.246 7.246
12 4.7 31.5 30.790 30.790
13 12.3 28.9 4.832 4.832
∑( )
√∑( )
√
Problem: 16
The quantity of a raw material purchased by a company at the specified prices during the
12 months of 1992 is given
MONTH PRICE/KG QUANTITY (KG)
Jan 96 250
Feb 110 200
Mar 100 250
Aprl 90 280
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 27
May 86 300
June 92 300
July 112 220
Aug 112 220
Sep 108 200
Oct 116 210
Nov 86 300
Dec 92 250
Find the regression equation based on the above data
Can you estimate the appropriate quantity likely to be purchased if the price shoot
upon Rs 124/kg?
Hence or otherwise obtain the coefficient of correlation between the price prevailing
and the quantity demanded
Answer: S. No
1 96 250 9216 62500 24000
2 110 200 12100 40000 22000
3 100 250 10000 62500 25000
4 90 280 8100 78400 25200
5 86 300 7396 90000 25800
6 92 300 8464 90000 27600
7 112 220 12544 48400 24640
8 112 220 12544 48400 24640
9 108 200 11664 40000 21600
10 116 210 13456 44100 24360
11 86 300 7396 90000 25800
12 92 250 8464 62500 23000
Total ∑ 1200 ∑ 2980 ∑ 121344 ∑ 756800 ∑ 293640
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 28
a) Regression equation
∑
∑
√∑
( )
√
√∑
( )
√
( ) ( ) ( ) ( )
∑
Correlation coefficient
( )
The line of regression of on is
( )
( )
( )
The line of regression of on is
( )
( )
( )
b) Estimation of the appropriate quantity likely to be purchased if the price shoot upon
Rs 124/kg,
Given price short , then
( )
c) Correlation coefficient
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 29
Problem: 17
Find the standard error of the estimate from the data given below.
X 1 2 3 4 5
Y 1 2 1.3 3.75 2.25
Answer: S. No
1 1 1 1 1 1
2 2 2 4 4 4
3 3 1.30 9 1.69 3.90
4 4 3.75 16 14.06 15
5 5 2.25 25 5.06 11.25
Total ∑ 15 ∑ 10.3 ∑ 55 ∑ 25.82 ∑ 35.15
a) Regression equation
∑
∑
√∑
( )
√
√∑
( )
√
( ) ( ) ( ) ( )
∑
Correlation coefficient
( )
The line of regression of on is
( )
( )
( )
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 30
II. Standard Error Estimate
S. No ( )
1 1 1 1.21 0.04
2 2 2 1.63 0.13
3 3 1.30 2.06 0.58
4 4 3.75 2.49 1.60
5 5 2.25 2.91 0.44
∑( )
√∑( )
√
Problem: 18
Find the standard error of the estimate from the data given below.
X 1 2 3 4 5 6 7
Y 2 4 7 6 5 6 5
Answer: S. No
1 1 2 1 4 2
2 2 4 4 16 8
3 3 7 9 49 21
4 4 6 16 36 24
5 5 5 25 25 25
6 6 6 36 36 36
7 7 5 49 25 35
Total ∑ 28 ∑ 35 ∑ 140 ∑ 191 ∑ 151
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 31
a) Regression equation
∑
∑
√∑
( )
√
√∑
( )
√
( ) ( ) ( ) ( )
∑
Correlation coefficient
( )
The line of regression of on is
( )
( )
( )
III. Standard Error Estimate
S. No ( )
1 1 2 3.82 3.32
2 2 4 4.22 0.05
3 3 7 4.61 5.72
4 4 6 5.00 1.00
5 5 5 5.40 0.16
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 32
6 6 6 5.79 0.04
7 7 5 6.18 1.39
∑( )
√∑( )
√
Problem: 19
The two lines of regression are . The
variance of is 9. Find (i) The mean values of and (ii) correlation coefficient between
and .
Answer:
Since both the lines of regression passes through the mean values and .
The point ( ) must satisfy the two lines.
( )
( )
Solving ( ) and ( ), we get
The mean values of and are
Consider the line is a regression line on .
Consider the line is a regression line on .
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 33
Correlation co efficient √
√
√
Correlation co efficient
Problem: 20
The two lines of regression are . Find ( ) and
correlation coefficient between and .
Answer:
Since both the lines of regression passes through the mean values and .
The point ( ) must satisfy the two lines.
( )
( )
Solving ( ) and ( ), we get
Consider the line is a regression line on .
Consider the line is a regression line on .
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 34
Correlation co efficient √
√
√
Correlation co efficient
Problem: 21
The regression equation of and is . If the mean value of
and the correlation coefficient.
Answer:
Given the regression equation of and is
.
Since the line of regression passes through ( ), then
Also given mean value of is
( )
Hence mean value of is 48.
which is the line of regression of on .
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 35
[
]
( )
( )
.
/
Problem: 22
If and are uncorrelated random variables with variances and . Find the
correlation coefficient between and .
Answer:
Given that ( ) ( )
Let us take and
Given that both and are uncorrelated.
Now ( ) ( )
( ) ( ) , ( ) ( ) ( )-
( )
and ( ) ( )
( ) ( ) , ( ) ( ) ( )-
( )
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 36
( ) ,( )( )-
( )
( ) ( )
( ) ,( )- ( ) ( )
( ) ,( )- ( ) ( )
Now ( ) ( ) ( ) ( )
( ) ( ) * ( ) ( )+ * ( ) ( )+
( ) ( ) ,* ( )+ * ( )+ -
, ( ) * ( )+ - , ( ) * ( )+ -
( ) ( )
( ) ( )
( )
Problem: 24
If the independent random variables and have the variances 3 and
respectively, find the correlation coefficient between and .
Answer:
Given that ( ) ( )
Let us take and
Given that both and are uncorrelated.
Now ( ) ( )
( ) ( ) , ( ) ( ) ( )-
( )
√
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 37
and ( ) ( )
( ) ( ) , ( ) ( ) ( )-
( )
√
( ) ,( )( )-
( )
( ) ( )
( ) ,( )- ( ) ( )
( ) ,( )- ( ) ( )
Now ( ) ( ) ( ) ( )
( ) ( ) * ( ) ( )+ * ( ) ( )+
( ) ( ) ,* ( )+ * ( )+ -
, ( ) * ( )+ - , ( ) * ( )+ -
( ) ( )
( ) ( )
√ √
( )
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 38
Curve Fitting
The principle of least squares
Fitting a Straight line
Let ( ) be a sets of observations and they related by the relation
. By calculating and by using the normal equations
∑ ∑
∑ ∑ ∑
and substitute in the equation , we get the best fitting straight line.
Problem: 25
By the method of least squares find the best fitting straight line to the data given below.
5 10 15 20 25
15 19 23 26 30
Answer:
Let the straight line be .
The normal equations are
∑ ∑ ( )
∑ ∑ ∑ ( )
S.No
1 5 15 25 75
2 10 19 100 190
3 15 23 225 345
4 20 26 400 520
5 25 30 625 750
Total ∑ 75 ∑ 113 ∑ 1375 ∑ 1880
Therefore equations ( ) and ( ) becomes
( )
( )
Solving, we get and
Therefore the best fit of straight line is
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 39
Problem: 26
Fit a straight line to the data also find the value of at
0 1 2 3 4
1 1.8 3.3 4.5 6.3
Answer:
Let the straight line be .
The normal equations are
∑ ∑ ( )
∑ ∑ ∑ ( )
S.No
1 0 1 0 0
2 1 1.8 1 1.8
3 2 3.3 4 6.6
4 3 4.5 9 13.5
5 4 6.3 16 25.2
Total ∑ 10 ∑ 16.9 ∑ 30 ∑ 47.1
Therefore equations ( ) and ( ) becomes
( )
( )
Solving, we get and
Therefore the best fit of straight line is
To find at :
( ) ( )
.
Problem: 27
Fit a straight line to the following data. Also estimate the value at .
71 68 73 69 67 65 66 67
69 72 70 70 68 67 68 64
Answer:
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 40
Let the straight line be .
The normal equations are
∑ ∑ ( )
∑ ∑ ∑ ( )
Therefore equations ( ) and ( ) becomes
( )
( )
Solving, we get and
Therefore the best fit of straight line is
To find at :
( ) ( )
Fitting a Parabola
Let ( ) be a sets of observations and they related by the relation
. By calculating and by using the normal equations
∑ ∑ ∑ ( )
∑ ∑ ∑ ∑ ( )
∑ ∑ ∑ ∑ ( )
and substitute in the equation , we get the best fitting parabola.
S.No
1 71 69 5041 4899
2 68 72 4624 4896
3 73 70 5329 5110
4 69 70 4761 4830
5 67 68 4489 4556
6 65 67 4225 4355
7 66 68 4356 4488
8 67 64 4489 4288
Total ∑ 546 ∑ 548 ∑ 37314 ∑ 37422
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 41
Problem: 28
By the method of least squares find the best fitting straight line to the data given below.
15 2 3 4
1.7 1.8 2.3 3.2
Answer:
Let the parabola be ( )
The normal equations are
∑ ∑ ∑ ( )
∑ ∑ ∑ ∑ ( )
∑ ∑ ∑ ∑ ( )
1 1.7 1 1 1 1.7 1.7
2 1.8 4 8 16 3.6 7.2
3 2.3 9 27 81 6.9 20.7
4 3.2 16 64 256 12.8 51.2
∑ ∑ ∑ ∑ ∑ ∑ ∑
Therefore equations ( ) ( ) and ( ) becomes
( )
( )
( )
Solving, we get and
Therefore the best fit of straight line is
Problem: 29
Fit a curve from the data given below.
3 5 7 9 11 13
2 3 4 6 5 8
Answer:
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 42
Let the parabola be ( )
The normal equations are
∑ ∑ ∑ ( )
∑ ∑ ∑ ∑ ( )
∑ ∑ ∑ ∑ ( )
3 2 9 27 81 6 18
5 3 25 125 625 15 75
7 4 49 343 2401 28 196
9 6 81 729 6561 54 486
11 5 121 1331 14641 55 605
13 8 169 2197 28561 104 1352
∑ ∑ ∑ ∑ ∑ ∑ ∑
Therefore equations ( ) ( ) and ( ) becomes
( )
( )
( )
Solving, we get and
Therefore the best fit of straight line is
Problem: 30
Fit a second degree curve from the data given below.
Answer:
Let the parabola be ( )
The normal equations are
∑ ∑ ∑ ( )
∑ ∑ ∑ ∑ ( )
∑ ∑ ∑ ∑ ( )
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 43
∑ ∑ ∑ ∑ ∑ ∑ ∑
Therefore equations ( ) ( ) and ( ) becomes
( )
( )
( )
Solving, we get and
Therefore the best fit of straight line is
Curve of the form
Let ( ) be a sets of observations and they related by the relation
.
Consider the curve
Taking log on both sides, we get
( )
( )
Assume and .
( ) ( )
It is a straight line.
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 44
Curve of the form
Let ( ) be a sets of observations and they related by the relation
.
Consider the curve
Taking log on both sides, we get
( )
( )
Assume and .
( ) ( )
It is a straight line.
Problem: 31
Fit a curve for the following data.
2 3 4 5 6
8.3 15.4 33.1 65.2 127.4
Answer:
Given the curve
Taking log on both sides, we get
( )
( )
Assume and .
( ) ( )
It is a straight line.
The normal equations are
∑ ∑ ( )
∑ ∑ ∑ ( )
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 45
2 2 8.3 0.92 4 1.84
3 3 15.4 1.19 9 3.56
4 4 33.1 1.52 16 6.08
5 5 65.2 1.81 25 9.07
6 6 127.4 2.11 36 12.63
∑ 20 ∑ 7.55 ∑ 90 ∑ 33.18
Therefore equations ( ) and ( ) becomes
( )
( )
Solving, we get and
Since
and
The required curve is
( )
Problem: 32
Fit a curve for the following data.
0 1 2 3 4 5 6 7
10 21 35 59 92 200 400 610
Answer:
Given the curve
Taking log on both sides, we get
( )
( )
Assume and .
( ) ( )
It is a straight line.
The normal equations are
∑ ∑ ( )
∑ ∑ ∑ ( )
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 46
0 10 1.00 0 0.00
1 21 1.32 1 1.32
2 35 1.54 4 3.09
3 59 1.77 9 5.31
4 92 1.96 16 7.86
5 200 2.30 25 11.51
6 400 2.60 36 15.61
7 610 2.79 49 19.50
∑ 28 ∑ 15.29 ∑ 140 ∑ 64.19
Therefore equations ( ) and ( ) becomes
( )
( )
Solving, we get and
Since
and
The required curve is
( )
Fitting a straight line trend Method of least square:
or
∑
∑
∑
Problem: 33
Fit a straight line trend by the method of least squares to the following data. Also forecast
for the year 2015.
Year : 2005 2006 2007 2008 2009 2010 2011 2012
Earning in
lakhs 38 40 65 72 69 60 87 95
Answer:
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 47
For making easy calculation let us subtract each values of by the average of years
( ), that is 2008.5 and the new data of are tabulated below.
Year (x)
Earning
in lakhs
New
2005 38 12.25
2006 40 6.25
2007 65 2.25
2008 72 0.25
2009 69 0.25
2010 60 2.25
2011 87 6.25
2012 95 12.25
∑ 526 ∑ 0 ∑ 42 ∑ 308
I. Fitting a trend line:
( )
∑
∑
∑
( ) becomes
( )
II. Forecast for the year 2015
And new
( )
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 48
Problem: 34
Find line of best fit for the following time series data.
Year : 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009
Active in
ratio in
the XYZ
Co. %
2 5 5 10 12 16 17 14 20 23
Also forecast the year 2014 and 2015.
Answer:
For making easy calculation let us subtract each values of by the average of years
( ), that is 2004.5 and the new data of are tabulated below.
Year
( )
Earning
in lakhs
New
2000 2 20.25
2001 5 12.25
2002 5 6.25
2003 10 2.25
2004 12 0.25
2005 16 0.5 0.25 8
2006 17 1.5 2.25 25.5
2007 14 2.5 6.25 35
2008 20 3.5 12.25 70
2009 23 4.5 20.25 103.5
∑ 124 ∑ 0 ∑ 82.5 ∑ 182
I. Fitting a trend line:
( )
∑
∑
∑
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 49
( ) becomes
( )
II. Forecast for the year 2014
And new
( )
III. Forecast for the year 2015
And new
( )
Problem: 35
The following data on production (in ‘000 units) of a commodity from the year 2006-2012.
Fit a straight line trend and forecast for the year 2020.
Year : 2006 2007 2008 2009 2010 2011 2012
Production 6 7 5 4 6 7 5
Answer:
For making easy calculation let us subtract each values of by the average of years
( ), that is 2009 and the new data of are tabulated below.
Year
( )
Earning
in lakhs
New
2006 6
2007 7
2008 5
2009 4
2010 6
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 50
2011 7
2012 5
∑ 40 ∑ 0 ∑ 28 ∑
I. Fitting a trend line:
( )
∑
∑
∑
( ) becomes
( )
II. Forecast for the year 2020
And new
( )
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 51
Two Marks
1. Define correlation. Give one example.
Answer:
If the change in one variable affects the change in the other variable, then the
variable are said to be correlated.
Ex:
Taller people have larger shoe sizes and shorter people have smaller shoe sizes.
2. Define positive correlation. Give one example.
Answer:
If the two variable deviate in the same direction (i.e., increase or decrease) in one
variable in a corresponding (increase or decrease) in other variable is said to be positive
correlation.
Ex: Income and expenditure
3. Define negative correlation. Give one example.
Answer:
If the two variable deviate in opposite direction (i.e., increase or decrease) in one
variable in a corresponding (decrease or increase) in other variable is said to be negative
correlation.
Ex: Price and demand of a product
4. Write down Spearman’s Rank Correlation formula.
Answer:
∑
( )
where is the number of data points of the two variables and is the difference in the
ranks of the ith element of each random variable considered. The value lies between
to .
5. What are the Merits of Rank Correlation Coefficient.
Answer:
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 52
Spearman’s rank correlation coefficient can be interpreted in the same way as the
Karl Pearson’s correlation coefficient;
It is easy to understand and easy to calculate;
6. What are the Demerits of Rank Correlation Coefficient
Answer:
Product moment correlation coefficient can be calculated for bivariate frequency
distribution but rank correlation coefficient cannot be calculated; and
If , this formula is time consuming.
7. Write down Karl – pearson’s co efficient of Correlation.
Answer:
( ) ( ) ( )
8. Define Coefficient of Determination
Answer:
The coefficient of determination or squared method is the proportion of the variance
in the dependent variable that is predicted from the independent variable. It indicates the
level of variation in the given data set.
9. Write any two properties of Coefficient of Determination
Answer:
The coefficient of determination is the square of the correlation(r), thus it ranges
from 0 to 1.
With linear regression, the correlation of determination is equal to the square of the
correlation between the x and y variables.
10. Define Regression
Answer:
Regression is mathematical measure of the average relationship between two or
more variables in terms of original limits of the data.
The equation of line of regression of on is
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 53
( )
The equation of line of regression of on is
( )
11. What are the differences between correlation and regression?
Answer: Correlation is used to represent the linear relationship between two variables. On the
contrary, regression is used to fit the best line and estimate one variable on the basis of
another variable. ... Unlike regression whose goal is to predict values of the random
variable on the basis of the values of fixed variable.
12. What is Regression co efficient. Answer:
13. Write down the two regression lines.
Answer: The equation of line of regression of on is
( )
The equation of line of regression of on is
( )
14. The two regression equations of two random variables and are
and . Find the mean values of and .
Answer:
Since both the lines of regression passes through the mean values and .
The point ( ) must satisfy the two lines.
( )
( )
Solving ( ) and ( ), we get
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 54
Problem: 1
The following table provides data about the percentage of students who have free
university meals and their CGPA scores. Calculate the Spearman’s Rank Correlation
between the two and interpret the result.
State University
% of students having free meals
% of students scoring above 8.5 CGPA
Pune 14.4 54
Chennai 7.2 64
Delhi 27.5 44
Kanpur 33.8 32
Ahmedabad 38.0 37
Indore 15.9 68
Guwahati 4.9 62
Problem: 2
Compute the coefficient of rank correlation between sales and advertisement expressed in
thousands of dollars from the following data:
Sales 90 85 68 75 82 80 95 70
Advertisement 7 6 2 3 4 5 8 1
Problem: 3
Find the rank correlation co efficient from the following data.
Rank in X 1 2 3 4 5 6 7
Rank in Y 4 3 1 2 6 5 7
Problem: 4
The ranks of some 16 students in mathematics and physics are as follows. Find the
rank correlation for the proficiency in mathematics and physics.
Rank in Math’s 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Rank in Physics 1 10 3 4 5 7 2 6 8 11 15 9 14 12 16 13
Problem: 5
Suppose we have ranks of 5 students in three subjects Computer, Physics and Statistics and
we want to test which two subjects have the same trend.
Rank in Computer 2 4 5 1 3
Rank in Physics 5 1 2 3 4
Rank in Statistics 2 3 5 4 1
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 55
Problem: 6
Determine the rank correlation co efficient for the following data.
68 64 75 50 64 80 75 40 55 64
62 58 68 45 81 60 68 48 50 70
Problem: 7
The sample of 12 fathers and their eldest sons have the following data about their
heights in inches.
Fathers 65 63 67 64 68 62 70 66 68 67 69 71
Sons 68 66 68 65 69 66 68 65 71 67 68 70
Problem: 8
Find the correlation co efficient for the following data
X 10 14 18 22 26 30 Y 18 12 24 6 30 36
Problem: 9
The table below shows the number of absences, , in a Calculus course and the final exam
grade, , for 7 students. Find the correlation coefficient and interpret your result.
X 1 0 2 6 4 3 3 Y 95 90 90 55 70 80 85
Problem: 10
The marks obtained by 10 students in Mathematics and Statistics are given below.
Find the correlation coefficient between the two subjects.
Marks in math’s 75 30 60 80 53 35 15 40 38 48
Marks in Stats 85 45 54 91 58 63 35 43 45 44
Problem: 11
Compute the coefficient of correlation between X and Y using the following data:
X 1 3 5 7 8 10
Y 8 12 15 17 18 20
Problem: 12
Calculate the coefficient of correlation for the following data:
9 8 7 6 5 4 3 2 1
15 16 14 13 11 12 10 8 9
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 56
Problem: 13
Obtain the equations of the lines of regression from the following data:
1 2 3 4 5 6 7
9 8 10 12 11 13 14
Problem: 14
From the following data find
The two regression lines
The co efficient of correlation between the marks in economics and statistics
The most likely marks in statistics when marks in economics are 30.
Marks in Economics: 25 28 35 32 31 36 29 38 34 32
Marks in Statistics: 43 46 49 41 36 32 31 30 33 39
Problem: 15
A tyre manufacturing company is interested in removing pollutants from the exhaust at the
factory, and cost is a concern. The company has collected data from other companies
concerning the amount of money spent on environmental measures and the resulting
amount of dangerous pollutants released (as a percentage of total emissions)
Money spent
(Rupees in lakhs
8.4 10.2 16.5 21.7 9.4 8.3 11.5 18.4 16.7 19.3 28.4 4.7 12.3
Percentage of
dangerous pollutants
35.9 31.8 24.7 25.2 36.8 35.8 33.4 25.4 31.4 27.4 15.8 31.5 28.9
a) Compute the regression equation.
b) Predict the percentage of dangerous pollutants released when Rs. 20,000 is spent on
control measures.
c) Find the standard error of the estimate (regression line).
Problem: 16
The quantity of a raw material purchased by a company at the specified prices during the
12 months of 1992 is given
MONTH PRICE/KG QUANTITY (KG)
Jan 96 250
Feb 110 200
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 57
Mar 100 250
Aprl 90 280
May 86 300
June 92 300
July 112 220
Aug 112 220
Sep 108 200
Oct 116 210
Nov 86 300
Dec 92 250
Find the regression equation based on the above data
Can you estimate the appropriate quantity likely to be purchased if the price shoot
upon Rs 124/kg?
Hence or otherwise obtain the coefficient of correlation between the price prevailing
and the quantity demanded
Problem: 17
Find the standard error of the estimate from the data given below.
X 1 2 3 4 5
Y 1 2 1.3 3.75 2.25
Problem: 18
Find the standard error of the estimate from the data given below.
X 1 2 3 4 5 6 7
Y 2 4 7 6 5 6 5
Problem: 19
The two lines of regression are . The
variance of is 9. Find (i) The mean values of and (ii) correlation coefficient between
and .
Problem: 20
The two lines of regression are . Find ( ) and
correlation coefficient between and .
Problem: 21
The regression equation of and is . If the mean value of
and the correlation coefficient.
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 58
Problem: 22
If and are uncorrelated random variables with variances and . Find the
correlation coefficient between and .
Problem: 24
If the independent random variables and have the variances 3 and
respectively, find the correlation coefficient between and .
Problem: 25
By the method of least squares find the best fitting straight line to the data given below.
5 10 15 20 25
15 19 23 26 30
Problem: 26
Fit a straight line to the data also find the value of at
0 1 2 3 4
1 1.8 3.3 4.5 6.3
Problem: 27
Fit a straight line to the following data. Also estimate the value at .
71 68 73 69 67 65 66 67
69 72 70 70 68 67 68 64
Problem: 28
By the method of least squares find the best fitting straight line to the data given below.
15 2 3 4
1.7 1.8 2.3 3.2
Problem: 29
Fit a curve from the data given below.
3 5 7 9 11 13
2 3 4 6 5 8
Problem: 30
Fit a second degree curve from the data given below.
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 59
Problem: 31
Fit a curve for the following data.
2 3 4 5 6
8.3 15.4 33.1 65.2 127.4
Problem: 32
Fit a curve for the following data.
0 1 2 3 4 5 6 7
10 21 35 59 92 200 400 610
Problem: 33
Fit a straight line trend by the method of least squares to the following data. Also forecast
for the year 2015.
Year : 2005 2006 2007 2008 2009 2010 2011 2012
Earning in
lakhs 38 40 65 72 69 60 87 95
Problem: 34
Find line of best fit for the following time series data.
Year : 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009
Active in
ratio in
the XYZ
Co. %
2 5 5 10 12 16 17 14 20 23
Also forecast the year 2014 and 2015.
Problem: 35
The following data on production (in ‘000 units) of a commodity from the year 2006-2012.
Fit a straight line trend and forecast for the year 2020.
Year : 2006 2007 2008 2009 2010 2011 2012
Production 6 7 5 4 6 7 5
SRIT / PICM105 – SFM / Correlation and Regression
SRIT / M & H / M. Vijaya Kumar 60
“I am a slow walker, but
I never walk back.”
― Abraham Lincoln