measures of variability€¦ · the range •the range of a set of numerical data is the difference...
TRANSCRIPT
Measures of Variability
For Ungrouped Data
The RangeMeasures of Variability for
Ungrouped Data
THE RANGE•The range of a set of numerical data is the difference
between the highest and the lowest values.
• It is the simplest possible measure of spread.
• It cannot be used with grouped data and it ignores the
distribution of intermediate values.
•A single very large or very small value would give a
misleading impression of the spread of the data.
FORMULA:
•𝑹𝒂𝒏𝒈𝒆 = 𝑯𝑶𝑽 − 𝑳𝑶𝑽Where:
𝑯𝑶𝑽 = 𝒉𝒊𝒈𝒉𝒆𝒔𝒕 𝒐𝒃𝒔𝒆𝒓𝒗𝒆𝒅 𝒗𝒂𝒍𝒖𝒆
𝑳𝑶𝑽 = 𝒍𝒐𝒘𝒆𝒔𝒕 𝒐𝒃𝒔𝒆𝒓𝒗𝒆𝒅 𝒗𝒂𝒍𝒖𝒆
Example 6.1:•The following data are the numbers of deaths of
army officers caused by horse kicks, for the
Prussian Army during the period 1875 to 1894.
In order of size the numbers are:
3, 4, 5, 5, 6, 6, 7, 8, 9, 9, 10
11, 11, 11, 12, 14, 15, 15, 17, 18
What is the range?
Example 6.2:•One year the numbers of academic staff (including part-
time staff) in the various departments of the University
of Essex (a small, friendly university) were as follows:
19.0, 15.7, 25.3, 28.0, 15.0, 10.0
12.0, 10.3, 22.0, 24.8, 13.8, 25.9,
23.0, 21.3, 12.0, 11.0, 23.0
What is the range?
The Interquartile Range (IQR)
Measures of Variability for Ungrouped Data
The Interquartile Range (IQR)
•It is more useful than range because it
concentrates on the middle portion of the
distribution which is the difference between
the upper and the lower quartiles.
•It is also called as the H-spread because of
the use of box plots.
Formula:
•𝑰𝑸𝑹 = 𝑸𝟑 −𝑸𝟏
•𝑰𝑸𝑹 = 𝑷𝟕𝟓 − 𝑷𝟐𝟓
Example 6.3:•The following data are the numbers of deaths of
army officers caused by horse kicks, for the
Prussian Army during the period 1875 to 1894.
In order of size the numbers are:
3, 4, 5, 5, 6, 6, 7, 8, 9, 9, 10
11, 11, 11, 12, 14, 15, 15, 17, 18
What is the IQR?
Example 6.4:•One year the numbers of academic staff (including part-
time staff) in the various departments of the University
of Essex (a small, friendly university) were as follows:
19.0, 15.7, 25.3, 28.0, 15.0, 10.0
12.0, 10.3, 22.0, 24.8, 13.8, 25.9,
23.0, 21.3, 12.0, 11.0, 23.0
What is the IQR?
The Semi-Interquartile Range
•It is the half of the
difference between 𝑷𝟕𝟓(or
𝑸𝟑) and the 𝑷𝟐𝟓(or 𝑸𝟏) in
the distribution.
Formula:
•𝑺𝑰𝑸𝑹 =𝑸𝟐−𝑸𝟏
𝟐
•𝑺𝑰𝑸𝑹 =𝑷𝟕𝟓−𝑷𝟐𝟓
𝟐
The Mean Deviation (MD)
Measures of Variability for Ungrouped Data
The Mean Deviation (MD)•It is the average distance between the
mean and the scores in the distribution.
•The technique provides a reasonably
stable estimate of variation.
•It is also called “Average Deviation”.
Formula:
•𝑴𝑫 = 𝑿−𝒙
𝒏
Example 6.5:•The following data are the numbers of deaths of
army officers caused by horse kicks, for the
Prussian Army during the period 1875 to 1894.
In order of size the numbers are:
3, 4, 5, 5, 6, 6, 7, 8, 9, 9, 10
11, 11, 11, 12, 14, 15, 15, 17, 18
What is the value of MD?
Steps in Calculating the MD
Measures of Variability for Ungrouped Data
Step 1:Construct a Frequency
Distribution/Table.
X (individual
scores)x (Mean) 𝐗 − 𝐱
X (individual
scores)x (Mean)
𝐗 − 𝐱
3 10
4 11
5 11
5 11
6 12
6 14
7 15
8 15
9 17
9 18
Step 2:Calculate the Mean using the
Simple Arithmetic Mean.
X (individual
scores)x (Mean) 𝐗 − 𝐱
X (individual
scores)x (Mean)
𝐗 − 𝐱
3 9.8 10 9.8
4 9.8 11 9.8
5 9.8 11 9.8
5 9.8 11 9.8
6 9.8 12 9.8
6 9.8 14 9.8
7 9.8 15 9.8
8 9.8 15 9.8
9 9.8 17 9.8
9 9.8 18 9.8
Step 3:Calculate the deviation of each score from the mean.
X (individual
scores)x (Mean) 𝐗 − 𝐱
X (individual
scores)x (Mean)
𝐗 − 𝐱
3 9.8 6.8 10 9.8 0.2
4 9.8 5.8 11 9.8 1.2
5 9.8 4.8 11 9.8 1.2
5 9.8 4.8 11 9.8 1.2
6 9.8 3.8 12 9.8 2.2
6 9.8 3.8 14 9.8 4.2
7 9.8 2.8 15 9.8 5.2
8 9.8 1.8 15 9.8 5.2
9 9.8 0.8 17 9.8 7.2
9 9.8 0.8 18 9.8 8.2
Step 4:Calculate the summation of deviation
of each score from the mean.
The value of 𝑿 − 𝒙 :
• 𝑿 − 𝒙 =𝟕𝟐
Step 5:Use the formula.
Formula:
•𝑴𝑫 = 𝑿−𝒙
𝒏
Final Answer:
•𝑴𝑫 = 𝟑. 𝟔
The VarianceMeasures of Variability for
Ungrouped Data
The Variance• It defines how close the scores in the distribution
are to the middle of the distribution.
•The more variation there is in the x-values, the
larger will be the value of the variance.
•The variance is define as the average squared
differences of the scores from the mean.
The Issue:•Diving the squares of the mean
deviation 𝑿𝒊 − 𝒙𝟐 by 𝒏 would
seem natural, but (unfortunately)
there is a strong case for diving
instead by 𝒏 − 𝟏 .
Using the divisor n:•This is appropriate in two
cases (1) if the values
𝒙𝟏, … , 𝒙𝒏 represent an entire
population;
Note:
•If 𝒙𝟏, … , 𝒙𝒏 represent the
entire population then 𝝈𝟐 is
called the POPULATION
VARIANCE.
Using the divisor n:•and (2) if the values 𝒙𝟏, … , 𝒙𝒏
represent a sample from a
population and we are interested
in the variation within the sample
itself.
Note:
•If 𝒙𝟏, … , 𝒙𝒏 represent a
sample of the data then 𝝈𝟐
is called the SAMPLE
VARIANCE.
Example 6.6:•An example of a case where the x-
values refer to the entire population
is where 𝒙𝟏, … , 𝒙𝒏 represent the
heights of ALL the children in a
particular class in a school.