engineering statistic 4 lecture: measures of …uowa.edu.iq/filestorage/file_1553620932.pdf · 1-...
TRANSCRIPT
![Page 1: ENGINEERING STATISTIC 4 LECTURE: MEASURES OF …uowa.edu.iq/filestorage/file_1553620932.pdf · 1- Range: The range is the simplest of the three measures and is defined now. The range](https://reader034.vdocument.in/reader034/viewer/2022042106/5e84c15e8ce309312c7a21b7/html5/thumbnails/1.jpg)
In statistics, to describe the data set accurately, statisticians must know more than the measures of
central tendency. Two data sets with the same mean may have completely different variation or
the measures of , so the measures that help us know about the spread of data set are called dispersion
such as : dispersion
1- Range.
2- Variance.
3- Standard deviation.
The range is the simplest of the three measures and is defined now. The range is the highest : Range -1
value minus the lowest value. The symbol R is used for the range.
:Disadvantage of range
a- Based on two values only, largest and smallest.
b- Extremely large or extremely small data can significantly effected the range.
9 16 10 7 -7 2 0 -5 Calculate the range for the following data set : ):1Exp.(
Sol: -9 -7 0 2 5 7 10 16
R = highest value - lowest value
R= 16 – (-9) = 25
2- Variance and Standard Deviation.
a- Ungrouped data
Exp(2): Find the sample variance , standard deviation and the range ,for the amount of European auto
sales for a sample of 6 years shown. The data are in millions of dollars.
11.2, 11.9, 12.0, 12.8, 13.4, 14.3
Sol:
1 − 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝑠2 = ∑(𝑥−𝑥−)2
𝑛−1
12.6/6= 75.6/6= 14.3 +13.4 +12.8 +12.0 +11.9 +11.2 = = ∑x/n -x
𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝜎2 = ∑(𝑥−𝜇)2
𝑁
𝑆𝑎𝑚𝑝𝑙𝑒 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝑠2 = ∑(𝑥−𝑥−)2
𝑛−1
𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛: 𝜎 = √ 𝜎2
𝑆𝑎𝑚𝑝𝑙𝑒 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛: 𝑠 = √ 𝑠2
R = highest value - lowest value
ENGINEERING STATISTIC
4th LECTURE: MEASURES OF DISPERSION AND POSITION
![Page 2: ENGINEERING STATISTIC 4 LECTURE: MEASURES OF …uowa.edu.iq/filestorage/file_1553620932.pdf · 1- Range: The range is the simplest of the three measures and is defined now. The range](https://reader034.vdocument.in/reader034/viewer/2022042106/5e84c15e8ce309312c7a21b7/html5/thumbnails/2.jpg)
2)-x-m(xif 2)-x-m(x i.fmx mx )iFreq. (f Class
boundaries
i
272.25 272.25 8 8 1 5.5–10.5 1
264.5 132.25 26 13 2 10.5–15.5 2
126.75 42.25 54 18 3 15.5–20.5 3
11.25 2.25 115 23 5 20.5–25.5 4
49.00 12.25 112 28 4 25.5–30.5 5
216.75 72.25 99 33 3 30.5–35.5 6
364.5 182.25 76 38 2 35.5–40.5 7
∑1305 ∑490
𝑥− =∑ 𝑓.𝑥𝑚
𝑛 , 𝑥− =
490
20= 24.5
s2= (11.2-12.6)2 +(11.9-12.6)2+(12.0-12.6)2+(12.8-12.6)2+(13.4-12.6)2+(14.3-12.6)2/ 5= 1.278
2- Standard deviation : s = √1.278= 1.13
3- The range (R) = 14.3 – 11.2 = 3.1
b- Grouped data
Exp(3): Find the variance and the standard deviation for the data in this frequency distribution table . The data represent the number of miles that 20 runners ran during one week.
35.5–40.5 30.5–35.5 25.5–30.5 20.5–25.5 15.5–20.5 10.5–15.5 5.5–10.5 Class
boundaries
2 3 4 5 3 2 1 )iFreq. (f
Sol:
𝑠2 = ∑ 𝑓(𝑥𝑚−𝑥−)2
𝑛−1=
1305
19= 68.68
s= 8.28
𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝜎2 = ∑ 𝑓(𝑥𝑚−𝜇)2
𝑁
𝑆𝑎𝑚𝑝𝑙𝑒 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝑠2 = ∑ 𝑓(𝑥𝑚−𝑥−)2
𝑛−1
𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛: 𝜎 = √ 𝜎2
𝑆𝑎𝑚𝑝𝑙𝑒 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛: 𝑠 = √ 𝑠2
![Page 3: ENGINEERING STATISTIC 4 LECTURE: MEASURES OF …uowa.edu.iq/filestorage/file_1553620932.pdf · 1- Range: The range is the simplest of the three measures and is defined now. The range](https://reader034.vdocument.in/reader034/viewer/2022042106/5e84c15e8ce309312c7a21b7/html5/thumbnails/3.jpg)
Coefficient of Variation
A statistic that allows you to compare standard deviations when the units are different, it denoted by
CVar, is the standard deviation divided by the mean. The result is expressed as a percentage.
Measures of Position
In addition to measures of central tendency and measures of variation, there are measures
of position or location. These measures include :
1-standard scores.
2- percentiles.
3- deciles, and quartiles.
They are used to locate the relative position of a data value in the data set.
z score or standard score : it represents the number of standard deviations that a data value falls above
or below the mean.
For samples, the formula is : z = (x-x-)/s
For populations, the formula is: z = (x-µ)/σ
Exp: A student scored 65 on a calculus test that had a mean of 50 and a standard deviation
of 10; she scored 30 on a history test with a mean of 25 and a standard deviation of 5.
Compare her relative positions on the two tests.
Sol:
z1= (x-x-)/s = 65-50/10= 1.5
z2= (x-x-)/s= 30-25/5= 1
Since the z score for calculus is larger, her relative position in the calculus class is
higher than her relative position in the history class.
For samples: CVar =(s/x-) *100
For populations: CVar =(σ/µ) *100
*** Note that if the z score is positive, the score is above the mean. If the z score is 0, the
score is the same as the mean. And if the z score is negative, the score is below the mean.
Exp: the mean of the number of sales of cars over a 3 month period is 87, and the standard deviation is 5. The mean of the commissions is 5225 $, and the standard deviation is 773 $. Compare the variations of the two. Solution: The coefficients of variation are:
CVar =(s/x-) *100 = (5/87)*100= 5.75% sales
CVar =(s/x-) *100 = (773/5225)*100= 14.8 % commissions
Since the coefficient of variation is larger for commissions, the commissions are more
variable than the sales.
![Page 4: ENGINEERING STATISTIC 4 LECTURE: MEASURES OF …uowa.edu.iq/filestorage/file_1553620932.pdf · 1- Range: The range is the simplest of the three measures and is defined now. The range](https://reader034.vdocument.in/reader034/viewer/2022042106/5e84c15e8ce309312c7a21b7/html5/thumbnails/4.jpg)
Quartile .
As the name implies, quartiles divide the data set into four equal parts. Therefore the first quartile, Q1,
is the 25th percentile, the second quartile, Q2 is the 50th percentile (or the median), and the third quartile,
Q3, is the 75th percentile. The difference between the third and first quartiles is inter quartile range
(IQR).
For ungrouped data, the quartiles (Q1, Q2, and Q3) are calculated by:
1- Arrange the data in order from lowest to highest.
2- Find the median of the data values. This is the value for Q2.
3- Find the median of the data values that fall below Q2. This is the value for Q1.
4- Find the median of the data values that fall above Q2. This is the value for Q3.
Percentiles: divide the data set into 100 equal groups. Each data set has 99 percentiles, data must be
ranked in increasing order to compute percentiles. The kth percentile is denoted by Pk , where k is an
integer range from (1 –99). For example, the 25th percentile which is denoted by P25, is defined to be
that numerical value such that at most 25% of the values are smaller than it and at most 75% are larger
than it in an ordered data set.
1Q 2Q
median 3Q Min. value Max. value
25% of data 25% of data 25% of data 25% of data
IQR = Q3- Q1
Example : Find Q1, Q2, and Q3 for the data set 15, 13, 6, 5, 12, 50, 22, 18.
Sol:
1- Arrange the data in increasing order: 5, 6, 12, 13, 15, 18, 22, 50
Q2 is the median of all values → Q2 = (13+15)/2= 14
Q1 is the median of values ( 5, 6, 12, 13 ) → Q1 = (6+12)/2= 9.
Q3 is the median of values (15, 18, 22, 50) → Q3 = (18+22)/2= 20.
Example : the following are the ages of nine employees of an insurance company
47 28 39 51 33 37 59 24 33
a- Find the values of three quartiles
b- When does the age 28 fall in relation to the ages of these employees.
c- Find the inter quartile range (IQR).
Sol:
Arrange the data in increasing order: 24, 28, 33, 33, 37, 39, 47, 51, 59
Q2 is the median of all values → Q2 = 37
Q1 is the median of values ( 24, 28, 33, 33) → Q1 = (28+33)/2= 30.5.
Q3 is the median of values (39, 47, 51, 59) → Q3 = (47+51)/2=49.
b- The age 28 fall in the first 25% of the ages.
c- The inter quartile range(IQR) = Q3- Q1 = 49-30.5= 18.5 years.
![Page 5: ENGINEERING STATISTIC 4 LECTURE: MEASURES OF …uowa.edu.iq/filestorage/file_1553620932.pdf · 1- Range: The range is the simplest of the three measures and is defined now. The range](https://reader034.vdocument.in/reader034/viewer/2022042106/5e84c15e8ce309312c7a21b7/html5/thumbnails/5.jpg)
For ungrouped data,
The percentile corresponding to a given value (x) is computed by using the formula:
Percentile = 𝑁𝑜.𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑏𝑒𝑙𝑜𝑤 𝑥+0.5
𝑇𝑜𝑡𝑎𝑙 𝑁𝑜.𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 ∗ 100
To Finding the value Corresponding to a Given Percentile :
Let p be the percentile and n the sample size.
Arrange the data in order.
Compute c = (n×p)/100.
If c is not a whole number, round up to the next whole number. If c is a whole number, use the
value halfway between c and c+1.
The value of c is the position value of the required percentile.
Deciles :divide the distribution into 10 groups. They are denoted by D1, D2, etc.
Note that D1 corresponds to P10; D2 corresponds to P20; etc.
Deciles can be found by using the formulas given for percentiles. Taken altogether then, these are the
relationships among percentiles, deciles, and quartiles.
Example: A teacher gives a 20-point test to 10 students. Find the percentile rank of
a score of 12. Scores: 18, 15, 12, 6, 8, 2, 3, 5, 20, 10.
Sol:
Ordered set: 2, 3, 5, 6, 8, 10, 12, 15, 18, 20.
Percentile = 𝑁𝑜.𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑏𝑒𝑙𝑜𝑤 𝑥+0.5
𝑇𝑜𝑡𝑎𝑙 𝑁𝑜.𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 ∗ 100
Percentile = 6+0.5
10 ∗ 100 = 65th percentile
Student did better than 65% of the class.
Example: For the following data set: 2, 3, 5, 6, 8, 10, 12, 15, 18, 20.
Find the values of the 25th and 80th percentile.
Sol:
a. n = 10, p = 25
c = (10×25)/100 = 2.5. Hence round up to c = 3.
Thus, the value of the 25th percentile is the value x = 5.
b. n = 10, p = 80
c = (10× 80)/100 = 8.
Thus the value of the 80th percentile is the average of the 8th and 9th values.
x = (15 + 18)/2 = 16.5.
![Page 6: ENGINEERING STATISTIC 4 LECTURE: MEASURES OF …uowa.edu.iq/filestorage/file_1553620932.pdf · 1- Range: The range is the simplest of the three measures and is defined now. The range](https://reader034.vdocument.in/reader034/viewer/2022042106/5e84c15e8ce309312c7a21b7/html5/thumbnails/6.jpg)
Exp: The following are test scores for a particular math class. Find the sixth deciles
44 56 58 62 64 64 70 72 72 72
74 74 75 78 78 79 80 82 82 84
86 87 88 90 92 95 96 96 98 100
Sol: D6 = P60
n = 30, p = 60, c = (30×60)/100 = 18
The average of the 18th and 19th items represents the 6th deciles. D6= 82.
Percentiles, deciles, and quartiles for grouped data: in order to find what value corresponds to a
specified i position such as the positions of Percentile, Quartile, or Decile in grouped data, the
following formulas must be used:
Where:
Qi , Pi, Di are the quartile, Percentile and deciles of i position.
L: lower class boundary for the class contain i position.
n: total number of data.
F: The cumulative frequency of the class before the class contain i position
fm: The frequency of the class contain i position. , Δ: the class width.
Deciles are denoted by D1, D2, D3, . . . , D9,
and they correspond to P10, P20, P30, . . . , P90.
Quartiles are denoted by Q1, Q2, Q3
and they correspond to P25, P50, P75.
The median = P50 = Q2 =D5.
𝑄𝑖= 𝐿 + n∗(
𝑖
4)− 𝐹
𝑓𝑖∗ ∆
𝑃𝑖= 𝐿 + n∗(
𝑖
100)− 𝐹
𝑓𝑖∗ ∆
𝐷𝑖= 𝐿 + n∗(
𝑖
10)− 𝐹
𝑓𝑖∗ ∆
![Page 7: ENGINEERING STATISTIC 4 LECTURE: MEASURES OF …uowa.edu.iq/filestorage/file_1553620932.pdf · 1- Range: The range is the simplest of the three measures and is defined now. The range](https://reader034.vdocument.in/reader034/viewer/2022042106/5e84c15e8ce309312c7a21b7/html5/thumbnails/7.jpg)
cumulative freq.
Freq.
)i(f Class
boundaries
i
2 2 7.5–10.5 1
6 4 10.5–13.5 2
12 6 13.5–16.5 3
16 4 16.5–19.5 4
19 3 19.5–22.5 5
20 1 22.5–25.5 6
Exp: The time taken by 20 worker in a factory to do a particular job were tabled as follow, find Q2, P70,
and D4.
Sol:
𝑄𝑖= 𝐿 + n∗(
𝑖
4)− 𝐹
𝑓𝑖∗ ∆
For Q2 → n ∗ (𝑖
4) = 20 ∗
2
4 = 10
the class boundary of Q2 is (13.5–16.5)
𝑄2= 13.5 + 10 − 6
6∗ 3 = 15.5
𝑃𝑖= 𝐿 + n∗ (
𝑖
100)− 𝐹
𝑓𝑖∗ ∆
for P70 → n ∗ (𝑖
100) = 20 ∗
70
100 = 14
the class boundary of P70 is (16.5–19.5)
𝑃70= 16.5 + 14− 12
4∗ 3 = 18
𝐷𝑖= 𝐿 + n∗(
𝑖
10)− 𝐹
𝑓𝑖∗ ∆
For D4 → n ∗ (𝑖
10) = 20 ∗
4
10 = 8
the class boundary of D4 is (13.5–16.5)
𝐷4 = 13.5 + 8− 6
6∗ 3 = 14.5