chapter 10 — summary...
TRANSCRIPT
M B 1 1 Q l d - 1 0 160 S u m m a r y s t a t i s t i c s
Exercise 10A — Measures of central tendency
1 a x = 365
x = 7.2
b x = 11416
x = 7.125
c x = 39.98
x = 4.9875
d x = 1519
x = 16. 7&
e x = 9.711
x = 0.8 81& &
2 a x = 6.476
x = 1.078 3& Not a good indicator of central tendency due to outlier
(2.3).
b x = 1368
x = 17 Yes, a good indicator of central tendency.
c x = 2478
x = 30.875 Yes, a good indicator of central tendency.
d x = 1097
x = 15.57 Not a good indicator of central tendency due to
outlier (2).
3 x = 24320
x = 12.15 The mean number attending each week is approximately
12 people. 4 Data 21, 22, 24, 25, 26, 26, 27, 28, 28, 28, 29, 30, 31, 31, 32, 33,
34, 34, 36, 38
x = 58320
= 29.15
The answer is D 5 Data 4, 7, 12, 14, 15, 15, 16, 17, 18, 21, 22, 24, 27, 27, 27
x = 26615
= 17.7
The answer is A 6 a If the distribution is positively skewed then the median
would be a better indicator of central tendency. b The mean is a good indicator of central tendency if the
distribution is symmetrical. c If the distribution has an outlier then the median is a
better indicator of central tendency. d If the distribution is negatively skewed then the median
is a better indicator of central tendency.
7 a
Class interval
Frequencyf
Midpoint of class interval
m f × m
0 − 9 1 4.5 4.5
10 − 19 3 14.5 43.5
20 − 29 6 24.5 147
30 − 39 17 34.5 586.5
40 − 49 12 44.5 534
50 − 59 5 54.5 272.5
Σ f = 44 Σ (f × m) = 1588
x = 68241
x = 36.09 b
Class interval
Frequencyf
Midpoint of class interval
m f × m
0 − 4 2 2 4
5 − 9 5 7 35
10 − 14 7 12 84
15 − 19 13 17 221
20 − 24 8 22 176
25 − 29 6 27 162
Σ f = 41 Σ (f × m) = 682
x = 68241
x = 16.63 c
Class interval
Frequency f
Midpoint of class interval
m f × m
0 − 49 2 24.5 49
50 − 99 7 74.5 521.5
100 − 149 8 124.5 996
150 − 199 14 174.5 2443
200 − 249 12 224.5 2694
250 − 299 5 274.5 1372.5
Σ f = 48 Σ (f × m) = 8076
x = 807648
x = 168.25
Chapter 10 — Summary statistics
S u m m a r y s t a t i s t i c s M B 1 1 Q l d - 1 0 161
d
Class interval
Frequency f
Midpoint of class interval
m
f × m
1 − 6 14 3.5 49
7 − 12 19 9.5 180.5
13 − 18 23 15.5 356.5
19 − 24 22 21.5 473
25 − 30 20 27.5 550
31 − 36 14 33.5 469
Σ f = 112 Σ (f × m) = 2078
x = 2078112
x = 18.55 8
Age Frequency f
Midpoint of class interval
m
f × m
10 − 14 5 12 60
15 − 19 5 17 85
20 − 24 7 22 154
25 − 29 4 27 108
30 − 34 3 32 96
35 − 39 2 37 74
40 − 44 2 42 84
45 − 49 1 47 47
Σ f = 29 Σ (f × m) = 708
x = 29
708
= 24.41 = 24 to the nearest whole number 9 a
Median = 12
n +⎛ ⎞⎜ ⎟⎝ ⎠
th position
= 24 12+⎛ ⎞
⎜ ⎟⎝ ⎠
th position
= 12.5th position 12th term = 36 13th term = 38 Therefore the Median is 37
b Median = 12
n +⎛ ⎞⎜ ⎟⎝ ⎠
th position
= 35 12+⎛ ⎞
⎜ ⎟⎝ ⎠
th position
= 18th position Therefore the Median is 5
c Median = 12
n +⎛ ⎞⎜ ⎟⎝ ⎠
th position
= 37 12+⎛ ⎞
⎜ ⎟⎝ ⎠
th position
= 19th position Therefore the Median is 11
d Median = 12
n +⎛ ⎞⎜ ⎟⎝ ⎠
th position
= 22 12+⎛ ⎞
⎜ ⎟⎝ ⎠
th position
= 11.5th position 11th term = 42 12th term = 43 Therefore the Median is 42.5
e Median = 12
n +⎛ ⎞⎜ ⎟⎝ ⎠
th position
= 34 12+⎛ ⎞
⎜ ⎟⎝ ⎠
th position
= 17.5th position 17th term = 628 18th term = 628 Therefore the Median is 628
10 a Median position
= 5 12+⎛ ⎞
⎜ ⎟⎝ ⎠
th position
= 3rd position Therefore the Median is 6
b Median position
= 5 12+⎛ ⎞
⎜ ⎟⎝ ⎠
th position
= 3rd position Therefore the Median is 17 c Median position
= 7 12+⎛ ⎞
⎜ ⎟⎝ ⎠
th position
= 4th position Therefore the Median is 6 d Median position
= 8 12+⎛ ⎞
⎜ ⎟⎝ ⎠
th position
= 4.5th position 4th term = 8 5th term = 12 Therefore the Median is 10 e Median position
= 10 12+⎛ ⎞
⎜ ⎟⎝ ⎠
th position
= 5.5th position 5th term = 18 6th term = 19 Therefore the Median is 18.5 f Ordered Data 1 2 3 4 5 6 8 Median position
= 7 12+⎛ ⎞
⎜ ⎟⎝ ⎠
th position
= 4th position Therefore the Median is 4
M B 1 1 Q l d - 1 0 162 S u m m a r y s t a t i s t i c s
g Ordered Data 11 14 15 16 19 21 23 25 28
Median = 9 12+⎛ ⎞
⎜ ⎟⎝ ⎠
th position
= 5th position Therefore the Median is 19 h Ordered Data 2 3 4 4 4 5 7 9 10 11
Median = 10 12+⎛ ⎞
⎜ ⎟⎝ ⎠
th position
= 5.5th position 5th term = 4 6th term = 5 Therefore the Median is 4.5 i Ordered Data 16 18 22 22 23 23 26 29 37
Median = 9 12+⎛ ⎞
⎜ ⎟⎝ ⎠
th position
= 5th position Therefore the Median is 23
11 a The number 17 occurs three times The mode is 17.
b Both 148 and 151 occur twice. The mode values are 148 and 151. c There is no mode. d The number 72 occurs four times The mode is 72. e The number 2.6 occurs three times The mode is 2.6.
12 a The highest frequency is 6. The score with the highest frequency is 4, so the mode is 4. b The highest frequency is 8. The score with the highest
frequency is 8, so the mode is 8. c The highest frequency is 6. The scores with the highest
frequency are 42 and 44, so the modes are 42 and 44. 13 a The highest frequency is 46. The class with the highest frequency is 17 − 20, so the modal class is 17 − 20.
b The highest frequency is 25. The class with the highest frequency is 22 − 28, so the modal class is 22 − 28.
14 a Age Frequency Midpoint
(x) x × f
0 − 14 1 7.5 7.5
15 − 29 13 22.5 292.5
30 − 44 2 37.5 75
45 − 59 0 52.5 0
60 − 74 4 67.5 270
75 − 89 5 82.5 412.5
25 1057.5
b Mean = 25
5.1057
= 42.3
c Median = 25 12+
= 13th score Median class = 15 − 29 years old d Highest frequency = 13 Modal class is 15 − 29 years old e No, the mean has one age group in the 30 − 44 class
while the median/mode have one age group of 15 − 29.
f Answers will vary. Check with teacher (e.g. Tendency for certain age groups to need emergency treatment i.e. 15 − 29 years old could be involved in car accidents more than other age groups.
15 a Player A mean:
= 30 34 42 28 30 416
+ + + + +
= 2066
≈ 34.33 Player B mean:
= 0 0 1 0 250 06
+ + + + +
= 2516
≈ 41.83
b Player B seems better on mean scores. c Player A median: 28 30 31 34 41 42
Median = 31 342+
= 32.5 Player B median: 0 0 0 0 0 1 250 Median = 0 d Player A is better. e Mean result is affected by extreme values so can be
misleading, hence player A would be better to have in a cricket team.
Exercise 10B — Range and interquartile range 1 a Range = 63 − 7 = 56
b Range = 17 − 0 = 17 c Range = 19 − 1 = 18 d Range = 49 − 31 = 18 e Range = 674 − 602 = 72
2 a Range = 9 − 2 = 7
b Range = 21 − 12 = 9 c Range = 9 − 3 = 6 d Range = 16 − 3 = 13 e Range = 26 − 12 = 14 f Range = 8 − 1 = 7 g Range = 28 − 11 = 17 h Range = 11 − 2 = 9 i Range = 37 − 16 = 21
3 a Ordered Data 8 9 12 14 14 15 18 18 21 24 24 24 25 25 25 To find Q1, consider 8 9 12 14 14 15 18 Q1 = 14 To find Q3, consider 21 24 24 24 25 25 25 Q3 = 24
S u m m a r y s t a t i s t i c s M B 1 1 Q l d - 1 0 163
So IQR = Q3 − Q1 = 24 − 14 = 10
b Ordered Data 7 9 10 11 12 13 14 16 18 18 19 19 20 20 21 To find Q1, consider 7 9 10 11 12 13 16 Q1 = 11 To find Q3, consider 18 18 19 19 20 20 21 Q3 = 19 So IQR = Q3 − Q1 = 19 − 11 = 8
4 An example of a set of data where n = 5, median is 6 and range is 7 is
2, 3, 6, 8 and 9 Range = 9 − 2 = 7 There are many other sets of data with these parameters. 5 An example of a set of data where n = 8, median is 7.5 and
range is 10 is 2, 5, 6, 7, 8, 10, 11 and 12 Range = 12 − 2 = 10 There are many other sets of data with these parameters. 6 If Q1 = 13, Median = 18 and Q3 = 25. The interquartile range = 25 − 13 = 12. The answer is C
7 a n = 20 Min x = 6 Q1 = 13.5 Median = 21 Q3 = 31.5 Max x = 51 Median = 21 IQR = Q3 − Q1 = 31.5 − 13.5 = 18 Range = 51 − 6 = 45
b n = 20 Min x = 19 Q1 = 22.5 Median = 27.5 Q3 = 30.5 Max x = 39 IQR = Q3 − Q1 = 30.5 − 22.5 = 8 Range = 3.9 − 19 = 20
c n = 19 Min x = 1.2 Q1 = 2.4 Median = 3.7 Q3 = 5.4 Max x =7.1 IQR = Q3 − Q1 = 5.4 − 2.4 = 3 Range = 7.1 − 1.2 = 5.9
8 a n = 39 Min x = 23 Q1 = 32 Median = 42 Q3 = 53 Max x = 114 IQR = Q3 − Q1 = 53 − 32 = 21 Range = 114 − 23 = 91
b n = 22 Min x = 14 Q1 = 28 Median = 32 Q3 = 35 Max x = 44 IQR = Q3 − Q1 = 35 − 28 = 7 Range = 44 − 14 = 30
9 Show the percentiles on the graph. a The median corresponds to the 50th percentile. From the
graph, the median is 1.5. b The lower quartile corresponds to the 25th percentile.
From the graph, the lower quartile is 0.7 c The upper quartile corresponds to the 75th percentile.
From the graph, the upper quartile is 2.2 d IQR = Q3 − Q1 = 2.2 − 0.7 = 1.5
10 a
Score Frequency Cumulative frequency
0 26 26
1 31 57
2 22 79
3 8 87
4 3 90 b
c The median corresponds to the 50th percentile. From the
graph, the median is approximately 1.3. d IQR = Q3 − Q1 = 2.1 − 0.5 = 1.6
Exercise 10C — The standard deviation 1 a
Scores Scores − Mean Square difference
3 −1.625 2.640625
5 0.375 0.140625
8 3.375 11.390625
2 −2.625 6.890625
7 2.375 5.640625
1 −3.625 13.140625
6 1.375 1.890625
5 0.375 0.140625
TOTAL 37 41.875
Variance = 41.875 ÷ 8 = 5.234 375
M B 1 1 Q l d - 1 0 164 S u m m a r y s t a t i s t i c s
Standard deviation = variance ≈ 2.288
b Scores Scores − Mean Square difference
11 0.571428571 0.326530612 8 −2.428571429 5.897959184
7 −3.428571429 11.75510204
12 1.571428571 2.469387755 10 −0.428571429 0.183673469
11 0.571428571 0.326530612 14 3.571428571 12.75510204
TOTAL 73 33.71428571 Variance = 33.71428571 ÷ 7 = 4.816 408 Standard deviation = variance ≈ 2.195
c Scores Scores − Mean Square difference
25 −11.375 129.390625
15 −21.375 456.890625
78 41.625 1732.640625 35 −1.375 1.890625
56 19.625 385.140625 41 4.625 21.390625 17 − 19.375 375.390625
24 −12.375 153.140625
TOTAL 291 3255.875 Variance = 3255.875 ÷ 8 = 406.984 375 Standard deviation = variance ≈ 20.17 d Scores Scores − Mean Square difference
5.2 −1.28 1.6384
4.7 −1.78 3.1684
5.1 −1.38 1.9044
12.6 6.12 37.4544 4.8 −1.68 2.8224
TOTAL 32.4 46.988 Variance = 46.988 ÷ 5 = 9.3976 Standard deviation = variance ≈ 3.066 2 As for question 1, just completed via inbuilt calculator
functions 3 a Group A
Mean = 160 170 5 1807
+ × +
= 11907
= 170 cm Median = 170 cm
Highest frequency is 5 Mode = 170 cm Group B
Mean = 110 160 170 3 180 2307
+ + × + +
= 11907
= 170 cm Median = 170 cm Highest frequency is 3 Mode = 170 cm All measures of central tendency (mean, median, mode) are
the same for both groups. b Mean, median and mode are misleading on similarity
between groups. c Group B d Group B e Group B f Group A Range = 180 − 160 = 20 cm IQR = 170 − 170 = 0 Scores Scores − Mean Square difference
160 −10 100
170 0 0 170 0 0 170 0 0 170 0 0 170 0 0 180 10 100 TOTAL 1190 200
Variance = 200 ÷ 7 = 28.57 Standard deviation = variance ≈ 5.345 Group B Range = 230 − 110 = 120 cm IQR = 180 − 160 = 20 cm Scores Scores − Mean Square difference
160 −10 100
170 0 0 170 0 0 110 −60 3600
230 60 3600 170 0 0 180 10 100 TOTAL 1190 7400
Variance = 7400 ÷ 7 = 1057.143 Standard deviation = variance ≈ 32.51
S u m m a r y s t a t i s t i c s M B 1 1 Q l d - 1 0 165
4 a Range = 140 − 80 = 60 b and c
Scores Midpoint Frequency Frequency ×
Midpoint Scores − Mean Square difference Square difference ×
Frequency 80 − 85 1 85 − 25.48387097 649.4276795 649.4277 90 − 95 4 380 − 15.48387097 239.7502601 959.001
100 − 105 11 1155 − 5.483870968 30.07284079 330.8012 110 − 115 9 1035 4.516129032 20.39542144 183.5588 120 − 125 4 500 14.51612903 210.7180021 842.872 130 − 135 2 270 24.51612903 601.0405827 1202.081
TOTAL 31 3425 1751.404787 4167.742
Mean = 3425 = 110.4831
Variance = 4167.742 ÷ 31 = 134.4433 Standard deviation = variance ≈ 11.59
5
Scores Midpoint Frequency Frequency ×
Midpoint Scores − Mean Square difference Square difference ×
Frequency 10 − 15 1 15 −19.04761905 362.8117914 362.8118 20 − 25 6 150 −9.047619048 81.85941043 491.1565 30 − 35 9 315 0.952380952 0.907029478 8.163265 40 − 45 4 180 10.95238095 119.9546485 479.8186 50 − 55 1 55 20.95238095 439.0022676 439.0023
TOTAL 21 715 1004.535147 1780.952
Variance = 1780.952 ÷ 21 = 84.807 Standard deviation = variance = 9.209 The answer is C 6 a Range = 700 − 200 = 500 hours
b
Scores Midpoint Frequency Frequency ×
Midpoint Scores − Mean Square difference Square difference ×
Frequency 200 − 225 2 450 − 230.2857143 53 031.5102 106 063 250 275 5 1375 − 180.2857143 32 502.93878 162 514.7
300 − 325 12 3900 − 130.2857143 16 974.36735 203 692.4 350 − 375 25 9375 − 80.28571429 6445.795918 161 144.9 400 − 425 42 17 850 − 30.28571429 917.2244898 38 253.43 450 − 475 38 18 050 19.71428571 388.6530612 14 768.82 500 − 525 26 13 650 69.71428571 4860.081633 126 362.1 550 − 575 15 8625 119.7142857 14 331.5102 214 972.7 600 − 625 7 4375 169.7142857 28 802.93878 201 620.6 650 − 675 3 2025 219.7142857 48 274.36735 144 823.1
TOTAL 175 79 675 206 529.3878 1 374 486
Mean = 79 675 ÷ 175 = 455.28571 = 455.29 hours Variance = 1 374 486 ÷ 175 = 7854.21
M B 1 1 Q l d - 1 0 166 S u m m a r y s t a t i s t i c s
Standard deviation = variance = 88.62
c Mean = 455.29 hours Standard deviation = 1374486 174÷
= 88.88 7 a
Maximum temperature
Number of days
cf
0 − 4 4
5 − 22 26
10 − 95 121
15 − 124 245
20 − 94 339
25 − 19 358
30 − 5 363
35 − 2 365
b
c Reading off graph: Q3 is the 75th percentile Q3 = 21 Q1 is the 25th percentile Q1 = 13.5 Interquartile range = Q3 − Q1 = 21 − 13.5 = 7.5 d Reading off graph: Median is the 50th percentile Median ≈ 17
Scores Midpoint Frequency Frequency ×
Midpoint Scores − Mean Square difference Square difference ×
Frequency
0 − 2.5 4 10 − 15.05479452 226.6468381 906.5874
5 − 7.5 22 165 − 10.05479452 101.0988929 2224.176
10 − 12.5 95 1187.5 − 5.054794521 25.55094764 2427.34
15 − 17.5 124 2170 − 0.054794521 0.003002439 0.372302
20 − 22.5 94 2115 4.945205479 24.45505723 2298.775
25 − 27.5 19 522.5 9.945205479 98.90711203 1879.235
30 − 32.5 5 162.5 14.94520548 223.3591668 1116.796
35 − 37.5 2 75 19.94520548 397.8112216 795.6224
TOTAL 365 6407.5 1097.832239 11 648.9
e Mean = 6407.5365
≈ 17.55 f Variance = 11 648.9 ÷ 365 = 31.9148 Standard deviation = variance = 5.649
S u m m a r y s t a t i s t i c s M B 1 1 Q l d - 1 0 167
g Range = 40 − 0 = 40°C
8 a Number of fruit Frequency cf
30 − 4 4
40 − 9 13
50 − 9 22
60 − 3 25
70 − 4 29
80 − 1 30 b
c Reading off graph: Median is 50th percentile Median = 52 d Q1 is 25th percentile Q1 = 45 Q3 is 75th percentile Q3 = 60 IQR = 60 − 45
= 15
Scores Midpoint Frequency Frequency × Midpoint Scores − Mean Square difference Frequency × Midpoint
30 − 35 4 140 − 19 361 1444
40 − 45 9 405 − 9 81 729
50 − 55 9 495 1 1 9
60 − 65 3 195 11 121 363
70 − 75 4 300 21 441 1764
80 − 85 1 85 31 961 961
TOTAL 30 1620 1966 5270
e Mean = 162030
= 54 f Variance = 5270 ÷ 30 = 175.67 Standard deviation = variance = 13.25
g s = 527030 1−
≈ 13.48 h Range = 81 − 36 = 45
9 a Electric Mate has longest life as it has a greater value for the mean. b Electric Mate Mean = 197 hours. Hot Wire Mean = 185 hours. c Greatest standard deviation Therefore, greatest spread would be Electric Mate. d The standard deviation shows that you could be lucky to get a toaster with an element that will last, but there is also a chance
that you will get an element that will not last. Therefore, overall inconsistent performance by Electric Mate.
M B 1 1 Q l d - 1 0 168 S u m m a r y s t a t i s t i c s
e Check with teacher. See example of answer below:
i Hot Wire had less spread of results and a mean of 185 hours. So you can be more certain of your purchase.
ii There is an element of risk due to the inconsistency of elements with Electric Mate — higher mean but greater spread of results.
10 Crinkle would be the best buy because the spread of packet weight is closer to the mean than Crunch.
Exercise 10D — Boxplots 1
a Median = 5 12+
= 3rd score Median = 13 b Q1 = 11 Q3 = 16 IQR = 16 − 11 = 5 c Range = 32 − 6 = 26
2
a Median = 5 12+
= 3rd score Median = 122 b Q1 = 119 Q3 = 125 IQR = 125 − 119 = 6 c Range = 128 − 101 = 27
3
a Median = 5 12+
= 3rd score Median = 49.0 b Q1 = 46.5 Q3 = 52.3 IQR = 52.3 − 46.5 = 5.8 c Range = 57.8 − 39.2 = 18.6
4 Reading off the boxplot: a Greatest score = 140 points b Least points score = 56 points c Median points score = 90 points d Range = 140 − 56 = 84 points e Q1 = 76 Q3 = 102 IQR = 102 − 76 = 26 points
5 Reading off the boxplot: a Largest number of honey bears = 58; b Smallest number of honey bears = 31 c Median number of honey bears = 43
d Range = 58 − 31 = 27 e Q1 = 39 Q3 = 46 IQR = 46 − 39 = 7 Reading off the boxplot 6 to 8.
6 Median = 23 The answer is C 7 Q1 = 20, Q3 = 25 IQR = 25 − 20 = 5 The answer is C 8 Most of the data are contained
between the scores of 5 and 20 No — only lower 25% of the scores
are contained between 5 and 20. The answer is D 9 a
Median = 10 12+
= 5.5th score
Q1 = 5 12+
= 3rd score Q1 = 28 Q3 = 43 min = 32 max = 48 Five-number summary 22, 28, 35, 43, 48 b
10 a 10 11 12 15 21 22 22 23 30 37 39
45
Median = 12 12+
= 6.5th score Median = 22
Q1 = 6 12+
= 3 12
th score
Q1 = 12 152+
= 13.5
Q3 = 30 372+
= 33.5 max = 45 min = 10 Five-number summary 10, 13.5, 22, 33.5, 45
b
11 a From stem plot:
Median = 25 12+
= 13th score
Median = 26 yrs
Q1 = 12 12+
= 6.5th score Q1 = 20
Q3 = 42 452+
= 43.5 min = 18 max = 74 Five-number summary 18, 20, 26, 43.5, 74
b
c The data is positively skewed,
and includes one extremely high value.
12 a From stem-and-leaf plot
Median = 30 12+
= 15.5th score Median
= 347 000 349 0002+
= 348 000
Q1 = 15 12+
= 8th score Q1 = 335 000 Q3 = 357 000 min = 324 000 max = 375 000 Five-number summary 324 000, 335 000, 348 000, 357 000,
375 000 b
13 a Key: 12|1 = 121 Stem Leaf
12 1 5 6 9
13 1 2 4
14 3 4 8 8
15 0 2 2 2 5 7
16 3 5
17 2 9
18 1 1 1 2 3 7 8
b Median = 28 12+
= 14.5th score Median = 152
Q1 = 4 12
1 + = 7.5th score
Q1 = 134 1432+
= 138.5
S u m m a r y s t a t i s t i c s M B 1 1 Q l d - 1 0 169
Q3 = 179 1812+
= 180 min = 121 max = 188
14 a Key: 1*|7 = 17 years Stem Leaf
1* 7 7 8 8 8 9 9
2 0 0 0 1 2 2 2 2 3 3 3 3 4 4 4
2* 5 5 8 9
3 1 2 3
3*
4
4* 8
b Median = 30 12+
= 15.5th score
Median = 22 232+
= 22.5
Q1 = 15 12+
Q1 = 8th score Q1 = 20 Q3 = 25
Min = 17 Max (without extreme value) = 33 (Extreme value = 48.)
c Data is positively skewed with one high extreme value. Most mothers have their first baby between the ages of 20 and 25. 15 Small interquartile range with long whiskers. Therefore, clustered data The answer is C
Exercise 10E — Back-to-back stem plots 1 German 20 38 45 21 30 39 41 22 27 33 30 21 25 32 37 42 26 31 25 37
French 23 25 36 46 44 39 38 24 25 42 38 34 28 31 44 30 35 48 43 34
Highest − 48 Lowest − 20 Use a stem of 1, divided into halves Key: 2|3 = 23
German French
2 1 1 0 2 3 4
7 6 5 5 2* 5 5 8
3 2 1 0 0 3 0 1 4 4
9 8 7 7 3* 5 6 8 8 9
2 1 4 2 3 4 4
5 4* 6 8
M B 1 1 Q l d - 1 0 170 S u m m a r y s t a t i s t i c s
2 Boys 3.4 5.0 4.2 3.7 4.9 3.4 3.8 4.8 3.6 4.3
Girls 3.0 2.7 3.7 3.3 4.0 3.1 2.6 3.2 3.6 3.1
Highest − 5.0 kg Lowest − 2.6 kg Use a stem of 1, divided into halves
Key: 2|7 = 2.7kg
Boys Girls
2* 6 7
4 4 3 0 1 1 2 3
8 7 6 3* 6 7
3 2 4 0
9 8 4*
0 5
3 A 11 15 20 25 12 16 21 27 16 17 17 22 23 24 B 10 15 20 25 30 35 16 31 32 21 23 26 28 29
Highest − 35 Lowest − 10 Use a stem of 1, divided into halves a Key: 2|3 = 23 trucks
A B
2 1 1 0
7 7 6 6 5 1* 5 6
4 3 2 1 0 2 0 1 3
7 5 2* 5 6 8 9
3 0 1 2
3* 5
b Statistical analysis
Supermarket A Supermarket B
Mean = 19 Mean = 24.4
sx = 4.9 sx = 7.2
Min x = 11 Min x = 10
Q1 = 16 Q1 = 20
Median = 18.5 Median = 25.5
Q3 = 23 Q3 = 30
Max x = 27 Max x = 35
IQR = 7 IQR = 10 For supermarket A the mean is 19, the median is 18.5, the standard deviation is 4.9 and the interquartile range is 7. The distribution is symmetric. For supermarket B the mean is 24.4, the median is 25.5, the standard deviation is 7.2 and the interquartile range is 10. The distribution is symmetric. The centre and spread of the distribution of supermarket B is higher than that of supermarket A. There is greater variation in the number of trucks arriving at supermarket B.
4 Females 12 13 14 14 15 15 16 17
Males 10 12 13 14 14 15 17 19
Highest − 19 Lowest − 10 Use a stem of 1, divided into fifths
S u m m a r y s t a t i s t i c s M B 1 1 Q l d - 1 0 171
a Key: 1|7 = 17 marks Females Males
1 0
3 2 1 2 3
5 5 4 4 1 4 4 5
7 6 1 7
1 9
Statistical Analysis Females Males
Mean = 14.5 Mean = 14.25 sx = 1.6 sx = 2.8 Min x = 12 Min x = 10
Q1 = 13.5 Q1 = 12.5 Median = 14.5 Median = 14 Q3 = 15.5 Q3 = 16 Max x = 17 Max x = 19 IQR = 2 IQR = 3.5
b For the marks of the females, the mean is 14.5, the median is 14.5, the standard deviation is 1.6 and the interquartile range is 2. The distribution is symmetric. For the marks of the males, the mean is 14.25, the median is 14, the standard deviation is 2.8 and the interquartile range is 3.5. The distribution is symmetric.
The centre of each distribution is about the same. The spread of marks for the boys is greater, however. This means that there is a wider variation in the abilities of the boys compared to the abilities of the girls.
5
First 30 31 35 37 39 41 41 42 43 46
Second 22 26 27 28 30 31 31 33 34 36
Highest − 46 Lowest − 22 Use stems of 1, divided into halves a Key: 2|6 = 26 marks
Leaf Stem Leaf
First year Second year
2 2
2* 6 7 8
1 0 3 0 1 1 3 4
9 7 5 3* 6
3 2 1 1 4
6 4*
Statistical Analysis First year Second year
Mean = 38.5 Mean = 29.8
sx = 5.2 sx = 4.16
Min x = 30 Min x = 22
Q1 = 35 Q1 = 27
Median = 40 Median = 30.5
Q3 = 42 Q3 = 33
Max x = 46 Max x = 36
IQR = 7 IQR = 6
b The distribution of marks for the first year and for the second year are each symmetric.
For the first year marks, the mean is 38.5, the median is 40, the standard deviation is 5.2 and the interquartile range is 7. The distribution is symmetric.
For the second year marks, the mean is 29.8, the median is 30.5, the standard deviation is 4.2 and the interquartile range is 6.
The spread of each of the distributions is much the same but the centre of each distribution is quite different with the centre of the second year distribution quite a lot lower. The work may have become a lot harder!
6
Female 23 24 25 26 27 28 30 31
Male 22 25 30 31 36 37 42 46
Highest − 46 Lowest − 22 Use stems of 1, divided into halves a Key: 2|5 = 25 years
Female Male
4 3 2 2
8 7 6 5 2* 5
1 0 3 0 1
3* 6 7
4 2
4* 6
Statistical Analysis
Female Male
Mean = 26.75 Mean = 33.63
sx = 2.82 sx = 8.19
Min x = 23 Min x = 22
Q1 = 24.5 Q1 = 27.5
Median = 26.5 Median = 33.5
Q3 = 29 Q3 = 39.5
Max x = 31 Max x = 46
IQR = 4.5 IQR = 12
b For the distribution of the females, the mean is 26.75, the median is 26.5, the standard deviation is 2.8 and the interquartile range is 4.5.
For the distribution of the males, the mean is 33.6, the median is 33.5, the standard deviation is 8.2 and the interquartile range is 12.
The centre of the distributions is very different: it is much higher for the males. The spread of the ages of the females who attend the fitness class is very small but very large for males. Older males are more likely to attend fitness classes than females.
7 Kindergarten 3 13 14 25 28 32 36 41 47 50
Prep. School 5 12 17 25 27 32 35 44 46 52
Highest − 52 Lowest − 3 Use stems of 1
M B 1 1 Q l d - 1 0 172 S u m m a r y s t a t i s t i c s
a Key: 3|2 = 32 Leaf Stem Leaf
Kindergarten Prep.
3 0 5
4 3 1 2 7
8 5 2 5 7
6 2 3 2 5
7 1 4 4 6
0 5 2
Statistical Analysis Kindergarten Prep. school
Mean = 28.9 Mean = 29.5
sx = 15.4 sx = 15.3
Min x = 3 Min x = 5
Q1 = 14 Q1 = 17
Median = 30 Median = 29.5
Q3 = 41 Q3 = 44
Max x = 50 Max x = 52
IQR = 27 IQR = 27 b For the distribution of scores of the kindergarten
children, the mean is 28.9, the median is 30, the standard deviation is 15.4 and the interquartile range is 27.
For the distribution of scores for the prep. children, the mean is 29.5, the median is 29.5, the standard deviation is 15.3 and the interquartile range is 27. The distributions are very similar. There is not a lot of difference between the way the kindergarten children and the prep. children scored.
8 The pair of variables that could be displayed on a back to back stemplot is: the time put into completing an assignment and a pass or fail score on the assignment.
The answer is B Why: “bivariate data, involving a numerical variable
and a categorical variable with 2 categories.” 9 A back-to-back stem plot is a useful way of displaying the
relationship between the age and attitude to gambling (for or against).
The answer is C Why:– age–numerical, attitude–categorical with 2
categories (for or against)
Exercise 10F — Parallel boxplots 1 Statistical analysis for boxplots
9A 10A 11A
Min x = 120 Min x = 140 Min x = 151
Q1 = 140 Q1 = 149 Q1 = 160
Median = 153 Median = 163 Median = 167
Q3 = 160 Q3 = 170 Q3 = 180
Max x = 170 Max x = 180 Max x = 199
a
b Clearly, the median height increases from Year 9 to
Year 11. There is greater variation in 9A’s distribution than in 10A’s. There is a wide range of heights in the lower 25% of the distribution of 9A’s distribution. There is a greater variation in 11A’s distribution than in 10A’s, with a wide range of heights in the top 25% of the 11A distribution.
2 Statistical analysis for boxplots
20 − 29 30 − 39 40 − 49 Min x 2000 4000 10 000
Q1 5000 6000 12 000
Median 6350 6900 13 600 Q3 7000 9000 14 000
Max x 10 000 12 000 15 000 a
b Clearly, there is a great jump in contributions to
superannuation for people in their 40s. The spread of contributions for that age group is smaller than for people in their 20s or 30s, suggesting that a high proportion of people in their 40s are conscious of superannuation. For people in their 20s and 30s, the range is greater, indicating a range of interest in contributing to superannuation.
3 Statistical analysis for boxplots Vitamin
A Vitamin
B Vitamin
C Multi
Min x 5 10 8 12 Q1 7 11 9 13
Median 8 14.5 9.5 16 Q3 11 15 12 19
Max x 14 19 13 20 a
b Overall, the biggest sales were of multi-vitamins,
followed by vitamin B, then C and finally vitamin A. 4
S u m m a r y s t a t i s t i c s M B 1 1 Q l d - 1 0 173
a Company C has the greatest range of ages 60 − 20 = 40. The answer is C b Company B has the greatest interquartile range
55 − 35 = 20. The answer is B c Company E has the lowest Median age, Median = 34. The answer is E d Company C has the greatest range of ages amongst their
oldest 25% of employees. The answer is C
Chapter review
1 a x = 52
10
= 5.2
b x = 519
8
= 64.875
c x = 53.9
7
= 7.7
d x = 179
5
= 35.8 2
Class Class centre (x) Frequency (f) 21-24 22.5 3
25-28 26.5 9
29-32 30.5 17
33-36 34.5 31
37-40 38.5 29
41-44 42.5 25
45-48 46.5 19
49-52 50.5 10
143f∑ =
Mean =x f
f
∑ ×
∑
= 5461.5
143
= 38.2
3 a x = 311
101
= 31.1
b x = 209
9
= 23.2
c x = 4.9
11
= 0.445
4 a x = x f
f
∑ ×
∑
= 6300
211
= 29.9
b x = x f
f
∑ ×
∑
= 12801484
= 26.4
c x = x f
f
∑ ×
∑
= 3206
172
= 18.6
5 a Middle value is the th9 1
2+⎛ ⎞
⎜ ⎟⎝ ⎠
score.
The fifth score is 27. b Ordered data 4, 4, 4, 5, 5, 6, 7, 8, 8, 8, 10, 11 data values, the middle value is the sixth value. Median is 6 c Ordered data 3.0, 3.1, 3.2, 3.2, 3.2, 3.2, 3.5, 3.6 Middle value is between fourth and fifth value. Median is 3.2 d Ordered data 2, 3, 4, 4, 5, 6, 7, 7, 7, 8 Middle value is between the fifth and sixth value.
Median = 5 22+
= 5.5 e Ordered data 101, 108, 111, 121, 135, 147, 154, 165 Middle value is between the fourth and fifth value.
Median = 121 1352+
= 128 6 a
Score Frequency Cumulative frequency
0 2 2
1 6 8
2 11 19
3 7 26
4 6 32
5 3 35
Median is the th35 1
2+⎛ ⎞
⎜ ⎟⎝ ⎠
score, that is, the 18th score.
Median = 2 b
Score Frequency Cumulative frequency
54 2 2
55 5 7
56 14 21
57 11 32
58 6 38
59 1 39
60 1 40
M B 1 1 Q l d - 1 0 174 S u m m a r y s t a t i s t i c s
Median is the th40 1
2+⎛ ⎞
⎜ ⎟⎝ ⎠
score, that is, the
20.5th score. Median = 56 c
Score Frequency Cumulative frequency
66 8 8
67 10 18
68 12 30
69 14 44
70 7 51
71 5 56
72 4 60
Median is the th60 1
2+⎛ ⎞
⎜ ⎟⎝ ⎠
score, that is the
30.5th score. Half way between 68 and 69. Median = 68.5 7 Highest frequency is 52. The modal class is 46-49
8 a x = 278
10
⎛ ⎞⎜ ⎟⎝ ⎠
= 27.8 b Ordered data 18, 19, 21, 22, 24, 25, 26, 28, 28, 67 Median = 24.5 c The score 28 occurs twice, so the mode is 28. d The median is the best statistic as the mean is affected by the outlier value (67).
9 a Mean: The mean is appropriate when no extreme values Answer will vary, but the following summary may be useful. Values (outliers) distort the average.
b Median: The median is appropriate when outliers are present.
c Mode: The mode is appropriate when the most common result is significant
10 a Range = 29 − 22 = 7
b Range = 159 − 0 = 159 c Range = 1.9 − 0.5 = 1.4
11 a Ordered data 22, 24, 24, 25, 25, 26, 27, 28, 29 i Median = 25 ii Q1 = 24 iii Q3 = 27.5 iv 1QR = 27.5 − 24 = 3.5
b Ordered data 0, 2, 43, 45, 56, 69, 72, 84, 118, 159
i Median = 56 692+ = 62.5
ii Q1 = 43 iii Q3 = 84 iv 1QR = 84 − 43 = 41 c Ordered data 0.5, 0.7, 0.8, 1.1, 1.4, 1.5, 1.9
i Median = 1.1 ii Q1 = 0.7 iii Q3 = 1.5 iv 1QR = 1.5 − 0.7 = 0.8
12 a Add an axis containing percentiles to the graph. The median is the 50th percentile Median = 1.6
b Q1 = 25th percentile = 1 Q3 = 75th percentile = 2.5 c 1QR = Q3 − Q1 = 2.5 − 1 = 1.5
13 a Median = 61 b Q1 = 54 Q3 = 72 c 1QR = Q3 − Q1 = 72 − 54 = 18
14
Scores Scores − mean Square difference
1 −3.9 15.21
2 −2.9 8.41
2 −2.9 8.41
3 −1.9 3.61
5 0.1 0.01
6 1.1 1.21
6 1.1 1.21
7 2.1 4.41
8 3.1 9.61
9 4.1 16.81
49 68.9
a Mean = 4910
= 4.9
b The median = 10 12+
= 5.5th score
Median = 5 62+
= 5.5 c The scores 2 and 6 both occur twice Mode = 2 and 6 d Range = 9 − 1 = 8 e Q1 = 2 and Q3 = 7 IQR = Q3 − Q1 = 7 − 2 = 5 f Variance = 68.9 ÷ 10 = 6.89 Standard deviation = variance
= 2.625
S u m m a r y s t a t i s t i c s M B 1 1 Q l d - 1 0 175
15
Scores Midpoint Frequency Frequency × midpoint Scores − meanSquare
difference Square
difference × frequency 30 − 39 34.5 3 103.5 −30.78 947.4084 2842.2252 40 − 49 44.5 6 267 −20.78 431.8084 2590.8504 50 −59 54.5 12 654 −10.78 116.2084 1394.5008 60 − 69 64.5 15 967.5 −0.78 0.6084 9.126 70 − 79 74.5 18 1341 9.22 85.0084 1530.1512 80 − 89 84.5 10 845 19.22 369.4084 3694.084 TOTAL 64 4178 12060.9376
a Mean = 417864
= 65.281 25
b Median = 64 12+
= 32.5th score Median group = 60 − 69 c Highest frequency = 18 Modal group 70 − 79 d Range = 89 − 30 = 59 e Variance = 12060.9376 ÷ 64 = 188.45215 Standard deviation = variance = 13.728
16 a Reading from the box-and-whisker plot. Median = 43
b Range = 55 − 12 = 43 c 1QR = Q3 − Q1 =46 − 32 = 14
17 lowest value is 1 and highest value is 18. Q1 = 8, Q2= 14 and Q3 = 16
18 a Highest − 15 Lowest − 1 Use stems of 1, divided into fifths. Key: 1|3 = 13 hours
Full-time Volunteer
1 0
2 2 0
4 4 3 3 0
6 5 0
0 8
1 0 1 1
1 2 3 3
1 4 5
1
1 b Both distributions are symmetric with the same spread. The centre of the volunteers’ distribution is much higher than that of
the full-time firefighters’ distribution. Clearly, the volunteers needed more counselling.
M B 1 1 Q l d - 1 0 176 S u m m a r y s t a t i s t i c s
19 Statistical analysis for boxplots Team A Team B Team C
Min x 98 95 114 Q1 103.5 101 115.5 Median 110.5 105 120.5 Q3 125 109.5 126.5 Max x 140 120 145
Modelling and problem solving 1 Office Workers
a Office workers distribution is negatively skewed with one
outlier (76). b Min x = 76, Max x = 134 Range = 134 − 76 = 58 beats/min
Median = th18 1
2+⎛ ⎞
⎜ ⎟⎝ ⎠
= 8 12
th position
= 121.5 beats/min
Quartiles = th8 1
2+⎛ ⎞
⎜ ⎟⎝ ⎠
= 4 12
th position
Q1 = 108.5, Q3 = 128 IQR = 128 − 108.5 = 19.5 beats/min Mode = 130 beats/min c
d Mean = 186916
x = 116.8125 beats/min sx = 15.3 beats/min Sports Instructors
Stem Leaf
6 2 4 8 8 9
7 2 2 3 5 7 9
8 2 8
9 6
10 8
Sports Instructors distribution positively skewed with one outlier (108).
Min x = 62, Max x = 108 Range = 108 − 62 = 46
Median = 73 beats/min Q3 = 82, Q1 = 68 IQR = 82 − 68 = 14 beats/min Mode = 68, 72 beats/min
Mean = 115315
x = 76.87 beats/min sx = 12.43 beats/min e Office workers: Pulse rates are generally very high,
clustered around 120 − 130 beats/min. Also, there is one person whose rate was much lower than the rest. This outlier (76) produces a large range and makes the mean slightly lower than the median. As a result the median is a more appropriate measure of the centre of the data rather than the mean. Sports instructors: Pulse rates are generally low, clustered around 60 − 70 beats/min, although there are a few people with rates much higher, which makes the mean slightly higher than the median and also produces quite a large range. As a result of the skewed distribution the median is the more appropriate measure of the centre of the data rather than the mean, although there is little difference between these values.
2 a Key: 1|3 = 13 Text B Text A
Leaf Stem Leaf
1 3
2 1 2 5 8
9 7 3 5 5
8 2 4 4 8
9 9 8 8 5 5 3 2 5 2 5 6 9
8 8 8 6 5 4 4 3 2 6 4 9
9 9 5 2 2 7 2 2 6
2 1 8 1 3 4 4
9 4 5 8
b Text A n = 25 Min Q1 Median Q3 Max
13 35 59 82 98
Text B n = 28 Min Q1 Median Q3 Max
37 55 63.5 70 82
c
d Performance on test. Students who used Text A had varied
results while the students who used Text B were more consistent and they had a higher median score.
Text A: range 85, IQR = 47, Median = 59 Text B: range 44, IQR = 15, Median = 63.5 e and f Students individual differences — that is, studying,
doing the work, ability can affect the results. However, you would probably go with the text that produced more consistent results. Therefore, Text B.
S u m m a r y s t a t i s t i c s M B 1 1 Q l d - 1 0 177
3 a Key: 1|6 = 16 Leaf Stem Leaf Jazz Symphony
9 9 8 6 1 9 7 4 3 0 2 0 3 9 8 4 3 0 3 0 5 9 6 5 3 2 0 4 2 5 5 7 8 8 9 9 5 0 3 4 6 8 8
2 6 0 b
Symphony Jazz i Min x 20 16
ii Q1 40.5 21.5 iii Median 48 31.5 iv Q3 53.5 41 v Max x 60 62
vi Mean 45.45 32.35 vii IQR 53.5 − 40.5 = 13 41 − 21.5 = 19.5
viii Standard deviation
11.20 12.04
c All values (except for Maximum) in the 5-figure summary
are higher for Symphony goers, than for Jazz. Median (48) for Symphony is much higher than for Jazz (31.5); the mean is also much larger for symphony (45.45 compared to 32.35). So the centre is much higher for symphony goers. The spread is higher for the Jazz goers: IQR is 19.5 (as opposed to 13 for Symphony and standard deviation is also slightly higher for Jazz. Overall it appears that people who went to the Symphony concert were older than those who went to Jazz. The spread of ages was nearly the same, but slightly higher for Jazz-goers.
d Back–to–back stem plots can only be used for the numerical data with 2 categories. To compare distributions for 3 different events (i.e. 3 categories) we need to use boxplots.
e Opera Min x 12 Q1 34 Median 47.5 Q3 56.5 Max x 68 Mean 44.95 IQR 56.5 − 34 = 22.5 sx 15.55
f
g The distributions for Symphony and Opera are similar: both
are skewed to the left; have approximately the same centre (means: 45.45 and 44.95; medians: 48 and 47.5 respectively), but Opera has much bigger spread. For Jazz, distribution is positively skewed; the centre is much lower than for both Symphony and Opera, while spread is in the middle (slightly bigger than for the Symphony, but smaller than the Opera). Overall it appears that people who attend Symphony concerts and Opera tend to be approximately the same age and older than those who attend Jazz. The most spread out distribution is for the Opera, followed by Jazz and then by the Symphony goers. So the widest range of ages is among the Opera goers.