ashleybriones module3 homework

2
Page 1 of 2 Homework: Central Tendency and Dispersion 1. For the following data, calculate the mode, median, mean, and range: {7, 8, 16, 15, 16, 6, 7, 9}. Please show your work. (4 points) Mode(s): 16, 7 Median: 6, 7, 7, 8, 9, 15, 16, 16 = (8+9)/2 = (17)/2= 8.5 Mean: (7 +8 +16 +15+16+ 6+7+9)/8= 84/8 =10.5 Range: 16-6= 10 2. For the following data, calculate the mode, median, mean, and range: {2, 4, 2, 9, 7, 5, 2, 1, 3, 6}. Please show your work. (4 points) Mode: 2 Median: 1,2,2,2,3,4,5,6,7,9= (3+4)/2= 7/2= 3.5 Mean: (1+2+2+2+3+4+5+6+7+9)/10 = 41/10 = 4.1 Range: 9-1 = 8 3. Give a concrete example for when a) the mode, b) the median, and c) the mean would be the best central tendency to report. Why? (4 points) a) The mode is best used in nominal situations, where the numbers don’t necessarily add up to a specific value. For example, if someone took a survey of most traveled to states in the U.S. and assigned each state a number, the number that was repeated the most (mode) would appropriately represent which state saw the most travel. b) The median is useful in times where the mean might not be appropriate, such as when trying to display salary or socioeconomic class (as these numbers can be very skewed by outliers). For example, if we were looking at 5 people and the salaries were 15,000, 20,000, 22,000, 25,000, and 100,000, the obvious outlier would make the average appear to be much higher than it actually is (36,000) while the median would show a more appropriate number (22,000). c) The mean is best used in times where there aren’t significant outliers to skew or misrepresent the data, so when the range of data is not extreme. If one was looking for the average price of ice cream in Boston, you could assume that the range would be between $2-$6 which is not very much at all (as it is highly unlikely to find an ice cream for $0.25 and another for $25) so the mean would be appropriate to use. 4. For the following data, demonstrate that the computational formula for standard deviation and the definitional formula for standard deviation produce the same answer. Please show your work and formulas. Assume the data came from a sample. {14, 22, 12, 5, 18, 20, 22} (8 points) Definitional formula: ( x - x ) 2 n - 1 Mean deviation of x: (14+22+12+5+18+20+22)/7 = (113)/7= 16.1 (x-mean deviation): (16.1-14), (16.1-22), (16.1-12), (16.1-5), (16.1-18), (16.1-20), (16.1- 22)= 2.1, -5.9, 4.1, 11.1, -1.9, -3.9, -5.9 x - x ( = 2 = (4.4+34.8+16.8+123.2+3.6+34.8)/n-1= (217.6)/6= 36.3 The variance is 29.01, therefore the standard deviation is 36.3 = 6.03

Upload: milove4u

Post on 25-Dec-2015

212 views

Category:

Documents


0 download

DESCRIPTION

fe

TRANSCRIPT

Page 1: AshleyBriones Module3 Homework

Page 1 of 2

Homework: Central Tendency and Dispersion

1. For the following data, calculate the mode, median, mean, and range: {7, 8, 16, 15, 16, 6, 7,

9}. Please show your work. (4 points)

Mode(s): 16, 7

Median: 6, 7, 7, 8, 9, 15, 16, 16 = (8+9)/2 = (17)/2= 8.5

Mean: (7 +8 +16 +15+16+ 6+7+9)/8= 84/8 =10.5

Range: 16-6= 10

2. For the following data, calculate the mode, median, mean, and range: {2, 4, 2, 9, 7, 5, 2, 1, 3,

6}. Please show your work. (4 points)

Mode: 2

Median: 1,2,2,2,3,4,5,6,7,9= (3+4)/2= 7/2= 3.5

Mean: (1+2+2+2+3+4+5+6+7+9)/10 = 41/10 = 4.1

Range: 9-1 = 8

3. Give a concrete example for when a) the mode, b) the median, and c) the mean would be the

best central tendency to report. Why? (4 points)

a) The mode is best used in nominal situations, where the numbers don’t necessarily add up

to a specific value. For example, if someone took a survey of most traveled to states in

the U.S. and assigned each state a number, the number that was repeated the most (mode)

would appropriately represent which state saw the most travel.

b) The median is useful in times where the mean might not be appropriate, such as when

trying to display salary or socioeconomic class (as these numbers can be very skewed by

outliers). For example, if we were looking at 5 people and the salaries were 15,000,

20,000, 22,000, 25,000, and 100,000, the obvious outlier would make the average appear

to be much higher than it actually is (36,000) while the median would show a more

appropriate number (22,000).

c) The mean is best used in times where there aren’t significant outliers to skew or

misrepresent the data, so when the range of data is not extreme. If one was looking for

the average price of ice cream in Boston, you could assume that the range would be

between $2-$6 which is not very much at all (as it is highly unlikely to find an ice cream

for $0.25 and another for $25) so the mean would be appropriate to use.

4. For the following data, demonstrate that the computational formula for standard deviation

and the definitional formula for standard deviation produce the same answer. Please show

your work and formulas. Assume the data came from a sample. {14, 22, 12, 5, 18, 20, 22}

(8 points)

Definitional formula: (x - x )2ån-1

Mean deviation of x: (14+22+12+5+18+20+22)/7 = (113)/7= 16.1

(x-mean deviation): (16.1-14), (16.1-22), (16.1-12), (16.1-5), (16.1-18), (16.1-20), (16.1-

22)= 2.1, -5.9, 4.1, 11.1, -1.9, -3.9, -5.9

x- x( )2

å = (4.4+34.8+16.8+123.2+3.6+34.8)/n-1= (217.6)/6= 36.3

The variance is 29.01, therefore the standard deviation is 36.3 = 6.03

Page 2: AshleyBriones Module3 Homework

Page 2 of 2

Computational formula:

S =x2å -

xå( )n

2

n-1 x2å = 14

2+22

2+12

2+5

2+18

2+20

2+22

2= 2057

xå( )n

2

= (14+22+12+5+18+20+22) 2

/n= (113) 2

=12769/7 =1824.14

x2å -xå( )n

2

= 2057-1824.14 = 232.86 232.86

n-1= 232.86/ (7-1)=232.86/6 =38.81

232.86

6= 38.81 = 6.23

5. The following is a sample of scores: {5, 0, 9, 3, 8, 5}. Calculate three measures of

dispersion. Please show your work and any formulas. Additionally, describe, in your own

words what “dispersion” means when it comes to describing a set of data. (9 points)

Dispersion refers to the distance between the data and center of distribution. For example, in

a bell curve it would refer to the distribution of the data from the center/median of the curve. The

dispersion can help us to determine how accurate the mean or median may be, and how affected

by outliers it may be as a high variability in numbers could lead to less than ideal results.

Range: 9-0= 9

Interquartile range: (0, 3, 5) | (5, 8, 9)= Q3 is 8, Q 1 is 3 (8-3= 5) IQR is 5

Standard deviation:

The mean is: (5+0+9+3+8+5)/6= 30/6= 5

(x- mean deviation): (5-5), (5-0), (5-9), (5-3), (5-8), (5-5)= 0, 5, -4, 2, -3, 0

deviation squared: 0, 25, 16, 4, 9, 0

deviation mean: (0+25+16+4+9+0)/6= (54)/6= 9

The variance is 9, while the standard deviation is 3 (the square root of the variance)

14 22 12 5 18 20 22

16.14

14-16.14