chapter 2 describing distributions with numbers. chapter outline 1. measuring center: the mean 2....
TRANSCRIPT
![Page 1: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/1.jpg)
Chapter 2
Describing distributions with numbers
![Page 2: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/2.jpg)
Chapter Outline
1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and the median 4. Measuring spread: the quartiles 5. The five-number summary and boxplots 6. Measuring spread: the standard deviation 7. Choosing measures of center and spread
![Page 3: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/3.jpg)
Measuring center: the mean
Notation: It is simply the ordinary arithmetic
average. Suppose that we have n observations
(data size, number of individuals). Observations are denoted as x1, x2, x3,
…xn.
x
![Page 4: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/4.jpg)
Measuring center: the mean
How to get ?
Example 2.1 (P.33)
x
n
x
n
xxxxx in
...321
![Page 5: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/5.jpg)
Measuring center: the median
Notation: M Median M is the midpointmidpoint of a
distribution half the observations are smaller than M and the other half are larger than M.
![Page 6: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/6.jpg)
Measuring center: the median
How to find M?
– 1. Sort all observations in increasing order Sort all observations in increasing order (This step is important!!!)(This step is important!!!)
– 2. If n is odd, observation is M. if n is even, average of two center values is M.
Note that is the location of the median in the ordered list, not the median value.
thn
)2
1(
)2
1(
n
![Page 7: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/7.jpg)
Measuring center: the median
Examples
Case 1. 11, 21, 13, 24, 15, 26, 17 Case 2. 11, 21, 13, 24, 15, 26
Example 2.2, 2.3 (P.35)
![Page 8: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/8.jpg)
Mean vs. Median
Median is more resistant than the mean. The mean and median of a symmetric
distribution are close together. If the distribution is exactly symmetric, the mean and median are exactly the same. In a skewed distribution, the mean is farther out in the long tail than is the median.
Example1, 2, 3, 4, 5, 6, 10000
![Page 9: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/9.jpg)
Inference :
Strongly skewed distributions are reported with median than the mean.
![Page 10: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/10.jpg)
Measuring Spread: The Quartiles
The quartiles mark out the middle half of the distribution.
![Page 11: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/11.jpg)
Calculating the Quartiles :
– Step1. Arrange the observations in increasing order and
locate the median M in the ordered list of observations.
– Step2. The first quartile Q1 is the median of the
observations whose position in the ordered list is to the left of the location of the overall median.
– Step3. The third quartile Q3 is the median of the
observations whose position in the ordered list is to the right of the location of the overall median.
![Page 12: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/12.jpg)
Measuring spread: the quartiles
Example 2.4 (P. 37) Example 2.5 (P. 38)
Note:
(1) It is important to sort data first before
we try to find quartiles!
(2) Quartiles are resistant.
![Page 13: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/13.jpg)
The five-number summary and boxplots
The five-number summary:
Minimum, Q1, M, Q3, Maximum.
Boxplot is a graph of five number
summary.
Boxplots are most useful for side-by-side comparison of several distributions.
![Page 14: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/14.jpg)
Boxplot
1. A boxplot is a graph of the five-number summary
2. A central box spans the quartiles 3. A line in the box marks the median 4. Lines extended from the box out to
the minimum and maximum 5. Range = maximum - minimum
![Page 15: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/15.jpg)
The five-number summary and boxplot
Figure 2.2(P.39): side-by-side boxplots comparing the distributions of earning for two levels of education.
![Page 16: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/16.jpg)
The five-number summary and boxplots
![Page 17: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/17.jpg)
Inference :
Boxplot also gives an indication of the symmetry or skewness of a distribution.
-- In a symmetric distribution Q1 and Q3 are equally distant from the median, but in case of right skewed one the third quartile would be further above the median than the first quartile bellow it.
![Page 18: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/18.jpg)
Measuring spread: the standard deviation
It says how far the observations are from their mean. The variance s2 of a set of observations is an average of the
squares of the deviations of the observations from their mean. Notation: s2 for variance and s for standard deviation
1
)()(
1
)(
1
)(...)()(
22
2
222
212
n
xnx
n
xx
n
xxxxxxs
i
i
n
![Page 19: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/19.jpg)
Why (n-1) ?
As the sum of the deviations always equals 0, so the knowledge of (n-1) of
them determines the last one. --- Only (n-1) of the squared deviations are
variable but not the last one, so we average by dividing the total by (n-1).
The number (n-1) is called the degrees of freedom of the variance or standard deviation
)( xxi
![Page 20: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/20.jpg)
Measuring spread: the standard deviation
To find the variance and the standard deviation– 1. Find the mean of the data set – 2. Subtract the mean from each number (we call
that deviation)– 3. Square each result– 4. Sum all the square– 5. Divide the sum of square by n-1, where n is the
number of all observations. Now you get variance– 6. Standard deviation is just the positive square
root of the variance.
![Page 21: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/21.jpg)
Measuring spread: the standard deviation
Example 2.6 (P.42)
![Page 22: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/22.jpg)
Properties of s2 and s
s measures spread about the mean and should be used only when the mean is chosen as the measure of center.
s 0 and s=0 only when each of the observation values does not differ from each other.
S is not resistant.
![Page 23: Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and](https://reader035.vdocument.in/reader035/viewer/2022062304/56649dbf5503460f94ab3d89/html5/thumbnails/23.jpg)
Choosing measures of center and spread
With a skewed distribution or with a distribution with extreme outliers, five-number summary is better.
With a symmetric distribution (without outliers), mean and standard deviation are better.