chapter 4

48
4 - 1 pyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Upload: wren

Post on 04-Jan-2016

40 views

Category:

Documents


0 download

DESCRIPTION

Other Descriptive Measures. Chapter 4. 1. 2. 3. Chapter Goals. When you have completed this chapter, you will be able to:. Compute and interpret the range , the mean deviation , the variance , the standard deviation , and the coefficient of variation of ungrouped data. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chapter 4

4 - 1

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Page 2: Chapter 4

4 - 2

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

1.1. Compute and interpret the range, the mean deviation, the variance, the standard deviation,

and the coefficient of variation of ungrouped data

2.2. Compute and interpret the range, the variance, and the standard deviation from grouped data

When you have completed this chapter, you will be able to:

3.3. Explain the characteristics, uses, advantages, and disadvantages of each measure

Page 3: Chapter 4

4 - 3

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

4.4.

5.5.

6.6.

7.7.

Understand Chebyshev’s theorem and the normal or empirical rule, as it relates to a set of

observationsCompute and interpret percentiles, quartiles and the

interquartile range

Construct and interpret box plots

Compute and describe the coefficient of skewness and kurtosis of a data distribution

Page 4: Chapter 4

4 - 4

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

TerminologyRange

…is the difference between the largest and the smallest value.

…is the difference between the largest and the smallest value.

Only two values are used in its calculation.It is influenced by an extreme value.It is easy to compute and understand.

Only two values are used in its calculation.It is influenced by an extreme value.It is easy to compute and understand.

Page 5: Chapter 4

4 - 5

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

xMD

N

xMD

N

TerminologyMean Deviation

…is the arithmetic mean of the absolute values of the deviations from the arithmetic mean.

…is the arithmetic mean of the absolute values of the deviations from the arithmetic mean.

All values are used in the calculation.It is not unduly influenced by large or small values.The absolute values are difficult to manipulate.

All values are used in the calculation.It is not unduly influenced by large or small values.The absolute values are difficult to manipulate.

Page 6: Chapter 4

4 - 6

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

The weights of a sample of crates containing books for the bookstore

(in kg) are: 103 97 101 106 103

Find the range and the mean deviation.

Page 7: Chapter 4

4 - 7

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Find the mean weight5

510 102

Find the mean deviation

Find the range

103 97 101 106 103103 97 101 106 103

106 – 97 = 9

= 2.45

54151

5

102103...102103

xMD

N

xMD

N

x

N x

N

Page 8: Chapter 4

4 - 8

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

TerminologyVariance

…is the arithmetic mean of the squared deviations

from the arithmetic mean.

…is the arithmetic mean of the squared deviations

from the arithmetic mean.

All values are used in the calculation. It is not influenced by extreme values. The units are awkward…the square of the original units.

All values are used in the calculation. It is not influenced by extreme values. The units are awkward…the square of the original units.

Computation

Page 9: Chapter 4

4 - 9

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Formula Formula

Computing the Variance Computing the Variance

22 x

N

( )

22 x

N

( )

… for a Population

22

1

x xs

n

( )

22

1

x xs

n

( )

Formula Formula … for a Sample

Page 10: Chapter 4

4 - 10

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

The ages of the Dunn family are: 2, 18, 34, 42

4

96 24 24

4

2442...242 22

4

944

236 236

What is the population mean and variance?

22 x

N

( )

22 x

N

( )

x

N x

N

Page 11: Chapter 4

4 - 11

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Population Standard Deviation

… is the square root of the population variance

… is the square root of the population variance

From previous example…

2

236 = 15.36= 15.36

Example

Page 12: Chapter 4

4 - 12

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

EXAMPLEEXAMPLE

The hourly wages earned by a sample of five students are: $7, $5, $11, $8, $6.

Find the mean, variance, and Standard Deviation.

The hourly wages earned by a sample of five students are: $7, $5, $11, $8, $6.

Find the mean, variance, and Standard Deviation.

= 7.40= 7.40537

= 5.30= 5.305-1

21.2 4.76...4.77 22

15

= 2.30= 2.305.29s2s =

22

1

x xs

n

( )

x

N

Page 13: Chapter 4

4 - 13

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

A sample of ten movie theatres in a metropolitan area tallied the total number of movies

showing last week. Compute the mean number of movies showing per theatre.

A sample of ten movie theatres in a metropolitan area tallied the total number of movies

showing last week. Compute the mean number of movies showing per theatre.

The Mean of

Grouped Data

The Mean of

Grouped Data

From chapter 3….

Page 14: Chapter 4

4 - 14

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Continued…

6610Total

301039 to under 11

8817 to under 9

18635 to under 7

8423 to under 5

2211 to under 3

(f)(x)Class Midpoint

Frequency

f

Movies Showing

The Mean of Grouped Data

The Mean of Grouped Data N

fxx

Page 15: Chapter 4

4 - 15

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

= 6.6= 6.61066

Continued…

(f)(x)Class Midpoint

Frequency

f

Movies Showing

6610Total

Now: Compute the variance and

standard deviation.

Now: Compute the variance and

standard deviation.

The Mean of Grouped Data

The Mean of Grouped Data N

fxx

Formula Formula

Nfxx

Page 16: Chapter 4

4 - 16

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Sample Variance for Grouped Data

Sample Variance for Grouped Data

The formula for the sample variance for grouped data is:

f is class frequency and X is class midpoint

where

1

)( 22

2

nn

fxfx

s

Page 17: Chapter 4

4 - 17

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

6610Total

301039 to under 11

8817 to under 9

18635 to under 7

8423 to under 5

2211 to under 3

(f)(x)Class Midpoint

Frequency

f

Movies Showing

508

300

64

108

32

4

(x2)f

Sample Variance for Grouped Data

Sample Variance for Grouped Data

Page 18: Chapter 4

4 - 18

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

(f)(x)Class Midpoint

Frequency

f

Movies Showing

(x2)f

6610Total 508

1

)( 22

2

nn

fxfx

s

= 508 - 662

109

= 8.04= 8.04

Sample Variance for Grouped Data

Sample Variance for Grouped Data

The variance is

The standard deviation is

The standard deviation is

8.04 = 2.8

Page 19: Chapter 4

4 - 19

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Interpretation and Uses of the Standard

Deviation

Interpretation and Uses of the Standard

Deviation Chebyshev’s Theorem:

For any set of observations,

the minimum proportion of the values that

lie within k standard deviations of the mean is at least:

where k2 is any constant greater than 1

21

1k

Formula Formula

Page 20: Chapter 4

4 - 20

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Suppose that a wholesale plumbing supply company has a group of 50 sales vouchers from a particular day.

The amount of these vouchers are:

How well does this

data set fit

Chebychev’s

Theorem?

How well does this

data set fit

Chebychev’s

Theorem?

Solution

Page 21: Chapter 4

4 - 21

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

UsingSolution (continued)

Determine the mean and standard deviation of the sample

Determine the mean and standard deviation of the sample

Step 1Step 1Mean = $319 SD = $101.78

Mean = $319 SD = $101.78

Input k =2 into Chebyshev’s theorem

Input k =2 into Chebyshev’s theorem

Step 2Step 2122

1 - = 1 – ¼ = 3/4

i.e. At least .75 of the observations will fall

within 2SDof the mean.

i.e. At least .75 of the observations will fall

within 2SDof the mean.

Step 3Step 3

Page 22: Chapter 4

4 - 22

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Using the mean and SD, find the range of data values

within 2 SD of the mean

Using the mean and SD, find the range of data values

within 2 SD of the mean

Step 3Step 3

Mean = $319 SD = $101.78

Mean = $319 SD = $101.78

= 319 - (2)101.78, 319 +2(101.78)

= (115.44, 522.56)

Now, go back to the sample data, and see what proportion of the values fall between

115.44 and 522.5656

Solution (continued)

Proportion

( - 2S, + 2S)x x

Page 23: Chapter 4

4 - 23

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Proportion of the values that fall

between 115.44 and 522.56

Proportion of the values that fall

between 115.44 and 522.56

We find that 48-50 or 96% of the data values are in

this range –

certainly at least 75% as the theorem

suggests!

We find that 48-50 or 96% of the data values are in

this range –

certainly at least 75% as the theorem

suggests!

Solution (continued)

Page 24: Chapter 4

4 - 24

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Interpretation and Uses of the

Standard Deviation

Interpretation and Uses of the

Standard DeviationEmpirical Rule:

For any symmetrical, bell-shaped distribution:

…About 68% of the observations will lie within 1s of the mean

…About 95% of the observations will lie within 2s of the mean

…Virtually all the observations will be within 3s of the mean

Page 25: Chapter 4

4 - 25

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Bell-Shaped Curve…showing the relationship between

and

Page 26: Chapter 4

4 - 26

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

How well does this

data set fit the

Empirical

Rule?

How well does this

data set fit the

Empirical

Rule?

Solution

Suppose that a wholesale plumbing supply company has a group of 50 sales vouchers from a particular day.

The amount of these vouchers are:

Page 27: Chapter 4

4 - 27

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

First check if the histogram has an approximate mound-shape

Not bad…so we’ll proceed!

We need to calculate the mean and standard deviation

Solution

Page 28: Chapter 4

4 - 28

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Mean: $319 Standard Deviation: $101.78

Calculate the intervals:

),( sxsx = (319-101.78, 319+101.78) 420.78)(217.22,

)2,2( sxsx = 319 -(2)101.78, 319 +2(101.78)

= 319-(3)101.78, 319 + 3(101.78) = )3,3( sxsx 624.34) (13.66,

Interval Empirical Rule Actual # values Actual percentage217.22, 420.78 68% 31/50 62%115.44, 522.56 95% 48/50 96% 13.66, 624.34 100% 49/50 98%

Interval Empirical Rule Actual # values Actual percentage217.22, 420.78 68% 31/50 62%115.44, 522.56 95% 48/50 96% 13.66, 624.34 100% 49/50 98%

=(115.44, 522.56)

Page 29: Chapter 4

4 - 29

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

…The coefficient of skewness can range from -3.00 up to +3.00

Skewness…is the measurement of the

lack of symmetry of the distribution

σ

Mean MedianSK1 = 3

…A value of 0 indicates a symmetric distribution.

It is computed as follows:

Page 30: Chapter 4

4 - 30

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Skewness

Following are the earnings per share for a sample of 15 software companies for the year 2000. The earnings per share are arranged from smallest to largest.

$0.09 0.13 0.41 0.51 1.12 1.20 1.49 3.18

3.50 6.36 7.83 8.92 10.13 12.99 16.40Find the

coefficient of

skewness.

Find the coefficient

of skewness.

Mean = 4.95Median = 3.18SD = 5.22

SK1 = 3(4.95-3.18)/5.22

= 1.017= 1.017

σ

Mean MedianSK1 = 3

Page 31: Chapter 4

4 - 31

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Positively Skewed Distribution

Mean and Median are to the right of the Mode

Skewed Right

Mode<Median<

Mean

Page 32: Chapter 4

4 - 32

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Negatively Skewed Distribution

Mean and Median are to the left of the Mode

Skew

ed le

ft

< Mode< Median

Mean

Page 33: Chapter 4

4 - 33

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

…is the distance between the third quartile

Q3 and the first quartile Q1.

Example

InterquartileRange

InterquartileRange

This distance will include the middle 50 percent of the

observations.

Interquartile Range = Q3 - Q1

Page 34: Chapter 4

4 - 34

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

For a set of observations the third quartile is 24 and the first

quartile is 10. What is the interquartile range?

Example

The interquartile range is 24 - 10 = 14. Fifty percent of the observations

will occur between 10 and 24.

Page 35: Chapter 4

4 - 35

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Five pieces of data are needed to construct a box plot:

… the Minimum Value,

… the First Quartile,

… the Median,

… the Third Quartile, and

… the Maximum Value

Box Plots

…is a graphical display, based on quartiles, that helps to picture a set of data

Example

Page 36: Chapter 4

4 - 36

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Based on a sample of 20 deliveries, Buddy’s Pizza determined the following information.

The…minimum delivery time was 13minutes

…the maximum 30 minutes

The…first quartile was 15 minutes

…the median 18 minutes, and

… the third quartile 22 minutes

Develop a box plot for the delivery times.

Example

Solution

Page 37: Chapter 4

4 - 37

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

12 14 16 18 20 22 24 26 28 30 32

Min. Q1 Median Q3 Max.

Solution

Page 38: Chapter 4

4 - 38

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

The following are the average rates of return for Stocks A and B over a six year

period,

In which of the following Stocks would you prefer to invest?

Why?Stock A: 7 6 8 5 7 3

Stock B: 15 -10 18 10 -5 8

Page 39: Chapter 4

4 - 39

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Find the Mean rate of return for each of the two stocks:

Stock A: 7 6 8 5 7 3

Stock B: 15 -10 18 10 -5 8

Mean = 36/6 = 6

Mean = 36/6 = 6

Page 40: Chapter 4

4 - 40

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

8 – 3 = 5

18 – ( -10) = 28

Find the Range of Values of each stock:

Stock A: 7 6 8 5 7 3

Stock B: 15 -10 18 10 -5 8

Therefore, Stock B is riskier.

Page 41: Chapter 4

4 - 41

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Relative DispersionRelative Dispersion

The coefficient of variation is the ratio of the standard deviation

to the arithmetic mean, expressed as a percentage:

A standard deviation of 10 may be perceived as large when the mean value is 100,

but only moderately large

when the mean value is 500!

CVsx (100%)

Page 42: Chapter 4

4 - 42

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Example Rates of return over the past 6 years for

two mutual funds are shown below.

Fund A: 8.3, -6.0, 18.9, -5.7, 23.6, 20 Fund B: 12, -4.8, 6.4, 10.2, 25.3, 1.4

Solution

Which one has a higher level of risk?

Page 43: Chapter 4

4 - 43

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Let us use the Excel printout

that is run from the

“Descriptive Statistics” sub-menu

Fund A Fund B

Mean 9.85 Mean 8.42Standard Error 5.38 Standard Error 4.20Median 13.60 Median 8.30Mode #N/A Mode #N/AStandard Deviation 13.19 Standard Deviation 10.29Sample Variance 173.88 Sample Variance 105.81Kurtosis -2.21 Kurtosis 0.90Skewness -0.44 Skewness 0.61Range 29.60 Range 30.1Minimum -6 Minimum -4.8Maximum 23.6 Maximum 25.3Sum 59.1 Sum 50.5Count 6 Count 6

Solution

Page 44: Chapter 4

4 - 44

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Is

Fund Ariskier

because its

standard deviation

is larger?

Fund A Fund B

Mean 9.85 Mean 8.42Standard Error 5.38 Standard Error 4.20Median 13.60 Median 8.30Mode #N/A Mode #N/AStandard Deviation 13.19 Standard Deviation 10.29Sample Variance 173.88 Sample Variance 105.81Kurtosis -2.21 Kurtosis 0.90Skewness -0.44 Skewness 0.61Range 29.60 Range 30.1Minimum -6 Minimum -4.8Maximum 23.6 Maximum 25.3Sum 59.1 Sum 50.5Count 6 Count 6

Solution

Page 45: Chapter 4

4 - 45

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

But the means of the two

funds are different.

Fund A Fund B

Mean 9.85 Mean 8.42Standard Error 5.38 Standard Error 4.20Median 13.60 Median 8.30Mode #N/A Mode #N/AStandard Deviation 13.19 Standard Deviation 10.29Sample Variance 173.88 Sample Variance 105.81Kurtosis -2.21 Kurtosis 0.90Skewness -0.44 Skewness 0.61Range 29.60 Range 30.1Minimum -6 Minimum -4.8Maximum 23.6 Maximum 25.3Sum 59.1 Sum 50.5Count 6 Count 6

Fund A has a higher rate of return, but it also has a larger sd.

Therefore we need to compare the relative variability

using the coefficient of variation.

Fund A has a higher rate of return, but it also has a larger sd.

Therefore we need to compare the relative variability

using the coefficient of variation.

Solution

Page 46: Chapter 4

4 - 46

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Fund A: CV = 13.19 / 9.85 = 1.34

Fund B: CV = 10.29 / 8.42 = 1.22

Fund A: CV = 13.19 / 9.85 = 1.34

Fund B: CV = 10.29 / 8.42 = 1.22

So now we say that there is more variability in Fund A

as compared to Fund B

So now we say that there is more variability in Fund A

as compared to Fund B

Therefore, Fund A is riskier.

SolutionCV

s

x (100%)

Page 47: Chapter 4

4 - 47

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Test your learning…Test your learning…

www.mcgrawhill.ca/college/lindClick on…Click on…

Online Learning Centrefor quizzes

extra contentdata setssearchable glossaryaccess to Statistics Canada’s E-Stat data…and much more!

Page 48: Chapter 4

4 - 48

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

This completes Chapter 4This completes Chapter 4