chapter 9

63
Slide 13 - 1 Copyright © 2009 Pearson Education, Inc. Survey of Mathematics – MM150 Unit 9 – Statistics Mr. Scott VanZuiden, Adjunct Professor Kaplan University [email protected] Welcome to seminar!

Upload: brett-nolan

Post on 01-Jan-2016

41 views

Category:

Documents


0 download

DESCRIPTION

Survey of Mathematics – MM150 Unit 9 – Statistics Mr. Scott VanZuiden, Adjunct Professor Kaplan University [email protected] Welcome to seminar!. Chapter 9. Statistics. WHAT YOU WILL LEARN. • Mode, median, mean, and midrange • Percentiles and quartiles • Range and standard deviation - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chapter 9

Slide 13 - 1Copyright © 2009 Pearson Education, Inc.

Survey of Mathematics – MM150Unit 9 – Statistics

Mr. Scott VanZuiden, Adjunct ProfessorKaplan University

[email protected]

Welcome to seminar!

Page 2: Chapter 9

Slide 13 - 2Copyright © 2009 Pearson Education, Inc.

Chapter 9

Statistics

Page 3: Chapter 9

Slide 13 - 3Copyright © 2009 Pearson Education, Inc.

WHAT YOU WILL LEARN

• Mode, median, mean, and midrange

• Percentiles and quartiles• Range and standard deviation• z-scores and the normal distribution• Correlation and regression

Page 4: Chapter 9

Slide 13 - 4Copyright © 2009 Pearson Education, Inc.

9.1

Measures of Central Tendency

Page 5: Chapter 9

Slide 13 - 5Copyright © 2009 Pearson Education, Inc.

Definitions

An average is a number that is representative of a group of data.

The arithmetic mean, or simply the mean is symbolized by , when it is a sample of a population or by the Greek letter mu, , when it is the entire population.

x

Page 6: Chapter 9

Slide 13 - 6Copyright © 2009 Pearson Education, Inc.

Mean

The mean, is the sum of the data divided by the number of pieces of data. The formula for calculating the mean is

where represents the sum of all the data and n represents the number of pieces of data.

x

x

xn

x

Page 7: Chapter 9

Slide 13 - 7Copyright © 2009 Pearson Education, Inc.

Example-find the mean

Find the mean amount of money parents spent on new school supplies and clothes if 5 parents randomly surveyed replied as follows:

$327 $465 $672 $150 $230

Page 8: Chapter 9

Slide 13 - 8Copyright © 2009 Pearson Education, Inc.

Median

The median is the value in the middle of a set of ranked data.

Example: Determine the median of

$327 $465 $672 $150 $230.

Page 9: Chapter 9

Slide 13 - 9Copyright © 2009 Pearson Education, Inc.

Example: Median (even data)

Determine the median of the following set of data: 8, 15, 9, 3, 4, 7, 11, 12, 6, 4.

Page 10: Chapter 9

Slide 13 - 10Copyright © 2009 Pearson Education, Inc.

Mode

The mode is the piece of data that occurs most frequently.

Example: Determine the mode of the data set: 3, 4, 4, 6, 7, 8, 9, 11, 12, 15.

Page 11: Chapter 9

Slide 13 - 11Copyright © 2009 Pearson Education, Inc.

Page 12: Chapter 9

Slide 13 - 12Copyright © 2009 Pearson Education, Inc.

Midrange

The midrange is the value halfway between the lowest (L) and highest (H) values in a set of data.

Example: Find the midrange of the data set $327, $465, $672, $150, $230.

Midrange =

lowest value + highest value

2

Page 13: Chapter 9

Slide 13 - 13Copyright © 2009 Pearson Education, Inc.

Example

The weights of eight Labrador retrievers rounded to the nearest pound are 85, 92, 88, 75, 94, 88, 84, and 101. Determine the

a) mean b) median

c) mode d) midrange

e) rank the measures of central tendency from lowest to highest.

Page 14: Chapter 9

Slide 13 - 14Copyright © 2009 Pearson Education, Inc.

Example--dog weights 85, 92, 88, 75, 94, 88, 84, 101

Page 15: Chapter 9

Slide 13 - 15Copyright © 2009 Pearson Education, Inc.

Example--dog weights 85, 92, 88, 75, 94, 88, 84, 101

Page 16: Chapter 9

Slide 13 - 16Copyright © 2009 Pearson Education, Inc.

Measures of Position

Measures of position are often used to make comparisons.

Two measures of position are percentiles and quartiles.

Page 17: Chapter 9

Slide 13 - 17Copyright © 2009 Pearson Education, Inc.

To Find the Quartiles of a Set of Data

1. Order the data from smallest to largest.

2. Find the median, or 2nd quartile, of the set of data. If there are an odd number of pieces of data, the median is the middle value. If there are an even number of pieces of data, the median will be halfway between the two middle pieces of data.

Page 18: Chapter 9

Slide 13 - 18Copyright © 2009 Pearson Education, Inc.

To Find the Quartiles of a Set of Data continued

3. The first quartile, Q1, is the median of the lower half of the data; that is, Q1, is the median of the data less than Q2.

4. The third quartile, Q3, is the median of the upper half of the data; that is, Q3 is the median of the data greater than Q2.

Page 19: Chapter 9

Slide 13 - 19Copyright © 2009 Pearson Education, Inc.

Example: Quartiles

The weekly grocery bills for 23 families are as follows. Determine Q1, Q2, and Q3.

170 210 270 270 280330 80 170 240 270225 225 215 310 5075 160 130 74 8195 172 190

Page 20: Chapter 9

Slide 13 - 20Copyright © 2009 Pearson Education, Inc.

Example: Quartiles continued

Order the data: 50 75 74 80 81 95 130160 170 170 172 190 210 215225 225 240 270 270 270 280310 330

Q2 is the median of the entire data set which is 190.

Q1 is the median of the numbers from 50 to 172 which is 95.

Q3 is the median of the numbers from 210 to 330 which is 270.

Page 21: Chapter 9

Slide 13 - 21Copyright © 2009 Pearson Education, Inc.

9.2

Measures of Dispersion

Page 22: Chapter 9

Slide 13 - 22Copyright © 2009 Pearson Education, Inc.

Measures of Dispersion

Measures of dispersion are used to indicate the spread of the data.

The range is the difference between the highest and lowest values; it indicates the total spread of the data.

Range = highest value – lowest value

Page 23: Chapter 9

Slide 13 - 23Copyright © 2009 Pearson Education, Inc.

Example: Range

Nine different employees were selected and the amount of their salary was recorded. Find the range of the salaries.

$24,000 $32,000 $26,500

$56,000 $48,000 $27,000

$28,500 $34,500 $56,750

Page 24: Chapter 9

Slide 13 - 24Copyright © 2009 Pearson Education, Inc.

Standard Deviation

The standard deviation measures how much the data differ from the mean. It is symbolized with s when it is calculated for a sample, and with (Greek letter sigma) when it is calculated for a population.

s

x x 2n 1

Page 25: Chapter 9

Slide 13 - 25Copyright © 2009 Pearson Education, Inc.

To Find the Standard Deviation of a Set of Data

1. Find the mean of the set of data.

2. Make a chart having three columns:Data Data Mean (Data Mean)2

3. List the data vertically under the column marked Data.

4. Subtract the mean from each piece of data and place the difference in the Data Mean column.

Page 26: Chapter 9

Slide 13 - 26Copyright © 2009 Pearson Education, Inc.

To Find the Standard Deviation of a Set of Data continued5. Square the values obtained in the Data Mean

column and record these values in the (Data Mean)2 column.

6. Determine the sum of the values in the (Data Mean)2 column.

7. Divide the sum obtained in step 6 by n 1, where n is the number of pieces of data.

8. Determine the square root of the number obtained in step 7. This number is the standard deviation of the set of data.

Page 27: Chapter 9

Slide 13 - 27Copyright © 2009 Pearson Education, Inc.

Example

Find the standard deviation of the following prices of selected washing machines:

$280, $217, $665, $684, $939, $299

Find the mean.

Page 28: Chapter 9

Slide 13 - 28Copyright © 2009 Pearson Education, Inc.

Example continued, mean = 514

939

684

665

299

280

217

(Data Mean)2 Data MeanData

Page 29: Chapter 9

Slide 13 - 29Copyright © 2009 Pearson Education, Inc.

Example continued, mean = 514

The standard deviation is $

Page 30: Chapter 9

Slide 13 - 30Copyright © 2009 Pearson Education, Inc.

9.3

The Normal Curve

Page 31: Chapter 9

Slide 13 - 31Copyright © 2009 Pearson Education, Inc.

Types of Distributions

Rectangular Distribution J-shaped distribution

Rectangular Distribution

Values

Fre

quen

cy

Page 32: Chapter 9

Slide 13 - 32Copyright © 2009 Pearson Education, Inc.

Types of Distributions continued

Bimodal Skewed to right

Page 33: Chapter 9

Slide 13 - 33Copyright © 2009 Pearson Education, Inc.

Types of Distributions continued

Skewed to left Normal

Page 34: Chapter 9

Slide 13 - 34Copyright © 2009 Pearson Education, Inc.

Properties of a Normal Distribution

The graph of a normal distribution is called the normal curve.

The normal curve is bell shaped and symmetric about the mean.

In a normal distribution, the mean, median, and mode all have the same value and all occur at the center of the distribution.

Page 35: Chapter 9

Slide 13 - 35Copyright © 2009 Pearson Education, Inc.

Empirical Rule

Approximately 68% of all the data lie within one standard deviation of the mean (in both directions).

Approximately 95% of all the data lie within two standard deviations of the mean (in both directions).

Approximately 99.7% of all the data lie within three standard deviations of the mean (in both directions).

Page 36: Chapter 9

Slide 13 - 36Copyright © 2009 Pearson Education, Inc.

z-Scores

z-scores determine how far, in terms of standard deviations, a given score is from the mean of the distribution.

z

value of piece of data mean

standard deviation

x

Page 37: Chapter 9

Slide 13 - 37Copyright © 2009 Pearson Education, Inc.

Example: z-scores

A normal distribution has a mean of 50 and a standard deviation of 5. Find z-scores for the following values.

a) 55 b) 60 c) 43

Page 38: Chapter 9

Slide 13 - 38Copyright © 2009 Pearson Education, Inc.

Example: z-scores continued

Page 39: Chapter 9

Slide 13 - 39Copyright © 2009 Pearson Education, Inc.

To Find the Percent of Data Between any Two Values

1. Draw a diagram of the normal curve, indicating the area or percent to be determined.

2. Use the formula to convert the given values to z-scores. Indicate these z-scores on the diagram.

3. Look up the percent that corresponds to each z-score in Table 13.7.

Page 40: Chapter 9

Slide 13 - 40Copyright © 2009 Pearson Education, Inc.

To Find the Percent of Data Between any Two Values continued 4.

a) When finding the percent of data between two z-scores on opposite sides of the mean (when one z-score is positive and the other is negative), you find the sum of the individual percents.

b) When finding the percent of data between two z-scores on the same side of the mean (when both z-scores are positive or both are negative), subtract the smaller percent from the larger percent.

Page 41: Chapter 9

Slide 13 - 41Copyright © 2009 Pearson Education, Inc.

To Find the Percent of Data Between any Two Values continued

c) When finding the percent of data to the right of a positive z-score or to the left of a negative z-score, subtract the percent of data between 0 and z from 50%.

d) When finding the percent of data to the left of a positive z-score or to the right of a negative z-score, add the percent of data between 0 and z to 50%.

Page 42: Chapter 9

Slide 13 - 42Copyright © 2009 Pearson Education, Inc.

Example

Assume that the waiting times for customers at a popular restaurant before being seated for lunch are normally distributed with a mean of 12 minutes and a standard deviation of 3 min.

a) Find the percent of customers who wait for at least 12 minutes before being seated.

b) Find the percent of customers who wait between 9 and 18 minutes before being seated.

c) Find the percent of customers who wait at least 17 minutes before being seated.

d) Find the percent of customers who wait less than 8 minutes before being seated.

Page 43: Chapter 9

Slide 13 - 43Copyright © 2009 Pearson Education, Inc.

Solution

a. wait for at least 12 minutes

b. between 9 and 18 minutes

Page 44: Chapter 9

Slide 13 - 44Copyright © 2009 Pearson Education, Inc.

Solution continued

c. at least 17 min d. less than 8 min

Page 45: Chapter 9

Slide 13 - 45Copyright © 2009 Pearson Education, Inc.

9.4

Linear Correlation and Regression

Page 46: Chapter 9

Slide 13 - 46Copyright © 2009 Pearson Education, Inc.

Linear Correlation

Linear correlation is used to determine whether there is a relationship between two quantities and, if so, how strong the relationship is.

Page 47: Chapter 9

Slide 13 - 47Copyright © 2009 Pearson Education, Inc.

Linear Correlation

The linear correlation coefficient, r, is a unitless measure that describes the strength of the linear relationship between two variables. If the value is positive, as one variable

increases, the other increases. If the value is negative, as one variable

increases, the other decreases. The variable, r, will always be a value

between –1 and 1 inclusive.

Page 48: Chapter 9

Slide 13 - 48Copyright © 2009 Pearson Education, Inc.

Scatter Diagrams

A visual aid used with correlation is the scatter diagram, a plot of points (bivariate data). The independent variable, x, generally is a

quantity that can be controlled. The dependent variable, y, is the other

variable. The value of r is a measure of how far a set of

points varies from a straight line. The greater the spread, the weaker the

correlation and the closer the r value is to 0. The smaller the spread, the stronger the

correlation and the closer the r value is to 1.

Page 49: Chapter 9

Slide 13 - 49Copyright © 2009 Pearson Education, Inc.

Correlation

Page 50: Chapter 9

Slide 13 - 50Copyright © 2009 Pearson Education, Inc.

Correlation

Page 51: Chapter 9

Slide 13 - 51Copyright © 2009 Pearson Education, Inc.

Linear Correlation Coefficient

The formula to calculate the correlation coefficient (r) is as follows:

2 22 2

n xy x yr

n x x n y y

Page 52: Chapter 9

Slide 13 - 52Copyright © 2009 Pearson Education, Inc.

There are five applicants applying for a job as a medical transcriptionist. The following shows the results of the applicants when asked to type a chart. Determine the correlation coefficient between the words per minute typed and the number of mistakes.

Example: Words Per Minute versus Mistakes

934Nancy1041Kendra1253Phillip1167George824Ellen

MistakesWords per MinuteApplicant

Page 53: Chapter 9

Slide 13 - 53Copyright © 2009 Pearson Education, Inc.

We will call the words typed per minute, x, and the mistakes, y.

List the values of x and y and calculate the necessary sums.

Solution

306811156934xy = 2,281y2 = 510x2 =10,711y = 50x = 219

1012118y

Mistakesxyy2 x2x

41536724

WPM

41010016816361442809737121448919264576

Page 54: Chapter 9

Slide 13 - 54Copyright © 2009 Pearson Education, Inc.

Solution continued

The n in the formula represents the number of pieces of data. Here n = 5.

r n xy x y

n x2 x 2 n y 2 y 2

r 5 2281 219 50

5 10,711 219 2 5 510 50 2

Page 55: Chapter 9

Slide 13 - 55Copyright © 2009 Pearson Education, Inc.

Solution continued

11,405 10,950

5 10,711 47,961 5 510 2500

455

53,555 47,961 2550 2500

455

5594 500.86

Page 56: Chapter 9

Slide 13 - 56Copyright © 2009 Pearson Education, Inc.

Solution continued

Since 0.86 is fairly close to 1, there is a fairly strong positive correlation.

This result implies that the more words typed per minute, the more mistakes made.

Page 57: Chapter 9

Slide 13 - 57Copyright © 2009 Pearson Education, Inc.

Linear Regression

Linear regression is the process of determining the linear relationship between two variables.

The line of best fit (regression line or the least squares line) is the line such that the sum of the squares of the vertical distances from the line to the data points (on a scatter diagram) is a minimum.

Page 58: Chapter 9

Slide 13 - 58Copyright © 2009 Pearson Education, Inc.

The Line of Best Fit

Equation:

y mx b, where

m n xy x y

n x2 x 2, and b

y m x n

Page 59: Chapter 9

Slide 13 - 59Copyright © 2009 Pearson Education, Inc.

Example

Use the data in the previous example to find the equation of the line that relates the number of words per minute and the number of mistakes made while typing a chart.

Graph the equation of the line of best fit on a scatter diagram that illustrates the set of bivariate points.

Page 60: Chapter 9

Slide 13 - 60Copyright © 2009 Pearson Education, Inc.

Solution

From the previous results, we know that

m n xy x y

n x2 x 2

m 5(2,281) (219)(50)

5(10,711) 2192

m 455

5594m 0.081

Page 61: Chapter 9

Slide 13 - 61Copyright © 2009 Pearson Education, Inc.

Solution

Now we find the y-intercept, b.

Therefore the line of best fit is y = 0.081x + 6.452

b y m x

n

b 50 0.081 219

5

b 32.261

56.452

Page 62: Chapter 9

Slide 13 - 62Copyright © 2009 Pearson Education, Inc.

Solution continued

To graph y = 0.081x + 6.452, plot at least two points and draw the graph.

8.88230

8.07220

7.26210

yx

Page 63: Chapter 9

Slide 13 - 63Copyright © 2009 Pearson Education, Inc.

Solution continued