a.p. statistics – sampling distributions and the central limit...

10
www.MasterMathMentor.com - 1 - Stu Schwartz A.P. Statistics – Sampling Distributions and the Central Limit Theorem Definitions A parameter is a number that describes the population. A parameter always exists but in practice we rarely know its value because of the difficulty in creating a census. Parameters always use Greek letters to describe them. For instance we know that μ represents the mean of a population and " represents the standard deviation of the population. If we are talking about a percentage parameter, we use the Greek letter " (rho). Example: If we wanted to compare the IQ’s of all American and Asian males, it would be impossible. But it is important to realize that μ American male and μ Asian male exist. Example: If we were interested in whether there is a greater percentage of women who eat broccoli than men, we want to know whether " women > " men . A statistic is a number that describes a sample. The value of a statistic can always be found when we take a sample but it is important to realize that that statistic1 can change to sample to sample. Statistics use variables like x , ˆ p , and s (non Greek). We often use statistics to estimate an unknown parameter. Example 1: I take a random sample of 500 American males and find their IQ’s. We find x = 103.2 . We would like to be able to say that μ = 103.2. Obviously though, if we had taken different 500 males, we would have gotten a different x . It is not clear that we can use x to find μ . Example 2: I take a random sample of 200 women and find that 40 like broccoli. Then ˆ p W = .2 . From a sample of 300 men, I find that 30 like broccoli. Then ˆ p M = .1. We know that ˆ p W > ˆ p M , but that is a far cry from being able to say that " W > " M . The question that should crop up is: since our samples can give different results, how can we used them to find parameters. Imagine an archer shooting many arrows at a target. There are four situations that can occur. a) high bias, low variability b) low bias, high variability c) high bias, high variability d) low bias, low variability Situation a) has the archer consistent but off target. Situation b) has the archer all over the place. He tends to average a bulls-eye but each result is far from the center. Situation c) is worse than situation a) as the archer is consistently missing high and to the right but not nearly as consistently as situation a. Obviously what the archer is attempting to do is situation d) – low bias and low variability. Now let’s relate this to sampling.

Upload: others

Post on 18-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A.P. Statistics – Sampling Distributions and the Central Limit …teachers.dadeschools.net/sdaniel/Sampling Distributions... · 2012-11-21 · sample but it is important to realize

www.MasterMathMentor.com - 1 - Stu Schwartz

A.P. Statistics – Sampling Distributions and the Central Limit Theorem Definitions A parameter is a number that describes the population. A parameter always exists but in practice we rarely know its value because of the difficulty in creating a census. Parameters always use Greek letters to describe them. For instance we know that

!

µ represents the mean of a population and

!

" represents the standard deviation of the population. If we are talking about a percentage parameter, we use the Greek letter

!

" (rho). Example: If we wanted to compare the IQ’s of all American and Asian males, it would be impossible. But it is important to realize that

!

µAmerican male

and µAsian male

exist. Example: If we were interested in whether there is a greater percentage of women who eat broccoli than men, we want to know whether

!

"women

> "men

. A statistic is a number that describes a sample. The value of a statistic can always be found when we take a sample but it is important to realize that that statistic1 can change to sample to sample. Statistics use variables like

!

x ,

!

ˆ p , and s (non Greek). We often use statistics to estimate an unknown parameter. Example 1: I take a random sample of 500 American males and find their IQ’s. We find

!

x =103.2. We would like to be able to say that

!

µ = 103.2. Obviously though, if we had taken different 500 males, we would have gotten a different

!

x . It is not clear that we can use

!

x to find

!

µ. Example 2: I take a random sample of 200 women and find that 40 like broccoli. Then

!

ˆ p W = .2 . From a sample of 300 men, I find that 30 like broccoli. Then

!

ˆ p M = .1. We know that

!

ˆ p W > ˆ p M , but that is a far cry from being able to say that

!

"W

> "M

. The question that should crop up is: since our samples can give different results, how can we used them to find parameters. Imagine an archer shooting many arrows at a target. There are four situations that can occur.

a) high bias, low variability b) low bias, high variability c) high bias, high variability d) low bias, low variability Situation a) has the archer consistent but off target. Situation b) has the archer all over the place. He tends to average a bulls-eye but each result is far from the center. Situation c) is worse than situation a) as the archer is consistently missing high and to the right but not nearly as consistently as situation a. Obviously what the archer is attempting to do is situation d) – low bias and low variability. Now let’s relate this to sampling.

Page 2: A.P. Statistics – Sampling Distributions and the Central Limit …teachers.dadeschools.net/sdaniel/Sampling Distributions... · 2012-11-21 · sample but it is important to realize

www.MasterMathMentor.com - 2 - Stu Schwartz

Suppose our goal was to estimate the average IQ of American males (

!

µAmerican male

). We cannot take a census so we take a sample. Suppose we took many samples. The same thing as we saw in the pictures can happen when we sample. Remember,

!

µAmerican male

exists. We just don’t know it, nor can we actually find it. We can only estimate it. • If our many samples of IQ’s of American men are consistent but higher than

!

µAmerican male

, then we have situation a ) high bias and low variability. (The problem is: how would we know this?)

• If our many samples of IQ’s of American males are inconsistent, some well higher than

!

µAmerican male

, and some well lower than

!

µAmerican male

, we have situation 2) low bias and high variability. • If our many samples of IQ’s of American males are not close to each other but are all higher than

!

µAmerican male

, we have situation c) high bias and high variability.

• Finally, if our many samples of IQ’s of American males are slightly higher and slightly lower than

!

µAmerican male

, we have our desired situation – low bias and low variability.

Now here is the crux of the matter: You are not going to take many samples. You are only going to take one. We want to use it to predict

!

µAmerican male

. It should be obvious that if we have situation a), b) or c), it most likely will not be a good predictor. It could be, but it very well may not be. But if we used the data from situation d) low bias and low variability, then any of the samples will provide a good predictor for

!

µAmerican male

. We already know some ways to get a good sample – using an SRS, and being very sure to have no bias when choosing your sample. In this and the future chapters, we will see how to use our sample statistic to predict our parameter with a certain degree of confidence. The Sampling Distribution of a statistic is the distribution of values taken by the statistic of all possible samples of the same size from the population. When we sample, we sample with replacement meaning that the same value can be used over again. A sampling distribution is a sample space: it describes everything that can happen when we sample. Suppose our data was the numbers {1, 5, 9} and we were going to sample two pieces of data. Generate the sampling distribution:

Choice 1 Choice 2 Choice 1 Choice 2 Choice 1 Choice 2

Example 4: Suppose our data was the numbers {4, 5, 7, 8} and we were going to sample two pieces of

data. Generate the sampling distribution:

Choice 1 Choice 2 Choice 1 Choice 2 Choice 1 Choice 2 Choice 1 Choice 2

Page 3: A.P. Statistics – Sampling Distributions and the Central Limit …teachers.dadeschools.net/sdaniel/Sampling Distributions... · 2012-11-21 · sample but it is important to realize

www.MasterMathMentor.com - 3 - Stu Schwartz

Mean and Standard Deviation of a Sample Mean Suppose that

!

x is the mean of an SRS of size n from a large population with mean

!

µ and standard deviation

!

" .

Then the mean of the sampling distribution of

!

x is

!

µ and its standard deviation is

!

"

n.

Example 5: Let

!

S = 0,4,8,12{ } .

!

µ = _______, " = _______. Note that we are using Greek letters, parameters because we are dealing with the entire population. Generate the sampling distribution of n = 2 as we did before, but for each member of the sampling distribution, find

!

x .

choice one choice two

!

x

From your data, find the mean of your sample means

!

x ______. and the s.d.

!

" ________. Now calculate

your original population

!

"

n= _______.

!

"

n and the standard deviation of the sample means should be

the same. Here is a histogram of the sampling distribution. Note its shape.

The Central Limit Theorem: Draw an SRS of size n from any population (not necessarily normal) whatsoever with mean

!

µ and standard deviation

!

" . When n is large, the sampling distribution of the sample mean

!

x is

close to the normal distribution

!

N µ,"

n

#

$ %

&

' ( with mean

!

µ, and standard deviation

!

"

n.

Page 4: A.P. Statistics – Sampling Distributions and the Central Limit …teachers.dadeschools.net/sdaniel/Sampling Distributions... · 2012-11-21 · sample but it is important to realize

www.MasterMathMentor.com - 4 - Stu Schwartz

What the Central Limit Theorem says in words: a) You start with some population with some mean

!

µ and standard deviation

!

" . You may know the mean and standard deviation but most likely you do not. The distribution may be normal but it does not have to be.

Example: A pizza shop sells slices of pizza for $1.75 and sodas for $.75. People come in for lunch and pay various amounts. Some people just buy a soda. Some buy only a slice, others buy a slice and a soda. Some get 2 slices – others buy lunch for a friend and thus spend more. We have no idea what the distribution of prices looks like. b) Decide a sample size – call it n. Start taking samples. Find the mean of your sample. Example: Suppose n = 10. Take a random sample of 10 people at the pizza place, calculate their bills,

and find the average and standard deviation.

c) Now take a lot of samples of size n and find the average of the averages you just found. Example: Let’s take 500 of these samples of 10 and find the average of the average bill. d) The CLT says three things:

1) that the mean of the population (what we want to find) will be the same as the mean of your samples. 2) the standard deviation of the samples will be the population standard deviation divided by

!

n . 3) the histogram of the samples will appear normal (bell shaped). The larger the sample size (n), the smaller the standard deviation will be and the more constricted the graph will be.

Here is a histogram of the population. It represents the total amount spent at lunch and the data summary. Note that you would not necessarily know this information. For instance, very few people just ordered a slice and the most common order appeared to be a slice and a drink for $2.50. About 33 people did this. Here are the results of choosing 500 samples of size 3 (took 3 random orders and averaged them). The population is certainly more normal than the population above. Note that the mean of the population (4.01) is similar to the mean of the samples (4.03) and the standard deviation of the samples (0.98) is similar to the population standard deviation 1.6667 divided by

!

10 . Here are the results of choosing 500 samples of size 10. Note how the distribution appears normal although the population certainly was not. Note that the standard deviation of the samples (.499) is approximately the population standard deviation 1.6667 divided by

!

10 .

Collection 1

Pay4.01

1.6678665

S1 = ( )mean

S2 = ( )stdDev

5

10

15

20

25

30

35

0 1 2 3 4 5 6 7 8

Pay

Collection 1 Histogram

10

20

30

40

50

60

70

80

90

avg_pay

2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0

Measures from Sample of Collection 1HistogramMeasures from Sample of Co…

avg_pay4.01395

0.49921948

S1 = ( )mean

S2 = ( )stdDev

20

40

60

80

100

avg_pay

1 2 3 4 5 6 7

Measures from Sample of Collection 1Histogram

Measures from Sample of Co…

avg_pay4.0311667

0.98366093

S1 = ( )mean

S2 = ( )stdDev

Page 5: A.P. Statistics – Sampling Distributions and the Central Limit …teachers.dadeschools.net/sdaniel/Sampling Distributions... · 2012-11-21 · sample but it is important to realize

www.MasterMathMentor.com - 5 - Stu Schwartz

If we changed the sample size to 25, note how the distribution becomes more normal but the standard deviation of the samples gets smaller and

smaller and approaching

!

1.667

25.

Example 6: The average study time for a final exam in History is found to be 6 hours and 25 minutes with a

standard deviation of 1 hour and 45 minutes. Assume the distribution is normal.

a. What is the probability that a student chosen at random spends more than 7 hours in studying?

b. What is the probability that an SRS of 4 students will average more than 7 hours in studying? Compared to a), why does the probability go down?

c. What is the probability that a student chosen at random spends less than 4 hours in studying?

d. What is the probability that an SRS of 5 students will average less than 4 hours in studying?

Example 7: The length of a pregnancy from conception to birth varies normally with mean 266 days and

standard deviation 6 days. a. What is the probability that a woman chosen at random has a pregnancy lasting more than 270 days? b. What is the probability that an SRS of 16 women have pregnancies averaging more than 270 days?

Compared to a), why does the probability go down? c. What is the probability that a woman chosen at random has a pregnancy lasting less than 255 days? d. What is the probability that an SRS of 20 women have pregnancies averaging less than 255 days? e. What is the probability that a woman will have a pregnancy lasting between 260 and 270 days? f. What is the probability that an SRS of 50 women have pregnancies averaging between 260 and 270

days? Compared to e), why does the probability go up?

20

40

60

80

100

120

avg_pay

3.0 3.5 4.0 4.5 5.0

Measures from Sample of Collection 1Histogram

Measures from Sample of Collecti…

avg_pay4.01438

0.33023131

S1 = ( )mean

S2 = ( )stdDev

Page 6: A.P. Statistics – Sampling Distributions and the Central Limit …teachers.dadeschools.net/sdaniel/Sampling Distributions... · 2012-11-21 · sample but it is important to realize

www.MasterMathMentor.com - 6 - Stu Schwartz

Homework Problems 1)

!

S = 1,7,10{ }.

!

µ = _______, " = _______. Generate the sampling distribution of

choice one choice two

!

x

From your data, find the mean of your sample means

!

x ______. and the s.d.

!

" ________. Now calculate your

original population

!

"

n= _______.

!

"

n and the standard deviation of the sample means should be the same.

To the right, make a histogram of the means of your sampling distribution. 2) Let

!

S = 2,6,11,13,18{ } .

!

µ = _______, " = _______. Generate the sampling distribution of n = 2.

choice one choice two

!

x choice one choice two

!

x

From your data, find the mean of your sample means

!

x ______. and the s.d.

!

" ________. Now calculate your

original population

!

"

n= _______.

!

"

n and the standard deviation of the sample means should be the same.

Below, make a histogram of the means of your sampling distribution.

Page 7: A.P. Statistics – Sampling Distributions and the Central Limit …teachers.dadeschools.net/sdaniel/Sampling Distributions... · 2012-11-21 · sample but it is important to realize

www.MasterMathMentor.com - 7 - Stu Schwartz

Example: Let

!

S = 4, 7,16{ } .

!

µ = _______, " = _______. Generate the sampling distribution of n = 3.

choice one choice two choice three

!

x

From your data, find the mean of your sample means

!

x ______. and the s.d.

!

" ________. Now calculate your

original population

!

"

n= _______.

!

"

n and the standard deviation of the sample means should be the same.

Below, make a histogram of the means of your sampling distribution.

Page 8: A.P. Statistics – Sampling Distributions and the Central Limit …teachers.dadeschools.net/sdaniel/Sampling Distributions... · 2012-11-21 · sample but it is important to realize

www.MasterMathMentor.com - 8 - Stu Schwartz

The Central Limit Theorem – Answer all questions. If a question cannot be answered, explain why.

1. The average amount of money spent at lunch a school cafeteria is $3.00 with a standard deviation of 75 cents. Assume the distribution of money spent is normal.

a. What is the probability that a student spends more than $3.50 for lunch? b. What is the probability that the average amount spent by an SRS of 10 students will be greater

than $4? Compared to a) why does the probability go down?

c. What is the probability that a student spends less than $2.75 for lunch?

d. Find the probability that the average amount spent by an SRS of 25 students is less than $2.75.

2. The typical 6 ounce bag of potato chips is normally distributed by weight with a standard deviation of

.15 ounces.

a. What is the probability that a bag contains less than 5.9 ounces? b. What is the probability that the average weight of an SRS of 12 bags will be less than 5.9

ounces?

c. What is the probability that a bag will have more than 6.2 ounces or less than 5.8 ounces?

d. Find the probability that an SRS of 5 bags has more than 6.2 ounces or less than 5.8 ounces. Compared to d), why does the probability go down?

3. The average number of years that a particular washing machine lasts is 7.45 years with a standard deviation of 2.71 years. Assume normality. The warranty for this machine is two years?

a. What is the probability that a machine will last more than 8 years? b. What is the probability that an SRS of 100 washing machines will average a lifetime of 8

or more years?

c. What is the probability that a machine will fail within the warranty period?

d. What is the probability that an SRS of 15 will average failing within the warranty period?

Page 9: A.P. Statistics – Sampling Distributions and the Central Limit …teachers.dadeschools.net/sdaniel/Sampling Distributions... · 2012-11-21 · sample but it is important to realize

www.MasterMathMentor.com - 9 - Stu Schwartz

4. The average amount of time that people spend going through airport security for planes taking off between 8 AM and 10 AM at a busy airport is 22 minutes with a standard deviation of 4.2 minutes. Assume normality.

a. What is the probability that a person has to wait more than 25 minutes?

b. What is the probability that the average wait for the next 8 people in line have to wait is more than 25 minutes?

c. What is the probability that a person has to wait between 20 and 25 minutes?

d. What is the probability that an SRS of 35 people will average a wait between 20 and 25 minutes?

Compared to c, why does the probability go up?

5. An SAT review course claims that it can increase SAT scores with great success. It reports that the

average gain of a student is 50 points with a standard deviation of 21.2. No other information is given.

a) If a student takes the course, what is the probability that her scores will increase 55 points or more?

b) If 10 students take the course, what is the probability that the average gain of those students will be 55 points or more?

c) If 100 students take the course, what is the probability that the average gain of those students will be 55 points or more?

d) If 200 students take the course, what is the probability that the average gain of those students will be 55 points or more?

e) Which of these answers are you most confident of? Why?

6. At 8 AM on a typical weekday morning, the average number of people in a 7-11 store is 24.5 with a

standard deviation of 4.4. Assume normality.

a. What is the approximate distribution of the mean number of persons

!

x in 500 randomly selected 7-11’s?

b. What is the probability that 500 selected stores will have more than 12,300 people in them at 8

AM?

Page 10: A.P. Statistics – Sampling Distributions and the Central Limit …teachers.dadeschools.net/sdaniel/Sampling Distributions... · 2012-11-21 · sample but it is important to realize

www.MasterMathMentor.com - 10 - Stu Schwartz

Practice Quiz 1. Let

!

S = 4,9,14{ }.

!

µ = _______, " = _______. Generate the sampling distribution of n = 2 and for each member of the sampling distribution, find

!

x .

choice one choice two

!

x choice one choice two

!

x

Explain how the Central Limit Theorem holds for this sampling distribution.

2. The amount of snow that falls in Buffalo, NY over a winter is normally distributed with a mean of 15 feet, 7 inches and a standard deviation of 6 feet, 2 inches.

a. What is the probability that Buffalo will have over 12 feet of snow next winter?

b. What is the probability that Buffalo’s 4 random winters will average over 10 feet of snow? Explain why or why not this is same probability that the next 4 Buffalo winters will average over 10 feet of snow.

c. What is the probability that Buffalo’s last 10 winters averaged between 12 and 16 feet of snow? 3. The time it takes waiting to ride a popular roller coaster at an amusement park varies normally with an

average wait time of 24 minutes and a standard deviation of 7.5 minutes. a. What is the probability that a random rider waits more than a half an hour? b. What is the probability that 5 random riders average a wait of more than a half an hour? c. Explain why the probability in b is (greater/smaller) than the probability in a. d. What is the probability that a random rider waits between 15 and 25 minutes? e. What is the probability that 75 random riders average a wait between 15 and 25 minutes? f. Explain why the probability in d is (greater/smaller) than the probability in e.