1 1 slide © 2005 thomson/south-western chapter 7, part a sampling and sampling distributions...

31
1 © 2005 Thomson/South-Western Chapter 7, Part A Sampling and Sampling Distributions Sampling Distribution of Introduction to Sampling Distributions Point Estimation Simple Random Sampling x

Upload: melissa-shaw

Post on 28-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

1 1 Slide

Slide

© 2005 Thomson/South-Western

Chapter 7, Part ASampling and Sampling Distributions

x Sampling Distribution of

Introduction to Sampling Distributions

Point Estimation

Simple Random Sampling

2 2 Slide

Slide

© 2005 Thomson/South-Western

The purpose of statistical inference is to obtain information about a population from information contained in a sample.

The purpose of statistical inference is to obtain information about a population from information contained in a sample.

Statistical Inference

A population is the set of all the elements of interest. A population is the set of all the elements of interest.

A sample is a subset of the population. A sample is a subset of the population.

3 3 Slide

Slide

© 2005 Thomson/South-Western

The sample results provide only estimates of the values of the population characteristics. The sample results provide only estimates of the values of the population characteristics.

A parameter is a numerical characteristic of a population. A parameter is a numerical characteristic of a population.

With proper sampling methods, the sample results can provide “good” estimates of the population characteristics.

With proper sampling methods, the sample results can provide “good” estimates of the population characteristics.

Statistical Inference

4 4 Slide

Slide

© 2005 Thomson/South-Western

Simple Random Sampling:Finite Population

Finite populations are often defined by lists such as:• Organization membership roster• Credit card account numbers• Inventory product numbers A simple random sample of size n from a

finite population of size N is a sample selected such that each possible sample of size n has the same probability of being selected.

5 5 Slide

Slide

© 2005 Thomson/South-Western

s is the point estimator of the population standard deviation . s is the point estimator of the population standard deviation .

In point estimation we use the data from the sample to compute a value of a sample statistic that serves as an estimate of a population parameter.

In point estimation we use the data from the sample to compute a value of a sample statistic that serves as an estimate of a population parameter.

Point Estimation

We refer to as the point estimator of the population mean . We refer to as the point estimator of the population mean .

x

is the point estimator of the population proportion p. is the point estimator of the population proportion p.p

6 6 Slide

Slide

© 2005 Thomson/South-Western

Sampling Error

Statistical methods can be used to make probability statements about the size of the sampling error.

Sampling error is the result of using a subset of the population (the sample), and not the entire population.

The absolute value of the difference between an unbiased point estimate and the corresponding population parameter is called the sampling error.

When the expected value of a point estimator is equal to the population parameter, the point estimator is said to be unbiased.

7 7 Slide

Slide

© 2005 Thomson/South-Western

Sampling Error

The sampling errors are:

| |p p for sample proportion

| |s for sample standard deviation

| |x for sample mean

8 8 Slide

Slide

© 2005 Thomson/South-Western

Example: St. Andrew’s

St. Andrew’s College receives900 applications annually fromprospective students. Theapplication form contains a variety of informationincluding the individual’sscholastic aptitude test (SAT) score and whether

or notthe individual desires on-campus housing.

9 9 Slide

Slide

© 2005 Thomson/South-Western

Example: St. Andrew’s

The director of admissionswould like to know thefollowing information:• the average SAT score for the 900 applicants, and• the proportion of

applicants that want to live on campus.

10 10 Slide

Slide

© 2005 Thomson/South-Western

Example: St. Andrew’s

We will now look at threealternatives for obtaining thedesired information. Conducting a census of the entire 900 applicants Selecting a sample of 30

applicants, using a random number table Selecting a sample of 30 applicants, using

Excel

11 11 Slide

Slide

© 2005 Thomson/South-Western

Conducting a Census

If the relevant data for the entire 900 applicants were in the college’s database, the population parameters of interest could be calculated using the formulas presented in Chapter 3.

We will assume for the moment that conducting a census is practical in this example.

12 12 Slide

Slide

© 2005 Thomson/South-Western

990900

ix

2( )80

900ix

Conducting a Census

648.72

900p

Population Mean SAT Score

Population Standard Deviation for SAT Score

Population Proportion Wanting On-Campus Housing

13 13 Slide

Slide

© 2005 Thomson/South-Western

as Point Estimator of x

as Point Estimator of pp

29,910997

30 30ix

x

2( ) 163,99675.2

29 29ix x

s

20 30 .68p

Point Estimation

Note: Different random numbers would haveidentified a different sample which would haveresulted in different point estimates.

s as Point Estimator of

14 14 Slide

Slide

© 2005 Thomson/South-Western

PopulationParameter

PointEstimator

PointEstimate

ParameterValue

m = Population mean SAT score

990 997

s = Population std. deviation for SAT score

80 s = Sample std. deviation for SAT score

75.2

p = Population pro- portion wanting campus housing

.72 .68

Summary of Point EstimatesObtained from a Simple Random Sample

= Sample mean SAT score x

= Sample pro- portion wanting campus housing

p

15 15 Slide

Slide

© 2005 Thomson/South-Western

Process of Statistical Inference

The value of is used tomake inferences about

the value of m.

x The sample data provide a value for

the sample mean .x

A simple random sampleof n elements is selected

from the population.

Population with mean

m = ?

Sampling Distribution of x

16 16 Slide

Slide

© 2005 Thomson/South-Western

The sampling distribution of is the probabilitydistribution of all possible values of the sample mean .

x

x

Sampling Distribution of x

where: = the population mean

E( ) = x

xExpected Value of

17 17 Slide

Slide

© 2005 Thomson/South-Western

Sampling Distribution of x

Finite Population Infinite Population

x n

N nN

( )1

x n

• is referred to as the standard error of the

mean.

x

• A finite population is treated as being infinite if n/N < .05.

• is the finite correction factor.( ) / ( )N n N 1

xStandard Deviation of

18 18 Slide

Slide

© 2005 Thomson/South-Western

Form of the Sampling Distribution of x

If we use a large (n > 30) simple random sample, thecentral limit theorem enables us to conclude that thesampling distribution of can be approximated bya normal distribution.

x

When the simple random sample is small (n < 30),the sampling distribution of can be considerednormal only if we assume the population has anormal distribution.

x

19 19 Slide

Slide

© 2005 Thomson/South-Western

8014.6

30x

n

( ) 990E x x

Sampling Distribution of for SAT Scoresx

SamplingDistribution

of x

20 20 Slide

Slide

© 2005 Thomson/South-Western

What is the probability that a simple random sampleof 30 applicants will provide an estimate of thepopulation mean SAT score that is within +/-10 ofthe actual population mean ? In other words, what is the probability that will bebetween 980 and 1000?

x

Sampling Distribution of for SAT Scoresx

21 21 Slide

Slide

© 2005 Thomson/South-Western

Step 1: Calculate the z-value at the upper endpoint of the interval.

z = (1000 - 990)/14.6= .68

P(z < .68) = .7517

Step 2: Find the area under the curve to the left of the upper endpoint.

Sampling Distribution of for SAT Scoresx

22 22 Slide

Slide

© 2005 Thomson/South-Western

Sampling Distribution of for SAT Scoresx

Cumulative Probabilities for the Standard Normal

Distributionz .00 .01 .02 .03 .04 .05 .06 .07 .08 .09

. . . . . . . . . . .

.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224

.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549

.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852

.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133

.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389

. . . . . . . . . . .

23 23 Slide

Slide

© 2005 Thomson/South-Western

Sampling Distribution of for SAT Scoresx

x990

SamplingDistribution

of x14.6x

1000

Area = .7517

24 24 Slide

Slide

© 2005 Thomson/South-Western

Step 3: Calculate the z-value at the lower endpoint of the interval.

Step 4: Find the area under the curve to the left of the lower endpoint.

z = (980 - 990)/14.6= - .68

P(z < -.68) = P(z > .68)

= .2483= 1 - . 7517

= 1 - P(z < .68)

Sampling Distribution of for SAT Scoresx

25 25 Slide

Slide

© 2005 Thomson/South-Western

Sampling Distribution of for SAT Scoresx

x980 990

Area = .2483

SamplingDistribution

of x14.6x

26 26 Slide

Slide

© 2005 Thomson/South-Western

Sampling Distribution of for SAT Scoresx

Step 5: Calculate the area under the curve between the lower and upper endpoints of the interval.

P(-.68 < z < .68) = P(z < .68) - P(z < -.68)= .7517 - .2483= .5034

The probability that the sample mean SAT score willbe between 980 and 1000 is:

P(980 < < 1000) = .5034x

27 27 Slide

Slide

© 2005 Thomson/South-Western

x1000980 990

Sampling Distribution of for SAT Scoresx

Area = .5034

SamplingDistribution

of x14.6x

28 28 Slide

Slide

© 2005 Thomson/South-Western

Relationship Between the Sample Size and the Sampling Distribution of x

Suppose we select a simple random sample of 100 applicants instead of the 30 originally considered.

E( ) = m regardless of the sample size. In our

example, E( ) remains at 990.

xx

Whenever the sample size is increased, the standard error of the mean is decreased. With the increase in the sample size to n = 100, the standard error of the mean is decreased to:

x

808.0

100x

n

29 29 Slide

Slide

© 2005 Thomson/South-Western

Relationship Between the Sample Size and the Sampling Distribution of x

( ) 990E x x

14.6x With n = 30,

8x With n = 100,

30 30 Slide

Slide

© 2005 Thomson/South-Western

Recall that when n = 30, P(980 < < 1000) = .5034.x

Relationship Between the Sample Size and the Sampling Distribution of x

We follow the same steps to solve for P(980 < < 1000) when n = 100 as we showed earlier when n = 30.

x

Now, with n = 100, P(980 < < 1000) = .7888.x Because the sampling distribution with n = 100 has a smaller standard error, the values of have less variability and tend to be closer to the population mean than the values of with n = 30.

x

x

31 31 Slide

Slide

© 2005 Thomson/South-Western

Relationship Between the Sample Size and the Sampling Distribution of x

x1000980 990

Area = .7888

SamplingDistribution

of x8x