sbe10_07a [read-only] [compatibility mode]
TRANSCRIPT
1
11SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Chapter 7, Part AChapter 7, Part ASampling and Sampling DistributionsSampling and Sampling Distributions
x�� Sampling Distribution ofSampling Distribution of
�� Introduction to Sampling DistributionsIntroduction to Sampling Distributions
�� Point EstimationPoint Estimation
�� Simple Random SamplingSimple Random Sampling
22SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
The purpose of The purpose of statistical inferencestatistical inference is to obtainis to obtaininformation about a population from informationinformation about a population from informationcontained in a sample.contained in a sample.
Statistical InferenceStatistical Inference
A A populationpopulation is the set of all the elements of interest.is the set of all the elements of interest.
A A samplesample is a subset of the population.is a subset of the population.
2
33SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
The sample results provide only The sample results provide only estimatesestimates of theof thevalues of the population characteristics.values of the population characteristics.
A A parameterparameter is a numerical characteristic of ais a numerical characteristic of apopulation.population.
With With proper sampling methodsproper sampling methods, the sample results, the sample resultscan provide “good” estimates of the populationcan provide “good” estimates of the populationcharacteristics.characteristics.
Statistical InferenceStatistical Inference
44SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Simple Random Sampling:Simple Random Sampling:Finite PopulationFinite Population
■■ Finite populationsFinite populations are often defined by lists such as:are often defined by lists such as:
•• Organization membership rosterOrganization membership roster
•• Credit card account numbersCredit card account numbers
•• Inventory product numbersInventory product numbers
■■ A A simple random sample of size simple random sample of size nn from a finitefrom a finite
population of size population of size NN is a sample selected suchis a sample selected such
that each possible sample of size that each possible sample of size nn has the samehas the same
probability of being selected.probability of being selected.
3
55SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Simple Random Sampling:Simple Random Sampling:Finite PopulationFinite Population
�� In large sampling projects, computerIn large sampling projects, computer--generatedgeneratedrandom numbersrandom numbers are often used to automate theare often used to automate thesample selection process.sample selection process.
�� Sampling without replacementSampling without replacement is the procedureis the procedureused most often.used most often.
�� Replacing each sampled element before selectingReplacing each sampled element before selectingsubsequent elements is called subsequent elements is called sampling withsampling withreplacementreplacement..
66SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
■■ Infinite populations are often defined by an Infinite populations are often defined by an ongoing ongoing processprocess whereby the elements of the population whereby the elements of the population consist of items generated as though the process would consist of items generated as though the process would operate indefinitely.operate indefinitely.
Simple Random Sampling:Simple Random Sampling:Infinite PopulationInfinite Population
■■ A A simple random sample from an infinite populationsimple random sample from an infinite population
is a sample selected such that the following conditionsis a sample selected such that the following conditions
are satisfied.are satisfied.
•• Each element selected comes from the sameEach element selected comes from the same
population.population.
•• Each element is selected independently.Each element is selected independently.
4
77SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Simple Random Sampling:Simple Random Sampling:Infinite PopulationInfinite Population
�� The random number selection procedure cannot beThe random number selection procedure cannot beused for infinite populations.used for infinite populations.
�� In the case of infinite populations, it is impossible toIn the case of infinite populations, it is impossible toobtain a list of all elements in the population.obtain a list of all elements in the population.
88SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
ss is the is the point estimatorpoint estimator of the population standardof the population standarddeviation deviation σσ..
In In point estimationpoint estimation we use the data from the sample we use the data from the sample to compute a value of a sample statistic that servesto compute a value of a sample statistic that servesas an estimate of a population parameter.as an estimate of a population parameter.
Point EstimationPoint Estimation
We refer to We refer to as the as the point estimatorpoint estimator of the populationof the populationmean mean µµ..
x
is the is the point estimatorpoint estimator of the population proportion of the population proportion pp..p
5
99SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Sampling ErrorSampling Error
�� Statistical methods can be used to make probabilityStatistical methods can be used to make probabilitystatements about the size of the sampling error.statements about the size of the sampling error.
�� Sampling error is the result of using a subset of theSampling error is the result of using a subset of thepopulation (the sample), and not the entirepopulation (the sample), and not the entirepopulation.population.
�� The absolute value of the difference between anThe absolute value of the difference between anunbiased point estimate and the correspondingunbiased point estimate and the correspondingpopulation parameter is called the population parameter is called the sampling errorsampling error..
�� When the expected value of a point estimator is equalWhen the expected value of a point estimator is equalto the population parameter, the point estimator is saidto the population parameter, the point estimator is saidto be to be unbiasedunbiased..
1010SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Sampling ErrorSampling Error
■■ The sampling errors are:The sampling errors are:
| |p p− for sample proportionfor sample proportion
| |s σ− for sample standard deviationfor sample standard deviation
| |x µ− for sample meanfor sample mean
6
1111SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Example: St. Andrew’sExample: St. Andrew’s
St. Andrew’s College receivesSt. Andrew’s College receives
900 applications annually from900 applications annually from
prospective students. Theprospective students. The
application form contains application form contains
a variety of informationa variety of information
including the individual’sincluding the individual’s
scholastic aptitude test (SAT) score and whether or notscholastic aptitude test (SAT) score and whether or not
the individual desires onthe individual desires on--campus housing.campus housing.
1212SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Example: St. Andrew’sExample: St. Andrew’s
The director of admissionsThe director of admissions
would like to know thewould like to know the
following information:following information:
•• the average SAT score forthe average SAT score for
the 900 applicants, andthe 900 applicants, and
•• the proportion ofthe proportion of
applicants that want to live on campus.applicants that want to live on campus.
7
1313SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Example: St. Andrew’sExample: St. Andrew’s
We will now look at threeWe will now look at three
alternatives for obtaining thealternatives for obtaining the
desired information.desired information.
■■ Conducting a census of theConducting a census of the
entire 900 applicantsentire 900 applicants
■■ Selecting a sample of 30Selecting a sample of 30
applicants, using a random number tableapplicants, using a random number table
■■ Selecting a sample of 30 applicants, using ExcelSelecting a sample of 30 applicants, using Excel
1414SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Conducting a CensusConducting a Census
■■ If the relevant data for the entire 900 applicants were If the relevant data for the entire 900 applicants were
in the college’s database, the population parameters of in the college’s database, the population parameters of
interest could be calculated using the formulas interest could be calculated using the formulas
presented in Chapter 3.presented in Chapter 3.
■■ We will assume for the moment that conducting a We will assume for the moment that conducting a
census is practical in this example.census is practical in this example.
8
1515SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
990900
ixµ = =∑
2( )80
900
ix µσ
−= =∑
Conducting a CensusConducting a Census
648.72
900p = =
■■ Population Mean SAT ScorePopulation Mean SAT Score
■■ Population Standard Deviation for SAT ScorePopulation Standard Deviation for SAT Score
■■ Population Proportion Wanting OnPopulation Proportion Wanting On--Campus HousingCampus Housing
1616SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Simple Random SamplingSimple Random Sampling
�� The applicants were numbered, from 1 to 900, asThe applicants were numbered, from 1 to 900, astheir applications arrived.their applications arrived.
�� She decides a sample of 30 applicants will be used.She decides a sample of 30 applicants will be used.
�� Furthermore, the Director of Admissions must obtainFurthermore, the Director of Admissions must obtainestimates of the population parameters of interest forestimates of the population parameters of interest fora meeting taking place in a few hours.a meeting taking place in a few hours.
�� Now suppose that the necessary data on theNow suppose that the necessary data on thecurrent year’s applicants were not yet entered in thecurrent year’s applicants were not yet entered in thecollege’s database.college’s database.
9
1717SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
■■ Taking a Sample of 30 ApplicantsTaking a Sample of 30 Applicants
Simple Random Sampling:Simple Random Sampling:Using a Random Number TableUsing a Random Number Table
•• We will use the We will use the lastlast three digits of the 5three digits of the 5--digitdigitrandom numbers in the random numbers in the thirdthird column of thecolumn of thetextbook’s random number table, and continuetextbook’s random number table, and continueinto the fourth column as needed.into the fourth column as needed.
•• Because the finite population has 900 elements, weBecause the finite population has 900 elements, wewill need 3will need 3--digit random numbers to randomlydigit random numbers to randomlyselect applicants numbered from 1 to 900.select applicants numbered from 1 to 900.
1818SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
■■ Taking a Sample of 30 ApplicantsTaking a Sample of 30 Applicants
Simple Random Sampling:Simple Random Sampling:Using a Random Number TableUsing a Random Number Table
•• (We will go through all of column 3 and part of(We will go through all of column 3 and part of
column 4 of the random number table, encounteringcolumn 4 of the random number table, encountering
in the process five numbers greater than 900 andin the process five numbers greater than 900 and
one duplicate, 835.)one duplicate, 835.)
•• We will continue to draw random numbers untilWe will continue to draw random numbers untilwe have selected 30 applicants for our sample.we have selected 30 applicants for our sample.
•• The numbers we draw will be the numbers of the The numbers we draw will be the numbers of the applicants we will sample unlessapplicants we will sample unless
•• the random number is greater than 900 orthe random number is greater than 900 or
•• the random number has already been used.the random number has already been used.
10
1919SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
■■ Use of Random Numbers for SamplingUse of Random Numbers for Sampling
Simple Random Sampling:Simple Random Sampling:Using a Random Number TableUsing a Random Number Table
744744436436865865790790835835902902190190836836
. . . and so on. . . and so on
33--DigitDigitRandom NumberRandom Number
ApplicantApplicantIncluded in SampleIncluded in Sample
No. 436No. 436No. 865No. 865No. 790No. 790No. 835No. 835
Number exceeds 900Number exceeds 900No. 190No. 190No. 836No. 836
No. 744No. 744
2020SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
■■ Sample DataSample Data
Simple Random Sampling:Simple Random Sampling:Using a Random Number TableUsing a Random Number Table
11 744 744 Conrad HarrisConrad Harris 10251025 YesYes22 436436 Enrique RomeroEnrique Romero 950950 YesYes
33 865865 Fabian AvanteFabian Avante 10901090 NoNo
44 790790 Lucila CruzLucila Cruz 11201120 YesYes
55 835835 Chan ChiangChan Chiang 930930 NoNo.. .. .. .. ..
3030 498498 Emily MorseEmily Morse 10101010 NoNo
No.No.RandomRandomNumberNumber ApplicantApplicant
SATSATScoreScore
Live OnLive On--CampusCampus
.. .. .. .. ..
11
2121SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
■■ Taking a Sample of 30 ApplicantsTaking a Sample of 30 Applicants
•• Then we choose the 30 applicants correspondingThen we choose the 30 applicants correspondingto the 30 smallest random numbers as our sample.to the 30 smallest random numbers as our sample.
•• For example, Excel’s functionFor example, Excel’s function= RANDBETWEEN(1,900)= RANDBETWEEN(1,900)
can be used to generate random numbers betweencan be used to generate random numbers between1 and 900.1 and 900.
•• Computers can be used to generate randomComputers can be used to generate randomnumbers for selecting random samples.numbers for selecting random samples.
Simple Random Sampling:Simple Random Sampling:Using a ComputerUsing a Computer
2222SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
■■ as Point Estimator of as Point Estimator of µµx
■■ as Point Estimator of as Point Estimator of ppp
29,910997
30 30
ixx = = =∑
2( ) 163,99675.2
29 29
ix xs
−= = =∑
20 30 .68p = =
Point EstimationPoint Estimation
Note:Note: Different random numbers would haveDifferent random numbers would haveidentified a different sample which would haveidentified a different sample which would haveresulted in different point estimates.resulted in different point estimates.
■■ ss as Point Estimator of as Point Estimator of σσ
12
2323SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
PopulationPopulationParameterParameter
PointPointEstimatorEstimator
PointPointEstimateEstimate
ParameterParameterValueValue
µµ = Population mean= Population meanSAT score SAT score
990990 997997
σσ = Population std.= Population std.deviation for deviation for SAT score SAT score
8080 s s = Sample std.= Sample std.deviation fordeviation forSAT score SAT score
75.275.2
pp = Population pro= Population pro--portion wantingportion wantingcampus housing campus housing
.72.72 .68.68
Summary of Point EstimatesSummary of Point EstimatesObtained from a Simple Random SampleObtained from a Simple Random Sample
= Sample mean= Sample meanSAT score SAT score
x
= Sample pro= Sample pro--portion wantingportion wantingcampus housing campus housing
p
2424SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
■■ Process of Statistical InferenceProcess of Statistical Inference
The value of is used toThe value of is used tomake inferences aboutmake inferences about
the value of the value of µµ..
x The sample data The sample data provide a value forprovide a value forthe sample meanthe sample mean ..x
A simple random sampleA simple random sampleof of nn elements is selectedelements is selected
from the population.from the population.
Population Population with meanwith mean
µµ = ?= ?
Sampling Distribution of Sampling Distribution of x
13
2525SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
The The sampling distribution of sampling distribution of is the probabilityis the probability
distribution of all possible values of the sample distribution of all possible values of the sample
mean .mean .
x
x
Sampling Distribution of Sampling Distribution of x
where: where:
µµ = the population mean= the population mean
EE( ) = ( ) = µµx
xExpected Value ofExpected Value of
2626SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Sampling Distribution of Sampling Distribution of x
Finite PopulationFinite Population Infinite PopulationInfinite Population
σ σx n
N n
N= −
−( )
1σ σ
x n=
•• is referred to as the is referred to as the standard error of thestandard error of themeanmean..σ x
•• A finite population is treated as beingA finite population is treated as beinginfinite if infinite if nn//NN << .05..05.
•• is the finite correction factor.is the finite correction factor.( ) / ( )N n N− − 1
xStandard Deviation ofStandard Deviation of
14
2727SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Form of the Sampling Distribution of Form of the Sampling Distribution of x
If we use a large (If we use a large (nn >> 30) simple random sample, the30) simple random sample, thecentral limit theoremcentral limit theorem enables us to conclude that theenables us to conclude that thesampling distribution of can be approximated bysampling distribution of can be approximated bya normal distribution.a normal distribution.
x
When the simple random sample is small (When the simple random sample is small (nn < 30),< 30),the sampling distribution of can be consideredthe sampling distribution of can be considerednormal only if we assume the population has anormal only if we assume the population has anormal distribution.normal distribution.
x
2828SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
8014.6
30x
n
σσ = = =
( ) 990E x =x
Sampling Distribution of Sampling Distribution of for SAT Scoresfor SAT Scoresx
SamplingSamplingDistributionDistribution
of of x
15
2929SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
What is the probability that a simple random sampleWhat is the probability that a simple random sample
of 30 applicants will provide an estimate of theof 30 applicants will provide an estimate of the
population mean SAT score that is within +/population mean SAT score that is within +/−−10 of10 of
the actual population mean the actual population mean µµ ? ?
In other words, what is the probability that will beIn other words, what is the probability that will be
between 980 and 1000?between 980 and 1000?
x
Sampling Distribution of Sampling Distribution of for SAT Scoresfor SAT Scoresx
3030SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Step 1: Step 1: Calculate the Calculate the zz--value at the value at the upperupper endpoint ofendpoint ofthe interval.the interval.
zz = (1000 = (1000 -- 990)/14.6= .68990)/14.6= .68
PP((zz << .68) = .7517.68) = .7517
Step 2:Step 2: Find the area under the curve to the left of theFind the area under the curve to the left of theupperupper endpoint.endpoint.
Sampling Distribution of Sampling Distribution of for SAT Scoresfor SAT Scoresx
16
3131SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Sampling Distribution of Sampling Distribution of for SAT Scoresfor SAT Scoresx
Cumulative Probabilities forCumulative Probabilities forthe Standard Normal Distributionthe Standard Normal Distribution
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
. . . . . . . . . . .
.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549
.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133
.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
. . . . . . . . . . .
3232SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Sampling Distribution of Sampling Distribution of for SAT Scoresfor SAT Scoresx
x990990
SamplingSamplingDistributionDistribution
of of x14.6xσ =
10001000
Area = .7517Area = .7517
17
3333SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Step 3: Step 3: Calculate the Calculate the zz--value at the value at the lowerlower endpoint ofendpoint ofthe interval.the interval.
Step 4:Step 4: Find the area under the curve to the left of theFind the area under the curve to the left of thelowerlower endpoint.endpoint.
zz = (980 = (980 -- 990)/14.6= 990)/14.6= -- .68.68
PP((zz << --.68) = .68) = PP((zz >> .68).68)
= .2483= .2483
= 1 = 1 -- . 7517. 7517
= 1 = 1 -- PP((zz << .68).68)
Sampling Distribution of Sampling Distribution of for SAT Scoresfor SAT Scoresx
3434SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Sampling Distribution of Sampling Distribution of for SAT Scoresfor SAT Scoresx
x980980 990990
Area = .2483Area = .2483
SamplingSamplingDistributionDistribution
of of x14.6xσ =
18
3535SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Sampling Distribution of Sampling Distribution of for SAT Scoresfor SAT Scoresx
Step 5: Step 5: Calculate the area under the curve betweenCalculate the area under the curve betweenthe lower and upper endpoints of the interval.the lower and upper endpoints of the interval.
PP((--.68 .68 << zz << .68) = .68) = PP((zz << .68) .68) -- PP((zz << --.68).68)
= .7517 = .7517 -- .2483.2483
= .5034= .5034
The probability that the sample mean SAT score willThe probability that the sample mean SAT score willbe between 980 and 1000 is:be between 980 and 1000 is:
PP(980 (980 << << 1000) = .50341000) = .5034x
3636SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
x10001000980980 990990
Sampling Distribution of Sampling Distribution of for SAT Scoresfor SAT Scoresx
Area = .5034Area = .5034
SamplingSamplingDistributionDistribution
of of x14.6xσ =
19
3737SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Relationship Between the Sample SizeRelationship Between the Sample Sizeand the Sampling Distribution of and the Sampling Distribution of x
�� Suppose we select a simple random sample of 100Suppose we select a simple random sample of 100applicants instead of the 30 originally considered.applicants instead of the 30 originally considered.
�� EE( ) = ( ) = µµ regardless of the sample size. In ourregardless of the sample size. In ourexample,example, EE( ) remains at 990.( ) remains at 990.
xx
�� Whenever the sample size is increased, the standardWhenever the sample size is increased, the standarderror of the mean is decreased. With the increaseerror of the mean is decreased. With the increasein the sample size to in the sample size to nn = 100, the standard error of the= 100, the standard error of themean is decreased to:mean is decreased to:
xσ
808.0
100x
n
σσ = = =
3838SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Relationship Between the Sample SizeRelationship Between the Sample Sizeand the Sampling Distribution of and the Sampling Distribution of x
( ) 990E x =x
14.6xσ =With With nn = 30,= 30,
8xσ =With With nn = 100,= 100,
20
3939SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
�� Recall that when Recall that when nn = 30, = 30, PP(980 (980 << << 1000) = .5034.1000) = .5034.x
Relationship Between the Sample SizeRelationship Between the Sample Sizeand the Sampling Distribution of and the Sampling Distribution of x
�� We follow the same steps to solve for We follow the same steps to solve for PP(980 (980 << << 1000)1000)when when nn = 100 as we showed earlier when = 100 as we showed earlier when nn = 30.= 30.
x
�� Now, with Now, with nn = 100, = 100, PP(980 (980 << << 1000) = .7888.1000) = .7888.x
�� Because the sampling distribution with Because the sampling distribution with nn = 100 has a= 100 has asmaller standard error, the values of have lesssmaller standard error, the values of have lessvariability and tend to be closer to the populationvariability and tend to be closer to the populationmean than the values of with mean than the values of with nn = 30.= 30.
x
x
4040SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
Relationship Between the Sample SizeRelationship Between the Sample Sizeand the Sampling Distribution of and the Sampling Distribution of x
x10001000980980 990990
Area = .7888Area = .7888
SamplingSamplingDistributionDistribution
of of x8xσ =
21
4141SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern
End of Chapter 7, Part AEnd of Chapter 7, Part A