doane chapter 08a

Upload: thomasmcarter

Post on 03-Apr-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 Doane Chapter 08a

    1/77

  • 7/28/2019 Doane Chapter 08a

    2/77

    Sampling Distributionsand Estimation

    Chapter

    8

    Sampling Variation

    Estimators and Sampling Distributions

    Sample Mean and the Central Limit Theorem

    Confidence Interval for a Mean (m) with Known s

    Confidence Interval for a Mean (m) with Unknown s

    Confidence Interval for a Proportion (p)

    Part 1

  • 7/28/2019 Doane Chapter 08a

    3/77

    Sampling Variation

    Sample statistic a random variable whosevalue depends on which population itemshappen to be included in the random sample.

    Depending on the sample size, the samplestatistic could either represent the populationwell or differ greatly from the population.

    This sampling variation can easily beillustrated.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

  • 7/28/2019 Doane Chapter 08a

    4/77

    Sampling Variation

    Consider eight random samples of size n = 5 from alarge population of GMAT scores for MBAapplicants.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    The sample means (xi) tend to be close to thepopulation mean (m = 520.78).

  • 7/28/2019 Doane Chapter 08a

    5/77

    Sampling Variation

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Dot plot of eight samples of size n = 5

    Dot plot of eight sample means

  • 7/28/2019 Doane Chapter 08a

    6/77

    Estimators and Sampling Distributions

    Estimator a statistic derived from a sample toinfer the value of a populationparameter.

    Estimate the value of the estimator in aparticular sample.

    Population parametersare represented by

    Greek letters and thecorresponding statisticby Roman letters.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Some Term inolog y

  • 7/28/2019 Doane Chapter 08a

    7/77

    Estimators and Sampling Distributions

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Examples o f Est imators

  • 7/28/2019 Doane Chapter 08a

    8/77

    Estimators and Sampling Distributions

    The sampling distribution of an estimator is theprobability distribution of all possible values the

    statistic may assume when a random sample ofsize n is taken.

    An estimator is a random variable since samplesvary.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Sampl ing Distr ibut ion s

    Sampling error= ^

  • 7/28/2019 Doane Chapter 08a

    9/77

    Estimators and Sampling Distributions

    Bias is the difference between the expectedvalue of the estimator and the true parameter.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Bias

    Bias = E( )^

    An estimator is unbiasedif E( ) = ^

    On average, an unbiased estimator neitheroverstates nor understates the true parameter.

  • 7/28/2019 Doane Chapter 08a

    10/77

    Estimators and Sampling Distributions

    Sampling error is random whereas bias issystematic.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Bias

    An unbiased estimator avoids systematic error.

  • 7/28/2019 Doane Chapter 08a

    11/77

    Estimators and Sampling Distributions

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

  • 7/28/2019 Doane Chapter 08a

    12/77

    Estimators and Sampling Distributions

    Efficiencyrefers to the variance of the estimatorssampling distribution.

    A more efficientestimator has smaller variance.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Eff ic iency

  • 7/28/2019 Doane Chapter 08a

    13/77

    Estimators and Sampling Distributions

    A consistentestimator converges toward theparameter being estimated as the sample sizeincreases.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Consis tency

  • 7/28/2019 Doane Chapter 08a

    14/77

    Sample Mean and theCentral Limit Theorem

    The sample mean is an unbiased estimator ofm,therefore,

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    E(X) = E(X) = m

    The standard error of the mean is the standarddeviation of the sampling error ofx:

    sx=s

    n

  • 7/28/2019 Doane Chapter 08a

    15/77

    Sample Mean and theCentral Limit Theorem

    If the population is exactly normal, then thesample mean follows a normal distribution.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

  • 7/28/2019 Doane Chapter 08a

    16/77

    Sample Mean and theCentral Limit Theorem

    For example, the average price, m, of a 5 GBMP3 player is $80.00 with a standard deviation,s, equal to $10.00. What will be the mean andstandard error from a sample of 20 players?

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    E(X) = E(X) = m = $80.00

    sx=s

    n= 10

    20

    = $2.236

    If the distribution of prices for these players is anormal distribution, then the sampling distributiononxis N(80.00, 2.236).

  • 7/28/2019 Doane Chapter 08a

    17/77

    Sample Mean and theCentral Limit Theorem

    If a random sample of size n is drawn from apopulation with mean m and standard deviation s,the distribution of the sample meanxapproachesa normal distribution with mean m and standarddeviation sx = s/ n as the sample size increase.

    If the population is normal, the distribution of the

    sample mean is normal regardless of samplesize.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Cent ral Lim it Theorem (CLT) fo r a Mean

  • 7/28/2019 Doane Chapter 08a

    18/77

    Sample Mean and theCentral Limit Theorem

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

  • 7/28/2019 Doane Chapter 08a

    19/77

    Sample Mean and theCentral Limit Theorem

    Rule of thumb: to obtain a normal distribution forthe sample mean, n > 30.

    A much smallern will suffice if the population issymmetric.

    For example,consider a

    uniformpopulationU(500, 1000).

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Symmetr ic Popu lat ion: Uni form Distr ibu t ion

  • 7/28/2019 Doane Chapter 08a

    20/77

    Sample Mean and theCentral Limit Theorem

    The central limit theorem predicts that samplesdrawn from this population will have a mean of1000 and the standard error of the mean of:

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Symmetr ic Popu lat ion: Uni form Distr ibu t ion

    = 288.7/ 1 = 288.7n = 1

    Predicted S.E. for sx= s/ n

    = 288.7/ 2 = 204.1n = 2

    = 288.7/ 4 = 144.3n = 4

    = 288.7/ 16 = 72.2n = 16

  • 7/28/2019 Doane Chapter 08a

    21/77

    Sample Mean and theCentral Limit Theorem

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Histograms o f Samp le Means from Uniform Popu lat ion

  • 7/28/2019 Doane Chapter 08a

    22/77

    Sample Mean and theCentral Limit Theorem

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Histograms o f Samp le Means from Uniform Popu lat ion

    S l d h

  • 7/28/2019 Doane Chapter 08a

    23/77

    Sample Mean and theCentral Limit Theorem

    Consider a strongly skewed population forwaiting timesat airportsecurityscreeningwith m = 2.983and s = 2.451

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Skewed Populat ion: Wait ing Time

    S l M d h

  • 7/28/2019 Doane Chapter 08a

    24/77

    Sample Mean and theCentral Limit Theorem

    The CLT predicts that samples drawn from thispopulation will have a mean of 2.983 minutesand standard error of the mean:

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Skewed Populat ion: Wait ing Time

    = 2.451/ 1 = 2.451n = 1

    Predicted S.E. for sx= s/ n

    = 2.451/ 2 = 1.733n = 2

    = 2.451/ 4 = 1.255n = 4

    = 2.451/ 16 = 0.613n = 16

    S l M d h

  • 7/28/2019 Doane Chapter 08a

    25/77

    Sample Mean and theCentral Limit Theorem

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Histograms of Sample Means from Skewed Populat ion

    S l M d th

  • 7/28/2019 Doane Chapter 08a

    26/77

    Sample Mean and theCentral Limit Theorem

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Histograms of Sample Means from Skewed Populat ion

    S l M d th

  • 7/28/2019 Doane Chapter 08a

    27/77

    Sample Mean and theCentral Limit Theorem

    The CLT permits a range or interval within whichthe sample means are expected to fall.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Range o f Sample Means

    If we know mand s, the range of sample meansfor samples of size n are predicted to be:

    m + zs

    n

    Where zis from thestandard normal table.

    m + 1.645 sn

    90% Interval

    m + 1.960 sn

    95% Interval

    m + 2.576 sn

    99% Interval

    S l M d th

  • 7/28/2019 Doane Chapter 08a

    28/77

    Sample Mean and theCentral Limit Theorem

    For samples of size n = 5 applicants, within whatrange would GMAT means be expected to fall?

    The parameters are m = 520.78 and s = 86.8.

    The predicted range for 95% of the samplemeans is:

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Il lus trat ion : GMAT Sco res

    m + 1.960 s

    n

    = 520.78 + 1.96086.8

    5= 520.78 + 76.08

    S l M d th

  • 7/28/2019 Doane Chapter 08a

    29/77

    Make the intervalsmall by increasing n.

    m + zs

    n

    Sample Mean and theCentral Limit Theorem

    The standard error declines as n increases, butat a decreasing rate.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc All rights reserved

    Sample Size and Standard Error

    The distribution of

    sample means collapsesat the true populationmean m as n increases.

    Sample Mean and the

  • 7/28/2019 Doane Chapter 08a

    30/77

    Consider a discrete uniform population consistingof the integers {0, 1, 2, 3}.

    The population parameters are:m = 1.5, s = 1.118

    Sample Mean and theCentral Limit Theorem

    McGraw-Hill/Irwin

    2007 The McGraw-Hill Companies Inc All rights reserved

    Il lus trat ion: Al l Possible Samples from aUni form Popu lat ion

    Sample Mean and the

  • 7/28/2019 Doane Chapter 08a

    31/77

    Sample Mean and theCentral Limit Theorem

    All possible samples of size n = 2, withreplacement, are given below along with theirmeans.

    McGraw-Hill/Irwin

    2007 The McGraw-Hill Companies Inc All rights reserved

    Il lustrat ion : A l l Poss ible Samples from aUni form Popu lat ion

    Sample Mean and the

  • 7/28/2019 Doane Chapter 08a

    32/77

    Sample Mean and theCentral Limit Theorem

    The population is uniform, yet the distribution ofall possible sample means has a peakedtriangular shape.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Il lustrat ion : A l l Poss ible Samples from aUni form Popu lat ion

    Sample Mean and the

  • 7/28/2019 Doane Chapter 08a

    33/77

    Sample Mean and theCentral Limit Theorem

    The CLTs predictions for the mean and standard

    error are

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Il lustrat ion : A l l Poss ible Samples from aUni form Popu lat ion

    mx = m = 1.5and

    sx= s/ n = 1.118/ 2 = 0.7906

    Sample Mean and the

  • 7/28/2019 Doane Chapter 08a

    34/77

    Sample Mean and theCentral Limit Theorem

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Il lustrat ion : A l l Poss ible Samples from aUni form Popu lat ion

    x the mean of means is

    x= 1(0.0) + 2(.05) + 3(1.0) + 4(1.5) + 3(2.0) + 2(2.5) + 1(3.0) = 1.516

    The standard deviation of the means is

    Confidence Interval for a

  • 7/28/2019 Doane Chapter 08a

    35/77

    Confidence Interval for aMean (m) with Known s

    A sample meanxis apoint estimate of thepopulation mean m.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    What is a Confidence Interval?

    A confidence intervalfor the mean is a rangemlower< m < mupper

    The confidence levelis the probability that theconfidence interval contains the true population

    mean. The confidence level (usually expressed as a %)

    is the area under the curve of the samplingdistribution.

    Confidence Interval for a

  • 7/28/2019 Doane Chapter 08a

    36/77

    Confidence Interval for aMean (m) with Known s

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    What is a Confidence Interval? The confidence interval form with known s is:

    Confidence Interval for a

  • 7/28/2019 Doane Chapter 08a

    37/77

    Confidence Interval for aMean (m) with Known s

    A higher confidence level leads to a widerconfidence interval.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Choosing a Conf idence Level

    Greaterconfidenceimplies loss ofprecision.

    95% confidenceis most oftenused.

    Confidence Interval for a

  • 7/28/2019 Doane Chapter 08a

    38/77

    Confidence Interval for aMean (m) with Known s

    A confidence interval eitherdoes ordoes notcontain m.

    The confidence level quantifies the risk.

    Out of 100 confidence intervals, approximately95% wouldcontain m, while approximately 5%would notcontain m.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Interpretat ion

    Confidence Interval for a

  • 7/28/2019 Doane Chapter 08a

    39/77

    Confidence Interval for aMean (m) with Known s

    Yes, but not very often.

    In quality control applications with ongoingmanufacturing processes, assume s stays thesame over time.

    In this case, confidence intervals are used toconstruct control charts to track the mean of a

    process over time.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Iss Ever Known?

    Confidence Interval for a

  • 7/28/2019 Doane Chapter 08a

    40/77

    Confidence Interval for aMean (m) with Unknown s

    Use the Students t distribution instead of thenormal distribution when the population is normalbut the standard deviation s is unknown and thesample size is small.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Students t Distribution

    x+ ts

    n The confidence interval form (unknown s) is

    x- ts

    nx+ t

    s

    n< m 10.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    When is i t Safe to Assume Normal ity?

    Sample size toassume normality:

    Confidence Interval for a Proportion (p)

  • 7/28/2019 Doane Chapter 08a

    64/77

    p ( )

    The standard error ofthe proportion sp

    depends on p, as wellas n.

    It is largest when p isnear .50 and smaller

    when p is near 0 or 1.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Standard Error of the Proport ion

    Confidence Interval for a Proportion (p)

  • 7/28/2019 Doane Chapter 08a

    65/77

    p ( )

    The formula for the standard error is symmetric.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Standard Error of the Proport ion

    Confidence Interval for a Proportion (p)

  • 7/28/2019 Doane Chapter 08a

    66/77

    p ( )

    Enlarging n reduces the standard errorsp but ata diminishing rate.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Standard Error of the Proport ion

    Confidence Interval for a Proportion (p)

  • 7/28/2019 Doane Chapter 08a

    67/77

    p ( )

    The confidence interval forp is

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Con fidence Interval forp

    p(1-p)n

    p + z

    Since p is unknown, the confidence interval forp =x/n (assuming a large sample) is

    p(1-p)n

    p + z

    Where zis based on the desired confidence.

    Confidence Interval for a Proportion (p)

  • 7/28/2019 Doane Chapter 08a

    68/77

    p ( )

    zcan be chosen for any confidence level.For example,

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Con fidence Interval forp

    Confidence Interval for a Proportion (p)

  • 7/28/2019 Doane Chapter 08a

    69/77

    A sample of 75 retail in-store purchases showedthat 24 were paid in cash. What isp?

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Example Aud i t ing

    p =x/n = 24/75 = .32 Isp normally distributed?

    np = (75)(.32) = 24

    n(1-p)= (75)(.88) = 51Both are > 10, so we may conclude normality.

    Confidence Interval for a Proportion (p)

  • 7/28/2019 Doane Chapter 08a

    70/77

    The 95% confidence interval for the proportion ofretail in-store purchases that are paid in cash is:

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Example Aud i t ing

    p(1-p)np + z =

    .32(1-.32)75.32 + 1.96

    = .32 + .106

    .214 < p < .426

    We are 95% confident that this interval containsthe true population proportion.

    Confidence Interval for a Proportion (p)

  • 7/28/2019 Doane Chapter 08a

    71/77

    The width of the confidence interval forpdepends on- the sample size

    - the confidence level- the sample proportionp

    To obtain a narrower interval (i.e., moreprecision) either

    - increase the sample size- reduce the confidence level

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Narrow ing the Interval

    Confidence Interval for a Proportion (p)

  • 7/28/2019 Doane Chapter 08a

    72/77

    To find a confidence interval for a proportion inExcel, use (for example)

    =0.15-NORMSINV(.95)*SQRT(0.15*(1-0.15)/200)

    =0.15+NORMSINV(.95)*SQRT(0.15*(1-0.15)/200)

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Using Excel and MegaStat

    Confidence Interval for a Proportion (p)

  • 7/28/2019 Doane Chapter 08a

    73/77

    In MegaStat, enterp and n to obtain theconfidence interval for a proportion.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Using Excel and MegaStat

    MegaStat always assumes normality.

    Confidence Interval for a Proportion (p)

  • 7/28/2019 Doane Chapter 08a

    74/77

    If the sample is small, the distribution ofp maynot be well approximated by the normal.

    Confidence limits aroundp can be constructed

    by using the binomial distribution.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Using Excel and MegaStat

    Confidence Interval for a Proportion (p)

  • 7/28/2019 Doane Chapter 08a

    75/77

    In polls and surveys, the confidence intervalwidth when p = .5 is called the margin of error.

    Below are some margins of error for 95%

    confidence interval assuming p = .50.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Pol ls and Margin of Error

    Each reduction in the margin of error requires adisproportionately larger sample size.

    Confidence Interval for a Proportion (p)

  • 7/28/2019 Doane Chapter 08a

    76/77

    If in n independent trials, no events occur, theupper 95% confidence bound is approximately3/n.

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    Rule of Three

    A Very Quick Rule (VQR) for a 95% confidence

    interval whenp is near .50 is

    Very Quick Ru le

    p + 1/ n

  • 7/28/2019 Doane Chapter 08a

    77/77

    Applied Statistics inBusiness and Economics

    End of Part 1 of Chapter 8