10.1: confidence intervals – the basics. introduction is caffeine dependence real? what proportion...

19
10.1: Confidence Intervals – The Basics

Upload: olivia-little

Post on 16-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

10.1: Confidence Intervals

– The Basics

Page 2: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Introduction Is caffeine dependence real? What proportion of college students

engage in binge drinking? How do we answer these questions?

Statistical inference provides methods for drawing conclusions about a population from sample data.

When using statistical inference, we are acting as if the data are a random sample or come from a randomized experiment.

Page 3: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Ex 1: IQ and Admissions

Harvard’s admissions director proposes using the IQ scores of current students as a marketing tool. The director gives the IQ test to an SRS of 50 of Harvard’s 5000 freshmen. The mean IQ score is x = 112. What can the director say about the mean score μ of the population of all 5000 freshmen?

Page 4: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Ex 1: IQ and Admissions

The mean of the sampling distribution of x is the same as the unknown mean μ of the population.

The standard deviation for an SRS of 50 freshmen is σ / √50. If σ = 50, then the standard deviation of x is 15 / √50 = 2.1

The central limit theorem tells us that the mean x of 50 scores has a distribution that is close to Normal.

Page 5: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Ex 1: IQ and Admissions

These facts give us the reasoning of statistical estimation in a nutshell…

1. To estimate the unknown population mean μ, use the mean x of our random sample.

2. Although x is an unbiased estimate of μ, it will rarely be exactly equal to μ, so our estimate has some error.

3. In repeated samples, the values of x follow an approximately Normal distribution with mean μ and standard deviation 2.1.

4. Whenever x is within 4.2 points of μ, μ is within 4.2 points of x. This happens in 95% of all possible samples.

The BIG IDEA is that the The BIG IDEA is that the sampling distribution of sampling distribution of

x-bar tells us how big x-bar tells us how big the error is likely to be the error is likely to be when we use x-bar to when we use x-bar to

estimate estimate μμ..

Page 6: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Ex 2: Estimation in Pictures

Imagine taking many SRSs of 50 freshmen, including three x-bars of 112, 109, and 114. The recipe x + 4.2 gives an interval based on each sample; 95% of these intervals capture the unknown population mean μ.

The language of statistical The language of statistical inference uses many inference uses many

samples to express our samples to express our confidence in the results of confidence in the results of

any one sample.any one sample.x

.

Page 7: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Ex 3: IQ Conclusion Our sample of 50 freshmen gave x = 112. The

resulting interval is 112 + 4.2, which can be written as (107.8,116.2). We say that we are 95% confident that the unknown mean IQ score for all Harvard freshmen is between 107.8 and 116.2.

Our confidence is based on the following:

1. The interval between 107.8 and 116.2 contains the true μ.

2. Our SRS was one of the few samples for which x is not within 4.2 points of the true μ. Only 5% of all samples give such inaccurate results.

Page 8: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Ex 3: IQ Conclusion The interval of numbers x + 4.2 is called a 95%

confidence interval. The confidence level is 95%. This is a 95%

confidence interval because it catches the unknown μ in 95% of all possible samples.

The margin of error + 4.2 shows how accurate we believe our guess is based on the variability of the estimate. (Estimate + Margin of Error)

Page 9: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Confidence Interval & Level

We typically select a confidence level (C) of 90% or higher.

Next…25 samples from the same population gave these 95% confidence intervals. In the long run, 95% of all samples given an interval that contains the population mean μ.

Page 10: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Conditions for Constructing a Confidence Interval for μ

The data must come from an SRS from the population of interest.

The sampling distribution of x-bar is approximately Normal.

Individual observations are independent; when sampling without replacement, the population size N is at least 10 times the sample size n. (Independence).

Page 11: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Ex 4: Finding z (Using Table A) Construct an 80% confidence interval - that must

catch the central 80% of the Normal sampling distribution of x.

We must leave out 10% in each tail of the distribution. Therefore, z is the point with area 0.1 to its right (and 0.9 to its left) under the standard Normal curve.

The closest entry in Table A is z = 1.28. Therefore, there is area 0.8 under the standard Normal curve between -1.28 and 1.28.

Page 12: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Most Common Confidence Levels

Confidence Level Tail Area z

90%

95%

99%

0.05 1.645

0.025 1.960

0.005 2.576

Page 13: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Critical Values

Page 14: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Confidence Interval for a Population Mean (σ Known)

When choosing an SRS from a population (having unknown μ and known σ), the level C confidence interval for μ is:

In other words, n

zx

= Estimate + Margin of Error

= Estimate + (Critical Value of z) (Standard Error)

Page 15: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Ex 5: Video Screen Tension

A manufacturer of high-resolution video terminals must control the tension on the mesh of fine wires that lies behind the surface of the viewing screen. Careful study has shown that when the process is operating properly, the standard deviation of the tension readings is σ = 43 mV. Here are the tension readings from an SRS of 20 screens from a single day’s production:

269.5 297.0 269.6 283.3 304.8 280.4 233.5 257.4 317.5 327.4

264.7 307.7 310.0 343.3 328.1 342.6 338.8 340.1 374.6 336.1

Page 16: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Ex 5: Video Screen Tension Construct and interpret a 90% confidence interval

for the mean tension μ of all the screens produced on this day.

Step 1: Parameter: Identify the population of interest and the parameter you want to draw conclusions about.

The population of interest is all of the video terminals produced on the day in question. We want to estimate μ, the mean tension for all of these screens.

Page 17: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Ex 5: Video Screen Tension Step 2: Conditions: Choose the appropriate

inference procedure. Verify the conditions for using it. (We must check that the three conditions are met.)

SRS? Yes. Normality? Past experience tells us that these samples are

approximately Normal. Because the sample size is too small to use the central limit theorem, we can explore the data in other ways (Boxplot & Normal Probability Plot).

No outliers or strong

skewness…appears

approximately Normal.

Looks linear…approximately

Normal.

Page 18: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Ex 5: Video Screen Tension Independence? We must assume that at least (10)(20) =

200 video terminals were produced on this day. Step 3: Calculations: If the conditions are met,

carry out the inference procedure.

)1.322,5.290(

8.153.30620

43645.13.306

n

zx

Page 19: 10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do

Ex 5: Video Screen Tension Step 4: Interpretation: Interpret your results in the

context of the problem. We are 90% confident that the true mean tension in the

entire batch of video terminals produced that day is between 290.5 and 322.1 mV.