the practice of statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • the 95 part of...

The Practice of StatisticsThird Edition

Chapter 10:Estimating with Confidence

Chapter 10

• Estimating with confidence

– Confidence Intervals are used to estimate

unknown population parameters.

• Three Big Ideas in this Chapter.

Big Idea #1

• When we use a sample statistic to estimate

the unknown value of a population

parameter, we want not just an estimate but

also a statement of how accurate the

estimate is.

– Confidence intervals do that for us.

Big Idea #2

• Most confidence intervals have the form:

estimate ± margin of error.

• We can see both the basic (“point”) estimate

and the margin of error that tells us how

accurate the point estimate is.

Big Idea #3

• A confidence interval can’t guarantee to always

capture the population parameter.

• The Confidence level tells us how often we capture

the parameter in very many uses of the confidence

interval method.

• Most common confidence level is 95%. If we use

95% confidence intervals very many times, 95% of

them will capture the parameter and 5% will not.

Confidence Intervals

• How long does a AA battery last?

• Not practical to determine life of every battery.

• Select a sample to represent the population.

• Collect data from the sample

• We want to infer from the sample data some

conclusion about the population.

We cannot be certain our conclusion is correct.

Statistical Inference uses probability to express the strength

of our conclusions.

This is the goal of Statistical Inference

This is the largest part of the AP Exam (30 – 40 percent).

Where Are We Headed?

• Two common types of statistical inference

– Chapter 10 – Confidence Intervals for

estimating the value of a population parameter.

– Chapter 11 – Significance Tests – Assess the

evidence for a claim about a population.

• Both report probabilities that state what

would happen if we used the inference

method many times.

Cautions!

• Formal inference requires long-run regular behavior.

• Inference is most reliable when data is produced by randomized design.

• If not true, your conclusions may be open to challenge.

• Inference cannot fix basic flaws in producing data.

• GIGO

Assumptions for Right Now

• We are going to pretend the world is simpler than it is.

• Pretend that population standard deviation, σ is known,

even though we don’t know μ, the population mean.

• We will start with an overly simple technique for

estimating a population mean.

• Then we will modify our approach to make it more

useful.

The Basics –

IQ and College Admissions

• Administer IQ test to SRS of 50 of 5,000

incoming college freshmen.

• ҧ𝑥 = 112

• What can you say about µ for the entire class?

• Is µ = 112 for the population really 112?

– Probably not, but how close to 112 is μ?

How do we Determine?

• Question to answer: How would ҧ𝑥 vary if we took many samples of 50 freshmen from the same population?

• Recall from Chapter 9:

– ҧ𝑥 is the same as µ

– Standard deviation of x-bar is σ/√n.

– Suppose for this test σ = 15

– So standard deviation of x-bar = 15/√50 = 2.1

– CLT tells us the ҧ𝑥 of 50 scores is distributed approximately normally.

Freshman Class

Take every possible combination

of 50 students and get sampling

distribution with mean equal to

unknown μ and std. dev. = 2.1

• To estimate µ use ҧ𝑥 of the random sample

• ҧ𝑥 is an unbiased estimator, but it will rarely equal µ, so the estimate has some error.

• The values of ҧ𝑥 follow a approximately normal distribution with mean µ and standard deviation = 2.1

• The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations of µ so(2*2.1) 4.2.

• So µ lies between ҧ𝑥 plus 4.2 AND ҧ𝑥 minus 4.2

• That means we estimate that µ is somewhere between 112 – 4.2 = 107.8 and 112 + 4.2 = 116.2.

• The interval is 107.8 to 116.2

• This captures the true µ in about 95% of the samples.

This is the same curve as the previous

slide. 95% of all samples lie within 2

standard deviations of the population

mean, μ.

That is the same thing as saying the

interval ҧ𝑥 ± 4.2 (2 standard deviation

in either direction) captures μ in 95%

of all samples.

Starting with the population, imagine taking all of the possible SRS of 50 freshman.

The recipe ҧ𝑥 ± 4.2 gives us an interval based on each sample, 95% of these intervals

capture the unknown population mean μ.

The language of statistical inference uses this fact about what

would happen in many samples to express our confidence in the

results in any one sample.

Interpreting a Confidence Interval

• ҧ𝑥 = 112

• Interval = 112 ± 4.2 (107.8, 116.2)

• We are 95% confident that the unknown

mean IQ for the freshmen is between 107.8

and 116.2.

There are TWO Possibilities

• The interval between 107.8 and 116.2 contains the true µ.

• Our SRS was one of the 5% of samples in which ҧ𝑥 was not within 4.2 of the true µ

• We cannot say which one our sample is.

• What we are saying is that: “we got these numbers using a method that gives correct results 95% of the time.”

Vocabulary

• The interval ҧ𝑥 ± 4.2 is called the 95%

CONFIDENCE INTERVAL (C.I.).

• C.I. = point estimate ± margin of error

• ҧ𝑥 is our point estimate and margin of error shows

how accurate we believe our guess to be.

• The CONFIDENCE LEVEL is 95%

• This is our confidence level because it catches the

unknown µ in 95% of all the possible samples.

Here is the formal description

We can choose the confidence level, usually 90% or greater,

because we want to quite sure of our conclusion.

C will stand for Confidence Level in decimal form.

95% confidence level corresponds to C = .95

The red dot is ҧ𝑥 in these 25

samples. How many of the

samples have ҧ𝑥 ± 2 SD or

Just this one

Assignment

• Read Pages 626 – 632

• Exercises 10.1, 10.2, 10.5, 10.6

• Play with the Confidence Interval Applet at: http://digitalfirst.bfwpub.com/stats_applet/stats_applet_4_ci.html

and do activity 10B with it. (Our applet is a bit different but you

should be able to figure out how to use it.)

• Watch: www.learner.org/courses/againstallodds/unitpages/unit24.html

the practice of statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • the 95 part of...

Documents

week31 the empirical (68-95-99.7) rule with a bell shaped...

warm up in a class where state the interval containing the...

10 ways to protect your pmdi product · 100.3 99.7 100.5...

illinois integrated water quality report and …...74,709...

who/cds/csr/edc/99.7 by...who/cds/csr/edc/99.7 laboratory...

using the 68-95-99.7 rule normal quantile plots. learning...

quantitative methods for alm · 2013. 11. 11. · •...

the distribution of heights of adult american men is...

statistics quality: control...nonetheless, control charts...

(spec) glycerin, vegetable 99.7% … usp/kosher...

the normal distributions. density curves normal...

1 chapter 6 part 1 using the mean and standard deviation...

1 balboa (b/.) = 99.7 yen (jpy) · exchange rate (october...

the empirical (68-95-99.7)...

objective: use the empirical rule (68-95-99.7 rule) to...

the normal distribution and the 68-95-99.7 rule

using the 68-95-99.7 rule normal quantile plots

zinc oxide (99.7%) by upper india smelting and refinery...

2.2a i ntroduction to n ormal d istributions. s ection 2.2a...

99.7 the blitz media kit