27 october 2003

27 October 20036.1 Estimating with Confidence

Sampling

We have a known population. We ask “what would happen if I

drew lots and lots of random samples from this population?”

Inference

We have a known sample. We ask “what kind of population

might this sample have been drawn from?”

Chapter 6 Estimating mu from sample data

Chapter 7Estimating mu and sigma fromsample data

Chapter 8Estimating a populationproportion from sample data

Looking ahead

The Central Limit Theorem

If you draw simple random samples of size n

from a population with mean and variance

then

the expected mean of x-bar is

the expected variance of x-bar is / n

the expected histogram of x-bar is approximately normal

Estimating mu from sample data

estimated mu = sample mean

Why? Because the Central Limit Theorem tells us

that, if we drew lots and lots of sample, the sample means would average out to mu.

(The sample mean is an unbiased estimator of mu.)


Is this true?

mu = sample mean

Why not? Because the Central Limit Theorem tells us

that, if we drew lots and lots of sample, the sample means vary. Some are bigger than mu and others are smaller than mu.


What abou this?

mu = somewhere in the neighborhood

of the sample mean

But how do we define neighborhood?

Example 6.1

We have a sample of 500 high-school seniors, selected at random from the population of all high-school seniors in California. For the 500 kids in the sample, their average score on the math section of the SAT is 461.

Known: sample mean is 461 Unknown: population mean Assumed: population sigma is 100

The Central Limit Theorem

If you draw simple random samples of size 500 from a population with mean and standard deviation of 100, then

the expected mean of x-bar is

the expected st dev of x-bar is about 4.5

the expected histogram of x-bar is approximately normal

Table A tells us...

...about 68% of sample means

should fall within 4.5 points of mu

...about 95% of sample means

should fall within 9 points of mu

...about 99.75% of sample means

should fall within 13.5 points of mu

mu-9 mu mu+9

Sample Means

About 95% of sample means should fall within 9 points of mu

mu is 452

435 440 445 450 455 460 465 470 475 480 485

Sample Means

mu is 470

435 440 445 450 455 460 465 470 475 480 485

Sample Means

mu is 452

435 440 445 450 455 460 465 470 475 480 485

Sample Means

mu is 470

435 440 445 450 455 460 465 470 475 480 485

Sample Means

The 95% Confidence Interval

If mu is any number less than 452, then our sample mean would be surprisingly large.

If mu is any number greater than 470, then our sample mean would be surprisingly small.

Therefore, the 95% confidence interval for mu is the range from 452 to 470.

If mu is inside this range, then our sample is not unusual (according to the 95% rule).

Other confidence intervals

If we suppose that the sample mean is within 1.645 standard deviations of mu, then we get a 90% confidence interval.

If we suppose that the sample mean is within 2.576 standard deviations of mu, then we get a 99% confidence interval.

Effect of sample size on the confidence interval

As n gets larger, the expected variability of the sample means gets smaller.

Larger sample sizes produce narrower confidence intervals (other things equal).

Smaller sample sizes produce wider confidence intervals (other things equal).

Some cautions

The data must be a simple random sample from the population

The sample mean, and therefore the confidence interval, may be too heavily influenced by one or more outliers

If the sample size is small and population is not approximately normal, then the CLT doesn’t promise the approximately normal distribution for the sample means

One more caution

There is a 95% chance that mu lies in the confidence interval.

In Example 6.1:

P(452 < mu < 470) = .95

One more caution

There is a 95% chance that mu lies in the confidence interval.

In Example 6.1:

P(452 < mu < 470) = .95×

27 october 2003

Documents

known sample

sample dataestimated

sample meanwhy

sample datais

sample meanbut

sample meanso

populationthe sample

smaller sample sizes