14. introduction to inference

14. Introduction to inference

Objectives (PSLS Chapter 14)

Introduction to inference

REVIEW: Uncertainty and confidence

REVIEW: Confidence intervals

REVIEW: Confidence interval for a Normal population mean (σ

known)

Significance tests

Null and alternative hypotheses

The P-value

Test for a Normal population mean (σ known)

Uncertainty and confidence

If you picked different samples from a population, you would probably

get different sample means ( x̅ ) and virtually none of them would

actually equal the true population mean, .

n

Blue dot: mean value of a given random sample

n

x ̅

~95% of all intervals computed with this

method capture the parameter μ.

Based on the ~68-95-99.7% rule, we

can expect that:

Red arrow: Interval of size plus or minus 2σ/√n

When we take a random sample, we

can compute the sample mean and an

interval of size plus-or-minus 2σ/√n

about the mean.

The confidence level C

(in %) represents an area

of corresponding size C

under the sampling

distribution.

mm

The margin of error, m

A confidence interval (“CI”) can be expressed as:

a center ± a margin of error m: within x̅ ± m

an interval: within (x̅ − m) to (x ̅ + m)

Someone makes a claim about the unknown value of a population

parameter. We check whether or not this claim makes sense in light of

the “evidence” gathered (sample data).

A test of statistical significance tests a specific hypothesis using

sample data to decide on the validity of the hypothesis.

You are in charge of quality control and sample randomly 4 packs of cherry

tomatoes, each labeled 1/2 lb (227 g).

The average weight from the 4 boxes is 222 g.

Is the somewhat smaller weight simply due to chance variation?

Is it evidence that the packing machine that sorts cherry

tomatoes into packs needs revision?

Significance tests

The null hypothesis, H0, is a very specific statement about a

parameter of the population(s).

The alternative hypothesis, Ha, is a more general statement that

complements yet is mutually exclusive with the null hypothesis.

Weight of cherry tomato packs:

H0: µ = 227 g (µ is the average weight of the population of packs)

Ha: µ ≠ 227 g (µ is either larger or smaller than as stated in H0)

Null and alternative hypotheses

Distinguish between the biological hypotheses and the statistical hypotheses.

They should be linked, but are unlikely to be the same.

Biological and statistical hypotheses

State:

What is the practical question, in the context of a real-world setting?

Plan:

What specific statistical operations does this problem call for?

Solve:

Make the graphs and carry out the calculations needed for this problem.

Conclude:

Give your practical conclusion in the real-world setting.

One-sided vs. two-sided alternatives

A two-tail or two-sided alternative is symmetric:

Ha: µ [a specific value or another parameter]

A one-tail or one-sided alternative is asymmetric and specific:

Ha: µ < [a specific value or another parameter] OR

Ha: µ > [a specific value or another parameter]

The Probability-value (P-value)

The packaging process is Normal with standard deviation = 5 g.

H0: µ = 227 g versus Ha: µ ≠ 227 g

The average weight from your 4 random boxes is 222 g.

What is the probability of drawing a random sample such as yours, or even more

extreme, if H0 is true?

P-value: The probability, if H0 were true, of obtaining a sample

statistic as extreme as the one obtained or more extreme.

This probability is obtained from the sampling distribution of the statistic.

http://bcs.whfreeman.com/psls2e/#616957__662349__

What are the parameters of our working sampling distribution?

Interpreting a P-value

Could random variation alone account for the difference between H0

and observations from a random sample? The answer is yes! You

must decide if that is a reasonable working explanation.

Small P-values are strong evidence AGAINST H0 and we reject H0.

The findings are “statistically significant.”

Large P-values don’t give enough evidence against H0 and we fail to

reject H0. Beware: This is not evidence for “proving H0” is true.

0 1

--------------------------- Not significant --------------------------

Possible P-values

Range of P-values

P-values are probabilities, so they are always a number between 0 and 1.

The order of magnitude of the P-value matters more than its exact

numerical value.

The significance level α

SN: The use of alpha is an entrenched custom, which is criticized by many statisticians. The significance

level, α, is the largest P-value tolerated for rejecting H0 (how much evidence against H0 we require). This

value is determined arbitrarily, subjectively, or based on convention before conducting the test.

When P-value ≤ α, they reject H0.

When P-value > α, they fail to reject H0.

Industry standards require a significance level α of 5%.

Does the packaging machine need revision?

What if the the P-value is 4.56%.

What if the P-value is 5.01%

Test for a population mean (σ known)

n

xz

0

To test H0: µ = µ0 using a random sample of size n from a Normal

population with known standard deviation σ, we use the null sampling

distribution N(µ0, σ√n).

The P-value is the area under N(µ0, σ√n) for values of x ̅ at least as

extreme in the direction of Ha as that of our random sample.

Calculate the z-value

then use Table B or C.

Or use technology.

P-value in one-sided and two-sided tests

To calculate the P-value for a two-sided test, use the symmetry of the

normal curve. Find the P-value for a one-sided test and double it.

One-sided

(one-tailed) test

Two-sided

(two-tailed) test

H0: µ = 227 g versus Ha: µ ≠ 227 g

What is the probability of drawing a random sample such as this one or

worse, if H0 is true?

245

227222

n

xz

From Table B, area left of z is 0.0228.

P-value = 2*0.0228 = 4.56%.

2.28%2.28%

217 222 227 232 237

Sampling distribution if H0 were true

σ/√n = 2.5g

µ (H0)2

,z

x

What should we conclude about

the tomato sorting machine?

From Table C, two-sided P-value

is between 4% and 5% (use |z|).

222g 5g 4 is normally distributedx n x

Are our assumptions about the

sampling distribution questionable?

Confidence intervals to test hypothesesBecause a two-sided test is symmetric, you can easily use a confidence

interval to test a two-sided hypothesis.

α /2 α /2

In a two-sided test,

C = 1 – α.

C confidence level

α significance level

Packs of cherry tomatoes (σ= 5 g): H0: µ = 227 g versus Ha: µ ≠ 227 g

Sample average 222 g. 95% CI for µ = 222 ± 1.96*5/√4 = 222 g ± 4.9 g

227 g does not belong to the 95% CI (217.1 to 226.9 g). Thus, we reject H0.

A sample gives a 99% confidence interval of .

With 99% confidence, could samples be from populations with µ =0.86? µ =0.85?

x m 0.84 0.0101

99% C.I.

Logic of confidence interval test

x

Cannot rejectH0: = 0.85

Reject H0: = 0.86

The Bad: A confidence interval doesn’t precisely gauge the level of significance.

Reject or don’t reject H0 (Are significance tests really that much better?)

The Great: It estimates a range of likely values for the true population mean µ.

A P-value quantifies how strong the evidence is against the H0, but doesn’t provide any

information about the true population mean µ.

CI vs. test of significance

Confidence intervals quantify the

uncertainty in an estimate of a

population parameter.

Tests of significance evaluate

statistically whether a particular

statement is reasonable.

Only small % p of samples would give x¯ so far from under H0

The national average birth weight is 120 oz: N(natl =120, = 24 oz).

An SRS of 100 poor mothers gave an average birth weight x ̅ = 115 oz.

Is significantly lower than the national average?

Statistical Hypotheses: H0: poor = 120 oz (no difference with natl )

Ha: poor ≠ 120 oz

083.210024

1201150

n

xz

We calculate the z score for x ̅:

In Table B, z = –2.08 gives area beyond = 0.0188 P-value 3.7%. In Table C, we find that the P-value is between 2% and 4%.

2-sid

ed

14. introduction to inference

Documents