test about a population proportionlzhang/teaching/3070summer2009... · test about a population...

Test about a Population Proportion

Summary for large-sample tests for population proportion p

Null hypothesis: H0 : p = p0

Test statistic value: z = p−p0√p0(1−p0)/n

Alternative Hypothesis Rejection RegionHa : p > p0 z ≥ zα (upper-tailed)Ha : p < p0 z ≤ −zα (lower-tailed)Ha : p 6= p0 either z ≥ z

α/2or z ≤ −z

α/2(two-tailed)

Remark: These test procedures are valid provided that np0 ≥ 10and n(1− p0) ≥ 10.


Type II Error β for large-sample testsAlternative Hypothesis β(p′)

Ha : p > p0 Φ

[p0−p′+zα

√p0(1−p0)/n√

p′(1−p′)/n

]

Ha : p < p0 1− Φ

[p0−p′−zα

√p0(1−p0)/n√

p′(1−p′)/n

]

Ha : p > p0 Φ

[p0−p′+z

α/2

√p0(1−p0)/n√

p′(1−p′)/n

]−

Φ

[p0−p′−z

α/2

√p0(1−p0)/n√

p′(1−p′)/n

]

Test about a Population ProportionSmall-Sample TestsWhen the sample size n is small (n ≤ 30), we test the hypothesesbased directly on the binomial distribution.

For example, if the null hypothesis is H0 : p = p0 and thealternative hypothesis is Ha : p > p0, then the rejection region is ofthe form X ≥ c , where X ∼ Bin(n, p).

P(type I error) = P(reject H0 | H0) = P(X ≥ c | p = p0)

= 1− P(X < c | p = p0) = 1− P(X ≤ c − 1 | p = p0)

= 1− B(c − 1; n, p0)

And

P(type II error) = P(fail to reject H0 | p = p′) = P(X < c | p = p′)

= P(X ≤ c − 1 | p = p′) = B(c − 1; n, p′)


Small-Sample TestsRemark: in the samll-sample case, it is usually not possible to finda vale c for which P(type I error) is exactly the desired significancelevel α. Therefore we choose the largest rejection region whichstatisfying

P(type I error) < α.

P-value

DefinitionThe P-value (or observed significance level) is the smallest level ofsignificance at which H0 would be rejected when a specified testprocedure is used on a given data set. Once the P-value has beendetermined, the conclusion at any partivular level α results fromcomparing the P-value to α:

1. P-value ≤ α⇒ reject H0 at level α.

2. P-value > α⇒ fail to reject H0 at level α.

Convention: it is customary to call the data significant when H0 isrejected and not significant otherwise.

P-value

P-value for z Tests

P =

1− Φ(z) for an upper-tailed test

Φ(z) for a lower-tailed test

2[1− Φ(|z |)] for a two-tailed test

where Φ(z) is the cdf for standard normal rv.

Statistical Inference Based on Two Samples

Basic Assumptions

1. X1,X2, . . . ,Xm is a random sample from a population withmean µ1 and variance σ2

1.

2. Y1,Y2, . . . ,Ym is a random sample from a population withmean µ2 and variance σ2

2.

3. The two samples are independent of one another.

Proposition

The expected value of X − Y is µ1 − µ2 and the standarddeviation of X − Y is

σX−Y

=

√σ2

1

m+σ2

2

n

Samples from Normal Populations with Known Variances

If the the two samples X1,X2, . . . ,Xm and Y1,Y2, . . . ,Ym are fromnormal populations, then we have

X − Y ∼ N(µ1 − µ2 ,σ2

1

m+σ2

2

n)

Therefore,

Z =(X − Y )− (µ1 − µ2)√

σ21

m +σ2

2n

is a standard normal rv.


If the population variances are known to be σ21andσ2

2, then the

two-sided confidence interval for the difference of the populationmeans µ1 − µ2 with confidence level 1− α is((

X − Y)− z

α/2

√σ2

1

m+σ2

2

n,(X − Y

)+ z

α/2

√σ2

1

m+σ2

2

n

)


In case of known population variances, the procedures forhypothesis testing for the difference of the population meansµ1 − µ2 is similar to the one sample test for the population mean:

Null hypothesis H0 : µ1 − µ2 = ∆0

Test statistic value

z =(X − Y )−∆0√

σ21

m +σ2

2n

Alternative Hypothesis Rejection Region for Level α TestHa : µ1 − µ2 > ∆0 z ≥ zα (upper-tailed)Ha : µ1 − µ2 < ∆0 z ≤ −zα (lower-tailed)Ha : µ1 − µ2 6= ∆0 z ≥ z

α/2or z ≤ −z

α/2(two-tailed)


The type II er-ror when µ1−µ2 = ∆′ is calculated similarly as the one sample case:

Alternative Hypothesis Type II Error Probability β(∆′) for Level α Test

Ha : µ1 − µ2 > ∆0 Φ(

zα + ∆0−∆′

σ

)Ha : µ1 − µ2 < ∆0 1− Φ

(−zα + ∆0−∆′

σ

)Ha : µ1 − µ2 6= ∆0 Φ

(zα/2 + ∆0−∆′

σ

)− Φ

(−zα/2 − ∆0−∆′

σ

)where

σ = σX−Y

=√

(σ21/m) + (σ2

2/n).

Large Size Samples

Example 9.1Analysis of a random sample consisting of m = 20 specimens ofcold-rolled steel to determine yield strengths resulted in a sampleaverage strength of x = 29.8 ksi. A second random sample ofn = 25 two-sided galvanized steel specimens gave a sample averagestrength of y = 34.7 ksi. Assuming that the two yield-strenghdistributions are normal with σ1 = 4.0 and σ2 = 5.0, does the dataindicate that the corresoponding true average yield strengths µ1

and µ2 are different?

Large Size Samples

When the sample size is large, both X and Y are approximatelynormally distributed, and

Z =(X − Y )− (µ1 − µ2)√

S21

m +S2

2n

is approximately a standard normal rv.

Large Size Samples

In case both m and n are large (m, n > 30), the procedure forconstructing confidence interval and testing hypotheses for thedifference of two population means are similar to the one samplecase.

The two-sided confidence interval for the difference of thepopulation means µ1 − µ2 with confidence level 1− α is(X − Y

)− z

α/2

√S 2

1

m+

S 2

2

n,(X − Y

)+ z

α/2

√S 2

1

m+

S 2

2

n

Large Size Samples

In case both m and n are large (m, n > 30), the procedures forhypothesis testing for the difference of the population meansµ1 − µ2 is :

Null hypothesis H0 : µ1 − µ2 = ∆0

Test statistic value

z =(X − Y )−∆0√

S21

m +S2

2n

Alternative Hypothesis Rejection Region for Level α TestHa : µ1 − µ2 > ∆0 z ≥ zα (upper-tailed)Ha : µ1 − µ2 < ∆0 z ≤ −zα (lower-tailed)Ha : µ1 − µ2 6= ∆0 z ≥ z

α/2or z ≤ −z

α/2(two-tailed)


Example Problem 7Are male college stuents more easily bored than their femalecounterparts? This question was examined in the article “Boredomin Young Adults – Gender and Cultural Comparisons” (J. ofCross-Cultural Psych., 1991: 209-223). The authors administereda scale called the Boredom Proneness Scale to 97 male and 148female U.S. college students. Does the accompanying data supportthe research hypothesis that the mean Boredom Proneness Ratingis highter for men than for women?

Gender Sample Size Sample Mean Sample SDMale 97 10.40 4..83

Female 148 9.26 4..68

Two-Sample t Test and C.I.

Assumptions:Both populations are normal, so that X1,X2, . . . ,Xm is a randomsample from a normal distribution and so is Y1,Y2, . . . ,Yn (withthe X ’s and Y ’s independent of one another).

The plausibility of

these assumptions can be judged by constructing a normalprobability plot of the xi s and another of yi s.


Assumptions:Both populations are normal, so that X1,X2, . . . ,Xm is a randomsample from a normal distribution and so is Y1,Y2, . . . ,Yn (withthe X ’s and Y ’s independent of one another). The plausibility of

these assumptions can be judged by constructing a normalprobability plot of the xi s and another of yi s.


TheoremWhen the population distributions are both normal, thestandardized variable

T =(X − Y )− (µ

X− µ

Y)√

S2Xm +

S2Yn

has approximately a t distribution with df ν estimated from thedata by

ν =

(s2Xm +

s2Yn

)2

(s2X/m)2

m−1 +(s2

Y/n)2

n−1

(round ν down to the nearest integer.)


Remark: the df of r.v. T can also be estimated from the sampledata by

ν =[(se

X)2 + (se

Y)2]2

(seX

)4

m−1 +(se

Y)4

n−1

wherese

X=

sX√m, se

Y=

sY√n

(round ν down to the nearest integer.)


The two-sample t confidence interval for µX − µY withconfidence level 100(1− α)% is given by(x − y)− t

α/2,ν

√s2X

m+

s2Y

n, (x − y) + t

α/2,ν

√s2X

m+

s2Y

n

A one-sided confidence bound can be obtained by replacing t

α/2,ν

with tα,ν .


The two-sample t test for testing H0 : µX− µ

Y= ∆0 is as

follows:

Test statistic value: t =(x − y)−∆0√

s2Xm +

s2Yn

Alternative Hypothesis Rejection Region for Approximate Level α TestHa : µ

X− µ

Y> ∆0 t ≥ tα,ν (upper-tailed)

Ha : µX− µ

Y< ∆0 t ≤ tα,ν (lower-tailed)

Ha : µX− µ

Y6= ∆0 t ≥ t

α/2,νor t ≤ t

α/2,ν(two-tailed)


Example: (Problem 23)Fusible interlinings are being used with increasing frequency to supportouter fabrics and improve the shape and drape of various pieces ofclothing. The article “Compatibility of Outer and Fusible InterliningFabrics in Tailored Garments” (Textile Res. J. 1997: 137-142) gave theaccompanying data on extensibility (%) at 100 gm/cm for bothhigh-quality fabric (H) and poor-quality fabric (P) specimens.

H 1.2 0.9 0.7 1.0 1.7 1.7 1.1 0.9 1.71.9 1.3 2.1 1.6 1.8 1.4 1.3 1.9 1.60.8 2.0 1.7 1.6 2.3 2.0

P 1.6 1.5 1.1 2.1 1.5 1.3 1.0 2.6

The sample mean and standard deviation for the high-quality sample are1.508 and 0.444, respectively, and those for the poor-quality sample are1.588 and 0.530.

Construct a 95% C.I. for the difference of the true average extensibility

between high-quality fabric and poor-quality fabric. Decide whether the

true average extensibility differs for the two types.


The Quantile-Quantile plot for sample H is


The Quantile-Quantile plot for sample P is


Degrees of freedom calculated from the samples:

ν =(0.4442/24 + 0.5302/8)2

(0.4442/24)2

24−1 + (0.5302/8)2

8−1

= 10.5 ≈ 10

α = 0.05, tα/2,ν

= t0.025,10 = 2.228.

Therefore the 95% C.I. for the difference of the true averageextensibility for the two types of fabric is given by

(1.508− 1.588)∓ 2.228 ·√

0.4442

24+

0.5302

8

which is (−0.544, 0.384)


1. Let µ1 be the true average extensibility for the high-qualityfabric and µ2 for the poor-quality fabric.

2. Hypotheses: H0 : µ1 − µ2 = 0 v.s. Ha : µ1 − µ2 < 0

3. Test statistic:

T =(X − Y )− 0√

S21

m +S2

2n

,

and the value of the test statistic is

t =(1.508− 1.588)√

0.4442

24 + 0.5302

8

= −1.846,

and df is 10.

4. The P-value for a lower-tailed t test in this case is 0.051

5. Using significance level 0.01, we will not reject H0

test about a population proportionlzhang/teaching/3070summer2009... · test about a population...

Documents